n_bytes
to 0, which will hold the number of bytes in the current UTF-8 character.n_bytes
is 0, determine the number of bytes in the UTF-8 character by counting the leading 1
bits using bitwise operations.1
bits is greater than 4 or the byte starts with 10
, it's invalid.n_bytes
is greater than 1, check if the next byte starts with 10
using bitwise operations. If not, return False
.n_bytes
for each continuation byte processed.n_bytes
is 0, return True
. Otherwise, return False
.