[Overview][Types][Classes][Procedures and functions][Variables][Index] Reference for unit 'LazUTF8' (#lazutils)

UTF8CharacterLength (deprecated)

Returns the number of bytes needed for the UTF-8 codepoint starting at p.

Declaration

Source position: lazutf8.pas line 77

function UTF8CharacterLength(

  p: PChar

):Integer;

Arguments

p

  

Pointer to the value examined in the routine.

Function result

Number of bytes required for the UTF-8 codepoint, or 0 (zero).

Description

Remark: Deprecated. Use UTF8CodepointSize instead.

It returns 0 if p is nil. It returns 1 if p is a 1-byte UTF-8 codepoint or p is an invalid UTF-8 sequence. Otherwise it returns a number 2..4. It does not check for malicious codepoints like #$c0#$80, nor for undefined codepoints like #$f3#$a0#$87#$b9. Use UTF8CharacterLength to step through a string with a simple loop:

while p^ <> #0 do
begin
  inc(p, UTF8CharacterLength(p));
end;

Even if p contains invalid UTF-8 codepoints it will run through the string without overflow.

See also

UTF8CharacterStrictLength

  

Returns the length in bytes (1..4) for a valid UTF-8 character. Otherwise 0.


Version 3.2 Generated 2024-02-25 Home