[Overview][Types][Classes][Procedures and functions][Variables][Index] |
Converts a UTF-8-encoded character to its unique Unicode U+XXXX character value.
Source position: lazutf8.pas line 88
function UTF8CodepointToUnicode( |
p: PChar; |
out CodepointLen: Integer |
):Cardinal; |
p |
|
The UTF-8-encode string value. |
CodepointLen |
|
Number of bytes needed for the codepoint. |
Unicode character value for the UTF-8 character.
UTF8CodepointToUnicode is a Cardinal function used to convert a UTF-8-encoded character to its representation as a unique Unicode U+XXXX hexadecimal character value. For example: The letter 'A' (Decimal 65) is expressed in Unicode as U+0041.
CodepointLen is an output variable used to store the number of UTF-8-encoded bytes needed for the codepoint. It will normally contain a value in the range 1..4 (the number of possible bytes used in the UTF-8 encoding scheme). It can contain 0 (zero) when p is an empty PChar value.
The return value for the function contains the hexadecimal Unicode character value as a Cardinal data type. It can contain 0 (zero) when the value in p is not a valid UTF-8-encoded character.
Use UTF8FixBroken to fix invalid UTF-8 encoding in the string.
Use UnicodeToUTF8 to convert a Unicode character value to its UTF-8-encoded value.
Remark: | UTF8CodepointToUnicode does not check whether the codepoint is actually defined in Unicode tables. |
Version 3.2 | Generated 2024-02-25 | Home |