UCS2BEToUTF8

Converts a string from UCS 2-byte BE encoding to UTF-8.

Declaration

Source position: lconvencoding.pas line 166

function UCS2BEToUTF8(

const s: string

):string;

Arguments

s

String value using UCS2 BE encoding.

Function result

String value after conversion to UTF-8 encoding.

Description

UCS2BEToUTF8 is a String function used to convert a value encoded using UCS2 BE (Big Endian) to its UTF-8 encoding. UCS2 is a fixed-length encoding where each character is represented using 2 bytes (16-bits). Byte values are stored in Most Significant (Big Endian) byte order.

UCS2BEToUTF8 iterates over the characters in the string value, and converts each character to the variable length multi-byte encoding used for characters in UTF-8. BEToN is called to convert the byte values to the byte order used for the platform. The UnicodeToUTF8SkipErrors routine in lazutf8.pas is called to handle code points which are malformed, require translation or are not used in UTF-8.

An exception is raised in UCS2BEToUTF8 if the length of the converted string is longer than 1.5 times the original string length.

The return value is cast to a RawByteString type, and SetCodePage is called to set the code page to CP_UTF8 (65001) in the result.

No actions are performed in the routine when s is an empty string (''), and the return value is an empty string.

UnicodeToUTF8SkipErrors		Stores a single Unicode codepoint as a UTF-8-encoded value in the buffer.
SetCodePage
BEToN

UCS2BEToUTF8

Declaration

Arguments

Function result

Description

See also