View Single Post
thp's Avatar
Posts: 1,391 | Thanked: 4,272 times | Joined on Sep 2007 @ Vienna, Austria
#78
Not sure if there's anything additional to SMS encoding (each part in a multi-SMS gets less space than a single non-multi-SMS), but in general for UCS-2 vs UTF-16:

UCS-2 can only encode codepoints between 0 and 2^16 (that's the reason why some Emoji characters outside of this range don't work). Each character takes up 16 bits (2 bytes). UCS-2 is defined as big endian, so usually no byte order mark is necessary(?).

UTF-16 can encode all unicode codepoints, using either 16 bits (2 bytes) if it fits, or 32 bits (4 bytes) otherwise. UTF-16 can be big endian and little endian, so a byte order mark might be necessary (not sure if the endianness is defined for SMS and UTF-16/UCS-2).

The README file contains some links to SMS encoding webpages related to UTF-16/UCS-2.
 

The Following 3 Users Say Thank You to thp For This Useful Post: