Site icon Sql Quantum Leap

Unicode Escape Sequences Across Various Languages and Platforms (including Supplementary Characters)

Unicode escape sequences across various languages and platforms (including Supplementary Characters), with working examples


(last updated: 2020-01-10 @ 14:15 EST / 2020-01-10 @ 19:15 UTC )

ATTENTION SQL SERVER CENTRAL READERS:
If the formatting below does not look correct, then please view the original post at:
https://SqlQuantumLeap.com/2019/06/26/unicode-escape-sequences-across-various-languages-and-platforms-including-supplementary-characters/
 
 
 
For convenience, you can easily navigate to this page using the following short URL:

https://bit.ly/UnicodeEscapeSequences

 

I often need to include Unicode-only characters in my scripts, posts, etc., and have found that including such characters directly can sometimes lead to problems when there are encoding “issues”. So, as much as possible I try to escape all Code Points above U+007F (value 127 in decimal), leaving me with a highly transportable / mostly risk-free document. But, this means that I need to know how to escape Unicode characters in various languages. After looking through the documentation for a number of languages and platforms, I have noticed that the descriptions can sometimes be misleading or at least unclear, and the examples, if any are provided, are nearly always showing standard ASCII characters such as an uppercase US English “A”. Very few show Unicode-only BMP Code Points, and even fewer show how to escape Supplementary Characters. Not showing examples of escaping Supplementary Characters is a problem because they can be trickier to escape, especially if the documentation is incomplete or misleading.

The purpose of this post is to correct the overall lack of examples. Everything shown below are actual working examples of creating both a Unicode-only BMP character (meaning a non-Supplementary Character that would require Unicode) and a Supplementary Character. Most examples include a link to an online demo, either on db<>fiddle (for database demos) or IDE One (for non-database demos), both very cool and handy sites.

I use the same two characters across all examples to hopefully make them all easier to understand. Those two characters are:

Unicode-only BMP Character

Tibetan Mark Gter Yig Mgo -Um Rnam Bcad Ma ( U+0F02 )
Encoding Binary /
hex value
Integer /
decimal value
UTF-8 {Bytes} E0,BC,82 224,188,130
UTF-16 LE (Little Endian) {Bytes} 020F N / A
UTF-16 / UTF-16 BE (Big Endian) {Code Units} 0F02 3842
UTF-32 {Code Point} 000F02 3842

Supplementary Character

Alien Monster ( U+1F47E ) 👾
Encoding Binary /
hex value
Integer /
decimal value
UTF-8 {Bytes} F0,9F,91,BE 240,159,145,190
UTF-16 LE (Little Endian) {Bytes} 3DD8,7EDC N / A
UTF-16 / UTF-16 BE (Big Endian) {Code Units} D83D,DC7E 55357,56446
UTF-32 {Code Point} 01F47E 128126

This post will be updated in the near future to include additional platforms and languages, such as: Oracle, DB2, R, Python, and VB.NET.






 

HTML, XHTML, and XML

Decimal: &#3842;
Hex:     &#x0F02;


Decimal: &#128126;
Hex:     &#x1F47E;



 

Microsoft SQL Server (T-SQL)

SQL Server technically does not have character escape sequences, but you can still create characters using either byte sequences or Code Points using the CHAR() and NCHAR() functions. We are only concerned with Unicode here, so we will only be using NCHAR().

All versions of SQL Server (at least since 2005, if not earlier):

SELECT N'T' + NCHAR(9) + N'A' + NCHAR(0x9) + N'B' AS [Single Decimal
or Hex Digit],

       NCHAR(0xF02) AS [Code Point (from hex)],
       NCHAR(3842) AS [Code Point (from decimal)],

       -- We are passing in "values", _not_ "escape sequences"
       NCHAR(0x0000000000000000000000F02) AS [BINARY / hex "value"],
       NCHAR(0003842.999999999) AS [INT / decimal "value"];



-- The following syntaxes work regardless of the database's collation:
SELECT NCHAR(0xD83D) + NCHAR(0xDC7E) AS [UTF-16 Surrogate Pair (BINARY/hex)],
       NCHAR(55357) + NCHAR(56446) AS [UTF-16 Surrogate Pair (INT/decimal)],
       CONVERT(NVARCHAR(10), 0x3DD87EDC) AS [UTF-16LE bytes];

Starting with SQL Server 2012:

-- The following syntax only works if the database's default collation
--   supports Supplementary Characters (starting in SQL 2012), else the
--   NCHAR() function returns NULL:
SELECT NCHAR(0x1F47E) AS [UTF-32 (BINARY / hex)],
       NCHAR(128126) AS [UTF-32 (INT / decimal)];

Starting with SQL Server 2019:

-- Works if current database has a "_UTF8" default collation:
SELECT CONVERT(VARCHAR(10), 0xF09F91BE); -- UTF-8 bytes

-- Works regardless of database's default collation:
DECLARE @Temp TABLE
(
  [TheValue] VARCHAR(10) COLLATE Latin1_General_100_CI_AS_SC_UTF8 NOT NULL
);

INSERT INTO @Temp ([TheValue]) VALUES (0xF09F91BE); -- UTF-8 bytes

SELECT * FROM @Temp;

See SQL Server 2017 demo on db<>fiddle


See SQL Server 2019 / UTF-8 demo on db<>fiddle

Also see:




 

MySQL

There is no Unicode character escape according to the “Special Character Escape Sequences” section of the String Literals documentation. And I did try the usual ones: \x, \X, \u, \U, and \U{}.

However, you could just use a hex literal. The Hexadecimal Literals documentation states:

The other option is the CHAR() function which has an optional using clause for specifying the encoding.

Two different HEX notations:

SELECT _utf8mb4 0xF09F91BE AS "UTF-8 bytes in 0x notation",
       _utf8mb4 X'F09F91BE' AS "UTF-8 bytes in X'' notation",

       _utf32 0x1F47E AS "Code Point in 0x notation",
       _utf32 X'01F47E' AS "Code Point in X'' notation";

Introducers:

# BMP Character ( U+0F02  ):
SELECT _utf8 0xE0BC82,    # 3-byte (BMP-only) UTF-8
       _utf8mb4 0xE0BC82, # Full UTF-8
       _utf16 0xF02,      # UTF-16 (implied Big Endian)
       _utf16le 0x020F,   # UTF-16 Little Endian
       _utf32 0xF02;      # Code Point / UTF-32


# Supplementary Character ( U+1F47E ):
SELECT _utf16 0xD83DDC7E,   # UTF-16 (implied Big Endian) Surrogate Pair
       _utf16le 0x3DD87EDC, # UTF-16 Little Endian Surrogate Pair
       _utf32 0x1F47E;      # Code Point / UTF-32

CHAR() function:

# CHAR(0xHEX USING encoding) function:
SELECT CHAR(0xF09F91BE USING utf8mb4), # UTF-8 bytes
       CHAR(0xD83DDC7E USING utf16),   # UTF-16 (Big Endian) Surrogate Pair
       CHAR(0x3DD87EDC USING utf16le), # UTF-16 Little Endian Surrogate Pair
       CHAR(0x0001F47E USING utf32),   # Code Point / UTF-32
       CHAR(0x1F47E USING utf32);      # Code Point (implied leading zeros)

See MySQL 8.0 demo on db<>fiddle

See request to add capability of using U&'' escape syntax (same as what PostgreSQL uses): WL#3529: Unicode Escape Sequences (original request linked at the bottom of the “High Level Architecture” tab, BUG 10199)




 

PostgreSQL

Also, the “String Constants With C-Style Escapes” and “String Constants With Unicode Escapes” sections of Lexical Structure documentation states:

SELECT E'TAB\x9TAB' AS "Single Byte", E'\xF0\x9F\x91\xBE' AS "UTF-8 bytes";

SELECT E'\u0F02' AS "Code Point",
       E'\uD83D\uDC7E' AS "UTF-16 Surrogate Pair",
       E'\U0000D83D\U0000DC7E' AS "UTF-16 Surrogate Pair via UTF-32",
       E'\U0001F47E' AS "UTF-32";

SELECT E'\U0010FFFF' AS "Highest UTF-32 Code Point";

SELECT U&'\0F02' AS "Code Point",
       U&'\D83D\DC7E' AS "UTF-16 Surrogate Pair",
       U&'\+00D83D\+00DC7E' AS "UTF-16 Surrogate Pair via UTF-32",
       U&'\+01F47E' AS "UTF-32";

See PostgreSQL 11 demo on db<>fiddle




 

C#

C# is a Microsoft .NET language.

The “String Escape Sequences” section of the Strings (C# Programming Guide) documentation states:

Console.WriteLine(
    "One to Four hex digits via \\x: W\x9W, X\x09X, Y\x009Y, Z\x0009Z");
Console.WriteLine("");
Console.WriteLine("Always four hex digits via \\u: TAB\u0009TAB");
Console.WriteLine("");

Console.WriteLine("Unicode-only BMP character: (\\x) \x0F02  (\\u) \u0F02");
Console.WriteLine("");

Console.WriteLine(
    "Two UTF-16 Code Units (i.e. Surrogate Pair) via \\x: \xD83D\xDC7E");
Console.WriteLine(
    "Two UTF-16 Code Units (i.e. Surrogate Pair) via \\u: \uD83D\uDC7E");
Console.WriteLine("");

Console.WriteLine("Code Point / UTF-32 via \\U: \U00000F02");
Console.WriteLine("Code Point / UTF-32 via \\U: \U0001F47E");
Console.WriteLine("");

Console.WriteLine("Highest Code Point / UTF-32 via \\U: \U0010FFFF");

WARNING: be care when using \x with less than 4 hex digits:

Console.WriteLine("-------------------");

Console.WriteLine("\\xA1 followed by a ...");
Console.WriteLine("..non-alphanumeric character ([space]): \xA1 A");
Console.WriteLine("..non-hex digit (Z): \xA1Z");
Console.WriteLine(
    "..hex digit, but intended to be used as itself (A): \xA1Ay, caramba!");
// \xA1Ay returns "ਚy" instead of "¡Ay" because \xA1A produces U+0A1A

Console.WriteLine(
    "\\x00A1 followed by a hex digit (A): \x00A1Aye aye, Captain!");

See C# demo on “IDE One”

Documentation Improvements and/or Corrections:




 

F#

F# is a Microsoft .NET language.

See the “Remarks” section of the Strings documentation.

printfn "UNDOCUMENTED Decimal (NOT Octal) \\DDD requires 3 digits: TAB\9TAB\09TAB\009TAB";
printfn "\\DDD notation is ISO-8859-1 (U+0000 - U+00FF): {\128-\129-\144-\152-\160-\161}";
printfn "CHAR for \\DDD = (DDD %% 256); Max = \\999 (U+00E7): {\365-\621-\6210-\176-\100-\999-\1000}";
printfn "---------------------";

printfn "UNDOCUMENTED \\x only works with two hex digits: TAB\x9TAB\x090TAB";
printfn "\\x is ISO-8859-1: 0x80 = \x80, 0x81 = \x81, 0x90 = \x90, 0x9A = \x9A, 0x9F = \x9F";
printfn "\\x is _not_ creating UTF-8: \xE0\xBC\x82"; // UTF-8 bytes for U+0F02
printfn "---------------------";

printfn "UTF-16 via \\u: \u0F02"; // ?
printfn "UTF-16 Surrogate Pair via \\u: \uD83D\uDC7E"; // U+1F47E
printfn "---------------------";

printfn "Code Point / UTF-32 via \\U: \U00000F02"; // ?
printfn "Code Point / UTF-32 via \\U: \U0001F47E";

See F# demo on “IDE One”

Documentation Improvements and/or Corrections:




 

Microsoft Visual C++ / C-Style

The “Escape Sequences” and “Universal character names” sections of the String and Character Literals (C++) documentation states:

#include "stdafx.h"
#include <iostream>

int main()
{
    // In Command Prompt, run the following first to get this console app to return values:
    // CHCP 65001

    std::wcout << u8"\\11 and \\011: tab\11tabby\011tab" << u8"\n";
    std::wcout << u8"\\7, \\07, and \\007: bell\7bell\07bell\007bell" << u8"\n";
    std::wcout << u8"\\176 = \176 ; \\177 = \177 ; \\200 = \200 ; \\237 = \237" << u8"\n";
    std::wcout << u8"\\242 = \242 ; \\377 = \377 ; \\777 = \777" << u8"\n"; // \777 == \u01FF
    std::wcout << u8"-------------------------------" << u8"\n";


    std::wcout << u8"\\x works with 1 or 2 hex digits: TAB\x9TAB\x09TAB" << u8"\n";
    std::wcout << u8"\\x works with 3 or 4 hex digits: Yadda\xA1Yadda\xA1AYadda\xA1AAYadda" << u8"\n";
    std::wcout << u8"\\x is ISO-8859-1: 0x80 = \x80, 0x81 = \x81, 0x90 = \x90, 0x9A = \x9A, 0x9F = \x9F" << u8"\n";
    std::wcout << u8"\\x is _not_ creating UTF-8: \xE0\xBC\x82" << u8"\n"; // UTF-8 bytes for U+0F02
    std::wcout << u8"-------------------------------" << u8"\n";


    std::wcout << u8"BMP Code Point / UTF-16 via \\u: \u0F02" << u8"\n";
    //std::wcout << L"UTF-16 Surrogate Pair via \\u: \uD83D\uDC7E" << L"\n"; // U+1F47E // compile error
    //std::wcout << u"UTF-16 Surrogate Pair via \\u: \uD83D\uDC7E" << u"\n"; // U+1F47E // compile error
    //std::wcout << u8"UTF-16 Surrogate Pair via \\u: \uD83D\uDC7E" << u8"\n"; // U+1F47E // compile error
    std::wcout << u8"-------------------------------" << u8"\n";


    std::wcout << u8"Code Point / UTF-32 via \\U: \U00000F02" << u8"\n";
    std::wcout << u8"Code Point / UTF-32 via \\U: \U0001F47E" << u8"\n";
    std::wcout << u8"Code Point / UTF-32 via \\U: \U0010FFFF" << u8"\n";
    //std::wcout << u8"Code Point / UTF-32 via \\U: \U00110000" << u8"\n";  // compile error
    std::wcout << u8"-------------------------------" << u8"\n";

    return 0;
}

I could not get the example code shown above to run on “IDE One”, but it did work as expected when compiled in Visual Studio, as a console app, and run from a Command Prompt.

NOTE: Be sure to run the following in a Command Prompt first if you are going to run the example shown above (it sets the code page to UTF-8):

C:\>CHCP 65001




 

C

printf("\\x can escape a single hex digit: TAB\x9TAB");
printf("\n\n");

printf("Three UTF-8 bytes via \\x: \xE0\xBC\x82");    // U+0F02
printf("\n");
printf("Four UTF-8 bytes via \\x: \xF0\x9F\x91\xBE"); // U+1F47E

printf("\n\n");

printf("The \\U syntax requires 8 hex digits (first two are always 0):\n");
printf("Code Point / UTF-32 via \\U: \U00000F02");
printf("\n");
printf("Code Point / UTF-32 via \\U: \U0001F47E");

See C demo on “IDE One”




 

PHP

The “Double quoted” section of the String documentation states that you can use the following sequences in double quoted, not single quoted, strings:

All versions of PHP:

echo "PHP version: ".phpversion()."\n\n";

echo "The following should work in all PHP versions:\n";
echo "\\x can escape a single hex digit: TAB\x9TAB";
echo "\n\n";

echo "Three UTF-8 bytes via \\x: \xE0\xBC\x82";    # U+0F02
echo "\n";
echo "Four UTF-8 bytes via \\x: \xF0\x9F\x91\xBE"; # U+1F47E
echo "\n\n";

echo "Octal notation is \\888 where '888' = 1 - 3 octal digits (values 0 - 7; range 0 - 377):";
echo "\n";
echo "\\11 and \\011: tab\11tabby\011tab";
echo "\n";
echo "\\7, \\07, and \\007: bell\7bell\07bell\007bell";
echo "\n";
echo "\\076 = \076 ; \\176 = \176 ; \\476 = \476 ; \\576 = \576";
echo "\n";
echo "UTF-8 bytes for U+0F02: \\340\\274\\202: \340\274\202";
echo "\n";
echo "UTF-8 bytes for U+1F47E: \\360\\237\\221\\276: \360\237\221\276";
echo "\n";
echo "UTF-8 bytes for U+1F47E: \\760\\637\\621\\676: \760\637\621\676";
echo "\n\n";

Starting in PHP 7.0.0:

echo "The following should work starting in PHP version 7.0.0:\n";
echo "Code Point / UTF-32 via \\u{}: \u{0F02}";
echo "\n";
echo "Code Point / UTF-32 via \\u{}: \u{1F47E}";

See PHP demo on “IDE One”

More info on the “\u{}” syntax




 

JavaScript

The “Escape notation” section of the String global object documentation states that you can use the following sequences in both double quoted and single quoted strings:

// \x9 throws an error when using JavaScript (SMonkey 24.2.0).
print("\\x only works with two hex digits: TAB\x9TAB\x090TAB");
print("\\x is ISO-8859-1: 0x80 = \x80, 0x81 = \x81, 0x90 = \x90, 0x9A = \x9A, 0x9F = \x9F");
print("\\x is _not_ creating UTF-8: \xE0\xBC\x82"); // UTF-8 bytes for U+0F02
print("");

print("BMP Code Point / UTF-16 via \\u: \u0F02");
print("UTF-16 Surrogate Pair via \\u: \uD83D\uDC7E"); // U+1F47E
print("");

// \u{} throws an error when using JavaScript (SMonkey 24.2.0).
print("\\u{} is noted as being \"experimental, should not be used in production code\":");
print("Code Point / UTF-32 via \\u{}: \u{0F02}");  // NO EFFECT (YET!!!)
print("Code Point / UTF-32 via \\u{}: \u{1F47E}"); // NO EFFECT (YET!!!)
print("-------------------------------");

print("UTF-16 via String.fromCharCode(decimal): " + String.fromCharCode(3842));
print("UTF-16 via String.fromCharCode(hex): " + String.fromCharCode(0x0F02));
print("");

print("UTF-16 Surrogate Pair via String.fromCharCode(decimal): " + String.fromCharCode(55357, 56446));
print("UTF-16 Surrogate Pair via String.fromCharCode(hex): " + String.fromCharCode(0xD83D, 0xDC7E));
print("");

print("Multiple UTF-16 via String.fromCharCode(decimal): " + String.fromCharCode(3842, 32, 55357, 56446));
print("Multiple UTF-16 via String.fromCharCode(hex): " + String.fromCharCode(0x0F02, 0x20, 0xD83D, 0xDC7E));
print("-------------------------------");

// Like \x, the octal escape sequence uses the ISO-8859-1 character set
print("Octal notation is \\888 where '888' = 1 - 3 octal digits (values 0 - 7; range 0 - 377):");
print("\\11 and \\011: tab\11tabby\011tab");
print("\\7, \\07, and \\007: bell\7bell\07bell\007bell");
print("\\176 = \176 ; \\177 = \177 ; \\200 = \200 ; \\237 = \237");
print("\\242 = \242 ; \\377 = \377 ; \\504 = \504");
print("-------------------------------");

// String.fromCodePoint() raises an error in both (rhino 1.7.7) and (SMonkey 24.2.0).
//print("Code Point / UTF-32 via String.fromCodePoint(decimal): " + String.fromCodePoint(3842));
//print("Code Point / UTF-32 via String.fromCodePoint(hex): " + String.fromCodePoint(0x0F02));

See JavaScript demo on “IDE One”

Documentation Improvements and/or Corrections:




 

Julia

The “Characters” and “Byte Array Literals” sections of the main “Strings” documentation states that you can use the following sequences in both double quoted strings and single quoted character literals:

Testing done with command-line julia.exe Version 1.2.0 (2019-08-20).

julia> # \x works with one or two hex digits:
julia> print("TAB\x9TAB\x09TAB")
TAB     TAB     TAB


julia> # \x is directly encoding UTF-8; it is not ISO-8859-1:
julia> print("\\xC1 should be Á, but here it's: \xC1")
\xC1 should be Á, but here it's: �


julia> # UTF-8 bytes for U+0F02:
julia> codepoint('\xE0\xBC\x82')
0x00000f02


julia> # UTF-8 bytes for U+1F47E:
julia> codepoint('\xF0\x9F\x91\xBE')
0x0001f47e

julia> ##################################################

julia> # Like \x, the octal escape sequence injects single bytes into a UTF-8 encoding.
julia> # Octal notation is \\888 where '888' = 1 - 3 octal digits (values 0 - 7; range 0 - 377):

julia> print("\\11 and \\011: tab\11tabby\011tab")
\11 and \011: tab       tabby   tab

julia> print("\\7, \\07, and \\007: Bell\7Bell\07Bell\007Bell")
\7, \07, and \007: BellBellBellBell

julia> # UTF-8 bytes for U+0F02:
julia> codepoint('\340\274\202')
0x00000f02

julia> # UTF-8 bytes for U+1F47E:
julia> codepoint('\360\237\221\276')
0x0001f47e

julia> ##################################################

julia> # BMP Code Point (U+0000 - U+FFFF) via \u:
julia> codepoint('\uF02')
0x00000f02

julia> codepoint('\u0F02')
0x00000f02

julia> # \u produces code points, not bytes:
julia> print("\xE0\xBC\x82  as opposed to: \uE0\uBC\u82")
?  as opposed to: �

julia> ##################################################

julia> # BMP and Supplementary Character Code Points U+0000 - U+10FFFF) via \U:
julia> codepoint('\UF02')
0x00000f02

julia> codepoint('\U0F02')
0x00000f02

julia> codepoint('\U000F02')
0x00000f02

julia> codepoint('\U00000F02')
0x00000f02


julia> codepoint('\U1F47E')
0x0001f47e

julia> codepoint('\U1F47E')
0x0001f47e

julia> codepoint('\U01F47E')
0x0001f47e

julia> codepoint('\U0001F47E')
0x0001f47e


julia> # \U produces code points, not bytes:
julia> print("\xE0\xBC\x82  as opposed to: \UE0\UBC\U82")
?  as opposed to: �

julia> print("\xF0\x9F\x91\xBE  as opposed to: \UF0\U9F\U91\UBE")
�  as opposed to: ð??¾

Documentation Improvements and/or Corrections:




 

Java

The “3.3. Unicode Escapes” section of the “Chapter 3. Lexical Structure” documentation, as well as the “3.10.6. Escape Sequences for Character and String Literals” section, state that you can use the following escape sequences in strings:

import java.util.*;
import java.lang.*;
import java.io.*;

class SqlQuantumLeap
{
    public static void main (String[] args) throws java.lang.Exception
    {
        // The octal escape sequence uses the ISO-8859-1 character set
        System.out.println("Octal notation is \\888 where '888' = 1 - 3 octal digits (values 0 - 7; range 0 - 377):");
        System.out.println("\\11 and \\011: tab\11tabby\011tab");
        System.out.println("\\7, \\07, and \\007: bell\7bell\07bell\007bell");
        System.out.println("\\176 = \176 ; \\177 = \177 ; \\200 = \200 ; \\237 = \237");
        System.out.println("\\242 = \242 ; \\377 = \377 ; \\504 = \504");
        System.out.println("-------------------------------");

        System.out.println("BMP Code Point / UTF-16 via \\u: \u0F02");
        System.out.println("UTF-16 Surrogate Pair via \\u\\u: \uD83D\uDC7E"); // U+1F47E
        System.out.println("-------------------------------");

        //  ---------------------------------------------------------------

        System.out.println("String constructor: " + new String(
            new int[]{ 0x0F02, 32, 65, 32, 0xD83D, 0xDC7E }, 0, 6 )); // U+1F47E

        char[] tc1 = Character.toChars(0x0F02);
        System.out.println("Character.toChars(int) static method (codePoint = U+0F02):");
        System.out.println("   Size of array returned for BMP Character: " + tc1.length);
        System.out.println("   String created from char[]: " + new String(tc1));

        char[] tc2 = Character.toChars(0x1F47E);
        System.out.println("Character.toChars(int) static method (codePoint = U+1F47E):");
        System.out.println("   Size of array returned for Supplementary Character: " + tc2.length);
        System.out.println("   String created from char[]: " + new String(tc2));

        char[] tc3 = new char[] { 65, 66, 67, 68, 69, 70 };
        System.out.println("Character.toChars(int, char[], int) static method (codePoint = U+1F47E):");
        System.out.println("   Initial String created from char[]: " + new String(tc3));
        Character.toChars(0x1F47E, tc3, 2); // insert into middle, between spaces
        System.out.println("   String created from char[] after Character.toChars(): " + new String(tc3));
    }
}

See Java demo on “IDE One”




 

Excel / VBA

Pre-Office 2013:

=UnicodeFromInt(3842)

=UnicodeFromHex("F02")
=UnicodeFromHex("0F02")



=UnicodeFromInt(128126)

=UnicodeFromHex("1F47E")
=UnicodeFromHex("01F47E")

Starting in Office 2013:

=UNICHAR(3842)

=UNICHAR(128126)
Exit mobile version