Unicode

Uniocode and Notepad (Windows)

Opening Files

When Notepad opens a file it attempts to determine the encoding using an algorithm. The following chart lists test results observed when opening files with various encodings.

Encoding Byte Order Mark Test Data Result
cp850 abc-àèìòù© Fail
windows-1252 abc-àèìòù© Pass
UTF-8 abc-àèìòù©-뮻뮼뮽 Pass
UTF-8 UTF-8 BOM abc-àèìòù©-뮻뮼뮽 Pass
UTF-16LE abc-àèìòù©-뮻뮼뮽 Pass
UTF-16 LE abc-àèìòù©-뮻뮼뮽 Pass
UTF-16BE abc-àèìòù©-뮻뮼뮽 Pass
UTF-16 BE abc-àèìòù©-뮻뮼뮽 Pass

Note that a bug in the algorithm causes some files containing certain patterns of characters to be incorrectly opened as UTF-16  files (see Notepad (Windows) Unicode detection).

Saving New Files

By default files created from Notepad are saved using the system's current ANSI code page. However files can be saved in other encodings by changing the "Encoding:" field of the "Save As" dialog.

Windows "Save As" Dialog

The four choices listed map to these standard encoding names.

Windows Encoding Name
Standard Encoding Name
Byte Order Mark
ANSI
active ANSI code page none
Unicode
UTF-16
LE
Unicode big endian
UTF-16
BE
UTF-8
UTF-8
UTF-8

Files saved with encodings other than ANSI will have a Byte Order Mark (BOM) added to the beginning of the file. According to the Unicode standard a BOM is optional in UTF-8 files. However, some programs only expect UTF-8 files with no BOM. When such programs open a UTF-8 file that includes a BOM they may incorrectly display the BOM as these three printable characters: .




Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-13426.html]SQL Snippets: Unicode - Uniocode and Notepad (Windows)[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-13426.html">SQL Snippets: Unicode - Uniocode and Notepad (Windows)</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: Unicode - Uniocode and Notepad (Windows)
  • URL (href): http://www.sqlsnippets.com/en/topic-13426.html