Unicode

Unicode and Excel 2007 (Windows)

Opening Files

When you open a text file using the [Microsoft Office Button] or [CTRL-O] Excel 2007 opens a Text Import Wizard dialog. This dialog allows you to identify the file's encoding by setting the "File Origin" field. The following chart lists test results observed using various file encodings.

Encoding Byte Order Mark Test Data File Origin Setting Used Result
cp850 abc-àèìòù© 850 : Western European (DOS) Pass
windows-1252 abc-àèìòù© 1252 : Western European (Windows)
or
Windows (ANSI)
Pass
UTF-8 abc-àèìòù©-뮻뮼뮽 65001 : Unicode (UTF-8) Pass
UTF-8 UTF-8 BOM abc-àèìòù©-뮻뮼뮽 65001 : Unicode (UTF-8) Pass
UTF-16LE abc-àèìòù©-뮻뮼뮽 none found Fail
UTF-16 LE abc-àèìòù©-뮻뮼뮽 Windows (ANSI) Pass
UTF-16BE abc-àèìòù©-뮻뮼뮽 none found Fail
UTF-16 BE abc-àèìòù©-뮻뮼뮽 none found Fail

Note: Even though you can open a UTF-8 file with Excel you will not be able to re-save it in this format. You can only save it in one of the supported text file types mentioned in the next section. Alternatively you can save the file as an Excel spreadsheet.

When you open a text file by dragging it from a Windows Explorer window and dropping it into an open Excel 2007 window the Text Import Wizard dialog is not displayed. Instead Excel 2007 appears to use an algorithm to determine the correct encoding.The following chart lists test results observed when dragging and dropping files that use various encodings.

File Encoding Byte Order Mark in File Data in File Result
cp850 abc-àèìòù© Fail
windows-1252 abc-àèìòù© Pass
UTF-8 abc-àèìòù©-뮻뮼뮽 Fail
UTF-8 UTF-8 BOM abc-àèìòù©-뮻뮼뮽 Pass
UTF-16LE abc-àèìòù©-뮻뮼뮽 Fail
UTF-16 LE abc-àèìòù©-뮻뮼뮽 Pass
UTF-16BE abc-àèìòù©-뮻뮼뮽 Fail
UTF-16 BE abc-àèìòù©-뮻뮼뮽 Fail

Saving New Files

When saving new text files (.txt, .csv, or .prn) Excel 2007 allows you to select one of the following encodings.

Save as type: Uses this Encoding Notes
Text (Tab delimited)(*.txt) active ANSI code page
  • e.g. cp1252
Text (MS-DOS)(*.txt) the system's active OEM code page
  • e.g. cp850
CSV (Comma delimited)(*.csv) the system's active ANSI code page
  • e.g. cp1252
CSV (MS-DOS)(*.csv) the system's active OEM code page
  • e.g. cp850
Formatted Text (Space delimited)(*.prn) the system's active ANSI code page
  • e.g. cp1252
Unicode Text (*.txt) UTF-16 (LE BOM)
HTML File the system's active ANSI code page
  • includes a tag like:<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
  • characters not available in the code page are save using &#ddddd; notation

Note that a UTF-8 file type is not available.




Linking to SQL Snippets ™

To link to this page in Oracle Technology Network Forums or OraFAQ Forums cut and paste this code.

  • [url=http://www.sqlsnippets.com/en/topic-13412.html]SQL Snippets: Unicode - Unicode and Excel 2007 (Windows)[/url]

To link to this page in HTML documents or Blogger comments cut and paste this code.

  • <a href="http://www.sqlsnippets.com/en/topic-13412.html">SQL Snippets: Unicode - Unicode and Excel 2007 (Windows)</a>

To link to this page in other web sites use the following values.

  • Link Text : SQL Snippets: Unicode - Unicode and Excel 2007 (Windows)
  • URL (href): http://www.sqlsnippets.com/en/topic-13412.html