To create a string of comma delimited values from a data set that looks like this
VAL_1 VAL_2 VAL_3 NOTE
------ ------ ------ -------------------------
a One NOT NULL, Two NULL
b One NOT NULL, Two NULL
c One NOT NULL, Two NULL
a b Two NOT NULL, One NULL
a b c Three NOT NULL, Zero NULL
Zero NOT NULL, Three NULL
simply use the concatenation operator, "||", as in the following snippet.
select val_1, val_2, val_3, val_1 || ',' || val_2 || ',' || val_3 as string from t order by key ;
VAL_1 VAL_2 VAL_3 STRING
------ ------ ------ --------------------
a a,,
b ,b,
c ,,c
a b a,b,
a b c a,b,c
,,
To exclude null values from the end result LTRIM and NVL2 can be used.
select
val_1, val_2, val_3,
LTRIM
( val_1 ||
NVL2( val_2, ',' || val_2 , null ) ||
NVL2( val_3, ',' || val_3 , null ) ,
','
) as string
from t
order by key ;
VAL_1 VAL_2 VAL_3 STRING
------ ------ ------ --------------------
a a
b b
c c
a b a,b
a b c a,b,c
How to Create a CSV File
While there is no official standard for CSV file formats, RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files describes one commonly used set of rules. We will follow this standard in the examples to follow.
When writing a query whose output will be spooled to a CSV file we need to disable a few SQL*Plus features to ensure they do not corrupt the contents. Here are the settings I typically use.
set ECHO off set FEEDBACK off set LINESIZE 60 set RECSEP off set TRIMSPOOL on set VERIFY off set PAGESIZE 0 spool temp.txt replace select val_1 || ',' || val_2 || ',' || val_3 as string from t order by key ;
a,, ,b, ,,c a,b, a,b,c ,,
spool off
The value for LINESIZE should be larger than the longest expected line in the CSV file.
How to Escape Double Quotes, CR, and LF
Rules for escaping special characters vary from product to product. For products that follow RFC 4180 the following rules affect how we write our query.
-
"4. Within the header and each record, there may be one or more fields, separated by commas. Each line should contain the same number of fields throughout the file. Spaces are considered part of a field and should not be ignored. The last field in the record must not be followed by a comma."
-
"5. Each field may or may not be enclosed in double quotes" - we will enclose all fields in double quotes since it simplifies our query
-
"6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes" - since we will enclose all fields in double quotes we automatically obey this rule
-
"7. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote."
Before we see the solution let's create some test data with special characters first. The NOTE column will contain values with double quotes, commas, and a line feed.
insert into t values ( 'String 7', 'd', 'e', 'f', 'line 1' || chr(10) || 'line 2' ); insert into t values ( 'String 8', 'g', 'h', 'i', '"1", "2", or "3"' ); commit; set pagesize 60 select val_1, val_2, val_3, note from t order by key ;
VAL_1 VAL_2 VAL_3 NOTE
------ ------ ------ -------------------------
a One NOT NULL, Two NULL
b One NOT NULL, Two NULL
c One NOT NULL, Two NULL
a b Two NOT NULL, One NULL
a b c Three NOT NULL, Zero NULL
Zero NOT NULL, Three NULL
d e f line 1
line 2
g h i "1", "2", or "3"
Now here is the query.
set echo off
set feedback off
set linesize 60
set recsep off
set trimspool on
set verify off
set pagesize 0
select
'"' || replace( val_1, '"', '""' ) || '"'
|| ',"' || replace( val_2, '"', '""' ) || '"'
|| ',"' || replace( val_3, '"', '""' ) || '"'
|| ',"' || replace( note , '"', '""' ) || '"'
as string
from t
order by key ;
"a","","","One NOT NULL, Two NULL" "","b","","One NOT NULL, Two NULL" "","","c","One NOT NULL, Two NULL" "a","b","","Two NOT NULL, One NULL" "a","b","c","Three NOT NULL, Zero NULL" "","","","Zero NOT NULL, Three NULL" "d","e","f","line 1 line 2" "g","h","i","""1"", ""2"", or ""3"""
How to Include a Header Line
RFC 4180 states
"3. There maybe an optional header line appearing as the first line of the file with the same format as normal record lines."
Including such a header line is easily accomplished in SQL*Plus for files with less than 50,000 lines (or whatever the maximum setting for PAGESIZE is in your environment) like this.
set echo off
set feedback off
set linesize 60
set recsep off
set trimspool on
set verify off
set pagesize 50000
set newpage none
set heading on
set underline off
column string heading '"Val 1","Val 2","Val 3","Note Field"'
select
'"' || replace( val_1, '"', '""' ) || '"'
|| ',"' || replace( val_2, '"', '""' ) || '"'
|| ',"' || replace( val_3, '"', '""' ) || '"'
|| ',"' || replace( note , '"', '""' ) || '"'
as string
from t
order by key ;
"Val 1","Val 2","Val 3","Note Field" "a","","","One NOT NULL, Two NULL" "","b","","One NOT NULL, Two NULL" "","","c","One NOT NULL, Two NULL" "a","b","","Two NOT NULL, One NULL" "a","b","c","Three NOT NULL, Zero NULL" "","","","Zero NOT NULL, Three NULL" "d","e","f","line 1 line 2" "g","h","i","""1"", ""2"", or ""3"""
Unlike column names, note how the column heading can be greater than 30 characters. In the last example the column heading for STRING is 36 characters long.
For files that will contain more than 50,000 lines I use the following approach.
set echo off
set feedback off
set linesize 60
set recsep off
set trimspool on
set verify off
set pagesize 0
select
string
from
(
select
0 as sort_key_1 ,
' ' as sort_key_2 ,
'"Val 1","Val 2","Val 3","Note Field"' as string
from dual
union all
select
1 as sort_key_1 ,
key as sort_key_2 ,
'"' || replace( val_1, '"', '""' ) || '"'
|| ',"' || replace( val_2, '"', '""' ) || '"'
|| ',"' || replace( val_3, '"', '""' ) || '"'
|| ',"' || replace( note , '"', '""' ) || '"'
as string
from t
)
order by
sort_key_1 ,
sort_key_2
;
"Val 1","Val 2","Val 3","Note Field" "a","","","One NOT NULL, Two NULL" "","b","","One NOT NULL, Two NULL" "","","c","One NOT NULL, Two NULL" "a","b","","Two NOT NULL, One NULL" "a","b","c","Three NOT NULL, Zero NULL" "","","","Zero NOT NULL, Three NULL" "d","e","f","line 1 line 2" "g","h","i","""1"", ""2"", or ""3"""
Gotcha
When creating large CSV files destined for Microsoft Excel beware of limits on the maximum number of rows per worksheet (1M rows in Excel 2007, 64K rows in earlier versions).
Cleanup
delete from t where key in ( 'String 7', 'String 8' ) ; commit ;
