Exporting Bulk Data

Exporting Bulk Data zhang Fri, 09/01/2017 - 16:11

The COPY INTO command with a file name argument allows for fast dumping of a result set into an UTF-8 text file, possibly compressed. The file must be accessible by the server and a full path name may be required. The special file STDOUT can be used to direct the result to the primary output channel.

The delimiters and NULL AS arguments provide control over the layout required.

COPY select_query INTO { file_name | STDOUT }
     [ [USING] DELIMITERS field_separator [ ',' record_separator [ ',' string_quote ] ] ]
     [ NULL AS null_string ]

For the output 'file_name', one can specify a file name ending with '.gz', '.bz2' or '.xz' to use the appropriate compression library (if available). Note: '.zip' is not (yet) supported, also not on Windows, use '.gz' instead.

sql>SELECT * FROM cars;
+------+-------+--------+
| yr   | brand | model  |
+======+=======+========+
| 2000 | Ford  | Focus  |
| 2001 | Honda | Accord |
| 2004 | Honda | Pilot  |
+------+-------+--------+
3 tuples
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out.csv' USING DELIMITERS ',' , '\n' , '"';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out2.csv.gz' DELIMITERS ',','\n';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out3.csv.bz2' DELIMITERS ',','\n';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out4.tsv.xz' DELIMITERS '\t','\n';
3 affected rows

Let us check some of the results:

$ cat /tmp/cars_out.csv
2000,"Ford","Focus"
2001,"Honda","Accord"
2004,"Honda","Pilot"
$ bzip2 -d cars_out2.csv.bz2 ; cat cars_out2.csv
2000,"Ford","Focus"
2001,"Honda","Accord"
2004,"Honda","Pilot"