Postgresql : load csv data




written by Marco Ferretti on April 04, 2012, at 11:22 AM

One of the most common action that one has to do with a database is to load data into it. Usually the data comes either in a dump/sql file or in csv format. Today I am going to brag about loading the latter format into a PostgreSQL database. The first thing to keep in mind is that PostgreSQL is able to do it without magic . The second thing to put attention on is which is the user who is connecting to the database to load the data.

The easy pattern : you have superuser privileges

In this scenario, the user is superuser ( postgres ) or is inheriting from the superuser role. If that's the case then all you need to do is open a psql shell and issue the command

 copy <table> from <file> with delimiter '<delimiter>' ;

where

  • <table> is the name of the table that you want to populate with the data
  • <file> is the path to the file that contains your csv data
  • <delimiter> is the delimiter used in your csv file to separate the fields ( usually ',' or ';' )

The way you open a connection depends on what you are using : if you are using a gui ( such as pgAdmin ) you simply double click on the database that contains the table that you want to populate and then hit the SQL button in the toolbar. If you are using a console then you have to use pql command :

 psql -h <servername> -U <username> <databasename>

where

  • <servername> is the name or ip address of the PostgreSQL server you are connecting to
  • <username> is the username that you are using to connect
  • <databasename> is the name of the database that contains the table you want to work on

The not so obvious pattern : you are a regular user

In this scenario you are connecting to the database via a user that has ( at least ) connect and insert privileges on the table that you want to populate. In this situation you cannot "simply" use the copy command telling postgres to get the data from a file because you need superuser privileges in order to import from a file . The work around here is to use stdin since everybody can use it ! This is going to limit the options a little but still ... Open a terminal and issue the commad :

cat <filename> | psql -U <username> -h <servername> <database> -c "copy <table> from stdin ;"

if you are on Windows system replace cat with type

Variation of the problem

  • The file contains different number of columns than the table does
Then you need to modify the copy command in order to specify the column names that you want to manipulate :
copy <table>(field1,field2, ..., fieldn) from ...
  • The file contains headers : I don't want to import that too !
Then you need to tell the copy command that the filed contains headers ! As simple as that :
copy <table> from [stdin|<filename>] with CSV HEADER;
  • The file contains special characters :
You will have to identify the charset of the file and tell psql which charset to use when issuing the copy command;e.g. you are importing a file that has accents or umlauts (i assume you are already connected via psql) :
 \encoding ISO_8859_1  ;
 copy <table> from [stdin|<filename>] with CSV HEADER;  

If you want to deepen your knowledge of the powerful copy command here's the official documentation

Leave a comment

Name (required)
E-mail (required, will not be published)
Website
Comment

Enter value: Captcha