COPY INTO robust to empty row at end of CSV

László Sándor sandorl at gmail.com
Mon May 19 09:45:58 CEST 2014


Thanks, Stefan, on COPY:

The current script issues the following command, and it still runs into the
problem with the empty line/row in the end:

    copy ='COPY %i OFFSET %i RECORDS INTO %s FROM \'%s\' USING DELIMITERS
%r,%r,%r NULL AS \'\' LOCKED;' %
(records,offset,tname,f,dialect.delimiter,eol,dialect.quotechar)



On Mon, May 19, 2014 at 8:11 AM, Stefan Manegold <Stefan.Manegold at cwi.nl>wrote:

>
>
> ----- Original Message -----
> > Hi all,
> >
> > Hannes Mühleisen (cc'd) generously shared his
> > homepages.cwi.nl/~hannes/importer.py Python script with me last week,
> which
> > automates the creation of tables from text files: Python's csv-reader
> helps
> > recognize the dialect of the text file as well as grab the header for
> column
> > names and identify the types for the columns, and then a single COPY %i
> > OFFSET %i RECORDS INTO %s FROM \'%s\' USING DELIMITERS … is issued to
> > MonetDB (see line 404 of the script).
> >
> > I had some issues with this over the last three days, but now I see the
> > source of the error I get: There is nothing wrong with Hannes's Python
> code
> > but when COPY INTO turns back to the text file, MonetDB still complains
> > about the last row containing an "incomplete record" (it is empty) even
> if
> > Python was smart enough not to count that row into the number of records
> > beforehand.
> >
> > Is there a way to call COPY INTO that is robust to the text file ending
> in an
> > extra empty row? It is common with some generated CSV or TSV files.
>
> Use COPY n RECORDS ..., where n is the number of non-empty lines in you
> CSV file.
>
> > By the way, I also have a follow-up about how to merge some of the tables
> > together. I am happy to work on this, of course, I don't expect anyone to
> > script this for me, but I am confused about how much I can use SQL
> variables
> > for this (and how) versus how much I need to use another language to
> script
> > the repeated calls to MonetDB about each year for each series of tables
> (I
> > have dozens of dozens). I hope this is a reasonably well-posed question:
> >
> http://dba.stackexchange.com/questions/65329/dynamic-sql-merge-partitioned-tables-annual-tables-into-a-single-table-of-all
>
> Not sure whether the WHILE loops works as such, but you could easily do
> that in your application.
>
> You also might want to consider replacing
>     ALTER TABLE data_ at i ADD COLUMN "year" INTEGER; UPDATE data_ at i SET
> "year" = @i;
> by, say,
>     ALTER TABLE data_ at i ADD COLUMN "year" INTEGER DEFAULT @i;
> and
> CREATE TABLE data AS SELECT * FROM data_1990 UNION ALL SELECT * FROM
> data_1991 UNION ALL [...] WITH DATA;
> by, say,
> CREATE VIEW data AS SELECT * FROM data_1990 UNION ALL SELECT * FROM
> data_1991 UNION ALL [...];
>
> Best,
> Stefan
>
> > Thanks for any help,
> >
> > Laszlo
> >
> > _______________________________________________
> > users-list mailing list
> > users-list at monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
>
> --
> | Stefan.Manegold at CWI.nl | DB Architectures   (DA) |
> | www.CWI.nl/~manegold/  | Science Park 123 (L321) |
> | +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |
>
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/users-list/attachments/20140519/e151b39d/attachment.html>


More information about the users-list mailing list