COPY INTO robust to empty row at end of CSV

Stefan Manegold Stefan.Manegold at cwi.nl
Mon May 19 09:58:32 CEST 2014


Then I suspect that "records" has the wrong value, i.e., it is also counting the empty line.
Did/could you double-check?

----- Original Message -----
> Thanks, Stefan, on COPY:
> 
> The current script issues the following command, and it still runs into the
> problem with the empty line/row in the end:
> 
> copy ='COPY %i OFFSET %i RECORDS INTO %s FROM \'%s\' USING DELIMITERS
> %r,%r,%r NULL AS \'\' LOCKED;' %
> (records,offset,tname,f,dialect.delimiter,eol,dialect.quotechar)
> 
> 
> 
> On Mon, May 19, 2014 at 8:11 AM, Stefan Manegold < Stefan.Manegold at cwi.nl >
> wrote:
> 
> 
> 
> 
> 
> ----- Original Message -----
> > Hi all,
> > 
> > Hannes Mühleisen (cc'd) generously shared his
> > homepages.cwi.nl/~hannes/importer.py Python script with me last week, which
> > automates the creation of tables from text files: Python's csv-reader helps
> > recognize the dialect of the text file as well as grab the header for
> > column
> > names and identify the types for the columns, and then a single COPY %i
> > OFFSET %i RECORDS INTO %s FROM \'%s\' USING DELIMITERS … is issued to
> > MonetDB (see line 404 of the script).
> > 
> > I had some issues with this over the last three days, but now I see the
> > source of the error I get: There is nothing wrong with Hannes's Python code
> > but when COPY INTO turns back to the text file, MonetDB still complains
> > about the last row containing an "incomplete record" (it is empty) even if
> > Python was smart enough not to count that row into the number of records
> > beforehand.
> > 
> > Is there a way to call COPY INTO that is robust to the text file ending in
> > an
> > extra empty row? It is common with some generated CSV or TSV files.
> 
> Use COPY n RECORDS ..., where n is the number of non-empty lines in you CSV
> file.
> 
> > By the way, I also have a follow-up about how to merge some of the tables
> > together. I am happy to work on this, of course, I don't expect anyone to
> > script this for me, but I am confused about how much I can use SQL
> > variables
> > for this (and how) versus how much I need to use another language to script
> > the repeated calls to MonetDB about each year for each series of tables (I
> > have dozens of dozens). I hope this is a reasonably well-posed question:
> > http://dba.stackexchange.com/questions/65329/dynamic-sql-merge-partitioned-tables-annual-tables-into-a-single-table-of-all
> 
> Not sure whether the WHILE loops works as such, but you could easily do that
> in your application.
> 
> You also might want to consider replacing
> ALTER TABLE data_ at i ADD COLUMN "year" INTEGER; UPDATE data_ at i SET "year" =
> @i;
> by, say,
> ALTER TABLE data_ at i ADD COLUMN "year" INTEGER DEFAULT @i;
> and
> CREATE TABLE data AS SELECT * FROM data_1990 UNION ALL SELECT * FROM
> data_1991 UNION ALL [...] WITH DATA;
> by, say,
> CREATE VIEW data AS SELECT * FROM data_1990 UNION ALL SELECT * FROM data_1991
> UNION ALL [...];
> 
> Best,
> Stefan
> 
> > Thanks for any help,
> > 
> > Laszlo
> > 
> > _______________________________________________
> > users-list mailing list
> > users-list at monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> > 
> 
> --
> | Stefan.Manegold at CWI.nl | DB Architectures (DA) |
> | www.CWI.nl/~manegold/ | Science Park 123 (L321) |
> | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
> 
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
> 
> 
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
> 

-- 
| Stefan.Manegold at CWI.nl | DB Architectures   (DA) |
| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |



More information about the users-list mailing list