Failing to load a table

Stefan Manegold Stefan.Manegold at cwi.nl
Thu Jul 4 16:48:12 CEST 2013


> Le 04/07/2013 15:12, Stefan Manegold a écrit :
> > Please check what the monetdbd / merovingian log file says.
> Hum... /var/log/monetdb is empty. And I can't locate any monet* log
> file.

It's called merovingian.log and is located in your dbfarm.

> >> The table size on disk for Postgresql is 37 GB.
> >> Using VectorWise, the table loads, and take only 7 GB on disks,
> >> but
> >> the loading process (very similar to what we do with MonetDB)
> >> takes
> >> a very long time( 25 hours), much longer than what happens with
> >> the
> >> 109 columns tables.
> >>
> >> I am trying to make sense of that and understand why this table is
> >> giving a (very) hard time to monetdb (and somehow VectorWise as
> >> well).
> >> Does anyone have an explanation ?
> > Can you share the schemas (column types) of both tables?
> Yes, sure.
> 
> The first table is called dwhinv and has :
> 
> 89 columns of type varchar(32).
> 2 timestamps (not null)
> 2 decimal (at least one of them is set on each line)
> 19 integers
> 
> The second table is called dwhcol and has :
> 
> 83 columns of type varchar(32)
> 2 timestamps
> 420 columns of type decimal

integers are 4 byte.
timestamps are IMHO also 4 byte
I'm not sure what we use for decimals without digit specification; I assume we then use the largest, i.e., 8 byte.
strings highly depend on the actual data (due to (best effort) duplicate elimination and dictionary encoding).

> > For each type used the number of column with that type might be
> > sufficient.
> > If there are no string columns at all, that would be sufficient
> > info for us
> > to estimate the expected storage size.
> >
> >
> > Other than duplicate elimination (dictionary encoding) for string
> > columns,
> > MonetDB does not yet use compressed storage.  We do have good
> > experiences
> > with compressed file systems, though.
> What about null values ? I was under the impression that null values
> where not stored in monetdb (just non existant entries in the BATs) ?

NULL values are stored. For fixed width columns (e.g., all numbers), they take as much storage as any value of that type. For variable length types (e.g., strings, (both char & varchar)), they take 1 to 8 byte, depending on the number of distinct values in that column.

Stefan

> Because each decimal column has at most 1 million non null values,
> and
> many have a few thousands... Doesn't it matter ?
> >
> > Do you have constraints and/or keys defined on your tables prior to
> > bulk
> > loading?
> No, there are no foreign keys, nor constraints of any sort.
> 
> Franck
> 
> 
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> http://mail.monetdb.org/mailman/listinfo/users-list
> 

-- 
| Stefan.Manegold at CWI.nl | DB Architectures   (DA) |
| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |




More information about the users-list mailing list