On Fri, Mar 24, 2017 at 6:55 AM, Manish gupta <gahoimnshg@gmail.com> wrote:

One more observation is that when I compress the monetDB database directory, I get ~5.5 times compression ( compressed size is ~30GB ). That is quite big considering binary data.

Regards,
Manish

On Thu, Mar 23, 2017 at 8:16 PM, Manish gupta <gahoimnshg@gmail.com> wrote:
Hi Stefan,
Thanks for explaining it.

I am wondering if columnar structure will be helpful in compressing data given very low cardinality of more than half of these columns? Actually, I want to estimate what kind of physical memory will be sufficient to have this database in memory to avoid swapping and give good performance of ad-hoc queries. 64 GB looks insufficient, is there anyway, I can have some estimate to avoid multiple iterations?

Regards,
Manish

On Thu, Mar 23, 2017 at 7:14 PM, Stefan Manegold <Stefan.Manegold@cwi.nl> wrote:

Dear Manish,

since MonetDB uses memory mapped files (also on Windows),
the (only) "utility" that knows for sure which data is in physical memory at what time
is the OS (operating system) that does the virtual memory management.

a back-of-the-envelope calculation yields:
- assuming 4 byte wide columns (e.g., integer, real) (on average):
200M * 70 * 4 byte ~= 56 GB
- assuming 8 byte wide columns (e.g., bigint, double, float, timestamp) (on average):
200M * 70 * 8 byte ~= 112 GB

additionally, MonetDB might have decided to build (and persist) some indexes.

Note that this calculation only considers persistent base data,
i.e., no transient intermediate results that might also consume
(temporary) disk space.

Best,
Stefan

----- On Mar 23, 2017, at 2:31 PM, Manish gupta gahoimnshg@gmail.com wrote:

> Dear all,
> I am looking at some memory profiling of monet database.
> Basically, the database size at disk is ~160 GB ( Although I am not very
> convinced with this big data size, there are 200M records with ~70 columns
> distributed among two big tables and several smaller ( relatively ) tables) .
> Right now, I have 64GB physical RAM dedicated machine for this database, but
> soon after firing queries on these tables ( with all sort of permutations on
> columns, but no joins between tables ), the memory is almost fully occupied,
> and resource crunch kills the performance.
> Is there any utility, which shows which all columns are in-memory and what is
> the size? And better some setting through which I can guide which columns
> should remain in memory and which should immediately trimmed after query
> returns?
>
> I am using windows 16 core machine on 64 bit architecture.
>
> Regards,
> Manish
>
>

> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list

--
| Stefan.Manegold@CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list