23 Oct
2014
23 Oct
'14
3:43 p.m.
Hello,
On the table I just "drop " and a "create as select " . I do not perform delete DROP TABLE src_inbox__at__lm__visites ; CREATE TABLE AS src_inbox__at__lm__visites ( SELECT SRC . * BDM.id_unique FROM CBC src__at__lm__visites INNER JOIN bdm_index_client_cookies BDM ON = SRC.idvisiteur BDM.idvisiteur AND BDM.id_univers = 1) WITH DATA ;
this script is run every night
how to explain the fragmentation?
--
1G6
52 route de bischwiller
67300 Schiltigheim
Société de Services et de Formations en Logiciels Libres
http://1g6.biz
Tél : 06 64 63 70 35
----- Mail original -----
De: "Stefan Manegold" Stefan.Manegold@cwi.nl
À: "Communication channel for MonetDB users" users-list@monetdb.org
Envoyé: Jeudi 23 Octobre 2014 09:14:59
Objet: Re: heapsize, speed, and vacuum
Hi,
in addition to Radovan's perfect answer (Thanks!!),
please also see
https://www.monetdb.org/Documentation/Manuals/SQLreference/Transactions
about transactions and deletes
(and yes, the vacuum operation is (for now) still *experimental*)
and
https://www.monetdb.org/Documentation/MemoryClaim
about memory usage and requirements.
Best,
Stefan
----- Original Message -----
> Hello,
> I would guess you are doing a lot of deletes from the table. Monet does
> not delete rows physically but merely marks them as deleted.
> Vacuum command removes deleted rows physically and table scan does not
> go over 80 GB but over 3.6 GB.
> Hope that helps,
> Radovan
>
> On 10/22/2014 11:44 PM, Pierre-Adrien Coustillas wrote:
> > Hello,
> >
> > (google translate)
> >
> > I have a query that takes a long time :
> > select count(distinct idvisiteur) from src_inbox__at__lm__visites;
> >
> > After two hours of running, it is not over, I'll kill her.
> >
> > if I look at the heapsize column storage ()
> > sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites';
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+
> > | schema | table | column | type |
> > location | count | typewidth | columnsize | heapsize | indices
> > | sorted |
> > +========+============================+=====================+=========+=============+===========+===========+============+=============+=========+========+
> > | sys | src_inbox__at__lm__visites | idvisite | varchar |
> > 01/00/10025 | 315819936 | 5 | 2526559488 | 27755937792 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idvisiteur | varchar |
> > 76/7667 | 315819936 | 19 | 2526559488 | 80449110016 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar |
> > 62/6201 | 315819936 | 2 | 1263279744 | 2222391296 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | siteniveau1 | varchar |
> > 77/7704 | 315819936 | 5 | 315819936 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | debutvisite | varchar |
> > 30/3064 | 315819936 | 19 | 2526559488 | 10741678080 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | catvisiteur | varchar |
> > 31/3171 | 315819936 | 0 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | region | varchar |
> > 67/6706 | 315819936 | 12 | 1263279744 | 2077556736 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | ville | varchar |
> > 40/4071 | 315819936 | 10 | 2526559488 | 22635741184 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | os | varchar |
> > 01/26/12673 | 315819936 | 8 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | browser | varchar |
> > 01/44/14410 | 315819936 | 17 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | sourcevisite | varchar |
> > 42/4234 | 315819936 | 1 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idcampagne | varchar |
> > 01/41/14100 | 315819936 | 0 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | prefixe | varchar |
> > 01/22/12277 | 315819936 | 0 | 315819936 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | libcampagne | varchar |
> > 65/6573 | 315819936 | 0 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | emailing | varchar |
> > 01/04/10434 | 315819936 | 0 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | dateenvois | varchar |
> > 01/22/12264 | 315819936 | 0 | 631639872 | 65536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | nbpages | int |
> > 56/5615 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | dureevisite | time |
> > 01/02/10211 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | id_unique | int |
> > 01/23/12370 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+
> >
> > Is the heapsize column represents the amount of data to be placed in RAM ?
> > I have 64GB of RAM on the server, so the idvisiteur column can not be
> > held in RAM. Is this true? ( 64 000 000 000 < 80 449 110 016 )
> >
> >
> > I execute a vacuum (experimental command ?) :
> > https://www.monetdb.org/Documentation/UserGuide/AutoloadingScript
> > (Table vacuum commands (experimental!))
> > call vacuum('sys','src_inbox__at__lm__visites');
> >
> > sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites';
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+
> > | schema | table | column | type |
> > location | count | typewidth | columnsize | heapsize | indices
> > | sorted |
> > +========+============================+=====================+=========+=============+===========+===========+============+============+=========+========+
> > | sys | src_inbox__at__lm__visites | idvisite | varchar |
> > 36 | 315819936 | 5 | 2526559488 | 4906811392 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idvisiteur | varchar |
> > 64/6435 | 315819936 | 19 | 1263279744 | 3667394560 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar |
> > 66/6640 | 315819936 | 2 | 1263279744 | 40042496 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | siteniveau1 | varchar |
> > 67/6743 | 315819936 | 5 | 315819936 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | debutvisite | varchar |
> > 70/7045 | 315819936 | 19 | 2526559488 | 7576551424 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | catvisiteur | varchar |
> > 71/7152 | 315819936 | 0 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | region | varchar |
> > 72/7252 | 315819936 | 12 | 1263279744 | 106561536 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | ville | varchar |
> > 73/7353 | 315819936 | 10 | 1263279744 | 1362165760 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | os | varchar |
> > 74/7453 | 315819936 | 8 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | browser | varchar |
> > 75/7554 | 315819936 | 17 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | sourcevisite | varchar |
> > 76/7655 | 315819936 | 1 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | idcampagne | varchar |
> > 77/7761 | 315819936 | 0 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | prefixe | varchar |
> > 01/00/10064 | 315819936 | 0 | 315819936 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | libcampagne | varchar |
> > 01/01/10165 | 315819936 | 0 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | emailing | varchar |
> > 01/02/10270 | 315819936 | 0 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | dateenvois | varchar |
> > 01/03/10372 | 315819936 | 0 | 631639872 | 73728 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | nbpages | int |
> > 01/04/10476 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | dureevisite | time |
> > 01/06/10600 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > | sys | src_inbox__at__lm__visites | id_unique | int |
> > 01/07/10703 | 315819936 | 4 | 1263279744 | 0 | 0
> > | false |
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+
> >
> > heapsize of idvisteur = 3 667 394 560 (3,6GB) instead of 80 449 110
> > 016 (80 GB)
> >
> >
> > my query is present MUCH faster
> > sql>select count(distinct idvisiteur) from src_inbox__at__lm__visites;
> > +----------+
> > | L1 |
> > +==========+
> > | 75528081 |
> > +----------+
> > 1 tuple (4m 22s)
> >
> >
> > HOURA !!!
> >
> >
> > but WHY WHY WHY ? I want to understand
> >
> > Thank
> >
> > Pierre
> >
> >
> > --
> > 1G6
> > 52 route de bischwiller
> > 67300 Schiltigheim
> > Société de Services et de Formations en Logiciels Libres
> > http://1g6.biz
> > Tél : 06 64 63 70 35
> >
> >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
--
| Stefan.Manegold@CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list