Hello,

On the table I just "drop" and a "create as select"
I do not perform delete
DROP TABLE src_inbox__at__lm__visites;
CREATE TABLE AS src_inbox__at__lm__visites (
                                 SELECT SRC. * BDM.id_unique
                                 FROM CBC src__at__lm__visites
                                 INNER JOIN bdm_index_client_cookies BDM
                                 ON = SRC.idvisiteur BDM.idvisiteur
                                 AND BDM.id_univers = 1) WITH DATA;

this script is run every night

how to explain the fragmentation?

--
1G6
52 route de bischwiller
67300 Schiltigheim

Société de Services et de Formations en Logiciels Libres
http://1g6.biz
Tél : 06 64 63 70 35



De: "Stefan Manegold" <Stefan.Manegold@cwi.nl>
À: "Communication channel for MonetDB users" <users-list@monetdb.org>
Envoyé: Jeudi 23 Octobre 2014 09:14:59
Objet: Re: heapsize, speed, and  vacuum

Hi,

in addition to Radovan's perfect answer (Thanks!!),

please also see
https://www.monetdb.org/Documentation/Manuals/SQLreference/Transactions
about transactions and deletes
(and yes, the vacuum operation is (for now) still *experimental*)

and
https://www.monetdb.org/Documentation/MemoryClaim
about memory usage and requirements.

Best,
Stefan

----- Original Message -----
> Hello,
> I would guess you are doing a lot of deletes from the table. Monet does
> not delete rows physically but merely marks them as deleted.
> Vacuum command removes deleted rows physically and table scan does not
> go over 80 GB but over 3.6 GB.
> Hope that helps,
> Radovan
>
> On 10/22/2014 11:44 PM, Pierre-Adrien Coustillas wrote:
> > Hello,
> >
> >    (google translate)
> >
> > I have a query that takes a long time :
> > select count(distinct idvisiteur) from src_inbox__at__lm__visites;
> >
> > After two hours of running, it is not over, I'll kill her.
> >
> > if I look at the heapsize column storage ()
> > sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites';
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+
> > | schema | table                      | column              | type    |
> > location    | count     | typewidth | columnsize | heapsize    | indices
> > | sorted |
> > +========+============================+=====================+=========+=============+===========+===========+============+=============+=========+========+
> > | sys    | src_inbox__at__lm__visites | idvisite            | varchar |
> > 01/00/10025 | 315819936 |         5 | 2526559488 | 27755937792 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idvisiteur          | varchar |
> > 76/7667     | 315819936 |        19 | 2526559488 | 80449110016 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idvisiteuridentifie | varchar |
> > 62/6201     | 315819936 |         2 | 1263279744 |  2222391296 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | siteniveau1         | varchar |
> > 77/7704     | 315819936 |         5 |  315819936 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | debutvisite         | varchar |
> > 30/3064     | 315819936 |        19 | 2526559488 | 10741678080 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | catvisiteur         | varchar |
> > 31/3171     | 315819936 |         0 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | region              | varchar |
> > 67/6706     | 315819936 |        12 | 1263279744 |  2077556736 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | ville               | varchar |
> > 40/4071     | 315819936 |        10 | 2526559488 | 22635741184 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | os                  | varchar |
> > 01/26/12673 | 315819936 |         8 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | browser             | varchar |
> > 01/44/14410 | 315819936 |        17 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | sourcevisite        | varchar |
> > 42/4234     | 315819936 |         1 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idcampagne          | varchar |
> > 01/41/14100 | 315819936 |         0 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | prefixe             | varchar |
> > 01/22/12277 | 315819936 |         0 |  315819936 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | libcampagne         | varchar |
> > 65/6573     | 315819936 |         0 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | emailing            | varchar |
> > 01/04/10434 | 315819936 |         0 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | dateenvois          | varchar |
> > 01/22/12264 | 315819936 |         0 |  631639872 |       65536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | nbpages             | int     |
> > 56/5615     | 315819936 |         4 | 1263279744 |           0 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | dureevisite         | time    |
> > 01/02/10211 | 315819936 |         4 | 1263279744 |           0 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | id_unique           | int     |
> > 01/23/12370 | 315819936 |         4 | 1263279744 |           0 |       0
> > | false  |
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+
> >
> > Is the heapsize column represents the amount of data to be placed in RAM ?
> > I have 64GB of RAM on the server, so the idvisiteur column can not be
> > held in RAM. Is this true?  ( 64 000 000 000 <  80 449 110 016 )
> >
> >
> > I execute a vacuum (experimental command ?) :
> >    https://www.monetdb.org/Documentation/UserGuide/AutoloadingScript
> >   (Table vacuum commands (experimental!))
> > call vacuum('sys','src_inbox__at__lm__visites');
> >
> > sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites';
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+
> > | schema | table                      | column              | type    |
> > location    | count     | typewidth | columnsize | heapsize   | indices
> > | sorted |
> > +========+============================+=====================+=========+=============+===========+===========+============+============+=========+========+
> > | sys    | src_inbox__at__lm__visites | idvisite            | varchar |
> > 36          | 315819936 |         5 | 2526559488 | 4906811392 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idvisiteur          | varchar |
> > 64/6435     | 315819936 |        19 | 1263279744 | 3667394560 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idvisiteuridentifie | varchar |
> > 66/6640     | 315819936 |         2 | 1263279744 |   40042496 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | siteniveau1         | varchar |
> > 67/6743     | 315819936 |         5 |  315819936 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | debutvisite         | varchar |
> > 70/7045     | 315819936 |        19 | 2526559488 | 7576551424 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | catvisiteur         | varchar |
> > 71/7152     | 315819936 |         0 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | region              | varchar |
> > 72/7252     | 315819936 |        12 | 1263279744 |  106561536 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | ville               | varchar |
> > 73/7353     | 315819936 |        10 | 1263279744 | 1362165760 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | os                  | varchar |
> > 74/7453     | 315819936 |         8 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | browser             | varchar |
> > 75/7554     | 315819936 |        17 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | sourcevisite        | varchar |
> > 76/7655     | 315819936 |         1 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | idcampagne          | varchar |
> > 77/7761     | 315819936 |         0 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | prefixe             | varchar |
> > 01/00/10064 | 315819936 |         0 |  315819936 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | libcampagne         | varchar |
> > 01/01/10165 | 315819936 |         0 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | emailing            | varchar |
> > 01/02/10270 | 315819936 |         0 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | dateenvois          | varchar |
> > 01/03/10372 | 315819936 |         0 |  631639872 |      73728 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | nbpages             | int     |
> > 01/04/10476 | 315819936 |         4 | 1263279744 |          0 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | dureevisite         | time    |
> > 01/06/10600 | 315819936 |         4 | 1263279744 |          0 |       0
> > | false  |
> > | sys    | src_inbox__at__lm__visites | id_unique           | int     |
> > 01/07/10703 | 315819936 |         4 | 1263279744 |          0 |       0
> > | false  |
> > +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+
> >
> > heapsize of idvisteur =  3 667 394 560 (3,6GB)  instead of  80 449 110
> > 016  (80 GB)
> >
> >
> > my query is present MUCH faster
> > sql>select count(distinct idvisiteur) from src_inbox__at__lm__visites;
> > +----------+
> > | L1       |
> > +==========+
> > | 75528081 |
> > +----------+
> > 1 tuple (4m 22s)
> >
> >
> > HOURA !!!
> >
> >
> > but WHY WHY WHY ?  I want to understand
> >
> > Thank
> >
> > Pierre
> >
> >
> > --
> > 1G6
> > 52 route de bischwiller
> > 67300 Schiltigheim
> > Société de Services et de Formations en Logiciels Libres
> > http://1g6.biz
> > Tél : 06 64 63 70 35
> >
> >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list@monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>

--
| Stefan.Manegold@CWI.nl | DB Architectures   (DA) |
| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list