Hi Jennie!
I chose the msqldump option, firstly because I thought it was the way to do the backup. Secondly, the msqldump creates the table and then it does a BULK LOAD, therefore for me was a good option. 

If I had chosen to export only the data, I would have had to create the tables either manually, or through another mysqldump (structure only).


Rgds
Ariel



On Tue, Mar 31, 2020 at 6:16 PM Ying Zhang <Y.Zhang@cwi.nl> wrote:
Hai Ariel,

Just a side question, why are you creating msqldump of your data to reload into MonetDB.  Why don’t you immediately prepare CSV data and do the bulk loading?

Regards,
Jennie

> On 31 Mar 2020, at 18:50, Ariel Abadi <aabadi@starbi.com> wrote:
>
> Hi!
>
> Firstly, I would like to thank for the prompt answer!!!
>
> I will wait your answer Panos, to see wether is a bug or not.
>
> In the case this is NOT a bug, and you confirm what Arjen said regarding the format, I think it would be interesting to add such comment on the documentation (https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/LoadingBulkData) which does not refer anything related to the format of the "titles".
> Also, it would be great to use the OFFSET clause disregards of the format of the remainder.
>
> Again, thank you very much to both!!!
>
> Ariel
> PS. The way I solved in this case, was to cut the first 222 rows of each file, and I imported using the Bulk load, which worked perfectly
>
>
> On Tue, Mar 31, 2020 at 7:16 AM Panagiotis Koutsourakis <panagiotis.koutsourakis@monetdbsolutions.com> wrote:
> Hi Arjen,
>
> please see my more recent email about the subject.
>
> Best regards,
> Panos.
>
> On 3/31/20 11:13 AM, Arjen P. de Vries wrote:
> > Hi Panos,
> >
> > Great assistance here!
> >
> > One thing:
> > I can understand why Ariel tried to use OFFSET for that purpose, maybe nice
> > to add to the documentation that skipped lines also should follow the
> > format of the remainder!
> >
> > Greetings,
> >
> > Arjen
> >
> > On Tue, 31 Mar 2020 at 10:51, Panagiotis Koutsourakis <
> > panagiotis.koutsourakis@monetdbsolutions.com> wrote:
> >
> >> Hi Ariel,
> >>
> >> The COPY INTO command recognizes a series of records of the form:
> >>
> >> <data><field delimiter><data><field delimiter>...<data><record delimiter>
> >>
> >> For example the command
> >>
> >>   COPY INTO tbl FROM '/path/to/file' USING DELIMITERS '\t','\n','"';
> >>
> >> will recognize lines of the form:
> >>
> >> 1       "MIA"   7       "MIA|115331571990001804"        2000-08-01
> >>
> >> but nothing else. Specifically the first 222 lines in the monet_list.txt
> >> file be rejected by COPY INTO. You need to delete those lines before you
> >> try to load data with copy into.
> >>
> >> Take a look at the documentation of COPY INTO in the MonetDB web page [1].
> >>
> >> Hope this helps,
> >> Panos.
> >>
> >> [1]
> >> https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/CSV_bulk_loads
> >>
> >> On 3/30/20 7:42 PM, Ariel Abadi wrote:
> >>> Hi Panos!
> >>>
> >>> Enclosed you can find the file (truncated) that I used and the statement
> >> I
> >>> run on mclient.
> >>>
> >>> COPY OFFSET 223 INTO "sb_traf"."nrm_20000818" FROM '/tmp/monet_list.txt'
> >>> USING DELIMITERS '\t','\n','"';
> >>>
> >>>
> >>> Thks
> >>> Ariel
> >>>
> >>>
> >>> On Mon, Mar 30, 2020 at 8:07 AM Panagiotis Koutsourakis <
> >>> panagiotis.koutsourakis@monetdbsolutions.com> wrote:
> >>>
> >>>> Hi Ariel,
> >>>>
> >>>> The first issue is not totally surprising to me. My best guess is that
> >>>> reading from a file is using buffered input while reading from stdin is
> >>>> not. See for example [1] and [2]. I am not sure there is a lot we can
> >> do at
> >>>> the MonetDB side about this.
> >>>>
> >>>> About the second issue could you please post the exact COPY INTO command
> >>>> and the first few lines of the file you are trying to load so that we
> >> can
> >>>> figure out the exact issue?
> >>>>
> >>>> Best regards,
> >>>> Panos.
> >>>>
> >>>> [1]
> >>>>
> >> https://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python
> >>>> [2]
> >>>>
> >> https://discourse.julialang.org/t/stdin-stdout-10x-slower-than-file-open/17233
> >>>>
> >>>> On 3/26/20 11:51 AM, Ariel Abadi wrote:
> >>>>> Hi!
> >>>>>
> >>>>> After a big battle against the DB, I found 2 issues.
> >>>>>
> >>>>> 1) Running the COPY RECORDS FROM *STDIN *and from *FILE*. Is totally
> >>>>> different in terms of performance. While the first one last 24 minutes,
> >>>> the
> >>>>> second one last only *90 secs.*
> >>>>> 2) As I had generated all those files thru msqldump (which it states
> >> the
> >>>>> "CREATE TABLE" instruction and then the COPY RECORDS)... I try ro run
> >> the
> >>>>> COPY *XXXX OFFSET *RECORDS (in order to avoid the create table
> >>>>> statement)... and the query, because of the word "STDIN" in the file.
> >> It
> >>>>> was not OFFSETTING all the rows I was asking for.
> >>>>>
> >>>>> I hope I've been clear... if not please let me know!
> >>>>>
> >>>>> Ariel
> >>>>>
> >>>>> On Tue, Mar 24, 2020 at 11:33 PM Martin Kersten <martin.kersten@cwi.nl
> >>>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi
> >>>>>>
> >>>>>> I am not sure what issue you are now dealing with, but since you are a
> >>>>>> company
> >>>>>> that seems to use (or considering) MonetDB in your commercial
> >> offerings,
> >>>>>> I gladly refer you to MonetDBSolutions(*), where we can give you the
> >>>>>> commercial
> >>>>>> support needed to diagnose your issues.
> >>>>>>
> >>>>>> regards Martin
> >>>>>>
> >>>>>> (*) https://www.monetdbsolutions.com/
> >>>>>>
> >>>>>> On 24/03/2020 23:55, Ariel Abadi wrote:
> >>>>>>> Thanks Martin for your prompt response
> >>>>>>> But I did and msqldump from other database, and now the upload is
> >> being
> >>>>>> done thru the BULK LOAD.
> >>>>>>>
> >>>>>>> /COPY 14173375 RECORDS INTO "sb_traf"."nrm_20180818" FROM stdin USING
> >>>>>> DELIMITERS '\t','\n','"';/
> >>>>>>>
> >>>>>>>
> >>>>>>> Any other idea ? :'(
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Mar 24, 2020 at 7:37 PM Martin Kersten <
> >> martin.kersten@cwi.nl
> >>>>>> <mailto:martin.kersten@cwi.nl>> wrote:
> >>>>>>>
> >>>>>>>     Hi,
> >>>>>>>
> >>>>>>>     given the numbers I suspect you are loading them as SQL insert
> >>>>>> statements under autocommit mode,
> >>>>>>>     Please read the documentation on Bulk loading.
> >>>>>>>
> >>>>>>>     regards,Martin
> >>>>>>>     Sent from my iPad
> >>>>>>>
> >>>>>>>>     On 24 Mar 2020, at 23:12, Ariel Abadi <aabadi@starbi.com
> >> <mailto:
> >>>>>> aabadi@starbi.com>> wrote:
> >>>>>>>>
> >>>>>>>>     
> >>>>>>>>     Hi!
> >>>>>>>>     Maybe somebody can help me, please!
> >>>>>>>>
> >>>>>>>>     I'm trying to upload into Monet a file with 14MM records. Is
> >>>> taking
> >>>>>> more than *24 minutes.*
> >>>>>>>>     It is a fresh new database.
> >>>>>>>>
> >>>>>>>>     Enclosed you will find the server configuration.
> >>>>>>>>
> >>>>>>>>     Monetdb Version Nov2019-SP3
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     *[root@mn tmp]$ lscpu*
> >>>>>>>>     Architecture:          x86_64
> >>>>>>>>     CPU op-mode(s):        32-bit, 64-bit
> >>>>>>>>     Byte Order:            Little Endian
> >>>>>>>>     CPU(s):                40
> >>>>>>>>     On-line CPU(s) list:   0-39
> >>>>>>>>     Thread(s) per core:    2
> >>>>>>>>     Core(s) per socket:    10
> >>>>>>>>     Socket(s):             2
> >>>>>>>>     NUMA node(s):          2
> >>>>>>>>     Vendor ID:             GenuineIntel
> >>>>>>>>     CPU family:            6
> >>>>>>>>     Model:                 85
> >>>>>>>>     Model name:            Intel(R) Xeon(R) Silver 4114 CPU @
> >> 2.20GHz
> >>>>>>>>     Stepping:              4
> >>>>>>>>     CPU MHz:               800.479
> >>>>>>>>     BogoMIPS:              4400.00
> >>>>>>>>     Virtualization:        VT-x
> >>>>>>>>     L1d cache:             32K
> >>>>>>>>     L1i cache:             32K
> >>>>>>>>     L2 cache:              1024K
> >>>>>>>>     L3 cache:              14080K
> >>>>>>>>     NUMA node0 CPU(s):
> >>>>>> 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> >>>>>>>>     NUMA node1 CPU(s):
> >>>>>> 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> >>>>>>>>     Flags:                 fpu vme de pse tsc msr pae mce cx8 apic
> >> sep
> >>>>>> mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht
> >> tm
> >>>> pbe
> >>>>>> syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts
> >>>>>> rep_good nopl xtopology nonstop_tsc
> >>>>>>>>     cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
> >>>>>> tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe
> >>>>>> popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
> >>>>>> 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3
> >>>>>>>>     invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp
> >> tpr_shadow
> >>>>>> vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2
> >> smep
> >>>>>> bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap
> >>>>>> clflushopt clwb intel_pt avx512cd avx512bw
> >>>>>>>>     avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc
> >>>>>> cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear
> >>>>>> flush_l1d
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     *[root@mn tmp]$ cat /proc/meminfo*
> >>>>>>>>
> >>>>>>>>     MemTotal:       83886080 kB
> >>>>>>>>     MemFree:        10535812 kB
> >>>>>>>>     MemAvailable:   80159744 kB
> >>>>>>>>     Buffers:               0 kB
> >>>>>>>>     Cached:         69623932 kB
> >>>>>>>>     SwapCached:            0 kB
> >>>>>>>>     Active:          6625436 kB
> >>>>>>>>     Inactive:       63737872 kB
> >>>>>>>>     Active(anon):     324456 kB
> >>>>>>>>     Inactive(anon):   417184 kB
> >>>>>>>>     Active(file):    6300980 kB
> >>>>>>>>     Inactive(file): 63320688 kB
> >>>>>>>>     Unevictable:           0 kB
> >>>>>>>>     Mlocked:            5320 kB
> >>>>>>>>     SwapTotal:       8388604 kB
> >>>>>>>>     SwapFree:        8381988 kB
> >>>>>>>>     Dirty:               220 kB
> >>>>>>>>     Writeback:             0 kB
> >>>>>>>>     AnonPages:      13626052 kB
> >>>>>>>>     Mapped:           186368 kB
> >>>>>>>>     Shmem:              1800 kB
> >>>>>>>>     KReclaimable:    4134960 kB
> >>>>>>>>     Slab:               0 kB
> >>>>>>>>     SReclaimable:          0 kB
> >>>>>>>>     SUnreclaim:            0 kB
> >>>>>>>>     KernelStack:       16720 kB
> >>>>>>>>     PageTables:        56500 kB
> >>>>>>>>     NFS_Unstable:          0 kB
> >>>>>>>>     Bounce:                0 kB
> >>>>>>>>     WritebackTmp:          0 kB
> >>>>>>>>     CommitLimit:    74123472 kB
> >>>>>>>>     Committed_AS:   18337812 kB
> >>>>>>>>     VmallocTotal:   34359738367 kB
> >>>>>>>>     VmallocUsed:      415836 kB
> >>>>>>>>     VmallocChunk:          0 kB
> >>>>>>>>     Percpu:            39552 kB
> >>>>>>>>     HardwareCorrupted:     0 kB
> >>>>>>>>     AnonHugePages:         0 kB
> >>>>>>>>     ShmemHugePages:        0 kB
> >>>>>>>>     ShmemPmdMapped:        0 kB
> >>>>>>>>     CmaTotal:              0 kB
> >>>>>>>>     CmaFree:               0 kB
> >>>>>>>>     HugePages_Total:       0
> >>>>>>>>     HugePages_Free:        0
> >>>>>>>>     HugePages_Rsvd:        0
> >>>>>>>>     HugePages_Surp:        0
> >>>>>>>>     Hugepagesize:       2048 kB
> >>>>>>>>     Hugetlb:               0 kB
> >>>>>>>>     DirectMap4k:      816396 kB
> >>>>>>>>     DirectMap2M:    82513920 kB
> >>>>>>>>     DirectMap1G:    52428800 kB

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list