Hi,
I was analysing the Insert performance of MonetDB...
I am trying to optimize a process where in a single tuple has to be
inserted into a table of 1200 columns, which takes about 7.5 seconds
presently... The query has values for only 5 columns out of 1200, yet the
speed is still the same.. Is it possible to reduce this time even??
While digging on this, I came across this thread..
https://www.monetdb.org/pipermail/users-list/2005-August/000139.html
*What is the insert speed relatively to other DBs?
> *
> Using SQL, with the current version we won't compete very well. We get
> much more acceptable speeds in 'raw' mode, where lots of the parsing
> overhead is removed.
>
>
Can anyone please help me understand what is the 'raw' mode that is
explained in this thread??
I am using Oct-2014 version on a CentOS machine with 128 GB RAM...
Thanks & Regards,
Vijayakrishna.P.
Mobile : (+91) 9500402305.
The MonetDB team at CWI/MonetDB BV is pleased to announce the
Jul2015 feature release of the MonetDB suite of programs.
More information about MonetDB can be found on our website at
<http://www.monetdb.org/>.
For details on this release, please see the release notes at
<http://www.monetdb.org/Downloads/ReleaseNotes>.
As usual, the download location is <http://dev.monetdb.org/downloads/>.
Jul 2015 feature release
MonetDB5 Server
* Implemented batcalc.min and batcalc.max. Made calc.min and calc.max
generic so that no other implementations are needed.
* Removed function batcalc.ifthen.
Build Environment
* Upgraded the license to the Mozilla Public License Version 2.0.
Client Package
* Added a new output format to mclient: --format=expanded (or -fx).
In this format, column values are shown in full and below each
other.
MonetDB Common
* BUNtvar and BUNhvar macros no longer work for TYPE_void columns.
* Changed interfaces of a lot of GDK-level functions. When they
modify a BAT, don't return the same BAT or NULL, but instead return
GDK_SUCCEED or GDK_FAIL.
* Changed a bunch of hash-related functions to work on the tail
column. The functions that have been changed to work on the tail
column are: BAThash, BATprepareHash, HASHgonebad, HASHins, and
HASHremove.
Java Module
* Improved JDBC driver to not throw NullPointerException anymore when
calling isNullable() or getPrecision() or getScale() or
getColumnDisplaySize() or getSchemaName() or getTableName() or
getColumnClassName() on a ResultSetMetaData object.
* We now compile the Java classes using the latest Java 1.8 version,
and we tell it to compile for Java 1.7.
SQL
* Differences between time, timestamp, and date values now return
properly typed interval types (second or month intervals) instead
of integers.
* Added support for 128-bit integers (called HUGEINT) on platforms
that support this.
* Removed support for the mseed library.
* Removed support for RDF.
* Removed DataCell. It was experimental code that was never enabled.
Bug Fixes
* 2618: Implement master slave scheme
* 3361: constants as MAL function parameters prevent intermediate
reuse
* 3364: Cannot set role back to a user's default role
* 3365: Unable to grant object privileges while having a non-default
current_role
* 3440: Sequence type errors
* 3449: mserver crash on start - Freebsd 10 amd64
* 3476: Cannot revoke object access
* 3496: autocompletion table names does not work correctly
* 3556: when 2 multiplexed functions in MAL plan, only one is mapped
correctly to bat<mod>.function primitive
* 3564: Request: add support for postgresql specific scalar function:
split_part(string text, delimiter text, field int)
* 3625: SIGSEGV because mat array can overrun in opt_mergetable.c
* 3627: SQRT in CASE does not work as of Oct2014
* 3654: configure --enable-fits requires extra commands after
creating a database instance
* 3673: mclient 'expanded' row formatter
* 3674: Obfuscate event tracing
* 3679: No error is given when incorrect timezone value is specified
for a timetz column
* 3686: Wrong associativity of multiply/divide
* 3702: Filter function not found if created in a user schema
* 3708: wrong scoping for cross-schema view references
* 3716: alter table my_merge_table drop table t1; crashes mserver5
with Segmentation fault
* 3724: Wrong size calculation in BATsubjoin
* 3730: SAMPLE function not sampling randomly
* 3732: memory leak (of InstrRecord) in opt_mergetable
* 3733: "(TRUE OR <Exp>) AND <Exp>" is evaluated incorrectly
* 3735: python connection with unix_socket
* 3736: crash if mclient disconnects abruptly during a query
* 3738: Database inconsistency when using savepoint
* 3739: CASE statements do not handle NULLs in the IN () operator
properly
* 3740: select epoch(now()); types timestamptz(7,0) and bigint(64,0)
are not equal
* 3742: Division By Zero
* 3744: cast to int gives different results for decimal than double
* 3747: joins fail in the presence of nulls
* 3748: Missing META-INF/services/java.sql.Driver in JDBC package
* 3753: Hang on json field parsing
* 3754: select from a REMOTE TABLE referring local table crashes
mserver5
* 3756: column type conversion sticks to subsequent queries
* 3758: "COPY INTO ..." doesn't work, if executing from 2 processes
concurrently.
* 3759: select data from "sys"."rejects" returns unexpected error and
when next select data from "sys"."sessions" causes an assertion
failure in mal_interpreter.c:646.
* 3760: SQL parser has problem with (position of) a scalar subquery
in a SELECT-list
* 3761: SQL executor has problem with (position of) a subquery in a
SELECT-list. Inconsistent behavior.
* 3763: JDBC PreparedStatement for a table with 14 Foreign Keys
crashing the Database
* 3764: DROPping multiple users causes a crash
* 3765: Re-granting a revoked privilege does not work
* 3766: VIEW not visible if created under a different schema
* 3767: CREATE TEMP TABLE using "LIKE" incorrectly handled
* 3769: SIGSEGV when combining a cast/column alias with a UNION ALL
view
* 3770: combined conditions on declared table in User Defined
Function definition crashes monetdb
* 3771: Owner of the schema loses rights if assumes the monetdb role.
* 3772: Any user can grant a role.
* 3773: quantile(col, 0) and quantile(col, 1) fail
* 3774: mclient is unaware of merge tables and remote tables
* 3775: COPY INTO: Backslash preceding field separator kills import
* 3778: Crash on remote table schema mismatch
* 3779: server crashes on MAX() on SELECT DISTINCT something combo
* 3783: Behavioural change in Jul2015 for 'timestamp minus timestamp'
* 3784: Assertion failed: (bn->batCapacity >= cnt), function
BAT_scanselect, file gdk_select.c, line 1008.
* 3785: sum(interval) causes overflow in conversion to bte
* 3786: ResultSet.close() never sends Xclose to free resources
* 3787: "b and g must be aligned" from complex group/union query
* 3791: HEAPextend: failed to extend to 2420077101056
Hello Developers-list,
I am writing some C function (something like "cast"), and I
have problem with proper handling of value type "any".
Say, I have C function:
str
testFn(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
str* ret = getArgReference_str(stk, pci, 0);
ValPtr sourceValue = &stk->stk[pci->argv[1]];
char tmp[256] = {0};
sprintf(&tmp[0], "Source value type is \"%s\" (%d)\n", getTypeName(sourceValue->vtype), sourceValue->vtype);
*ret = GDKstrdup(&tmp[0});
return MAL_SUCCEED;
}
MAL function:
pattern testFn(value:any_1):str
address testFn
comment "Test function";
Then I added follow code to sql_types.c, where all built-in types and
functions gets initialized:
sql_create_func(sa, "testFn", "somemodule", "testFn", ANY, NULL, STR, SCALE_NONE);
When I want to pass "float" number, I get only integer, for example:
SELECT testFn(123.12);
I will get "int" type of ValRecord. And value of passed argument will
be "12312" (without dot-delimiter). But, I expecting double or float
value.
Please, help to solve this.
--
Best regards,
Sherzod mailto:shmutalov@gmail.com
Thanks for the links Brian..
Dear all,
>From this, I see that a database can be set to any number of clients, which
gives me a question on number of queries processed at a time.
If I am using an 8 core machine and connecting 50 client threads (each
executing many queries one by one) at a time to a database, will the
mserver be processing queries from ALL the 50 clients at any given
instance?? Or will that be restricted by the number of cores?? If this is
restricted, then is there a default queue maintained by the server??
Put in simpler words, is the maximum number of queries processed by MonetDB
at a given time equal to the number of client connections??
If not, then how is this handled (with a queue) ??
Thanks & Regards,
Vijayakrishna.P.
Mobile : (+91) 9500402305.
On Wed, Aug 12, 2015 at 6:11 PM, Brian Hood <brianh6854(a)googlemail.com>
wrote:
> Hi Vijay,
>
> I have found some interesting info in the monetdb man page actually.
>
> --- This is interests me as it alllows you to limit the amount of clients
> that can connect to a database not just an mserver
>
> *nclients=**number*
>
> Sets the maximum amount of clients that can connect to this database at
> the same time. Setting this to a high value is discouraged. A
> multiplex-funnel may be more performant, see *MULTIPLEX-FUNNEL* below.
>
> Regards,
>
> Brian Hood
>
> On Wed, Aug 12, 2015 at 7:48 AM, Vijay Krishna <vijayakrishna55(a)gmail.com>
> wrote:
>
>> Thanks Stefan and Brian.
>>
>> When I went through monetdb man page, the usage of nthreads (set as
>> gdk_nr_threads) environmental variable was defined as
>>
>> >> Defines how many worker threads the server should use to perform main
>> processing. Normally, this number equals the number of available CPU cores
>> in the system. Reducing this number forces the server to use less
>> parallelism when executing queries, or none at all if set to *1*.
>> I am a bit confused in understanding this statement. The variable
>> "nthreads" controls
>>
>> 1. the number of threads processing a single query using multiple
>> cores.
>> 2. the number of queries currently being executed over all databases
>> by the mserver.
>>
>> Please help me understand which of these two is correct..
>> This doubt arises because I have read from MonetDB home page that "MonetDB
>> is designed for multi-core parallel execution on desktops to reduce
>> response time for complex query processing".
>>
>> If the variable "nthreads" controls the latter, then does monetdb have an
>> in built queue to store the queries from remaining connections (controlled
>> by the variable nclients) to the mserver?? I have read about multiplex
>> funnel and a bit afraid over the transactional disadvantages mentioned in
>> there..
>> Is there any other solution?
>>
>>
>>
>>
>> Thanks & Regards,
>>
>> Vijayakrishna.P.
>> Mobile : (+91) 9500402305.
>>
>> On Tue, Aug 11, 2015 at 6:56 PM, Stefan Manegold <Stefan.Manegold(a)cwi.nl>
>> wrote:
>>
>>> Hi,
>>>
>>> indeed;
>>>
>>> see also
>>> `man mserver5` and `man monetdb`
>>> or
>>> https://www.monetdb.org/Documentation/mserver5-man-page
>>> and
>>> https://www.monetdb.org/Documentation/monetdb-man-page
>>>
>>> Stefan
>>>
>>> ----- On Aug 11, 2015, at 3:11 PM, Brian Hood brianh6854(a)googlemail.com
>>> wrote:
>>>
>>> > Hi Vijay,
>>> >
>>> > Well if you take a look at your merovingian.log you can see the
>>> command that
>>> > starts the database.
>>> >
>>> > /usr/bin/mserver5 --dbpath=/data2/threatmonitor/threatmonitor --set
>>> > merovingian_uri=mapi:monetdb://orville.lan:50000/threatmonitor --set
>>> > mapi_open=false --set mapi_port=0 --set
>>> > mapi_usock=/data2/threatmonitor/threatmonitor/.mapi.sock --set
>>> > monet_vault_key=/data2/threatmonitor/threatmonitor/.vaultkey --set
>>> > gdk_nr_threads=2 --set max_clients=64 --set sql_optimizer=default_pipe
>>> --set
>>> > monet_daemon=yes
>>> >
>>> >
>>> > These 2 i think are the ones your looking for.
>>> >
>>> > --set gdk_nr_threads=2 --set max_clients=64
>>> >
>>> > Regards,
>>> >
>>> > Brian Hood
>>> >
>>> >
>>> > On Tue, Aug 11, 2015 at 7:14 AM, Vijay Krishna <
>>> vijayakrishna55(a)gmail.com >
>>> > wrote:
>>> >
>>> >
>>> >
>>> > Oops.. My bad.. Thanks Brian..
>>> >
>>> > Yes I was indeed looking for connection pools... Something like
>>> maximum thread
>>> > concurrency option in innodb...
>>> >
>>> > Is there any such available with MonetDB?
>>> >
>>> > I had gone through this option... mserver5 --set max_clients=<num>
>>> > Is this the only way to control parallel threads??
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Thanks & Regards,
>>> >
>>> > Vijayakrishna.P.
>>> > Mobile : (+91) 9500402305 .
>>> >
>>> > On Mon, Aug 10, 2015 at 7:48 PM, Brian Hood <
>>> brianh6854(a)googlemail.com > wrote:
>>> >
>>> >
>>> >
>>> > Hi Vijay,
>>> >
>>> > I think the words your looking for a connection pooling.
>>> >
>>> > InnoDB Buffer pool in MySQL is the amount of data stored in memory
>>> which you set
>>> > to a certain size and has nothing to do with network traffic.
>>> >
>>> > Regards,
>>> >
>>> > Brian Hood
>>> >
>>> > On Mon, Aug 10, 2015 at 8:45 AM, Vijay Krishna <
>>> vijayakrishna55(a)gmail.com >
>>> > wrote:
>>> >
>>> >
>>> >
>>> > Hi,
>>> >
>>> > Does MonetDB have any built in buffer pools to handle parallel
>>> requests?
>>> >
>>> > I see the default number of parallel connections accepted and
>>> processed at a
>>> > time by MonetDB is 64..
>>> > While I see that there are ways to increase it, is there anyway built
>>> in to pool
>>> > incoming requests until the live 64 requests are processed?
>>> >
>>> > For example, our application would receive 100 requests at a time,
>>> while we wish
>>> > to limit the number of queries handled at a time by MonetDB to 50. Is
>>> there any
>>> > way to automatically handle the remaining 50 (something like innodb
>>> buffer
>>> > pool) ??
>>> >
>>> > Thanks & Regards,
>>> >
>>> > Vijayakrishna.P.
>>> > Mobile : (+91) 9500402305 .
>>> >
>>> > _______________________________________________
>>> > users-list mailing list
>>> > users-list(a)monetdb.org
>>> > https://www.monetdb.org/mailman/listinfo/users-list
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > users-list mailing list
>>> > users-list(a)monetdb.org
>>> > https://www.monetdb.org/mailman/listinfo/users-list
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > users-list mailing list
>>> > users-list(a)monetdb.org
>>> > https://www.monetdb.org/mailman/listinfo/users-list
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > users-list mailing list
>>> > users-list(a)monetdb.org
>>> > https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>> --
>>> | Stefan.Manegold(a)CWI.nl | DB Architectures (DA) |
>>> | www.CWI.nl/~manegold/ | Science Park 123 (L321) |
>>> | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
>>> _______________________________________________
>>> users-list mailing list
>>> users-list(a)monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>
>>
>> _______________________________________________
>> users-list mailing list
>> users-list(a)monetdb.org
>> https://www.monetdb.org/mailman/listinfo/users-list
>>
>>
>
> _______________________________________________
> users-list mailing list
> users-list(a)monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
>
I am sorry didn¹t realize mailing list is not their in my reply.
On 7/20/15, 14:24, "Ying Zhang" <Y.Zhang(a)cwi.nl> wrote:
>Hello Anu,
>
>Would you please post this to the MonetDB users mailing list? Thanks!
>
>Regards,
>
>Jennie
>
>> On Jul 20, 2015, at 19:44, Anu Raj Srivastava <Anu.Srivastava(a)rms.com>
>>wrote:
>>
>> Hi Hannes,
>>
>> I am using apache dbcp2 connection Pool. I am running queries in a
>>constant loop with 8 concurrent invocations. And each 8 threads are
>>running 9 queries each again and again.
>>
>> Java code is quite straight forward:-
>>
>> private ConnectionPoolPG() {
>> try {
>> final Properties properties = new Properties();
>>
>>properties.load(ConnectionPoolPG.class.getClassLoader().getResourceAsStre
>>am("monetDB.properties"));
>> dataSource =
>>BasicDataSourceFactory.createDataSource(properties);
>> schemaName = properties.getProperty("schemaname");
>> } catch (Exception e) {
>> throw new ExposureStoreException("Unable to instantiate
>>connection pool: " + e.getMessage(),
>> EAPAException.ERR_STORE_CONNECTION_CONFIG_ERROR);
>> }
>> }
>>
>> public void AllQueryPerformance() throws Exception {
>> while (true) {
>> if (queries != null) {
>> for (String query : queries) {
>> try (Connection connection =
>>ConnectionPoolPG.getInstance().getConnection()) {
>> try (Statement statement =
>>connection.createStatement()) {
>> statement.execute(query);
>> }
>> }
>> }
>> }
>> }
>> }
>>
>> -----Original Message-----
>> From: Hannes Mühleisen [mailto:Hannes.Muehleisen@cwi.nl]
>> Sent: Sunday, July 19, 2015 4:47 AM
>> To: Communication channel for developers of the MonetDB suite.
>> Cc: Ying Zhang; Anu Raj Srivastava
>> Subject: Re: MonetDB causing disk space to fill up
>>
>>
>>> On 19 Jul 2015, at 13:23, Hannes Mühleisen <Hannes.Muehleisen(a)cwi.nl>
>>>wrote:
>>>
>>> Hi,
>>>
>>>> On 16 Jul 2015, at 23:11, Ying Zhang <Y.Zhang(a)cwi.nl> wrote:
>>>>
>>>> On Jul 13, 2015, at 22:32, Anu Raj Srivastava
>>>><Anu.Srivastava(a)rms.com> wrote:
>>>>>
>>>>> Hey Jennie,
>>>>>
>>>>> I did upgrade to latest version but I am facing the same issue.
>>>>>Further debugging into the issue it seems like it is ConnectionPool
>>>>>issue. When I use new connection for each query MonetDB seems to be
>>>>>able to clean up the resources
>>>>
>>>> Ok, MonetDB is doing what it¹s supposed to do.
>>>>
>>>>> but if I use ConnectionPool for some reason DiskSpace keep filling
>>>>>up.
>>>>
>>>> If a connection is kept open, I¹d expect MonetDB to be less eager to
>>>>clean up resources.
>>>> However, we generally recommend using a ConnectionPool, to avoid too
>>>>much overhead caused by making a new connection for every query. So
>>>>my suggestion for now would be to keep using the ConnectionPool, but
>>>>try to find the reason why your disk is filled up.
>>>>
>>>>> Do you have any idea why that could happen?
>>>>
>>>> Hmmm, no. We¹ve fixed quite some leaks in Jul2015...
>>>> Some more thinking:
>>>> - After how many iterations of repeatedly executing the queries is
>>>>your disk filled up?
>>>> - Can you please try to run your queries with mclient to observe the
>>>>behaviour of MonetDB (and disk usage)?
>>>>
>>>> @Hannes: sorry to disturb you during holiday. But if possible, would
>>>>you please have a look at this thread. Does anything ring a bell?
>>>>Thanks!
>>>
>>> I tested this with ac3p0 connection pool (limited to 4 concurrent
>>>connections) and repeating queries from 10 threads (SELECT
>>>MIN(something) FROM Š). I was not able to observe any (!) increase in
>>>storage footprint. So without a minimal reproducible example of Java
>>>code + DB schema + sample data, it is hard to tell what is going on. Is
>>>this running in auto commit? Using temporary tables?
>>
>> Small follow-up: When forcing the allocation of a large intermediate
>>(SELECT MIN(something+1) FROM Š), additional storage space is indeed
>>claimed. The attached plot shows dbfarm size over time while 10 threads
>>are running 1000 queries each using a 5-connection pool. You can see how
>>the amount of additional storage drops back to exactly where it was once
>>the pool shuts down at around 350s, and that there is no accumulation of
>>additional storage over the course of the pool being active.
>>
>> Hannes
>>
>
Hi,
I might have missed that,
but the TRACE option / runtime feature to profile SQL queries
(cf., https://www.monetdb.org/Documentation/Manuals/SQLreference/Runtime)
appears to be gone in Jul2015.
Bug or feature?
Thanks!
Stefan
--
| Stefan.Manegold(a)CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Hi,
When browsing the source code of MonetDB, I found that functions for
operators all have a *sub* prefix, e.g., BATsubsort, BATsubjoin, etc. Any
information about how to understand this *sub* functionality? Thanks.
Best regards,
Wenjian