Hello,
sorry for my late replay, I have been really busy in the last few days.

> Hello Stefano,
>
> Which MonetDB version do you use?  On which OS?

Sorry, I forgot to mention this. I am using ubuntu 16.04 and I am running MonetDB inside a Docker container (which also runs ubuntu).


>> On 4 Jul 2017, at 16:32, Stefano Piani <stefano.piani@exact-lab.it> wrote:
>>
>> Dear monetdb users,
>>
>> I started to use MonetDB a few weeks ago and, until now, I have been
>> able to use it proficiently. Unfortunately, I have faced a small issue
>> that I have not been able to figure out how to fix.
>>
>> I collect data generated by different users. A unique id is associated
>> to each user. I have a specific table (called "user") where I save the
>> name and some other information about the user. In this table, the ID
>> is a primary key.
>> When I collect the data, I want to check if the specific user who
>> generated that data is already saved in the user table. If this is not
>> the case,
>> I want to add a row to the user table for this user.
>> Unfortunately, I have not been able to accomplish this operation in
>> one SQL query and, therefore, I have been forced to query for the user
>> id and,
>> if the answer is negative, add the user using another query. The
>> following is the python script that perform the operations.
>>
>> def save_user_in_database(user, cursor):
>>    db_line = 'SELECT 1 FROM users WHERE user_id = {}'.format(user.id)
>>    LOGGER.info('Executing on DB: %s', db_line)
>>    cursor.execute(db_line)
>>
>>    answer = cursor.fetchall()
>>    LOGGER.info('Get this answer %s', answer)
>>    if len(answer) == 0:
>>        LOGGER.info('Inserting new user')
>>
>>        db_line = sql_cmd_from_template(
>>            USER_TABLE_INSERT_TEMPLATE,
>>            user.id,
>>            user.name,
>>            user.description,
>>            user.location
>>        )
>>        LOGGER.info('Executing on DB: %s', db_line)
>>        cursor.execute(db_line)
>
>Do you execute the queries in a different schema than the default ?sys? schema (which is recommended for database users)?
>Because ?users? is a special VIEW in the ?sys? schema.
>

Yes, I should be in a different schema. In any case, I changed the name of the table just to be on the safe way.

>>
>> This function is executed inside a single threaded application.
>> Moreover, there are no "DELETE" or similar operations inside this
>> script (it only adds rows to the database).
>> No user (human or software) but this script connects to my MonetDB instance.
>>
>> Unfortunately, after I while that the script is running (let's say one
>> or two days), something unexpected happens. For example, this is the
>> log generated for user 42847029
>> (there are a lot of other lines in between that I removed). As you can
>> see, after a while, my check fails and I don't understand why.
>>
>> 2017/06/13 09:30:01 - INFO - save_user_in_database: Executing on DB:
>> SELECT 1 FROM users WHERE user_id = 42847029
>> 2017/06/13 09:30:01 - INFO - save_user_in_database: Get this answer [(1,)]
>>
>> 2017/06/13 10:00:04 - INFO - save_user_in_database: Executing on DB:
>> SELECT 1 FROM users WHERE user_id = 42847029
>> 2017/06/13 10:00:04 - INFO - save_user_in_database: Get this answer [(1,)]
>>
>> 2017/06/13 10:02:33 - INFO - save_user_in_database: Executing on DB:
>> SELECT 1 FROM users WHERE user_id = 42847029
>> 2017/06/13 10:02:33 - INFO - save_user_in_database: Get this answer [(1,)]
>>
>> 2017/06/13 10:30:36 - INFO - save_user_in_database: Executing on DB:
>> SELECT 1 FROM users WHERE user_id = 42847029
>> 2017/06/13 10:30:36 - INFO - save_user_in_database: Get this answer []
>> 2017/06/13 10:30:36 - INFO - save_user_in_database: Inserting new user
>
>This shouldn?t happen, as is shown by the error below that a user_id of 42847029 still exists; also, as you mentioned, there hasn?t been any delete queries.
>
>This might be a bug in MonetDB.
>
>Is this problem repeatable?  If yes, can you give us a sequence of queries to reproduce this problem.  We?d like to look into it.
>

I will try to reproduce the problem with some scripts as soon as possible. Unfortunately, I am quite busy in this period and therefore this will require some time. In any case, this happens only when the system is busy and
the table is big enough. I think that the key to reproduce the problem is to stress the database with several operations per second. In this situation, I have noticed that the db starts to use a lot of RAM (about 15 GB for a database that occupies 3 GB on the disk). At that point, the db becomes extremely slow (for example, for a query that usually takes less than 10 ms I have to wait for more than 2 seconds) and the unique key problem start to appear.

>
>> 2017/06/13 10:30:36 - INFO - save_user_in_database: Executing on DB:
>> INSERT INTO users(user_id, user_name, user_description, user_location)
>> VALUES (42847029, 'XXXX', 'XXXX', 'XXXX')
>> 2017/06/13 10:30:36 - WARNING - on_data: Error elaborating data
>> Traceback (most recent call last):
>>  File "data_collector.py", line 267, in on_data
>>    return self.elaborate_data(data)
>>  File "data_collector.py", line 354, in elaborate_data
>>    save_user_in_database(user, cursor)
>>  File "data_collector.py", line 234, in save_user_in_database
>>    cursor.execute(db_line)
>>  File "/opt/Python-3.6.0/lib/python3.6/site-packages/pymonetdb-1.0.4-py3.6.egg/pymonetdb/sql/cursors.py",
>> line 164, in execute
>>    block = self.connection.execute(query)
>>  File "/opt/Python-3.6.0/lib/python3.6/site-packages/pymonetdb-1.0.4-py3.6.egg/pymonetdb/sql/connections.py",
>> line 133, in execute
>>    return self.command('s' + query + ';')
>>  File "/opt/Python-3.6.0/lib/python3.6/site-packages/pymonetdb-1.0.4-py3.6.egg/pymonetdb/sql/connections.py",
>> line 138, in command
>>    return self.mapi.cmd(command)
>>  File "/opt/Python-3.6.0/lib/python3.6/site-packages/pymonetdb-1.0.4-py3.6.egg/pymonetdb/mapi.py",
>> line 235, in cmd
>>    raise exception(string)
>> pymonetdb.exceptions.IntegrityError: INSERT INTO: PRIMARY KEY
>> constraint 'users.users_user_id_pkey' violated
>>
>> Have you any advice? Is there a way to perform the same check in just
>> one SQL line? Is this way atomic (if a would ever move to a
>> multithreading script)?
>
>It is not atomic, but you can make it atomic by wrapping both queries into one transaction.
>
>> Am I doing something
>> wrong with my current code? Please, take into account that I do not
>> have a deep experience with other SQL tools so it is perfectly
>> possible that I miss something
>> that for you is extremely obvious.
>
>No, you?re not doing anything wrong.  But there are better alternatives:
>
>1. If this is a standalone action (i.e. not part of a transaction), the check for existing user_id is not necessary.  The simplest and most efficient way is to always do the insert and ignore the ?primary key constraint ? violated? >error (e.g. use a try-catch block).
>
>2. If this insert is going to be part of a (big, long) transaction, then you won?t want the insert to fail.  In that case, you can use a stored procedure:
>
>create procedure insert_new (id int, name varchar(10), description varchar(10), location varchar(10))
>begin
>        declare cnt int;
>        set cnt = (select count(*) from myusers where user_id=id);
>
>        if (cnt = 0 )
>        then
>                insert into myusers values (id, name, description, location);
>        end if;
>end;
>
>Then call the procedure: call insert_new  (42847029, 'XXXX', 'XXXX', 'XXXX?)
>
>I hope this helps.
>

II tried the solution number 2, but unfortunately the problem is still there. I get an Integrity error even calling the procedure that you described before. What is really strange is that, after a while, execute the procedure that you have showed me requires more than 3 seconds on my system. If I restart the monetdb instance, it takes a few milliseconds again. I think that maybe there is something broken in my setup. It could be that I have misconfigured something or I have a problem with my machine.
Unfortunately, taking also into account that I am using Docker (which adds a layer of complexity), there are plenty of things that could not work as expected. Thank you for your help!

>Regards,
>Jennie
>
>>
>> Thank you in advance!
>>
>> Best regards,
>>
>> Stefano Piani