Jul2021 (11.41)

The Jul2021 documentation can be found here.

Jul2021-SP10 Bugfix Release (11.41.33)

MonetDB Common

  • Fixed parsing of the BBP.dir files when BAT ids grow larger than 2**24 (i.e. 100000000 in octal).

MonetDB5 Server

  • A bug was fixed where data from a client context was freed after the context was closed. This meant that the data being freed could belong to the next user of the context (a next client that just connected), leading to chaos (i.e. crashes).

SQL Frontend

  • When creating a hot snapshot, allow other clients to proceed, even with updating queries.

Jul2021-SP9 Bugfix Release (11.41.31)

MonetDB Common

  • When processing the WAL, if a to-be-destroyed object cannot be found, don’t stop, but keep processing the rest of the WAL.

  • A race condition was fixed where certain write-ahead log messages could get intermingled, resulting in a corrupted WAL file.

  • If opening of a file failed when it was supposed to get memory mapped, an incorrect value was returned to indicate the failure, causing crashes later on. This has been fixed.

  • When saving a bat failed for some reason during a low-level commit, this was logged in the log file, but the error was then subsequently ignored, possibly leading to files that are too short or even missing.

  • The write-ahead log (WAL) is now rotated a bit more efficiently by doing multiple log files in one go (i.e. in one low-level transaction).

  • Fixed a race condition that could lead to a bat being added to the SQL catalog but nog being made persistent, causing a subsequent restart of the system to fail (and crash).

  • Fixed a race condition where a hash could have been created on a bat using the old bat count while in another thread the bat count got updated. This would make the hash be based on too small a size, causing failures later on.

  • When extending a bat failed, the capacity had been updated already and was therefore too large. This could then later cause a crash. This has been fixed by only updating the capacity if the extend succeeded.

  • A bug was fixed when dealing with copy-on-write memory maps. These can occur for some bats used by the write-ahead log code when they grow large enough.

MonetDB5 Server

  • Client connections are cleaned up better so that we get fewer instances of clients that cannot connect.

  • Fix a bug where the MAL optimizer would use the starttime of the previous query to determine whether a query timeout occurred.

SQL Frontend

  • Increased the size of a variable counting the number of changes made to the database (e.g. in case more than a 2 billion rows are added to a table).

  • Improved cleanup after failures such as failed memory allocations.

  • An insert into a table from which a column was dropped in a parallel transaction was incorrectly not flagged as a transaction conflict.

  • Added some error checking to prevent crashes. Errors would mainly occur under memory pressure.

  • Fixed cleanup after a failed allocation where the data being cleaned up was unitialized but still used as pointers to memory that also had to be freed.

  • A bug was fixed when optimizing combining of range select subexpressions.

  • If there was an error in one of the special commands to the server (e.g. setting the reply size for result sets), the server could get into an infinite loop. This has been fixed.

  • Fixed a double cleanup after a failed allocation in COPY INTO. The double cleanup could cause a crash due to a race condition it enabled.

Merovingian

  • Stop logging references to monetdbd’s logfile in said logfile.

Jul2021-SP8 Bugfix Release (11.41.27)

MonetDB Common

  • A bug was fixed when upgrading a database from the Oct2020 releases (11.39.X) or older when the write-ahead log (WAL) was not empty and contained instructions to create new tables.

  • Avoid logging of failure to backup files that didn’t need to be backed up in the first place.

  • Avoid an attempt to access a file when the database is in memory.

SQL Frontend

  • Fixed a busy loop in the code that applies the write-ahead log when there are log files that cannot yet be cleaned due to active transactions. This loop can become nasty when mserver5 is exiting.

Merovingian

  • In certain cases (when an mserver5 process exits right after producing a message) the log message was logged over and over again, causing monetdbd to use 100% CPU. This has been fixed.

Jul2021-SP7 Bugfix Release (11.41.25)

MonetDB Common

  • When destroying a bat, make sure there are no files left over in the BACKUP directory since they can cause problems when the bat id gets reused.

  • Fixed an off-by-one error in the logger which caused older log files to stick around longer in the write-ahead log than necessary.

  • When an empty BAT is committed, skip writing (and synchronizing to disk) the heap (tail and theap) files and write 0 for their sizes to the BBP.dir file. When reading the BBP.dir file, if an empty BAT is encountered, set the sizes of those files to 0. This fixes potential issues during startup of the server (BBPcheckbats reporting errors).

  • Make sure heap files of transient bats get deleted when the bat is destroyed. If the bat was a partial view (sharing the vheap but not the tail), the tail file wasn’t deleted.

  • Various changes were made to satisfy newer compilers.

  • The batDirtydesc and batDirtyflushed Boolean values have been deprecated and are no longer used. They were both holdovers from long ago.

  • Various race conditions (data races) have been fixed.

  • All accesses to the BACKUP directory need to be protected by the same lock. The lock already existed (GDKtmLock), but wasn’t used consistently. This is now fixed. Hopefully this makes the hot snapshot code more reliable.

MonetDB5 Server

  • Various race conditions (data races) have been fixed.

Merovingian

  • When multiple identical messages are written to the log, write the first one, and combine subsequent ones in a single message.

  • Fixed a leak where the log file wasn’t closed when it was reopened after a log rotation (SIGHUP signal).

  • Try to deal more gracefully with “inherited” mserver5 processes. This includes not complaining about an “impossible state”, and allowing such processes to be stopped by the monetdbd process.

  • When a transient failure occurs during processing of a new connection to the monetdbd server, sleep for half a second so that if the transient failure occurs again, the log file doesn’t get swamped with error messages.

Bug Fixes

Jul2021-SP6 Bugfix Release (11.41.23)

Bug Fixes

Jul2021-SP5 Bugfix Release (11.41.21)

MonetDB Common

  • Fixed a race condition which could cause a too large size being written for a .theap file to the BBP.dir file after the correct size file had been saved to disk.

  • We now ignore the size and capacity columns in the BBP.dir file. These values are essential during run time, but not useful in the on-disk image of the database.

Merovingian

  • Disabled logging into merovingian.log of next info message types: “proxying client <host>:<port> for database ‘<dbname>’ to <url>” and “target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying”. These messages were written to the log file at each connection. In most cases this information is not used. The disabling reduces the log file size.

Bug Fixes

Jul2021-SP4 Bugfix Release (11.41.19)

Bug Fixes

  • 7267: Update after delete does not update some rows

Jul2021-SP3 Bugfix Release (11.41.15)

MonetDB Common

  • Fixed race condition during backup of BATs.

  • Fixed append to BATs of type msk (bit mask).

  • Fix to WAL logger when a BAT gets replaced within a transaction.

SQL Frontend

  • Add number of rows affected by output statements into the total rowcount.

  • Fix to MAL code generation.

Bug Fixes

  • 7225: Invalid memory access when extending a BAT during appends

  • 7228: COMMIT: transaction is aborted because of concurrency conflicts, will ROLLBACK instead

Jul2021-SP2 Bugfix Release (11.41.13)

Client Package

  • Dumping the database now also dumps the read-only and insert-only states of tables.

MonetDB Common

  • Sometimes when the server was restarted, it wouldn’t start anymore due to an error from BBPcheckbats. We finally found and fixed a (hopefully “the”) cause of this problem.

SQL Frontend

  • Number parsing for SQL was fixed. If a number was immediately followed by letters (i.e. without a space), the number was accepted and the alphanumeric string starting with the letter was interpreted as an alias (if aliases were allowed in that position).

Bug Fixes

  • 7163: Multiple sql.mvc() invocations in the same query
  • 7167: sys.shutdown() problems
  • 7184: Insert into query blocks all other queries
  • 7185: GROUPING SETS on groups with aliases provided in the SELECT returns empty result
  • 7186: data files created with COPY SELECT .. INTO ‘file.csv’ fail to be loaded using COPY INTO .. FROM ‘file.csv’ when double quoted string data contains the field values delimiter character
  • 7191: [MonetDBe] monetdbe_cleanup_statement() with bound NULLs on variable-sized types bug
  • 7196: BATproject2: does not match always
  • 7198: Suboptimal query plan for query containing JSON access filter and two negative string comparisons
  • 7200: PRIMARY KEY unique constraint is violated with concurrent inserts
  • 7206: Python UDF fails when returning an empty table as a dictionary

Jul2021-SP1 Bugfix Release (11.41.11)

MonetDB Common

  • Some deadlock and race condition issues were fixed.
  • Handling of the list of free bats has been improved, leading to less thread contention.
  • A problem was fixed where the server wouldn’t start with a message from BBPcheckbats about files being too small. The issue was not that the file was too small, but that BBPcheckbats was looking at the wrong file.
  • An issue was fixed where a “short read” error was produced when memory was getting tight.
  • When appending to a string bat, we made an optimization where the string heap was sometimes copied completely to avoid having to insert strings individually. This copying was still done too eagerly, so now the string heap is copied less frequently. In particular, when appending to an empty bat, the string heap is now not always copied whole.

SQL Frontend

  • If the server has been idle for a while with no active clients, the write-ahead log is now rotated.
  • A problem was fixed where files belonging to bats that had been deleted internally were not cleaned up, leading to a growing database (dbfarm) directory.
  • A leak was fixed where extra bats were created but never cleaned up, each taking up several kilobytes of memory.
  • [This feature was already released in Jul2021 (11.41.5), but the ChangeLog was missing] Grant indirect privileges. With “GRANT SELECT ON <my_view> TO <another_user>” and “GRANT EXECUTE ON FUNCTION <my_func> TO <another_user>”, one can grant access to “my_view” and “my_func” to another user who does not have access to the underlying database objects (e.g. tables, views) used in “my_view” and “my_func”. The grantee will only be able to access data revealed by “my_view” or conduct operations provided by “my_func”.
  • Improved error reporting in COPY INTO by giving the line number (starting with one) for the row in which an error was found. In particular, the sys.rejects() table now lists the line number of the CSV file on which the record started in which an error was found.

Bug Fixes

  • 7140: SQL Query Plan Non Optimal with View
  • 7165: ‘JOINIDX: missing ‘.’’ when running distributed join query on merged remote tables
  • 7172: Unexpected query result with merge tables
  • 7173: If truncate is in transaction then after restart of MonetDB the table is empty
  • 7178: Remote Table Throws Error - createExceptionInternal: !ERROR: SQLException:RAstatement2:42000!The number of projections don’t match between the generated plan and the expected one: 1 != 1200

Jul2021 Feature Release (11.41.5)

Client Package

  • The MonetDB stethoscope has been removed. There is now a separate package available with PIP (monetdb_stethoscope) or an RPM or DEB package (stethoscope) from the monetdb.org repository.

Mapi Library

  • Add optional MAPI header field which can be used to immediately set reply size, autocommit, time zone and some other options, see mapi.h. This makes client connection setup faster. Support has been added to mapilib, pymonetdb and the jdbc driver.

ODBC Driver

  • A typo that made the SQLSpecialColumns function unusable was fixed.

MonetDB Common

  • A bug in the grouping code has been fixed.
  • Hash indexes are no longer maintained at all cost: if the number of distinct values is too small compared to the total number of values, the index is dropped instead of being maintained during updates.
  • A new type, called msk, was introduced. This is a bit mask type. In a bat with type msk, each row occupies a single bit, so 8 rows are stored in a single byte. There is no NULL value for this type.
  • The function of the BAT iterator (type BATiter, function bat_iterator) has been expanded. The iterator now contains more information about the BAT, and it contains a pointer to the heaps (theap and tvheap) that are stable, at least in the sense that they will remain available even when parallel threads update the BAT and cause those heaps to grow (and therefore possibly move in memory). A call to bat_iterator must now be accompanied by a call to bat_iterator_end.
  • Implemented function BUNreplacemultiincr to replace multiple values in a BAT in one go, starting at a given position.
  • Implemented new function BUNreplacemulti to replace multiple values in a BAT in one go, at the given positions.
  • Removed function BUNinplace, just use BUNreplace, and check whether the BAT argument is of type TYPE_void before calling if you don’t want to materialize.
  • Implemented a function BUNappendmulti which appends an array of values to a BAT. It is a generalization of the function BUNappend.
  • Changed the interface of the atom read function. It now requires an extra pointer to a size_t value that gives the current size of the destination buffer, and when that buffer is too small, it receives the size of the reallocated buffer that is large enough. In any case, and as before, the return value is a pointer to the destination buffer.
  • Environment variables (sys.env()) must be UTF-8, but since they can contain file names which may not be UTF-8, there is now a mechanism to store the original values outside of sys.env() and store %-escaped (similar to URL escaping) values in the environment. The key must still be UTF-8.
  • We now save the location of the min and max values when known.

MonetDB5 Server

  • When using the –in-memory option, mserver5 will run completely in memory, i.e. not create a database on disk. The server can still be connected to using the name of the in-memory database. This name is “in-memory”.
  • By using the option “–dbextra=in-memory”, mserver5 can be instructed to keep transient BATs completely in memory.

SQL Frontend

  • The system view sys.ids has been updated to include some more system IDs.
  • The sys.storage() function now only returns meta data, i.e. data that can be calculated without access to the column contents.
  • Since STREAM tables support is removed, left over STREAM tables are dropped from the catalog.
  • Fix a warning emitted by some implementations of the tar(1) command when unpacking hot snapshot files.
  • support reading the concatenation of compressed files as a single compressed file.
  • COPY BINARY overhaul. Allow control over binary endianness using COPY [ (BIG | LITTLE | NATIVE) ENDIAN] BINARY syntax. Defaults to NATIVE. Strings are now \0 terminated rather than \n. Support for BOOL, TINYINT, SMALLINT, INT, LARGEINT, HUGEINT, with their respective “INTMIN” values as the NULL representation; 32 and 64 bit FLOAT/REAL, with NaN as the NULL representation; VARCHAR/TEXT, JSON and URL with \x80 as the NULL representation; UUID as fixed width 16 byte binary values, with (by default) all zeroes as the NULL representation; temporal type structs as defined in copybinary.h with any invalid value as the NULL representation.
  • In the Jul2021 release the storage and transaction layers have undergone major changes. The goal of these changes is robust performance under inserts/updates and deletes and lowering the transaction startup costs, allowing faster (small) queries. Where the old transaction layer duplicated a lot of data structures during startup, the new layer shares the same tree. Using object timestamps the isolation of object is guaranteed. On the storage side the timestamps indicate whether a row is visible (deleted or valid), to a transaction as well. The changes also give some slight changes on the perceived transactional behavior. The new implementation uses shared structures among all transactions, which do not allow multiple changes of the same object. And we then follow the principle of the first writer wins, i.e., if a transaction creates a table with name ’table_name’, and concurrently one other transaction does the same the later of the two will fail with a concurrency conflict error message (even if the first writer never commits). We expect most users not to notice this change, as such schema changes aren’t usually done concurrently.
  • There is now a function sys.current_sessionid() to return the session ID of the current session. This ID corresponds with the sessionid in the sys.queue() result.
  • Merge statements could not produce correct results on complex join conditions, so a renovation was made. As a consequence, subqueries now have to be disabled on merge join conditions.
  • preserve in-query comments
  • Use of CTEs inside UPDATE and DELETE statements are now more restrict. Previously they could be used without any extra specification in the query (eg. with “v1”(“c1”) as (…) delete from “t” where “t”.“c1” = “v1”.“c1”), however this was not conformant with the SQL standard. In order to use them, they must be specified in the FROM clause in UPDATE statements or inside a subquery.
  • Added ‘schema path’ property to user, specifying a list of schemas to be searched on to find SQL objects such as tables and functions. The scoping rules have been updated to support this feature and it now finds SQL objects in the following order: 1. On occasions with multiple tables (e.g. add foreign key constraint, add table to a merge table), the child will be searched on the parent’s schema. 2. For tables only, declared tables on the stack. 3. ’tmp’ schema if not listed on the ‘schema path’. 4. Session’s current schema. 5. Each schema from the ‘schema path’ in order. 6. ‘sys’ schema if not listed on the ‘schema path’. Whenever the full path is specified, ie “schema”.“object”, no search will be made besides on the explicit schema.
  • To update the schema path ALTER USER x SCHEMA PATH y; statement was added. [SCHEMA PATH string] syntax was added to the CREATE USER statement. The schema path must be a single string where each schema must be between double quotes and separated with a single comma, e.g. ‘“sch1”,“sch2”’ For every created user, if the schema path is not given, ‘“sys”’ will be the default schema path.
  • Changes in the schema path won’t be reflected on currently connected users, therefore they have to re-connect to see the change. Non existent schemas on the path will be ignored.
  • Leftover STREAM table definition from Datacell extension was removed from the parser. They had no effect anymore.

Merovingian

  • Deprecate ‘profilerstart’ and ‘profilerstop’ commands. Since stethoscope is a separate project (https://github.com/MonetDBSolutions/monetdb-pystethoscope) the installation directory is not standard anymore. ‘profilerstart’ and ‘profilerstop’ commands assume that the stethoscope executable is in the same directory as ‘mserver5’. This is no longer necessarily true since stethoscope can now be installed in a python virtual environment. The commands still work if stethoscope is installed using the official MonetDB installers, or if a symbolic link is created in the directory where ‘mserver5’ is located.
  • The exittimeout value can now be set to a negative value (e.g. -1) to indicate that when stopping the dbfarm (using monetdbd stop dbfarm), any mserver5 processes are to be sent a termination signal and then waited for until they terminate. In addition, if exittimeout is greater than zero, the mserver5 processes are sent a SIGKILL signal after the specified timeout and the managing monetdbd is sent a SIGKILL signal after another five seconds (if it didn’t terminate already). The old situation was that the managing monetdbd process was sent a SIGKILL after 30 seconds, and the mserver5 processes that hadn’t terminated yet would be allowed to continue their termination sequence.

Bug Fixes

  • 2030: Temporary table is semi-persistent when transaction fails
  • 7031: I cannot start MoentDb, because the installation path has Chinese.
  • 7055: Table count returning function used inside other function gives wrong results.
  • 7075: Inconsistent Results using CTEs in Large Queries
  • 7079: WITH table AS… UPDATE ignores the WHERE conditions on table
  • 7081: Attempt to allocate too much space in UPDATE query
  • 7093: ‘current_schema’ not in sys.keywords
  • 7096: DEBUG SQL statement broken
  • 7115: Jul2021: ParseException while upgrading Oct2020 database
  • 7116: Jul2021: Cannot create filter functions
  • 7125: MonetDB Round Function issues in the latest release
  • 7126: The “lower” and “upper” functions doesn’t work for Cyrillic alphabet
  • 7127: Bug report: “write error on stream” that results in mclient crash
  • 7128: Bug report: strange error message “Subquery result missing”
  • 7129: Bug report: TypeException:user.main[19]:‘batcalc.between’ undefined
  • 7130: Bug report: TypeException:user.main[396]:‘algebra.join’ undefined
  • 7131: Bug report: TypeException:user.main[273]:‘bat.append’ undefined
  • 7133: WITH ( SELECT x ) DELETE FROM … deletes wrong tuples
  • 7136: MERGE statement is deleting rows if the column is set as NOT NULL even though it should not
  • 7137: Segmentation fault while loading data
  • 7138: Monetdb Python UDF crashes because of null aggr_group_arr
  • 7141: COUNT(DISTINCT col) does not calculate correctly distinct values
  • 7142: Aggregates returning tables should not be allowed
  • 7144: Type up-casting (INT to BIGINT) doesn’t always happen automatically
  • 7146: Query produces this error: !ERROR: Could not find %102.%102
  • 7147: Internal error occurs and is not shown on the screen
  • 7148: Select distinct is not working correctly
  • 7151: Insertion is too slow
  • 7153: System UDFs lose their indentation - Python functions broken
  • 7158: Python aggregate UDF returns garbage when run on empty table
  • 7161: fix priority