Jan2022 (11.43)

The Jan2022 documentation can be found here.

Jan2022-SP7 Bugfix Release (11.43.27)

MonetDB Common

  • Warnings and informational messages are now sent to stdout instead of stderr, which means that monetdbd will now log them with the tag MSG instead of ERR.

  • Fixed parsing of the BBP.dir files when BAT ids grow larger than 2**24 (i.e. 100000000 in octal).

  • When processing the WAL, if a to-be-destroyed object cannot be found, don’t stop, but keep processing the rest of the WAL.

  • A race condition was fixed where certain write-ahead log messages could get intermingled, resulting in a corrupted WAL file.

  • If opening of a file failed when it was supposed to get memory mapped, an incorrect value was returned to indicate the failure, causing crashes later on. This has been fixed.

  • When saving a bat failed for some reason during a low-level commit, this was logged in the log file, but the error was then subsequently ignored, possibly leading to files that are too short or even missing.

  • The write-ahead log (WAL) is now rotated a bit more efficiently by doing multiple log files in one go (i.e. in one low-level transaction).

  • Fixed a race condition that could lead to a bat being added to the SQL catalog but nog being made persistent, causing a subsequent restart of the system to fail (and crash).

  • Fixed a race condition where a hash could have been created on a bat using the old bat count while in another thread the bat count got updated. This would make the hash be based on too small a size, causing failures later on.

  • When extending a bat failed, the capacity had been updated already and was therefore too large. This could then later cause a crash. This has been fixed by only updating the capacity if the extend succeeded.

  • A bug was fixed when dealing with copy-on-write memory maps. These can occur for some bats used by the write-ahead log code when they grow large enough.

MonetDB5 Server

  • Client connections are cleaned up better so that we get fewer instances of clients that cannot connect.

SQL Frontend

  • When creating a hot snapshot, allow other clients to proceed, even with updating queries.

  • Increased the size of a variable counting the number of changes made to the database (e.g. in case more than 2 billion rows are added to a table).

  • Improved cleanup after failures such as failed memory allocations.

  • An insert into a table from which a column was dropped in a parallel transaction was incorrectly not flagged as a transaction conflict.

  • Added some error checking to prevent crashes. Errors would mainly occur under memory pressure.

  • Fixed cleanup after a failed allocation where the data being cleaned up was unitialized but still used as pointers to memory that also had to be freed.

  • If there was an error in one of the special commands to the server (e.g. setting the reply size for result sets), the server could get into an infinite loop. This has been fixed.

  • Fixed a double cleanup after a failed allocation in COPY INTO. The double cleanup could cause a crash due to a race condition it enabled.

Jan2022-SP6 Bugfix Release (11.43.25)

MonetDB Common

  • Memory leaks have been fixed.

  • Offset heaps (.tailN files) were growing too fast and unnecessarily under certain conditions. This has been fixed. Also, when such too large files are now loaded into the system, it is recognized they are too large and they are truncated to a more reasonable size.

MonetDB5 Server

  • A race condition in the SHA hash code was fixed which resulted in occasional failed connection attempts when they occurred concurrently.

  • Fix a bug where the MAL optimizer would use the starttime of the previous query to determine whether a query timeout occurred.

Merovingian

  • Stop logging references to monetdbd’s logfile in said logfile.

Bug Fixes

  • 7336: Selecting from a literal-value table returns wrong values

Jan2022-SP5 Bugfix Release (11.43.23)

MonetDB Common

  • Fixed a bug in ORDER BY with both NULLS LAST and LIMIT when the ordering was on an interger or integer-like column and NULL values are present.

  • The median_avg and quantile_avg returned bogus results in the non-grouped case (i.e. something like SELECT sys.median_avg(i) FROM t).

SQL Frontend

  • Fixed a bug in COPY BINARY INTO where the input wasn’t checked thoroughly enough.

Merovingian

  • In certain cases (when an mserver5 process exits right after producing a message) the log message was logged over and over again, causing monetdbd to use 100% CPU. This has been fixed.

Jan2022-SP4 Bugfix Release (11.43.21)

Client Package

  • Implemented dump of global grants, that is to say, grants for COPY INTO and COPY FROM which grant permission to users to write to or read from files on the server (COPY INTO queries without the ON CLIENT option).

  • Fixed a bug where when the semicolon at the end of a COPY INTO query that reads from STDIN is at exactly a 10240 byte boundary in a file, the data isn’t read as input for the COPY INTO but instead as a new SQL query.

MonetDB Common

  • A bug was fixed when upgrading a database from the Oct2020 releases (11.39.X) or older when the write-ahead log (WAL) was not empty and contained instructions to create new tables.

  • When destroying a bat, make sure there are no files left over in the BACKUP directory since they can cause problems when the bat id gets reused.

  • Fixed an off-by-one error in the logger which caused older log files to stick around longer in the write-ahead log than necessary.

  • When an empty BAT is committed, skip writing (and synchronizing to disk) the heap (tail and theap) files and write 0 for their sizes to the BBP.dir file. When reading the BBP.dir file, if an empty BAT is encountered, set the sizes of those files to 0. This fixes potential issues during startup of the server (BBPcheckbats reporting errors).

  • Make sure heap files of transient bats get deleted when the bat is destroyed. If the bat was a partial view (sharing the vheap but not the tail), the tail file wasn’t deleted.

  • Various changes were made to satisfy newer compilers.

  • The batDirtydesc and batDirtyflushed Boolean values have been deprecated and are no longer used. They were both holdovers from long ago.

  • Various race conditions (data races) have been fixed.

  • All accesses to the BACKUP directory need to be protected by the same lock. The lock already existed (GDKtmLock), but wasn’t used consistently. This is now fixed. Hopefully this makes the hot snapshot code more reliable.

MonetDB5 Server

  • Various race conditions (data races) have been fixed.

Merovingian

  • When multiple identical messages are written to the log, write the first one, and combine subsequent ones in a single message.

  • Fixed a leak where the log file wasn’t closed when it was reopened after a log rotation (SIGHUP signal).

  • Try to deal more gracefully with “inherited” mserver5 processes. This includes not complaining about an “impossible state”, and allowing such processes to be stopped by the monetdbd process.

  • When a transient failure occurs during processing of a new connection to the monetdbd server, sleep for half a second so that if the transient failure occurs again, the log file doesn’t get swamped with error messages.

Bug Fixes

  • 7040: Memory leak detected for MAPI interface

  • 7298: Irresponsive database server after reading incomplete SQL script.

  • 7308: Race condition in MVCC transaction management

Jan2022-SP3 Bugfix Release (11.43.15)

MonetDB Common

  • All accesses to the BACKUP directory need to be protected by the same lock. The lock already existed (GDKtmLock), but wasn’t used consistently. This is now fixed. Hopefully this makes the hot snapshot code more reliable.

  • When exiting, long running instructions are aborted using the same mechanism that is used for query timeouts.

SQL Frontend

  • GLOBAL TEMPORARY tables are now treated like LOCAL TEMPORARY tables as far as the table content is concerned. The schema information stays global. This fixes an issue with concurrent access and cleanup of stale data.

  • The NO CONSTRAINT option of the COPY INTO query has been removed. It didn’t work and it was never a good idea anyway.

Bug Fixes

  • 7036: Generate column names instead of labels

Jan2022-SP2 Bugfix Release (11.43.13)

Client Package

  • Improved the handling of the \r (internal pager) command in mclient. It now properly counts the header of table, and when a (very) long table is being printed and aborted part way in the built-in pager, not all data is transferred to the client (and then discarded). Instead at most 1000 rows are transferred.

MonetDB Common

  • Improved speed of BATappend to empty varsized bat: we now just copy the heaps instead of inserting individual values.

  • Improved speed of projection (BATproject) on varsized bats by sharing the data heap (vheap).

  • Fixed a race condition which could cause a too large size being written for a .theap file to the BBP.dir file after the correct size file had been saved to disk.

  • We now ignore the size and capacity columns in the BBP.dir file. These values are essential during run time, but not useful in the on-disk image of the database.

  • Fixed a bug in the append code for msk (bit mask) bats.

  • Conversions from floating point types to integral types that involve multiplication now use the “long double” as intermediate type, thereby loosing as few significant bits as is feasible.

  • Found and fixed another source for the now infamous BBPcheckbats error that sometimes occurs at startup of the server.

MonetDB5 Server

  • Improved parsing speed of blob values, especially on Windows. On Windows, using the locale-aware functions isdigit and isxdigit is comparatively very slow, so we avoid them.

Bug Fixes

  • 7252: Segmentation fault on second run

  • 7253: Extremely slow INSERT INTO <table> SELECT

  • 7254: Commit with deletions is very slow

  • 7263: PRIMARY KEY constraint is not persistent through server restarts

  • 7267: Update after delete does not update some rows

  • 7278: BUG when there is more than one field/filter in the having clause

Jan2022-SP1 Bugfix Release (11.43.9)

SQL Frontend

  • [This feature was already released in Jan2022 (11.43), but the ChangeLog was missing] Added SQL procedures sys.vacuum(sname string, tname string, cname string), sys.vacuum(sname string, tname string, cname string, interval int), sys.stop_vacuum(sname string, tname string, cname string). These can be used to vacuum string columns.

Bug Fixes

  • 7228: COMMIT: transaction is aborted because of concurrency conflicts, will ROLLBACK instead

  • 7230: Prepared statement of INSERT with SELECT fails when types difer

  • 7232: False conflicts when inserting in a not null field

  • 7237: SELECT with concurrent writes rarely returns corrupt data

  • 7238: query with system function: “index”(varchar, boolean) fails with GDK error or assertion failure.

  • 7241: Replacing a view by a query on the view itself crashes the server.

Jan2022 Feature Release (11.43.5)

  • A couple of concurrency issues have been fixed.

Windows Packaging

  • We now build Windows binaries using Visual Studio 2022.

Client Package

  • A new output formatting mode was added to mclient. Use -fcsv-noquote to produce a CSV (comma-separated values) output where the quote characters have not been escapes. This can be useful when producing a single column string output that should be saved as is, e.g. when using the sys.dump_database() function.

MonetDB Common

  • Implement string imprints (strimps for short) a pre-filter structure for strings in order to accelerate LIKE queries. If a strimp exists for a specific string column the strings are pre-filtered, rejecting strings that cannot possibly match, before the more expensive and accurate matching algorithms run.

  • On Windows, files and directories we create now get the attribute FILE_ATTIBUTE_NOT_CONTENT_INDEXED, meaning that they should not be indexed by indexing or search services.

  • Many (most) low level functions that could take a long time (such as BATjoin) can now be aborted with a timeout. When the function takes too long, the function will fail, and hence the whole SQL query will fail.

  • At some point in the past, string heaps were created where the hash value of the string was stored in the heap before the string. This hasn’t been used in a long time. Now the code that could still read those old heaps has been removed. Bats that used the old format are converted automatically.

  • Some small interface changes to the atom functions: the atomPut function now returns (var_t) -1 on error instead of 0; the atomHeap function now returns success or failure as a gdk_return value.

General

  • Reliance on the OpenSSL library has been removed. OpenSSL was used for the hash algorithms it contained (e.g. SHA512 and RIPEMD160) and for producing random numbers. The hash functions have been replaced by the original published functions, and random numbers are generated using system-specific random sources (i.e. not simply pseudo-random number generators).

MonetDB5 Server

  • The storage cleanup in the 11.41.5 (Jul2021) release made the OLTP optimizer pipeline obsolete, thus it was removed.

SQL Frontend

  • Add string imprints to the existing imprints index creation syntax. On string column “col” of a table “tbl” marked read only (“ALTER TABLE tbl SET READ ONLY”) the user can create a string imprint using the syntax: “CREATE IMPRINTS INDEX index_name ON tbl(col);”.

  • With the storage cleanup in the 11.41.5 (Jul2021) release, the ANALYZE statement was updated to accomodate those changes. The SAMPLE parameter is now ignored because ANALYZE generated statistics used by relational operators, are required to be precise.

  • In order to mitigate the I/O required to update the ‘statistics’ table, this table is no longer persisted. Alternately, it was changed into a computed view every time when queried. The ‘stamp’ and ‘sample’ fields were removed for the aforementioned reasons. The ‘schema’, ’table’ and ‘column’ fields were added for convenience.

  • In previous versions there was no check that the INCREMENT BY value of a SEQUENCE was not zero. During the automatic upgrade of a database, INCREMENT BY values that are zero are set to one.

  • The method to compute the ‘side_effect’ effect property was changed for SQL functions defined in the backend engine (eg. CREATE FUNCTION ... EXTERNAL NAME "module"."function"). It was changed from being computed by the SQL layer to the backend engine itself. As a consequence, the computed ‘side_effect’ value may be different, thus bringing incompatibilities. After an upgrade, if a ‘side_effect’ incompatibility arises, either the ‘side_effect’ value in the backend should be changed or the function should be re-created in SQL.

  • Removed deprecated system view sys.systemfunctions. It was marked as deprecated from release Apr2019 (11.33.3). Use query: select id as function_id from sys.functions where system; to get the same data as the old view.

  • Extended SQL system catalog with lookup table sys.fkey_actions and view sys.fkeys to provide user friendly querying of existing foreign keys and their ON UPDATE and ON DELETE referential action specifications.

  • Many improvements were done for REMOTE table plans. As a consequence, master or slave servers from this feature release are not compatible with older releases.

  • The view sys.ids has been changed to give more information about the objects in the system. In particular, there is an extra column added at the end that indicates whether the object is a system object.

  • The example modules opt_sql_append and udf are no longer loaded by default and no longer part of the binary release. If installed, they can be loaded using the –loadmodule option.

  • The built-in SQL functions to produce a dump that were added as a proof-of-concept in the previous release have been improved and are now usable. Use the query SELECT stmt FROM sys.dump_database(FALSE) ORDER BY o to produce a dump. The dump code built into mclient and msqldump is probably still more efficient, though.

  • The sys.epoch function has always been confusing. There are two versions, one with an INTEGER argument, and one with a BIGINT argument. The former accepted values as seconds, whereas the latter expected milliseconds. Also, the construct EXTRACT(EPOCH FROM value) returns a BIGINT with millisecond precision. This has now been overhauled. There is no longer a function sys.epoch with BIGINT argument, but instead there is a new function sys.epoch with DECIMAL(18,3) argument. The argument is seconds, but with 3 decimals, it provides millisecond accuracy. Also the EXTRACT(EPOCH FROM value) now returns a DECIMAL(18,3), again seconds with 3 decimals giving millisecond accuracy. Note that the internal, binary representation of DECIMAL(18,3) interpreted as seconds with 3 decimals and BIGINT with millisecond precision is exactly the same.

Merovingian

  • Disabled logging into merovingian.log of next info message types:

    • proxying client <host>:<port> for database ‘<dbname>’ to <url>
    • target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying These messages were written to the log file at each connection. In most cases this information is not used. The disabling reduces the log file size.
  • Removed the deprecated monetdb commands profilerstart and profilerstop.

Bug Fixes

  • 7168: Loosing the documentation

  • 7180: GROUP BY-subquery crashes MonetDb

  • 7182: Queries against sys.querylog_catalog, sys.querylog_calls or sys.querylog_history fail after restore of a db created using call sys.hot_snapshot(R’\path\file.tar’);

  • 7201: Selection of a subquery with a LEFT JOIN returns the wrong result set

  • 7202: DISTINCT does not work when sorting by additional columns

  • 7215: ODBC Driver SQLStatistics returns duplicate rows/rows for other tables