Hi,

I would appreciate some help interpreting the following memory-related issues.

I've got a mserver5 instance running with cgroups v1 constraints

- memory.limit_in_bytes = 17179869184 (16g)
- memory.memsw.limit_in_bytes = 9223372036854771712

gdk_mem_maxsize is initialised as 0.815 * 17179869184 = 14001593384.

So I get:
sql>select * from env() where name in ('gdk_mem_maxsize', 'gdk_vm_maxsize');
+-----------------+---------------+
| name            | value         |
+=================+===============+
| gdk_vm_maxsize  | 4398046511104 |
| gdk_mem_maxsize | 14001593384   |
+-----------------+---------------+

That looks good.


To my surprise, this instance gets frequently OOM-killed for reaching 16g of RSS (no swap used):

memory: usage 16777216kB, limit 16777216kB, failcnt 244063804
memory+swap: usage 16777964kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup out of memory: Kill process 975803 (mserver5) score 1003 or sacrifice child

Now, there are two different aspects: giving a process a memory cap and making the process respect that cap without getting killed.

- if the process allocates more than defined with cgroups, then it gets killed. That is fine, it doesn't surprise me
- the question is: why did monetDB surpass the 16g limit?

Even more surprising, given that it "prudently" initialises itself at 80% of the available memory.

Perhaps I was under the wrong assumption that MonetDB would never allocate more than gdk_mem_maxsize, but now I seem to realise that it simply uses this value to optimise its memory management (e.g. to decide how early to mmap). 

So, am I correct that setting gdk_mem_maxsize (indirectly via gcroups or directly via memmaxsize parameter) does not guarantee rss memory will stay underthat value?

If that is true, I am back at square 1 in my quest for how to cap rss usage (without getting the process killed).

Thanks for your help.
Roberto