Thanks again for these details and for the gdk estimates to look at.

To me what is important here is not much to understand why swap wasn't used, but the fact that no swap used actually simplifies things. 

> This 16g is a combination of what was allocated (and used) through
> malloc and through mmap.  Forget about memory being a malloc thing.
> Mmap also uses memory.

The kernel seems to tell me that in this case the 16g used are physical RAM. Not swapped, not mmapped:

memory: usage 16777216kB, limit 16777216kB, failcnt 244063804
memory+swap: usage 16777964kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup out of memory: Kill process 975803 (mserver5) score 1003 or sacrifice child
Killed process 4134544 (mserver5), UID 0, total-vm:28382372kB, anon-rss:16776396kB, file-rss:13284kB, shmem-rss:0kB

> GDK_mem_maxsize is not a hard limit.  It's a value on which some
> decisions having to do with allocating memory are based.

This is actually what I was asking confirmation about.
I'm not complaining of anything, but this is not really documented and I'm just trying to understand exactly which tools I have: is it a soft limit in the sense that it actually tries to stay more or less below this limit, or is it really only used to estimate other values?

The point is not to force some exact behaviour, but to apply some resource management *without* having the server killed, and that is a big problem.

On Tue, 9 Mar 2021 at 20:26, Sjoerd Mullender <sjoerd@monetdb.org> wrote:
On 09/03/2021 17.44, Roberto Cornacchia wrote:
> Sjoerd,
>
> Thanks for these details.
>
> Let me focus on these two concepts:
>
>  > Allocated address space may or may not reside in physical memory.
>  > The kernel decides that.
>
> Absolutely.
> Still, you can decide what's the max that malloc() can use:
>
>  > gdk_mem_maxsize is the maximum amount of address space we want to
>  > allocate using malloc and friends.

GDK_mem_maxsize is not a hard limit.  It's a value on which some
decisions having to do with allocating memory are based.  GDK_vm_maxsize
is a fairly hard limit.  We may go over it in critical code (during
transaction commit), but otherwise allocations (both malloc and mmap)
will fail if you were to go over this limit.

>
> Isn't malloc() using RSS + swap to back its allocations? Does that mean
> that gdk_mem_maxsize should be a cap to what we want to be able to
> allocate on RSS + swap?
> In this case, actually, swap usage was 0.
>
> So I still don't understand why mallocs for 16g happened,
> when gdk_mem_maxsize was 14g.

This 16g is a combination of what was allocated (and used) through
malloc and through mmap.  Forget about memory being a malloc thing.
Mmap also uses memory.

Why swap is unused I don't know.  As I said, it's the kernel that does
that.  We have nothing to do with that.  It may be you have kernel
parameters set that cause the kernel to not use swap.

By using the debugger you can check how much MonetDB thinks it has
allocated.  Look at the values of the variables
GDK_mallocedbytes_estimate and GDK_vm_cursize.  But again, that is
allocated address space, not memory.  And there may be fragmentation as
well, so the amount of address space in use by the process may well be
higher (of course these numbers don't take space for declared variables
into account).




>
>
> On Tue, 9 Mar 2021 at 17:27, Sjoerd Mullender <sjoerd@monetdb.org
> <mailto:sjoerd@monetdb.org>> wrote:
>
>     We do not in any way control RSS (resident set size).  That is fully
>     under control of the kernel.
>
>     gdk_mem_maxsize is the maximum amount of address space we want to
>     allocate using malloc and friends.
>     gdk_vm_maxsize is the maximum amount of address space we want to
>     allocate (malloc + mmap).
>     Neither value has anything to do with how much actual, physical memory
>     is being used.  They are just measures of how much address space is
>     used, allocated either through malloc or malloc+mmap.  Allocated
>     address
>     space may or may not reside in physical memory.  The kernel decides
>     that.
>
>     Of course, if you're using the address space (however you got it),
>     there
>     must be physical memory to which the address space is mapped.
>
>     The difference between malloc and mmap is mostly where the physical,
>     disk-spaced backing (if any) for the virtual memory is located, i.e.
>     where the kernel can copy the memory to if it needs space.  In the case
>     of mmap (our use of it, anyway) it is files in the file system, and in
>     the case of malloc it is swap (if you have it) or physical memory (if
>     you don't).
>
>     On 09/03/2021 16.58, Roberto Cornacchia wrote:
>      > Hi,
>      >
>      > I would appreciate some help interpreting the following
>     memory-related
>      > issues.
>      >
>      > I've got a mserver5 instance running with cgroups v1 constraints
>      >
>      > - memory.limit_in_bytes = 17179869184 (16g)
>      > - memory.memsw.limit_in_bytes = 9223372036854771712
>      >
>      > gdk_mem_maxsize is initialised as 0.815 * 17179869184 = 14001593384.
>      >
>      > So I get:
>      > sql>select * from env() where name in ('gdk_mem_maxsize',
>     'gdk_vm_maxsize');
>      > +-----------------+---------------+
>      > | name            | value         |
>      > +=================+===============+
>      > | gdk_vm_maxsize  | 4398046511104 |
>      > | gdk_mem_maxsize | 14001593384   |
>      > +-----------------+---------------+
>      >
>      > That looks good.
>      >
>      >
>      > To my surprise, this instance gets frequently OOM-killed for
>     reaching
>      > 16g of RSS (no swap used):
>      >
>      > memory: usage 16777216kB, limit 16777216kB, failcnt 244063804
>      > memory+swap: usage 16777964kB, limit 9007199254740988kB, failcnt 0
>      > kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
>      > Memory cgroup out of memory: Kill process 975803 (mserver5) score
>     1003
>      > or sacrifice child
>      >
>      > Now, there are two different aspects: giving a process a memory
>     cap and
>      > making the process respect that cap without getting killed.
>      >
>      > - if the process allocates more than defined with cgroups, then
>     it gets
>      > killed. That is fine, it doesn't surprise me
>      > - the question is: why did monetDB surpass the 16g limit?
>      >
>      > Even more surprising, given that it "prudently" initialises
>     itself at
>      > 80% of the available memory.
>      >
>      > Perhaps I was under the wrong assumption that MonetDB would never
>      > allocate more than gdk_mem_maxsize, but now I seem to realise
>     that it
>      > simply uses this value to optimise its memory management (e.g. to
>     decide
>      > how early to mmap).
>      >
>      > So, am I correct that setting gdk_mem_maxsize (indirectly via
>     gcroups or
>      > directly via memmaxsize parameter) does not guarantee rss memory
>     will
>      > stay underthat value?
>      >
>      > If that is true, I am back at square 1 in my quest for how to cap
>     rss
>      > usage (without getting the process killed).
>      >
>      > Thanks for your help.
>      > Roberto
>      >
>      > _______________________________________________
>      > users-list mailing list
>      > users-list@monetdb.org <mailto:users-list@monetdb.org>
>      > https://www.monetdb.org/mailman/listinfo/users-list
>     <https://www.monetdb.org/mailman/listinfo/users-list>
>      >
>
>     --
>     Sjoerd Mullender
>     _______________________________________________
>     users-list mailing list
>     users-list@monetdb.org <mailto:users-list@monetdb.org>
>     https://www.monetdb.org/mailman/listinfo/users-list
>     <https://www.monetdb.org/mailman/listinfo/users-list>
>
>
> _______________________________________________
> users-list mailing list
> users-list@monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>

--
Sjoerd Mullender
_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list