|
Go's memory management, ulimit -v, and RSS control
ByJonathan Corbet February 15, 2011
Many years ago, your editor ported a borrowed copy of the original BSD
vi editor to VMS; after all, using EDT was the sort of activity
that lost its
charm relatively quickly. DEC's implementation of C for VMS wasn't too
bad, so most of the port went reasonably well, but there was one hitch: the
vi code assumed that two calls to sbrk() would return
virtually contiguous chunks of memory. That was true on early BSD systems,
but not on VMS. Your editor, being a fan of elegant solutions to
programming problems, solved this one by simply allocating a massive array
at the beginning, thus ensuring that the second sbrk() call would
never happen. Needless to say, this "fix" was never sent back upstream
(the VMS uucp port hadn't been done yet in any case) and has long since
vanished from memory.
That said, your editor was recently amused by this
message on the golang-dev list indicating that the developers of the Go
language have adopted a solution of equal elegance. Go has memory
management and garbage collection built into it; the developers believe
that this feature is crucial, even in a systems-level programming
language. From the FAQ:
One of the biggest sources of bookkeeping in systems programs is
memory management. We feel it's critical to eliminate that
programmer overhead, and advances in garbage collection technology
in the last few years give us confidence that we can implement it
with low enough overhead and no significant latency.
In the process of trying to reach that goal of『low enough overhead and no
significant latency,』the Go developers have made some simplifying
assumptions, one of which is that the memory being managed for a running
application comes from a single, virtually-contiguous address range. Such
assumptions can run into the same problem your editor hit with vi
- other code can allocate pieces in the middle of the range - so the Go
developers adopted the same solution: they simply allocate all the memory
they think they might need (they figured, reasonably, that 16GB should
suffice on a 64-bit system) at startup time.
That sounds like a bit of a hack, but an effort has been made to make
things work well. The memory is allocated with an mmap() call,
using PROT_NONE as the protection parameter. This call is meant
to reserve the range without actually instantiating any of the memory; when
a piece of that range is actually used by the application, the protection
is changed to make it readable and writable. At that point, a page fault
on the pages in question will cause real memory to be allocated. Thus,
while this mmap() call will bloat the virtual address size of the
process, it should not actually consume much more memory until the running
program actually needs it.
This mechanism works fine on the developers' machines, but it runs into
trouble in the real world. It is not uncommon for users to use
ulimit -v to limit the amount of virtual memory available to
any given process; the purpose is to keep applications from getting too
large and causing the entire system to thrash. When users go to the
trouble to set such limits, they tend, for some reason, to choose numbers
rather smaller than 16GB. Go applications will fail to run in such an
environment,
even though their memory use is usually far below the limit that the user
set. The problem is that ulimit -v does not restrict memory
use; it restricts the maximum virtual address space size, which is a very
different thing.
One might argue that, given what users typically want to do with
ulimit -v, it might make more sense to have it restrict
resident set size instead of virtual address space size. Making that
change now would be an ABI change, though; it would also make Linux
inconsistent with the behavior of other Unix-like systems. Restricting
resident set size is also simply harder than restricting the virtual
address space size. But even if this change could be
made, it would not help current users of Go applications, who may not
update their kernels for a long time.
One might also argue that the Go developers should dump the continuous-heap
assumption and implement a data structure which allows allocated memory to
be scattered throughout the virtual address space. Such a change also
appears not to be in the cards, though; evidently that assumption makes
enough things easy (and fast) that they are unwilling to drop it. So some
other kind of solution will need to be found. According to the original
message, that solution will be to shift allocations for Go programs (on
64-bit systems) up to a range of memory starting at 0xf800000000.
No memory will be allocated until it is needed; the runtime will simply
assume that nobody else will take pieces of that range in between
allocations. Should that assumption prove false, the application will die
messily.
For now, that assumption is good; the Linux kernel will not hand out memory
in that range unless the application asks for it explicitly. As with many
things that just happen to work, though, this kind of scheme could break at
any time in the future. Kernel policy could change, the C library might
begin doing surprising things, etc. That is always the hazard of relying
on accidental, undocumented behavior. For now, though, it solves the
problem and allows Go programs to run on systems where users have
restricted virtual address space sizes.
It's worth considering what a longer-term solution might look like. If one
assumes that Go will continue to need a large, virtually-contiguous heap,
then we need to find a way to make that possible. On 64-bit systems, it
should be possible; there is a lot of address space available, and the cost
of reserving unused address space should be small. The problem is that
ulimit -v is not doing exactly what users are hoping for; it
regulates the maximum amount of virtual memory an application can use, but
it has relatively little effect on how much physical memory an application
consumes. It would be nice if there were a mechanism which controlled
actual memory use - resident set sizes - instead.
As it turns out, we have such a mechanism in the memory controller. Even better, this
controller can manage whole groups of processes, meaning that an
application cannot increase its effective memory limit by forking. The
memory controller is somewhat resource-intensive to use (though work is
being done to reduce its footprint) and, like other control group-based
mechanisms, it's not set up to "just work" by default. With a bit of work,
though, the memory controller could replace ulimit -v and do
a better job as well. With a suitably configured controller running, a Go
process could run without limits on address space size and still be
prevented from driving the system into thrashing. That seems like a more
elegant solution, somehow.
(Log in to post comments)
It's probably good idea to provide some warning when application is close to the limit (it's usually much easier to cope with "low memory" problem rather then "no memory" problem), but that's separate issue.
WebKit's JavaScriptCore also allocates 2GB for its usage on startup. A problem I discovered while debugging crashes in a buildbot was that Linux is not happy to overcommit too much by default, so in a machine having 2GB RAM plus less than a G swap the likelihood of an mmap failing for hitting the limmit increased by quite a bit when the tests were running. The limit is system-wide. I fixed it by disabling the overcommit limit. If Go goes that route I'm afraid we'll run into this problem more often, so disabling that limmit by default seems to be in order.
|
|