25 captures
16 May 2013 - 10 Dec 2025
Apr MAY Jun
16
2012 2013 2014
success
fail

About this capture

COLLECTED BY

Organization: Internet Archive

The Internet Archive discovers and captures web pages through many different web crawls. At any given time several distinct crawls are running, some for months, and some every day or longer. View the web archive through the Wayback Machine.

Collection: Wide Crawl started April 2013

Web wide crawl with initial seedlist and crawler configuration from April 2013.
TIMESTAMPS

The Wayback Machine - http://web.archive.org/web/20130516061104/https://lwn.net/Articles/548189/
 
LWN.net Logo

Log in now

Create an account

Subscribe to LWN

LWN.net Weekly Edition for May 16, 2013

A look at the PyPy 2.0 release

PostgreSQL 9.3 beta: Federated databases and more

LWN.net Weekly Edition for May 9, 2013

(Nearly) full tickless operation in 3.10

LSFMM: Soft reclaim

ByJonathan Corbet
April 23, 2013
LSFMM Summit 2013
Michal Hocko's 2013 LSFMM Summit session on soft reclaim was meant to be an overview of the work he has done to add soft limits to memory control groups. It turned into one of the more contentious sessions of the conference, though, as it revealed a fundamental disagreement over how soft limits should be implemented in this context.

Resource limits are often implemented in "soft" and "hard" forms. The soft limit, being the lower of the two, can be exceeded if the resource in question is not currently in short supply; the hard limit, instead, is always enforced. In the memory control group (memcg) context, one could interpret the soft limit as the amount of memory guaranteed to a group, while the hard limit is the maximum the group will ever be allowed to use. Memory usage between the soft and hard limits is only allowed if the system is not currently short of memory.

Michal's patch set comes in two parts. The first of which is a relatively simple patch; when memory gets tight, the kernel will scan over the memcg hierarchy and reclaim memory from any group that is over its soft limit while leaving others alone. Should memory remain tight after this pass has completed, a second pass will be done where every group is subject to reclaim. This part of the patch set did not generate a lot of discussion.

Part two gets deeper into the idea of what a soft limit actually means. Michal's implementation treats a soft limit of zero as being "unlimited"; it also assumes that if somebody does not bother to set a soft limit on a [Memcg hierarchy] memcg, they don't care about the resources available to that memcg, so it can always be reclaimed from. The most controversial part of the implementation, though, is this: if a memcg has exceeded its soft limit, all child memcgs underneath it will be reclaimed from, regardless of whether they have exceeded their soft limits or not. So, given a memcg hierarchy like that seen to the right, if group A is over its soft limit, groups B and C will be reclaimed from whether they are within their soft limit or not. Much of the session was dedicated to the discussion of this topic; indeed, that argument continues on the mailing list as of this writing.

Those opposed to this behavior feel that it violates the meaning of a soft limit, which they interpret as a promise of a minimum amount of memory that a memcg can use. If one child memcg exceeds its limit to the point that it puts the parent over the soft limit as well, then all of its siblings will suffer, even if they remain below their soft limits. It would be better, it was argued, to simply reclaim from the specific memcg that has exceeded its soft limit while leaving the others alone. In a properly configured memcg setup, the parent should not go over its limit unless at least one child has; reclaiming from that child should bring the parent below its limit as well. Only in the case of a misconfigured control group, where no over-limit child can be found, would it make sense to reclaim from all child groups.

Michal's view is a bit different, needless to say. He sees the parent group's soft limit as a sort of "gatekeeper" used to put an overall limit on a group of memcgs. In this view, it would make sense to "misconfigure" the control groups so that the parent could go over the soft limit even if all children remain below their limits; it's simply another memory management policy that the administrator can elect to use.

No consensus was reached on this particular issue, though the soft reclaim work as a whole was universally liked. As Hugh Dickins put it, everybody is happy that Michal is creating something that is better than what the kernel has now, but many of them disagree with the idea of reclaiming from child groups in this way. This has the look of a debate that won't be resolved anytime soon.

A few other memcg issues were touched on briefly. Deadlocks within the out-of-memory killer are evidently a problem at times, especially if a process runs into an out-of-memory situation while holding an inode's i_mutex lock. The suggest solution was to not go into the out-of-memory killer when certain locks are held; instead, allocation attempts should just fail. There were also some vaguely-expressed concerns about dirty page accounting which, evidently, come down to『really ugly locking.』

At this point, time ran out. The soft reclaim discussion appears poised to continue for some time yet, though; stay tuned.


(Log in to post comments)

LSFMM: Soft reclaim

Posted Apr 24, 2013 4:23 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

I can sure see cases where the soft limit of one level could be such that all children of it can still be below their soft limits.

As a result, it seems to be that both modes of operation need to be supported

1. if a group is over it's softlimit, look inside that group.

1b. if any child is over it's limit, reclaim from it

1c. if all children are within their soft limit, but the parent is still over it's softlimit, reclaim from all children.

that way the kernel is not determining the policy that the limit must be >= the sum of the limits of it's children. If people believe that's the way they want to configure things, it works. If people want to configure things differently, it still works in a predictable way.

and let's face it, people are going to misconfigure systems, even if the policy is that the parent must be >= the children, how is the kernel going to deal with these systems?

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds