Return to the Kernel page

LWN.net Weekly Edition for May 23, 2013

LWN.net Weekly Edition for May 16, 2013

Weekly edition	Kernel	Security	Distributions	Contact Us	Search
Archives	Calendar	Subscribe	Write for LWN	LWN.net FAQ	Sponsors

Capabilities in 2.6

[Posted April 6, 2004 by corbet]

The kernel capability mechanism gives (relatively) fine-grained control over what actions any given process can perform. The various capabilities include the ability to override file permissions, send signals to other processes, bind to low-numbered ports, and many other tasks. There have been visions over the years of exporting capabilities to user space and eliminating the "all-powerful superuser" concept, but none of those visions have been implemented in any sort of widely-distributed sort of way.

One of the capabilities is called CAP_IPC_LOCK; it gives a process the ability to lock a region of virtual memory into physical RAM. This capability needs to be controlled; otherwise a rogue process could lock up all of physical memory and effectively shut down the system. There are, however, legitimate reasons for giving this capability to normal users. Programs which handle encryption (such as gpg) would like to lock in some of their memory so that passphrases and clear text do not get written out to swap. Systems like Oracle need the capability to lock in their shared segments (since they do their own paging, essentially) and to be able to allocate large page "hugetlb" segments.

To this end, Andrea Arcangeli posted a patch which allows the system administrator to disable CAP_IPC_LOCK checking via a sysctl variable. With those checks disabled, any non-privileged process can lock pages into memory or allocate large-page shared memory segments. Andrea asked for the patch to be incorporated into the 2.6 mainline.

The patch inspired some thinking on how best to make certain capabilities available to users. There has been a patch in circulation for a while which simply opens up memory locking to everybody, but which puts a resource limit on the number of pages which can be locked. The default limit is a single page, which works for gpg but which does not easily threaten the system as a whole. With a suitably adjusted limit, this patch should work for Oracle as well - but it does not address the large-page shared memory issue.

William Lee Irwin put together a different patch which allows the administrator to turn off checks for any capability via a set of sysctl variables. It differs from Andrea's patch in its generality, but also by virtue of using the security module framework rather than direct changes to the kernel core. Some people seemed to like this patch better, though there was some nervousness about its overall security which led William to add a strong comment and a lockdown capability to the patch.

Given that the whole idea behind capabilities was to be able to give specific capabilities to individual users, however, some developers wondered why the current system couldn't be used. To this end, Andrew Morton looked into hacking login to enable it to give capabilities to users. He was not impressed with what he found once he started trying to work with kernel capabilities:

It turns out that the whole『drop capabilities and then run something』thing does not work in either 2.4 or 2.6. And hasn't done since forever. What we have in there is no more useful than suser()...

I must say that I'm fairly disappointed that we developed and merged all that fancy security stuff but nobody ever bothered to fix up the existing simple capability code. Particularly as, apparently, the new security stuff STILL cannot solve the extremely simple Oracle-wants-CAP_IPC_LOCK requirement.

It was pointed out that SELinux can, in fact, solve this problem. But that will be little comfort to those who are not yet ready to adopt SELinux for their production systems.

The problem may originate from the fact that the visions of fully capability-driven systems involve assigning capabilities to all executables and having a process's capabilities tweaked every time a new program is run. That part of the system has never been merged into the mainline, partly because nobody has ever really figured out how to deal with system administration when every file has another 32 permissions bits added onto it. The end result, in any case, is that the capability subsystem has never worked quite as it should. Given that Andrew is the gatekeeper, chances are good that some sort of fix for that problem will get into the kernel before any sort of more complicated solution to the problem of giving capabilities to users.

(Log in to post comments)

OpenBSD as a model?

Posted Apr 8, 2004 6:51 UTC (Thu) by rfunk (subscriber, #4054) [Link]

I wonder if Linux might benefit from looking at OpenBSD's systrace facility. Systrace isn't quite like Linux Capabilities, since it deals with access to system calls rather than underlying actions, but it seems fairly similar on a functional level. Maybe the interface used for systrace could benefit Linux.

SELinux vs. (capabilities + file permissions)?

Posted Apr 8, 2004 14:08 UTC (Thu) by bkw1a (subscriber, #4101) [Link]

If the problems with capabilities noted above were fixed, how would
capabilities + file permissions compare with SElinux? What extra features
does SELinux get you? Are they worth the trouble?

SELinux vs. (capabilities + file permissions)?

Posted Apr 8, 2004 20:41 UTC (Thu) by jmshh (guest, #8257) [Link]

SELinux is much more than capabilities:
- More fine grained configurable
- Can prevent users to share access given to them
- Is role based, not just user or process
- Is more complex to administer
So you can do more with SELinux, but the price is a lot more work.

Capabilities mostly useless

Posted Apr 8, 2004 21:45 UTC (Thu) by Ross (subscriber, #4065) [Link]

The whole capability system is not very fine grained. In fact, all
capabilities are all super-user privileges. The ability to lock up the
system or escalate priviledges is highly likely if any of the capabilities
are granted. And some of them like CAP_SYS_ADMIN are grab-bags of
unrelated priviledges. Why can't some normal user priviledges be worked
into the scheme CAP_LISTEN, CAP_CONNECT, CAP_PTRACE,
CAP_EXECSUID, CAP_USRCHOWN, CAP_USRCHMOD, etc.? With these
I could actually use capabilities to harden systems. As they are I can
only lock down the root account which isn't too useful when nothing runs as
root anyway.

Capabilities in 2.6

Posted Apr 8, 2004 22:29 UTC (Thu) by Klavs (subscriber, #10563) [Link]

I would note, that I've used vserver for quite some time on 2.4 (and pathces are in dev -and released in alpha-state, for 2.6 - using them on my laptop) to enable exactly this capability handling. Vserver strips ALL capabilities - even from root - but you can very easily add a capability pr. vserver - and as you are supposed to run each service in a seperate vserver (this has no notable overhead), you could easily add the mentioned capability to the vserver running oracle.

Vserver works rather simply - and does not reserve memory for each vserver etc. this makes it very lightweight. see http://www.linux-vserver.org
Perhaps the kernel coders should have a look at how the capabilities are used there? - as it works rather well.

Apr	MAY	Jun
	24
2012	2013	2014