24 captures
03 Dec 2008 - 04 Dec 2025
Apr MAY Jun
24
2012 2013 2014
success
fail

About this capture

COLLECTED BY

Organization: Internet Archive

The Internet Archive discovers and captures web pages through many different web crawls. At any given time several distinct crawls are running, some for months, and some every day or longer. View the web archive through the Wayback Machine.

Collection: Wide Crawl started April 2013

Web wide crawl with initial seedlist and crawler configuration from April 2013.
TIMESTAMPS

The Wayback Machine - http://web.archive.org/web/20130524064554/http://lwn.net/Articles/246201/
 
LWN.net Logo

Log in now

Create an account

Subscribe to LWN

Return to the Kernel page

LWN.net Weekly Edition for May 23, 2013

An "enum" for Python 3

An unexpected perf feature

LWN.net Weekly Edition for May 16, 2013

A look at the PyPy 2.0 release

Who maintains this file?

ByJonathan Corbet
August 21, 2007
Kernel developers are generally encouraged to split patches into small pieces before posting them to the mailing lists. Making each change self-contained and easy to understand helps reviewers do their job and is thus a good thing. That said, anybody who doubted that one can get too much of a good thing surely learned the truth when Joe Perches submitted this patch set made up of almost 550 patches, all to the same file. It is fair to say that this deluge of patches was not universally welcomed.

Packaging aside, the ultimate goal of Joe's patch was not particularly controversial: he would like to make it possible to easily find out who is the maintainer of a specific file in the kernel tree. So, for each entry in the MAINTAINERS file, he added one or more lines with patterns describing which files belong to that entry. With that information in place, his get_maintainer.pl script can quickly identify who is responsible for any file in the tree. No more digging through MAINTAINERS or trying to extract email addresses from copyright notices in the source.

It's an appealing idea, but nobody seems to be entirely clear on how to implement it. Keeping this information in a central file has a number of obvious disadvantages. It would clearly go out of date quickly, for example. The MAINTAINERS file tends to get stale as it is; the chances of it being patched for every new or renamed file seem quite small. If developers, contrary to expectations, do keep this file up to date, one can expect large numbers of conflicts as all the resulting patches try to touch the same file.

The patch conflict problem could be mitigated by splitting up the MAINTAINERS file into per-directory versions, much like what was done with the kernel configuration file in the past. There are now over 400 Kconfig files in the mainline tree; some developers have expressed dismay at the idea of similar numbers of MAINTAINERS files being scattered around the tree. And, in any case, per-directory files aren't much more likely to be updated than the single, central file.

So around came another idea: why not just put the maintainer information into the source files? The result would be nicely split documentation which gets put in front of the relevant developers every time they edit the file. The record for maintenance of documentation in the code is far from perfect, but it is much better than the record for completely out-of-line documentation.

One question which comes up when this approach is considered is whether the resulting information should go into the binary kernel image or not. It would be easy to define a new tag like:

    MODULE_MAINTAINER("Your name here");

The provided information could then go into a special section in the kernel image where special tools could find it. Doing things this way would make it possible for people who don't have a kernel tree handy to look up a maintainer. On the other hand, it would bloat the kernel image and fix information in a binary, widely-distributed form where it could persist long after it goes out of date. So ex-maintainers could continue receiving mail for years after they have changed all of the relevant documentation.

An alternative would be to just put the maintainer information at the top of the file as a comment. Then it would only be in the source, and would, presumably, be relatively easy to keep up to date. At least, until, say, a mailing list for a major subsystem moves and all of the associated source files have to be changed. For example, Adrian Bunk noted that the move of the netdev mailing list to vger would have forced patches to about 1300 files.

Yet another approach is to find a way to store the information in the git repository. Git already maintains quite a bit of metadata about source files; to some it seems natural to add maintainer information as well. So far, the git developers have not shown a lot of appetite for adding this sort of feature. But Linus did point out that one could already use git to a similar effect with a simple command:

Do a script like this:

 #!/bin/sh
 git log --since=6.months.ago -- "$@" |
  grep -i '^    [-a-z]*by:.*@' |
  sort | uniq -c |
  sort -r -n | head

and it gives you a rather good picture of who is involved with a particular subdirectory or file.

The advantage of doing things this way is that the resulting output gives a current picture of who has actually been working on a file - a picture which requires no explicit maintenance at all. That list of people is probably a much better group to send copies of patches to than whoever might be listed in a maintainers file; they are the ones who know about what is happening in that part of the tree now.

No real resolution has been reached on this topic. It may be that Linus's approach may be the one taken by default; it already works without the need to merge any patches at all. The question may well stay around for a while, though. Approximately 2,000 developers put patches into the mainline over the course of one year; keeping track of which of those developers is the best to notify of changes to a particular file is never going to be easy.


(Log in to post comments)

Why not auto-generate MAINTAINERS?

Posted Aug 23, 2007 8:01 UTC (Thu) by walles (guest, #954) [Link]

Couldn't somebody just auto-generate the MAINTAINERS file from git as part of the build process?

That way we'll still have a MAINTAINERS file, but without the need for manually keeping it up to date.

Who maintains this file?

Posted Aug 23, 2007 20:24 UTC (Thu) by deweerdt (subscriber, #18159) [Link]

I use git blame. IMHO this often hits the right target.

Who maintains this file?

Posted Aug 24, 2007 0:59 UTC (Fri) by shredwheat (guest, #4188) [Link]

Does git have no 'praise' command like subversion? I would hope that this would be more necessary than its 'blame' counterpart.
:-)

Who maintains this file?

Posted Aug 24, 2007 8:44 UTC (Fri) by deweerdt (subscriber, #18159) [Link]

Adding

[alias]
praise = blame

to your .gitconfig allows you to chose between the two commands depending on your mood (and the nastiness of the bug) :)

Who maintains this file?

Posted Aug 25, 2007 0:23 UTC (Sat) by socket (guest, #43) [Link]

So, I used Linus's script to discover who maintains the MAINTAINERS file.

~/src/linux-2.6> whomaintains.sh MAINTAINERS
43 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
38 Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
11 Signed-off-by: Jeff Garzik <jeff@garzik.org>
8 Signed-off-by: Jean Delvare <khali@linux-fr.org>
8 Signed-off-by: Adrian Bunk <bunk@stusta.de>
6 Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
6 Signed-off-by: David S. Miller <davem@davemloft.net>
6 Signed-off-by: Bryan Wu <bryan.wu@analog.com>
5 Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
5 Signed-off-by: John W. Linville <linville@tuxdriver.com>

But the script just scans through the 'Signed-off-by' (or other *by) lines in commits where that file is changed and counts them. And since Linus and Andrew wind up with their sign-offs on just about everything, the top few entries might not be as informative as one would hope.

Someone with a little more shell-fu than I have could figure out what happens when this script gets run on every file in the tree and see just how much variation there is between the results.

I think some clever shell work and use of git could still give us what we're looking for, but I agree with the other comment that suggested 'git blame' is more likely to be what we want to use.

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds