|
Leading items
ByJonathan Corbet March 13, 2013
One need only have a quick look at the LWN conference coverage index to
understand that our community does not lack for opportunities to get
together. A relatively recent addition to the list of Linux-related
conferences is the series of "Linaro Connect" events. Recently, your editor was
finally able to attend one of these gatherings: Linaro Connect Asia in Hong
Kong. Various talks of
interest have been covered in separate articles; this article will focus on
the event itself.
Linaro is an industry consortium dedicated
to improving the functionality and performance of Linux on the ARM
processor; its list of members includes many of
the companies working in this area. Quite a bit of engineering work is
done under the Linaro banner, to the point that it was the source of 4.6% of the changes going into
the 3.8 kernel. A lot of Linaro's developers are employed by member
companies and assigned to Linaro, but the number of developers employed by
Linaro directly has been growing steadily. All told, there are hundreds of
people whose work is related to Linaro in some way.
Given that those people work for a lot of different companies and are
spread across the world, it makes sense that they would all want to get
together on occasion. That is the purpose of the Linaro Connect events.
These conferences are open to any interested attendee, but they are focused
on Linaro employees and assignees who otherwise would almost never see each
other. The result is that, in some ways, Linaro Connect resembles an
internal corporate get-together more than a traditional Linux conference.
So, for example, the opening session was delivered by George Grey, Linaro's
CEO; he used it to update attendees on recent developments in the Linaro
organization. The Linaro Enterprise Group (LEG) was announced last November; at this point there
are 25 engineers working with LEG and 14 member companies. More recently,
the Linaro Networking Group was announced
as an initiative to support the use of ARM processors in networking
equipment. This group has 12 member companies, two of which have yet to
decloak and identify themselves.
Life is good in the ARM world, George said; some 8.7 billion ARM chips were
shipped in 2012. There are many opportunities for expansion, not the least
of which is the data center. He pointed out that, in the US, data centers
are responsible for 2.2% of all energy use; ARM provides the opportunity to
reduce power costs considerably. The "Internet of things" is also a
natural opportunity for ARM, though it brings its own challenges, not the
least of which is security: George noted that he really does not want his
heart rate to be broadcast to the world as a whole. And, he said, the
upcoming 64-bit ARMv8 architecture is "going to change everything."
The event resembled a company meeting in other ways; for example, one of
the talks on the first day was an orientation for new employees and assignees.
Others were mentoring sessions aimed at helping developers learn how to get
code merged upstream. One of the sessions on the final day was for the
handing out of awards to the people who have done the most to push Linaro's
objectives forward. And a large part of the schedule (every afternoon,
essentially) was dedicated to hacking sessions aimed at the solution of
specific problems. It was, in summary, a focused, task-oriented gathering
meant to help Linaro meet its goals.
There were also traditional talk sessions, though the hope was for them to
be highly interactive and task-focused as well. Your editor was amused to
hear the standard complaint of conference organizers everywhere: despite
their attempts to set up and facilitate discussions, more and more of the
sessions seem to be turning into lecture-style presentations with one person
talking at the audience. That said, your editor's overall impression was
of an event with about 350 focused developers doing their best to get a lot
of useful work done.
If there is a complaint to be made about Linaro Connect, it would be that
the event, like much in the mobile and embedded communities, is its own
world with limited connections to the broader community. Its sessions
offered help on how to work with upstream; your editor, in his talk,
suggested that Linaro's developers might want to work harder to be
the upstream. ARM architecture maintainer Russell King was recently heard to complain about Linaro Connect, saying
that it works outside the community and that『It can be viewed as
corporate takeover of open source.』 It is doubtful that many see
Linaro in that light; indeed, even Russell might not really view things in
such a harsh way. But Linaro Connect does feel just a little bit isolated
from the development community as a whole.
In any case, that is a relatively minor quibble. It is clear that the ARM
community would like to be less isolated, and Linaro, through its strong
focus on getting code upstream, is helping to make that happen.
Contributions from the mobile and embedded communities have been steadily
increasing for the last few years, to the point that they now make up a
significant fraction of the changes going into the kernel. That can be
expected to increase further as ARM developers become more confident in
their ability to work with the core kernel, and as ARM processors move into
new roles. Chances are, in a few years, we'll have a large set of
recently established kernel developers, and that quite a few of them
will have gotten their start at events like Linaro Connect.
[Your editor would like to thank Linaro for travel assistance to attend
this event.]
Comments (9 posted)
ByJonathan Corbet March 12, 2013
By any reckoning, the ARM architecture is a big success; there are more ARM
processors shipping than any other type. But, despite the talk of
ARM-based server systems over the last few years, most people still do not
take ARM seriously in that role. Jason Taylor, Facebook's Director of
Capacity Engineering & Analysis, came to the 2013 Linaro Connect Asia
event to say that it may be time for that view to change. His talk was an
interesting look into how one large, server-oriented operation thinks ARM
may fit into its data centers.
It should come as a surprise to few readers that Facebook is big. The
company claims 1 billion users across the planet. Over 350 million
photographs are uploaded to Facebook's servers every day; Jason suggested
that, perhaps 25% of all photos taken end up on Facebook. The company's
servers handle 4.2 billion "likes," posts, and comments every day and
vast numbers of users checking in. To be able to handle that kind of load,
Facebook invests a lot of money into its data centers; that, in turn, has
led naturally to a high level of interest in efficiency.
Facebook sees a server rack as its basic unit of computing. Those racks
are populated with five standard types of server; each type is optimized
for the needs of one of the top five users within the company. Basic web
servers offer a lot of CPU power, but not much else, while database servers
are loaded with a lot of memory and large amounts of flash storage capable
of providing high I/O operation rates. "Hadoop" servers offer medium
levels of CPU and memory, but large amounts of rotating storage; "haystack"
servers offer lots of storage and not much of anything else. Finally,
there are "feed" servers with fast CPUs and a lot of memory; they handle
search, advertisements, and related tasks. The fact that these servers run
Linux wasn't really even deemed worth mentioning.
There are clear advantages to focusing on a small set of server types. The
machines become cheaper as a result of volume pricing; they are also easier to
manage and easier to move from one task to another. New servers can be
allocated and placed into service in a matter of hours. On the other hand,
these servers are optimized for specific internal Facebook users; everybody
else just has to make do with servers that might not be ideal for their
needs. Those needs also tend to change over time, but the configuration of
the servers remains fixed. There would be clear value in the creation of a
more flexible alternative.
Facebook's servers are currently all built using large desktop processors
made by Intel and AMD. But, Jason noted, interesting things are happening
in the area of mobile processors. Those processors will cross a couple of
important boundaries in the next year or two: 64-bit versions will be
available, and they will start reaching clock speeds of 2.4 GHz or
so. As a result, he said, it is becoming reasonable to consider the use of
these processors for big, compute-oriented jobs.
That said, there are a couple of significant drawbacks to mobile
processors. The number of instructions executed per clock cycle is still
relatively low, so, even at a high clock rate, mobile processors cannot get
as much computational work done as desktop processors. And that hurts
because processors do not run on their own; they need to be placed in
racks, provided with power supplies, and connected to memory, storage,
networking, and so on. A big processor reduces the relative cost of those
other resources, leading to a more cost-effective package overall. In
other words, the use of "wimpy cores" can triple the other fixed costs
associated with building a complete, working system.
Facebook's solution to this problem is a server board called, for better or
worse, "Group Hug." This design, being put together and published through
Facebook's Open Compute Project,
puts ten ARM processor boards onto a single server board; each processor
has a 1Gb network interface which is aggregated, at the board level, into a
single 10Gb interface. The server boards have no storage or other
peripherals. The result is a server board with far more processors than a
traditional dual-socket board, but with roughly the same computing power as
a server board built with desktop processors.
These ARM server boards can then be used in a related initiative called the
"disaggregated rack." The problem Facebook is trying to address here is
the mismatch between available server resources and what a particular task
may need. A particular server may provide just the right amount of RAM,
for example, but the CPU will be idle much of the time, leading to wasted
resources. Over time, that task's CPU needs might grow, to the point that,
eventually, the CPU power on its servers may be inadequate, slowing things
down overall. With Facebook's current server architecture, it is hard to
keep up with the changing needs of this kind of task.
In a disaggregated rack, the resources required by a computational task are
split apart and provided at the rack level. CPU power is provided by boxes
with processors and little else — ARM-based "Group Hug" boards, for
example. Other boxes in the rack may provide RAM (in the form of a simple
key/value database service), high-speed storage (lots of flash), or
high-capacity storage in the form of a pile of rotating drives. Each rack
can be configured differently, depending on a specific task's needs. A
rack dedicated to the new "graph search" feature will have a lot of compute
servers and flash servers, but not much storage. A photo-serving rack,
instead, will be dominated by rotating storage. As needs change, the
configuration of the rack can change with it.
All of this has become possible because the speed of network interfaces has
increased considerably. With networking speeds up to 100Gb/sec within the
rack, the local bandwidth begins to look nearly infinite, and the network
can become the backplane for computers built at a higher level. The result
is a high-performance computing architecture that allows systems to be
precisely tuned to specific needs and allows individual components to be
depreciated (and upgraded) on independent schedules.
Interestingly, Jason's talk did not mention power consumption — one of
ARM's biggest advantages — at all. Facebook is almost certainly concerned
about the power costs of its data centers, but Linux-based ARM servers are
apparently of interest mostly because they can offer relatively inexpensive
and flexible computing power. If the disaggregated rack experiment
succeeds, it may well demonstrate one way in which ARM-based servers can take a
significant place in the data center.
[Your editor would like to thank Linaro for travel assistance to attend
this event.]
Comments (21 posted)
ByNathan Willis March 13, 2013
AtSCALE 11x in Los Angeles, Bradley Kuhn of the Software Freedom
Conservancy presented
a unique look at the peculiar origin of the Affero GPL
(AGPL). The AGPL was created to solve the problem of application
service providers (such as Web-delivered services) skirting copyleft
while adhering to the letter of licenses like the GPL, but as Kuhn
explained, it is not a perfect solution.
The history of AGPL has an unpleasant beginning, middle, and end,
Kuhn said, but the community needs to understand it. Many people
think of the AGPL in conjunction with the "so-called Application
Service Provider loophole"—but it was not really a
loophole at all. Rather, the authors of the GPLv2 did not foresee the
dramatic takeoff of web applications—and that was not a failure,
strictly speaking, since no one can foresee the future.
In the late 1980s, he noted, client-server applications were not yet
the default, and in the early 1990s, client/server applications
running over the Internet were still comparatively new. In addition,
the entire "copyleft hack" that makes the GPL work is centered around
distribution, as it functions in copyright law. To the
creators of copyleft, making private modifications to a work has never
required publishing one's changes, he said, and that is the right
stance. Demanding publication in such cases would violate the user's
privacy.
Nevertheless, when web applications took off, the
copyleft community did recognize that web services represented a
problem. In early 2001, someone at an event told Kuhn『I won’t
release my web application code at all, because the GPL is the BSD
license of the web.』 In other words, a service can be built on
GPL code, but can incorporate changes that are never shared with the
end user, because the end user does not download the software
from the server. Henry Poole, who founded the web service
company Allseer to assist nonprofits with fundraising, also understood
how web applications inhibited user freedom, and observed that
"we have no copyleft." Poole approached the Free
Software Foundation (FSF) looking for a solution, which touched off
the development of what became the AGPL.
Searching for an approach
Allseer eventually changed its name to Affero, after which the AGPL
is named, but before that license was written, several other ideas to
address the web application problem were tossed back and forth between
Poole, Kuhn, and others. The first was the notion of『public
performance,』which is a concept already well-established in copyright
law. If running the software on a public web server is a public
performance, then perhaps, the thinking went, a copyleft license's
terms could specify that such public performances would require source
distribution of the software.
The trouble with this approach is that "public performance" has
never been defined for software, so relying on it would be somewhat
unpredictable—as an undefined term, it would not be clear when
it did and did not apply. Establishing a definition for『public
performance』in software terms is a challenge in its own right, but
without a definition for software public performance, it would be
difficult to write a public performance clause into (for example) the
GPL and guarantee that it was sufficiently strong to address the web
application issue. Kuhn has long supported adding a public
performance clause anyway, saying it would be at worst a "no op," but
so far he has not persuaded anyone else.
The next idea floated was that of the Ouroboros, which in antiquity
referred to a serpent
eating its own tail, but in classic computer science
terminology also meant a program that could generate its own source
code as output. The idea is also found in programs known as quines,
Kuhn said, although he only encountered the term later.
Perhaps the GPL could add a clause requiring that the program be able
to generate its source code as output, Kuhn thought. The GPLv2
already requires in §2(c) that an interactive program produce a
copyright notice and information about obtaining the license. Thus,
there was a precedent that the GPL can require adding a "feature" for
the sole purpose of preserving software freedom.
The long and winding license development path
In September 2002, Kuhn proposed adding the『print your own source
code』feature as §2(d) in a new revision of the GPL, which would then
be published as version 2.2 (and would serve as Poole's license
solution for Affero). Once the lawyers started actually drafting the
language, however, they dropped the "computer-sciencey" focus of the
print-your-own-source clause and replaced it with the AGPL's
now-familiar "download the corresponding source code" feature
requirement instead. Poole was happy with the change and incorporated
it into the AGPLv1. The initial draft was "buggy," Kuhn said, with
flaws such as specifying the use of HTTP, but it was released by the
FSF, and was the first officially sanctioned fork of the GPL.
The GPLv2.2 (which could have incorporated the new Affero-style
source code download clause) was never released, Kuhn said, even
though Richard Stallman agreed to the release in 2003. The reasons
the release was never made were mostly bad ones, Kuhn said, including
Affero (the company) entering bankruptcy. But there was also internal
division within the FSF team. Kuhn chose the "wrong fork," he said,
and spent much of his time working on license enforcement actions technical work,
which distracted him from other tasks. Meanwhile, other FSF people
started working on the GPLv3, and the still-unreleased version 2.2
fell through the cracks.
Kuhn and Poole had both assumed that the Affero clause was safely
part of the GPLv3, but those working on the license development
project left it out. By the time he realized what had happened, Kuhn
said, the first drafts of GPLv3 appeared, and the Affero clause was
gone. Fortunately, however, Poole insisted on upgrading the AGPLv1,
and AGPLv3 was written to maintain compatibility with GPLv3. AGPLv3
was not released until 2007, but in the interim Richard Fontana wrote
a "transitional" AGPLv2 that projects could use to migrate from AGPLv1
to the freshly-minted AGPLv3. Regrettably, though, the release of
AGPLv3 was made with what Kuhn described as a "whimper." A lot of
factors—and people—contributed, but ultimately the upshot
is that the Affero clause did not revolutionize web development as had
been hoped.
In the time that elapsed between the Affero clause's first
incarnation (in 2002) and the release of AGPLv3 (in 2007), Kuhn said,
the computing landscape had changed considerably. Ruby on
Rails was born, for example, launching a widely popular web
development platform that had no ties to the GPL community.
"AJAX"—which is now known simply as JavaScript, but at the time
was revolutionary—became one of the most widely-adopted way to
deliver services. Finally, he said, the possibility of
venture–capital funding trained new start-ups to build their
businesses on a『release everything but your secret
sauce』model.
Open source had become a buzzword-compliance checkbox to tick, but
the culture of web development did not pick copyleft licenses, opting
instead largely for the MIT License and the three-clause BSD License.
The result is what Kuhn called "trade secret software." It is not
proprietary in the old sense of the word; since it runs on a server,
it is not installed and the user never has any opportunity to get it.
The client side of the equation is no better; web services deliver
what they call "minified" JavaScript: obfuscated code that is
intentionally compressed. This sort of JavaScript should really be
considered a compiled JavaScript binary, Kuhn said, since it is
clearly not the "preferred form for modifying" the application. An
example snippet he showed illustrated the style:
try{function e(b){throw b;}var i=void 0,k=null;
function aa(){return function(b){return b}}
function m(){return function(){}}
function ba(b){return function(a){this[b]=a}}
function o(b){ return function(){return this[b]}}
function p(b){return function(){return b}}var q;
function da(b,a,c){b=b.split(".");c=c||ea;
!(b[0]in c)&&c.execScript&&c.execScript("var "+b[0]);
for(var d;b.length&&(d=b.shift());)
!b.length&&s(a)?c[d]=a:c=c[d]?c[d]:c[d]={}}
function fa(b,a){for(var c=b.split("."),d=a||ea,f;f=c.shift();)
which is not human-readable.
Microsoft understands the opportunity in this approach, he added,
noting that proprietary JavaScript can be delivered to run even on an
entirely free operating system. Today, the "trade secret" server side
plus "compiled JavaScript" client side has become the norm, even with
services that ostensibly are dedicated to software freedom, like the
OpenStack infrastructure or the git-based GitHub and Bitbucket.
In addition to the non-free software deployment itself, Kuhn
worries that software freedom advocates risk turning into a
"cloistered elite" akin to monks in the Dark Ages. The monks were
literate and preserved knowledge, but the masses outside the walls of
the monastery suffered. Free software developers, too, can live
comfortably in their own world as source code "haves" while the bulk
of computer users remain source code "have-nots."
One hundred years out
Repairing such a bifurcation would be a colossal task. Among other
factors, the rise of web application development represents a
generational change, Kuhn said. How many of today's web developers have
chased a bug from the top of the stack all the way down into the
kernel? Many of them develop on Mac OS X, which is proprietary but is
of very good quality (as opposed to Microsoft, he commented, which was
never a long term threat since its software was always terrible...).
Furthermore, if few of today's web developers have chased a bug all
the way down the stack, as he suspects, tomorrow's developers may not
ever need to. There are so many layers underneath a web application
framework that most web developers do not need to know what happens in
the lowest layers. Ironically, the success of free software has
contributed to this situation as well. Today, the best operating
system software in the world is free, and any teenager out there can
go download it and run it. Web developers can get "cool, fun jobs"
without giving much thought to the OS layer.
Perhaps this shift was inevitable, Kuhn said, and even if GPLv2.2
had rolled out the Affero clause in 2002 and he had done the best
possible advocacy, it would not have altered the situation. But the
real question is what the software freedom community should do now.
For starters, he said, the community needs to be aware that the
AGPL can be—and often is—abused. This is usually done
through "up-selling" and license enforcement done with a profit
motive, he said. MySQL AB (now owned by Oracle) is the most prominent
example; because it holds the copyright to the MySQL code and offers
it under both GPL and commercial proprietary licenses, it can pressure
businesses into purchasing commercial proprietary licenses by telling
them that their usage of the software violates the GPL, even if it
does not. This technique is one of the most frequent uses of
the AGPL (targeting web services), Kuhn said, and "it makes me sick," because it goes directly
against the intent of the license authors.
But although using the AGPL for web applications does not prevent
such abuses, it is still the best option. Preserving software freedom
on the web demands more, however, including building more federated
services. There are a few examples, he said, including Identi.ca and
MediaGoblin, but the problem that such services face is the
"Great Marketing Machine." When everybody else (such as Twitter and
Flickr) deploys proprietary web services, the resulting marketing
push is not something that licensing alone can overtake.
The upshot, Kuhn said, is that『we’re back to catching up to
proprietary software,』just as GNU had to catch up to Unix in
earlier decades. That game of catch-up took almost 20 years, he said,
but then again an immediate solution is not critical. He is resigned
to the fact that proprietary software will not disappear within his
lifetime, he said, but he still wants to think about 50 or 100 years
down the road.
Perhaps there were mistakes made in the creation and deployment of
the Affero clause, but as Kuhn's talk illustrated, the job of
protecting software freedom in web applications involves a number of
discrete challenges. The AGPL is not a magic bullet, nor can it
change today's web development culture, but the issues that it
addresses are vital for the long term preservation of software
freedom. The other wrinkle, of course, is that there are a wide range
of opinions about what constitutes software freedom on the web. Some
draw the line at whatever software runs on the user's local machine
(i.e., the JavaScript components), others insist that public APIs and
open access to data are what really matters. The position advocated
by the FSF and by Kuhn is the most expansive, but because of it,
developers now have another licensing option at their disposal.
Comments (43 posted)
Page editor: Jonathan Corbet
Next page:
Security>>
|
|