43 captures
09 Jun 2013 - 16 Jan 2026
May JUN Jul
09
2012 2013 2014
success
fail

About this capture

COLLECTED BY

Organization: Alexa Crawls

Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.

Collection: Alexa Crawls

Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period.
TIMESTAMPS

The Wayback Machine - http://web.archive.org/web/20130609152048/http://lwn.net:80/Articles/550427/
 
LWN.net Logo

Log in now

Create an account

Subscribe to LWN

Return to the Front page

Little things that matter in language design

LWN.net Weekly Edition for June 6, 2013

Power-aware scheduling meets a line in the sand

Trusting upstream

LWN.net Weekly Edition for May 31, 2013

A look at the PyPy 2.0 release

ByJake Edge
May 15, 2013

It's hard to say why, but May appears to be the month where we look in on PyPy. Three years ago, we had a May 2010 introduction to PyPy, followed by an experiment using it in May 2011. This year, the PyPy 2.0 release was made on May 9—that, coupled with our evident tradition, makes for a good reason to look in on this Python interpreter written in Python.

One might ask: "Why write a Python interpreter in Python?" The possibly surprising answer is: "performance". It's not quite as simple as that, of course, as there are some additional pieces to the puzzle. To start with, PyPy is written in a subset of Python, called RPython, which is oriented toward making a Python dialect that is less dynamic and acts a bit more like C. PyPy also includes a just-in-time (JIT) compiler that flat out beats "normal" Python (called CPython) on a variety of benchmarks.

PyPy has been making steady progress for over ten years now, and has reached a point where it can be used in place of CPython in lots of places. For a long time, compatibility with the standard library and other Python libraries and frameworks was lacking, but that situation has improved substantially over the years. Major frameworks like Django and Twisted already work with PyPy. 2.0 adds support for Stackless Python with greenlets, which provide micro-threading for Python. Those two pieces should allow asynchronous programs using gevent and eventlet to work as well (though gevent requires some PyPy-specific changes).

In order to support more Python modules that call out to C (typically for performance reasons), PyPy now includes CFFI 0.6, which is a foreign function interface for calling C code from Python. Unlike other methods for calling C functions, CFFI works well for both CPython and PyPy, while also providing a "reasonable path" for IronPython (Python on .NET) or Jython (Python on the Java virtual machine).

Trying it out

Getting PyPy 2.0 is a bit tricky, at least for now. Those who are on Ubuntu 10.04 or 12.04 can pick up binaries from the download page (as can Mac OS X and Windows users). While many distributions carry PyPy in their repositories, 2.0 has not arrived yet. There are "statically linked" PyPy binaries, but the 64-bit version (at least) doesn't quite live up to the name—it requires a dozen or so shared libraries, including older versions of libssl, libcrypto, and libbz2 than those available for Fedora 18.

Normally, given constraints like that, building from source is the right approach, but the project has some fairly scary warnings about doing so. According to the docs, building on a 64-bit machine with 4G or less of RAM will "just swap forever", which didn't sound all that enticing. But there is a workaround that doesn't use CPython and instead requires PyPy using some magic incantations (an environment variable and command-line flag) to limit the memory usage—but that means making the static PyPy binary work. A little symbolic linking (for libbz2) and some library building (openssl-1.0.0j) resulted in a functioning PyPy. There is no real reason not to use that, but I was a little leery of it and curious to continue with the build process.

Running the PyPy build is a rather eye-opening experience. Beyond the voluminous output—including colored-ASCII Mandelbrot set rendering, lots of status, and some warnings that are seemingly not serious—it took more than 2 hours (8384.6 seconds according to the detailed timing information spit out at end of the build process) on my not horribly underpowered desktop (2.33 GHz Core 2 Duo). The Linux kernel only takes six minutes or so on that system.

Calculating the Mandelbrot set while translating is not the only whimsical touch that comes with PyPy. Starting it up leads to a fortune-like quote, though one with a PyPythonic orientation:

    $ pypy
    Python 2.7.3 (b9c3566aa0170aaa736db0491d542c309ec7a5dc, May 11 2013, 17:54:41)
    [PyPy 2.0.0 with GCC 4.7.2 20121109 (Red Hat 4.7.2-8)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    And now for something completely different: ``PyPy is a tool to keep otherwise
    dangerous minds safely occupied.''
    >>>> 
From the command line, it acts pretty much exactly like Python 2.7.3—as advertised.

As a quick test of PyPy, I ran an amusingly shaped Mandelbrot program in both CPython and PyPy. As expected, the PyPy version ran significantly faster (just over 3 minutes, compared to 8+ minutes for CPython 2.7.3). In addition, the bitmaps produced were identical.

PyPy comes with its own set of standard library modules, but additional modules can be picked up from an existing CPython site-packages directory (via the PYTHONPATH environment variable). Trying out a few of those (BeautifulSoup 4 for example) showed no obvious problems, though a PyPy bug report shows problems using the lxml parser, along with some other, more subtle problems. The compatibility page gives an overview of the compatibility picture, while the compatibility wiki has more information on a per-package basis. A scan of the wiki shows there's lots more work to do, but it also shows a pretty wide range of code compatible with PyPy.

ARM and other projects

One of the more interesting developments in the PyPy world is ARM support, for which a 2.0 alpha has been released. It supports both ARMv6 (e.g., Raspberry Pi) and v7 (e.g., Beagleboard, Chromebook) and the benchmark results look good, especially given that the ARM code "is not nearly as polished as our x86 assembler". The Raspberry Pi Foundation helped get PyPy onto ARM with "a small amount of funding".

The PyPy project is running several concurrent fundraising efforts, three for specific sub-projects, and one for overall project funding. The transactional memory/automatic mutual exclusion sub-project is an effort to use software transactional memory to allow Python programs to use multiple cores more effectively. It would remove the global interpreter lock (GIL) for PyPy for better concurrency. PyPy hackers Armin Rigo and Maciej Fijałkowski gave a presentation at PyCon 2013 on this effort.

Another ongoing sub-project is an effort to add Python 3 support to PyPy. That would allow input programs in Python 3, but would not change the PyPy implementation language (RPython based on Python 2.x). A status report back in March shows good progress. On x86-32 Linux, the CPython regression test suite "now passes 289 out of approximately 354 modules (with 39 skips)".

The third sub-project is to make NumPy work with PyPy. NumPy is an extension for doing math with matrices and multi-dimensional arrays. Much of that work is done in C code, so PyPy's JIT would need to use the vector instructions on modern CPUs. A brief status update from May 11 shows some progress, as does the 2.0 release announcement (though the temporary loss of lazy expression evaluation may not exactly be considered progress).

Overall, the PyPy project seems to be cruising along. While none of the fundraising efforts have hit their targets, some fairly significant money has been raised. Beyond that, some major technical progress has been made. The sub-projects, software transactional memory in particular, are also providing interesting new dimensions for the project. We are getting closer to the day when most Python code is runnable with PyPy, though we still aren't there yet.


(Log in to post comments)

A look at the PyPy 2.0 release

Posted May 16, 2013 0:22 UTC (Thu) by justincormack (subscriber, #70439) [Link]

The ffi says it is "based on LuaJIT" but seems not to be really at all, as it is a slightly odd hybrid of an ABI based ffi (which LuaJIT is) and a C API based one using a compiler. Some C code has such appalling ABI properties it is unusable, but decent C code has reasonable ABI stability.

A look at the PyPy 2.0 release

Posted May 17, 2013 16:51 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Not to mention that, AFAICT, C++ is, again, left high and dry. There certainly aren't any docs or examples that I've been able to find. So we're still mess with the messes of Boost.Python (my preferred route), SWIG, or whatever other tools have cropped up in the past months. Wrapping C is "easy", but bindings for C++ get really hairy very quickly and you sometimes need access to the inner conversion mechanisms to do what you want (think of bindings for boost::any) which only Boost.Python seems to do well at all (and even that isn't ideal).

A look at the PyPy 2.0 release

Posted May 17, 2013 17:16 UTC (Fri) by pboddie (subscriber, #50784) [Link]

Have you looked at SIP for binding C++ libraries for Python? I have no idea whether there's any PyPy interoperability, but SIP's role for CPython is definitely proven.

A look at the PyPy 2.0 release

Posted May 17, 2013 18:25 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

I haven't looked recently, but I was just drawing a blank and I knew there were likely a few I'd forget anyways. From a cursory skim, it's missing the a way to ask if a given PyObject* is a type T from the C++ code (see Boost's[1] version of it). Besides that, is there a way to manage the GIL in the entries into the bindings? With a single replaced header (invoke.hpp), I can have Boost drop the GIL on every entry into C++ code (via Boost at least) to help alleviate performance hits.

[1]http://www.boost.org/doc/libs/1_53_0/libs/python/doc/v2/e...

A look at the PyPy 2.0 release

Posted May 17, 2013 20:18 UTC (Fri) by pboddie (subscriber, #50784) [Link]

Sorry, I don't have direct experience with SIP, but perhaps your GIL question is answered here:

http://pyqt.sourceforge.net/Docs/sip4/using.html#the-pyth...

A look at the PyPy 2.0 release

Posted May 18, 2013 6:52 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Well, that's good, but I'm still missing other features I get with Boost.Python[1] in minimal code (typically ~3 lines of C++ per typical method (with named args and a docstring); add ~10 extra for more involved methods including virtual methods, callback wrapping, and PyObject-as-std::iostream translations). Other features are custom getter/setter functions for properties (boost::variant bindings with named accessors which return None if the value for name isn't set and can be assigned to), protected class access[2] (we allow inheriting from a a C++ class in Python and calling it from C++ again), and the extract class mentioned above (boost::any bindings). Not sure how well SIP handles using objects via a shared_ptr (or similar class) and never by instance either, but this is likely hackable with some work.

[1]With which my main gripe about is the lack of PyPy support. This isn't me just ripping on other bindings systems for the sake of it; these features are just required for complete bindings and that is more important than PyPy support (as nice as it would be).
[2]"SIP does not support access to either private or protected instance variables."

A look at the PyPy 2.0 release

Posted May 18, 2013 7:13 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

> the extract class mentioned above (boost::any bindings)

Found something usable[1] for boost::any conversions. Maybe boost::variant could be done with %Property and %MethodCode, but it'd likely be messy (not that it's clean in Boost.Python either, but at least that is all C++ and not a mixture).

[1]http://pyqt.sourceforge.net/Docs/sip4/c_api.html#sipCanCo...

A look at the PyPy 2.0 release

Posted May 18, 2013 8:12 UTC (Sat) by edomaur (subscriber, #14520) [Link]

From the same people than cffi, there is also cppyy : http://doc.pypy.org/en/latest/cppyy.html

A look at the PyPy 2.0 release

Posted May 16, 2013 7:00 UTC (Thu) by Frej (subscriber, #4165) [Link]

The original pypy, I believe was actually investigating the feasibility of generating a compiler from an interpreter using just partial evaluation (futamura projections). I do think feasibility for them also ment real usable performance, so it's not an entirely pure approach, as the interpreter was written in restricted python instead of plain python, and I don't know how much pypy has diverged from the original research and instead started being aiming for a real alternative to CPython.

But pypy has some really interesting roots
http://en.wikipedia.org/wiki/Partial_evaluation
http://www.diku.dk/OLD/forskning/topps/neilfest/leuschel.pdf

A look at the PyPy 2.0 release

Posted May 24, 2013 10:31 UTC (Fri) by zack (subscriber, #7062) [Link]

> Getting PyPy 2.0 is a bit tricky, at least for now. Those who are on Ubuntu 10.04 or 12.04 can pick up binaries from the download page (as can Mac OS X and Windows users). While many distributions carry PyPy in their repositories, 2.0 has not arrived yet.

FWIW, on the very same this article went out, PyPy 2.0 has been uploaded to Debian unstable http://packages.qa.debian.org/p/pypy/news/20130515T180110...

Call it a race condition :)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds