OpenEnterpriseTrends.com

OET HOME

OET NEWS

USE CASES

The Open Source Portal for Enterprise Developers

OET News

The O'Reilly Factor: How Python Grips the Enterprise - Part I
Posted on: 05/27/2003

The Python scripting language is gaining popularity for its ability to let developers extend sharing between applications and business logic. OET begins a two-part discussion with Alex Martelli, author of O'Reilly's popular Python in a Nutshell and Python Cookbook. In this discussion, commercial developers will learn just how they can safely experiment with Python in their enterprise, and work with all sorts of legacy systems, Java, C++ and Win32.

In Part I, we start with Martelli's advice for how Python scrpting can make your current apps components more easily shareable with other apps. Tips for a starter project are also included.

An OET Interview
with Alex Martelli, Author,
"Python in a Nutshell" and "The Python Cookbook"

OET: From your books, Python seems to be a very flexible and feature-rich scripting language. What's your best advice for a "starter" Python project aimed at extending some core enterprise applications/assets?

Martelli: Exactly because Python is a general-purpose scripting language, a typical starter project to integrate Python with an existing enterprise application might be to make the application itself "scriptable." By that, I mean to integrate the Python interpreter with the application, so that some or all of the application's functionality may be accessed from Python scripts.

OET: Can you give more details about that process?

Martelli: When you're extending an existing application, to make it scriptable by embedding a Python interpreter in it, you need to consider two aspects about each target application:

If an application is internally structured as a collection of reusable components, all you need to do is to make those components Python-accessible. In other words, Python code must be enabled to "call back" into your application, in order to get information and/or effect changes to the application's state. Once your existing components can be accessed by Python, you can use Python itself to provide the "main flow of control" for all sorts of presentation layers. This strategy is known as "extending Python" because your application's reusable components are seen from Python as " extension modules. "

or
If your application isn't component-oriented (and is, instead, structured as a somewhat "monolithic" executable program), you must still cover point 1 (Python code must still be enabled to call back into your application, or else it wouldn't be able to interact with it!), but you also need to do some extra work. Specifically, if your application's main flow-of-control logic must remain at the wheel at all times, then you must ensure that, within said flow, your application can load, initialize and start the Python interpreter and pass appropriate Python code to the interpreter as and when appropriate.

The tasks implied by point 2 are not hard ones, intrinsically. For example, here's how you start a Python interpreter and pass to it a Python source statement that emits "Hello World! " on standard output. If your application is in Java, the minimal code you need is:

import org.python.core.*; import org.python.util.*; ... PySystemState.initialize(); PythonInterpreter interp = new PythonInterpreter(); interp.exec("print 'Hello World! '\n");

If your application is in C (or in C++, or in other languages that interoperate smoothly with C, and you want to use the Python C API for this task), the minimal code you need is even simpler:

#Include ... Py_Initialize(); PyRun_SimpleString("print 'Hello World! '\n");

As you can see, the difficulties connected with point 2 are definitely not ones that Python itself imposes! Rather, they're connected to finding the right ways to fit Python initialization into your application's logic (that won't be hard if you just want to initialize the Python interpreter at your program's start-up and thereafter keep Python ready throughout), and deciding what Python code to pass to the interpreter and on what occasions to do so.

OET: Is the concept of an application written with components a Python-specific idea?

Martelli: By no means! If you've followed Bertrand Meyer's crucial advice that "Real systems have no top," your applications are already structured as collections of reusable components providing all the key application functionality. Such components are then joined together flexibly in various ways by relatively simple "main programs" in order to provide different presentation layers (GUIs, web-oriented interfaces, data-mining systems running in the background and so on) and/or varying sets of end-user functionality.

This approach is by far the best way to write real-world applications, in any language, particularly object-oriented ones. Python is just particularly good at exploiting such component-oriented application architectures, but you should consider architecting your applications in this way no matter what languages you use for them.

OET: Is the component-oriented approach you just described considered a Python "Best Practice" for introducing Python scripting into the enterprise?

Martelli: Oh, yes. As I mentioned, this approach to integrating Python into your existing applications is known as "extending Python" because your reusable components become "Python extension modules" and Python code just uses them, exactly as it might use any other kind of extension modules and packages, and indeed, often, together with other extensions. When feasible, such "extending" is vastly preferable to the two-layered approach (points 1, 2) that is necessary when you have a monolithic application rather than reusable components: It means less work for you and more flexibility in the resulting applications.

OET: What if a target application is not written in components?

Martelli: Unfortunately, when one starts with a monolithic application executable, restructuring it into a collection of reusable components is often a major architectural effort. Therefore, point 2, also known as "embedding Python" in the strict sense, becomes important. Loading and initializing Python is quite simple, as I showed in the small Hello World examples, but you do have to deal carefully with several sub-issues:

2a. Where will your application get Python code?
Text files are often OK, but if your application hinges on a database, you might want to have the Python source live as blobs in the database -- in that case, you need to provide ways to put Python source into the database, and to get it out, edit it for modification and put it back into the database again.

In my experience, allowing the use of text files is invariably useful in the development phase, even if you intend to rely exclusively on a database for deployment in the production phase, because text files are so much easier to create, edit and modify with text editors.

In the deployment/production phase, you don't necessarily have to support the loading of Python source code: you might choose to load compiled Python bytecode instead (the Python-specialized bytecode for the "Classic Python" version, ordinary Java bytecode originally produced from Python sources for the JVM-reliant "jython" version; I'll come back to this distinction later).

However, once again, I repeat my recommendation to ensure you do support loading of Python source code from ordinary text files, at least in the development phase: This will make your development process smoother and more productive even if, for some reason, you decide to disable it when the application is deployed in a production environment.

For loading Python bytecode modules in a Classic Python environment, one option added in the Python 2.3 version (currently released in beta-test) that's often very handy is to load the modules directly from a zipped archive file. In a Java-based version, of course, you could similarly load bytecode-compiled classes from jar files.

2b. When will you load the Python code?
If you load the code in memory when your application starts up, or when the code is first needed, and then you keep it in memory (ideally in bytecode compiled form), you will get good performance in deployed production-phase applications (assuming you can spare the memory for such "caching"), but you may lose some flexibility unless you allow the user to explicitly ask for a reload of the code (in case the user is actively editing and modifying it), so you normally need to architect both operating modes.

2c. On what occasions will you pass control to Python code (or, to be precise, to the Python interpreter, pointing it to it the appropriate Python source or compiled code)?
Your application probably already has some appropriate hooks, such as start-up phases where it reads configuration files (you can conditionally replace such reading with the loading and execution of a Python module, depending, for example, on the filename or extension of the file), and interactive phases where it gets commands and requests from the users; e.g., via a GUI (then, one such command should become "load and execute this Python file").

OET: So, these existing hooks are all I need?

Martelli: To start with, these existing hooks are probably sufficient. However, by examining your application's specific architecture, you'll probably be able to identify many more appropriate "hook points," where the ability to execute general Python code (the ability to call back into your application), rather than just hard-wired operations, will substantially increase your application's usefulness.

For example, you might want to ensure the possibility of going through Python for all network operations, or for all database accesses and updates, rather than directly using calls into some existing library.

OET: Can you give some examples of where that kind of detailed work would pay off for a developer?

Martelli: If your application accesses networks and databases directly, for instance, (i.e., via direct calls to existing net-access or DB-access libraries), then, in order to run an exhaustive test of your application, you'd need to accurately simulate its network partners or DB infrastructure, which can be very problematic.

If all such accesses happen by executing Python code, it becomes much simpler to "stub out" network and DB access during testing phases by using an alternative set of Python modules that just make believe they're actually talking on the
Net or to the DB, but in fact produce pre-canned test data to properly stimulate the rest of the application's logic and log the application's actions for analysis (i.e., so that the application can be shown to behave correctly, or, if it doesn't, so that the causes of misbehavior can be identified). For similar reasons, it's often a good idea to optionally interpose the ability to execute Python code where your application would normally access some specialized hardware component.

OET: From this last discussion, it sounds like I could use Python to actually improve the performance and management of my core application, without really adding a Python feature?

Martelli: Indeed, if you're looking for a specific killer application that you'll enable by making your application fully scriptable in Python, I'd nominate the ability to conduct more thorough and detailed automated tests of your application's functionality and performance as one very good choice.

Easy and powerful customization of an application's behavior by such non-programmers (or not-quite-programmers) as "power users" and customer-support technicians in the field is what one normally thinks of as the key advantage of making an application fully scriptable. And I'm not dissing that advantage, mind you! However, while only some applications can truly benefit from enabling such customization, all applications can benefit by richer, more powerful, more automated testing and performance tuning. In this sense, scriptability can offer high returns along these lines even more widely than it can for customization.

OET: Let's say I like the resultant payoff from Python on my existing monolithic applications. Are there any ways I can avoid, or at least minimize, some of the steps you outline above?

Martelli: Most of points 2a, 2b and 2c can be dispensed with if your application is structured as a set of reusable components (i.e., if you can adopt the preferable strategy of extending Python rather than having to go for the strategy of embedding Python).

However, none of the issues raised under these points need be a blocking factor -- as I've tried to indicate, they've all been met quite successfully, and there are very good, field-proven strategies you can follow in each case. And if you start with a monolithic application, following the point 2 (embedding) strategy may require less effort than restructuring your application into a set of reusable components -- the end result, generally, won't be quite as flexible and powerful, but the return on investment may still be higher, because the investment required on your part is substantially less.

Next Week: How Python Grips the Enterprise - Part II, Martelli suggests specific techniques developers can use to marry Open Source Python with common commercial software packages for Win32, C++ and Java.

Jan	FEB	Mar
	21
2003	2004	2005