Lambda the Ultimate

The Programming Languages Weblog
   

Algebra Of Programming (Bird, De Moor)

"Algebra Of Programming" has been mentioned on LTU a few times (mostly in 2002 it seems). Unfortunately the book is not available on-line, and costs $125 on Amazon! Since the book is about 9 years old now, I'm wondering if there are is any other material that summarizes it, re-states it or supersedes it?

Secondly, from what I have read, the examples in the book are in haskell, should I expect any problems if I use Scheme to implement those examples?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Froogle lists several copies for around $30.
Addall.com shows a copy for $23.
Amazon is a decent vendor, but they're not necessarily the best source for used books.

isbn.nu can do the "shop around"ing for you. (no affiliation, just FYI)

Papers by Bird and generally members of the Algebra of Programming group at Oxford have the same stuff and new developments, of course scattered over many papers and many years. Check out their home pages.

Rumour goes that de Moor no longer works on this stuff; like all of us, he found it too abstract. :)

If you can do product and sum types in Scheme, e.g., translate this user data type declaration:

data X = X0 | X2 X X

you will be fine.

It is interesting that you called it abstract. I actually started looking into it because it seemed very practical (at least from its table of contents). My understanding is that their work lays out the foundation for famous papers on folds (the universality of fold, bananas and lenses, etc.). I specifically looked into it because functional programming seems to handle 'data processing' in a much more elegant fashion. A good example is databases (I've only looked at open source java implementations as well as the new LINQ work in C#). They have one function for addition of two numbers, and a whole different implementation for summation of a list or a collection. In functional programming, the addition function can be 'folded' over a list.

All this is available in many, many papers, articles, etc. What I am interested in is the ability to use such 'foldable' functions in streams, rather than lists (lists where data is being continously being added). Michael Stonebraker's company is selling 'stream databases' for around $100k. I don't see why such a system couldn't be built using foldable functions, streams and other things I probably haven't learned yet.

By the way, I don't mean to offend anyone by mentioning commercial products and their license costs, it is an incredibly interesting problem to me even without the shocking license fees :)

It is pretty abstract, but it's also a really great book.

The first few chapters cover material about folds and their equational reasoning rules that are well-covered in the literature. The coverage of unfolds in AoP is a lot weaker.

The second half is where the book really shines. In the second half, they start talking about optimization problems, and they specify the optimization problems in terms of relations between the input and output. The beauty of their approach is that since they work in the category of relations, the specifications are arrows in the category of sets and relations, and can be systematically transformed into arrows in the category of sets and functions -- which turns the spec into an implementation. It's a really beautiful, elegant and uniform way of deriving efficient solutions to a very wide class of problems.

What kind of optimization problems do they discuss and solve? From my own experience most mathematicians are unlikely to learn CT just for the matter of shining elegance or abstraction but because they need it as a tool for describing certain classes of algebro-geometric problems adequately, that are hard to solve - no toy problems. The way computer scientists speak about the "elegance of mathematics" and CT in particular resembles me sometimes in apologies of "elegance" by superstring theorists. I'm sceptical if this leads very far if anywhere.

The class of optimization problems they address are finding the min or max under a relation in the set of results generated by a hylomorphism (ie, the composition of a fold and an unfold). An example of an algorithm in this category is the TeX line-breaking algorithm, whose derivation is one of the examples in the book.

Incidentally, I use categorical ideas pretty much every day.

As a mathematician wannabe and enthusiast of CT: Are you familiar with anyone willing to teach CT to someone who is interested in the shining elegance and abstraction?

Ohad.

kammar in actcom point co dot il

I specifically looked into it because functional programming seems to handle 'data processing' in a much more elegant fashion. A good example is databases (I've only looked at open source java implementations as well as the new LINQ work in C#). They have one function for addition of two numbers, and a whole different implementation for summation of a list or a collection. In functional programming, the addition function can be 'folded' over a list.

When you're dealing with the relational database model, you're dealing with sets of tuples, which, unlike lists or trees, are not a recursive data type. The notion of a fold does not really apply that well, since sets just don't have any intrinsic recursive structure.

The higher-order operation that does apply is aggregation, which transforms a set by applying to its elements a reflexive and associative operation that's closed over the type of the elements of the set, using the identity element for the operation as the base value. This, of course, is entirely equivalent to if you were to take the set, impose an arbitrary order on its elements, and then fold the operation over the elements of the resulting sequence.

Since the operations we aggregate with are reflexive and associative, aggregation can be optimized in ways that folds in general can't. You can divide up the set into arbitrary partitions, aggregate those in parallel, and then aggregate the set of partial results. This is a really important property for lots of database applications.

I would not have known about any of this if it wasn't for Grust's PhD thesis "Comprehending Queries." He uses Bird-Meertens formalism (as far as I can tell) to translate between completely functional programs and SQL (and XQuery, in subsequent papers). I have also read google listings mentioning BMF and parallelism (didn't read the papers yet). Since I posted the question, I've also come accross some papers from Jeremy Gibbons which seem to talk about 'streaming algorithms.' It is all pretty interesting, but my next step is to actually write some programs :)

Hi,

the link you gave points to a brief abstract of my
thesis. The full text is available here.

Best wishes,
--Torsten

Since the operations we aggregate with are reflexive and associative, aggregation can be optimized in ways that folds in general can't.

Don't you mean "symmetric and associative" instead of "reflexive and associative"?

Don't you mean "symmetric and associative" instead of "reflexive and associative"?

Or even "commutative and associative".

When you're dealing with the relational database model, you're dealing with sets of tuples, which, unlike lists or trees, are not a recursive data type. The notion of a fold does not really apply that well, since sets just don't have any intrinsic recursive structure.

I'm tempted to disagree with that, well, at least in the case of finite countable sets. But anyway, maybe a fold is just that, a fold? Just a manner to define the functional equivalent of looping over a collection while updating an accumulator? Not to be blase and al that.

Did some searching on Algebra of Programming, found some links that others may find useful:

Notes on "Algebra of Programming"

"Algebra of Programming" course - see "Schedule" for lecture notes.

Review of the book "Algebra of Programming"

[long-overdue-update]

Unable to find any other (ie: cheaper) editions, I got the on-demand print version. Text quality is very nice, but the cover is crap; already peeling.

Highly recommended; you know it's a good book when you're reading proofs on page 8.. and then it's proofs non-stop throughout the rest of the text. It's dense; no cruft, lots of equations, examples, exercises, theory, etc. Nice work!

After a significant dive into the book, I get lost around the design of the Allegories and their subsequent utility in the derivation of the polynomial-time optimization algorithms. While I can follow the proofs and even complete some of the exercises, I'm missing the "why".. the reasons for moving into relational vs functional and the cost/benefit of doing so. It seems overly complex to me (as an implementer rather than a theorist).

I've cross-researched another line (same concepts, but held at the functional level) in the work of Zhenjiang Hu. The results are similar (albeit less general), but significantly simpler (petagogically).

Bird argues Hu's work is not any simpler to implement, which I agree with, but if no one can understand what you're doing its awfully hard to write an implementation of it!

Hu argues Bird's work is all "hand-derivation" and not amenable to automatic transformations. This appears to be true with the current level of maturity on Bird's work (2005-ish). Since I understand Hu's technique, and I'm targeting automatic application of the rules, I've been following Hu-and-company's research more closely.

Without any other end-to-end discussion of Bird's technique, and this the only reference on the subject, it appears to be a dying line. This is unfortunate in my opinion, it would be nice to see the text available freely for future researchers to build upon.

I've cross-researched another line (same concepts, but held at the functional level) in the work of Zhenjiang Hu.
Are you refering to Zhenjiang Hu's work on Program Transformation and Optimization in Calculational Form?

Yes, that's the line of research I've been studying. Specifically, I found their work enlightening in the following papers (found at your link):

Deriving Structural Hylomorphisms from Recursive Definitions
Tupling Calculation Eliminates Multiple Data Traversals
Program Transformation in Calculational Form
(this is the 'competitor' to Algebra of Programming's technique)
Make it Practical: A Generic Linear Time Algorithm for Solving Maximum Weightsum Problems
(if I recall, this is the paper he argues against Bird's work)
Iterative-free Program Analysis
(nice application of the technique)
The Third Homomorphism Theorem on Trees: Downward & Upward Lead to Divide-and-Conquer
(nice derivation of a valuable new property)

His co-authors also have similar and enlightening work:
Isao Sasano specifically his thesis Generation of Efficient Algorithms for Maximum Marking Problems.

Masato Takeichi (lots of nice program transformation work).

Prentice Hall cancelled the entire series that this book was a part of, and as far as I know the rights have reverted to the authors. Some of the books in the series are already available online in various places: perhaps it might be worth emailing Oege to ask him if he's willing to make an electronic version available?

To save others a bit of research, this is the "Prentice Hall International Series in Computer Science", started by Hoare and continued by Bird. A page detailing which texts are now available online is here:

http://vl.fmnet.info/phiscs/

(I've made much use of the Z texts as an introduction to formal specification techniques --- although I ended up doing informal specification using LaTeX, with manual proofs, instead of using Z itself.)

If you haven't seen it already, and are still interested in introductions to formal specification (especially using Z), then you may want to take a look at Jonathan Jacky's "The Way of Z". It's possibly the most practical and accessible introduction to Z I've seen.

OK, I've bitten the bullet and sent him an email myself. Don't all jump on him at once...

Any response from him yet?

I asked Oege, and he pointed out that you can find it printed on demand by Pearson Education, if you look hard enough. Not cheap, mind you...

Yes, in that I got a response, but no in that he said he'd talk to Richard (Bird) about possibly making it available & I haven't heard anything since.

Phil

I found a copy of an international edition of the book, priced at around 40 dollars. Still waiting for it to arrive, the seller is from India. Maybe there are other copies of this international edition around (the seller from which I got it had 3 copies but they were all sold, apparently).

In all probability, you will receive a softcover Indian edition. It is an Eastern Economy Edition and the listed price is around $5.

Actually an Indian publisher also has rights to publish the book and he actually published the book sometime around year 2000. The publisher does not have any copies of the book right now. Again his rights are limited to selling the book in Indian sub-continent only.

Can you please tell me the details of the Indian Edition seller. I actually live in India and want to get this book but couldn't find it.