| http://people.csail.mit.edu/jaffer/Schlep |
Schlep Toolchains |
| |||
| |||
|
`free!' are ignored.
C#
C# is also known as ECMA-334 and ISO/IEC 23270. It provides
garbage collection natively; calls to `free!' are
ignored.
Cfor WB
For WB, all storage allocation and deallocation is explicit in the
source code. Calls to the WB APIs which pass byte-vectors also pass
their lengths.
Cfor Water
For Water, byte-vectors and strings (which are distinct) are
dynamically allocated and have manifest lengths.
Water-C uses the
Hans Boehm GC.
There are small differences between the WB and Water versions
of scm2c due to their different representations for
byte-vectors. The WB version is provided here. The Water version
will also be supplied in the future.
qase and vector-set-length! and
(from Common-Lisp)
#+, #-, defmacro, defvar,
and defconst. The Schlep Dialect is described in a
separate document:
●Schlep Dialect of Scheme
●Read-Syntax
●Syntax
●Procedures
●Integers as Bits
●Diagnostic Output
One might assume that in order for Schlep-dialect code to map to these
three languages, the Schlep-dialect must manifest the worst
limitations of each language; but translation can actually rectify
some of these limitations.
If I had to pick one practical feature of Scheme which elevates it
above other languages, it would be internal definitions. The
Algorithmic Language Scheme allows procedure definitions
(using define, letrec, and
named-let) inside of other procedures, which none of C,
C#, or Java does. Internal definitions allow calls of internal
procedures with a small number of arguments to replace the common
alternatives:
●duplication of code (okay for exit routines, not for internal
recursion);
●top-level definitions and calls passing large numbers of
arguments (foils tail-recursion);
●moving shared variables to top level (makes code not reentrant;
foils tail-recursion).
C and C# have a `goto' statement, enabling Schlep to
emulate calling of internal-procedures in the tail position
using some variable assignments (sometimes including temporary
variable binding to emulate simultaneous assignment) followed by
a goto statement. The restriction to the tail-position
does not allow internal recursion other than tail-recursion; but this
facilitates use of internal procedures in many situations which would
otherwise force less desirable practices.
Java lacks a `goto' statement. Tail-called internal
procedures are instead implemented
using while (true), continue,
and break with labels. The resulting Java code is not as
readable as the original Scheme-dialect; but that loss in clarity is
balanced by greater expressive power.
Schlep-example gives an example of a
procedure with an internal procedure and how Schlep translates this
procedure into C, C#, and Java.
schlep-name which maps Scheme identifiers to
identifiers in the target language. This mapping is described on the
individual translator home pages:
scm2java.html,
scm2cs.html, and
scm2c.html.
Each of the target languages is statically typed, but the
Schlep-Dialect is manifestly typed. Each of the translator home pages
describes the multiple ways of declaring types for Scheme identifiers
based on glob-matching the identifier names. Two cases are handled
without declaration: identifiers ending in `!'or`?' are typed void and the native Boolean type,
respectively. The type of a identifier bound to a procedure is the
type of the return value of that procedure.
One could individually declare the type of every identifier used; but
I recommend adopting matchable conventions; this makes for less work
and more readable code. Examples of declaration files are
scm2c.typ,
scm2cs.typ, and
scm2java.typ (text from semicolon to
end-of-line is a comment).
For arithmetic and basic data operations accessing or setting
variables, vectors, and strings, the translators emit the
corresponding statement or expression in the target language. For
utility and other procedures not handled, the translators emit
procedure calls with the names translated appropriately for the target
language. Utility procedures not intrinsic to the target language or
its libraries must be supplied by target language files to be compiled
with the translated code. Accessor-routines written in the target
language allow composite data types to be operated on by translated
code.
NULL and false are conflated. They are separate in Java
and C#; their static typing allows NULL, but not false to
be a placeholder for missing object data. False (#f) is
typically used for missing data in Scheme so that the logical
operators work on them. Our solution for this is to have scm2java and
scm2cs wrap test expressions which are not obviously boolean with the
function (method) `a2b' in generated Java and C# code.
These definitions are in SchlepRT.java
and SchlepRT.cs, respectively:
Java
public static boolean a2b(boolean b) {return b;}
public static boolean a2b(Object i) {return (i != null);}
C#
public static bool a2b(bool b) {return b;}
public static bool a2b(Object i) {return (i != null);}
This makes the semantics of Java and C# conditionals close to C. One
must still not depend on distinguishing false from NULL.
scm2java.scm,
scm2cs.scm, and
scm2c.scm, are written in (full)
Scheme. The translations
from Schlep-Dialect source files to
target language files can be done by invoking the translation programs
as SCM scripts, or by loading and calling a translator from a Scheme
session. The first two lines of each program are written so that
the SCM Scheme implementation can execute them as
scripts.
#! /usr/local/bin/scm \ - !#If your SCM binary is located in a different place, change
`/usr/local/bin/scm' to the absolute path to the
SCM executable on your computer. To try loading these files into
another implementation, you may need to remove the first two lines.
scm2java
produces one file.java file for
each file.scm file passed on the command
line.
scm2cs
creates one file concatenating the translations of all the input
files, which will be a combination of Scheme files and C# files
with the .cs extension. The one-file-approach was
adopted so that methods could call methods in other classes
without explicit class prefixes. If you know how to do this
with multiple C# files, please let me know.
scm2c
when called with file.corfile.scm, scm2c produces
files file.h
and file.c. When called
with file.h, scm2c produces
just file.h
(from file.scm).
scm2c) which is nearly as tight as can be written
directly. Testing and debugging the source in SCM speeds development
and eliminates range errors which are difficult to find in compiled C.
For Water running a
spectralnorm
benchmark, Sun's Java-1.6 HotSpot Virtual Machine runs Water-J
nearly as fast (within a few percent) as GCC-4.3 compiled Water-C on
Linux. The Mono
JIT C# compiler version 2.4.2.3 compiled Water implementation runs
more than 2 times slower.
scm2java, scm2cs,
and scm2c, programs
generate Texinfo
files with a .txi extension if the Scheme source file
has Schmooz format comments. Nearly any
documentation format can be generated from Texinfo files. Schmooz was
written by
Radey Shouman.
Each translator generates documentation for its target language API.
So that .txi files generated for different language don't
overwrite each other, the translated sources should be directed to
distinct directories.
Not part of the Schlep technology, Pas2scm is a
Pascal-to-Scheme
translator I wrote to revive some nifty graphics programs I wrote
for Apollo
Computer workstations. Pas2scm demonstrates that programming
language translation can have Scheme as the target language. That
being said, Wirth's Pascal language is small, easily parsed, and not
object-oriented. Translating
from C++ would
prove more challenging.
|
|
I am a guest and not a member of the MIT Computer Science and Artificial Intelligence Laboratory.
My actions and comments do not reflect in any way on MIT. | ||
| agj @ alum.mit.edu | Go Figure! | |