2004.08_Swig-How to Glue Code Together.pdf

(3159 KB) Pobierz
Layout 1
SYSADMIN
SWIG
Sticky Stuff
About the only thing as abundant as
Linux distributions are cover versions
of Gershwin’s “Summertime” and
programming languages. But, choice
is good. One of the reasons that there
are so many programming languages
is that there is no single best pro-
gramming language.
BY ARMIJN HEMEL
languages) are designed with spe-
cific goals in mind and
programming language inventors often
have different goals and thus design
their language in a different way. The
result is that often you can do things
faster or more efficient in one language
than you can in another.
There are people who zealously argue
that serious programs have to be written
in C and that a program written in a
scripting language can never be efficient.
Reality is somewhat different. A lot of
scripting languages are implemented in a
very efficient way and with minimal
effort you can write good performing,
maintainable code. “The right tool for
the right job.” quote definitely applies
when it comes to choosing which pro-
gramming language you should be
using.
My favorite programming language (at
the moment) is Python. I can write code
in Python more easily and certainly
faster than in C. However, not everything
that I would like to do can be done from
Python. There are heaps and heaps of
code in C or C++ for which there is no
equivalent module in Python.
One example (that we will see later
on) is code for accessing the USB bus
and USB devices from user space with
libusb (used by programs like SANE and
gPhoto). I have some weird devices that
I don’t want to write a kernel driver for
because I don’t trust the device, I don’t
have any specifications and I can’t write
kernel drivers. With libusb I can stay in
user space and don’t have to fear to
crash my machine with a badly written
driver.
Of course, I want to be able to write
this user space driver in Python, but
there is no libusb for Python so I will
need to use C. What if I really want to use
Python? Aren’t there any alternatives?
A solution to this particular problem
would be to reimplement the functional-
ity of libusb in Python. This is
error-prone (you might introduce new
errors that weren’t in the original library)
and time-consuming (coding and testing
a whole library costs time and you will
have to maintain the code as well) and
from a software-reuse point of view
reimplementing good working code in
another language is a definite no-no.
It would be better to be able to use the
existing C code from within a scripting
language and thus leverage all the work
that others have done. This is what
SWIG makes possible.
SWIG stands for “Simplified Wrapper
and Interface Generator” and is a utility
to generate glue code that allows the use
of existing C/C++ code from within a
whole bunch of scripting languages,
such as Perl and Python and languages
like Java and C#.
A lot of interpreters have built-in
mechanisms to call C/C++ code that
lives outside the interpreter. Perl has the
XS mechanism and Java uses JNI (Java
Native Interface). SWIG has the advan-
tage of being generic (up to a certain
point) and it makes gluing a lot easier
(and more fun!) than the conventional
ways.
How SWIG works
The whole process of gluing C/C++
code and a scripting language can be
broken down in four steps:
• write an interface file that specifies
which of the available functions to
export
• let SWIG generate C wrapper code for
the target language by using the inter-
face file
58
August 2004
www.linux-magazine.com
Gluing code with SWIG
L anguages (even general purpose
594199044.003.png
SWIG
SYSADMIN
• compile the wrapper code into a
shared object
• link all object files and libraries into a
shared library
So, why does this work? Remember that
a lot (if not all) of the interpreters that
SWIG generates code for are written in C
or C++ themselves. Calling some func-
tions in code that is written in the same
language as the interpreter itself can
hardly be called magic.
The interface file describes which
functions in the C/C++ code should be
made visible to the scripting language. In
a simple case, this interface file can be
just a few lines, in complex cases it can
get a lot longer.
The C code SWIG generates is hardly
readable (and not meant to be read), but
it is basically used to include the right
header files for the interpreter of choice
and take care of other things needed to
be able to use the C code we want to
glue. The wrapper code should be con-
sidered as a black box.
#include "hello.h"
%}
int printhelloworld();
$ gcc -shared hello_wrap.o U
hello.o -o _printhelloworld.so
In this case we’re only interested in the
function printhelloworld(). Because here
that would be actually all the functions
that are in the program (leaving out
“main()” for convenience) we could also
have written:
Tip: If you are developing a program
with SWIG, typing in all these com-
mands can soon become quite boring
and irritating. To stop this from happen-
ing, it is advisable to just make a simple
Makefile instead.
That’s all you need to do! To test it,
you can either fire up the Python inter-
preter and manually test it, or write a
small Python program that does the job
for you. The Python interpreter will
search for “_printhelloworld.so” in the
Python library path and if it cannot find
it, Python will search in the current
working directory. You will have to make
sure that you either put it in a global
directory (like /usr/lib/python2.2/) or
that you launch the script in the direc-
tory where the library is located.
%module printhelloworld
%{
#include "hello.h"
%}
%include "hello.h"
which parses the header file in its
entirety and generates code for all the
functions for which we want to have
access to from Python as listed in the
header file.
The Python module which we will
generate will be called “printhelloworld”
(defined with the “module” keyword).
Everything between the curly braces on
lines 2 and 4 will be included verbatim
in the generated wrapper code, which is
generated by running the following com-
mand:
import printhelloworld
A simple example
Imagine we have the following program
(“hello.c”):
printhelloworld.printhello U
world()
Running this program produces the fol-
lowing output:
01 #include <stdio.h>
02 #include "hello.h"
03
04 int printhelloworld() {
05 printf("hello world\n");
06 return 0;
07 }
08
09 int main(int argc, char* U
argv) {
10 printhelloworld();
11 return 0;
12 }
$ swig -i python hello.i
[armijn@swig]$ python test.py
hello world
The result is a wrapper file called
“hello_wrap.c”. For other scripting lan-
guages another option would be used
(for example, “-i perl”).
Exchanging data between
Python and C
Of course, in real life things aren’t as
easy as in the previous example. Often
there is some sort of data exchange tak-
ing place between the calling code (the
script) and the library, for example, some
custom data structure as the result of
calling a function.
SWIG has native support for convert-
ing basic C types (int, short, long, char,
bool) to and from scripting languages.
SWIG treats everything else as a pointer.
This has some consequences when you
are passing around advanced data struc-
tures.
As an example, for us a string in
Python and a string in C/C++ are not
that different, but if you look at the
implementations they are not the same
at all. A Python string is not a char*, but
a PyString, which is a different datatype.
You cannot give a C function that expects
a char* a PyString instead.
$ gcc -c hello_wrap.c -I U
/usr/include/python2.2/
In this case we have to include the path
to the “Python.h” header file, because by
default it’s not in the search path on Red
Hat based systems (such as Fedora). The
location of this header file varies. As an
example, on Fedora Core 2 you would be
required to include the path of
/usr/include/python2.3/.
To be able to link with the actual pro-
gram code from “hello.c” we now need
to create an object file from the source
file:
with header file “hello.h”:
int printhelloworld();
This program will print “hello world” on
stdout and then exit. Suppose we now
want to be able to use the function
“printhelloworld” from within a Python
script.
As described earlier, we have an inter-
face file:
$ gcc -c hello.c
%module printhelloworld
%{
As a final step we link everything into
one shared object:
www.linux-magazine.com
August 2004
59
594199044.004.png 594199044.005.png 594199044.006.png 594199044.001.png
SYSADMIN
SWIG
Another example: in Python the range
of integers is bigger than the range of
integers in C (and in fact it corresponds
to a “long” in C). If you do data
exchange other than with simple types
some conversion has to be done. In
SWIG this is not that hard, but it comes
at a price: the conversion is specific to a
language and you lose the ability to use
one interface file for generating glue
code for different languages.
To convert data between two
datatypes SWIG uses so called type-
maps. With typemaps you can redefine
SWIG behavior for generating wrapper
code. There are quite a few uses for
typemaps, including argument type
checking, handling exceptions (in C++),
and conversion of arguments. Here we
will only look at argument conversion,
for which we will look at a beefier piece
of code, namely libusb.
A common technique of exchanging
data in C is to pass a pointer to a piece of
memory to a function where it can store
data, which libusb uses to transfer data
to a USB device:
Python, but dynamically, so it doesn’t
make much sense to also supply a size
argument.
With the following typemap (see List-
ing 1) we can convert an array in Python
to a char* and int in C:
In Python you can now simply write:
was passed, and calls the function
usb_bulk_write with all the original
parameters and the two newly created
parameters in place of the PyString.
Getting data from C works in a similar
way. The C function usb_bulk_read is
defined as Listing 2.
The “argout typemap” does the oppo-
site of the “in typemap”, it converts
outgoing parameters back into Python.
Apart from the first bit (which checks if
there is a reference to the variable in
Python) the code is pretty straightfor-
ward. What happens is that the char* is
cast to a Python String and returned
using the special variable $result.
In Python you can now simply say:
libusb.usb_bulk_write(device, U
endpoint, bytestring, 2048)
where “bytestring” is a string, “device”
is an object representation of a USB
device and “endpoint” is the number of
an endpoint on that device.
This typemap is an “in typemap”,
which means it’s converting data that is
flowing from the scripting language into
C. Typemaps work by pattern matching
on the arguments and this one works for
every (char *, int) it sees (and only in
that precise order).
One thing you should keep in mind is
that the matching in the typemap
describes the parameter sequence on the
C/C++ side, not on the side of the
scripting language.
As you can see in the typemap, it
checks if the data given to it is really a
PyString ($input is a special object that
holds the value to be converted) and if it
is, creates two arguments for the C func-
tion: it converts the original PyString to a
char* and calculates the int parameter,
by taking the size of the PyString that
read_result = libusb.usb_bulk U
_read(device, ep, 16, 1024)
Apart from the “argout” typemap there is
also an “out” typemap, which converts
the result of the C function. In this case
using an “out” typemap does not have
that much effect, because the data we
are interested in is passed around via a
char*. The result of “usb_bulk_read” is
an integer, indicating whether it was suc-
cessful or not in reading data, not the
data itself.
When you define a typemap, every
function declaration that follows in the
SWIG interface file is covered by that
typemap. Typemaps can be overridden
by new typemaps, or redefining old
typemaps.
int usb_bulk_write(usb_dev U
_handle *dev, int ep, char U
*bytes, int size, int timeout);
This function writes an array of chars to
an endpoint on a USB device. It does this
by passing a char pointer (“array of
chars”) and an integer with the size of
the array.
In Python you just want to pass an
array with chars (which is exactly the
same as a string in Python). Further-
more, arrays are not statically sized in
Listing 2: usb_bulk_read
01 int usb_bulk_read(usb_dev
_handle *dev, int ep, char
*bytes, int size, int
timeout);
Going further
Creating a direct mapping from C to a
scripting language isn’t that difficult:
basic mappings for Python for libusb
cost me about a day with hardly any
prior knowledge of SWIG, except for
some “hello world” examples.
One of the real challenges when gluing
a scripting language to C/C++ code is to
encapsulate the functions in such a way
that it is natural for programmers of that
particular scripting language, for exam-
ple, modules in the case of Python. This
is probably the hardest part of using
SWIG, and probably the only way to get
this right is a lot of practice.
Listing 1: Converting an
array
02
03 %typemap(argout) (char *bytes,
int size) {
04 Py_XDECREF($result);
05 if (result < 0) {
06 free($1);
07
01 %typemap(in) (char *bytes, int
size) {
02 if (!PyString_Check($input))
{
03
PyErr_SetFromErrno(PyExc_IOErr
or);
08 return NULL;
09 }
10 $result =
PyString_FromStringAndSize($1,
result);
11 free($1);
12 }
PyErr_SetString(PyExc_ValueErr
or, "Expecting a string");
04
return NULL;
05 }
06 $1 = (void *)
PyString_AsString($input);
07 $2 = PyString_Size($input);
08 }
INFO
SWIG homepage: http://www.swig.org/
60
August 2004
www.linux-magazine.com
594199044.002.png
Zgłoś jeśli naruszono regulamin