terça-feira, 15 de setembro de 2009

Kernel Modules in Haskell

If you love Haskell and Linux then today is your day – today we reconcile the two and allow you to write Linux Kernel modules in Haskell. By making GHC and the Linux build system meet in the middle we can have modules that are type safe and garbage collected. Using the copy of GHC modified for the House operating system as a base, it turns out to be relatively simple to make the modifications necessary to generate object files for the Kernel environment. Additionally, a new calling convention (regparm3) was added to make it easier to import (and export) functions from the Kernel.

I haven’t made a dry run through all these instructions, but have done each part at separate times and believe them correct – feel free to comment with bug reports!


Building GHC to Make Object files for Linux Modules

Start by downloading the House 0.8.93 [1]. Use the build system to acquire ghc-6.8.2 and apply the House patches. The House patches allow GHC to compile Haskell binaries that will run on bare metal (x86) without an underlying operating system, so this makes a good starting point.

 > wget http://web.cecs.pdx.edu/~kennyg/house/House-0.8.93.tar.bz2
> tar xjf House-0.8.93.tar.bz2
> cd House-0.8.93
> make boot
> make stamp-patch

Now acquire the extra patch which changes the RTS to use the proper Kernel calls, instead of allocating its own memory, and to respect the current interrupt level. This patch also changes the build options to avoid common area blocks for uninitilized data (-fno-common) and use frame pointers (-fno-omit-frame-pointers).

 > wget https://projects.cecs.pdx.edu/~dubuisst/hghc.patch
> patch -d ghc-6.8.2 -p1 < hghc.patch
> make stamp-configure
> make stamp-ghc # makes ghc stage 1

Next, build a custom libgmp with the -fno-common flag set. This library is needed for the Integer support in Haskell.

 > wget ftp://ftp.gnu.org/gnu/gmp/gmp-4.3.1.tar.bz2
> tar xjf gmp-4.3.1.tar.bz2
> cd gmp-4.3.1
> ./configure
# edit 'Makefile' and add '-fno-common' to the end of the 'CFLAGS = ' line.
> make
> cp .libs/libgmp.a $HOUSE_DIR/support

Apply ’support.patch’ to alter the build systems of the libtiny_{c,gcc,gmp}.a and build the libraries.

 > wget https://projects.cecs.pdx.edu/~dubuisst/support.patch
> patch -d $HOUSE_DIR < support.patch
> make -C $HOUSE_DIR/support

Build the cbits object files:

 > make -C $HOUSE_DIR/kernel cobjs

In preparation for the final linking, which is done manually, pick a working directory ($WDIR) that will serve to hold the needed libraries. Make some last minute modifications to the archives and copy libHSrts.a, libcbits.a, libtiny_gmp.a, libtiny_c.a, and libtiny_gcc.a

 > mkdir $WDIR
> ar d support/libtiny_c.a dlmalloc.o # dlmalloc assumes it manages all memory
> ar q $WDIR/libcbits.a kernel/cbits/*.o
> cp ghc-6.8.2/rts/libHSrts.a support/libtiny_{c,gcc,gmp}.a $WDIR


Build a Kernel Module

First, write the C shim that will be read by the Linux build system. While it might be possible to avoid C entirely its easier to use the build system, and its plethora of macros, than fight it. The basic components of the shim are a license declaration, function prototypes for the imported (Haskell) functions, initialization, and exit functions. All these can be seen in the example hello.c [2].

Notice that many of the standard C functions in the GHC RTS were not changed by our patches. To allow the RTS to perform key actions, such as malloc and free, the hello.c file includes shim functions such as ‘malloc’ which simply calls ‘kmalloc’. Any derivative module you make should include these functions either in the main C file or a supporting one.

Second, write a Haskell module and export the initialization and exit function so the C module may call them. Feel free to import kernel functions, just be sure to use the ‘regparm3′ key word in place of ‘ccall’ or ’stdcall’. For example:

 foreign import regparm3 unsafe foo :: CString -> IO CInt
foreign export regparm3 hello :: IO CInt

Continuing the example started by hello.c, ‘hsHello.hs’ is online [3].

Now start building the object files. Starting with building hsHello.o, you must execute:

        > $HOUSE_DIR/ghc-6.8.2/compiler/stage1/ghc-inplace -B$HOUSE_DIR/ghc-6.8.2  hsHello.hs -c

* note that this step will generate or overwrite any hsHello_stub.c file.

When GHC generates C code for exported functions there is an implicit assumption that the program will be compiled by GHC. As a result the nursery and most the RTS systems are not initialized so the proper function calls must be added to hsHello_stub.c.

Add the funcion call “startupHaskell(0, NULL, NULL);” before rts_lock() in the initializing Haskell function. Similarly, add a call to “hs_exit_nowait()” after rts_unlock().

The stub may now be compiled, producing hsHello_stub.o. This is done below via hghc, which is an alias for our version of ghc with many flags [4].

 > hghc hsHello_stub.c -c

The remaining object files, hello.o and module_name.mod.o, can be created by the Linux build system. The necessary make file should contain the following:

 obj-m := module.o  # Obviously you should name the module as you see fit
module-objs := hello.o

And the make command (assuming the kernel source is in /usr/src/kernels/):

 > make -C /usr/src/kernels/2.6.29.6-217.2.16.fc11.i586 M=`pwd` modules

This should make “hello.o” and “module.mod.o”. Everything can now be linked together with a single ld command.

 > ld -r -m elf_i386 -o module.ko hsHello_stub.o hsHello.o module.mod.o hello.o *.a libcbits.a

A successful build should not have any common block variables and the only undefined symbols should be provided by the kernel, meaning you should recognize said functions. As a result, the first command below should not result in output while the second should be minimal.

 > nm module.ko | egrep “^ +C ”
> nm module.ko | egrep “^ +U ”
U __kmalloc
U kfree
U krealloc
U mcount
U printk


Known Issue

The House-GHC 6.8.2 run-time system (RTS) does not clean up the allocated memory on shutdown, so adding and removing kernel modules results in large memory leaks which can eventually crash the system. This should be relatively easy to fix, but little investigation was done.

A good estimate of the memory leak is the number of megabytes in the heap (probably 1MB, unless your module needs lots of memory) plus 14 bytes of randomly leaked memory from two unidentified (6 and 8 byte) allocations.

[1] http://web.cecs.pdx.edu/~kennyg/house/
[2] https://projects.cecs.pdx.edu/~dubuisst/hello.c
[3] https://projects.cecs.pdx.edu/~dubuisst/hsHello.hs
[4] $HOUSE_DIR/ghc-6.8.2/compiler/stage1/ghc-inplace -B$HOUSE_DIR/ghc-6.8.2 -optc-fno-common -optc-Wa,symbolic -optc-static-libgcc -optc-nostdlib -optc-I/usr/src/kernels/2.6.29.6-213.fc11.i586/arch/x86/include/ -optc-MD -optc-mno-sse -optc-mno-mmx -optc-mno-sse2 -optc-mno-3dnow -optc-Wframe-larger-than=1024 -optc-fno-stack-protector -optc-fno-optimize-sibling-calls -optc-g -optc-fno-dwarf2-cfi-asm -optc-Wno-pointer-sign -optc-fwrapv -optc-fno-strict-aliasing -I/usr/src/kernels/2.6.29.6-213.fc11.i586/include/ -optc-mpreferred-stack-boundary=2 -optc-march=i586 -optc-Wa,-mtune=generic32 -optc-ffreestanding -optc-mtune=generic -optc-fno-asynchronous-unwind-tables -optc-pg -optc-fno-omit-frame-pointer -fvia-c

Nenhum comentário: