When renovating a building, it is common to do this floor by floor. In this sense, Ocamlnet 3.0.0 focused on the foundation and the first floor. Also, the renovation is not yet finished - many features still need to be added, like supporting SSL for more protocols. This is now easier thanks to some new basic APIs that have been introduced in the first step.
One of the parts that got most attention is Netsys, the
library adding the missing links to the operating system (OS). One of
the driving forces was the port to Win32. This lead to the introduction
of generalized versions of Unix.read and Unix.write
calls (defined in Netsys):
val gread : fd_style -> Unix.file_descr -> string -> int -> int -> int val gwrite : fd_style -> Unix.file_descr -> string -> int -> int -> intFor getting some Win32-specific emulations right, it is sometimes required to call other functions instead of
Unix.read
and Unix.write, e.g. Netsys_win32.pipe_read
and Netsys_win32.pipe_write. In order to avoid that such
case distinctions are scattered over the whole library, the idea of
defining these generic functions was born. In fd_style the
user passes in how to handle the
descriptor. Usually fd_style is automatically determined
by another function get_fd_style (this requires a few
system calls and is factored out because of this). Although targeting mostly
at Win32, there are already
some benefits for POSIX systems, e.g. the fd_style already
encodes whether a descriptor is a socket, and whether it is connected,
which is sometimes quite useful information. In the future, this system
will be extended:
Seekable files are currently not well supported by the
asynchronous I/O layer. The reason is that the select
and poll system calls cannot predict whether I/O would be
blocking or non-blocking (and thus always say non-blocking).
This can be improved by using special AIO calls of the OS.
Of course, files for which AIO is to be used need to be
flagged specially, and a new fd_style could
do so.
There are also some ideas for labeling SSL sockets by a special
fd_style. This would make it a bit easier to
support SSL thoughout the library. This is a bit more work
than just calling Ssl.read and Ssl.write,
though, because the SSL protocol allows renegotiations at any
time, and a read may also require writes on the socket level,
and vice versa.
Netsys level is a little object
definition called pollset:
class type pollset =
object
method find : Unix.file_descr -> Netsys_posix.poll_req_events
method add : Unix.file_descr -> Netsys_posix.poll_req_events -> unit
method remove : Unix.file_descr -> unit
method wait : float ->
( Unix.file_descr *
Netsys_posix.poll_req_events *
Netsys_posix.poll_act_events ) list
method dispose : unit -> unit
method cancel_wait : bool -> unit
end
A pollset represents a set of file descriptor events one
wants to poll. Again, this data structure was originally required
for the Win32 port (because Win32 is very different in this respect),
but there are also advantages for Unix systems. Nowadays, there are
various improved APIs for polling such as Linux epoll or BSD kqueue.
The pollset abstraction will make it very easy to support
these - the user simply selects one of the advanced implementations
of pollset, and thanks to dynamic binding of object
methods it is automatically used everywhere. (One of the next
versions of Ocamlnet will allow this.)
Another word about polling. The Ocaml runtime only provides
select. Although not as bad as claimed by some people,
it imposes artificial limitations, especially about the number of
supported file descriptors. Because of this, Netsys_posix
includes now a binding of the poll system call which is not
suffering from this disease. Of course, poll is now the
only poll API used throughout Ocamlnet (and, as noted, even better
APIs will be supported in one of the next releases).
Other additions on the OS level for Unix systems:
Netsys_posix.spawn is a new way of starting subprograms,
with special support for monitoring the subprocesses asynchronously
Netsys_posix
fsync and fdatasync are
supported
fadvise can be invoked
to control the page cache
fallocate to allocate disk space, so far
the OS provides it
Netsys_signal,
so that various users of signals do not mutually override their handlers
For all systems, Netsys implements:
Netsys_mem there is now special support for
using bigarrays of chars as efficient I/O buffers. Such
bigarray-backed buffers are called memory (reminding us
to the fact that these buffers are not relocatable like strings, but
bound to fixed memory addresses). There are functions for allocating
page-aligned or cache-line-aligned memory buffers. Also,
there is experimental support for copying Ocaml values into buffers
(used by the Camlbox module, see below). Finally, there are also
versions of read, write, recv
and send operating on memory buffers rather than strings.
These versions open the door to zero-copy network I/O (if supported by
the OS).
Netsys_oothr).
Netexn is now almost outdated,
because the Ocaml standard library recently introduced a similar
feature (yes, sometimes feature wishes are honoured :-).
As Netsys uses now pollsets to manage
polling, Equeue had to be rewritten to take advantage of
this. In particular, there is now Unixqueue_pollset which
is a port of the old Unixqueue API around pollsets. For
the user, there is absolutely no difference.
What's more important is the extension of the engine API. Ocamlnet 2
introduced engines as a way of expressing a suspended I/O possibility,
but there was only limited support for it in the library. This has now
changed - engines are now a first class member of Ocamlnet. In particular,
there are now much more synchronization primitives (e.g.
stream_seq_engine for executing an open number of engines
in sequence, or msync_engine for waiting for the
completion of multiple engines). This development was mostly driven by
another project of mine: Plasma (see other blog articles on this
site). Plasma uses engines for all kinds of concurrent execution of
I/O code, and while I was developing Plasma, I extended the Ocamlnet
engine API step by step.
There is also now a way to call RPC procedures with an engine:
Rpc_proxy.ManagedClient.rpc_engine. This function has
originally also been developed for the Plasma project.
For simpler I/O needs, I added Uq_io. It contains
"engineered" versions of simple I/O functions like input,
input_line or flush. Uq_io is
not limited to file descriptors, but works also on top of a number
of other I/O devices (including virtual ones).
The operators ++ and >> have been
introduced as abbreviations for sequential execution, and result
mapping of engines, respectively. For example, the synchronous
code
let line1 = input_line ch_in in let line2 = input_line ch_in in output_string ch_out (line1 ^ line2 ^ "\n")would now look in "engineered" code:
Uq_io.input_line_e d_in ++
(fun line1 ->
Uq_io.input_line_e d_in ++
(fun line2 ->
Uq_io.output_string_e d_out (line1 ^ line2 ^ "\n")
)
)
Not bad, if you compare with the previous solution (hand-coding a
scanner for lines, writing the event handler routines, etc., adding
up to 100-200 lines of code).
Netplex area was focused on easing
multi-processing. With Netplex it is very easy to run
code in several worker processes, e.g. for network servers. What was
missing up to now, however, was an easy way to manage the
collaboration of the processes.
Netplex worker processes got now a number of ways to
talk to each other:
Netplex_sharedvar).
Of course, this mechanism is typed.
Netplex_mutex and Netplex_semaphore)
The implementation of these mechanisms is not yet optimal, but the APIs
are defined and backed by simple but robust modules. It is expected that
in the future more sophisticated implementations will become available,
e.g. the Netplex_sharedvar code use a shared memory object
if the OS supports that.
Another addition are "levers". This kind of handle exists within the Netplex master process, but can be activated from the child processes. It is a kind of little RPC function for a special purpose: Sometimes the process model requires that certain functionality must be done within the scope of the master process. An example would be the start of another child process. By doing that via a lever, this action can also be triggered from any child process.
Besides that there are numerous smaller enhancements. Especially
the module Netplex_cenv has been extended, e.g. there
are now timers that can be attached to the Netplex event queue.
The improved client is called Rpc_proxy. All experience
went in that I made at my Ocaml job at Mylife.com - lots of RPC calls
in an unreliable environment (if you have hundreds of machines, one
box is always down). Clients can now be recycled, they can react
better on errors, and even load balancing and fail-over to alternate
endpoints are now supported. (See the other blog posting, "The next
server, please!".)
Performance improvements were achieved by two means: First, the XDR
encoding and decoding was optimized. This has not yet come to an end
yet, but certain XDR types like arrays of strings are now processed a
lot faster. The other strategy was to replace many string buffers by
bigarrays of char (see under "memory" above). This allows it to get rid
of a number of copy operations, especially when large strings are
transmitted via RPC. This new string representation is even accessible
by user code via a new XDR type _managed string. This
may avoid even more copies.
Shell is mostly the same - only a few
suspicious functions have been removed. The implementation, however,
has changed a lot.
Shell now uses the new Netsys functions for
starting subprocesses. As these functions are written in C, one gets
some immediate benefits: Shell is now officially supported
for multi-threaded programs because it is possible to do the signal
handling right in C (but still, this is notoriously difficult). Also,
there is now no risk anymore that the Ocaml garbage collector wants to
clean up in the worst moment, namely between fork and exec.
Another benefit is that Shell works now also under Win32.
The C part is completely different, though.
Netcgi1
is gone now.
This works as follows: If process 1 want to send process 2 a message, both have to map the same memory pages into their address space. The message is orignally an Ocaml value somewhere in the private memory of process 1. With the help of Camlbox this value is now copied to shared memory so that, and this is the pivotal point, process 2 can directly access the value without additional decoding step. This reduces greatly the overhead of message sending - actually only a relatively fast value copy is done, bypassing any kernel-controlled I/O devices.
For passing a short message, this takes now only a few microseconds. Most of that time is spent for synchronization, of course, not for copying. (On the hardware level, the synchronization is mostly done by moving cache lines from one CPU core to the other, so this is some kind of hidden copying. It is worth noting that Camlboxes are way faster on single-core machines than on multi-cores because this low-level synchronization is not required then.)
Camlboxes have one downside, though. They are not perfectly integrated into the garbage collecting machinery, and because of this, one has to follow some programming rules. In particular, there is no way to recognize that a message (or part of it) is no longer referenced, so messages are manually deleted, and there is of course the danger that bad code keeps references to (or into) deleted messages. For fixing this, we would need more help by the Ocaml GC.
Another problem is missing integration with Equeue.
Camlboxes are synchronous by design - that's the price for their speed.
by Matías Giovannini (noreply@blogger.com) at September 01, 2010 09:21 PM
Some careful readers of Planet OCamlCore should wonder why the OCaml packages in Debian has not yet been upgraded to 3.12.0. For the Planet Debian readers, this is the latest version of the Objective Caml programming language.
The answer is simple: Debian Squeeze froze on 6th August. This means that Debian folks focus on fixing release critical bugs and avoid doing big transitions in unstable (Sid). In particular, the Debian OCaml maintainers has decided to keep OCaml 3.11.2 for Squeeze, because the delay was really too short: OCaml 3.12 was out on 2nd August.
A great work has already been done by S. Glondu and the rest of the Debian OCaml maintainers to spot possible problems. The result was a series of bugs submitted to the Debian BTS. This effort has started quite early and have been updated with various OCaml release candidates.
S. Glondu has also built an unofficial Debian repository of OCaml 3.12.0 packages here.
Let's use it to experiment with OCaml 3.12.0.
Following my last post about schroot and CentOS, we will use a schroot to isolate our installation of unofficial OCaml 3.12.0 packages.
approx is a debian caching proxy server for Debian archive files. It is very effective and simple to setup. It is already on my server (Debian Lenny, approx v3.3.0). I just have to add a single line to create a proxy for ocaml 3.12 packages:
$ echo "ocaml-312 http://ocaml.debian.net/debian/ocaml-3.12.0" >> /etc/approx/approx.conf $ invoke-rc.d approx restart
approx is written in OCaml, if you want to know how I come to it.
We create a chroot environment with Debian Sid:
# PROXY = host where approx is installed, debian/ points to official Debian repository of # your choice. $ debootstrap sid sid-amd64-ocaml312 http://PROXY:9999/debian
We create a section for sid-amd64-ocaml312 in /etc/schroot/schroot.conf (Debian Lenny):
[sid-amd64-ocaml312] description=Debian sid/amd64 with OCaml 3.12.0 type=directory location=/srv/chroot/sid-amd64-ocaml312 priority=3 users=XXX root-groups=root run-setup-scripts=true run-exec-scripts=true
Replace XXX by your login.
And we install additional softwares:
$ schroot -c sid-amd64-ocaml312 apt-get update $ schroot -c sid-amd64-ocaml312 apt-get install vim-nox sudo
Now we can start the setup to access OCaml 3.12.0 packages.
The repository is signed by S. Glondu GPG key (see here). We need to get it and inject it into apt:
$ gpg --recv-key 49881AD3 gpg: requête de la clé 49881AD3 du serveur hkp keys.gnupg.net gpg: clé 49881AD3: « Stéphane Glondu <steph@glondu.net> » n'a pas changé gpg: Quantité totale traitée: 1 gpg: inchangée: 1 $ gpg -a --export 49881AD3 > glondu.gpg $ schroot -c sid-amd64-ocaml312 apt-key add glondu.gpg
The following part is done in the schroot:
$ schroot -c sid-amd64-ocaml312
# PROXY = host where approx is installed
(sid-amd64-ocaml312)$ echo "deb http://PROXY:9999/ocaml-312 sid main" >> /etc/apt/sources.list
(sid-amd64-ocaml312)$ cat <<EOF >> /etc/apt/preferences
Package: *
Pin: release l=ocaml
Pin-Priority: 1001
EOF
(sid-amd64-ocaml312)$ apt-get update
...
(sid-amd64-ocaml312)$ apt-cache policy ocaml
Installé : (aucun)
Candidat : 3.12.0-1~38
Table de version :
3.12.0-1~38 0
1001 http://atto/ocaml-312/ sid/main amd64 Packages
3.11.2-1 0
500 http://atto/debian/ sid/main amd64 Packages
(sid-amd64-ocaml312)$ apt-get install ocaml-nox libtype-conv-camlp4-dev libounit-ocaml-dev...
That's it. The apt-policy command shows that OCaml 3.12 for the ocaml-312 repository has an higher priority for installation.
Good luck playing with OCaml 3.12.0.
I am happy to announce version 0.3 of ocamljs. Ocamljs is a system for compiling OCaml to Javascript. It includes a Javascript back-end for the OCaml compiler, as well as several support libraries, such as bindings to the browser DOM. Ocamljs also works with orpc for RPC over HTTP, and froc for functional reactive browser programming.
Changes since version 0.2 include:
Development of ocamljs has moved from Google Code to Github; see
Since I last did an ocamljs release, a new OCaml-to-Javascript system has arrived, js_of_ocaml. I want to say a little about how the two systems compare:
Ocamljs is a back-end to the existing OCaml compiler; it translates the “lambda” intermediate language to Javascript. (This is also where the bytecode and native code back-ends connect to the common front-end.) Js_of_ocaml post-processes ordinary OCaml bytecode (compiled and linked with the ordinary OCaml bytecode compiler) into Javascript. With ocamljs you need a special installation of the compiler (and special support for ocamlbuild and ocamlfind), you need to recompile libraries, and you need the OCaml source to build it. With js_of_ocaml you don’t need any of this.
Since ocamljs recompiles libraries, it’s possible to special-case code for the Javascript build to take advantage of Javascript facilities. For example, ocamljs implements the Buffer module on top of Javascript arrays instead of strings, for better performance. Similarly, it implements CamlinternalOO to use Javascript method dispatch directly instead of layering OCaml method dispatch on top. Js_of_ocaml can’t do this (or at least it would be necessary to recognize the compiled bytecode and replace it with the special case).
Because js_of_ocaml works from bytecode, it can’t always know the type of values (at the bytecode level, ints, bools, and chars all have the same representation, for example). This makes interoperating with native Javascript more difficult: you usually need conversion functions between the OCaml and Javascript representation of values when you call a Javascript function from OCaml. Ocamljs has more information to work with, and can represent OCaml bools as Javascript bools, for example, so you can usually call a Javascript function from OCaml without conversions.
Ocamljs has a mixed representation of strings: literal strings and the result of ^, Buffer.contents, and Printf.sprintf are all immutable Javascript strings; strings created with String.create are mutable strings implemented by Javascript arrays (with a toString method which returns the represented string). This is good for interoperability—you can usually pass a string directly to Javascript—but it doesn’t match regular OCaml’s semantics, and it can cause runtime failures (e.g. if you try to mutate an immutable string). Js_of_ocaml implements only mutable strings, so you need conversions when calling Javascript, but the semantics match regular OCaml.
With ocamljs, Javascript objects can be called from OCaml using the ordinary OCaml method-call syntax, and objects written in OCaml can be called using the ordinary Javascript syntax. With js_of_ocaml, a special syntax is needed to call Javascript objects, and OCaml objects can’t easily be called from Javascript. However, there is an advantage to having a special call syntax: with ocamljs it is not possible to partially apply calls to native Javascript methods, but this is not caught by the compiler, so there can be a runtime failure.
Ocamljs supports inline Javascript, while js_of_ocaml does not. I think it might be possible for js_of_ocaml to do so using the same approach that ocamljs takes: use Camlp4 quotations to embed a syntax tree, then convert the syntax tree from its OCaml representation (as lambda code or bytecode) into Javascript. However, you would still need conversion functions between OCaml and Javascript values.
I haven’t compared the performance of the two systems. It seems like there must be a speed penalty to translating from bytecode compared to translating from lambda code. On the other hand, while ocamljs is very naive in its translation, js_of_ocaml makes several optimization passes. With many programs it doesn’t matter, since most of the time is spent in browser code. (For example, the planet example seems to run at the same speed in ocamljs and js_of_ocaml.) It would be interesting to compare them on something computationally intensive like Andrej Bauer’s random-art.org.
Js_of_ocaml is more complete and careful in its implementation of OCaml (e.g. it supports int64s), and it generates much more compact code than ocamljs. I hope to close the gap in these areas, possibly by borrowing some code and good ideas from js_of_ocaml.
by Jake Donham (noreply@blogger.com) at August 26, 2010 09:45 PM
OCaml compiles native executables in static mode. It allows to have a minimal set of dependencies when delivering an executable. It has also disadvantages like the size of the executable and problems arising when considering libraries update -- but this is another topic. There is still one strong dependency that you should not forget when you want to deliver a product for most of the Linux distributions: dependency on the glibc version.
Trying to run OASIS compiled with Debian Lenny, on CentOS 5.5:
$ OASIS .../OASIS: /lib64/libc.so.6: version `GLIBC_2.7' not found (required by .../OASIS)
So when compiling for delivery, one should choose the oldest distribution he targets. In my case, I choose CentOS 5 which comes with glibc v2.5. I usually choose Debian stable at the moment of writing Debian Lenny. But for now, the Debian Lenny's glibc is newer (v2.7) than the one coming from the CentOS 5.5 stable release. CentOS is a Red Hat like Linux distribution.
I use a Debian Lenny amd64 host system and I decided to setup a chroot of CentOS 5 i386 and amd64. I also setup schroot to use my CentOS chroot.
First of all we use rinse, which can setup a RPM based distribution in a chroot. The version v1.3 shipped with Debian Lenny has some bugs: it doesn't install nss and other mandatory packages. So I downloaded v1.7 directly from Debian Sid. There is no dependencies problems and the package is arch:all, so it is straightforward to install:
$ wget http://ftp.de.debian.org/debian/pool/main/r/rinse/rinse_1.7-1_all.deb # Replace ftp.de.debian.org by your preferred Debian mirror $ dpkg -i rinse_1.7-1_all.deb
Then I create the chroot directory and launch rinse:
$ mkdir /srv/chroot/centos5-amd64 $ rinse --arch amd64 --distribution centos-5 --directory /srv/chroot/centos5-amd64 # N.B. you must use --arch, the default is i386
Once installation is complete, you can add an entry for this distribution in /etc/schroot/schroot.conf:
[centos5-amd64] description=Centos 5 (amd64) location=/srv/chroot/centos5-amd64 priority=3 users=XXX groups= root-groups=root type=directory run-setup-scripts=true run-exec-scripts=true
Replace XXX by your login.
If you try to login directly, you will get warnings:
$ schroot -c centos5-i386 I : [chroot centos5-i386-a952de23-7f4b-4bae-a9b9-752ecee4a185] Exécution de l'interpréteur de commandes initial : « /bin/bash » -bash: /dev/null: Permission denied -bash: /dev/null: Permission denied -bash: /dev/null: Permission denied -bash: /dev/null: Permission denied -bash: /dev/null: Permission denied
This is a bit misleading because the real problem is that nothing is created in /dev/. CentOS delegates creating char/block devices to udev. You have two solutions to solve this issue:
$ MAKEDEV random $ MAKEDEV console $ MAKEDEV zero $ MAKEDEV null $ MAKEDEV stdout $ MAKEDEV stdin $ MAKEDEV stderr
$ rsync -av /srv/chroot/lenny-amd64/dev/* /srv/chroot/centos5-amd64/dev/
That's it, you now have a functional chrooted CentOS 5 environment:
$ schroot -c centos5-amd64 cat /etc/redhat-release I : [chroot centos5-amd64-b9bae264-285b-4d17-a046-13386736cecd] Exécution de la commande : « cat /etc/redhat-release » CentOS release 5.5 (Final)
To setup an i386 environment, we follow almost the same scheme, except we need to fix a bug in rinse v1.7: we need to call linux32 before executing chroot. The problem is that the first stage installation of rinse install an i386/686 environment but as soon as you call chroot yum install ..., it will guess that the system is amd64 and will install missing packages. See the Debian bug report and the example patch attached to correct this behavior.
WARNING: this patch is just an example, you can apply it for creating CentOS i386 chroot on Lenny amd64 host but you should remove the patch as soon as the installation is complete.
$ mkdir /srv/chroot/centos5-i386/ $ rinse --arch i386 --distribution centos-5 --directory /srv/chroot/centos5-i386 # With /usr/lib/rinse/centos-5/post-install.sh patched $ rsync -av /srv/chroot/lenny-i386/dev/* /srv/chroot/centos5-i386/dev/
Add this distribution to /etc/schroot/schroot.conf:
[centos5-i386] description=Centos 5 (i386) location=/srv/chroot/centos5-i386 priority=3 users=XXX groups= root-groups=root type=directory run-setup-scripts=true run-exec-scripts=true personality=linux32
You now have a schroot of CentOS 5 i386:
$ schroot -c centos5-i386 cat /etc/redhat-release I : [chroot centos5-i386-9acafa91-9862-4488-aaef-4ab2a482771e] Exécution de la commande : « cat /etc/redhat-release » CentOS release 5.5 (Final)
Happy schroot hacking!
I'm on the program committee for CUFP this year, so I'm a bit biased, but I feel very good about this year's program. For the first time, CUFP will be broken up into three parts:
So, if you're interested, register here. Note that CUFP is being run as part of ICFP and the family of related workshops, so you go through the same registration process.
See you in Baltimore!