OCaml Planet

May 23, 2015

Daniel Bünzli

On the book « More OCaml »

I eventually read through John Whitington's latest book « More OCaml. » It's again a very good one, I made a small review here.

May 23, 2015 03:17 PM

May 22, 2015

GaGallium

ICFP programming contest 2014: a retrospective (part 1/2)

The dates for the 2015 ICFP programming contest have been announced on the contest webpage. This is a good occasion to dig out a summary of our 2014 participation, that had been written over the span of several months (mostly right after the contest), and that we had unexplainably failed to publish before.

In 2013, some Gallium members (Thibaut Balabonski, Thomas Braibant, Jacques-Henri Jourdan and Gabriel Scherer) participated to the ICFP programming contest, and it was a lot of fun. We participated again in 2014, with a bit less gallium in the team (blame the precarity of post-docs, and some badly-chosen dates for a honeymoon) and a skewed alphabetic distribution: François Bobot, Pierre Boutillier, Thomas Braibant, Damien Doligez and Gabriel Scherer.

This (double) blog post was written over the course of the last ~5 months by three different persons. With hindsigth, we could probably have phrased some of the post differently. We believe that this blog post is best enjoyed as a pot-pourri of our thoughts, much like our ragtag submission.

The subject of the 2014 contest was the following: a computer game called Lambda-Man, a variation on Pacman, running on an arcane (simulated) computer architecture with heterogeneous processors. We were asked to implement a Lambda-man AI (the player character) on a Lisp-machine-inspired processor architecture, using a functional but relatively low-level programming language, and to write ghost programs (the attackers) on very low-level co-processors, using a low-level assembly-like language. The organizers funnily enough called the functional low-level language "Gcc", and the Ghost language "Ghc".

Because it would be too long to be read comfortably in one shot, we split the document in two posts: we will now mostly focus on the Lambda-Man AI, and describe the ghost in a later post.

Our broad design

We wrote a Ghost in ghost machine code, using a very thin assembler to have named labels and simple constants. The ghost AI, due to Damien and Thomas, is relatively elaborate and is probably the best part of our submission (Editor's note: the judges specifically refered to our ghost as the best of the competition. See the annoucement.).

For the Lambda-man AI, we wrote a compiler from OCaml (what else?) to the Lisp machine. Specifically, we use the OCaml compiler API (as exposed in the not-well-documented compiler-libs ocamlfind package) to get the "lambda" representation to the Lisp machine.

"Lambda" is the representation used just after the compilation of pattern-matching, the choice of data representation, and a couple of very simple type-directed specializations; it is an untyped representation that is rather close to Lisp, and was thus a fitting choice. In particular, lower-level representations (bytecode or Cmm) have already performed closure conversion, which was unnecessary as our backend supports closures. You can inspect the lambda-representation of an OCaml module foo.ml by running ocamlc -dlambda -c foo.ml.

We then wrote the Lambda-man AI using OCaml, in the subset that our compiler supports (more on this in the "# Compiler limitations" section). The AI is rather simple; a problem in our team (and I guess many other teams) is that everyone was quite excited at writing a compiler, but nobody was really fond of writing AIs, so we delayed this part to the very last day of the contest.

PS: We spent quite a few hours implementing a simulator (interpreters for Gcc and Ghc, and game rules engine), but we couldn't make much use of them in the end. Time would probably have been better spent experimenting with the Javascript API directly from the start. On the other hand, we kept the option open to actually simulate other's ghosts in our lambda-man, which would have made use of our Ghc simulator -- written in OCaml; we didn't have the time budget to make that work, though.

In retrospect, I think that reusing/porting/hacking an existing language (OCaml) was a great choice, because it allowed us to test our IA code not only on the organizers' simulator, but also by compiling it as an usual OCaml program with the usual OCaml toolchain, which has less bugs (than our own compiler backend) and is much more efficient. This saved our sanity of mind many times.

The lambda-man AI

We iterated quite a lot on the lambda-man AI in the last few hours of the contest. One problem is that we had a lot of fun with the ghosts and the compiler, and we delayed working on the main part of the contest till the end. Here are the main features of our lambda-man, which was mostly implemented by Thomas with help from Gabriel:

  • at initialisation time, it computes the graph structure of the map (seen as a vector of vector of possible directions); this makes it possible to have a more efficient main loop in the time constrained part of the game.

  • our main loop is a depth-first search that computes the best possible path of a given length, based on a few heuristics. Broadly speaking, the value of a path is the sum of the value of the pills that are eaten along the way, plus an approximation of the value of the ghosts that are eaten, minus a huge value in case our lambda-man gets eaten. One nice thing is that Gabriel implemented a function that predicts the possible positions of the ghosts in the next few steps, which helps take good decisions in tight situations (rather than flee blindly if a ghost comes nearby).

  • since this DFS is quite costly and cannot be used to find a long path, we had to put a failsafe mechanism in place that tries to find a pill if no "good" path is available.

  • finally, our lambda-man "remembers" its decisions and try to keep its path, unless its path is blocked by a ghost or a ghost is too close.

There are quite a few problems with this basic solution. For instance, our lambda-man will quite often let a pill alone (because its AI forbids backtracking). Then, it does not try to maximise the use of power pills (and does not even give the proper value to the ghosts eaten in succession). Also, it does not computes the proper timing of the moves of the ghosts -- each ghost move at its own pace, and keeping track of this situation might be useful in tight situations.

Finally, in the heat of the last fifteen minutes, Thomas added (amond other changes) the Fruit line in the part of our code that computes the value of a path:

let content = get2 map pos in
let value, fright  =
  match content with
    | Pill -> value + 10, fright
    | Power_pill -> value + 50, fright + 20
    | Fruit -> value + 15, fright (* Blunder *)
    | _ -> value, fright
in
...

This is a real bad idea, and it was obvious soon enough. Problem is we ran out of time to correct it ("Never try to update your submission in the last 120 seconds of the contest"). The problem is that Fruit indicates a cell of the map where the fruit may be, not the position of the fruit updated in real-time. It turns out that this single line makes our lambda-man perform quite badly, because it might try to run in circle around the Fruit cell. There are plenty of one-line changes that could solve or mitigate this issue: e.g., removing the line, or changing this 15 value into 5, or just checking that the fruit is indeed present at the time of the evaluation. Too bad.

The OCaml-to-Lisp-machine compiler

The compiler, written mainly by François (which first experimented the backend by hacking on the miniML compiler of Andrej Bauer's PL zoo, thanks Andrej!), only implements the bare minimum we need to write our AIs. We made two design choices:

  1. We would only implement the minimum needed; so we have no modules, no polymorphic variants (or at least we didn't check that they were supported), and even our compilation of pattern-matching was a bit lousy: the OCaml compiler will generate various kind of code depending on whether it thinks, for example, that binary tests or jump tables are the best way to test something, and we only added support for the various lambda-code instruction as our AI examples revealed a need for. It's a compiler that is meant to be used in tandem with a backend hacker, implementing (or fixing) the features as your need evolve.

  2. We wanted the OCaml datatypes to be compiled in such a way that we would need no separate FFI layer to talk with the values given to us, in Lisp-land, by the game engine. At the beginning of each turn the lambda-man AI is passed a big tuple of linked lists that represents the map, each, and we wanted those to fit exactly the representation of some OCaml datatype. Another option would be to represent the OCaml datatypes however we like, and add the FFI types as primitives that can only be inspected through special compiler primitives.

Those two choices are inter-linked: representing OCaml records as Lisp tuples (our goal) is harder than as Lisp lists or Lisp closures (did you notice how frame environments are the only memory structure in this Lisp machine that allows reasonably compact storage of data?), because the access pattern for the very last element is not the same as the others. We ended up without records, but with good support for OCaml tuples (implemented as Lisp tuples).

Another OCaml-using team (the OCamlpro guys) made the different design choice of having the FFI as compiler primitives, and also implemented a more complete support for the rest of the OCaml language -- we'll let them describe their solution if they want to.

Data representation tricks

We initially wished to have a valid OCaml representation for Lisp lists (either 0 for the empty list or a cons-cell (x . xs) for... a cons cell x::xsq), and arrange our compilation of OCaml tuples and records to be exactly Lisp tuples (eg. (a,b,c) is (a . (b . c)), note how there is no end-marker). This is all that was needed to talk natively with the world representation passed to our Lambda-AI. We ended up with no records, OCaml tuples that were exactly Lisp tuples and, surprisingly, OCaml lists that mapped directly to Lisp lists (we would have been happy with having to define a hand-made OCaml datatype).

Focusing on tuples for now, we faced two independent issues:

  • Constructor tags: OCaml variants (sums) and tuples/records (products) come with a header word that contains some GC-specific information and a "constructor tag"; it always is 0 for tuples and records, and for variants it indicates the constructor. For example with the datatype

    type foo =
     | A
     | B
     | C of foo
     | D of bar * baz

    A and B are represented as (tagged) integers, and the heap-allocated blocks are C x (tag 0, one parameter) and D (y, z) (tag 1, two parameters).

    Directly translating this to cons-cells would give to the OCaml tuple (a, b, c) the representation (0 . (a . (b . c))), which is not good to talk to the FFI (of course we could convert first, but constraints sometime provoke creativity).

  • Irregular access: with the Lisp representation, the access pattern for the third-element of a triplet is distinct from the first or second. Unfortunately, the lambda-representation of field access in the OCaml compiler tells us which field to access, but not the total length of the structure (when it is statically known; the same field-access instruction is used for arrays and strings). We would need to compile getfield foo 3 different if foo has 3 fields, or 4 or more, and we don't know which it is.

Solving the irregular access

Thomas had a cool idea to solve the irregular access problem: let's preprocess the OCaml source-code to replace each n-tuple by a composition of 2-tuples, just as in the Lisp representation: (a, b, c) becomes (a, (b, c)), and we gain the invariant that field accesses are always to index 0 and 1, which map directly to car and cdr.

In theory this also works for records: translate record types to tuples and, on record field selection, use the type information associated to the field to know which field number it is, and translate that accordingly. (OCaml 4.00 supports distinct record types sharing field names, so you really need typing information here.) It works in theory, but processing the OCaml typedtree is painful enough in practice to be worth it; so we did not implement this for records.

The preprocessing for tuples is amazingly easy to do using Alain Frisch's visitor-pattern Ast_mapper library.

Solving the constructor tags

The two data structures we care about the most are linked lists and tuples; in both case the tag is 0. Gabriel suggested that we make that a rule: let's forbid non-0 tags, that is, sum types with strictly more than one non-constant constructor. We can then never include the tag and get exactly the Lisp tuple representation.

That may seem like an awful limitation, but as PhD students (or, for the most part, former PhD students), we get paid to know about arcane type-system tricks we would never hope to suggest in a serious discussion, but can use in situations like that: you can translate any OCaml type declaration into an isomorphic one respecting this condition (at most one non-constant constructor), using GADTs to simulate sigma-types. The type declaration above gets rewritten as:

type foo =
 | Foo : 'a foo_tag * 'a -> foo
and foo_tag =
 | A : unit foo_tag
 | B : unit foo_tag
 | C : foo foo_tag
 | D : (bar * baz) foo_tag

And we used this trick in our AI implementation, in this rushed implementation of logarithmic-access immutable arrays (we didn't implement any support for mutation, as pure and strict is clearly the way to go):

type 'a vect_tree = Tree : ('a, 'b) vect_tag * 'b -> 'a vect_tree
and ('a, _) vect_tag =
| Leaf : ('a, 'a) vect_tag
| Node : ('a, 'a vect_tree * 'a vect_tree) vect_tag
type 'a vect = 'a vect_tree * int

let get_vect (t, n) i =
  let rec find : type a . int -> int -> a vect_tree -> a =
    fun i n -> function
    | Tree (Leaf, v) -> v
    | Tree (Node, (left, right)) ->
      let mid = n / 2 in
      if i < mid
      then find i mid left
      else find (i - mid) (n - mid) right
  in find i n t

let rec vect_of_list li =
  let rec consume li n =
    if n = 1 then
      begin match li with
        | [] -> assert false
        | x::xs -> (Tree (Leaf, x), xs)
      end
    else begin
      let mid = n / 2 in
      let left, li = consume li mid in
      let right, li = consume li (n - mid) in
      Tree (Node, (left, right)), li
    end
  in
  let len = list_length li in
  match consume li len with
    | tree, [] -> (tree, len)
    | _, _::_ -> assert false

Our productions

For reference, we uploaded the sorry state of our code at the very end of the contest as this archive. (This is not an open-source release, because it would be work to prepare one.)

The ghost (readable) assembly are in:

  • ghosts/genius.g
  • ghosts/genius2.g

The lambda-man source is:

  • mlia/graph_local.ml

Actually running our code

We use

  • ocaml 4.01 (including compilerlibs (+unix +graphics))
  • ocamlfind

In code/

  • gcc/ contains compiler ./ocamlgcc.native from OCaml (minus records, references, exceptions) to gcc. « make » compiles it.

  • mlia/ is the AI of lambda man written in OCaml. In this directory do ../gcc/ocamlgcc.native « an_ia.ml » --print --no-exec to get some gcc code

  • ghc/ contains an assembler ./asm.byte of ghc. « make » compiles it.

  • ghosts/*.g contains the assembly of our ghost AI. It must go though ../ghc/asm.byte « an_ia.g » to give correct ghc code.

  • At top-level you can try ocamlbuild -use-ocamlfind main.native to get our not working own game simulator (you’ll need « cmdliner » to compile it)

by Gabriel Scherer, Thomas Braibant at May 22, 2015 08:00 AM

May 19, 2015

Shayne Fletcher

Labeled and optional arguments

I don't know why this is, but of all the features of the OCaml language, somehow remembering how to make use of labeled or optional arguments is difficult for me. I find when I need to use them, I have to go back and "relearn" the syntax every time. For that reason, I've written these summary notes for myself (based mostly on Jacques Garrigue's "Labels and Variants" chapter of the OCaml reference) and I hope they might be of help you too!

Labels

I've seen people write this function from time to time.

let flip (f : 'a -> 'b -> 'c) = fun y x -> f x y

flip applied to a function f in two arguments, produces a new function that when presented reverses their order in the application of f. For example, if sub is defined by let sub x y = x - y, then sub 3 2 will give the result 1 whereas (flip sub) 3 2 will produce the value -1.

The reason you'd find yourself wanting a function like flip is because you want to partially apply f but the argument you want to bind is inconveniently in the 'wrong' position. For example, the function that subtracts 2 from it's argument can be written (flip sub 2) (whereas sub 2 is the function that subtracts its argument from 2).

Labeled arguments do away with the need for functions like flip. You see, arguments in function types can be annotated.


val sub : (x:int) -> (y:int) -> int
In the above, x and y are labels. When you define a function with labeled arguments, you prefix the labels with a tilde ~.

let sub : (x:int -> y:int -> int) = fun ~x ~y -> x - y

Labels and variables are not the same things. A label is a name associated with a variable e.g. let u = 3 and v = 2 in sub ~x:u ~y:v (u, v are variables x, y labels) and this also works when the variables are replaced with literals as in sub ~x:3 ~y:2.

The expression ~x on it's own is shorthand for ~x:x which means that you can conveniently write let x = 3 and y = 2 in sub ~x ~y for example. This is "punning" - meaning if the variable name and the label are the same, the variable name is permitted to be elided in the application.

Labels enable making function application commutative (changing the order of the arguments does not change the result); one can write sub ~y ~x and that will yield the same result as sub ~x ~y. Where this is useful of course is partial application of a function on any any argument. For example, we can create functions from sub by binding either of x or y such as sub ~x:3 or sub ~y:2 without having to resort to ad-hoc trickery like flip sub.

Some technical details:

  • If more than one argument of a function has the same label (or no label), position prevails and they will not commute among themselves;
  • If an application is total, labels can be omitted (e.g. like in sub 3 2). Be wary though - when a function is polymorphic in its return type, it will never be considered as totally applied. Consider for example , let app ~f ~x = f x. Then app has type f:('a -> 'b') -> x:'a -> 'b. Now we attempt total application but omit the required labels app (fun x -> x) 1 and this types to f:('a -> ('b -> 'b) -> int -> 'c) -> x:'a -> 'c. You can see that the arguments have been "stacked up" in anticipation of presentation of f and x and the result although it types, is not at all what we were going for!
  • When a function is passed to a higher order function, labels must match in both types, labels may not be added or removed.

Optional arguments

In a function type involving an optional argument, the annotation ? appears.

val range : ?step:int -> int -> int -> int list
The ? appears again in the value syntax of a function definition involving an optional argument. Additionally, at that time, optional parameters can be given default values. When calling a function with a value for an optional parameter, well, you use the fact it is a labeled argument and provide a ~ as normal.

let rec range ?(step:int = 1) (m : int) (n : int) : int list =
if m >= n then []
else m :: (range ~step (m + step) n)

Now, this bit is important. A function can not take optional arguments alone. There must be some non-optional ones there too and it is when a non-optional parameter that comes after an optional one in the function type is encountered is it determined if the optional parameter has been omitted or not. For example, if we reorder the argument list as in the below, we find we can't 'erase' the optional argument anymore.


let rec range (m : int) (n : int) ?(step : int = 1): int list = ...
Warning 16: this optional argument cannot be erased.
That said, optional parameters may commute with non-optional or unlabeled ones, as long as they are applied simultaneously. That is, going back to this definition,

let rec range ?(step:int = 1) (m : int) (n : int) : int list =
if m >= n then []
else m :: (range ~step (m + step) n)
(range 0 ~step:2) 100 is valid whereas, (range 0) ~step:2 100 is not.

Optional arguments are implemented as option types. In the event no default is given, you can treat them as exactly that by matching on None or Some x to switch behaviors. Here's a schematic of what that looks like.


let f ?s y =
match s with
| None -> ...
| Some x -> ...

When you are just "passing through" an optional argument from one function call to another, you can prefix the applied argument with ? as in the following.


let g ?s x = ...
let h ?s x = g ?s x (*[s] is passed through to [g]*)

by Shayne Fletcher (noreply@blogger.com) at May 19, 2015 06:09 PM

Andrej Bauer

The troublesome reflection rule (TYPES 2015 slides)

Here are the slides of my TYPES 2015 talk “The troublesome reflection rule” with fairly detailed presenter notes. The meeting is  taking place in Tallinn, Estonia – a very cool country in many senses (it’s not quite spring yet even though we’re in the second half of May, and it’s the country that gave us Skype).

Download slides: The troublesome reflection rule (TYPES 2015) [PDF].

by Andrej Bauer at May 19, 2015 03:10 PM

May 18, 2015

Psellos

Sliding Tile OCaml iOS App

May 18, 2015

I revamped another OCaml iOS app from a few years ago to make it run under iOS 8.3. It implements a sliding tile puzzle that was a fad in the 1880s and was also popular in my childhood a few years after that.

Instructions for downloading, building, and running the app are here:

Slide24: Sliding Tile Puzzle for iOS

You can download the sources directly here:

Slide24 2.0.2, OCaml app for iOS 8.3 (111 KB)

I had a lot of fun with this puzzle as a kid, and I still find it enjoyable to play with though it’s not particularly challenging to solve. To make the app more interesting, I coded up a heuristic search that solves the puzzle.

Here are some insights I had while revamping the app:

  • iOS devices have gotten tremendously faster since I wrote the code. I should rewrite it to solve with fewer moves. I’d really love to see what an extremely short solution looks like. I’ll bet it looks impossibly wise, like those Swedish girls in First Aid Kit.

  • It’s a little strange for a puzzle to solve itself. Like what if there was a button on the jigsaw puzzle box that made the pieces crawl themselves on the table into the solution. It would be fun to watch, but would it be a puzzle any more? More like a garden I guess.

Now, reader, it’s late at night and I have to rest up. There are exciting new developments in the OCaml-on-iOS world to talk about in the coming weeks. A version of the the iOS cross compiler, much improved with help from the truly knowledgeable, and with support for 64-bit ARM, will soon be available through OPAM. I’m really looking forward to this, and indeed this is why I’m working through these apps as fast as I can. I’m testing them against the new compiler.

If you have any trouble (or success) getting Slide24 to work for you, leave a comment below or email me at jeffsco@psellos.com.

Posted by: Jeffrey

<style type="text/css"> .flowaroundimg { float: left; margin: 0em 1em 0em 0em; } .rightoffloat li { position: relative; left: 1em; } pre { white-space: pre-wrap; width: 96%; margin-bottom: 24px; overflow: hidden; padding: 3px 10px; -webkit-border-radius: 3px; background-color: #fed; border: 1px solid #dcb; } pre code { white-space: pre-wrap; border: none; padding: 0; background-color: transparent; -webkit-border-radius: 0; } td { padding: 0em 1em 0em 1em; } th { padding: 0em 1em 0em 1em; } </style>

May 18, 2015 07:00 PM

OCamlPro

Reduced Memory Allocations with ocp-memprof

In this blog post, we explain how ocp-memprof helped us identify a piece of code in Alt-Ergo that needed to be improved. Simply put, a function that merges two maps was performing a lot of unnecessary allocations, negatively impacting the garbage collector's activity. A simple patch allowed us to prevent these allocations, and thus speed up Alt-Ergo's execution.

 

The Story

Il all started with a challenging example coming from an industrial user of Alt-Ergo, our SMT solver. It was proven by Alt-Ergo in approximately 70 seconds. This seemed abnormnally long and needed to be investigated. Unfortunately, all our tests with different options (number of triggers, case-split analysis, ...) and different plugins (satML plugin, profiling plugin, fm-simplex plugin) of Alt-Ergo failed to improve the resolution time. We then profiled an execution using ocp-memprof to understand the memory behavior of this example.

 

Profiling an Execution with ocp-memprof

As usual, profiling an OCaml application with ocp-memprof is very simple (see the user manual for more details). We just compiled Alt-Ergo in the OPAM switch for ocp-memprof (version 4.01.0+ocp1) and executed the following command:

$ ocp-memprof --exec ./ae-4.01.0+ocp1-public-without-patch pb-many-GCs.why

The execution above triggers 612 garbage collections in about 114 seconds. The analysis of the generated dumps produces the evolution graph below. We notice on the graph that:

  • we have approximately 10 MB of hash-tables allocated since the beginning of the execution, which is expected;

  • the second most allocated data in the major heap are maps, and they keep growing during the execution of Alt-Ergo.

We are not able to precisely identify the allocation origins of the maps in this graph (maps are generic structures that are intensively used in Alt-Ergo). To investigate further, we wanted to know if some global value was abnormally retaining a lot of memory, or if some (non recursive-terminal) iterator was causing some trouble when applied on huge data structures. For that, we extended the analysis with the --per-root option to focus on the memory graph of the last dump. This is done by executing the following command, where 4242 is the PID of the process launched by ocp-memprof --exec in the previous command:

$ ocp-memprof --load 4242 --per-root 611

Evolution-graph-of-Alt-Ergo-memory-consumption-before-patching

Analyzing-last-dump-of-Alt-Ergo-memory-consumption-before-patching


The per-root graph (above, on the right) gives more interesting information. When expanding the stack node and sorting the sixth column in decreasing order, we notice that:

  • a bunch of these maps are still in the stack: the item Map_at_192_offset_1 in the first column means that most of the memory is retained by the fold function, at line 192 of the Map module (resolution of stack frames is only available in the commercial version of ocp-memprof);

  • the "kind" column corresponding to Map_at_192_offset_1 gives better information. It provides the signature of the function passed to fold. This information is already provided by the online version.

Uf.Make(Uf.??Make.X).LX.t ->
Explanation.t ->
Explanation.t Map.Make(Uf.Make(Uf.??Make.X).LX).t ->
Explanation.t Map.Make(Uf.Make(Uf.??Make.X).LX).t

This information allows us to see the precise origin of the allocation: the map of module LX used in uf.ml. Lucky us, there is only one fold function of LX's maps in the Uf module with the same type.

 

Optimizing the Code

Thanks to the information provided by the --per-root option, we identified the code responsible for this behavior:

(*** function extracted from module uf.ml ***)

01 | module MapL = Map.Make(LX)
02 |
03 | let update_neqs r1 r2 dep env =
04 |   let merge_disjoint_maps l1 ex1 mapl =
05 |     try
06 |       let ex2 = MapL.find l1 mapl in
07 |       let ex = Ex.union (Ex.union ex1 ex2) dep in
08 |       raise (Inconsistent (ex, cl_extract env))
09 |     with Not_found ->
10 |       MapL.add l1 (Ex.union ex1 dep) mapl
11 |   in
12 |   let nq_r1 = lookup_for_neqs env r1 in
13 |   let nq_r2 = lookup_for_neqs env r2 in
14 |   let mapl = MapL.fold merge_disjoint_maps nq_r1 nq_r2 in
15 |   MapX.add r2 mapl (MapX.add r1 mapl env.neqs)

Roughly speaking, the function above retrieves two maps nq_r1 and nq_r2 from env, and folds on the first one while providing the second map as an accumulator. The local function merge_disjoint_maps (passed to fold) raises Exception.Inconsistent if the original maps were not disjoint. Otherwise, it adds the current binding (after updating the corresponding value) to the accumulator. Finally, the result mapl of the fold is used to update the values of r1 and r2 in env.neqs.

After further debugging, we observed that one of the maps (nq_r1 and nq_r2) is always empty in our situation. A straightforward fix consists in testing whether one of these two maps is empty. If it is the case, we simply return the other map. Here is the corresponding code:

(*** first patch: testing if one of the maps is empty ***)

.. | ...
14 |   let mapl =
15 |      if MapL.is_empty nq_r1 then nq_r2
16 |      else
17 |        if MapL.is_empty nq_r2 then nq_r1
18 |        else
19 |          MapL.fold merge_disjoint_maps nq_r1 nq_r2
.. | ...

Of course, a more generic solution should not just test for emptiness, but should fold on the smallest map. In the second patch below, we used a slightly modified version of OCaml's maps that exports the height function (implemented in constant time). This way, we always fold on the smallest map while providing the biggest one as an accumulator.

(*** second (better) patch : folding on the smallest map ***)

01 | module MapL = Emap.Make(LX)
.. | ...
14 |   let small, big =
15 |     if MapL.height nq_r1 < MapL.height nq_r2 then nq_r1, nq_r2
16 |     else nq_r2, nq_r1
17 |   in
18 |   let mapl = MapL.fold merge_disjoint_maps small big in
.. |   ...
 

Checking the Efficiency of our Patch

Curious to see the result of the patch ? We regenerate the evolution and memory graphs of the patched code (fix 1), and we noticed:

  • a better resolution time: from 69 seconds to 16 seconds;

  • less garbage collection : from 53,000 minor collections to 19,000;

  • a smaller memory footprint : from 26 MB to 24 MB;

Evolution-graph-of-Alt-Ergo-memory-consumption-after-patching

Analyzing-last-dump-of-Alt-Ergo-memory-consumption-after-patching



 

Conclusion

We show in this post that ocp-memprof can also be used to optimize your code, not only by decreasing memory usage, but by improving the speed of your application. The interactive graphs are online in our gallery of examples if you want to see and explore them (without the patch and with the patch).

AE AE + patch Remarks
4.01.0 69.1 secs 16.4 secs substantial improvement on the example
4.01.0+ocp1 76.3 secs 17.1 secs when using the patched version of Alt-Ergo
dumps generation 114.3 secs (+49%) 17.6 secs (+2.8%) (important) overhead when dumping
memory snapshots
# dumps (major collections) 612 GCs 31 GCs impressive GC activity without the patch
dumps analysis
(online ocp-memprof)
759 secs 24.3 secs
dumps analysis
(commercial ocp-memprof)
153 secs 3.7 secs analysis with commercial ocp-memprof is
~ x5 faster than public version (above)
AE memory footprint 26 MB 24 MB memory consumption is comparable
minor collections 53K 19K fewer minor GCs thanks to the patch

Do not hesitate to use ocp-memprof on your applications. Of course, all feedback and suggestions are welcome, just email us !


More information:

by Çagdas Bozman (cagdas.bozman@ocamlpro.com) at May 18, 2015 12:00 AM

May 14, 2015

Functional Jobs

OCaml server-side developer at Ahrefs Research (Full-time)

Who we are

Ahrefs Research is a San Francisco branch of Ahrefs Pte Ltd (Singapore), which runs an internet-scale bot that crawls whole Web 24/7, storing huge volumes of information to be indexed and structured in timely fashion. On top of that Ahrefs is building analytical services for end-users.

Ahrefs Research develops a custom petabyte-scale distributed storage to accommodate all that data coming in at high speed, focusing on performance, robustness and ease of use. Performance-critical low-level part is implemented in C++ on top of a distributed filesystem, while all the coordination logic and communication layer, along with API library exposed to the developer is in OCaml.

We are a small team and strongly believe in better technology leading to better solutions for real-world problems. We worship functional languages and static typing, extensively employ code generation and meta-programming, value code clarity and predictability, constantly seek out to automate repetitive tasks and eliminate boilerplate, guided by DRY and following KISS. If there is any new technology that will make our life easier - no doubt, we'll give it a try. We rely heavily on opensource code (as the only viable way to build maintainable system) and contribute back, see e.g. https://github.com/ahrefs . It goes without saying that our team is all passionate and experienced OCaml programmers, ready to lend a hand or explain that intricate ocamlbuild rule.

Our motto is "first do it, then do it right, then do it better".

What we need

Ahrefs Research is looking for backend developer with deep understanding of operating systems, networks and taste for simple and efficient architectural designs. Our backend is implemented mostly in OCaml and some C++, as such proficiency in OCaml is very much appreciated, otherwise a strong inclination to intensively learn OCaml in a short term will be required. Understanding of functional programming in general and/or experience with other FP languages (F#,Haskell,Scala,Scheme,etc) will help a lot. Knowledge of C++ is a plus.

The candidate will have to deal with the following technologies on the daily basis:

  • networks & distributed systems
  • 4+ petabyte of live data
  • OCaml
  • C++
  • linux
  • git

The ideal candidate is expected to:

  • Independently deal with and investigate bugs, schedule tasks and dig code
  • Make argumented technical choice and take responsibility for it
  • Understand the whole technology stack at all levels : from network and userspace code to OS internals and hardware
  • Handle full development cycle of a single component, i.e. formalize task, write code and tests, setup and support production (devops)
  • Approach problems with practical mindset and suppress perfectionism when time is a priority

These requirements stem naturally from our approach to development with fast feedback cycle, highly-focused personal areas of responsibility and strong tendency to vertical component splitting.

What you get

We provide:

  • Competitive salary
  • Modern office in San Francisco SOMA (Embarcadero)
  • Informal and thriving atmosphere
  • First-class workplace equipment (hardware, tools)
  • No dress code

Get information on how to apply for this position.

May 14, 2015 08:07 AM

OCaml backend developer at Ahrefs Research (Full-time)

Who we are

Ahrefs Research is a San Francisco branch of Ahrefs Pte Ltd (Singapore), focusing on development of distributed storage engine. Ahrefs runs an internet-scale bot that crawls whole Web 24/7, storing huge volumes of information to be indexed and structured in timely fashion. On top of that Ahrefs is building analytical services for end-users.

Custom petabyte-scale distributed storage was developed to accommodate all that data coming in at high speed. This database is the current focus of Ahrefs Research development efforts - increasing speed, reliability, simplifying deployment and maintenance. We might as well dive into other parts of Ahrefs backend development.

Our motto is "first do it, then do it right, then do it better".

What we need

Ahrefs Research is looking for backend developer with deep understanding of operating systems, networks and taste for simple and efficient architectural designs. Our backend is implemented in OCaml and C++, as such proficiency in at least one of those languages is absolutely required.

The candidate will have to deal with the following technologies on the daily basis:

  • networks & distributed systems
  • 4+ petabyte of live data
  • OCaml
  • C++
  • linux
  • git

The ideal candidate is expected to:

  • Independently deal with and investigate bugs, schedule tasks and dig code
  • Make argumented technical choice and take responsibility for it
  • Understand the whole technology stack at all levels : from network and userspace code to OS internals and hardware
  • Handle full development cycle of a single component, i.e. formalize task, write code and tests, setup and support production (devops)
  • Approach problems with practical mindset and suppress perfectionism when time is a priority

These requirements stem naturally from our approach to development with fast feedback cycle, highly-focused personal areas of responsibility and strong tendency to vertical component splitting.

What you get

We provide:

  • Competitive salary
  • Modern office in San Francisco SOMA (Embarcadero)
  • Informal and thriving atmosphere
  • First-class workplace equipment (hardware, tools)
  • No dress code

Get information on how to apply for this position.

May 14, 2015 08:07 AM