OCaml Planet

March 05, 2015

Jane Street

Centralizing distributed version control, revisited

7 years ago, I wrote a blog post about how we at Jane Street were using our distributed version control system (hg, though the story would be the same for git) in a partially centralized way. Essentially, we built a centralized repo and a continuous integration system whose job was to merge in new changesets. The key responsibility of this system was to make sure that a change was rejected unless it merged, compiled and tested cleanly.

This half-distributed, half-centralized approach let us enjoy the benefits of a DVCS, while still getting the coherence and easy of sharing that comes from having a central authoritative source.

Since then, our development tools have changed a lot, including the arrival of a new code review and release management system called Iron. In writing Iron we discovered that centralization was valuable in ways we hadn't considered before. In particular, despite the fact that good support for merging is central to a DVCS, centralization is actually a critical ingredient to making merges work better.

To understand how centralization can help, let's talk about one reason why merging is a fraught process to begin with.

The criss-cross merge

The basic approach to merging in a DVCS like hg or git is pretty simple. Here are the basic steps that are taken to merge two heads A and B.

  • Find the greatest common ancestor (GCA(A,B)) of the heads to be merged.
  • Compute the patch from that base point to one of the two heads, say, A.
  • Take the patch you just computed, and apply it to B. Conflicts appear when the patch, which was actually based on GCA(A,B), doesn't apply cleanly to B. The result of this process is the merge.

The above discussion oversimplifies the story by assuming there's a well defined GCA, but this just isn't always true. To see why, consider a repository staring with a root revision R, and two revisions made independently on top of R.


Now, imagine that two different developers each concurrently decide to merge the heads A and B and do some further development. Note that in both of the cases shown below, the GCA for the merge between A and B is R.

Developer 1                Developer 2

  A---C--D                    A     
 /   /                       / \        
R   /                       R   \       
 \ /                         \   \  
  B                           B---E--F

Now, if we bring these two separate histories together into one repo, we have something like this.

 / \ /
R   \
 \ / \

Now, what happens if we want to merge D and F? In particular, what is GCA(D,F)? Both A and B are common ancestors, but neither one is greater than the other. In this case, there are in some sense two different GCAs, or, more precisely, there are multiple maximal common ancestors, or MCAs. This case is often described as a criss-cross merge, and is the source of much wailing and gnashing of teeth among developers and users of DVCSs.

git and hg have different ways of dealing with the case of multiple MCAs. By default, hg just picks one of the MCAs arbitrarily and does the merge based on that. Given that different choices of the merge base will lead to different results, making that choice arbitrarily is pretty disturbing.

git, on the other hand, has a strategy called recursive merge that repeatedly merges together the MCAs, and then uses that merged MCA as the basis for computing the diffs to A and B on which the final merge will be based. And hg has a new strategy called [bid merge][bidmerge] that is willing to make different choices as to the GCA to use on a file by file basis.

None of these approaches amount to principled solutions, and while they work better in some cases and worse in others, they all sometimes lead to bad results. It's tempting to look for a way out of this conundrum altogether, by avoiding the possibility of criss cross merges in the first place.

Avoiding the criss-cross merge

For those who haven't read my previous posts about how Iron approaches merges, I'll describe it briefly here. Iron organizes its branches into a hierarchy: every repository has a root feature, and that feature can have children, and those can have children as well. Thus, our main repository, called Jane, has a root feature called jane, and one can develop changes to jane in child features, such as jane/Core.Applicative or jane/quickcheck.

Critically, the merging of features is constrained. Note that in Iron, every feature is defined by its base and tip revision, where the diff between those two revisions is effectively the contents of the feature. Here are some of the key operations allowed on features.

  • fe release, moves changes from a child feature into a parent. This can only be done once the child feature is fully merged with its parent, and has the effect of setting the tip of parent to be the tip of the child, and typically deleting the child.

As an example, if the jane/quickcheck feature is based at the current tip of jane (and is fully reviewed, and all its tests pass), then calling fe release jane/quickcheck will move the tip of jane forward to be equal to the tip of jane/quickcheck, and will delete jane/quickcheck.

  • fe rebase lets you merge a feature with its parents, effectively pulling changes from a parent feature into a child. This has the effect of changing the base of the feature to be the tip of its parent, and the tip of the feature to be the result of the merge.

So, if other features have been released into jane since the jane/Core.Applicative feature was created, then the base of jane/Core.Applicative will no longer be the tip of jane. Calling fe rebase jane/Core.Applicative will merge the tip of jane/Core.Applicative with the tip of jane, and will set the base of jane/Core.Applicative to the tip of jane.

  • fe rename, which in addition to allowing you to simply change the name of a feature, also lets you introduce a parent-child relationship between features that didn't previously have one. e.g., calling fe rename jane/Core.Applicative jane/quickcheck/Core.Applicative causes the Core.Applicative feature to become a child of, and so be able to depend on the changes in, thequickcheck feature.

All of these operations are implemented against a single, centralized server which keeps track of the state of all our features. This centralization lets Iron enforce some useful invariants along the way, critically, that the GCA of a feature and its parent is well defined, and is equal to the base of the feature. This simple property turns out to outlaw criss-cross merges, which avoids all of the mess we described earlier.

The happy outcome turns out to depend critically on the fact that we built a central server that could enforce the invariants in question, or, more precisely, that we built a consistent service

discovered by chance, the existence of the central server is key to enforcing the necessary invariant. In particular, the scenario of two different users concurrently releasing into the same feature or rebasing the same feature simply isn't possible when there's a centralized monitor determining who goes first.

In retrospect, this shouldn't be too surprising. The criss-cross merge is really a result of concurrency, and the idea that introducing a lock (which is what a centralized server does for you) can be used to exclude unwanted concurrent executions in a distributed systems should surprise no one.

In the end, you can trace it all back to the CAP theorem: If you want progress while partitioned, you need to give up on consistency in some way. And criss cross merges are caused by a kind of inconsistency.

Centralization obviously has downsides, but I think Iron picks a nice point along the spectrum here: writing code is totally doable while disconnected, but operations like rebase and release that affect how information is shared between features requires you to be connected. I think it's a small price to pay to never have to deal with a criss-cross merge.

by Yaron Minsky at March 05, 2015 01:07 AM

March 04, 2015

Functional Jobs

Hacker (Node.js, NOSQL, Data Science) at Mobitrans (Full-time)

We are looking for exceptional functional developers to join our product research and development team. It's a full stack job that requires more than anything a passion for learning and experimenting with new technologies, creating prototypes and dealing with a few million data points every day. You will be working on any technology between Linux shell scripts, high performance Web servers and beautiful UIs.

We tweak our advertising and targeting algorithms multiple times a day and you should be able to reason about, program, discover and propose new ways of improving our result.

Our backend is supported by Node.js and Mongodb and our front-end is powered by JavaScript, TypeScript, LiveScript, AngularJS and React. You should be able to improve our architecture, bring new ideas and create POCs and lead the development of new projects.

You will be building ad servers for serving interactive mobile ads as well as realtime data analysis of the traffic; you will be tinkering with the algorithms and assisting product designers and data analysts by developing data visualization tools for UX and performance reports.

Who we are looking for?

  • Familiarity with functional programming languages including but not limited to Haskell, Clojure, F#, OCaml.
  • Hands-on experience with JavaScript, Python, Node.js, Mongodb, WebSockets, Reactive programming.
  • We know LiveScript is not the most popular language in the world, but we like its functional style and we use it all the time, if you're applying for this job, go and check it out: http://livescript.net
  • Experience with R, Mathematica, and similar software
  • Familiar with data analysis, machine learning software, MapReduce, Hadoop and related concepts and big data technologies.
  • Good UI skills with Underscore.js, d3 and React.
  • Some math
  • Practical knowledge in using Git.

About Us

Mobitrans, started in 2006 and is the global leader in mobile learning and entertainment services. We are active in 38 countries. Head quartered in Dubai Media City, we have an international and diverse team, creating and marketing our products to millions of users every day.

Our offer

  • We offer a very competitive salary.
  • World-class healthcare package.
  • Complementary gym membership.
  • Flexible working hours.
  • Opportunities for career progression within the company.
  • 365 days of sunshine!
  • Amazing office location with stunning views of the famous Palm Jumeirah.

This position is open for worldwide candidates and in case of relocation we will assist you ensuring a smooth transition to working and living in the beautiful city of Dubai.

Get information on how to apply for this position.

March 04, 2015 12:30 PM


Cumulus and ocp-memprof, a love story

In this blog post, we went on the hunt of memory leaks in Cumulus by using our memory profiler: ocp-memprof. Cumulus is a feed aggregator based on Eliom, a framework for programming web sites and client/server web applications, part of the Ocsigen Project.

First, run and get the memory snapshots

To test and run the server, we use ocp-memprof to start the process:

$ ocp-memprof --exec ocsigenserver.opt -c ocsigenserver.opt.conf -v

There are several ways to obtain snapshots:

  • automatically after each GC: there is nothing to do, this is the default behavior

  • manually:

    • by sending a SIGUSR1 signal (the default signal can be changed by using --signal SIG option);

    • by editing the source code and using the dump function in the Headump module:

    val dump : string -> unit (* the string argument stands for the name of the dump *)

Here, we use the default behavior and get a snapshot after every GC.

The Memory Evolution Graph

After running the server for a long time, the server process shows an unusually high consumption of memory. ocp-memprof automatically generates some statistics on the application memory usage. Below, we show the graph of memory consumption. On the x-axis, you can see the number of GCs, and on the y-axis, the memory size in bytes used by the most popular types in memory.


Eliom expert users would quickly identify that most of the memory is used by XML nodes and attributes, together with strings and closures.

Unfortunately, it is not that easy to know which parts of Cumulus source code are the cause for the allocations of these XML trees. These trees are indeed abstract types allocated using functions exported by the Eliom modules. The main part of the allocations are then located in the Eliom source code.

Generally, we will have a problem to locate abstract type values just using allocation points. It may be useful to browse the memory graph which can be completely reconstructed from the snapshot to identify all paths between the globals and the blocks representing XML nodes.

From roots to leaking nodes


The approach that we chose to identify the leak is to take a look at the pointer graph of our application in order to identify the roots retaining a significant portion of the memory. Above, we can observe the table of the retained size, for all roots of the application. What we can tell quickly is that 92.2% of our memory is retained by values with finalizers.

Below, looking at them more closely, we can state that there is a significant amount of values of type:

'a Eliom_comet_base.channel_data Lwt_stream.t -> unit


Probably, these finalizers are never called in order to free their associated values. The leak is not trivial to track down and fix. However, a quick fix is possible in the case of Cumulus.

Identifying the source code and patching it

After further investigation into the source code of Cumulus, we found the only location where such values are allocated:

(* $ROOT/cumulus/src/base/feeds.ml *)
let (event , call_event ) =
  let ( private_event , call_event ) = React.E. create () in
  let event = Eliom_react .Down. of_react private_event in
  (event , call_event )

The function of_react takes an optional argument ~scope to specify the way that Eliom_comet.Channel.create has to use the communication channel.

Changing the default value of the scope by another given in Eliom module, we have now only one channel and every client use this channel to communicate with the server (the default method created one channel by client).

(* $ROOT/cumulus/src/base/feeds.ml *)
let (event , call_event ) =
  let ( private_event , call_event ) = React.E. create () in
  let event = Eliom_react .Down. of_react
              ~scope : Eliom_common . site_scope private_event in
  (event , call_event )let (event , call_event ) =

Checking the fix

After patching the source code, we recompile our application and re-execute the process as before. Below, we can observe the new pointer graph. By changing the default value of scope, the size retained by finalizers drops from 92.2% to 0% !


The new evolution graph below shows that the memory usage drops from 45Mb (still growing quickly) for a few hundreds connections to 5.2Mb for thousands connections.



This example is online in our gallery of examples if you want to see and explore the graphs (with the leak and without the leak).

Do not hesitate to use ocp-memprof on your applications. Of course, all feedback and suggestions on using ocp-memprof are welcome, just send us a mail !

More information:

  • Homepage: http://memprof.typerex.org/

  • Usage: http://memprof.typerex.org/free-version.php

  • Support: http://memprof.typerex.org/report-a-bug.php

  • Gallery of examples: http://memprof.typerex.org/gallery.php

  • Commercial: http://memprof.typerex.org/commercial-version.php

by Çagdas Bozman (cagdas.bozman@ocamlpro.com) at March 04, 2015 12:00 AM

March 03, 2015

Functional Jobs

Full-Stack Senior Functional Web Engineer at Front Row Education (Full-time)


Senior full-stack functional web engineer to join fast-growing education startup that changes how over a million young students learn math.

TL;DR - Reasons to care about working with Front Row

  • Our mission is important to us, and we want it to be important to you as well: hundreds of thousands of kids learn math using Front Row every month. Our early results show students improve twice as much while using Front Row than their peers who aren’t using the program.
  • You’ll be one of the first engineers on the team, which means you’ll have an immense impact on our company, product, and culture; you’ll have a ton of autonomy; and you’ll have equity to match the weight of this role
  • A lot of flexibility: while we all work towards the same goals, you’ll have a lot of autonomy in what you work on, you can work from home up to one day a week, and we have a very flexible unlimited vacation days policy
  • You’ll use the most effective tools in the industry: Haskell, Postgres, Backbone.js, Ansible and more. Front Row is one of the very few organizations in the world that use Haskell in production for most of their systems and is an active member of the Haskell community.
  • In addition to doing good, we’re doing really well: in just over a year after launch, we are in more than 20% of all US elementary & middle schools.

The Business

Millions of teachers around the USA are struggling to help 30+ students in their class learn math because every student is in their own place. In a typical fourth grade classroom, there may be students learning to count, students learning to add, students learning to multiply, and students learning how exponents work - and one teacher somehow needs to address all these needs.

Front Row makes that impossible task possible, and as of today, more than a hundred thousand students use Front Row to receive personalized guidance in their learning. Thousands of teachers use Front Row every day to save hours of time and make sure their students are growing at the fastest rate achievable. Front Row active users have been growing over 25% a month for the past 6 months.

Front Row is successfully venture-funded and on the road to profitability.

The Role

As one of our very first engineers, you will be part of a team of developers who are passionate about their vocation, take pride in their craft and who push each other to grow as professionals. You will strive for pragmatism and 80/20 in your work. You will be using tools that make you most effective. By working really smart, you will produce more than the average developer ever will, but without the crazy hours.

We love generalists who can quickly bring themselves up to speed with any technology we’re using: you will have the chance to learn a lot, and fast too. You will receive continuous support and mentorship on your journey to achieving mastery. We do however expect you not to need to be hand-held and rely on others for your own growth. You will have full autonomy over your work.

You will work in an effective team that plans, executes and reflects together. Because we’re a small team, everything you create will go into production and be used by students. You will never do unimportant work: every contribution will make a clear and tangible impact on the company’s trajectory. Your personal success will be directly aligned with that of the company.

Most importantly, your work will have purpose: Front Row is a mission-driven company that takes pride in making a significant impact in the lives of hundreds of thousands of students.


  • Front Row is a polyglot combination of multiple web applications, mobile apps and asset generation tools.
  • Web front-ends are a custom version of Backbone.js + plugins.
  • The backend is a series of Haskell+Yesod-based applications talking to PostgreSQL and 3rd party services.
  • All test, build and deployment automation relies on Ansible. AWS for hosting.
  • We have mobile apps for both iOS and Android
  • Work is continuously happening to simplify and minimize the codebases.

Must haves

  • You have experience doing full-stack web development. You understand HTTP, networking, databases and the world of distributed systems.
  • You have functional programming experience.
  • Extreme hustle: you’ll be solving a lot of problems you haven’t faced before without the resources and the support of a giant organization. You must thrive on getting things done, whatever the cost.


  • You have familiarity with a functional stack (Haskell / Clojure / Scala / OCaml etc)
  • You're comfortable with the Behavior-Driven Development style
  • You have worked at a very small startup before: you thrive on having a lot of responsibility and little oversight
  • You have worked in small and effective Agile/XP teams before
  • You have delivered working software to large numbers of users before
  • You have familiarity with system and network administration


  • Competitive salary
  • Generous equity option grants
  • Medical, Dental, and Vision
  • Lunch is on us three times a week, and half-day event every month (trip to Sonoma, BBQ, etc)
  • Equipment budget
  • Flexible work schedule
  • Flexible, untracked vacation day policy
  • Working from downtown SF, very accessible location

Front Row - our mission

It's an unfortunate reality that students from less affluent families perform worse in school than students from wealthier families. Part of this reason has to do with home environment and absentee parents, but much of it has to do with inferior resources and less experienced teachers. The worst part of this problem is that if a student falls behind in any grade, they will forever be behind in every grade.

That's the core problem Front Row solves - it doesn't let students fall behind. And if they fall behind, Front Row helps catch them up really quickly because Front Row arms teachers with the resources and knowledge to develop their students individually. Now, the probability of falling behind in any given grade is irrelevant, because it will never compound. The student who would have been the most at risk will instead be up to speed, and therefore far more motivated.

Get information on how to apply for this position.

March 03, 2015 08:52 PM

March 02, 2015

Heidi Howard

Part 2: Running your own DNS Resolver with MirageOS

Last time, we wrote a simple “dig like” unikernel. Given a domain and the address of a nameserver, the unikernel resolved the domain by asking the nameserver and returned the return to the console.

Today, we will look at another way to resolve a DNS query, being a DNS server. This is useful in its own right but also allows us to cool things with our local DNS resolver such as locally overwriting DNS names and resolving .local names, both of which we will add to our DNS resolver another day.

Today we use features only added to ocaml-dns library in version 0.15 (currently PR #52), so if you do not have this version or later, then update OPAM or pin the master branch on github.

Building a DNS server with MirageOS is simple, look at the following code:

open Lwt
open V1_LWT
open Dns
open Dns_server

let port = 53
let zonefile = "test.zone"

module Main (C:CONSOLE) (K:KV_RO) (S:STACKV4) = struct

  module U = S.UDPV4
  module DNS = Dns_server_mirage.Make(K)(S)

  let start c k s =
    let t = DNS.create s k in
    DNS.serve_with_zonefile t ~port ~zonefile

The above code will serve DNS requests to port 53, responding with the resource records (RR) in test.zone. We have provided an example zone file in the repo with the code from this guide. To use this unikernel, we also need to edit the config.ml file from yesterday.

open Mirage

let data = crunch "./data"

let handler =
  foreign "Unikernel.Main" (console @-> kv_ro @-> stackv4 @-> job)

let ip_config:ipv4_config = {
  address= Ipaddr.V4.make 192 168 1 2;
  netmask= Ipaddr.V4.make 255 255 255 0;
  gateways= [Ipaddr.V4.make 192 168 1 1];

let direct =
  let stack = direct_stackv4_with_static_ipv4 default_console tap0 ip_config  in
  handler $ default_console $ data $ stack

let () =
  add_to_ocamlfind_libraries ["dns.mirage";"dns.lwt-core"];
  add_to_opam_packages ["dns"];
  register "dns" [direct]

We are using crunch to access the zone file in the data directory. As explain in part 1, this config file is specific to my network setup for xen backends and can easily be generalised.

You can now test your DNS server and see it work

$ dig @ ns0.d1.signpo.st.


by Heidi-ann at March 02, 2015 09:50 PM

February 21, 2015


Pearl No.2 - The Max Number of Surpassers


In a list of unsorted numbers (not necessarily distinct), such as


The surpassers of an element are all elements whose indices are bigger and values are larger. For example, the element 1's surpassers are 9, 5, 5, and 6, so its number of surpassers is 4.


And also we can see that 9 doesn't have any surpassers so its number of surpassers is 0.

So the problem of this pearl is:

Given an unsorted list of numbers, find the max number of surpassers, O(nlogn) is required.

In the answer for the above example is 5, for the element -10.

An easy but not optimal solution

As usual, let's put a trivial solution on the table first ([1]). It is straightforward:

  1. For every element, we scan all elements behind it and maintain a ns as its number of surpassers.
  2. If one element is larger, then we increase the ns.
  3. After we finish on all elements, the max of all the nses is what we are looking for.

The diagram below demonstrates the process on 1.
trivial solution

let num_surpasser p l = List.fold_left (fun c x -> if x > p then c+1 else c) 0 l

let max_num_surpasser l =  
  let rec aux ms l =
    match ms, l with
    | _, [] -> ms
    | None, p::tl -> aux (Some (p, num_surpasser p tl)) tl
    | Some (_, c), p::tl -> 
      let ns = num_surpasser p tl in
      if ns > c then aux (Some (p, ns)) tl
      else aux ms tl
  aux None l

(* note think the answer as an `option` will make the code seem more complicated, but it is not a bad practice as for empty list we won't have max number of surpassers *)

The solution should be easy enough to obtain but its time complexity is O(n^2) which is worse than the required O(nlogn).

Introducing Divide and Conquer


The algorithm design technique divide and conquer was mentioned in Recursion Reloaded. I believe it is a good time to properly introduce it now as it provides a elegant approach towards a better solution for pearl 2.

What is it, literally

Suppose we want to replace the dragon lady and become the king of the land below ([2]).


We are very lucky that we have got a strong army and now the only question is how to overcome the realm.

One "good" plan is no plan. We believe in our troops so much that we can just let all of them enter the land and pwn everywhere.


Maybe our army is very good in terms of both quality and quantity and eventually this plan will lead us to win. However, is it really a good plan? Some soldiers may march to places that have already been overcome; Some soldiers may leave too soon for more wins after one winning and have to come back due to local rebel... the whole process won't be efficient and it cost too much gold on food for men and horses.

Fortunately, we have a better plan.


We divide the land into smaller regions and further smaller ones inside until unnecessary. And for each small region, we put ideal amount of soldiers there for battles. After soldiers finish their assigned region, they don't need to move and just make sure the region stay with us. This is more oganised and more efficient in terms of both gold and time. After all, if we conquer all the tiny regions, who would say we were not the king?

What is it in algorithm design, with accumulation

divide and conquer in algorithm design is not a universal solution provider, but a problem attacking strategy or paradigm. Moreover, although this classic term has been lasting for quite a long time, personally I would like to add one more action - accumulate - to make it appear more complete. Let's check the 3 actions one by one to see how we can apply the techque.


Conceptually this action is simple and we know we need to divide a big problem into smaller ones. But how to is non-trivial and really context-sensitive. Generally we need to ask ourselves 2 questions first:

What are the sizes of the smaller sub-problems?

Normally we intend to halve the problem because it can lead us to have a O(logn) in our final time complexity.

But it is not a must. For example 3-way quicksort divides the problem set into 3 smallers ones. 3-way partition can let quicksort have O( $ \log_3{n} $ ). However, do not forget the number of comparison also increases as we need to check equality during partition.

Moreover, sometimes we may have to just split the problem into a sub-problem with size 1 and another sub-problem with the rest, like what we did for the sum function. This kind of divide won't give us any performance boost and it turns out to be a normal recursion design.

Do we directly split the problem, or we somehow reshape the problem and then do the splitting?

In mergesort, we simply split the problem into 2; while in quicksort, we use partition to rearrange the list and then obtain the 2 sub-problems.

The point of this question is to bear in mind that we do not have shortcuts. We can have a very simple splitting, but later on we need to face probably more complicated accumulate. Like mergesort relies on merge. Or, we can do our important work during divide phase and have a straight accumulate (quicksort just needs to concatenate the solutions of the two sub-problems with the pivot in the middle).


This action implies two things:

  1. Recursion. We divided the problem, then we need to conquer. How to conquer? We need to apply divide and conquer and accumulate again until we are not able to divide any more.
  2. Edge cases. This means if we cannot divide further, then it is time to really give a solution. For example, let's say our target is a list. If we reach an empty list or an one-element list, then what shall we do? Normally, if this happens, we do not need to accumulate and just return the answer based on the edge cases.

I believe the conquer part in the original divide and conquer term also implies the accumulate. I seperate accumulate as explained next.


After we conquer every little area of the land, we should now somehow combine all our accomplishments and really build a kingdom out of them. This is the step of accumulate.

A key way to figure out how to accumulate is to start from small. In mergesort, if each of the 2 sub-problems just has one element, then the according answer is a list having that element and we have finished the conquer step. Now we have two lists each of which has one element, how can we accumulate them to have a single sorted list? Simple, smaller element goes first into our resulting list. What if we have two sorted list each of which has two elements? The same, smaller element goes first again.

If we decide to divide the problem in a fairly simple way, then accumulate is normally non-trivial and also dominates the time complexity. Figuring out a cost-efficient approach of accumulate is very important.


Again, divide and conquer and accumulate is just a framework for solving an algorithm problem. All the concrete solutions are problem context based and can spread into all 3 steps.

In addition, a fundamental hint to using this techqniue is that if we are given a problem, and we know the future solution is not anyhow related to the problem size, then we should try divide and conquer and accumulate

Solve pearl 2

Although pearl 2 asks us to get the max number of surpassers, we can

  1. Get the number of surpassers for every element (we anyway need to)
  2. Then do a linear scan for the max one.

The second step is O(n). If we can achieve the first step in O(nlogn), the overall time complexity stays as O(nlogn).

So our current goal is to use divide and conquer and accumulate to get all numbers of surpassers.

Divide the problem of pearl 2


We have such as list and we want to get a new list that have the same elements and each element is associated with the number of its surpassers. Now we want to divide the original list (problem set).

Can we directly halve the list?


As pearl 2 stated, an element only cares about all elements that are behind it. So if we split the list in the middle, we know the numbers of surpassers for all elements in sub-problem 2 do not need any special operations and the answers can directly be part of the future resulting list.

For the elements inside sub-problem 1, the answers are not fully accomplished yet as they will be affected by the elemnts in sub-problem 2. But hey, how we obtain full answers for sub-problem 1 with the help of the solutions of sub-problem 2 should be the job of accumulate, right? For now, I believe halving the problem is a good choice for divide as at least we already solve half of the problem directly.


We of course will use recursion. For sub-problems, we further halve them into smaller sub-problems until we are not able to, which means we reach the edge cases.

There are possibly two edge cases we need to conquer:

  1. Empty list
  2. A list with only one element.


For empty list, we just need to return an empty list as there is no element for us to count the number of surpassers. For an one-element list, we also return an one-element resulting list where the only element has 0 surpassers.


Now we finished dividing and conquering like below and it is time to accumulate (take sub-problem 1 only for illustration).

accumulate 1

It is easy to combine solutions of sp 111 and sp 112: just compare 8 from sp 111 with 1 from sp112, update 8's number of surpassers if necessary and we can leave sp 112 alone as we talked about during divide. The same way can be applied on sp 121 and sp 122. Then we get:

accumulate 2

Now both sp 11 and sp 12 have more than one element. In order to get the solution for sp 1, sp 12 can stay put. How about sp 11? An obvious approach is just let every element in sp 11 to compare every element in sp 12, and update their numbers of surpassers accordingly. This can be a candidate for accumulate action, however, it is O(n^2). We need to accumulate better.

We said in the very beginning of this post during our trivial solution that the original order of the list matters. However, is it still sensitive after we get the solution (for a sub-problem)?

accumulate 3

As we can see once the answer of sp 11 is obtained, the order between 8 and 1 doesn't matter as they don't rely on each for their number of surpassers any more.

If we can obtain the solution in a sorted manner, it will help us a lot. For example, assume the resulting lists for sp 11 and sp 12 are sorted like this:

accumulate 4

Then we can avoid comparing every pair of elements by using merge like this:

accumulate 5

We can see that 8 in the left hand side list doesn't have to compare to -10 any more. However, this example has not shown the full picture yet. If keep tracking the length of resulting list on the right hand side, we can save more comparisons. Let's assume both sp 1 and sp 2 have been solved as sorted list with lengths attached.

accumulate 6

We begin our merge.

accumulate 7

Have you noticed the fascinating part? Because -10 < -2, without further going down along the resulting list on the right hand side, we can directly update the number of surpassers of -10 and get it out. Why? Because -2 is the smallest element on the right, and if it is bigger than -10, then the rest of the elements on the right must all be bigger than -10, right? Through only one comparison (instead of 4), we get the number of surpassers.

Thus, as long as the solutions of all sub-problems are sorted list with the length associated, we can accumulate like this:

  1. Compare the heads hd1 and hd2, from two sorted resulting lists l1 and l2, respectively
  2. If hd1 >= hd2, then hd2 gets out; go to 1 with updated length for l2
  3. if hd1 < hd2, then hd1 gets out, and its ns gets updated by adding the length of l2 to the existing value; go to 1 with updated length for l1

The full process of accumulating sp 1 and sp 2 is illustrated as follows:

accumulate 8

Two things might need to be clarified:

  1. Although we assumed the resulting lists of sub-problems to be sorted, they will naturally become sorted anyway because we are doing the smaller goes first merging.
  2. We need to attach the lengths to each resulting list on the right and keep updating them because scanning the length of a list takes O(n).

Obviously this way of accumulation can give us O(n). Because at most we can divide O(logn) times, our divide and conquer and accumulate solution will be O(nlogn).


At first, we divide. Note that this version of divide is actually a kind of splitting from middle, as the original order of the elements before we get any solution is important.

(* we have a parameter n to indicate the length of l.
   it will be passed by the caller and 
   in this way, we do not need to scan l for its length every time.

   it will return left, right and the length of the right.
let divide l n =  
  let m = n / 2 in
  let rec aux left i = function
    | [] -> List.rev left, [], 0
    | right when i >= m -> List.rev left, right, n-i
    | hd::tl -> aux (hd::left) (i+1) tl
  aux [] 0 l

Now accumulate. We put it before writing conquer because conquer would call it thus it must be defined before conquer.

let accumulate l1 l2 len2 =  
  let rec aux acc len2 = function
    | l, [] | [], l -> List.rev_append acc l
    | (hd1,ns1)::tl1 as left, ((hd2,ns2)::tl2 as right) ->
      if hd1 >= hd2 then aux ((hd2,ns2)::acc) (len2-1) (left, tl2)
      else aux ((hd1,ns1+len2)::acc) len2 (tl1, right)
  aux [] len2 (l1, l2)

conquer is the controller.

(* note the list parameter is a list of tuple, i.e., (x, ns) *)
let rec conquer n = function  
  | [] | _::[] as l -> l
  | l ->
    let left, right, right_len = divide l n in
    let solution_sp1 = conquer (n-right_len) left in
    let solution_sp2 = conquer right_len right in
    accumulate solution_sp1 solution_sp2 right_len

So if we are given a list of numbers, we can now get all numbers of surpassers:

let numbers_of_surpassers l =  
  List.map (fun x -> x, 0) l |> conquer (List.length l)

Are we done? No! we should find the max number of surpassers out of them:

(* we should always consider the situation where no possible answer could be given by using **option**, although it is a bit troublesome *)
let max_num_surpassers = function  
  | [] -> None
  | l ->
    let nss = numbers_of_surpassers l in
    Some (List.fold_left (fun max_ns (_, ns) -> max max_ns ns) 0 nss)

[1] Unless I can see an optimal solution instantly, I always intend to think of the most straightforward one even though it sometimes sounds stupid. I believe this is not a bad habit. Afterall, many good solutions come out from brute-force ones. As long as we anyway have a solution, we can work further based on it and see whether we can make it better.

[2] Yes, I believe the dragon lady will end the game and win the throne. It is the circle and fate.

by Jackson Tale at February 21, 2015 01:24 AM

February 20, 2015

OCaml Platform

Improving the OCaml documentation toolchain

Last week, we published an alpha version of a new OCaml documentation generator, codoc 0.2.0. In the 2014 OCaml workshop presentation (abstract, slides, video), we mentioned the 'module wall' for documentation and this attempts to fix it. To try it out, simply follow the directions in the README on that repository, or browse some samples of the current, default output of the tool. Please do bear in mind codoc and its constituent libraries are still under heavy development and are not feature complete.

codoc's aim is to provide a widely useful set of tools for generating OCaml documentation. In particular, we are striving to:

  1. Cover all of OCaml's language features
  2. Provide accurate name resolution and linking
  3. Support cross-linking between different packages
  4. Expose interfaces to the components we've used to build codoc
  5. Provide a magic-free command-line interface to the tool itself
  6. Reduce external dependencies and default integration with other tools

We haven't yet achieved all of these at all levels of our tool stack but are getting close. codoc 0.2.0 is usable today (if a little rough in some areas like default CSS). This post outlines the architecture of the new system to make it easier to understand the design decisions that went into it.

The five stages of documentation

There are five stages in generating documentation from OCaml source code. Here we describe how each was handled in the past (using OCamldoc), the present (using our current prototype), and the future (using the final version of the tools we are developing).

Associating comments with definitions

The first stage is to associate the various documentation comments in an .ml or .mli file with the definitions that they correspond to.


Associating comments with definitions is handled by the OCamldoc tool, which does this in two steps. First it parses the file using the regular OCaml parser or camlp4, just as in normal compilation. It uses the syntax tree from the first step and then re-parses the file looking for comments. This second parse is guided by the location information in the syntax tree; for example if there is a definition which ends on line 5 then OCamldoc will look for comments to attach to that definition starting at line 6.

The rules used for attaching comments are quite intricate and whitespace dependent. This makes it difficult to parse the file and attach comments using a single parser. In particular, it would be difficult to do so in a way that doesn't cause a lot of problems for camlp4 extensions. This is why OCamldoc does the process in two steps.

A disadvantage of this two-step approach is that it assumes that the input to any preprocessor is something which could reasonably be read by the compiler/tool creating documentation, which may not always be the case.


Our current prototype associates comments with definitions within the compiler itself. This relies on a patch to the OCaml compiler (pull request #51 on GitHub). Comment association is activated by the -doc command-line flag. It uses (a rewritten version of) the same two-step algorithm currently used by OCamldoc. The comments are then attached to the appropriate node in the syntax tree as an attribute. These attributes are passed through the type-checker and appear in .cmt/.cmti files, where they can be read by other tools.


We intend to move away from the two-step approach taken by OCamldoc. To do this we will need to simplify the rules for associating comments with definitions. One suggestion was to use the same rules as attributes, however that seems to be overly restrictive. So the approach we hope to take is to keep quite close to what OCamldoc currently supports, but disallow some of the more ambiguous cases. For example,

val x : int
(** Is this for x or y? *)
val y : float

may well not be supported in our final version. We will take care to understand the impact of such design decisions and we hope to arrive at a robust solution for the future. By avoiding the two-step approach, it should be safe to always turn on comment association rather than requiring a -doc command-line flag.

Parsing the contents of comments

Once you have associated documentation comments with definitions, you must parse the contents of these comments.


OCamldoc parses the contents of comments.


In our current prototype, the contents of comments are parsed in the compiler, so that the documentation attributes available in .cmt/.cmti files contain a structured representation of the documentation.


We intend to separate parsing the contents of documentation comments from the compiler. This means that the documentation will be stored as strings within the .cmt/.cmti files and parsed by external tools. This will allow the documentation language (and its parser) to evolve faster than the distribution cycle of the compiler.

Representing compilation units with types and documentation

The typed syntax tree stored in .cmt/.cmti files is not a convenient representation for generating documentation from, so the next stage is to convert the syntax tree and comments into some suitable intermediate form. In particular, this allows .cmt files and .cmti files to be treated uniformly.


OCamldoc generates an intermediate form from a syntax tree, a typed syntax tree, and the comments that it found and parsed in the earlier stages. The need for both an untyped and typed syntax tree is a historical artefact that is no longer necessary.


Our current prototype creates an intermediate form in the doc-ock library. This form can be currently be created from .cmti files or .cmi files. .cmi files do not contain enough information for complete documentation, but you can use them to produce partial documentation if the .cmti files are not available to you.

This intermediate form can be serialised to XML using doc-ock-xml.


In the final version, doc-ock will also support reading .cmt files.

Resolving references

Once you have a representation for documentation, you need to resolve all the paths and references so that links can point to the correct locations. For example,

(* This type is used by {!Foo} *) type t = Bar.t

The path Bar.t and the reference Foo must be resolved so that the documentation can include links to the corresponding definitions.

If you are generating documentation for a large collection of packages, there may be more than one module called Foo. So it is important to be able to work out which one of these Foos the reference is referring to.

Unlike most languages, resolving paths can be very difficult in OCaml due to the powerful module system. For example, consider the following code:

module Dep1 : sig
 module type S = sig
   class c : object
     method m : int
 module X : sig
   module Y : S

module Dep2 :
 functor (Arg : sig module type S module X : sig module Y : S end end) ->
     module A : sig
       module Y : Arg.S
     module B = A.Y

type dep1 = Dep2(Dep1).B.c;;

Here it looks like, Dep2(Dep1).B.c would be defined by a type definition c within the submodule B of the functor Dep2. However, Dep2.B's type is actually dependent on the type of Dep2's Arg parameter, so the actual definition is the class definition within the module type S of the Dep1 module.


OCamldoc does resolution using a very simple string based lookup. This is not designed to handle collections of projects, where module names are not unique. It is also not sophisticated enough to handle advanced uses of OCaml's module system (e.g. it fails to resolve the path Dep2(Dep1).B.c in the above example).


In our current prototype, path and reference resolution are performed by the doc-ock library. The implementation amounts to a reimplementation of OCaml's module system that tracks additional information required to produce accurate paths and references (it is also lazy to improve performance). The system uses the digests provided by .cmti/.cmi files to resolve references to other modules, rather than just relying on the module's name.


There are still some paths handled incorrectly by doc-ock-lib, which will be fixed, but mostly the final version will be the same as the current prototype.

Producing output

Finally, you are ready to produce some output from the tools.


OCamldoc supports a variety of output formats, including HTML and LaTeX. It also includes support for plugins called "custom generators" which allow users to add support for additional formats.


codoc only supports HTML and XML output at present, although extra output formats such as JSON should be very easy to add once the interfaces settle down. codoc defines a documentation index XML format for tracking package hierarchies, documentation issues, and hierarchically localized configuration.

codoc also defines a scriptable command-line interface giving users access to its internal documentation phases: extraction, linking, and rendering. The latest instructions on how to use the CLI can be found in the README. We provide an OPAM remote with all the working versions of the new libraries and compiler patches required to drive the new documentation engine.


As previously mentioned, codoc and its constituent libraries doc-ock-lib and doc-ock-xml are still under heavy development and are not yet feature complete. Notably, there are some important outstanding issues:

  1. Class and class type documentation has no generated HTML. (issue codoc#9)
  2. CSS is subpar. (issue codoc#27)
  3. codoc HTML does not understand --package. (issue codoc#42)
  4. opam doc is too invasive (temporary for demonstration purposes; tracked by (issue codoc#48))
  5. Documentation syntax errors are not reported in the correct phase or obviously enough. (issue codoc#58)
  6. Character sets are not handled correctly (issue doc-ock-lib#43)
  7. -pack and cmt extraction are not supported (issue doc-ock-lib#35 and issue doc-ock-lib#3)
  8. Inclusion/substitution is not supported (issue doc-ock-lib#2)

We are very happy to take bug reports and patches at https://github.com/dsheets/codoc/issues. For wider suggestions, comments, complaints and discussions, please join us on the Platform mailing list. We do hope that you'll let us know what you think and help us build a next generation documentation tool which will serve our community admirably.

by OCaml Platform Team at February 20, 2015 12:00 AM

February 18, 2015

Heidi Howard

Part 1: Running your own DNS Resolver with MirageOS

The following is the first part in a step-by-step guide to setting up your own DNS resolver using MirageOS. I will be running this on a low power, low cost ARM device called the Cubieboard 2. Up to date code for each version of the DNS resolver is on Github. This guide assumes some basic experience of lwt and MirageOS, up to the level of the Hello World Tutorial.

Feedback on this article and pull requests to the demo code are welcome.

Part 1.1 – Setting up the cubieboard with MirageOS

Plenty of information on setting up a cubieboard with Xen and MirageOS is available elsewhere, most notability:

For debugging I am a big fan for wireshark. I run a full wireshark sesson on the machine which is connection sharing to my cubieboard network, to check all external traffic.

For this guide, I will always be compiling for Xen ARM backend, with direct network connection via br0 and a static IP for all unikernels. My test network router is configured to give out static IP of the form 192.168.1.x to hosts with the MAC address 00:00:00:00:00:0x. As a result, my config.ml file look like:

open Mirage

let ip_config:ipv4_config = {
  address= Ipaddr.V4.make 192 168 1 2;
  netmask= Ipaddr.V4.make 255 255 255 0;
  gateways= [Ipaddr.V4.make 192 168 1 1];

let client =
  foreign "Unikernel.Client" @@ console @-> stackv4 @-> job

let () =
  add_to_ocamlfind_libraries [ "dns.mirage"; ];
  register "dns-client" 
[ client $ default_console $ direct_stackv4_with_static_ipv4 default_console tap0 ip_config]

Since the IP address of the unikernel is, before launching the unikernel, I do:

echo "vif = [ 'mac=00:00:00:00:00:02,bridge=br0' ]" >> dns-client.xl

I build unikernel using the usual commands:

mirage configure --xen
make depend; make; make run
# edit file.xl
sudo xl create -c file.xl

Part 1.2 – Getting Started

The following is the complete code for a unikernel which queries a DNS server for a DNS domain and prints to console the IP address returned.

open Lwt
open V1_LWT

let domain = "google.com"
let server = Ipaddr.V4.make 8 8 8 8

module Client (C:CONSOLE) (S:STACKV4) = struct

  module U = S.UDPV4
  module DNS = Dns_resolver_mirage.Make(OS.Time)(S)

  let start c s =
    let t = DNS.create s in
    OS.Time.sleep 2.0 
    >>= fun () ->
    C.log_s c ("Resolving " ^ domain)
    >>= fun () ->
    DNS.gethostbyname t ~server domain
    >>= fun rl ->
      (fun r ->
         C.log_s c ("Answer " ^ (Ipaddr.to_string r))
      ) rl


This unikernel will query a DNS server at (google public DNS resolver) for a domain google.com. Here we are using the simple function, DNS.gethostbyname, with the following type sig:

  val gethostbyname : t ->
    ?server:Ipaddr.V4.t -> ?dns_port:int ->
    ?q_class:Dns.Packet.q_class ->
    ?q_type:Dns.Packet.q_type ->
    string -> Ipaddr.t list Lwt.t

This returns a list of IP’s, which we then iterative over with Lwt_list.iter_s and print to the console.

Part 1.3 – Boot time parameters

Hardcoding the server and domain is far from ideal, instead we will provide them at boot time with Bootvar, the interface for bootvar is below:

type t
(* read boot parameter line and store in assoc list - expected format is "key1=val1 key2=val2" *)
val create: unit -> t Lwt.t

(* get boot parameter *)
val get: t -> string -> string option

(* get boot parameter, throws Not Found exception *)
val get_exn: t -> string -> string

We can now use this to provide domain and server at boot time instead of compile time

let start c s =
    Bootvar.create () >>= fun bootvar ->
    let domain = Bootvar.get_exn bootvar "domain" in
    let server = Ipaddr.V4.of_string_exn (Bootvar.get_exn bootvar "server") in

Part 1.4 – Using Resolve

Now, a real DNS resolver will need to make many more parameters (any DNS query) and return full DNS responses not just IP address. Thus we need to move on from DNS.hostbyname to using the less abstract resolve function, resolve:

  val resolve :
    (module Dns.Protocol.CLIENT) ->
    t -> Ipaddr.V4.t -> int ->
    Dns.Packet.q_class ->
    Dns.Packet.q_type ->
    Dns.Name.domain_name ->
    Dns.Packet.t Lwt.t 

We can achieve same result of hostbyname as follows:

    DNS.resolve (module Dns.Protocol.Client) t server 53 Q_IN Q_A (string_to_domain_name domain)
    >>= fun r ->
    let ips =
    List.fold_left (fun a x ->
      match x.rdata with
      | A ip -> (Ipaddr.V4 ip) :: a
      | _ -> a ) [] r.answers in

We are now explicit about parameters such as port, class and type. Note that we have opened the Dns.Name and Dns.Packet.t modules. The return value of resolve is a Dns.Packet.t, we fold over answers in the produce an IPaddr.V4 list as with hostbyname. We can also use the to_string function in Packet to print

I’ve taken a break to do some refactoring work on the ocaml-dns library. In the next post, Part 2, we will expand our code to a DNS stub resolver.


by Heidi-ann at February 18, 2015 05:08 PM