OCaml Planet

May 26, 2016

Github OCaml jobs

Full Time: Software Developer (Functional Programming) at Jane Street in New York, NY; London, UK; Hong Kong

Software Developer

Jane Street is a proprietary quantitative trading firm, focusing primarily on trading equities and equity derivatives. We use innovative technology, a scientific approach, and a deep understanding of markets to stay successful in our highly competitive field. We operate around the clock and around the globe, employing over 400 people in offices in New York, London and Hong Kong.

The markets in which we trade change rapidly, but our intellectual approach changes faster still. Every day, we have new problems to solve and new theories to test. Our entrepreneurial culture is driven by our talented team of traders and programmers. At Jane Street, we don't come to work wanting to leave. We come to work excited to test new theories, have thought-provoking discussions, and maybe sneak in a game of ping-pong or two. Keeping our culture casual and our employees happy is of paramount importance to us.

We are looking to hire great software developers with an interest in functional programming. OCaml, a statically typed functional programming language with similarities to Haskell, Scheme, Erlang, F# and SML, is our language of choice. We've got the largest team of OCaml developers in any industrial setting, and probably the world's largest OCaml codebase. We use OCaml for running our entire business, supporting everything from research to systems administration to trading systems. If you're interested in seeing how functional programming plays out in the real world, there's no better place.

The atmosphere is informal and intellectual. There is a focus on education, and people learn about software and trading, both through formal classes and on the job. The work is challenging, and you get to see the practical impact of your efforts in quick and dramatic terms. Jane Street is also small enough that people have the freedom to get involved in many different areas of the business. Compensation is highly competitive, and there's a lot of room for growth.

You can learn more about Jane Street and our technology from our main site, janestreet.com. You can also look at a a talk given at CMU about why Jane Street uses functional programming (http://ocaml.janestreet.com/?q=node/61), and our programming blog (http://ocaml.janestreet.com).

We also have extensive benefits, including:

  • 90% book reimbursement for work-related books
  • 90% tuition reimbursement for continuing education
  • Excellent, zero-premium medical and dental insurance
  • Free lunch delivered daily from a selection of restaurants
  • Catered breakfasts and fresh brewed Peet's coffee
  • An on-site, private gym in New York with towel service
  • Kitchens fully stocked with a variety of snack choices
  • Full company 401(k) match up to 6% of salary, vests immediately
  • Three weeks of paid vacation for new hires in the US
  • 16 weeks fully paid maternity/paternity leave for primary caregivers, plus additional unpaid leave

More information at http://janestreet.com/culture/benefits/

May 26, 2016 01:11 PM

May 20, 2016

OCaml Labs compiler hacking

Thirteenth OCaml compiler hacking evening at Pembroke College

It's Spring, and time for the thirteenth Cambridge OCaml compiler-hacking evening! We'll be back in Pembroke College again, in the centre of Cambridge.

Pembroke

If you're planning to come along, it'd be helpful if you could indicate interest via Doodle and sign up to the mailing list to receive updates.

Where: N7, Pembroke College, Cambridge CB2 1RF. Head through the entrance on Trumpington Street, and turn left straight after the Porters Lodge.

When: 7pm, Tuesday 31st May.

Who: anyone interested in improving OCaml. Knowledge of OCaml programming will obviously be helpful, but prior experience of working on OCaml internals isn't necessary.

What: fixing bugs, implementing new features, learning about OCaml internals

Wiki: https://github.com/ocamllabs/compiler-hacking/wiki

We're defining "compiler" pretty broadly, to include anything that's part of the standard distribution, which means at least the standard library, runtime, tools (ocamldep, ocamllex, ocamlyacc, etc.), the debugger, the documentation, and the compiler itself. We'll have suggestions for mini-projects for various levels of experience, but feel free to come along and work on whatever you fancy.

We've arranged a self-service buttery dinner at 6:30pm - please head to the buttery when you arrive and help yourself to the food choices available. You can then bring your dinner to the Old Library (just next door) to sit down and eat.

by OCaml Labs (cl-ocamllabs@lists.cam.ac.uk) at May 20, 2016 10:30 PM

May 13, 2016

Marc Simpson

Tail call optimisation in (g)awk

Tail call optimisation in (g)awk

# May 13, 2016

or “Wait, what? Tail call optimisation in awk?”

Overview

This post covers tail call optimisation (TCO) behaviour in three common awk implementations1: gawk, mawk and nawk (AKA the one true awk).

None of the three implement full TCO, while gawk alone provides self-TCO. The bulk of this post will therefore be devoted to gawk’s implementation and related pitfalls.

Initial observations

Let’s begin with a simple awk script that defines a single function, recur, called from the BEGIN block:

$ nawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ mawk 'function recur() {return recur()} BEGIN {recur()}'
Segmentation fault: 11
$ gawk 'function recur() {return recur()} BEGIN {recur()}'
# ...runs indefinitely [?]...

Note the difference in behaviour here: nawk and mawk blow the stack and segfault while gawk cheerily continues running. Thanks gawk.

But wait! Gawk is actually dynamically allocating additional stack frames—so long as there’s memory (and swap) to consume, gawk will gobble it up and our script will plod on. Below, the first 30 seconds of (virtual) memory consumption are charted2:

Whoops!

The gawk optimiser

In order to obtain (self-)TCO and spare your poor swap partition, gawk provides the -O switch,

$ gawk -O 'function foo() {return recur()} BEGIN {recur()}'
# ...runs indefinitely; air conditioning no longer required...

and lo and behold,

Doubling down

What about full TCO? Let’s expand our one liner a little to include a trampoline call:

$ gawk -O 'function go() {return to()} function to() {return go()} BEGIN {go()}'

and chart memory consumption again,

Bugger. So, it looks like gawk isn’t keen on full blown TCO. Time to find out why.

The secret sauce

We’ve just seen that gawk seems to optimise self-calls in tail position when the -O flag is specified. To better understand this functionality we can dump opcodes from the trampoline case and take a look under the hood:

$ echo 'function go() {return to()} function to() {return go()} BEGIN {go()}' > /tmp/trampoline.awk
$ gawk --debug -O -f /tmp/trampoline.awk
gawk> dump

	# BEGIN

[     1:0x7fc00bd022e0] Op_rule             : [in_rule = BEGIN] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd02360] Op_func_call        : [func_name = go] [arg_count = 0]
[      :0x7fc00c800f60] Op_pop              :
[      :0x7fc00c800e20] Op_no_op            :
[      :0x7fc00c800ea0] Op_atexit           :
[      :0x7fc00c800f80] Op_stop             :
[      :0x7fc00c800e60] Op_no_op            :
[      :0x7fc00bd01e00] Op_after_beginfile  :
[      :0x7fc00c800e40] Op_no_op            :
[      :0x7fc00c800e80] Op_after_endfile    :

	# Function: go ()

[     1:0x7fc00bd01f20] Op_func             : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd020a0] Op_func_call        : [func_name = to] [arg_count = 0]
[     1:0x7fc00bd01fb0] Op_K_return         :
[      :0x7fc00c800ee0] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc00c800f00] Op_K_return         :

	# Function: to ()

[     1:0x7fc00bd02130] Op_func             : [param_cnt = 0] [source_file = /tmp/trampoline.awk]
[     1:0x7fc00bd02270] Op_func_call        : [func_name = go] [arg_count = 0]
[     1:0x7fc00bd021f0] Op_K_return         :
[      :0x7fc00c800f20] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc00c800f40] Op_K_return         :

Note the lack of a distinct jump or tailcall opcode; instead, even with the optimiser turned on, go and to are performing Op_func_calls. Hmm, okay; we’ll see a different opcode in our original recur case, though, right? Wrong:

$ echo 'function recur() {return recur()} BEGIN {recur()}' > /tmp/recur.awk
$ gawk --debug -O -f /tmp/recur.awk
gawk> dump

	# BEGIN

[     1:0x7fc1d0408ef0] Op_rule             : [in_rule = BEGIN] [source_file = /tmp/recur.awk]
[     1:0x7fc1d0408f80] Op_func_call        : [func_name = recur] [arg_count = 0]
[      :0x7fc1d0802120] Op_pop              :
[      :0x7fc1d0802020] Op_no_op            :
[      :0x7fc1d08020a0] Op_atexit           :
[      :0x7fc1d0802140] Op_stop             :
[      :0x7fc1d0802060] Op_no_op            :
[      :0x7fc1d0408bc0] Op_after_beginfile  :
[      :0x7fc1d0802040] Op_no_op            :
[      :0x7fc1d0802080] Op_after_endfile    :

	# Function: recur ()

[     1:0x7fc1d0408ce0] Op_func             : [param_cnt = 0] [source_file = /tmp/recur.awk]
[     1:0x7fc1d0408e60] Op_func_call        : [func_name = recur] [arg_count = 0]
[     1:0x7fc1d0408d70] Op_K_return         :
[      :0x7fc1d08020e0] Op_push_i           : Nnull_string [MALLOC|STRING|STRCUR|NUMCUR|NUMBER]
[      :0x7fc1d0802100] Op_K_return         :

¯\_(ツ)_/¯

Time to dig around gawk’s grammar definition. Here’s return, defined in awkgram.y:

| LEX_RETURN
  {
    if (! in_function)
        yyerror(_("`return' used outside function context"));
  } opt_exp statement_term {
    if ($3 == NULL) {
        $$ = list_create($1);
        (void) list_prepend($$, instruction(Op_push_i));
        $$->nexti->memory = dupnode(Nnull_string);
    } else {
        if (do_optimize
            && $3->lasti->opcode == Op_func_call
            && strcmp($3->lasti->func_name, in_function) == 0
        ) {
            /* Do tail recursion optimization. Tail
             * call without a return value is recognized
             * in mk_function().
             */
            ($3->lasti + 1)->tail_call = true;
        }

        $$ = list_append($3, $1);
    }
    $$ = add_pending_comment($$);
  }

Take a closer look at the code following that comment:

if (do_optimize
  && $3->lasti->opcode == Op_func_call
  && strcmp($3->lasti->func_name, in_function) == 0
) { /* ... */
  ($3->lasti + 1)->tail_call = true; /* <--- */
}

In other words, during a return gawk:

  1. Checks whether the do_optimize flag (-O) is specified.
  2. If so, it checks whether the previous instruction is an Op_func_call.
  3. If that call is to a function with the same name as the current one,
  4. …the tail_call flag is set.

So it goes.

At last, a conclusion

Here’re a few takeaways from the above3:

  • Don’t rely on TCO if you’re writing awk.
  • Just don’t.
  • If you do need TCO, make sure you’re using gawk
    • Be sure to specify the -O flag otherwise you’ll need to buy a new fan,
    • and make sure you’re not trampolining as the optimiser won’t be of any help.

Personally, I’ll be sticking with nawk.


  1. Probably the most common.

  2. Output drawn from ps

  3. YMMV


<script type="text/javascript"> var disqus_shortname = '0branch'; // required: replace example with your forum shortname /* * * DON'T EDIT BELOW THIS LINE * * */ (function() { var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true; dsq.src = 'http://' + disqus_shortname + '.disqus.com/embed.js'; (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); </script> <noscript>Please enable JavaScript to view the comments powered by Disqus.</noscript> comments powered by Disqus

by Marc Simpson at May 13, 2016 11:20 AM

OCamlCore Forge News

Batteries 2.5.0 released, compatible with OCaml 4.03

See the detailed release announcement at https://sympa.inria.fr/sympa/arc/caml-list/2016-05/msg00082.html

by Gabriel Scherer at May 13, 2016 05:35 AM

April 28, 2016

Heidi Howard

Do you want a shed or a castle?

I have seen the error of my (programming) ways. Let me explain…

To me, programming in OCaml is like trying to build a house from just breeze blocks. It takes a long time to build even a simple shed. However. when its done, its really quite solid.

To me, programming in Go is like building a house from an array of complex pre-build components. In the blink of an eye, you have an amazing castle, complete with turrets and ornate window frames.

You open the door to your beautiful new castle and it all fails down. Each time you rebuild one part, another falls down.

You are full of regrets as you sleep in the wreckage of your fallen castle and wish for a solid shed.

Another fallen castle – rod collier [CC BY-SA 2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons

Yours truly,

Someone fighting to hold up a fallen castle

EDIT: here’s  some more example of what a falling castle looks like

Screen Shot 2016-04-28 at 14.46.46

Screen Shot 2016-04-28 at 14.52.34

by Heidi-ann at April 28, 2016 01:27 PM

April 21, 2016

Shayne Fletcher

Oh! Pascal!

I can't help but want to share my joy at coming across this pearl of a program from the "Pascal User Manual and Report" - Jensen and Wirth (circa 1974). In my edition, it's program 4.7 (graph1.pas).

This is it, rephrased in OCaml.


(* Graph of f x = exp (-x) * sin (2 * pi * x)

Program 4.7, Pascal User Manual and Report, Jensen & Wirth
*)

let round (x : float) : int =
let f, i =
let t = modf x in
(fst t, int_of_float@@ snd t) in
if f = 0.0 then i
else if i >= 0 then
if f >= 0.5 then i + 1 else i
else if -.f >= 0.5 then i - 1 else i

let graph (oc : out_channel) : unit =
(*The x-axis runs vertically...*)
let s = 32. in (*32 char widths for [y, y + 1]*)
let h = 34 in (*char position of x-axis*)
let d = 0.0625 in (*1/16, 16 lines for [x, x + 1]*)
let c = 6.28318 in (* 2pi *)
let lim = 32 in
for i = 0 to lim do
let x = d *. (float_of_int i) in
let y = exp (-.x) *. sin (c *. x) in
let n = round (s *. y) + h in
for _ = n downto 0 do output_char oc ' '; done;
output_string oc "*\n"
done

let () = print_newline (); graph stdout; print_newline ()

The output from the above is wonderful :)


*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*

by Shayne Fletcher (noreply@blogger.com) at April 21, 2016 03:31 PM