OCamlcore Planet

July 02, 2009

Mauricio Fernandez

Hash tables: separate chaining vs. double hashing

This entry uses jsMath to display mathematical expressions. Please go to the main site if some fail to render.

In my earlier finite map benchmarks which compared several hash tables and functional maps (AVL trees and ternary trees), separate chaining (with $$\alpha = 0.47$$) beat double hashing ($$\alpha = 0.42$$) at unsuccessful searches (1% hit rate), but was slower at successful ones.

Theory predicts the following average number of probes for unsuccessful and successful lookups (expressions below): Unsuccessful searches, same load factor Successful searches, same load factor

Separate chaining looks much better here, but the graphs are misleading, because we don't actually care as much about the load factor as about memory usage, so we want to compare the collision resolution schemes when the latter is the same.

If we use a linked list for the collision chain, this represents one word of overhead per item compared to double hashing. For instance, if the load factor with separate chaining is 1, we can afford to have a table twice as big with double hashing for the same amount of memory and a load factor of 50%. In other words, $$N + n = N'$$ where $$N$$ and $$N'$$ are the sizes of the tables for separate chaining and double hashing, and $$n$$ the number of elements. The respective load factors are

\[\begin{eqnarray*} \alpha & = & \frac{n}{N}\\ \alpha' & = & \frac{n}{N'}\\ & = & \frac{n}{N+n}\\ \alpha' & = & \frac{\alpha}{1+\alpha}\end{eqnarray*} \]

The expressions for the average number of probes for unsuccessful and successful searches are (Knuth, TAOCP Vol 3, Section 6.4), for separate chaining

Read more...

2009-07-02T13:22:19Z

Mauricio Fernandez

Math typesetting with jsMath

This entry uses jsMath to display mathematical expressions. Please go to the main site if some fail to render.

I've added math typesetting support to Ocsiblog, which runs this site, using jsMath: (double-click on the equations to see the sources)

\[\begin{eqnarray*} \nabla.D & = & \rho\\ \nabla.B & = & 0\\ \nabla\times E & = & -\frac{\partial B}{\partial t}\\ \nabla\times H & = & j+\frac{\partial D}{\partial t}\end{eqnarray*} \]

When jsMath is enabled, you'll see in the bottom right-hand corner a button that triggers a control panel allowing you to set several options such as the scale or the rendering mechanism (TeX fonts, image fonts or native Unicode fonts; you can pick the one that works best for you and save the configuration --- the best one is TeX fonts, if you have them, but Unicode fonts is quite decent). For best viewing, you can get the TeX fonts here (the page includes detailed install instructions for Windows, OSX and Un*x).

2009-07-02T10:21:46Z

July 01, 2009

Erik de Castro Lopo

Three More for the Debian New Queue.

Over the last couple of weeks I've managed to get three new packages into the Debian NEW queue :

  • haskell-polyparse - A variety of alternative parser combinator libraries for Haskell (this is a dependency for later versions of HaXml).
  • haskell-dataenc - A Haskell library of data encoders and decoders like Base64, uuencoding etc.
  • haskell-json - Haskell library for serialising data to and from JSON.

Thanks to Simon Horman for sponsoring/uploading the first two of and Matt Palmer for sponsoring/uploading haskell-json.

2009-07-01T09:00:25Z

OCamlCore Forge News

Concurrent Cell now supports Dvar.

Concurrent Cell version 2.0.0 now supports Dvar. Module [Dvar] provides a kind of mutable variable that can make dependency between variables. These dependencies can be resolved automatically when the value of a variable is updated. This system is similar to Functional Reactive Programming.

by Satoshi Ogasawara at 2009-07-01T01:13:17Z

June 30, 2009

Caml Weekly News

Caml Weekly News, 30 Jun 2009

New HLVM examples! / Other Caml News

2009-06-30T12:00:00Z

June 29, 2009

The Caml Humps

Camomile 0.7.2

Camomile is a comprehensive Unicode library for OCaml. Camomile provides Unicode character type, UTF-8, UTF-16, UTF-32 strings, conversion to/from about 200 encodings, collation and locale-sensitive case mappings, and more. The library is currently designed for Unicode Standard 3.2.

2009-06-29T06:01:59Z

OCamlCore Forge Projects

RegSTAB

RegSTAB is a SAT-Solver extended with the possibility to handle formulas such that /\i=1..n P_i where n is a variable i.e. it can deal with propositional formula schemas.

2009-06-29T05:00:34Z

June 25, 2009

David Mentré

Transition to OCaml 3.11.1 has started in Debian

Debian The 24th of June, the Debian package for new upstream OCaml 3.11.1 has been uploaded. Thus upload marks the start of the transition to OCaml 3.11.1 in Debian Sid (aka unstable). Right now, the new package has been successfully built on most of Debian supported architectures.

Following this first round, other packages are being uploaded in successive rounds. You can follow this transition on Stéphane Glondu's dedicated OCaml transition monitor.[1]

The count-down has started. We will see how much time it will take to do this transition for the 138 source packages.

by d. at 2009-06-25T11:00:00Z

June 23, 2009

Caml Weekly News

Caml Weekly News, 23 Jun 2009

OCaml Meeting 2009 in Tokyo / All OCaml packages synchronised for OCaml 3.11.0 in Ubuntu Karmic / Inflection lib for ocaml ? / Obj.magic and existential types / Camomile 0.7.2 / Other Caml News

2009-06-23T12:00:00Z

June 21, 2009

FP-Sydney

FP-Syd #16.

On Thursday June 18th we held the 16th meeting of FP-Syd, the Sydney Functional Programming group. The meeting was held at Google's Sydney offices and we had about 25 people attending to hear our two presentations.

This month we tried something a little different, 5 minute lightning talks. We had four lightning presenters :

  • Jeremy Apthorp, "Closures in C" - Jeremy explained how the Clang frontend for LLVM had added lexical closures to the C programming language.
  • Jeeva Suresh, "Functional Programming Warps your mind - Quantified"- Jeeva quantified exactly how programming in functional languages changes a programmer's approach when programming in other languages.
  • Mark Bradley, "Cycling Josephus for Speed" - Mark show how a naive approach to the Josephus Problem in Haskell would, for some example parameters, exhaust all memory. Mark showed a better solution that satisfied the problem but still consumed a large amount of memory. Interestingly, Benjamin Johnston posted a solution to this problem in SWI Prolog that performed very well in comparison to the Haskell version.
  • Ben Lippmeier, "The Poisoning Problem" - Ben showed us an example of the poisoning problem which had cropped up in his DDC compiler where a value might need to be mutable or immutable depending on the value of an if statement.

Interestingly, none of the presenters managed to stay within their allotted 5 minutes but the time limits were not enforced.

Our main speaker of the evening was Eric Willigers who gave us an excellent introduction to the Clojure language, a LISP like language for the Java Virtual Machine. Eric started with the main data structures; lists, vectors, maps and sets, moved on to functions and macros before explaining Clojure's concurrency primitives.

All in all this was another most inspiring and enjoyable meeting. Thanks to all our speakers as well as Shane Stephens and Google for providing the venue.

2009-06-21T03:03:00Z