commit ea240d0a39cf27bae358718b7611859a9059c4cf · anil.recoil.org/thicket-eeg

+5 -5

index.json

···

       242
       242
        
           "lucasma": {

     

       243
       243
        
             "username": "lucasma",

     

       244
       244
        
             "feeds": [

     

       245
       245
       -
               "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml%E2%81%A6%E2%80%84"

     

       245
       245
       +
               "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       246
       246
        
             ],

     

       247
       247
        
             "directory": "lucasma",

     

       248
       248
        
             "created": "2025-08-10T19:13:04.454217",

     

       249
       249
       -
             "last_updated": "2025-08-10T19:13:04.454223",

     

       250
       250
       -
             "entry_count": 0

     

       249
       249
       +
             "last_updated": "2025-08-10T19:13:45.290261",

     

       250
       250
       +
             "entry_count": 5

     

       251
       251
        
           }

     

       252
       252
        
         },

     

       253
       253
        
         "created": "2025-07-15T16:04:07.657530",

     

       254
       254
       -
         "last_updated": "2025-08-10T19:13:04.454235",

     

       255
       255
       -
         "total_entries": 271

     

       254
       254
       +
         "last_updated": "2025-08-10T19:13:45.290262",

     

       255
       255
       +
         "total_entries": 276

     

       256
       256
        
       }

+19

lucasma/blog_2025_07_04_effects-scheduling-w01.json

···

       1
       1
       +
       {

     

       2
       2
       +
         "id": "https://lucasma8795.github.io/blog/2025/07/04/effects-scheduling-w01",

     

       3
       3
       +
         "title": "Effects-based scheduling for the OCaml compiler - w01",

     

       4
       4
       +
         "link": "https://lucasma8795.github.io/blog/2025/07/04/effects-scheduling-w01.html",

     

       5
       5
       +
         "updated": "2025-07-04T08:00:00",

     

       6
       6
       +
         "published": "2025-07-04T08:00:00",

     

       7
       7
       +
         "summary": "This is a series of blog posts documenting my progress for an internship at the University of Cambridge. This project explores the potential of using OCaml\u2019s effect handlers and domains in place of the current separate build system (dune, make) to self-schedule compilation of missing dependencies on-the-fly.",

     

       8
       8
       +
         "content": "<p>This is a series of blog posts documenting my progress for an internship at the University of Cambridge. This project explores the potential of using OCaml\u2019s <a href=\"https://ocaml.org/manual/5.3/effects.html\">effect handlers</a> and <a href=\"https://ocaml.org/manual/5.3/parallelism.html\">domains</a> in place of the current separate build system (dune, make) to self-schedule compilation of missing dependencies on-the-fly.</p>\n\n<p>My knowledge with functional programming, at this point, basically only came from the <a href=\"https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/\">CST 1A Foundations course</a>. To catch up, much of the first few days were spent studying the <a href=\"https://ocaml.org/manual/5.3/effects.html\">OCaml effect handler</a>, and the rest were spent poking around in the OCaml compiler. Here is what I\u2019ve picked up so far:</p>\n\n<h3>Continuations</h3>\n\n<p>A <a href=\"https://en.wikipedia.org/wiki/First-class_citizen\">first-class</a> continuation <code>k</code>, informally, is a callable that represents \u201cthe rest of the computation\u201d, held at a given point in execution. In other words, it is a snapshot of the control flow at a given moment. This is made explicit in the <a href=\"https://en.wikipedia.org/wiki/Continuation-passing_style\">continuation-passing style (CPS)</a> of a program, where control is passed explicitly in the form of continuations <code>k : 'a -&gt; unit</code>, where <code>'a</code> is the type of an intermediate result:</p>\n\n<div><div><pre><code><span>let</span> <span>eq</span> <span>x</span> <span>y</span> <span>k</span> <span>=</span> <span>k</span> <span>(</span><span>x</span> <span>=</span> <span>y</span><span>)</span>\n<span>let</span> <span>sub</span> <span>x</span> <span>y</span> <span>k</span> <span>=</span> <span>k</span> <span>(</span><span>x</span> <span>-</span> <span>y</span><span>)</span>\n<span>let</span> <span>mul</span> <span>x</span> <span>y</span> <span>k</span> <span>=</span> <span>k</span> <span>(</span><span>x</span> <span>*</span> <span>y</span><span>)</span>\n\n<span>let</span> <span>rec</span> <span>factorial</span> <span>n</span> <span>k</span> <span>=</span>\n  <span>eq</span> <span>n</span> <span>0</span> <span>(</span><span>fun</span> <span>b</span> <span>-&gt;</span>\n    <span>if</span> <span>b</span> <span>then</span>\n      <span>k</span> <span>1</span>\n    <span>else</span>\n      <span>sub</span> <span>n</span> <span>1</span> <span>(</span><span>fun</span> <span>m</span> <span>-&gt;</span>\n        <span>factorial</span> <span>m</span> <span>(</span><span>fun</span> <span>x</span> <span>-&gt;</span>\n          <span>mul</span> <span>n</span> <span>x</span> <span>k</span><span>)))</span>\n\n<span>(* 120 should appear in stdout *)</span>\n<span>factorial</span> <span>5</span> <span>(</span><span>fun</span> <span>ret</span> <span>-&gt;</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"%d</span><span>\\n</span><span>\"</span> <span>ret</span><span>)</span>\n</code></pre></div></div>\n\n<p>This is somewhat analogous to <code>setjmp</code>/<code>longjmp</code> in C.</p>\n\n<p>(side note: notice that in CPS, all calls must be tail-calls!)</p>\n\n<h3>OCaml algebraic effect handlers</h3>\n\n<p><em>Delimited continuations</em> generalize continuations, in the sense that we now capture the context only up to a delimiter (read: slice of a stack frame). Naturally, unlike continuations, <em>delimited</em> continuations can meaningfully return values, and not just <code>unit</code>.</p>\n\n<p>OCaml (algebraic) effect handlers generalize <a href=\"https://ocaml.org/docs/error-handling\">exception handlers</a>, in the sense that the handler is provided with the delimited continuation of the call site, whereas exceptions do not have access to a \u201ccontinuation mechanism\u201d. Here is a nice example, courtesy of <a href=\"https://github.com/ocaml-multicore/ocaml-effects-tutorial\">this tutorial</a>:</p>\n\n<div><div><pre><code><span>type</span> <span>_</span> <span>Effect</span><span>.</span><span>t</span> <span>+=</span> <span>Conversion_failure</span> <span>:</span> <span>string</span> <span>-&gt;</span> <span>int</span> <span>Effect</span><span>.</span><span>t</span>\n\n<span>let</span> <span>int_of_string</span> <span>l</span> <span>=</span>\n  <span>try</span> <span>int_of_string</span> <span>l</span> <span>with</span>\n  <span>|</span> <span>Failure</span> <span>_</span> <span>-&gt;</span> <span>perform</span> <span>(</span><span>Conversion_failure</span> <span>l</span><span>)</span>\n\n<span>let</span> <span>rec</span> <span>sum_up</span> <span>acc</span> <span>=</span>\n    <span>let</span> <span>l</span> <span>=</span> <span>input_line</span> <span>stdin</span> <span>in</span>\n    <span>acc</span> <span>:=</span> <span>!</span><span>acc</span> <span>+</span> <span>int_of_string</span> <span>l</span><span>;</span>\n    <span>sum_up</span> <span>acc</span>\n\n<span>let</span> <span>()</span> <span>=</span>\n  <span>let</span> <span>acc</span> <span>=</span> <span>ref</span> <span>0</span> <span>in</span>\n  <span>match_with</span> <span>sum_up</span> <span>acc</span>\n  <span>{</span>\n    <span>effc</span> <span>=</span> <span>(</span><span>fun</span> <span>(</span><span>type</span> <span>c</span><span>)</span> <span>(</span><span>eff</span><span>:</span> <span>c</span> <span>Effect</span><span>.</span><span>t</span><span>)</span> <span>-&gt;</span>\n      <span>match</span> <span>eff</span> <span>with</span>\n      <span>|</span> <span>Conversion_failure</span> <span>s</span> <span>-&gt;</span>\n        <span>Some</span> <span>(</span>\n          <span>fun</span> <span>(</span><span>k</span><span>:</span> <span>(</span><span>c</span><span>,_</span><span>)</span> <span>continuation</span><span>)</span> <span>-&gt;</span> <span>continue</span> <span>k</span> <span>0</span>\n        <span>)</span>\n      <span>|</span> <span>_</span> <span>-&gt;</span> <span>None</span>\n    <span>);</span>\n    <span>exnc</span> <span>=</span> <span>(</span><span>function</span>\n      <span>|</span> <span>End_of_file</span> <span>-&gt;</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Sum is %d</span><span>\\n</span><span>\"</span> <span>!</span><span>acc</span>\n      <span>|</span> <span>e</span> <span>-&gt;</span> <span>raise</span> <span>e</span>\n    <span>);</span>\n    <span>retc</span> <span>=</span> <span>fun</span> <span>_</span> <span>-&gt;</span> <span>failwith</span> <span>\"impossible\"</span>\n  <span>}</span>\n</code></pre></div></div>\n\n<p>Here, <code>match_with f v h</code> runs the computation <code>f v</code> in the given handler <code>h</code>, and handles the effect <code>Conversion_failure</code> when it is invoked (c.f. <code>try</code>/<code>with</code>).</p>\n\n<p>Effects are performed (invoked) with the <code>perform : 'a Effect.t -&gt; 'a</code> primitive (c.f. <code>raise : exn -&gt; 'a</code>), which hands over control flow to the corresponding delimiting effect handler, and the continuation <code>k</code> is resumed with the <code>continue : ('a, 'b) continuation -&gt; 'a -&gt; 'b</code> primitive. (The type <code>('a, 'b) continuation</code> can be mentally processed as <code>'a -&gt; 'b</code> but used exclusively for effects, as far as I can tell.)</p>\n\n<h4>\u2026how is this type-checked?</h4>\n\n<p>Effects are declared by adding constructors to an <a href=\"https://ocaml.org/manual/5.3/extensiblevariants.html\">extensible variant type</a> defined in the <code>Effect</code> module. In short, extensible variant types are <a href=\"https://dev.realworldocaml.org/variants.html\">variant types</a> which can be extended with new variant constructors at runtime , with the <code>+=</code> operator. As an aside, this is also how one could extend the built-in exception type <code>exn</code>:</p>\n\n<div><div><pre><code><span>type</span> <span>exn</span> <span>+=</span> <span>Invalid_argument</span> <span>of</span> <span>string</span>\n<span>type</span> <span>exn</span> <span>+=</span> <span>Out_of_memory</span>\n</code></pre></div></div>\n\n<p>(there is of course the <code>exception</code> keyword that one should probably use instead!)</p>\n\n<p>Effects are strongly typed, but the effect handler needs to be able to match against multiple effects at once, and since constructors can be added at runtime, the handler must be generic over every possible effect type (and so we must match against the wildcard <code>_</code>). A <code>None</code> return value means to exhibit transparent behaviour (ignore the effect), and allow it to be captured by an effect handler lower down the call stack. (OCaml effects are unchecked, i.e.: it is a runtime error if an effect is ultimately not handled.)</p>\n\n<p>The syntax <code>fun (type c) (eff: c Effect.t) -&gt; ...</code> makes use of <a href=\"https://ocaml.org/manual/5.3/locallyabstract.html\">locally abstract types</a>. This is required for type inference here, when different branches of the pattern-matching have possibly different <code>c</code> (the type of <code>c</code> is \u201clocally collapsed\u201d inside a branch when we have a match). It follows that the scope of <code>c</code> cannot escape a branch.</p>\n\n<p>While reading on this, I <a href=\"https://stackoverflow.com/questions/69144536/what-is-the-difference-between-a-and-type-a-and-when-to-use-each\">came</a> <a href=\"https://discuss.ocaml.org/t/locally-abstract-type-polymorphism-and-function-signature/4523\">across</a> another interesting construct: explicit polymorphism. Turns out, if we write the following in a module interface:</p>\n\n<div><div><pre><code><span>(* foo.mli *)</span>\n<span>val</span> <span>foo</span> <span>:</span> <span>'</span><span>a</span> <span>*</span> <span>'</span><span>b</span> <span>-&gt;</span> <span>'</span><span>a</span>\n</code></pre></div></div>\n\n<p>This would mean what one would think it means: for all types <code>'a</code> and <code>'b</code>, <code>foo</code> must be able to take in a 2-tuple of type <code>'a * 'b</code> and return a result of type <code>'a</code>. However, if we instead write the following in a module implementation:</p>\n\n<div><div><pre><code><span>(* bar.ml *)</span>\n<span>let</span> <span>bar</span> <span>:</span> <span>'</span><span>a</span> <span>*</span> <span>'</span><span>b</span> <span>-&gt;</span> <span>'</span><span>a</span> <span>=</span> <span>fun</span> <span>(</span><span>x</span><span>,</span><span>y</span><span>)</span> <span>-&gt;</span> <span>x</span> <span>+</span> <span>y</span>\n</code></pre></div></div>\n\n<p><code>bar</code> would have the type signature <code>int * int -&gt; int</code>, i.e.: <code>'a</code> and <code>'b</code> are both refined into <code>int</code>. This is because in a module implementation, instead of having implicit universal quantifiers in the type signature as we would normally expect, the type checker interprets this as \u201cthere exists types <code>'a</code> and <code>'b</code> that satisfies the definition\u201d.</p>\n\n<p>To force it to take a polymorphic type signature, we declare the polymorphism explicitly, with:</p>\n\n<div><div><pre><code><span>(* bar.ml *)</span>\n<span>let</span> <span>bar</span> <span>:</span> <span>'</span><span>a</span> <span>'</span><span>b</span><span>.</span> <span>'</span><span>a</span> <span>*</span> <span>'</span><span>b</span> <span>-&gt;</span> <span>'</span><span>a</span> <span>=</span> <span>fun</span> <span>(</span><span>x</span><span>,</span><span>y</span><span>)</span> <span>-&gt;</span> <span>x</span> <span>+</span> <span>y</span>\n<span>(* read: forall types 'a and 'b, ... *)</span>\n</code></pre></div></div>\n\n<p>which now fails to compile, as expected.</p>\n\n<h4>\u2026surely this has (significant) overhead?</h4>\n\n<p>No. (I hope so!)</p>\n\n<p>OCaml delimited continuations are implemented on top of <em>fibers</em>: small runtime-managed, heap-allocated, dynamically resized call stacks. If we install two effect handlers (corresponding to the two arrows), just before doing a <code>perform</code> in <code>foo</code>, we have the following execution stack:</p>\n\n<div><div><pre><code>+-----+   +-----+   +-----+\n|     |   |     |   |     |\n| baz |&lt;--| bar |&lt;--| foo |\n|     |   |     |   |     |\n+-----+   +-----+   +-----+ &lt;- stack_pointer\n</code></pre></div></div>\n\n<p>Suppose that then the effect is performed and being handled in <code>baz</code>. We then have the following stack:</p>\n\n<div><div><pre><code>+-----+                   +-----+   +-----+\n|     |                   |     |   |     |   +-+\n| baz |                   | bar |&lt;--| foo |&lt;--|k|\n|     |                   |     |   |     |   +-+\n+-----+ &lt;- stack_pointer  +-----+   +-----+\n</code></pre></div></div>\n\n<p>The delimited continuation <code>k</code> here is an object on the heap that corresponds to the suspended computation. When the continuation is resumed, the stack is restored to the previous state. (we can safely do this since continuations are <em>one-shot</em> \u2013 they can only be resumed at most once). Notice that it was not necessary to copy any stack frames in the capture and resumption of a continuation; my guess is that they probably have around the same cost as a normal function call?</p>\n\n<h3>So what is it that I\u2019m doing?</h3>\n\n<p>The original project proposal <a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">can be found here</a>.</p>\n\n<p>Currently, the compiler is built with an external build system <a href=\"https://en.wikipedia.org/wiki/Make_(software)\">Make</a>. Compilation units naturally form a directed acyclic graph of (immediate) dependencies, and this is generated and saved in a text file <code>.depend</code>. In the Makefile, one can add dependencies to build rules, and thus the build system knows to launch a compiler instance for every compilation unit in dependency order.</p>\n\n<p>The project aims to explore the potential of taking that ability away from the build system, and instead get the OCaml compiler to effectively \u201cdiscover\u201d the dependency order itself, via launching a copy of itself when it discovers that a dependency is missing.</p>\n\n<h3>Progress so far</h3>\n\n<p>I have <a href=\"https://github.com/lucasma8795/ocaml/commit/708d64a9b5b650b9208c8da85e5ffdd95e8b7bab\">hoisted</a> all the logic in <code>driver/Load_path.ml</code> up to <code>main.ml</code> via effects (performing effects in <code>Load_path.ml</code> and installing an effect handler at <code>main.ml</code>. The point of this is to get the relevant path resolution logic from being buried deep inside the compiler, to just below surface level.</p>\n\n<p>I have also successfully performed a <a href=\"https://en.wikipedia.org/wiki/Bootstrapping\">bootstrap cycle</a>, where one builds a compiler with a previously stable version of itself.</p>\n\n<p>The logical next step would be to experiment with code that launches a copy of the compiler whenever a dependency has not been compiled, and eventually merge that with my existing code\u2026</p>",

     

       9
       9
       +
         "content_type": "html",

     

       10
       10
       +
         "author": {

     

       11
       11
       +
           "name": "",

     

       12
       12
       +
           "email": null,

     

       13
       13
       +
           "uri": null

     

       14
       14
       +
         },

     

       15
       15
       +
         "categories": [

     

       16
       16
       +
           "ocaml-effects-scheduling"

     

       17
       17
       +
         ],

     

       18
       18
       +
         "source": "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       19
       19
       +
       }

+19

lucasma/blog_2025_07_11_effects-scheduling-w02.json

···

       1
       1
       +
       {

     

       2
       2
       +
         "id": "https://lucasma8795.github.io/blog/2025/07/11/effects-scheduling-w02",

     

       3
       3
       +
         "title": "Effects-based scheduling for the OCaml compiler - w02",

     

       4
       4
       +
         "link": "https://lucasma8795.github.io/blog/2025/07/11/effects-scheduling-w02.html",

     

       5
       5
       +
         "updated": "2025-07-11T08:00:00",

     

       6
       6
       +
         "published": "2025-07-11T08:00:00",

     

       7
       7
       +
         "summary": "Hours of refactoring and bug-fixing later, I was able to get the OCaml compiler to invoke itself in another process to compile a missing dependency, then resume the compilation process as usual.",

     

       8
       8
       +
         "content": "<p>Hours of refactoring and bug-fixing later, I was able to get the OCaml compiler to invoke itself in another process to compile a missing dependency, then resume the compilation process as usual.</p>\n\n<p>More specifically, consider the two <code>.ml</code> files below (and their corresponding <code>.mli</code> interface files, omitted):</p>\n\n<div><div><pre><code><span>(* foo.ml *)</span>\n<span>let</span> <span>bar</span> <span>=</span> <span>42</span>\n\n<span>(* program.ml *)</span>\n<span>let</span> <span>()</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"%d\"</span> <span>Foo</span><span>.</span><span>bar</span>\n</code></pre></div></div>\n\n<p>If we invoke the compiler on <code>program.ml</code> without first compiling <code>foo.ml</code>, clearly it doesn\u2019t work: we are missing a dependency <code>foo.cmi</code>. However, if we catch the exception that would\u2019ve normally been raised by the compiler, in our effect handler:</p>\n\n<div><div><pre><code><span>effc</span> <span>=</span> <span>fun</span> <span>(</span><span>type</span> <span>c</span><span>)</span> <span>(</span><span>eff</span><span>:</span> <span>c</span> <span>Effect</span><span>.</span><span>t</span><span>)</span> <span>-&gt;</span>\n  <span>match</span> <span>eff</span> <span>with</span>\n  <span>(* filename -&gt; filename *)</span>\n  <span>|</span> <span>Load_path</span><span>.</span><span>Find_path</span> <span>fn</span> <span>-&gt;</span>\n    <span>Some</span> <span>(</span><span>fun</span> <span>(</span><span>k</span><span>:</span> <span>(</span><span>c</span><span>,</span> <span>_</span><span>)</span> <span>continuation</span><span>)</span> <span>-&gt;</span>\n      <span>try</span>\n        <span>Effect</span><span>.</span><span>Deep</span><span>.</span><span>continue</span> <span>k</span> <span>(</span><span>find_path</span> <span>fn</span><span>)</span>\n      <span>with</span> <span>Not_found</span> <span>-&gt;</span> <span>begin</span>\n        <span>(* missing dependency, we need to compile it\n           imitate what find_path would normally return *)</span>\n        <span>try</span>\n          <span>Effect</span><span>.</span><span>Deep</span><span>.</span><span>continue</span> <span>k</span> <span>(</span><span>compile_dependency</span> <span>fn</span><span>)</span>\n        <span>(* source file not found, give up *)</span>\n        <span>with</span> <span>Not_found</span> <span>-&gt;</span>\n          <span>Effect</span><span>.</span><span>Deep</span><span>.</span><span>discontinue</span> <span>k</span> <span>Not_found</span>\n      <span>end</span>\n    <span>)</span>\n    \n  <span>|</span> <span>...</span>\n</code></pre></div></div>\n\n<p>Invoking <code>./ocamlrun ./ocamlc -c program.ml -I ./stdlib</code>, we find a missing dependency, and <code>compile_dependency: filename -&gt; filename</code> generates the following (hopefully, portable?) command to compile our dependency <code>foo.ml</code> (we inherit the load path from the calling parent):</p>\n\n<div><div><pre><code>'runtime/ocamlrun' './ocamlc' '-c' 'foo.ml' '-I' './stdlib' '-I' ''\n</code></pre></div></div>\n\n<p>\u2026and we then resume compilation for <code>program.ml</code> with the <code>continue</code> primitive.</p>\n\n<p>Linking the object files together, we then get</p>\n\n<div><div><pre><code>\u279c ocamlrun ocamlc foo.cmo program.cmo -I stdlib -o program\n\u279c ocamlrun ./program\n42\n</code></pre></div></div>\n\n<p>as expected!</p>\n\n<p>Using the above, I was then able to trace through the Makefile and build <code>ocamlcommon.cma</code> and <code>ocamlbytecomp.cma</code>, first by building the required <code>.cmo</code> files (in no particular order, and missing <code>.cmi</code> dependencies are auto-discovered and compiled), then linking the objects in dependency order (which is something I\u2019d hope to be able to relax in the future? <a href=\"https://lucasma8795.github.io/blog/2025/07/11/effects-scheduling-w02.html#fn:1\">1</a>). With this done, we are only two commands away to produce <code>ocamlc</code>, the OCaml <a href=\"https://ocaml.org/manual/5.3/comp.html\">bytecode compiler</a>:</p>\n\n<div><div><pre><code>ocamlrun ocamlc -c driver/main.ml &lt;compiler flags&gt; &lt;load path&gt;\nocamlrun ocamlc ocamlcommon.cma ocamlbytecomp.cma driver/main.cmo -o ocamlc &lt;compiler flags&gt; &lt;load path&gt;\n</code></pre></div></div>\n\n<p>An issue that I can see coming: the <a href=\"https://ocaml.org/manual/5.2/api/compilerlibref/Load_path.html\">original</a> <code>Load_path</code> module makes the assumption that the contents of the load path don\u2019t change throughout the lifetime of the compiler process, and for a good reason: file system calls are much much slower than simply reading from memory, and so the compiler reads in the filenames and directories and caches them in memory. However, we want newly compiled dependencies to be present in the load path state to avoid compiling dependencies twice, and so it now needs to be mutable and synchronized across compiler instances.</p>\n\n<p>For now I\u2019ve added file system calls to avoid overwriting existing <code>.cmi</code> and <code>.cmo</code> files (having to synchronize load path state across independent compiler <em>processes</em> sounds like a lot of pain), but this should be quite straightforward when I eventually transition over to using <a href=\"https://ocaml.org/manual/5.1/parallelism.html\">domains</a>.</p>\n\n<p>The next step would be to work on building the rest of the targets that <code>make install</code> requires, more to come on this\u2026</p>\n\n<div>\n  <ol>\n    <li>\n      <p>Week 5 Lucas here, turns out this was not possible! The initialization order of modules is the order of which they are linked. This is a <a href=\"https://en.wikipedia.org/wiki/Total_order\">total order</a> of the modules that respects the dependency graph, but notice that this is not unique, so in general the link order is not a function of the program text. Arbitrarily picking a valid total order also doesn\u2019t work, suppose we had some global state in <code>A</code>, with <code>B</code> and <code>C</code> both trying to read and modify that global state, then the program behaviour would depend on the link order.\u00a0<a href=\"https://lucasma8795.github.io/blog/2025/07/11/effects-scheduling-w02.html#fnref:1\">&#8617;</a></p>\n    </li>\n  </ol>\n</div>",

     

       9
       9
       +
         "content_type": "html",

     

       10
       10
       +
         "author": {

     

       11
       11
       +
           "name": "",

     

       12
       12
       +
           "email": null,

     

       13
       13
       +
           "uri": null

     

       14
       14
       +
         },

     

       15
       15
       +
         "categories": [

     

       16
       16
       +
           "ocaml-effects-scheduling"

     

       17
       17
       +
         ],

     

       18
       18
       +
         "source": "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       19
       19
       +
       }

+19

lucasma/blog_2025_07_18_effects-scheduling-w03.json

···

       1
       1
       +
       {

     

       2
       2
       +
         "id": "https://lucasma8795.github.io/blog/2025/07/18/effects-scheduling-w03",

     

       3
       3
       +
         "title": "Effects-based scheduling for the OCaml compiler - w03",

     

       4
       4
       +
         "link": "https://lucasma8795.github.io/blog/2025/07/18/effects-scheduling-w03.html",

     

       5
       5
       +
         "updated": "2025-07-18T08:00:00",

     

       6
       6
       +
         "published": "2025-07-18T08:00:00",

     

       7
       7
       +
         "summary": "This week was an extension of last week\u2019s work, where I took my success of building ocamlc with the modified compiler and started to build the rest of the executables that made up the OCaml installation. Ideally, I want to replicate the behaviour of make world &amp;&amp; make install, which builds everything necessary for a complete OCaml installation, including the compiler, the standard library, and the tools that come with it (e.g.: ocamlc, ocamlopt, ocamldep, etc.), and installs it in some directory. To make the entire build process reproducible, I made a shell script that does all the above. Since I have the compiler find all the dependencies of the .ml files, I can drop all the .mli files in the recipe and have it find them on-the-fly. Having to pull out all the relevant parts from the Makefile was quite the tedious process, but the end of the week I had it all up and working, and a quick diff between a clean OCaml installation and an installation from my script verifies this:",

     

       8
       8
       +
         "content": "<p>This week was an extension of last week\u2019s work, where I took my success of building <code>ocamlc</code> with the modified compiler and started to build the rest of the executables that made up the OCaml installation. Ideally, I want to replicate the behaviour of <code>make world &amp;&amp; make install</code>, which builds everything necessary for a complete OCaml installation, including the compiler, the standard library, and the tools that come with it (e.g.: <code>ocamlc</code>, <code>ocamlopt</code>, <code>ocamldep</code>, etc.), and installs it in some directory. To make the entire build process reproducible, I made a shell script that does all the above. Since I have the compiler find all the dependencies of the <code>.ml</code> files, I can drop all the <code>.mli</code> files in the recipe and have it find them on-the-fly. Having to pull out all the relevant parts from the <code>Makefile</code> was quite the tedious process, but the end of the week I had it all up and working, and a quick <code>diff</code> between a clean OCaml installation and an installation from my script verifies this:</p>\n\n<div><div><pre><code>\u279c diff ./Documents/cambridge/urop/ocaml/install ./Github/ocaml/install <span>-qr</span> | <span>grep</span> <span>\"Only in\"</span>\nOnly <span>in</span> ./Documents/cambridge/urop/ocaml/install/lib/ocaml/compiler-libs: handler_common.cmi\nOnly <span>in</span> ./Documents/cambridge/urop/ocaml/install/lib/ocaml/compiler-libs: handler_common.cmt\nOnly <span>in</span> ./Documents/cambridge/urop/ocaml/install/lib/ocaml/compiler-libs: handler_common.cmti\nOnly <span>in</span> ./Documents/cambridge/urop/ocaml/install/lib/ocaml/compiler-libs: handler_common.mli\n</code></pre></div></div>\n\n<p><code>handler_common.ml</code> is the only new file that I have added to the compiler so far, which installs the effect handler to the entry point of the compiler, so it makes sense that it appears in the diff.</p>",

     

       9
       9
       +
         "content_type": "html",

     

       10
       10
       +
         "author": {

     

       11
       11
       +
           "name": "",

     

       12
       12
       +
           "email": null,

     

       13
       13
       +
           "uri": null

     

       14
       14
       +
         },

     

       15
       15
       +
         "categories": [

     

       16
       16
       +
           "ocaml-effects-scheduling"

     

       17
       17
       +
         ],

     

       18
       18
       +
         "source": "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       19
       19
       +
       }

+19

lucasma/blog_2025_07_25_effects-scheduling-w04.json

···

       1
       1
       +
       {

     

       2
       2
       +
         "id": "https://lucasma8795.github.io/blog/2025/07/25/effects-scheduling-w04",

     

       3
       3
       +
         "title": "Effects-based scheduling for the OCaml compiler - w04",

     

       4
       4
       +
         "link": "https://lucasma8795.github.io/blog/2025/07/25/effects-scheduling-w04.html",

     

       5
       5
       +
         "updated": "2025-07-25T08:00:00",

     

       6
       6
       +
         "published": "2025-07-25T08:00:00",

     

       7
       7
       +
         "summary": "Now that I have a working prototype of a linear self-scheduling OCaml compiler, the next step was to dispatch compilation tasks in parallel. My idea was to have some sort of process (domain?) pool to submit compilation tasks to, so I got that done fairly quickly:",

     

       8
       8
       +
         "content": "<p>Now that I have a working prototype of a linear self-scheduling OCaml compiler, the next step was to dispatch compilation tasks in parallel. My idea was to have some sort of process (domain?) pool to submit compilation tasks to, so I got that done fairly quickly:</p>\n\n<div><div><pre><code><span>type</span> <span>!</span><span>'</span><span>a</span> <span>promise</span>\n<span>(** Type of a promise, representing an asynchronous return value that will\n    eventually be available. *)</span>\n\n<span>val</span> <span>await</span> <span>:</span> <span>'</span><span>a</span> <span>promise</span> <span>-&gt;</span> <span>'</span><span>a</span>\n<span>(** [await p] blocks the calling domain until the promise [p] is resolved,\n    returning the value if it was resolved, or re-raising the wrapped exception\n    if it was rejected. *)</span>\n\n<span>module</span> <span>Pool</span> <span>:</span> <span>sig</span>\n  <span>type</span> <span>t</span>\n  <span>(** Type of a thread pool. *)</span>\n\n  <span>val</span> <span>create</span> <span>:</span> <span>int</span> <span>-&gt;</span> <span>t</span>\n  <span>(** [create n] creates a thread pool with [n] new domains. *)</span>\n\n  <span>val</span> <span>submit</span> <span>:</span> <span>t</span> <span>-&gt;</span> <span>(</span><span>unit</span> <span>-&gt;</span> <span>'</span><span>a</span><span>)</span> <span>-&gt;</span> <span>'</span><span>a</span> <span>promise</span>\n  <span>(** [submit pool task] submits a task to be executed by the thread pool. *)</span>\n\n  <span>val</span> <span>join_and_shutdown</span> <span>:</span> <span>t</span> <span>-&gt;</span> <span>unit</span>\n  <span>(** [join_and_shutdown pool] blocks the calling thread until all tasks are\n      finished, then closes the thread pool. *)</span>\n<span>end</span>\n</code></pre></div></div>\n\n<p>Internally, this is done with an array of <code>Domain.t</code>, and a thread-safe task queue <code>(unit -&gt; unit) TSQueue.t</code>, which was nothing more than a wrapper around <code>'a Queue.t</code> from stdlib. I have identical worker loops that sit on each domain, checking the queue for tasks when one completes.</p>\n\n<p>Slight caveat: when a compilation task in the pool is waiting on another dependency to finish compiling, we certainly don\u2019t want to block the entire domain that the task sits on. I needed some way to yield control back to the pool, allow other tasks to run on our domain, then <em>continue</em> the task at the point the task was <em>suspended</em>. (sounds familiar?) This was done with a list of continuations, each paired with a <code>promise</code> that signals the dependency\u2019s completion. To suspend a task, I simply have it raise an effect.</p>\n\n<p>Back to actual compiler work: <a href=\"https://github.com/dra27\">David Allsopp</a> (my supervisor!) suggested that for a first prototype of my parallel scheduler, I should start with <code>Unix.create_process</code> instead of jumping straight into domains, just to cut down on the mutable compiler global state that I would have to initially deal with. The idea was to only have the main process compile <code>.ml</code> files, and have it spawn child processes in parallel to compile missing <code>.cmi</code> interfaces; if those missing <code>.cmi</code> interfaces have their set of missing dependencies, they are compiled linearly<a href=\"https://lucasma8795.github.io/blog/2025/07/25/effects-scheduling-w04.html#fn:1\">1</a>, i.e.: we block until its children are ready. The best way to explain this is with an example:</p>\n\n<div><div><pre><code><span>(* A.ml *)</span>\n<span>let</span> <span>foo</span> <span>=</span> <span>42</span>\n<span>let</span> <span>()</span> <span>=</span>\n  <span>Printf</span><span>.</span><span>printf</span> <span>\"foo: %d, bar: %s, sum(baz): %d</span><span>\\n</span><span>\"</span> \n    <span>foo</span> <span>(</span><span>B</span><span>.</span><span>bar</span><span>)</span> <span>(</span><span>List</span><span>.</span><span>fold_left</span> <span>(</span><span>+</span><span>)</span> <span>0</span> <span>C</span><span>.</span><span>baz</span><span>)</span>\n\n<span>(* B.ml *)</span>\n<span>let</span> <span>bar</span> <span>=</span> <span>\"Hello, world!\"</span>\n\n<span>(* C.ml *)</span>\n<span>let</span> <span>baz</span> <span>=</span> <span>[</span><span>1</span><span>;</span> <span>2</span><span>;</span> <span>3</span><span>;</span> <span>4</span><span>;</span> <span>5</span><span>]</span>\n</code></pre></div></div>\n\n<p>(insert <code>{A,B,C}.mli</code> files as appropriate!)</p>\n\n<p>When we invoke our custom <code>ocamlc</code> to compile <code>{A,B,C}.ml</code> (in this order), what then should happen chronologically is:</p>\n\n\n\n<img alt=\"Image 1\" src=\"https://lucasma8795.github.io/blog/public/images/effects_scheduling_1.jpeg\">\n  \nImage 1: Effects-based parallel scheduling between compilation of three modules\n\n\n<ol>\n  <li><code>A.ml</code> starts compiling. One of its dependencies <code>C.mli</code> is missing, which is discovered by our effect handler after an effect is raised somewhere to locate <code>C.mli</code> in the load path. We launch a child process to compile <code>C.mli</code>, then move on immediately.</li>\n  <li><code>B.ml</code> starts compiling. Its only dependency <code>B.mli</code> is missing, so that gets compiled in parallel.</li>\n  <li><code>C.ml</code> starts compiling. Its only dependency <code>C.mli</code> is missing, but we already launched a child process to compile it (represented as a dotted line), so we attach the suspended compilation to <code>C.cmi</code> and resume it only when it is ready.</li>\n  <li>Suppose <code>C.cmi</code> is now ready. We can now resume the compilation of <code>C.ml</code> and it should complete successfully, since that was our only dependency.</li>\n  <li><code>A.ml</code> was also waiting on <code>C.cmi</code>, so it can also be resumed. It now hits a second missing dependency <code>B.mli</code>, which we again compile in parallel.</li>\n</ol>\n\n<p>Steps 6 to 10 follow the same logic, as shown in the diagram above. We fold on the list of implementations <code>{A,B,C}.ml</code> until all of them compile successfully. I had most of the code down for this by the end of week.</p>\n\n<p>Finally, I took a couple of hours out of my weekend to make this website! I used <a href=\"https://jekyllrb.com/\">Jekyll</a>, a static site generator, which was surprisingly pleasant to set up and easy to work with. The source code is publicly available on <a href=\"https://github.com/lucasma8795/lucasma8795.github.io\">GitHub</a>.</p>\n\n<div>\n  <ol>\n    <li>\n      <p>This is actually non-trivial, since the main process wants to launch child processes in parallel, but the child processes want to be linear. I did this by temporarily maintaining two branches of the compiler, one for the main process itself (with all this new fancy parallelism) and one that the main process launches (with our linear compiler from the start of week). I take my existing compiler and install it to some directory, but instead of using its executables directly, I create a new entry point to replace <code>driver/main.ml</code> and link against the <code>.cma</code> files in the installation to create the parallel compiler. This also doubles as a hack for me to use the <code>Unix</code> module in the compiler, since that originally depended on <code>ocamlc</code> to be built, which in turn depends on <code>ocamlcommon.cma</code>, which likely contains whatever that I need to modify, and I can\u2019t have those depend on <code>Unix</code>.\u00a0<a href=\"https://lucasma8795.github.io/blog/2025/07/25/effects-scheduling-w04.html#fnref:1\">&#8617;</a></p>\n    </li>\n  </ol>\n</div>",

     

       9
       9
       +
         "content_type": "html",

     

       10
       10
       +
         "author": {

     

       11
       11
       +
           "name": "",

     

       12
       12
       +
           "email": null,

     

       13
       13
       +
           "uri": null

     

       14
       14
       +
         },

     

       15
       15
       +
         "categories": [

     

       16
       16
       +
           "ocaml-effects-scheduling"

     

       17
       17
       +
         ],

     

       18
       18
       +
         "source": "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       19
       19
       +
       }

+19

lucasma/blog_2025_08_01_effects-scheduling-w05.json

···

       1
       1
       +
       {

     

       2
       2
       +
         "id": "https://lucasma8795.github.io/blog/2025/08/01/effects-scheduling-w05",

     

       3
       3
       +
         "title": "Effects-based scheduling for the OCaml compiler - w05",

     

       4
       4
       +
         "link": "https://lucasma8795.github.io/blog/2025/08/01/effects-scheduling-w05.html",

     

       5
       5
       +
         "updated": "2025-08-01T08:00:00",

     

       6
       6
       +
         "published": "2025-08-01T08:00:00",

     

       7
       7
       +
         "summary": "I started the week off by fixing my parallel scheduler that I\u2019ve started writing end of last week. There was this one bug that simply refused to budge, no matter how many things I\u2019ve thrown at it (you can find the setup from last week\u2019s notes here):",

     

       8
       8
       +
         "content": "<p>I started the week off by fixing my parallel scheduler that I\u2019ve started writing end of last week. There was this one bug that simply refused to budge, no matter how many things I\u2019ve thrown at it (you can find the setup from <a href=\"https://lucasma8795.github.io/blog/2025/07/25/effects-scheduling-w04.html\">last week\u2019s notes here</a>):</p>\n\n<div><div><pre><code>&gt;&gt; Fatal error: Cannot find address for: C.baz\nFatal error: exception Misc.Fatal_error\nRaised at Custom_ocamlc.handle.(fun) in file \"custom_ocamlc.ml\", line 365, characters 45-55\nCalled from Custom_ocamlc in file \"custom_ocamlc.ml\", line 685, characters 2-12\n</code></pre></div></div>\n\n<p>This happened after step 10 of the diagram from last week, during compilation of <code>A.ml</code>.</p>\n\n<p>Continuations capture everything on the call stack, but what they don\u2019t capture is the <em>global state</em> of the compiler. Thankfully, some <a href=\"https://github.com/ocaml/ocaml/pull/9963\">people</a> over at <a href=\"https://github.com/ocaml/merlin\">Merlin</a> have already added a module (<a href=\"https://ocaml.org/manual/5.2/api/compilerlibref/Local_store.html\">Local_store</a>) to the compiler, for them to \u201csnapshot\u201d the global state of the type-checker to move back and forth to type different files. They do this by explicitly registering all global state with <code>s_ref: 'a -&gt; 'a ref</code> in place of <code>ref</code>, which then registers the reference in a list of global bindings. Before we start any compilation, we call <code>fresh: unit -&gt; store</code> once, which <em>snapshots</em> the current global state as the \u201cinitial state\u201d and returns an opaque <code>store</code> type capable of storing a set of global states, initialized to the fresh state. This is then used in <code>with_store : store -&gt; (unit -&gt; 'a) -&gt; 'a</code> to restore the global state to the state of the <code>store</code> during the run of the function, and saving any changes to the <code>store</code>. Subsequent calls to <code>fresh</code> will return a fresh <code>store</code> with values obtained from the snapshot taken at the first instance of <code>fresh ()</code>.</p>\n\n<p>This is huge news, because all the missing dependencies would have already been discovered by the time the file has finished type-checking, so most if not all of the global state has already been registered for us. This is what my scheduler looked like, stripping away all unnecessary details:</p>\n\n<div><div><pre><code><span>let</span> <span>suspended_tasks</span> <span>=</span> <span>Queue</span><span>.</span><span>create</span> <span>()</span>\n<span>type</span> <span>_</span> <span>Effect</span><span>.</span><span>t</span> <span>+=</span> <span>Load_path</span> <span>:</span> <span>string</span> <span>-&gt;</span> <span>string</span> <span>Effect</span><span>.</span><span>t</span>\n\n<span>(* start compilation of all .ml files *)</span>\n<span>List</span><span>.</span><span>iter</span> <span>(</span><span>fun</span> <span>ml_file</span> <span>-&gt;</span>\n  <span>let</span> <span>store</span> <span>=</span> <span>fresh</span> <span>()</span><span>;</span>\n  <span>match</span> <span>with_store</span> <span>store</span> <span>(</span><span>fun</span> <span>()</span> <span>-&gt;</span> <span>compile</span> <span>ml_file</span><span>)</span> <span>with</span>\n  <span>|</span> <span>()</span> <span>-&gt;</span> <span>()</span> <span>(* file compiled successfully *)</span>\n  <span>|</span> <span>effect</span> <span>(</span><span>Load_path</span> <span>dep</span><span>)</span><span>,</span> <span>cont</span> <span>-&gt;</span> <span>(* dep will be a .cmi file *)</span>\n      <span>begin</span> <span>try</span>\n        <span>continue</span> <span>cont</span> <span>(</span><span>resolve_full_filename</span> <span>dep</span><span>)</span>\n      <span>with</span> <span>Not_found</span> <span>-&gt;</span>\n        <span>(* we hit a missing dependency, suspend the task *)</span>\n        <span>let</span> <span>full_mli_file</span> <span>=</span> <span>find_interface_source</span> <span>dep</span> <span>in</span>\n        <span>let</span> <span>dep</span> <span>=</span> <span>(</span><span>remove_suffix</span> <span>mli_file</span> <span>\".mli\"</span><span>)</span> <span>^</span> <span>\".cmi\"</span> <span>in</span>\n        <span>let</span> <span>pid</span> <span>=</span> <span>compile_process_parallel</span> <span>full_mli_file</span> <span>in</span>\n        <span>Queue</span><span>.</span><span>add</span> <span>(</span><span>pid</span><span>,</span> <span>cont</span><span>,</span> <span>dep</span><span>,</span> <span>store</span><span>)</span> <span>suspended_tasks</span>\n      <span>end</span>\n<span>)</span> <span>files_to_compile</span>\n\n<span>(* fold on suspended tasks until we are done *)</span>\n<span>while</span> <span>not</span> <span>(</span><span>Queue</span><span>.</span><span>is_empty</span> <span>suspended_tasks</span><span>)</span> <span>do</span>\n  <span>let</span> <span>(</span><span>pid</span><span>,</span> <span>cont</span><span>,</span> <span>dep</span><span>,</span> <span>store</span><span>)</span> <span>=</span> <span>Queue</span><span>.</span><span>take</span> <span>suspended_tasks</span> <span>in</span>\n  <span>if</span> <span>process_finished</span> <span>pid</span> <span>then</span>\n    <span>(* dependency has finished compiling, we can resume the task *)</span>\n    <span>add_to_load_path</span> <span>dep</span><span>;</span>\n    <span>with_store</span> <span>store</span> <span>(</span><span>fun</span> <span>()</span> <span>-&gt;</span> <span>continue</span> <span>cont</span> <span>dep</span><span>)</span>\n  <span>else</span>\n    <span>(* re-add the task to the queue *)</span>\n    <span>Queue</span><span>.</span><span>add</span> <span>(</span><span>pid</span><span>,</span> <span>cont</span><span>,</span> <span>dep</span><span>,</span> <span>store</span><span>)</span> <span>suspended_tasks</span>\n<span>done</span>\n</code></pre></div></div>\n\n<p>I\u2019m sure this was necessary anyway, but this somehow did not fix the issue! I then spent the good part of two whole days adding print statements all over the type-checker and staring at ridiculously long call stacks, until I came across a fairly innocuous piece of code, in <code>typing/env.ml</code>:</p>\n\n<div><div><pre><code><span>let</span> <span>find_same_module</span> <span>id</span> <span>tbl</span> <span>=</span>\n  <span>match</span> <span>IdTbl</span><span>.</span><span>find_same</span> <span>id</span> <span>tbl</span> <span>with</span>\n  <span>|</span> <span>x</span> <span>-&gt;</span> <span>x</span>\n  <span>|</span> <span>exception</span> <span>Not_found</span>\n    <span>when</span> <span>Ident</span><span>.</span><span>persistent</span> <span>id</span> <span>&amp;&amp;</span> <span>not</span> <span>(</span><span>Current_unit</span><span>.</span><span>Name</span><span>.</span><span>is_ident</span> <span>id</span><span>)</span> <span>-&gt;</span>\n      <span>Mod_persistent</span>\n</code></pre></div></div>\n\n<p>At this point I had realized that <code>B</code> was being opened successfully in <code>A</code>, going through the <code>Mod_persistent</code> code path above, but somehow <code>C</code> kept on raising <code>Not_found</code> here no matter what I did, and this was quite suspicious as their behaviour should be virtually identical. The first predicate in line 5 couldn\u2019t have been the issue, so it must have been the second that was failing. <code>Current_unit.Name</code> sounds like some mutable global state, and surely something as simple as that that must have been captured by <code>Local_store</code>.</p>\n\n<p>It wasn\u2019t! So when we resumed compilation of <code>A</code> (in step 10), the compiler thinks it\u2019s in <code>C</code>, and it makes sense that it couldn\u2019t find <code>C</code>, because it thinks we are already in the module <code>C</code>. The fix was:</p>\n\n<div><div><pre><code><span>- let current_unit : Unit_info.t option ref = ref None\n</span><span>+ let current_unit : Unit_info.t option ref = s_ref None\n</span></code></pre></div></div>\n\n<p>It took me two days to add two characters to the compiler! (<a href=\"https://github.com/dra27\">David</a> told me that he once took 5 days to fix a GC bug that changed only a couple of characters, so I guess this was bound to happen at some point\u2026)</p>\n\n<p>At this point, the entry point of the compiler was turning into a 800-line monster, so I decided to spend the rest of the week doing refactoring and logging improvements, in preparation of using domains as the next step.</p>",

     

       9
       9
       +
         "content_type": "html",

     

       10
       10
       +
         "author": {

     

       11
       11
       +
           "name": "",

     

       12
       12
       +
           "email": null,

     

       13
       13
       +
           "uri": null

     

       14
       14
       +
         },

     

       15
       15
       +
         "categories": [

     

       16
       16
       +
           "ocaml-effects-scheduling"

     

       17
       17
       +
         ],

     

       18
       18
       +
         "source": "https://lucasma8795.github.io/blog/feed/ocaml-effects-scheduling.xml"

     

       19
       19
       +
       }