the full recipe as markdown
recipe.md edited
277 lines 8.0 kB view raw view rendered
1# Typst-Unlit 2 3*[tangled.org/@oppi.li/typst-unlit](https://tangled.org/@oppi.li/typst-unlit)* 4 5*Serves: 1 Prep Time: 10min Compile Time: 10ms* 6 7A literate program is one where comments are first-class citizens, and 8code is explicitly demarcated, as opposed to a regular program, where 9comments are explicitly marked, and code is a first-class entity. 10 11GHC supports literate programming out of the box, by using a 12preprocessor to extract code from documents. This preprocessor is known 13as *unlit*[^1]. GHC also supports *custom* preprocessors, which can be 14passed in via the `-pgmL` flag. This very document you are reading, is 15one such preprocessor that allows embedding Haskell code inside typst 16files[^2]. 17 18This recipe not only gives you a fish (the typst-unlit preprocessor), 19but also, teaches you how to fish (write your own preprocessors). 20 21## Ingredients 22 23<table> 24<colgroup> 25<col style="width: 50%" /> 26<col style="width: 50%" /> 27</colgroup> 28<tbody> 29<tr> 30<td><p>To write your own preprocessor:</p> 31<ul> 32<li>GHC: the Glorious Haskell Compiler</li> 33<li>Typst: to generate PDFs</li> 34<li>And thats it! No stacking, shaking or caballing here.</li> 35</ul></td> 36<td><p>To compile this very document:</p> 37<ul> 38<li>The bootstrap program</li> 39<li>GHC: to produce an executable program</li> 40<li>Typst: to produce a readable PDF</li> 41</ul></td> 42</tr> 43</tbody> 44</table> 45 46**Pro Tip:** If you’re missing any ingredients, your local nixpkgs 47should stock them! 48 49## Instructions 50 51The idea behind the unlit program is super simple: iterate over the 52lines in the supplied input file and replace lines that aren’t Haskell 53with an empty line! To detect lines that are Haskell, we look for the 54```` ```haskell ```` directive and stop at the end of the code fence. 55Simple enough! Annoyingly, Haskell requires that imports be declared at 56the top of the file. This results in literate haskell programs always 57starting with a giant block of imports: 58 59> -- So first we need to get some boilerplate and imports out of the way. 60 61— Every literate programmer 62 63Oh gee, if only we had a tool to put the important stuff first. Our 64preprocessor will remedy this wart, with the `haskell-top` directive to 65move blocks to the top. With that out of the way, lets move onto the 66program itself! 67 68### Step 1: The maincourse 69 70I prefer starting with `main` but you do you. Any program that is passed 71to `ghc -pgmL` has to accept exactly 4 arguments: 72 73- `-h`: ignore this for now 74- `<label>`: ignore this for now 75- `<infile>`: the input lhaskell source code 76- `<outfile>`: the output haskell source code 77 78Invoke the runes to handle CLI arguments: 79 80``` 81main = do 82 args <- getArgs 83 case args of 84 ["-h", _label, infile, outfile] -> process infile outfile 85 _ -> die "Usage: typst-unlit -h <label> <source> <destination>" 86``` 87 88You will need these imports accordingly (notice how I am writing my 89imports *after* the main function!): 90 91``` 92import System.Environment (getArgs) 93import System.Exit (die) 94``` 95 96Now, we move onto defining `process`: 97 98### Step 2: The processor 99 100`process` does a bit of IO to read from the input file, remove comments, 101and write to the output file, `removeComments` is a pure function 102however: 103 104``` 105process :: FilePath -> FilePath -> IO () 106process infile outfile = do 107 ls <- lines <$> readFile infile 108 writeFile outfile $ unlines $ removeComments ls 109``` 110 111### Step 3: Removing comments 112 113We will be iterating over lines in the file, and wiping clean those 114lines that are not Haskell. To do so, we must track some state as we 115will be jumping in and out of code fences: 116 117``` 118data State 119 = OutsideCode 120 | InHaskell 121 | InHaskellTop 122 deriving (Eq, Show) 123``` 124 125To detect the code fences itself, we can define a few matcher functions, 126here is one for the ```` ```haskell ```` pattern: 127 128``` 129withTag :: (String -> Bool) -> String -> Bool 130withTag pred line = length ticks > 2 && pred tag 131 where (ticks, tag) = span (== '`') line 132 133isHaskell :: String -> Bool 134isHaskell = withTag (== "haskell") 135``` 136 137You will notice that this will also match ````` ````haskell `````, and 138this is intentional. If your text already contains 3 backticks inside 139it, you will need 4 backticks in the code fence and so on. 140 141We do the same exercise for `haskell-top`: 142 143``` 144isHaskellTop = withTag (== "haskell-top") 145``` 146 147And for the closing code fences: 148 149``` 150isCodeEnd = withTag null 151``` 152 153`removeComments` itself, is just a filter, that takes a list of lines 154and removes comments from those lines: 155 156``` 157removeComments :: [String] -> [String] 158removeComments ls = go OutsideCode ls [] [] 159``` 160 161Finally, `go` is a recursive function that starts with some `State`, a 162list of input lines, and two more empty lists that are used to store the 163lines of code that go at the top (using the `haskell-top` directive), 164and the ones that go below, using the `haskell` directive: 165 166``` 167go :: State -> [String] -> [String] -> [String] -> [String] 168``` 169 170When the input file is empty, we just combine the `top` and `bottom` 171stacks of lines to form the file: 172 173``` 174go _ [] top bot = reverse top ++ reverse bot 175``` 176 177Next, whenever, we are `OutsideCode`, and the current line contains a 178directive, we must update the state to enter a code block: 179 180``` 181go OutsideCode (x : rest) top bot 182 | isHaskellTop x = go InHaskellTop rest top ("" : bot) 183 | isHaskell x = go InHaskell rest top ("" : bot) 184 | otherwise = go OutsideCode rest top ("" : bot) 185``` 186 187When we are already inside a Haskell code block, encountering a 188triple-tick should exit the code block, and any other line encountered 189in the block is to be included in the final file, but below the imports: 190 191``` 192go InHaskell (x : rest) top bot 193 | isCodeEnd x = go OutsideCode rest top ("" : bot) 194 | otherwise = go InHaskell rest top (x : bot) 195``` 196 197And similarly, for blocks that start with the `haskell-top` directive, 198lines encountered here go into the `top` stack: 199 200``` 201go InHaskellTop (x : rest) top bot 202 | isCodeEnd x = go OutsideCode rest top ("" : bot) 203 | otherwise = go InHaskellTop rest (x : top) bot 204``` 205 206And thats it! Gently tap the baking pan against the table and let your 207code settle. Once it is set, you can compile the preprocessor like so: 208 209``` 210ghc -o typst-unlit typst-unlit.hs 211``` 212 213And now, we can execute our preprocessor on literate haskell files! 214 215## Serving 216 217To test our preprocessor, first, write a literate haskell file 218containing your typst code: 219 220```` 221 = Quicksort in Haskell 222 The first thing to know about Haskell's syntax is that parentheses 223 are used for grouping, and not for function application. 224 225 ```haskell 226 quicksort :: Ord a => [a] -> [a] 227 quicksort [] = [] 228 quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater) 229 where 230 lesser = filter (< p) xs 231 greater = filter (>= p) xs 232 ``` 233 234 The parentheses indicate the grouping of operands on the 235 right-hand side of equations. 236```` 237 238Remember to save that as a `.lhs` file, say `quicksort.lhs`. Now you can 239compile it with both `ghc` … 240 241``` 242ghci -pgmL ./typst-unlit quicksort.lhs 243GHCi, version 9.10.3: https://www.haskell.org/ghc/ :? for help 244[1 of 2] Compiling Main ( quicksort.lhs, interpreted ) 245Ok, one module loaded. 246ghci> quicksort [3,2,4,1,5,4] 247[1,2,3,4,4,5] 248``` 249 250… and `typst`: 251 252``` 253typst compile quicksort.lhs 254``` 255 256And there you have it! One file that can be interpreted by `ghc` and 257rendered beautifully with `typst` simultaneously. 258 259#### Notes 260 261This entire document is just a bit of ceremony around writing 262preprocessors, the Haskell code in this file can be summarized in this 263shell script: 264 265``` 266#!/usr/bin/env bash 267 268# this does the same thing as typst-unlit.lhs, but depends on `typst` and `jq` 269 270typst query "$3" 'raw.where(lang: "haskell-top")' | jq -r '.[].text' > "$4" 271typst query "$3" 'raw.where(lang: "haskell")' | jq -r '.[].text' >> "$4" 272``` 273 274[^1]: <https://gitlab.haskell.org/ghc/ghc/-/tree/master/utils/unlit> 275 276[^2]: This document needs itself to compile itself! This is why a 277 bootstrap program is included.