+2
-2
andres/metadata.json
+2
-2
andres/metadata.json
+21
andres/research_outreach_2025_04_13_weekly-notes.json
+21
andres/research_outreach_2025_04_13_weekly-notes.json
···+"summary": "So, I finally!!! finished refactoring the code and included the Gini Coefficient in my analysis. So now my code runs smoothly and get the metrics at different spatial levels and some metrics at the building level, including a differentiation between euclidean and manhattan distances to a public park. So I calculated the green inequality using the total number of trees, the available trees within 10, 25, 50, 75, 100 m for all buildings and the accessibility to parks at the same spatial resolution as deprivation is measured in England (image on the left shows the Gini Coefficient for the total number of trees in London; higher is more unequal).",+"content": "<p>So, I finally!!! finished refactoring the code and included the Gini Coefficient in my analysis. So now my code runs smoothly and get the metrics at different spatial levels and some metrics at the building level, including a differentiation between euclidean and manhattan distances to a public park. So I calculated the green inequality using the total number of trees, the available trees within 10, 25, 50, 75, 100 m for all buildings and the accessibility to parks at the same spatial resolution as deprivation is measured in England (image on the left shows the Gini Coefficient for the total number of trees in London; higher is more unequal).</p>\n\n<p>In addition to this, I\u2019ve been preparing on the ESL technical interview happening this week, which is mostly about projects where I\u2019ve used ML or DL. In preparation for this and for <a href=\"https://earthoutreachonair.withgoogle.com/events/geoforgood25-nyc\">Google\u2019s Geo for Good Summit</a> in late August, which I had in my goals this year (they accept application until late April), I resumed my work with their Foundation Model (FM)<a href=\"https://ancazugo.github.io/research/outreach/2025/04/13/weekly-notes.html#fn:1\">1</a>. When I started doing my PhD, I came across the <a href=\"https://journals.ametsoc.org/view/journals/bams/93/12/bams-d-11-00019.1.xml\">local climate zones (LCZ)</a>, which are a classification of urban areas based on their physical and morphological characteristics, namely buildings and vegetation. There is only one global dataset, published a couple of years ago that uses some of the same remote sensing products used in the FM. Many approaches to LCZs are done for specific cities by manually labelling regions and training small classifiers (some of them are found in <a href=\"https://www.wudapt.org/\">WUDAPT</a>). My hypothesis is that urban features are not uniform from one city to another; for instance, trees are not the same in London to Rio de Janeiro, so an open midrise area is different in those two places, because one may have more deciduous trees and the other more palm trees. So, my guess is that this FM model (and other Foundation models for that matter) can pick up those differences given the huge number of variables, to create a better land use classes in the urban context. I managed to get a working example from the download part (which I had struggled with in the past; don\u2019t use <a href=\"https://github.com/google/Xee\">XEE</a> just yet, at least for FM) to a PCA (see second image for a representation of LCZ based on FM; each dot is a 100x100 m pixel in London) and a small neural network. My intention is to submit this to the Geo for Good Summit, so I will be working on this in the next months as well and then trying to combine that with my tree infrastructure analysis of England. I found <a href=\"https://xbatcher.readthedocs.io/en/latest/\">XBatcher</a> as a good library to generate training batches for Pytorch from Xarray objects, but I am searching for better ways to sample data for the train/test split, particularly in the geospatial context, as XBatcher just generates the chips in the correct size for any n-dimensional array, so the split has to happen before.</p>\n\n<p>In the side project with the Estates division, I managed to sync the internal Esri Python environment with my code (that uses other libraries like <code>pyautocad</code>) and generate the dreaded geodatabase \ud83d\ude02 using <code>arcpy</code>. It\u2019s very difficult to integrate external tools to the Esri ecosystem, so I have to stick to this vector data format, even though there are better ones, as pointed by <a href=\"https://digitalflapjack.com/\">Michael</a> and <a href=\"https://anil.recoil.org/\">Anil</a>.</p>\n\n<p>Finally, I recently got some good news as well, I successfully passed my French course and I got the Kettle\u2019s Yard award from the Department of Architecture (small travel grant).</p>\n\n<div>\n <ol>\n <li>\n <p>This is not the official name of the model.\u00a0<a href=\"https://ancazugo.github.io/research/outreach/2025/04/13/weekly-notes.html#fnref:1\">↩</a></p>\n </li>\n </ol>\n</div>",
+21
andres/research_outreach_2025_04_20_weekly-notes.json
+21
andres/research_outreach_2025_04_20_weekly-notes.json
···+"summary": "I dedicated a good chunk of last week to prepare for my technical interview for the ESL, for which I read Josh Starmer\u2019s Statquest books and Anil Ananthaswamy\u2019s Why Machines Learn. Would highly recommend the three of them for a brush-up on the fundamentals and a bit of history of ML math.",+"content": "<p>I dedicated a good chunk of last week to prepare for my technical interview for the <a href=\"https://eslab.ai/esl-2025\">ESL</a>, for which I read Josh Starmer\u2019s <a href=\"https://statquest.org/\">Statquest</a> books and Anil Ananthaswamy\u2019s <a href=\"https://anilananthaswamy.com/\">Why Machines Learn</a>. Would highly recommend the three of them for a brush-up on the fundamentals and a bit of history of ML math.</p>\n\n<p>I also had a meeting with <a href=\"https://www.turing.ac.uk/people/researchers/polly-hudson\">Polly Hudson</a> and <a href=\"https://www.ioer.de/en/institute/staff/herold\">Hendrik Herold</a> where we discussed how we can integrate my tree data into the <a href=\"https://booth.lse.ac.uk/map\">Colouring Cities</a> platform. Also, we talked about the possibility of extracting deprivation and trees from the <a href=\"https://booth.lse.ac.uk/map\">Charles Booth\u2019s Maps</a> from the 19th century in London, with the goal of comparing the trends in the era with current socio-economic metrics.</p>\n\n<p>Finally, in my analysis of green equality I decided to measure the Gini Coefficient only in residential units in England, which account for 85% of all buildings (~25M). Bu I needed to define the metric that was going to be used as a measure of <em>green wealth</em>. Given that Gini is essentially an aggregate, I used the standard geographical units explained in a <a href=\"https://ancazugo.github.io/research/outreach/2025/03/30/weekly-notes.html\">previous post</a> but instead of taking the raw values, I made use of basic math. So for the 3-part of the <a href=\"https://www.330300rule.com/\">3-30-300 rule</a> I had counted the number of trees at different buffer sizes around each building. With that information I calculated the slope of an exponential regression for each building, and that\u2019s my final metric for this part, meaning that greater slopes mean a building is close to several trees, while lower slopes mean that the number of trees does not increase as you go further from the building (see left panel in the image).</p>\n\n<p>For the 300-part I had modified the code to calculated Euclidean distance on top of the already working Manhattan distance from a building to its closest public park. While reading about the differences between equality and equity, I could see that both these metrics could be associated with each concept respectively. I\u2019m not fully sure how to combine these two into one metric that I can input in the Gini calculation, because Gini measures wealth so greater numbers are better, but in this case, greater numbers are worse because they mean that the building is further away from a park. I\u2019ve thought about using a ratio of equality and equity (see righ panel in the image) but I need to think this further, as measures of green inequality are not standardised.</p>\n\n<p>PS: As I\u2019ve mentioned in the past, I normally use R for plotting, which meant that I was normally working with local copies of my datasets in RStudio, mostly because the native installation of R in the Kinabalu HPC has some issues with GDAL, making the sf package impossible to use (geopandas in R, essentially). There is a workaround though, use conda as an R environment manager and install the libraries via the conda-forge channel, as explained <a href=\"https://medium.com/@tortuecookie/using-r-with-conda-80953395bfe6\">here</a>. However, RStudio cannot SSH into any remote machine, meaning that interactive sessions are not possible. Sadly, the new Positron IDE cannot recognise a conda environment as an R interpreter and defaults to Python. But VS Code can, and through the R extension it is possible to set the interpreter to the location of the conda (R) environment and make use of other VS Code stuff like Copilot. So, I guess it\u2019s a bitter bye to RStudio for the foreseeable future.</p>\n\n<p>PS2: Some familiar faces appear in the <a href=\"https://www.youtube.com/watch?v=g-O4rf7_kHw\">latest video</a> about supercomputers released by the University of Cambridge social media team. \ud83e\udd23</p>",
+21
andres/research_outreach_2025_04_27_weekly-notes.json
+21
andres/research_outreach_2025_04_27_weekly-notes.json
···+"summary": "This week I modtly focused on refining the inequality metric I talked about in last week\u2019s post. After talking with Anil, I decided to try the Euclidean/Manhattan ratio as a measurement to include in the paper, but the problem is that it will be the same for two observations where the distances are equally high or equally low, meaning that the absolute value is lost. Also, Ronita suggested not to do this as equity, which is what I was trying to see is not actually a quantifiable metric, but just a conceptual framework. On the other hand (in)equality, using the Gini or Theil index (which is not that common in literature) is much more reproducible. In the image below, you\u2019ll see what the most \u201cpark-deprived\u201d residential buildings in London are, measured as the walking distance (Manhattan). Now what I need to do is figure out if this actually matches other forms of deprivation and the other green and blue metrics I\u2019ve calculated.",+"content": "<p>This week I modtly focused on refining the inequality metric I talked about in last week\u2019s post. After talking with Anil, I decided to try the Euclidean/Manhattan ratio as a measurement to include in the paper, but the problem is that it will be the same for two observations where the distances are equally high or equally low, meaning that the absolute value is lost. Also, Ronita suggested not to do this as equity, which is what I was trying to see is not actually a quantifiable metric, but just a conceptual framework. On the other hand (in)equality, using the Gini or Theil index (which is not that common in literature) is much more reproducible. In the image below, you\u2019ll see what the most \u201cpark-deprived\u201d residential buildings in London are, measured as the walking distance (Manhattan). Now what I need to do is figure out if this actually matches other forms of deprivation and the other green and blue metrics I\u2019ve calculated.</p>\n\n<p>In the side project with the Estates department I\u2019ve hit a wall in the automation of creating the CAD files for import in ArcGIS, as the unique identifiers of the polygons are lost in the processing, meaning that in ArcGIS they are not recognised as individual features. I need to find a way to explode (in CAD terms) the geometries, so they can be correctly identified when generating the geodatabase.</p>\n\n<p>Finally, I\u2019ve resumed and almost finished reading <a href=\"https://www.bloomsbury.com/uk/good-nature-9781526664891/\">Good Nature</a> by Professor Kathy Willis from Oxford. Highly inspiring book and a lot to take away. Nice read to commemorate Earth\u2019s Day (April 22nd).</p>",
+21
andres/research_outreach_2025_05_04_weekly-notes.json
+21
andres/research_outreach_2025_05_04_weekly-notes.json
···+"summary": "Last week started off with a Docker Workshop by the Accelerate Programme for Scientific Discovery (APSCI). This was the only workshop run by them that I hadn\u2019t attended yet and I was really looking forward to it as my experience with Docker had been really limited and didn\u2019t understand very well how to implement it in my workflow. I would highly recommend the other workshops by them, especially the packaging and publishing one for scientific software.",+"content": "<p>Last week started off with a Docker Workshop by the <a href=\"https://github.com/acceleratescience\">Accelerate Programme for Scientific Discovery</a> (APSCI). This was the only workshop run by them that I hadn\u2019t attended yet and I was really looking forward to it as my experience with Docker had been really limited and didn\u2019t understand very well how to implement it in my workflow. I would highly recommend the other workshops by them, especially the packaging and publishing one for scientific software.</p>\n\n<p>I also wrote the documentation for the Python package I\u2019ve been building for the Estates department, for which I used MKDocs (which I learned in one of APSCI\u2019s workshops). I hadn\u2019t used it before and it is so easy to set up and publish in GitHub Pages. I would recommend it alongside the material theme and the mkdocs-jupyter libraries to integrate jupyter notebooks straight in the website (see any of <a href=\"https://leafmap.org/\">Qiusheng Wu\u2019s famous package websites</a> in the geospatial community for reference). By learning this, I also accidentally came across CI/CD concepts and managed to setup a GitHub workflow for MKDocs, and found how to set up one for my Quarto website (to be realeased at some point in June).</p>\n\n<p>I also had an interesting conversation with Anil and one of his MPhil students about walkability maps with my data and OSM. We will keep working on it in these next weeks.</p>\n\n<p>The highlight of the week was attending the AI for Nature & Climate meeting at DAB with bunch of experts from both the domain and technical side, with many familiar faces from around Cambridge. It was a great opportunity to hear about other people\u2019s work and do a bit of networking.</p>\n\n<p>Finally, I finished reading <a href=\"https://www.bloomsbury.com/uk/good-nature-9781526664891/\">Good Nature</a> by Professor Kathy Willis from Oxford. Quite a good literature review of all the benefits of having nature around you, whether indoors or outdoors. The book is divided into chapters explaining the positive effects of nature on the different senses, and it seems that the most impacted is actually the smell, despite the fact that most research is focused on the visual aspect. Another interesting thing I found in the book is how human microbiome changes with plants around, which begs the question if people at DAB have a distinct bacteria fingerprint to the rest of the city due to the giant green wall in the building.\ud83e\udd14 Overall, I will be using many of the references in the book for my literature review because they are very relevant for my project. My next read is going to be the Nature of Our Cities by <a href=\"https://www.nadinagalle.com/about\">Nadina Galle</a>.</p>",
+21
andres/research_outreach_2025_05_18_weekly-notes.json
+21
andres/research_outreach_2025_05_18_weekly-notes.json
···+"summary": "Most of last week was spent on working on the Estates project, trying to wrap up the Python library and build a quick demo for the team to reuse. So the library and its documentation can be found here and it takes floor plans in DWG format and converts them to a geodatabase with the building footprints and rooms. To do so, it makes use of AutoCAD and ArcGIS Pro\u2019s Python APIs, which makes the manual process of converting files much easier. Then a Streamlit demo (check below) plots the rooms and their type, as well as providing more info about usage. One interesting thing I found while working on this streamlit app is that there is a mismatch in the basemaps from Esri, Google Maps and OpenStreetMaps. It is not a big deal for 99% of cases, but in this case, when I was trying plot the floor plans against a basemap, OSM polygons were visibly south of the actual georreferenced location. And it\u2019s not related to projection as all these use Pseudo-Mercator.",+"content": "<p>Most of last week was spent on working on the Estates project, trying to wrap up the Python library and build a quick demo for the team to reuse. So the library and its documentation can be found <a href=\"https://ancazugo.github.io/ucam-digital-twin/\">here</a> and it takes floor plans in DWG format and converts them to a geodatabase with the building footprints and rooms. To do so, it makes use of AutoCAD and ArcGIS Pro\u2019s Python APIs, which makes the manual process of converting files much easier. Then a Streamlit demo (check below) plots the rooms and their type, as well as providing more info about usage. One interesting thing I found while working on this streamlit app is that there is a mismatch in the basemaps from Esri, Google Maps and OpenStreetMaps. It is not a big deal for 99% of cases, but in this case, when I was trying plot the floor plans against a basemap, OSM polygons were visibly south of the actual georreferenced location. And it\u2019s not related to projection as all these use Pseudo-Mercator.</p>\n\n<p>In my PhD project, I\u2019ve been trying to compare my data with the <a href=\"https://uk.treeequityscore.org/\">Tree Equity Score</a> and the <a href=\"https://www.forestresearch.gov.uk/tools-and-resources/fthr/trees-outside-woodland-map/\">Forest Research Trees Outside Woodland Dataset</a>. In addition, I\u2019ve been exploring Over-representation analysis for my paper, as recommended by Ronita. It\u2019s a group of methods that\u2019s normally used in genomics to see if a set of genes is over-represented in a given gene set. I think it could be useful in urban analytics contexts.</p>\n\n<p>Finally, this week I am attending a workshop on Impact Health Assessment (HIA), organised by the Epidemiological Unit from the MRC in Cambridge and the AI4ER Annual Showcase.</p>\n\n<p>PS: I found <a href=\"https://learning.oreilly.com/library/view/3d-data-science/9781098161323/\">this book on 3D Data Science</a> methods in Python, which was recently published and looks very promising for those working on LiDAR data.</p>",
+21
andres/research_outreach_2025_05_25_weekly-notes.json
+21
andres/research_outreach_2025_05_25_weekly-notes.json
···+"summary": "For most of last week I attended the Health Impact Assessment workshop organised by the Public Health Modelling Group from the MRC Epidemiology Unit in Cambridge. It was a great in-depth introduction to epidemiological modelling using mostly longitudinal data for health assessment. It is very relevant as the group focuses on the effects of urban features in different health outcomes.",+"content": "<p>For most of last week I attended the Health Impact Assessment workshop organised by the <a href=\"https://www.mrc-epid.cam.ac.uk/research/research-areas/public-health-modelling/\">Public Health Modelling</a> Group from the MRC Epidemiology Unit in Cambridge. It was a great in-depth introduction to epidemiological modelling using mostly longitudinal data for health assessment. It is very relevant as the group focuses on the effects of urban features in different health outcomes.</p>\n\n<p>Also, I attended the AI4ER workshop in West Hub where some of my PhD peers presented their work. Worth highlighting is <a href=\"https://orlando-code.github.io/\">Orlando Timmerman\u2019s</a> work in long-term coral suitability, <a href=\"https://www.joshuadimasaka.com/\">Joshua Dimasaka\u2019s</a> work in catastrophe modelling using graph neural networks. And shout-out to Sanjoo Paddea and Gilly Walker for organising an amazing day of networking spaces and lovely talks for all AI4ER and adjacent people.</p>\n\n<p>For my project I mostly worked on two things:</p>\n\n<ul>\n <li>I started labelling nature features on Charles Booth\u2019s maps using <a href=\"https://labelstud.io/\">Label Studio</a>, focusing on individual trees (broadleaf and conifer) and clusters of trees, as well as water bodies. I met with Hendrik Herold to discuss some ideas on what the best approach is to extract all other features, including poverty and building footprints.</li>\n <li>I worked on my presentation with the <a href=\"https://www.sustainabledesign.arct.cam.ac.uk/\">SDG group</a> where I will be presenting what I\u2019ve done in the last couple of weeks and updates on my paper, for which I designed the plot explaining what I had drawn in a <a href=\"https://ancazugo.github.io/research/outreach/2025/04/20/weekly-notes.html\">previous post</a> about Gini coefficient and measuring tree visibility (see below).</li>\n</ul>",
+21
andres/research_outreach_2025_06_08_weekly-notes.json
+21
andres/research_outreach_2025_06_08_weekly-notes.json
···+"summary": "Two weeks smashed into one due to the annual AI4ER retreat in the Lake District and lack of good wifi \ud83e\udd23: I finished the analysis of inequality using my 3-30-300, one of the obstacles was trying to account for population size when measuring a given variable in a certain geographical unit; the solution was rather simple, just normalise with the population (n/(n-1)), which roughly keeps the interpretation of Gini the same (higher values more unequal), but it leads to values that might be larger than one, due to imbalance in population size. This is actually good as I can now see two areas that might have a similar Gini Score but may have different number of inhabitants. This led to a rabbithole of trying to find the best way to represent this and I ended up using a bivariate map (see below), which will definitely be in the paper.",+"content": "<p>Two weeks smashed into one due to the annual AI4ER retreat in the Lake District and lack of good wifi \ud83e\udd23: I finished the analysis of inequality using my 3-30-300, one of the obstacles was trying to account for population size when measuring a given variable in a certain geographical unit; the solution was rather simple, just normalise with the population (n/(n-1)), which roughly keeps the interpretation of Gini the same (higher values more unequal), but it leads to values that might be larger than one, due to imbalance in population size. This is actually good as I can now see two areas that might have a similar Gini Score but may have different number of inhabitants. This led to a rabbithole of trying to find the best way to represent this and I ended up using a bivariate map (see below), which will definitely be in the paper.</p>\n\n<p>On another note, I\u2019ve always read many papers that quanitfy urban trees use Google Street View, but that\u2019s not very scalable to a big city like London without incurring into big expenses in the Maps API in GCP. Thankfully, I found <a href=\"https://ancazugo.github.io/research/outreach/2025/06/08/www.mapillary.com\">Mapillary</a> which is the open source equivalent. Their documentation is not the best but I managed to create a script that downloads all photos from one path, which I can integrate with the LiDAR and high-res RGB to map trees and species (or that\u2019s my idea, roughly inspired by <a href=\"https://google.github.io/auto-arborist/\">The Auto Arborist Dataset</a>). Photos might not be the best quality as Google\u2019s but it has good coverage and is open.</p>\n\n<p>Moreover, I found one paper that is relevant for (urban) Foundation Models: <a href=\"https://dl.acm.org/doi/10.1145/3627673.3679662\">CityML</a>, which has better benchmark scores than the most popular framework for vector-based FM, named GeoVectors. This paper is quite interesting because they use the visual context of polygon features, which means that they rasterise several vector layers for the model to learn.</p>\n\n<p>Finally, I also recently shifted my coding workflow from VS Code to Cursor due to their support for better models and an education-based subscription their Pro membership. Unfortunately, I only got it to work with my Colombian university account as their form apparently uses regex to detect <em>edu</em> in the email domain, which is not part of Cambridge email accounts. The change to Cursor has definitely sped up the coding process as it is faster than copilot and surprisingly better at writting and formatting R scripts than VS Code.</p>\n\n<p>PS: Interesting discussions I had in the AI4ER retreat with my fellow PhD friends. Worth highlighting the session led by <a href=\"https://orlando-code.github.io/\">Orlando</a> about PhD struggles and life after the PhD and <a href=\"https://www.clarehall.cam.ac.uk/directory/adriano-gualandi/\">Adriano Gualandi</a>\u2019s insightful talk about the publishing process.</p>",
+21
andres/research_outreach_2025_06_22_weekly-notes.json
+21
andres/research_outreach_2025_06_22_weekly-notes.json
···+"summary": "After a one-week hiatus, my week started with the SDG meeting, where all Masters students presented their work. Was particularly impressed by Dr Haiman Raman\u2019s talk and attempt at simulating interviews with people on psypchiatry facilities and the perception of a healthy space. Also, worth highlighting John Nguyen\u2019s work on acoustic surfaces. Then, I attended the Centre for Human-Inspired Artificial Intelligence (CHIA) Meeting in Jesus College. The main event was Millie Chapman\u2019s talk on her use of computational methods (particularly reinforcement learning), for decision making for biodiversity and climate. The talk was followed up by an interesting panel with her and other researchers inlcuding Anil. I attended a student-led session for AI4ER students on life during and after the PhD, where we also shared some interesting and very useful tools for productivity and accelerating the coding and writing processes. From this discussion, I decided to give Gemini a go for deep literature search, and I have to say that it\u2019s by far the best tool I\u2019ve used, even more than Perplexity, and the hallucination rate is quite low, since all the sources were actually real. And it\u2019s also better for coding, so ChatGPT has been closed for my for the last couple of days.",+"content": "<p>After a one-week hiatus, my week started with the SDG meeting, where all Masters students presented their work. Was particularly impressed by Dr Haiman Raman\u2019s talk and attempt at simulating interviews with people on psypchiatry facilities and the perception of a healthy space. Also, worth highlighting <a href=\"https://john-nguyen.com/cv/\">John Nguyen\u2019s</a> work on acoustic surfaces.\nThen, I attended the <a href=\"https://www.chia.cam.ac.uk/\">Centre for Human-Inspired Artificial Intelligence (CHIA)</a> Meeting in Jesus College. The main event was <a href=\"https://milliechapman.info/\">Millie Chapman\u2019s</a> talk on her use of computational methods (particularly reinforcement learning), for decision making for biodiversity and climate. The talk was followed up by an interesting panel with her and other researchers inlcuding Anil.\nI attended a student-led session for AI4ER students on life during and after the PhD, where we also shared some interesting and very useful tools for productivity and accelerating the coding and writing processes. From this discussion, I decided to give Gemini a go for deep literature search, and I have to say that it\u2019s by far the best tool I\u2019ve used, even more than Perplexity, and the hallucination rate is quite low, since all the sources were actually real. And it\u2019s also better for coding, so ChatGPT has been closed for my for the last couple of days.</p>\n\n<p>Finally, I decided to rewrite and reformulate my introduction based on the funnel method presented in <a href=\"https://comegic.org.mx/wp-content/uploads/2023/06/Como-escribir-articulo-cientifico.pdf\">Barbara Gastel and Robert Day\u2019s</a> book on How to Write and Publish a Scientific Paper. I realised that my writing was more modular, whereas I needed a more structured an linear story going from the broader topic and then into more details. In addition, I\u2019ve almost finished writing the results and made good progress with the discussion, for which I am comparing my results to other forms of green infrastructure evaluation in the UK. Notably, the Tree Equity Score, which provides some foundation to which I can compare my results (see plot below). I have to say this plot is not the best, because it doesn\u2019t capture everything I want to show, so I might go for a different way of visualizing this in the manuscript. It essentially tries to show how different neighbourhoods in England fare in the different metrics, and I\u2019m glad to say they are not the same, so there\u2019s an interesting discussion with that, as my data may suggest other forms of green inequality.</p>",
+21
andres/research_outreach_2025_06_29_weekly-notes.json
+21
andres/research_outreach_2025_06_29_weekly-notes.json
···+"summary": "Last week was unexpectedly devoted to sorting things out with my scholarship due to an unforeseen issue. However, I managed to finish all the sections in the 3-30-300 paper, except for the discussion which needs polishing. I added a couple of plots and rearranged the results with a story for each subsection. I\u2019m really becoming a power user of Gemini for correcting writing style to fit an academic text. Linked below is a plot that summarises one of the points I make in the manuscript about how the 3-30-300 is reflected in population numbers.",+"content": "<p>Last week was unexpectedly devoted to sorting things out with my scholarship due to an unforeseen issue. However, I managed to finish all the sections in the 3-30-300 paper, except for the discussion which needs polishing. I added a couple of plots and rearranged the results with a story for each subsection. I\u2019m really becoming a power user of Gemini for correcting writing style to fit an academic text. Linked below is a plot that summarises one of the points I make in the manuscript about how the 3-30-300 is reflected in population numbers.</p>\n\n<p>I briefly tried to use ApacheSedona to do zonal statistics on the buildings dataset to create a 30-metric per building using a buffer area, but it wasn\u2019t very easy to do, due to the size and extent of the images. This was only to test if I could add more granularity to the paper methods for some of the spectral indexes, but using census polygons should be good enough.</p>\n\n<p>Also, I finally had a meeting with a couple of people interested in the project I worked on for about room occupancy for Estates and they were very impressed with my app. The main conclusion from the pilot study for this is that a University of this size should have a more centralised system to know their facilities. The nature of the insitution makes it difficult but it should be a matter of gathering all parties involved (departments, colleges, etc) to create a unified way to quantify room occupancy.</p>\n\n<p>Later in the week, the <a href=\"https://ual.sg/\">Urban Analytics Lab</a> from NUS came to visit Cambridge and gave short presentations (I also did one). Most of the PhD students work on using Street View images to quantify different aspects of cities from traffic, to energy use and building type. This came right on time as I wrote about this <a href=\"https://ancazugo.github.io/research/outreach/2025/06/08/weekly-notes.html\">a couple of weeks ago</a>. Of particular relevance was Winston\u2019s <a href=\"https://ual.sg/post/2023/07/31/new-paper-and-open-source-software-urbanity-automated-modelling-and-analysis-of-multidimensional-networks-in-cities/\">Urbanity project</a> which I\u2019ll have to try, as well as the group-wide <a href=\"https://www.sciencedirect.com/science/article/pii/S0924271624002612\">Global Streetscapes dataset</a>. Also, interesting talks by Zicheng\u2019s work on <a href=\"https://www.sciencedirect.com/science/article/pii/S2210670724006863\">nightime stree imagery</a> and Xiucheng\u2019s <a href=\"https://arxiv.org/abs/2504.02866\">OpenFACADES</a>. Yixin Wu\u2019s (unpublished) work on carbon sequestration by urban trees seems highly relevant for my work as it is a very understudied ecosystem service provided by urban green infrastructure.</p>\n\n<p>Also, from this week on I will be implementing the Zettelkasten method to record my literature review so I don\u2019t have to do double the work from notes to writing. This is based on the book titled <a href=\"https://www.soenkeahrens.de/en/takesmartnotes\">How to Take Smart Notes</a> by S\u00f6nke Ahrens, based on the method made popular by sociologist Niklas Luhmann. I decided to do this because I felt that my notes and highlights were not going anywhere and were not as effective, so I did some research a while ago and finished reading the book recently. This was also motivated by my struggle with writing, which I believe is quite common in academia (particularly for those whose first language is not English). I will update how my knowledge graph grows in Obsidian and how I implement the writing in my thesis and papers, alongside other LLM-based tools.</p>\n\n<p>Finally, to hold myself accountable for stuff I need to do, I will start posting my weekly objectives here and come back to them the following week to see how I did. I will post them from top to lowest priority and with sub goals.</p>\n\n<h2>Weekly Goals</h2>\n\n<ul>\n <li>FINISH THE PAPER MANUSCRIPT!!!!\n <ul>\n <li>Add the references to all the sections</li>\n <li>Organise the supplementary material</li>\n </ul>\n </li>\n <li>Move the trees dataset to an Earth Engine App\n <ul>\n <li>Decide on publishing all segmented trees or just the ones I defined as proper trees (height > 3 m & crown area > 10 m^2)</li>\n <li>Figure out how to split the data for massive upload to Google Cloud</li>\n </ul>\n </li>\n <li>Start the Zettelkasten method\n <ul>\n <li>Link Obsidian with my previous notes from notebook and iPad notes</li>\n </ul>\n </li>\n <li>Resume documenting the code for the 3-30-300 paper\n <ul>\n <li>Update the docstring for some functions that have changed</li>\n <li>Use mkdocs to create the github pages website to link in the paper</li>\n </ul>\n </li>\n</ul>",
+21
andres/research_outreach_2025_07_06_weekly-notes.json
+21
andres/research_outreach_2025_07_06_weekly-notes.json
···+"summary": "Last week started by attending the in-person event of Information is Beautiful\u2019s author, David McCandless at the Royal Geographical Society, who gave a presentation on how to make data tell a story, based on his experience as a data journalist. This was the inspiration for this week\u2019s plot down below using my data. In the paper is shown differently (you\u2019ll have to read it to see it) but I thought it should be fun to explore treemaps, which I hadn\u2019t done in the past. He made extensive use of this to compare the narrative around expenditure in the army by country in raw numbers, as a proportion of GDP and as proportion of the population. My goal is to show a similar thing but in terms of trees just for now. While doing this I realised that these kinds of plots are not very common in scientific papers, and I\u2019m not really sure why as they explain proportions quite clearly, and way better than the awful pie charts you see everywhere. This also marks my return to D3 after years of not using it, although this time with a significant help from LLMs. Below you\u2019ll find a hierarchical treemap of Regions, Local Authorities, and LSOAs according to the number of trees in each geography. This is a work in progress as you\u2019ll see that the smallest box doesn\u2019t have the right size, so a lot of debugging (go click on each box to see where in that area you can find more trees!!!)",+"content": "<p>Last week started by attending the in-person event of <a href=\"https://informationisbeautiful.net/\">Information is Beautiful\u2019s</a> author, <a href=\"https://davidmccandless.com/\">David McCandless</a> at the Royal Geographical Society, who gave a presentation on how to make data tell a story, based on his experience as a data journalist. This was the inspiration for this week\u2019s plot down below using my data. In the paper is shown differently (you\u2019ll have to read it to see it) but I thought it should be fun to explore treemaps, which I hadn\u2019t done in the past. He made extensive use of this to compare the narrative around expenditure in the army by country in raw numbers, as a proportion of GDP and as proportion of the population. My goal is to show a similar thing but in terms of trees just for now. While doing this I realised that these kinds of plots are not very common in scientific papers, and I\u2019m not really sure why as they explain proportions quite clearly, and way better than the awful pie charts you see everywhere. This also marks my return to D3 after years of not using it, although this time with a significant help from LLMs. Below you\u2019ll find a hierarchical treemap of Regions, Local Authorities, and LSOAs according to the number of trees in each geography. This is a work in progress as you\u2019ll see that the smallest box doesn\u2019t have the right size, so a lot of debugging (go click on each box to see where in that area you can find more trees!!!)</p>\n\n<h2>Past Weekly Objectives</h2>\n<ul>\n <li>FINISH THE PAPER MANUSCRIPT!!!!\n <ul>\n <li>Add the references to all the sections</li>\n <li>Organise the supplementary material</li>\n </ul>\n </li>\n <li>Move the trees dataset to an Earth Engine App (TODO)\n <ul>\n <li>Publish them in Zenodo as well</li>\n <li>Decide on publishing all segmented trees or just the ones I defined as proper trees (height > 3 m & crown area > 10 m^2)</li>\n <li>Figure out how to split the data for massive upload to Google Cloud</li>\n </ul>\n </li>\n <li>Start the Zettelkasten method (Setup and working)\n <ul>\n <li>Link Obsidian with my previous notes from notebook and iPad notes</li>\n </ul>\n </li>\n <li>Resume documenting the code for the 3-30-300 paper (Sort of started)\n <ul>\n <li>Update the docstring for some functions that have changed</li>\n <li>Use mkdocs to create the github pages website to link in the paper</li>\n </ul>\n </li>\n</ul>\n\n<h2>Weekly Objectives</h2>\n<ul>\n <li>Make the corrections to the manuscript\n <ul>\n <li>Change image order and reduce number of tables</li>\n </ul>\n </li>\n <li>Apply to <a href=\"https://propl.dev/\">PROPL25</a></li>\n <li>Prepare slides for AI4ER training event</li>\n</ul>",
+18
avsm/ideas_3d-print-world.json
+18
avsm/ideas_3d-print-world.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing the planet (or bits of it)</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Thanks to a combination of <a href=\"https://anil.recoil.org/projects/rsn\">satellite</a> information, remote sensors and data-science, we now are able to reason about places all over the globe from the comfort of our desks and offices. But sometimes, you just want to be able to see or touch an area to understand it properly: the flat 2D-projection on a screen doesnt necessarily reveal the subtle geography of a landscape, and data locked into a computer feels less immediate than even a physical model of the same area.</p>\n<p>In recent work, <a href=\"https://mynameismwd.org\">Michael Dales</a> has experimented with making 3D-printed models of surface terrain to make some areas of study more relatable. By combining high resolution <a href=\"https://en.wikipedia.org/wiki/Digital_elevation_model\">Digital Elevation Maps</a> (DEMs), and CAD software we were able to scale and print this section of a Swedish forest <a href=\"https://www.svtplay.se/video/jMd2Gb3/den-stora-algvandringen/idag-00-00\">used to observe Moose migrations</a>.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/3dprint-1.webp\" title=\"\">\n</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/3d-print-world\">403 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#3d-printing-the-planet-or-bits-of-it\"></a>3D printing the planet (or bits of it)</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Thanks to a combination of <a href=\"https://anil.recoil.org/projects/rsn\">satellite</a> information, remote sensors and data-science, we now are able to reason about places all over the globe from the comfort of our desks and offices. But sometimes, you just want to be able to see or touch an area to understand it properly: the flat 2D-projection on a screen doesnt necessarily reveal the subtle geography of a landscape, and data locked into a computer feels less immediate than even a physical model of the same area.</p>\n<p>In recent work, <a href=\"https://mynameismwd.org\">Michael Dales</a> has experimented with making 3D-printed models of surface terrain to make some areas of study more relatable. By combining high resolution <a href=\"https://en.wikipedia.org/wiki/Digital_elevation_model\">Digital Elevation Maps</a> (DEMs), and CAD software we were able to scale and print this section of a Swedish forest <a href=\"https://www.svtplay.se/video/jMd2Gb3/den-stora-algvandringen/idag-00-00\">used to observe Moose migrations</a>.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/3dprint-1.webp\" title=\"\">\n</p>\n<p>However, this method is not easily scalable as:</p>\n<ul>\n<li>The data sets involved are cumbersome in size.</li>\n<li>The resulting meshes are very detailed, causing even professional grade CAD Software to struggle.</li>\n<li>Raw data does not normally work out of the box for visualisations as 3D-model. For example, water levels have to be added, and often the height has to be accentuated to make it look realistic to our eyes.</li>\n</ul>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/3dprint-2.webp\" title=\"\">\n</p>\n<p>There are <a href=\"https://terrainator.com\">some general tools</a> that can help with this, but they don\u2019t support adding custom data layers in that would allow us to project data-science results onto a physical surface, and nothing that is open-source that others can readily work with.</p>\n<h2><a href=\"https://anil.recoil.org/#the-summer-project\"></a>The summer project</h2>\n<p>We\u2019d like to build a simple workflow based around open source tools such as Python, GDAL, and optionally QGIS, to take geospatial results from ecologists and render them ready for 3D-printing. The goal is to make it trivial for ecologists to combine datasets and render them physically without having to become experts in 3D-modelling.</p>\n<p>In this project we\u2019d like to:</p>\n<ul>\n<li>Use off the shelf Python libraries to select datasets, and covert them to 3D-meshes</li>\n<li>Provide a way to generate multi-colour meshes for use with the Bambu Carbon printers that we have access to which support up to 8 colours of filament</li>\n<li>Test it using one of the active projects in the group, such as the <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">LIFE global extinction risk maps</a></li>\n</ul>\n<p>We'll have access to 3D printers in the <a href=\"https://web.makespace.org\">Cambridge Makespace</a>, so this is a good project for a student who wants to get into the nitty-gritty of making things!</p>",
+18
avsm/ideas_accurate-summarisation-for-ce.json
+18
avsm/ideas_accurate-summarisation-for-ce.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in constructing a taxonomy of threats\nto wildlife from the literature. This involves scanning the body of\nconservation literature and gathering/synthesising evidence for conservation\ninterventions from a threats perspective. Once the text has been retrieved, it\nneeds to be summarised in a way that is accurate, concise and relevant and\nverified with human experts. This is particularly important for conservation\nevidence, where the key findings need to be communicated clearly to inform\npolicy and practice.</p>\n<p>This project therefore investigates how to generate threats, and to verify\ntheir accuracy as generated by LLMs and RAG pipelines from the CE literature.\nOur goal is to develop a pipeline that can reliably go from extracting relevant\ninformation from text to a summary that is verifiably (by a human) correct.</p>\n<p>As of June 2025, the project has been successfully completed and submitted\nfor Kittson's MPhil. A test version of the <a href=\"https://climateinaction-production.up.railway.app/\">avian threats dataset</a>\nis online for browsing, and we're spending the summer working on widening the evaluation with the wider CE team.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>The <a href=\"https://docs.ragas.io/en/stable/index.html\">Ragas framework</a> for RAG evaluation</li>\n<li><a href=\"https://arxiv.org/abs/2406.02524v2\">CheckEmbed: Effective Verification of LLM Solutions to Open Ended Tasks</a>, arxiv:2406.02524v2, June 2024</li>\n<li><a href=\"https://arxiv.org/abs/2210.00045\">Calibrating Sequence Likelihood Improves Conditional Language Generation</a>, arxiv:2210.00045, September 2000</li>\n</ul>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#accurate-summarisation-of-threats-for-conservation-evidence-literature\"></a>Accurate summarisation of threats for conservation evidence literature</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in constructing a taxonomy of threats\nto wildlife from the literature. This involves scanning the body of\nconservation literature and gathering/synthesising evidence for conservation\ninterventions from a threats perspective. Once the text has been retrieved, it\nneeds to be summarised in a way that is accurate, concise and relevant and\nverified with human experts. This is particularly important for conservation\nevidence, where the key findings need to be communicated clearly to inform\npolicy and practice.</p>\n<p>This project therefore investigates how to generate threats, and to verify\ntheir accuracy as generated by LLMs and RAG pipelines from the CE literature.\nOur goal is to develop a pipeline that can reliably go from extracting relevant\ninformation from text to a summary that is verifiably (by a human) correct.</p>\n<p>As of June 2025, the project has been successfully completed and submitted\nfor Kittson's MPhil. A test version of the <a href=\"https://climateinaction-production.up.railway.app/\">avian threats dataset</a>\nis online for browsing, and we're spending the summer working on widening the evaluation with the wider CE team.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>The <a href=\"https://docs.ragas.io/en/stable/index.html\">Ragas framework</a> for RAG evaluation</li>\n<li><a href=\"https://arxiv.org/abs/2406.02524v2\">CheckEmbed: Effective Verification of LLM Solutions to Open Ended Tasks</a>, arxiv:2406.02524v2, June 2024</li>\n<li><a href=\"https://arxiv.org/abs/2210.00045\">Calibrating Sequence Likelihood Improves Conditional Language Generation</a>, arxiv:2210.00045, September 2000</li>\n</ul>",
+18
avsm/ideas_activitypub-resilience.json
+18
avsm/ideas_activitypub-resilience.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/activitypub-resilience\">Improving Resilience of ActivityPub Services</a> <span>/ Jan 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Gediminas Lele\u0161ius</span>.</p>\n<p>The original goal of the project was to improve the resilience of the\ndistributed social networking protocol "ActivityPub", by caching the content on\nmultiple instances and serving them in case the origin instance goes down. The\nproject uses public-key cryptography to ensure data integrity, build a network\nof public key servers and verifiers and use that consensus instead of relying\non individual servers to provide trustworthy data. The core deliverable is a\nkey server gathering and serving public keys, a verifier checking the entries\nof that server, and a modified Mastodon server rescuing failed ActivityPub\nrequests using an external key server.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/activitypub-resilience\">171 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#improving-resilience-of-activitypub-services\"></a>Improving Resilience of ActivityPub Services</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Gediminas Lele\u0161ius</span>.</p>\n<p>The original goal of the project was to improve the resilience of the\ndistributed social networking protocol "ActivityPub", by caching the content on\nmultiple instances and serving them in case the origin instance goes down. The\nproject uses public-key cryptography to ensure data integrity, build a network\nof public key servers and verifiers and use that consensus instead of relying\non individual servers to provide trustworthy data. The core deliverable is a\nkey server gathering and serving public keys, a verifier checking the entries\nof that server, and a modified Mastodon server rescuing failed ActivityPub\nrequests using an external key server.</p>\n<p>The original aims were achieved, with core deliverables being a key server and\na verifier. The Mastodon source code was modified to use a key server and\nother instances to rescue failed content queries. Finally, functional testing\nwas conducted and the project outcome was evaluated. The system was deployed to\nwork with the live ActivityPub network.</p>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li>The <a href=\"https://github.com/gediminasel/activitypub-resilience\">dissertation is on GitHub</a>.</li>\n<li>The modified <a href=\"https://github.com/gediminasel/mastodon-resilience\">Mastodon source code</a>.</li>\n</ul>",
+18
avsm/ideas_ai-assisted-inclusion-criteria.json
+18
avsm/ideas_ai-assisted-inclusion-criteria.json
···+"title": "Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis",+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Whenever we do evidence synthesis (especially for <a href=\"https://anil.recoil.org/projects/ce\">conservation outcomes</a>)\nto distil the world's scientific literature into actionable insights, we have\nto decide on <em>what</em> published studies we will include or exclude, and <em>why</em>\nthey are categorised as such. This can be a challenging process, and sometimes\ninclusion criteria may not be very reproducible or clearly defined, leading to\nconfusion between reviewers and more time-consuming reviews.</p>\n<p>In AI-assisted review methods, we are increasingly finding that LLMs may\ninterpret inclusion criteria differently to human reviewers, potentially\nbecause human experts may implicitly assume certain things that are not obvious\nto those working outside the review team (or interpret things differently to\nfellow reviewers). We trialled an informal process earlier this year to\niterate over the inclusion/exclusion criteria for an evidence synthesis using\nsynthetic studies that represent "edge cases", whereby it is difficult to agree\non whether they should be in or out. Through back-and-forth with an LLM, human\nreviewers were able to refine and improve their inclusion criteria.</p>\n<p>This project will build on this work to develop a prototype, open-source tool\nthat enables users to refine their inclusion criteria with the help of an LLM\nchatbot. This will be extremely useful for anyone conducting any type of\nevidence synthesis and so has great potential to be an impactful project beyond\n"just" the field of conservation.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#evaluating-a-human-in-the-loop-ai-framework-to-improve-inclusion-criteria-for-evidence-synthesis\"></a>Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Whenever we do evidence synthesis (especially for <a href=\"https://anil.recoil.org/projects/ce\">conservation outcomes</a>)\nto distil the world's scientific literature into actionable insights, we have\nto decide on <em>what</em> published studies we will include or exclude, and <em>why</em>\nthey are categorised as such. This can be a challenging process, and sometimes\ninclusion criteria may not be very reproducible or clearly defined, leading to\nconfusion between reviewers and more time-consuming reviews.</p>\n<p>In AI-assisted review methods, we are increasingly finding that LLMs may\ninterpret inclusion criteria differently to human reviewers, potentially\nbecause human experts may implicitly assume certain things that are not obvious\nto those working outside the review team (or interpret things differently to\nfellow reviewers). We trialled an informal process earlier this year to\niterate over the inclusion/exclusion criteria for an evidence synthesis using\nsynthetic studies that represent "edge cases", whereby it is difficult to agree\non whether they should be in or out. Through back-and-forth with an LLM, human\nreviewers were able to refine and improve their inclusion criteria.</p>\n<p>This project will build on this work to develop a prototype, open-source tool\nthat enables users to refine their inclusion criteria with the help of an LLM\nchatbot. This will be extremely useful for anyone conducting any type of\nevidence synthesis and so has great potential to be an impactful project beyond\n"just" the field of conservation.</p>",
+18
avsm/ideas_autoscaling-geospatial-yirgacheffe.json
+18
avsm/ideas_autoscaling-geospatial-yirgacheffe.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/autoscaling-geospatial-yirgacheffe\">Autoscaling geospatial computation with Python and Yirgacheffe</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Python is a popular tool for geospatial data-science, but it, along with the <a href=\"https://gdal.org/\">GDAL</a> library, handle resource management poorly. Python does not deal with <a href=\"https://wiki.python.org/moin/GlobalInterpreterLock\">parallelism</a> well and GDAL can be a <a href=\"https://github.com/OSGeo/gdal/issues/10792\">memory hog</a> when parallelised. Geo-spatial workloads -- working on global maps at metre-level resolutions -- can easily exceed the resources available on a given host when run using conventional schedulers.</p>\n<p>To that end, we've been building <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, a geospatial library for Python that attempts to both hide the tedious parts of geospatial work (aligning different data sources for instance), but also tackling the resource management issues so that ecologists don't have to also become computer scientists to scale their work. Yirgacheffe can:</p>\n<ul>\n<li>chunk data in memory automatically, to avoid common issues around memory overcommitment</li>\n<li>can do limited forms of parallelism to use multiple cores.</li>\n</ul>\n<p>Yirgacheffe has been deployed in multiple geospatial pipelines, underpinning work like <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>, as well as an implementation of the <a href=\"https://iucn.org/resources/conservation-tool/species-threat-abatement-and-restoration-star-metric\">IUCN STAR metric</a>, and <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">a methodology for assessing tropical forest interventions</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/autoscaling-geospatial-yirgacheffe\">453 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#autoscaling-geospatial-computation-with-python-and-yirgacheffe\"></a>Autoscaling geospatial computation with Python and Yirgacheffe</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Python is a popular tool for geospatial data-science, but it, along with the <a href=\"https://gdal.org/\">GDAL</a> library, handle resource management poorly. Python does not deal with <a href=\"https://wiki.python.org/moin/GlobalInterpreterLock\">parallelism</a> well and GDAL can be a <a href=\"https://github.com/OSGeo/gdal/issues/10792\">memory hog</a> when parallelised. Geo-spatial workloads -- working on global maps at metre-level resolutions -- can easily exceed the resources available on a given host when run using conventional schedulers.</p>\n<p>To that end, we've been building <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, a geospatial library for Python that attempts to both hide the tedious parts of geospatial work (aligning different data sources for instance), but also tackling the resource management issues so that ecologists don't have to also become computer scientists to scale their work. Yirgacheffe can:</p>\n<ul>\n<li>chunk data in memory automatically, to avoid common issues around memory overcommitment</li>\n<li>can do limited forms of parallelism to use multiple cores.</li>\n</ul>\n<p>Yirgacheffe has been deployed in multiple geospatial pipelines, underpinning work like <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>, as well as an implementation of the <a href=\"https://iucn.org/resources/conservation-tool/species-threat-abatement-and-restoration-star-metric\">IUCN STAR metric</a>, and <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">a methodology for assessing tropical forest interventions</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#the-summer-project\"></a>The summer project</h2>\n<p>Whilst Yirgacheffe solves some of the resource management problems involved in geospatial coding, it does so conservatively and statically. It does not currently assess the current state of the host on which it is being run: how much memory or how many CPU cores are free? How much memory is each thread using? How to react if someone else fires up a big job on the same machine?</p>\n<p>If it gets this wrong via overcommitting resources, then the dreaded the Linux <a href=\"https://linux-mm.org/OOM_Killer\">OOM killer</a> can (at best) take down your job or (at worst) take down the entire system including other users' work. Therefore, we want Yirgacheffe to be more clever about scaling up resource usage on a large host, without compromising overall system stability.</p>\n<p>In this project we'd like to:</p>\n<ul>\n<li>Add the ability to better estimate how much memory and CPU is free at the start of day to set sensible defaults rather than the current highly conservative estimates</li>\n<li>Add the ability to adjust those values based on reaction to current machine state</li>\n<li>Demonstrate that this works by applying it to one of the existing pipelines and demonstrating better resource utilisation on a big but busy compute server (you get to play with 256 core hosts with a terabyte of RAM!)</li>\n</ul>\n<p>This would be a good summer project for a student interested both operating systems and scientific computing, looking to help work on enabling real sustainability and environmental research.</p>\n<p>For background reading:</p>\n<ul>\n<li><a href=\"https://mynameismwd.org\">Michael Dales</a> posts a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">blog on Yirgacheffe</a></li>\n<li>A <a href=\"https://linen.futureofcoding.org/t/5063652/as-promised-in-my-intro-here-s-a-little-bit-of-current-think\">future of coding thread</a> with some discussion</li>\n</ul>\n<p>You can also watch a (slightly tangential but on the same topic of geospatial processing) talk from <a href=\"https://mynameismwd.org\">Michael Dales</a> at LOCO24.</p>\n<p></p><div></div><p></p>",
+18
avsm/ideas_battery-free-riotee.json
+18
avsm/ideas_battery-free-riotee.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>.</p>\n<p>Monitoring wildlife in the field today relies heavily on <a href=\"https://anil.recoil.org/papers/2024-terracorder\">battery-powered devices</a>, like GPS collars or acoustic recorders. However, such devices are\noften deployed in remote environments, where battery replacement and data\nretrieval can be labour-intensive and time-consuming. Moving away from\nbattery-powered field devices could radically reduce the environmental\nfootprint and labour cost of wildlife monitoring. The rise of batteryless\nenergy-harvesting platforms could enable ultra-low-power, long-term,\nmaintenance-free deployments.\nHowever, existing battery-less devices are severely constrained, often unable to perform meaningful on-device computation\nsuch as ML inference or high-frequency audio capture.</p>\n<p>This project explores the development of next-generation, battery-less wildlife\nmonitoring platforms using <a href=\"https://www.crowdsupply.com/nessie-circuits/riotee\">Riotee</a>, an open-source platform purpose-built for\n<a href=\"https://www.sciencedirect.com/science/article/pii/S1383762120301430\">intermittent computing</a>.\nRiotee integrates energy harvesting with a powerful Cortex-M4 MCU and full SDK\nfor managing state-saving, redundancy, and graceful resume from power failures.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">273 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#battery-free-wildlife-monitoring-with-riotee\"></a>Battery-free wildlife monitoring with Riotee</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>.</p>\n<p>Monitoring wildlife in the field today relies heavily on <a href=\"https://anil.recoil.org/papers/2024-terracorder\">battery-powered devices</a>, like GPS collars or acoustic recorders. However, such devices are\noften deployed in remote environments, where battery replacement and data\nretrieval can be labour-intensive and time-consuming. Moving away from\nbattery-powered field devices could radically reduce the environmental\nfootprint and labour cost of wildlife monitoring. The rise of batteryless\nenergy-harvesting platforms could enable ultra-low-power, long-term,\nmaintenance-free deployments.\nHowever, existing battery-less devices are severely constrained, often unable to perform meaningful on-device computation\nsuch as ML inference or high-frequency audio capture.</p>\n<p>This project explores the development of next-generation, battery-less wildlife\nmonitoring platforms using <a href=\"https://www.crowdsupply.com/nessie-circuits/riotee\">Riotee</a>, an open-source platform purpose-built for\n<a href=\"https://www.sciencedirect.com/science/article/pii/S1383762120301430\">intermittent computing</a>.\nRiotee integrates energy harvesting with a powerful Cortex-M4 MCU and full SDK\nfor managing state-saving, redundancy, and graceful resume from power failures.</p>\n<p>The project could involve work on one or more of the following areas:</p>\n<ul>\n<li>SDK tooling: developing a user-friendly C/Rust SDK that integrates audio recording, ML-based data processing, scheduling, and wireless communication into a unified and easily configurable framework for non-technical users in conservation and ecology.</li>\n<li>GPS tracking: building a hardware/software solution using Riotee for wildlife tracking, harvesting energy from both motion and solar.</li>\n<li>Acoustic monitoring: exploring the feasibility of bioacoustic monitoring on Riotee, quantifying the trade-off between scalability/lifetime and ecological data yield.</li>\n<li>On-device ML: adapting or training lightweight ML models to fit within Riotee\u2019s memory and energy budgets, and intermittent compute runtime.</li>\n</ul>\n<p>This project would suit a student interested in low-power hardware and/or\napplied ML. Prior experience with C and embedded programming would be helpful,\nbut the desire to get your hands dirty with low-level debugging is essential!</p>",
+18
avsm/ideas_bigraphs-real-world.json
+18
avsm/ideas_bigraphs-real-world.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/bigraphs-real-world\">Building bigraphs of the real world</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:ra652@cam.ac.uk\">Roy Ang</a>. It is co-supervised with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>.</p>\n<p>Bigraphs were originally proposed as a model for the behaviour of ubiquitous\nsystems since interaction between mobile devices is dependent on both placing\n(locality) and linking (connectivity). However, there has yet to be a bigraph\nthat represents the complete physical world. Such a bigraph will enhance the\ncomputer's representation of its location from a simple latitude-longitude pair\nto a context more familiar to humans: the room it is in, the street the\nbuilding is on, and the town the street is in. This will allow for\nlocation-aware applications and policies about connectivity of mobile devices\nto work based on the defined locality of buildings, streets and administrative\nregions.</p>\n<p>The physical world has also long been represented by maps. <a href=\"https://openstreetmap.org\">OpenStreetMap</a>\n(OSM) is a freely-licensed geographic database built by a community of\nvolunteers through the annotation of data collected through surveys, aerial\nimagery and other free geodata sources. Boasting a user base of 10 million, OSM\nhas labelled buildings, streets and regions with impressive detail comparable\nwith commercial counterparts. The map elements are supplemented with key-value\npairs called "tags" that describe characteristics of the element. Tagging\nconventions vary across countries, but there are standard practices such as the\n<code>addr</code> tag on buildings to describe its address.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/bigraphs-real-world\">296 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#building-bigraphs-of-the-real-world\"></a>Building bigraphs of the real world</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:ra652@cam.ac.uk\">Roy Ang</a>. It is co-supervised with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>.</p>\n<p>Bigraphs were originally proposed as a model for the behaviour of ubiquitous\nsystems since interaction between mobile devices is dependent on both placing\n(locality) and linking (connectivity). However, there has yet to be a bigraph\nthat represents the complete physical world. Such a bigraph will enhance the\ncomputer's representation of its location from a simple latitude-longitude pair\nto a context more familiar to humans: the room it is in, the street the\nbuilding is on, and the town the street is in. This will allow for\nlocation-aware applications and policies about connectivity of mobile devices\nto work based on the defined locality of buildings, streets and administrative\nregions.</p>\n<p>The physical world has also long been represented by maps. <a href=\"https://openstreetmap.org\">OpenStreetMap</a>\n(OSM) is a freely-licensed geographic database built by a community of\nvolunteers through the annotation of data collected through surveys, aerial\nimagery and other free geodata sources. Boasting a user base of 10 million, OSM\nhas labelled buildings, streets and regions with impressive detail comparable\nwith commercial counterparts. The map elements are supplemented with key-value\npairs called "tags" that describe characteristics of the element. Tagging\nconventions vary across countries, but there are standard practices such as the\n<code>addr</code> tag on buildings to describe its address.</p>\n<p>This project will demonstrate modelling the physical world as a bigraph. Places\nmarked on OSM will be hierarchically structured in a place graph, guided by\nadministrative boundaries such as country, state, city etc. Then, a link graph\nwill be built on top of the place graph to model the network of connected\nstreets. The use of such bigraphs for ubiquitous systems will be demonstrated\nwith the use case of Bluetooth connectivity, using reaction rules that allow\ndevices to move to a new place and form links with other devices in its\nproximity.</p>",
+18
avsm/ideas_brain-interface-security.json
+18
avsm/ideas_brain-interface-security.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/brain-interface-security\">Security analysis of brain-computing interfaces</a> <span>/ Jan 2021</span></h2><div><p>This is an idea proposed in 2021 as a good starter project, and has been <span>completed</span> by <span>Malachy O'Connor Brown</span> and <span>Oscar Hill</span>. It was co-supervised with <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and <a href=\"https://lorenaqendro.github.io\">Lorena Qendro</a>.</p>\n<p>Brain Computing Interface (BCI) technologies, both invasive and non-invasive,\nare increasingly used in a wide range of applications, from health-care to\nsmart communication and control. Most BCI applications are safety-critical or\nprivacy-sensitive. However, the infinite potentials of BCI and its ever-growing\nmarket size have been distracted the BCI community from significant security\nand privacy threats. In this research, we first investigate the security and\nprivacy threats of various BCI devices and applications, from machine learning\nadversarial threats to untrusted systems and malicious applications. Then, we\npropose a hybrid framework for analyzing and mitigating these threats utilizing\neffective combinations of ML robustness techniques, information flow control,\nand systems/hardware security.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/brain-interface-security\">281 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#security-analysis-of-brain-computing-interfaces\"></a>Security analysis of brain-computing interfaces</h1>\n<p>This is an idea proposed in 2021 as a good starter project, and has been <span>completed</span> by <span>Malachy O'Connor Brown</span> and <span>Oscar Hill</span>. It was co-supervised with <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and <a href=\"https://lorenaqendro.github.io\">Lorena Qendro</a>.</p>\n<p>Brain Computing Interface (BCI) technologies, both invasive and non-invasive,\nare increasingly used in a wide range of applications, from health-care to\nsmart communication and control. Most BCI applications are safety-critical or\nprivacy-sensitive. However, the infinite potentials of BCI and its ever-growing\nmarket size have been distracted the BCI community from significant security\nand privacy threats. In this research, we first investigate the security and\nprivacy threats of various BCI devices and applications, from machine learning\nadversarial threats to untrusted systems and malicious applications. Then, we\npropose a hybrid framework for analyzing and mitigating these threats utilizing\neffective combinations of ML robustness techniques, information flow control,\nand systems/hardware security.</p>\n<p>There were two separate internship projects that emerged from this, worked on\nby <span>Malachy O'Connor Brown</span> and <span>Oscar Hill</span>. They were:</p>\n<ul>\n<li><strong>Security analysis of BCI systems.</strong> We explore the impact of current security threats on BCI stacks, including applications, frameworks, libraries, and systems abstractions. You will also investigate the possibility of new attack vectors and build tools to make the security analysis easier and more fun/automatic. You need to have development skills with C/C++ and scripting languages (e.g., Python). Experience with embedded devices, OS and sandboxes, reverse engineering, and threat analysis is preferred.</li>\n<li><strong>Adversarial attacks on BCI.</strong> We explore various methods to detect and analyze security threats on BCI ML models, including attacks based on perturbed inputs, inference, and model patterns. You need to have development skills (e.g., C, C++, Python) and experience with at least one ML/Deep Learning framework such as PyTorch or TensorFlow. Previous work on embedded devices and adversarial attacks is preferred.</li>\n</ul>\n<p>The results of this work were written up in <a href=\"https://anil.recoil.org/papers/2022-enhancing-brain-security\">Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces</a>,\nwhich is a really fun but rather worrying read!</p>",
+18
avsm/ideas_cairngorms-connect-habitats.json
+18
avsm/ideas_cairngorms-connect-habitats.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>. It is co-supervised with <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a>.</p>\n<p>The <a href=\"https://cairngormsconnect.org.uk/\">Cairngorms Connect</a> is the largest landscape restoration project in the UK.\nFour landowners (RSPB, Wildlands Ltd, FLS, and NatureScot) embarked on a\n200-year vision to restore over 600 km2 of land in the Cairngorms National Park\nwith an emphasis on natural processes.</p>\n<p>In July, 2023, the <a href=\"https://www.clr.conservation.cam.ac.uk/\">Centre for Landscape\nRegeneration</a> commissioned a flight\nover a 400 km2 stretch of land over the area, collecting both high resolution\nRGBI imagery (0.1m ground resolution) and LiDAR data. Various research\nprojects were built on this dataset, including studies into carbon cycling,\nshrub ecology, tree regeneration, and deadwood detection.</p>\n<p>Existing habitat maps of the area are based on Sentinel 2 satellite data at a\nground resolution of 10m. While this dataset provides a good basis for some\nresearch objectives, a habitat map that could leverage the high resolution of\nthe aerial imagery would potentially be able to capture fine-scale variations\nin habitat structure more accurately. This project involves applying new\ndevelopments in geospatial machine learning (specifically the\n<a href=\"https://anil.recoil.org/papers/2025-tessera\">Tessera</a> one developed locally in Cambridge) to achieve this.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#habitat-mapping-of-the-cairngormes-connect-restoration-area\"></a>Habitat mapping of the Cairngormes Connect restoration area</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>. It is co-supervised with <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a>.</p>\n<p>The <a href=\"https://cairngormsconnect.org.uk/\">Cairngorms Connect</a> is the largest landscape restoration project in the UK.\nFour landowners (RSPB, Wildlands Ltd, FLS, and NatureScot) embarked on a\n200-year vision to restore over 600 km2 of land in the Cairngorms National Park\nwith an emphasis on natural processes.</p>\n<p>In July, 2023, the <a href=\"https://www.clr.conservation.cam.ac.uk/\">Centre for Landscape\nRegeneration</a> commissioned a flight\nover a 400 km2 stretch of land over the area, collecting both high resolution\nRGBI imagery (0.1m ground resolution) and LiDAR data. Various research\nprojects were built on this dataset, including studies into carbon cycling,\nshrub ecology, tree regeneration, and deadwood detection.</p>\n<p>Existing habitat maps of the area are based on Sentinel 2 satellite data at a\nground resolution of 10m. While this dataset provides a good basis for some\nresearch objectives, a habitat map that could leverage the high resolution of\nthe aerial imagery would potentially be able to capture fine-scale variations\nin habitat structure more accurately. This project involves applying new\ndevelopments in geospatial machine learning (specifically the\n<a href=\"https://anil.recoil.org/papers/2025-tessera\">Tessera</a> one developed locally in Cambridge) to achieve this.</p>",
+18
avsm/ideas_causal-rpc.json
+18
avsm/ideas_causal-rpc.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/causal-rpc\">CausalRPC: a traceable distributed computation framework</a> <span>/ Jan 2018</span></h2><div><p>This is an idea proposed in 2018 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://craigfe.io\">Craig Ferguson</a>.</p>\n<p>The project aims to implement an RPC framework in OCaml using the <a href=\"https://github.com/mirage/irmin\">Irmin</a> distributed database library as a network substrate. It will explore the trade-offs of a novel data-oriented approach to RPC in which race conditions between clients are resolved automatically by the middleware layer. The core deliverable is a demonstration of an RPC client remotely executing functions with Irmin-serialisable parameters on a server capable of handling concurrent client requests.</p>\n<p>The project was completed successfully, with an implementation of <a href=\"https://github.com/craigfe/causal-rpc\">CausalRPC</a>, a distributed computation framework satisfying the above criteria. The approach of making the statefulness of RPC explicit was surprisingly effective, allowing CausalRPC to provide stronger consistency and traceability guarantees than conventional RPC systems. This broadened the scope of the project considerably, allowing for a variety of extensions to explore the inherent trade-offs of the approach. The final version of CausalRPC supported fault-tolerant worker clusters and is compatible with <a href=\"https://mirageos.org\">MirageOS</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/causal-rpc\">244 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#causalrpc-a-traceable-distributed-computation-framework\"></a>CausalRPC: a traceable distributed computation framework</h1>\n<p>This is an idea proposed in 2018 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://craigfe.io\">Craig Ferguson</a>.</p>\n<p>The project aims to implement an RPC framework in OCaml using the <a href=\"https://github.com/mirage/irmin\">Irmin</a> distributed database library as a network substrate. It will explore the trade-offs of a novel data-oriented approach to RPC in which race conditions between clients are resolved automatically by the middleware layer. The core deliverable is a demonstration of an RPC client remotely executing functions with Irmin-serialisable parameters on a server capable of handling concurrent client requests.</p>\n<p>The project was completed successfully, with an implementation of <a href=\"https://github.com/craigfe/causal-rpc\">CausalRPC</a>, a distributed computation framework satisfying the above criteria. The approach of making the statefulness of RPC explicit was surprisingly effective, allowing CausalRPC to provide stronger consistency and traceability guarantees than conventional RPC systems. This broadened the scope of the project considerably, allowing for a variety of extensions to explore the inherent trade-offs of the approach. The final version of CausalRPC supported fault-tolerant worker clusters and is compatible with <a href=\"https://mirageos.org\">MirageOS</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2015-jfla-irmin\">Mergeable persistent data structures</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The project PDF writeup is publically <a href=\"https://www.craigfe.io/causalrpc.pdf\">available</a>, and <a href=\"https://craigfe.io\">Craig Ferguson</a> won the G-Research Prize for Best Individual Project 2018 departmental prize.</p>\n<p><a href=\"https://craigfe.io\">Craig Ferguson</a> also gave a <a href=\"https://ocaml.org/workshops/ocaml-workshop-2019\">talk about CausalRPC</a> at the 2019 OCaml Workshop. Unfortunately the videos of that year's ICFP don't seem to have made it online, but the <a href=\"https://github.com/CraigFe/causal-rpc-talk\">slides are available</a>.</p>\n<p><a href=\"https://craigfe.io\">Craig Ferguson</a> followed up with a podcast where he discussed his subsequent work on Irmin in 2022:</p>",
+18
avsm/ideas_choregraphic-programming-ocaml.json
+18
avsm/ideas_choregraphic-programming-ocaml.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/choregraphic-programming-ocaml\">Implementing a higher-order choreographic language</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/Rokcas\">Rokas Urbonas</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/ds709\">Dmirtij Szamozvancev</a>.</p>\n<p>This project aims to implement a functional choreographic language inspired by\nthe <a href=\"https://dl.acm.org/doi/pdf/10.1145/3498684\">Pirouette calculus</a>. This language was meant to make the notoriously\ndifficult process of implementing distributed algorithms easier, while offering\na practical execution model for multi-participant programs. Additionally, it\naimed to match the expressiveness and performance of similar existing\nsolutions.</p>\n<p>The project completed very successfully, and resulted in <a href=\"https://github.com/Rokcas/chorcaml\"><em>ChorCaml</em></a>, an\nembedded DSL for choreographic programming in OCaml. The language facilitates\nthe implementation of distributed algorithms, while offering a clear syntax and\nsafety via the type system. ChorCaml also improves upon existing alternatives\nin certain common use cases, both in terms of program conciseness and\nperformance. The practicality of the DSL was verified by successfully\nimplementing well-known distributed algortihms such as Diffie-Hellman key\nexchange and concurrent Karatsuba fast integer multiplication.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/choregraphic-programming-ocaml\">163 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#implementing-a-higher-order-choreographic-language\"></a>Implementing a higher-order choreographic language</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/Rokcas\">Rokas Urbonas</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/ds709\">Dmirtij Szamozvancev</a>.</p>\n<p>This project aims to implement a functional choreographic language inspired by\nthe <a href=\"https://dl.acm.org/doi/pdf/10.1145/3498684\">Pirouette calculus</a>. This language was meant to make the notoriously\ndifficult process of implementing distributed algorithms easier, while offering\na practical execution model for multi-participant programs. Additionally, it\naimed to match the expressiveness and performance of similar existing\nsolutions.</p>\n<p>The project completed very successfully, and resulted in <a href=\"https://github.com/Rokcas/chorcaml\"><em>ChorCaml</em></a>, an\nembedded DSL for choreographic programming in OCaml. The language facilitates\nthe implementation of distributed algorithms, while offering a clear syntax and\nsafety via the type system. ChorCaml also improves upon existing alternatives\nin certain common use cases, both in terms of program conciseness and\nperformance. The practicality of the DSL was verified by successfully\nimplementing well-known distributed algortihms such as Diffie-Hellman key\nexchange and concurrent Karatsuba fast integer multiplication.</p>\n<p><a href=\"https://github.com/Rokcas\">Rokas Urbonas</a> subsequently submitted a proposal to the OCaml Workshop about his\nwork, and presented it at the <a href=\"https://icfp24.sigplan.org/details/ocaml-2024-papers/13/ChorCaml-Functional-Choreographic-Programming-in-OCaml\">2014 edition of the OCaml Workshop</a>.</p>\n<ul>\n<li><a href=\"https://www.youtube.com/watch?v=KEkmcXVtFi0\">Video</a> of his talk</li>\n<li><a href=\"https://ocaml2024.hotcrp.com/doc/ocaml2024-paper17.pdf\">PDF</a> of his writeup.</li>\n</ul>",
+18
avsm/ideas_chunk-free-embeddings.json
+18
avsm/ideas_chunk-free-embeddings.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/chunk-free-embeddings\">Generating chunk-free embeddings for LLMs</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:mj651@cam.ac.uk\">Mark Jacobsen</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>This project aims to explore the development of a chunk-free approach for\ngenerating embeddings in Retrieval-Augmented Generation (RAG) models.\nTraditional RAG workflows often involve manual or predefined chunking of\ndocuments, and we seek to bypass this requirement.</p>\n<p>Instead, our approach involves generating multiple embeddings for unchunked\ntext using a synthetic dataset created by (e.g.) a 7b parameter LLM. This\ndataset would feature structured, point-by-point summaries of each paragraph.\nAn off-the-shelf embedding model could then be modified by removing its mean\npooling layer and incorporating cross-attention layers. These layers, inspired\nby T5's encoder-decoder architecture, would enable a frozen set of embeddings\nto interact with summary-based embeddings via cross-attention, creating a more\nnuanced chunk-free representation.</p>\n<p>Additionally, the research aims to explore adaptive chunking driven by a\ntrained model, allowing context-aware embedding generation end-to-end. This\nmethod promises a more integrated and efficient approach, eliminating the need\nfor separate summarization and embedding processes.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#generating-chunk-free-embeddings-for-llms\"></a>Generating chunk-free embeddings for LLMs</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:mj651@cam.ac.uk\">Mark Jacobsen</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>This project aims to explore the development of a chunk-free approach for\ngenerating embeddings in Retrieval-Augmented Generation (RAG) models.\nTraditional RAG workflows often involve manual or predefined chunking of\ndocuments, and we seek to bypass this requirement.</p>\n<p>Instead, our approach involves generating multiple embeddings for unchunked\ntext using a synthetic dataset created by (e.g.) a 7b parameter LLM. This\ndataset would feature structured, point-by-point summaries of each paragraph.\nAn off-the-shelf embedding model could then be modified by removing its mean\npooling layer and incorporating cross-attention layers. These layers, inspired\nby T5's encoder-decoder architecture, would enable a frozen set of embeddings\nto interact with summary-based embeddings via cross-attention, creating a more\nnuanced chunk-free representation.</p>\n<p>Additionally, the research aims to explore adaptive chunking driven by a\ntrained model, allowing context-aware embedding generation end-to-end. This\nmethod promises a more integrated and efficient approach, eliminating the need\nfor separate summarization and embedding processes.</p>",
+18
avsm/ideas_compressive-geospatial.json
+18
avsm/ideas_compressive-geospatial.json
···+"title": "Assessing high-performance lightweight compression formats for geospatial computation",+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/compressive-geospatial\">Assessing high-performance lightweight compression formats for geospatial computation</a> <span>/ Jan 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/omarathon\">Omar Tanner</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Geospatial data processing can benefit from by applying lightweight compression\ntechniques to data in GeoTIFF format, addressing the challenge of modern CPU\nbandwidth surpassing RAM bandwidths. This project will explore how to mitigate\nthe impact of poor cache locality and the resulting memory bottlenecks by\nleveraging CPU superscalar capabilities and SIMD instructions. By implementing\nSIMD-optimised compression, data can remain compressed in RAM and closer to the\nCPU caches, facilitating faster access and alleviating memory constraints.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/compressive-geospatial\">113 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#assessing-high-performance-lightweight-compression-formats-for-geospatial-computation\"></a>Assessing high-performance lightweight compression formats for geospatial computation</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/omarathon\">Omar Tanner</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Geospatial data processing can benefit from by applying lightweight compression\ntechniques to data in GeoTIFF format, addressing the challenge of modern CPU\nbandwidth surpassing RAM bandwidths. This project will explore how to mitigate\nthe impact of poor cache locality and the resulting memory bottlenecks by\nleveraging CPU superscalar capabilities and SIMD instructions. By implementing\nSIMD-optimised compression, data can remain compressed in RAM and closer to the\nCPU caches, facilitating faster access and alleviating memory constraints.</p>\n<h2><a href=\"https://anil.recoil.org/#background-reading\"></a>Background Reading</h2>\n<ul>\n<li>Damme, P., Habich, D., Hildebrandt, J. & Lehner, W. Lightweight Data Compression Algorithms: An Experimental Survey (Experiments and Analyses) en. 2017.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li><a href=\"https://github.com/omarathon/mres/blob/32bcdd4413e951933c40f037c0c595ebbebe3aca/mres_project.pdf\">Dissertation PDF</a> for the <a href=\"https://cdt.sensors.cam.ac.uk/sd-classification/2023-student-cohort\">Sensors CDT MRes</a>.</li>\n<li><a href=\"https://github.com/omarathon/compression-geospatial\">Source Code</a></li>\n</ul>",
+18
avsm/ideas_computational-scientific-methods.json
+18
avsm/ideas_computational-scientific-methods.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/computational-scientific-methods\">Computational Models for Scientific Exploration</a> <span>/ Aug 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>. It is co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>The modern scientific method has become highly computational, but computer\nscience hasn't entirely caught up and is sometimes hindering research progress.</p>\n<p>We use climate science and ecology computation needs as a case study, we are\nconducting a systematic study in the sources of uncertainty in these fields.\nWe are also designing and implementing a specification language and hermetic\ncomputation environment that empowers climate scientists and ecologists to\ncreate less ambiguous, more precise and testable scientific methodologies and\nresults, while preserving the ability to explore and introspect intermediate\nresults.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/computational-scientific-methods\">125 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#computational-models-for-scientific-exploration\"></a>Computational Models for Scientific Exploration</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>. It is co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>The modern scientific method has become highly computational, but computer\nscience hasn't entirely caught up and is sometimes hindering research progress.</p>\n<p>We use climate science and ecology computation needs as a case study, we are\nconducting a systematic study in the sources of uncertainty in these fields.\nWe are also designing and implementing a specification language and hermetic\ncomputation environment that empowers climate scientists and ecologists to\ncreate less ambiguous, more precise and testable scientific methodologies and\nresults, while preserving the ability to explore and introspect intermediate\nresults.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>There is an extensive amount of source code up at <a href=\"https://github.com/quantifyearth\">https://github.com/quantifyearth</a>\nand <a href=\"https://github.com/carboncredits\">https://github.com/carboncredits</a> which forms parts of our pipeline.</li>\n<li>See the related ideas for some smaller scale projects you can engage with.</li>\n</ul>",
+18
avsm/ideas_computational-storage-for-vector-dbs.json
+18
avsm/ideas_computational-storage-for-vector-dbs.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/computational-storage-for-vector-dbs\">Using computational SSDs for vector databases</a> <span>/ Feb 2025</span></h2><div><p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Large <a href=\"https://en.wikipedia.org/wiki/Foundation_model\">pre-trained models</a> can be used to embed media/documents into concise vector representations with the property that vectors that are "close" to each other are semantically related. <a href=\"https://en.wikipedia.org/wiki/Nearest_neighbor_search\">ANN</a> (Approximate Nearest Neighbour) search on these embeddings is used heavily already in <a href=\"https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/\">RAG</a> systems for LLMs or search-by-example for satellite imagery.</p>\n<p>Right now, most ANN databases almost exclusively use memory-resident indexes to accelerate this searching. This is a showstopper for larger datasets, such as the terabytes of PDFs we have for our <a href=\"https://anil.recoil.org/projects/ce\">big evidence synthesis</a> project, each of which generates dozens of embeddings. For global satellite datasets for <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing of nature</a> at 10m scale this is easily petabytes per year (the raw data here would need to come from tape drives).</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/computational-storage-for-vector-dbs\">398 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#using-computational-ssds-for-vector-databases\"></a>Using computational SSDs for vector databases</h1>\n<p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Large <a href=\"https://en.wikipedia.org/wiki/Foundation_model\">pre-trained models</a> can be used to embed media/documents into concise vector representations with the property that vectors that are "close" to each other are semantically related. <a href=\"https://en.wikipedia.org/wiki/Nearest_neighbor_search\">ANN</a> (Approximate Nearest Neighbour) search on these embeddings is used heavily already in <a href=\"https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/\">RAG</a> systems for LLMs or search-by-example for satellite imagery.</p>\n<p>Right now, most ANN databases almost exclusively use memory-resident indexes to accelerate this searching. This is a showstopper for larger datasets, such as the terabytes of PDFs we have for our <a href=\"https://anil.recoil.org/projects/ce\">big evidence synthesis</a> project, each of which generates dozens of embeddings. For global satellite datasets for <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing of nature</a> at 10m scale this is easily petabytes per year (the raw data here would need to come from tape drives).</p>\n<p>The project idea is that <a href=\"https://www.xilinx.com/publications/product-briefs/xilinx-smartssd-computational-storage-drive-product-brief.pdf\">computational storage devices</a> can add compute (via FPGAs) to the SSD controller and let us compute on the data <em>before</em> it reaches main-memory. Binary-quantisation of embedding vectors is now practical <a href=\"https://anil.recoil.org/#fn-1\">[1]</a>, and so simple comparison of these should be quite amenable to acceleration with the SSD-attached FPGA. Since we're willing to tradeoff searching more vectors, each SSD only needs to have a lightweight index (potentially a flat IVF) shard. In a big storage array, every SSD could then return the small number of original (un-quantised) embeddings which were closest to the query points, and then the the CPU would do a fast final reranking step <a href=\"https://anil.recoil.org/#fn-2\">[2]</a>.</p>\n<p>Our hypothesis is that we could scale vector database size just by adding more SSDs, through both storage and aggregate disk throughput.\nThere are risks to overcome though: if the FPGAs on the SSD controllers dont have enough compute to keep up with the full SSD bandwidth, or we can't discard enough of a % of vectors via the on-disk index then we're memory bound without much gain. A key part of the solution is balancing out the memory vs SSD bandwidth carefully via some autotuning.\n(e.g. if we have 4TB per SSD shard we have 9GBs of max bandwidth, so we'd need to discard 99.9% of the on-disk indexed vectors to get sub-second response times).</p>\n<p>But if the experiment does succeed, we could get real-time sub-second responses time on massive datasets, which would be a game changer for interaction exploration of huge datasets. A student more interested in the programming interface side may also wish to look over my <a href=\"https://anil.recoil.org/notes/fpgas-hardcaml\">OCaml FPGA notes</a>.</p>\n\n<ol>\n<li>\n<p>https://arxiv.org/abs/2405.12497</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>https://arxiv.org/abs/2106.00882</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_concurrent-revisions.json
+18
avsm/ideas_concurrent-revisions.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/concurrent-revisions\">Concurrent revisions for OCaml</a> <span>/ Jan 2013</span></h2><div><p>This is an idea proposed in 2013 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Dimitar Popov</span>.</p>\n<p>The biggest challenge when using parallel programming is typically how to keep\ntrack of the side effects of computations that are executed in parallel and\nthat involve shared mutable state. Traditional methods for dealing with this\nissue often limit concurrency, do not provide sufficient determinism and are\nerror prone. Ideally, we would like a concept where all conflicts between\nparallel tasks are resolved deterministically with minimized effort from the\nprogrammer.\nThis project aims to design and build a library for OCaml that implements the\nconcept of <a href=\"https://www.microsoft.com/en-us/research/project/concurrent-revisions/\">concurrent\nrevisions</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/concurrent-revisions\">361 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#concurrent-revisions-for-ocaml\"></a>Concurrent revisions for OCaml</h1>\n<p>This is an idea proposed in 2013 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Dimitar Popov</span>.</p>\n<p>The biggest challenge when using parallel programming is typically how to keep\ntrack of the side effects of computations that are executed in parallel and\nthat involve shared mutable state. Traditional methods for dealing with this\nissue often limit concurrency, do not provide sufficient determinism and are\nerror prone. Ideally, we would like a concept where all conflicts between\nparallel tasks are resolved deterministically with minimized effort from the\nprogrammer.\nThis project aims to design and build a library for OCaml that implements the\nconcept of <a href=\"https://www.microsoft.com/en-us/research/project/concurrent-revisions/\">concurrent\nrevisions</a>.</p>\n<p>Concurrent revisions as initially proposed highlight these design choices:</p>\n<ol>\n<li>Declarative data sharing: the user declares what data is to be shared between parallel tasks by the use of isolation types</li>\n<li>Automatic isolation: each task has its own private stable copy of the data that is taken at the time of the fork</li>\n<li>Deterministic conflict resolution: the user specifies a merge function that is used to resolve write-write conflicts that might arise when joining parallel tasks. Given that this function is deterministic, the conflict resolution is also deterministic.</li>\n</ol>\n<p>In this framework the unit of concurrency are asynchronous tasks called\n<em>revisions</em>. They provide the typical functionality for asynchronous tasks -\nthe user can create, fork and join them. This removes the complexity of\nsynchronization out of the tasks themselves and gathers it into a single place; the <code>merge</code> function.</p>\n<p>A key outcome is to improve our understanding of the tradeoffs both between the\ndifferent paths that can be chosen during the implementation of this library\nand the more traditional means of concurrent programming. We will design an\nevaluation of the differences between the API of the original concurrent\nrevisions limplementation written in C# and the more functional style of one\nbuilt in OCaml.</p>\n<p>The project was successfully completed, with the major decision being whether\nor not to switch to a monadic API vs a direct-style one with better lower-level\ncontrol.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://www.microsoft.com/en-us/research/project/concurrent-revisions/\">Concurrent Revisions at Microsoft Research</a></li>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation <a href=\"https://github.com/dpp23/ocaml_revisions/\">PDF is available</a>\npublically along with the <a href=\"https://github.com/dpp23/ocaml_revisions/\">source code to the prototype\nlibrary</a> which implemented a logging\nand chat server to demonstrate the use of concurrent revisions.</p>",
+18
avsm/ideas_decomposing-audio-with-dl.json
+18
avsm/ideas_decomposing-audio-with-dl.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/decomposing-audio-with-dl\">Deep learning for decomposing sound into vector audio</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://pure.qub.ac.uk/en/persons/trevor-agus\">Trevor Agus</a>.</p>\n<p>All that we hear is mediated through cues transmitted to the brain from the\ncochlea, which acts like a bank of auditory filters centred at a wide range of\ncentre frequencies. A lot of our knowledge of hearing comes from\npsychoacoustical experiments that involve simple sounds, like sine waves, whose\nsynthesis parameters are closely related to cues available beyond the cochlea.\nHowever, for recorded sounds, many types of cue are available, but our use of\nthese cues is limited by the extent that these cues can be manipulated in a\ncontrolled fashion. [^1] [^2]</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/decomposing-audio-with-dl\">267 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#deep-learning-for-decomposing-sound-into-vector-audio\"></a>Deep learning for decomposing sound into vector audio</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://pure.qub.ac.uk/en/persons/trevor-agus\">Trevor Agus</a>.</p>\n<p>All that we hear is mediated through cues transmitted to the brain from the\ncochlea, which acts like a bank of auditory filters centred at a wide range of\ncentre frequencies. A lot of our knowledge of hearing comes from\npsychoacoustical experiments that involve simple sounds, like sine waves, whose\nsynthesis parameters are closely related to cues available beyond the cochlea.\nHowever, for recorded sounds, many types of cue are available, but our use of\nthese cues is limited by the extent that these cues can be manipulated in a\ncontrolled fashion. <a href=\"https://anil.recoil.org/#fn-1\">[1]</a> <a href=\"https://anil.recoil.org/#fn-2\">[2]</a></p>\n<p>The goal of this project is to apply deep learning tools to explore the extent\nto which recorded sounds, such as speech, music and noise, can be decomposed\ninto components, such as modulated sine waves, that dominate independent\nregions of activity on the cochlea. The training data would come from\ncombinations of basic sounds with known synthesis parameters and the\ncorresponding output from a differential auditory filterbank, which has\nrecently become available (Famularo<a href=\"https://anil.recoil.org/#fn-3\">[3]</a>). The ability to control perceptually\nrelevant parameters of arbitrarily complex sounds would be a powerful tool in\nhearing research and may have other applications in data compression and\nartificially generated sound.</p>\n<p><em>(Note: this will be co-supervised with faculty from Queen's University, Belfast)</em></p>\n\n<ol>\n<li>\n<p>McDermott, J.H. and E.P. Simoncelli, <a href=\"https://www.sciencedirect.com/science/article/pii/S0896627311005629\">Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis</a>. Neuron, 2011. 71(5): p. 926-40.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Agus, T.R., et al., <a href=\"https://pubmed.ncbi.nlm.nih.gov/22559384/\">Fast recognition of musical sounds based on timbre</a>. J Acoust Soc Am, 2012. 131(5): p. 4124-33.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Famularo, R.L., et al., <a href=\"https://www.arxiv.org/abs/2409.08997\">Biomimetic frontend for differentiable audio processing</a>. [pre-print], 2024.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_differentiable-abm.json
+18
avsm/ideas_differentiable-abm.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/differentiable-abm\">Scalable agent-based models for optimized policy design</a> <span>/ Jan 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <span>Sharan Agrawal</span>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>As the world faces twinned crises of climate change and biodiversity loss, the need for integrated policy approaches addressing both is paramount. To help address this, this project investigates a new agent-based model dubbed the VDSK-B. Using Dasgupta's <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">review of the economics of biodiversity</a>, it builds on the <a href=\"https://www.sciencedirect.com/science/article/pii/S0921800917314623\">Dystopian Schumpeter meets Keynes</a> (DSK) climate economics model to link together the climate, economy and biosphere. This is the first ABM proposed that integrates all 3 key elements.</p>\n<p>The project also investigates how to scale such ABMs to be applicable for global policy design and scale to planetary-sized models. A new ABM framework called SalVO expresses agent updates as recursive applications of pure agent functions. This formalism differs from existing computational ABM models but is shown to be expressive enough to emulate a Turing complete language. SalVO is built on a JAX backend and designed to be scalable, vectorized, and optimizable. Employing hardware acceleration, tests showed it was more performant and more able to scale on a single machine than any existing ABM framework, such as FLAME (GPU).</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/differentiable-abm\">252 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#scalable-agent-based-models-for-optimized-policy-design\"></a>Scalable agent-based models for optimized policy design</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <span>Sharan Agrawal</span>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>As the world faces twinned crises of climate change and biodiversity loss, the need for integrated policy approaches addressing both is paramount. To help address this, this project investigates a new agent-based model dubbed the VDSK-B. Using Dasgupta's <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">review of the economics of biodiversity</a>, it builds on the <a href=\"https://www.sciencedirect.com/science/article/pii/S0921800917314623\">Dystopian Schumpeter meets Keynes</a> (DSK) climate economics model to link together the climate, economy and biosphere. This is the first ABM proposed that integrates all 3 key elements.</p>\n<p>The project also investigates how to scale such ABMs to be applicable for global policy design and scale to planetary-sized models. A new ABM framework called SalVO expresses agent updates as recursive applications of pure agent functions. This formalism differs from existing computational ABM models but is shown to be expressive enough to emulate a Turing complete language. SalVO is built on a JAX backend and designed to be scalable, vectorized, and optimizable. Employing hardware acceleration, tests showed it was more performant and more able to scale on a single machine than any existing ABM framework, such as FLAME (GPU).</p>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation is available as <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-985.pdf\">UCAM-CL-TR-985</a> from the Cambridge Computer Lab technical reports series. The project was awarded the "2023 best M.Phil Project" prize from the Cambridge Computer Science department.</p>\n<p><span>Sharan Agrawal</span> also presented this work at <a href=\"https://propl.dev\">PROPL 2024</a>:</p>\n<div>\n\n</div>\n<h2><a href=\"https://anil.recoil.org/#see-also\"></a>See Also</h2>\n<p><a href=\"https://www.linkedin.com/in/pedro-marques-sousa/\">Pedro Sousa</a> did a follow up project on <a href=\"https://anil.recoil.org/ideas/rev-abm\">Reverse emulating agent-based models for policy simulation</a> in 2023.</p>",
+18
avsm/ideas_diffusion-model-satellites.json
+18
avsm/ideas_diffusion-model-satellites.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/diffusion-model-satellites\">Diffusion models for terrestrial predictions about land use change</a> <span>/ May 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>This project investigates how to build remote sensing data-driven models for\nthe evolution of landscapes, which we can use to better predict deforestation,\nflooding and fire risks. Diffusion models are now widespread for image\ngeneration and are now being applied to video.<a href=\"https://anil.recoil.org/#fn-3\">[1]</a> In addition the GenCast project\nfrom Google Deepmind used a diffusion model ensemble for weather forecasting,\nresulting in a high degree of accuracy compared to traditional methods.<a href=\"https://anil.recoil.org/#fn-2\">[2]</a></p>\n<p>The goal of this project is to train a video diffusion model on time series of\noptical and radar satellite tiles and evaluate its performance in predicting\nchanges in land use / land cover (such as deforestation or flooding).<a href=\"https://anil.recoil.org/#fn-1\">[3]</a> A\nstretch goal is to build a user interface over this to predict and visualise\nthe effects of a given change in land cover over time.</p>\n\n<ol>\n<li>\n<p>"<a href=\"https://arxiv.org/abs/2312.15796\">GenCast: Diffusion-based ensemble forecasting for medium range weather</a>", arXiv:2312.15796</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>"<a href=\"https://arxiv.org/abs/2405.03150\">Video Diffusion Models: A Survey</a>" (May 2024), <a href=\"https://video-diffusion.github.io\">https://video-diffusion.github.io</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>"<a href=\"https://arxiv.org/abs/2312.03606\">DiffusionSat: A Generative Foundation Model for Satellite Imagery</a>" (Dec 2023)</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#diffusion-models-for-terrestrial-predictions-about-land-use-change\"></a>Diffusion models for terrestrial predictions about land use change</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>This project investigates how to build remote sensing data-driven models for\nthe evolution of landscapes, which we can use to better predict deforestation,\nflooding and fire risks. Diffusion models are now widespread for image\ngeneration and are now being applied to video.<a href=\"https://anil.recoil.org/#fn-3\">[1]</a> In addition the GenCast project\nfrom Google Deepmind used a diffusion model ensemble for weather forecasting,\nresulting in a high degree of accuracy compared to traditional methods.<a href=\"https://anil.recoil.org/#fn-2\">[2]</a></p>\n<p>The goal of this project is to train a video diffusion model on time series of\noptical and radar satellite tiles and evaluate its performance in predicting\nchanges in land use / land cover (such as deforestation or flooding).<a href=\"https://anil.recoil.org/#fn-1\">[3]</a> A\nstretch goal is to build a user interface over this to predict and visualise\nthe effects of a given change in land cover over time.</p>\n\n<ol>\n<li>\n<p>"<a href=\"https://arxiv.org/abs/2312.15796\">GenCast: Diffusion-based ensemble forecasting for medium range weather</a>", arXiv:2312.15796</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>"<a href=\"https://arxiv.org/abs/2405.03150\">Video Diffusion Models: A Survey</a>" (May 2024), <a href=\"https://video-diffusion.github.io\">https://video-diffusion.github.io</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>"<a href=\"https://arxiv.org/abs/2312.03606\">DiffusionSat: A Generative Foundation Model for Satellite Imagery</a>" (Dec 2023)</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_digitisation-of-insects.json
+18
avsm/ideas_digitisation-of-insects.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a> <span>/ Feb 2025</span></h2><div><p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>. It is co-supervised with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a>.</p>\n<p>Insects dominate animal biodiversity and are sometimes called "<a href=\"https://faculty.washington.edu/timbillo/Readings%20and%20documents/ABRIDGED%20READINGS%20for%20PERU/Wilson_1987_Little_things_that_run.pdf\">the little things that run the world</a>". They play a disproportionate role in ecosystem functioning, are highly sensitive to environmental change and often considered to be early indicators of responses in other taxa. There is widespread concern about global insect declines[^1] yet the evidence behind such declines is highly biassed towards the Global North and much is drawn from short-term biodiversity datasets[^2] [^3].</p>\n<p>The <a href=\"https://www.museum.zoo.cam.ac.uk/insects\">Insect Collection</a> at the University Museum of Zoology, Cambridge holds over 1.2 million specimens. These include specimens collected from the early 19th century to the present day. Most specimens remain undocumented and unavailable for analysis. However, they contain data that are critical to understanding long-term species and community responses to anthropogenic change, and vital to evaluating whether short-term declines are representative of longer-term trends[^4] [^5]. As such, unlocking these insect collections is of paramount importance, and the large-scale nature of these collections necessitates the development of an efficient and effective digitisation process.</p>\n<p>The 3D digitisation of specimens using current methods is either highly time-intensive or expensive, rendering it impossible to achieve across the collection in a reasonable time-frame. Yet, 3D models of specimens have huge potential for investigating species morphological responses to anthropogenic changes over time and identification of trade-offs in morphological responses within a 3D morphospace.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">540 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#affordable-digitisation-of-insect-collections-using-photogrammetry\"></a>Affordable digitisation of insect collections using photogrammetry</h1>\n<p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>. It is co-supervised with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a>.</p>\n<p>Insects dominate animal biodiversity and are sometimes called "<a href=\"https://faculty.washington.edu/timbillo/Readings%20and%20documents/ABRIDGED%20READINGS%20for%20PERU/Wilson_1987_Little_things_that_run.pdf\">the little things that run the world</a>". They play a disproportionate role in ecosystem functioning, are highly sensitive to environmental change and often considered to be early indicators of responses in other taxa. There is widespread concern about global insect declines<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> yet the evidence behind such declines is highly biassed towards the Global North and much is drawn from short-term biodiversity datasets<a href=\"https://anil.recoil.org/#fn-2\">[2]</a> <a href=\"https://anil.recoil.org/#fn-3\">[3]</a>.</p>\n<p>The <a href=\"https://www.museum.zoo.cam.ac.uk/insects\">Insect Collection</a> at the University Museum of Zoology, Cambridge holds over 1.2 million specimens. These include specimens collected from the early 19th century to the present day. Most specimens remain undocumented and unavailable for analysis. However, they contain data that are critical to understanding long-term species and community responses to anthropogenic change, and vital to evaluating whether short-term declines are representative of longer-term trends<a href=\"https://anil.recoil.org/#fn-4\">[4]</a> <a href=\"https://anil.recoil.org/#fn-5\">[5]</a>. As such, unlocking these insect collections is of paramount importance, and the large-scale nature of these collections necessitates the development of an efficient and effective digitisation process.</p>\n<p>The 3D digitisation of specimens using current methods is either highly time-intensive or expensive, rendering it impossible to achieve across the collection in a reasonable time-frame. Yet, 3D models of specimens have huge potential for investigating species morphological responses to anthropogenic changes over time and identification of trade-offs in morphological responses within a 3D morphospace.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/umzc-1.webp\" title=\"\">\n</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/umzc-2.webp\" title=\"\">\n</p>\n<p>This project aims to develop a reproducible low-cost method of digitising specimens using commodified software to achieve large-scale efficient 3D digitisation of specimens. The student will experiment and develop the methods on the UMZC UK macromoth collection, and would gain experience in insect specimen handling and digitisation, as well as developing knowledge on the role of museum specimens in understanding the biodiversity crisis and tackling global challenges.</p>\n<p>Some early experiments we have done with high quality mobile phones such as an iPhone 16 show that even off-the-shelf software (sucn as <a href=\"https://poly.cam/\">Polycam</a>) using both the Lidar and just normal <a href=\"https://en.wikipedia.org/wiki/Photogrammetry\">photogrammetry</a> modes are sufficient to do a remarkably high fidelity 3D reconstruction of moths. The project, therefore, could either go in the direction of an app that uses <a href=\"https://developer.apple.com/augmented-reality/arkit/\">ARKit</a> to facilitate an interactive scan, or towards building a low-cost rig within which the insect mounting board could be placed with the camera going around "on rails". Challenges will include developing the photogrammetry software pipeline, and also on matters of focus to ensure that the critical areas are measured appropriately accurately (such as the antenna).</p>\n<p>\n<img alt=\"The interested student should not have a fear of insects\" src=\"https://anil.recoil.org/images/umzc-4.webp\" title=\"The interested student should not have a fear of insects\">\nThe interested student should not have a fear of insects</p>\n\n<ol>\n<li>\n<p>Wagner et al. (2020) <a href=\"https://www.pnas.org/doi/abs/10.1073/pnas.2023989118\">Insect decline in the Anthropocene: Death by a thousand cuts</a>. PNAS, 118, e2023989118. DOI: 10.1073/pnas.2023989118</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>van Klink et al. (2020) <a href=\"https://www.science.org/doi/10.1126/science.aax9931\">Meta-analysis reveals declines in terrestrial but increases in freshwater insect abundances</a>. Science 368, 417-420. DOI:10.1126/science.aax9931</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Didham et al. (2020) <a href=\"https://resjournals.onlinelibrary.wiley.com/doi/10.1111/icad.12408\">Interpreting insect declines: seven challenges and a way forward</a>. Insect Conservation and Diversity 13, 102-114. DOI: 10.1111/icad.12408</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Kharouba et al. (2018) <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2017.0405\">Using insect natural history collections to study global change impacts: challenges and opportunities</a>. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 374, 20170405. DOI: 10.1098/rstb.2017.0405</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-4\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Meineke et al. (2018) <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2017.0386\">Biological collections for understanding biodiversity in the Anthropocene</a>. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 374, 20170386. DOI: 10.1098/rstb.2017.0386</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-5\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_dispersed-compartments.json
+18
avsm/ideas_dispersed-compartments.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/dispersed-compartments\">Secure Programming with Dispersed Compartments</a> <span>/ May 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a>.</p>\n<p>This PhD project proposes novel approaches and mechanisms for application\ncompartmentalization and isolation to reduce their ever-growing attack\nsurfaces.</p>\n<p>Our approach is motivated by the key observation that while hardware\nvendors compete to provide security features (notably memory safety and\nprivilege separation) existing systems software like commodity OSs fail to\nutilize such features to improve application security and privacy properly.</p>\n<p>We propose a novel principled approach to privilege separation and isolation,\nenabling application security to be designed and enforced <em>within</em> and\n<em>across</em> different isolation boundaries, and yet remain flexible in the face of\ndiverse threats and changing hardware requirements.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/dispersed-compartments\">186 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#secure-programming-with-dispersed-compartments\"></a>Secure Programming with Dispersed Compartments</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a>.</p>\n<p>This PhD project proposes novel approaches and mechanisms for application\ncompartmentalization and isolation to reduce their ever-growing attack\nsurfaces.</p>\n<p>Our approach is motivated by the key observation that while hardware\nvendors compete to provide security features (notably memory safety and\nprivilege separation) existing systems software like commodity OSs fail to\nutilize such features to improve application security and privacy properly.</p>\n<p>We propose a novel principled approach to privilege separation and isolation,\nenabling application security to be designed and enforced <em>within</em> and\n<em>across</em> different isolation boundaries, and yet remain flexible in the face of\ndiverse threats and changing hardware requirements.</p>\n<p>Specifically, we design <em>dispersed compartments</em> as a building block for\napplications that can encapsulate arbitrary isolation boundaries across\nprivilege levels. Dispersed compartments provide a unified model for extensible\nand auditable compartmentalization. To enable such system-wide privilege\nseparation, we introduce two key concepts; first, <em>dispersed monitoring</em> to check\nextensible security policies. Secondly, dispersed enforcement to enforce\nisolation and security policies across various privilege boundaries while\nreducing the trusted computing base (TCB) through deprivileging the host kernel\non-demand.</p>\n<p>See <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a>'s completed <a href=\"https://www.repository.cam.ac.uk/items/15b038fd-2b81-4608-a033-fc5a39de3bf2\">PhD thesis</a>\non the subject for more details!</p>",
+18
avsm/ideas_distributed-tasks-irmin.json
+18
avsm/ideas_distributed-tasks-irmin.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/distributed-tasks-irmin\">Distributed Task Scheduling Framework over Irmin</a> <span>/ Jan 2019</span></h2><div><p>This is an idea proposed in 2019 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Mohammed Daudali</span>.</p>\n<p>Distributed computation and task scheduling frameworks can be decentralised with minimal cost to performance. Furthermore, this decentralisation can provide a significant reduction in the trusted computing base and complexity of the system, affording end consumers a greater level of confidence in the integrity of the results. Moreover, carefully designed persistent and transient data structures can augment this confidence by providing strong isolation guarantees in a multi-tenant system, whilst retaining full transparency over the dynamic data flow graph. This can all be achieved with an API that interfaces directly with conventional developer tools, enabling end users to easily verify that the computation directly aligns with their expectations. Detailed metadata can ensure a fair and transparent pricing structure for both service providers and consumers by carefully tracking the resource usage. Together, this allows open-source communities to remain completely transparent whilst providing non-developer end users a simpler and more accessible downloadable package that can be independently verified.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/distributed-tasks-irmin\">374 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#distributed-task-scheduling-framework-over-irmin\"></a>Distributed Task Scheduling Framework over Irmin</h1>\n<p>This is an idea proposed in 2019 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Mohammed Daudali</span>.</p>\n<p>Distributed computation and task scheduling frameworks can be decentralised with minimal cost to performance. Furthermore, this decentralisation can provide a significant reduction in the trusted computing base and complexity of the system, affording end consumers a greater level of confidence in the integrity of the results. Moreover, carefully designed persistent and transient data structures can augment this confidence by providing strong isolation guarantees in a multi-tenant system, whilst retaining full transparency over the dynamic data flow graph. This can all be achieved with an API that interfaces directly with conventional developer tools, enabling end users to easily verify that the computation directly aligns with their expectations. Detailed metadata can ensure a fair and transparent pricing structure for both service providers and consumers by carefully tracking the resource usage. Together, this allows open-source communities to remain completely transparent whilst providing non-developer end users a simpler and more accessible downloadable package that can be independently verified.</p>\n<p>This project will investigate building a composable task scheduler over <a href=\"https://github.com/mirage/irmin\">Irmin</a>. The core of this project started with a single server model, in which a large number of workers can independently clone and interact with a persistent job queue CRDT. Crucially, each worker schedules tasks using only local knowledge, giving a high probability that at least two workers are working on the same task. This has a twofold benefit - completed work can be independently verified by a number of different workers, and two, work in progress by stragglers can be selected by other workers, which can result in a lower time to completion. By independently sampling and verifying work, we remove the need for implicitly trusting individual workers. Adversaries must now compromise all worker nodes to have the required effect - compromising N - 1 workers results in a non-zero probability of the attack being detected. Given a heterogeneous set of worker machines, all under the control of different and independent entities, this attack becomes significantly harder. The project will investigate suitable sampling schedules for calculating the pareto frontier of over-committing work versus cluster throughput.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2015-jfla-irmin\">Mergeable persistent data structures</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li>The dissertation writeup is in a private <a href=\"https://github.com/mdaudali/dissertation_writeup\">GitHub repository</a> and the Irmin implementation code also in a <a href=\"https://github.com/mdaudali/Dissertation\">private repository</a>. Please contact the author directly for access.</li>\n</ul>",
+18
avsm/ideas_dsl-for-decentralised-id.json
+18
avsm/ideas_dsl-for-decentralised-id.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/dsl-for-decentralised-id\">A DSL for decentralised identity in OCaml</a> <span>/ Aug 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://www.linkedin.com/in/michal-mgeladze-arciuch\">Micha\u0142 Mge\u0142adze-Arciuch</a>. It was co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>There are currently multiple identity providers without direct incentives to\ncooperate. This leads to many redundant implementations of the identity\nhandling logic, many of which are not immediately compatible with each other,\nleading to additional increases in friction when eventual agreement needs to be\nreached to perform user actions. Furthermore, from the perspective of the user\nof the identity service, they need to keep track of identity documents from\nmultiple sources, which leads to more security attack surface.</p>\n<p>Solving the problem of partial identity proofs allows for many possible\nopportunities. For example, consider a simple May Ball ticketing system in\nwhich every college member gets a discount to their College, but without\nrevealing their exact identity. Or imagine an e-commerce system, in which every\nuser could prove their age to be over a given threshold, without revealing any\nadditional information to the retailer. In the example of a carbon credits\nproject, we would be able to allow entities associated with any carbon\noffsetting project to prove their association, protecting the identity of\nwhistleblowers.</p>\n<p>This project will build a system of Decentralised Digital Identifiers, which\ncan be used to prove a subset of the information associated with the user\u2019s\nidentity using cryptographic proofs. Every participant in\nthe system will have a public-private key pair associated with them. Then any\nidentity provider P could provide an identity document for Alice, who has a\npublic key A, by cryptographically signing a message containing both A, to\npoint to the receiver of this document, and the document itself. Then, whenever\nAlice would want to authenticate herself to a service provider S, she could do\nso simply by sending the message she received from P to S. Then the service\nprovider can verify that P, indeed supplied Alice with the given identity\ndocument.</p>\n<p>This Part II project was successfully completed but not available online; please\ncontact the author for a copy of it. <a href=\"https://www.linkedin.com/in/michal-mgeladze-arciuch\">Micha\u0142 Mge\u0142adze-Arciuch</a> has subsequently founded <a href=\"https://www.czechtradeoffices.com/se/news/czech-startup-yoneda-labs-raises-over-$100-million-to-revolutionize-chemical-reactions-with-ai\">Yoneda\nLabs to revolutionize chemical\nreactions</a>!</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#a-dsl-for-decentralised-identity-in-ocaml\"></a>A DSL for decentralised identity in OCaml</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://www.linkedin.com/in/michal-mgeladze-arciuch\">Micha\u0142 Mge\u0142adze-Arciuch</a>. It was co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>There are currently multiple identity providers without direct incentives to\ncooperate. This leads to many redundant implementations of the identity\nhandling logic, many of which are not immediately compatible with each other,\nleading to additional increases in friction when eventual agreement needs to be\nreached to perform user actions. Furthermore, from the perspective of the user\nof the identity service, they need to keep track of identity documents from\nmultiple sources, which leads to more security attack surface.</p>\n<p>Solving the problem of partial identity proofs allows for many possible\nopportunities. For example, consider a simple May Ball ticketing system in\nwhich every college member gets a discount to their College, but without\nrevealing their exact identity. Or imagine an e-commerce system, in which every\nuser could prove their age to be over a given threshold, without revealing any\nadditional information to the retailer. In the example of a carbon credits\nproject, we would be able to allow entities associated with any carbon\noffsetting project to prove their association, protecting the identity of\nwhistleblowers.</p>\n<p>This project will build a system of Decentralised Digital Identifiers, which\ncan be used to prove a subset of the information associated with the user\u2019s\nidentity using cryptographic proofs. Every participant in\nthe system will have a public-private key pair associated with them. Then any\nidentity provider P could provide an identity document for Alice, who has a\npublic key A, by cryptographically signing a message containing both A, to\npoint to the receiver of this document, and the document itself. Then, whenever\nAlice would want to authenticate herself to a service provider S, she could do\nso simply by sending the message she received from P to S. Then the service\nprovider can verify that P, indeed supplied Alice with the given identity\ndocument.</p>\n<p>This Part II project was successfully completed but not available online; please\ncontact the author for a copy of it. <a href=\"https://www.linkedin.com/in/michal-mgeladze-arciuch\">Micha\u0142 Mge\u0142adze-Arciuch</a> has subsequently founded <a href=\"https://www.czechtradeoffices.com/se/news/czech-startup-yoneda-labs-raises-over-$100-million-to-revolutionize-chemical-reactions-with-ai\">Yoneda\nLabs to revolutionize chemical\nreactions</a>!</p>",
+18
avsm/ideas_ecoregion-maps.json
+18
avsm/ideas_ecoregion-maps.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/ecoregion-maps\">Using graph theory to define data-driven ecoregion and bioregion maps</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\">Daniele Baisero</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Maps of biologically driven regionalization (e.g. ecoregions and bioregions)\nare useful in conservation science and policy as they help identify areas with\nsimilar ecological characteristics, allowing for more targeted, efficient, and\necosystem-specific management strategies. These regions provide a framework for\nprioritizing conservation efforts, monitoring biodiversity, and aligning\npolicies across political boundaries based on ecological realities rather than\narbitrary lines. However these products have historically been "hand drawn" by\nexperts and are mostly based on plant distribution data only.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/ecoregion-maps\">270 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#using-graph-theory-to-define-data-driven-ecoregion-and-bioregion-maps\"></a>Using graph theory to define data-driven ecoregion and bioregion maps</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\">Daniele Baisero</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Maps of biologically driven regionalization (e.g. ecoregions and bioregions)\nare useful in conservation science and policy as they help identify areas with\nsimilar ecological characteristics, allowing for more targeted, efficient, and\necosystem-specific management strategies. These regions provide a framework for\nprioritizing conservation efforts, monitoring biodiversity, and aligning\npolicies across political boundaries based on ecological realities rather than\narbitrary lines. However these products have historically been "hand drawn" by\nexperts and are mostly based on plant distribution data only.</p>\n<p>Graph theory offers numerous tools to analyse and highlight the relation\nbetween data points and has been used to study spatially explicit datasets.\nHowever, these tools have never been applied to global-scale systematic species\ndistribution datasets. The <a href=\"https://www.keybiodiversityareas.org/\">Key Biodiversity Areas</a> (KBA) Secretariat has\ncompiled such a comprehensive dataset that includes Range and Area Of Habitat\n(AOH) information for all species currently mapped on the <a href=\"https://www.iucnredlist.org/\">IUCN Red List</a> (92,255\nspecies; each species modelled for both its breeding and non-breeding\ndistribution), along ~85 million hexagonal 6 km2 cells that cover the entire\nglobe. The entire dataset is comprised of 32 billion spatially explicit data\nrecords.</p>\n<h2><a href=\"https://anil.recoil.org/#the-summer-project\"></a>The summer project</h2>\n<p>We aim to use clustering analysis for community detection on a combination of species\nco-occurrence and cell proximity, to create a data-driven spatial\nregionalization of the world based on all spatially described species. The\nproject will involve compiling all this data into a graph database, identifying\nsuitable clustering approaches for community detection, and analysing results\nto identify informative clustering thresholds.</p>\n<p>This is a good summer project for a computer science student who wants to\nget more familiar with graph databases, data science and environmental/biodiversity\napproaches.</p>",
+18
avsm/ideas_effect-parallel-strategies.json
+18
avsm/ideas_effect-parallel-strategies.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/effect-parallel-strategies\">Parallel traversal effect handlers for OCaml</a> <span>/ Sep 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"mailto:sb2634@cam.ac.uk\">Sky Batchelor</a>. It was co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Most existing uses of effect handlers perform synchronous execution of handled\neffects. Xie <em>et al</em> proposed a <code>traverse</code> handler for parallelisation of\nindependent effectful computations whose effect handlers are outside the\nparallel part of the program. The paper [^1] gives a sample implementation as a\nHaskell library with an associated \u03bbp calculus that formalises the parallel\nhandlers.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/effect-parallel-strategies\">199 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#parallel-traversal-effect-handlers-for-ocaml\"></a>Parallel traversal effect handlers for OCaml</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"mailto:sb2634@cam.ac.uk\">Sky Batchelor</a>. It was co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Most existing uses of effect handlers perform synchronous execution of handled\neffects. Xie <em>et al</em> proposed a <code>traverse</code> handler for parallelisation of\nindependent effectful computations whose effect handlers are outside the\nparallel part of the program. The paper <a href=\"https://anil.recoil.org/#fn-1\">[1]</a> gives a sample implementation as a\nHaskell library with an associated \u03bbp calculus that formalises the parallel\nhandlers.</p>\n<p>This project aims to:</p>\n<ul>\n<li>implement the <code>traverse</code> handler in OCaml 5, using single-shot handlers <a href=\"https://anil.recoil.org/#fn-2\">[2]</a></li>\n<li>identify a selection of parallel-friendly data structures that might benefit from such parallel traversals</li>\n<li>investigate handlers for alternative traversal strategies beyond the folds support by <code>traverse</code></li>\n<li>evaluate the performance of such parallel handlers, for instance using Eio's <code>Domain_pool</code> <a href=\"https://anil.recoil.org/#fn-3\">[3]</a> on a many core machine (ranging from 8--128 cores)</li>\n</ul>\n<p><a href=\"mailto:sb2634@cam.ac.uk\">Sky Batchelor</a> successfully built a traverse handler for their Part II project and submitted it succcessfully in June 2025! A copy of the dissertation is available on request, and we're working on getting the dissertation and code online.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n\n<ol>\n<li>\n<p><a href=\"https://dl.acm.org/doi/abs/10.1145/3674651\">Parallel Algebraic Effect Handlers</a> describes the <code>traverse</code> effect</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">Retrofitting effect handlers onto OCaml</a>, PLDI 2021 describes how the effect system in OCaml works.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://github.com/ocaml-multicore/eio\">EIO</a> is a high-performance direct-style IO library we have been developing for OCaml.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_effective-geospatial-code.json
+18
avsm/ideas_effective-geospatial-code.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/effective-geospatial-code\">Effective geospatial code in OCaml</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:gp528@cam.ac.uk\">George Pool</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Geospatial data processing is a critical component of many scientific and engineering workflows, from environmental monitoring to urban planning. However, writing geospatial code that scales to multiple cores and makes best use of available memory can be challenging due to the scale of the data involved. To deal with this, we have been developing some domain-specific tools to improve the state of affairs.</p>\n<p><a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> is a wrapper to the GDAL library that provides high-level Python APIs that take care of figuring out if datasets overlap, and if vector layers need to be rasterised, and manages memory efficiently for large layers. There is only one problem: we would like to write similar code to this, but in a high level functional language rather than an imperative one!</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/effective-geospatial-code\">299 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#effective-geospatial-code-in-ocaml\"></a>Effective geospatial code in OCaml</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:gp528@cam.ac.uk\">George Pool</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Geospatial data processing is a critical component of many scientific and engineering workflows, from environmental monitoring to urban planning. However, writing geospatial code that scales to multiple cores and makes best use of available memory can be challenging due to the scale of the data involved. To deal with this, we have been developing some domain-specific tools to improve the state of affairs.</p>\n<p><a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> is a wrapper to the GDAL library that provides high-level Python APIs that take care of figuring out if datasets overlap, and if vector layers need to be rasterised, and manages memory efficiently for large layers. There is only one problem: we would like to write similar code to this, but in a high level functional language rather than an imperative one!</p>\n<p>OCaml has recently gained supported for multicore parallelism, and is also one of the first mainstream languages with support for effects. This project will involve writing a library in OCaml that provides similar functionality to Yirgacheffe, but with a focus on high-level functional programming. This will involve interfacing with the GDAL library, and also writing some high-level abstractions for geospatial data processing. As an alternative to depending on GDAL, you may also choose to contribute to the emerging <a href=\"https://github.com/geocaml\">GeoCaml</a> ecosystem which <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> created.</p>\n<p>A successful project will demonstrate a direct-style, readable interface to geospatial code, with the scheduling of parallel operations and memory management delegated to a separate library written in OCaml which can be customised to the local computing environment (e.g. a large local multicore machine, or a cloud computing cluster).</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">Planetary computing for data-driven environmental policy-making</a> covers the data processing pipelines we need to integrate into.</li>\n<li><a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">Retrofitting effect handlers onto OCaml</a>, PLDI 2021 describes how the effect system in OCaml works.</li>\n<li><a href=\"https://github.com/ocaml-multicore/eio\">EIO</a> is the high-performance direct-style IO library we have been developing for OCaml.</li>\n</ul>",
+18
avsm/ideas_effective-specification-languages.json
+18
avsm/ideas_effective-specification-languages.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/effective-specification-languages\">An imperative, pure and effective specification language</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:ms2922@cam.ac.uk\">Max Smith</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Formal specification languages are conventionally rather functional looking,\nand not hugely amenable to iterative development. In contrast, real world\nspecifications for geospatial algorithms tend to developed with "holes" in the\nlogic which is then filled in by a domain expert as they explore the datasets\nthrough small pieces of exploratory code and visualisations.</p>\n<p>This project seeks to investigate the design of a specification language that\n<em>looks and feels</em> like Python, but that supports typed holes and the robust\nsemantic foundations of a typed functional language behind the hood. The\nlangage would have a Python syntax, with the familiar imperative core, but\ntranslate it into <a href=\"https://hazel.org\">Hazel</a> code behind the scenes.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/effective-specification-languages\">217 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#an-imperative-pure-and-effective-specification-language\"></a>An imperative, pure and effective specification language</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:ms2922@cam.ac.uk\">Max Smith</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Formal specification languages are conventionally rather functional looking,\nand not hugely amenable to iterative development. In contrast, real world\nspecifications for geospatial algorithms tend to developed with "holes" in the\nlogic which is then filled in by a domain expert as they explore the datasets\nthrough small pieces of exploratory code and visualisations.</p>\n<p>This project seeks to investigate the design of a specification language that\n<em>looks and feels</em> like Python, but that supports typed holes and the robust\nsemantic foundations of a typed functional language behind the hood. The\nlangage would have a Python syntax, with the familiar imperative core, but\ntranslate it into <a href=\"https://hazel.org\">Hazel</a> code behind the scenes.</p>\n<p>Another direction to investigate is also translating the same code into OCaml 5,\nand use the new effect system to handle IO and mutability in the source language\ncode. This would allow for multiple interpretations of the program to execute\ndepending on the context:</p>\n<ul>\n<li>an interative JavaScript-compiled (or wasm-compiled) tracing version that records variable updates</li>\n<li>a high performance version that batches and checkpoints variable updates and deploys parallel execution</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#background-reading\"></a>Background Reading</h2>\n<ul>\n<li><a href=\"https://hazel.org/papers/propl24.pdf\">Toward a Live, Rich, Composable, and Collaborative Planetary Compute Engine</a>, PROPL 2024.</li>\n<li><a href=\"https://patrick.sirref.org\">Patrick Ferris</a>'s first year PhD report (available on request to students interested in this idea).</li>\n<li><a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">Retrofitting effect handlers onto OCaml</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li><a href=\"https://hazel.org\">Hazel</a></li>\n</ul>",
+18
avsm/ideas_effects-scheduling-ocaml-compiler.json
+18
avsm/ideas_effects-scheduling-ocaml-compiler.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a>. It is co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p>In order to compile the OCaml program <code>foo.ml</code> containing:</p>\n<pre><code>Stdlib.print_endline "Hello, world"\n</code></pre>\n<p>the OCaml compilers only require the compiled <code>stdlib.cmi</code> interface to exist in order to determine the type of <code>Stdlib.print_endline</code>. This separate compilation technique allows modules of code to be compiled before the <em>code</em> they depend on has necessarily been compiled. When OCaml was first written, this technique was critical to reduce recompilation times. As CPU core counts increased through the late nineties and early 2000s, separate compilation also provided a parallelisation benefit, where modules which did not depend on each other could be compiled at the same time as each other benefitting <em>compilation</em> as well as <em>recompilation</em>.</p>\n<p>For OCaml, as in many programming languages, the compilation of large code bases is handled by a separate <em>build system</em> (for example, <code>dune</code>, <code>make</code> or <code>ocamlbuild</code>) with the <em>compiler driver</em> (<code>ocamlc</code> or <code>ocamlopt</code>) being invoked by that build system as required. In this project, we'll investigate how to get the OCaml compiler itself to be responsible for exploiting available parallelism.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">697 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#effects-based-scheduling-for-the-ocaml-compiler-pipeline\"></a>Effects based scheduling for the OCaml compiler pipeline</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a>. It is co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p>In order to compile the OCaml program <code>foo.ml</code> containing:</p>\n<pre><code>Stdlib.print_endline "Hello, world"\n</code></pre>\n<p>the OCaml compilers only require the compiled <code>stdlib.cmi</code> interface to exist in order to determine the type of <code>Stdlib.print_endline</code>. This separate compilation technique allows modules of code to be compiled before the <em>code</em> they depend on has necessarily been compiled. When OCaml was first written, this technique was critical to reduce recompilation times. As CPU core counts increased through the late nineties and early 2000s, separate compilation also provided a parallelisation benefit, where modules which did not depend on each other could be compiled at the same time as each other benefitting <em>compilation</em> as well as <em>recompilation</em>.</p>\n<p>For OCaml, as in many programming languages, the compilation of large code bases is handled by a separate <em>build system</em> (for example, <code>dune</code>, <code>make</code> or <code>ocamlbuild</code>) with the <em>compiler driver</em> (<code>ocamlc</code> or <code>ocamlopt</code>) being invoked by that build system as required. In this project, we'll investigate how to get the OCaml compiler itself to be responsible for exploiting available parallelism.</p>\n<p>Some previous work (parts of which are available on GitHub<a href=\"https://anil.recoil.org/#fn-1\">[1]</a>) showed the benefits of sharing the typing information known\nto the compiler between each invocation. The hypothesis was during a\n<em>sequential</em> computation, a considerable amount of time is spent by the\ncompiler searching for and reloading typing information, as well as the\noverheads of launching thousands of copies of the compiler in a given build.</p>\n<p>Our test compiler with an adapted version of Dune showed as much as a halving\nof compilation time in <em>sequential</em> builds. However, in <em>parallel</em> builds, the\nresults were not as impressive - although the many invocations of the compiler\nrepeat the same loading operations, much of this cost is (quite predictably)\nmasked by performing the work in parallel.</p>\n<p>The previous investigation was carried out on OCaml 4.07. Although it shared\nthe typing information between "invocations" of the compiler, the compiler\npipeline itself was unaltered - a file only started to be processed when all of\nits dependencies were ready. Furthermore, it remained the responsibility of a\nbuild system to provide this dependency ordering.</p>\n<p>Fast forward to the present day, and we have OCaml 5.x, with both first class\nsupport for <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar\">parallelism</a> and <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">algebraic effects</a>. Domains provide an obvious ability for a single\ncompiler process to compile several files simultaneously. Effects should allow\nus to break the pipeline into stages, suspending the compilation whenever new\ntype information is required by performing an effect. Using this model, it\nshould be possible to start with the entry module for a program and allow the\ntype checker itself to discover the dependency graph. it should be possible to\nsee many files being <em>progressively</em> type-checked in\nparallel.</p>\n<p>The hypothesis is that this will be both faster, but also considerably simpler.\nThe "scheduler" required for handling the effects should be a considerably\nsimpler program than a full-blown separate build system. Key challenges in this\nwork:</p>\n<ul>\n<li>the compiler library functions are not parallel-safe. It will be necessary to\nadapt the compiler either to work around or eliminate its global mutable\nstate. This was necessary in the OCaml 4.07 as well.</li>\n<li>The compiler becomes a much longer-lived process, and the garbage collector\nbecomes more relevant. The OCaml 4.07 version required "ancient heaps" to be\nused to keep the major collector under control - otherwise significant amount\nof time are spent by the runtime marking major heap which will never be\ncollected. This technique will need revising for OCaml 5, potentially with a\ndirect alteration to the runtime to support stop-the-world promotion of items\nfrom the major heap to the ancient heap.</li>\n<li>It will not be possible to achieve an upstreamable change to OCaml during a\nproject of this length, but given that the comparison will be against a real\nbuild system operating with the same level of parallelism, it should be\npossible to perform a wide-range of measurements building existing OCaml\nprojects.</li>\n<li>There's lots of potential for additional exploration, particularly\ndispatching multiple build targets to the compiler (i.e. building multiple\nlibraries and executables in the one invocation) and in using reusing previous\nbuild graph computations to inform scheduling decisions.</li>\n</ul>\n\n<ol>\n<li>\n<p>see <a href=\"https://github.com/dra27/ocaml/commits/nandor-dune-work/\">dra27/ocaml#nandor-dune-work</a>, <a href=\"https://github.com/dra27/dune/commits/nandor-shmap\">dra27/dune#nandor-shmap</a>, and <a href=\"https://github.com/nandor/offheap\">nandor/offheap</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_embedded-whisper.json
+18
avsm/ideas_embedded-whisper.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/embedded-whisper\">Low power audio transcription with Whisper</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>.</p>\n<p>The rise of batteryless energy-harvesting platforms could enable\nultra-low-power, long-term, maintenance-free deployments of sensors.</p>\n<p>This project explores the deployment of the OpenAI Whisper audio transcription\nmodel onto <a href=\"https://github.com/ggml-org/whisper.cpp/discussions/166\">embedded devices</a>, initially starting with the\nrPi and moving onto smaller devices.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#low-power-audio-transcription-with-whisper\"></a>Low power audio transcription with Whisper</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>.</p>\n<p>The rise of batteryless energy-harvesting platforms could enable\nultra-low-power, long-term, maintenance-free deployments of sensors.</p>\n<p>This project explores the deployment of the OpenAI Whisper audio transcription\nmodel onto <a href=\"https://github.com/ggml-org/whisper.cpp/discussions/166\">embedded devices</a>, initially starting with the\nrPi and moving onto smaller devices.</p>",
+18
avsm/ideas_evaluating-conservation-copilot.json
+18
avsm/ideas_evaluating-conservation-copilot.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/evaluating-conservation-copilot\">Evaluating LLMs for providing evidence-based information on conservation actions</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>We are building a <a href=\"https://anil.recoil.org/projects/ce\">Conservation Co-Pilot</a> to improve worldwide\nconservation action through evidence-driven insights. Biodiversity loss is one\nof the biggest threats to our planet and to tackle it, we must improve the\neffectiveness of conservation action, which currently falls short of its full\npotential. This is because conservationists typically find it hard to access\nlocally relevant evidence on what works to conserve biodiversity as research\nknowledge is not translated quickly or accessibly enough into policy and\npractice. We therefore need to accelerate the transfer of relevant, reliable\nevidence to decision-makers using more intuitive and interactive interfaces.</p>\n<p>This project will use the comprehensive <a href=\"https://conservationevidence.com\">Conservation\nEvidence</a> database (holding 8600 studies that\nhave quantitatively tested 3600 actions) to evaluate the ability of a Mixture\nof Agents (MOA) approach, and/or individual LLMs, at providing rigorous\nevidence-based answers to priority questions from real conservationists.</p>\n<p>This will extend our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">previous work</a> that found that LLMs coupled\nwith a hybrid retrieval strategy can answer multiple choice conservation\nquestions as well as human experts. This will enable us to develop a\n"Conservation Co-Pilot" that can handle complex and nuanced questions from\ndifferent users.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#evaluating-llms-for-providing-evidence-based-information-on-conservation-actions\"></a>Evaluating LLMs for providing evidence-based information on conservation actions</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>. It is co-supervised with <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>We are building a <a href=\"https://anil.recoil.org/projects/ce\">Conservation Co-Pilot</a> to improve worldwide\nconservation action through evidence-driven insights. Biodiversity loss is one\nof the biggest threats to our planet and to tackle it, we must improve the\neffectiveness of conservation action, which currently falls short of its full\npotential. This is because conservationists typically find it hard to access\nlocally relevant evidence on what works to conserve biodiversity as research\nknowledge is not translated quickly or accessibly enough into policy and\npractice. We therefore need to accelerate the transfer of relevant, reliable\nevidence to decision-makers using more intuitive and interactive interfaces.</p>\n<p>This project will use the comprehensive <a href=\"https://conservationevidence.com\">Conservation\nEvidence</a> database (holding 8600 studies that\nhave quantitatively tested 3600 actions) to evaluate the ability of a Mixture\nof Agents (MOA) approach, and/or individual LLMs, at providing rigorous\nevidence-based answers to priority questions from real conservationists.</p>\n<p>This will extend our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">previous work</a> that found that LLMs coupled\nwith a hybrid retrieval strategy can answer multiple choice conservation\nquestions as well as human experts. This will enable us to develop a\n"Conservation Co-Pilot" that can handle complex and nuanced questions from\ndifferent users.</p>",
+18
avsm/ideas_food-provenance-fao.json
+18
avsm/ideas_food-provenance-fao.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/food-provenance-fao\">An access library for the world crop, food production and consumption datasets</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>.</p>\n<p>Agricultural habitat degradation is a leading threat to global biodiversity. To\nmake informed decisions, it's crucial to understand the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impacts of various foods</a>, their origins, and potential mitigation strategies. Insights can\ndrive actions from national policies to individual dietary choices. Key factors\ninclude knowing where crops are grown, their yields, and food sourcing by\ncountry.</p>\n<p>The <a href=\"https://www.fao.org/faostat/en/#home\">FAOSTAT trade data</a> offers\ncomprehensive import and export records since 1986, but its raw form is\ncomplex, including double counting, hindering the link between production and\nconsumption.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/food-provenance-fao\">372 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#an-access-library-for-the-world-crop-food-production-and-consumption-datasets\"></a>An access library for the world crop, food production and consumption datasets</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>.</p>\n<p>Agricultural habitat degradation is a leading threat to global biodiversity. To\nmake informed decisions, it's crucial to understand the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impacts of various foods</a>, their origins, and potential mitigation strategies. Insights can\ndrive actions from national policies to individual dietary choices. Key factors\ninclude knowing where crops are grown, their yields, and food sourcing by\ncountry.</p>\n<p>The <a href=\"https://www.fao.org/faostat/en/#home\">FAOSTAT trade data</a> offers\ncomprehensive import and export records since 1986, but its raw form is\ncomplex, including double counting, hindering the link between production and\nconsumption.</p>\n<p>While Kastner et al. proposed a method<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> to address this, it has only been\napplied to 2013 data<a href=\"https://anil.recoil.org/#fn-2\">[2]</a> so far. Creating a reproducible pipeline for\nprocessing FAO trade data across years is essential for assessing how global\ntrade changes affect biodiversity. For instance, how has Brexit impacted the\nUK's food sourcing and biodiversity? What are the repercussions of emerging\nproducers on ecosystems?</p>\n<h2><a href=\"https://anil.recoil.org/#the-summer-project\"></a>The summer project</h2>\n<p>There exists a Python <a href=\"https://pypi.org/project/faostat/\">faostat</a> library to\nact as an interface to the raw CSV. And in 2024, a bunch of food hackers released\n<a href=\"https://joss.theoj.org/papers/10.21105/joss.06305\">AgriFoodPy</a> <a href=\"https://anil.recoil.org/#fn-3\">[3]</a> which is a package\nfor modelling food systems.</p>\n<p>In this project, we'd like to:</p>\n<ul>\n<li>port a bunch of R code to Python using faostat/agrifoodpy and verify the outputs are broadly the same</li>\n<li>determine strategies to incrementaly update and reproduce FAO data on top of these libraries so we can do more frequent updates and tailoring</li>\n<li>apply it to the code backing the "<a href=\"https://anil.recoil.org/papers/2024-food-life\">Quantifying the impact of the food we eat on species extinctions</a>" paper</li>\n</ul>\n<p>This would be a good summer project for a student interested in getting to\ngrips with scientific computing, such as Python, Rscript, and dataframes\nlibraries. If the core is done early, then we can investigate visualisations\nas well. And of course, if you're interested in sustainability, is this is a\n<a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">great topic</a> to start on!</p>\n<p>See also:</p>\n<ul>\n<li>AgrifoodPy food calculator at <a href=\"https://agrifood-consultation.streamlit.app/\">https://agrifood-consultation.streamlit.app/</a></li>\n</ul>\n\n<ol>\n<li>\n<p>Kastner T, Kastner M, Nonhebel S (2011): <a href=\"https://doi.org/10.1016/j.ecolecon.2011.01.012\">Tracing distant environmental impacts of agricultural products from a consumer perspective</a>. Ecol Econ 70:1032\u20131040.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Schwarzmueller, F., Kastner, T. <a href=\"https://link.springer.com/article/10.1007/s11625-022-01138-7#Sec13\">Agricultural trade and its impacts on cropland use and the global loss of species habitat</a>. Sustain Sci 17, 2363\u20132377 (2022).</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Cordero, Juan P. et al. <a href=\"https://joss.theoj.org/papers/10.21105/joss.06305\">AgriFoodPy: a package for modelling food systems</a>. Journal of Open Source Software (2024).</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_frp-web-ocaml.json
+18
avsm/ideas_frp-web-ocaml.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/frp-web-ocaml\">Functional Reactive Web Applications</a> <span>/ Jan 2010</span></h2><div><p>This is an idea proposed in 2010 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Henry Hughes</span>.</p>\n<p>There are a variety of programming languages which can be used to create\ndesktop applications, and each provides different tradeoffs. This could be\nanything from the runtime guarantees the programming language provides to rapid\ndevelopment and prototyping. It does not make much difference to the user which\nof these languages was used, as all they want to do is run their favourite\napplication reliably.</p>\n<p>When writing an application for the web, however, the programmer\nis forced to use a specific set of APIs that come under the umbrella\nterm AJAX (Asynchronous JavaScript and XML). AJAX involves writing client-side\ncode in JavaScript and performing asynchronous requests to a server. This\nprovides a more interactive environment than the classical web application\nmodel. The classical model uses the server to create the next web page on the\nfly and then reloads the current page with the new one. This is often less\ndesirable because loading a new page causes a break in the user\u2019s work flow.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/frp-web-ocaml\">268 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#functional-reactive-web-applications\"></a>Functional Reactive Web Applications</h1>\n<p>This is an idea proposed in 2010 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Henry Hughes</span>.</p>\n<p>There are a variety of programming languages which can be used to create\ndesktop applications, and each provides different tradeoffs. This could be\nanything from the runtime guarantees the programming language provides to rapid\ndevelopment and prototyping. It does not make much difference to the user which\nof these languages was used, as all they want to do is run their favourite\napplication reliably.</p>\n<p>When writing an application for the web, however, the programmer\nis forced to use a specific set of APIs that come under the umbrella\nterm AJAX (Asynchronous JavaScript and XML). AJAX involves writing client-side\ncode in JavaScript and performing asynchronous requests to a server. This\nprovides a more interactive environment than the classical web application\nmodel. The classical model uses the server to create the next web page on the\nfly and then reloads the current page with the new one. This is often less\ndesirable because loading a new page causes a break in the user\u2019s work flow.</p>\n<p>While JavaScript is a full-featured language there are other programming\nlanguages which provide features for more robust coding. This project explores\nhow AJAX applications might be written using a paradigm known as\n<em>functional reactive programming</em>, and implement it in the OCaml language\nand compile it to JavaScript via the <code>ocamljs</code> transpiler. The project uses\nthe <a href=\"https://github.com/jaked/froc\">froc</a> FRP library by Jake Donham.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://github.com/jaked/froc\">FROC</a></li>\n<li><a href=\"http://ambassadortothecomputers.blogspot.com/search/label/froc\">Discussion about FROC and reactive programming</a>, Jake Donham</li>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation PDF isn't available publically but\nshould be in the Cambridge Computer Lab archives somewhere.\nThe source code is also archived but not publically available.</p>",
+18
avsm/ideas_functional-diffs.json
+18
avsm/ideas_functional-diffs.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/functional-diffs\">Composable diffing for heterogenous file formats</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>When dealing with large scale geospatial data, we also have to deal with a variety of file formats, such as CSV, JSON, GeoJSON, or GeoTIFFs, etc. Each of these file formats has its own structure and semantics, and it is often necessary to compare and merge data across different file formats. The conventional solution with source code would be to use a tool such as Git to compare and merge data across different file formats. However, this approach is not always feasible, as it requires the data to be in a text-based format and the data to be structured in a way that can be compared line by line.</p>\n<p>This project explores the design of a composable diffing specification that can compare and merge data across heterogenous file formats. The project will involve designing a domain-specific language for specifying the diffing rules, and implementing a prototype tool that can compare and merge data across different file formats. Crucially, the tool should be composable, meaning that it should be possible to combine different diffing rules to compare and merge data across different file formats.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/functional-diffs\">309 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#composable-diffing-for-heterogenous-file-formats\"></a>Composable diffing for heterogenous file formats</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>When dealing with large scale geospatial data, we also have to deal with a variety of file formats, such as CSV, JSON, GeoJSON, or GeoTIFFs, etc. Each of these file formats has its own structure and semantics, and it is often necessary to compare and merge data across different file formats. The conventional solution with source code would be to use a tool such as Git to compare and merge data across different file formats. However, this approach is not always feasible, as it requires the data to be in a text-based format and the data to be structured in a way that can be compared line by line.</p>\n<p>This project explores the design of a composable diffing specification that can compare and merge data across heterogenous file formats. The project will involve designing a domain-specific language for specifying the diffing rules, and implementing a prototype tool that can compare and merge data across different file formats. Crucially, the tool should be composable, meaning that it should be possible to combine different diffing rules to compare and merge data across different file formats.</p>\n<p>As an evaluation, the project will apply the composable diffing specification to real-world dataset used in our <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> projects, and compare the results with a conventional approach using Git.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a> has relevant background reading on some of the types of diffs that would be useful in a geospatial context.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">Planetary computing for data-driven environmental policy-making</a> covers the broader data processing pipelines we need to integrate into.</li>\n<li><em><a href=\"http://eelco.lempsink.nl/thesis.pdf\">"Generic type-safe diff and patch for families of datatypes"</a>, Eelco Lempsink (2009)</em> is a principled library in Haskell for constructing type safe diff and patch functions using GADTs.</li>\n<li><em><a href=\"https://gioele.io/p/doceng2018/doceng2018-diffi.pdf\">diffi: diff improved; a preview</a>, Gioele Barabucci (2018)</em> is a comparison tool whose primary goal is to describe the differences between the content of two documents regardless of their formats.</li>\n</ul>",
+18
avsm/ideas_functional-imap.json
+18
avsm/ideas_functional-imap.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/functional-imap\">Functional ABNF parser generators</a> <span>/ Jan 2011</span></h2><div><p>This is an idea proposed in 2011 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/ns476\">Nicholas Skehin</a>.</p>\n<p>Writing internet servers is a difficult proposition. On some levels it seems as\nthough we haven\u2019t made much progress since the 1970s, as popular servers such as\nApache and nginx for HTTP, BIND for DNS and qmail for IMAP for many Internet\nprotocols still tend to be written in C. While it is not impossible to write\nrobust software in C, it does tend to be extremely difficult and almost all of\nthe above have suffered from their fair share of security vulnerabilities.\nWith the advent of higher level programming languages, this does not need to be\nthe case any longer. Modern functional languages such as OCaml and Haskell can\nbe competitive performance-wise with C on many workloads. In many cases their\nemphasis on purity where possible comes with significant benefits when moving\ntowards an environment where concurrent execution is the norm rather than the\nexception.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/functional-imap\">303 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#functional-abnf-parser-generators\"></a>Functional ABNF parser generators</h1>\n<p>This is an idea proposed in 2011 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/ns476\">Nicholas Skehin</a>.</p>\n<p>Writing internet servers is a difficult proposition. On some levels it seems as\nthough we haven\u2019t made much progress since the 1970s, as popular servers such as\nApache and nginx for HTTP, BIND for DNS and qmail for IMAP for many Internet\nprotocols still tend to be written in C. While it is not impossible to write\nrobust software in C, it does tend to be extremely difficult and almost all of\nthe above have suffered from their fair share of security vulnerabilities.\nWith the advent of higher level programming languages, this does not need to be\nthe case any longer. Modern functional languages such as OCaml and Haskell can\nbe competitive performance-wise with C on many workloads. In many cases their\nemphasis on purity where possible comes with significant benefits when moving\ntowards an environment where concurrent execution is the norm rather than the\nexception.</p>\n<p>This project aimed to build a functional parser for the IMAP email protocol\nin OCaml, and to compare its performance and flexibility against a C-based\nparser. IMAP is a very complex protocol with many quirks and has endured several\nbuggy implementations through the years on both the server and the client side.\nSince writing a parser for IMAP by hand was going to be tedious and error prone,\nthis project focusses on how better tooling to make writing parsers for internet\nservers a more manageable and pain-free experience. Specifically, it investigated\nwriting ABNF generators for OCaml, since IMAP was already specified using that.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>RFC 3501 and the ABNF spec of IMAP.</li>\n<li>An OCaml <a href=\"https://github.com/nojb/ocaml-imap\">IMAP implementation</a></li>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation PDF isn't available publically but\nshould be in the Cambridge Computer Lab archives somewhere.\nThe ABNFComp tool that was built is also available on request\nfrom the author, but not published.</p>",
+18
avsm/ideas_git-maildir.json
+18
avsm/ideas_git-maildir.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/git-maildir\">A strongly consistent index for email using git and MirageOS</a> <span>/ Jan 2019</span></h2><div><p>This is an idea proposed in 2019 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/odnh\">Oliver Hope</a>. It was co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Maildir\">Maildir</a> is a widely used format for storing emails. Its main benefit is that it uses the filesystem in such a way that client programs do not have to handle locking themselves. The downside of this is that it makes it hard to create a consistent index as we cannot guarantee that the filesystem is in a consistent state when we try to update it. If we did have a consistent index, it would allow for safer concurrent support and the implementation of new features.</p>\n<p>The aim of this project therefore is to solve the consistency problem. This can be done by using git, the version control system, to build an overlay on top of maildir in the filesystem, allowing multiple filesystem operations to be bundled into commits. These can be used to keep track of all changes made to the maildir. As these changes are being recorded by a version control system, we can be sure that any index built on top will be strongly consistent. As git also provides branching, we can extend this model to add new features described in the possible extensions section.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/git-maildir\">283 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#a-strongly-consistent-index-for-email-using-git-and-mirageos\"></a>A strongly consistent index for email using git and MirageOS</h1>\n<p>This is an idea proposed in 2019 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://github.com/odnh\">Oliver Hope</a>. It was co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Maildir\">Maildir</a> is a widely used format for storing emails. Its main benefit is that it uses the filesystem in such a way that client programs do not have to handle locking themselves. The downside of this is that it makes it hard to create a consistent index as we cannot guarantee that the filesystem is in a consistent state when we try to update it. If we did have a consistent index, it would allow for safer concurrent support and the implementation of new features.</p>\n<p>The aim of this project therefore is to solve the consistency problem. This can be done by using git, the version control system, to build an overlay on top of maildir in the filesystem, allowing multiple filesystem operations to be bundled into commits. These can be used to keep track of all changes made to the maildir. As these changes are being recorded by a version control system, we can be sure that any index built on top will be strongly consistent. As git also provides branching, we can extend this model to add new features described in the possible extensions section.</p>\n<p>The project successfully implemented this git overlay using libraries provided by <a href=\"https://github.com/mirage\">MirageOS</a> which provide git functionality, maildir operations, and even email parsing. With the overlay, and therefore consistent index implemented, the project was able to make many more guarantees about the state of the maildir at any time. This allowed for dealing with conflicting operations in an easier and more reliable manner. Furthermore, the overlay also provided the possibility of easily implementing novel features such as roll-back and separate branches for different use cases.</p>\n<p><a href=\"https://github.com/odnh\">Oliver Hope</a> published his <a href=\"https://github.com/odnh/gitmaildir\">dissertation repository</a> and the <a href=\"https://github.com/odnh/gitmaildir\">source code</a> to gitmaildir online.</p>",
+18
avsm/ideas_gradual-type-error-debugging.json
+18
avsm/ideas_gradual-type-error-debugging.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/gradual-type-error-debugging\">Gradually debugging type errors</a> <span>/ Sep 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Reasoning about type errors is very difficult, and requires shifting between\nstatic and dynamic types. In OCaml, the type checker asserts ill-typedness but\nprovides little in the way of understanding why the type checker inferred such\ntypes. These direct error messages are difficult to understand even for\nexperienced programmers working on larger codebases.</p>\n<p>This project will explore how to use gradual types to reason more effectively\nabout such ill-typed programs, by introducing more dynamic types to help some\nusers build an intuition about the problem in their code. The intention is to\nenable a more exploratory approach to constructing well-typed programs.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/gradual-type-error-debugging\">131 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#gradually-debugging-type-errors\"></a>Gradually debugging type errors</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Reasoning about type errors is very difficult, and requires shifting between\nstatic and dynamic types. In OCaml, the type checker asserts ill-typedness but\nprovides little in the way of understanding why the type checker inferred such\ntypes. These direct error messages are difficult to understand even for\nexperienced programmers working on larger codebases.</p>\n<p>This project will explore how to use gradual types to reason more effectively\nabout such ill-typed programs, by introducing more dynamic types to help some\nusers build an intuition about the problem in their code. The intention is to\nenable a more exploratory approach to constructing well-typed programs.</p>\n<p>Some relevant reading:</p>\n<ul>\n<li><a href=\"https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SNAPL.2015.274\">Refined Criteria for Gradual Typing</a></li>\n<li><a href=\"https://arxiv.org/abs/1810.12619\">Dynamic Type Inference for Gradual Hindley-Milner Typing</a></li>\n<li><a href=\"https://arxiv.org/abs/1606.07557\">Dynamic Witnesses for Static Type Errors (or, Ill-Typed Programs Usually Go Wrong)</a></li>\n</ul>",
+18
avsm/ideas_grey-lit-crawl.json
+18
avsm/ideas_grey-lit-crawl.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">Crawling grey literature for conservation evidence</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a good starter project, and has been <span>completed</span> by <a href=\"mailto:sb2704@cam.ac.uk\">Shrey Biswas</a> and <a href=\"https://github.com/Kacper-M-Michalik\">Kacper Michalik</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Much of this evidence is published in academic journals, but there is a large body of <a href=\"https://en.wikipedia.org/wiki/Grey_literature\">grey literature</a> that is not indexed in academic databases. This includes reports from NGOs, government agencies, and other organisations that are not peer-reviewed, but can still contain valuable information.</p>\n<p>This project involved developing a web crawler to search for grey literature on conservation interventions, tracking the provenance and license information, and extracting relevant information from these documents. The goal is to make this information more accessible to researchers and practitioners in the field of conservation.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">117 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#crawling-grey-literature-for-conservation-evidence\"></a>Crawling grey literature for conservation evidence</h1>\n<p>This is an idea proposed in 2024 as a good starter project, and has been <span>completed</span> by <a href=\"mailto:sb2704@cam.ac.uk\">Shrey Biswas</a> and <a href=\"https://github.com/Kacper-M-Michalik\">Kacper Michalik</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Much of this evidence is published in academic journals, but there is a large body of <a href=\"https://en.wikipedia.org/wiki/Grey_literature\">grey literature</a> that is not indexed in academic databases. This includes reports from NGOs, government agencies, and other organisations that are not peer-reviewed, but can still contain valuable information.</p>\n<p>This project involved developing a web crawler to search for grey literature on conservation interventions, tracking the provenance and license information, and extracting relevant information from these documents. The goal is to make this information more accessible to researchers and practitioners in the field of conservation.</p>\n<p><strong>Status:</strong> Paper in preparation, contact me for more details about followups.</p>",
+18
avsm/ideas_hazel-to-ocaml-to-hazel.json
+18
avsm/ideas_hazel-to-ocaml-to-hazel.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a>.</p>\n<p><a href=\"https://hazel.org\">Hazel</a> is a pure subset of OCaml with a live functional\nprogramming environment that is able to typecheck, manipulate, and even run\nincomplete programs. As a pure language with no effects, Hazel is a great\nchoice for domains such as configuration languages where some control flow\nis needed, but not the full power of a general purpose programming language.\nOn the other hand, Hazel only currently has an interpreter and so is fairly slow\nto evaluate compared to a full programming language such as OCaml.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">277 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#bidirectional-hazel-to-ocaml-programming\"></a>Bidirectional Hazel to OCaml programming</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>. It is co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a>.</p>\n<p><a href=\"https://hazel.org\">Hazel</a> is a pure subset of OCaml with a live functional\nprogramming environment that is able to typecheck, manipulate, and even run\nincomplete programs. As a pure language with no effects, Hazel is a great\nchoice for domains such as configuration languages where some control flow\nis needed, but not the full power of a general purpose programming language.\nOn the other hand, Hazel only currently has an interpreter and so is fairly slow\nto evaluate compared to a full programming language such as OCaml.</p>\n<p>This summer project aims to do two things:</p>\n<ul>\n<li>Build a simple Hazel -> OCaml transpiler that will directly evaluate a Hazel\nprogram with no typed holes as OCaml. If there is a typed hole, then an\nexception can be raised. With some creative thinking, we may be able to raise\nan OCaml effect instead and do something useful to continue the execution of the program.</li>\n<li>Build on <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>'s <a href=\"https://github.com/patricoferris/hazel_of_ocaml\">OCaml to Hazel transpiler</a> which goes\nfrom a subset of OCaml code to Hazel.</li>\n</ul>\n<p>Once we can go back and forth, we can explore some interesting domains where this is useful. For example,\ncan we build a configuration language frontend in Hazel, and then directly convert that into OCaml code\nfor embedding into an application? Could we build a simple blog/wiki frontend where layout is expressed\nin livelit Hazel, and then when ready is converted to OCaml for publishing on the web?</p>\n<p>We don't know if any of this will work, but we'd like to explore this "context\nswitching" between languages of different expressivity in order to explore the\ndivide between interactive, exploratory programming, and high performance and\nmore static published code.</p>",
+18
avsm/ideas_hedgehog-mapping.json
+18
avsm/ideas_hedgehog-mapping.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>. It is co-supervised with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>.</p>\n<p>The <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring Programme</a> aims to provide robust population estimates for the beloved hedgehog.</p>\n<blockquote>\n<p>Despite being the nation\u2019s favourite mammal, there's a lot more to learn about hedgehog populations across the country. We do know that, although urban populations are faring better than their rural counterparts, overall hedgehogs are declining across Britain, so much so that they\u2019re now categorised as Vulnerable to extinction.\n-- <a href=\"https://www.mammalweb.org/en/nhmp\">NHMP</a></p>\n</blockquote>\n<p>The People's Trust for Endangered Species has been <a href=\"https://ptes.org/campaigns/hedgehogs/nhmp/\">coordinating the programme</a>. For the purposes of this project, we have access to:</p>\n<ul>\n<li>GPS data from over 100 tagged hedgehogs collected by <a href=\"https://www.hedgehogstreet.org/on-the-hunt-for-hedgehogs/\">Lauren Moore during her PhD</a> to build predictive movement models.</li>\n<li>OpenStreetMap data about where hedgehogs probably shouldn't be (e.g. middle of a road) to help with species distribution modelling</li>\n<li>PTES also run the <a href=\"https://www.hedgehogstreet.org/\">Hedgehog Street</a> program which has the mapped locations of <a href=\"https://www.hedgehogstreet.org/help-hedgehogs/link-your-garden/\">hedgehog highways</a> across the UK to assess how effective they are.</li>\n<li>A new high-res map of the UK's <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">hedgerows and stonewalls</a> from Google DeepMind and <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a>.</li>\n</ul>\n<p>Our initial efforts in the summer of 2025 will be to put together a high res map of UK hedgehog habitats, specifically brambles and likely urban habitats. Once that works, the plan is to apply some spatially explicit modeling, still focussing on the UK. This will involving exciting collaborating with the <a href=\"https://ptes.org/\">PTES</a> who I'm looking forward to meeting!</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#mapping-urban-and-rural-british-hedgehogs\"></a>Mapping urban and rural British hedgehogs</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>. It is co-supervised with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>.</p>\n<p>The <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring Programme</a> aims to provide robust population estimates for the beloved hedgehog.</p>\n<blockquote>\n<p>Despite being the nation\u2019s favourite mammal, there's a lot more to learn about hedgehog populations across the country. We do know that, although urban populations are faring better than their rural counterparts, overall hedgehogs are declining across Britain, so much so that they\u2019re now categorised as Vulnerable to extinction.\n-- <a href=\"https://www.mammalweb.org/en/nhmp\">NHMP</a></p>\n</blockquote>\n<p>The People's Trust for Endangered Species has been <a href=\"https://ptes.org/campaigns/hedgehogs/nhmp/\">coordinating the programme</a>. For the purposes of this project, we have access to:</p>\n<ul>\n<li>GPS data from over 100 tagged hedgehogs collected by <a href=\"https://www.hedgehogstreet.org/on-the-hunt-for-hedgehogs/\">Lauren Moore during her PhD</a> to build predictive movement models.</li>\n<li>OpenStreetMap data about where hedgehogs probably shouldn't be (e.g. middle of a road) to help with species distribution modelling</li>\n<li>PTES also run the <a href=\"https://www.hedgehogstreet.org/\">Hedgehog Street</a> program which has the mapped locations of <a href=\"https://www.hedgehogstreet.org/help-hedgehogs/link-your-garden/\">hedgehog highways</a> across the UK to assess how effective they are.</li>\n<li>A new high-res map of the UK's <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">hedgerows and stonewalls</a> from Google DeepMind and <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a>.</li>\n</ul>\n<p>Our initial efforts in the summer of 2025 will be to put together a high res map of UK hedgehog habitats, specifically brambles and likely urban habitats. Once that works, the plan is to apply some spatially explicit modeling, still focussing on the UK. This will involving exciting collaborating with the <a href=\"https://ptes.org/\">PTES</a> who I'm looking forward to meeting!</p>",
+18
avsm/ideas_interspatial-networking.json
+18
avsm/ideas_interspatial-networking.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/interspatial-networking\">Interspatial Networking with DNS</a> <span>/ Jan 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>. It is co-supervised with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>.</p>\n<p>The existing Internet architecture lacks support for naming locations and\nresolving them to the myriad addressing mechanisms we use beyond IP. While\nthere have been many advances in addressing locations via multiple <em>routing\nschemes</em>, it remains difficult to refer to location-based services via <em>logical\nnames</em>. This in turn makes it difficult to deploy network services that can be\nreferred to by a stable name that specifies a given location, and that resolves\nto the addresses of the devices in that space. This matters because there are\na broad class of network-connected devices with a physical presence to which\nlocation is an intrinsic part of their identity. A networked speaker in, say,\nthe Oval Office is defined by its location: it is simply the Oval Office\nSpeaker! If the specific device moves location its identity should change with\nits new location, and if the device is replaced then the replacement should\nassume the function of its predecessor.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/interspatial-networking\">218 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#interspatial-networking-with-dns\"></a>Interspatial Networking with DNS</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>. It is co-supervised with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>.</p>\n<p>The existing Internet architecture lacks support for naming locations and\nresolving them to the myriad addressing mechanisms we use beyond IP. While\nthere have been many advances in addressing locations via multiple <em>routing\nschemes</em>, it remains difficult to refer to location-based services via <em>logical\nnames</em>. This in turn makes it difficult to deploy network services that can be\nreferred to by a stable name that specifies a given location, and that resolves\nto the addresses of the devices in that space. This matters because there are\na broad class of network-connected devices with a physical presence to which\nlocation is an intrinsic part of their identity. A networked speaker in, say,\nthe Oval Office is defined by its location: it is simply the Oval Office\nSpeaker! If the specific device moves location its identity should change with\nits new location, and if the device is replaced then the replacement should\nassume the function of its predecessor.</p>\n<p>This PhD project will explore the Spatial Name System (SNS) that allows for the\nassignment of hierarchical location-based names and for resolution schemes that\nare both global and local. Since we extend the DNS, our scheme allows for the\nintegration of spatial names into existing applications and opens up new\npossibilities for sensor networks and augmented reality.</p>\n<h2><a href=\"https://anil.recoil.org/#relevant-reading\"></a>Relevant Reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2023-hotnets-sns\">Where on Earth is the Spatial Name System?</a></li>\n</ul>",
+18
avsm/ideas_legal-aspects-of-credits.json
+18
avsm/ideas_legal-aspects-of-credits.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">Legal perspectives on integrity issues in forest carbon</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a postdoctoral project, and has been <span>completed</span> by <a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>.</p>\n<p>Carbon finance offers a vital way to fund urgently needed forest conservation,\nbut there are integrity issues on the supply side.<a href=\"https://anil.recoil.org/#fn-1\">[1]</a>\nBesides the known issues with carbon quantification,<a href=\"https://anil.recoil.org/#fn-2\">[2]</a> carbon credits\nare often poorly designed and implemented from a <em>legal</em> perspective.\nSpecifically, in the absence of a clear legal framework for forest carbon\ncredits, contracts tend to conceptualise credits in similar terms to the\nproducts of extractive industries, such as mineral mining. This is a factually\ninaccurate model for carbon credits, since the carbon is not extracted but on\nthe contrary is stored in the trees which remain part of the landscape. This\ninappropriate model then leads to misunderstandings and misallocations of the\nrights of the various stakeholders in carbon finance projects and militates\nagainst just benefit-sharing arrangements.</p>\n<p>This project is exploring a novel legal framework for forest carbon credits\nwhich separates carbon tenure (i.e. title and associated property rights to the\nland and trees which store the carbon) from the carbon rights (i.e. title and\nassociated rights to monetise, sell, count and retire the credits which\nsymbolically represent the carbon stored in the trees), while also specifying\nthe relationship between the carbon tenure and the carbon rights.</p>\n\n<ol>\n<li>\n<p>See the note on <a href=\"https://anil.recoil.org/notes/nature-crossroads\">Nature Sustainability commentary on carbon and biodiversity credits</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>See the <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a> project and related papers.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li></ol>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">227 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#legal-perspectives-on-integrity-issues-in-forest-carbon\"></a>Legal perspectives on integrity issues in forest carbon</h1>\n<p>This is an idea proposed in 2024 as a postdoctoral project, and has been <span>completed</span> by <a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>.</p>\n<p>Carbon finance offers a vital way to fund urgently needed forest conservation,\nbut there are integrity issues on the supply side.<a href=\"https://anil.recoil.org/#fn-1\">[1]</a>\nBesides the known issues with carbon quantification,<a href=\"https://anil.recoil.org/#fn-2\">[2]</a> carbon credits\nare often poorly designed and implemented from a <em>legal</em> perspective.\nSpecifically, in the absence of a clear legal framework for forest carbon\ncredits, contracts tend to conceptualise credits in similar terms to the\nproducts of extractive industries, such as mineral mining. This is a factually\ninaccurate model for carbon credits, since the carbon is not extracted but on\nthe contrary is stored in the trees which remain part of the landscape. This\ninappropriate model then leads to misunderstandings and misallocations of the\nrights of the various stakeholders in carbon finance projects and militates\nagainst just benefit-sharing arrangements.</p>\n<p>This project is exploring a novel legal framework for forest carbon credits\nwhich separates carbon tenure (i.e. title and associated property rights to the\nland and trees which store the carbon) from the carbon rights (i.e. title and\nassociated rights to monetise, sell, count and retire the credits which\nsymbolically represent the carbon stored in the trees), while also specifying\nthe relationship between the carbon tenure and the carbon rights.</p>\n<p>This paper was subsequently published in Climate Law Review and is available\nto read as "<a href=\"https://anil.recoil.org/papers/2024-cclr-carbon\">A Legal Perspective on Supply-side Integrity Issues in the Forest Carbon Market</a>".</p>\n\n<ol>\n<li>\n<p>See the note on <a href=\"https://anil.recoil.org/notes/nature-crossroads\">Nature Sustainability commentary on carbon and biodiversity credits</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>See the <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a> project and related papers.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_life-explorer-wasm.json
+18
avsm/ideas_life-explorer-wasm.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/life-explorer-wasm\">Using wasm to locally explore geospatial layers</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:sf729@cam.ac.uk\">Sam Forbes</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Some of my projects like <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> or <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> involve geospatial base maps with\ngigabytes or even terabytes of data. This data is usually split up into\nmultiple GeoTIFFs, each of which has a slice of information. For example, the\nLIFE persistence maps have around 30000 maps for individual species, and then\nan aggregated GeoTIFF for mammals, birds, reptiles and so forth.</p>\n<p>This project will explore how to build a WebAssembly-based visualisation tool\nfor geospatial ecology data. This existing data is in the form of GeoTIFF\nfiles, which are image files with embedded georeferencing information. The\napplication will be applied to files which include information on the\nprevalence of species in an area, consisting of a global map at 100 m2 scale.\nAn existing tool, QGIS, allows ecologists to visualise this data across the\nentire world, collated by types of species, but this is difficult to work with\nbecause of the scale of the data involved.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/life-explorer-wasm\">341 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#using-wasm-to-locally-explore-geospatial-layers\"></a>Using wasm to locally explore geospatial layers</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:sf729@cam.ac.uk\">Sam Forbes</a>. It is co-supervised with <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>Some of my projects like <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> or <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> involve geospatial base maps with\ngigabytes or even terabytes of data. This data is usually split up into\nmultiple GeoTIFFs, each of which has a slice of information. For example, the\nLIFE persistence maps have around 30000 maps for individual species, and then\nan aggregated GeoTIFF for mammals, birds, reptiles and so forth.</p>\n<p>This project will explore how to build a WebAssembly-based visualisation tool\nfor geospatial ecology data. This existing data is in the form of GeoTIFF\nfiles, which are image files with embedded georeferencing information. The\napplication will be applied to files which include information on the\nprevalence of species in an area, consisting of a global map at 100 m2 scale.\nAn existing tool, QGIS, allows ecologists to visualise this data across the\nentire world, collated by types of species, but this is difficult to work with\nbecause of the scale of the data involved.</p>\n<p>Therefore, it would be useful to have a tool which can work across a smaller\nsubset of locations and species, which allows ecologists to more quickly and\neasily visualise the subset of data that they are working with. Additionally,\nthe use of WebAssembly means this tool can be run entirely in-browser. This\nenables offline use in a cross-platform environment, and avoids the need for a\ncentral webserver. The project could also be extended to online applications\nmore easily because of this.</p>\n<p>The files will be requested from a local server process, as WebAssembly is\nunable to manipulate local files directly. This will be implemented via a\nseparate JavaScript-based process. Then, the application will collate and crop\ninformation from the files, as specified by the user through the interface, to\ndisplay the desired species distribution map.</p>\n<p>To ensure that the application can process the data sufficiently fast for a\nreal-time application, the implementation will exploit the inherent\nparallelisms of the data through concurrency. This can be on a file level, by\nconcurrently processing multiple files, or on a pixel level when generating\nindependent parts of the map.</p>",
+18
avsm/ideas_macro-micro-benchmarking.json
+18
avsm/ideas_macro-micro-benchmarking.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/macro-micro-benchmarking\">Macro- and Micro-benchmarking in OCaml</a> <span>/ Jan 2012</span></h2><div><p>This is an idea proposed in 2012 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Sebastian Funk</span>.</p>\n<p>Benchmarking involves the measurement of statistics such as run-time, memory allocations, garbage collections in a running program in order to analyze its performance and behaviour. To scientifically evaluate and understand the performance of a program, there is often a cycle of:</p>\n<ol>\n<li>making performance observations about the program</li>\n<li>finding a potential hypothesis, i.e. a cause for this performance behaviour</li>\n<li>making predictions on experiments based on this hypothesis</li>\n<li>comparing the predictions against the actual benchmark results to evaluate the hypothesis.</li>\n</ol>\n<p>To be able to do all this, there is a need for an effective and robust\nframework to continuously make these observations that is not biased by the\nchoice of hypothesis or the observation made. In general, any sort of\nimprovement relies on robust and precise measurements.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/macro-micro-benchmarking\">345 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#macro--and-micro-benchmarking-in-ocaml\"></a>Macro- and Micro-benchmarking in OCaml</h1>\n<p>This is an idea proposed in 2012 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Sebastian Funk</span>.</p>\n<p>Benchmarking involves the measurement of statistics such as run-time, memory allocations, garbage collections in a running program in order to analyze its performance and behaviour. To scientifically evaluate and understand the performance of a program, there is often a cycle of:</p>\n<ol>\n<li>making performance observations about the program</li>\n<li>finding a potential hypothesis, i.e. a cause for this performance behaviour</li>\n<li>making predictions on experiments based on this hypothesis</li>\n<li>comparing the predictions against the actual benchmark results to evaluate the hypothesis.</li>\n</ol>\n<p>To be able to do all this, there is a need for an effective and robust\nframework to continuously make these observations that is not biased by the\nchoice of hypothesis or the observation made. In general, any sort of\nimprovement relies on robust and precise measurements.</p>\n<p>Benchmarking can be split into two perspectives: micro-benchmarking, measuring\na single (small) function repeatedly to collect statistics for a regression,\nand macro-benchmarking, measuring the performance of a complete program or\nlibrary, often in a single-run. This project aims to improve the benchmarking\ninfrastructure in OCaml, both at micro- and macro-benchmarking.</p>\n<p>The project aims to add event tracing into OCaml, via instrumentation to the\n<a href=\"https://github.com/janestreet/core-bench\">Core Bench</a> library using Camlp4.\nThe event-tracing tool\nis then a way for macro-benchmarking together with the multivariate regression\nfor micro-benchmarking to analyze the performance of commonly used libraries to\nexhibit and explain abnormalities and performance differences in\nimplementations. On a meta-level this study will give an insight into which\npredictors are useful for a multivariate regression in which circumstances to\nprovide interesting results and how event-tracing can be used efficiently and\ncompactly in large libraries.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation is available on request to students from <a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a> but isn't\nonline anywhere. The source code (a CamlP4 event tracer) has been superceded by modern\nevent tracing.</p>\n<p><span>Sebastian Funk</span> went on to work at Jane Street on OCaml after his project, and one\n2019 talk on his subsequent work can be seen below.</p>",
+18
avsm/ideas_mangrove-literature-for-ce.json
+18
avsm/ideas_mangrove-literature-for-ce.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/mangrove-literature-for-ce\">Assessing mangrove literature for conservation evidence</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-thomas-worthington\">Tom Worthington</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Mangrove forests are one of the most threatened ecosystems in the world, and there is a large body of literature on the topic. However, this literature is spread across a wide range of sources, including academic journals, reports, and grey literature, and it can be difficult to identify and synthesise the key findings. Moreover, there is a need to assess the quality of the evidence, perhaps from remote sensing data or field studies, to inform policy and practice.</p>\n<p>At the Cambridge Conservation Initiative, there are in-house experts such as <a href=\"https://www.zoo.cam.ac.uk/directory/dr-thomas-worthington\">Tom Worthington</a> developing a platform to integrate evidence on mangrove preservation and restoration. This project will involve assessing the literature found by this project to identify key sources, extract relevant information, and evaluate the quality of the resulting evidence. The goal is to develop a set of best practices for using mangrove literature in the context of conservation evidence, to validate it from the databases collated by domain experts, and to make recommendations for future work in this area.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#assessing-mangrove-literature-for-conservation-evidence\"></a>Assessing mangrove literature for conservation evidence</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-thomas-worthington\">Tom Worthington</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Mangrove forests are one of the most threatened ecosystems in the world, and there is a large body of literature on the topic. However, this literature is spread across a wide range of sources, including academic journals, reports, and grey literature, and it can be difficult to identify and synthesise the key findings. Moreover, there is a need to assess the quality of the evidence, perhaps from remote sensing data or field studies, to inform policy and practice.</p>\n<p>At the Cambridge Conservation Initiative, there are in-house experts such as <a href=\"https://www.zoo.cam.ac.uk/directory/dr-thomas-worthington\">Tom Worthington</a> developing a platform to integrate evidence on mangrove preservation and restoration. This project will involve assessing the literature found by this project to identify key sources, extract relevant information, and evaluate the quality of the resulting evidence. The goal is to develop a set of best practices for using mangrove literature in the context of conservation evidence, to validate it from the databases collated by domain experts, and to make recommendations for future work in this area.</p>",
+18
avsm/ideas_mapping-hunting-risks-for-wild-meat.json
+18
avsm/ideas_mapping-hunting-risks-for-wild-meat.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">Mapping hunting risks for wild meat in protected areas</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a postdoctoral project, and is currently <span>being worked on</span> by <a href=\"https://charlesemogor.com\">Charles Emogor</a>. It is co-supervised with <a href=\"https://teamcore.seas.harvard.edu/tambe\">Milind Tambe</a>.</p>\n<p>There is an important balance needed between the biodiversity damage caused by\nhunting in protected areas and the well-being of local communities that depend\non it. One understudied driver of overly damaging hunting in these areas is <a href=\"https://en.wikipedia.org/wiki/Trapping\">snaring</a> (as\nopposed to gun hunting) which potentially increases carcass wastage and hence\ncausing biodiversity harm without proportionate benefit to the community.</p>\n<p>This project examines how to improve the efficacy of anti-poaching ranger patrols\nwhile also plugging the knowledge gap around wild meat snaring. Both of these\nresearch topics can be tackled in a new light with the emergence of machine\nlearning as a data-driven approach to deriving insights from sparse data, and\nparticularly from some of the newer base maps being developed in our <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>\nproject.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#mapping-hunting-risks-for-wild-meat-in-protected-areas\"></a>Mapping hunting risks for wild meat in protected areas</h1>\n<p>This is an idea proposed in 2024 as a postdoctoral project, and is currently <span>being worked on</span> by <a href=\"https://charlesemogor.com\">Charles Emogor</a>. It is co-supervised with <a href=\"https://teamcore.seas.harvard.edu/tambe\">Milind Tambe</a>.</p>\n<p>There is an important balance needed between the biodiversity damage caused by\nhunting in protected areas and the well-being of local communities that depend\non it. One understudied driver of overly damaging hunting in these areas is <a href=\"https://en.wikipedia.org/wiki/Trapping\">snaring</a> (as\nopposed to gun hunting) which potentially increases carcass wastage and hence\ncausing biodiversity harm without proportionate benefit to the community.</p>\n<p>This project examines how to improve the efficacy of anti-poaching ranger patrols\nwhile also plugging the knowledge gap around wild meat snaring. Both of these\nresearch topics can be tackled in a new light with the emergence of machine\nlearning as a data-driven approach to deriving insights from sparse data, and\nparticularly from some of the newer base maps being developed in our <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>\nproject.</p>",
+18
avsm/ideas_mapping-species-extinction-risks.json
+18
avsm/ideas_mapping-species-extinction-risks.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/mapping-species-extinction-risks\">Real-time mapping of changes in species extinction risks</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://emiliolr.github.io\">Emilio Luz-Ricca</a>. It is co-supervised with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>.</p>\n<p>Loss of habitat represents the most significant threat to wildlife overall, but\nadvances in satellite sensing have enabled the assessment of habitat extent\nwith comprehensive spatial coverage and reasonable temporal resolution. To\naddress rising demand for metrics to quantify biodiversity, we have developed\nthe LIFE metric (see <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>) that models the effect of landuse changes on\nspecies extinction risk as a function of Areas of Habitat (AoH).</p>\n<p>This PhD work explores how to deal with the anthropogenic threats beyond simple\nhabitat loss, including hunting, agricultural practices, and the introduction\nof invasive species. These additional threatening processes degrade habitat quality\nand lower species occupancy, but are extremely difficult to observe directly via\nremote sensing. This project will therefore involve a combination of modelling,\nmachine learning and remote sensing data analysis to understand the impact of these\nadditional anthropogenic threats on habitat quality on a per-species basis.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#real-time-mapping-of-changes-in-species-extinction-risks\"></a>Real-time mapping of changes in species extinction risks</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://emiliolr.github.io\">Emilio Luz-Ricca</a>. It is co-supervised with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>.</p>\n<p>Loss of habitat represents the most significant threat to wildlife overall, but\nadvances in satellite sensing have enabled the assessment of habitat extent\nwith comprehensive spatial coverage and reasonable temporal resolution. To\naddress rising demand for metrics to quantify biodiversity, we have developed\nthe LIFE metric (see <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>) that models the effect of landuse changes on\nspecies extinction risk as a function of Areas of Habitat (AoH).</p>\n<p>This PhD work explores how to deal with the anthropogenic threats beyond simple\nhabitat loss, including hunting, agricultural practices, and the introduction\nof invasive species. These additional threatening processes degrade habitat quality\nand lower species occupancy, but are extremely difficult to observe directly via\nremote sensing. This project will therefore involve a combination of modelling,\nmachine learning and remote sensing data analysis to understand the impact of these\nadditional anthropogenic threats on habitat quality on a per-species basis.</p>",
+18
avsm/ideas_metaproperties-for-smart-contracts.json
+18
avsm/ideas_metaproperties-for-smart-contracts.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/metaproperties-for-smart-contracts\">Meta Properties of Financial Smart Contracts</a> <span>/ Aug 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://derekhsorensen.com\">Derek Sorensen</a>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>Financial smart contracts routinely manage billions of US dollars worth of\ndigital assets, making bugs in smart contracts extremely costly, and are also\nincreasingly being used in other areas of endeavour such as carbon credit\ntracking.\nBecause of this, much work has been done in formal verification of smart\ncontracts to prove a contract correct with regards to its specification.\nHowever, financial smart contracts have complicated specifications, and it is not all\nstraightforward for humans to write one which correctly captures all of its <em>intended</em>\nhigh-level behaviors.</p>\n<p>To mitigate this challenge, this PhD explores the development of formal tools to\ntarget <em>meta properties</em> of smart contracts, which are properties of a contract\nthat are intended by, but out of scope of, its specification. The targeted\nproperties include the economic behaviors of the contract, properties relating\nto its upgradeability features, and the intended behaviors of systems of\ncontracts. The formal tools presented are written in Coq.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#meta-properties-of-financial-smart-contracts\"></a>Meta Properties of Financial Smart Contracts</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://derekhsorensen.com\">Derek Sorensen</a>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<p>Financial smart contracts routinely manage billions of US dollars worth of\ndigital assets, making bugs in smart contracts extremely costly, and are also\nincreasingly being used in other areas of endeavour such as carbon credit\ntracking.\nBecause of this, much work has been done in formal verification of smart\ncontracts to prove a contract correct with regards to its specification.\nHowever, financial smart contracts have complicated specifications, and it is not all\nstraightforward for humans to write one which correctly captures all of its <em>intended</em>\nhigh-level behaviors.</p>\n<p>To mitigate this challenge, this PhD explores the development of formal tools to\ntarget <em>meta properties</em> of smart contracts, which are properties of a contract\nthat are intended by, but out of scope of, its specification. The targeted\nproperties include the economic behaviors of the contract, properties relating\nto its upgradeability features, and the intended behaviors of systems of\ncontracts. The formal tools presented are written in Coq.</p>",
+18
avsm/ideas_mips-llvm.json
+18
avsm/ideas_mips-llvm.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/mips-llvm\">Extending 64-bit MIPS support for LLVM</a> <span>/ Aug 2011</span></h2><div><p>This is an idea proposed in 2011 as a good starter project, and has been <span>completed</span> by <a href=\"https://github.com/wmorland\">William Morland</a>. It was co-supervised with <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a>.</p>\n<p>In the summer of 2011, we hosted <a href=\"https://github.com/wmorland\">William Morland</a> to do an internship in the Computer Lab just as the\n<a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/\">CTSRD/CHERI</a> project kicked off.\nI was interested in MIPS as a potential target for MirageOS, and <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a> in using it for the\nfuture CHERI processor.</p>\n<p><a href=\"https://github.com/wmorland\">William Morland</a> hacked on the gxemul MIPS simulator, validating (and often creating) the CHERI\ntest suite against the gxemul simulator. He then shifted gears into the (then experimental)\nLLVM/MIPS backend, filling in missing instructions and finding bugs via exercising the test suite.\nHis LLVM repository is up at <a href=\"https://github.com/wmorland/LLVM-Mips\">GitHub</a>, along with\nthe discussions from back in 2011 on the <a href=\"https://discourse.llvm.org/t/mips-target-instruction-set/20373\">llvm-dev</a> lists.\nThere's also a nice poster of this work from the <a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/20111108-ctsrd-pimeeting-poster.pdf\">2011 CTSRD project meeting</a>!</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#extending-64-bit-mips-support-for-llvm\"></a>Extending 64-bit MIPS support for LLVM</h1>\n<p>This is an idea proposed in 2011 as a good starter project, and has been <span>completed</span> by <a href=\"https://github.com/wmorland\">William Morland</a>. It was co-supervised with <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a>.</p>\n<p>In the summer of 2011, we hosted <a href=\"https://github.com/wmorland\">William Morland</a> to do an internship in the Computer Lab just as the\n<a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/\">CTSRD/CHERI</a> project kicked off.\nI was interested in MIPS as a potential target for MirageOS, and <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a> in using it for the\nfuture CHERI processor.</p>\n<p><a href=\"https://github.com/wmorland\">William Morland</a> hacked on the gxemul MIPS simulator, validating (and often creating) the CHERI\ntest suite against the gxemul simulator. He then shifted gears into the (then experimental)\nLLVM/MIPS backend, filling in missing instructions and finding bugs via exercising the test suite.\nHis LLVM repository is up at <a href=\"https://github.com/wmorland/LLVM-Mips\">GitHub</a>, along with\nthe discussions from back in 2011 on the <a href=\"https://discourse.llvm.org/t/mips-target-instruction-set/20373\">llvm-dev</a> lists.\nThere's also a nice poster of this work from the <a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/20111108-ctsrd-pimeeting-poster.pdf\">2011 CTSRD project meeting</a>!</p>",
+18
avsm/ideas_murmuration.json
+18
avsm/ideas_murmuration.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/murmuration\">Scheduling for Reduced Tail Latencies in Highly Utilised Datacenters</a> <span>/ Sep 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/ek264\">Evangelia Kalyvianaki</a>.</p>\n<p>Modern datacenters have become the backbone for running diverse workloads that increas-\ningly comprise data-parallel computational jobs. Due to the ease of use and diversity of\nresources they host there has been an exponential rise in the demand for datacenters leading\nto high volume of traffic. Datacenters execute thousands of jobs by scheduling billions\nof tasks every day. To meet these demands, datacenters providers operate their clusters at\nlevels of high utilisation. We show that under such conditions existing scheduling designs\nimpose large wait times on tail tasks. This leads to large tail task completion times and\nconsequently elevated job completion times that can potentially cost datacenter providers\nmillions of dollars in terms of total cost of operations of these datacenters.</p>\n<p>This PhD explores a new decentralised scheduling model, Murmuration, that uses\nmultiple communicating scheduler instances to ensure tasks are scheduled in a\nmanner that reduces their total wait times. It achieves this by scheduling all\ntasks of a job such that their start times are as close together as possible,\nthereby ensuring small tail task completion times and better average job\ncompletion times.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#scheduling-for-reduced-tail-latencies-in-highly-utilised-datacenters\"></a>Scheduling for Reduced Tail Latencies in Highly Utilised Datacenters</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and has been <span>completed</span> by <a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a>. It was co-supervised with <a href=\"https://www.cst.cam.ac.uk/people/ek264\">Evangelia Kalyvianaki</a>.</p>\n<p>Modern datacenters have become the backbone for running diverse workloads that increas-\ningly comprise data-parallel computational jobs. Due to the ease of use and diversity of\nresources they host there has been an exponential rise in the demand for datacenters leading\nto high volume of traffic. Datacenters execute thousands of jobs by scheduling billions\nof tasks every day. To meet these demands, datacenters providers operate their clusters at\nlevels of high utilisation. We show that under such conditions existing scheduling designs\nimpose large wait times on tail tasks. This leads to large tail task completion times and\nconsequently elevated job completion times that can potentially cost datacenter providers\nmillions of dollars in terms of total cost of operations of these datacenters.</p>\n<p>This PhD explores a new decentralised scheduling model, Murmuration, that uses\nmultiple communicating scheduler instances to ensure tasks are scheduled in a\nmanner that reduces their total wait times. It achieves this by scheduling all\ntasks of a job such that their start times are as close together as possible,\nthereby ensuring small tail task completion times and better average job\ncompletion times.</p>",
+18
avsm/ideas_nqsb-tls.json
+18
avsm/ideas_nqsb-tls.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/nqsb-tls\">Not-quite-so-broken TLS in OCaml</a> <span>/ Jan 2014</span></h2><div><p>This is an idea proposed in 2014 as a good starter project, and has been <span>completed</span> by <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> and <a href=\"https://github.com/pqwy\">David Kaloper-Mersinjak</a>. It was co-supervised with <a href=\"https://www.cl.cam.ac.uk/~pes20/\">Peter Sewell</a>.</p>\n<p>Transport Layer Security (TLS) implementations have a history of security flaws. The immediate causes of these are often programming errors, e.g. in memory manage- ment, but the root causes are more fundamental: the challenges of interpreting the ambiguous prose specification, the complexities inherent in large APIs and code bases, inherently unsafe programming choices, and the impossibility of directly testing conformance between implementations and the specification.</p>\n<p>This internship was to work on nqsb-TLS, our re-engineered approach to security protocol specification and implementation that addresses the above root causes. The same source code serves two roles: it is both a specification of TLS, executable as a test oracle to check conformance of traces from arbitrary implementations, and a usable implementation of TLS; a modular and declarative programming style provides clean separation between its components. Many security flaws are thus excluded by construction.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/nqsb-tls\">310 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#not-quite-so-broken-tls-in-ocaml\"></a>Not-quite-so-broken TLS in OCaml</h1>\n<p>This is an idea proposed in 2014 as a good starter project, and has been <span>completed</span> by <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> and <a href=\"https://github.com/pqwy\">David Kaloper-Mersinjak</a>. It was co-supervised with <a href=\"https://www.cl.cam.ac.uk/~pes20/\">Peter Sewell</a>.</p>\n<p>Transport Layer Security (TLS) implementations have a history of security flaws. The immediate causes of these are often programming errors, e.g. in memory manage- ment, but the root causes are more fundamental: the challenges of interpreting the ambiguous prose specification, the complexities inherent in large APIs and code bases, inherently unsafe programming choices, and the impossibility of directly testing conformance between implementations and the specification.</p>\n<p>This internship was to work on nqsb-TLS, our re-engineered approach to security protocol specification and implementation that addresses the above root causes. The same source code serves two roles: it is both a specification of TLS, executable as a test oracle to check conformance of traces from arbitrary implementations, and a usable implementation of TLS; a modular and declarative programming style provides clean separation between its components. Many security flaws are thus excluded by construction.</p>\n<p>nqsb-TLS can be used in standalone Unix applications, which we demonstrate with a messaging client, and can also be compiled into Xen unikernels (see <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a>) with a trusted computing base (TCB) that is 4% of a standalone system running a standard Linux/OpenSSL stack, with all network traffic being handled in a memory-safe language; this supports applications including HTTPS, IMAP, Git, and Websocket clients and servers. Despite the dual-role design, the high-level implementation style, and the functional programming language we still achieved reasonable performance, with the same handshake performance as OpenSSL and 73%\u201384% for bulk throughput.</p>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li><a href=\"https://github.com/hannesm\">Hannes Mehnert</a> and <a href=\"https://github.com/pqwy\">David Kaloper-Mersinjak</a> worked on this in an internship after discovering the MirageOS project online, and came over in the summer of 2014. The results have been hguely successful within the OCaml community, as the <a href=\"https://github.com/mirleft/ocaml-tls\">ocaml-tls</a> is still widely used as the defacto TLS stack in many popular OCaml applications.</li>\n<li>The paper was published in USENIX Security; see <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb\">Not-Quite-So-Broken TLS</a>.</li>\n<li>For other stuff that happened during that internship period, see <a href=\"https://anil.recoil.org/notes/ocamllabs-2014-review\">Reviewing the second year of OCaml Labs in 2014</a>.</li>\n</ul>",
+18
avsm/ideas_ocaml-bytecode-native-ffi.json
+18
avsm/ideas_ocaml-bytecode-native-ffi.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</a> <span>/ Apr 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>. It is co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p>In 1998, <a href=\"https://fabrice.lefessant.net/\">Fabrice le Fessant</a> released Efuns ("Emacs for Functions"), an implementation of an Emacs-like editor entire in OCaml and which included a library for loading <a href=\"https://caml.inria.fr/pub/old_caml_site/caml-list/0780.html\">bytecode within native code programs</a>[^1].</p>\n<p>This nearly a decade before OCaml 3.11 would introduce <a href=\"https://gallium.inria.fr/~frisch/ndl.txt\">Alain Frisch's</a> native Dynlink support to OCaml. Natdynlink means that this original work has been largely forgotten, but there remain two interesting applications for being able to "cross-load" code compiled for the OCaml bytecode runtime in an OCaml native code application and vice versa:</p>\n<ol>\n<li>Native code OCaml applications could use OCaml as a scripting language without needing to include an assembler toolchain or solutions such as <a href=\"https://github.com/tarides/ocaml-jit\">ocaml-jit</a>.</li>\n<li>The existing bytecode REPL could use OCaml natdynlink plugins (<code>.cmxs</code> files) directly, allowing more dynamic programming and exploration of high-performance libraries with the ease of the bytecode interpreter, but retaining the runtime performance of the libraries themselves.</li>\n</ol>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">310 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#runtimes-\u00e0-la-carte-crossloading-native-and-bytecode-ocaml\"></a>Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>. It is co-supervised with <a href=\"https://github.com/dra27\">David Allsopp</a>.</p>\n<p>In 1998, <a href=\"https://fabrice.lefessant.net/\">Fabrice le Fessant</a> released Efuns ("Emacs for Functions"), an implementation of an Emacs-like editor entire in OCaml and which included a library for loading <a href=\"https://caml.inria.fr/pub/old_caml_site/caml-list/0780.html\">bytecode within native code programs</a><a href=\"https://anil.recoil.org/#fn-1\">[1]</a>.</p>\n<p>This nearly a decade before OCaml 3.11 would introduce <a href=\"https://gallium.inria.fr/~frisch/ndl.txt\">Alain Frisch's</a> native Dynlink support to OCaml. Natdynlink means that this original work has been largely forgotten, but there remain two interesting applications for being able to "cross-load" code compiled for the OCaml bytecode runtime in an OCaml native code application and vice versa:</p>\n<ol>\n<li>Native code OCaml applications could use OCaml as a scripting language without needing to include an assembler toolchain or solutions such as <a href=\"https://github.com/tarides/ocaml-jit\">ocaml-jit</a>.</li>\n<li>The existing bytecode REPL could use OCaml natdynlink plugins (<code>.cmxs</code> files) directly, allowing more dynamic programming and exploration of high-performance libraries with the ease of the bytecode interpreter, but retaining the runtime performance of the libraries themselves.</li>\n</ol>\n<p>This project aims to implement these two features directly in the OCaml distribution by:</p>\n<ol>\n<li>Extending the bytecode version of <code>Dynlink</code> to be able to load <code>.cmxs</code> files. This feature would be validated by extending the <code>#load</code> directive of the bytecode toplevel <code>ocaml</code> to be able to load <code>.cmxs</code> files.</li>\n<li>Extending the native version of <code>Dynlink</code> to be able to load bytecode units, both from <code>.cmo</code>/<code>.cma</code> files but also directly generated in the native code program itself. This feature would be validated by adding <code>ocaml.opt</code> to the distribution - i.e. the <em>bytecode</em> toplevel compiled in native code, acting as the bytecode toplevel today, but also capable of <code>#load</code>ing <code>.cmxs</code> files, and still converting toplevel phrases for execution by the bytecode interpreter</li>\n</ol>\n<p>This is a good student project for anyone seeking to gain more familiarity with a "real" compiler codebase, and to learn more about how these work towards (e.g.) hacking on <a href=\"https://anil.recoil.org/notes/wasm-on-exotic-targets\">webassembly</a> in the future.</p>\n\n<ol>\n<li>\n<p>A version can be found at <a href=\"https://github.com/jrrk/efuns/tree/master/dynlink\">jrrk/efuns</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_ocaml-forest-sim.json
+18
avsm/ideas_ocaml-forest-sim.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/ocaml-forest-sim\">Exploring Concurrency in Agent-Based Modelling with Multicore OCaml</a> <span>/ Jan 2021</span></h2><div><p>This is an idea proposed in 2021 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Martynas Sinkievi\u010d</span>.</p>\n<p>Computational modelling techniques such as ABMs are used to understand the\ndynamics of ecosystems and predict their behaviour in response to climate\nchange and ecological disturbances, while also searching for optimal paths\ntowards solutions to these problems. Terrestrial biosphere models are one such\nmodel which simulate the vegetation and soil life cycle. There have been two\napproaches taken with such modelling:</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/ocaml-forest-sim\">371 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#exploring-concurrency-in-agent-based-modelling-with-multicore-ocaml\"></a>Exploring Concurrency in Agent-Based Modelling with Multicore OCaml</h1>\n<p>This is an idea proposed in 2021 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Martynas Sinkievi\u010d</span>.</p>\n<p>Computational modelling techniques such as ABMs are used to understand the\ndynamics of ecosystems and predict their behaviour in response to climate\nchange and ecological disturbances, while also searching for optimal paths\ntowards solutions to these problems. Terrestrial biosphere models are one such\nmodel which simulate the vegetation and soil life cycle. There have been two\napproaches taken with such modelling:</p>\n<ul>\n<li>The top-down approach take coarse-grained dynamic models that simulate environments in large chunks and scale to large areas as needed, but with a lack of accuracy in the simulated environment that only captures summarised features.</li>\n<li>Bottom-up fine-grained agent-based models (ABMs) which provide a more accurate description of the modelled domain.</li>\n</ul>\n<p>This project investigates ABMs that simulate all relevant parameters of a local\nenvironment and can capture the lifetime of agents, and thus can achieve\naccurate summaries as observed emergent behaviour. These models are\ncomputationally intensive, and so we need multi-processor hardware to be\nutilised fully. While common performant languages for computational science\ninclude C++ and Java, their semantics can be unforgiving in the face of complex\ncode, with data-races causing potentially causing non-sequential behaviour in\nboth languages. This makes debugging and developing such applications with\nparallelism in mind very difficult, especially so for those without deep\nbackground knowledge of the respective compilers and runtimes. It is also\ncommon practise in the aforementioned languages to introduce global state,\nwhich can lead to difficult to interpret data relationships and makes\nparallelism much more difficult to apply.</p>\n<p>This project ported a particular example of the leading agent-based forest\nsimulator created by Marechaux and Chave, TROLL, and migrated it to OCaml while\napplying a more functional style, and then introduced concurrency. This gave\ninsight into the difficulties of refactoring and maintaining modern scientific\ncomputing codebases, as well as the new parallelisation mechanisms of Multicore\nOCaml.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li>Isabelle Marechaux and Jerome Chave. An individual-based forest model to jointly simulate carbon and tree diversity in Amazonia: description and applications. Ecological Monographs, 87(4):632\u2013664, 2017.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li>The source code is on a <a href=\"https://github.com/mSinkievic/troll-ocaml\">private repository on GitHub</a>. Please contact <span>Martynas Sinkievi\u010d</span> to request access.</li>\n<li>The dissertation is available on request for interested students from <a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a> but has not otherwise been made public.</li>\n</ul>",
+18
avsm/ideas_parallel-scheduling-with-effects.json
+18
avsm/ideas_parallel-scheduling-with-effects.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/parallel-scheduling-with-effects\">Using effect handlers for efficient parallel scheduling</a> <span>/ Jan 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/bartoszmodelski\">Bartosz Modelski</a>.</p>\n<p>Modern hardware is so parallel and workloads are so concurrent, that there is\nno single, perfect scheduling strategy across a complex application software\nstack. Therefore, there are significant performance advantages to be gained\nfrom customizing and composing schedulers.</p>\n<p>Multicore parallelism is here to stay, and in contrast with clock frequency\nincreases, schedulers have to be carefully crafted in order to take full\nadvantage of horizontal scaling of the underlying architecture. That\u2019s because\ndesigns need to evolve as synchronization primitives such as locks or atomics\ndo not scale endlessly to many cores, and a naive work stealing scheduler that\nmay have been good enough on 16-thread Intel Xeon in 2012 will fail to utilize\nall 128 threads of a contemporary AMD ThreadRipper in 2022. Modern high-core\narchitectures also feature non-uniform memory and so memory latency patterns\nvary with the topology. Scheduling decisions will benefit from taking mem- ory\nhierarchy into account. Moreover, the non-uniformity also appears also in\nconsumer products such as Apple M1 or Intel Core i7-1280P. These highlight two\nsets of cores in modern architectures: one optimized for performance and\nanother one for efficiency.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/parallel-scheduling-with-effects\">483 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#using-effect-handlers-for-efficient-parallel-scheduling\"></a>Using effect handlers for efficient parallel scheduling</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/bartoszmodelski\">Bartosz Modelski</a>.</p>\n<p>Modern hardware is so parallel and workloads are so concurrent, that there is\nno single, perfect scheduling strategy across a complex application software\nstack. Therefore, there are significant performance advantages to be gained\nfrom customizing and composing schedulers.</p>\n<p>Multicore parallelism is here to stay, and in contrast with clock frequency\nincreases, schedulers have to be carefully crafted in order to take full\nadvantage of horizontal scaling of the underlying architecture. That\u2019s because\ndesigns need to evolve as synchronization primitives such as locks or atomics\ndo not scale endlessly to many cores, and a naive work stealing scheduler that\nmay have been good enough on 16-thread Intel Xeon in 2012 will fail to utilize\nall 128 threads of a contemporary AMD ThreadRipper in 2022. Modern high-core\narchitectures also feature non-uniform memory and so memory latency patterns\nvary with the topology. Scheduling decisions will benefit from taking mem- ory\nhierarchy into account. Moreover, the non-uniformity also appears also in\nconsumer products such as Apple M1 or Intel Core i7-1280P. These highlight two\nsets of cores in modern architectures: one optimized for performance and\nanother one for efficiency.</p>\n<p>This project uses the experimental multicore OCaml extension to explore\nconcurrent scheduling on multicore hardware, using library schedulers. Common\nprogramming languages either include threading support, which is tightly\ncoupled with the language itself, or offer no support and, thus,\nlibrary-schedulers cannot offer much beyond simply running scheduled functions\nin some order. OCaml, on the other hand, features fibers and effects. Together,\nthey allow writing a direct style, stack-switching scheduler as a library.\nFurther, OCaml allows composing schedulers -- a much-needed mechanism for\nexecuting diverse workloads with portions having different optimization\ncriteria.</p>\n<h2><a href=\"https://anil.recoil.org/#results\"></a>Results</h2>\n<p>The project was successfully concluded. To validate the hypothesis, it\ndeveloped several practical userspace schedulers and extended them with a\nnumber of work distribution methods. The code was written in OCaml with\nmulticore support, which features a novel effects-based approach to\nmultithreading. Most importantly, it decoupled lightweight threading from the\nruntime and lets user compose schedulers.\nThe evaluation involved several real-world benchmarks executed on up to 120\nthreads of a dual-socket machine with two AMD EPYC 7702 processors.</p>\n<p>The results showed that scaling applications to high core counts is\nnon-trivial, and some classic methods such as work stealing do not provide\noptimal performance. Secondly, different scheduling policies have a profound\nimpact on the throughput and latency of specific benchmarks, which justifies\nthe need to compose schedulers for heterogeneous workloads. Further, a\ncomposition of schedulers in a staged architecture was shown to provide better\ntail latency than its components. Moreover, the performance of the scheduler\ndeveloped in this project was shown to improve over the existing default\nMulticore OCaml scheduler - Domainslib. Finally, the results put in question a\ncommon design of overflow queue present in e.g., Go and Tokio (Rust).</p>\n<p>Read the full <a href=\"https://github.com/bartoszmodelski/ebsl/blob/main/report/report.pdf\">report\nPDF</a>\nonline, and see the <a href=\"https://github.com/bartoszmodelski/ebsl\">notebooks</a>\nassociated with the experiments here.</p>",
+18
avsm/ideas_prob-programming-owl.json
+18
avsm/ideas_prob-programming-owl.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/prob-programming-owl\">Probabilistic Programming in OCaml</a> <span>/ Jan 2018</span></h2><div><p>This is an idea proposed in 2018 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Hari Chandrasekaran</span>. It was co-supervised with <a href=\"https://github.com/ctk21\">Tom Kelly</a> and <a href=\"https://github.com/ryanrhymes\">Liang Wang</a>.</p>\n<p>With increasing use of machine learning, it is useful to develop frameworks\nthat support rapid development and functional specification of probabilistic\nmodels for inference and reasoning. Probabilistic Programming Languages aim to\nsupport concise syntax for specifying models and consequently making inference\neasier. This can pave way to improvements of the model created, more data\ngathering and further model refinement in an iterative sense.</p>\n<p>PPL enables easier development of statistical models and allows decoupling\ninference from modelling. There is a lot of recent work on PPLs, and this\nproject seeks to incorporate them into functional languages. This project aims\nto develop a small PPL with a graph based model for Bayesian inference (similar\nto the Edward PPL) into the Owl numerical library written in OCaml.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/prob-programming-owl\">277 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#probabilistic-programming-in-ocaml\"></a>Probabilistic Programming in OCaml</h1>\n<p>This is an idea proposed in 2018 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Hari Chandrasekaran</span>. It was co-supervised with <a href=\"https://github.com/ctk21\">Tom Kelly</a> and <a href=\"https://github.com/ryanrhymes\">Liang Wang</a>.</p>\n<p>With increasing use of machine learning, it is useful to develop frameworks\nthat support rapid development and functional specification of probabilistic\nmodels for inference and reasoning. Probabilistic Programming Languages aim to\nsupport concise syntax for specifying models and consequently making inference\neasier. This can pave way to improvements of the model created, more data\ngathering and further model refinement in an iterative sense.</p>\n<p>PPL enables easier development of statistical models and allows decoupling\ninference from modelling. There is a lot of recent work on PPLs, and this\nproject seeks to incorporate them into functional languages. This project aims\nto develop a small PPL with a graph based model for Bayesian inference (similar\nto the Edward PPL) into the Owl numerical library written in OCaml.</p>\n<p>The implementation focusses on modularity, enabling the composability of models\nand allowing them contain parameters which could be random variables from\ncommon probability distributions or deterministic functions or combinations of\nother random variables. The language would allow the specification of\ngenerative models that model the joint probability distribution of latent\nvariables and observed parameters, and inference by conditioning. The initial\nfocus will be on common statistical inference methods such as MCMC. Other\ninference algorithms such as Hamiltonian Monte Carlo or Variational Inference\nwill be explored as optional extensions to the project.</p>\n<h1><a href=\"https://anil.recoil.org/#background-reading\"></a>Background reading</h1>\n<ul>\n<li><a href=\"https://dl.acm.org/doi/10.1145/3236778\">"Functional Programming for modular Bayesian Inference"</a></li>\n<li>Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, and David M. Blei. <a href=\"https://arxiv.org/abs/1610.09787\">Edward: A library for probabilistic modeling, inference, and criticism</a>, 2016</li>\n<li>Liang Wang. 2017. Owl: A General-Purpose Numerical Library in OCaml.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation is not available online; contact <span>Hari Chandrasekaran</span> directly to obtain a\ncopy.</p>",
+18
avsm/ideas_raft-consensus.json
+18
avsm/ideas_raft-consensus.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/raft-consensus\">Analysis of the Raft Consensus Protocol</a> <span>/ Jan 2012</span></h2><div><p>This is an idea proposed in 2012 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Heidi Howard</span>.</p>\n<p>The Paxos algorithm, despite being synonymous with distributed consensus for\na decade, is famously difficult to reason about and implement due to its\nnon-intuitive approach and underspecification. In response, this project\naimed to implement and evaluate a framework for constructing fault-tolerant\napplications, utilising the recently proposed Raft algorithm for distributed\nconsensus. Constructing a simulation framework for our implementation would\nenable us to evaluate the protocol on everything from understandability and\nefficiency to correctness and performance in diverse network environments.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/raft-consensus\">273 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#analysis-of-the-raft-consensus-protocol\"></a>Analysis of the Raft Consensus Protocol</h1>\n<p>This is an idea proposed in 2012 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Heidi Howard</span>.</p>\n<p>The Paxos algorithm, despite being synonymous with distributed consensus for\na decade, is famously difficult to reason about and implement due to its\nnon-intuitive approach and underspecification. In response, this project\naimed to implement and evaluate a framework for constructing fault-tolerant\napplications, utilising the recently proposed Raft algorithm for distributed\nconsensus. Constructing a simulation framework for our implementation would\nenable us to evaluate the protocol on everything from understandability and\nefficiency to correctness and performance in diverse network environments.</p>\n<p>In retrospect, the complexity of the project far exceeded initial expectations:\nreproducing research from a paper that was still under submission and was\nmodified regularly proved a big challenge alongside Raft's many subtleties.\nNevertheless, the project achieved optoinal extensions by using our work to\npropose a range of optimisations to the Raft protocol. The project successfully\nconducted a thorough analysis of the protocol and released to the community a\ntestbed for developing further optimisations and investigating optimal protocol\nparameters for real-world deployments.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://raft.github.io/raft.pdf\">In Search of an Understandable Consensus Algorithm</a>, Diego Ongaro and John Ousterhout</li>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation is available as <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-857.html\">UCAM-CL-TR-857</a> in the Cambridge Computer Laboratory technical report series. <span>Heidi Howard</span> continued work on Raft subsequent to submitting this project and published it later in the year as <a href=\"https://anil.recoil.org/papers/2014-sigops-raft\">Raft Refloated: Do We Have Consensus?</a>.</p>\n<p>You can watch <span>Heidi Howard</span> talk about her work in a Computerphile video from 2016:</p>\n\n<p><span>Heidi Howard</span> also continued to work on Raft and distributed consensus later:</p>",
+18
avsm/ideas_rag-evaluation-for-ce.json
+18
avsm/ideas_rag-evaluation-for-ce.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/rag-evaluation-for-ce\">Evaluating RAG pipelines for conservation evidence</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a good starter project, and has been <span>completed</span> by <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Once we have this evidence, it needs to be synthesised into a form that can be used to inform policy and practice via natural language query interfaces. One way to do this is to use a <a href=\"https://anil.recoil.org/14768\">RAG (Retrieval Augmented Generation)</a> pipeline, which can automatically retrieve relevant information from a large corpus of documents, analyse it to extract key information relevant to CE, and generate a summary of the key findings.</p>\n<p>This project involved involve evaluating the performance of RAG pipelines for conservation evidence, comparing different models, configurations and benchmark sets, and identifying areas for improvement. The goal is to develop a set of best practices for using RAG pipelines in the context of conservation evidence, and to make recommendations for future work in this area.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/rag-evaluation-for-ce\">168 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#evaluating-rag-pipelines-for-conservation-evidence\"></a>Evaluating RAG pipelines for conservation evidence</h1>\n<p>This is an idea proposed in 2024 as a good starter project, and has been <span>completed</span> by <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>At the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project, we are interested in finding and synthesising evidence for conservation interventions. Once we have this evidence, it needs to be synthesised into a form that can be used to inform policy and practice via natural language query interfaces. One way to do this is to use a <a href=\"https://anil.recoil.org/14768\">RAG (Retrieval Augmented Generation)</a> pipeline, which can automatically retrieve relevant information from a large corpus of documents, analyse it to extract key information relevant to CE, and generate a summary of the key findings.</p>\n<p>This project involved involve evaluating the performance of RAG pipelines for conservation evidence, comparing different models, configurations and benchmark sets, and identifying areas for improvement. The goal is to develop a set of best practices for using RAG pipelines in the context of conservation evidence, and to make recommendations for future work in this area.</p>\n<p>A first preprint on this work titled "<a href=\"https://anil.recoil.org/papers/2024-ce-llm\">Careful design of Large Language Model pipelines enables expert-level retrieval of evidence-based information from syntheses and databases</a>" is now available.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://arxiv.org/html/2405.13622v1\">Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation</a>, arXiv:2405.13622v1, May 2024</li>\n</ul>",
+18
avsm/ideas_recording-nature.json
+18
avsm/ideas_recording-nature.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/recording-nature\">Low-power sensing infrastructure for biodiversity</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>. It is co-supervised with <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a>.</p>\n<p>In-situ sensing devices need to be deployed in remote environments for long\nperiods of time, and minimizing their power consumption is vital for maximising\nboth their operational lifetime and coverage.</p>\n<p>We are exploring the construction of a versatile multi-sensor device (initially\nbased around the ESP32 chipset) and designing an exceptionally low power\nconsumption model by using an on-device reinforcement learning scheduler that\ncan learn to cooperate with other nearby devices.</p>\n<p>Our prototype device setup for learning schedules for biodiversity monitoring\ndoes pretty well against a number of fixed schedules; the scheduler captures\nmore than 80% of events at less than 50% of the number of activations of the\nbest-performing fixed schedule. You can read more about this in\n<a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a>.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#low-power-sensing-infrastructure-for-biodiversity\"></a>Low-power sensing infrastructure for biodiversity</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a>. It is co-supervised with <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a>.</p>\n<p>In-situ sensing devices need to be deployed in remote environments for long\nperiods of time, and minimizing their power consumption is vital for maximising\nboth their operational lifetime and coverage.</p>\n<p>We are exploring the construction of a versatile multi-sensor device (initially\nbased around the ESP32 chipset) and designing an exceptionally low power\nconsumption model by using an on-device reinforcement learning scheduler that\ncan learn to cooperate with other nearby devices.</p>\n<p>Our prototype device setup for learning schedules for biodiversity monitoring\ndoes pretty well against a number of fixed schedules; the scheduler captures\nmore than 80% of events at less than 50% of the number of activations of the\nbest-performing fixed schedule. You can read more about this in\n<a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a>.</p>",
+18
avsm/ideas_rev-abm.json
+18
avsm/ideas_rev-abm.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/rev-abm\">Reverse emulating agent-based models for policy simulation</a> <span>/ Jan 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://www.linkedin.com/in/pedro-marques-sousa/\">Pedro Sousa</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Governments increasingly rely on simulation tools to inform policy design. Agent-based models (ABMs) simulate complex systems to study the emergent phenomena of individual behaviours and interactions in agent populations. However, these ABMs force an iterative, time-consuming, unmethodical parameter tuning of key policy "levers" (or input parameters) to steer the model towards the envisioned outcomes. To unlock a more natural workflow, this project investigates <em>reverse emulation</em>, a novel approach that streamlines policy design using probabilistic machine learning to predict parameter values that yield the desired policy outcomes.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/rev-abm\">192 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#reverse-emulating-agent-based-models-for-policy-simulation\"></a>Reverse emulating agent-based models for policy simulation</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://www.linkedin.com/in/pedro-marques-sousa/\">Pedro Sousa</a>. It was co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>.</p>\n<p>Governments increasingly rely on simulation tools to inform policy design. Agent-based models (ABMs) simulate complex systems to study the emergent phenomena of individual behaviours and interactions in agent populations. However, these ABMs force an iterative, time-consuming, unmethodical parameter tuning of key policy "levers" (or input parameters) to steer the model towards the envisioned outcomes. To unlock a more natural workflow, this project investigates <em>reverse emulation</em>, a novel approach that streamlines policy design using probabilistic machine learning to predict parameter values that yield the desired policy outcomes.</p>\n<h1><a href=\"https://anil.recoil.org/#background-reading\"></a>Background reading</h1>\n<ul>\n<li>J. Dyer, P. Cannon, J. D. Farmer, and S. M. Schmon, "Black-box bayesian inference for agent-based models", Journal of Economic Dynamics and Control, vol. 161, p. 104827, 2024.</li>\n<li>E. Frias-Martinez, G. Williamson, and V. Fr \u0301\u0131as-Mart \u0301\u0131nez, "An agent-based model of epidemic spread using human mobility and social network information," pp. 57\u201364, 10 2011.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li>Publication to follow as it is currently being written up. The project was awarded the "<a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7228682518596603904/\">2024 Highly Commended M.Phil Project</a>" commendation from the Computer Science department.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#see-also\"></a>See Also</h2>\n<p>This project was a followup to one in the previous year by <span>Sharan Agrawal</span> on <a href=\"https://anil.recoil.org/ideas/differentiable-abm\">Scalable agent-based models for optimized policy design</a>.</p>",
+18
avsm/ideas_scaling-tls-trust.json
+18
avsm/ideas_scaling-tls-trust.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/scaling-tls-trust\">Consolidating Trust for Client Groups that use TLS to Secure Connections</a> <span>/ Jan 2014</span></h2><div><p>This is an idea proposed in 2014 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Johann Beleites</span>. It was co-supervised with <span>David Sheets</span>.</p>\n<p>This project aimed to develop a framework that allows administrators to\ncentrally manage trust in CAs and certificates across a large number of\nclients. The framework should be responsive and changes in trust should not\nrequire any software updates or reboots of client devices. Further, no\ncooperation from CAs or domain owners should be necessary for a security gain.\nPerformance optimisations should be implemented such that it is usable on a\ndaily basis and this project could integrate with other existing attempts at\nimproving the TLS trust model.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/scaling-tls-trust\">167 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#consolidating-trust-for-client-groups-that-use-tls-to-secure-connections\"></a>Consolidating Trust for Client Groups that use TLS to Secure Connections</h1>\n<p>This is an idea proposed in 2014 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <span>Johann Beleites</span>. It was co-supervised with <span>David Sheets</span>.</p>\n<p>This project aimed to develop a framework that allows administrators to\ncentrally manage trust in CAs and certificates across a large number of\nclients. The framework should be responsive and changes in trust should not\nrequire any software updates or reboots of client devices. Further, no\ncooperation from CAs or domain owners should be necessary for a security gain.\nPerformance optimisations should be implemented such that it is usable on a\ndaily basis and this project could integrate with other existing attempts at\nimproving the TLS trust model.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb\">Not-Quite-So-Broken TLS</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#results\"></a>Results</h2>\n<p>A functioning framework dubbed "ConTrust" was implemented, allowing\nadministrators to centrally manage trust for TLS certificates. It can be\nresponsive (depending on the configuration) and does not require software\nupdates or reboots of client devices. Some means of authenticating certificates\nwere introduced \u2013- including a whitelist of trusted CAs. Caches were\nintroduced to improve performance, although more performance optimisations\nwould be possible but were not implemented due to prioritisation of other\nfeatures.</p>",
+18
avsm/ideas_sdms-with-cnns.json
+18
avsm/ideas_sdms-with-cnns.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/sdms-with-cnns\">Species distribution modelling using CNNs</a> <span>/ Feb 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/emorris7\">Emily Morris</a>. It was co-supervised with <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>The goal of this project is to compare the performance of <a href=\"https://biodiversityinformatics.amnh.org/open_source/maxent/\">MaxEnt</a> techniques to the performance of a CNN model for the task of species distribution\nmodeling.</p>\n<p>The CNN model will use remote sensing data as part of the input features. The remote sensing data we plan on using is a combination of LULC data (e.g. Dynamic World) and satellite imagery (Planet/Landsat 8/Sentinel 2).\nWe will also use more classical environmental variables from WorldClim and soil data.</p>\n<p>To evaluate it, we will focus on <a href=\"https://en.wikipedia.org/wiki/Protea\">proteas</a> for the species distribution modeling task. We have two observation data sets: the Protea Atlas and iNaturalist. The work for the CNN is largely based on the work done\nby <a href=\"https://europepmc.org/article/ppr/ppr533361\">Gillespie et al</a>, who present a model that takes in an RGB image and an\nembedding for environment variables and predicts which species are present in the image. This method performs multispecies presence modeling and the use of other species is somewhat central to the method. Including other species gives training examples which are pseudo-absences for some species, circumventing the issue of the lack of negative data.</p>\n<p>This project was conducted successfully, and presented at the <a href=\"https://www.climatechange.ai/events/neurips2024\">CCAI Workshop</a> at NeurIPS as '<a href=\"https://anil.recoil.org/papers/2024-sdm-sa\">Towards Scalable Deep Species Distribution Modelling using Global Remote Sensing</a>'.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#species-distribution-modelling-using-cnns\"></a>Species distribution modelling using CNNs</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://github.com/emorris7\">Emily Morris</a>. It was co-supervised with <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>The goal of this project is to compare the performance of <a href=\"https://biodiversityinformatics.amnh.org/open_source/maxent/\">MaxEnt</a> techniques to the performance of a CNN model for the task of species distribution\nmodeling.</p>\n<p>The CNN model will use remote sensing data as part of the input features. The remote sensing data we plan on using is a combination of LULC data (e.g. Dynamic World) and satellite imagery (Planet/Landsat 8/Sentinel 2).\nWe will also use more classical environmental variables from WorldClim and soil data.</p>\n<p>To evaluate it, we will focus on <a href=\"https://en.wikipedia.org/wiki/Protea\">proteas</a> for the species distribution modeling task. We have two observation data sets: the Protea Atlas and iNaturalist. The work for the CNN is largely based on the work done\nby <a href=\"https://europepmc.org/article/ppr/ppr533361\">Gillespie et al</a>, who present a model that takes in an RGB image and an\nembedding for environment variables and predicts which species are present in the image. This method performs multispecies presence modeling and the use of other species is somewhat central to the method. Including other species gives training examples which are pseudo-absences for some species, circumventing the issue of the lack of negative data.</p>\n<p>This project was conducted successfully, and presented at the <a href=\"https://www.climatechange.ai/events/neurips2024\">CCAI Workshop</a> at NeurIPS as '<a href=\"https://anil.recoil.org/papers/2024-sdm-sa\">Towards Scalable Deep Species Distribution Modelling using Global Remote Sensing</a>'.</p>",
+18
avsm/ideas_sensor-fusion-vslam-forests.json
+18
avsm/ideas_sensor-fusion-vslam-forests.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/sensor-fusion-vslam-forests\">Making GPS accurate in dense forests using sensor fusion</a> <span>/ Aug 2020</span></h2><div><p>This is an idea proposed in 2020 as a good starter project, and has been <span>completed</span> by <a href=\"https://keshav123456.github.io\">Keshav Sivakumar</a>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>Current GPS solutions are either very expensive ($8k+) or have relatively poor accuracies (10m+) under dense forest canopy. This project explores how to determine our location accurately in a forest area where we travel by foot under canopy without a GPS signal.</p>\n<ul>\n<li>What low cost solutions exist to perform localisation under such circumstances?</li>\n<li>What are the rough accuracies of these solutions?</li>\n<li>What constraints and advantages do these solutions have (in terms of power, light, cost, etc)</li>\n</ul>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/sensor-fusion-vslam-forests\">231 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#making-gps-accurate-in-dense-forests-using-sensor-fusion\"></a>Making GPS accurate in dense forests using sensor fusion</h1>\n<p>This is an idea proposed in 2020 as a good starter project, and has been <span>completed</span> by <a href=\"https://keshav123456.github.io\">Keshav Sivakumar</a>. It was co-supervised with <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>Current GPS solutions are either very expensive ($8k+) or have relatively poor accuracies (10m+) under dense forest canopy. This project explores how to determine our location accurately in a forest area where we travel by foot under canopy without a GPS signal.</p>\n<ul>\n<li>What low cost solutions exist to perform localisation under such circumstances?</li>\n<li>What are the rough accuracies of these solutions?</li>\n<li>What constraints and advantages do these solutions have (in terms of power, light, cost, etc)</li>\n</ul>\n<p>We observe that a lot of SLAM algorithms exist these days, but most of the recent research is on optimizing for monocular cameras, whereas we have the luxury of using cameras built for this purpose. A lot of options also exist with regards to depth cameras/fish eye cameras that specialize for localisation/mapping use cases. We chose the Intel T265 as it is part of a family of widely used products, and comes with a usable library (librealsense). It can also provide a good benchmark for base VSLAM, there is huge scope for greater accuracy by using depth cameras or LIDAR, but it is the cheapest, easiest solution among the current industry grade solutions. Interestingly, even the latest iPad Pro has LIDAR built-in now, so this is a solid approach!</p>\n<p>The project was completed successfully (remotely due to pandemic), with details available in <a href=\"https://forests.notion.site/Keshav-Sivakumar-1fe07a2ebf0e4c318c50ac5e15bedae5\">the PDF writeup and slides</a>, and <a href=\"https://github.com/keshav123456/UROP2020\">code notebooks</a> on GitHub.</p>",
+18
avsm/ideas_sns.json
+18
avsm/ideas_sns.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/sns\">Spatial Name System</a> <span>/ Jan 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>. It was co-supervised with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>.</p>\n<p>The development of emerging classes of hardware such as Internet of Thing\ndevices and Augmented Reality headsets has outpaced the development of Internet\ninfrastructure. We identify problems with latency, security and privacy in the\nglobal hierarchical distributed Domain Name System. To remedy this, we propose\nthe Spatial Name System, an alternative network architecture that relies on the\ninnate physicality of this paradigm. Utilizing a device\u2019s pre-existing unique\nidentifier, its location, allows us to identify devices locally based on their\nphysical presence. A naming system tailored to the physical world for\nubiquitous computing can enable reliable, low latency, secure and private\ncommunication.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/sns\">196 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#spatial-name-system\"></a>Spatial Name System</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>. It was co-supervised with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>.</p>\n<p>The development of emerging classes of hardware such as Internet of Thing\ndevices and Augmented Reality headsets has outpaced the development of Internet\ninfrastructure. We identify problems with latency, security and privacy in the\nglobal hierarchical distributed Domain Name System. To remedy this, we propose\nthe Spatial Name System, an alternative network architecture that relies on the\ninnate physicality of this paradigm. Utilizing a device\u2019s pre-existing unique\nidentifier, its location, allows us to identify devices locally based on their\nphysical presence. A naming system tailored to the physical world for\nubiquitous computing can enable reliable, low latency, secure and private\ncommunication.</p>\n<p>This dissertation explores the hypothesis that:</p>\n<blockquote>\n<p>We have the hardware and software to support low latency augmented reality\ninteractions, but the current network architecture is inadequate to support\ninterconnecting them. We need a Spatial Name System that can map physical\ndevice locations to network addresses to overcome this limitation and unlock\nthe potential of augmented reality.</p>\n</blockquote>\n<p>An extended version of this was published in HotNets 22 in <a href=\"https://anil.recoil.org/papers/2023-hotnets-sns\">Where on Earth is the Spatial Name System?</a>.\nThe MPhil dissertation is available <a href=\"https://ryan.freumh.org/papers/2022-mphil-sns.pdf\">online as a\nPDF</a>. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> has continued\non to work on his PhD on <a href=\"https://anil.recoil.org/ideas/interspatial-networking\">Interspatial Networking with DNS</a> as well!</p>",
+18
avsm/ideas_soapp-privgrind.json
+18
avsm/ideas_soapp-privgrind.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/soapp-privgrind\">Control flow analysis for privilege separation</a> <span>/ Aug 2011</span></h2><div><p>This is an idea proposed in 2011 as a good starter project, and has been <span>completed</span> by <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> and <a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a>. It was co-supervised with <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a>.</p>\n<p>In the summer of 2011, we hosted <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> and <a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a> to do an\ninternship in the Computer Lab working just as the\n<a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/soaap/\">CTSRD/SOAPP</a> project\nkicked off.\n<a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a> built a tool called\n<a href=\"https://github.com/rmcilroy/Privgrind\">privgrind</a>, using valgrind that tracks,\nfor all data addresses touched, the list of functions that wrote or read from\nthe address and how much they wrote or read. <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> then built a\nvisualiser for this that output the complex control flow graph that results\nfrom this as a <a href=\"https://github.com/chris838/privsep-visualiser\">privsep-visualiser</a>\nwhich would then form a guideline for future compartmentalisation activities.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/soapp-privgrind\">139 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#control-flow-analysis-for-privilege-separation\"></a>Control flow analysis for privilege separation</h1>\n<p>This is an idea proposed in 2011 as a good starter project, and has been <span>completed</span> by <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> and <a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a>. It was co-supervised with <a href=\"http://www.watson.org/~robert/\">Robert M Watson</a>.</p>\n<p>In the summer of 2011, we hosted <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> and <a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a> to do an\ninternship in the Computer Lab working just as the\n<a href=\"https://www.cl.cam.ac.uk/research/security/ctsrd/soaap/\">CTSRD/SOAPP</a> project\nkicked off.\n<a href=\"https://research.google/people/ross-mcilroy/\">Ross McIlroy</a> built a tool called\n<a href=\"https://github.com/rmcilroy/Privgrind\">privgrind</a>, using valgrind that tracks,\nfor all data addresses touched, the list of functions that wrote or read from\nthe address and how much they wrote or read. <a href=\"https://uk.linkedin.com/in/hardingcj\">Chris Harding</a> then built a\nvisualiser for this that output the complex control flow graph that results\nfrom this as a <a href=\"https://github.com/chris838/privsep-visualiser\">privsep-visualiser</a>\nwhich would then form a guideline for future compartmentalisation activities.</p>\n<p>\n<img alt=\"CFG of OpenBSD&apos;s syslogd\" src=\"https://anil.recoil.org/images/syslogd-privgrind-cfg.webp\" title=\"CFG of OpenBSD&apos;s syslogd\">\nCFG of OpenBSD's syslogd</p>\n<p>The results of this work only got partly written up, despite being very cool\n(we all got busy with other projects). There is a workshop paper on <a href=\"https://anil.recoil.org/papers/2012-ahans-soapp\">Exploring Compartmentalisation Hypotheses with SOAAP</a>\nwhich covers some of the work, and the wider CHERI/CTSRD project has done plenty\nmore since.</p>",
+18
avsm/ideas_spatial-summarisation-of-llms.json
+18
avsm/ideas_spatial-summarisation-of-llms.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/spatial-summarisation-of-llms\">Spatial and multi-modal extraction from conservation literature</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>.</p>\n<p>The <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> database contains information on numerous conservation actions and\ntheir supporting evidence. We also have access to a large corpus of academic\nliterature detailing species presence and threats which we have assembled in\nCambridge in collaboration with the various journal publishers.</p>\n<p>This MPhil project aims to combine these published literature resources with\ngeographic information to propose conservation interventions. The goal is to\nidentify actions that are likely to be effective based on prior evidence and\nhave the potential to produce significant gains in biodiversity. This approach\nshould then enhance the targeting and impact of future conservation efforts and\nmake them more evidence driven.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/spatial-summarisation-of-llms\">298 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#spatial-and-multi-modal-extraction-from-conservation-literature\"></a>Spatial and multi-modal extraction from conservation literature</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>.</p>\n<p>The <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> database contains information on numerous conservation actions and\ntheir supporting evidence. We also have access to a large corpus of academic\nliterature detailing species presence and threats which we have assembled in\nCambridge in collaboration with the various journal publishers.</p>\n<p>This MPhil project aims to combine these published literature resources with\ngeographic information to propose conservation interventions. The goal is to\nidentify actions that are likely to be effective based on prior evidence and\nhave the potential to produce significant gains in biodiversity. This approach\nshould then enhance the targeting and impact of future conservation efforts and\nmake them more evidence driven.</p>\n<p></p><div></div><p></p>\n<p>To realize this project, several key components need to be developed, each of\nwhich could constitute an MPhil project in its own right:</p>\n<ul>\n<li>Firstly, a pipeline needs to be constructed to <strong>extract actions, threats,\nand species information from the literature</strong>, aligning with the Conservation\nEvidence taxonomy. This would involve natural language processing and\ninformation extraction techniques, possibly involving LLMs.</li>\n<li>Secondly, the project requires <strong>multimodal models capable of analyzing both text\nand visual elements</strong> (such as maps and graphs) in scientific papers to identify\nrelevant conservation data.</li>\n<li>Thirdly, a predictive model needs to be developed to <strong>assess the potential efficacy\nof conservation interventions</strong>. This model would be based on the Conservation\nEvidence database and should provide reasoning for its predictions, potentially\nutilizing techniques in explainable AI and causal inference.</li>\n</ul>\n<p>If you're interested in applying machine learning and LLM techniques to global\nconservation, then get in touch about the above or any other ideas you might have.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>The <a href=\"https://docs.ragas.io/en/stable/index.html\">Ragas framework</a> for RAG evaluation</li>\n<li><a href=\"https://arxiv.org/abs/2406.02524v2\">CheckEmbed: Effective Verification of LLM Solutions to Open Ended Tasks</a>, arxiv:2406.02524v2, June 2024</li>\n<li><a href=\"https://arxiv.org/abs/2210.00045\">Calibrating Sequence Likelihood Improves Conditional Language Generation</a>, arxiv:2210.00045, September 2000</li>\n</ul>",
+18
avsm/ideas_ssl-for-geospatial-tasks.json
+18
avsm/ideas_ssl-for-geospatial-tasks.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/ssl-for-geospatial-tasks\">Foundation models for complex geospatial tasks</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>Self-supervised learning (SSL) represents a shift in machine learning that\nenables versatile <em>pretrained</em> models to leverage the complex relationships\npresent in dense\u2013oftentimes multispectral and multimodal\u2013remote sensing data.\nThis in turn can accelerate how we address sophisticated downstream geospatial\ntasks for which current methodologies prove insufficient, ranging from land cover\nclassification to urban building segmentation to crop yield measurement and\nwildfire forecasting.</p>\n<p>This PhD project explores the question of how current SSL methodologies may be\naltered to tackle remote sensing tasks, and also how to make them amenable\nto incremental time-series generation as new data regularly comes in from\nsensing instruments.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#foundation-models-for-complex-geospatial-tasks\"></a>Foundation models for complex geospatial tasks</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a>. It is co-supervised with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://coomeslab.org\">David Coomes</a>.</p>\n<p>Self-supervised learning (SSL) represents a shift in machine learning that\nenables versatile <em>pretrained</em> models to leverage the complex relationships\npresent in dense\u2013oftentimes multispectral and multimodal\u2013remote sensing data.\nThis in turn can accelerate how we address sophisticated downstream geospatial\ntasks for which current methodologies prove insufficient, ranging from land cover\nclassification to urban building segmentation to crop yield measurement and\nwildfire forecasting.</p>\n<p>This PhD project explores the question of how current SSL methodologies may be\naltered to tackle remote sensing tasks, and also how to make them amenable\nto incremental time-series generation as new data regularly comes in from\nsensing instruments.</p>",
+18
avsm/ideas_tardis.json
+18
avsm/ideas_tardis.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/tardis\">Privacy preserving emissions disclosure techniques</a> <span>/ Jan 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a>. It is co-supervised with <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a>.</p>\n<p>Customers of online services may want to take carbon emissions into account\nwhen deciding which service to use, but are currently hindered by a lack of\nreliable emissions data that is comparable across services. Calculating\naccurate carbon emissions across a cloud computing pipeline involves a number\nof stakeholders, none of whom are incentivised to accurately report their\nemissions for competitive reasons.</p>\n<p>This PhD explores mechanisms to support verifiable and privacy-preserving\nemissions reporting across a chain of energy suppliers, cloud data centres,\nvirtual machine hosting services providers and cloud services providers, which\nare ultimately passed through to APIs used by customers. We hypothesise that adding\nverifiable and composable emissions transparency to cloud computing\narchitectures enables providers to compete on the basis of sustainability,\nresulting in demand-side pressure on cloud services to shift to renewable\nenergy sources.</p>\n<p>We published a workshop paper on this topic in <a href=\"https://anil.recoil.org/papers/2024-loco-emissions\">Emission Impossible: privacy-preserving carbon emissions claims</a>.</p>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#privacy-preserving-emissions-disclosure-techniques\"></a>Privacy preserving emissions disclosure techniques</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a>. It is co-supervised with <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a>.</p>\n<p>Customers of online services may want to take carbon emissions into account\nwhen deciding which service to use, but are currently hindered by a lack of\nreliable emissions data that is comparable across services. Calculating\naccurate carbon emissions across a cloud computing pipeline involves a number\nof stakeholders, none of whom are incentivised to accurately report their\nemissions for competitive reasons.</p>\n<p>This PhD explores mechanisms to support verifiable and privacy-preserving\nemissions reporting across a chain of energy suppliers, cloud data centres,\nvirtual machine hosting services providers and cloud services providers, which\nare ultimately passed through to APIs used by customers. We hypothesise that adding\nverifiable and composable emissions transparency to cloud computing\narchitectures enables providers to compete on the basis of sustainability,\nresulting in demand-side pressure on cloud services to shift to renewable\nenergy sources.</p>\n<p>We published a workshop paper on this topic in <a href=\"https://anil.recoil.org/papers/2024-loco-emissions\">Emission Impossible: privacy-preserving carbon emissions claims</a>.</p>",
+18
avsm/ideas_tracing-hdl-with-effects.json
+18
avsm/ideas_tracing-hdl-with-effects.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/tracing-hdl-with-effects\">A hardware description language using OCaml effects</a> <span>/ Mar 2025</span></h2><div><p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and <a href=\"https://github.com/andrewray\">Andy Ray</a>.</p>\n<p>Programming FPGAs using functional programming languages is a very good fit for\nthe problem domain. OCaml has the <a href=\"https://anil.recoil.org/notes/fpgas-hardcaml\">HardCaml ecosystem</a> to\nexpress hardware designs in OCaml, make generic designs using the power of the\nlanguage, then simulate designs and convert them to Verilog or VHDL.</p>\n<p>HardCaml is very successfully used in production at places like <a href=\"https://janestreet.com\">Jane\nStreet</a>, but needs quite a lot of prerequisite knowledge\nabout the full OCaml language. In particular, it makes very heavy use of the <a href=\"https://github.com/janestreet/hardcaml/blob/master/docs/hardcaml_interfaces.md\">module\nsystem</a> in\norder to build up the circuit description as an OCaml data structure.</p>\n<p>Instead of building up a circuit as the output of the OCaml program, it would\nbe very cool if we could <em>directly</em> implement the circuit as OCaml code by\nevaluating it. This is an approach that works very successfully in the <a href=\"https://github.com/clash-lang/clash-compiler\">Clash\nHaskell HDL</a>, as described in this\n<a href=\"https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf\">thesis</a>. Clash uses a\nnumber of advanced Haskell type-level features to encode fixed-length vectors\n(very convenient for hardware description) and has an interactive REPL that\nallows for exploration without requiring a separate test bench.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/tracing-hdl-with-effects\">296 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#a-hardware-description-language-using-ocaml-effects\"></a>A hardware description language using OCaml effects</h1>\n<p>This is an idea proposed in 2025 as a Cambridge Computer Science Part III or MPhil project, and is <span>available</span> for being worked on. It may be co-supervised with <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and <a href=\"https://github.com/andrewray\">Andy Ray</a>.</p>\n<p>Programming FPGAs using functional programming languages is a very good fit for\nthe problem domain. OCaml has the <a href=\"https://anil.recoil.org/notes/fpgas-hardcaml\">HardCaml ecosystem</a> to\nexpress hardware designs in OCaml, make generic designs using the power of the\nlanguage, then simulate designs and convert them to Verilog or VHDL.</p>\n<p>HardCaml is very successfully used in production at places like <a href=\"https://janestreet.com\">Jane\nStreet</a>, but needs quite a lot of prerequisite knowledge\nabout the full OCaml language. In particular, it makes very heavy use of the <a href=\"https://github.com/janestreet/hardcaml/blob/master/docs/hardcaml_interfaces.md\">module\nsystem</a> in\norder to build up the circuit description as an OCaml data structure.</p>\n<p>Instead of building up a circuit as the output of the OCaml program, it would\nbe very cool if we could <em>directly</em> implement the circuit as OCaml code by\nevaluating it. This is an approach that works very successfully in the <a href=\"https://github.com/clash-lang/clash-compiler\">Clash\nHaskell HDL</a>, as described in this\n<a href=\"https://essay.utwente.nl/59482/1/scriptie_C_Baaij.pdf\">thesis</a>. Clash uses a\nnumber of advanced Haskell type-level features to encode fixed-length vectors\n(very convenient for hardware description) and has an interactive REPL that\nallows for exploration without requiring a separate test bench.</p>\n<p>The question for this project is whether the new <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">effect handlers</a>\nin OCaml 5.0 might be suitable for using OCaml as a host language for a tracing-style\nhardware description language. We would explore several elements using OCaml 5:</p>\n<ul>\n<li>using effects for control-flow memoisation (see <a href=\"https://github.com/ocaml-multicore/effects-examples/blob/master/multishot/memo.ml\">the example</a>)</li>\n<li>restricting arbitrary recursion using effect handlers</li>\n<li>ergonomic ways of encoding fixed-length vectors</li>\n</ul>\n<p>This project will require a deep interest in programming language design and implementation,\nand an enthusiasm for learning more about digital hardware. There are quite a few good\n<a href=\"https://anil.recoil.org/ideas/computational-storage-for-vector-dbs\">usecases</a> for using heterogenous hardware like FPGAs these days.\nThere's a great <a href=\"https://signalsandthreads.com/programmable-hardware/\">Signals and Threads episode</a> on\nprogrammable hardware with <a href=\"https://github.com/andrewray\">Andy Ray</a> that should give you more useful background knowledge as well.</p>",
+18
avsm/ideas_urban-vegetation.json
+18
avsm/ideas_urban-vegetation.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/urban-vegetation\">The role of urban vegetation in human health</a> <span>/ Jan 2023</span></h2><div><p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>. It is co-supervised with <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a>.</p>\n<p>Cities around the globe have experienced unprecedented growth in recent years,\nbecoming centres of economic, cultural, and social hubs for human activity.\nRapid urbanisation has transformed the physical landscape and significantly\naltered local climates, with predictions stating that cities will harbour more\nthan 70% of the population by the middle of the 21st century. This has also\nchanged the climatic variables to which humans are most directly exposed.\nCombining global climatic changes with localised human activities has created\nnew challenges that cities must solve to be sustainable homes for humanity in\nthe coming decades.</p>\n<p>Given the complexity of building sustainable cities and the breadth and variety\nof available information, it is crucial to use data-driven approaches for urban\nplanning. Urban environments have become humanity's home in the past century,\nand they will play a key role in shaping the culture, environment and society\nof the 21st century. Moreover, due to how cities have been built historically\nand how their urban structure reflects social and economic conditions, it is\nessential to address the challenge of shaping cities into a more sustainable\nand equal future regarding the environment and human health. In particular,\ngreen spaces and trees have been regarded as one of the most crucial\ninterventions in cities because of their ecosystem services.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/urban-vegetation\">281 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#the-role-of-urban-vegetation-in-human-health\"></a>The role of urban vegetation in human health</h1>\n<p>This is an idea proposed in 2023 as a Cambridge Computer Science PhD topic, and is currently <span>being worked on</span> by <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>. It is co-supervised with <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a>.</p>\n<p>Cities around the globe have experienced unprecedented growth in recent years,\nbecoming centres of economic, cultural, and social hubs for human activity.\nRapid urbanisation has transformed the physical landscape and significantly\naltered local climates, with predictions stating that cities will harbour more\nthan 70% of the population by the middle of the 21st century. This has also\nchanged the climatic variables to which humans are most directly exposed.\nCombining global climatic changes with localised human activities has created\nnew challenges that cities must solve to be sustainable homes for humanity in\nthe coming decades.</p>\n<p>Given the complexity of building sustainable cities and the breadth and variety\nof available information, it is crucial to use data-driven approaches for urban\nplanning. Urban environments have become humanity's home in the past century,\nand they will play a key role in shaping the culture, environment and society\nof the 21st century. Moreover, due to how cities have been built historically\nand how their urban structure reflects social and economic conditions, it is\nessential to address the challenge of shaping cities into a more sustainable\nand equal future regarding the environment and human health. In particular,\ngreen spaces and trees have been regarded as one of the most crucial\ninterventions in cities because of their ecosystem services.</p>\n<p>This PhD project aims to model the role of vegetation in regulating urban\nclimates and improving human health, using several sources of information,\nincluding weather and climate data, remote sensing products and census and\nsurvey data (socio-economic and health indicators).</p>\n<ul>\n<li>Read more in the first abstract: <a href=\"https://anil.recoil.org/papers/2024-green-urban-eq\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a></li>\n<li>There will be a talk at <a href=\"https://www.conftool.pro/biospace25/sessions.php\">Biospace 2025</a> at the European Space Agency from <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> and <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> in Feb 2025</li>\n</ul>",
+18
avsm/ideas_urls-with-provenance.json
+18
avsm/ideas_urls-with-provenance.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/urls-with-provenance\">Towards reproducible URLs with provenance</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Vurls are an attempt to add versioning to URI resolution. For example, what should happen when we request <code>https://doi.org/10.1109/SASOW.2012.14</code> and how do we track the chain of events that leads to an answer coming back? The prototype <a href=\"https://github.com/quantifyearth/vurl\">vurl</a> library written in OCaml outputs the following:</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/urls-with-provenance\">323 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#towards-reproducible-urls-with-provenance\"></a>Towards reproducible URLs with provenance</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and has <span>expired</span>. It may be co-supervised with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>.</p>\n<p>Vurls are an attempt to add versioning to URI resolution. For example, what should happen when we request <code>https://doi.org/10.1109/SASOW.2012.14</code> and how do we track the chain of events that leads to an answer coming back? The prototype <a href=\"https://github.com/quantifyearth/vurl\">vurl</a> library written in OCaml outputs the following:</p>\n<pre><code># Eio_main.run @@ fun env ->\n Vurl_eio.with_default ~net:env#net env#cwd @@ fun () ->\n let vurl = Vurl.of_uri "https://doi.org/10.1109/SASOW.2012.14" in\n let vurl, file = Vurl.file vurl in\n Vurl.pp Format.std_formatter vurl;;\n\n{\n "intentional_uri": "https://doi.org/10.1109/SASOW.2012.14",\n "segments": [\n {\n "uri": "file:./_data/document-6498375",\n "cid": "bag5qgeraipjyvov4axsmb4pktfhmleqi4oc2lno5if6f6wjyq37w4ktncvxq"\n },\n {\n "uri": "https://ieeexplore.ieee.org/document/6498375/",\n "cid": "bag5qgeraipjyvov4axsmb4pktfhmleqi4oc2lno5if6f6wjyq37w4ktncvxq"\n },\n {\n "uri": "http://ieeexplore.ieee.org/document/6498375/",\n "cid": "bag5qgerap5iaobunfnlovfzv4jeq2ygp6ltszlrreaskyh3mseky5osh2boq"\n }\n ]\n}\n</code></pre>\n<p>The <code>intentional_uri</code> is the original URI, and the <code>segments</code> are the different versions of the document as tracked through HTTP redirects and so on. The <code>cid</code> is a content identifier tgat is a hash of the content retrieved in that snapshot. The <code>file</code> is the local file that the URI resolves to.</p>\n<p>This project will build on the vurl concept to build a practical implementation that integrates it into a popular HTTP library (in any language, but Python or OCaml are two good starts), and also builds a simple proxy service that can be used to resolve these URLs. The web service should be able to take a normal url and return the content of the URL at that point in time, and also return a vurl representing the complete state of the protocol traffic, and also be able to take a vurl and return the diff between two versions of the content.</p>\n<p>Once successful, the project could also explore what more compact representations of the vurls would look like, and how to integrate them into existing web infrastructure.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://github.com/quantifyearth/vurl\">https://github.com/quantifyearth/vurl</a> has some prototype code.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a> has relevant background reading on some of the types of diffs that would be useful in a geospatial context.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">Planetary computing for data-driven environmental policy-making</a> covers the broader data processing pipelines we need to integrate into.</li>\n</ul>",
+18
avsm/ideas_validating-anti-poaching-predictions.json
+18
avsm/ideas_validating-anti-poaching-predictions.json
···+"title": "Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas",+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>. It is co-supervised with <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>.</p>\n<p>Biodiversity is declining at an unprecedented rate, underscoring the critical\nrole of protected areas (PAs) in conserving threatened species and ecosystems.\nYet, many of these are increasingly dismissed as "paper parks" due to poor\nmanagement.\nPark rangers play a vital role in PA effectiveness by detecting and potentially\ndeterring illegal activities. However, limited funding for PA management has\nled to low patrol frequency and detection rates, reducing the overall deterrent\neffect of ranger efforts. This resource scarcity often results in\nnon-systematic patrol strategies, which are sub-optimal given that illegal\nhunters tend to be selective in where and when they operate.</p>\n<p>The situation is\npoised to become more challenging as countries expand PA coverage under the\nKunming-Montreal Global Biodiversity Framework\u2014aiming to increase global PA\narea from 123 million km2 to 153 million km2 by 2030.\nWithout a substantial boost in enforcement capacity, both existing and newly\ndesignated PAs will remain vulnerable. Continued overexploitation of wildlife\nthreatens not only species survival but also ecosystem integrity and the\nwell-being of local communities who rely on wildlife for food and income.</p>\n<p>This project aims to combine <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">data from rangers</a> in multiple African protected\nareas and hunters around a single protected area (Nigeria) to improve the\ndeterrence effect of ranger patrols by optimising ranger efforts and provide\ninformation on the economic impacts of improved ranger patrols on community\nlivelihoods and well-being. We plan to deploy our models to rangers in the\nfield via <a href=\"https://smartconservationtools.org\">SMART</a>, which is used in > 1000\nPAs globally to facilitate monitoring and data collection during patrols.</p>\n<p>The two main aims are to:</p>\n<ol>\n<li>develop an accessibility layer using long-term ranger-collected data</li>\n<li>validate the results of this layer, as well as those from other <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">models developed</a>, using ranger insights.</li>\n</ol>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">334 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#validating-predictions-with-ranger-insights-to-enhance-anti-poaching-patrol-strategies-in-protected-areas\"></a>Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>. It is co-supervised with <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>.</p>\n<p>Biodiversity is declining at an unprecedented rate, underscoring the critical\nrole of protected areas (PAs) in conserving threatened species and ecosystems.\nYet, many of these are increasingly dismissed as "paper parks" due to poor\nmanagement.\nPark rangers play a vital role in PA effectiveness by detecting and potentially\ndeterring illegal activities. However, limited funding for PA management has\nled to low patrol frequency and detection rates, reducing the overall deterrent\neffect of ranger efforts. This resource scarcity often results in\nnon-systematic patrol strategies, which are sub-optimal given that illegal\nhunters tend to be selective in where and when they operate.</p>\n<p>The situation is\npoised to become more challenging as countries expand PA coverage under the\nKunming-Montreal Global Biodiversity Framework\u2014aiming to increase global PA\narea from 123 million km2 to 153 million km2 by 2030.\nWithout a substantial boost in enforcement capacity, both existing and newly\ndesignated PAs will remain vulnerable. Continued overexploitation of wildlife\nthreatens not only species survival but also ecosystem integrity and the\nwell-being of local communities who rely on wildlife for food and income.</p>\n<p>This project aims to combine <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">data from rangers</a> in multiple African protected\nareas and hunters around a single protected area (Nigeria) to improve the\ndeterrence effect of ranger patrols by optimising ranger efforts and provide\ninformation on the economic impacts of improved ranger patrols on community\nlivelihoods and well-being. We plan to deploy our models to rangers in the\nfield via <a href=\"https://smartconservationtools.org\">SMART</a>, which is used in > 1000\nPAs globally to facilitate monitoring and data collection during patrols.</p>\n<p>The two main aims are to:</p>\n<ol>\n<li>develop an accessibility layer using long-term ranger-collected data</li>\n<li>validate the results of this layer, as well as those from other <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">models developed</a>, using ranger insights.</li>\n</ol>\n<p><em>This work involves collaborating with the Wildlife Conservation Society (WCS)\nNigeria team and rangers from Cross River National Park\u2014who are already active\ncollaborators in this project. They have provided ranger patrol data,\ncontributed valuable on-the-ground perspectives for interpreting the data, and\nengaged with preliminary model outputs.</em></p>",
+18
avsm/ideas_version-control-matrix.json
+18
avsm/ideas_version-control-matrix.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/version-control-matrix\">Decentralised Capability-based Code Collaboration using Matrix</a> <span>/ Jan 2022</span></h2><div><p>This is an idea proposed in 2022 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a>.</p>\n<p>In 2005, due to licensing disputes, the team behind Linux parted ways with\ntheir proprietary source management tool BitKeeper, and needed a new solution.\nThis prompted the development of Git, an open-source decentralised version\ncontrol system (DVCS), which was soon used to manage the source code of Linux.\nContributions were submitted as patch files, which contained just the\ndifferences that the contribution made, to an email list, which were reviewed\nand applied to the central Git repository for Linux.</p>\n<p>Git grew in popularity and other projects started using it to manage their\nsource code. Then, in 2008, the GitHub.com platform launched, providing Git\nrepository hosting alongside other project management tools. Notably, GitHub\nfacilitates "pull requests", where contributors fork the repository, make\nchanges to their fork, and then request that their changes be merged back into\nthe central repository.\nAs of 2023, GitHub hosts over 364 million repositories and is the most popular\nversion control platform for both personal and professional use, followed by\nGitLab and BitBucket, which are all centralised version control platforms (CVCPs).</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/version-control-matrix\">386 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#decentralised-capability-based-code-collaboration-using-matrix\"></a>Decentralised Capability-based Code Collaboration using Matrix</h1>\n<p>This is an idea proposed in 2022 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a>.</p>\n<p>In 2005, due to licensing disputes, the team behind Linux parted ways with\ntheir proprietary source management tool BitKeeper, and needed a new solution.\nThis prompted the development of Git, an open-source decentralised version\ncontrol system (DVCS), which was soon used to manage the source code of Linux.\nContributions were submitted as patch files, which contained just the\ndifferences that the contribution made, to an email list, which were reviewed\nand applied to the central Git repository for Linux.</p>\n<p>Git grew in popularity and other projects started using it to manage their\nsource code. Then, in 2008, the GitHub.com platform launched, providing Git\nrepository hosting alongside other project management tools. Notably, GitHub\nfacilitates "pull requests", where contributors fork the repository, make\nchanges to their fork, and then request that their changes be merged back into\nthe central repository.\nAs of 2023, GitHub hosts over 364 million repositories and is the most popular\nversion control platform for both personal and professional use, followed by\nGitLab and BitBucket, which are all centralised version control platforms (CVCPs).</p>\n<p>Git is decentralised by design, meaning that repository mirrors are easily made\nand maintained, and development and collaboration can continue even when\ncentral servers experience downtime or data loss. However, the project\nmanagement tools that are provided by CVCPs are not decentralised, meaning\ndowntime can grind many parts of development to a halt, and data loss could set\nprojects back months.</p>\n<p>This project demonstrates that project management tools, such as\ncontribution-tracking, do not need to be centralised, siloed, or proprietary,\nbut can instead be de-centralised, open-source, interoperable, and redundant;\nso that developers can more time developing. It does this by routing collaboration\nrequests (such as patch exchange) over the Matrix communications protocol rather\nthan a centralised service. When developers wish to synchronise, they create\na Matrix channel on a variety of federated servers. When they wish to discuss\na patch, they use the messaging facilities in Matrix to revise changes. Finally,\npatches can be applied directly to a remote repository by creating a modified\nversion of <code>git-send-email</code> to work with the Matrix protocol.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li>Drew DeVault. <a href=\"https://drewdevault.com/2018/07/23/Git-is-already-distributed.html\">Git is already federated & decentralised</a></li>\n<li><a href=\"https://matrix.org\">https://matrix.org</a></li>\n<li><a href=\"https://git-scm.com/docs/git-send-email\">git-send-email</a>.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The dissertation PDF isn't available publically but\nshould be in the Cambridge Computer Lab archives somewhere or\non request from <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a>.</p>",
+18
avsm/ideas_void-processes.json
+18
avsm/ideas_void-processes.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/void-processes\">Void Processes: Minimising privilege by default</a> <span>/ Jan 2021</span></h2><div><p>This is an idea proposed in 2021 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://blog.hillion.co.uk\">Jake Hillion</a>.</p>\n<p>Void processes intend to make it easier for all developers to produce\neffectively privilege separated applications. The project has two primary\ngoals: show the merits of starting from zero privilege, and provide the\nutilities to make this feasible for the average developer.</p>\n<p>Building void processes involves first reliably removing all privilege from a\nprocess then systematically adding back in what is required, and no more. This\nproject utilises Linux namespaces to revoke privilege from an application,\nshowing how this can be done and why its easier in some domains than others.\nIt then shows how to inject sufficient privilege for applications to perform useful\nwork, developing new APIs that are friendly for privilege separation. These\nelements compose a shim called the "void orchestrator", a framework for\nrestricting Linux processes.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/void-processes\">158 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#void-processes-minimising-privilege-by-default\"></a>Void Processes: Minimising privilege by default</h1>\n<p>This is an idea proposed in 2021 as a Cambridge Computer Science Part III or MPhil project, and has been <span>completed</span> by <a href=\"https://blog.hillion.co.uk\">Jake Hillion</a>.</p>\n<p>Void processes intend to make it easier for all developers to produce\neffectively privilege separated applications. The project has two primary\ngoals: show the merits of starting from zero privilege, and provide the\nutilities to make this feasible for the average developer.</p>\n<p>Building void processes involves first reliably removing all privilege from a\nprocess then systematically adding back in what is required, and no more. This\nproject utilises Linux namespaces to revoke privilege from an application,\nshowing how this can be done and why its easier in some domains than others.\nIt then shows how to inject sufficient privilege for applications to perform useful\nwork, developing new APIs that are friendly for privilege separation. These\nelements compose a shim called the "void orchestrator", a framework for\nrestricting Linux processes.</p>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<ul>\n<li>The dissertation is available as a <a href=\"https://blog.hillion.co.uk/posts/void-processes/dissertation/jsh77-dissertation.pdf\">PDF</a>, with associated <a href=\"https://blog.hillion.co.uk/posts/void-processes/dissertation/\">blog post</a> and <a href=\"https://github.com/JakeHillion/void-processes\">LaTeX source</a>.</li>\n<li>The source code to the void orchestrator prototype is at <a href=\"https://github.com/JakeHillion/void-orchestrator\">jakehillion/void-orchestrator</a>.</li>\n</ul>",
+18
avsm/ideas_walkability-for-osm.json
+18
avsm/ideas_walkability-for-osm.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a> <span>/ Aug 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>.</p>\n<p><a href=\"https://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html\">Pathfinding</a> algorithms used in modern navigation systems utilize a plethora of different\ngeospatial data, such as from <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a>.\nNevertheless, they often operate under one-size-fits-all assumptions and the\nsimple objective of minimizing the anticipated travel time.</p>\n<p>By leveraging vectorized geospatial descriptions, this project aims to build a\nframework for finding <a href=\"https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1\">walking\nroutes</a>\nthat seek to achieve much more customizable objectives. Given a set of specific\nrequirements and preferences ("avoid dark streets at night"), we aim to\nleverage the semantic representation of a given area to select relevant\ngeospatial data.</p>\n<p>Once points of interest are selected, we then generate a specific walking route\nthat seeks to fulfill the initial requirements by trying to maximize their\nvectorized similarity to a semantic representation of the route. The potential\nof the framework, and its contrasting versatility to existing path-finding\nalgorithms, can be evaluated through experiments that reflect real-world\nscenarios such as accessibility, goals ("are we going shopping or just for a\nwalk in nature?").</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://arxiv.org/pdf/2108.13092\">Geovectors: a linked corpus of OpenStreetMap embeddings on World Scale</a> (2021, arXiv)</li>\n</ul>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#enhancing-navigation-algorithms-with-semantic-embeddings\"></a>Enhancing Navigation Algorithms with Semantic Embeddings</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part III or MPhil project, and is currently <span>being worked on</span> by <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>.</p>\n<p><a href=\"https://theory.stanford.edu/~amitp/GameProgramming/AStarComparison.html\">Pathfinding</a> algorithms used in modern navigation systems utilize a plethora of different\ngeospatial data, such as from <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a>.\nNevertheless, they often operate under one-size-fits-all assumptions and the\nsimple objective of minimizing the anticipated travel time.</p>\n<p>By leveraging vectorized geospatial descriptions, this project aims to build a\nframework for finding <a href=\"https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1\">walking\nroutes</a>\nthat seek to achieve much more customizable objectives. Given a set of specific\nrequirements and preferences ("avoid dark streets at night"), we aim to\nleverage the semantic representation of a given area to select relevant\ngeospatial data.</p>\n<p>Once points of interest are selected, we then generate a specific walking route\nthat seeks to fulfill the initial requirements by trying to maximize their\nvectorized similarity to a semantic representation of the route. The potential\nof the framework, and its contrasting versatility to existing path-finding\nalgorithms, can be evaluated through experiments that reflect real-world\nscenarios such as accessibility, goals ("are we going shopping or just for a\nwalk in nature?").</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related reading</h2>\n<ul>\n<li><a href=\"https://arxiv.org/pdf/2108.13092\">Geovectors: a linked corpus of OpenStreetMap embeddings on World Scale</a> (2021, arXiv)</li>\n</ul>",
+18
avsm/ideas_wayland.json
+18
avsm/ideas_wayland.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/wayland\">Low-latency wayland compositor in OCaml</a> <span>/ May 2024</span></h2><div><p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:tt492@cam.ac.uk\">Tom Thorogood</a>. It is co-supervised with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>.</p>\n<p>When building situated displays and hybrid streaming\nsystems, we need fine-grained composition over what to show on the displays.\nWayland is a communications protocol for next-generation display servers used\nin Unix-like systems.[^0]</p>\n<p>It has been adopted as the default display server by Linux distributions\nincluding Fedora with KDE, and Ubuntu and Debian with GNOME. It aims to\nreplace the venerable X display server with a modern alternative. X leaves\nlogic such as window management to application software, which has allowed the\nproliferation of different approaches. Wayland, however, centralizes all this\nlogic in the 'compositor', which assumes both display server and window manager\nroles.[^1]</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/wayland\">267 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#low-latency-wayland-compositor-in-ocaml\"></a>Low-latency wayland compositor in OCaml</h1>\n<p>This is an idea proposed in 2024 as a Cambridge Computer Science Part II project, and is currently <span>being worked on</span> by <a href=\"mailto:tt492@cam.ac.uk\">Tom Thorogood</a>. It is co-supervised with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>.</p>\n<p>When building situated displays and hybrid streaming\nsystems, we need fine-grained composition over what to show on the displays.\nWayland is a communications protocol for next-generation display servers used\nin Unix-like systems.<a href=\"https://anil.recoil.org/#fn-0\">[1]</a></p>\n<p>It has been adopted as the default display server by Linux distributions\nincluding Fedora with KDE, and Ubuntu and Debian with GNOME. It aims to\nreplace the venerable X display server with a modern alternative. X leaves\nlogic such as window management to application software, which has allowed the\nproliferation of different approaches. Wayland, however, centralizes all this\nlogic in the 'compositor', which assumes both display server and window manager\nroles.<a href=\"https://anil.recoil.org/#fn-1\">[2]</a></p>\n<p>Libraries such as wlroots, libweston, and 'small Wayland compositor', exist to\nprovide a basis on which to build a Wayland compositor. Much of the Wayland\necosystem is written in C, but modern memory-safe, type-safe, composable\nsystems programming languages like OCaml offer tempting alternatives. This\nproject proposes writing a Wayland compositor in OCaml, which opens up interesting\nopportunities for writing custom window management logic similar to how xmonad\ndoes for X<a href=\"https://anil.recoil.org/#fn-3\">[3]</a> rather than relying on IPC mechanisms used in state-of-the-art\nsystems.<a href=\"https://anil.recoil.org/#fn-4\">[4]</a></p>\n<p>This project is suitable for an ambitious student with a keen interest in\ngraphics, communication protocols, and operating systems. Starting points\ninclude completing OCaml wlroots bindings<a href=\"https://anil.recoil.org/#fn-3\">[3]</a> enough to implement an OCaml\nversion of the tinywl compositor<a href=\"https://anil.recoil.org/#fn-5\">[5]</a> and the pure OCaml implementation of the\nWayland protocol.<a href=\"https://anil.recoil.org/#fn-6\">[6]</a></p>\n<p>If you want to read a really fun historical paper that inspires this work, then\nthe <a href=\"https://www.cl.cam.ac.uk/research/dtg/attarchive/pub/docs/att/tr.94.4.pdf\">teleporting displays</a>\npaper should give you some entertaining background.</p>\n\n<ol>\n<li>\n<p><a href=\"https://wayland.freedesktop.org/\">https://wayland.freedesktop.org/</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-0\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://wayland.freedesktop.org/faq.html#heading_toc_j_11\">https://wayland.freedesktop.org/faq.html#heading_toc_j_11</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://github.com/swaywm/ocaml-wlroots\">https://github.com/swaywm/ocaml-wlroots</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e1</a><a href=\"https://anil.recoil.org/#ref-2-fn-3\">\u21a9\ufe0e\ufe0e2</a></span></li><li>\n<p><a href=\"https://github.com/swaywm/sway/blob/master/sway/sway-ipc.7.scd\">https://github.com/swaywm/sway/blob/master/sway/sway-ipc.7.scd</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-4\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/master/tinywl/tinywl.c\">https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/master/tinywl/tinywl.c</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-5\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p><a href=\"https://github.com/talex5/ocaml-wayland\">https://github.com/talex5/ocaml-wayland</a></p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-6\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/ideas_xmpp-group-comms.json
+18
avsm/ideas_xmpp-group-comms.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/xmpp-group-comms\">Simulating XMPP Group Communication</a> <span>/ Jan 2011</span></h2><div><p>This is an idea proposed in 2011 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://farhanmannan.com\">Farh\u0101n Mann\u0101n</a>.</p>\n<p>The problem of getting a digital message from one place to another has a staggering range of possible scenarios, constraints and applications. Humans and devices are in constant dialogue, with various constraints and contracts being invisibly maintained. Even the most flippant instant message sets layers of protocols in motion, all straining to resolve identities and propagate information transparently across disparate physical components that must present a logically unified front to users. Subtleties like authentication, encryption and anonymity abound.</p>\n<p>This project aims to build an OCaml-based simulator (using the <code>ocamlgraph</code> library) to build an XMPP protocol simulator that can model the networks, agents and protocols involved in XMPP-based group communication. The project is twofold and modular: the core is a simulator which is used to investigate the properties of gossip protocols acting on different graph topologies. The simulator can be parameterised on an RPC implementation so that rather than using simulated graphs, it can monitor the performance of the algorithms on real networks as well. An attempted extension is implementation of a functional OCaml RPC abstraction over XMPP which would be compatible with the simulator and be usable with <a href=\"https://mirageos.org\">MirageOS</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/ideas/xmpp-group-comms\">234 words</a>]</span></div>",+"content": "<h1><a href=\"https://anil.recoil.org/#simulating-xmpp-group-communication\"></a>Simulating XMPP Group Communication</h1>\n<p>This is an idea proposed in 2011 as a Cambridge Computer Science Part II project, and has been <span>completed</span> by <a href=\"https://farhanmannan.com\">Farh\u0101n Mann\u0101n</a>.</p>\n<p>The problem of getting a digital message from one place to another has a staggering range of possible scenarios, constraints and applications. Humans and devices are in constant dialogue, with various constraints and contracts being invisibly maintained. Even the most flippant instant message sets layers of protocols in motion, all straining to resolve identities and propagate information transparently across disparate physical components that must present a logically unified front to users. Subtleties like authentication, encryption and anonymity abound.</p>\n<p>This project aims to build an OCaml-based simulator (using the <code>ocamlgraph</code> library) to build an XMPP protocol simulator that can model the networks, agents and protocols involved in XMPP-based group communication. The project is twofold and modular: the core is a simulator which is used to investigate the properties of gossip protocols acting on different graph topologies. The simulator can be parameterised on an RPC implementation so that rather than using simulated graphs, it can monitor the performance of the algorithms on real networks as well. An attempted extension is implementation of a functional OCaml RPC abstraction over XMPP which would be compatible with the simulator and be usable with <a href=\"https://mirageos.org\">MirageOS</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<ul>\n<li><a href=\"https://xmpp.org/extensions/xep-0045.html\">XEP-0045</a> XMPP multiuser chat spec.</li>\n<li>An OCaml <a href=\"https://github.com/ermine/xmpp\">XMPP implementation</a></li>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#links\"></a>Links</h2>\n<p>The source code to the <a href=\"https://github.com/f6m6/gossip\">OCaml XMPP simulator</a>\nis available publically. The dissertation PDF isn't available publically but\nshould be in the Cambridge Computer Lab archives somewhere.</p>",
+18
avsm/ideas_zfs-filesystem-perf.json
+18
avsm/ideas_zfs-filesystem-perf.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a> <span>/ Jun 2025</span></h2><div><p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>. It is co-supervised with <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a>.</p>\n<p>We are using ZFS in much of our <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a> infrastructure due to its ease of remote replication. Therefore, its performance characteristics when used as a local filesystem are particularly interesting. Some questions that we need to answer about our <a href=\"https://www.tunbury.org/2025/04/29/distributed-zfs-storage/\">uses of ZFS</a> are:</p>\n<ol>\n<li>We intend to have an encrypted <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">remote backups</a> in several locations, but only a few of those hosts should have keys and the rest should use <a href=\"https://www.tunbury.org/2025/05/02/zfs-send-streams/\">raw ZFS send streams</a>.</li>\n</ol>\n<ul>\n<li><em>Does encryption add a significant overhead when used locally?</em></li>\n<li><em>Is replication faster if the source and target are both encrypted <em>vs</em> a raw send?</em></li>\n</ul>\n<ol>\n<li>We would typically have a <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">snapshot schedule</a>, such as hourly snapshots with a retention of 48 hours, daily snapshots with a retention of 14 days, and weekly snapshots with a retention of 8 weeks. As these snapshots build up over time, is there a performance degradation?</li>\n</ol>\n<ul>\n<li><em>Should we minimise the number of snapshots held locally, as this would allow faster purging of deleted files?</em></li>\n</ul>\n<ol>\n<li>How does ZFS send/receive compare to a peer-to-peer backup solution like <a href=\"https://www.borgbackup.org/\">Borg Backup</a>, given that it gives a free choice of source and target backup file system and supports encryption?</li>\n</ol>\n<ul>\n<li>ZFS should have the advantage of knowing which blocks have changed between two backups, but potentially, this adds an overhead to day-to-day use.</li>\n<li>On the other hand, ZFS replicants can be brought online much more quickly, whereas Borg backup files need to be reconstructed into a usable filesystem.</li>\n</ul>\n</div>",+"content": "<h1><a href=\"https://anil.recoil.org/#zfs-replication-strategies-with-encryption\"></a>ZFS replication strategies with encryption</h1>\n<p>This is an idea proposed in 2025 as a good starter project, and is currently <span>being worked on</span> by <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>. It is co-supervised with <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a>.</p>\n<p>We are using ZFS in much of our <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a> infrastructure due to its ease of remote replication. Therefore, its performance characteristics when used as a local filesystem are particularly interesting. Some questions that we need to answer about our <a href=\"https://www.tunbury.org/2025/04/29/distributed-zfs-storage/\">uses of ZFS</a> are:</p>\n<ol>\n<li>We intend to have an encrypted <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">remote backups</a> in several locations, but only a few of those hosts should have keys and the rest should use <a href=\"https://www.tunbury.org/2025/05/02/zfs-send-streams/\">raw ZFS send streams</a>.</li>\n</ol>\n<ul>\n<li><em>Does encryption add a significant overhead when used locally?</em></li>\n<li><em>Is replication faster if the source and target are both encrypted <em>vs</em> a raw send?</em></li>\n</ul>\n<ol>\n<li>We would typically have a <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">snapshot schedule</a>, such as hourly snapshots with a retention of 48 hours, daily snapshots with a retention of 14 days, and weekly snapshots with a retention of 8 weeks. As these snapshots build up over time, is there a performance degradation?</li>\n</ol>\n<ul>\n<li><em>Should we minimise the number of snapshots held locally, as this would allow faster purging of deleted files?</em></li>\n</ul>\n<ol>\n<li>How does ZFS send/receive compare to a peer-to-peer backup solution like <a href=\"https://www.borgbackup.org/\">Borg Backup</a>, given that it gives a free choice of source and target backup file system and supports encryption?</li>\n</ol>\n<ul>\n<li>ZFS should have the advantage of knowing which blocks have changed between two backups, but potentially, this adds an overhead to day-to-day use.</li>\n<li>On the other hand, ZFS replicants can be brought online much more quickly, whereas Borg backup files need to be reconstructed into a usable filesystem.</li>\n</ul>",
+2
-2
avsm/metadata.json
+2
-2
avsm/metadata.json
+18
avsm/news_0bc235e0-b154-4cbf-a84a-61240f16d60a-1.json
+18
avsm/news_0bc235e0-b154-4cbf-a84a-61240f16d60a-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/0bc235e0-b154-4cbf-a84a-61240f16d60a-1\">Delivered keynote at BOB 2015 on MirageOS</a> <span>/ Jan 2015</span></h2><p>I hopped over to Berlin to give the keynote at <a href=\"https://bobkonf.de/2015/en/\">BOB 2015</a> keynote on functional operating systems. If you're in the region, I <em>highly</em> recommend attending BOB as a superbly organised conference with a diverse and interesting crowd of functional programmers.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/0bc235e0-b154-4cbf-a84a-61240f16d60a-1\">#</a> 23rd Jan 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>fp</span> <span>keynote</span> <span>mirageos</span> <span>ocaml</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>I hopped over to Berlin to give the keynote at <a href=\"https://bobkonf.de/2015/en/\">BOB 2015</a> keynote on functional operating systems. If you're in the region, I <em>highly</em> recommend attending BOB as a superbly organised conference with a diverse and interesting crowd of functional programmers.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_2004-spotcodes-1.json
+18
avsm/news_2004-spotcodes-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2004-spotcodes-1\">Using camera-phones to interact with context-aware mobile services</a> <span>/ Dec 2004</span></h2><p>A technical report is now available on our SpotCode visual tag system, and includes a user study lead\nby <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> which tested its benefits against conventional mobile interfaces.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:alan.blackwell@cl.cam.ac.uk\"><span>Alan Blackwell</span></a>, and <a href=\"mailto:eben@phlegethon.org\"><span>Eben Upton</span></a>.</p><p>Technical report (UCAM-CL-TR-609) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-609.html\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-609.html\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-609\">DOI</a> <a href=\"https://anil.recoil.org/papers/2004-spotcodes.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2004-spotcodes.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2004-spotcodes-1\">#</a> 1st Dec 2004 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>hci</span> <span>mobile</span> <span>report</span> <span>spotcodes</span> <span>ubicomp</span> <span>visual</span></span></div>",+"content": "<p>A technical report is now available on our SpotCode visual tag system, and includes a user study lead\nby <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> which tested its benefits against conventional mobile interfaces.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:alan.blackwell@cl.cam.ac.uk\"><span>Alan Blackwell</span></a>, and <a href=\"mailto:eben@phlegethon.org\"><span>Eben Upton</span></a>.</p><p>Technical report (UCAM-CL-TR-609) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-609.html\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-609.html\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-609\">DOI</a> <a href=\"https://anil.recoil.org/papers/2004-spotcodes.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2004-spotcodes.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2004-ubicomp-camera-1.json
+18
avsm/news_2004-ubicomp-camera-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2004-ubicomp-camera-1\">Using Camera-Phones to Enhance Human-Computer Interaction</a> <span>/ Sep 2004</span></h2><p>We gave a demo at <a href=\"https://www.ubicomp.org/ubicomp2004/\">UbiComp 2004</a> all the way in Tokyo\non our SpotCode visual tag system. It went very well, including some time to do some\nsightseeing in Japan and visit the sumo wrestling championships!</p>\n<p></p><div></div><p></p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <a href=\"mailto:eben@phlegethon.org\"><span>Eben Upton</span></a>.</p><p>Paper in the <a href=\"https://ubicomp.org/ubicomp2004/adjunct/demos/madhavapeddy.pdf\">adjunct Proceedings of Ubicomp 2004 (Demo Track)</a>.</p><p><a href=\"https://ubicomp.org/ubicomp2004/adjunct/demos/madhavapeddy.pdf\">URL</a> <i>(ubicomp.org)</i> <a href=\"https://anil.recoil.org/papers/2004-ubicomp-camera.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2004-ubicomp-camera.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2004-ubicomp-camera-1\">#</a> 1st Sep 2004 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>hci</span> <span>japan</span> <span>mobile</span> <span>spotcodes</span> <span>ubicomp</span></span></div>",+"content": "<p>We gave a demo at <a href=\"https://www.ubicomp.org/ubicomp2004/\">UbiComp 2004</a> all the way in Tokyo\non our SpotCode visual tag system. It went very well, including some time to do some\nsightseeing in Japan and visit the sumo wrestling championships!</p>\n<p></p><div></div><p></p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <a href=\"mailto:eben@phlegethon.org\"><span>Eben Upton</span></a>.</p><p>Paper in the <a href=\"https://ubicomp.org/ubicomp2004/adjunct/demos/madhavapeddy.pdf\">adjunct Proceedings of Ubicomp 2004 (Demo Track)</a>.</p><p><a href=\"https://ubicomp.org/ubicomp2004/adjunct/demos/madhavapeddy.pdf\">URL</a> <i>(ubicomp.org)</i> <a href=\"https://anil.recoil.org/papers/2004-ubicomp-camera.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2004-ubicomp-camera.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-bbphone-1.json
+18
avsm/news_2005-bbphone-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-bbphone-1\">The Broadband Phone Network: Experiences with Context-Aware Telephony</a> <span>/ Jan 2005</span></h2><p>Report on our hacking on the AT&T Broadband Phone</p>\n<blockquote><div><p><a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>, <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (CUED/F INFENG/TR.533) at <a href=\"https://cam-orl.co.uk/bphone/\">Cambridge University Engineering Department</a>.</p><p><a href=\"https://cam-orl.co.uk/bphone/\">URL</a> <i>(cam-orl.co.uk)</i> <a href=\"https://anil.recoil.org/papers/2005-bbphone.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-bbphone.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-bbphone-1\">#</a> 1st Jan 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>audio</span> <span>hci</span> <span>report</span> <span>ubicomp</span></span></div>",+"content": "<p>Report on our hacking on the AT&T Broadband Phone</p>\n<blockquote><div><p><a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>, <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (CUED/F INFENG/TR.533) at <a href=\"https://cam-orl.co.uk/bphone/\">Cambridge University Engineering Department</a>.</p><p><a href=\"https://cam-orl.co.uk/bphone/\">URL</a> <i>(cam-orl.co.uk)</i> <a href=\"https://anil.recoil.org/papers/2005-bbphone.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-bbphone.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-hotdep-spl-1.json
+18
avsm/news_2005-hotdep-spl-1.json
···+"title": "On the challenge of delivering high-performance, dependable, model-checked internet servers",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-hotdep-spl-1\">On the challenge of delivering high-performance, dependable, model-checked internet servers</a> <span>/ Jun 2005</span></h2><p>Paper on temporal automata for protocol implementations at HotDep 2005</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.5555/1973400.1973406\">proceedings of the First Conference on Hot Topics in System Dependability</a>.</p><p><a href=\"https://dl.acm.org/doi/10.5555/1973400.1973406\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://anil.recoil.org/papers/2005-hotdep-spl.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-hotdep-spl.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-hotdep-spl-1\">#</a> 1st Jun 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>dsl</span> <span>formal</span> <span>modelchecking</span> <span>security</span></span></div>",+"content": "<p>Paper on temporal automata for protocol implementations at HotDep 2005</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.5555/1973400.1973406\">proceedings of the First Conference on Hot Topics in System Dependability</a>.</p><p><a href=\"https://dl.acm.org/doi/10.5555/1973400.1973406\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://anil.recoil.org/papers/2005-hotdep-spl.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-hotdep-spl.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-ieee-audio-1.json
+18
avsm/news_2005-ieee-audio-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-ieee-audio-1\">Audio networking: the forgotten wireless technology</a> <span>/ Jul 2005</span></h2><p>New paper <a href=\"https://anil.recoil.org/papers/2005-ieee-audio\">Audio networking: the forgotten wireless technology</a> available</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1495392/\">IEEE Pervasive Computing</a> (vol 4 issue 3).</p><p><a href=\"https://ieeexplore.ieee.org/document/1495392/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.50\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-audio.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-audio.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-ieee-audio-1\">#</a> 1st Jul 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>audio</span> <span>hci</span> <span>journal</span> <span>mobile</span> <span>ubicomp</span></span></div>",+"content": "<p>New paper <a href=\"https://anil.recoil.org/papers/2005-ieee-audio\">Audio networking: the forgotten wireless technology</a> available</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1495392/\">IEEE Pervasive Computing</a> (vol 4 issue 3).</p><p><a href=\"https://ieeexplore.ieee.org/document/1495392/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.50\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-audio.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-audio.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-ieee-smartphones-1.json
+18
avsm/news_2005-ieee-smartphones-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-ieee-smartphones-1\">Using smart phones to access site-specific services</a> <span>/ Jan 2005</span></h2><p>Article on using cameraphones to access site-specific services in IEEE Pervasive Computing</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1427650/\">IEEE Pervasive Computing</a> (vol 4 issue 2).</p><p><a href=\"https://ieeexplore.ieee.org/document/1427650/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.44\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-ieee-smartphones-1\">#</a> 1st Jan 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>hci</span> <span>journal</span> <span>mobile</span> <span>spotcodes</span> <span>ubicomp</span></span></div>",+"content": "<p>Article on using cameraphones to access site-specific services in IEEE Pervasive Computing</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1427650/\">IEEE Pervasive Computing</a> (vol 4 issue 2).</p><p><a href=\"https://ieeexplore.ieee.org/document/1427650/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.44\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-spin-splat-1.json
+18
avsm/news_2005-spin-splat-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-spin-splat-1\">SPLAT: A Tool for Model-Checking and Dynamically-Enforcing Abstractions</a> <span>/ Aug 2005</span></h2><p>Workshop paper on temporal automata for protocol specifications at SPIN 2005</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/11537328_23\">model Checking Software</a>.</p><p><a href=\"http://link.springer.com/10.1007/11537328_23\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/11537328_23\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-spin-splat.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-spin-splat.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-spin-splat-1\">#</a> 1st Aug 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>dsl</span> <span>formal</span> <span>modelchecking</span> <span>security</span></span></div>",+"content": "<p>Workshop paper on temporal automata for protocol specifications at SPIN 2005</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/11537328_23\">model Checking Software</a>.</p><p><a href=\"http://link.springer.com/10.1007/11537328_23\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/11537328_23\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-spin-splat.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-spin-splat.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-ubiapp-ubimedia-1.json
+18
avsm/news_2005-ubiapp-ubimedia-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-ubiapp-ubimedia-1\">Ubiquitious Computing needs to catch up with Ubiquitous Media</a> <span>/ Jul 2005</span></h2><p>Position paper on ubiquitous computing approaches to emerging stream media appliances</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <span><span>Kasim Rehman</span></span>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1495397/\">IEEE Pervasive Computing</a> (vol 4 issue 3).</p><p><a href=\"https://ieeexplore.ieee.org/document/1495397/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.69\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ubiapp-ubimedia.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ubiapp-ubimedia.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-ubiapp-ubimedia-1\">#</a> 1st Jul 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>hci</span> <span>journal</span> <span>mobile</span> <span>ubicomp</span></span></div>",+"content": "<p>Position paper on ubiquitous computing approaches to emerging stream media appliances</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <span><span>Kasim Rehman</span></span>.</p><p>Journal paper in <a href=\"https://ieeexplore.ieee.org/document/1495397/\">IEEE Pervasive Computing</a> (vol 4 issue 3).</p><p><a href=\"https://ieeexplore.ieee.org/document/1495397/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/MPRV.2005.69\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ubiapp-ubimedia.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ubiapp-ubimedia.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2005-ubicomp-bluetooth-1.json
+18
avsm/news_2005-ubicomp-bluetooth-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2005-ubicomp-bluetooth-1\">A Study of Bluetooth Propagation Using Accurate Indoor Location Mapping</a> <span>/ Jul 2005</span></h2><p>Ubicomp paper on a study of indoor bluetooth propagation using the Active Bat system</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>.</p><p>Paper in the ubiComp 2005: Ubiquitous Computing.</p><p><a href=\"https://doi.org/10.1007/11551201_7\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ubicomp-bluetooth.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ubicomp-bluetooth.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2005-ubicomp-bluetooth-1\">#</a> 1st Sep 2005 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>bluetooth</span> <span>conference</span> <span>hci</span> <span>mobile</span> <span>ubicomp</span></span></div>",+"content": "<p>Ubicomp paper on a study of indoor bluetooth propagation using the Active Bat system</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>.</p><p>Paper in the ubiComp 2005: Ubiquitous Computing.</p><p><a href=\"https://doi.org/10.1007/11551201_7\">DOI</a> <a href=\"https://anil.recoil.org/papers/2005-ubicomp-bluetooth.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2005-ubicomp-bluetooth.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2006-fighting-crimeware-1.json
+18
avsm/news_2006-fighting-crimeware-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2006-fighting-crimeware-1\">Fighting Crimeware: An Architecture for Split-Trust Web Applications</a> <span>/ Apr 2006</span></h2><p>New paper <a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware\">Fighting Crimeware: An Architecture for Split-Trust Web Applications</a> available</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Roy Want</span></span>, <span><span>Trevor Pering</span></span>, and <a href=\"https://ieeexplore.ieee.org/author/37549829000\"><span>John Light</span></a>.</p><p>Technical report (IRC-TR-06-053) at Intel Research.</p><p><a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2006-fighting-crimeware-1\">#</a> 1st Apr 2006 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>browser</span> <span>javascript</span> <span>report</span> <span>security</span></span></div>",+"content": "<p>New paper <a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware\">Fighting Crimeware: An Architecture for Split-Trust Web Applications</a> available</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Roy Want</span></span>, <span><span>Trevor Pering</span></span>, and <a href=\"https://ieeexplore.ieee.org/author/37549829000\"><span>John Light</span></a>.</p><p>Technical report (IRC-TR-06-053) at Intel Research.</p><p><a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2006-fighting-crimeware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2007-eurosys-melange-1.json
+18
avsm/news_2007-eurosys-melange-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2007-eurosys-melange-1\">Melange: creating a "functional" internet</a> <span>/ Jun 2007</span></h2><p>Won best student paper for my PhD work on a high-performance functional packet parsing DSL at Eurosys 2007!</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://hoiho.net\"><span>Alex Ho</span></a>, <a href=\"mailto:tjd@phlegethon.org\"><span>Tim Deegan</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/1272998.1273009\">ACM SIGOPS Operating Systems Review</a> (vol 41 issue 3).</p><p><a href=\"https://dl.acm.org/doi/10.1145/1272998.1273009\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1272998.1273009\">DOI</a> <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2007-eurosys-melange-1\">#</a> 1st Jun 2007 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>award</span> <span>dsl</span> <span>fp</span> <span>internet</span> <span>journal</span> <span>networks</span> <span>ocaml</span> <span>security</span></span></div>",+"content": "<p>Won best student paper for my PhD work on a high-performance functional packet parsing DSL at Eurosys 2007!</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://hoiho.net\"><span>Alex Ho</span></a>, <a href=\"mailto:tjd@phlegethon.org\"><span>Tim Deegan</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/1272998.1273009\">ACM SIGOPS Operating Systems Review</a> (vol 41 issue 3).</p><p><a href=\"https://dl.acm.org/doi/10.1145/1272998.1273009\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1272998.1273009\">DOI</a> <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2008-mobisys-splittrust-1.json
+18
avsm/news_2008-mobisys-splittrust-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2008-mobisys-splittrust-1\">Enhancing web browsing security on public terminals using mobile composition</a> <span>/ Jun 2008</span></h2><p>Paper on splitting trust between smartphones and webrowsers at MobiSys 2008</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Roy Want</span></span>, and <span><span>Trevor Pering</span></span>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/1378600.1378612\">proceedings of the 6th international conference on Mobile systems, applications, and services</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/1378600.1378612\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1378600.1378612\">DOI</a> <a href=\"https://anil.recoil.org/papers/2008-mobisys-splittrust.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2008-mobisys-splittrust.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2008-mobisys-splittrust-1\">#</a> 1st Jun 2008 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>browser</span> <span>conference</span> <span>javascript</span> <span>mobile</span> <span>security</span></span></div>",+"content": "<p>Paper on splitting trust between smartphones and webrowsers at MobiSys 2008</p>\n<blockquote><div><p><a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Roy Want</span></span>, and <span><span>Trevor Pering</span></span>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/1378600.1378612\">proceedings of the 6th international conference on Mobile systems, applications, and services</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/1378600.1378612\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1378600.1378612\">DOI</a> <a href=\"https://anil.recoil.org/papers/2008-mobisys-splittrust.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2008-mobisys-splittrust.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2009-icfem-spl-1.json
+18
avsm/news_2009-icfem-spl-1.json
···+"title": "Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2009-icfem-spl-1\">Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language</a> <span>/ Nov 2009</span></h2><p>Paper on a DSL for specifying temporal protocol automata at ICFEM 2009</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-642-10373-5_23\">formal Methods and Software Engineering</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-642-10373-5_23\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-642-10373-5_23\">DOI</a> <a href=\"https://anil.recoil.org/papers/2009-icfem-spl.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2009-icfem-spl.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2009-icfem-spl-1\">#</a> 1st Nov 2009 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>dsl</span> <span>formal</span> <span>fp</span> <span>security</span></span></div>",+"content": "<p>Paper on a DSL for specifying temporal protocol automata at ICFEM 2009</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-642-10373-5_23\">formal Methods and Software Engineering</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-642-10373-5_23\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-642-10373-5_23\">DOI</a> <a href=\"https://anil.recoil.org/papers/2009-icfem-spl.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2009-icfem-spl.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-bcs-visions-1.json
+18
avsm/news_2010-bcs-visions-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-bcs-visions-1\">Multiscale not multicore: efficient heterogeneous cloud computing</a> <span>/ Apr 2010</span></h2><p>Paper on our vision for multiscale programming at the BCS Visions 2010 conference</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.5555/1811182.1811191\">proceedings of the 2010 ACM-BCS Visions of Computer Science Conference</a>.</p><p><a href=\"https://dl.acm.org/doi/10.5555/1811182.1811191\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://anil.recoil.org/papers/2010-bcs-visions.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-bcs-visions.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-bcs-visions-1\">#</a> 1st Apr 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>architecture</span> <span>cloud</span> <span>conference</span> <span>internet</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on our vision for multiscale programming at the BCS Visions 2010 conference</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.5555/1811182.1811191\">proceedings of the 2010 ACM-BCS Visions of Computer Science Conference</a>.</p><p><a href=\"https://dl.acm.org/doi/10.5555/1811182.1811191\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://anil.recoil.org/papers/2010-bcs-visions.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-bcs-visions.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-dyntype-wgt-1.json
+18
avsm/news_2010-dyntype-wgt-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-dyntype-wgt-1\">Dynamics for ML using Meta-Programming</a> <span>/ Jul 2011</span></h2><p>Paper on statically typed value persistence for OCaml in ENTCS 2011</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">Electronic Notes in Theoretical Computer Science</a> (vol 264 issue 5).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/J.ENTCS.2011.06.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-dyntype-wgt-1\">#</a> 1st Jul 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>journal</span> <span>ocaml</span> <span>staged</span></span></div>",+"content": "<p>Paper on statically typed value persistence for OCaml in ENTCS 2011</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">Electronic Notes in Theoretical Computer Science</a> (vol 264 issue 5).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/J.ENTCS.2011.06.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-hotcloud-lamp-1.json
+18
avsm/news_2010-hotcloud-lamp-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-hotcloud-lamp-1\">Turning Down the LAMP: Software Specialisation for the Cloud</a> <span>/ Jun 2010</span></h2><p>Workshop paper on the early MirageOS architecture and evaluation at HotCloud 2010</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"mailto:tjd@phlegethon.org\"><span>Tim Deegan</span></a>, <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/hotcloud-10/turning-down-lamp-software-specialisation-cloud\">2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10)</a>.</p><p><a href=\"https://www.usenix.org/conference/hotcloud-10/turning-down-lamp-software-specialisation-cloud\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-hotcloud-lamp-1\">#</a> 1st Jun 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>mirageos</span> <span>ocaml</span> <span>security</span> <span>unikernels</span></span></div>",+"content": "<p>Workshop paper on the early MirageOS architecture and evaluation at HotCloud 2010</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"mailto:ripduman.sohan@gmail.com\"><span>Ripduman Sohan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"mailto:tjd@phlegethon.org\"><span>Tim Deegan</span></a>, <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/hotcloud-10/turning-down-lamp-software-specialisation-cloud\">2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10)</a>.</p><p><a href=\"https://www.usenix.org/conference/hotcloud-10/turning-down-lamp-software-specialisation-cloud\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-icfp-xen-1.json
+18
avsm/news_2010-icfp-xen-1.json
···+"title": "Using functional programming within an industrial product group: perspectives and perceptions",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-icfp-xen-1\">Using functional programming within an industrial product group: perspectives and perceptions</a> <span>/ Sep 2010</span></h2><p>Paper on our experiences with writing the Xen control stack in OCaml at ICFP 2010</p>\n<blockquote><div><p><a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/1863543.1863557\">proceedings of the 15th ACM SIGPLAN international conference on Functional programming</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/1863543.1863557\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1863543.1863557\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-icfp-xen-1\">#</a> 1st Sep 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>cufp</span> <span>fp</span> <span>icfp</span> <span>opensource</span> <span>xen</span></span></div>",+"content": "<p>Paper on our experiences with writing the Xen control stack in OCaml at ICFP 2010</p>\n<blockquote><div><p><a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/1863543.1863557\">proceedings of the 15th ACM SIGPLAN international conference on Functional programming</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/1863543.1863557\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/1863543.1863557\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-iswp-dustclouds-1.json
+18
avsm/news_2010-iswp-dustclouds-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-iswp-dustclouds-1\">Using Dust Clouds to Enhance Anonymous Communication</a> <span>/ Mar 2014</span></h2><p>Paper on building dust clouds for anonymous communication</p>\n<blockquote><div><p><a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Theodore Hong</span></span>, <a href=\"https://github.com/mrry\"><span>Derek Murray</span></a>, and <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-662-45921-8_10\">security Protocols XVIII</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-662-45921-8_10\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-662-45921-8_10\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-iswp-dustclouds-1\">#</a> 1st Mar 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>distributed</span> <span>security</span> <span>tor</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on building dust clouds for anonymous communication</p>\n<blockquote><div><p><a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Theodore Hong</span></span>, <a href=\"https://github.com/mrry\"><span>Derek Murray</span></a>, and <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-662-45921-8_10\">security Protocols XVIII</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-662-45921-8_10\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-662-45921-8_10\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2010-smarte-privacybutler-1.json
+18
avsm/news_2010-smarte-privacybutler-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2010-smarte-privacybutler-1\">Privacy Butler: A Personal Privacy Rights Manager for Online Presence</a> <span>/ Mar 2010</span></h2><p>Paper on privacy butler services for more private data management</p>\n<blockquote><div><p><span><span>Ryan Wishart</span></span>, <span><span>Domenico Corapi</span></span>, and <span><span>Morris Sloman</span></span>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/5470519\">2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/5470519\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/PERCOMW.2010.5470519\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-smarte-privacybutler.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-smarte-privacybutler.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2010-smarte-privacybutler-1\">#</a> 1st Mar 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>hci</span> <span>privacy</span> <span>selfhosting</span></span></div>",+"content": "<p>Paper on privacy butler services for more private data management</p>\n<blockquote><div><p><span><span>Ryan Wishart</span></span>, <span><span>Domenico Corapi</span></span>, and <span><span>Morris Sloman</span></span>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/5470519\">2010 8th IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/5470519\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/PERCOMW.2010.5470519\">DOI</a> <a href=\"https://anil.recoil.org/papers/2010-smarte-privacybutler.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2010-smarte-privacybutler.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2011-cufp-scribe-1.json
+18
avsm/news_2011-cufp-scribe-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2011-cufp-scribe-1\">CUFP 2011 Workshop Report</a> <span>/ Jan 2012</span></h2><p>Published the scribe's report for CUFP 2011</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>, and <a href=\"https://monkey.org/~marius/\"><span>Marius Eriksen</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp-2011-workshop-report/F22A5B087C6DD9A382D518F6DE08477A\">Journal of Functional Programming</a> (vol 22 issue 1).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp-2011-workshop-report/F22A5B087C6DD9A382D518F6DE08477A\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796812000020\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-cufp-scribe.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2011-cufp-scribe-1\">#</a> 1st Jan 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cufp</span> <span>icfp</span> <span>journal</span> <span>service</span></span></div>",+"content": "<p>Published the scribe's report for CUFP 2011</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>, and <a href=\"https://monkey.org/~marius/\"><span>Marius Eriksen</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp-2011-workshop-report/F22A5B087C6DD9A382D518F6DE08477A\">Journal of Functional Programming</a> (vol 22 issue 1).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp-2011-workshop-report/F22A5B087C6DD9A382D518F6DE08477A\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796812000020\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-cufp-scribe.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2011-dynamics-ml-1.json
+18
avsm/news_2011-dynamics-ml-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2011-dynamics-ml-1\">Dynamics for ML using Meta-Programming</a> <span>/ Jul 2011</span></h2><p>Published dyntype at the Workshop on Generative Technologies</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">Electronic Notes in Theoretical Computer Science</a> (vol 264 issue 5).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.entcs.2011.06.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-dynamics-ml.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-dynamics-ml.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2011-dynamics-ml-1\">#</a> 1st Jul 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>journal</span> <span>ocaml</span> <span>staged</span></span></div>",+"content": "<p>Published dyntype at the Workshop on Generative Technologies</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">Electronic Notes in Theoretical Computer Science</a> (vol 264 issue 5).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S1571066111000739\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.entcs.2011.06.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-dynamics-ml.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-dynamics-ml.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2011-fccm-cloudfpga-1.json
+18
avsm/news_2011-fccm-cloudfpga-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2011-fccm-cloudfpga-1\">Reconfigurable Data Processing for Clouds</a> <span>/ May 2011</span></h2><p>Paper on what a Xen+FPGA cloud would look like at FCCM</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://raintown.org\"><span>Satnam Singh</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/5771265/\">2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/5771265/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/FCCM.2011.35\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2011-fccm-cloudfpga-1\">#</a> 1st May 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>fpga</span> <span>systems</span> <span>xen</span></span></div>",+"content": "<p>Paper on what a Xen+FPGA cloud would look like at FCCM</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://raintown.org\"><span>Satnam Singh</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/5771265/\">2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/5771265/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/FCCM.2011.35\">DOI</a> <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2011-icdcn-droplets-1.json
+18
avsm/news_2011-icdcn-droplets-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2011-icdcn-droplets-1\">Unclouded vision</a> <span>/ Jan 2011</span></h2><p>Paper on a vision for a semi-federated cloud for personal data at ICDCN</p>\n<blockquote><div><p><a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <span><span>Theodore Hong</span></span>, and <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>.</p><p>Paper in the proceedings of the 12th International Conference on Distributed Computing and Networking.</p><p><a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2011-icdcn-droplets-1\">#</a> 1st Jan 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>networks</span> <span>privacy</span> <span>selfhosting</span> <span>systems</span></span></div>",+"content": "<p>Paper on a vision for a semi-federated cloud for personal data at ICDCN</p>\n<blockquote><div><p><a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <span><span>Theodore Hong</span></span>, and <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>.</p><p>Paper in the proceedings of the 12th International Conference on Distributed Computing and Networking.</p><p><a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2011-nsdi-ciel-1.json
+18
avsm/news_2011-nsdi-ciel-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2011-nsdi-ciel-1\">CIEL: A universal execution engine for distributed data-flow computing</a> <span>/ Mar 2011</span></h2><p>Paper on CIEL, a distributed dataflow engine, at USENIX NSDI 2011</p>\n<blockquote><div><p><a href=\"https://research.google/people/derekmurray/?&type=google\"><span>Derek G Murray</span></a>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <span><span>Christopher Smowton</span></span>, <a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Murray.pdf\">8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11)</a>.</p><p><a href=\"https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Murray.pdf\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2011-nsdi-ciel-1\">#</a> 1st Mar 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>bigdata</span> <span>cloud</span> <span>conference</span> <span>distributed</span> <span>fp</span> <span>systems</span></span></div>",+"content": "<p>Paper on CIEL, a distributed dataflow engine, at USENIX NSDI 2011</p>\n<blockquote><div><p><a href=\"https://research.google/people/derekmurray/?&type=google\"><span>Derek G Murray</span></a>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <span><span>Christopher Smowton</span></span>, <a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Murray.pdf\">8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11)</a>.</p><p><a href=\"https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Murray.pdf\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-ahans-soapp-1.json
+18
avsm/news_2012-ahans-soapp-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-ahans-soapp-1\">Exploring Compartmentalisation Hypotheses with SOAAP</a> <span>/ Sep 2012</span></h2><p>Paper on control flow analysis to break up applications into compartments</p>\n<blockquote><div><p><a href=\"https://www.khilan.com/\"><span>Khilan Gudka</span></a>, <a href=\"http://www.watson.org/~robert/\"><span>Robert M Watson</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"https://en.wikipedia.org/wiki/Ben_Laurie\"><span>Ben Laurie</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/6498375/\">2012 IEEE Sixth International Conference on Self-Adaptive and Self-Organizing Systems Workshops</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/6498375/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/SASOW.2012.14\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-ahans-soapp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-ahans-soapp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-ahans-soapp-1\">#</a> 1st Sep 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>security</span> <span>systems</span></span></div>",+"content": "<p>Paper on control flow analysis to break up applications into compartments</p>\n<blockquote><div><p><a href=\"https://www.khilan.com/\"><span>Khilan Gudka</span></a>, <a href=\"http://www.watson.org/~robert/\"><span>Robert M Watson</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"https://en.wikipedia.org/wiki/Ben_Laurie\"><span>Ben Laurie</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/6498375/\">2012 IEEE Sixth International Conference on Self-Adaptive and Self-Organizing Systems Workshops</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/6498375/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/SASOW.2012.14\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-ahans-soapp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-ahans-soapp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-conext-pvtcp-1.json
+18
avsm/news_2012-conext-pvtcp-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-conext-pvtcp-1\">Evolving TCP: how hard can it be?</a> <span>/ Dec 2012</span></h2><p>Paper on extending TCP in a backwards compatible way at CoNeXT 2013</p>\n<blockquote><div><p><span><span>Zubair Nabi</span></span>, <span><span>Toby Moncaster</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2413247.2413270\">proceedings of the 2012 ACM conference on CoNEXT student workshop</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2413247.2413270\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2413247.2413270\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-conext-pvtcp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-conext-pvtcp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-conext-pvtcp-1\">#</a> 1st Dec 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>networks</span> <span>systems</span> <span>tcp</span></span></div>",+"content": "<p>Paper on extending TCP in a backwards compatible way at CoNeXT 2013</p>\n<blockquote><div><p><span><span>Zubair Nabi</span></span>, <span><span>Toby Moncaster</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2413247.2413270\">proceedings of the 2012 ACM conference on CoNEXT student workshop</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2413247.2413270\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2413247.2413270\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-conext-pvtcp.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-conext-pvtcp.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-cufp-scribe-1.json
+18
avsm/news_2012-cufp-scribe-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-cufp-scribe-1\">Commercial users of functional programming workshop report</a> <span>/ Nov 2013</span></h2><p>Published the scribe's report for CUFP 2012</p>\n<blockquote><div><p><a href=\"https://www.deinprogramm.de/sperber/\"><span>Michael Sperber</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/commercial-users-of-functional-programming-workshop-report/7B8E5D99E6C0D40D45B37D972B82598D\">Journal of Functional Programming</a> (vol 23 issue 6).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/commercial-users-of-functional-programming-workshop-report/7B8E5D99E6C0D40D45B37D972B82598D\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796813000257\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-cufp-scribe.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-cufp-scribe-1\">#</a> 1st Nov 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cufp</span> <span>fp</span> <span>icfp</span> <span>journal</span> <span>service</span></span></div>",+"content": "<p>Published the scribe's report for CUFP 2012</p>\n<blockquote><div><p><a href=\"https://www.deinprogramm.de/sperber/\"><span>Michael Sperber</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/commercial-users-of-functional-programming-workshop-report/7B8E5D99E6C0D40D45B37D972B82598D\">Journal of Functional Programming</a> (vol 23 issue 6).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/commercial-users-of-functional-programming-workshop-report/7B8E5D99E6C0D40D45B37D972B82598D\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796813000257\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-cufp-scribe.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2012-iccsdn-mirageflow-1.json
+18
avsm/news_2012-iccsdn-mirageflow-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-iccsdn-mirageflow-1\">Cost, Performance & Flexibility in OpenFlow: Pick three</a> <span>/ Jun 2012</span></h2><p>Paper on using MirageOS for better SDN infrastructure with OpenFlow</p>\n<blockquote><div><p><a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, and <span><span>Andrew W. Moore</span></span>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/6364690/\">2012 IEEE International Conference on Communications (ICC)</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/6364690/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/ICC.2012.6364690\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-iccsdn-mirageflow.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-iccsdn-mirageflow.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-iccsdn-mirageflow-1\">#</a> 1st Jun 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>fp</span> <span>mirageos</span> <span>networks</span> <span>sdn</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on using MirageOS for better SDN infrastructure with OpenFlow</p>\n<blockquote><div><p><a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, and <span><span>Andrew W. Moore</span></span>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/6364690/\">2012 IEEE International Conference on Communications (ICC)</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/6364690/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/ICC.2012.6364690\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-iccsdn-mirageflow.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-iccsdn-mirageflow.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-mpm-caware-1.json
+18
avsm/news_2012-mpm-caware-1.json
···+"title": "Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-mpm-caware-1\">Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting</a> <span>/ Apr 2012</span></h2><p>Paper on our use of data lockers within Cambridge to incentivise more green commuting patterns</p>\n<blockquote><div><p><span><span>Chris Elsmore</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Ian Leslie</span></span>, and <span><span>Amir Chaudhry</span></span>.</p><p>Paper in the <a href=\"https://doi.org/10.1145/2181196.2181201\">proceedings of the First Workshop on Measurement, Privacy, and Mobility</a>.</p><p><a href=\"https://doi.org/10.1145/2181196.2181201\">URL</a> <i>(doi.org)</i> <a href=\"https://doi.org/10.1145/2181196.2181201\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-mpm-caware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-mpm-caware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-mpm-caware-1\">#</a> 1st Apr 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carbon</span> <span>conference</span> <span>databox</span> <span>hci</span> <span>privacy</span> <span>sensing</span> <span>travel</span></span></div>",+"content": "<p>Paper on our use of data lockers within Cambridge to incentivise more green commuting patterns</p>\n<blockquote><div><p><span><span>Chris Elsmore</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Ian Leslie</span></span>, and <span><span>Amir Chaudhry</span></span>.</p><p>Paper in the <a href=\"https://doi.org/10.1145/2181196.2181201\">proceedings of the First Workshop on Measurement, Privacy, and Mobility</a>.</p><p><a href=\"https://doi.org/10.1145/2181196.2181201\">URL</a> <i>(doi.org)</i> <a href=\"https://doi.org/10.1145/2181196.2181201\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-mpm-caware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-mpm-caware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-oud-xen-1.json
+18
avsm/news_2012-oud-xen-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-oud-xen-1\">Programming the Xen cloud using OCaml</a> <span>/ Sep 2012</span></h2><p>Paper on programming the Xen cloud using OCaml at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 1st ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2012-oud-xen.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-oud-xen.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-oud-xen-1\">#</a> 1st Sep 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>ocaml</span> <span>opensource</span> <span>xen</span></span></div>",+"content": "<p>Paper on programming the Xen cloud using OCaml at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 1st ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2012-oud-xen.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-oud-xen.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-resolve-fable-1.json
+18
avsm/news_2012-resolve-fable-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-resolve-fable-1\">The case for reconfigurable I/O channels</a> <span>/ Mar 2012</span></h2><p>Paper on a new design for reconfigurable IO that copes with heterogenous software/hardware</p>\n<blockquote><div><p><a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Christopher Smowton</span></span>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"http://www.watson.org/~robert/\"><span>Robert M Watson</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the rESoLVE workshop at ASPLOS.</p><p><a href=\"https://anil.recoil.org/papers/2012-resolve-fable.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-resolve-fable.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-resolve-fable-1\">#</a> 1st Mar 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on a new design for reconfigurable IO that copes with heterogenous software/hardware</p>\n<blockquote><div><p><a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Christopher Smowton</span></span>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"http://www.watson.org/~robert/\"><span>Robert M Watson</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Paper in the rESoLVE workshop at ASPLOS.</p><p><a href=\"https://anil.recoil.org/papers/2012-resolve-fable.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2012-resolve-fable.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2012-sigcomm-signposts-1.json
+18
avsm/news_2012-sigcomm-signposts-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2012-sigcomm-signposts-1\">Signposts: end-to-end networking in a world of middleboxes</a> <span>/ Sep 2012</span></h2><p>Demoed the Signposts DNSSEC system at SIGCOMM</p>\n<blockquote><div><p><span><span>Amir Chaudhry</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Andrius Aucinas</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://about.me/sebastianprobsteide\"><span>Sebastian Probst Eide</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <span><span>Andrew W. Moore</span></span>, and <span><span>Narseo Vallina-Rodriguez</span></span>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2377677.2377692\">ACM SIGCOMM Computer Communication Review</a> (vol 42 issue 4).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2377677.2377692\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2377677.2377692\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-sigcomm-signposts.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2012-sigcomm-signposts-1\">#</a> 1st Sep 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>dns</span> <span>journal</span> <span>networks</span> <span>signposts</span></span></div>",+"content": "<p>Demoed the Signposts DNSSEC system at SIGCOMM</p>\n<blockquote><div><p><span><span>Amir Chaudhry</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Andrius Aucinas</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://about.me/sebastianprobsteide\"><span>Sebastian Probst Eide</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <span><span>Andrew W. Moore</span></span>, and <span><span>Narseo Vallina-Rodriguez</span></span>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2377677.2377692\">ACM SIGCOMM Computer Communication Review</a> (vol 42 issue 4).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2377677.2377692\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2377677.2377692\">DOI</a> <a href=\"https://anil.recoil.org/papers/2012-sigcomm-signposts.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2013-asplos-mirage-1.json
+18
avsm/news_2013-asplos-mirage-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-asplos-mirage-1\">Unikernels: library operating systems for the cloud</a> <span>/ Mar 2013</span></h2><p>The first paper on unikernels is pubished at ASPLOS 2013</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2451116.2451167\">proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2451116.2451167\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2451116.2451167\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-asplos-mirage-1\">#</a> 1st Mar 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>mirageos</span> <span>ocaml</span> <span>security</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>The first paper on unikernels is pubished at ASPLOS 2013</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://github.com/sos22\"><span>Steven Smith</span></a>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2451116.2451167\">proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2451116.2451167\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2451116.2451167\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2013-cufp-scribe-1.json
+18
avsm/news_2013-cufp-scribe-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-cufp-scribe-1\">CUFP'13 scribe's report</a> <span>/ Jan 2015</span></h2><p>Published the scribe's report for CUFP 2013 in JFP</p>\n<blockquote><div><p><a href=\"https://monkey.org/~marius/\"><span>Marius Eriksen</span></a>, <a href=\"https://www.deinprogramm.de/sperber/\"><span>Michael Sperber</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp13-scribes-report/F38AAE60DA9AD95E1737E3F863075C13\">Journal of Functional Programming</a> (vol 25).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp13-scribes-report/F38AAE60DA9AD95E1737E3F863075C13\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796815000052\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-cufp-scribe.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-cufp-scribe-1\">#</a> 1st Jan 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cufp</span> <span>fp</span> <span>icfp</span> <span>journal</span> <span>service</span></span></div>",+"content": "<p>Published the scribe's report for CUFP 2013 in JFP</p>\n<blockquote><div><p><a href=\"https://monkey.org/~marius/\"><span>Marius Eriksen</span></a>, <a href=\"https://www.deinprogramm.de/sperber/\"><span>Michael Sperber</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp13-scribes-report/F38AAE60DA9AD95E1737E3F863075C13\">Journal of Functional Programming</a> (vol 25).</p><p><a href=\"https://www.cambridge.org/core/journals/journal-of-functional-programming/article/cufp13-scribes-report/F38AAE60DA9AD95E1737E3F863075C13\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/S0956796815000052\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-cufp-scribe.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2013-foci-signposts-1.json
+18
avsm/news_2013-foci-signposts-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-foci-signposts-1\">Lost in the Edge: Finding Your Way with DNSSEC Signposts</a> <span>/ Aug 2013</span></h2><p>Paper on DNSSEC-based Signpost servers for better p2p communications at USENIX FOCI</p>\n<blockquote><div><p><a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <span><span>Heidi Howard</span></span>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/foci13/workshop-program/presentation/rotsos\">3rd USENIX Workshop on Free and Open Communications on the Internet (FOCI 13)</a>.</p><p><a href=\"https://www.usenix.org/conference/foci13/workshop-program/presentation/rotsos\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-foci-signposts-1\">#</a> 1st Aug 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>distributed</span> <span>dns</span> <span>networks</span> <span>privacy</span> <span>security</span></span></div>",+"content": "<p>Paper on DNSSEC-based Signpost servers for better p2p communications at USENIX FOCI</p>\n<blockquote><div><p><a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\"><span>Charalampos Rotsos</span></a>, <span><span>Heidi Howard</span></span>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/foci13/workshop-program/presentation/rotsos\">3rd USENIX Workshop on Free and Open Communications on the Internet (FOCI 13)</a>.</p><p><a href=\"https://www.usenix.org/conference/foci13/workshop-program/presentation/rotsos\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2013-hotnets-trevi-1.json
+18
avsm/news_2013-hotnets-trevi-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-hotnets-trevi-1\">Trevi: watering down storage hotspots with cool fountain codes</a> <span>/ Nov 2013</span></h2><p>Paper on fountain coding for datacentre networking at HotNets 2013</p>\n<blockquote><div><p><a href=\"http://georgeparisis.github.io\"><span>George Parisis</span></a>, <span><span>Toby Moncaster</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2535771.2535781\">proceedings of the Twelfth ACM Workshop on Hot Topics in Networks</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2535771.2535781\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2535771.2535781\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-hotnets-trevi-1\">#</a> 1st Nov 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>distributed</span> <span>networks</span> <span>storage</span> <span>systems</span></span></div>",+"content": "<p>Paper on fountain coding for datacentre networking at HotNets 2013</p>\n<blockquote><div><p><a href=\"http://georgeparisis.github.io\"><span>George Parisis</span></a>, <span><span>Toby Moncaster</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2535771.2535781\">proceedings of the Twelfth ACM Workshop on Hot Topics in Networks</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2535771.2535781\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2535771.2535781\">DOI</a> <a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2013-ocamlot-1.json
+18
avsm/news_2013-ocamlot-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-ocamlot-1\">Ocamlot: Online OCaml Testing</a> <span>/ Sep 2013</span></h2><p>Presented an OCaml ecosystem testing system</p>\n<blockquote><div><p><span><span>David Sheets</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, and <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>.</p><p>Paper in the <a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">the 3rd ACM OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">URL</a> <i>(github.com)</i> <a href=\"https://anil.recoil.org/papers/2013-ocamlot.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-ocamlot.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-ocamlot-1\">#</a> 1st Sep 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>ocaml</span> <span>systems</span> <span>testing</span></span></div>",+"content": "<p>Presented an OCaml ecosystem testing system</p>\n<blockquote><div><p><span><span>David Sheets</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, and <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>.</p><p>Paper in the <a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">the 3rd ACM OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">URL</a> <i>(github.com)</i> <a href=\"https://anil.recoil.org/papers/2013-ocamlot.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-ocamlot.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2013-oud-platform-1.json
+18
avsm/news_2013-oud-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2013-oud-platform-1\">The OCaml Platform v0.1</a> <span>/ Sep 2013</span></h2><p>Paper on the OCaml Platform at the OCaml Workshop 2013</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>David Sheets</span></span>, <span><span>Phillipe Wang</span></span>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>.</p><p>Paper in the the 2nd ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2013-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2013-oud-platform-1\">#</a> 1st Sep 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on the OCaml Platform at the OCaml Workshop 2013</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>David Sheets</span></span>, <span><span>Phillipe Wang</span></span>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>.</p><p>Paper in the the 2nd ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2013-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2013-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2014-oud-irminsule-1.json
+18
avsm/news_2014-oud-irminsule-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2014-oud-irminsule-1\">Irminsule: a branch-consistent distributed library database</a> <span>/ Sep 2014</span></h2><p>Paper at the OCaml Workshop on the Irmin database library</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <span><span>David Sheets</span></span>, and <span><span>Gregory Tsipenyuk</span></span>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-irminsule.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-irminsule.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2014-oud-irminsule-1\">#</a> 1st Sep 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>distributed</span> <span>irmin</span> <span>mirageos</span> <span>storage</span></span></div>",+"content": "<p>Paper at the OCaml Workshop on the Irmin database library</p>\n<blockquote><div><p><a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <span><span>David Sheets</span></span>, and <span><span>Gregory Tsipenyuk</span></span>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-irminsule.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-irminsule.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2014-oud-multicore-1.json
+18
avsm/news_2014-oud-multicore-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2014-oud-multicore-1\">Multicore OCaml</a> <span>/ Sep 2014</span></h2><p>First paper on multicore OCaml's design at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-multicore.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-multicore.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2014-oud-multicore-1\">#</a> 1st Sep 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>multicore</span> <span>ocaml</span></span></div>",+"content": "<p>First paper on multicore OCaml's design at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-multicore.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-multicore.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2014-oud-platform-1.json
+18
avsm/news_2014-oud-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2014-oud-platform-1\">The OCaml Platform v1.0</a> <span>/ Sep 2014</span></h2><p>Paper on the OCaml Platform status</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, <span><span>Jeremie Dimino</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>Louis Gesbert</span></span>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/mshinwell\"><span>Mark Shinwell</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2014-oud-platform-1\">#</a> 1st Sep 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span> <span>testing</span></span></div>",+"content": "<p>Paper on the OCaml Platform status</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Amir Chaudhry</span></span>, <span><span>Jeremie Dimino</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>Louis Gesbert</span></span>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/mshinwell\"><span>Mark Shinwell</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>.</p><p>Paper in the the 4th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2014-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2014-regional-clouds-1.json
+18
avsm/news_2014-regional-clouds-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2014-regional-clouds-1\">Regional clouds: technical considerations</a> <span>/ Nov 2014</span></h2><p>Report on regional cloud computing law available</p>\n<blockquote><div><p><a href=\"https://www.cl.cam.ac.uk/~js573/\"><span>Jatinder Singh</span></a>, <a href=\"https://www.cl.cam.ac.uk/~jmb25/\"><span>Jean Bacon</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://tfjmp.org\"><span>Thomas Pasquier</span></a>, <a href=\"https://www.kuan0.com\"><span>W. Kuan Hon</span></a>, and <a href=\"https://www.qmul.ac.uk/law/people/academic-staff/items/millard.html\"><span>Christopher Millard</span></a>.</p><p>Technical report (UCAM-CL-TR-863) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.html\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.html\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-863\">DOI</a> <a href=\"https://anil.recoil.org/papers/2014-regional-clouds.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-regional-clouds.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2014-regional-clouds-1\">#</a> 1st Nov 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>legal</span> <span>report</span></span></div>",+"content": "<p>Report on regional cloud computing law available</p>\n<blockquote><div><p><a href=\"https://www.cl.cam.ac.uk/~js573/\"><span>Jatinder Singh</span></a>, <a href=\"https://www.cl.cam.ac.uk/~jmb25/\"><span>Jean Bacon</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://tfjmp.org\"><span>Thomas Pasquier</span></a>, <a href=\"https://www.kuan0.com\"><span>W. Kuan Hon</span></a>, and <a href=\"https://www.qmul.ac.uk/law/people/academic-staff/items/millard.html\"><span>Christopher Millard</span></a>.</p><p>Technical report (UCAM-CL-TR-863) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.html\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.html\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-863\">DOI</a> <a href=\"https://anil.recoil.org/papers/2014-regional-clouds.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-regional-clouds.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2014-sigops-raft-1.json
+18
avsm/news_2014-sigops-raft-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2014-sigops-raft-1\">Raft Refloated: Do We Have Consensus?</a> <span>/ Jan 2015</span></h2><p>Paper on reproducing the raft consensus protocol</p>\n<blockquote><div><p><span><span>Heidi Howard</span></span>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2723872.2723876\">ACM SIGOPS Operating Systems Review</a> (vol 49 issue 1).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2723872.2723876\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2723872.2723876\">DOI</a> <a href=\"https://anil.recoil.org/papers/2014-sigops-raft.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-sigops-raft.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2014-sigops-raft-1\">#</a> 1st Jan 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>consensus</span> <span>distributed</span> <span>formal</span> <span>journal</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on reproducing the raft consensus protocol</p>\n<blockquote><div><p><span><span>Heidi Howard</span></span>, <a href=\"https://cs.brown.edu/people/malte/\"><span>Malte Schwarzkopf</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2723872.2723876\">ACM SIGOPS Operating Systems Review</a> (vol 49 issue 1).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2723872.2723876\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2723872.2723876\">DOI</a> <a href=\"https://anil.recoil.org/papers/2014-sigops-raft.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2014-sigops-raft.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-aarhus-databox-1.json
+18
avsm/news_2015-aarhus-databox-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-aarhus-databox-1\">Personal Data: Thinking Inside the Box</a> <span>/ Oct 2015</span></h2><p>Paper on personal databoxes at the one-in-a-decade Aarhus conference</p>\n<blockquote><div><p><span><span>Amir Chaudhry</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <span><span>Heidi Howard</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>.</p><p>Journal paper in <a href=\"https://tidsskrift.dk/ashcc/article/view/21312\">Aarhus Series on Human Centered Computing</a> (vol 1 issue 1).</p><p><a href=\"https://tidsskrift.dk/ashcc/article/view/21312\">URL</a> <i>(tidsskrift.dk)</i> <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-aarhus-databox-1\">#</a> 1st Oct 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>databox</span> <span>journal</span> <span>privacy</span> <span>security</span> <span>selfhosting</span></span></div>",+"content": "<p>Paper on personal databoxes at the one-in-a-decade Aarhus conference</p>\n<blockquote><div><p><span><span>Amir Chaudhry</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <span><span>Heidi Howard</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>.</p><p>Journal paper in <a href=\"https://tidsskrift.dk/ashcc/article/view/21312\">Aarhus Series on Human Centered Computing</a> (vol 1 issue 1).</p><p><a href=\"https://tidsskrift.dk/ashcc/article/view/21312\">URL</a> <i>(tidsskrift.dk)</i> <a href=\"https://doi.org/10.7146/aahcc.v1i1.21312\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-diynet-kadupul-1.json
+18
avsm/news_2015-diynet-kadupul-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-diynet-kadupul-1\">Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles</a> <span>/ May 2015</span></h2><p>Workshop paper on DIY networking using timelock puzzles</p>\n<blockquote><div><p><a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2753488.2753492\">proceedings of the 2015 Workshop on Do-it-yourself Networking: an Interdisciplinary Approach</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2753488.2753492\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2753488.2753492\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-diynet-kadupul.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-diynet-kadupul.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-diynet-kadupul-1\">#</a> 1st May 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>crypto</span> <span>distributed</span> <span>economics</span> <span>embedded</span> <span>wireless</span></span></div>",+"content": "<p>Workshop paper on DIY networking using timelock puzzles</p>\n<blockquote><div><p><a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2753488.2753492\">proceedings of the 2015 Workshop on Do-it-yourself Networking: an Interdisciplinary Approach</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2753488.2753492\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2753488.2753492\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-diynet-kadupul.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-diynet-kadupul.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-jfla-irmin-1.json
+18
avsm/news_2015-jfla-irmin-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-jfla-irmin-1\">Mergeable persistent data structures</a> <span>/ Jan 2015</span></h2><p>Paper on mergeable data structures using Irmin (nee Irminsule) at JFLA 2015</p>\n<blockquote><div><p><span><span>Benjamin Farinier</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the vingt-sixiemes Journees Francophones des Langages Applicatifs (JFLA 2015).</p><p><a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-jfla-irmin-1\">#</a> 1st Jan 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>irmin</span> <span>mirageos</span> <span>storage</span></span></div>",+"content": "<p>Paper on mergeable data structures using Irmin (nee Irminsule) at JFLA 2015</p>\n<blockquote><div><p><span><span>Benjamin Farinier</span></span>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the vingt-sixiemes Journees Francophones des Langages Applicatifs (JFLA 2015).</p><p><a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-nsdi-jitsu-1.json
+18
avsm/news_2015-nsdi-jitsu-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-nsdi-jitsu-1\">Jitsu: Just-In-Time Summoning of Unikernels</a> <span>/ May 2015</span></h2><p>Paper on spinning up low-latency unikernels per-connection at NSDI 2015</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, <a href=\"https://github.com/jonludlam\"><span>Jon Ludlam</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <span><span>Ian Leslie</span></span>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapeddy\">12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15)</a>.</p><p><a href=\"https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapeddy\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-nsdi-jitsu-1\">#</a> 1st May 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>dns</span> <span>embedded</span> <span>mirageos</span> <span>security</span> <span>systems</span></span></div>",+"content": "<p>Paper on spinning up low-latency unikernels per-connection at NSDI 2015</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <span><span>David Sheets</span></span>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Amir Chaudhry</span></span>, <a href=\"https://github.com/balrajsingh\"><span>Balraj Singh</span></a>, <a href=\"https://github.com/jonludlam\"><span>Jon Ludlam</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <span><span>Ian Leslie</span></span>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapeddy\">12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15)</a>.</p><p><a href=\"https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/madhavapeddy\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-sosp-sibylfs-1.json
+18
avsm/news_2015-sosp-sibylfs-1.json
···+"title": "SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-sosp-sibylfs-1\">SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems</a> <span>/ Oct 2015</span></h2><p>Paper on formal specificaiton and testing of filesystems at SOSP 2015</p>\n<blockquote><div><p><a href=\"https://www.tom-ridge.com\"><span>Tom Ridge</span></a>, <span><span>David Sheets</span></span>, <span><span>Thomas Tuerk</span></span>, <a href=\"https://www.cs.le.ac.uk/people/ag400/\"><span>Andrea Giugliano</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cl.cam.ac.uk/~pes20/\"><span>Peter Sewell</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2815400.2815411\">proceedings of the 25th Symposium on Operating Systems Principles</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2815400.2815411\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2815400.2815411\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-sosp-sibylfs-1\">#</a> 1st Oct 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>formal</span> <span>linux</span> <span>storage</span> <span>systems</span> <span>testing</span></span></div>",+"content": "<p>Paper on formal specificaiton and testing of filesystems at SOSP 2015</p>\n<blockquote><div><p><a href=\"https://www.tom-ridge.com\"><span>Tom Ridge</span></a>, <span><span>David Sheets</span></span>, <span><span>Thomas Tuerk</span></span>, <a href=\"https://www.cs.le.ac.uk/people/ag400/\"><span>Andrea Giugliano</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cl.cam.ac.uk/~pes20/\"><span>Peter Sewell</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/2815400.2815411\">proceedings of the 25th Symposium on Operating Systems Principles</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/2815400.2815411\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2815400.2815411\">DOI</a> <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2015-usenixsec-nqsb-1.json
+18
avsm/news_2015-usenixsec-nqsb-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2015-usenixsec-nqsb-1\">Not-Quite-So-Broken TLS</a> <span>/ Aug 2015</span></h2><p>Paper on rebuilding TLS securely but practically at USENIX Security 2015</p>\n<blockquote><div><p><a href=\"https://github.com/pqwy\"><span>David Kaloper-Mersinjak</span></a>, <a href=\"https://github.com/hannesm\"><span>Hannes Mehnert</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cl.cam.ac.uk/~pes20/\"><span>Peter Sewell</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/kaloper-mersinjak\">24th USENIX Security Symposium (USENIX Security 15)</a>.</p><p><a href=\"https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/kaloper-mersinjak\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2015-usenixsec-nqsb-1\">#</a> 1st Aug 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>mirageos</span> <span>ocaml</span> <span>security</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on rebuilding TLS securely but practically at USENIX Security 2015</p>\n<blockquote><div><p><a href=\"https://github.com/pqwy\"><span>David Kaloper-Mersinjak</span></a>, <a href=\"https://github.com/hannesm\"><span>Hannes Mehnert</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cl.cam.ac.uk/~pes20/\"><span>Peter Sewell</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/kaloper-mersinjak\">24th USENIX Security Symposium (USENIX Security 15)</a>.</p><p><a href=\"https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/kaloper-mersinjak\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2016-flops-cmeleon-1.json
+18
avsm/news_2016-flops-cmeleon-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2016-flops-cmeleon-1\">Declarative Foreign Function Binding Through Generic Programming</a> <span>/ Feb 2016</span></h2><p>Paper on declarative approaches to foreign function bindings at FLOPS 2016</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <span><span>David Sheets</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-319-29604-3_13\">the proceedings of Functional and Logic Programming (FLOPS)</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-319-29604-3_13\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-319-29604-3_13\">DOI</a> <a href=\"https://anil.recoil.org/papers/2016-flops-cmeleon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2016-flops-cmeleon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2016-flops-cmeleon-1\">#</a> 1st Feb 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>ffi</span> <span>ocaml</span> <span>staged</span></span></div>",+"content": "<p>Paper on declarative approaches to foreign function bindings at FLOPS 2016</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <span><span>David Sheets</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-319-29604-3_13\">the proceedings of Functional and Logic Programming (FLOPS)</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-319-29604-3_13\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-319-29604-3_13\">DOI</a> <a href=\"https://anil.recoil.org/papers/2016-flops-cmeleon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2016-flops-cmeleon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2016-usenix-flick-1.json
+18
avsm/news_2016-usenix-flick-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2016-usenix-flick-1\">FLICK: Developing and Running Application-Specific Network Services</a> <span>/ Jun 2016</span></h2><p>Paper on application-specific network services at USENIX ATC 2016</p>\n<blockquote><div><p><span><span>Abdul Alim</span></span>, <a href=\"https://www.richardclegg.org/about\"><span>Richard Clegg</span></a>, <span><span>Luo Mai</span></span>, <span><span>Lukas Rupprecht</span></span>, <a href=\"https://seckler.org\"><span>Eric Seckler</span></a>, <span><span>Paolo Costa</span></span>, <a href=\"https://profiles.imperial.ac.uk/prp\"><span>Peter Pietzuch</span></a>, <span><span>Alexander L Wolf</span></span>, <span><span>Nik Sultana</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Andrew W. Moore</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Masoud Koleni</span></span>, <span><span>Luis Oviedo</span></span>, <span><span>Matteo Miliavacca</span></span>, and <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/atc16/technical-sessions/presentation/alim\">2016 USENIX Annual Technical Conference (USENIX ATC 16)</a>.</p><p><a href=\"https://www.usenix.org/conference/atc16/technical-sessions/presentation/alim\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2016-usenix-flick.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2016-usenix-flick.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2016-usenix-flick-1\">#</a> 1st Jun 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>distributed</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on application-specific network services at USENIX ATC 2016</p>\n<blockquote><div><p><span><span>Abdul Alim</span></span>, <a href=\"https://www.richardclegg.org/about\"><span>Richard Clegg</span></a>, <span><span>Luo Mai</span></span>, <span><span>Lukas Rupprecht</span></span>, <a href=\"https://seckler.org\"><span>Eric Seckler</span></a>, <span><span>Paolo Costa</span></span>, <a href=\"https://profiles.imperial.ac.uk/prp\"><span>Peter Pietzuch</span></a>, <span><span>Alexander L Wolf</span></span>, <span><span>Nik Sultana</span></span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Andrew W. Moore</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Masoud Koleni</span></span>, <span><span>Luis Oviedo</span></span>, <span><span>Matteo Miliavacca</span></span>, and <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>.</p><p>Paper in the <a href=\"https://www.usenix.org/conference/atc16/technical-sessions/presentation/alim\">2016 USENIX Annual Technical Conference (USENIX ATC 16)</a>.</p><p><a href=\"https://www.usenix.org/conference/atc16/technical-sessions/presentation/alim\">URL</a> <i>(usenix.org)</i> <a href=\"https://anil.recoil.org/papers/2016-usenix-flick.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2016-usenix-flick.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2017-ml-effects-1.json
+18
avsm/news_2017-ml-effects-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2017-ml-effects-1\">Effectively tackling the awkward squad</a> <span>/ Sep 2017</span></h2><p>Paper on how to tackle awkward IO patterns with effect handlers</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/seliopou\"><span>Spiros Eliopoulos</span></a>, <span><span>Daniel Hillerstrom</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>.</p><p>Paper in the the ACM ML Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2017-ml-effects.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-ml-effects.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2017-ml-effects-1\">#</a> 1st Sep 2017 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>fp</span> <span>multicore</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on how to tackle awkward IO patterns with effect handlers</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/seliopou\"><span>Spiros Eliopoulos</span></a>, <span><span>Daniel Hillerstrom</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>.</p><p>Paper in the the ACM ML Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2017-ml-effects.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-ml-effects.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2017-oud-platform-1.json
+18
avsm/news_2017-oud-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2017-oud-platform-1\">The State of the OCaml Platform: Sep 2017</a> <span>/ Sep 2017</span></h2><p>Annual update on the OCaml Pltform at ICFP</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 7th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2017-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2017-oud-platform-1\">#</a> 1st Sep 2017 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span></span></div>",+"content": "<p>Annual update on the OCaml Pltform at ICFP</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the the 7th ACM OCaml Users and Developers Workshop.</p><p><a href=\"https://anil.recoil.org/papers/2017-oud-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-oud-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2017-snapl-dali-1.json
+18
avsm/news_2017-snapl-dali-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2017-snapl-dali-1\">DaLi: Database as a Library</a> <span>/ May 2017</span></h2><p>Position paper on building databases-as-a-library at SNAPL 2017</p>\n<blockquote><div><p><a href=\"https://gowthamk.github.io\"><span>Gowtham Kaki</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cs.purdue.edu/people/faculty/sjaganna.html\"><span>Suresh Jagannathan</span></a>.</p><p>Paper in the 2nd Summit on Advances in Programming Languages (SNAPL).</p><p><a href=\"https://anil.recoil.org/papers/2017-snapl-dali.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-snapl-dali.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2017-snapl-dali-1\">#</a> 1st May 2017 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>databases</span> <span>fp</span> <span>irmin</span> <span>ocaml</span> <span>storage</span> <span>unikernels</span></span></div>",+"content": "<p>Position paper on building databases-as-a-library at SNAPL 2017</p>\n<blockquote><div><p><a href=\"https://gowthamk.github.io\"><span>Gowtham Kaki</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cs.purdue.edu/people/faculty/sjaganna.html\"><span>Suresh Jagannathan</span></a>.</p><p>Paper in the 2nd Summit on Advances in Programming Languages (SNAPL).</p><p><a href=\"https://anil.recoil.org/papers/2017-snapl-dali.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-snapl-dali.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2017-tfp-effecthandlers-1.json
+18
avsm/news_2017-tfp-effecthandlers-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2017-tfp-effecthandlers-1\">Concurrent System Programming with Effect Handlers</a> <span>/ Apr 2018</span></h2><p>Paper on concurrent systems programming with effect handlers at TFP 2017</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/seliopou\"><span>Spiros Eliopoulos</span></a>, <span><span>Daniel Hillerstrom</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-319-89719-6_6\">trends in Functional Programming</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-319-89719-6_6\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-319-89719-6_6\">DOI</a> <a href=\"https://anil.recoil.org/papers/2017-tfp-effecthandlers.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-tfp-effecthandlers.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2017-tfp-effecthandlers-1\">#</a> 1st Apr 2018 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>fp</span> <span>multicore</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on concurrent systems programming with effect handlers at TFP 2017</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/seliopou\"><span>Spiros Eliopoulos</span></a>, <span><span>Daniel Hillerstrom</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>.</p><p>Paper in the <a href=\"http://link.springer.com/10.1007/978-3-319-89719-6_6\">trends in Functional Programming</a>.</p><p><a href=\"http://link.springer.com/10.1007/978-3-319-89719-6_6\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-319-89719-6_6\">DOI</a> <a href=\"https://anil.recoil.org/papers/2017-tfp-effecthandlers.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2017-tfp-effecthandlers.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2018-hotpost-osmose-1.json
+18
avsm/news_2018-hotpost-osmose-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2018-hotpost-osmose-1\">An architecture for interspatial communication</a> <span>/ Apr 2018</span></h2><p>Paper on the interspatial networking architecture at HotPOST 2018</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <span><span>Gemma Gordon</span></span>, and <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/8406931/\">proceedings of the HotPOST 2018 workshop at the IEEE Conference on Computer Communications</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/8406931/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/INFCOMW.2018.8406931\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2018-hotpost-osmose-1\">#</a> 1st Apr 2018 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>irmin</span> <span>mirageos</span> <span>networking</span> <span>spatial</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on the interspatial networking architecture at HotPOST 2018</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <span><span>Gemma Gordon</span></span>, and <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>.</p><p>Paper in the <a href=\"https://ieeexplore.ieee.org/document/8406931/\">proceedings of the HotPOST 2018 workshop at the IEEE Conference on Computer Communications</a>.</p><p><a href=\"https://ieeexplore.ieee.org/document/8406931/\">URL</a> <i>(ieeexplore.ieee.org)</i> <a href=\"https://doi.org/10.1109/INFCOMW.2018.8406931\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2018-pldi-memorymodel-1.json
+18
avsm/news_2018-pldi-memorymodel-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2018-pldi-memorymodel-1\">Bounding data races in space and time</a> <span>/ Jun 2018</span></h2><p>Paper on the OCaml memory model and underlying theory at PLDI 2018</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3192366.3192421\">proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3192366.3192421\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3192366.3192421\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-pldi-memorymodel.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2018-pldi-memorymodel-1\">#</a> 1st Jun 2018 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>fp</span> <span>memory</span> <span>multicore</span> <span>ocaml</span> <span>systems</span></span></div>",+"content": "<p>Paper on the OCaml memory model and underlying theory at PLDI 2018</p>\n<blockquote><div><p><a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3192366.3192421\">proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3192366.3192421\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3192366.3192421\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-pldi-memorymodel.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-pldi-memorymodel.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2018-socp-modular-ffi-1.json
+18
avsm/news_2018-socp-modular-ffi-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2018-socp-modular-ffi-1\">A modular foreign function interface</a> <span>/ Oct 2018</span></h2><p>Journal paper on building modular foreign function interfaces</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <span><span>David Sheets</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0167642317300709\">Science of Computer Programming</a> (vol 164).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0167642317300709\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.scico.2017.04.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-socp-modular-ffi.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-socp-modular-ffi.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2018-socp-modular-ffi-1\">#</a> 1st Oct 2018 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ffi</span> <span>journal</span> <span>ocaml</span> <span>staged</span></span></div>",+"content": "<p>Journal paper on building modular foreign function interfaces</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <span><span>David Sheets</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0167642317300709\">Science of Computer Programming</a> (vol 164).</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0167642317300709\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.scico.2017.04.002\">DOI</a> <a href=\"https://anil.recoil.org/papers/2018-socp-modular-ffi.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2018-socp-modular-ffi.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2019-edgesys-snape-1.json
+18
avsm/news_2019-edgesys-snape-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2019-edgesys-snape-1\">Snape: The Dark Art of Handling Heterogeneous Enclaves</a> <span>/ Mar 2019</span></h2><p>Paper on a framework to rearchitect applications for better TEE support at EdgeSys 2019</p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3301418.3313945\">proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3301418.3313945\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3301418.3313945\">DOI</a> <a href=\"https://anil.recoil.org/papers/2019-edgesys-snape.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-edgesys-snape.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2019-edgesys-snape-1\">#</a> 1st Mar 2019 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>kernel</span> <span>security</span> <span>systems</span> <span>tee</span></span></div>",+"content": "<p>Paper on a framework to rearchitect applications for better TEE support at EdgeSys 2019</p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3301418.3313945\">proceedings of the 2nd International Workshop on Edge Systems, Analytics and Networking</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3301418.3313945\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3301418.3313945\">DOI</a> <a href=\"https://anil.recoil.org/papers/2019-edgesys-snape.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-edgesys-snape.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2019-mirage-build-1.json
+18
avsm/news_2019-mirage-build-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2019-mirage-build-1\">MirageOS 4: the dawn of practical build systems for exotic targets</a> <span>/ Aug 2019</span></h2><p>Paper on the MirageOS 4 build system at OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://www.lortex.org\"><span>Lucas Pluvinage</span></a>, <span><span>Romain Calascibetta</span></span>, <span><span>Rudi Grinberg</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp19.sigplan.org/home/ocaml-2019#program\">proceedings of the OCaml Workshop 2019</a>.</p><p><a href=\"https://icfp19.sigplan.org/home/ocaml-2019#program\">URL</a> <i>(icfp19.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2019-mirage-build.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-build.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2019-mirage-build-1\">#</a> 1st Aug 2019 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>mirageos</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on the MirageOS 4 build system at OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://www.lortex.org\"><span>Lucas Pluvinage</span></a>, <span><span>Romain Calascibetta</span></span>, <span><span>Rudi Grinberg</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp19.sigplan.org/home/ocaml-2019#program\">proceedings of the OCaml Workshop 2019</a>.</p><p><a href=\"https://icfp19.sigplan.org/home/ocaml-2019#program\">URL</a> <i>(icfp19.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2019-mirage-build.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-build.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2019-mirage-functors-1.json
+18
avsm/news_2019-mirage-functors-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2019-mirage-functors-1\">Programming Unikernels in the Large via Functor Driven Development</a> <span>/ May 2019</span></h2><p>Preprint on programming unikernels with ML modules</p>\n<blockquote><div><p><a href=\"https://www.irif.fr/~gradanne/\"><span>Gabriel Radanne</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://github.com/hannesm\"><span>Hannes Mehnert</span></a>, <a href=\"https://github.com/yomimono\"><span>Mindy Preston</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/1905.02529\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/1905.02529\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.1905.02529\">DOI</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-functors.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-functors.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2019-mirage-functors-1\">#</a> 1st May 2019 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>mirageos</span> <span>ocaml</span> <span>preprint</span> <span>unikernels</span></span></div>",+"content": "<p>Preprint on programming unikernels with ML modules</p>\n<blockquote><div><p><a href=\"https://www.irif.fr/~gradanne/\"><span>Gabriel Radanne</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/jdy22\"><span>Jeremy Yallop</span></a>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://github.com/hannesm\"><span>Hannes Mehnert</span></a>, <a href=\"https://github.com/yomimono\"><span>Mindy Preston</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/1905.02529\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/1905.02529\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.1905.02529\">DOI</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-functors.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2019-mirage-functors.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2019-ocaml-platform-1.json
+18
avsm/news_2019-ocaml-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2019-ocaml-platform-1\">The OCaml Platform in 2019</a> <span>/ Aug 2019</span></h2><p>Annual update on the OCaml Platform in 2019</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <span><span>Gemma Gordon</span></span>.</p><p>Paper in the <a href=\"https://icfp19.sigplan.org/home/ocaml-2019\">proceedings of the OCaml Workshop 2019</a>.</p><p><a href=\"https://icfp19.sigplan.org/home/ocaml-2019\">URL</a> <i>(icfp19.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2019-ocaml-platform.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2019-ocaml-platform-1\">#</a> 1st Aug 2019 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span></span></div>",+"content": "<p>Annual update on the OCaml Platform in 2019</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <span><span>Gemma Gordon</span></span>.</p><p>Paper in the <a href=\"https://icfp19.sigplan.org/home/ocaml-2019\">proceedings of the OCaml Workshop 2019</a>.</p><p><a href=\"https://icfp19.sigplan.org/home/ocaml-2019\">URL</a> <i>(icfp19.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2019-ocaml-platform.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2020-asplas-banyan-1.json
+18
avsm/news_2020-asplas-banyan-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2020-asplas-banyan-1\">Banyan: Coordination-Free Distributed Transactions over Mergeable Types</a> <span>/ Nov 2020</span></h2><p>Paper on Banyan for coordination-free distributed transactions in ASPLAS 2020</p>\n<blockquote><div><p><span><span>Shashank Shekhar Dubey</span></span>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/10.1007/978-3-030-64437-6_12\">proceedings of the Asian Symposium on Programming Languages and Systems</a>.</p><p><a href=\"https://link.springer.com/10.1007/978-3-030-64437-6_12\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-030-64437-6_12\">DOI</a> <a href=\"https://anil.recoil.org/papers/2020-asplas-banyan.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-asplas-banyan.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2020-asplas-banyan-1\">#</a> 1st Nov 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>distributed</span> <span>fp</span> <span>irmin</span> <span>mirageos</span> <span>ocaml</span> <span>storage</span></span></div>",+"content": "<p>Paper on Banyan for coordination-free distributed transactions in ASPLAS 2020</p>\n<blockquote><div><p><span><span>Shashank Shekhar Dubey</span></span>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/10.1007/978-3-030-64437-6_12\">proceedings of the Asian Symposium on Programming Languages and Systems</a>.</p><p><a href=\"https://link.springer.com/10.1007/978-3-030-64437-6_12\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-030-64437-6_12\">DOI</a> <a href=\"https://anil.recoil.org/papers/2020-asplas-banyan.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-asplas-banyan.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2020-icfp-retropar-1.json
+18
avsm/news_2020-icfp-retropar-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2020-icfp-retropar-1\">Retrofitting parallelism onto OCaml</a> <span>/ Aug 2020</span></h2><p>Won best paper award at ICFP 2020 for our paper on retrofitting parallelism onto OCaml!</p>\n<blockquote><div><p><a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <span><span>Anmol Sahoo</span></span>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <span><span>Atul Dhiman</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/3408995\">Proceedings of the ACM on Programming Languages</a> (vol 4 issue ICFP).</p><p><a href=\"https://dl.acm.org/doi/10.1145/3408995\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3408995\">DOI</a> <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2020-icfp-retropar-1\">#</a> 1st Aug 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>award</span> <span>effects</span> <span>fp</span> <span>journal</span> <span>multicore</span> <span>ocaml</span> <span>systems</span></span></div>",+"content": "<p>Won best paper award at ICFP 2020 for our paper on retrofitting parallelism onto OCaml!</p>\n<blockquote><div><p><a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <span><span>Anmol Sahoo</span></span>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <span><span>Atul Dhiman</span></span>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/3408995\">Proceedings of the ACM on Programming Languages</a> (vol 4 issue ICFP).</p><p><a href=\"https://dl.acm.org/doi/10.1145/3408995\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3408995\">DOI</a> <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2020-oud-ci-1.json
+18
avsm/news_2020-oud-ci-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2020-oud-ci-1\">OCaml-CI: A Zero-Configuration CI</a> <span>/ Aug 2020</span></h2><p>Presented the new OCaml DSL for continuous integration at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://craigfe.io\"><span>Craig Ferguson</span></a>, <a href=\"https://github.com/kit-ty-kate\"><span>Kate Deplaix</span></a>, <a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp20.sigplan.org/details/ocaml-2020-papers/6/OCaml-CI-A-Zero-Configuration-CI\">proceedings of the 2020 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp20.sigplan.org/details/ocaml-2020-papers/6/OCaml-CI-A-Zero-Configuration-CI\">URL</a> <i>(icfp20.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-ci.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2020-oud-ci-1\">#</a> 1st Aug 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span></span></div>",+"content": "<p>Presented the new OCaml DSL for continuous integration at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://craigfe.io\"><span>Craig Ferguson</span></a>, <a href=\"https://github.com/kit-ty-kate\"><span>Kate Deplaix</span></a>, <a href=\"http://www.skjegstad.com/about/\"><span>Magnus Skjegstad</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp20.sigplan.org/details/ocaml-2020-papers/6/OCaml-CI-A-Zero-Configuration-CI\">proceedings of the 2020 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp20.sigplan.org/details/ocaml-2020-papers/6/OCaml-CI-A-Zero-Configuration-CI\">URL</a> <i>(icfp20.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-ci.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2020-oud-parallelising-1.json
+18
avsm/news_2020-oud-parallelising-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2020-oud-parallelising-1\">Parallelising your OCaml Code with Multicore OCaml</a> <span>/ Aug 2020</span></h2><p>Paper on how to parallelise OCaml code at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://github.com/ocaml-multicore/multicore-talks/tree/master/ocaml2020-workshop-parallel\">proceedings of the 2020 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://github.com/ocaml-multicore/multicore-talks/tree/master/ocaml2020-workshop-parallel\">URL</a> <i>(github.com)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-parallelising.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-oud-parallelising.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2020-oud-parallelising-1\">#</a> 1st Aug 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>fp</span> <span>multicore</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on how to parallelise OCaml code at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://github.com/ocaml-multicore/multicore-talks/tree/master/ocaml2020-workshop-parallel\">proceedings of the 2020 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://github.com/ocaml-multicore/multicore-talks/tree/master/ocaml2020-workshop-parallel\">URL</a> <i>(github.com)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-parallelising.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2020-oud-parallelising.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2020-oud-platform-1.json
+18
avsm/news_2020-oud-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2020-oud-platform-1\">The OCaml Platform: 2020</a> <span>/ Sep 2020</span></h2><p>Annual update on the OCaml Platform at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp20.sigplan.org/home/ocaml-2020\">the 10th ACM OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp20.sigplan.org/home/ocaml-2020\">URL</a> <i>(icfp20.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-platform.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2020-oud-platform-1\">#</a> 1st Sep 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span></span></div>",+"content": "<p>Annual update on the OCaml Platform at the OCaml Workshop</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp20.sigplan.org/home/ocaml-2020\">the 10th ACM OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp20.sigplan.org/home/ocaml-2020\">URL</a> <i>(icfp20.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2020-oud-platform.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2021-arxiv-forestrycs-1.json
+18
avsm/news_2021-arxiv-forestrycs-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2021-arxiv-forestrycs-1\">How Computer Science Can Aid Forest Restoration</a> <span>/ Aug 2021</span></h2><p>Preprint about our working notes on how CS might contribute to forest preservation</p>\n<blockquote><div><p><span><span>Gemma Gordon</span></span>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://github.com/jonludlam\"><span>Jon Ludlam</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2109.07898\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2109.07898\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2109.07898\">DOI</a> <a href=\"https://anil.recoil.org/papers/2021-arxiv-forestrycs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-arxiv-forestrycs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2021-arxiv-forestrycs-1\">#</a> 1st Aug 2021 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conservation</span> <span>drones</span> <span>forests</span> <span>preprint</span> <span>sensing</span></span></div>",+"content": "<p>Preprint about our working notes on how CS might contribute to forest preservation</p>\n<blockquote><div><p><span><span>Gemma Gordon</span></span>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://github.com/jonludlam\"><span>Jon Ludlam</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2109.07898\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2109.07898\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2109.07898\">DOI</a> <a href=\"https://anil.recoil.org/papers/2021-arxiv-forestrycs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-arxiv-forestrycs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2021-oud-effects-1.json
+18
avsm/news_2021-oud-effects-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2021-oud-effects-1\">Experiences with Effects</a> <span>/ Aug 2021</span></h2><p>Paper on programming with effects in OCaml</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://craigfe.io\"><span>Craig Ferguson</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp21.sigplan.org/details/ocaml-2021-papers/16/Experiences-with-Effects\">proceedings of the 2021 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp21.sigplan.org/details/ocaml-2021-papers/16/Experiences-with-Effects\">URL</a> <i>(icfp21.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2021-oud-effects.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-oud-effects.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2021-oud-effects-1\">#</a> 1st Aug 2021 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>multicore</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on programming with effects in OCaml</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://craigfe.io\"><span>Craig Ferguson</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp21.sigplan.org/details/ocaml-2021-papers/16/Experiences-with-Effects\">proceedings of the 2021 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp21.sigplan.org/details/ocaml-2021-papers/16/Experiences-with-Effects\">URL</a> <i>(icfp21.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2021-oud-effects.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-oud-effects.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2021-pldi-retroeff-1.json
+18
avsm/news_2021-pldi-retroeff-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2021-pldi-retroeff-1\">Retrofitting effect handlers onto OCaml</a> <span>/ Jun 2021</span></h2><p>Paper on retrofitting effects in OCaml presented in PLDI 2024</p>\n<blockquote><div><p><a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3453483.3454039\">proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3453483.3454039\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3453483.3454039\">DOI</a> <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2021-pldi-retroeff-1\">#</a> 1st Jun 2021 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>fp</span> <span>multicore</span> <span>ocaml</span> <span>systems</span></span></div>",+"content": "<p>Paper on retrofitting effects in OCaml presented in PLDI 2024</p>\n<blockquote><div><p><a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/stedolan\"><span>Stephen Dolan</span></a>, <a href=\"https://github.com/lpw25\"><span>Leo White</span></a>, <a href=\"https://github.com/ctk21\"><span>Tom Kelly</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3453483.3454039\">proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3453483.3454039\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3453483.3454039\">DOI</a> <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2022-enhancing-brain-security-1.json
+18
avsm/news_2022-enhancing-brain-security-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2022-enhancing-brain-security-1\">Enhancing the Security & Privacy of Wearable Brain-Computer Interfaces</a> <span>/ Jan 2022</span></h2><p>Preprint on security vulnerabilities in brain-computer interfaces</p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, <a href=\"https://lorenaqendro.github.io\"><span>Lorena Qendro</span></a>, <span><span>Malachy O'Connor Brown</span></span>, <span><span>Oscar Hill</span></span>, <a href=\"https://www.cl.cam.ac.uk/~cm542/\"><span>Cecilia Mascolo</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2201.07711\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2201.07711\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2201.07711\">DOI</a> <a href=\"https://anil.recoil.org/papers/2022-enhancing-brain-security.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2022-enhancing-brain-security.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2022-enhancing-brain-security-1\">#</a> 1st Jan 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>brain</span> <span>embedded</span> <span>preprint</span> <span>security</span> <span>systems</span></span></div>",+"content": "<p>Preprint on security vulnerabilities in brain-computer interfaces</p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, <a href=\"https://lorenaqendro.github.io\"><span>Lorena Qendro</span></a>, <span><span>Malachy O'Connor Brown</span></span>, <span><span>Oscar Hill</span></span>, <a href=\"https://www.cl.cam.ac.uk/~cm542/\"><span>Cecilia Mascolo</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2201.07711\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2201.07711\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2201.07711\">DOI</a> <a href=\"https://anil.recoil.org/papers/2022-enhancing-brain-security.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2022-enhancing-brain-security.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2022-oud-ocurrent-1.json
+18
avsm/news_2022-oud-ocurrent-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2022-oud-ocurrent-1\">Homogeneous Builds with OBuilder and OCaml</a> <span>/ Sep 2022</span></h2><p>Paper on our incremental computation DSL ocurrent presented in OCaml Workshop 2022</p>\n<blockquote><div><p><span><span>Tim McGilchrist</span></span>, <a href=\"https://github.com/dra27\"><span>David Allsopp</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://github.com/MisterDA\"><span>Antonin D\u00e9cimo</span></a>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/kit-ty-kate\"><span>Kate Deplaix</span></a>.</p><p>Paper in the <a href=\"https://icfp22.sigplan.org/details/ocaml-2022-papers/8/Homogeneous-builds-with-OBuilder-and-OCaml\">proceedings of the 2022 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp22.sigplan.org/details/ocaml-2022-papers/8/Homogeneous-builds-with-OBuilder-and-OCaml\">URL</a> <i>(icfp22.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2022-oud-ocurrent.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2022-oud-ocurrent-1\">#</a> 1st Sep 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>fp</span> <span>ocaml</span></span></div>",+"content": "<p>Paper on our incremental computation DSL ocurrent presented in OCaml Workshop 2022</p>\n<blockquote><div><p><span><span>Tim McGilchrist</span></span>, <a href=\"https://github.com/dra27\"><span>David Allsopp</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://github.com/MisterDA\"><span>Antonin D\u00e9cimo</span></a>, <a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/kit-ty-kate\"><span>Kate Deplaix</span></a>.</p><p>Paper in the <a href=\"https://icfp22.sigplan.org/details/ocaml-2022-papers/8/Homogeneous-builds-with-OBuilder-and-OCaml\">proceedings of the 2022 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp22.sigplan.org/details/ocaml-2022-papers/8/Homogeneous-builds-with-OBuilder-and-OCaml\">URL</a> <i>(icfp22.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2022-oud-ocurrent.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2023-acns-microguards-1.json
+18
avsm/news_2023-acns-microguards-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-acns-microguards-1\">Enabling Lightweight Privilege Separation in Applications with MicroGuards</a> <span>/ Oct 2023</span></h2><p>Paper on microgrounds memory API at <a href=\"https://link.springer.com/chapter/10.1007/978-3-031-41181-6_31\">ACNSW</a></p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/10.1007/978-3-031-41181-6_31\">applied Cryptography and Network Security Workshops</a>.</p><p><a href=\"https://link.springer.com/10.1007/978-3-031-41181-6_31\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-031-41181-6_31\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-acns-microguards.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-acns-microguards.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-acns-microguards-1\">#</a> 1st Oct 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>linux</span> <span>os</span> <span>security</span> <span>systems</span> <span>tee</span></span></div>",+"content": "<p>Paper on microgrounds memory API at <a href=\"https://link.springer.com/chapter/10.1007/978-3-031-41181-6_31\">ACNSW</a></p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/10.1007/978-3-031-41181-6_31\">applied Cryptography and Network Security Workshops</a>.</p><p><a href=\"https://link.springer.com/10.1007/978-3-031-41181-6_31\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-031-41181-6_31\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-acns-microguards.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-acns-microguards.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-carbon-credibility-1.json
+18
avsm/news_2023-carbon-credibility-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-carbon-credibility-1\">Credit credibility threatens forests</a> <span>/ May 2023</span></h2><p>Our perspective in <a href=\"https://science.org\">Science</a> magazine appeared this week on the credibility of carbon credits and its importance for tropical forest protection.</p>\n<blockquote>\n<p>Addressing global warming requires increased investment in conserving and restoring carbon-dense natural habitats. Some companies that emit carbon have turned to certified carbon credits to offset their environmental impact. However, the effectiveness of carbon credits depends on the methods used to quantify them. If carbon credits do not accurately represent their environmental benefits, relying on them could exacerbate climate change. To ensure that carbon credits are robust, the methods used to calculate them must be improved.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <span><span>Pedro H. S. Brancalion</span></span>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.cccep.ac.uk/profile/ben-filewod/\"><span>Ben Filewod</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/alejandro-guizar-coutino\"><span>Alejandro Guizar-Couti\u00f1o</span></a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\"><span>Julia P.G. Jones</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <span><span>Andreas Kontoleon</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.geog.ox.ac.uk/staff/ymalhi.html\"><span>Yadvinder Malhi</span></a>, <a href=\"https://cnr.ncsu.edu/directory/erin-o-sills/\"><span>Erin O Sills</span></a>, <a href=\"https://www.iis-rio.org/en/collaborators/bernardo/\"><span>Bernardo Strassburg</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://thaleswest.wixsite.com/home\"><span>Thales West</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.science.org/doi/10.1126/science.adh3426\">Science</a> (vol 380 issue 6644).</p><p><a href=\"https://www.science.org/doi/10.1126/science.adh3426\">URL</a> <i>(science.org)</i> <a href=\"https://doi.org/10.1126/science.adh3426\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-carbon-credibility.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-carbon-credibility.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-carbon-credibility-1\">#</a> 1st May 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carbon</span> <span>carboncredits</span> <span>conservation</span> <span>economics</span> <span>forests</span> <span>journal</span> <span>nbs</span> <span>sensing</span></span></div>",+"content": "<p>Our perspective in <a href=\"https://science.org\">Science</a> magazine appeared this week on the credibility of carbon credits and its importance for tropical forest protection.</p>\n<blockquote>\n<p>Addressing global warming requires increased investment in conserving and restoring carbon-dense natural habitats. Some companies that emit carbon have turned to certified carbon credits to offset their environmental impact. However, the effectiveness of carbon credits depends on the methods used to quantify them. If carbon credits do not accurately represent their environmental benefits, relying on them could exacerbate climate change. To ensure that carbon credits are robust, the methods used to calculate them must be improved.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <span><span>Pedro H. S. Brancalion</span></span>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.cccep.ac.uk/profile/ben-filewod/\"><span>Ben Filewod</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/alejandro-guizar-coutino\"><span>Alejandro Guizar-Couti\u00f1o</span></a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\"><span>Julia P.G. Jones</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <span><span>Andreas Kontoleon</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.geog.ox.ac.uk/staff/ymalhi.html\"><span>Yadvinder Malhi</span></a>, <a href=\"https://cnr.ncsu.edu/directory/erin-o-sills/\"><span>Erin O Sills</span></a>, <a href=\"https://www.iis-rio.org/en/collaborators/bernardo/\"><span>Bernardo Strassburg</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://thaleswest.wixsite.com/home\"><span>Thales West</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.science.org/doi/10.1126/science.adh3426\">Science</a> (vol 380 issue 6644).</p><p><a href=\"https://www.science.org/doi/10.1126/science.adh3426\">URL</a> <i>(science.org)</i> <a href=\"https://doi.org/10.1126/science.adh3426\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-carbon-credibility.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-carbon-credibility.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-hotnets-sns-1.json
+18
avsm/news_2023-hotnets-sns-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-hotnets-sns-1\">Where on Earth is the Spatial Name System?</a> <span>/ Nov 2023</span></h2><p>Paper on spatial networks on DNS at <a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">HotNets 2023</a></p>\n<blockquote><div><p><a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">proceedings of the 22nd ACM Workshop on Hot Topics in Networks</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3626111.3628210\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-hotnets-sns.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-hotnets-sns.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-hotnets-sns-1\">#</a> 1st Nov 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>distributed</span> <span>dns</span> <span>networks</span> <span>spatial</span> <span>systems</span></span></div>",+"content": "<p>Paper on spatial networks on DNS at <a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">HotNets 2023</a></p>\n<blockquote><div><p><a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">proceedings of the 22nd ACM Workshop on Hot Topics in Networks</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3626111.3628210\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3626111.3628210\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-hotnets-sns.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-hotnets-sns.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-naturecredits-1.json
+18
avsm/news_2023-naturecredits-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-naturecredits-1\">Nature Sustainability article on carbon/biodiversity credits</a> <span>/ Aug 2024</span></h2><p>Our commentary on nature-based credits has been published in <a href=\"https://www.nature.com/natsustain/\">Nature Sustainability</a>. I wrote some <a href=\"https://anil.recoil.org/notes/nature-crossroads\">thoughts</a> about it here as well.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\"><span>Siddarth Shrikanth</span></a>, <a href=\"https://www.biology.ox.ac.uk/people/joseph-bull\"><span>Joseph Bull</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\"><span>Sophus zu Ermgassen</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41893-024-01403-w\">Nature Sustainability</a>.</p><p><a href=\"https://www.nature.com/articles/s41893-024-01403-w\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41893-024-01403-w\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-naturecredits.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-naturecredits.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-naturecredits-1\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>biodiversity</span> <span>carboncredits</span> <span>economics</span> <span>journal</span> <span>nature</span></span></div>",+"content": "<p>Our commentary on nature-based credits has been published in <a href=\"https://www.nature.com/natsustain/\">Nature Sustainability</a>. I wrote some <a href=\"https://anil.recoil.org/notes/nature-crossroads\">thoughts</a> about it here as well.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\"><span>Siddarth Shrikanth</span></a>, <a href=\"https://www.biology.ox.ac.uk/people/joseph-bull\"><span>Joseph Bull</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\"><span>Sophus zu Ermgassen</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41893-024-01403-w\">Nature Sustainability</a>.</p><p><a href=\"https://www.nature.com/articles/s41893-024-01403-w\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41893-024-01403-w\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-naturecredits.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-naturecredits.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-ncc-permanence-1.json
+18
avsm/news_2023-ncc-permanence-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-ncc-permanence-1\">Preprint on the social value of impermanent carbon credits</a> <span>/ Jul 2023</span></h2><p>We have uploaded a preprint of our <a href=\"https://anil.recoil.org/projects/4c\">4C</a> paper on valuing impermanent carbon credits, by using the <a href=\"https://en.wikipedia.org/wiki/Social_cost_of_carbon\">Social Cost of Carbon</a> as a basis for a discount function into the future. Comments and feedback are most welcome.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a> (vol 13 issue 11).</p><p><a href=\"https://www.nature.com/articles/s41558-023-01815-0\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41558-023-01815-0\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-ncc-permanence-1\">#</a> 1st Nov 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carbon</span> <span>carboncredits</span> <span>conservation</span> <span>economics</span> <span>forests</span> <span>journal</span> <span>nbs</span> <span>permanence</span> <span>scc</span> <span>sensing</span></span></div>",+"content": "<p>We have uploaded a preprint of our <a href=\"https://anil.recoil.org/projects/4c\">4C</a> paper on valuing impermanent carbon credits, by using the <a href=\"https://en.wikipedia.org/wiki/Social_cost_of_carbon\">Social Cost of Carbon</a> as a basis for a discount function into the future. Comments and feedback are most welcome.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a> (vol 13 issue 11).</p><p><a href=\"https://www.nature.com/articles/s41558-023-01815-0\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41558-023-01815-0\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-ncc-permanence-2.json
+18
avsm/news_2023-ncc-permanence-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-ncc-permanence-2\">Nature Climate Change paper on impermanent carbon credits</a> <span>/ Nov 2023</span></h2><p>Our paper on valuing impermanent carbon credits has been published at <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a>. It has received a bunch of press coverage, including <a href=\"https://phys.org/news/2023-10-offset-approach-tropical-forests-faith.html\">phys.org</a>, <a href=\"https://www.cam.ac.uk/research/news/offset-markets-new-approach-could-help-save-tropical-forests-by-restoring-faith-in-carbon-credits\">cam.ac.uk</a>, and <a href=\"https://www.miragenews.com/new-method-may-boost-trust-in-carbon-credits-1113599/\">Mirage</a>.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a> (vol 13 issue 11).</p><p><a href=\"https://www.nature.com/articles/s41558-023-01815-0\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41558-023-01815-0\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-ncc-permanence-2\">#</a> 1st Nov 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>4c</span> <span>carbon</span> <span>carboncredits</span> <span>conservation</span> <span>economics</span> <span>forests</span> <span>journal</span> <span>nbs</span> <span>permanence</span> <span>scc</span> <span>sensing</span></span></div>",+"content": "<p>Our paper on valuing impermanent carbon credits has been published at <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a>. It has received a bunch of press coverage, including <a href=\"https://phys.org/news/2023-10-offset-approach-tropical-forests-faith.html\">phys.org</a>, <a href=\"https://www.cam.ac.uk/research/news/offset-markets-new-approach-could-help-save-tropical-forests-by-restoring-faith-in-carbon-credits\">cam.ac.uk</a>, and <a href=\"https://www.miragenews.com/new-method-may-boost-trust-in-carbon-credits-1113599/\">Mirage</a>.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://www.lse.ac.uk/granthaminstitute/profile/frank-venmans/\"><span>Frank Venmans</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://business-school.exeter.ac.uk/economics/research/subject-themes/profile/index.php?web_id=ben_groom\"><span>Ben Groom</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>.</p><p>Journal paper in <a href=\"https://www.nature.com/articles/s41558-023-01815-0\">Nature Climate Change</a> (vol 13 issue 11).</p><p><a href=\"https://www.nature.com/articles/s41558-023-01815-0\">URL</a> <i>(nature.com)</i> <a href=\"https://doi.org/10.1038/s41558-023-01815-0\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-ocaml-eio-1.json
+18
avsm/news_2023-ocaml-eio-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-ocaml-eio-1\">Eio 1.0 \u2013 Effects-based IO for OCaml 5</a> <span>/ Sep 2023</span></h2><p>An update on the OCaml EIO library at the OCaml Workshop 2023</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://github.com/haesbaert\"><span>Christiano Haesbaert</span></a>, <a href=\"https://www.lortex.org\"><span>Lucas Pluvinage</span></a>, <a href=\"https://github.com/polytypic\"><span>Vesa Karvonen</span></a>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/balat\"><span>Vincent Balat</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/5/Eio-1-0-Effects-based-IO-for-OCaml-5\">proceedings of the 2023 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/5/Eio-1-0-Effects-based-IO-for-OCaml-5\">URL</a> <i>(icfp23.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2023-ocaml-eio.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ocaml-eio.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-ocaml-eio-1\">#</a> 1st Sep 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>effects</span> <span>ocaml</span> <span>systems</span></span></div>",+"content": "<p>An update on the OCaml EIO library at the OCaml Workshop 2023</p>\n<blockquote><div><p><a href=\"https://github.com/https://roscidus.com\"><span>Thomas Leonard</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://github.com/haesbaert\"><span>Christiano Haesbaert</span></a>, <a href=\"https://www.lortex.org\"><span>Lucas Pluvinage</span></a>, <a href=\"https://github.com/polytypic\"><span>Vesa Karvonen</span></a>, <a href=\"https://github.com/Sudha247\"><span>Sudha Parimala</span></a>, <a href=\"https://kcsrk.info\"><span>KC Sivaramakrishnan</span></a>, <a href=\"https://github.com/balat\"><span>Vincent Balat</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/5/Eio-1-0-Effects-based-IO-for-OCaml-5\">proceedings of the 2023 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/5/Eio-1-0-Effects-based-IO-for-OCaml-5\">URL</a> <i>(icfp23.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2023-ocaml-eio.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ocaml-eio.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-ocaml-platform-1.json
+18
avsm/news_2023-ocaml-platform-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-ocaml-platform-1\">State of the OCaml Platform 2023</a> <span>/ Sep 2023</span></h2><p>We deliver the annual presentation about the OCaml Platform in the OCaml Workshop at ICFP 2023.</p>\n<blockquote>\n<p>This paper reflects on a decade of progress and developments within the OCaml Platform, from its inception in 2013 with the release of opam 1.0, to today where it stands as a robust toolchain for OCaml developers. We review the last three years in detail, emphasizing the advancements and innovations that have shaped the OCaml development landscape and highlighting key milestones such as the migration to Dune as the primary build system, and the development of a Language Server Protocol (LSP) server for OCaml.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://github.com/tmattio\"><span>Thibaut Mattio</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://github.com/dra27\"><span>David Allsopp</span></a>.</p><p>Paper in the <a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/15/State-of-the-OCaml-Platform-2023\">proceedings of the 2023 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/15/State-of-the-OCaml-Platform-2023\">URL</a> <i>(icfp23.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2023-ocaml-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ocaml-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-ocaml-platform-1\">#</a> 1st Sep 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>devtools</span> <span>ocaml</span> <span>testing</span></span></div>",+"content": "<p>We deliver the annual presentation about the OCaml Platform in the OCaml Workshop at ICFP 2023.</p>\n<blockquote>\n<p>This paper reflects on a decade of progress and developments within the OCaml Platform, from its inception in 2013 with the release of opam 1.0, to today where it stands as a robust toolchain for OCaml developers. We review the last three years in detail, emphasizing the advancements and innovations that have shaped the OCaml development landscape and highlighting key milestones such as the migration to Dune as the primary build system, and the development of a Language Server Protocol (LSP) server for OCaml.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://github.com/tmattio\"><span>Thibaut Mattio</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/samoht\"><span>Thomas Gazagnaire</span></a>, and <a href=\"https://github.com/dra27\"><span>David Allsopp</span></a>.</p><p>Paper in the <a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/15/State-of-the-OCaml-Platform-2023\">proceedings of the 2023 OCaml Users and Developers Workshop</a>.</p><p><a href=\"https://icfp23.sigplan.org/details/ocaml-2023-papers/15/State-of-the-OCaml-Platform-2023\">URL</a> <i>(icfp23.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2023-ocaml-platform.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-ocaml-platform.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-pact-tmf-1.json
+18
avsm/news_2023-pact-tmf-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-pact-tmf-1\">PACT Tropical Moist Forest Accreditation Methodology</a> <span>/ Jun 2023</span></h2><p>We have just published the Tropical Moist Forest v1.0 specification, which is a detailed description of the methodology we are using for counterfactual dynamic baselines to calculate the additionality, leakage and permanence behind REDD+ projects. I explained some of the background behind this in a seminar last year.</p>\n<p></p><div></div><p></p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-pact-tmf-1\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>conservation</span> <span>forests</span> <span>nature</span> <span>pact</span> <span>preprint</span> <span>redd</span> <span>satellite</span></span></div>",+"content": "<p>We have just published the Tropical Moist Forest v1.0 specification, which is a detailed description of the methodology we are using for counterfactual dynamic baselines to calculate the additionality, leakage and permanence behind REDD+ projects. I explained some of the background behind this in a seminar last year.</p>\n<p></p><div></div><p></p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-pact-tmf-2.json
+18
avsm/news_2023-pact-tmf-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-pact-tmf-2\">PACT Tropical Moist Forest Accreditation Methodology</a> <span>/ Dec 2023</span></h2><p>We have just released the Tropical Moist Forest v2.0 specification, to update the <a href=\"https://anil.recoil.org/news/2023-pact-tmf-1\">v1.1</a> released earlier in the year. There are significant updates to the methodology to better match the scheme described in <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">Realizing the social value of impermanent carbon credits</a>.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-pact-tmf-2\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>conservation</span> <span>forests</span> <span>nature</span> <span>pact</span> <span>preprint</span> <span>redd</span> <span>satellite</span></span></div>",+"content": "<p>We have just released the Tropical Moist Forest v2.0 specification, to update the <a href=\"https://anil.recoil.org/news/2023-pact-tmf-1\">v1.1</a> released earlier in the year. There are significant updates to the methodology to better match the scheme described in <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">Realizing the social value of impermanent carbon credits</a>.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-pact-tmf-3.json
+18
avsm/news_2023-pact-tmf-3.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-pact-tmf-3\">PACT Tropical Moist Forest Accreditation Methodology</a> <span>/ Aug 2024</span></h2><p>We have just released the Tropical Moist Forest v2.1 specification, to follow up the now-expired <a href=\"https://anil.recoil.org/news/2023-pact-tmf-2\">v2.0</a> from six months ago. The key updates are a new <a href=\"https://tinyurl.com/PACTTMFexplainer\">high-level explainer</a>, as well as clarifiations for buffer zones and base tiles.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-pact-tmf-3\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>conservation</span> <span>forests</span> <span>nature</span> <span>pact</span> <span>preprint</span> <span>redd</span> <span>satellite</span></span></div>",+"content": "<p>We have just released the Tropical Moist Forest v2.1 specification, to follow up the now-expired <a href=\"https://anil.recoil.org/news/2023-pact-tmf-2\">v2.0</a> from six months ago. The key updates are a new <a href=\"https://tinyurl.com/PACTTMFexplainer\">high-level explainer</a>, as well as clarifiations for buffer zones and base tiles.</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.liverpool.ac.uk/geography-and-planning/research/environmental-change/postgraduates/\"><span>James Hartup</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\"><span>Miranda Lam</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-charlotte-wheeler\"><span>Charlotte Wheeler</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/abigail-williams\"><span>Abby Williams</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/66b9d9345101a2ffa813e37c\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-gvslq\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-pact-tmf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2023-raid-deluminator-1.json
+18
avsm/news_2023-raid-deluminator-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2023-raid-deluminator-1\">Information Flow Tracking for Heterogeneous Compartmentalized Software</a> <span>/ Oct 2023</span></h2><p>Paper on DIFC Deluminator interface at <a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">RAID 2023</a></p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3607199.3607235\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2023-raid-deluminator-1\">#</a> 1st Oct 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>difc</span> <span>security</span> <span>systems</span> <span>tee</span> <span>unikernels</span></span></div>",+"content": "<p>Paper on DIFC Deluminator interface at <a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">RAID 2023</a></p>\n<blockquote><div><p><a href=\"https://zatkh.github.io/\"><span>Zahra Tarkhani</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3607199.3607235\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3607199.3607235\">DOI</a> <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-ai-conhorizon-1.json
+18
avsm/news_2024-ai-conhorizon-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-ai-conhorizon-1\">Horizon scan on AI and conservation published</a> <span>/ Dec 2024</span></h2><p>Back in July 2024, a large group of conservation and computer scientists got together in the <a href=\"https://conservation.cam.ac.uk\">CCI</a> to prioritise the storm of AI-related projects that have been kicking off around the world. Our key goal was to harness AI to accelerate the positive impact of conservation efforts, while minimising harm caused through either the direct or indirect use of AI technologies.</p>\n<p>The first horizon scan resulting from this has just been published in Trends in Ecology and Evolution. If you're looking for a gentle introduction to some of the terms in AI from a non-experts perspective, the first section does a good job of defining a glossary as well.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-ai-conhorizon-1\">118 words</a>]</span><blockquote><div><p><a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://beerys.github.io\"><span>Sara Beery</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\"><span>Neil Burgess</span></a>, <a href=\"https://profiles.imperial.ac.uk/m.burgman\"><span>Mark Burgman</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/stuart-butchart\"><span>Stuart Butchart</span></a>, <a href=\"https://carleton.ca/biology/people/steven-j-cooke/\"><span>Steven J. Cooke</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.framework-biodiversity.eu/team/dr-finn-danielsen\"><span>Finn Danielsen</span></a>, <a href=\"https://researchportal.helsinki.fi/en/persons/enrico-di-minin\"><span>Enrico Di Minin</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.vizzuality.com/team/francis-gassert\"><span>Francis Gassert</span></a>, <a href=\"https://www.biology.ox.ac.uk/people/amy-hinsley\"><span>Amy Hinsley</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\"><span>Julia P.G. Jones</span></a>, <a href=\"https://env.dukekunshan.edu.cn/faculty-env/binbin-li-ph-d/\"><span>Binbin V. Li</span></a>, <a href=\"http://oisin.info\"><span>Oisin Mac Aodha</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://stephanieodonnell.com\"><span>Stephanie O'Donnell</span></a>, <a href=\"https://www.lancaster.ac.uk/maths/people/william-oxbury\"><span>Bill Oxbury</span></a>, <a href=\"https://www.bas.ac.uk/profile/lspe/\"><span>Lloyd Peck</span></a>, <a href=\"https://www.zsl.org/about-zsl/our-people/prof-nathalie-pettorelli\"><span>Nathalie Pettorelli</span></a>, <a href=\"https://www.rainforesttrust.org/about-us/our-team/dr-jon-paul-rodriguez-2/\"><span>Jon Paul Rodr\u00edguez</span></a>, <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\"><span>Emily Shuckburgh</span></a>, <a href=\"https://www.iis-rio.org/en/collaborators/bernardo/\"><span>Bernardo Strassburg</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-hiromi-yamashita\"><span>Hiromi Yamashita</span></a>, <a href=\"https://www.microsoft.com/en-us/research/people/zhongqimiao/\"><span>Zhongqi Miao</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0169534724002866\">Trends in Ecology & Evolution</a>.</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0169534724002866\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-ai-conhorizon-1\">#</a> 1st Dec 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>cci</span> <span>conservation</span> <span>evidence</span> <span>horizon</span> <span>journal</span></span></div>",+"content": "<p>Back in July 2024, a large group of conservation and computer scientists got together in the <a href=\"https://conservation.cam.ac.uk\">CCI</a> to prioritise the storm of AI-related projects that have been kicking off around the world. Our key goal was to harness AI to accelerate the positive impact of conservation efforts, while minimising harm caused through either the direct or indirect use of AI technologies.</p>\n<p>The first horizon scan resulting from this has just been published in Trends in Ecology and Evolution. If you're looking for a gentle introduction to some of the terms in AI from a non-experts perspective, the first section does a good job of defining a glossary as well.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-ai-conhorizon-1\">118 words</a>]</span><blockquote><div><p><a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://beerys.github.io\"><span>Sara Beery</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\"><span>Neil Burgess</span></a>, <a href=\"https://profiles.imperial.ac.uk/m.burgman\"><span>Mark Burgman</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/stuart-butchart\"><span>Stuart Butchart</span></a>, <a href=\"https://carleton.ca/biology/people/steven-j-cooke/\"><span>Steven J. Cooke</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.framework-biodiversity.eu/team/dr-finn-danielsen\"><span>Finn Danielsen</span></a>, <a href=\"https://researchportal.helsinki.fi/en/persons/enrico-di-minin\"><span>Enrico Di Minin</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.vizzuality.com/team/francis-gassert\"><span>Francis Gassert</span></a>, <a href=\"https://www.biology.ox.ac.uk/people/amy-hinsley\"><span>Amy Hinsley</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\"><span>Julia P.G. Jones</span></a>, <a href=\"https://env.dukekunshan.edu.cn/faculty-env/binbin-li-ph-d/\"><span>Binbin V. Li</span></a>, <a href=\"http://oisin.info\"><span>Oisin Mac Aodha</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://stephanieodonnell.com\"><span>Stephanie O'Donnell</span></a>, <a href=\"https://www.lancaster.ac.uk/maths/people/william-oxbury\"><span>Bill Oxbury</span></a>, <a href=\"https://www.bas.ac.uk/profile/lspe/\"><span>Lloyd Peck</span></a>, <a href=\"https://www.zsl.org/about-zsl/our-people/prof-nathalie-pettorelli\"><span>Nathalie Pettorelli</span></a>, <a href=\"https://www.rainforesttrust.org/about-us/our-team/dr-jon-paul-rodriguez-2/\"><span>Jon Paul Rodr\u00edguez</span></a>, <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\"><span>Emily Shuckburgh</span></a>, <a href=\"https://www.iis-rio.org/en/collaborators/bernardo/\"><span>Bernardo Strassburg</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-hiromi-yamashita\"><span>Hiromi Yamashita</span></a>, <a href=\"https://www.microsoft.com/en-us/research/people/zhongqimiao/\"><span>Zhongqi Miao</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>.</p><p>Journal paper in <a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0169534724002866\">Trends in Ecology & Evolution</a>.</p><p><a href=\"https://linkinghub.elsevier.com/retrieve/pii/S0169534724002866\">URL</a> <i>(linkinghub.elsevier.com)</i> <a href=\"https://doi.org/10.1016/j.tree.2024.11.013\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-cc-blockchain-1.json
+18
avsm/news_2024-cc-blockchain-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-cc-blockchain-1\">Global, robust and comparable digital carbon assets</a> <span>/ Apr 2024</span></h2><p>Paper on smart contracts for carbon credits at <a href=\"http://icbc2024.ieee-icbc.org\">ICBC 2024</a> in Dublin</p>\n<blockquote><div><p><a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://derekhsorensen.com\"><span>Derek Sorensen</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://arxiv.org/abs/2403.14581\">proceedings of the 6th IEEE International Conference on Blockchain and Cryptocurrency</a>.</p><p><a href=\"http://arxiv.org/abs/2403.14581\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2403.14581\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-cc-blockchain.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-cc-blockchain.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-cc-blockchain-1\">#</a> 1st Apr 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>conference</span> <span>crypto</span> <span>distributed</span> <span>economics</span> <span>security</span></span></div>",+"content": "<p>Paper on smart contracts for carbon credits at <a href=\"http://icbc2024.ieee-icbc.org\">ICBC 2024</a> in Dublin</p>\n<blockquote><div><p><a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://derekhsorensen.com\"><span>Derek Sorensen</span></a>, <a href=\"https://www.lambdacambridge.com/robin-message\"><span>Robin Message</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://arxiv.org/abs/2403.14581\">proceedings of the 6th IEEE International Conference on Blockchain and Cryptocurrency</a>.</p><p><a href=\"http://arxiv.org/abs/2403.14581\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2403.14581\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-cc-blockchain.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-cc-blockchain.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-cclr-carbon-1.json
+18
avsm/news_2024-cclr-carbon-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-cclr-carbon-1\">Published a legal perspective on high integrity forest carbon credits</a> <span>/ Nov 2024</span></h2><p><a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> lead an <a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">effort</a> to explore a novel legal framework for forest carbon credits\nthat separates carbon tenure (i.e. title and associated property rights to the\nland and trees which store the carbon) from the carbon rights (i.e. title and\nassociated rights to monetise and manage the credits which\nsymbolically represent the carbon stored in the trees), while also specifying\nthe relationship between the carbon tenure and the carbon rights.</p>\n<p>The resulting <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon\">paper</a> has just been published in the Climate\nand Carbon Law Review journal, and is available as open access for your perusal.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/smc70\"><span>Sophie Chapman</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\"><span>Robin Daniels</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://cclr.lexxion.eu/article/CCLR/2024/3/5\">Carbon & Climate Law Review</a> (vol 18 issue 3).</p><p><a href=\"https://cclr.lexxion.eu/article/CCLR/2024/3/5\">URL</a> <i>(cclr.lexxion.eu)</i> <a href=\"https://doi.org/10.21552/cclr/2024/3/5\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-cclr-carbon-1\">#</a> 1st Nov 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>conservation</span> <span>forest</span> <span>journal</span> <span>landuse</span> <span>law</span> <span>legal</span> <span>nbs</span></span></div>",+"content": "<p><a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> lead an <a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">effort</a> to explore a novel legal framework for forest carbon credits\nthat separates carbon tenure (i.e. title and associated property rights to the\nland and trees which store the carbon) from the carbon rights (i.e. title and\nassociated rights to monetise and manage the credits which\nsymbolically represent the carbon stored in the trees), while also specifying\nthe relationship between the carbon tenure and the carbon rights.</p>\n<p>The resulting <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon\">paper</a> has just been published in the Climate\nand Carbon Law Review journal, and is available as open access for your perusal.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/smc70\"><span>Sophie Chapman</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\"><span>Robin Daniels</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in <a href=\"https://cclr.lexxion.eu/article/CCLR/2024/3/5\">Carbon & Climate Law Review</a> (vol 18 issue 3).</p><p><a href=\"https://cclr.lexxion.eu/article/CCLR/2024/3/5\">URL</a> <i>(cclr.lexxion.eu)</i> <a href=\"https://doi.org/10.21552/cclr/2024/3/5\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-ce-llm-1.json
+18
avsm/news_2024-ce-llm-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-ce-llm-1\">Preprint on using LLMs to for evidence-based decision support</a> <span>/ Nov 2024</span></h2><p>We have just uploaded a preprint on using LLMs for conservation evidence, based on our <a href=\"https://anil.recoil.org/projects/ce\">work</a> on large-scale crawling of the academic literature. Well done in particular to <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a> for having done the bulk of the evaluation on this as part of a very productive summer internship with us!</p>\n<blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-ce-llm-1\">#</a> 1st May 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>conservation</span> <span>evidence</span> <span>journal</span> <span>llms</span></span></div>",+"content": "<p>We have just uploaded a preprint on using LLMs for conservation evidence, based on our <a href=\"https://anil.recoil.org/projects/ce\">work</a> on large-scale crawling of the academic literature. Well done in particular to <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a> for having done the bulk of the evaluation on this as part of a very productive summer internship with us!</p>\n<blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-ce-llm-2.json
+18
avsm/news_2024-ce-llm-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-ce-llm-2\">Updated preprint on LLMs for evidence-based decision support</a> <span>/ Jan 2025</span></h2><p>We have just updated our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">preprint</a> on using LLMs for evidence decision support with more evaluation results and corrections from peer review.</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used "out-of-the-box" are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>See also the fantastic <a href=\"https://watch.eeg.cl.cam.ac.uk/w/ijC1E36q7fn2qwxs7opSJq\">EEG seminar talk</a> that the student group who worked on this over the summer gave towards the end of last year.</p>\n<blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-ce-llm-2\">#</a> 1st May 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>conservation</span> <span>evidence</span> <span>journal</span> <span>llms</span></span></div>",+"content": "<p>We have just updated our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">preprint</a> on using LLMs for evidence decision support with more evaluation results and corrections from peer review.</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used "out-of-the-box" are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>See also the fantastic <a href=\"https://watch.eeg.cl.cam.ac.uk/w/ijC1E36q7fn2qwxs7opSJq\">EEG seminar talk</a> that the student group who worked on this over the summer gave towards the end of last year.</p>\n<blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-ce-llm-3.json
+18
avsm/news_2024-ce-llm-3.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-ce-llm-3\">Out-of-the-box LLMs are not ready for conservation decision making</a> <span>/ May 2025</span></h2><p>Our paper on <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">how the careful design of LLMs is crucial for expert-level evidence retrieval</a> has been published today in PLOS One and is available fully open access!</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used "out-of-the-box" are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>In a nutshell, we tested 10 LLMs with six different retrieval strategies on their ability to answer questions related to conservation, benchmarked against the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> database that has been hand-assembled by experts over the last two decades. In some of the retrieval scenarios, models were <em>only</em> allowed to use their pretrained knowledge, whereas in others they had access to the relevant parts of the hand-curated database.</p>\n<p>We found that language models had very varying results when relying only on their pretrained data, and were particularly bad at answering questions about reptile conservation.\nHowever, given some extra training with the CE database, their performance improved dramatically.\nWhen we put these models head to head with human experts (from the conservation evidence team), with a set of questions and with RAG access to the database, we found that the models were just as good as our experts, but answered the questions much much much faster (near instant).</p>\n<p>Essentially, LLMs without extra training are likely to perform poorly and misinform decision-makers. This is crucial when considering how to build AI infrastructure for <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">public policymaking</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-ce-llm-3\">377 words</a>]</span><blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-ce-llm-3\">#</a> 1st May 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>conservation</span> <span>evidence</span> <span>journal</span> <span>llms</span></span></div>",+"content": "<p>Our paper on <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">how the careful design of LLMs is crucial for expert-level evidence retrieval</a> has been published today in PLOS One and is available fully open access!</p>\n<blockquote>\n<p>Our findings suggest that, with careful domain-specific design, LLMs could potentially be powerful tools for enabling expert-level use of evidence syntheses and databases. However, general LLMs used "out-of-the-box" are likely to perform poorly and misinform decision-makers. By establishing that LLMs exhibit comparable performance with human synthesis experts on providing restricted responses to queries of evidence syntheses and databases, future work can build on our approach to quantify LLM performance in providing open-ended responses.</p>\n</blockquote>\n<p>In a nutshell, we tested 10 LLMs with six different retrieval strategies on their ability to answer questions related to conservation, benchmarked against the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> database that has been hand-assembled by experts over the last two decades. In some of the retrieval scenarios, models were <em>only</em> allowed to use their pretrained knowledge, whereas in others they had access to the relevant parts of the hand-curated database.</p>\n<p>We found that language models had very varying results when relying only on their pretrained data, and were particularly bad at answering questions about reptile conservation.\nHowever, given some extra training with the CE database, their performance improved dramatically.\nWhen we put these models head to head with human experts (from the conservation evidence team), with a set of questions and with RAG access to the database, we found that the models were just as good as our experts, but answered the questions much much much faster (near instant).</p>\n<p>Essentially, LLMs without extra training are likely to perform poorly and misinform decision-makers. This is crucial when considering how to build AI infrastructure for <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">public policymaking</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-ce-llm-3\">377 words</a>]</span><blockquote><div><p><a href=\"mailto:ri301@cam.ac.uk\"><span>Radhika Iyer</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>.</p><p>Journal paper in <a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">PLOS ONE</a> (vol 20 issue 5).</p><p><a href=\"https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323563\">URL</a> <i>(journals.plos.org)</i> <a href=\"https://doi.org/10.1371/journal.pone.0323563\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-ce-llm.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-food-life-1.json
+18
avsm/news_2024-food-life-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-food-life-1\">Quantifying the impact of the food we eat on species extinctions</a> <span>/ May 2024</span></h2><p>Submitted preprint on quantifying the biodiversity cost of global food consumption for peer review</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\"><span>David Williams</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-fl5fk-v2\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-food-life-1\">#</a> 1st Feb 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>agriculture</span> <span>biodiversity</span> <span>conservation</span> <span>consumption</span> <span>extinctions</span> <span>food</span> <span>land-use</span> <span>preprint</span> <span>sensing</span> <span>supplychains</span></span></div>",+"content": "<p>Submitted preprint on quantifying the biodiversity cost of global food consumption for peer review</p>\n<blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\"><span>David Williams</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-fl5fk-v2\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-food-life-2.json
+18
avsm/news_2024-food-life-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-food-life-2\">Updated preprint on quantifying biodiversity cost of food consumption</a> <span>/ Feb 2025</span></h2><p>We've uploaded a revised preprint on our ongoing work on quantifying the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity cost of global food consumption</a>, lead by <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. This is based on the <a href=\"https://anil.recoil.org/news/2024-life-3\">recently published</a> <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> metric, combined with supply chain data and provenance modeling.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-food-life-2\">196 words</a>]</span><blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\"><span>David Williams</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-fl5fk-v2\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-food-life-2\">#</a> 1st Feb 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>agriculture</span> <span>biodiversity</span> <span>climate</span> <span>conservation</span> <span>consumption</span> <span>extinctions</span> <span>food</span> <span>land-use</span> <span>preprint</span> <span>sensing</span> <span>supplychains</span></span></div>",+"content": "<p>We've uploaded a revised preprint on our ongoing work on quantifying the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity cost of global food consumption</a>, lead by <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. This is based on the <a href=\"https://anil.recoil.org/news/2024-life-3\">recently published</a> <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> metric, combined with supply chain data and provenance modeling.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-food-life-2\">196 words</a>]</span><blockquote><div><p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\"><span>David Williams</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Working paper at <a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">Cambridge Open Engage</a>.</p><p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.33774/coe-2024-fl5fk-v2\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-food-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-green-urban-eq-1.json
+18
avsm/news_2024-green-urban-eq-1.json
···+"title": "Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications",+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-green-urban-eq-1\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a> <span>/ Mar 2024</span></h2><p>Abstract on urban biodiversity and human health at <a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">EGU 24</a></p>\n<blockquote><div><p><a href=\"https://ancazugo.github.io/\"><span>Andres Zu\u00f1iga-Gonzalez</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\"><span>Ronita Bardhan</span></a>.</p><p>Technical report (EGU24-20833) at <a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">Copernicus Meetings</a>.</p><p><a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">URL</a> <i>(meetingorganizer.copernicus.org)</i> <a href=\"https://doi.org/10.5194/egusphere-egu24-20833\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-green-urban-eq.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-green-urban-eq-1\">#</a> 1st Mar 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>cities</span> <span>health</span> <span>report</span> <span>sensing</span> <span>spatial</span></span></div>",+"content": "<p>Abstract on urban biodiversity and human health at <a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">EGU 24</a></p>\n<blockquote><div><p><a href=\"https://ancazugo.github.io/\"><span>Andres Zu\u00f1iga-Gonzalez</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\"><span>Ronita Bardhan</span></a>.</p><p>Technical report (EGU24-20833) at <a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">Copernicus Meetings</a>.</p><p><a href=\"https://meetingorganizer.copernicus.org/EGU24/EGU24-20833.html\">URL</a> <i>(meetingorganizer.copernicus.org)</i> <a href=\"https://doi.org/10.5194/egusphere-egu24-20833\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-green-urban-eq.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2024-hope-bastion-1.json
+18
avsm/news_2024-hope-bastion-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">Towards security specifications for agentic AIs</a> <span>/ Sep 2024</span></h2><p>A very fun talk at <a href=\"https://icfp24.sigplan.org/home/hope-2024\">ACM HOPE 2024</a>\non some new work with <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> on how we can formally specify\nsystems to be robust to code generation by AI agents. For instance, if you were\nto ask GitHub Copilot to generate you code to filter endangered animals out of\na folder of images, it might interpret that as to delete the image, or to move\nit to another folder (which might be public), or just remove it from the index.\nAny of those options are potentially valid, so what do we do? Our idea is to\nuse F* to specify a rich set of allowable behaviours which can then be\ndynamically enforced in less expressive languages, and thus offer layers of\nprotection against over-eager (or rogue) AI agents.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">183 words</a>]</span><blockquote><div><p><a href=\"https://web.eecs.umich.edu/~comar/\"><span>Cyrus Omar</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp24.sigplan.org/home/hope-2024\">the 12th ACM SIGPLAN Workshop on Higher-Order Programming with Effects (HOPE)</a>.</p><p><a href=\"https://icfp24.sigplan.org/home/hope-2024\">URL</a> <i>(icfp24.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">#</a> 1st Sep 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>conference</span> <span>formal</span> <span>icfp</span> <span>security</span> <span>specification</span> <span>systems</span></span></div>",+"content": "<p>A very fun talk at <a href=\"https://icfp24.sigplan.org/home/hope-2024\">ACM HOPE 2024</a>\non some new work with <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> on how we can formally specify\nsystems to be robust to code generation by AI agents. For instance, if you were\nto ask GitHub Copilot to generate you code to filter endangered animals out of\na folder of images, it might interpret that as to delete the image, or to move\nit to another folder (which might be public), or just remove it from the index.\nAny of those options are potentially valid, so what do we do? Our idea is to\nuse F* to specify a rich set of allowable behaviours which can then be\ndynamically enforced in less expressive languages, and thus offer layers of\nprotection against over-eager (or rogue) AI agents.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">183 words</a>]</span><blockquote><div><p><a href=\"https://web.eecs.umich.edu/~comar/\"><span>Cyrus Omar</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"https://icfp24.sigplan.org/home/hope-2024\">the 12th ACM SIGPLAN Workshop on Higher-Order Programming with Effects (HOPE)</a>.</p><p><a href=\"https://icfp24.sigplan.org/home/hope-2024\">URL</a> <i>(icfp24.sigplan.org)</i> <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-hyper-tropical-mapping-1.json
+18
avsm/news_2024-hyper-tropical-mapping-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-hyper-tropical-mapping-1\">Hyperspectrally identifying trees in tropical forests</a> <span>/ Jun 2024</span></h2><p>A preprint on using <a href=\"https://en.wikipedia.org/wiki/Hyperspectral_imaging\">hyperspectral sensors</a> to perform tree species identification across the tropics is now available on bioarxiv.</p>\n<blockquote>\n<p>This study introduces a new approach for mapping tree species linking a multi-temporal implementation of the CNN method detectree2 to segment tree-crowns from aerial photographs to machine learning classifiers to identify species from hyperspectral data.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://patball1.github.io\"><span>James G. C. Ball</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://scholar.google.com/citations?user=gQYhlDYAAAAJ&hl=fr\"><span>Anthony Laybros</span></a>, <a href=\"https://www.researchgate.net/profile/Colin-Prieur\"><span>Colin Prieur</span></a>, <a href=\"https://www.bristol.ac.uk/people/person/Toby-Jackson-0f0cc27a-9b35-479c-b2a6-7459834ca871/\"><span>Toby Jackson</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://orcid.org/0000-0002-5323-3866\"><span>Nicolas Barbier</span></a>, <a href=\"https://scholar.google.ca/citations?user=bc4TxdsAAAAJ\"><span>Gregoire Vincent</span></a>, and <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>.</p><p>Working paper at <a href=\"https://www.biorxiv.org/content/10.1101/2024.06.24.600405v1\">bioRxiv</a>.</p><p><a href=\"https://www.biorxiv.org/content/10.1101/2024.06.24.600405v1\">URL</a> <i>(biorxiv.org)</i> <a href=\"https://doi.org/10.1101/2024.06.24.600405\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-hyper-tropical-mapping-1\">#</a> 1st Jun 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>forests</span> <span>hyperspectral</span> <span>preprint</span> <span>satellite</span> <span>sensing</span></span></div>",+"content": "<p>A preprint on using <a href=\"https://en.wikipedia.org/wiki/Hyperspectral_imaging\">hyperspectral sensors</a> to perform tree species identification across the tropics is now available on bioarxiv.</p>\n<blockquote>\n<p>This study introduces a new approach for mapping tree species linking a multi-temporal implementation of the CNN method detectree2 to segment tree-crowns from aerial photographs to machine learning classifiers to identify species from hyperspectral data.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://patball1.github.io\"><span>James G. C. Ball</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://scholar.google.com/citations?user=gQYhlDYAAAAJ&hl=fr\"><span>Anthony Laybros</span></a>, <a href=\"https://www.researchgate.net/profile/Colin-Prieur\"><span>Colin Prieur</span></a>, <a href=\"https://www.bristol.ac.uk/people/person/Toby-Jackson-0f0cc27a-9b35-479c-b2a6-7459834ca871/\"><span>Toby Jackson</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://orcid.org/0000-0002-5323-3866\"><span>Nicolas Barbier</span></a>, <a href=\"https://scholar.google.ca/citations?user=bc4TxdsAAAAJ\"><span>Gregoire Vincent</span></a>, and <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>.</p><p>Working paper at <a href=\"https://www.biorxiv.org/content/10.1101/2024.06.24.600405v1\">bioRxiv</a>.</p><p><a href=\"https://www.biorxiv.org/content/10.1101/2024.06.24.600405v1\">URL</a> <i>(biorxiv.org)</i> <a href=\"https://doi.org/10.1101/2024.06.24.600405\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-life-1.json
+18
avsm/news_2024-life-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-life-1\">First preprint of LIFE biodiversity metric available</a> <span>/ Nov 2023</span></h2><p>The first preprint on our new <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> metric for global biodiversity is now available. It is under review, so feedback would be very welcome.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-life-1\">#</a> 1st Jan 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>aoh</span> <span>biodiversity</span> <span>conservation</span> <span>economics</span> <span>journal</span> <span>nature</span> <span>sdms</span> <span>sensing</span> <span>spatial</span></span></div>",+"content": "<p>The first preprint on our new <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> metric for global biodiversity is now available. It is under review, so feedback would be very welcome.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-life-2.json
+18
avsm/news_2024-life-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-life-2\">Second preprint of the LIFE biodiversity metric available</a> <span>/ Jul 2024</span></h2><p>We have made an update to the <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> biodiversity metric based on reviewer feedback, and are very please that it has been accepted for publication early next year as part of a special issue from the Royal Society. Any comments would be most welcome before we submit the final proofs in a few months.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-life-2\">#</a> 1st Jan 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>aoh</span> <span>biodiversity</span> <span>conservation</span> <span>economics</span> <span>journal</span> <span>nature</span> <span>sdms</span> <span>sensing</span> <span>spatial</span></span></div>",+"content": "<p>We have made an update to the <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> biodiversity metric based on reviewer feedback, and are very please that it has been accepted for publication early next year as part of a special issue from the Royal Society. Any comments would be most welcome before we submit the final proofs in a few months.</p>\n<blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-life-3.json
+18
avsm/news_2024-life-3.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-life-3\">LIFE metric published in Royal Society Phil Trans B</a> <span>/ Jan 2025</span></h2><p>After some years of hard work, our <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> biodiversity metric was published today in a <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">special issue</a> of the Royal Society Philosophical Transactions B! The idea behind LIFE is that although human-driven habitat loss is known to be the greatest cause of the <a href=\"https://www.unep.org/facts-about-nature-crisis\">biodiversity crisis</a>, we do not yet have robust spatially explicit metrics that <em>quantify</em> the relative impacts of human actions on species extinctions. And that's what LIFE provides: a way to compare the relative impacts of some landuse anywhere in the world, in a manner that is globally applicable.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-life-3\">409 words</a>]</span><blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-life-3\">#</a> 1st Jan 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>aoh</span> <span>biodiversity</span> <span>conservation</span> <span>economics</span> <span>journal</span> <span>nature</span> <span>sdms</span> <span>sensing</span> <span>spatial</span></span></div>",+"content": "<p>After some years of hard work, our <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> biodiversity metric was published today in a <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">special issue</a> of the Royal Society Philosophical Transactions B! The idea behind LIFE is that although human-driven habitat loss is known to be the greatest cause of the <a href=\"https://www.unep.org/facts-about-nature-crisis\">biodiversity crisis</a>, we do not yet have robust spatially explicit metrics that <em>quantify</em> the relative impacts of human actions on species extinctions. And that's what LIFE provides: a way to compare the relative impacts of some landuse anywhere in the world, in a manner that is globally applicable.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-life-3\">409 words</a>]</span><blockquote><div><p><a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\"><span>Thomas Ball</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/andy-arnell/\"><span>Andy Arnell</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/daniele-baisero/\"><span>Daniele Baisero</span></a>, <a href=\"https://www.cambridgeconservation.org/about/people/paz-duran/\"><span>Am\u00e9rica Paz Dur\u00e1n</span></a>, <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\"><span>Jonathan Green</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/rhys-green\"><span>Rhys Green</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>.</p><p>Journal paper in <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">Philosophical Transactions of the Royal Society</a> (vol 380 issue 1917).</p><p><a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">URL</a> <i>(royalsocietypublishing.org)</i> <a href=\"https://doi.org/10.1098/rstb.2023.0327\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-life.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-life.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-loco-carbonres-1.json
+18
avsm/news_2024-loco-carbonres-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-loco-carbonres-1\">Prototyping carbon-aware domain name resolution</a> <span>/ Dec 2024</span></h2><p><a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and I have been thinking about how the current Internet architecture fails to treat the carbon emissions\nassociated with networked services as a first-class metric. So when the <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> conference came up, we tried extending the DNS with load balancing techniques to consider the carbon cost of scheduling decisions. A next step was then to build a custom <a href=\"https://github.com/RyanGibb/eon\">DNS server written in OCaml</a> to actively wake machines running networked services as a side effect of the name\nresolution.</p>\n<p>Extending DNS means that we maintain compatibility with existing Internet\ninfrastructure, unlocking the ability for existing applications to be\ncarbon-aware. This is very much a spiritual follow on to the\n<a href=\"https://anil.recoil.org/papers/2013-foci-signposts\">Signposts</a> project that I worked on back in 2013, and\nhave always wanted to return to!</p>\n<blockquote><div><p><a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_28.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_28.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-loco-carbonres-1\">#</a> 1st Dec 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>abstract</span> <span>carbon</span> <span>distributed</span> <span>dns</span> <span>loco</span> <span>selfhosting</span> <span>signpost</span> <span>systems</span></span></div>",+"content": "<p><a href=\"https://ryan.freumh.org\">Ryan Gibb</a> and I have been thinking about how the current Internet architecture fails to treat the carbon emissions\nassociated with networked services as a first-class metric. So when the <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> conference came up, we tried extending the DNS with load balancing techniques to consider the carbon cost of scheduling decisions. A next step was then to build a custom <a href=\"https://github.com/RyanGibb/eon\">DNS server written in OCaml</a> to actively wake machines running networked services as a side effect of the name\nresolution.</p>\n<p>Extending DNS means that we maintain compatibility with existing Internet\ninfrastructure, unlocking the ability for existing applications to be\ncarbon-aware. This is very much a spiritual follow on to the\n<a href=\"https://anil.recoil.org/papers/2013-foci-signposts\">Signposts</a> project that I worked on back in 2013, and\nhave always wanted to return to!</p>\n<blockquote><div><p><a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_28.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_28.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-loco-emissions-1.json
+18
avsm/news_2024-loco-emissions-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-loco-emissions-1\">Towards verifiable, privacy-preserving carbon emissions claims</a> <span>/ Dec 2024</span></h2><p>Customers of online services may want to take carbon emissions into account\nwhen deciding which service to use, but it's currently difficult to do so due\nto the lack of reliable emissions data that is comparable across online\nservices. There's a lot of muddled data out there, and calculating accurate\ncarbon emissions across a computing pipeline involves a number of stakeholders,\nnone of whom are incentivised to accurately report their emissions for\ncompetitive reasons!</p>\n<p>In this <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> paper, <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a> lead our\nexploration of mechanisms to support verifiable <em>and</em> privacy-preserving\nemissions reporting across a chain of energy suppliers, cloud data centres,\nvirtual machine hosting services providers and cloud services providers. The\nidea is that all of this can ultimately be exposed to APIs that can be consumed\nby client devices in order to let consumers make direct choices about their\ndecisions based on relative environmental impacts.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/psjm3\"><span>Jessica Man</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://martin.kleppmann.com\"><span>Martin Kleppmann</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2506.16347\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2506.16347\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2506.16347\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-loco-emissions.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-emissions.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-loco-emissions-1\">#</a> 1st Jun 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carbon</span> <span>crypto</span> <span>emissions</span> <span>loco</span> <span>preprint</span> <span>security</span> <span>zkp</span></span></div>",+"content": "<p>Customers of online services may want to take carbon emissions into account\nwhen deciding which service to use, but it's currently difficult to do so due\nto the lack of reliable emissions data that is comparable across online\nservices. There's a lot of muddled data out there, and calculating accurate\ncarbon emissions across a computing pipeline involves a number of stakeholders,\nnone of whom are incentivised to accurately report their emissions for\ncompetitive reasons!</p>\n<p>In this <a href=\"https://locos.codeberg.page/loco2024/\">LOCO</a> paper, <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a> lead our\nexploration of mechanisms to support verifiable <em>and</em> privacy-preserving\nemissions reporting across a chain of energy suppliers, cloud data centres,\nvirtual machine hosting services providers and cloud services providers. The\nidea is that all of this can ultimately be exposed to APIs that can be consumed\nby client devices in order to let consumers make direct choices about their\ndecisions based on relative environmental impacts.</p>\n<blockquote><div><p><a href=\"https://www.cst.cam.ac.uk/people/psjm3\"><span>Jessica Man</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://martin.kleppmann.com\"><span>Martin Kleppmann</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2506.16347\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2506.16347\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2506.16347\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-loco-emissions.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-emissions.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-loco-shark-1.json
+18
avsm/news_2024-loco-shark-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-loco-shark-1\">Towards a frugal userspace for Linux</a> <span>/ Dec 2024</span></h2><p>All the work we've been doing on biodiversity (such as <a href=\"https://anil.recoil.org/projects/life\">LIFE</a>) comes at\na fairly large computation and storage cost due to the amount of data that we\nchurn through. This gets worse when you consider the exploratory nature of\nscience -- we sometimes just need to mess around with the large dataset to test\nhypotheses which are often shown to be wrong. So then, when the\n<a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO</a> conference came around, we wrote\nup our thoughts on what a <em>frugal</em> Linux userspace might look like.</p>\n<p>The key insight is that the Linux kernel already exposes a number of namespace\nmechanisms (that we use in Docker, for example), and so we explore a new OS\narchitecture which defaults to deterministic, reusable computation with the\ncareful recording of side-effects. This in turn allows Linux to guide complex\ncomputations towards previously acquired intermediate results, but still\nallowing for recomputation when required by the user. We're putting this\ntogether into a new shell known as "Shark", and this first abstract describes\nour early results.</p>\n<blockquote><div><p><a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-shark.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-shark.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-loco-shark-1\">#</a> 1st Dec 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>abstract</span> <span>carbon</span> <span>docker</span> <span>life</span> <span>linux</span> <span>loco</span> <span>shark</span> <span>systems</span> <span>zfs</span></span></div>",+"content": "<p>All the work we've been doing on biodiversity (such as <a href=\"https://anil.recoil.org/projects/life\">LIFE</a>) comes at\na fairly large computation and storage cost due to the amount of data that we\nchurn through. This gets worse when you consider the exploratory nature of\nscience -- we sometimes just need to mess around with the large dataset to test\nhypotheses which are often shown to be wrong. So then, when the\n<a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO</a> conference came around, we wrote\nup our thoughts on what a <em>frugal</em> Linux userspace might look like.</p>\n<p>The key insight is that the Linux kernel already exposes a number of namespace\nmechanisms (that we use in Docker, for example), and so we explore a new OS\narchitecture which defaults to deterministic, reusable computation with the\ncareful recording of side-effects. This in turn allows Linux to guide complex\ncomputations towards previously acquired intermediate results, but still\nallowing for recomputation when required by the user. We're putting this\ntogether into a new shell known as "Shark", and this first abstract describes\nour early results.</p>\n<blockquote><div><p><a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-shark.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-shark.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-loco-terracorder-1.json
+18
avsm/news_2024-loco-terracorder-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-loco-terracorder-1\">Cooperative Sensor Networks for Long-Term Biodiversity Monitoring</a> <span>/ Dec 2024</span></h2><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and I have been having great fun designing embedded systems for\ncooperative biodiversity monitoring. Josh presented our work over at <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> with an abstract on the\nTerracorder project. Read more if you enjoy a combination of machine learning\nand ESP32 hacking.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-loco-terracorder-1\">#</a> 1st Dec 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>abstract</span> <span>ai</span> <span>biodiversity</span> <span>embedded</span> <span>loco</span> <span>qlearning</span> <span>sensing</span> <span>terracorder</span></span></div>",+"content": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and I have been having great fun designing embedded systems for\ncooperative biodiversity monitoring. Josh presented our work over at <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> with an abstract on the\nTerracorder project. Read more if you enjoy a combination of machine learning\nand ESP32 hacking.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">1st International Workshop on Low Carbon Computing</a>.</p><p><a href=\"https://www.sicsa.ac.uk/wp-content/uploads/2024/11/LOCO2024_paper_30.pdf\">URL</a> <i>(sicsa.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/2024-loco-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-loco-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-nbs-risk-1.json
+18
avsm/news_2024-nbs-risk-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-nbs-risk-1\">Preprint available on insuring against variability of NbS</a> <span>/ Mar 2024</span></h2><p>A new preprint is available on our work on ex-ante pricing models for nature-based solutions. It is currently under review, so any feedback is most welcome!</p>\n<blockquote><div><p><a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.jamesgross.org\"><span>James Gross</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>.</p><p>Journal paper in <a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">Carbon Management</a> (vol 15 issue 1).</p><p><a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">URL</a> <i>(tandfonline.com)</i> <a href=\"https://doi.org/10.1080/17583004.2024.2390854\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-nbs-risk-1\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>economics</span> <span>forest</span> <span>forests</span> <span>journal</span> <span>nature</span> <span>nbs</span></span></div>",+"content": "<p>A new preprint is available on our work on ex-ante pricing models for nature-based solutions. It is currently under review, so any feedback is most welcome!</p>\n<blockquote><div><p><a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.jamesgross.org\"><span>James Gross</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>.</p><p>Journal paper in <a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">Carbon Management</a> (vol 15 issue 1).</p><p><a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">URL</a> <i>(tandfonline.com)</i> <a href=\"https://doi.org/10.1080/17583004.2024.2390854\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-nbs-risk-2.json
+18
avsm/news_2024-nbs-risk-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-nbs-risk-2\">Paper published on ex-ante forecasts of nature-based solutions</a> <span>/ Aug 2024</span></h2><p>Our paper on ex-ante projection for nature-based solutions has been published in the <a href=\"https://www.tandfonline.com/journals/tcmt20\">Journal of Carbon Management</a>. I also wrote up some <a href=\"https://anil.recoil.org/notes/mitigating-nbs-risk-paper\">long-form thoughts</a> on it here.</p>\n<blockquote><div><p><a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.jamesgross.org\"><span>James Gross</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>.</p><p>Journal paper in <a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">Carbon Management</a> (vol 15 issue 1).</p><p><a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">URL</a> <i>(tandfonline.com)</i> <a href=\"https://doi.org/10.1080/17583004.2024.2390854\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-nbs-risk-2\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>carboncredits</span> <span>economics</span> <span>forest</span> <span>forests</span> <span>journal</span> <span>nature</span> <span>nbs</span></span></div>",+"content": "<p>Our paper on ex-ante projection for nature-based solutions has been published in the <a href=\"https://www.tandfonline.com/journals/tcmt20\">Journal of Carbon Management</a>. I also wrote up some <a href=\"https://anil.recoil.org/notes/mitigating-nbs-risk-paper\">long-form thoughts</a> on it here.</p>\n<blockquote><div><p><a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\"><span>E.-Ping Rau</span></a>, <a href=\"https://www.jamesgross.org\"><span>James Gross</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>.</p><p>Journal paper in <a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">Carbon Management</a> (vol 15 issue 1).</p><p><a href=\"https://www.tandfonline.com/doi/full/10.1080/17583004.2024.2390854\">URL</a> <i>(tandfonline.com)</i> <a href=\"https://doi.org/10.1080/17583004.2024.2390854\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-nbs-risk.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-planetary-computing-1.json
+18
avsm/news_2024-planetary-computing-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-planetary-computing-1\">A Case for Planetary Computing</a> <span>/ Mar 2023</span></h2><p>Preprint of planetary computing paper</p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2303.04501\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2303.04501\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2303.04501\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-planetary-computing.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-planetary-computing-1\">#</a> 1st Mar 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>conservation</span> <span>networks</span> <span>preprint</span> <span>sensing</span> <span>systems</span></span></div>",+"content": "<p>Preprint of planetary computing paper</p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2303.04501\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2303.04501\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2303.04501\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-planetary-computing.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2024-planetary-computing-2.json
+18
avsm/news_2024-planetary-computing-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-planetary-computing-2\">A Case for Planetary Computing</a> <span>/ Mar 2024</span></h2><p>Revision of planetary computing preprint</p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2303.04501\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2303.04501\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2303.04501\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-planetary-computing.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-planetary-computing-2\">#</a> 1st Mar 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>conservation</span> <span>networks</span> <span>preprint</span> <span>sensing</span> <span>systems</span></span></div>",+"content": "<p>Revision of planetary computing preprint</p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://ameliaholcomb.github.io\"><span>Amelia Holcomb</span></a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\"><span>Eleanor Toye Scott</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\"><span>Alison Eyres</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\"><span>Andrew Balmford</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2303.04501\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2303.04501\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2303.04501\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-planetary-computing.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_2024-sdm-sa-1.json
+18
avsm/news_2024-sdm-sa-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-sdm-sa-1\">Predicting species using machine learning at CCAI</a> <span>/ May 2024</span></h2><p><a href=\"https://github.com/emorris7\">Emily Morris</a> did some great MPhil work here in her Masters on using <a href=\"https://anil.recoil.org/ideas/sdms-with-cnns\">CNNs with satellite data</a> to do species predictions in South Africa better. She presented it at the <a href=\"https://www.climatechange.ai/events/iclr2024\">ICLR CCAI</a> workshop in Vienna, and is now off to do a PhD at Oxford!</p>\n<blockquote>\n<p>Species distribution models are crucial tools that predict species locations by interpolating observed field data with environmental information. We develop an improved, scalable method for species distribution modelling by proposing a dataset pipeline that incorporates global remote sensing imagery, land use classification data, environmental variables, and observation data, and utilising this with CNN models to predict species presence at higher spatial and temporal resolutions than well-established species distribution modelling methods.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://github.com/emorris7\"><span>Emily Morris</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, and <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>.</p><p>Paper in the <a href=\"https://www.climatechange.ai/papers/iclr2024/67\">proceedings of the ICLR 2024 Workshop on Tackling Climate Change with Machine Learning</a>.</p><p><a href=\"https://www.climatechange.ai/papers/iclr2024/67\">URL</a> <i>(climatechange.ai)</i> <a href=\"https://anil.recoil.org/papers/2024-sdm-sa.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-sdm-sa.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-sdm-sa-1\">#</a> 1st May 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>conference</span> <span>sdms</span> <span>sensing</span></span></div>",+"content": "<p><a href=\"https://github.com/emorris7\">Emily Morris</a> did some great MPhil work here in her Masters on using <a href=\"https://anil.recoil.org/ideas/sdms-with-cnns\">CNNs with satellite data</a> to do species predictions in South Africa better. She presented it at the <a href=\"https://www.climatechange.ai/events/iclr2024\">ICLR CCAI</a> workshop in Vienna, and is now off to do a PhD at Oxford!</p>\n<blockquote>\n<p>Species distribution models are crucial tools that predict species locations by interpolating observed field data with environmental information. We develop an improved, scalable method for species distribution modelling by proposing a dataset pipeline that incorporates global remote sensing imagery, land use classification data, environmental variables, and observation data, and utilising this with CNN models to predict species presence at higher spatial and temporal resolutions than well-established species distribution modelling methods.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://github.com/emorris7\"><span>Emily Morris</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, and <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>.</p><p>Paper in the <a href=\"https://www.climatechange.ai/papers/iclr2024/67\">proceedings of the ICLR 2024 Workshop on Tackling Climate Change with Machine Learning</a>.</p><p><a href=\"https://www.climatechange.ai/papers/iclr2024/67\">URL</a> <i>(climatechange.ai)</i> <a href=\"https://anil.recoil.org/papers/2024-sdm-sa.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-sdm-sa.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-sensys-terracorder-1.json
+18
avsm/news_2024-sensys-terracorder-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-sensys-terracorder-1\">Presented poster at Sensys on low-power biodiversity monitoring</a> <span>/ Nov 2024</span></h2><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> presented our work on biodiversity sensing over at <a href=\"http://sensys.acm.org/2024/\">ACM Sensys 2024</a> in China. The <a href=\"http://sensys.acm.org/2024/demos/\">full set</a> of papers and demos has a range of impressive work on sensor networks, and some that stood out to me follow.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-sensys-terracorder-1\">140 words</a>]</span><blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://dl.acm.org/doi/10.1145/3666025.3699400\">proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3666025.3699400\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3666025.3699400\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-sensys-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-sensys-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-sensys-terracorder-1\">#</a> 1st Nov 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>abstract</span> <span>biodiversity</span> <span>sensing</span> <span>sensys</span> <span>terracorder</span></span></div>",+"content": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> presented our work on biodiversity sensing over at <a href=\"http://sensys.acm.org/2024/\">ACM Sensys 2024</a> in China. The <a href=\"http://sensys.acm.org/2024/demos/\">full set</a> of papers and demos has a range of impressive work on sensor networks, and some that stood out to me follow.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-sensys-terracorder-1\">140 words</a>]</span><blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Abstract in the <a href=\"https://dl.acm.org/doi/10.1145/3666025.3699400\">proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems</a>.</p><p><a href=\"https://dl.acm.org/doi/10.1145/3666025.3699400\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/3666025.3699400\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-sensys-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-sensys-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-socc-murmuration-1.json
+18
avsm/news_2024-socc-murmuration-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-socc-murmuration-1\">Paper on scheduling for reduced tail task latencies</a> <span>/ Nov 2024</span></h2><p><a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> went along to Seattle to <a href=\"https://acmsocc.org/2024/\">SOCC 2024</a> to present her PhD research on Murmuration. This is a new scheduler for Kubernetes that allows for 15%--25% faster job completion times than the default scheduler for different job arrival characteristics in datacenters that are very busy.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-socc-murmuration-1\">71 words</a>]</span><blockquote><div><p><a href=\"https://www.cl.cam.ac.uk/~sv440/\"><span>Smita Vijayakumar</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/ek264\"><span>Evangelia Kalyvianaki</span></a>.</p><p>Paper in the <a href=\"https://acmsocc.org/2024/\">proceedings of the 2024 ACM Symposium on Cloud Computing</a>.</p><p><a href=\"https://acmsocc.org/2024/\">URL</a> <i>(acmsocc.org)</i> <a href=\"https://doi.org/10.1145/3698038.3698522\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-socc-murmuration-1\">#</a> 1st Nov 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>conference</span> <span>distributed</span> <span>scheduling</span> <span>systems</span></span></div>",+"content": "<p><a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> went along to Seattle to <a href=\"https://acmsocc.org/2024/\">SOCC 2024</a> to present her PhD research on Murmuration. This is a new scheduler for Kubernetes that allows for 15%--25% faster job completion times than the default scheduler for different job arrival characteristics in datacenters that are very busy.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2024-socc-murmuration-1\">71 words</a>]</span><blockquote><div><p><a href=\"https://www.cl.cam.ac.uk/~sv440/\"><span>Smita Vijayakumar</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.cst.cam.ac.uk/people/ek264\"><span>Evangelia Kalyvianaki</span></a>.</p><p>Paper in the <a href=\"https://acmsocc.org/2024/\">proceedings of the 2024 ACM Symposium on Cloud Computing</a>.</p><p><a href=\"https://acmsocc.org/2024/\">URL</a> <i>(acmsocc.org)</i> <a href=\"https://doi.org/10.1145/3698038.3698522\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-terracorder-1.json
+18
avsm/news_2024-terracorder-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-terracorder-1\">Preprint on Terracorder sensing now available</a> <span>/ Aug 2024</span></h2><p>Our preprint on the Terracorder ground sensing platform I've been working with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> at Imperial on is now available on arXiv. It's a heady combination of ESP32 very low power hardware, combined with Q-learning to build cooperative networks of them that can run for long periods of time without wasting energy on redundant operations.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2408.02407\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2408.02407\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2408.02407\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-terracorder-1\">#</a> 1st Aug 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>esp32</span> <span>preprint</span> <span>qlearning</span> <span>sensing</span> <span>terracorder</span></span></div>",+"content": "<p>Our preprint on the Terracorder ground sensing platform I've been working with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> at Imperial on is now available on arXiv. It's a heady combination of ESP32 very low power hardware, combined with Q-learning to build cooperative networks of them that can run for long periods of time without wasting energy on redundant operations.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2408.02407\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2408.02407\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2408.02407\">DOI</a> <a href=\"https://anil.recoil.org/papers/2024-terracorder.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-terracorder.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2024-uncertainty-cs-1.json
+18
avsm/news_2024-uncertainty-cs-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2024-uncertainty-cs-1\">Uncertainty at scale: how CS hinders climate research</a> <span>/ Feb 2024</span></h2><p>Paper on uncertainty in climate science in <a href=\"https://undonecs.sciencesconf.org\">Undone CS</a></p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">Undone Computer Science</a>.</p><p><a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">URL</a> <i>(undonecs.sciencesconf.org)</i> <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2024-uncertainty-cs-1\">#</a> 1st Feb 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>biodiversity</span> <span>climate</span> <span>preprint</span> <span>satellites</span> <span>shark</span></span></div>",+"content": "<p>Paper on uncertainty in climate science in <a href=\"https://undonecs.sciencesconf.org\">Undone CS</a></p>\n<blockquote><div><p><a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\"><span>Thomas Swinfield</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\"><span>Srinivasan Keshav</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">Undone Computer Science</a>.</p><p><a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">URL</a> <i>(undonecs.sciencesconf.org)</i> <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2025-dl-rcn-1.json
+18
avsm/news_2025-dl-rcn-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2025-dl-rcn-1\">New preprint survey on energy-aware deep learning on embedded hardware</a> <span>/ May 2025</span></h2><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> has just released the latest survey paper he lead on energy-aware approaches to optimise deep-learning training and inference on embedded devices, such as those benchmarked in "<a href=\"https://anil.recoil.org/papers/2025-npu-bench\">Benchmarking Ultra-Low-Power \u00b5NPUs</a>" recently.</p>\n<blockquote>\n<p>We present an overview of such approaches, outlining their methodologies, implications for energy consumption and system-level efficiency, and their limitations in terms of supported network types, hardware platforms, and application scenarios. We hope our review offers a clear synthesis of the evolving energy-aware DL landscape and serves as a foundation for future research in energy-constrained computing.</p>\n</blockquote>\n<p>Any comments, please do let any of us know!</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2505.12523\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2505.12523\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2505.12523\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-dl-rcn.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-dl-rcn.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2025-dl-rcn-1\">#</a> 1st May 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>embedded</span> <span>esp32</span> <span>llms</span> <span>preprint</span> <span>sensing</span> <span>systems</span></span></div>",+"content": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> has just released the latest survey paper he lead on energy-aware approaches to optimise deep-learning training and inference on embedded devices, such as those benchmarked in "<a href=\"https://anil.recoil.org/papers/2025-npu-bench\">Benchmarking Ultra-Low-Power \u00b5NPUs</a>" recently.</p>\n<blockquote>\n<p>We present an overview of such approaches, outlining their methodologies, implications for energy consumption and system-level efficiency, and their limitations in terms of supported network types, hardware platforms, and application scenarios. We hope our review offers a clear synthesis of the evolving energy-aware DL landscape and serves as a foundation for future research in energy-constrained computing.</p>\n</blockquote>\n<p>Any comments, please do let any of us know!</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2505.12523\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2505.12523\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2505.12523\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-dl-rcn.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-dl-rcn.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2025-internet-ecology-1.json
+18
avsm/news_2025-internet-ecology-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2025-internet-ecology-1\">Steps towards an ecology of the Internet</a> <span>/ Jun 2025</span></h2><p>Every ten years, the city of <a href=\"https://www.visitdenmark.com/denmark/destinations/jutland/aarhus\">Aarhus</a> throws a giant conference to discuss new agendas for critical action and theory in computing. Back in 2016, <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> and I posited the idea of <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox\">personal data stores</a>, a topic that is just now becoming hot due to agentic AI. Well, time flies, and I'm pleased to report that our <em>second</em> dicennial thought experiment on <strong>"<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">Steps towards an Ecology for the Internet</a>"</strong> will appear at the 2025 edition of Aarhus this August!</p>\n<p>This time around, we projected our imaginations forward a decade to imagine an optimistic future for the Internet, when it has <a href=\"https://archive.org/details/trillionsthrivin0000luca\">exceeded a trillion nodes</a>. After deciding in the <a href=\"https://www.themillpubcambridge.com/\">pub</a> that this many nodes was too many for us to handle, we turned to our newfound buddies in <a href=\"https://anil.recoil.org/news?t=conservation\">conservation</a> to get inspiration from nature. We asked <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> first year undergraduate questions about how natural ecosystems operate across <em>all</em> levels of scale: from DNA through to cells through to whole populations.\nWe spent hours discussing the strange correspondences between the seeming chaos in the low-level interactions between cells through to the extraordinary emergent discipline through which biological development typically takes place.</p>\n<p>Then, going back to the computer scientists in our group and more widely (like <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> who I ran into at <a href=\"https://www.mcgill.ca/bellairs/\">Bellairs</a>), it turns out that this fosters some really wild ideas for how the Internet itself could evolve into the future. We could adopti biological process models within the heart of the <a href=\"https://en.wikipedia.org/wiki/End-to-end_principle\">end-to-end principle</a> that has driven the Internet architecture for decades!</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2025-internet-ecology-1\">623 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://web.eecs.umich.edu/~comar/\"><span>Cyrus Omar</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2506.06469\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2506.06469\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.1145/3744169.3744180\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-internet-ecology.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-internet-ecology.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2025-internet-ecology-1\">#</a> 1st Jun 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>ai</span> <span>biodiversity</span> <span>community</span> <span>ecology</span> <span>internet</span> <span>llms</span> <span>networking</span> <span>opensource</span> <span>preprint</span></span></div>",+"content": "<p>Every ten years, the city of <a href=\"https://www.visitdenmark.com/denmark/destinations/jutland/aarhus\">Aarhus</a> throws a giant conference to discuss new agendas for critical action and theory in computing. Back in 2016, <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> and I posited the idea of <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox\">personal data stores</a>, a topic that is just now becoming hot due to agentic AI. Well, time flies, and I'm pleased to report that our <em>second</em> dicennial thought experiment on <strong>"<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">Steps towards an Ecology for the Internet</a>"</strong> will appear at the 2025 edition of Aarhus this August!</p>\n<p>This time around, we projected our imaginations forward a decade to imagine an optimistic future for the Internet, when it has <a href=\"https://archive.org/details/trillionsthrivin0000luca\">exceeded a trillion nodes</a>. After deciding in the <a href=\"https://www.themillpubcambridge.com/\">pub</a> that this many nodes was too many for us to handle, we turned to our newfound buddies in <a href=\"https://anil.recoil.org/news?t=conservation\">conservation</a> to get inspiration from nature. We asked <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> first year undergraduate questions about how natural ecosystems operate across <em>all</em> levels of scale: from DNA through to cells through to whole populations.\nWe spent hours discussing the strange correspondences between the seeming chaos in the low-level interactions between cells through to the extraordinary emergent discipline through which biological development typically takes place.</p>\n<p>Then, going back to the computer scientists in our group and more widely (like <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> who I ran into at <a href=\"https://www.mcgill.ca/bellairs/\">Bellairs</a>), it turns out that this fosters some really wild ideas for how the Internet itself could evolve into the future. We could adopti biological process models within the heart of the <a href=\"https://en.wikipedia.org/wiki/End-to-end_principle\">end-to-end principle</a> that has driven the Internet architecture for decades!</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/2025-internet-ecology-1\">623 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://samreynolds.org/\"><span>Sam Reynolds</span></a>, <a href=\"https://profiles.imperial.ac.uk/a.christie\"><span>Alec Christie</span></a>, <a href=\"https://coomeslab.org\"><span>David Coomes</span></a>, <a href=\"https://mynameismwd.org\"><span>Michael Dales</span></a>, <a href=\"https://patrick.sirref.org\"><span>Patrick Ferris</span></a>, <a href=\"https://ryan.freumh.org\"><span>Ryan Gibb</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, <a href=\"https://toao.com\"><span>Sadiq Jaffer</span></a>, <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://web.eecs.umich.edu/~comar/\"><span>Cyrus Omar</span></a>, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\"><span>Bill Sutherland</span></a>, and <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2506.06469\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2506.06469\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.1145/3744169.3744180\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-internet-ecology.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-internet-ecology.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_2025-npu-bench-1.json
+18
avsm/news_2025-npu-bench-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2025-npu-bench-1\">New preprint on benchmarking ultra-low power neural accelerators</a> <span>/ Mar 2025</span></h2><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> just released our latest preprint on how to make sense of the growing number of dedicated, ultra-low-power 'neural network accelerators' that are found in many modern embedded chipsets. My interest in this derives from wanting to decouple from the cloud when it comes to <a href=\"https://anil.recoil.org/projects/osmose\">low-latency local environments</a>, and this needs fast tensor operations in hardware. Josh found a huge number of interesting NPUs in modern low-cost chips, ranging from <a href=\"https://www.espressif.com/en/products/socs/esp32\">ESP32</a>-based boards over to <a href=\"https://arm.com\">ARM</a> ones. All of these have quite a variety of tradeoffs, from the operations supported (which affects which models can be run on them) to the amount of memory and CPU power.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://yushan-huang.github.io/\"><span>Yushan Huang</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2503.22567\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2503.22567\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2503.22567\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-npu-bench.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-npu-bench.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/2025-npu-bench-1\">#</a> 1st Mar 2025 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>biodiversity</span> <span>conservation</span> <span>embedded</span> <span>esp32</span> <span>preprint</span> <span>sensing</span></span></div>",+"content": "<p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> just released our latest preprint on how to make sense of the growing number of dedicated, ultra-low-power 'neural network accelerators' that are found in many modern embedded chipsets. My interest in this derives from wanting to decouple from the cloud when it comes to <a href=\"https://anil.recoil.org/projects/osmose\">low-latency local environments</a>, and this needs fast tensor operations in hardware. Josh found a huge number of interesting NPUs in modern low-cost chips, ranging from <a href=\"https://www.espressif.com/en/products/socs/esp32\">ESP32</a>-based boards over to <a href=\"https://arm.com\">ARM</a> ones. All of these have quite a variety of tradeoffs, from the operations supported (which affects which models can be run on them) to the amount of memory and CPU power.</p>\n<blockquote><div><p><a href=\"https://profiles.imperial.ac.uk/joshua.millar22\"><span>Josh Millar</span></a>, <a href=\"https://yushan-huang.github.io/\"><span>Yushan Huang</span></a>, <a href=\"https://www.imperial.ac.uk/people/sarab.sethi\"><span>Sarab Sethi</span></a>, <a href=\"https://haddadi.github.io/\"><span>Hamed Haddadi</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Working paper at <a href=\"http://arxiv.org/abs/2503.22567\">arXiv</a>.</p><p><a href=\"http://arxiv.org/abs/2503.22567\">URL</a> <i>(arxiv.org)</i> <a href=\"https://doi.org/10.48550/arXiv.2503.22567\">DOI</a> <a href=\"https://anil.recoil.org/papers/2025-npu-bench.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/2025-npu-bench.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_287364fa-b59c-4b9f-812d-d81cc0c992a5-1.json
+18
avsm/news_287364fa-b59c-4b9f-812d-d81cc0c992a5-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/287364fa-b59c-4b9f-812d-d81cc0c992a5-1\">Programming the Next Trillion Embedded Devices</a> <span>/ Feb 2020</span></h2><p>Part 3</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/287364fa-b59c-4b9f-812d-d81cc0c992a5-1\">#</a> 26th Feb 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>embedded</span> <span>keynote</span> <span>mirageos</span> <span>scotland</span> <span>systems</span> <span>unikernels</span></span></div>",
+18
avsm/news_2f824dde-e112-4f4f-890d-1825572ea1c4-1.json
+18
avsm/news_2f824dde-e112-4f4f-890d-1825572ea1c4-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/2f824dde-e112-4f4f-890d-1825572ea1c4-1\">State of the OCaml Platform</a> <span>/ Sep 2017</span></h2><p>Talk on the state of the OCaml Platform</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/2f824dde-e112-4f4f-890d-1825572ea1c4-1\">#</a> 8th Sep 2017 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>devtools</span> <span>icfp</span> <span>ocaml</span> <span>opensource</span></span></div>",
+18
avsm/news_35e1a70d-0fb4-49b1-86ce-dd6266b812de-1.json
+18
avsm/news_35e1a70d-0fb4-49b1-86ce-dd6266b812de-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/35e1a70d-0fb4-49b1-86ce-dd6266b812de-1\">The State of the OCaml Platform</a> <span>/ Sep 2015</span></h2><p>Update on the state of the OCaml Platform</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/35e1a70d-0fb4-49b1-86ce-dd6266b812de-1\">#</a> 4th Sep 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>devtools</span> <span>ocaml</span></span></div>",
+18
avsm/news_4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1.json
+18
avsm/news_4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1\">OCaml 2014: The OCaml Platform v1.0</a> <span>/ Sep 2014</span></h2><p>Talk on the OCaml Platform reaching v1.0</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/4390c1d0-ed4f-4c01-9e10-dab2a3faed7a-1\">#</a> 5th Sep 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>devtools</span> <span>icfp</span> <span>ocaml</span></span></div>",
+18
avsm/news_43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1.json
+18
avsm/news_43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1\">Building the Xen toolstack using OCaml</a> <span>/ Nov 2010</span></h2><p>Talk on building the Xen toolstack using OCaml</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/43ab3ae0-9ffc-474f-aa02-3cc1139f54d1-1\">#</a> 5th Nov 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>cloud</span> <span>fp</span> <span>icfp</span> <span>mirageos</span> <span>ocaml</span> <span>systems</span> <span>unikernels</span></span></div>",
+18
avsm/news_46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1.json
+18
avsm/news_46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1\">Jitsu: Just-in-Time Summoning of Unikernels (new directions in operating systems)</a> <span>/ Nov 2014</span></h2><p>New Directions in Operating Systems talk on Jitsu</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/46968fa0-e5bd-4df8-98e1-3cf88d9b31e5-1\">#</a> 25th Nov 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>dns</span> <span>london</span> <span>mirageos</span> <span>security</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>New Directions in Operating Systems talk on Jitsu</p>\n<p></p><div></div><p></p>",
+18
avsm/news_48a7ab10-3f49-4978-a00f-c26b64c2cae7-1.json
+18
avsm/news_48a7ab10-3f49-4978-a00f-c26b64c2cae7-1.json
···+"title": "BBC report on the new Cambridge supercomputer (\"Dawn\") announced at the 2023 AI Summit",+"summary": "<h2><a href=\"https://anil.recoil.org/news/48a7ab10-3f49-4978-a00f-c26b64c2cae7-1\">BBC report on the new Cambridge supercomputer ("Dawn") announced at the 2023 AI Summit</a> <span>/ Nov 2023</span></h2><p>On the BBC briefly about the Dawn supercomputer</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/48a7ab10-3f49-4978-a00f-c26b64c2cae7-1\">#</a> 2nd Nov 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>cloud</span> <span>hpc</span> <span>interview</span></span></div>",
+18
avsm/news_4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1.json
+18
avsm/news_4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1\">Turning Down the LAMP: Software Specialisation for the Cloud</a> <span>/ Jun 2010</span></h2><p>At HotCloud for the first talk about MirageOS</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/4957325f-d7f5-4a29-95b6-a1e1f61ea5cf-1\">#</a> 22nd Jun 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>boston</span> <span>mirageos</span> <span>systems</span> <span>unikernels</span></span></div>",
+18
avsm/news_4c-1.json
+18
avsm/news_4c-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/4c-1\">Trusted Carbon Credits</a> <span>/ May 2022</span></h2><p>With the recent controversies over low-integrity carbon credits, I spoke to Vox magazine\nabout my skepticism about Adam Neumann's new startup.</p>\n<blockquote>\n<p>"The problem with the current markets is nothing to do with how we can trade these more effectively," said Anil Madhavapeddy, who is an associate professor of computer science and technology at Cambridge University and the director of the Cambridge Center for Carbon Credits. "We just do not have enough supply."\n-- <a href=\"https://www.vox.com/recode/23142106/adam-neumann-crypto-carbon-credit-offset-flowcarbon\">Vox</a></p>\n</blockquote>\n<div><p>The Cambridge Centre for Carbon Credits is an initiative I started with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>, <a href=\"https://coomeslab.org\">David Coomes</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, aimed at issuing trusted and verifiable carbon credits towards the prevention of nature destruction due to anthropogenic actions. We researched a combination of large-scale data processing (satellite and and sensor networks) and decentralised <a href=\"https://tezos.com\">Tezos</a> smart contracts to design a carbon marketplace with verifiable transactions that link back to trusted primary observations.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/projects/4c\">422 words</a>]</span></div><div><a href=\"https://anil.recoil.org/news/4c-1\">#</a> 1st Jan 2021 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/project.svg\">projects</span> <span>carboncredits</span> <span>conservation</span> <span>systems</span></span></div>",+"content": "<p>With the recent controversies over low-integrity carbon credits, I spoke to Vox magazine\nabout my skepticism about Adam Neumann's new startup.</p>\n<blockquote>\n<p>"The problem with the current markets is nothing to do with how we can trade these more effectively," said Anil Madhavapeddy, who is an associate professor of computer science and technology at Cambridge University and the director of the Cambridge Center for Carbon Credits. "We just do not have enough supply."\n-- <a href=\"https://www.vox.com/recode/23142106/adam-neumann-crypto-carbon-credit-offset-flowcarbon\">Vox</a></p>\n</blockquote>\n<div><p>The Cambridge Centre for Carbon Credits is an initiative I started with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>, <a href=\"https://coomeslab.org\">David Coomes</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, aimed at issuing trusted and verifiable carbon credits towards the prevention of nature destruction due to anthropogenic actions. We researched a combination of large-scale data processing (satellite and and sensor networks) and decentralised <a href=\"https://tezos.com\">Tezos</a> smart contracts to design a carbon marketplace with verifiable transactions that link back to trusted primary observations.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/projects/4c\">422 words</a>]</span></div>",
+18
avsm/news_55852136-843d-4043-98e7-6b46c6d39b01-1.json
+18
avsm/news_55852136-843d-4043-98e7-6b46c6d39b01-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/55852136-843d-4043-98e7-6b46c6d39b01-1\">Unikernels: Functional Infrastructure with Mirage OS</a> <span>/ May 2015</span></h2><p>Talk at Esper on functional programming with unikernels</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/55852136-843d-4043-98e7-6b46c6d39b01-1\">#</a> 12th May 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>california</span> <span>docker</span> <span>irmin</span> <span>mirageos</span> <span>ocaml</span> <span>unikernels</span></span></div>",+"content": "<p>Talk at Esper on functional programming with unikernels</p>\n<p></p><div></div><p></p>",
+18
avsm/news_5cdf2eef-9053-428e-b8b3-ab5ae274c129-1.json
+18
avsm/news_5cdf2eef-9053-428e-b8b3-ab5ae274c129-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/5cdf2eef-9053-428e-b8b3-ab5ae274c129-1\">FLOSS Weekly 302: Open Mirage</a> <span>/ Jul 2014</span></h2><p>Appeared on FLOSS Weekly 302 about Open Mirage</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/5cdf2eef-9053-428e-b8b3-ab5ae274c129-1\">#</a> 23rd Jul 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>mirageos</span> <span>ocaml</span> <span>opensource</span> <span>unikernels</span></span></div>",
+18
avsm/news_644914a5-a40b-4ef7-bb17-cea43c95dd09-1.json
+18
avsm/news_644914a5-a40b-4ef7-bb17-cea43c95dd09-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/644914a5-a40b-4ef7-bb17-cea43c95dd09-1\">Codemesh 2014: Nymote: Git Your Own Cloud Here</a> <span>/ Dec 2014</span></h2><p>Gave Codemesh 2014 talk on Nymote</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/644914a5-a40b-4ef7-bb17-cea43c95dd09-1\">#</a> 17th Dec 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>irmin</span> <span>mirageos</span> <span>ocaml</span> <span>security</span> <span>selfhosting</span> <span>unikernels</span></span></div>",
+18
avsm/news_725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1.json
+18
avsm/news_725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1\">Unikernels: the rise of the library hypervisor in MirageOS</a> <span>/ Oct 2016</span></h2><p>DockerCon talk on unikernels and MirageOS</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/725dda70-b12b-4b1a-a8ae-fa9c22683ff2-1\">#</a> 14th Oct 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>docker</span> <span>mirageos</span> <span>ocaml</span> <span>systems</span> <span>unikernels</span> <span>xen</span></span></div>",
+18
avsm/news_762795c5-9f3b-499b-a054-b2af37d1ddd2-1.json
+18
avsm/news_762795c5-9f3b-499b-a054-b2af37d1ddd2-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/762795c5-9f3b-499b-a054-b2af37d1ddd2-1\">Mirage Developer Preview 1 screencast</a> <span>/ Jul 2013</span></h2><p>Mirage Developer Preview 1 screencast</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/762795c5-9f3b-499b-a054-b2af37d1ddd2-1\">#</a> 26th Jul 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>mirageos</span> <span>ocaml</span> <span>screencast</span> <span>unikernels</span></span></div>",
+18
avsm/news_7d949597-b864-4ada-ab1a-81ff8c0463e2-1.json
+18
avsm/news_7d949597-b864-4ada-ab1a-81ff8c0463e2-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/7d949597-b864-4ada-ab1a-81ff8c0463e2-1\">OCaml Meeting 2011 - MirageOS</a> <span>/ Oct 2011</span></h2><p>At the OCaml Meeting 2011 speaking about MirageOS</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/7d949597-b864-4ada-ab1a-81ff8c0463e2-1\">#</a> 19th Oct 2011 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span></span></div>",+"content": "<p>At the OCaml Meeting 2011 speaking about MirageOS</p>\n<p></p><div></div><p></p>",
+18
avsm/news_80795e06-ac75-4015-b178-3cfcbb233685-1.json
+18
avsm/news_80795e06-ac75-4015-b178-3cfcbb233685-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/80795e06-ac75-4015-b178-3cfcbb233685-1\">Speaking at CCI workshop on conservation evidence</a> <span>/ Jun 2024</span></h2><p><a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> organised a workshop at the CCI on how to bring about an <a href=\"https://about.conservationevidence.com/2024/07/12/the-next-steps-for-transforming-conservation-ideas-from-the-effectiveness-revolution-workshop/\">Effectiveness Revolution</a> for transforming conservation into an evidence-driven discipline.</p>\n<blockquote>\n<p>The aim was to discuss the "Evidence Emergency" (The Wildlife Trusts' term), the urgent need to embed evidence into decision-making and to create additional evidence to fill the considerable gaps in the evidence base, to improve conservation practice.</p>\n</blockquote>\n<p>I gave a talk about our early results with the <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">conservation copilots</a> work.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/80795e06-ac75-4015-b178-3cfcbb233685-1\">#</a> 25th Jun 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>ai</span> <span>biodiversity</span> <span>conservation</span> <span>evidence</span> <span>llms</span></span></div>",+"content": "<p><a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> organised a workshop at the CCI on how to bring about an <a href=\"https://about.conservationevidence.com/2024/07/12/the-next-steps-for-transforming-conservation-ideas-from-the-effectiveness-revolution-workshop/\">Effectiveness Revolution</a> for transforming conservation into an evidence-driven discipline.</p>\n<blockquote>\n<p>The aim was to discuss the "Evidence Emergency" (The Wildlife Trusts' term), the urgent need to embed evidence into decision-making and to create additional evidence to fill the considerable gaps in the evidence base, to improve conservation practice.</p>\n</blockquote>\n<p>I gave a talk about our early results with the <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">conservation copilots</a> work.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_8c92d6cf-3e05-429f-8c8e-094f77be61c6-1.json
+18
avsm/news_8c92d6cf-3e05-429f-8c8e-094f77be61c6-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/8c92d6cf-3e05-429f-8c8e-094f77be61c6-1\">Ian Eyberg, Joshua Bernstein, Anil Madhavapeddy at OSCON in Austin</a> <span>/ Jun 2016</span></h2><p>Interviewed by The New Stack at OSCON in Austin, Texas</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/8c92d6cf-3e05-429f-8c8e-094f77be61c6-1\">#</a> 6th Jun 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>fp</span> <span>interview</span> <span>mirageos</span> <span>opensource</span> <span>oscon</span> <span>unikernels</span></span></div>",+"content": "<p>Interviewed by The New Stack at OSCON in Austin, Texas</p>\n<p></p><div></div><p></p>",
+18
avsm/news_981c00b5-32c0-4cac-a387-6c945dfa9934-1.json
+18
avsm/news_981c00b5-32c0-4cac-a387-6c945dfa9934-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/981c00b5-32c0-4cac-a387-6c945dfa9934-1\">Functional Programming for the Planet</a> <span>/ Sep 2023</span></h2><p>Keynoted at ICFP 2023 on Functional Programming for the Planet</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/981c00b5-32c0-4cac-a387-6c945dfa9934-1\">#</a> 5th Sep 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>biodiversity</span> <span>forests</span> <span>fp</span> <span>icfp</span> <span>keynote</span> <span>satellite</span> <span>seattle</span> <span>sensing</span></span></div>",+"content": "<p>Keynoted at ICFP 2023 on Functional Programming for the Planet</p>\n<p></p><div></div><p></p>",
+18
avsm/news_a0280750-2ef0-4f5c-b138-68f7b11b4c29-1.json
+18
avsm/news_a0280750-2ef0-4f5c-b138-68f7b11b4c29-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/a0280750-2ef0-4f5c-b138-68f7b11b4c29-1\">Mapping greener futures with planetary computing</a> <span>/ Oct 2024</span></h2><p>I got invited by <a href=\"https://profiles.ucl.ac.uk/78591-serta%C3%A7-sehlikoglu\">Serta\u00e7 Sehlikoglu</a> to deliver a lecture to the Masters students down at the <a href=\"https://www.ucl.ac.uk/bartlett/igp/\">UCL Institute for Global Prosperity</a>. I talked about the recent work on <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a>, with an overview of the <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> papers.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/a0280750-2ef0-4f5c-b138-68f7b11b4c29-1\">#</a> 25th Oct 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>biodiversity</span> <span>conservation</span> <span>food</span> <span>london</span> <span>sensing</span> <span>spatial</span> <span>systems</span></span></div>",+"content": "<p>I got invited by <a href=\"https://profiles.ucl.ac.uk/78591-serta%C3%A7-sehlikoglu\">Serta\u00e7 Sehlikoglu</a> to deliver a lecture to the Masters students down at the <a href=\"https://www.ucl.ac.uk/bartlett/igp/\">UCL Institute for Global Prosperity</a>. I talked about the recent work on <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a>, with an overview of the <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> papers.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_a26475b5-c169-478e-b88e-be5cd1f2aff8-1.json
+18
avsm/news_a26475b5-c169-478e-b88e-be5cd1f2aff8-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/a26475b5-c169-478e-b88e-be5cd1f2aff8-1\">17th William Pitt Seminar - Who's in Charge?</a> <span>/ Nov 2022</span></h2><p>I opened the 17th William Pitt Seminar at Pembroke College on climate change with a brief talk about the status of the world's biodiversity, and how we have more agency than ever before to take matters into our own hands.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/a26475b5-c169-478e-b88e-be5cd1f2aff8-1\">#</a> 1st Nov 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>biodiversity</span> <span>climate</span> <span>evidence</span> <span>pembroke</span> <span>policy</span> <span>satellite</span></span></div>",+"content": "<p>I opened the 17th William Pitt Seminar at Pembroke College on climate change with a brief talk about the status of the world's biodiversity, and how we have more agency than ever before to take matters into our own hands.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_a612e810-d56c-48af-b43e-2893a96b9120-1.json
+18
avsm/news_a612e810-d56c-48af-b43e-2893a96b9120-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/a612e810-d56c-48af-b43e-2893a96b9120-1\">Unikernel Systems is now part of Docker</a> <span>/ Jan 2016</span></h2><p>Announced that Unikernel Systems is now part of Docker</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/a612e810-d56c-48af-b43e-2893a96b9120-1\">#</a> 21st Jan 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>docker</span> <span>startups</span> <span>unikernels</span></span></div>",+"content": "<p>Announced that Unikernel Systems is now part of Docker</p>\n<p></p><div></div><p></p>",
+18
avsm/news_ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1.json
+18
avsm/news_ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1\">Jitsu: Just-In-Time Summoning of Unikernels</a> <span>/ May 2015</span></h2><p>NSDI 2015 talk on Jitsu</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/ad4658f5-ca4f-42f3-b61a-58f13dcdeb1a-1\">#</a> 4th May 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>california</span> <span>distributed</span> <span>dns</span> <span>docker</span> <span>embedded</span> <span>irmin</span> <span>mirageos</span> <span>ocaml</span> <span>unikernels</span></span></div>",
+18
avsm/news_anil-phd-thesis-1.json
+18
avsm/news_anil-phd-thesis-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/anil-phd-thesis-1\">Creating high-performance, statically type-safe network applications</a> <span>/ Mar 2010</span></h2><p>PhD thesis now available as a technical report</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (UCAM-CL-TR-775) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-775\">DOI</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/anil-phd-thesis-1\">#</a> 1st Mar 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>fp</span> <span>internet</span> <span>ocaml</span> <span>report</span> <span>security</span></span></div>",+"content": "<p>PhD thesis now available as a technical report</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (UCAM-CL-TR-775) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-775\">DOI</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_anil-phd-thesis-2.json
+18
avsm/news_anil-phd-thesis-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/anil-phd-thesis-2\">Creating high-performance, statically type-safe network applications</a> <span>/ May 2010</span></h2><p>My PhD thesis is now also published as a print book</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (UCAM-CL-TR-775) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-775\">DOI</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/anil-phd-thesis-2\">#</a> 1st Mar 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>fp</span> <span>internet</span> <span>ocaml</span> <span>report</span> <span>security</span></span></div>",+"content": "<p>My PhD thesis is now also published as a print book</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (UCAM-CL-TR-775) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-775.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-775\">DOI</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/anil-phd-thesis.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_audio-networking-1.json
+18
avsm/news_audio-networking-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/audio-networking-1\">Context-Aware Computing with Sound</a> <span>/ Oct 2003</span></h2><p>While working as an intern at Intel Research Cambridge, <a href=\"https://github.com/djs55\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> and I put together a fun system based on the emerging new class of smartphones. The project kicked off when we randomly experimented with our fancy Nokia smartphones and discovered that they didn't have anti-aliasing filters on the microphones! We argued that</p>\n<blockquote>\n<p>[...] audio networking can be used as the basis for developing context-aware applications. Audio networking allows standard devices fitted with speakers and microphones (e.g. PDAs, laptops, desktop PCs and mobile phones) to exchange data and infer information about their environment. One of the key advantages of audio networking is that it enables context-aware applications to be immediately deployed on a large scale without requiring users to purchase and install additional hardware.</p>\n</blockquote>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/audio-networking-1\">178 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/chapter/10.1007/978-3-540-39653-6_25\">ubiComp 2003: Ubiquitous Computing</a>.</p><p><a href=\"https://link.springer.com/chapter/10.1007/978-3-540-39653-6_25\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-540-39653-6_25\">DOI</a> <a href=\"https://anil.recoil.org/papers/audio-networking.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/audio-networking.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/audio-networking-1\">#</a> 1st Oct 2003 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>audio</span> <span>conference</span> <span>hci</span> <span>mobile</span> <span>networking</span> <span>ubicomp</span></span></div>",+"content": "<p>While working as an intern at Intel Research Cambridge, <a href=\"https://github.com/djs55\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> and I put together a fun system based on the emerging new class of smartphones. The project kicked off when we randomly experimented with our fancy Nokia smartphones and discovered that they didn't have anti-aliasing filters on the microphones! We argued that</p>\n<blockquote>\n<p>[...] audio networking can be used as the basis for developing context-aware applications. Audio networking allows standard devices fitted with speakers and microphones (e.g. PDAs, laptops, desktop PCs and mobile phones) to exchange data and infer information about their environment. One of the key advantages of audio networking is that it enables context-aware applications to be immediately deployed on a large scale without requiring users to purchase and install additional hardware.</p>\n</blockquote>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/audio-networking-1\">178 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"https://link.springer.com/chapter/10.1007/978-3-540-39653-6_25\">ubiComp 2003: Ubiquitous Computing</a>.</p><p><a href=\"https://link.springer.com/chapter/10.1007/978-3-540-39653-6_25\">URL</a> <i>(link.springer.com)</i> <a href=\"https://doi.org/10.1007/978-3-540-39653-6_25\">DOI</a> <a href=\"https://anil.recoil.org/papers/audio-networking.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/audio-networking.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_b11188ba-0f97-4ec4-b372-fa3cea0821ab-1.json
+18
avsm/news_b11188ba-0f97-4ec4-b372-fa3cea0821ab-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/b11188ba-0f97-4ec4-b372-fa3cea0821ab-1\">State of the OCaml Platform 2020</a> <span>/ Aug 2020</span></h2><p>Talk on the state of the OCaml Platform in 2020</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/b11188ba-0f97-4ec4-b372-fa3cea0821ab-1\">#</a> 28th Aug 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>devtools</span> <span>ocaml</span> <span>opensource</span></span></div>",
+18
avsm/news_bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1.json
+18
avsm/news_bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1\">Financing Forests: A Credible Approach towards Halting Tropical Deforestation</a> <span>/ Nov 2022</span></h2><p>Wednesday seminar on financing forests using carbon credits</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/bc9da6fc-9419-4f18-9db9-c13b1a4a859f-1\">#</a> 16th Nov 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>carboncredits</span> <span>economics</span> <span>forests</span> <span>satellite</span> <span>sensing</span></span></div>",+"content": "<p>Wednesday seminar on financing forests using carbon credits</p>\n<p></p><div></div><p></p>",
+18
avsm/news_be2f049b-174a-4e5b-b30e-0319793487c7-1.json
+18
avsm/news_be2f049b-174a-4e5b-b30e-0319793487c7-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/be2f049b-174a-4e5b-b30e-0319793487c7-1\">Mirage: A New Multi-Scale Operating System for Clouds and Crowds (2014)</a> <span>/ Oct 2010</span></h2><p>At LinkedIn giving tech talk about Mirage</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/be2f049b-174a-4e5b-b30e-0319793487c7-1\">#</a> 25th Oct 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>california</span> <span>mirageos</span> <span>systems</span> <span>unikernels</span></span></div>",
+18
avsm/news_c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1.json
+18
avsm/news_c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1\">The First Billion Real Deployments of Unikernels</a> <span>/ Feb 2020</span></h2><p>Part 2</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/c09ed36f-6ad5-4254-a0ce-3ca3398f38a3-1\">#</a> 26th Feb 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>embedded</span> <span>keynote</span> <span>mirageos</span> <span>scotland</span> <span>systems</span> <span>unikernels</span></span></div>",
+18
avsm/news_c9273fa0-802f-4d2b-8f0d-db383943564e-1.json
+18
avsm/news_c9273fa0-802f-4d2b-8f0d-db383943564e-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/c9273fa0-802f-4d2b-8f0d-db383943564e-1\">MirageOS 2.0: branch consistency for Xen Stub Domains</a> <span>/ Oct 2014</span></h2><p>At the Xen Summit speaking about branch consistency for Xen Stub Domains</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/c9273fa0-802f-4d2b-8f0d-db383943564e-1\">#</a> 17th Oct 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>mirageos</span> <span>unikernels</span> <span>xen</span></span></div>",+"content": "<p>At the Xen Summit speaking about branch consistency for Xen Stub Domains</p>\n<p></p><div></div><p></p>",
+18
avsm/news_ce64a918-ff52-4116-b1ee-256f08e6e7f1-1.json
+18
avsm/news_ce64a918-ff52-4116-b1ee-256f08e6e7f1-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/ce64a918-ff52-4116-b1ee-256f08e6e7f1-1\">Leveraging Scientific Innovation and AI to Scale Carbon Markets</a> <span>/ Mar 2023</span></h2><p>Discussion with Mantle Labs about carbon credits</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/ce64a918-ff52-4116-b1ee-256f08e6e7f1-1\">#</a> 7th Mar 2023 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>carboncredits</span> <span>economics</span> <span>forests</span> <span>london</span></span></div>",+"content": "<p>Discussion with Mantle Labs about carbon credits</p>\n<p></p><div></div><p></p>",
+18
avsm/news_cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1.json
+18
avsm/news_cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1\">MirageOS and XAPI project update at XenSummit</a> <span>/ Nov 2013</span></h2><p>MirageOS and XAPI project update at XenSummit</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/cf9fcf6b-de5d-4a23-a00d-cceadea5b668-1\">#</a> 13th Nov 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>cloud</span> <span>ocaml</span> <span>systems</span> <span>unikernels</span> <span>xen</span></span></div>",
+18
avsm/news_d456e4bc-bce6-45ad-9d2e-102f834ec400-1.json
+18
avsm/news_d456e4bc-bce6-45ad-9d2e-102f834ec400-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/d456e4bc-bce6-45ad-9d2e-102f834ec400-1\">Rebuilding Operating Systems with Functional Principles</a> <span>/ Feb 2020</span></h2><p>Delivered the distinguished seminar series at St Andrews on rebuilding Operating Systems with functional principles</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/d456e4bc-bce6-45ad-9d2e-102f834ec400-1\">#</a> 26th Feb 2020 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>embedded</span> <span>keynote</span> <span>mirageos</span> <span>scotland</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>Delivered the distinguished seminar series at St Andrews on rebuilding Operating Systems with functional principles</p>\n<p></p><div></div><p></p>",
+18
avsm/news_d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1.json
+18
avsm/news_d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1.json
···+"title": "SE Radio Episode 204: Anil Madhavapeddy on the Mirage Cloud Operating System and the OCaml Language",+"summary": "<h2><a href=\"https://anil.recoil.org/news/d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1\">SE Radio Episode 204: Anil Madhavapeddy on the Mirage Cloud Operating System and the OCaml Language</a> <span>/ May 2014</span></h2><p>Appeared on SE Radio Episode 204 about Mirage and OCaml</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/d5411e25-7845-41e8-b3ec-ab3c33ce13c8-1\">#</a> 1st May 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>interview</span> <span>mirageos</span> <span>opensource</span> <span>podcast</span> <span>unikernels</span> <span>xen</span></span></div>",+"content": "<p>Appeared on SE Radio Episode 204 about Mirage and OCaml</p>\n<p></p><div></div><p></p>",
+18
avsm/news_d592bf17-c835-435f-9469-f0f65e926975-1.json
+18
avsm/news_d592bf17-c835-435f-9469-f0f65e926975-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/d592bf17-c835-435f-9469-f0f65e926975-1\">Programming for the Planet</a> <span>/ May 2024</span></h2><p>I was invited by Mary Sheeran to deliver a keynoted at <a href=\"https://www.lambdadays.org/\">Lambda Days</a>, and I decided to go along to talk about my work on <a href=\"https://anil.recoil.org/videos/981c00b5-32c0-4cac-a387-6c945dfa9934\">Programming for the Planet</a>. The conference was a really vibrant crowd and I would definitely go along in future years. It's best summarised via an <a href=\"https://www.youtube.com/watch?v=Kao-LguvYDU&list=PLvL2NEhYV4ZtX2TurK0BIlKD_cHct0rSs\">interview video</a> they took of all the speakers.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/d592bf17-c835-435f-9469-f0f65e926975-1\">#</a> 27th May 2024 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>biodiversity</span> <span>cloud</span> <span>distributed</span> <span>fp</span> <span>interview</span> <span>ocaml</span> <span>satellite</span> <span>satellites</span> <span>sensing</span> <span>sweden</span></span></div>",+"content": "<p>I was invited by Mary Sheeran to deliver a keynoted at <a href=\"https://www.lambdadays.org/\">Lambda Days</a>, and I decided to go along to talk about my work on <a href=\"https://anil.recoil.org/videos/981c00b5-32c0-4cac-a387-6c945dfa9934\">Programming for the Planet</a>. The conference was a really vibrant crowd and I would definitely go along in future years. It's best summarised via an <a href=\"https://www.youtube.com/watch?v=Kao-LguvYDU&list=PLvL2NEhYV4ZtX2TurK0BIlKD_cHct0rSs\">interview video</a> they took of all the speakers.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1.json
+18
avsm/news_d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1\">Immutable Distributed Infrastructure with Unikernels</a> <span>/ Sep 2015</span></h2><p>Invited talk at NetPL on Immutable Distributed Infrastructure with Unikernels</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/d5fbd6a4-bef2-4fbc-9d02-cb9935e50d8e-1\">#</a> 29th Sep 2015 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>cloud</span> <span>distributed</span> <span>irmin</span> <span>mirageos</span> <span>storage</span></span></div>",+"content": "<p>Invited talk at NetPL on Immutable Distributed Infrastructure with Unikernels</p>\n<p></p><div></div><p></p>",
+18
avsm/news_dbd7546a-95d8-40af-b286-3cf930767682-1.json
+18
avsm/news_dbd7546a-95d8-40af-b286-3cf930767682-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/dbd7546a-95d8-40af-b286-3cf930767682-1\">The functional innards of Docker for Mac and Windows</a> <span>/ Jun 2016</span></h2><p>I gave a talk at the <a href=\"https://functional.works-hub.com\">Functional Works</a> meetup, held in <a href=\"https://janestreet.com\">Jane Street London</a> about how Docker for Mac and Windows use OCaml and unikernels <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">under the hood</a>.</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/dbd7546a-95d8-40af-b286-3cf930767682-1\">#</a> 24th Jun 2016 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>docker</span> <span>janestreet</span> <span>mirageos</span> <span>ocaml</span> <span>systems</span> <span>unikernels</span> <span>xen</span></span></div>",+"content": "<p>I gave a talk at the <a href=\"https://functional.works-hub.com\">Functional Works</a> meetup, held in <a href=\"https://janestreet.com\">Jane Street London</a> about how Docker for Mac and Windows use OCaml and unikernels <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">under the hood</a>.</p>\n<p></p><div></div><p></p>",
+18
avsm/news_dd8b1f58-c43c-4422-9963-d3a980529e57-1.json
+18
avsm/news_dd8b1f58-c43c-4422-9963-d3a980529e57-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/dd8b1f58-c43c-4422-9963-d3a980529e57-1\">OUD 2012: Towards an OCaml Platform and Introducing OCaml Labs</a> <span>/ Sep 2012</span></h2><p>Recording of the OCaml Labs announcement</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/dd8b1f58-c43c-4422-9963-d3a980529e57-1\">#</a> 17th Sep 2012 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>devtools</span> <span>funding</span> <span>ocaml</span></span></div>",
+18
avsm/news_de10-perscon-1.json
+18
avsm/news_de10-perscon-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/de10-perscon-1\">The personal container, or your life in bits</a> <span>/ Oct 2010</span></h2><p>Paper on personal containers for data management at the UK Digital Economy meeting</p>\n<blockquote><div><p><a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://www.nottingham.ac.uk/computerscience/people/chris.greenhalgh\"><span>Chris Greenhalgh</span></a>, <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>, <a href=\"https://www.nottingham.ac.uk/psychology/people/alexa.spence\"><span>Alexa Spence</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Journal paper in <a href=\"http://mort.io/publications/pdf/de10-perscon.pdf\">Digital Futures</a> (vol 10).</p><p><a href=\"http://mort.io/publications/pdf/de10-perscon.pdf\">URL</a> <i>(mort.io)</i> <a href=\"https://anil.recoil.org/papers/de10-perscon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/de10-perscon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/de10-perscon-1\">#</a> 1st Oct 2010 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>hci</span> <span>journal</span> <span>privacy</span> <span>selfhosting</span></span></div>",+"content": "<p>Paper on personal containers for data management at the UK Digital Economy meeting</p>\n<blockquote><div><p><a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <a href=\"https://www.nottingham.ac.uk/computerscience/people/chris.greenhalgh\"><span>Chris Greenhalgh</span></a>, <a href=\"https://drdrmc.github.io/about/\"><span>Derek McAuley</span></a>, <a href=\"https://www.nottingham.ac.uk/psychology/people/alexa.spence\"><span>Alexa Spence</span></a>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>.</p><p>Journal paper in <a href=\"http://mort.io/publications/pdf/de10-perscon.pdf\">Digital Futures</a> (vol 10).</p><p><a href=\"http://mort.io/publications/pdf/de10-perscon.pdf\">URL</a> <i>(mort.io)</i> <a href=\"https://anil.recoil.org/papers/de10-perscon.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/de10-perscon.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_de13-dataware-1.json
+18
avsm/news_de13-dataware-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/de13-dataware-1\">Perceived risks of personal data sharing</a> <span>/ Feb 2013</span></h2><p>Paper on dataware computing in the digital economy</p>\n<blockquote><div><p><span><span>Anya Skatova</span></span>, <span><span>Jaspreet Johal</span></span>, <span><span>Robert Houghton</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Neelam Bhandari</span></span>, <a href=\"https://www.tomlodge.info/cv\"><span>Tom Lodge</span></a>, <span><span>Christian Wagner</span></span>, <a href=\"https://www.nottingham.ac.uk/business/people/psxjog.phtml\"><span>James Goulding</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in Digital Economy: Open Digital.</p><p><a href=\"https://anil.recoil.org/papers/de13-dataware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/de13-dataware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/de13-dataware-1\">#</a> 1st Feb 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>databox</span> <span>hci</span> <span>journal</span> <span>privacy</span></span></div>",+"content": "<p>Paper on dataware computing in the digital economy</p>\n<blockquote><div><p><span><span>Anya Skatova</span></span>, <span><span>Jaspreet Johal</span></span>, <span><span>Robert Houghton</span></span>, <a href=\"https://github.com/mor1\"><span>Richard Mortier</span></a>, <span><span>Neelam Bhandari</span></span>, <a href=\"https://www.tomlodge.info/cv\"><span>Tom Lodge</span></a>, <span><span>Christian Wagner</span></span>, <a href=\"https://www.nottingham.ac.uk/business/people/psxjog.phtml\"><span>James Goulding</span></a>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\"><span>Jon Crowcroft</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Journal paper in Digital Economy: Open Digital.</p><p><a href=\"https://anil.recoil.org/papers/de13-dataware.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/de13-dataware.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1.json
+18
avsm/news_ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1\">Haskell Symposium 2014 Keynote on functional OS design</a> <span>/ Sep 2014</span></h2><p>Gave Haskell Symposium 2014 Keynote on functional OS design</p>\n<p></p><div></div><p></p>\n<div><a href=\"https://anil.recoil.org/news/ed84b2eb-1b93-4dc3-b746-63a4af13d4ea-1\">#</a> 5th Sep 2014 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">talks</span> <span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/video.svg\">videos</span> <span>fp</span> <span>haskell</span> <span>icfp</span> <span>keynote</span> <span>mirageos</span> <span>ocaml</span></span></div>",+"content": "<p>Gave Haskell Symposium 2014 Keynote on functional OS design</p>\n<p></p><div></div><p></p>",
+18
avsm/news_netapp-tr-3071-1.json
+18
avsm/news_netapp-tr-3071-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/netapp-tr-3071-1\">Paper on the NASA Mars Polar Lander website architecture</a> <span>/ Jul 2000</span></h2><p>Although the Mars Polar Lander ended up <a href=\"https://en.wikipedia.org/wiki/Mars_Polar_Lander#See_also\">crashing</a>, the website itself was one of the busiest websites in the world at the time during the approach to the landing. I was the person handling the website architecture and the amazing <code>webmaster@mars.nasa.gov</code> account at the time. I worked closely <a href=\"https://anil.recoil.org/notes/mars-polar-lander\">with Sun</a> and NetApp and wrote up a technical report on how the Mars Polar Lander website acceleration architecture worked.</p>\n<blockquote><div><p><a href=\"https://www.linkedin.com/in/ndoherty\"><span>Niall Doherty</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (TR-3071) at <a href=\"http://tr.netapp.link/tr-3071.pdf\">NetApp</a>.</p><p><a href=\"http://tr.netapp.link/tr-3071.pdf\">URL</a> <i>(tr.netapp.link)</i> <a href=\"https://anil.recoil.org/papers/netapp-tr-3071.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3071.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/netapp-tr-3071-1\">#</a> 1st Jul 2000 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>distributed</span> <span>internet</span> <span>mars</span> <span>nasa</span> <span>netapp</span> <span>networks</span> <span>report</span> <span>space</span> <span>web</span></span></div>",+"content": "<p>Although the Mars Polar Lander ended up <a href=\"https://en.wikipedia.org/wiki/Mars_Polar_Lander#See_also\">crashing</a>, the website itself was one of the busiest websites in the world at the time during the approach to the landing. I was the person handling the website architecture and the amazing <code>webmaster@mars.nasa.gov</code> account at the time. I worked closely <a href=\"https://anil.recoil.org/notes/mars-polar-lander\">with Sun</a> and NetApp and wrote up a technical report on how the Mars Polar Lander website acceleration architecture worked.</p>\n<blockquote><div><p><a href=\"https://www.linkedin.com/in/ndoherty\"><span>Niall Doherty</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Technical report (TR-3071) at <a href=\"http://tr.netapp.link/tr-3071.pdf\">NetApp</a>.</p><p><a href=\"http://tr.netapp.link/tr-3071.pdf\">URL</a> <i>(tr.netapp.link)</i> <a href=\"https://anil.recoil.org/papers/netapp-tr-3071.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3071.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_netapp-tr-3152-1.json
+18
avsm/news_netapp-tr-3152-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/netapp-tr-3152-1\">Streaming U2 live across the Internet</a> <span>/ Apr 2002</span></h2><p>After the <a href=\"https://anil.recoil.org/notes/mars-polar-lander\">Mars Polar Lander crashed</a>, I took a job at NetApp working as the\nproduct architect for <a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a>. Among the hundreds\nof deployments that I help setup across the world, the most fun was figuring out how to scale\none of the biggest bands in the world at the time wanting to stream their concert live to a\nglobal audience.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/netapp-tr-3152-1\">158 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.linkedin.com/in/alberto-crivelli-459209\"><span>Alberto Crivelli</span></a>.</p><p>Technical report (TR-3152) at <a href=\"http://tr.netapp.link/tr-3152.pdf\">NetApp</a>.</p><p><a href=\"http://tr.netapp.link/tr-3152.pdf\">URL</a> <i>(tr.netapp.link)</i> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/netapp-tr-3152-1\">#</a> 1st Apr 2002 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>caching</span> <span>distributed</span> <span>internet</span> <span>italy</span> <span>netapp</span> <span>networks</span> <span>report</span> <span>streaming</span></span></div>",+"content": "<p>After the <a href=\"https://anil.recoil.org/notes/mars-polar-lander\">Mars Polar Lander crashed</a>, I took a job at NetApp working as the\nproduct architect for <a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a>. Among the hundreds\nof deployments that I help setup across the world, the most fun was figuring out how to scale\none of the biggest bands in the world at the time wanting to stream their concert live to a\nglobal audience.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/netapp-tr-3152-1\">158 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://www.linkedin.com/in/alberto-crivelli-459209\"><span>Alberto Crivelli</span></a>.</p><p>Technical report (TR-3152) at <a href=\"http://tr.netapp.link/tr-3152.pdf\">NetApp</a>.</p><p><a href=\"http://tr.netapp.link/tr-3152.pdf\">URL</a> <i>(tr.netapp.link)</i> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_netgames04-ctf-1.json
+18
avsm/news_netgames04-ctf-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/netgames04-ctf-1\">Exploring tradeoffs in location-aware gaming using smartphones</a> <span>/ Aug 2004</span></h2><p>The summer of 2004 was sufficient full of procrastination that the members\nof the <a href=\"https://web.archive.org/web/20041212123550/http://sn17.org/\">SN17 collective</a> in the\nComputer Lab decided to build a computer game. But it wasn't enough to just play\nthe game on our phones -- instead, we combined all the public displays in the corridors,\nand then added in a cutting-edge 5cm-accurate <a href=\"https://en.wikipedia.org/wiki/Active_Bat\">ActiveBAT</a>,\nand built a Symbian-based Capture The Flag game where we all had to run around and\ntag each other physically while tracking the flag virtually.</p>\n<p>Was it mad? Yes. Was it fun? Yes. Did it get us a paper into the SIGCOMM NetGames\nworkshop? Yes!</p>\n<blockquote>\n<p>Our novel contributions include: (i) creating a fast-paced, close quarters, location-aware game, (ii) exploring the tradeoffs between the accuracy of a location system, the I/O capabilities of current mobile hardware, and the latency of user feedback, and (iii) investigating the viability of Bluetooth as a component in a low-latency location-aware gaming infrastructure.</p>\n</blockquote>\n<blockquote><div><p><a href=\"mailto:kieran@recoil.org\"><span>Kieran Mansley</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://portal.acm.org/citation.cfm?doid=1016540.1016544\">proceedings of ACM SIGCOMM 2004 workshops on NetGames '04 Network and system support for games - SIGCOMM 2004 Workshops</a>.</p><p><a href=\"http://portal.acm.org/citation.cfm?doid=1016540.1016544\">URL</a> <i>(portal.acm.org)</i> <a href=\"https://doi.org/10.1145/1016540.1016544\">DOI</a> <a href=\"https://anil.recoil.org/papers/netgames04-ctf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netgames04-ctf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/netgames04-ctf-1\">#</a> 1st Aug 2004 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>bluetooth</span> <span>conference</span> <span>games</span> <span>hci</span> <span>mobile</span> <span>networking</span> <span>spatial</span> <span>ubicomp</span> <span>vr</span></span></div>",+"content": "<p>The summer of 2004 was sufficient full of procrastination that the members\nof the <a href=\"https://web.archive.org/web/20041212123550/http://sn17.org/\">SN17 collective</a> in the\nComputer Lab decided to build a computer game. But it wasn't enough to just play\nthe game on our phones -- instead, we combined all the public displays in the corridors,\nand then added in a cutting-edge 5cm-accurate <a href=\"https://en.wikipedia.org/wiki/Active_Bat\">ActiveBAT</a>,\nand built a Symbian-based Capture The Flag game where we all had to run around and\ntag each other physically while tracking the flag virtually.</p>\n<p>Was it mad? Yes. Was it fun? Yes. Did it get us a paper into the SIGCOMM NetGames\nworkshop? Yes!</p>\n<blockquote>\n<p>Our novel contributions include: (i) creating a fast-paced, close quarters, location-aware game, (ii) exploring the tradeoffs between the accuracy of a location system, the I/O capabilities of current mobile hardware, and the latency of user feedback, and (iii) investigating the viability of Bluetooth as a component in a low-latency location-aware gaming infrastructure.</p>\n</blockquote>\n<blockquote><div><p><a href=\"mailto:kieran@recoil.org\"><span>Kieran Mansley</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, <a href=\"https://liquidx.net\"><span>Alastair Tse</span></a>, and <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>.</p><p>Paper in the <a href=\"http://portal.acm.org/citation.cfm?doid=1016540.1016544\">proceedings of ACM SIGCOMM 2004 workshops on NetGames '04 Network and system support for games - SIGCOMM 2004 Workshops</a>.</p><p><a href=\"http://portal.acm.org/citation.cfm?doid=1016540.1016544\">URL</a> <i>(portal.acm.org)</i> <a href=\"https://doi.org/10.1145/1016540.1016544\">DOI</a> <a href=\"https://anil.recoil.org/papers/netgames04-ctf.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/netgames04-ctf.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_rise-of-libos-1.json
+18
avsm/news_rise-of-libos-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/rise-of-libos-1\">Unikernels: Rise of the Virtual Library Operating System</a> <span>/ Nov 2013</span></h2><p>Article on the Communications of the ACM on unikernels is published</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2557963.2566628\">ACM Queue</a> (vol 11 issue 11).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2557963.2566628\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2557963.2566628\">DOI</a> <a href=\"https://anil.recoil.org/papers/rise-of-libos.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/rise-of-libos-1\">#</a> 1st Nov 2013 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cloud</span> <span>distributed</span> <span>fp</span> <span>journal</span> <span>systems</span> <span>unikernels</span></span></div>",+"content": "<p>Article on the Communications of the ACM on unikernels is published</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>.</p><p>Journal paper in <a href=\"https://dl.acm.org/doi/10.1145/2557963.2566628\">ACM Queue</a> (vol 11 issue 11).</p><p><a href=\"https://dl.acm.org/doi/10.1145/2557963.2566628\">URL</a> <i>(dl.acm.org)</i> <a href=\"https://doi.org/10.1145/2557963.2566628\">DOI</a> <a href=\"https://anil.recoil.org/papers/rise-of-libos.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_rwo-1.json
+18
avsm/news_rwo-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/rwo-1\">First edition of Real World OCaml published</a> <span>/ Nov 2013</span></h2><p>The 1st Edition of Real World OCaml by O'Reilly associates has been released! There have been flurry of signing events, including an upcoming one at OSCON in Austin.</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>.</p><p>Book published by <a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">Cambridge University Press</a>.</p><p><a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/9781009129220\">DOI</a> <a href=\"https://anil.recoil.org/papers/rwo.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/rwo-1\">#</a> 1st Oct 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>book</span> <span>fp</span> <span>ocaml</span> <span>writing</span></span></div>",+"content": "<p>The 1st Edition of Real World OCaml by O'Reilly associates has been released! There have been flurry of signing events, including an upcoming one at OSCON in Austin.</p>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>.</p><p>Book published by <a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">Cambridge University Press</a>.</p><p><a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/9781009129220\">DOI</a> <a href=\"https://anil.recoil.org/papers/rwo.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_rwo-2.json
+18
avsm/news_rwo-2.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/rwo-2\">The 2nd ed of Real World OCaml is available in shops</a> <span>/ Oct 2022</span></h2><p>I'm delighted to report that the second edition of <a href=\"https://realworldocaml.org\">Real World OCaml</a> is now available from Cambridge University Press! It's also freely available <a href=\"https://realworldocaml.org\">online</a>, and CUP also kindly agreed that the PDF version could be freely available online thanks to sponsorship from <a href=\"https://tarides.com\">Tarides</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/rwo-2\">105 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>.</p><p>Book published by <a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">Cambridge University Press</a>.</p><p><a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/9781009129220\">DOI</a> <a href=\"https://anil.recoil.org/papers/rwo.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/rwo-2\">#</a> 1st Oct 2022 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>book</span> <span>cambridge</span> <span>fp</span> <span>ocaml</span></span></div>",+"content": "<p>I'm delighted to report that the second edition of <a href=\"https://realworldocaml.org\">Real World OCaml</a> is now available from Cambridge University Press! It's also freely available <a href=\"https://realworldocaml.org\">online</a>, and CUP also kindly agreed that the PDF version could be freely available online thanks to sponsorship from <a href=\"https://tarides.com\">Tarides</a>.</p>\n<span>[\u2026<a href=\"https://anil.recoil.org/news/rwo-2\">105 words</a>]</span><blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, and <a href=\"https://github.com/yminsky\"><span>Yaron Minsky</span></a>.</p><p>Book published by <a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">Cambridge University Press</a>.</p><p><a href=\"https://www.cambridge.org/core/books/real-world-ocaml-functional-programming-for-the-masses/052E4BCCB09D56A0FE875DD81B1ED571\">URL</a> <i>(cambridge.org)</i> <a href=\"https://doi.org/10.1017/9781009129220\">DOI</a> <a href=\"https://anil.recoil.org/papers/rwo.bib\">BIB</a></p></div></blockquote>",
+18
avsm/news_sam03-secpol-1.json
+18
avsm/news_sam03-secpol-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/sam03-secpol-1\">The Case for Abstracting Security Policies</a> <span>/ Jun 2003</span></h2><p>My first ever academic paper, written with the expert guidance of <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and my PhD colleagues <a href=\"https://github.com/djs55\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>! We worked on a system call policy language to help constrain application access to privileged resources, and implemented this on OpenBSD using <a href=\"https://man.openbsd.org/OpenBSD-5.1/systrace.1\">systrace</a>. The paper describing the declarative language was presented at SAM 2003 in Las Vegas.</p>\n<blockquote>\n<p>"Untrusted code" is just as much a social problem as it\nis a technical problem. Looking for a complete solution\nis unrealistic: it is analogous to looking for a solution to\ncrime in general. With this in mind, we do not claim that\nour proposed framework is a panacea. However, although\na number of security problems remain (e.g. covert channel\nleakage), we claim that our system offers the potential to\nraise the security level of existing general purpose operating systems significantly.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.cl.cam.ac.uk/~am21/\"><span>Alan Mycroft</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"https://www.cl.cam.ac.uk/~am21/papers/sam03.pdf\">proceedings of the International Conference on Security and Management, SAM 03, June 23 - 26, 2003, Las Vegas, Nevada, USA, Volume 1</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/~am21/papers/sam03.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/sam03-secpol.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/sam03-secpol.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/sam03-secpol-1\">#</a> 1st Jun 2003 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>conference</span> <span>dsl</span> <span>kernel</span> <span>openbsd</span> <span>security</span> <span>systems</span></span></div>",+"content": "<p>My first ever academic paper, written with the expert guidance of <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and my PhD colleagues <a href=\"https://github.com/djs55\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>! We worked on a system call policy language to help constrain application access to privileged resources, and implemented this on OpenBSD using <a href=\"https://man.openbsd.org/OpenBSD-5.1/systrace.1\">systrace</a>. The paper describing the declarative language was presented at SAM 2003 in Las Vegas.</p>\n<blockquote>\n<p>"Untrusted code" is just as much a social problem as it\nis a technical problem. Looking for a complete solution\nis unrealistic: it is analogous to looking for a solution to\ncrime in general. With this in mind, we do not claim that\nour proposed framework is a panacea. However, although\na number of security problems remain (e.g. covert channel\nleakage), we claim that our system offers the potential to\nraise the security level of existing general purpose operating systems significantly.</p>\n</blockquote>\n<blockquote><div><p><a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <a href=\"https://www.cl.cam.ac.uk/~am21/\"><span>Alan Mycroft</span></a>, <a href=\"https://github.com/djs55\"><span>Dave Scott</span></a>, and <a href=\"mailto:richard.sharp@gmail.com\"><span>Richard Sharp</span></a>.</p><p>Paper in the <a href=\"https://www.cl.cam.ac.uk/~am21/papers/sam03.pdf\">proceedings of the International Conference on Security and Management, SAM 03, June 23 - 26, 2003, Las Vegas, Nevada, USA, Volume 1</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/~am21/papers/sam03.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://anil.recoil.org/papers/sam03-secpol.bib\">BIB</a> <a href=\"https://anil.recoil.org/papers/sam03-secpol.pdf\"><span>PDF<img alt=\"pdf\" src=\"https://anil.recoil.org/assets/pdf.svg\"></span></a></p></div></blockquote>",
+18
avsm/news_xen02-1.json
+18
avsm/news_xen02-1.json
···+"summary": "<h2><a href=\"https://anil.recoil.org/news/xen02-1\">Xen 2002</a> <span>/ Jan 2003</span></h2><p>The first technical report on the <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a> hypervisor is now available. I mainly contributed to the early NetBSD port (but have run into a snag with the lack of linear page tables in our paravirtual page implementation).</p>\n<blockquote><div><p><span><span>Paul R. Barham</span></span>, <span><span>Boris Dragovic</span></span>, <span><span>Keir Fraser</span></span>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"https://timharris.uk/\"><span>Tim Harris</span></a>, <a href=\"https://hoiho.net\"><span>Alex Ho</span></a>, <span><span>Evangelos Kotsovinos</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Rolf Neugebauer</span></span>, <span><span>Ian Pratt</span></span>, and <span><span>Andrew Warfield</span></span>.</p><p>Technical report (UCAM-CL-TR-553) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-553.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-553.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-553\">DOI</a> <a href=\"https://anil.recoil.org/papers/xen02.bib\">BIB</a></p></div></blockquote><div><a href=\"https://anil.recoil.org/news/xen02-1\">#</a> 1st Jan 2003 <span><span><img alt=\"icon\" src=\"https://anil.recoil.org/assets/paper.svg\">papers</span> <span>cambridge</span> <span>cloud</span> <span>computerlab</span> <span>report</span> <span>security</span> <span>xen</span></span></div>",+"content": "<p>The first technical report on the <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a> hypervisor is now available. I mainly contributed to the early NetBSD port (but have run into a snag with the lack of linear page tables in our paravirtual page implementation).</p>\n<blockquote><div><p><span><span>Paul R. Barham</span></span>, <span><span>Boris Dragovic</span></span>, <span><span>Keir Fraser</span></span>, <a href=\"https://research.google/people/steven-hand/\"><span>Steven Hand</span></a>, <a href=\"https://timharris.uk/\"><span>Tim Harris</span></a>, <a href=\"https://hoiho.net\"><span>Alex Ho</span></a>, <span><span>Evangelos Kotsovinos</span></span>, <a href=\"https://anil.recoil.org\"><span>Anil Madhavapeddy</span></a>, <span><span>Rolf Neugebauer</span></span>, <span><span>Ian Pratt</span></span>, and <span><span>Andrew Warfield</span></span>.</p><p>Technical report (UCAM-CL-TR-553) at <a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-553.pdf\">University of Cambridge, Computer Laboratory</a>.</p><p><a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-553.pdf\">URL</a> <i>(cl.cam.ac.uk)</i> <a href=\"https://doi.org/10.48456/tr-553\">DOI</a> <a href=\"https://anil.recoil.org/papers/xen02.bib\">BIB</a></p></div></blockquote>",
+18
avsm/notes_4c-launch.json
+18
avsm/notes_4c-launch.json
···+"summary": "<p>I launched <a href=\"https://anil.recoil.org/projects/4c\">4C</a> recently, and Pembroke College covers the launch with an interview with me.</p>\n<blockquote>\n<p>The world is facing a large-scale environmental crisis. Two parallel and related strands of this are, first, the crisis in biodiversity and the rapid extinction of many species, recently addressed at the COP15 UN Biodiversity Conference in October, and second, the threat of climate change, the topic of last month\u2019s COP26 summit in Glasgow. Pressure is growing on governments to execute nature-based solutions which will offset some of the most damaging impacts of these crises. While COP26 built some momentum, there is still a long way to go to turn promises into lasting change. More engagement with the private sector is urgently needed.</p>\n<p>The solution to the crisis is two-pronged: we must engage in behaviour change to reduce unnecessary harmful emissions, and also invest in nature-based solutions at global scales to not only reduce, but ultimately reverse the effects of climate change and biodiversity loss.\n-- <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/25th-anniversary-corporate-partnership-programme/25th-anniversary-11\">Pembroke College</a></p>\n</blockquote>",+"content": "<p>I launched <a href=\"https://anil.recoil.org/projects/4c\">4C</a> recently, and Pembroke College covers the launch with an interview with me.</p>\n<blockquote>\n<p>The world is facing a large-scale environmental crisis. Two parallel and related strands of this are, first, the crisis in biodiversity and the rapid extinction of many species, recently addressed at the COP15 UN Biodiversity Conference in October, and second, the threat of climate change, the topic of last month\u2019s COP26 summit in Glasgow. Pressure is growing on governments to execute nature-based solutions which will offset some of the most damaging impacts of these crises. While COP26 built some momentum, there is still a long way to go to turn promises into lasting change. More engagement with the private sector is urgently needed.</p>\n<p>The solution to the crisis is two-pronged: we must engage in behaviour change to reduce unnecessary harmful emissions, and also invest in nature-based solutions at global scales to not only reduce, but ultimately reverse the effects of climate change and biodiversity loss.\n-- <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/25th-anniversary-corporate-partnership-programme/25th-anniversary-11\">Pembroke College</a></p>\n</blockquote>",
+18
avsm/notes_acm-sigplan-award.json
+18
avsm/notes_acm-sigplan-award.json
···+"summary": "<p>I was honoured to be included in the OCaml team that won the <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">ACM Programming Languages Software Award for 2023</a>.</p>\n<blockquote>\n<p>The Association for Computing Machinery (ACM), the world's largest association of computing professionals, today gave the 2023 SIGPLAN Award to a group of developers for their work on the functional programming language OCaml.</p>\n<p>The award was presented at the annual SIGPLAN Programming Language Design and Implementation Conference to a group of researchers and developers including our colleague Anil Madhavapeddy, Professor of Planetary Computing here.</p>\n<p>The prestigious Programming Languages Software Award is given annually "to an institution or individual(s) to recognise the development of a software system that has had a significant impact on programming language research, implementations, and tools," ACM says.</p>\n<p>-- <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">Computer Laboratory</a></p>\n</blockquote>\n<p>See also the main <a href=\"https://www.sigplan.org/Awards/Software/\">ACM Award Page</a> citation:</p>\n<blockquote>\n<p>The OCaml Compiler Distribution is the reference implementation of the OCaml language, a dialect of ML that aims to be pragmatic, both in language features and implementation, encouraging a simple programming style that yields good performance and usability. It has a large user base in industry, research, and education throughout the world, and was used to implement a number of other impactful systems, notably in verification: Coq proof assistant, CompCert verified compiler, Why3 verified programming environment, Frama-C, Astr\u00e9e and Gillian static analyzers, Infer, Hack and Flow projects at Meta, SLAM/SDV and F* at Microsoft, etc.\n-- <a href=\"https://www.sigplan.org/Awards/Software/\">ACM SIGPLAN</a></p>\n</blockquote>",+"content": "<p>I was honoured to be included in the OCaml team that won the <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">ACM Programming Languages Software Award for 2023</a>.</p>\n<blockquote>\n<p>The Association for Computing Machinery (ACM), the world's largest association of computing professionals, today gave the 2023 SIGPLAN Award to a group of developers for their work on the functional programming language OCaml.</p>\n<p>The award was presented at the annual SIGPLAN Programming Language Design and Implementation Conference to a group of researchers and developers including our colleague Anil Madhavapeddy, Professor of Planetary Computing here.</p>\n<p>The prestigious Programming Languages Software Award is given annually "to an institution or individual(s) to recognise the development of a software system that has had a significant impact on programming language research, implementations, and tools," ACM says.</p>\n<p>-- <a href=\"https://www.cst.cam.ac.uk/news/acm-programming-languages-software-award-goes-ocaml-researchers\">Computer Laboratory</a></p>\n</blockquote>\n<p>See also the main <a href=\"https://www.sigplan.org/Awards/Software/\">ACM Award Page</a> citation:</p>\n<blockquote>\n<p>The OCaml Compiler Distribution is the reference implementation of the OCaml language, a dialect of ML that aims to be pragmatic, both in language features and implementation, encouraging a simple programming style that yields good performance and usability. It has a large user base in industry, research, and education throughout the world, and was used to implement a number of other impactful systems, notably in verification: Coq proof assistant, CompCert verified compiler, Why3 verified programming environment, Frama-C, Astr\u00e9e and Gillian static analyzers, Infer, Hack and Flow projects at Meta, SLAM/SDV and F* at Microsoft, etc.\n-- <a href=\"https://www.sigplan.org/Awards/Software/\">ACM SIGPLAN</a></p>\n</blockquote>",
+18
avsm/notes_ai-contamination-of-papers.json
+18
avsm/notes_ai-contamination-of-papers.json
···+"summary": "<p><a href=\"https://toao.com\">Sadiq Jaffer</a> sent along this <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">piece in The Conversation</a> last week about the remarkable number of academic papers that are now AI generated. The numbers of these papers are probably underestimated:</p>\n<blockquote>\n<p>These papers are absorbed into the worldwide library of research faster than they can be weeded out. About 119,000 scholarly journal articles and conference papers are published globally every week, or more than 6 million a year. Publishers estimate that, at most journals, about 2% of the papers submitted \u2013 but not necessarily published \u2013 are likely fake, although this number can be much higher at some publications.\n-- Frederik Joelving et al, <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">The Conversation</a></p>\n</blockquote>\n<p>What caught my eye in this article is their development of the <a href=\"https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24495\">Problematic Paper Screener</a>, which the good folks at <a href=\"https://en.wikipedia.org/wiki/Retraction_Watch\">Retraction Watch</a> developed. It works with high precision to detect papers issued by grammar-based generators. They noted in <a href=\"https://theconversation.com/problematic-paper-screener-trawling-for-fraud-in-the-scientific-literature-246317\">another article</a> that over 764,000 articles cited papers that could be unreliable, further illustrating the creeping unreliability. <a href=\"https://toao.com\">Sadiq Jaffer</a> and I are planning to run this over our <a href=\"https://anil.recoil.org/projects/ce\">growing paper corpus</a>, but I can't find the source code to their system, just <a href=\"https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::\">the hosted version</a>.</p>\n<p>Meanwhile, datasets are also under similar threat of causing <a href=\"https://www.nature.com/articles/s41586-024-07566-y\">recursive model collapse</a>. The <a href=\"https://github.com/rspeer/wordfreq\">Wordfreq</a> team announced in September 2024 that they would <a href=\"https://github.com/rspeer/wordfreq/blob/master/SUNSET.md\">discontinue</a> updating their corpus because generative AI has polluted the data and information that used to be free has became expensive. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> also noted the related problem of dataset versioning becoming unreliable across science in "<a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a>", but for different reasons -- large datasets are inherently difficult to version and reproduce (it's quite hard to share a terabyte of data over the Internet easily, even in this day and age).</p>\n<p>Another big development this week was the release of <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's Deep Research</a> feature, which goes off and really mines a literature corpus for information. I've grudgingly updated to their expensive <a href=\"https://openai.com/index/introducing-chatgpt-pro/\">Pro</a> to try this out and will report my findings in a future post. The ability to generate papers has moved well beyond just the grammar generators that the Problem Paper Screener can filter out, so this arms race is unlikely to end well if we're pinning our hopes on detecting AI-generated papers. The current publish-or-perish model has already died; at least our Cambridge <a href=\"https://www.acp.hr.admin.cam.ac.uk/acp-overview/acp-key-principles\">promotion process</a> is more enlightened than "just" looking at paper counts!</p>",+"content": "<p><a href=\"https://toao.com\">Sadiq Jaffer</a> sent along this <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">piece in The Conversation</a> last week about the remarkable number of academic papers that are now AI generated. The numbers of these papers are probably underestimated:</p>\n<blockquote>\n<p>These papers are absorbed into the worldwide library of research faster than they can be weeded out. About 119,000 scholarly journal articles and conference papers are published globally every week, or more than 6 million a year. Publishers estimate that, at most journals, about 2% of the papers submitted \u2013 but not necessarily published \u2013 are likely fake, although this number can be much higher at some publications.\n-- Frederik Joelving et al, <a href=\"https://theconversation.com/fake-papers-are-contaminating-the-worlds-scientific-literature-fueling-a-corrupt-industry-and-slowing-legitimate-lifesaving-medical-research-246224\">The Conversation</a></p>\n</blockquote>\n<p>What caught my eye in this article is their development of the <a href=\"https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24495\">Problematic Paper Screener</a>, which the good folks at <a href=\"https://en.wikipedia.org/wiki/Retraction_Watch\">Retraction Watch</a> developed. It works with high precision to detect papers issued by grammar-based generators. They noted in <a href=\"https://theconversation.com/problematic-paper-screener-trawling-for-fraud-in-the-scientific-literature-246317\">another article</a> that over 764,000 articles cited papers that could be unreliable, further illustrating the creeping unreliability. <a href=\"https://toao.com\">Sadiq Jaffer</a> and I are planning to run this over our <a href=\"https://anil.recoil.org/projects/ce\">growing paper corpus</a>, but I can't find the source code to their system, just <a href=\"https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::\">the hosted version</a>.</p>\n<p>Meanwhile, datasets are also under similar threat of causing <a href=\"https://www.nature.com/articles/s41586-024-07566-y\">recursive model collapse</a>. The <a href=\"https://github.com/rspeer/wordfreq\">Wordfreq</a> team announced in September 2024 that they would <a href=\"https://github.com/rspeer/wordfreq/blob/master/SUNSET.md\">discontinue</a> updating their corpus because generative AI has polluted the data and information that used to be free has became expensive. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> also noted the related problem of dataset versioning becoming unreliable across science in "<a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">Uncertainty at scale: how CS hinders climate research</a>", but for different reasons -- large datasets are inherently difficult to version and reproduce (it's quite hard to share a terabyte of data over the Internet easily, even in this day and age).</p>\n<p>Another big development this week was the release of <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's Deep Research</a> feature, which goes off and really mines a literature corpus for information. I've grudgingly updated to their expensive <a href=\"https://openai.com/index/introducing-chatgpt-pro/\">Pro</a> to try this out and will report my findings in a future post. The ability to generate papers has moved well beyond just the grammar generators that the Problem Paper Screener can filter out, so this arms race is unlikely to end well if we're pinning our hopes on detecting AI-generated papers. The current publish-or-perish model has already died; at least our Cambridge <a href=\"https://www.acp.hr.admin.cam.ac.uk/acp-overview/acp-key-principles\">promotion process</a> is more enlightened than "just" looking at paper counts!</p>",
+18
avsm/notes_ai-for-evidence-synthesis-workshop.json
+18
avsm/notes_ai-for-evidence-synthesis-workshop.json
···+"title": "A fully AI-generated paper just passed peer review; notes from our evidence synthesis workshop",+"summary": "<p>Access to reliable and timely scientific evidence is utterly vital for the practise of responsible policymaking, especially with all the turmoil in the world these days. At the same time, the evidence base on which use to make these decisions is rapidly morphing under our feet; the <a href=\"https://sakana.ai/ai-scientist-first-publication/\">first entirely AI-generated paper passed peer review</a> at an ICLR workshop today. We held a workshop on this topic of AI and evidence synthesis at <a href=\"https://pem.cam.ac.uk\">Pembroke College</a> last week, to understand both the opportunities for the use of AI here, the <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">strengths and limitations</a> of current tools, areas of progress and also just to chat with policymakers from <a href=\"https://www.gov.uk/government/organisations/department-for-science-innovation-and-technology\">DSIT</a> and thinktanks about how to approach this rapidly moving area.</p>\n<p><em>(The following notes are adapted from jottings from <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>,\n<a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> and myself. They are not at all complete, but hopefully useful!)</em></p>\n<p>We invited a range of participants to the workshop and held it at Pembroke College (the choice of the centuries-old location felt appropriate).\n<a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> and <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> expertly emceed the day, with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> also presenting provocations to get the conversation going.</p>\n<p>\n<img alt=\"Lots of excellent discussions over Pembroke sarnies!\" src=\"https://anil.recoil.org/images/evidence-synth-2.webp\" title=\"Lots of excellent discussions over Pembroke sarnies!\">\nLots of excellent discussions over Pembroke sarnies!</p>\n<h2><a href=\"https://anil.recoil.org/#evidence-synthesis-at-scale\"></a>Evidence synthesis at scale</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> described the purpose of the workshop as follows:</p>\n<blockquote>\n<p>Evidence synthesis is a vital tool to connect scientific knowledge to areas\nof demand for actionable insights. It helps build supply chains of ideas,\nthat connect research to practice in ways that can deliver meaningful\nimprovements in policy development and implementation. Its value can be seen\nacross sectors: aviation safety benefitted from systematic incident analysis;\nmedical care has advanced through clinical trials and systematic reviews;\nengineering is enhanced through evidence-based design standards. When done\nwell, evidence synthesis can transform how fields operate. However, for every\nfield where evidence synthesis is embedded in standard operating practices,\nthere are others relying on untested assumptions or outdated guidance.\n-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</p>\n</blockquote>\n<p>One such field that benefits from evidence is <a href=\"https://anil.recoil.org/projects/ce\">conservation</a>, which is what <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> and his <a href=\"https://conservationevidence.com\">team</a> have been working away on for years. Bill went on to discuss the fresh challenges that AI brings to this field, because it introduces a new element of scale which could augment relatively slow human efforts.</p>\n<blockquote>\n<p>Scale poses a fundamental challenge to traditional approaches to evidence\nsynthesis. Comprehensive reviews take substantial resources and time. By the\ntime they are complete \u2013 or reach a policy audience \u2013 the window for action\nmay have closed. The Conservation Evidence project at the University of\nCambridge offers an example of how researchers can tackle this challenge. The\nConservation Evidence team has analysed over 1.3M journals from 17 languages\nand built a website enabling access to this evidence base. To support users\nto interrogate this evidence base, the team has compiled a metadataset that\nallows users to explore this literature based on a question of interest, for\nexample looking at what conservation actions have been effective in managing\na particular invasive species in a specified geographic area.\n-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</p>\n</blockquote>\n<p>The AI for evidence synthesis landscape is changing very rapidly, with a variety of specialised tools now\nbeing promoted in this space. This ranges from commercial tools such as <a href=\"https://gemini.google/overview/deep-research/?hl=en\">Gemini Deep Research</a> and <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's deep searcher</a>, to\nresearch-focused systems such as <a href=\"https://elicit.com\">Elicit</a>, <a href=\"https://www.distillersr.com/products/distillersr-systematic-review-software\">DistillerSR</a>, and <a href=\"https://www.robotreviewer.net\">RobotReviewer</a>. These tools vary in their approach, capabilities, and target users, raising questions about which will best serve different user needs. RobotReviewer, for example, notes that:</p>\n<blockquote>\n<p>[...] the machine learning works well, but is not a substitute for human systematic reviewers. We recommend the use of our demo as an assistant to human reviewers, who can validate the machine learning suggestions, and correct them as needed. Machine learning used this way is often described as semi-automation.\n-- <a href=\"https://www.robotreviewer.net/about\">About RobotReviewer</a></p>\n</blockquote>\n<p>The problem, of course, is that these guidelines will often be ignored by\nreviewers who are under time pressure, and so the well established protocols\nfor systematic reviewers are under some threat.</p>\n<p>\n<img alt=\"Sadiq Jaffer and Sam Reynolds discuss emerging AI systems\" src=\"https://anil.recoil.org/images/evidence-synth-4.webp\" title=\"Sadiq Jaffer and Sam Reynolds discuss emerging AI systems\">\nSadiq Jaffer and Sam Reynolds discuss emerging AI systems</p>\n<h2><a href=\"https://anil.recoil.org/#how-do-we-get-more-systematic-ai-driven-systematic-reviews\"></a>How do we get more systematic AI-driven systematic reviews?</h2>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> then talked about some of the computing approaches required to achieve a more reliable evidence review base.\nThey identified three key principles for responsible AI integration into evidence synthesis:</p>\n<ul>\n<li>Traceability: Users should see which information sources informed the evidence review system and why any specific evidence was included or excluded.</li>\n<li>Transparency: Open-source computation code, the use of open-weights models, <a href=\"https://www.ibm.com/impact/ai-ethics\">ethically sourced</a> training data, and clear documentation of methods mean users can scrutinise how the system is working.</li>\n<li>Dynamism: The evidence outputs should be continuous updated to refines the evidence base, via adding new evidence and flagging <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">retracted papers</a>.</li>\n</ul>\n<p><a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> pointed out his recent work on <a href=\"https://osf.io/sz2g8/\">AI replication games</a> which I found fascinating. The idea here is that:</p>\n<blockquote>\n<p>Researchers will be randomly assigned to one of three teams: Machine, Cyborg\nor Human. Machine and Cyborg teams will have access to (commercially\navailable) LLM models to conduct their work; Human teams of course rely only\non unaugmented human skills. Each team consists of 3 members with similar\nresearch interests and varying skill levels. Teams will be asked to check for\ncoding errors and conduct a robustness reproduction, which is the ability to\nduplicate the results of a prior study using the same data but different\nprocedures as were used by the original investigator.\n-- <a href=\"https://www.sheffield.ac.uk/machine-intelligence/events/i4rs-ai-replication-games\">Institute for Replication</a></p>\n</blockquote>\n<p>These replication games are happening on the outputs of evidence, but the\n<em>inputs</em> are also rapidly changing with today's announcement of a <a href=\"https://sakana.ai/ai-scientist-first-publication/\">fully generated AI papers passing peer\nreview</a>. It's hopefully now clear\nthat AI is a huge disruptive factor in evidence synthesis.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/evidence-synth-3.webp\" title=\"\">\n</p>\n<h2><a href=\"https://anil.recoil.org/#the-opportunity-ahead-of-us-for-public-policy\"></a>The opportunity ahead of us for public policy</h2>\n<p>We first discussed how AI could help in enhancing systematic reviews.\nAI-enabled analysis can accelerate literature screening and data extraction,\ntherefore helping make the reviews more timely and comprehensive. The\nopportunity ahead of us is to democratise access to knowledge synthesis by\nmaking it available to those without specialised training or institutional\nresources, and therefore getting wider deployment in countries and\norganisations without the resources to commission traditional reviews.</p>\n<p>However, there are big challenges remaining in <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">gaining access</a> to published research papers and datasets.\nThe publishers have deep concerns over AI-generated evidence synthesis, and more generally about the use of generative AI involving their source material. But individual publishers are <a href=\"https://theconversation.com/an-academic-publisher-has-struck-an-ai-data-deal-with-microsoft-without-their-authors-knowledge-235203\">already selling</a> their content to the highest bidder as part of the <a href=\"https://anil.recoil.org/notes/ai-ietf-aiprefs\">data hoarding wars</a> and so the spread of the work into pretrained models is not currently happening equitably or predictably.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> called this "competitive exclusion", and it is limiting communication and knowledge diversity.</p>\n<p>The brilliant <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> then led a panel discussion about the responsible\nuse of AI in the public sector. The panel observed that different countries\nare taking different approaches to the applications of AI in policy research.\nHowever, every country has deep regional variances in the <em>application</em> of\npolicy and priorities, which means that global pretrained AI models always need\nsome localized retuning. The "one-size-fits-all" approach works particularly\nbadly for policy, where local context is crucial to a good community outcome\nthat minimises harm.</p>\n<p>Policymakers therefore need realistic expectations about what AI can and cannot do in evidence synthesis.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> came up with the notion that "anticipate, test, and learn" methods must guide AI deployment in policy research; this is an extension of the "<a href=\"https://public.digital/pd-insights/blog/2024/12/just-what-is-test-and-learn\">test and learn</a>" culture being pushed by Pat McFadden as part of the Labour plan to <a href=\"https://www.gov.uk/government/speeches/reform-of-the-state-has-to-deliver-for-the-people\">reform the public sector</a> this year. With AI systems, <a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> noted that we need to be working with the end users of the tools to scope what government departments need and want. These conversations needs to happen <em>before</em> we build the tools, letting us anticipate problems before we deploy and test them in a real policy environment. <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> noted that policy doesn't have a simple "sandbox" environment to test AI outcomes in, unlike many other fields where simulation is practical ahead of deployment.</p>\n<p><a href=\"https://www.jbs.cam.ac.uk/people/lucia-reisch/\">Lucia Reisch</a> noted that users must maintain critical judgement when using these\nnew AI tools; the machine interfaces must empower users towrads enhancing their\ncritical thinking and encouraging reflection on what outputs are being created\n(and what is being left out!). Lucia also mentioned that her group helps run\nthe "<a href=\"https://whatworksclimate.solutions/about/\">What Works</a>" summit, which\nI've never been to but plan on attending next it rolls around.</p>\n<p>The energy requirements for training and running these large scale AI models\nare significant as well, of course, raising questions about the long-term\nmaintenance costs of these tools and their environmental footprint. There was\nwide consensus that the UK should develop its own AI models to ensure\nresilience and sovereignty, but also to make sure that the regional finetuning\nto maximise positive outcomes is under clear local control and not outsourced\ngeopolitically. By providing a single model that combines <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">UK national data</a>, we would also not waste energy with lots of\nsmaller training efforts around the four nations.</p>\n<p>\n<img alt=\"Sadiq Jaffer in front of a very old, very fancy and not AI-designed door\" src=\"https://anil.recoil.org/images/evidence-synth-1.webp\" title=\"Sadiq Jaffer in front of a very old, very fancy and not AI-designed door\">\nSadiq Jaffer in front of a very old, very fancy and not AI-designed door</p>\n<p>Thanks <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> for such a stellar organisation job and to Pembroke for hosting and all for\nattending, and please do continue the discussion about this <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7303431795587309569/\">on LinkedIn</a>\nif you are so inclined.</p>",+"content": "<p>Access to reliable and timely scientific evidence is utterly vital for the practise of responsible policymaking, especially with all the turmoil in the world these days. At the same time, the evidence base on which use to make these decisions is rapidly morphing under our feet; the <a href=\"https://sakana.ai/ai-scientist-first-publication/\">first entirely AI-generated paper passed peer review</a> at an ICLR workshop today. We held a workshop on this topic of AI and evidence synthesis at <a href=\"https://pem.cam.ac.uk\">Pembroke College</a> last week, to understand both the opportunities for the use of AI here, the <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">strengths and limitations</a> of current tools, areas of progress and also just to chat with policymakers from <a href=\"https://www.gov.uk/government/organisations/department-for-science-innovation-and-technology\">DSIT</a> and thinktanks about how to approach this rapidly moving area.</p>\n<p><em>(The following notes are adapted from jottings from <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>,\n<a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> and myself. They are not at all complete, but hopefully useful!)</em></p>\n<p>We invited a range of participants to the workshop and held it at Pembroke College (the choice of the centuries-old location felt appropriate).\n<a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> and <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> expertly emceed the day, with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> also presenting provocations to get the conversation going.</p>\n<p>\n<img alt=\"Lots of excellent discussions over Pembroke sarnies!\" src=\"https://anil.recoil.org/images/evidence-synth-2.webp\" title=\"Lots of excellent discussions over Pembroke sarnies!\">\nLots of excellent discussions over Pembroke sarnies!</p>\n<h2><a href=\"https://anil.recoil.org/#evidence-synthesis-at-scale\"></a>Evidence synthesis at scale</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a> described the purpose of the workshop as follows:</p>\n<blockquote>\n<p>Evidence synthesis is a vital tool to connect scientific knowledge to areas\nof demand for actionable insights. It helps build supply chains of ideas,\nthat connect research to practice in ways that can deliver meaningful\nimprovements in policy development and implementation. Its value can be seen\nacross sectors: aviation safety benefitted from systematic incident analysis;\nmedical care has advanced through clinical trials and systematic reviews;\nengineering is enhanced through evidence-based design standards. When done\nwell, evidence synthesis can transform how fields operate. However, for every\nfield where evidence synthesis is embedded in standard operating practices,\nthere are others relying on untested assumptions or outdated guidance.\n-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</p>\n</blockquote>\n<p>One such field that benefits from evidence is <a href=\"https://anil.recoil.org/projects/ce\">conservation</a>, which is what <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> and his <a href=\"https://conservationevidence.com\">team</a> have been working away on for years. Bill went on to discuss the fresh challenges that AI brings to this field, because it introduces a new element of scale which could augment relatively slow human efforts.</p>\n<blockquote>\n<p>Scale poses a fundamental challenge to traditional approaches to evidence\nsynthesis. Comprehensive reviews take substantial resources and time. By the\ntime they are complete \u2013 or reach a policy audience \u2013 the window for action\nmay have closed. The Conservation Evidence project at the University of\nCambridge offers an example of how researchers can tackle this challenge. The\nConservation Evidence team has analysed over 1.3M journals from 17 languages\nand built a website enabling access to this evidence base. To support users\nto interrogate this evidence base, the team has compiled a metadataset that\nallows users to explore this literature based on a question of interest, for\nexample looking at what conservation actions have been effective in managing\na particular invasive species in a specified geographic area.\n-- <a href=\"https://www.cst.cam.ac.uk/people/jkm40\">Jessica Montgomery</a>, AI@Cam</p>\n</blockquote>\n<p>The AI for evidence synthesis landscape is changing very rapidly, with a variety of specialised tools now\nbeing promoted in this space. This ranges from commercial tools such as <a href=\"https://gemini.google/overview/deep-research/?hl=en\">Gemini Deep Research</a> and <a href=\"https://openai.com/index/introducing-deep-research/\">OpenAI's deep searcher</a>, to\nresearch-focused systems such as <a href=\"https://elicit.com\">Elicit</a>, <a href=\"https://www.distillersr.com/products/distillersr-systematic-review-software\">DistillerSR</a>, and <a href=\"https://www.robotreviewer.net\">RobotReviewer</a>. These tools vary in their approach, capabilities, and target users, raising questions about which will best serve different user needs. RobotReviewer, for example, notes that:</p>\n<blockquote>\n<p>[...] the machine learning works well, but is not a substitute for human systematic reviewers. We recommend the use of our demo as an assistant to human reviewers, who can validate the machine learning suggestions, and correct them as needed. Machine learning used this way is often described as semi-automation.\n-- <a href=\"https://www.robotreviewer.net/about\">About RobotReviewer</a></p>\n</blockquote>\n<p>The problem, of course, is that these guidelines will often be ignored by\nreviewers who are under time pressure, and so the well established protocols\nfor systematic reviewers are under some threat.</p>\n<p>\n<img alt=\"Sadiq Jaffer and Sam Reynolds discuss emerging AI systems\" src=\"https://anil.recoil.org/images/evidence-synth-4.webp\" title=\"Sadiq Jaffer and Sam Reynolds discuss emerging AI systems\">\nSadiq Jaffer and Sam Reynolds discuss emerging AI systems</p>\n<h2><a href=\"https://anil.recoil.org/#how-do-we-get-more-systematic-ai-driven-systematic-reviews\"></a>How do we get more systematic AI-driven systematic reviews?</h2>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> then talked about some of the computing approaches required to achieve a more reliable evidence review base.\nThey identified three key principles for responsible AI integration into evidence synthesis:</p>\n<ul>\n<li>Traceability: Users should see which information sources informed the evidence review system and why any specific evidence was included or excluded.</li>\n<li>Transparency: Open-source computation code, the use of open-weights models, <a href=\"https://www.ibm.com/impact/ai-ethics\">ethically sourced</a> training data, and clear documentation of methods mean users can scrutinise how the system is working.</li>\n<li>Dynamism: The evidence outputs should be continuous updated to refines the evidence base, via adding new evidence and flagging <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">retracted papers</a>.</li>\n</ul>\n<p><a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> pointed out his recent work on <a href=\"https://osf.io/sz2g8/\">AI replication games</a> which I found fascinating. The idea here is that:</p>\n<blockquote>\n<p>Researchers will be randomly assigned to one of three teams: Machine, Cyborg\nor Human. Machine and Cyborg teams will have access to (commercially\navailable) LLM models to conduct their work; Human teams of course rely only\non unaugmented human skills. Each team consists of 3 members with similar\nresearch interests and varying skill levels. Teams will be asked to check for\ncoding errors and conduct a robustness reproduction, which is the ability to\nduplicate the results of a prior study using the same data but different\nprocedures as were used by the original investigator.\n-- <a href=\"https://www.sheffield.ac.uk/machine-intelligence/events/i4rs-ai-replication-games\">Institute for Replication</a></p>\n</blockquote>\n<p>These replication games are happening on the outputs of evidence, but the\n<em>inputs</em> are also rapidly changing with today's announcement of a <a href=\"https://sakana.ai/ai-scientist-first-publication/\">fully generated AI papers passing peer\nreview</a>. It's hopefully now clear\nthat AI is a huge disruptive factor in evidence synthesis.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/evidence-synth-3.webp\" title=\"\">\n</p>\n<h2><a href=\"https://anil.recoil.org/#the-opportunity-ahead-of-us-for-public-policy\"></a>The opportunity ahead of us for public policy</h2>\n<p>We first discussed how AI could help in enhancing systematic reviews.\nAI-enabled analysis can accelerate literature screening and data extraction,\ntherefore helping make the reviews more timely and comprehensive. The\nopportunity ahead of us is to democratise access to knowledge synthesis by\nmaking it available to those without specialised training or institutional\nresources, and therefore getting wider deployment in countries and\norganisations without the resources to commission traditional reviews.</p>\n<p>However, there are big challenges remaining in <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">gaining access</a> to published research papers and datasets.\nThe publishers have deep concerns over AI-generated evidence synthesis, and more generally about the use of generative AI involving their source material. But individual publishers are <a href=\"https://theconversation.com/an-academic-publisher-has-struck-an-ai-data-deal-with-microsoft-without-their-authors-knowledge-235203\">already selling</a> their content to the highest bidder as part of the <a href=\"https://anil.recoil.org/notes/ai-ietf-aiprefs\">data hoarding wars</a> and so the spread of the work into pretrained models is not currently happening equitably or predictably.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> called this "competitive exclusion", and it is limiting communication and knowledge diversity.</p>\n<p>The brilliant <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> then led a panel discussion about the responsible\nuse of AI in the public sector. The panel observed that different countries\nare taking different approaches to the applications of AI in policy research.\nHowever, every country has deep regional variances in the <em>application</em> of\npolicy and priorities, which means that global pretrained AI models always need\nsome localized retuning. The "one-size-fits-all" approach works particularly\nbadly for policy, where local context is crucial to a good community outcome\nthat minimises harm.</p>\n<p>Policymakers therefore need realistic expectations about what AI can and cannot do in evidence synthesis.\n<a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://www.aru.ac.uk/people/jennifer-schooling\">Jennifer Schooling</a> came up with the notion that "anticipate, test, and learn" methods must guide AI deployment in policy research; this is an extension of the "<a href=\"https://public.digital/pd-insights/blog/2024/12/just-what-is-test-and-learn\">test and learn</a>" culture being pushed by Pat McFadden as part of the Labour plan to <a href=\"https://www.gov.uk/government/speeches/reform-of-the-state-has-to-deliver-for-the-people\">reform the public sector</a> this year. With AI systems, <a href=\"https://www.cser.ac.uk/team/alex-marcoci/\">Alex Marcoci</a> noted that we need to be working with the end users of the tools to scope what government departments need and want. These conversations needs to happen <em>before</em> we build the tools, letting us anticipate problems before we deploy and test them in a real policy environment. <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> noted that policy doesn't have a simple "sandbox" environment to test AI outcomes in, unlike many other fields where simulation is practical ahead of deployment.</p>\n<p><a href=\"https://www.jbs.cam.ac.uk/people/lucia-reisch/\">Lucia Reisch</a> noted that users must maintain critical judgement when using these\nnew AI tools; the machine interfaces must empower users towrads enhancing their\ncritical thinking and encouraging reflection on what outputs are being created\n(and what is being left out!). Lucia also mentioned that her group helps run\nthe "<a href=\"https://whatworksclimate.solutions/about/\">What Works</a>" summit, which\nI've never been to but plan on attending next it rolls around.</p>\n<p>The energy requirements for training and running these large scale AI models\nare significant as well, of course, raising questions about the long-term\nmaintenance costs of these tools and their environmental footprint. There was\nwide consensus that the UK should develop its own AI models to ensure\nresilience and sovereignty, but also to make sure that the regional finetuning\nto maximise positive outcomes is under clear local control and not outsourced\ngeopolitically. By providing a single model that combines <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">UK national data</a>, we would also not waste energy with lots of\nsmaller training efforts around the four nations.</p>\n<p>\n<img alt=\"Sadiq Jaffer in front of a very old, very fancy and not AI-designed door\" src=\"https://anil.recoil.org/images/evidence-synth-1.webp\" title=\"Sadiq Jaffer in front of a very old, very fancy and not AI-designed door\">\nSadiq Jaffer in front of a very old, very fancy and not AI-designed door</p>\n<p>Thanks <a href=\"https://ai.cam.ac.uk/people/annabelle-scott\">Annabelle Scott</a> for such a stellar organisation job and to Pembroke for hosting and all for\nattending, and please do continue the discussion about this <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7303431795587309569/\">on LinkedIn</a>\nif you are so inclined.</p>",
+18
avsm/notes_ai-for-science-2024.json
+18
avsm/notes_ai-for-science-2024.json
···+"summary": "<p>I got invited to join the Royal Society and DeepMind to a summit on\nhow AI is revolutionising scientific discovery and trotted along with\n<a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>. This event is hot on the heels of the\nexcellent RS report on <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Science in the Age of AI</a>\nand, of course, the Nobel prize for Demis Hassabis which was the <a href=\"https://www.cst.cam.ac.uk/news/nobel-prize-our-alumnus-sir-demis-hassabis\">first ever\nfor my department</a>!\nThe event was held at the BAFTA today, and what follows are my quick livenotes\nas there was just so much to absorb. The RS and Deepmind will have the\nfull sessions online sometime soon, so I'll update this with those more polished\noutputs when they're out! <em>Update: Proper notes now available from <a href=\"https://blog.google/technology/ai/ai-science-forum-2024/\">Google</a> and the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society</a>.</em></p>\n<p>\n<img alt=\"Hannah Fry doing a great job emceeing\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-4.webp\" title=\"Hannah Fry doing a great job emceeing\">\nHannah Fry doing a great job emceeing</p>\n<p>The summit was a day-long exploration of how artificial intelligence is\ntransforming science and society, and the overall theme (including four Nobel\nlaureates!) was on how we are in a golden age of scientific discovery,\nespecially in the biological sciences. The emcee for the event was Hannah Fry,\nwho simply dazzled with her rapid summarisation of complex discussions\ninterspersed with very dry humour about the proceedings!</p>\n<p>The consistent message was how ongoing synergy between science, technology, and\nsociety is essential to setting the stage for an exploration of transformative\ndiscoveries powered by AI that <em>would benefit everyone in society</em>. Missing that\nsynergy would lead to unequal benefit or dangerous crossings of boundaries.</p>\n<p>\n<img alt=\"Busy schedule for the day at BAFTA HQ!\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-8.webp\" title=\"Busy schedule for the day at BAFTA HQ!\">\nBusy schedule for the day at BAFTA HQ!</p>\n<h2><a href=\"https://anil.recoil.org/#lessons-from-crispr\"></a>Lessons from CRISPR</h2>\n<p>The first session had James Manyika interviewing <a href=\"https://en.wikipedia.org/wiki/Jennifer_Doudna\">Jennifer Doudna</a>, Nobel Laureate and co-inventor of CRISPR, shared how gene editing has moved from science fiction to an essential tool for societal improvement. Some key points:</p>\n<ul>\n<li>AI's integration with CRISPR allows scientists to better predict and control\ngenome editing outcomes, advancing efficiency and reducing hypothesis-testing\ntime. Many experiments could potentially be skipped thanks to simulations\npredicting outcomes without the need for wetlab work.</li>\n<li>Jennifer discussed projects like <a href=\"https://www.ucdavis.edu/food/news/making-cattle-more-sustainable\">methane-free cows</a>,\nwhere altering cattle genomes could eliminate methane production entirely.\nThese efforts require multidisciplinary collaboration between computer\nscientists, agricultural experts, and biologists.</li>\n<li>The success of CRISPR emphasises the need for public engagement, policy\nframeworks, and open databases for international collaboration. Doudna called\nfor knowledge accessibility, including teaching high school educators about\ngenome editing's impact, as a key part of public outreach about how this\ntechnology might affect society in the future.</li>\n<li>CRISPR moved really fast: it was published in 2012, and by 2014 scientists\nwere already editing monkey embroyes. This lead to a realisation that it\nwasnt enought to be head down in the Lab, but needed a whole team that works on\npublic impact and policy (including RS and national academies) to bootstrap\ninternational meetings on human genome editing. The most recent was held in\nLondon in March of last year which lead to open, transparent conversations and\nbuilding a worldwide database of work involving genome editing, especially that\nwhich impacts human genome or environmental editing which could escape.</li>\n</ul>\n<p>\n<img alt=\"James and Jennifer in discussion.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-1.webp\" title=\"James and Jennifer in discussion.\">\nJames and Jennifer in discussion.</p>\n<h2><a href=\"https://anil.recoil.org/#science-in-the-age-of-ai\"></a>Science in the Age of AI</h2>\n<p>The panel next was <a href=\"https://en.wikipedia.org/wiki/Eric_Topol\">Eric Topol</a> chairing a discussion with <a href=\"https://en.wikipedia.org/wiki/Pushmeet_Kohli\">Pushmeet Kohli</a>, <a href=\"https://en.wikipedia.org/wiki/Fiona_Marshall_(pharmacologist)\">Fiona H. Marshall</a> and <a href=\"https://en.wikipedia.org/wiki/Alison_Noble\">Alison Noble</a>. The theme was on how huge number of foundation\nmodels are coming out for LLLMs (large language life models) at a dizzying\npace.</p>\n<p>\n<img alt=\"Eric, Pushmeet, Fiona and Alison on stage.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-2.webp\" title=\"Eric, Pushmeet, Fiona and Alison on stage.\">\nEric, Pushmeet, Fiona and Alison on stage.</p>\n<p>Pushmeet Kohli first explained how deciphering the genome accelerates\nbiological discoveries by orders of magnitude. AI tools like AlphaFold (which\njust got the Nobel prize) exemplify breakthroughs that transform biology into a\npredictive science from a wetlab driven discipline. On other fronts,\nDeepMind's GraphCast model enables near-term weather forecasting in minutes,\nwhich previously required days of supercomputer time to do an equivalent\nforecast (and now\n<a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> is doing even\nbetter with mechanistic models combined with data). Pushmeet then noted how\nGNNs for materials science identified over 400k (or 200k? couldnt catch it) new\nstable inorganic crystals, with potential applications like high-temperature\nsuperconductors which were just scifi before.</p>\n<p>Then Fiona H. Marshall from Novartis emphasized how AI identifies new drug\ntargets using genomic and population-level data. In drug development,\npredictive safety is absolutely crucial. Pharmaceutical companies have decades\u2019\nworth of histological data, such as rodent testing, stored on hundreds of\nthousands of glass slides that are now being digitized. Once this data is\nprocessed, we can use it to make various predictions. Sharing this data across\nthe pharmaceutical industry would benefit everyone. One of their partner\ncompanies is developing scanning algorithms, and once these are operational\nthey will be made open, the entire industry will gain from the resulting\ndataset. Generative chemistry (like AlphaFold) now designs drugs faster while\npredictive toxicology ensures safer medicines.\nInterestingly, the data scientists are in the prime author\nposition as they are dictating the experimental procedures vs the wetlab\nscientists a few decades ago. This change in incentives drives change within a\nlarge org towards more data driven methods.</p>\n<p>Another valuable source of data is population-level information, such as <a href=\"https://ourfuturehealth.org.uk\">Our\nFuture Health</a> (a UK-based NHS initiative).\nProper management of nomenclature will ensure that this project generates a\nmassive, usable dataset vs what we have anywhere else. Eric noted that they\nrely heavily on the UK Biobank, which, with its 15,000 participants, is one of\nthe largest in the world and the Our Future Health program is a huge leap\nforward with 5m participants. The NIH in the United States is hesitant to\nengage in public-private partnerships, and so the UK is leading the way in this\ndomain (<em>Anil notes: with privacy concerns about the data sharing</em>).</p>\n<p>Fiona also noted that drug discovery is also transforming clinical trials, not\njust the discovery process itself. Typically, it takes around 10 years for a\ncandidate molecule to move to the clinical trial phase. One major bottleneck is\npatient enrollment. By leveraging vast demographic databases, which include\ninformation on global populations, their diseases, medications, and hospital\naffiliations, we can drastically improve recruitment efficiency. For example,\na clinical trial protocol targeting "women aged 30-50 who are not taking drug X\nor estrogen modifiers" can utilize these databases to identify and enroll\npatients quickly. This approach can reduce recruitment time from three years to\njust six months, significantly accelerating the process of getting drugs to\nmarket.</p>\n<p>\n<img alt=\"Sneaking in some sightseeing at BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-12.webp\" title=\"Sneaking in some sightseeing at BAFTA\">\nSneaking in some sightseeing at BAFTA</p>\n<p>Alison Noble discussed how deep learning has revolutionized ultrasound imaging.\nAI now guides users on probe placement, reducing training requirements for\nmedical professionals. However, we're now moving so fast that we need to be\ncareful; even the notion of what a scientist is is changing. In the RS report\non <a href=\"https://anil.recoil.org/\">Science in the Age of AI</a> a number of scientists around the UK were\nconsulted and this concern of reproducibility and data access came up loud and\nclear. When we publish results, we need to make sure theyh are sound and that\npeer review is possible. Openness is a deliberate choice however and not always\nappropriate when sensitive data is involved (e.g. healthcare) but requiring a\nrigor in evaluation is essential for good science. Scientists need to rethink\nin the age of AI how we present our work, and how we train new scientists in\nthis environment. So we have some wonderful early examples like AlphaFold, but\nwe need to understand the societal/incentive impacts on our new generation of\nscientists.</p>\n<p>Eric also noted that one of the greatest challenges in genomics is\nunderstanding variance, and\n<a href=\"https://www.science.org/doi/10.1126/science.adg7492\">AlphaMissense</a> has\ntackled this head-on. However, there is a significant data shortage. Without\nHelen Birman and the creation of the <a href=\"https://en.wikipedia.org/wiki/Worldwide_Protein_Data_Bank\">protein data\nbank</a>, AlphaFold\nwouldn\u2019t have been possible. This raises the critical question: where do we\nsource the necessary inputs? Pushmeet responded that intelligence doesn\u2019t\nemerge in isolation; it relies on experiential datasets. Models can derive this\nexperience from real-world input or interactions within simulations.\nHigh-fidelity simulations, in particular, provide models with valuable\nintuition about future outcomes. Experimental data is also critical, as it\nintegrates with simulations to complete the picture. When dealing with\nunlabeled data, prediction labels generated by the model itself can be used for\nfurther training. However, it's essential to discard incorrect labels to ensure\naccuracy, which makes this technique effective for bridging data gaps.\nCrucially, a pure and uncontaminated test set is vital to ensure the\nreliability of the system. For example, AlphaMissense was trained in an\nunsupervised manner and successfully identified cancer mutations.</p>\n<p>The discussion was quite wide ranging, but overall the two key challenges were:</p>\n<ul>\n<li>Reproducibility in science is a growing concern as AI accelerates discovery.\nAlison Noble emphasized the need for rigorous validation of results.</li>\n<li>Pushmeet noted the importance of testing the "prediction hull" of AI systems\nto understand their uncertainty and limitations, which was how AlphaFold\nbuilt up user confidence (by not having false positives).</li>\n</ul>\n<p>As AI transforms science, public-private partnerships and interdisciplinary\ncollaboration are essential (like the Our Future Health) program. Transparency\nand openness are deliberate choices in science, but regulation must keep up\nwith the pace of innovation. Alison Noble also noted there is a culture change\ngoing on for these public/private partnerships within academia. While\ncompetition drives innovcation, we need to make sure that the academic reward\nsystem keeps up; if there are 50 people on a paper then how is this attractive\nfor young scientists to enter a field and make their own name?</p>\n<h2><a href=\"https://anil.recoil.org/#a-view-from-the-frontier-of-cell-biology\"></a>A view from the frontier (of cell biology)</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Siddhartha_Mukherjee\">Siddhartha Mukherjee</a>, a cancer physician and Pulitzer Prize winner (and\npersonally speaking, author of one of my <a href=\"https://en.wikipedia.org/wiki/The_Emperor_of_All_Maladies\">favourite medical\nbooks</a> ever), began\nthe discussion with a warning about potential breaches of data privacy and the\ndangers they pose. He emphasized the risk of AI being weaponized for\nmisinformation, calling it a frontier challenge. These issues highlight the\nurgent need to develop mitigation strategies while continuing to advance the\ncapabilities of AI in their respective fields.\nSiddhartha emphasized that data is the critical missing link in advancing AI.\nIssues of access, quality, integration, equity, and privacy must be addressed.\nThe complexity of data ownership in AI raises ethical and commercial concerns,\nas data aggregators often benefit disproportionately.</p>\n<p>\n<img alt=\"Siddhartha on stage with Anne, Janet and Anna.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-9.webp\" title=\"Siddhartha on stage with Anne, Janet and Anna.\">\nSiddhartha on stage with Anne, Janet and Anna.</p>\n<p><a href=\"https://www.ebi.ac.uk/people/person/janet-thornton/\">Janet Thornton</a>, from the European Molecular\nBiology Lab, shared her perspective on protein structures. She highlighted how\nAI has transformed the field\u2014from modeling just 20 protein structures in the\nearly days to now over 214 million. Structural biologists worldwide are using\nthis data to validate their findings, explore ligand binding, and venture into\nuncharted territories of protein functionality. Anna delved into her work as a\ncell biologist studying membrane proteins and their organization within the\nbody. She shared a case from Cyprus, where mysterious kidney disease affected\ncertain families for decades. AI-driven image recognition was used to identify\na genetic mutation, leading to a therapeutic solution. The issue stemmed from a\nmisshapen protein caused by a nodal molecule that traps faulty proteins,\nultimately causing cell death. This discovery is now being applied to other\nconditions, such as Alzheimer\u2019s and blindness, offering hope for broader\ntherapeutic applications.</p>\n<p>Janet also spoke about her time as the director of the European\nBioinformatics Institute, which manages data repositories like the <a href=\"https://www.wwpdb.org\">Worldwide\nProtein Data Bank</a>. She described the cultural shift required to encourage data\nsharing, noting it took 20 years for crystallographers to agree to mandatory\ndata deposition before publication. She stressed that medical data,\nparticularly in clinical contexts, must undergo a similar transformation.\nPublicly funded data must eventually reach the commons, especially when such\nsharing has the potential to save lives. The UK\u2019s NHS, with its secure data\nenvironments, provides a strong model for this approach. However, the health\nsector needs to move faster than the crystallography community did, requiring\nbuy-in from both patients and medical professionals, as emphasized in Kathy\nSudlow\u2019s recent report on the UK\u2019s health data landscape.</p>\n<p><a href=\"https://www.broadinstitute.org/bios/anna-greka\">Anna Greka</a>, a pathologist and head of a new institute focusing on women\u2019s\ncancer at the Broad Institute, discussed her work on building AI tools to identify and facilitate the\ndevelopment of disease mechanisms. Anna Greka added that millions of human\ngenomes have been sequenced and aggregated into databases usable by scientists\nworldwide. If pathology labs globally pooled their data, AI training models\nwould benefit immensely. She suggested anonymizing the data while preserving\nmetadata and federating results across countries to protect privacy and enhance\nglobal collaboration.</p>\n<p>Anne Vincent-Salomon, head of diagnostic and theranostic medicine at the\nInstitute Curie, stressed the importance of multidisciplinary science and\neffective communication. She emphasized the need to educate the public,\nreducing fear and fostering understanding of scientific advancements.</p>\n<p>Anna concluded by underscoring the importance of understanding the "unit of\nlife" to progress in biology. She argued for the creation of a high-quality\nperturbation dataset for cells, akin to the Protein Data Bank\u2019s role in\nAlphaFold. Skeptical of synthetic data, she emphasized the need for\nexperimentally derived data as a foundation for future breakthroughs. She\ncalled this effort a moonshot for the next five years -\u2014 a grand challenge to\ndeepen our understanding of life itself! (<em>Anil: see also this great <a href=\"https://www.ted.com/talks/anna_greka_the_world_s_rarest_diseases_and_how_they_impact_everyone?subtitle=en\">TED talk</a> from Anna last year</em>)</p>\n<h2><a href=\"https://anil.recoil.org/#the-polycene-exploring-the-opportunity-of-the-moment\"></a>The Polycene: Exploring the Opportunity of the Moment</h2>\n<p>\n<img alt=\"Thomas Friedman talking about the polycene.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-5.webp\" title=\"Thomas Friedman talking about the polycene.\">\nThomas Friedman talking about the polycene.</p>\n<p>The (epic) next speaker was Thomas L. Friedman, who explored the the interplay of three critical "scaling laws" in the modern era:</p>\n<ol>\n<li><strong>Software</strong>: The rapid improvement of AI capabilities post-2017 (with transformers and GPUs).</li>\n<li><strong>Carbon</strong>: Rising CO2e and methane emissions driving "<a href=\"https://www.newstatesman.com/science-tech/2021/04/why-we-need-talk-about-global-weirding\">global weirding</a>" (extreme and destructive climate changes).</li>\n<li><strong>Disorder</strong>: Societal and institutional fragmentation in addressing these crises.</li>\n</ol>\n<p>Between 2017, with the introduction of transformer algorithms, and 2020, when\nadvancements in microchips and GPUs took off, artificial intelligence has\nexpanded dramatically. This period reflects a "scaling law" in action.\nPolymathic AI\u2014AI capable of addressing a broad range of problems\u2014now seems\nwithin reach. In just three years, AI-driven science has evolved from science\nfiction to reality and become accessible to many (albeit often with some\nlimitations on free access). To address the challenges AI presents, we need\nhigher-dimensional thinking for higher-dimensional problems.</p>\n<p>At the same time, we're seeing a scaling law in carbon emissions. Levels of CO\u2082\nand methane in the atmosphere, including methane from livestock, are causing\ndestructive climate change. The seven warmest years on record occurred between\n2015 and 2021, resulting in what\u2019s called "global weirding"\u2014where hot regions\ngrow hotter, wet regions grow wetter, and the effects become increasingly\ndestructive.</p>\n<p>In parallel with these scaling points in carbon and silicon (AI hardware),\nwe\u2019re facing a scaling point in disorder\u2014the erosion of civic structures.\nGovernments worldwide have over-promised on the benefits of industrial\ndemocracies, such as healthcare, retirement plans, and infrastructure, yet are\nstruggling to deliver. Societies are aging, educational attainment has\nstagnated, and productivity growth has slowed.</p>\n<p>We're also witnessing growing national security challenges and the dissolution\nof the great power balance that defined the post-Cold War era. Electric\ntransportation, healthcare, and employment systems are under strain, leading to\nincreased migration. Today, there are 56 active conflicts globally\u2014the highest\nnumber since World War II\u2014and more displaced people than at any point in\nhistory.</p>\n<p>We need a game-changer.</p>\n<p>To solve these interconnected crises, we must adopt higher-dimensional\napproaches that blend solutions across disciplines and scales. The future\nstability of our planet\u2014and the well-being of the next generation\u2014depends on\npresenting holistic, interconnected solutions. Friedman calls this the "Polycene" era.</p>\n<p>Never before has politics needed science more. Science must enable leaps\nforward in education and provide buffers against disruptive climate change.\nSimilarly, politics must create the frameworks and systems to allow science to\nthrive and deliver meaningful solutions.</p>\n<p>In <a href=\"https://en.wikipedia.org/wiki/That_Used_to_Be_Us\">That Used to Be Us</a>,\nFriedman argued that "average is over"; the rapid acceleration of technology\nmeans the American Dream -- once achievable for many -- is no longer guaranteed.</p>\n<p>However, technology can flatten access to resources. For instance, an Indian\nfarmer can now access advanced insights about crop planting, watering\nschedules, and fertilizers directly on a smartphone. For the first time, those\nwithout access to "average" resources now have access to them through AI.\nThanks to AI, "average" as a benchmark is over\u2014and that gives Friedman optimism.</p>\n<p>However, there\u2019s a caveat: science and politics must work together. The\nalignment problem between these fields is real and will become more pressing as\nwe approach AGI. As a species, we\u2019ve become more godlike than ever before,\ncreating systems and intelligence that resemble a larger, more powerful brain.\nHow we use this power will determine our future.</p>\n<h2><a href=\"https://anil.recoil.org/#building-the-infrastructure-for-success\"></a>Building the Infrastructure for Success</h2>\n<p>The speakers here were Paul Hofheinz, <a href=\"https://en.wikipedia.org/wiki/Asmeret_Asefaw_Berhe\">Asmeret Asefaw Berhe</a>, Fabian J. Theis and Bosun Tijani.</p>\n<p>\n<img alt=\"Paul Hofheinz, Asmeret, Fabian and Bosun.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-6.webp\" title=\"Paul Hofheinz, Asmeret, Fabian and Bosun.\">\nPaul Hofheinz, Asmeret, Fabian and Bosun.</p>\n<p>Berhe began by noting that we are at an inflection point -- how do we avoid\nrepeating mistakes from the past, ensuring we don\u2019t leave segments of human\nsociety behind or widen the digital divide further? In previous programs such\nas exascale computing, they insisted as part of the program that they must\nhave explicit sustainability goals. While these goals may have seemed\nunrealistic initially and were criticised, in retrospect they have shown they\ncan be achieved. An example of the next thing theyre working on is the\nHigh-Performance Data Facility, recognizing that the Office of Science produces\nmore scientific data than any other entity in the world (e.g particle physics,\ngenomic labs). We need to rethink how we handle such huge amounts of data,\naddressing concerns around privacy and integrity.</p>\n<p>Minister Tijani then talked about how in Nigeria, there is a direct correlation\nbetween science and economic growth, yet in the Global South, we often fail to\napply science effectively to societal issues. The answers we think we have\noften got to shift solutions with context; for instance, policies from the UK don\u2019t transplant\ncleanly to Nigeria where the poplulation is growing hugely faster.</p>\n<p>Key points included:</p>\n<ul>\n<li><strong>Dataset Inclusion</strong>: Like Indian farmers accessing AI-driven agricultural\ninsights, we need datasets that represent Nigeria, Rwanda, and other African\ncountries. Nigeria\u2019s average age is 16.9, with 5 million new people entering\nthe population each year. The workforce of the future will come from Africa.</li>\n<li><strong>Infrastructure</strong>: Establishing certainty in data infrastructure is\ncritical. Countries will need to ensure their data infrastructures allow for\nglobal participation rather than stagnation.</li>\n<li><strong>Digitization</strong>: Much of Nigeria\u2019s existing knowledge isn't encoded in a\nform digestible by AI. Smart digitization efforts are necessary to create\ninputs for widely used models.</li>\n</ul>\n<p>To address these challenges, the Nigerian government did a few things. Over\nthe past 30 years, publications and a name library were correlated to identify\nAI-focused Nigerian scientists. This effort brought 100 of them together to\ndevelop a national AI strategy for Nigeria. An AI Trust was created with a\ncommunity of trustees to support this strategy. And then a Talent Attraction\nProgram was launched, supported by Google and the government, alongside local\nprivate investment. This is one of the largest talent accelerators globally.\nNigeria aims to become a net exporter of talent, much like India\u2019s success in\nthis area.</p>\n<p>Fabien then talked about how many scientists are driven by natural curiosity,\nand it's vital to nurture that environment. The "holy trinity" of AI consists\nof data compute and algorithms need to be completed together. Ten years ago,\ncomputer vision flourished due to ImageNet\u2019s availability and now we\u2019re\nentering a new era with foundation models for cell biology. These models\nrequire algorithmic innovations to merge datasets and techniques like\nmultiscale modeling mixed with self-supervised learning to succeed.</p>\n<p>We're at a tipping point where we can build generalizable models that can be\nadapted for specific tasks around the world (a reference to the Nigerian\nusecases earlier)</p>\n<p>Some key points discussed were:</p>\n<ul>\n<li><em>Equitable compute access</em>: Academic/industrial partnerships are essential to make GPUs accessible for foundational research.</li>\n<li><em>Cell Atlas</em>: Foundation models help academics plan experiments ("lab in the loop") and overlay disease data for deeper insights. The Deep Cell Project for example aims to not only create steady-state models but also add perturbation behaviors, enabling predictions about how cells respond to interventions. Unlike physics, where laws of motion guide us, cell biology lacks such universal equations. Deep Cell integrates image-based observations, tissue contact data, computational morphology, and clinical data to create a comprehensive model that maps healthy states and potential interventions.</li>\n<li><em>Benchmarks</em>: Gene data is consistent internationally, but we need\nstandardized benchmarks to equalize global talent and foster competition.\nBenchmarks are powerful tools for collaboration and innovation as they are accessible for anyone (see Kaggle for example).</li>\n<li><em>Bias</em>: While we have powerful computational systems, the data they rely on is highly biased and incomplete. These flaws risk being perpetuated in future frontier models. To address this we need to invest in rebalancing datasets to prevent historical biases from carrying over. Cooperative investments are essential to develop homegrown talent in the global south.</li>\n</ul>\n<p>Bosun Tijani also noted that the Global South isnt a charity case when it comes\nto AI. By the end of this century, Nigeria\u2019s population will be half a billion,\nmaking it crucial in a highly connected world. There's a strong business case\nfor not missing out. Nigeria is investing $2 billion to deploy dark fiber\ninfrastructure nationwide. This connectivity will empower people to contribute\nmeaningfully to the global digital economy. Governments in the Global South\nmust expand their capacity to think strategically about AI and its potential.\nUnlike academic institutions, which often drive innovation, governments in\nthese regions need to strengthen their ability to lead and cant rely on a large\nuniversity pool like Europe does.</p>\n<h2><a href=\"https://anil.recoil.org/#collaborating-for-impact\"></a>Collaborating for Impact</h2>\n<p>The speakers here were Lila Ibrahim, Ilan Gur, Dame Angela McLean and Sir Paul Nurse.</p>\n<p>The first question about around how the speakers have experienced changes in science and how it have evolved?</p>\n<p>Paul Nurse noted that we live in an increasingly complex scientific world. The\nexpansion of science has led to greater complexity, which, in turn, has created\nmore silos across disciplines. To address this, we need more interaction \u2014- not\nnecessarily immediate collaboration -\u2014 but increasing permeability between\nfields. There are also important social science aspects to consider, such as\nhow to structure training and foster interdisciplinary to work effectively.</p>\n<p>\n<img alt=\"Lila, Ilan, Angela and Paul on stage.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-7.webp\" title=\"Lila, Ilan, Angela and Paul on stage.\">\nLila, Ilan, Angela and Paul on stage.</p>\n<p>Angela McClean: we\u2019ve transitioned from an era of "big science" projects -\u2014\nlarge, centrally organized efforts with clear, command-and-control structures\n-\u2014 towards distributed collectives. These collectives pool together the\nnecessary data to tackle significant questions, such as those addressed by\nAlphaFold. Unlike a single centralized project, AlphaFold\u2019s success came from a\nclear, well-defined question that encouraged competition and fostered winning\nstrategies.</p>\n<p>Today, we need the big research projects to define what success looks like and\nthen innovate towards new ways for people to contribute collectively without a\nbig central structure.</p>\n<p>Disciplines can generally be divided into two categories. Those with "countable\nquestions"; for example, solving the Riemann hypothesis might win you a Nobel\nPrize! Then we have unstructured disciplines: fields like biology, where there\nisn\u2019t a single list of questions to solve. As Paul Nurse put it, "biology is a\nbunch of anarchist scientists!". He contined that we need more unfocused\nresearch organizations that keep track of unstructured problems and help refine\nthem into focused scientific questions. This kind of work is essential for\nachieving progress in disciplines that don't have clear or countable goals.</p>\n<p>Ilan Gur then introduced ARIA, the Advanced Research Intelligence Agency, that\nhas a mission to build on the UK\u2019s rich scientific ecosystem by pursuing\nbreakthroughs that may not yet have immediate or obvious consequences. ARIA\nfocuses on identifying the right people, their environments, their incentives, and\nhow their work can ultimately benefit society.\nARIA\u2019s method begins by gathering program manager scientists to develop a thesis about\nwhere to focus efforts. This doesn\u2019t involve just one research project but\nrather a constellation of efforts that can cross technology readiness\nlevels and organizational boundaries to achieve a focused target.\nExamples of ARIA initiatives include scaling AI via compute inspired by nature, and\nanother project observing that formal mathematics should be better integrated\ninto AI research to create more robust models. By guaranteeing bounds on\ninputs, we could use AI in critical applications with confidence in its\noutcomes.</p>\n<p>Angela McClean then talked about how the UK government (under Labour) has outlined missions for addressing key societal challenges, such as\ngrowth, safer streets, opportunities for all, clean energy, and better health.\nThese missions are a great starting point for research and problem-solving.\nHowever, the structure of Whitehall (government departments) tends to remain\nsiloed. To succeed, we need to draw expertise from across departments to\naddress these challenges.</p>\n<p>Paul Nurse noted that science operates on a spectrum between discovery and\napplication. Discovery is anarchic and unpredictable but applications are more\ndirected and goal-oriented. We need end-to-end funding that supports the\nentire spectrum, from discovery to application, while embracing diversity in\napproaches. A one-size-fits-all method won\u2019t work. At the Francis Crick\nInstitute, departments were abolished, allowing disciplines to mix; turnover\nwas encouraged after a limit of tenure to keep staffing dynamic (including at\nsenior levels) and a high level of technical expertise was made available to\neveryone. Mixing people from different disciplines and using social scientists\nto understand the cultural structures within organizations is key to fostering\ninnovation.</p>\n<p>At the Crick Institute, the space encourages serendipitous conversations* This\nincluded inviting external guests and creating opportunities for unexpected\ninteractions. Tools like Slack\u2019s "lunch roulette" feature could similarly\nencourage serendipity and collaboration.\n(<em>Anil note</em>: Cambridge Colleges do a great job here, but our departments are\nmore siloed but events like <a href=\"https://www.christs.cam.ac.uk/news/celebrating-50-years-rubiks-cube\">Rubik's 50</a> are a great example of how different disciplines come together)</p>\n<p>Angela McClean also noted that we need to find ways to communicate the\nimportance of breakthroughs like AlphaFold outside of the scientific community.\nFor example, when AlphaFold was introduced, the Ministry of Defence (MoD)\ndidn\u2019t grasp why the science mattered -\u2014 they lacked the necessary context. Even\namong highly educated people in the UK, there's a gap in understanding just how\nmuch AI is transforming society. By better connecting experts and amplifying\ntheir insights, we can and must help bridge this gap.</p>\n<p>Paul Nurse also noted that the public must be informed about science advances;\ne.g. the fiasco around GM crops happened because noone trying to introduce GM\ncrops had bothered to infrm the public explainign what the issues are and\ngetting feedback. The main answer from the public smaple about "why not GM crops" is because they\ndidnt want to eat food with genes in it, and thats what bothered them. So when\nwe're thinking about AI and its implications, lets go out and talk to the\npublic and find out what worries them and then think about how to communicate.</p>\n<h3><a href=\"https://anil.recoil.org/#accelerating-scientific-discovery-using-ai\"></a>Accelerating Scientific Discovery Using AI</h3>\n<p>Demis Hassabis then reflected on AlphaFold and the future of AI-Driven science.\nAlphaFold, which has been cited over 28,000 times already and by open-sourcing it (including AlphaFold 3 with open weights for non-commercial use), its impact has been profound. Some standout applications include:</p>\n<ul>\n<li>Determining the structure of the nuclear pore complex, which regulates nutrients entering and exiting the cell nucleus.</li>\n<li>Developing a molecular syringe for delivering drugs to hard-to-reach parts of the body.</li>\n<li>Discovering plastic-eating enzymes to address environmental challenges.</li>\n</ul>\n<p>AlphaFold is described as a "root node" problem within Deepmind -\u2014 once solved, it unlocks entirely new branches of scientific discovery. Its success in determining protein structures has validated this potential. What's next?</p>\n<p>\n<img alt=\"Hannah Fry and Demis Hassabis on stage\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-10.webp\" title=\"Hannah Fry and Demis Hassabis on stage\">\nHannah Fry and Demis Hassabis on stage</p>\n<p>Material Design (<a href=\"https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/\">Gnome Project</a>) is the next frontier of material design, which shares characteristics with protein folding:</p>\n<ul>\n<li>A massive combinatorial search space.</li>\n<li>The need for models that integrate physics and chemistry to optimize solutions.</li>\n<li>Potential breakthroughs include developing new batteries or room-temperature superconductors.\nAlready, 200,000 new crystals have been published -\u2014 previously unknown to science -\u2014 marking significant progress toward a usable Gnome system.</li>\n</ul>\n<p>Applying AI to mathematics also offers exciting possibilities, including solving major conjectures that have eluded mathematicians.</p>\n<p>Inspired by mentorship from Paul Nurse, the aim is to simulate an entire\nvirtual cell -\u2014 a "Mount Everest" of biology. AlphaFold 2 solves static protein\nstructures, while AlphaFold 3 models interactions, taking us closer to\nsimulating an entire cell (e.g., a yeast cell as the model organism). Within\n5\u201310 years, this ambitious goal may well be achievable.</p>\n<p>Quantum Computing is accelerating and offers exciting intersections with AI; simulating quantum systems to generate synthetic data or addressing challenges like error-correcting codes. However, classical Turing machines have proven more capable than initially thought:</p>\n<ul>\n<li>Projects like AlphaGo and AlphaFold show how new algorithms outperform brute force by precomputing models before tasks like making a move in Go or folding a protein.</li>\n<li>Classical systems, when used effectively, can model even quantum systems.</li>\n</ul>\n<p>David Deutsch called this approach "crazy, but the right kind of crazy" when Demis talked to him about it. Demis thinks that every natural phenomenon has inherent structure, which machine learning can model to efficiently search for optimal solutions. So quantum may not be necessary for this, and classical computing used with machine learning sufficient to solve the hugely complex underlying problem.</p>\n<p>Meanwhile they also launched Isomorphic Labs to rethinking the drug discovery\nprocess from the ground up, leveraging AI for one of the most impactful use\ncases: curing diseases. AlphaFold is a powerful tool for fundamental research,\nand Isomorphic works on adjacent usecases need for practical drug discovery\n(helping design chemical compounds, test for toxicity, and minimize side\neffects, etc). Isomorphic aims to cure diseases with AI, and generate revenue\nto reinvest in fundamental research, so striking a balance between societal\nimpact and profitability.</p>\n<p>Demis then commented that we stand on the brink of a golden era of scientific\ndiscovery, driven by interdisciplinary collaboration with domain experts and\nlimitless possibilities for applying AI to new fields and improving AI itself\n(approaching exponential improvement). The scientific method is humanity's\ngreatest invention and remains the foundation of modern civilization. In an era\nof transformative AI, its useful to go beyond simple A/B testing and treat AI\ndevelopment as a scientific method test. We need to understand the emergent\nproperties of AI systems and improve interpretability. Techniques from\nneuroscience (e.g fMRI for studying brains) could inspire ways to study neural\nnetworks and make them explainable rather than just being black boxes. The\napproach is to build the artifact of interest first, then decompose it through\ntargeted experiments to understand it once it has proven useful. Artificial\nsystems like neural networks can be as complex as natural systems, requiring\nsimilar rigor to understand.</p>\n<p>Science is increasingly expensive and complex, leading to slower progress in\ncertain areas. However, interdisciplinary work will drive significant advances\nin the next decade. DeepMind, originally founded at the intersection of\nneuroscience and computer science, exemplifies how crossing disciplines\naccelerates innovation.</p>\n<p>To support these efforts, Google.org just announced a <a href=\"https://blog.google/outreach-initiatives/google-org/google-org-science-ai-funding/\">$20 million fund for\ninterdisciplinary\nresearch</a>,\nfurther enabling breakthroughs at the intersection of fields. (<em>Anil's note: let's hope that sustainability is on the list here!</em>)</p>\n<h3><a href=\"https://anil.recoil.org/#ask-the-nobel-laureates\"></a>Ask the Nobel Laureates</h3>\n<p>The last panel had all four Laureates on stage to answer questions, moderated\nby Hannah Fry: Jennifer Doudna, Sir Demis Hassabis, John Jumper and Sir Paul\nNurse.</p>\n<p>\n<img alt=\"What&apos;s a group of Nobel laureates called?\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-11.webp\" title=\"What&apos;s a group of Nobel laureates called?\">\nWhat's a group of Nobel laureates called?</p>\n<p>The discussion opened by asking the panelists how they first felt when they\nmade their prize winning discoveries.</p>\n<p>John Jumper: when you release groundbreaking work, it\u2019s fascinating to see the\nimmediate responses. I remember refreshing Twitter and seeing graduate students\nexclaiming, \u201cHow did they get my structure? It hasn\u2019t even been published!\u201d\nThere was a special issue of Science related to the nuclear pore complex, and\nthree out of four studies had heavily used AlphaFold without me even knowing\nit. It was amazing to see how our tools are empowering researchers.</p>\n<p>Jennifer Doudna: In the fall of 2011, while working on CRISPR (a bacterial\nimmune system), we realized it was an RNA-guided system that targets DNA for\ncleaving. It was one of those "aha" moments\u2014bacteria had figured out how to do\nthis, and now we could understand and manipulate DNA using the same principle.\nA year later, when we published our findings, we could feel the momentum\nbuilding in the scientific community.</p>\n<p>Paul Nurse: In 1985 (much older than the others!), I was working on yeast and\nmy lab had identified the genes responsible for the cell cycle\u2014how one cell\nreproduces into two. We wondered whether these findings could apply to humans,\neven though this was well before human genome mapping. Using the first human\ncDNA library ever made, we introduced human genes into defective yeast cells.\nIf a human gene could replace the defective yeast gene and restore function, it\nmeant the discovery was transferable. Remarkably, 1.5 billion years of\nevolutionary divergence didn\u2019t stop this experiment from working.</p>\n<h2><a href=\"https://anil.recoil.org/#qa\"></a>Q&A</h2>\n<p>Q: What would you say to your 18-year-old self?</p>\n<p>Demis Hassabis: I actually had this plan when I was 18! The amazing thing is that it worked out, but I\u2019d tell myself to enjoy the journey a bit more.</p>\n<p>John Jumper: My career has been more of a random walk, driven by doing good science in the moment and being open to new opportunities. My advice is to focus on doing good science now and let the interesting paths unfold naturally. It\u2019s almost the opposite of Demis\u2019s advice.</p>\n<p>Jennifer Doudna: Follow your passion, never give up, and don\u2019t listen to naysayers.</p>\n<p>Paul Nurse: Coming from a non-academic background, I couldn\u2019t believe that I could be paid to follow my curiosity. Even now, it still feels like a privilege.</p>\n<p>\n<img alt=\"Hideo Kojima has the coolest portraits at the BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-3.webp\" title=\"Hideo Kojima has the coolest portraits at the BAFTA\">\nHideo Kojima has the coolest portraits at the BAFTA</p>\n<p>Q: AI gives answers but struggles with mechanistic insights. How big a barrier is this to public trust, and when can we expect true mechanistic insights?</p>\n<p>Demis Hassabis: AI is an engineering science. First, we need to build systems that are worthy of study. Once built, we can break them down and understand them mechanistically over time. Early systems weren\u2019t worth this effort, but now we\u2019re developing tools that are, and they\u2019re improving themselves. Unlike physics, biology can\u2019t always be explained by universal laws, but simulations that can be tested and probed are better suited. Neuroscience techniques, like those used to study real brains, can also help us understand artificial neural networks.</p>\n<p>Q: Is attention still all we need?</p>\n<p>John Jumper: AlphaFold isn\u2019t just an off-the-shelf transformer. While attention is an important component, many other innovations were added to change the structure of the network significantly. Fundamental research continues to unlock insights into both new data and previously unexamined data. AlphaFold has revealed new knowledge about data that had been available for years.</p>\n<p>Demis Hassabis: The transformer architecture has been incredible but isn\u2019t sufficient on its own. We\u2019ll need several more breakthroughs of that scale to reach full AGI.</p>\n<p>Q: What are the current challenges in biology data?</p>\n<p>Jennifer Doudna: Biology faces issues with both the quality and quantity of data for training AI models. We need to educate researchers on how to collect data both sparsely and smartly. Sparse but broad data is critical to creating robust platforms for training. This ultimately comes down to asking the right questions.</p>\n<p>Q: What about people who are skeptical of these breakthroughs? Could society reject them?</p>\n<p>Paul Nurse: Keeping the public on board is critical. This isn\u2019t the first time new technology has faced resistance, and every time it happens, there\u2019s concern. Special interest groups often hijack these conversations, so we need to find better ways to engage with the public and explain the science behind the breakthroughs.</p>\n<p>Q: Africa will have the largest population of young adults by 2050. How can Africans be included in this global scientific revolution?</p>\n<p>Jennifer Doudna: The Innovative Genomics Institute has an ongoing effort in Kenya to work with scientists and help them understand CRISPR. This initiative has fostered a network of researchers, and I\u2019d like to see more of that happening.</p>\n<p>Demis Hassabis: DeepMind has been actively working in Africa, with events like the Deep Indaba conference serving as key convening points for African talent. There\u2019s still a lot more to be done, but it\u2019s a hugely important area of focus.</p>\n<p>Q: How do we encourage the next generation of scientists?</p>\n<p>Paul Nurse: In today\u2019s world, journals are dominated by big data studies. While there\u2019s value in this, we must ensure that creativity doesn\u2019t get lost. There\u2019s enormous potential in big data if approached with creativity, and we need to foster this mindset in our colleagues and students.</p>\n<p>Demis Hassabis: Encouraging the next generation is crucial. One of my heroes is Richard Feynman. Every schoolchild should read <em>Surely You\u2019re Joking, Mr. Feynman!</em> It shows how exhilarating it is to work at the frontier of knowledge. Science is incredible and fun, and we need to expose people to that joy.</p>\n<p>\n<img alt=\"Ray Dolby is a Pembroke alumni too\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-13.webp\" title=\"Ray Dolby is a Pembroke alumni too\">\nRay Dolby is a Pembroke alumni too\n\n<img alt=\"Interactive exhibits inside the room\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-15.webp\" title=\"Interactive exhibits inside the room\">\nInteractive exhibits inside the room\n\n<img alt=\"Glitzy entrance to the BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-14.webp\" title=\"Glitzy entrance to the BAFTA\">\nGlitzy entrance to the BAFTA</p>\n<p>These conclude my live notes! Beyond the notes here, the corridor conversations were incredibly\nuseful for me: I have lots of connections to make next. Any errors in these\nnotes are all mine, of course; I mainly took them for myself, but I hope it's\nuseful for you to have put them online as well.</p>",+"content": "<p>I got invited to join the Royal Society and DeepMind to a summit on\nhow AI is revolutionising scientific discovery and trotted along with\n<a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>. This event is hot on the heels of the\nexcellent RS report on <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Science in the Age of AI</a>\nand, of course, the Nobel prize for Demis Hassabis which was the <a href=\"https://www.cst.cam.ac.uk/news/nobel-prize-our-alumnus-sir-demis-hassabis\">first ever\nfor my department</a>!\nThe event was held at the BAFTA today, and what follows are my quick livenotes\nas there was just so much to absorb. The RS and Deepmind will have the\nfull sessions online sometime soon, so I'll update this with those more polished\noutputs when they're out! <em>Update: Proper notes now available from <a href=\"https://blog.google/technology/ai/ai-science-forum-2024/\">Google</a> and the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society</a>.</em></p>\n<p>\n<img alt=\"Hannah Fry doing a great job emceeing\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-4.webp\" title=\"Hannah Fry doing a great job emceeing\">\nHannah Fry doing a great job emceeing</p>\n<p>The summit was a day-long exploration of how artificial intelligence is\ntransforming science and society, and the overall theme (including four Nobel\nlaureates!) was on how we are in a golden age of scientific discovery,\nespecially in the biological sciences. The emcee for the event was Hannah Fry,\nwho simply dazzled with her rapid summarisation of complex discussions\ninterspersed with very dry humour about the proceedings!</p>\n<p>The consistent message was how ongoing synergy between science, technology, and\nsociety is essential to setting the stage for an exploration of transformative\ndiscoveries powered by AI that <em>would benefit everyone in society</em>. Missing that\nsynergy would lead to unequal benefit or dangerous crossings of boundaries.</p>\n<p>\n<img alt=\"Busy schedule for the day at BAFTA HQ!\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-8.webp\" title=\"Busy schedule for the day at BAFTA HQ!\">\nBusy schedule for the day at BAFTA HQ!</p>\n<h2><a href=\"https://anil.recoil.org/#lessons-from-crispr\"></a>Lessons from CRISPR</h2>\n<p>The first session had James Manyika interviewing <a href=\"https://en.wikipedia.org/wiki/Jennifer_Doudna\">Jennifer Doudna</a>, Nobel Laureate and co-inventor of CRISPR, shared how gene editing has moved from science fiction to an essential tool for societal improvement. Some key points:</p>\n<ul>\n<li>AI's integration with CRISPR allows scientists to better predict and control\ngenome editing outcomes, advancing efficiency and reducing hypothesis-testing\ntime. Many experiments could potentially be skipped thanks to simulations\npredicting outcomes without the need for wetlab work.</li>\n<li>Jennifer discussed projects like <a href=\"https://www.ucdavis.edu/food/news/making-cattle-more-sustainable\">methane-free cows</a>,\nwhere altering cattle genomes could eliminate methane production entirely.\nThese efforts require multidisciplinary collaboration between computer\nscientists, agricultural experts, and biologists.</li>\n<li>The success of CRISPR emphasises the need for public engagement, policy\nframeworks, and open databases for international collaboration. Doudna called\nfor knowledge accessibility, including teaching high school educators about\ngenome editing's impact, as a key part of public outreach about how this\ntechnology might affect society in the future.</li>\n<li>CRISPR moved really fast: it was published in 2012, and by 2014 scientists\nwere already editing monkey embroyes. This lead to a realisation that it\nwasnt enought to be head down in the Lab, but needed a whole team that works on\npublic impact and policy (including RS and national academies) to bootstrap\ninternational meetings on human genome editing. The most recent was held in\nLondon in March of last year which lead to open, transparent conversations and\nbuilding a worldwide database of work involving genome editing, especially that\nwhich impacts human genome or environmental editing which could escape.</li>\n</ul>\n<p>\n<img alt=\"James and Jennifer in discussion.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-1.webp\" title=\"James and Jennifer in discussion.\">\nJames and Jennifer in discussion.</p>\n<h2><a href=\"https://anil.recoil.org/#science-in-the-age-of-ai\"></a>Science in the Age of AI</h2>\n<p>The panel next was <a href=\"https://en.wikipedia.org/wiki/Eric_Topol\">Eric Topol</a> chairing a discussion with <a href=\"https://en.wikipedia.org/wiki/Pushmeet_Kohli\">Pushmeet Kohli</a>, <a href=\"https://en.wikipedia.org/wiki/Fiona_Marshall_(pharmacologist)\">Fiona H. Marshall</a> and <a href=\"https://en.wikipedia.org/wiki/Alison_Noble\">Alison Noble</a>. The theme was on how huge number of foundation\nmodels are coming out for LLLMs (large language life models) at a dizzying\npace.</p>\n<p>\n<img alt=\"Eric, Pushmeet, Fiona and Alison on stage.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-2.webp\" title=\"Eric, Pushmeet, Fiona and Alison on stage.\">\nEric, Pushmeet, Fiona and Alison on stage.</p>\n<p>Pushmeet Kohli first explained how deciphering the genome accelerates\nbiological discoveries by orders of magnitude. AI tools like AlphaFold (which\njust got the Nobel prize) exemplify breakthroughs that transform biology into a\npredictive science from a wetlab driven discipline. On other fronts,\nDeepMind's GraphCast model enables near-term weather forecasting in minutes,\nwhich previously required days of supercomputer time to do an equivalent\nforecast (and now\n<a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> is doing even\nbetter with mechanistic models combined with data). Pushmeet then noted how\nGNNs for materials science identified over 400k (or 200k? couldnt catch it) new\nstable inorganic crystals, with potential applications like high-temperature\nsuperconductors which were just scifi before.</p>\n<p>Then Fiona H. Marshall from Novartis emphasized how AI identifies new drug\ntargets using genomic and population-level data. In drug development,\npredictive safety is absolutely crucial. Pharmaceutical companies have decades\u2019\nworth of histological data, such as rodent testing, stored on hundreds of\nthousands of glass slides that are now being digitized. Once this data is\nprocessed, we can use it to make various predictions. Sharing this data across\nthe pharmaceutical industry would benefit everyone. One of their partner\ncompanies is developing scanning algorithms, and once these are operational\nthey will be made open, the entire industry will gain from the resulting\ndataset. Generative chemistry (like AlphaFold) now designs drugs faster while\npredictive toxicology ensures safer medicines.\nInterestingly, the data scientists are in the prime author\nposition as they are dictating the experimental procedures vs the wetlab\nscientists a few decades ago. This change in incentives drives change within a\nlarge org towards more data driven methods.</p>\n<p>Another valuable source of data is population-level information, such as <a href=\"https://ourfuturehealth.org.uk\">Our\nFuture Health</a> (a UK-based NHS initiative).\nProper management of nomenclature will ensure that this project generates a\nmassive, usable dataset vs what we have anywhere else. Eric noted that they\nrely heavily on the UK Biobank, which, with its 15,000 participants, is one of\nthe largest in the world and the Our Future Health program is a huge leap\nforward with 5m participants. The NIH in the United States is hesitant to\nengage in public-private partnerships, and so the UK is leading the way in this\ndomain (<em>Anil notes: with privacy concerns about the data sharing</em>).</p>\n<p>Fiona also noted that drug discovery is also transforming clinical trials, not\njust the discovery process itself. Typically, it takes around 10 years for a\ncandidate molecule to move to the clinical trial phase. One major bottleneck is\npatient enrollment. By leveraging vast demographic databases, which include\ninformation on global populations, their diseases, medications, and hospital\naffiliations, we can drastically improve recruitment efficiency. For example,\na clinical trial protocol targeting "women aged 30-50 who are not taking drug X\nor estrogen modifiers" can utilize these databases to identify and enroll\npatients quickly. This approach can reduce recruitment time from three years to\njust six months, significantly accelerating the process of getting drugs to\nmarket.</p>\n<p>\n<img alt=\"Sneaking in some sightseeing at BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-12.webp\" title=\"Sneaking in some sightseeing at BAFTA\">\nSneaking in some sightseeing at BAFTA</p>\n<p>Alison Noble discussed how deep learning has revolutionized ultrasound imaging.\nAI now guides users on probe placement, reducing training requirements for\nmedical professionals. However, we're now moving so fast that we need to be\ncareful; even the notion of what a scientist is is changing. In the RS report\non <a href=\"https://anil.recoil.org/\">Science in the Age of AI</a> a number of scientists around the UK were\nconsulted and this concern of reproducibility and data access came up loud and\nclear. When we publish results, we need to make sure theyh are sound and that\npeer review is possible. Openness is a deliberate choice however and not always\nappropriate when sensitive data is involved (e.g. healthcare) but requiring a\nrigor in evaluation is essential for good science. Scientists need to rethink\nin the age of AI how we present our work, and how we train new scientists in\nthis environment. So we have some wonderful early examples like AlphaFold, but\nwe need to understand the societal/incentive impacts on our new generation of\nscientists.</p>\n<p>Eric also noted that one of the greatest challenges in genomics is\nunderstanding variance, and\n<a href=\"https://www.science.org/doi/10.1126/science.adg7492\">AlphaMissense</a> has\ntackled this head-on. However, there is a significant data shortage. Without\nHelen Birman and the creation of the <a href=\"https://en.wikipedia.org/wiki/Worldwide_Protein_Data_Bank\">protein data\nbank</a>, AlphaFold\nwouldn\u2019t have been possible. This raises the critical question: where do we\nsource the necessary inputs? Pushmeet responded that intelligence doesn\u2019t\nemerge in isolation; it relies on experiential datasets. Models can derive this\nexperience from real-world input or interactions within simulations.\nHigh-fidelity simulations, in particular, provide models with valuable\nintuition about future outcomes. Experimental data is also critical, as it\nintegrates with simulations to complete the picture. When dealing with\nunlabeled data, prediction labels generated by the model itself can be used for\nfurther training. However, it's essential to discard incorrect labels to ensure\naccuracy, which makes this technique effective for bridging data gaps.\nCrucially, a pure and uncontaminated test set is vital to ensure the\nreliability of the system. For example, AlphaMissense was trained in an\nunsupervised manner and successfully identified cancer mutations.</p>\n<p>The discussion was quite wide ranging, but overall the two key challenges were:</p>\n<ul>\n<li>Reproducibility in science is a growing concern as AI accelerates discovery.\nAlison Noble emphasized the need for rigorous validation of results.</li>\n<li>Pushmeet noted the importance of testing the "prediction hull" of AI systems\nto understand their uncertainty and limitations, which was how AlphaFold\nbuilt up user confidence (by not having false positives).</li>\n</ul>\n<p>As AI transforms science, public-private partnerships and interdisciplinary\ncollaboration are essential (like the Our Future Health) program. Transparency\nand openness are deliberate choices in science, but regulation must keep up\nwith the pace of innovation. Alison Noble also noted there is a culture change\ngoing on for these public/private partnerships within academia. While\ncompetition drives innovcation, we need to make sure that the academic reward\nsystem keeps up; if there are 50 people on a paper then how is this attractive\nfor young scientists to enter a field and make their own name?</p>\n<h2><a href=\"https://anil.recoil.org/#a-view-from-the-frontier-of-cell-biology\"></a>A view from the frontier (of cell biology)</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Siddhartha_Mukherjee\">Siddhartha Mukherjee</a>, a cancer physician and Pulitzer Prize winner (and\npersonally speaking, author of one of my <a href=\"https://en.wikipedia.org/wiki/The_Emperor_of_All_Maladies\">favourite medical\nbooks</a> ever), began\nthe discussion with a warning about potential breaches of data privacy and the\ndangers they pose. He emphasized the risk of AI being weaponized for\nmisinformation, calling it a frontier challenge. These issues highlight the\nurgent need to develop mitigation strategies while continuing to advance the\ncapabilities of AI in their respective fields.\nSiddhartha emphasized that data is the critical missing link in advancing AI.\nIssues of access, quality, integration, equity, and privacy must be addressed.\nThe complexity of data ownership in AI raises ethical and commercial concerns,\nas data aggregators often benefit disproportionately.</p>\n<p>\n<img alt=\"Siddhartha on stage with Anne, Janet and Anna.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-9.webp\" title=\"Siddhartha on stage with Anne, Janet and Anna.\">\nSiddhartha on stage with Anne, Janet and Anna.</p>\n<p><a href=\"https://www.ebi.ac.uk/people/person/janet-thornton/\">Janet Thornton</a>, from the European Molecular\nBiology Lab, shared her perspective on protein structures. She highlighted how\nAI has transformed the field\u2014from modeling just 20 protein structures in the\nearly days to now over 214 million. Structural biologists worldwide are using\nthis data to validate their findings, explore ligand binding, and venture into\nuncharted territories of protein functionality. Anna delved into her work as a\ncell biologist studying membrane proteins and their organization within the\nbody. She shared a case from Cyprus, where mysterious kidney disease affected\ncertain families for decades. AI-driven image recognition was used to identify\na genetic mutation, leading to a therapeutic solution. The issue stemmed from a\nmisshapen protein caused by a nodal molecule that traps faulty proteins,\nultimately causing cell death. This discovery is now being applied to other\nconditions, such as Alzheimer\u2019s and blindness, offering hope for broader\ntherapeutic applications.</p>\n<p>Janet also spoke about her time as the director of the European\nBioinformatics Institute, which manages data repositories like the <a href=\"https://www.wwpdb.org\">Worldwide\nProtein Data Bank</a>. She described the cultural shift required to encourage data\nsharing, noting it took 20 years for crystallographers to agree to mandatory\ndata deposition before publication. She stressed that medical data,\nparticularly in clinical contexts, must undergo a similar transformation.\nPublicly funded data must eventually reach the commons, especially when such\nsharing has the potential to save lives. The UK\u2019s NHS, with its secure data\nenvironments, provides a strong model for this approach. However, the health\nsector needs to move faster than the crystallography community did, requiring\nbuy-in from both patients and medical professionals, as emphasized in Kathy\nSudlow\u2019s recent report on the UK\u2019s health data landscape.</p>\n<p><a href=\"https://www.broadinstitute.org/bios/anna-greka\">Anna Greka</a>, a pathologist and head of a new institute focusing on women\u2019s\ncancer at the Broad Institute, discussed her work on building AI tools to identify and facilitate the\ndevelopment of disease mechanisms. Anna Greka added that millions of human\ngenomes have been sequenced and aggregated into databases usable by scientists\nworldwide. If pathology labs globally pooled their data, AI training models\nwould benefit immensely. She suggested anonymizing the data while preserving\nmetadata and federating results across countries to protect privacy and enhance\nglobal collaboration.</p>\n<p>Anne Vincent-Salomon, head of diagnostic and theranostic medicine at the\nInstitute Curie, stressed the importance of multidisciplinary science and\neffective communication. She emphasized the need to educate the public,\nreducing fear and fostering understanding of scientific advancements.</p>\n<p>Anna concluded by underscoring the importance of understanding the "unit of\nlife" to progress in biology. She argued for the creation of a high-quality\nperturbation dataset for cells, akin to the Protein Data Bank\u2019s role in\nAlphaFold. Skeptical of synthetic data, she emphasized the need for\nexperimentally derived data as a foundation for future breakthroughs. She\ncalled this effort a moonshot for the next five years -\u2014 a grand challenge to\ndeepen our understanding of life itself! (<em>Anil: see also this great <a href=\"https://www.ted.com/talks/anna_greka_the_world_s_rarest_diseases_and_how_they_impact_everyone?subtitle=en\">TED talk</a> from Anna last year</em>)</p>\n<h2><a href=\"https://anil.recoil.org/#the-polycene-exploring-the-opportunity-of-the-moment\"></a>The Polycene: Exploring the Opportunity of the Moment</h2>\n<p>\n<img alt=\"Thomas Friedman talking about the polycene.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-5.webp\" title=\"Thomas Friedman talking about the polycene.\">\nThomas Friedman talking about the polycene.</p>\n<p>The (epic) next speaker was Thomas L. Friedman, who explored the the interplay of three critical "scaling laws" in the modern era:</p>\n<ol>\n<li><strong>Software</strong>: The rapid improvement of AI capabilities post-2017 (with transformers and GPUs).</li>\n<li><strong>Carbon</strong>: Rising CO2e and methane emissions driving "<a href=\"https://www.newstatesman.com/science-tech/2021/04/why-we-need-talk-about-global-weirding\">global weirding</a>" (extreme and destructive climate changes).</li>\n<li><strong>Disorder</strong>: Societal and institutional fragmentation in addressing these crises.</li>\n</ol>\n<p>Between 2017, with the introduction of transformer algorithms, and 2020, when\nadvancements in microchips and GPUs took off, artificial intelligence has\nexpanded dramatically. This period reflects a "scaling law" in action.\nPolymathic AI\u2014AI capable of addressing a broad range of problems\u2014now seems\nwithin reach. In just three years, AI-driven science has evolved from science\nfiction to reality and become accessible to many (albeit often with some\nlimitations on free access). To address the challenges AI presents, we need\nhigher-dimensional thinking for higher-dimensional problems.</p>\n<p>At the same time, we're seeing a scaling law in carbon emissions. Levels of CO\u2082\nand methane in the atmosphere, including methane from livestock, are causing\ndestructive climate change. The seven warmest years on record occurred between\n2015 and 2021, resulting in what\u2019s called "global weirding"\u2014where hot regions\ngrow hotter, wet regions grow wetter, and the effects become increasingly\ndestructive.</p>\n<p>In parallel with these scaling points in carbon and silicon (AI hardware),\nwe\u2019re facing a scaling point in disorder\u2014the erosion of civic structures.\nGovernments worldwide have over-promised on the benefits of industrial\ndemocracies, such as healthcare, retirement plans, and infrastructure, yet are\nstruggling to deliver. Societies are aging, educational attainment has\nstagnated, and productivity growth has slowed.</p>\n<p>We're also witnessing growing national security challenges and the dissolution\nof the great power balance that defined the post-Cold War era. Electric\ntransportation, healthcare, and employment systems are under strain, leading to\nincreased migration. Today, there are 56 active conflicts globally\u2014the highest\nnumber since World War II\u2014and more displaced people than at any point in\nhistory.</p>\n<p>We need a game-changer.</p>\n<p>To solve these interconnected crises, we must adopt higher-dimensional\napproaches that blend solutions across disciplines and scales. The future\nstability of our planet\u2014and the well-being of the next generation\u2014depends on\npresenting holistic, interconnected solutions. Friedman calls this the "Polycene" era.</p>\n<p>Never before has politics needed science more. Science must enable leaps\nforward in education and provide buffers against disruptive climate change.\nSimilarly, politics must create the frameworks and systems to allow science to\nthrive and deliver meaningful solutions.</p>\n<p>In <a href=\"https://en.wikipedia.org/wiki/That_Used_to_Be_Us\">That Used to Be Us</a>,\nFriedman argued that "average is over"; the rapid acceleration of technology\nmeans the American Dream -- once achievable for many -- is no longer guaranteed.</p>\n<p>However, technology can flatten access to resources. For instance, an Indian\nfarmer can now access advanced insights about crop planting, watering\nschedules, and fertilizers directly on a smartphone. For the first time, those\nwithout access to "average" resources now have access to them through AI.\nThanks to AI, "average" as a benchmark is over\u2014and that gives Friedman optimism.</p>\n<p>However, there\u2019s a caveat: science and politics must work together. The\nalignment problem between these fields is real and will become more pressing as\nwe approach AGI. As a species, we\u2019ve become more godlike than ever before,\ncreating systems and intelligence that resemble a larger, more powerful brain.\nHow we use this power will determine our future.</p>\n<h2><a href=\"https://anil.recoil.org/#building-the-infrastructure-for-success\"></a>Building the Infrastructure for Success</h2>\n<p>The speakers here were Paul Hofheinz, <a href=\"https://en.wikipedia.org/wiki/Asmeret_Asefaw_Berhe\">Asmeret Asefaw Berhe</a>, Fabian J. Theis and Bosun Tijani.</p>\n<p>\n<img alt=\"Paul Hofheinz, Asmeret, Fabian and Bosun.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-6.webp\" title=\"Paul Hofheinz, Asmeret, Fabian and Bosun.\">\nPaul Hofheinz, Asmeret, Fabian and Bosun.</p>\n<p>Berhe began by noting that we are at an inflection point -- how do we avoid\nrepeating mistakes from the past, ensuring we don\u2019t leave segments of human\nsociety behind or widen the digital divide further? In previous programs such\nas exascale computing, they insisted as part of the program that they must\nhave explicit sustainability goals. While these goals may have seemed\nunrealistic initially and were criticised, in retrospect they have shown they\ncan be achieved. An example of the next thing theyre working on is the\nHigh-Performance Data Facility, recognizing that the Office of Science produces\nmore scientific data than any other entity in the world (e.g particle physics,\ngenomic labs). We need to rethink how we handle such huge amounts of data,\naddressing concerns around privacy and integrity.</p>\n<p>Minister Tijani then talked about how in Nigeria, there is a direct correlation\nbetween science and economic growth, yet in the Global South, we often fail to\napply science effectively to societal issues. The answers we think we have\noften got to shift solutions with context; for instance, policies from the UK don\u2019t transplant\ncleanly to Nigeria where the poplulation is growing hugely faster.</p>\n<p>Key points included:</p>\n<ul>\n<li><strong>Dataset Inclusion</strong>: Like Indian farmers accessing AI-driven agricultural\ninsights, we need datasets that represent Nigeria, Rwanda, and other African\ncountries. Nigeria\u2019s average age is 16.9, with 5 million new people entering\nthe population each year. The workforce of the future will come from Africa.</li>\n<li><strong>Infrastructure</strong>: Establishing certainty in data infrastructure is\ncritical. Countries will need to ensure their data infrastructures allow for\nglobal participation rather than stagnation.</li>\n<li><strong>Digitization</strong>: Much of Nigeria\u2019s existing knowledge isn't encoded in a\nform digestible by AI. Smart digitization efforts are necessary to create\ninputs for widely used models.</li>\n</ul>\n<p>To address these challenges, the Nigerian government did a few things. Over\nthe past 30 years, publications and a name library were correlated to identify\nAI-focused Nigerian scientists. This effort brought 100 of them together to\ndevelop a national AI strategy for Nigeria. An AI Trust was created with a\ncommunity of trustees to support this strategy. And then a Talent Attraction\nProgram was launched, supported by Google and the government, alongside local\nprivate investment. This is one of the largest talent accelerators globally.\nNigeria aims to become a net exporter of talent, much like India\u2019s success in\nthis area.</p>\n<p>Fabien then talked about how many scientists are driven by natural curiosity,\nand it's vital to nurture that environment. The "holy trinity" of AI consists\nof data compute and algorithms need to be completed together. Ten years ago,\ncomputer vision flourished due to ImageNet\u2019s availability and now we\u2019re\nentering a new era with foundation models for cell biology. These models\nrequire algorithmic innovations to merge datasets and techniques like\nmultiscale modeling mixed with self-supervised learning to succeed.</p>\n<p>We're at a tipping point where we can build generalizable models that can be\nadapted for specific tasks around the world (a reference to the Nigerian\nusecases earlier)</p>\n<p>Some key points discussed were:</p>\n<ul>\n<li><em>Equitable compute access</em>: Academic/industrial partnerships are essential to make GPUs accessible for foundational research.</li>\n<li><em>Cell Atlas</em>: Foundation models help academics plan experiments ("lab in the loop") and overlay disease data for deeper insights. The Deep Cell Project for example aims to not only create steady-state models but also add perturbation behaviors, enabling predictions about how cells respond to interventions. Unlike physics, where laws of motion guide us, cell biology lacks such universal equations. Deep Cell integrates image-based observations, tissue contact data, computational morphology, and clinical data to create a comprehensive model that maps healthy states and potential interventions.</li>\n<li><em>Benchmarks</em>: Gene data is consistent internationally, but we need\nstandardized benchmarks to equalize global talent and foster competition.\nBenchmarks are powerful tools for collaboration and innovation as they are accessible for anyone (see Kaggle for example).</li>\n<li><em>Bias</em>: While we have powerful computational systems, the data they rely on is highly biased and incomplete. These flaws risk being perpetuated in future frontier models. To address this we need to invest in rebalancing datasets to prevent historical biases from carrying over. Cooperative investments are essential to develop homegrown talent in the global south.</li>\n</ul>\n<p>Bosun Tijani also noted that the Global South isnt a charity case when it comes\nto AI. By the end of this century, Nigeria\u2019s population will be half a billion,\nmaking it crucial in a highly connected world. There's a strong business case\nfor not missing out. Nigeria is investing $2 billion to deploy dark fiber\ninfrastructure nationwide. This connectivity will empower people to contribute\nmeaningfully to the global digital economy. Governments in the Global South\nmust expand their capacity to think strategically about AI and its potential.\nUnlike academic institutions, which often drive innovation, governments in\nthese regions need to strengthen their ability to lead and cant rely on a large\nuniversity pool like Europe does.</p>\n<h2><a href=\"https://anil.recoil.org/#collaborating-for-impact\"></a>Collaborating for Impact</h2>\n<p>The speakers here were Lila Ibrahim, Ilan Gur, Dame Angela McLean and Sir Paul Nurse.</p>\n<p>The first question about around how the speakers have experienced changes in science and how it have evolved?</p>\n<p>Paul Nurse noted that we live in an increasingly complex scientific world. The\nexpansion of science has led to greater complexity, which, in turn, has created\nmore silos across disciplines. To address this, we need more interaction \u2014- not\nnecessarily immediate collaboration -\u2014 but increasing permeability between\nfields. There are also important social science aspects to consider, such as\nhow to structure training and foster interdisciplinary to work effectively.</p>\n<p>\n<img alt=\"Lila, Ilan, Angela and Paul on stage.\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-7.webp\" title=\"Lila, Ilan, Angela and Paul on stage.\">\nLila, Ilan, Angela and Paul on stage.</p>\n<p>Angela McClean: we\u2019ve transitioned from an era of "big science" projects -\u2014\nlarge, centrally organized efforts with clear, command-and-control structures\n-\u2014 towards distributed collectives. These collectives pool together the\nnecessary data to tackle significant questions, such as those addressed by\nAlphaFold. Unlike a single centralized project, AlphaFold\u2019s success came from a\nclear, well-defined question that encouraged competition and fostered winning\nstrategies.</p>\n<p>Today, we need the big research projects to define what success looks like and\nthen innovate towards new ways for people to contribute collectively without a\nbig central structure.</p>\n<p>Disciplines can generally be divided into two categories. Those with "countable\nquestions"; for example, solving the Riemann hypothesis might win you a Nobel\nPrize! Then we have unstructured disciplines: fields like biology, where there\nisn\u2019t a single list of questions to solve. As Paul Nurse put it, "biology is a\nbunch of anarchist scientists!". He contined that we need more unfocused\nresearch organizations that keep track of unstructured problems and help refine\nthem into focused scientific questions. This kind of work is essential for\nachieving progress in disciplines that don't have clear or countable goals.</p>\n<p>Ilan Gur then introduced ARIA, the Advanced Research Intelligence Agency, that\nhas a mission to build on the UK\u2019s rich scientific ecosystem by pursuing\nbreakthroughs that may not yet have immediate or obvious consequences. ARIA\nfocuses on identifying the right people, their environments, their incentives, and\nhow their work can ultimately benefit society.\nARIA\u2019s method begins by gathering program manager scientists to develop a thesis about\nwhere to focus efforts. This doesn\u2019t involve just one research project but\nrather a constellation of efforts that can cross technology readiness\nlevels and organizational boundaries to achieve a focused target.\nExamples of ARIA initiatives include scaling AI via compute inspired by nature, and\nanother project observing that formal mathematics should be better integrated\ninto AI research to create more robust models. By guaranteeing bounds on\ninputs, we could use AI in critical applications with confidence in its\noutcomes.</p>\n<p>Angela McClean then talked about how the UK government (under Labour) has outlined missions for addressing key societal challenges, such as\ngrowth, safer streets, opportunities for all, clean energy, and better health.\nThese missions are a great starting point for research and problem-solving.\nHowever, the structure of Whitehall (government departments) tends to remain\nsiloed. To succeed, we need to draw expertise from across departments to\naddress these challenges.</p>\n<p>Paul Nurse noted that science operates on a spectrum between discovery and\napplication. Discovery is anarchic and unpredictable but applications are more\ndirected and goal-oriented. We need end-to-end funding that supports the\nentire spectrum, from discovery to application, while embracing diversity in\napproaches. A one-size-fits-all method won\u2019t work. At the Francis Crick\nInstitute, departments were abolished, allowing disciplines to mix; turnover\nwas encouraged after a limit of tenure to keep staffing dynamic (including at\nsenior levels) and a high level of technical expertise was made available to\neveryone. Mixing people from different disciplines and using social scientists\nto understand the cultural structures within organizations is key to fostering\ninnovation.</p>\n<p>At the Crick Institute, the space encourages serendipitous conversations* This\nincluded inviting external guests and creating opportunities for unexpected\ninteractions. Tools like Slack\u2019s "lunch roulette" feature could similarly\nencourage serendipity and collaboration.\n(<em>Anil note</em>: Cambridge Colleges do a great job here, but our departments are\nmore siloed but events like <a href=\"https://www.christs.cam.ac.uk/news/celebrating-50-years-rubiks-cube\">Rubik's 50</a> are a great example of how different disciplines come together)</p>\n<p>Angela McClean also noted that we need to find ways to communicate the\nimportance of breakthroughs like AlphaFold outside of the scientific community.\nFor example, when AlphaFold was introduced, the Ministry of Defence (MoD)\ndidn\u2019t grasp why the science mattered -\u2014 they lacked the necessary context. Even\namong highly educated people in the UK, there's a gap in understanding just how\nmuch AI is transforming society. By better connecting experts and amplifying\ntheir insights, we can and must help bridge this gap.</p>\n<p>Paul Nurse also noted that the public must be informed about science advances;\ne.g. the fiasco around GM crops happened because noone trying to introduce GM\ncrops had bothered to infrm the public explainign what the issues are and\ngetting feedback. The main answer from the public smaple about "why not GM crops" is because they\ndidnt want to eat food with genes in it, and thats what bothered them. So when\nwe're thinking about AI and its implications, lets go out and talk to the\npublic and find out what worries them and then think about how to communicate.</p>\n<h3><a href=\"https://anil.recoil.org/#accelerating-scientific-discovery-using-ai\"></a>Accelerating Scientific Discovery Using AI</h3>\n<p>Demis Hassabis then reflected on AlphaFold and the future of AI-Driven science.\nAlphaFold, which has been cited over 28,000 times already and by open-sourcing it (including AlphaFold 3 with open weights for non-commercial use), its impact has been profound. Some standout applications include:</p>\n<ul>\n<li>Determining the structure of the nuclear pore complex, which regulates nutrients entering and exiting the cell nucleus.</li>\n<li>Developing a molecular syringe for delivering drugs to hard-to-reach parts of the body.</li>\n<li>Discovering plastic-eating enzymes to address environmental challenges.</li>\n</ul>\n<p>AlphaFold is described as a "root node" problem within Deepmind -\u2014 once solved, it unlocks entirely new branches of scientific discovery. Its success in determining protein structures has validated this potential. What's next?</p>\n<p>\n<img alt=\"Hannah Fry and Demis Hassabis on stage\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-10.webp\" title=\"Hannah Fry and Demis Hassabis on stage\">\nHannah Fry and Demis Hassabis on stage</p>\n<p>Material Design (<a href=\"https://deepmind.google/discover/blog/millions-of-new-materials-discovered-with-deep-learning/\">Gnome Project</a>) is the next frontier of material design, which shares characteristics with protein folding:</p>\n<ul>\n<li>A massive combinatorial search space.</li>\n<li>The need for models that integrate physics and chemistry to optimize solutions.</li>\n<li>Potential breakthroughs include developing new batteries or room-temperature superconductors.\nAlready, 200,000 new crystals have been published -\u2014 previously unknown to science -\u2014 marking significant progress toward a usable Gnome system.</li>\n</ul>\n<p>Applying AI to mathematics also offers exciting possibilities, including solving major conjectures that have eluded mathematicians.</p>\n<p>Inspired by mentorship from Paul Nurse, the aim is to simulate an entire\nvirtual cell -\u2014 a "Mount Everest" of biology. AlphaFold 2 solves static protein\nstructures, while AlphaFold 3 models interactions, taking us closer to\nsimulating an entire cell (e.g., a yeast cell as the model organism). Within\n5\u201310 years, this ambitious goal may well be achievable.</p>\n<p>Quantum Computing is accelerating and offers exciting intersections with AI; simulating quantum systems to generate synthetic data or addressing challenges like error-correcting codes. However, classical Turing machines have proven more capable than initially thought:</p>\n<ul>\n<li>Projects like AlphaGo and AlphaFold show how new algorithms outperform brute force by precomputing models before tasks like making a move in Go or folding a protein.</li>\n<li>Classical systems, when used effectively, can model even quantum systems.</li>\n</ul>\n<p>David Deutsch called this approach "crazy, but the right kind of crazy" when Demis talked to him about it. Demis thinks that every natural phenomenon has inherent structure, which machine learning can model to efficiently search for optimal solutions. So quantum may not be necessary for this, and classical computing used with machine learning sufficient to solve the hugely complex underlying problem.</p>\n<p>Meanwhile they also launched Isomorphic Labs to rethinking the drug discovery\nprocess from the ground up, leveraging AI for one of the most impactful use\ncases: curing diseases. AlphaFold is a powerful tool for fundamental research,\nand Isomorphic works on adjacent usecases need for practical drug discovery\n(helping design chemical compounds, test for toxicity, and minimize side\neffects, etc). Isomorphic aims to cure diseases with AI, and generate revenue\nto reinvest in fundamental research, so striking a balance between societal\nimpact and profitability.</p>\n<p>Demis then commented that we stand on the brink of a golden era of scientific\ndiscovery, driven by interdisciplinary collaboration with domain experts and\nlimitless possibilities for applying AI to new fields and improving AI itself\n(approaching exponential improvement). The scientific method is humanity's\ngreatest invention and remains the foundation of modern civilization. In an era\nof transformative AI, its useful to go beyond simple A/B testing and treat AI\ndevelopment as a scientific method test. We need to understand the emergent\nproperties of AI systems and improve interpretability. Techniques from\nneuroscience (e.g fMRI for studying brains) could inspire ways to study neural\nnetworks and make them explainable rather than just being black boxes. The\napproach is to build the artifact of interest first, then decompose it through\ntargeted experiments to understand it once it has proven useful. Artificial\nsystems like neural networks can be as complex as natural systems, requiring\nsimilar rigor to understand.</p>\n<p>Science is increasingly expensive and complex, leading to slower progress in\ncertain areas. However, interdisciplinary work will drive significant advances\nin the next decade. DeepMind, originally founded at the intersection of\nneuroscience and computer science, exemplifies how crossing disciplines\naccelerates innovation.</p>\n<p>To support these efforts, Google.org just announced a <a href=\"https://blog.google/outreach-initiatives/google-org/google-org-science-ai-funding/\">$20 million fund for\ninterdisciplinary\nresearch</a>,\nfurther enabling breakthroughs at the intersection of fields. (<em>Anil's note: let's hope that sustainability is on the list here!</em>)</p>\n<h3><a href=\"https://anil.recoil.org/#ask-the-nobel-laureates\"></a>Ask the Nobel Laureates</h3>\n<p>The last panel had all four Laureates on stage to answer questions, moderated\nby Hannah Fry: Jennifer Doudna, Sir Demis Hassabis, John Jumper and Sir Paul\nNurse.</p>\n<p>\n<img alt=\"What&apos;s a group of Nobel laureates called?\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-11.webp\" title=\"What&apos;s a group of Nobel laureates called?\">\nWhat's a group of Nobel laureates called?</p>\n<p>The discussion opened by asking the panelists how they first felt when they\nmade their prize winning discoveries.</p>\n<p>John Jumper: when you release groundbreaking work, it\u2019s fascinating to see the\nimmediate responses. I remember refreshing Twitter and seeing graduate students\nexclaiming, \u201cHow did they get my structure? It hasn\u2019t even been published!\u201d\nThere was a special issue of Science related to the nuclear pore complex, and\nthree out of four studies had heavily used AlphaFold without me even knowing\nit. It was amazing to see how our tools are empowering researchers.</p>\n<p>Jennifer Doudna: In the fall of 2011, while working on CRISPR (a bacterial\nimmune system), we realized it was an RNA-guided system that targets DNA for\ncleaving. It was one of those "aha" moments\u2014bacteria had figured out how to do\nthis, and now we could understand and manipulate DNA using the same principle.\nA year later, when we published our findings, we could feel the momentum\nbuilding in the scientific community.</p>\n<p>Paul Nurse: In 1985 (much older than the others!), I was working on yeast and\nmy lab had identified the genes responsible for the cell cycle\u2014how one cell\nreproduces into two. We wondered whether these findings could apply to humans,\neven though this was well before human genome mapping. Using the first human\ncDNA library ever made, we introduced human genes into defective yeast cells.\nIf a human gene could replace the defective yeast gene and restore function, it\nmeant the discovery was transferable. Remarkably, 1.5 billion years of\nevolutionary divergence didn\u2019t stop this experiment from working.</p>\n<h2><a href=\"https://anil.recoil.org/#qa\"></a>Q&A</h2>\n<p>Q: What would you say to your 18-year-old self?</p>\n<p>Demis Hassabis: I actually had this plan when I was 18! The amazing thing is that it worked out, but I\u2019d tell myself to enjoy the journey a bit more.</p>\n<p>John Jumper: My career has been more of a random walk, driven by doing good science in the moment and being open to new opportunities. My advice is to focus on doing good science now and let the interesting paths unfold naturally. It\u2019s almost the opposite of Demis\u2019s advice.</p>\n<p>Jennifer Doudna: Follow your passion, never give up, and don\u2019t listen to naysayers.</p>\n<p>Paul Nurse: Coming from a non-academic background, I couldn\u2019t believe that I could be paid to follow my curiosity. Even now, it still feels like a privilege.</p>\n<p>\n<img alt=\"Hideo Kojima has the coolest portraits at the BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-3.webp\" title=\"Hideo Kojima has the coolest portraits at the BAFTA\">\nHideo Kojima has the coolest portraits at the BAFTA</p>\n<p>Q: AI gives answers but struggles with mechanistic insights. How big a barrier is this to public trust, and when can we expect true mechanistic insights?</p>\n<p>Demis Hassabis: AI is an engineering science. First, we need to build systems that are worthy of study. Once built, we can break them down and understand them mechanistically over time. Early systems weren\u2019t worth this effort, but now we\u2019re developing tools that are, and they\u2019re improving themselves. Unlike physics, biology can\u2019t always be explained by universal laws, but simulations that can be tested and probed are better suited. Neuroscience techniques, like those used to study real brains, can also help us understand artificial neural networks.</p>\n<p>Q: Is attention still all we need?</p>\n<p>John Jumper: AlphaFold isn\u2019t just an off-the-shelf transformer. While attention is an important component, many other innovations were added to change the structure of the network significantly. Fundamental research continues to unlock insights into both new data and previously unexamined data. AlphaFold has revealed new knowledge about data that had been available for years.</p>\n<p>Demis Hassabis: The transformer architecture has been incredible but isn\u2019t sufficient on its own. We\u2019ll need several more breakthroughs of that scale to reach full AGI.</p>\n<p>Q: What are the current challenges in biology data?</p>\n<p>Jennifer Doudna: Biology faces issues with both the quality and quantity of data for training AI models. We need to educate researchers on how to collect data both sparsely and smartly. Sparse but broad data is critical to creating robust platforms for training. This ultimately comes down to asking the right questions.</p>\n<p>Q: What about people who are skeptical of these breakthroughs? Could society reject them?</p>\n<p>Paul Nurse: Keeping the public on board is critical. This isn\u2019t the first time new technology has faced resistance, and every time it happens, there\u2019s concern. Special interest groups often hijack these conversations, so we need to find better ways to engage with the public and explain the science behind the breakthroughs.</p>\n<p>Q: Africa will have the largest population of young adults by 2050. How can Africans be included in this global scientific revolution?</p>\n<p>Jennifer Doudna: The Innovative Genomics Institute has an ongoing effort in Kenya to work with scientists and help them understand CRISPR. This initiative has fostered a network of researchers, and I\u2019d like to see more of that happening.</p>\n<p>Demis Hassabis: DeepMind has been actively working in Africa, with events like the Deep Indaba conference serving as key convening points for African talent. There\u2019s still a lot more to be done, but it\u2019s a hugely important area of focus.</p>\n<p>Q: How do we encourage the next generation of scientists?</p>\n<p>Paul Nurse: In today\u2019s world, journals are dominated by big data studies. While there\u2019s value in this, we must ensure that creativity doesn\u2019t get lost. There\u2019s enormous potential in big data if approached with creativity, and we need to foster this mindset in our colleagues and students.</p>\n<p>Demis Hassabis: Encouraging the next generation is crucial. One of my heroes is Richard Feynman. Every schoolchild should read <em>Surely You\u2019re Joking, Mr. Feynman!</em> It shows how exhilarating it is to work at the frontier of knowledge. Science is incredible and fun, and we need to expose people to that joy.</p>\n<p>\n<img alt=\"Ray Dolby is a Pembroke alumni too\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-13.webp\" title=\"Ray Dolby is a Pembroke alumni too\">\nRay Dolby is a Pembroke alumni too\n\n<img alt=\"Interactive exhibits inside the room\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-15.webp\" title=\"Interactive exhibits inside the room\">\nInteractive exhibits inside the room\n\n<img alt=\"Glitzy entrance to the BAFTA\" src=\"https://anil.recoil.org/images/ai-for-science/ai-for-science-2024-14.webp\" title=\"Glitzy entrance to the BAFTA\">\nGlitzy entrance to the BAFTA</p>\n<p>These conclude my live notes! Beyond the notes here, the corridor conversations were incredibly\nuseful for me: I have lots of connections to make next. Any errors in these\nnotes are all mine, of course; I mainly took them for myself, but I hope it's\nuseful for you to have put them online as well.</p>",
+18
avsm/notes_ai-ietf-aiprefs.json
+18
avsm/notes_ai-ietf-aiprefs.json
···+"summary": "<p>The <a href=\"https://ietf.org\">IETF</a> <a href=\"https://bsky.app/profile/ietf.org/post/3lj6w5fpjx22u\">announced</a> their new <a href=\"https://www.ietf.org/blog/aipref-wg/\">AI Preferences Working Group</a> (AIPREF), which will <em>"work on standardizing building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence models"</em>. This is quite well timed; the IETF tries not to standardise too early before there is <a href=\"https://www.ietf.org/runningcode/\">running code</a> but also needs to move before it's too late and a bad defacto standard is <a href=\"https://datatracker.ietf.org/doc/html/rfc7282\">chosen</a>. The AI world seems to be at that nexus point right about now, with <a href=\"https://openai.com/index/introducing-gpt-4-5/\">GPT 4.5</a> seemingly hitting a <a href=\"https://www.newscientist.com/article/2470327-is-openai-hitting-a-wall-with-huge-and-expensive-gpt-4-5-model/\">scaling wall</a> and possibly triggering the start of a renewed data scraping frenzy.</p>\n<h2><a href=\"https://anil.recoil.org/#how-do-websites-interact-with-ai-crawlers-right-now\"></a>How do websites interact with AI crawlers right now?</h2>\n<p>I've found when developing my own website there are a number of approaches to interacting with automated data crawlers. For the record, over 90% of the traffic to this site is from automated sources, so it's a material concern for <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> infrastructure.</p>\n<ol>\n<li><strong>Ban all bots; humans only plz:</strong> I don't want to do this, as I'd like to opt into my writing training next generation foundation models, but would like some agency over how much I need to pay for them to get their data (I am covering the bandwidth costs here, after all), so I just need them to cooperate more to avoid flooding my site. If I do want to ban them, the excellent <a href=\"https://github.com/ai-robots-txt/ai.robots.txt/blob/main/table-of-bot-metrics.md\">ai-robots</a> crew maintain a useful list of bad bots.</li>\n<li><strong>Ban some bots with a robots.txt:</strong> <a href=\"https://www.rfc-editor.org/rfc/rfc9309.html\">RFC9309</a> allows for the discrimination of web-crawlers via a <a href=\"https://anil.recoil.org/robots.txt\">robots.txt</a>. We nowadays have not just a few big crawlers mirroring the Internet (like <a href=\"https://developers.google.com/search/docs/crawling-indexing/googlebot\">Googlebot</a> and <a href=\"https://en.wikipedia.org/wiki/Bingbot\">Bingbot</a>), but seemingly thousands of variants competing for the data gold rush (or in my case, for <a href=\"https://anil.recoil.org/projects/ce\">conservation research</a>!) The <code>robots.txt</code> doesn't give us enough control to usefully rate-limit across all of these, unfortunately. You need to regenerate the file every time there are new URLs on the site that don't fit a longest-prefix match. This, combined with having a mega <a href=\"https://sitemaps.org\">sitemaps</a> file, is a lot of non-cacheable metadata that's just adding to my serving load.</li>\n<li><strong>Add server-side throttling for specific bots:</strong> On the assumption that there are a bunch of bad bots that mimic good bots, what I really need is to start rate-throttling them all! This is where I am today, and ended up hacking together a bunch of OCaml code for <a href=\"https://anil.recoil.org/notes/bushel-lives\">this</a> website to track all the robots request rates and slow down over-eager ones. The rest of the Internet are mostly just asking Cloudflare to take care of this for them, which results in a <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">world of pain</a> for anyone outside of their world view.</li>\n<li><strong>Just give the bots what they want, which is Markdown:</strong> Since I can't really win the throttling wars in the long term, can I just give the bots what they want, which is the core text without all the HTML around it? The first thing these crawlers do is to tokenize the HTML anyway! There is <a href=\"https://llmstxt.org/\">llms.txt</a> emerging for this. I author my website in Markdown in the first place, and then transform it into the HTML you see here. But it looks like the <a href=\"https://llmstxt.org/domains.html\">llms.txt guidelines</a> insist on just one page at the root of the site, and not one Markdown per page. This is probably better for reducing crawling traffic, but it would be a large page even for my humble homepage.</li>\n<li><strong>Can I just give you a tarball with my stuff so you leave me alone?:</strong> I rebuild my site regularly, so I could just provide the AI bots with a convenient tar/zip of my entire website content, but put it in a common place so I don't have to pay for the download bandwidth. This could include my images, videos, and source markdown which could be used not only for training, but for <a href=\"https://archive.org/\">archival</a> as well. We don't seem to have a common protocol to map URLs to static archives right now, although there are a <a href=\"https://en.wikipedia.org/wiki/Web_archive_file\">few web archive</a> formats flying around.</li>\n</ol>\n<h2><a href=\"https://anil.recoil.org/#the-role-of-the-ietf-is-to-create-protocols-not-mandate-implementations\"></a>The role of the IETF is to create protocols, not mandate implementations</h2>\n<p>The IETF has a valuable role to play here to establish a consensus around what a sensible, usable <em>protocol</em> for exchanging data on our websites might look like, rather than mandating any specific backend technology or storage format.\nThere is a lot of nuance around sharing content over HTTP: it supports <a href=\"https://http.dev/authentication\">authentication</a>, <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control\">caching</a>, <a href=\"http://web.archive.org/web/20190904190534/https://www.dirv.me/blog/2011/07/18/understanding-403-forbidden/index.html\">access control</a>, <a href=\"https://matt-jackson.com/seo-glossary/http-429/\">rate limiting</a>, and many other features hidden behind a seemingly simple <a href=\"https://datatracker.ietf.org/doc/html/rfc2616\">request-response specification</a>.</p>\n<p>I'm hoping that the AIPREF process will end up with something that gives me something closer to 5) above than 1). I need an HTTP-based mechanism by which I can express my preferences for AI crawling, and cooperate with the crawlers so that I can ensure maximum collective benefit to both people and bots visiting my site, rather than withdrawing behind a gated community of humans only. However, I think that this requires the establishment of a protocol to help sequence the HTTP requests together and not just a single static file like <code>llms.txt</code> or <code>sitemap.xml</code>.</p>\n<p>Back in the 90s, I <a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">worked</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152\">on</a> NetApp/<a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a> with <a href=\"https://www.netskope.com/press-releases/netskope-john-martin-chief-product-officer\">John Martin</a>. Bandwidth used to be expensive and so we deployed edge caches that could <em>modify</em> website content with local modifications to common global content. Consider, for example, a local news website that might want to show mostly cached global news, but also modify the HTML to include local news content. You can do that today via JavaScript, but back then the only way to have a protocol to modify the static HTML. The <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">Internet Content Adaptation Protocol</a> was the IETF's answer to creating a structured HTTP-like protocol to allow edge modifications from proxy servers:</p>\n<blockquote>\n<p>ICAP is, in essence, a lightweight protocol for executing a "remote procedure call" on HTTP messages. It allows ICAP clients to pass HTTP messages to ICAP servers for some sort of transformation or other processing ("adaptation"). The server executes its transformation service on messages and sends back responses to the client, usually with modified messages. Typically, the adapted messages are either HTTP requests or HTTP responses.\n-- <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">RFC3507</a>, IETF</p>\n</blockquote>\n<p>One of the coolest features of ICAP is that is didn't mandate the transformation mechanism, just the protocol. The proxies deployed at the edge networks would get a vector into transforming the data stream. NetCache implemented an implementation of ICAP, and Squid <a href=\"https://www.egirna.com/blog/news-2/configure-squid-v6-2-on-ubuntu-server-22-and-use-it-with-icap-18\">still supports</a> it. What would a similar approach look like for allowing crawlers into your site's content, but leaving lots of freedom for the details of this to be delegated to the crawlers and servers?</p>\n<h2><a href=\"https://anil.recoil.org/#challenges-in-an-open-data-hoovering-protocol\"></a>Challenges in an open data hoovering protocol</h2>\n<p><a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> <a href=\"https://bsky.app/profile/aftnet.bsky.social/post/3ljcw2uawe22c\">identifies</a> the main problem facing AIPref:</p>\n<blockquote>\n<p>Given the reports that some current LLM models have been trained on data corpus obtained illegally, I have some doubts that AIPref will be respected.</p>\n</blockquote>\n<p>This is <a href=\"https://www.tomshardware.com/tech-industry/artificial-intelligence/meta-staff-torrented-nearly-82tb-of-pirated-books-for-ai-training-court-records-reveal-copyright-violations\">true</a> for the current generation of data crawlers, but also where the <em>opportunity</em> lies for AIPref. Without a systematic way to support <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">replication of non-public data</a>, the situation will get even worse as custom apertures are created into data silos without any integrity underlying them.</p>\n<p>The main reason for having a protocol-based solution is that we could support the strong authentication and identification of bots. If (for example) the GoogleBot supplied a token with every HTTP request to fetch my content, I could track its use and perhaps even get compensation for the bandwidth costs. The current methods of <a href=\"https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot\">bot verification</a> all seem quite weak; they are just IP based checks for example.</p>\n<p>This would in turn open a path to the disciplined negotiation for access controlled data bilaterally between crawlers and hosters. More and more content publishers are signing various <a href=\"https://www.monda.ai/blog/ultimate-list-of-data-licensing-deals-for-ai\">exclusive deals</a> with AI training companies. Irrespective of your opinion on such deals, a protocol to make it easier to authenticate bots strongly would make the establishment (and ongoing negotiation) of those mechanisms far easier to handle.</p>\n<p>We are also seeing rapid adoption of the the <a href=\"https://github.com/modelcontextprotocol\">Model Context Protocol</a> released a few months ago. This establishes a <a href=\"https://github.com/modelcontextprotocol/specification\">JSON-RPC specification</a> for LLM clients and data providers to talk to each other locally. It seems odd to me that we'd have a rich "local" specification for data exchange like this for RAG-like systems, but not have one in the wide area across the Internet. As the chair of the AIPREFS group <a href=\"https://mnot.net\">Mark Nottingham</a> notes, <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">platform advantages</a> are not just network effects, so there may be deep repurcussions into the economics of AI here:</p>\n<blockquote>\n<p>In short: there are less-recognised structural forces that push key Internet services into centralized, real-time advertising-supported platforms. Along with factors like network effects and access to data, they explain some of why the Internet landscape looks like it does.\n-- <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">Mark Nottingham</a></p>\n</blockquote>\n<p>Just substitute "advertising-supported" with "AI" above and the trend becomes clear. The protocol designs we chose today will form structural forces that decide the future of what the post-advertising driven Internet culture and content architecture looks like. It would be a nice outcome to establish open protocols that are somewhere in between the <a href=\"https://github.com/punkpeye/awesome-mcp-clients\">MCP clients</a> and <a href=\"https://en.wikipedia.org/wiki/Web_server\">HTTP servers</a> to facilitate a more equitable outcome rather than pooling all the data to a few big players.</p>\n<p>The other consideration is here is that such an open protocol could have utility far beyond "just" managing AI training bots and address the general problem we have that <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">replicating datasets with access control is difficult</a>. This would help the good folk at <a href=\"https://archive.org/\">Archive.org</a> to manage <a href=\"https://help.archive.org/help/how-to-download-files/\">restricted access</a> data sets that might want to become eventually open. There are also geospatial datasets such as <a href=\"https://www.gbif.org/\">biodiversity data</a> that need help managing how they are mirrored, but with access restrictions for <a href=\"https://india.mongabay.com/2025/02/commentary-how-data-deficiency-is-hindering-hydro-diplomacy-between-china-and-india/\">geopolitical reasons</a>.</p>\n<p>Luckily, the IETF do a lot of things over email, so I've signed up to the <a href=\"https://mailman3.ietf.org/mailman3/lists/ai-control.ietf.org/\">AIPREF mailing list</a> to learn more as it develops and hopefully participate!</p>\n\n<p>Changelog. Mar 1st 2024: Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for spotting typos, and <a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> for helpful clarifying questions on Bluesky.</p>",+"content": "<p>The <a href=\"https://ietf.org\">IETF</a> <a href=\"https://bsky.app/profile/ietf.org/post/3lj6w5fpjx22u\">announced</a> their new <a href=\"https://www.ietf.org/blog/aipref-wg/\">AI Preferences Working Group</a> (AIPREF), which will <em>"work on standardizing building blocks that allow for the expression of preferences about how content is collected and processed for Artificial Intelligence models"</em>. This is quite well timed; the IETF tries not to standardise too early before there is <a href=\"https://www.ietf.org/runningcode/\">running code</a> but also needs to move before it's too late and a bad defacto standard is <a href=\"https://datatracker.ietf.org/doc/html/rfc7282\">chosen</a>. The AI world seems to be at that nexus point right about now, with <a href=\"https://openai.com/index/introducing-gpt-4-5/\">GPT 4.5</a> seemingly hitting a <a href=\"https://www.newscientist.com/article/2470327-is-openai-hitting-a-wall-with-huge-and-expensive-gpt-4-5-model/\">scaling wall</a> and possibly triggering the start of a renewed data scraping frenzy.</p>\n<h2><a href=\"https://anil.recoil.org/#how-do-websites-interact-with-ai-crawlers-right-now\"></a>How do websites interact with AI crawlers right now?</h2>\n<p>I've found when developing my own website there are a number of approaches to interacting with automated data crawlers. For the record, over 90% of the traffic to this site is from automated sources, so it's a material concern for <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> infrastructure.</p>\n<ol>\n<li><strong>Ban all bots; humans only plz:</strong> I don't want to do this, as I'd like to opt into my writing training next generation foundation models, but would like some agency over how much I need to pay for them to get their data (I am covering the bandwidth costs here, after all), so I just need them to cooperate more to avoid flooding my site. If I do want to ban them, the excellent <a href=\"https://github.com/ai-robots-txt/ai.robots.txt/blob/main/table-of-bot-metrics.md\">ai-robots</a> crew maintain a useful list of bad bots.</li>\n<li><strong>Ban some bots with a robots.txt:</strong> <a href=\"https://www.rfc-editor.org/rfc/rfc9309.html\">RFC9309</a> allows for the discrimination of web-crawlers via a <a href=\"https://anil.recoil.org/robots.txt\">robots.txt</a>. We nowadays have not just a few big crawlers mirroring the Internet (like <a href=\"https://developers.google.com/search/docs/crawling-indexing/googlebot\">Googlebot</a> and <a href=\"https://en.wikipedia.org/wiki/Bingbot\">Bingbot</a>), but seemingly thousands of variants competing for the data gold rush (or in my case, for <a href=\"https://anil.recoil.org/projects/ce\">conservation research</a>!) The <code>robots.txt</code> doesn't give us enough control to usefully rate-limit across all of these, unfortunately. You need to regenerate the file every time there are new URLs on the site that don't fit a longest-prefix match. This, combined with having a mega <a href=\"https://sitemaps.org\">sitemaps</a> file, is a lot of non-cacheable metadata that's just adding to my serving load.</li>\n<li><strong>Add server-side throttling for specific bots:</strong> On the assumption that there are a bunch of bad bots that mimic good bots, what I really need is to start rate-throttling them all! This is where I am today, and ended up hacking together a bunch of OCaml code for <a href=\"https://anil.recoil.org/notes/bushel-lives\">this</a> website to track all the robots request rates and slow down over-eager ones. The rest of the Internet are mostly just asking Cloudflare to take care of this for them, which results in a <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">world of pain</a> for anyone outside of their world view.</li>\n<li><strong>Just give the bots what they want, which is Markdown:</strong> Since I can't really win the throttling wars in the long term, can I just give the bots what they want, which is the core text without all the HTML around it? The first thing these crawlers do is to tokenize the HTML anyway! There is <a href=\"https://llmstxt.org/\">llms.txt</a> emerging for this. I author my website in Markdown in the first place, and then transform it into the HTML you see here. But it looks like the <a href=\"https://llmstxt.org/domains.html\">llms.txt guidelines</a> insist on just one page at the root of the site, and not one Markdown per page. This is probably better for reducing crawling traffic, but it would be a large page even for my humble homepage.</li>\n<li><strong>Can I just give you a tarball with my stuff so you leave me alone?:</strong> I rebuild my site regularly, so I could just provide the AI bots with a convenient tar/zip of my entire website content, but put it in a common place so I don't have to pay for the download bandwidth. This could include my images, videos, and source markdown which could be used not only for training, but for <a href=\"https://archive.org/\">archival</a> as well. We don't seem to have a common protocol to map URLs to static archives right now, although there are a <a href=\"https://en.wikipedia.org/wiki/Web_archive_file\">few web archive</a> formats flying around.</li>\n</ol>\n<h2><a href=\"https://anil.recoil.org/#the-role-of-the-ietf-is-to-create-protocols-not-mandate-implementations\"></a>The role of the IETF is to create protocols, not mandate implementations</h2>\n<p>The IETF has a valuable role to play here to establish a consensus around what a sensible, usable <em>protocol</em> for exchanging data on our websites might look like, rather than mandating any specific backend technology or storage format.\nThere is a lot of nuance around sharing content over HTTP: it supports <a href=\"https://http.dev/authentication\">authentication</a>, <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control\">caching</a>, <a href=\"http://web.archive.org/web/20190904190534/https://www.dirv.me/blog/2011/07/18/understanding-403-forbidden/index.html\">access control</a>, <a href=\"https://matt-jackson.com/seo-glossary/http-429/\">rate limiting</a>, and many other features hidden behind a seemingly simple <a href=\"https://datatracker.ietf.org/doc/html/rfc2616\">request-response specification</a>.</p>\n<p>I'm hoping that the AIPREF process will end up with something that gives me something closer to 5) above than 1). I need an HTTP-based mechanism by which I can express my preferences for AI crawling, and cooperate with the crawlers so that I can ensure maximum collective benefit to both people and bots visiting my site, rather than withdrawing behind a gated community of humans only. However, I think that this requires the establishment of a protocol to help sequence the HTTP requests together and not just a single static file like <code>llms.txt</code> or <code>sitemap.xml</code>.</p>\n<p>Back in the 90s, I <a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">worked</a> <a href=\"https://anil.recoil.org/papers/netapp-tr-3152\">on</a> NetApp/<a href=\"https://en.wikipedia.org/wiki/NetCache\">NetCache</a> with <a href=\"https://www.netskope.com/press-releases/netskope-john-martin-chief-product-officer\">John Martin</a>. Bandwidth used to be expensive and so we deployed edge caches that could <em>modify</em> website content with local modifications to common global content. Consider, for example, a local news website that might want to show mostly cached global news, but also modify the HTML to include local news content. You can do that today via JavaScript, but back then the only way to have a protocol to modify the static HTML. The <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">Internet Content Adaptation Protocol</a> was the IETF's answer to creating a structured HTTP-like protocol to allow edge modifications from proxy servers:</p>\n<blockquote>\n<p>ICAP is, in essence, a lightweight protocol for executing a "remote procedure call" on HTTP messages. It allows ICAP clients to pass HTTP messages to ICAP servers for some sort of transformation or other processing ("adaptation"). The server executes its transformation service on messages and sends back responses to the client, usually with modified messages. Typically, the adapted messages are either HTTP requests or HTTP responses.\n-- <a href=\"https://datatracker.ietf.org/doc/rfc3507/\">RFC3507</a>, IETF</p>\n</blockquote>\n<p>One of the coolest features of ICAP is that is didn't mandate the transformation mechanism, just the protocol. The proxies deployed at the edge networks would get a vector into transforming the data stream. NetCache implemented an implementation of ICAP, and Squid <a href=\"https://www.egirna.com/blog/news-2/configure-squid-v6-2-on-ubuntu-server-22-and-use-it-with-icap-18\">still supports</a> it. What would a similar approach look like for allowing crawlers into your site's content, but leaving lots of freedom for the details of this to be delegated to the crawlers and servers?</p>\n<h2><a href=\"https://anil.recoil.org/#challenges-in-an-open-data-hoovering-protocol\"></a>Challenges in an open data hoovering protocol</h2>\n<p><a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> <a href=\"https://bsky.app/profile/aftnet.bsky.social/post/3ljcw2uawe22c\">identifies</a> the main problem facing AIPref:</p>\n<blockquote>\n<p>Given the reports that some current LLM models have been trained on data corpus obtained illegally, I have some doubts that AIPref will be respected.</p>\n</blockquote>\n<p>This is <a href=\"https://www.tomshardware.com/tech-industry/artificial-intelligence/meta-staff-torrented-nearly-82tb-of-pirated-books-for-ai-training-court-records-reveal-copyright-violations\">true</a> for the current generation of data crawlers, but also where the <em>opportunity</em> lies for AIPref. Without a systematic way to support <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">replication of non-public data</a>, the situation will get even worse as custom apertures are created into data silos without any integrity underlying them.</p>\n<p>The main reason for having a protocol-based solution is that we could support the strong authentication and identification of bots. If (for example) the GoogleBot supplied a token with every HTTP request to fetch my content, I could track its use and perhaps even get compensation for the bandwidth costs. The current methods of <a href=\"https://developers.google.com/search/docs/crawling-indexing/verifying-googlebot\">bot verification</a> all seem quite weak; they are just IP based checks for example.</p>\n<p>This would in turn open a path to the disciplined negotiation for access controlled data bilaterally between crawlers and hosters. More and more content publishers are signing various <a href=\"https://www.monda.ai/blog/ultimate-list-of-data-licensing-deals-for-ai\">exclusive deals</a> with AI training companies. Irrespective of your opinion on such deals, a protocol to make it easier to authenticate bots strongly would make the establishment (and ongoing negotiation) of those mechanisms far easier to handle.</p>\n<p>We are also seeing rapid adoption of the the <a href=\"https://github.com/modelcontextprotocol\">Model Context Protocol</a> released a few months ago. This establishes a <a href=\"https://github.com/modelcontextprotocol/specification\">JSON-RPC specification</a> for LLM clients and data providers to talk to each other locally. It seems odd to me that we'd have a rich "local" specification for data exchange like this for RAG-like systems, but not have one in the wide area across the Internet. As the chair of the AIPREFS group <a href=\"https://mnot.net\">Mark Nottingham</a> notes, <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">platform advantages</a> are not just network effects, so there may be deep repurcussions into the economics of AI here:</p>\n<blockquote>\n<p>In short: there are less-recognised structural forces that push key Internet services into centralized, real-time advertising-supported platforms. Along with factors like network effects and access to data, they explain some of why the Internet landscape looks like it does.\n-- <a href=\"https://www.mnot.net/blog/2024/11/29/platforms\">Mark Nottingham</a></p>\n</blockquote>\n<p>Just substitute "advertising-supported" with "AI" above and the trend becomes clear. The protocol designs we chose today will form structural forces that decide the future of what the post-advertising driven Internet culture and content architecture looks like. It would be a nice outcome to establish open protocols that are somewhere in between the <a href=\"https://github.com/punkpeye/awesome-mcp-clients\">MCP clients</a> and <a href=\"https://en.wikipedia.org/wiki/Web_server\">HTTP servers</a> to facilitate a more equitable outcome rather than pooling all the data to a few big players.</p>\n<p>The other consideration is here is that such an open protocol could have utility far beyond "just" managing AI training bots and address the general problem we have that <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">replicating datasets with access control is difficult</a>. This would help the good folk at <a href=\"https://archive.org/\">Archive.org</a> to manage <a href=\"https://help.archive.org/help/how-to-download-files/\">restricted access</a> data sets that might want to become eventually open. There are also geospatial datasets such as <a href=\"https://www.gbif.org/\">biodiversity data</a> that need help managing how they are mirrored, but with access restrictions for <a href=\"https://india.mongabay.com/2025/02/commentary-how-data-deficiency-is-hindering-hydro-diplomacy-between-china-and-india/\">geopolitical reasons</a>.</p>\n<p>Luckily, the IETF do a lot of things over email, so I've signed up to the <a href=\"https://mailman3.ietf.org/mailman3/lists/ai-control.ietf.org/\">AIPREF mailing list</a> to learn more as it develops and hopefully participate!</p>\n\n<p>Changelog. Mar 1st 2024: Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for spotting typos, and <a href=\"https://bsky.app/profile/aftnet.bsky.social\">Antoine Fressancourt</a> for helpful clarifying questions on Bluesky.</p>",
+18
avsm/notes_ai-poisoning.json
+18
avsm/notes_ai-poisoning.json
···+"summary": "<p>For the past few years, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I been working with our colleagues in\n<a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> to do <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">analysis at scale</a> on the\nacademic literature. Getting local access to millions of fulltext papers has not\nbeen without drama, but made possible thanks to huge amounts of help from our\n<a href=\"https://www.lib.cam.ac.uk/\">University Library</a> who helped us navigate our\nrelationships with scientific publishers. We have just <strong><a href=\"https://rdcu.be/evkfj\">published a comment\nin Nature</a></strong> about the next phase\nof our research, where are looking into the impact of AI advances on evidence synthesis.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> \n<img alt=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\" src=\"https://anil.recoil.org/images/davidparkins-ai-poison.webp\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\">\nAI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature </a></p>\n<p>Our work on literature reviews led us into assessing methods for <a href=\"https://royalsociety.org/news-resources/projects/evidence-synthesis/\">evidence\nsynthesis</a>\n(which is crucial to rational policymaking!) and specifically about how recent advances in AI may\nimpact it. The current methods for <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">rigorous systematic literature review</a> are expensive and slow, and authors are already struggling to keep up with the <a href=\"https://ourworldindata.org/grapher/scientific-and-technical-journal-articles?time=latest\">rapidly expanding</a>\nnumber of legitimate papers. Adding to this, <a href=\"https://retractionwatch.com/2025/\">paper retractions</a> are increasing near\n<a href=\"https://www.nature.com/articles/d41586-023-03974-8\">exponentially</a> and already\nsystematic reviews <a href=\"https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/\">unknowingly cite</a>\nretracted papers, with most remaining uncorrected even a year (after notification!)</p>\n<p>This is all made much more complex as LLMs are flooding the landscape with\nconvincing, fake manuscripts and doctored data, potentially overwhelming our\ncurrent ability to distinguish fact from fiction. Just this March, the <a href=\"https://sakana.ai/ai-scientist/\">AI\nScientist</a> formulated hypotheses, designed and\nran experiments, analysed the results, generated the figures and produced a\nmanuscript that <a href=\"https://sakana.ai/ai-scientist-first-publication/\">passed human peer\nreview</a> for an ICLR\nworkshop! Distinguishing genuine papers from those produced by LLMs isn't just\na problem for review authors; it's a threat to the very foundation of\nscientific knowledge. And meanwhile, Google is taking a different tack with a\ncollaborative <a href=\"https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/\">AI co-scientist</a> who acts as a multi-agent assistant.</p>\n<p>So the landscape is moving <em>really</em> quickly! Our proposal for the future of\nliterature reviews builds on our desire to move towards a more regional,\nfederated network approach. Instead of having giant repositories of knowledge\nthat <a href=\"https://en.wikipedia.org/wiki/2025_United_States_government_online_resource_removals\">may be erased unilaterally</a>,\nwe're aiming for a more bilateral network of "living evidence databases".\nEvery government, especially those in the Global South, should have the ability to build their\nown "<a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a>" which represent the body\nof digital data that affects their own regional needs.</p>\n<p>This system of living evidence databases can be incremental and dynamically\nupdated, and AI assistance can be used as long as humans remain in-the-loop.\nSuch a system can continuously gather, screen, and index literature,\nautomatically remove compromised studies and recalculating results. We're\nworking on this on multiple fronts this year; ranging from the computer science\nto figure out the distributed-nitty-gritty <a href=\"https://anil.recoil.org/#fn-1\">[1]</a>, over to working with the\n<a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">GEOBON folk</a> on global biodiversity <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">data\nmanagement</a>, and continuing\nto drive the core LED design at Conservation Evidence. It feels like a</p>\n<p>Read our <a href=\"https://www.nature.com/articles/d41586-025-02069-w\">Nature Comment piece</a> (<a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_will-ai-speed-up-literature-reviews-or-derail-activity-7348317711002705920-Y5UT?rcm=ACoAAAB0Kb0BNo1v6ylsGU2NtPa95mj-w1VcaJA\">comment on LI</a>) to learn more about how we think we can safeguard evidence synthesis against the rising tide of "AI-poisoned literature" and ensure the continued integrity of scientific discovery. As a random bit of trivia, the incredibly cool artwork in the piece was drawn by the legendary <a href=\"https://www.davidparkins.com/\">David Parkins</a>, who also drew <a href=\"https://www.beano.com/\">Beano</a> and <a href=\"https://en.wikipedia.org/wiki/Dennis_the_Menace_and_Gnasher\">Dennis the Menace</a>!</p>\n\n<ol>\n<li>\n<p>My instinct is that we'll end up with something <a href=\"https://arxiv.org/abs/2402.03239\">ATProto based</a> as it's so convenient for <a href=\"https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/\">distributed system authentication</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>For the past few years, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I been working with our colleagues in\n<a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> to do <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">analysis at scale</a> on the\nacademic literature. Getting local access to millions of fulltext papers has not\nbeen without drama, but made possible thanks to huge amounts of help from our\n<a href=\"https://www.lib.cam.ac.uk/\">University Library</a> who helped us navigate our\nrelationships with scientific publishers. We have just <strong><a href=\"https://rdcu.be/evkfj\">published a comment\nin Nature</a></strong> about the next phase\nof our research, where are looking into the impact of AI advances on evidence synthesis.</p>\n<p><a href=\"https://rdcu.be/evkfj\"> \n<img alt=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\" src=\"https://anil.recoil.org/images/davidparkins-ai-poison.webp\" title=\"AI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature\">\nAI poisoning the literature in a legendary cartoon. Credit: David Parkins, Nature </a></p>\n<p>Our work on literature reviews led us into assessing methods for <a href=\"https://royalsociety.org/news-resources/projects/evidence-synthesis/\">evidence\nsynthesis</a>\n(which is crucial to rational policymaking!) and specifically about how recent advances in AI may\nimpact it. The current methods for <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">rigorous systematic literature review</a> are expensive and slow, and authors are already struggling to keep up with the <a href=\"https://ourworldindata.org/grapher/scientific-and-technical-journal-articles?time=latest\">rapidly expanding</a>\nnumber of legitimate papers. Adding to this, <a href=\"https://retractionwatch.com/2025/\">paper retractions</a> are increasing near\n<a href=\"https://www.nature.com/articles/d41586-023-03974-8\">exponentially</a> and already\nsystematic reviews <a href=\"https://retractionwatch.com/the-retraction-watch-leaderboard/top-10-most-highly-cited-retracted-papers/\">unknowingly cite</a>\nretracted papers, with most remaining uncorrected even a year (after notification!)</p>\n<p>This is all made much more complex as LLMs are flooding the landscape with\nconvincing, fake manuscripts and doctored data, potentially overwhelming our\ncurrent ability to distinguish fact from fiction. Just this March, the <a href=\"https://sakana.ai/ai-scientist/\">AI\nScientist</a> formulated hypotheses, designed and\nran experiments, analysed the results, generated the figures and produced a\nmanuscript that <a href=\"https://sakana.ai/ai-scientist-first-publication/\">passed human peer\nreview</a> for an ICLR\nworkshop! Distinguishing genuine papers from those produced by LLMs isn't just\na problem for review authors; it's a threat to the very foundation of\nscientific knowledge. And meanwhile, Google is taking a different tack with a\ncollaborative <a href=\"https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/\">AI co-scientist</a> who acts as a multi-agent assistant.</p>\n<p>So the landscape is moving <em>really</em> quickly! Our proposal for the future of\nliterature reviews builds on our desire to move towards a more regional,\nfederated network approach. Instead of having giant repositories of knowledge\nthat <a href=\"https://en.wikipedia.org/wiki/2025_United_States_government_online_resource_removals\">may be erased unilaterally</a>,\nwe're aiming for a more bilateral network of "living evidence databases".\nEvery government, especially those in the Global South, should have the ability to build their\nown "<a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a>" which represent the body\nof digital data that affects their own regional needs.</p>\n<p>This system of living evidence databases can be incremental and dynamically\nupdated, and AI assistance can be used as long as humans remain in-the-loop.\nSuch a system can continuously gather, screen, and index literature,\nautomatically remove compromised studies and recalculating results. We're\nworking on this on multiple fronts this year; ranging from the computer science\nto figure out the distributed-nitty-gritty <a href=\"https://anil.recoil.org/#fn-1\">[1]</a>, over to working with the\n<a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">GEOBON folk</a> on global biodiversity <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">data\nmanagement</a>, and continuing\nto drive the core LED design at Conservation Evidence. It feels like a</p>\n<p>Read our <a href=\"https://www.nature.com/articles/d41586-025-02069-w\">Nature Comment piece</a> (<a href=\"https://www.linkedin.com/posts/anilmadhavapeddy_will-ai-speed-up-literature-reviews-or-derail-activity-7348317711002705920-Y5UT?rcm=ACoAAAB0Kb0BNo1v6ylsGU2NtPa95mj-w1VcaJA\">comment on LI</a>) to learn more about how we think we can safeguard evidence synthesis against the rising tide of "AI-poisoned literature" and ensure the continued integrity of scientific discovery. As a random bit of trivia, the incredibly cool artwork in the piece was drawn by the legendary <a href=\"https://www.davidparkins.com/\">David Parkins</a>, who also drew <a href=\"https://www.beano.com/\">Beano</a> and <a href=\"https://en.wikipedia.org/wiki/Dennis_the_Menace_and_Gnasher\">Dennis the Menace</a>!</p>\n\n<ol>\n<li>\n<p>My instinct is that we'll end up with something <a href=\"https://arxiv.org/abs/2402.03239\">ATProto based</a> as it's so convenient for <a href=\"https://www.tunbury.org/2025/04/25/bluesky-ssh-authentication/\">distributed system authentication</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_ai-should-unite-conservation.json
+18
avsm/notes_ai-should-unite-conservation.json
···+"summary": "<p>I had a tremendous time participating in last year's <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan of AI and Conservation</a>, which laid out the opportunities that technological progress from AI (a catchall phrase here) could bring to hard-working conservation practitioners. Since then, there's been a lot of corridor conversations about future projects (and even <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">dinner with the Wildlife Trusts</a>). However, there has also been discussion about the potential <em>harms</em> of our work, most notably in a <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">response letter</a> to our paper written by <a href=\"https://experts.exeter.ac.uk/42389-katie-murray/about\">Katie Murray</a> and colleagues.</p>\n<p>Murray et al make two really important points:</p>\n<blockquote>\n<ul>\n<li>[...] importance of ecological expertise must be recognised as much more than just the expert annotation of training data</li>\n<li>[...] effort should be made to build capacity for AI development in the Global South, so that the rewards of successful research can be shared\n-- <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">The potential for AI to divide conservation</a></li>\n</ul>\n</blockquote>\n<p>Myself and the co-authors of the original horizon scan could not agree more with this statement, and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> lead us to publish a response-to-the-response <a href=\"https://anil.recoil.org/#fn-1\">[1]</a> dubbed "<a href=\"https://anil.recoil.org/papers/2025-conservation-div\">Conservation changed but not divided</a>".</p>\n<p><a href=\"https://authors.elsevier.com/a/1k%7ESZcZ3X3uxK\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cam-nature-3.webp\" title=\"\">\n </a></p>\n<p>In our response, we note that:</p>\n<blockquote>\n<p>We agree wholeheartedly with these points and recognise that the task of equitable integration of AI into conservation is beyond the scope of any single group and requires collective action.</p>\n<p>[...] Developers of AI tools have a foundational role to play in delivering an equitable AI landscape. Technologies disconnected from pragmatic ecological, cultural, and socioeconomic factors are unlikely to advance the field [...]</p>\n<p>[...] Developers should adopt participatory design and development principles, identifying conservation actors to guide the process, designing [...] protocols that respect cultural sensitivities and Indigenous and local knowledge [...]</p>\n<p>[...] All tools should be open source and thoroughly documented, so that they can be easily adapted for local contexts.\n-- <a href=\"https://anil.recoil.org/papers/2025-conservation-div\">Conservation changed but not divided</a></p>\n</blockquote>\n<p>Many thanks to Katie Murray and colleagues for taking the trouble to call out the issues in our original paper! Both of the letters will be side-by-side in the next issue of <a href=\"https://www.cell.com/trends/ecology-evolution/home\">Trends in Ecology and Evolution</a> and, of course, we welcome any more perspectives about either of these.</p>\n<h1><a href=\"https://anil.recoil.org/#cambridge-goes-full-ai\"></a>Cambridge goes full AI</h1>\n<p>This whole discourse also happened at the same time as Cambridge <a href=\"https://www.cam.ac.uk/topics/artificial-intelligence\">dove</a> in with a major piece about <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">turbocharging the race to protect nature and climate with AI</a>. The piece itself (lead by the brilliant <a href=\"https://uk.linkedin.com/in/jacqueline-garget-b24804214\">Jacqueline Garget</a> and <a href=\"https://anil.recoil.org/louise.walsh@admin.cam.ac.uk\">Louise Walsh</a>) covers a number of the projects in our <a href=\"https://ai.conservation.cam.ac.uk/\">AICN</a> project we started <a href=\"https://anil.recoil.org/notes/aicn-in-aicam\">last year</a>.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cam-nature-1.webp\" title=\"\">\n </a></p>\n<p>The online story itself is a rather gorgeous layout, with pieces on:</p>\n<ul>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Land-use-planning-T1WPpYngXA\">landuse planning</a> from me about our <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">UKRI-funded</a> "Terra" project to map global plants and the impact on <a href=\"https://anil.recoil.org/papers/2024-food-life\">food supply chains</a>.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Biodiversity-conservation-50N1jQTVIa\">biodiversity conservation</a> with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> who leads <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> and whose <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts channel</a> is a must-watch, and <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> who have the <a href=\"https://ai.conservation.cam.ac.uk/2024/06/05/planetary-computing-fellows-michael-dales-and-sadiq-jaffer-putting-systems-to-work-to-accelerate-ecological-interventions/\">coolest job titles</a> in Cambridge.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Climate-modelling-NdQYHh3cRP\">climate modelling</a> with Joe and Jack from the ICCS talking about <a href=\"https://github.com/Cambridge-ICCS/FTorch\">differentiable fortran</a> (I'm coorganising the <a href=\"https://anil.recoil.org/notes/propl-at-splash\">next PROPL</a> with ICCS lead <a href=\"https://dorchard.github.io\">Dominic Orchard</a> as well).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Energy-efficient-homes-0AUJzMfjnS\">energy efficient homes</a> with <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a> (who I'm having a blast working with alongside <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> on <a href=\"https://anil.recoil.org/ideas/urban-vegetation\">urban vegatation</a>).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Forest-monitoring-VudaoOH7Rd\">forest monitoring</a> about Emily Lines and Harry Owens work on forest structure reconstruction and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and Frank Feng's work on <a href=\"https://github.com/MingyueX/GreenLens\">GreenLens</a>.</li>\n</ul>\n<p>While each of these projects are fascinating research projects, the bit that made me stop and really think was the last <a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\">interview with me</a> about how AI could heal in the planet. In it, I talk about conservation from a technological lens:</p>\n<blockquote>\n<p>We need to act fast to mitigate the impacts of climate change, and to protect and restore biodiversity. There\u2019s incredible potential in using AI to augment our work. It enables us to do things much more quickly \u2013 it\u2019s like giving humans global knowledge superpowers!</p>\n</blockquote>\n<p>But, after more corridor conversations with colleagues in the <a href=\"https://www.conservation.cam.ac.uk\">CCI</a><a href=\"https://anil.recoil.org/#fn-2\">[2]</a> more important angles to this story emerged. It's really easy for us to lose sight of the fact that AI is just a piece in the puzzle; a means to an end. We must keep the focus on the giant crisis in biodiversity unfolding in front of our eyes like a slow motion steamroller. Other pieces on the Cambridge website that cover this include <a href=\"https://www.cam.ac.uk/stories/pollinatorsriskindex\">the pollinator risk index</a> (with <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a>), <a href=\"https://www.cam.ac.uk/research/news/pledge-to-phase-out-toxic-lead-ammunition-in-uk-hunting-by-2025-has-failed\">lead poisoning of grouse</a>, <a href=\"https://www.cam.ac.uk/research/news/uk-peatland-fires-are-supercharging-carbon-emissions-as-climate-change-causes-hotter-drier-summers\">carbon emissions from peatland fires</a>, or the risks of <a href=\"https://www.cam.ac.uk/research/news/restoring-wildlife-habitats-in-wealthy-nations-could-drive-extinctions-in-species-rich-regions\">biodiversity leakage</a> causing extinctions (by <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>).\nIt's not all bad news of course! Cambridge has also covered <a href=\"https://www.cam.ac.uk/research/news/thriving-antarctic-ecosystems-found-following-iceberg-calving\">thriving Antarctic ecosystems</a> and <a href=\"https://www.cam.ac.uk/stories/conservation-success-stories\">success stories in species restoration</a>.</p>\n<p>My overall concerns with current central University (and world's) focus on AI stem from:</p>\n<ul>\n<li>the "distraction effect" caused by AI. If every conversation begins with 'artificial intelligence', then we lose track of the goal, which is to protect what remains of the natural world while making access to it as equitable as possible to every human who lives on this planet. In the past few months, I've had four separate meetings with groups in the CCI where I've been explaining things like <a href=\"https://modelcontextprotocol.io/introduction\">MCP</a> to a bunch of conservation practitioners who should be, frankly, not be having to keep up with this incredibly fast moving field in order to submit funding proposals in their own areas of core expertise on the natural world.</li>\n<li>the "leakage effect" to funding caused by AI. While almost anything AI has a better chance of getting dosh right now, this means conventional conservation work is being undermined as a result. But this in turn chokes out the lifeblood of AI -- the data that trains the models we build! I noticed also that <a href=\"https://rich-turner-group.github.io/\">Richard Turner</a> made the same point about his recent <a href=\"https://www.cam.ac.uk/research/news/fully-ai-driven-weather-prediction-system-could-start-revolution-in-forecasting\">revolutionary climate model</a>, where he observes that <em>"Aardvark would not have been possible without decades of physical-model development by the community, and we are particularly indebted to ECMWF for their ERA5 dataset which is essential for training Aardvark"</em>. The same is true for conservation.</li>\n<li>the "credit effect", which ascribes all advances to AI rather than the hard work from a global community. I noticed this in Demis Hassabis' recent <a href=\"https://www.cam.ac.uk/stories/demis-hassabis-AI-Cambridge\">talk on his Nobel prize</a>, where Alphafold was mainly possible due to a <a href=\"https://www.statnews.com/2025/01/07/casp-protein-structure-prediction-competition-after-alphafold/\">decades-long competition</a> organised by computational and experimental chemists. Whole cohorts of scientists withheld their latest results back for a year in order to allow the models to have benchmarks.</li>\n<li>the "fashion effect", whereby conservation interventions that might last decades (not <a href=\"https://golarainforest.org/grnp-history\">uncommon</a> in nature restoration projects) are forced to lurch between the latest topic of the week. <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> noted that plastic pollution was another example of how precious political attention was diverted suddenly; there was a BBC documentary with heart-breaking images of <a href=\"https://www.youtube.com/watch?v=EjIUp6A7GRU\">plastic pollution killing dolphins</a> and suddenly all attention was on <a href=\"https://www.economist.com/international/2018/03/03/the-known-unknowns-of-plastic-pollution\">eliminating them</a> at <a href=\"https://www.gov.uk/government/news/gove-takes-action-to-ban-plastic-straws-stirrers-and-cotton-buds\">all cost</a>. This isn't to say that banning plastic straws was bad (quite the opposite!), but that we must also consider biodiversity impacts holistically and continue to fund <a href=\"https://pubmed.ncbi.nlm.nih.gov/33213887/\">broad picture work</a> as well as the 'charismatic topic of the week'. Alec, for example, held a fantastic workshop last year about the use of <a href=\"https://pubmed.ncbi.nlm.nih.gov/35979694/\">OSINT</a> for establishing the bigger picture in ecosystem management.</li>\n</ul>\n<p><a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\"> \n<img alt=\"Would you trust this man with your garden? What&apos;s that? Yes? Yes you would?\" src=\"https://anil.recoil.org/images/cam-nature-2.webp\" title=\"Would you trust this man with your garden? What&apos;s that? Yes? Yes you would?\">\nWould you trust this man with your garden? What's that? Yes? Yes you would? </a></p>\n<h2><a href=\"https://anil.recoil.org/#telling-the-story-from-the-conservation-perspective\"></a>Telling the story from the conservation perspective</h2>\n<p>I only really started thinking about this properly <em>after</em> talking to Jacqueline and Sam, so I'm grateful to them for sparking the chain of thoughts. I've started reading how other organisations (such as MacArthur's <a href=\"https://www.macfound.org/programs/field-support/technology-public-interest/\">Technology for the Public Good</a>) discuss the role of technology in societal domains, and would be grateful for any pointers to similar initiatives in conservation.</p>\n<p>I would also dearly love to see a roundup of all the Cambridge <a href=\"https://www.cam.ac.uk/news/environment\">environmental coverage</a> in one place, perhaps on the <a href=\"https://www.conservation.cam.ac.uk/\">Conservation Research Institute</a> pages, told as a cohesive story from the perspective of the nature research and not the technology that enables just a part of it. If you're an undergraduate looking for something to do this summer, especially from the social sciences or journalism, do get in touch and I'd be delighted to work with you on this for an internship! Or maybe this is something for the first edition of the <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">Cambridge Green Blue</a> competition to assemble next year...</p>\n<p>Thanks to <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-william-morgan\">William Morgan</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/ashley-simkins\">Ash Simkins</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> for corrections and suggestions to this post! *<em>7th May 2025:</em> See also <a href=\"https://anil.recoil.org/notes/humans-save-nature-not-ai\">a followup article</a> on this by <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a>.</p>\n\n<ol>\n<li>\n<p>As an aside, I love this long-form, carefully considered mechanism for scholarly discussion, as espoused by the letter back-and-forth in a journal. I wish we had more of this in computer science rather than social media arguments that disappear like tears in the rain just a few scrolls later.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Amusingly, they were triggered by an accidental reply-all from me to the whole building rather than a private reply. I hold that this is the best way to start a real conversation!</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>I had a tremendous time participating in last year's <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan of AI and Conservation</a>, which laid out the opportunities that technological progress from AI (a catchall phrase here) could bring to hard-working conservation practitioners. Since then, there's been a lot of corridor conversations about future projects (and even <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">dinner with the Wildlife Trusts</a>). However, there has also been discussion about the potential <em>harms</em> of our work, most notably in a <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">response letter</a> to our paper written by <a href=\"https://experts.exeter.ac.uk/42389-katie-murray/about\">Katie Murray</a> and colleagues.</p>\n<p>Murray et al make two really important points:</p>\n<blockquote>\n<ul>\n<li>[...] importance of ecological expertise must be recognised as much more than just the expert annotation of training data</li>\n<li>[...] effort should be made to build capacity for AI development in the Global South, so that the rewards of successful research can be shared\n-- <a href=\"https://www.sciencedirect.com/science/article/pii/S0169534725000588\">The potential for AI to divide conservation</a></li>\n</ul>\n</blockquote>\n<p>Myself and the co-authors of the original horizon scan could not agree more with this statement, and <a href=\"https://samreynolds.org/\">Sam Reynolds</a> lead us to publish a response-to-the-response <a href=\"https://anil.recoil.org/#fn-1\">[1]</a> dubbed "<a href=\"https://anil.recoil.org/papers/2025-conservation-div\">Conservation changed but not divided</a>".</p>\n<p><a href=\"https://authors.elsevier.com/a/1k%7ESZcZ3X3uxK\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cam-nature-3.webp\" title=\"\">\n </a></p>\n<p>In our response, we note that:</p>\n<blockquote>\n<p>We agree wholeheartedly with these points and recognise that the task of equitable integration of AI into conservation is beyond the scope of any single group and requires collective action.</p>\n<p>[...] Developers of AI tools have a foundational role to play in delivering an equitable AI landscape. Technologies disconnected from pragmatic ecological, cultural, and socioeconomic factors are unlikely to advance the field [...]</p>\n<p>[...] Developers should adopt participatory design and development principles, identifying conservation actors to guide the process, designing [...] protocols that respect cultural sensitivities and Indigenous and local knowledge [...]</p>\n<p>[...] All tools should be open source and thoroughly documented, so that they can be easily adapted for local contexts.\n-- <a href=\"https://anil.recoil.org/papers/2025-conservation-div\">Conservation changed but not divided</a></p>\n</blockquote>\n<p>Many thanks to Katie Murray and colleagues for taking the trouble to call out the issues in our original paper! Both of the letters will be side-by-side in the next issue of <a href=\"https://www.cell.com/trends/ecology-evolution/home\">Trends in Ecology and Evolution</a> and, of course, we welcome any more perspectives about either of these.</p>\n<h1><a href=\"https://anil.recoil.org/#cambridge-goes-full-ai\"></a>Cambridge goes full AI</h1>\n<p>This whole discourse also happened at the same time as Cambridge <a href=\"https://www.cam.ac.uk/topics/artificial-intelligence\">dove</a> in with a major piece about <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">turbocharging the race to protect nature and climate with AI</a>. The piece itself (lead by the brilliant <a href=\"https://uk.linkedin.com/in/jacqueline-garget-b24804214\">Jacqueline Garget</a> and <a href=\"https://anil.recoil.org/louise.walsh@admin.cam.ac.uk\">Louise Walsh</a>) covers a number of the projects in our <a href=\"https://ai.conservation.cam.ac.uk/\">AICN</a> project we started <a href=\"https://anil.recoil.org/notes/aicn-in-aicam\">last year</a>.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cam-nature-1.webp\" title=\"\">\n </a></p>\n<p>The online story itself is a rather gorgeous layout, with pieces on:</p>\n<ul>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Land-use-planning-T1WPpYngXA\">landuse planning</a> from me about our <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">UKRI-funded</a> "Terra" project to map global plants and the impact on <a href=\"https://anil.recoil.org/papers/2024-food-life\">food supply chains</a>.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Biodiversity-conservation-50N1jQTVIa\">biodiversity conservation</a> with <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> who leads <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> and whose <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts channel</a> is a must-watch, and <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> who have the <a href=\"https://ai.conservation.cam.ac.uk/2024/06/05/planetary-computing-fellows-michael-dales-and-sadiq-jaffer-putting-systems-to-work-to-accelerate-ecological-interventions/\">coolest job titles</a> in Cambridge.</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Climate-modelling-NdQYHh3cRP\">climate modelling</a> with Joe and Jack from the ICCS talking about <a href=\"https://github.com/Cambridge-ICCS/FTorch\">differentiable fortran</a> (I'm coorganising the <a href=\"https://anil.recoil.org/notes/propl-at-splash\">next PROPL</a> with ICCS lead <a href=\"https://dorchard.github.io\">Dominic Orchard</a> as well).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Energy-efficient-homes-0AUJzMfjnS\">energy efficient homes</a> with <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a> (who I'm having a blast working with alongside <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> on <a href=\"https://anil.recoil.org/ideas/urban-vegetation\">urban vegatation</a>).</li>\n<li><a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature#section-Forest-monitoring-VudaoOH7Rd\">forest monitoring</a> about Emily Lines and Harry Owens work on forest structure reconstruction and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and Frank Feng's work on <a href=\"https://github.com/MingyueX/GreenLens\">GreenLens</a>.</li>\n</ul>\n<p>While each of these projects are fascinating research projects, the bit that made me stop and really think was the last <a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\">interview with me</a> about how AI could heal in the planet. In it, I talk about conservation from a technological lens:</p>\n<blockquote>\n<p>We need to act fast to mitigate the impacts of climate change, and to protect and restore biodiversity. There\u2019s incredible potential in using AI to augment our work. It enables us to do things much more quickly \u2013 it\u2019s like giving humans global knowledge superpowers!</p>\n</blockquote>\n<p>But, after more corridor conversations with colleagues in the <a href=\"https://www.conservation.cam.ac.uk\">CCI</a><a href=\"https://anil.recoil.org/#fn-2\">[2]</a> more important angles to this story emerged. It's really easy for us to lose sight of the fact that AI is just a piece in the puzzle; a means to an end. We must keep the focus on the giant crisis in biodiversity unfolding in front of our eyes like a slow motion steamroller. Other pieces on the Cambridge website that cover this include <a href=\"https://www.cam.ac.uk/stories/pollinatorsriskindex\">the pollinator risk index</a> (with <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a>), <a href=\"https://www.cam.ac.uk/research/news/pledge-to-phase-out-toxic-lead-ammunition-in-uk-hunting-by-2025-has-failed\">lead poisoning of grouse</a>, <a href=\"https://www.cam.ac.uk/research/news/uk-peatland-fires-are-supercharging-carbon-emissions-as-climate-change-causes-hotter-drier-summers\">carbon emissions from peatland fires</a>, or the risks of <a href=\"https://www.cam.ac.uk/research/news/restoring-wildlife-habitats-in-wealthy-nations-could-drive-extinctions-in-species-rich-regions\">biodiversity leakage</a> causing extinctions (by <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>).\nIt's not all bad news of course! Cambridge has also covered <a href=\"https://www.cam.ac.uk/research/news/thriving-antarctic-ecosystems-found-following-iceberg-calving\">thriving Antarctic ecosystems</a> and <a href=\"https://www.cam.ac.uk/stories/conservation-success-stories\">success stories in species restoration</a>.</p>\n<p>My overall concerns with current central University (and world's) focus on AI stem from:</p>\n<ul>\n<li>the "distraction effect" caused by AI. If every conversation begins with 'artificial intelligence', then we lose track of the goal, which is to protect what remains of the natural world while making access to it as equitable as possible to every human who lives on this planet. In the past few months, I've had four separate meetings with groups in the CCI where I've been explaining things like <a href=\"https://modelcontextprotocol.io/introduction\">MCP</a> to a bunch of conservation practitioners who should be, frankly, not be having to keep up with this incredibly fast moving field in order to submit funding proposals in their own areas of core expertise on the natural world.</li>\n<li>the "leakage effect" to funding caused by AI. While almost anything AI has a better chance of getting dosh right now, this means conventional conservation work is being undermined as a result. But this in turn chokes out the lifeblood of AI -- the data that trains the models we build! I noticed also that <a href=\"https://rich-turner-group.github.io/\">Richard Turner</a> made the same point about his recent <a href=\"https://www.cam.ac.uk/research/news/fully-ai-driven-weather-prediction-system-could-start-revolution-in-forecasting\">revolutionary climate model</a>, where he observes that <em>"Aardvark would not have been possible without decades of physical-model development by the community, and we are particularly indebted to ECMWF for their ERA5 dataset which is essential for training Aardvark"</em>. The same is true for conservation.</li>\n<li>the "credit effect", which ascribes all advances to AI rather than the hard work from a global community. I noticed this in Demis Hassabis' recent <a href=\"https://www.cam.ac.uk/stories/demis-hassabis-AI-Cambridge\">talk on his Nobel prize</a>, where Alphafold was mainly possible due to a <a href=\"https://www.statnews.com/2025/01/07/casp-protein-structure-prediction-competition-after-alphafold/\">decades-long competition</a> organised by computational and experimental chemists. Whole cohorts of scientists withheld their latest results back for a year in order to allow the models to have benchmarks.</li>\n<li>the "fashion effect", whereby conservation interventions that might last decades (not <a href=\"https://golarainforest.org/grnp-history\">uncommon</a> in nature restoration projects) are forced to lurch between the latest topic of the week. <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> noted that plastic pollution was another example of how precious political attention was diverted suddenly; there was a BBC documentary with heart-breaking images of <a href=\"https://www.youtube.com/watch?v=EjIUp6A7GRU\">plastic pollution killing dolphins</a> and suddenly all attention was on <a href=\"https://www.economist.com/international/2018/03/03/the-known-unknowns-of-plastic-pollution\">eliminating them</a> at <a href=\"https://www.gov.uk/government/news/gove-takes-action-to-ban-plastic-straws-stirrers-and-cotton-buds\">all cost</a>. This isn't to say that banning plastic straws was bad (quite the opposite!), but that we must also consider biodiversity impacts holistically and continue to fund <a href=\"https://pubmed.ncbi.nlm.nih.gov/33213887/\">broad picture work</a> as well as the 'charismatic topic of the week'. Alec, for example, held a fantastic workshop last year about the use of <a href=\"https://pubmed.ncbi.nlm.nih.gov/35979694/\">OSINT</a> for establishing the bigger picture in ecosystem management.</li>\n</ul>\n<p><a href=\"https://www.cam.ac.uk/stories/Anil-Madhavapeddy-AI-climate-nature\"> \n<img alt=\"Would you trust this man with your garden? What&apos;s that? Yes? Yes you would?\" src=\"https://anil.recoil.org/images/cam-nature-2.webp\" title=\"Would you trust this man with your garden? What&apos;s that? Yes? Yes you would?\">\nWould you trust this man with your garden? What's that? Yes? Yes you would? </a></p>\n<h2><a href=\"https://anil.recoil.org/#telling-the-story-from-the-conservation-perspective\"></a>Telling the story from the conservation perspective</h2>\n<p>I only really started thinking about this properly <em>after</em> talking to Jacqueline and Sam, so I'm grateful to them for sparking the chain of thoughts. I've started reading how other organisations (such as MacArthur's <a href=\"https://www.macfound.org/programs/field-support/technology-public-interest/\">Technology for the Public Good</a>) discuss the role of technology in societal domains, and would be grateful for any pointers to similar initiatives in conservation.</p>\n<p>I would also dearly love to see a roundup of all the Cambridge <a href=\"https://www.cam.ac.uk/news/environment\">environmental coverage</a> in one place, perhaps on the <a href=\"https://www.conservation.cam.ac.uk/\">Conservation Research Institute</a> pages, told as a cohesive story from the perspective of the nature research and not the technology that enables just a part of it. If you're an undergraduate looking for something to do this summer, especially from the social sciences or journalism, do get in touch and I'd be delighted to work with you on this for an internship! Or maybe this is something for the first edition of the <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">Cambridge Green Blue</a> competition to assemble next year...</p>\n<p>Thanks to <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/dr-william-morgan\">William Morgan</a>, <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/ashley-simkins\">Ash Simkins</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> for corrections and suggestions to this post! *<em>7th May 2025:</em> See also <a href=\"https://anil.recoil.org/notes/humans-save-nature-not-ai\">a followup article</a> on this by <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a>.</p>\n\n<ol>\n<li>\n<p>As an aside, I love this long-form, carefully considered mechanism for scholarly discussion, as espoused by the letter back-and-forth in a journal. I wish we had more of this in computer science rather than social media arguments that disappear like tears in the rain just a few scrolls later.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>Amusingly, they were triggered by an accidental reply-all from me to the whole building rather than a private reply. I hold that this is the best way to start a real conversation!</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_aicam-interview-ce.json
+18
avsm/notes_aicam-interview-ce.json
···+"summary": "<p>I talked to the <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> team to discuss our <a href=\"https://anil.recoil.org/notes/aicn-in-aicam\">AICN</a>\nproject and what we're planning to do in the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> team.</p>\n<blockquote>\n<p>Over the last two decades, the University of Cambridge-based project Conservation Evidence has screened more than 1.6 million scientific papers on conservation, as well as manually summarising 8,600+ studies relating to conservation actions. However, the current project\u2019s work is limited by the specialised skills needed to screen and summarise relevant studies. It took more than 75 person years to manually curate the current database and only a few 100 papers can be added each year. By accelerating these efforts, AI has the potential to transform the impact this database has on biodiversity conservation.</p>\n<p>What we\u2019re aiming to do through the ai@cam project \u2013 bringing together an interdisciplinary team from across the fields of computer science, ecology, climate and conservation \u2013 is to build up models of the world that are really detailed and that can be queried by policy makers to help make informed decisions.</p>\n<p>-- <a href=\"https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet\">AI@Cam</a></p>\n</blockquote>",+"content": "<p>I talked to the <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> team to discuss our <a href=\"https://anil.recoil.org/notes/aicn-in-aicam\">AICN</a>\nproject and what we're planning to do in the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> team.</p>\n<blockquote>\n<p>Over the last two decades, the University of Cambridge-based project Conservation Evidence has screened more than 1.6 million scientific papers on conservation, as well as manually summarising 8,600+ studies relating to conservation actions. However, the current project\u2019s work is limited by the specialised skills needed to screen and summarise relevant studies. It took more than 75 person years to manually curate the current database and only a few 100 papers can be added each year. By accelerating these efforts, AI has the potential to transform the impact this database has on biodiversity conservation.</p>\n<p>What we\u2019re aiming to do through the ai@cam project \u2013 bringing together an interdisciplinary team from across the fields of computer science, ecology, climate and conservation \u2013 is to build up models of the world that are really detailed and that can be queried by policy makers to help make informed decisions.</p>\n<p>-- <a href=\"https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet\">AI@Cam</a></p>\n</blockquote>",
+18
avsm/notes_aicn-in-aicam.json
+18
avsm/notes_aicn-in-aicam.json
···+"summary": "<p>We won the <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch\">AI@CAM challenge</a> that was sent out\nUniversity wide to find research projects that use AI to tackle society's biggest challenges.\nOur project on using <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI for climate and nature</a>\nis one of the five selected.</p>\n<blockquote>\n<p>The twin climate and biodiversity crises are two of the world\u2019s most complex challenges to tackle. This project aims to develop AI approaches for bringing together a wide range of datasets and accelerating the collation of information.</p>\n<p>This work will provide up to date, relevant and robust information for researchers and decision-makers working on climate and biodiversity conservation \u2013 opening up the possibility for more targeted and effective solutions to some of our world\u2019s most pressing climate and biodiversity challenges.</p>\n<p><a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a>, AI-deas challenge co-lead, said: 'Mitigating the impacts of climate change while maintaining and restoring biodiversity demands urgent, evidence-based action. We're excited to bring together an interdisciplinary team across computer science, ecology, climate and conservation to use AI to empower decision-makers to equitably tackle the biggest challenge of our generation.'</p>\n<p>-- <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI@CAM</a></p>\n</blockquote>\n<p>This project is a collaboration between lots of friendly people at Cambridge Zero, the Cambridge Conservation Initiative, Conservation Evidence, the Institute for Computing for Climate Science, Conservation Research Institute, Centre for Landscape Regeneration, <a href=\"https://anil.recoil.org/projects/4c\">Cambridge Centre for Carbon Credits</a> and Cambridge Centre for Earth Observation.</p>\n<p>\n<img alt=\"Team AICN in the CCI building, Feb 2024\" src=\"https://anil.recoil.org/images/aicn-team-feb24.webp\" title=\"Team AICN in the CCI building, Feb 2024\">\nTeam AICN in the CCI building, Feb 2024</p>",+"content": "<p>We won the <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch\">AI@CAM challenge</a> that was sent out\nUniversity wide to find research projects that use AI to tackle society's biggest challenges.\nOur project on using <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI for climate and nature</a>\nis one of the five selected.</p>\n<blockquote>\n<p>The twin climate and biodiversity crises are two of the world\u2019s most complex challenges to tackle. This project aims to develop AI approaches for bringing together a wide range of datasets and accelerating the collation of information.</p>\n<p>This work will provide up to date, relevant and robust information for researchers and decision-makers working on climate and biodiversity conservation \u2013 opening up the possibility for more targeted and effective solutions to some of our world\u2019s most pressing climate and biodiversity challenges.</p>\n<p><a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a>, AI-deas challenge co-lead, said: 'Mitigating the impacts of climate change while maintaining and restoring biodiversity demands urgent, evidence-based action. We're excited to bring together an interdisciplinary team across computer science, ecology, climate and conservation to use AI to empower decision-makers to equitably tackle the biggest challenge of our generation.'</p>\n<p>-- <a href=\"https://www.cam.ac.uk/stories/AI-deas-launch#section-9RKgEyI2LZ\">AI@CAM</a></p>\n</blockquote>\n<p>This project is a collaboration between lots of friendly people at Cambridge Zero, the Cambridge Conservation Initiative, Conservation Evidence, the Institute for Computing for Climate Science, Conservation Research Institute, Centre for Landscape Regeneration, <a href=\"https://anil.recoil.org/projects/4c\">Cambridge Centre for Carbon Credits</a> and Cambridge Centre for Earth Observation.</p>\n<p>\n<img alt=\"Team AICN in the CCI building, Feb 2024\" src=\"https://anil.recoil.org/images/aicn-team-feb24.webp\" title=\"Team AICN in the CCI building, Feb 2024\">\nTeam AICN in the CCI building, Feb 2024</p>",
+18
avsm/notes_announcing-mirageos-1-2.json
+18
avsm/notes_announcing-mirageos-1-2.json
···+"summary": "<p>I announce a point release of MirageOS 1.x, and the exciting run up to the major MirageOS 2.0 release which has lots of new features. The number of Mirage users is growing steadily!</p>",+"content": "<p>I announce a point release of MirageOS 1.x, and the exciting run up to the major MirageOS 2.0 release which has lots of new features. The number of Mirage users is growing steadily!</p>",
+18
avsm/notes_announcing-mirageos-2.json
+18
avsm/notes_announcing-mirageos-2.json
···+"summary": "<p>This is a big release for us; after the first version came out earlier in the year, we added in support for ARM devices, a new storage subsystem called <a href=\"https://irmin.org\">Irmin</a> and even a pure OCaml TLS stack.</p>",+"content": "<p>This is a big release for us; after the first version came out earlier in the year, we added in support for ARM devices, a new storage subsystem called <a href=\"https://irmin.org\">Irmin</a> and even a pure OCaml TLS stack.</p>",
+18
avsm/notes_announcing-ocaml-labs.json
+18
avsm/notes_announcing-ocaml-labs.json
···+"summary": "<p>I\u2019m very excited to announce <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a>, the latest project\nto hit the Cambridge Computer Lab. As anyone that hangs out near me\nprobably realises, I very much enjoy functional programming. My weapon\nof choice tends to be <a href=\"http://www.ocaml-lang.org\">OCaml</a>, as it\ncondenses <a href=\"http://events.inf.ed.ac.uk/Milner2012/X_Leroy-html5-mp4.html\">decades of\nresearch</a>\ninto a pragmatic blend of functional, imperative and object-oriented\nprogramming styles. What\u2019s perhaps less well known are the steady\n<a href=\"http://www.ocaml-lang.org/companies.html\">inroads</a> that OCaml has been\nmaking into mission-critical areas of industry. At <a href=\"http://ocaml.janestreet.com\">Jane\nStreet</a>, billions of dollars of\ntransactions are routed through a huge ML code-base that is designed to\ncatch bugs <a href=\"http://vimeo.com/14313378\">at compile-time</a>. At\n<a href=\"http://github.com/xen-org/xen-api\">Citrix</a>, the Xen management\ntoolstack that powers\n<a href=\"http://blogs.citrix.com/2012/10/09/one-in-a-million/\">millions</a> of\nhosts in the cloud is <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.pdf\">largely written in\nOCaml</a>. Facebook does\nsophisticated <a href=\"https://github.com/facebook/pfff/wiki/Main\">static\nanalysis</a> using OCaml over\ntheir vast PHP codebase to close security holes.</p>\n<p>The OCaml community is small but dedicated, but there is always more to\ndo to improve the language and ecosystem. So, thanks to a generous\nplatform grant from <a href=\"http://ocaml.janestreet.com\">Jane Street</a>, we are\nlaunching a program to help with the open-source development of OCaml\nfrom Cambridge.</p>\n<p>The <em><a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml Labs</a></em> are\nbased in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer Lab</a> and led\nmy myself, <a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and <a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a>. We\u2019re closely affiliated with\nother\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">groups</a>,\nand will be:</p>\n<ul>\n<li>\n<p>developing the OCaml Platform, which will bundle the official OCaml\ncompiler from INRIA with a tested set of community libraries that\nrefreshed every six months.</p>\n</li>\n<li>\n<p>working with the core OCaml team at INRIA\u2019s\n<a href=\"http://gallium.inria.fr/\">Gallium</a> group on the compiler, and with\ncommercial partners like <a href=\"http://ocamlpro.com\">OCamlPro</a> on tool\ndevelopment. OCamlPro are making some very impressive progress\nalready with the <a href=\"http://opam.ocamlpro.com\">OPAM</a> packge manager and\n<a href=\"http://www.typerex.org\">TypeRex</a> IDE helper.</p>\n</li>\n<li>\n<p>supporting the online presence with more teaching material and\ncontent. Yaron, Jason and I are working hard on a <a href=\"http://realworldocaml.org\">new\nbook</a> that will be published next year,\nand the OCaml Web team (led by <a href=\"http://ashishagarwal.org\">Ashish</a>\nand\n<a href=\"https://plus.google.com/109604597514379193052/posts\">Christophe</a>)\nhave made great progress on a <a href=\"http://www.ocaml-lang.org\">brand new\nwebsite</a> that we will move to the\n<code>ocaml.org</code> domain soon.</p>\n</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#research-efforts\"></a>Research efforts</h3>\n<p>Of course, it is difficult to hack on a language in a void, and we also\n<em>use</em> OCaml heavily in our own research. The other half of OCaml Lab\u2019s\ngoals are more disruptive (and riskier!):</p>\n<ul>\n<li>The upcoming first beta release of <a href=\"http://openmirage.org\">Mirage</a>,\nwhich is an operating system designed for cloud and embedded\nenvironments, and is written almost entirely from the ground up in\nOCaml. The outputs of Mirage include a <a href=\"http://www.openmirage.org/blog/breaking-up-is-easy-with-opam\">large number of\nlibraries</a>\nwhich are usable separately, such as pure implementations of TCP/IP,\nDNS, SSH, DHCP and HTTP. The Xen hackers, led by <a href=\"http://dave.recoil.org\">David Scott</a>, are out in force to integrate Mirage\ninto their <a href=\"http://www.xen.org/xensummit/xs12na_talks/T2.html\">next-generation</a>\nplatform. Meanwhile, Raphael Proust is busy eliminating the <a href=\"https://anil.recoil.org/papers/drafts/2012-places-limel-draft1.pdf\">garbage\ncollector</a>\nwith his cut-down \u201cLinearML\u201d variant.</li>\n<li>Working with our collaborators at the <a href=\"http://horizon.ac.uk\">Horizon\nInstitute</a> on privacy-preserving technologies\nsuch as\n<a href=\"https://anil.recoil.org/papers/2012-sigcomm-signposts-demo.pdf\">Signposts</a>\nwhich let you build and maintain your own personal clouds that\noperate <a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.pdf\">autonomously</a>\nfrom the central cloud. You can read more about our <a href=\"http://www.cam.ac.uk/research/features/privacy-by-design/\">privacy-by-design</a> philosophy too.</li>\n<li>Extending OCaml to run on secure hardware platforms that doesn\u2019t\ncompromise on performance, using the MIPS64-based <a href=\"http://www.cl.cam.ac.uk/research/security/ctsrd/cheri.html\">capability\nprocessor</a>\nthat is being developed at at the Lab.</li>\n<li>The <a href=\"http://www.trilogy-project.org\">Trilogy</a> was a hugely\nsuccessful EU-funded effort on future evolution of the Internet, and\nresulted in <a href=\"http://trilogy-project.org/publications/standards-contributions.html\">numerous\nRFCs</a>\non subjects such as multipath-TCP. We\u2019re partipating in the\nfollow-up (imaginatively dubbed \u201cTrilogy2\u201d), and look forward to\nworking on more structured abstractions for programming large-scale\nnetworks.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#getting-involved\"></a>Getting involved</h3>\n<p>So, how can you get involved? We are initially advertising three\npositions for full-time developers and researchers\n(<a href=\"http://www.jobs.cam.ac.uk/job/-21662/\">junior</a> and\n<a href=\"http://www.jobs.cam.ac.uk/job/-21942/\">senior</a>) to help us get started\nwith the OCaml Platform and compiler development. These aren\u2019t\nconventional pure research jobs, and a successful candidate should enjoy\nthe open-source development cycle (you retain your own copyright for\nyour own projects). The Computer Lab offers a pretty unique environment:\na friendly, non-hierarchical group in a beautiful city, and some of the\nbest faculty and students you could hope to hang out with.</p>\n<p>And finally, there is a longer lead time on <a href=\"http://www.cl.cam.ac.uk/admissions/phd/\">applying for\nPhDs</a>, but this is a great time\nto get involved. When I started at the Lab in 2002, a little project\ncalled <a href=\"http://xen.org\">Xen</a> was just kicking off, and many of us had a\nwild (and oft great) time riding that wave. Get in touch with myself,\n<a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan</a>,\n<a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian</a> or\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon</a> soon if you are interested in\napplying! There\u2019s some more information available on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">OCaml Labs\npages</a>\nabout options.</p>",+"content": "<p>I\u2019m very excited to announce <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a>, the latest project\nto hit the Cambridge Computer Lab. As anyone that hangs out near me\nprobably realises, I very much enjoy functional programming. My weapon\nof choice tends to be <a href=\"http://www.ocaml-lang.org\">OCaml</a>, as it\ncondenses <a href=\"http://events.inf.ed.ac.uk/Milner2012/X_Leroy-html5-mp4.html\">decades of\nresearch</a>\ninto a pragmatic blend of functional, imperative and object-oriented\nprogramming styles. What\u2019s perhaps less well known are the steady\n<a href=\"http://www.ocaml-lang.org/companies.html\">inroads</a> that OCaml has been\nmaking into mission-critical areas of industry. At <a href=\"http://ocaml.janestreet.com\">Jane\nStreet</a>, billions of dollars of\ntransactions are routed through a huge ML code-base that is designed to\ncatch bugs <a href=\"http://vimeo.com/14313378\">at compile-time</a>. At\n<a href=\"http://github.com/xen-org/xen-api\">Citrix</a>, the Xen management\ntoolstack that powers\n<a href=\"http://blogs.citrix.com/2012/10/09/one-in-a-million/\">millions</a> of\nhosts in the cloud is <a href=\"https://anil.recoil.org/papers/2010-icfp-xen.pdf\">largely written in\nOCaml</a>. Facebook does\nsophisticated <a href=\"https://github.com/facebook/pfff/wiki/Main\">static\nanalysis</a> using OCaml over\ntheir vast PHP codebase to close security holes.</p>\n<p>The OCaml community is small but dedicated, but there is always more to\ndo to improve the language and ecosystem. So, thanks to a generous\nplatform grant from <a href=\"http://ocaml.janestreet.com\">Jane Street</a>, we are\nlaunching a program to help with the open-source development of OCaml\nfrom Cambridge.</p>\n<p>The <em><a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml Labs</a></em> are\nbased in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer Lab</a> and led\nmy myself, <a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and <a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a>. We\u2019re closely affiliated with\nother\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">groups</a>,\nand will be:</p>\n<ul>\n<li>\n<p>developing the OCaml Platform, which will bundle the official OCaml\ncompiler from INRIA with a tested set of community libraries that\nrefreshed every six months.</p>\n</li>\n<li>\n<p>working with the core OCaml team at INRIA\u2019s\n<a href=\"http://gallium.inria.fr/\">Gallium</a> group on the compiler, and with\ncommercial partners like <a href=\"http://ocamlpro.com\">OCamlPro</a> on tool\ndevelopment. OCamlPro are making some very impressive progress\nalready with the <a href=\"http://opam.ocamlpro.com\">OPAM</a> packge manager and\n<a href=\"http://www.typerex.org\">TypeRex</a> IDE helper.</p>\n</li>\n<li>\n<p>supporting the online presence with more teaching material and\ncontent. Yaron, Jason and I are working hard on a <a href=\"http://realworldocaml.org\">new\nbook</a> that will be published next year,\nand the OCaml Web team (led by <a href=\"http://ashishagarwal.org\">Ashish</a>\nand\n<a href=\"https://plus.google.com/109604597514379193052/posts\">Christophe</a>)\nhave made great progress on a <a href=\"http://www.ocaml-lang.org\">brand new\nwebsite</a> that we will move to the\n<code>ocaml.org</code> domain soon.</p>\n</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#research-efforts\"></a>Research efforts</h3>\n<p>Of course, it is difficult to hack on a language in a void, and we also\n<em>use</em> OCaml heavily in our own research. The other half of OCaml Lab\u2019s\ngoals are more disruptive (and riskier!):</p>\n<ul>\n<li>The upcoming first beta release of <a href=\"http://openmirage.org\">Mirage</a>,\nwhich is an operating system designed for cloud and embedded\nenvironments, and is written almost entirely from the ground up in\nOCaml. The outputs of Mirage include a <a href=\"http://www.openmirage.org/blog/breaking-up-is-easy-with-opam\">large number of\nlibraries</a>\nwhich are usable separately, such as pure implementations of TCP/IP,\nDNS, SSH, DHCP and HTTP. The Xen hackers, led by <a href=\"http://dave.recoil.org\">David Scott</a>, are out in force to integrate Mirage\ninto their <a href=\"http://www.xen.org/xensummit/xs12na_talks/T2.html\">next-generation</a>\nplatform. Meanwhile, Raphael Proust is busy eliminating the <a href=\"https://anil.recoil.org/papers/drafts/2012-places-limel-draft1.pdf\">garbage\ncollector</a>\nwith his cut-down \u201cLinearML\u201d variant.</li>\n<li>Working with our collaborators at the <a href=\"http://horizon.ac.uk\">Horizon\nInstitute</a> on privacy-preserving technologies\nsuch as\n<a href=\"https://anil.recoil.org/papers/2012-sigcomm-signposts-demo.pdf\">Signposts</a>\nwhich let you build and maintain your own personal clouds that\noperate <a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets.pdf\">autonomously</a>\nfrom the central cloud. You can read more about our <a href=\"http://www.cam.ac.uk/research/features/privacy-by-design/\">privacy-by-design</a> philosophy too.</li>\n<li>Extending OCaml to run on secure hardware platforms that doesn\u2019t\ncompromise on performance, using the MIPS64-based <a href=\"http://www.cl.cam.ac.uk/research/security/ctsrd/cheri.html\">capability\nprocessor</a>\nthat is being developed at at the Lab.</li>\n<li>The <a href=\"http://www.trilogy-project.org\">Trilogy</a> was a hugely\nsuccessful EU-funded effort on future evolution of the Internet, and\nresulted in <a href=\"http://trilogy-project.org/publications/standards-contributions.html\">numerous\nRFCs</a>\non subjects such as multipath-TCP. We\u2019re partipating in the\nfollow-up (imaginatively dubbed \u201cTrilogy2\u201d), and look forward to\nworking on more structured abstractions for programming large-scale\nnetworks.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#getting-involved\"></a>Getting involved</h3>\n<p>So, how can you get involved? We are initially advertising three\npositions for full-time developers and researchers\n(<a href=\"http://www.jobs.cam.ac.uk/job/-21662/\">junior</a> and\n<a href=\"http://www.jobs.cam.ac.uk/job/-21942/\">senior</a>) to help us get started\nwith the OCaml Platform and compiler development. These aren\u2019t\nconventional pure research jobs, and a successful candidate should enjoy\nthe open-source development cycle (you retain your own copyright for\nyour own projects). The Computer Lab offers a pretty unique environment:\na friendly, non-hierarchical group in a beautiful city, and some of the\nbest faculty and students you could hope to hang out with.</p>\n<p>And finally, there is a longer lead time on <a href=\"http://www.cl.cam.ac.uk/admissions/phd/\">applying for\nPhDs</a>, but this is a great time\nto get involved. When I started at the Lab in 2002, a little project\ncalled <a href=\"http://xen.org\">Xen</a> was just kicking off, and many of us had a\nwild (and oft great) time riding that wave. Get in touch with myself,\n<a href=\"http://www.cl.cam.ac.uk/~am21/\">Alan</a>,\n<a href=\"http://www.cl.cam.ac.uk/~iml1/\">Ian</a> or\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon</a> soon if you are interested in\napplying! There\u2019s some more information available on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/collaboration.html\">OCaml Labs\npages</a>\nabout options.</p>",
+18
avsm/notes_apple-containerisation.json
+18
avsm/notes_apple-containerisation.json
···+"summary": "<p>Apple made a notable <a href=\"https://developer.apple.com/videos/play/wwdc2025/346/\">announcement</a> in <a href=\"https://developer.apple.com/wwdc25/\">WWDC 2025</a> that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early <a href=\"https://docs.docker.com/desktop/setup/install/mac-install/\">Docker for Mac</a> days in 2016 when we <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">announced</a> the first mainstream use of the <a href=\"https://developer.apple.com/documentation/hypervisor\">hypervisor framework</a>, so I couldn't resist taking a quick peek under the hood.</p>\n<p>There were two separate things announced: a <a href=\"https://github.com/apple/containerization\">Containerization framework</a> and also a <a href=\"https://github.com/apple/container\">container</a> CLI tool that aims to be an <a href=\"https://opencontainers.org/\">OCI</a> compliant tool to manipulate and execute container images. The former is a general-purpose framework that could be used by Docker, but it wasn't clear to me where the new CLI tool fits in among the existing layers of <a href=\"https://github.com/opencontainers/runc\">runc</a>, <a href=\"https://containerd.io/\">containerd</a> and of course Docker itself. The only way to find out is to take the new release for a spin, since Apple open-sourced everything (well done!).</p>\n<h2><a href=\"https://anil.recoil.org/#getting-up-and-running\"></a>Getting up and running</h2>\n<p>To get the full experience, I chose to install the <a href=\"https://www.apple.com/uk/newsroom/2025/06/macos-tahoe-26-makes-the-mac-more-capable-productive-and-intelligent-than-ever/\">macOS Tahoe beta</a>, as there have been improvements to the networking frameworks<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> that are only present in the new beta. It's essential you only use the <a href=\"https://developer.apple.com/news/releases/?id=06092025g\">Xcode 26 beta</a> as otherwise you'll get Swift link errors against vmnet. I had to force my installation to use the right toolchain via:</p>\n<pre><code>sudo xcode-select --switch /Applications/Xcode-beta.app/Contents/Developer\n</code></pre>\n<p>Once that was done, it was simple to clone and install the <a href=\"https://github.com/apple/container\">container\nrepo</a> with a <code>make install</code>. The first\nthing I noticed is that everything is written in Swift with no Go in sight.\nThey still use Protobuf for communication among the daemons, as most of the\nwider Docker ecosystem does.</p>\n<p>\n<img alt=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they&apos;re impossible to distinguish!\" src=\"https://anil.recoil.org/images/macos-ss-1.webp\" title=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they&apos;re impossible to distinguish!\">\nI have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they're impossible to distinguish!</p>\n<h2><a href=\"https://anil.recoil.org/#starting-our-first-apple-container\"></a>Starting our first Apple container</h2>\n<p>Let's start our daemon up and take the <code>container</code> CLI for a spin.</p>\n<pre><code>$ container system start\nVerifying apiserver is running...\nInstalling base container filesystem...\nNo default kernel configured.\nInstall the recommended default kernel from [https://github.com/kata-containers/kata-containers/releases/download/3.17.0/kata-static-3.17.0-arm64.tar.xz]? [Y/n]: y\nInstalling kernel... \n\u2819 [1/2] Downloading kernel 33% (93.4/277.1 MB, 14.2 MB/s) [5s]\n</code></pre>\n<p>The first thing we notice is it downloading a full Linux kernel from the <a href=\"https://github.com/kata-containers/kata-containers\">Kata Containers</a> project. This system spins up a VM per container in order to provide more isolation. Although I haven't tracked Kata closely since its <a href=\"https://techcrunch.com/2017/12/05/intel-and-hyper-partner-with-the-openstack-foundation-to-launch-the-kata-containers-project/\">launch</a> in 2017, I did notice it being used to containerise <a href=\"https://confidentialcomputing.io/\">confidential computing enclaves</a> while <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and I were working on <a href=\"https://anil.recoil.org/projects/difc-tee\">TEE programming models</a> a few years ago.</p>\n<p>The use of Kata tells us that <code>container</code> spins up a new kernel using the\nmacOS <a href=\"https://anil.recoil.org/\">Virtualization framework</a> every time a new container is started. This\nis ok for production use (where extra isolation may be appropriate in a\nmultitenant cloud environment) but very memory inefficient for development\n(where it's usual to spin up 4-5 VMs for a development environment with a\ndatabase etc). In contrast, Docker for Mac <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">uses</a> a single Linux kernel and runs\nthe containers within that instead.</p>\n<p>It's not quite clear to me why Apple chose the extra overheads of a\nVM-per-container, but I suspect this might be something to do with running code securely\ninside the <a href=\"https://support.apple.com/en-gb/guide/security/sec59b0b31ff/web\">many hardware enclaves</a>\npresent in modern Apple hardware, a usecase that is on the rise with <a href=\"https://www.apple.com/uk/apple-intelligence/\">Apple\nIntelligence</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#peeking-under-the-hood-of-the-swift-code\"></a>Peeking under the hood of the Swift code</h2>\n<p>Once the container daemon is running, we can spin up our first container using Alpine, which uses the familiar Docker-style <code>run</code>:</p>\n<pre><code>$ time container run alpine uname -a \nLinux 3c555c19-b235-4956-bed8-27bcede642a6 6.12.28 #1 SMP\nTue May 20 15:19:05 UTC 2025 aarch64 Linux\n0.04s user 0.01s system 6% cpu 0.733 total\n</code></pre>\n<p>The container spinup time is noticable, but still less than a second and pretty acceptable for day to day use. This is possible thanks to a custom userspace they implement via a Swift init process that's run by the Linux kernel as the <em>sole</em> binary in the filesystem, and that provides an RPC interface to manage other services. The <a href=\"https://github.com/apple/containerization/tree/main/vminitd/Sources/vminitd\">vminitd</a> is built using the Swift static Linux SDK, which links <a href=\"https://musl.libc.org/\">musl libc</a> under the hood (the same one used by <a href=\"https://www.alpinelinux.org/\">Alpine Linux</a>).</p>\n<p>We can see the processes running by using <a href=\"https://man7.org/linux/man-pages/man1/pstree.1.html\">pstree</a>:</p>\n<pre><code>|- 29203 avsm /System/Library/Frameworks/Virtualization.framework/\n Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/\n Contents/MacOS/com.apple.Virtualization.VirtualMachine\n|- 29202 avsm <..>/plugins/container-runtime-linux/\n bin/container-runtime-linux\n --root <..>/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm <..>/bin/container-network-vmnet\n start --id default\n --service-identifier <..>network.container-network-vmnet.default\n|- 28899 avsm <..>/bin/container-core-images start\n|- 29202 avsm <..>/bin/container-runtime-linux\n --root <..>/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm <..>/container-network-vmnet start --id default\n --service-identifier <..>network.container-network-vmnet.default\n</code></pre>\n<p>You can start to see the overheads of a VM-per-container now, as each container\nneeds the host process infrastructure to not only run the computation, but also to\nfeed it with networking and storage IO (which have to be translated from the\nhost). Still, its a drop in the ocean for macOS these days, as I'm running 850\nprocesses in the background on my Macbook Air from an otherwise fresh\ninstallation! This isn't the lean, fast MacOS X Cheetah I used on my G4 Powerbook anymore,\nsadly.</p>\n<h3><a href=\"https://anil.recoil.org/#finding-the-userspace-ext4-in-swift\"></a>Finding the userspace ext4 in Swift</h3>\n<p>I then tried to run a more interesting container for my local dev environment:\nthe <a href=\"https://hub.docker.com/r/ocaml/opam\">ocaml/opam</a> Docker images that we use\nin OCaml development. This showed up an interesting new twist in the Apple\nrewrite: they have an entire <a href=\"https://anil.recoil.org/\">ext4</a> filesystem <a href=\"https://github.com/apple/containerization/tree/main/Sources/ContainerizationEXT4\">implementation written in\nSwift</a>!\nThis is used to extract the OCI images from the Docker registry and then\nconstruct a new filesystem.</p>\n<pre><code>$ container run ocaml/opam opam list\n\u2826 [2/6] Unpacking image for platform linux/arm64 (112,924 entries, 415.9 MB, Zero KB/s) [9m 22s] \n\u2839 [2/6] Unpacking image for platform linux/arm64 (112,972 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u2807 [2/6] Unpacking image for platform linux/arm64 (113,012 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u283c [2/6] Unpacking image for platform linux/arm64 (113,059 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u280b [2/6] Unpacking image for platform linux/arm64 (113,104 entries, 415.9 MB, Zero KB/s) [9m 24s] \n# Packages matching: installed \n# Name # Installed # Synopsis\nbase-bigarray base\nbase-domains base\nbase-effects base\nbase-threads base\nbase-unix base\nocaml 5.3.0 The OCaml compiler (virtual package)\nocaml-base-compiler 5.3.0 pinned to version 5.3.0\nocaml-compiler 5.3.0 Official release of OCaml 5.3.0\nocaml-config 3 OCaml Switch Configuration\nopam-depext 1.2.3 Install OS distribution packages\n</code></pre>\n<p>The only hitch here is how slow this process is. The OCaml images do have a lot of individual\nfiles within the layers (not unusual for a package manager), but I was surprised that this took\n10 minutes on my modern M4 Macbook Air, versus a few seconds on Docker for Mac. I <a href=\"https://github.com/apple/container/issues/136\">filed a bug</a> upstream to investigate further since (as with any new implementation) there are many <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">edge cases</a> when handling filesystems in userspace, and the Apple code seems to have <a href=\"https://github.com/apple/container/issues/134\">other limitations</a> as well. I'm sure this will all shake out as the framework gets more users, but it's worth bearing in mind if you're thinking of using it in the near term in a product.</p>\n<h2><a href=\"https://anil.recoil.org/#whats-conspicuously-missing\"></a>What's conspicuously missing?</h2>\n<p>I was super excited when this announcement first happened, since I thought it might be the beginning of a few features I've needed for years and years. But they're missing...</p>\n<h3><a href=\"https://anil.recoil.org/#running-macos-containers-nope\"></a>Running macOS containers: nope</h3>\n<p>In OCaml-land, we have gone to ridiculous lengths to be able to run macOS CI on our own infrastructure. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> first wrote a <a href=\"https://tarides.com/blog/2023-08-02-obuilder-on-macos/\">custom snapshotting builder</a> using undocumented interfaces like userlevel sandboxing, subsequently taken over and maintained by <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a>. This is a tremendous amount of work to maintain, but the alternative is to depend on very expensive hosted services to spin up individual macOS VMs which are slow and energy hungry.</p>\n<p>What we <em>really</em> need are macOS containers! We have dozens of mechanisms to run Linux ones already, and only a few <a href=\"https://github.com/dockur/macos\">heavyweight alternatives</a> to run macOS itself within macOS. However, the VM-per-container mechanism chosen by Apple might be the gateway to supporting macOS itself in the future. I will be first in line to test this if it happens!</p>\n<h3><a href=\"https://anil.recoil.org/#running-ios-containers-nope\"></a>Running iOS containers: nope</h3>\n<p>Waaaay back when we were <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">first writing</a> Docker for Mac, there were no mainstream users of the Apple Hypervisor framework at all (that's why we built and released <a href=\"https://github.com/moby/hyperkit\">Hyperkit</a>. The main benefit we hoped to derive from using Apple-blessed frameworks is that they would make our app App-Store friendly for distribution via those channels.</p>\n<p>But while there do exist <a href=\"https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.security.hypervisor\">entitlements</a> to support virtualisation on macOS, there is <em>no</em> support for iOS or iPadOS to this day! All of the trouble to sign binaries and deal with entitlements and opaque Apple tooling only gets it onto the Mac App store, which is a little bit of a graveyard compared to the iOS ecosystem.\nThis thus remains on my wishlist for Apple: the hardware on modern iPad adevices <em>easily</em> supports virtualisation, but Apple is choosing to cripple these devices from having a decent development experience by not unlocking the software capability by allowing the hypervisor, virtualisation and container frameworks to run on there.</p>\n<h3><a href=\"https://anil.recoil.org/#running-linux-containers-yeah-but-no-gpu\"></a>Running Linux containers: yeah but no GPU</h3>\n<p>One reason to run Linux containers on macOS is to handle machine learning workloads. Actually getting this to be performant is tricky, since macOS has its own custom <a href=\"https://github.com/ml-explore/mlx\">MLX-based</a> approach to handling tensor computations. Meanwhile, the rest of the world mostly uses nVidia or AMD interfaces for those GPUs, which is reflected in container images that are distributed.</p>\n<p>There is some chatter on the <a href=\"https://github.com/apple/container/discussions/62#discussioncomment-13414483\">apple/container GitHub</a> about getting GPU passthrough working, but I'm still unclear on how to get a more portable GPU ABI. The reason Linux containers work so well is that the Linux kernel provides a very stable ABI, but this breaks down with GPUs badly.</p>\n<h1><a href=\"https://anil.recoil.org/#does-this-threaten-dockers-dominance\"></a>Does this threaten Docker's dominance?</h1>\n<p>I have mixed feelings about the Containerization framework release. On one hand, it's always fun to see more systems code in a new language like Swift, and this is an elegant and clean reimplementation of classic containerisation techniques in macOS. But the release <strong>fails to unlock any real new end-user capabilities</strong>, such as running a decent development environment on my iPad without using cloud services. Come on Apple, you can make that happen; you're getting ever closer every release!</p>\n<p>I don't believe that Docker or Orbstack are too threatened by this release at this stage either, despite some reports that <a href=\"https://appleinsider.com/articles/25/06/09/sorry-docker-macos-26-adds-native-support-for-linux-containers\">they're being Sherlocked</a>. The Apple container CLI is quite low-level, and there's a ton of quality-of-life features in the full Docker for Mac app that'll keep me using it, and there seems to be no real blocker from Docker adopting the Containerization framework as one of its optional backends. I prefer having a single VM for my devcontainers to keep my laptop battery life going, so I think Docker's current approach is better for that usecase.</p>\n<p>Apple has been a very good egg here by open sourcing all their code, so I believe this will overall help the Linux container ecosystem by adding choice to how we deploy software containers. Well done <a href=\"https://github.com/crosbymichael\">Michael Crosby</a>, <a href=\"https://github.com/mavenugo\">Madhu Venugopal</a> and many of my other former colleagues who are all merrily hackily away on this for doing so! As an aside, I'm also just revising a couple of papers about the history of using OCaml in several Docker components, and a retrospective look back at the hypervisor architecture backing Docker for Desktop, which will appear in print in the next couple of months (I'll update this post when they appear). But for now, back to my day job of marking undergraduate exam scripts...</p>\n\n<ol>\n<li>\n<p>vmnet is a networking framework for VMs/containers that I had to <a href=\"https://github.com/mirage/ocaml-vmnet\">reverse engineer</a> back in 2014 to use with OCaml/MirageOS.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>Apple made a notable <a href=\"https://developer.apple.com/videos/play/wwdc2025/346/\">announcement</a> in <a href=\"https://developer.apple.com/wwdc25/\">WWDC 2025</a> that they've got a new containerisation framework in the new Tahoe beta. This took me right back to the early <a href=\"https://docs.docker.com/desktop/setup/install/mac-install/\">Docker for Mac</a> days in 2016 when we <a href=\"https://www.docker.com/blog/docker-unikernels-open-source/\">announced</a> the first mainstream use of the <a href=\"https://developer.apple.com/documentation/hypervisor\">hypervisor framework</a>, so I couldn't resist taking a quick peek under the hood.</p>\n<p>There were two separate things announced: a <a href=\"https://github.com/apple/containerization\">Containerization framework</a> and also a <a href=\"https://github.com/apple/container\">container</a> CLI tool that aims to be an <a href=\"https://opencontainers.org/\">OCI</a> compliant tool to manipulate and execute container images. The former is a general-purpose framework that could be used by Docker, but it wasn't clear to me where the new CLI tool fits in among the existing layers of <a href=\"https://github.com/opencontainers/runc\">runc</a>, <a href=\"https://containerd.io/\">containerd</a> and of course Docker itself. The only way to find out is to take the new release for a spin, since Apple open-sourced everything (well done!).</p>\n<h2><a href=\"https://anil.recoil.org/#getting-up-and-running\"></a>Getting up and running</h2>\n<p>To get the full experience, I chose to install the <a href=\"https://www.apple.com/uk/newsroom/2025/06/macos-tahoe-26-makes-the-mac-more-capable-productive-and-intelligent-than-ever/\">macOS Tahoe beta</a>, as there have been improvements to the networking frameworks<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> that are only present in the new beta. It's essential you only use the <a href=\"https://developer.apple.com/news/releases/?id=06092025g\">Xcode 26 beta</a> as otherwise you'll get Swift link errors against vmnet. I had to force my installation to use the right toolchain via:</p>\n<pre><code>sudo xcode-select --switch /Applications/Xcode-beta.app/Contents/Developer\n</code></pre>\n<p>Once that was done, it was simple to clone and install the <a href=\"https://github.com/apple/container\">container\nrepo</a> with a <code>make install</code>. The first\nthing I noticed is that everything is written in Swift with no Go in sight.\nThey still use Protobuf for communication among the daemons, as most of the\nwider Docker ecosystem does.</p>\n<p>\n<img alt=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they&apos;re impossible to distinguish!\" src=\"https://anil.recoil.org/images/macos-ss-1.webp\" title=\"I have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they&apos;re impossible to distinguish!\">\nI have mixed feelings about the new glass UI in macOS Tahoe. The tabs in the terminal are so low contrast they're impossible to distinguish!</p>\n<h2><a href=\"https://anil.recoil.org/#starting-our-first-apple-container\"></a>Starting our first Apple container</h2>\n<p>Let's start our daemon up and take the <code>container</code> CLI for a spin.</p>\n<pre><code>$ container system start\nVerifying apiserver is running...\nInstalling base container filesystem...\nNo default kernel configured.\nInstall the recommended default kernel from [https://github.com/kata-containers/kata-containers/releases/download/3.17.0/kata-static-3.17.0-arm64.tar.xz]? [Y/n]: y\nInstalling kernel... \n\u2819 [1/2] Downloading kernel 33% (93.4/277.1 MB, 14.2 MB/s) [5s]\n</code></pre>\n<p>The first thing we notice is it downloading a full Linux kernel from the <a href=\"https://github.com/kata-containers/kata-containers\">Kata Containers</a> project. This system spins up a VM per container in order to provide more isolation. Although I haven't tracked Kata closely since its <a href=\"https://techcrunch.com/2017/12/05/intel-and-hyper-partner-with-the-openstack-foundation-to-launch-the-kata-containers-project/\">launch</a> in 2017, I did notice it being used to containerise <a href=\"https://confidentialcomputing.io/\">confidential computing enclaves</a> while <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> and I were working on <a href=\"https://anil.recoil.org/projects/difc-tee\">TEE programming models</a> a few years ago.</p>\n<p>The use of Kata tells us that <code>container</code> spins up a new kernel using the\nmacOS <a href=\"https://anil.recoil.org/\">Virtualization framework</a> every time a new container is started. This\nis ok for production use (where extra isolation may be appropriate in a\nmultitenant cloud environment) but very memory inefficient for development\n(where it's usual to spin up 4-5 VMs for a development environment with a\ndatabase etc). In contrast, Docker for Mac <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">uses</a> a single Linux kernel and runs\nthe containers within that instead.</p>\n<p>It's not quite clear to me why Apple chose the extra overheads of a\nVM-per-container, but I suspect this might be something to do with running code securely\ninside the <a href=\"https://support.apple.com/en-gb/guide/security/sec59b0b31ff/web\">many hardware enclaves</a>\npresent in modern Apple hardware, a usecase that is on the rise with <a href=\"https://www.apple.com/uk/apple-intelligence/\">Apple\nIntelligence</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#peeking-under-the-hood-of-the-swift-code\"></a>Peeking under the hood of the Swift code</h2>\n<p>Once the container daemon is running, we can spin up our first container using Alpine, which uses the familiar Docker-style <code>run</code>:</p>\n<pre><code>$ time container run alpine uname -a \nLinux 3c555c19-b235-4956-bed8-27bcede642a6 6.12.28 #1 SMP\nTue May 20 15:19:05 UTC 2025 aarch64 Linux\n0.04s user 0.01s system 6% cpu 0.733 total\n</code></pre>\n<p>The container spinup time is noticable, but still less than a second and pretty acceptable for day to day use. This is possible thanks to a custom userspace they implement via a Swift init process that's run by the Linux kernel as the <em>sole</em> binary in the filesystem, and that provides an RPC interface to manage other services. The <a href=\"https://github.com/apple/containerization/tree/main/vminitd/Sources/vminitd\">vminitd</a> is built using the Swift static Linux SDK, which links <a href=\"https://musl.libc.org/\">musl libc</a> under the hood (the same one used by <a href=\"https://www.alpinelinux.org/\">Alpine Linux</a>).</p>\n<p>We can see the processes running by using <a href=\"https://man7.org/linux/man-pages/man1/pstree.1.html\">pstree</a>:</p>\n<pre><code>|- 29203 avsm /System/Library/Frameworks/Virtualization.framework/\n Versions/A/XPCServices/com.apple.Virtualization.VirtualMachine.xpc/\n Contents/MacOS/com.apple.Virtualization.VirtualMachine\n|- 29202 avsm <..>/plugins/container-runtime-linux/\n bin/container-runtime-linux\n --root <..>/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm <..>/bin/container-network-vmnet\n start --id default\n --service-identifier <..>network.container-network-vmnet.default\n|- 28899 avsm <..>/bin/container-core-images start\n|- 29202 avsm <..>/bin/container-runtime-linux\n --root <..>/f82d3a52-c89b-4ff0-9e71-c7127cb5eee1\n --uuid f82d3a52-c89b-4ff0-9e71-c7127cb5eee1 --debug\n|- 28896 avsm <..>/container-network-vmnet start --id default\n --service-identifier <..>network.container-network-vmnet.default\n</code></pre>\n<p>You can start to see the overheads of a VM-per-container now, as each container\nneeds the host process infrastructure to not only run the computation, but also to\nfeed it with networking and storage IO (which have to be translated from the\nhost). Still, its a drop in the ocean for macOS these days, as I'm running 850\nprocesses in the background on my Macbook Air from an otherwise fresh\ninstallation! This isn't the lean, fast MacOS X Cheetah I used on my G4 Powerbook anymore,\nsadly.</p>\n<h3><a href=\"https://anil.recoil.org/#finding-the-userspace-ext4-in-swift\"></a>Finding the userspace ext4 in Swift</h3>\n<p>I then tried to run a more interesting container for my local dev environment:\nthe <a href=\"https://hub.docker.com/r/ocaml/opam\">ocaml/opam</a> Docker images that we use\nin OCaml development. This showed up an interesting new twist in the Apple\nrewrite: they have an entire <a href=\"https://anil.recoil.org/\">ext4</a> filesystem <a href=\"https://github.com/apple/containerization/tree/main/Sources/ContainerizationEXT4\">implementation written in\nSwift</a>!\nThis is used to extract the OCI images from the Docker registry and then\nconstruct a new filesystem.</p>\n<pre><code>$ container run ocaml/opam opam list\n\u2826 [2/6] Unpacking image for platform linux/arm64 (112,924 entries, 415.9 MB, Zero KB/s) [9m 22s] \n\u2839 [2/6] Unpacking image for platform linux/arm64 (112,972 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u2807 [2/6] Unpacking image for platform linux/arm64 (113,012 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u283c [2/6] Unpacking image for platform linux/arm64 (113,059 entries, 415.9 MB, Zero KB/s) [9m 23s] \n\u280b [2/6] Unpacking image for platform linux/arm64 (113,104 entries, 415.9 MB, Zero KB/s) [9m 24s] \n# Packages matching: installed \n# Name # Installed # Synopsis\nbase-bigarray base\nbase-domains base\nbase-effects base\nbase-threads base\nbase-unix base\nocaml 5.3.0 The OCaml compiler (virtual package)\nocaml-base-compiler 5.3.0 pinned to version 5.3.0\nocaml-compiler 5.3.0 Official release of OCaml 5.3.0\nocaml-config 3 OCaml Switch Configuration\nopam-depext 1.2.3 Install OS distribution packages\n</code></pre>\n<p>The only hitch here is how slow this process is. The OCaml images do have a lot of individual\nfiles within the layers (not unusual for a package manager), but I was surprised that this took\n10 minutes on my modern M4 Macbook Air, versus a few seconds on Docker for Mac. I <a href=\"https://github.com/apple/container/issues/136\">filed a bug</a> upstream to investigate further since (as with any new implementation) there are many <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">edge cases</a> when handling filesystems in userspace, and the Apple code seems to have <a href=\"https://github.com/apple/container/issues/134\">other limitations</a> as well. I'm sure this will all shake out as the framework gets more users, but it's worth bearing in mind if you're thinking of using it in the near term in a product.</p>\n<h2><a href=\"https://anil.recoil.org/#whats-conspicuously-missing\"></a>What's conspicuously missing?</h2>\n<p>I was super excited when this announcement first happened, since I thought it might be the beginning of a few features I've needed for years and years. But they're missing...</p>\n<h3><a href=\"https://anil.recoil.org/#running-macos-containers-nope\"></a>Running macOS containers: nope</h3>\n<p>In OCaml-land, we have gone to ridiculous lengths to be able to run macOS CI on our own infrastructure. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> first wrote a <a href=\"https://tarides.com/blog/2023-08-02-obuilder-on-macos/\">custom snapshotting builder</a> using undocumented interfaces like userlevel sandboxing, subsequently taken over and maintained by <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a>. This is a tremendous amount of work to maintain, but the alternative is to depend on very expensive hosted services to spin up individual macOS VMs which are slow and energy hungry.</p>\n<p>What we <em>really</em> need are macOS containers! We have dozens of mechanisms to run Linux ones already, and only a few <a href=\"https://github.com/dockur/macos\">heavyweight alternatives</a> to run macOS itself within macOS. However, the VM-per-container mechanism chosen by Apple might be the gateway to supporting macOS itself in the future. I will be first in line to test this if it happens!</p>\n<h3><a href=\"https://anil.recoil.org/#running-ios-containers-nope\"></a>Running iOS containers: nope</h3>\n<p>Waaaay back when we were <a href=\"https://speakerdeck.com/avsm/the-functional-innards-of-docker-for-mac-and-windows\">first writing</a> Docker for Mac, there were no mainstream users of the Apple Hypervisor framework at all (that's why we built and released <a href=\"https://github.com/moby/hyperkit\">Hyperkit</a>. The main benefit we hoped to derive from using Apple-blessed frameworks is that they would make our app App-Store friendly for distribution via those channels.</p>\n<p>But while there do exist <a href=\"https://developer.apple.com/documentation/bundleresources/entitlements/com.apple.security.hypervisor\">entitlements</a> to support virtualisation on macOS, there is <em>no</em> support for iOS or iPadOS to this day! All of the trouble to sign binaries and deal with entitlements and opaque Apple tooling only gets it onto the Mac App store, which is a little bit of a graveyard compared to the iOS ecosystem.\nThis thus remains on my wishlist for Apple: the hardware on modern iPad adevices <em>easily</em> supports virtualisation, but Apple is choosing to cripple these devices from having a decent development experience by not unlocking the software capability by allowing the hypervisor, virtualisation and container frameworks to run on there.</p>\n<h3><a href=\"https://anil.recoil.org/#running-linux-containers-yeah-but-no-gpu\"></a>Running Linux containers: yeah but no GPU</h3>\n<p>One reason to run Linux containers on macOS is to handle machine learning workloads. Actually getting this to be performant is tricky, since macOS has its own custom <a href=\"https://github.com/ml-explore/mlx\">MLX-based</a> approach to handling tensor computations. Meanwhile, the rest of the world mostly uses nVidia or AMD interfaces for those GPUs, which is reflected in container images that are distributed.</p>\n<p>There is some chatter on the <a href=\"https://github.com/apple/container/discussions/62#discussioncomment-13414483\">apple/container GitHub</a> about getting GPU passthrough working, but I'm still unclear on how to get a more portable GPU ABI. The reason Linux containers work so well is that the Linux kernel provides a very stable ABI, but this breaks down with GPUs badly.</p>\n<h1><a href=\"https://anil.recoil.org/#does-this-threaten-dockers-dominance\"></a>Does this threaten Docker's dominance?</h1>\n<p>I have mixed feelings about the Containerization framework release. On one hand, it's always fun to see more systems code in a new language like Swift, and this is an elegant and clean reimplementation of classic containerisation techniques in macOS. But the release <strong>fails to unlock any real new end-user capabilities</strong>, such as running a decent development environment on my iPad without using cloud services. Come on Apple, you can make that happen; you're getting ever closer every release!</p>\n<p>I don't believe that Docker or Orbstack are too threatened by this release at this stage either, despite some reports that <a href=\"https://appleinsider.com/articles/25/06/09/sorry-docker-macos-26-adds-native-support-for-linux-containers\">they're being Sherlocked</a>. The Apple container CLI is quite low-level, and there's a ton of quality-of-life features in the full Docker for Mac app that'll keep me using it, and there seems to be no real blocker from Docker adopting the Containerization framework as one of its optional backends. I prefer having a single VM for my devcontainers to keep my laptop battery life going, so I think Docker's current approach is better for that usecase.</p>\n<p>Apple has been a very good egg here by open sourcing all their code, so I believe this will overall help the Linux container ecosystem by adding choice to how we deploy software containers. Well done <a href=\"https://github.com/crosbymichael\">Michael Crosby</a>, <a href=\"https://github.com/mavenugo\">Madhu Venugopal</a> and many of my other former colleagues who are all merrily hackily away on this for doing so! As an aside, I'm also just revising a couple of papers about the history of using OCaml in several Docker components, and a retrospective look back at the hypervisor architecture backing Docker for Desktop, which will appear in print in the next couple of months (I'll update this post when they appear). But for now, back to my day job of marking undergraduate exam scripts...</p>\n\n<ol>\n<li>\n<p>vmnet is a networking framework for VMs/containers that I had to <a href=\"https://github.com/mirage/ocaml-vmnet\">reverse engineer</a> back in 2014 to use with OCaml/MirageOS.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_atproto-for-fun-and-blogging.json
+18
avsm/notes_atproto-for-fun-and-blogging.json
···+"summary": "<p>While <a href=\"https://bsky.app\">Bluesky</a> is taking off like a rocket, a number of us <a href=\"https://anil.recoil.org/notes/enter-the-matrix-hookshot\">moving</a> towards <a href=\"https://anil.recoil.org/\">self sovereign</a> digital infrastructure have been looking at how to use the Bluesky network for other uses than just short-form notes. This is possible because of my colleague <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a>'s hard work on the "<a href=\"https://atproto.com/\">AT Protocol</a>" that underpins the Bluesky network. Martin recently gave us a <a href=\"https://talks.cam.ac.uk/talk/index/224767\">deep-dive into the AT proto</a> in the Cambridge <a href=\"https://www.cl.cam.ac.uk/research/security/\">security group</a>, which made me look into other uses of it more closely. As background, you may wish to read <a href=\"https://arxiv.org/abs/2402.03239\">his paper</a> on the subject which explains the technical architecture extremely clearly.</p>\n<p><a href=\"https://arxiv.org/pdf/2402.03239\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/atproto-paper-ss-1.webp\" title=\"\">\n </a></p>\n<p>One of the key problems this solves is one I'm having with using my <a href=\"https://en.wikipedia.org/wiki/ActivityPub\">ActivityPub</a>-based services at the moment. Each of these services (like my <a href=\"https://crank.recoil.org\">video</a> or <a href=\"https://amok.recoil.org\">microblog</a> sites) do not share a common authentication system, and so each account is different. <a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I are also thinking of renaming all of our services to go under the cleaner <code>recoil.org</code> domain rather than a subdomain, but this involves a fairly error-prone <a href=\"https://digitalflapjack.com/blog/hosting24/\">migration</a> that lacks <a href=\"https://anil.recoil.org/ideas/activitypub-resilience\">resilience</a> to domain change changes since they are baked into the ActivityPub protocol messages. The AT Protocol underpinning Bluesky deals with all these by decoupling the underlying authentication and identity system, and the content that's flowing over the network.</p>\n<p>This strikes a nice balance between pure self-hosting and longevity while bootstrapping the network; I chuckled, for instance, reading <a href=\"https://statusq.org/archives/2012/09/29/4524/\">Q's post from 2012</a> about how a new social network called "<a href=\"https://web.archive.org/web/20121011065707/https://join.app.net/\">app.net</a>", which described itself as "your real-time feed, a home for meaningful conversation, where you control your data", but is now a long expired domain-squatted adfest. The AT Proto model seems more pragmatic in that it builds up a centralised bootstrap, but the underlying protocol itself admits innovation for other apps, and so permits evolution.</p>\n<p>So let's look at some of these alternative apps are already cropping up:</p>\n<ul>\n<li><a href=\"https://whtwnd.com/about\">Whitewind</a> is a blogging platform (<a href=\"https://github.com/whtwnd/whitewind-blog\">source code</a>) that lets you write longform posts in Markdown format and post them to the Internet. The data itself is stored on a local <a href=\"https://github.com/bluesky-social/pds\">PDS</a> and you can republish the blog posts using a <a href=\"https://github.com/hugeblank/whitebreeze\">simple site generator</a>.</li>\n<li><a href=\"https://github.com/muni-town/roomy\">Roomy</a> (via <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>) is a peer-to-peer messaging app built over AT Proto. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> points out the nice idea of "digital gardening" in their <a href=\"https://github.com/commune-sh/commune-server/discussions/28\">discussions</a> which I absolutely <em>love</em> and have been building into my own <a href=\"https://github.com/avsm/bushel\">Bushel</a> notes platform which powers this site. I've wanted to the ability to go from short-form thoughts to long-form consolidation for years and years now.</li>\n<li><a href=\"https://bsky.app/profile/tom.frontpage.team\">Tom Sherman</a> maintains a Bluesky list of <a href=\"https://bsky.app/profile/tom.frontpage.team/lists/3l3qcs6lizq2o\">alternative ATProto apps</a>, from which I discovered <a href=\"https://bsky.app/profile/stream.place\">Streamplace</a>, a mechanism to share live video on AT Proto.</li>\n</ul>\n<p>Then there are a bunch of "alternative clients" that do specific forms of media, such as photos or videos. This is less about using the underlying protocol than about building a new client, but it's still pretty neat that it's so accessible:</p>\n<ul>\n<li><a href=\"https://bsky.app/profile/did:plc:24kqkpfy6z7avtgu3qg57vvl\">Flashes</a> is a photo sharing app that recently launched in <a href=\"https://techcrunch.com/2025/02/06/flashes-a-photo-sharing-app-for-bluesky-opens-beta/\">beta</a> and is currently only available via TestFlight. Like Insta, it allows multiple photos per post, and you can then share comments with the mainline Bluesky.</li>\n<li><a href=\"https://www.bluecast.app/\">Bluecast</a> is a real-time audio streaming service for any Bluesky users (anyone remember the fun we had for about two weeks in the pandemic lockdown with <a href=\"https://www.clubhouse.com/\">Clubhouse</a>?)</li>\n<li><a href=\"https://bsky.app/profile/did:plc:kx626d5pdvqbn3kmoxtjjcbd\">Bluemotion</a> has been built by the <a href=\"https://fediversereport.com/video-audio-and-blogging-japanese-bluesky-is-building-in-the-atmosphere/\">Japanese Bluesky community</a> for quick and easy video sharing. <a href=\"https://liquidx.net\">Alastair Tse</a> says that development on this has <a href=\"https://bsky.app/profile/liquidx.net/post/3lhsoperh2s2f\">slowed down</a>, possibly as Bluesky supports videos natively now.</li>\n<li><a href=\"https://apps.apple.com/us/app/bluescreen-for-bluesky/id6741334901\">Bluescreen</a> is a <a href=\"https://lifehacker.com/tech/bluesky-now-has-its-own-tiktok\">Tiktok alternative</a> for video posts, as is <a href=\"https://bsky.app/profile/mmccue.bsky.social/post/3lg6ezjpawc2c\">SkyTok</a>. Skytok seems to built on something called <a href=\"https://surf.social/\">surf.social</a> that gives more control over a Bluesky/Mastodon/RSS feedset as well, but it's still in closed beta.</li>\n</ul>\n<p>This is just the tip of the iceberg for the open web, of course. Excitingly, there are experiments ongoing to <a href=\"https://berjon.com/ap-at/?ref=cosmico.org\">run ActivityPub over AT Proto</a> which describes how complementary these ecosystems are.\nAnd most excitingly from my personal perspective, is <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> successfully <a href=\"https://bsky.app/profile/patrick.sirref.org/post/3lh24rrjngw24\">posting</a> from an up-and-coming <a href=\"https://github.com/patricoferris/ocaml-atproto-lexicon\">OCaml ATProto</a> implementation. I'm looking forward to hacking in this ecosystem in 2025!</p>\n<p><em>(Thanks David Gageot for spotting typos!)</em></p>",+"content": "<p>While <a href=\"https://bsky.app\">Bluesky</a> is taking off like a rocket, a number of us <a href=\"https://anil.recoil.org/notes/enter-the-matrix-hookshot\">moving</a> towards <a href=\"https://anil.recoil.org/\">self sovereign</a> digital infrastructure have been looking at how to use the Bluesky network for other uses than just short-form notes. This is possible because of my colleague <a href=\"https://martin.kleppmann.com\">Martin Kleppmann</a>'s hard work on the "<a href=\"https://atproto.com/\">AT Protocol</a>" that underpins the Bluesky network. Martin recently gave us a <a href=\"https://talks.cam.ac.uk/talk/index/224767\">deep-dive into the AT proto</a> in the Cambridge <a href=\"https://www.cl.cam.ac.uk/research/security/\">security group</a>, which made me look into other uses of it more closely. As background, you may wish to read <a href=\"https://arxiv.org/abs/2402.03239\">his paper</a> on the subject which explains the technical architecture extremely clearly.</p>\n<p><a href=\"https://arxiv.org/pdf/2402.03239\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/atproto-paper-ss-1.webp\" title=\"\">\n </a></p>\n<p>One of the key problems this solves is one I'm having with using my <a href=\"https://en.wikipedia.org/wiki/ActivityPub\">ActivityPub</a>-based services at the moment. Each of these services (like my <a href=\"https://crank.recoil.org\">video</a> or <a href=\"https://amok.recoil.org\">microblog</a> sites) do not share a common authentication system, and so each account is different. <a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I are also thinking of renaming all of our services to go under the cleaner <code>recoil.org</code> domain rather than a subdomain, but this involves a fairly error-prone <a href=\"https://digitalflapjack.com/blog/hosting24/\">migration</a> that lacks <a href=\"https://anil.recoil.org/ideas/activitypub-resilience\">resilience</a> to domain change changes since they are baked into the ActivityPub protocol messages. The AT Protocol underpinning Bluesky deals with all these by decoupling the underlying authentication and identity system, and the content that's flowing over the network.</p>\n<p>This strikes a nice balance between pure self-hosting and longevity while bootstrapping the network; I chuckled, for instance, reading <a href=\"https://statusq.org/archives/2012/09/29/4524/\">Q's post from 2012</a> about how a new social network called "<a href=\"https://web.archive.org/web/20121011065707/https://join.app.net/\">app.net</a>", which described itself as "your real-time feed, a home for meaningful conversation, where you control your data", but is now a long expired domain-squatted adfest. The AT Proto model seems more pragmatic in that it builds up a centralised bootstrap, but the underlying protocol itself admits innovation for other apps, and so permits evolution.</p>\n<p>So let's look at some of these alternative apps are already cropping up:</p>\n<ul>\n<li><a href=\"https://whtwnd.com/about\">Whitewind</a> is a blogging platform (<a href=\"https://github.com/whtwnd/whitewind-blog\">source code</a>) that lets you write longform posts in Markdown format and post them to the Internet. The data itself is stored on a local <a href=\"https://github.com/bluesky-social/pds\">PDS</a> and you can republish the blog posts using a <a href=\"https://github.com/hugeblank/whitebreeze\">simple site generator</a>.</li>\n<li><a href=\"https://github.com/muni-town/roomy\">Roomy</a> (via <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>) is a peer-to-peer messaging app built over AT Proto. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> points out the nice idea of "digital gardening" in their <a href=\"https://github.com/commune-sh/commune-server/discussions/28\">discussions</a> which I absolutely <em>love</em> and have been building into my own <a href=\"https://github.com/avsm/bushel\">Bushel</a> notes platform which powers this site. I've wanted to the ability to go from short-form thoughts to long-form consolidation for years and years now.</li>\n<li><a href=\"https://bsky.app/profile/tom.frontpage.team\">Tom Sherman</a> maintains a Bluesky list of <a href=\"https://bsky.app/profile/tom.frontpage.team/lists/3l3qcs6lizq2o\">alternative ATProto apps</a>, from which I discovered <a href=\"https://bsky.app/profile/stream.place\">Streamplace</a>, a mechanism to share live video on AT Proto.</li>\n</ul>\n<p>Then there are a bunch of "alternative clients" that do specific forms of media, such as photos or videos. This is less about using the underlying protocol than about building a new client, but it's still pretty neat that it's so accessible:</p>\n<ul>\n<li><a href=\"https://bsky.app/profile/did:plc:24kqkpfy6z7avtgu3qg57vvl\">Flashes</a> is a photo sharing app that recently launched in <a href=\"https://techcrunch.com/2025/02/06/flashes-a-photo-sharing-app-for-bluesky-opens-beta/\">beta</a> and is currently only available via TestFlight. Like Insta, it allows multiple photos per post, and you can then share comments with the mainline Bluesky.</li>\n<li><a href=\"https://www.bluecast.app/\">Bluecast</a> is a real-time audio streaming service for any Bluesky users (anyone remember the fun we had for about two weeks in the pandemic lockdown with <a href=\"https://www.clubhouse.com/\">Clubhouse</a>?)</li>\n<li><a href=\"https://bsky.app/profile/did:plc:kx626d5pdvqbn3kmoxtjjcbd\">Bluemotion</a> has been built by the <a href=\"https://fediversereport.com/video-audio-and-blogging-japanese-bluesky-is-building-in-the-atmosphere/\">Japanese Bluesky community</a> for quick and easy video sharing. <a href=\"https://liquidx.net\">Alastair Tse</a> says that development on this has <a href=\"https://bsky.app/profile/liquidx.net/post/3lhsoperh2s2f\">slowed down</a>, possibly as Bluesky supports videos natively now.</li>\n<li><a href=\"https://apps.apple.com/us/app/bluescreen-for-bluesky/id6741334901\">Bluescreen</a> is a <a href=\"https://lifehacker.com/tech/bluesky-now-has-its-own-tiktok\">Tiktok alternative</a> for video posts, as is <a href=\"https://bsky.app/profile/mmccue.bsky.social/post/3lg6ezjpawc2c\">SkyTok</a>. Skytok seems to built on something called <a href=\"https://surf.social/\">surf.social</a> that gives more control over a Bluesky/Mastodon/RSS feedset as well, but it's still in closed beta.</li>\n</ul>\n<p>This is just the tip of the iceberg for the open web, of course. Excitingly, there are experiments ongoing to <a href=\"https://berjon.com/ap-at/?ref=cosmico.org\">run ActivityPub over AT Proto</a> which describes how complementary these ecosystems are.\nAnd most excitingly from my personal perspective, is <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> successfully <a href=\"https://bsky.app/profile/patrick.sirref.org/post/3lh24rrjngw24\">posting</a> from an up-and-coming <a href=\"https://github.com/patricoferris/ocaml-atproto-lexicon\">OCaml ATProto</a> implementation. I'm looking forward to hacking in this ecosystem in 2025!</p>\n<p><em>(Thanks David Gageot for spotting typos!)</em></p>",
+18
avsm/notes_biomass-launches.json
+18
avsm/notes_biomass-launches.json
···+"summary": "<p>The <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass\">BIOMASS</a> forest mission satellite was <a href=\"https://www.bbc.co.uk/newsround/articles/c0jzy3g0zx2o\">successfully</a> boosted into space a couple of days ago, after decades of development from just down the road in <a href=\"https://www.gov.uk/government/news/british-built-satellite-to-map-earths-forests-in-3d-for-the-first-time\">Stevenage</a>. I'm excited by this because it's the first global-scale <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">P-band SAR</a> instrument that can penetrate forest canopys to look underneath. This, when combined with <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">hyperspectral mapping</a> will give us a lot more <a href=\"https://anil.recoil.org/projects/rsn\">insight</a> into global tree health.</p>\n<p>Weirdly, the whole thing almost never happened because permission to use the <a href=\"https://ieeexplore.ieee.org/document/9048581\">P-band</a> was blocked because it might <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">interfere with US nuclear missile warning radars</a> back in 2013.</p>\n<blockquote>\n<p>Meeting in Graz, Austria, to select the the 7th Earth Explorer mission to be flown by the 20-nation European Space Agency (ESA), backers of the Biomass mission were pelted with questions about how badly the U.S. network of missile warning and space-tracking radars in North America, Greenland and Europe would undermine Biomass\u2019 global carbon-monitoring objectives.</p>\n<p>Europe's Earth observation satellite system may be the world's most dynamic, but as it pushes its operating envelope into new areas, it is learning a lesson long ago taught to satellite telecommunications operators: Radio frequency is scarce, and once users have a piece of it they hold fast.\n-- <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">Spacenews</a> (2013)</p>\n</blockquote>\n<p>Luckily, all this got sorted by international frequency negotiators, and after\n<a href=\"https://www.thecomet.net/news/25125302.satellite-built-stevenage-airbus-launches-space/\">being built by Airbus in Stevenage</a>\n(and Germany and France, as it's a complex instrument!) it took off without a hitch. Looking forward to getting my hands on the first results later in the year over at the <a href=\"https://eo.conservation.cam.ac.uk\">Centre for Earth Observation</a>.</p>\n<p>Check out this cool <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">ESA video</a> about the instrument to learn more, and congratulations to the team at ESA. Looking forward to the next <a href=\"https://anil.recoil.org/notes/biospace-25\">BIOSPACE</a> where there will no doubt be initial buzz about this.</p>\n<p></p><div></div><p></p>\n<p><em>Update 28th June 2025:</em> See also this <a href=\"https://www.bbc.co.uk/news/resources/idt-d7353b50-0fea-46ba-8495-ae9e25192cfe\">beautiful BBC article</a> about the satellite, via <a href=\"https://coomeslab.org\">David Coomes</a>.</p>",+"content": "<p>The <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass\">BIOMASS</a> forest mission satellite was <a href=\"https://www.bbc.co.uk/newsround/articles/c0jzy3g0zx2o\">successfully</a> boosted into space a couple of days ago, after decades of development from just down the road in <a href=\"https://www.gov.uk/government/news/british-built-satellite-to-map-earths-forests-in-3d-for-the-first-time\">Stevenage</a>. I'm excited by this because it's the first global-scale <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">P-band SAR</a> instrument that can penetrate forest canopys to look underneath. This, when combined with <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">hyperspectral mapping</a> will give us a lot more <a href=\"https://anil.recoil.org/projects/rsn\">insight</a> into global tree health.</p>\n<p>Weirdly, the whole thing almost never happened because permission to use the <a href=\"https://ieeexplore.ieee.org/document/9048581\">P-band</a> was blocked because it might <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">interfere with US nuclear missile warning radars</a> back in 2013.</p>\n<blockquote>\n<p>Meeting in Graz, Austria, to select the the 7th Earth Explorer mission to be flown by the 20-nation European Space Agency (ESA), backers of the Biomass mission were pelted with questions about how badly the U.S. network of missile warning and space-tracking radars in North America, Greenland and Europe would undermine Biomass\u2019 global carbon-monitoring objectives.</p>\n<p>Europe's Earth observation satellite system may be the world's most dynamic, but as it pushes its operating envelope into new areas, it is learning a lesson long ago taught to satellite telecommunications operators: Radio frequency is scarce, and once users have a piece of it they hold fast.\n-- <a href=\"https://spacenews.com/us-missile-warning-radars-could-squelch-esas-proposed-biomass-mission/\">Spacenews</a> (2013)</p>\n</blockquote>\n<p>Luckily, all this got sorted by international frequency negotiators, and after\n<a href=\"https://www.thecomet.net/news/25125302.satellite-built-stevenage-airbus-launches-space/\">being built by Airbus in Stevenage</a>\n(and Germany and France, as it's a complex instrument!) it took off without a hitch. Looking forward to getting my hands on the first results later in the year over at the <a href=\"https://eo.conservation.cam.ac.uk\">Centre for Earth Observation</a>.</p>\n<p>Check out this cool <a href=\"https://www.esa.int/Applications/Observing_the_Earth/FutureEO/Biomass/The_instrument\">ESA video</a> about the instrument to learn more, and congratulations to the team at ESA. Looking forward to the next <a href=\"https://anil.recoil.org/notes/biospace-25\">BIOSPACE</a> where there will no doubt be initial buzz about this.</p>\n<p></p><div></div><p></p>\n<p><em>Update 28th June 2025:</em> See also this <a href=\"https://www.bbc.co.uk/news/resources/idt-d7353b50-0fea-46ba-8495-ae9e25192cfe\">beautiful BBC article</a> about the satellite, via <a href=\"https://coomeslab.org\">David Coomes</a>.</p>",
+18
avsm/notes_biospace-25.json
+18
avsm/notes_biospace-25.json
···+"summary": "<p>The <a href=\"https://www.esa.int/\">European Space Agency</a> organised the first conference on <a href=\"https://biospace25.esa.int/\">Biodiversity Insights from Space</a> (BioSpace) in February this year, and it seems like it was a huge success. The conference itself sold out within days, and the program <a href=\"https://biospace25.esa.int/agenda/\">was so packed</a> that the organisers had to split it into multiple chunks during the week to cope with everyone. I've only just gotten around to fully browsing the <a href=\"https://biospace25.esa.int/agenda/\">schedule</a>, and it's incredible to see so much variety of work happening in biodiversity and remote sensing. Here's hoping that <a href=\"https://www.esa.int/\">ESA</a> makes this an annual event in Italy!</p>\n<p><a href=\"https://coomeslab.org\">David Coomes</a>, who was on the scientific selection committee, told us about it so we hastily submitted a few abstracts which got selected for presentation! David himself <a href=\"https://biospace25.esa.int/iframe-agenda/files/ID498_Coomes.pdf\">talked about forest disturbance</a>.</p>\n<p><a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/biospace-ss-1.webp\" title=\"\">\n </a></p>\n<h2><a href=\"https://anil.recoil.org/#from-ground-to-canopy-integrating-ground-based-sensors-with-remote-sensing-to-improve-urban-tree-management\"></a>From Ground to Canopy: Integrating Ground-based Sensors with Remote Sensing to Improve Urban Tree Management</h2>\n<p><a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> presented the work we've been <a href=\"https://anil.recoil.org/papers/2024-terracorder\">exploring</a> at Cambridge and Imperial around using <a href=\"https://anil.recoil.org/papers/2025-npu-bench\">ultra low power sensors</a> for biodiversity monitoring and <a href=\"https://anil.recoil.org/ideas/urban-vegetation\">urban health</a>:</p>\n<blockquote>\n<p>Urban trees are essential for supporting biodiversity, as they provide\nhabitats for various species and help regulate water storage and temperature,\nand sequester CO\u2082 in urban ecosystems.Urban forests have been proposed as a\nnature-based solution to fight climate change and provide ecosystem services\nto citizens. Mapping and monitoring urban trees is vital as it facilitates\nconservation strategies for both flora and fauna, early diagnosis of plant\npathogens, and zoning and urban development.</p>\n<p>However, mapping trees has\nproved difficult for urban planners since they rely on in situ surveys or\ncommunity-led projects that may not cover all areas; one such case is London,\nwhere the official survey only accounts for ~10% of the estimated 8 million\ntrees in the city. Moreover, the geographic coordinates of trees are\nsurprisingly unreliable due to a lack of precision of measuring devices (e.g.\nphones or commercial GPS).</p>\n<p>We propose a method for calibrating urban tree\nlocations using physical ground sensors as "anchors". These sensors help\nreconcile spatial mismatches across various spatial datasets, including\nhigh-resolution satellite and aerial imagery and tree surveys collected by\ncity councils or in open-data projects like OSM. These low-power sensors can\nalso collect microclimate and other biodiversity-related data, such as\npassive acoustic animal activity monitoring, providing a richer picture of\ntree and urban ecosystem health and enabling high resolution maps not\npreviously possible. Our ultimate goal is to combine remote sensing\ninformation with ground-based measurements to support reliable data that can\nbe used in geographic-based foundation models to help better urban planning\nstrategies around trees that maximise their benefit to humans and nature.</p>\n</blockquote>\n<p>\n<img alt=\"The Biospace poster was so big it was half-way to space already\" src=\"https://anil.recoil.org/images/biospace-ss-2.webp\" title=\"The Biospace poster was so big it was half-way to space already\">\nThe Biospace poster was so big it was half-way to space already</p>\n<p>You can read <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>'s own <a href=\"https://ancazugo.github.io/research/outreach/2025/02/14/biospace25-blog.html\">writeup on his blog</a> and watch the <a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\">recording</a>! <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> would have made it to a poster presentation, but forgot to register in time and missed out due to how packed the conference was!</p>\n<h2><a href=\"https://anil.recoil.org/#establishing-causal-links-which-facilitate-remote-sensing-of-biodiversity-metric\"></a>Establishing causal links which facilitate remote sensing of biodiversity metric</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> also prepared a poster for his <a href=\"https://anil.recoil.org/ideas/ssl-for-geospatial-tasks\">PhD work</a> on the topic of causality measurement. His <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">notes from the conference</a> about the use of SDGs are great:</p>\n<blockquote>\n<p>My big takeaway from the opening speeches was that this is the first year that the ESA is spending more on building out its data science capabilities than it is on putting satellites into space. To me, this is indicative of the fact that the marginal benefit from putting effort into effectively wrangling huge amounts of data is now greater than that from collecting huge amounts of data at a faster pace.</p>\n</blockquote>\n<p>Given the growing amount of <a href=\"https://www.sdo.esoc.esa.int/environment_report/Space_Environment_Report_latest.pdf\">space junk</a> out there, getting more leverage over already gathered data seems very sensible indeed.</p>\n<p>Another important point Onkar makes that I've been noticing in my own thoughts about <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a> is:</p>\n<blockquote>\n<p>A key point multiple speakers made note of (there were a dozen or so speakers\ntalking for perhaps ~10 minutes each) was that introducing frameworks and\nmethodologies to give countries national ownership of their data and the\nability to independently generate compatible statistics was the priority, not\nintroducing new data products. If we can move towards all countries using the\nsame standards, we can enable the aggregation of statistics up in a reliable\nmanner.</p>\n</blockquote>\n<p>Since the February date of this BIOSPACE conference there has, of course, been a huge amount of\ngeopolitical flux in the world. Countries gaining national ownership of <em>their\nown</em> data seems more important than ever.\nOnkar's <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">full writeup</a> is full of\ninsights derived from the conference, so I encourage you to have a direct read!</p>",+"content": "<p>The <a href=\"https://www.esa.int/\">European Space Agency</a> organised the first conference on <a href=\"https://biospace25.esa.int/\">Biodiversity Insights from Space</a> (BioSpace) in February this year, and it seems like it was a huge success. The conference itself sold out within days, and the program <a href=\"https://biospace25.esa.int/agenda/\">was so packed</a> that the organisers had to split it into multiple chunks during the week to cope with everyone. I've only just gotten around to fully browsing the <a href=\"https://biospace25.esa.int/agenda/\">schedule</a>, and it's incredible to see so much variety of work happening in biodiversity and remote sensing. Here's hoping that <a href=\"https://www.esa.int/\">ESA</a> makes this an annual event in Italy!</p>\n<p><a href=\"https://coomeslab.org\">David Coomes</a>, who was on the scientific selection committee, told us about it so we hastily submitted a few abstracts which got selected for presentation! David himself <a href=\"https://biospace25.esa.int/iframe-agenda/files/ID498_Coomes.pdf\">talked about forest disturbance</a>.</p>\n<p><a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/biospace-ss-1.webp\" title=\"\">\n </a></p>\n<h2><a href=\"https://anil.recoil.org/#from-ground-to-canopy-integrating-ground-based-sensors-with-remote-sensing-to-improve-urban-tree-management\"></a>From Ground to Canopy: Integrating Ground-based Sensors with Remote Sensing to Improve Urban Tree Management</h2>\n<p><a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> presented the work we've been <a href=\"https://anil.recoil.org/papers/2024-terracorder\">exploring</a> at Cambridge and Imperial around using <a href=\"https://anil.recoil.org/papers/2025-npu-bench\">ultra low power sensors</a> for biodiversity monitoring and <a href=\"https://anil.recoil.org/ideas/urban-vegetation\">urban health</a>:</p>\n<blockquote>\n<p>Urban trees are essential for supporting biodiversity, as they provide\nhabitats for various species and help regulate water storage and temperature,\nand sequester CO\u2082 in urban ecosystems.Urban forests have been proposed as a\nnature-based solution to fight climate change and provide ecosystem services\nto citizens. Mapping and monitoring urban trees is vital as it facilitates\nconservation strategies for both flora and fauna, early diagnosis of plant\npathogens, and zoning and urban development.</p>\n<p>However, mapping trees has\nproved difficult for urban planners since they rely on in situ surveys or\ncommunity-led projects that may not cover all areas; one such case is London,\nwhere the official survey only accounts for ~10% of the estimated 8 million\ntrees in the city. Moreover, the geographic coordinates of trees are\nsurprisingly unreliable due to a lack of precision of measuring devices (e.g.\nphones or commercial GPS).</p>\n<p>We propose a method for calibrating urban tree\nlocations using physical ground sensors as "anchors". These sensors help\nreconcile spatial mismatches across various spatial datasets, including\nhigh-resolution satellite and aerial imagery and tree surveys collected by\ncity councils or in open-data projects like OSM. These low-power sensors can\nalso collect microclimate and other biodiversity-related data, such as\npassive acoustic animal activity monitoring, providing a richer picture of\ntree and urban ecosystem health and enabling high resolution maps not\npreviously possible. Our ultimate goal is to combine remote sensing\ninformation with ground-based measurements to support reliable data that can\nbe used in geographic-based foundation models to help better urban planning\nstrategies around trees that maximise their benefit to humans and nature.</p>\n</blockquote>\n<p>\n<img alt=\"The Biospace poster was so big it was half-way to space already\" src=\"https://anil.recoil.org/images/biospace-ss-2.webp\" title=\"The Biospace poster was so big it was half-way to space already\">\nThe Biospace poster was so big it was half-way to space already</p>\n<p>You can read <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>'s own <a href=\"https://ancazugo.github.io/research/outreach/2025/02/14/biospace25-blog.html\">writeup on his blog</a> and watch the <a href=\"https://www.youtube.com/live/e-eQ8XhRrsE?t=14326s\">recording</a>! <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> would have made it to a poster presentation, but forgot to register in time and missed out due to how packed the conference was!</p>\n<h2><a href=\"https://anil.recoil.org/#establishing-causal-links-which-facilitate-remote-sensing-of-biodiversity-metric\"></a>Establishing causal links which facilitate remote sensing of biodiversity metric</h2>\n<p><a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> also prepared a poster for his <a href=\"https://anil.recoil.org/ideas/ssl-for-geospatial-tasks\">PhD work</a> on the topic of causality measurement. His <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">notes from the conference</a> about the use of SDGs are great:</p>\n<blockquote>\n<p>My big takeaway from the opening speeches was that this is the first year that the ESA is spending more on building out its data science capabilities than it is on putting satellites into space. To me, this is indicative of the fact that the marginal benefit from putting effort into effectively wrangling huge amounts of data is now greater than that from collecting huge amounts of data at a faster pace.</p>\n</blockquote>\n<p>Given the growing amount of <a href=\"https://www.sdo.esoc.esa.int/environment_report/Space_Environment_Report_latest.pdf\">space junk</a> out there, getting more leverage over already gathered data seems very sensible indeed.</p>\n<p>Another important point Onkar makes that I've been noticing in my own thoughts about <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">national data libraries</a> is:</p>\n<blockquote>\n<p>A key point multiple speakers made note of (there were a dozen or so speakers\ntalking for perhaps ~10 minutes each) was that introducing frameworks and\nmethodologies to give countries national ownership of their data and the\nability to independently generate compatible statistics was the priority, not\nintroducing new data products. If we can move towards all countries using the\nsame standards, we can enable the aggregation of statistics up in a reliable\nmanner.</p>\n</blockquote>\n<p>Since the February date of this BIOSPACE conference there has, of course, been a huge amount of\ngeopolitical flux in the world. Countries gaining national ownership of <em>their\nown</em> data seems more important than ever.\nOnkar's <a href=\"https://www.onkargulati.com/2025/02/28/biospace.html\">full writeup</a> is full of\ninsights derived from the conference, so I encourage you to have a direct read!</p>",
+18
avsm/notes_breaking-up-mirageos.json
+18
avsm/notes_breaking-up-mirageos.json
···+"summary": "<p>Once the main advantages of having hypervisors is that you can have strongly isolated services within a single machine. But it's really hard to actually build these specialised services; that is, until MirageOS came along. This post discusses how to build so-called "stub domains" for Xen using MirageOS.</p>",+"content": "<p>Once the main advantages of having hypervisors is that you can have strongly isolated services within a single machine. But it's really hard to actually build these specialised services; that is, until MirageOS came along. This post discusses how to build so-called "stub domains" for Xen using MirageOS.</p>",
+18
avsm/notes_bushel-lives.json
+18
avsm/notes_bushel-lives.json
···+"summary": "<p>This website has been through quite a few iterations over the years. The first version in 1998 was written in Perl and hosted on <a href=\"https://anil.recoil.org/\">OpenBSD</a>; the second was rewritten in 2000 when I <a href=\"https://anil.recoil.org/notes/commit-access-to-php\">got commit access to PHP</a>; the third rewrite became a hybrid OCaml/PHP/Perl special in 2004 in <a href=\"https://en.wikipedia.org/wiki/Blosxom\">Blosxom</a>; then the forth rewrite around 2013 got turned into a <a href=\"https://anil.recoil.org/projects/unikernels\">unikernel</a> in MirageOS; then the <a href=\"https://web.archive.org/web/20220118200046/https://anil.recoil.org/\">fifth</a> in 2019 then transitioned to an OCaml static site generator hosted on a prerelease <a href=\"https://github.com/avsm/eeww\">multicore OCaml webserver</a>. So the sixth generation now needs something to continue the grand <a href=\"https://en.wikipedia.org/wiki/Rube_Goldberg_machine\">Rube Goldberg</a> tradition of helping me learn the latest and greatest in systems technology.</p>\n<p>And so here it is! The site is now written in a bleeding-edge unreleased variant of OCaml with extensions based around <a href=\"https://blog.janestreet.com/icfp-2024-index/\">Rust-like type system features</a> activated, including rather exciting <a href=\"https://popl25.sigplan.org/details/POPL-2025-popl-research-papers/23/Data-Race-Freedom-la-Mode\">data-race freedom</a> work that just won a best paper award at POPL 2025. It's normally difficult to work on continuously moving compilers, but Diana Kalinichenko did a tremendous amount of work into making it usable with opam out of the box, and this post documents the journey to getting this website live.</p>\n<h2><a href=\"https://anil.recoil.org/#getting-the-oxidised-compiler\"></a>Getting the oxidised compiler</h2>\n<p>Firstly, we did some groundwork a few months ago by adding support into the opam-repository for <a href=\"https://github.com/ocaml/opam-repository/pull/26471\">bootstrap versions</a> of dune, menhir and ocamlfind. These are used to build the Jane Street version of the OCaml compiler, which is published as an <a href=\"https://github.com/janestreet/opam-repository/tree/with-extensions\">opam-repository#with-extensions</a>.</p>\n<p>The extensions there are straightforward for those familiar with opam. On a clean system you can run:</p>\n<pre><code>$ opam init\n$ opam switch create 5.2.0+flambda2 \\\n --repos with-extensions=git+https://github.com/janestreet/opam-repository.git#with-extensions,default\n$ eval $(opam env)\n</code></pre>\n<p>This creates a new opam switch known as <code>5.2.0+flambda2</code>, and we can then verify it's running the variant compiler.</p>\n<pre><code>$ ocaml\nOCaml version 5.2.0+jst\nEnter #help;; for help.\n# let () =\n let local_message : string @@ local = "Hello, World" in\n print_endline local_message\n ;;\nError: This value escapes its region.\n</code></pre>\n<p>That last bit is the new region magic which I'm keen to start experimenting with for this website! But before that, we need to get the rest of the ecosystem packages needed for the website working under this compiler.</p>\n<h2><a href=\"https://anil.recoil.org/#installing-ecosystem-packages\"></a>Installing ecosystem packages</h2>\n<p>I decided to build the new site based on a content manager I've been designing\n(and scrapping) for a few years, codenamed Bushel. The basic idea behind\nBushel is to extend Markdown sufficiently with rich contextual data (such as\ncontacts, papers, projects, ideas and so on), and allow for cross-referencing\nto <em>other</em> sites that also follow the Bushel protocol. I'll talk about that in\nmore detail in future posts, but for now that means that I need a more dynamic\nwebsite than the static one I used for the past few years.</p>\n<p>Since the Jane Street compiler doesn't yet support the effect system from OCaml 5, I couldn't use my own Eio-based webserver. So after some discussion with <a href=\"https://mynameismwd.org\">Michael Dales</a> who is <a href=\"https://digitalflapjack.com/blog/the-partially-dynamic-web/\">also porting his site to OCaml</a>, I took the opportunity to learn the the excellent <a href=\"https://aantron.github.io/dream/\">Dream</a> server, which is based on Lwt. I also used Daniel Bunzli's <a href=\"https://discuss.ocaml.org/t/ann-cmarkit-0-3-0-commonmark-parser-and-renderer-for-ocaml/13622\">cmarkit</a> library for Markdown parsing, and my own <a href=\"https://github.com/avsm/jekyll_format\">Jekyll_format</a> and <a href=\"https://github.com/avsm/ocaml-yaml\">yaml</a> libraries.</p>\n<p>Amazingly, all of these libraries worked out of the box on the Jane Street\ncompiler, except for one snag: the parsetree internals have changed in their\nbranch. This means that <a href=\"https://ocaml.org/docs/metaprogramming\">PPX</a>\nextensions will not work out-of-the-box. Thankfully, there is an abstraction\nlibrary called <a href=\"https://discuss.ocaml.org/t/ann-ppxlib-034-0/15952\">ppxlib</a> which\nhas been ported to the variant compiler, and the differences in the parse tree\nwere easy to fix up (thanks Nathan Reb and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> for your recent ppxlib work!)</p>\n<p>After forking and fixing just two libraries that were using ppx (and not part of the\nJane Street core libraries that were already ported), all I had to do was to pin them\nand add them to my development environment.</p>\n<pre><code>opam pin add ppxlib 0.33.0+jst\nopam pin add dream-httpaf 1.0.0~alpha4\nopam pin add hpack https://github.com/avsm/ocaml-h2.git#js-extensions-fixes\nopam pin add lwt_ppx https://github.com/avsm/lwt.git#js-extensions-fixes\n</code></pre>\n<p>And this then installs the overridden version of packages that I needed,\nwith the pins making sure that the right dependencies were also present.\nAfter that, it was plain sailing! I've now compiled up a native code version\nof my webserver code, deployed it into a <a href=\"https://anil.recoil.org/\">Docker</a> container, and\ndeployed it on Linux.</p>\n<p>In the future, I hope to use <a href=\"https://preview.dune.build\">dune package management</a> to ease the deployment\nof the site, but it didn't work in its current preview form due to a <a href=\"https://github.com/ocaml/dune/issues/11405\">problem\nwith depopts</a>. Just teething\nproblems with a preview, so I'll post more about that when I get it working!\nI also have a half-finished port of the variant compiler to OpenBSD, so that\nI can shift my website back to its familiar home rather than running on Linux.</p>\n<p>I haven't yet actually taken advantage of any of the new extensions in the\nJane Street variant, since I wantd to get this site up and running first.\nI'll tidy up the code, open source it in the coming weeks, and then we can\ndive into some region extensions and see how far I get!</p>",+"content": "<p>This website has been through quite a few iterations over the years. The first version in 1998 was written in Perl and hosted on <a href=\"https://anil.recoil.org/\">OpenBSD</a>; the second was rewritten in 2000 when I <a href=\"https://anil.recoil.org/notes/commit-access-to-php\">got commit access to PHP</a>; the third rewrite became a hybrid OCaml/PHP/Perl special in 2004 in <a href=\"https://en.wikipedia.org/wiki/Blosxom\">Blosxom</a>; then the forth rewrite around 2013 got turned into a <a href=\"https://anil.recoil.org/projects/unikernels\">unikernel</a> in MirageOS; then the <a href=\"https://web.archive.org/web/20220118200046/https://anil.recoil.org/\">fifth</a> in 2019 then transitioned to an OCaml static site generator hosted on a prerelease <a href=\"https://github.com/avsm/eeww\">multicore OCaml webserver</a>. So the sixth generation now needs something to continue the grand <a href=\"https://en.wikipedia.org/wiki/Rube_Goldberg_machine\">Rube Goldberg</a> tradition of helping me learn the latest and greatest in systems technology.</p>\n<p>And so here it is! The site is now written in a bleeding-edge unreleased variant of OCaml with extensions based around <a href=\"https://blog.janestreet.com/icfp-2024-index/\">Rust-like type system features</a> activated, including rather exciting <a href=\"https://popl25.sigplan.org/details/POPL-2025-popl-research-papers/23/Data-Race-Freedom-la-Mode\">data-race freedom</a> work that just won a best paper award at POPL 2025. It's normally difficult to work on continuously moving compilers, but Diana Kalinichenko did a tremendous amount of work into making it usable with opam out of the box, and this post documents the journey to getting this website live.</p>\n<h2><a href=\"https://anil.recoil.org/#getting-the-oxidised-compiler\"></a>Getting the oxidised compiler</h2>\n<p>Firstly, we did some groundwork a few months ago by adding support into the opam-repository for <a href=\"https://github.com/ocaml/opam-repository/pull/26471\">bootstrap versions</a> of dune, menhir and ocamlfind. These are used to build the Jane Street version of the OCaml compiler, which is published as an <a href=\"https://github.com/janestreet/opam-repository/tree/with-extensions\">opam-repository#with-extensions</a>.</p>\n<p>The extensions there are straightforward for those familiar with opam. On a clean system you can run:</p>\n<pre><code>$ opam init\n$ opam switch create 5.2.0+flambda2 \\\n --repos with-extensions=git+https://github.com/janestreet/opam-repository.git#with-extensions,default\n$ eval $(opam env)\n</code></pre>\n<p>This creates a new opam switch known as <code>5.2.0+flambda2</code>, and we can then verify it's running the variant compiler.</p>\n<pre><code>$ ocaml\nOCaml version 5.2.0+jst\nEnter #help;; for help.\n# let () =\n let local_message : string @@ local = "Hello, World" in\n print_endline local_message\n ;;\nError: This value escapes its region.\n</code></pre>\n<p>That last bit is the new region magic which I'm keen to start experimenting with for this website! But before that, we need to get the rest of the ecosystem packages needed for the website working under this compiler.</p>\n<h2><a href=\"https://anil.recoil.org/#installing-ecosystem-packages\"></a>Installing ecosystem packages</h2>\n<p>I decided to build the new site based on a content manager I've been designing\n(and scrapping) for a few years, codenamed Bushel. The basic idea behind\nBushel is to extend Markdown sufficiently with rich contextual data (such as\ncontacts, papers, projects, ideas and so on), and allow for cross-referencing\nto <em>other</em> sites that also follow the Bushel protocol. I'll talk about that in\nmore detail in future posts, but for now that means that I need a more dynamic\nwebsite than the static one I used for the past few years.</p>\n<p>Since the Jane Street compiler doesn't yet support the effect system from OCaml 5, I couldn't use my own Eio-based webserver. So after some discussion with <a href=\"https://mynameismwd.org\">Michael Dales</a> who is <a href=\"https://digitalflapjack.com/blog/the-partially-dynamic-web/\">also porting his site to OCaml</a>, I took the opportunity to learn the the excellent <a href=\"https://aantron.github.io/dream/\">Dream</a> server, which is based on Lwt. I also used Daniel Bunzli's <a href=\"https://discuss.ocaml.org/t/ann-cmarkit-0-3-0-commonmark-parser-and-renderer-for-ocaml/13622\">cmarkit</a> library for Markdown parsing, and my own <a href=\"https://github.com/avsm/jekyll_format\">Jekyll_format</a> and <a href=\"https://github.com/avsm/ocaml-yaml\">yaml</a> libraries.</p>\n<p>Amazingly, all of these libraries worked out of the box on the Jane Street\ncompiler, except for one snag: the parsetree internals have changed in their\nbranch. This means that <a href=\"https://ocaml.org/docs/metaprogramming\">PPX</a>\nextensions will not work out-of-the-box. Thankfully, there is an abstraction\nlibrary called <a href=\"https://discuss.ocaml.org/t/ann-ppxlib-034-0/15952\">ppxlib</a> which\nhas been ported to the variant compiler, and the differences in the parse tree\nwere easy to fix up (thanks Nathan Reb and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> for your recent ppxlib work!)</p>\n<p>After forking and fixing just two libraries that were using ppx (and not part of the\nJane Street core libraries that were already ported), all I had to do was to pin them\nand add them to my development environment.</p>\n<pre><code>opam pin add ppxlib 0.33.0+jst\nopam pin add dream-httpaf 1.0.0~alpha4\nopam pin add hpack https://github.com/avsm/ocaml-h2.git#js-extensions-fixes\nopam pin add lwt_ppx https://github.com/avsm/lwt.git#js-extensions-fixes\n</code></pre>\n<p>And this then installs the overridden version of packages that I needed,\nwith the pins making sure that the right dependencies were also present.\nAfter that, it was plain sailing! I've now compiled up a native code version\nof my webserver code, deployed it into a <a href=\"https://anil.recoil.org/\">Docker</a> container, and\ndeployed it on Linux.</p>\n<p>In the future, I hope to use <a href=\"https://preview.dune.build\">dune package management</a> to ease the deployment\nof the site, but it didn't work in its current preview form due to a <a href=\"https://github.com/ocaml/dune/issues/11405\">problem\nwith depopts</a>. Just teething\nproblems with a preview, so I'll post more about that when I get it working!\nI also have a half-finished port of the variant compiler to OpenBSD, so that\nI can shift my website back to its familiar home rather than running on Linux.</p>\n<p>I haven't yet actually taken advantage of any of the new extensions in the\nJane Street variant, since I wantd to get this site up and running first.\nI'll tidy up the code, open source it in the coming weeks, and then we can\ndive into some region extensions and see how far I get!</p>",
+18
avsm/notes_bushel-step1.json
+18
avsm/notes_bushel-step1.json
···+"summary": "<p>I've done a redesign of my site after about 20 years since the last one <a href=\"https://anil.recoil.org/notes/opening-anil-recoil-org\">back in 2003</a>.\nThe site design is based on my upcoming Bushel content manager, which I'll post about more once I get the data model in place and try it out properly using this site as a guinea pig.</p>\n<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> also refreshed <a href=\"https://nick.recoil.org\">his website</a> since we were chatting about how outdated our web presences were, and he also put up a main <a href=\"https://recoil.org\">recoil.org</a> page for the main server.</p>",+"content": "<p>I've done a redesign of my site after about 20 years since the last one <a href=\"https://anil.recoil.org/notes/opening-anil-recoil-org\">back in 2003</a>.\nThe site design is based on my upcoming Bushel content manager, which I'll post about more once I get the data model in place and try it out properly using this site as a guinea pig.</p>\n<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> also refreshed <a href=\"https://nick.recoil.org\">his website</a> since we were chatting about how outdated our web presences were, and he also put up a main <a href=\"https://recoil.org\">recoil.org</a> page for the main server.</p>",
+18
avsm/notes_c2k5-thoughts.json
+18
avsm/notes_c2k5-thoughts.json
···+"summary": "<p>Finally had some time to get back from the OpenBSD hackathon and take\nstock of what I worked on. It was pretty interesting one this year, as I\nwent without having much idea of what to work on (unlike last year, when\nI had a mad backlog to catch up on).</p>\n<p>Some stuff I did during the week included:</p>\n<ul>\n<li>Clean up the <a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/ssh/atomicio.c\">atomicio</a>\ninterface used in <a href=\"http://www.openssh.com\">OpenSSH</a> and\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=nc\">nc(1)</a></em> to\nprovide simpler semantics. Error checking from read/write functions\nare a real headache in C, as the functions return <code>-1</code> on error,\nwhich means a signed <code>ssize_t</code> is returned. However, they accept an\nunsigned value as the size of the buffer to process, which means\nthey could potentially return a value outside the range of the\nreturn value. This means you have to check if the return is <code>-1</code>,\nwhich indicates an error, and otherwise cast to a <code>size_t</code> to\ncorrectly get the buffer size back. With the new atomicio, it always\nreturns a <code>size_t</code>, and returns <code>0</code> to signal an error (with <code>errno</code>\ncontaining the error, and <code>EPIPE</code> being set for an <code>EOF</code> condition).</li>\n<li>Start looking at the Bluetooth stack to get L2CAP and RFCOMM\nsupport. We are half-way through un-netgraphing the FreeBSD stack\nand having a more traditional <code>netbt</code> socket interface (much like\n<code>netinet</code> or <code>netinet6</code>) to Bluetooth.</li>\n<li>Use <a href=\"http://cil.sf.net/\">CIL</a> to implement a few fun kernel\nsource->source transforms. <code>kerneltrace</code> just accepts a regular\nexpression and inserts a <code>printf</code> in the function prologue which\noutputs the function name and any arguments passed into it. Had this\nidea when chatting with <a href=\"http://www.monkey.org/~marius/\">Marius</a>,\nand it turned out to be very useful when trying to figure out\ndataflow in the Bluetooth stack (just compile with\n<code>make CC="/usr/local/bin/cilly --dokerneltrace --trace-regexp='ubt|ng_blue'"</code>).\nThe second one was even simpler; <code>randomvars</code> assigns a non-zero\nvalue to every local variable in a function call to help track down\nuninitialized-local-variable bugs. Heres\n<a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/mg/search.c.diff?r1=1.15&r2=1.16\">one</a>\nChad Loder found in\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=mg\">mg(1)</a></em>.</li>\n<li>Other random <a href=\"http://marc.theaimsgroup.com/?l=openbsd-cvs&m=111689009724884&w=2\">signed/unsigned cleanups</a>\nin OpenSSH. Boring but important I guess...</li>\n</ul>\n<p>All in all, the hackathon re-motivated me to continue work on the\nOCaml-based daemons that <a href=\"https://github.com/djs55\">Dave Scott</a> and I have been\nhacking on. I don't want to be fixing random buffer or integer overflows\nin an OpenBSD hackathon 5 years from now; we need to move on to more\nhigh-level issues.</p>",+"content": "<p>Finally had some time to get back from the OpenBSD hackathon and take\nstock of what I worked on. It was pretty interesting one this year, as I\nwent without having much idea of what to work on (unlike last year, when\nI had a mad backlog to catch up on).</p>\n<p>Some stuff I did during the week included:</p>\n<ul>\n<li>Clean up the <a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/ssh/atomicio.c\">atomicio</a>\ninterface used in <a href=\"http://www.openssh.com\">OpenSSH</a> and\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=nc\">nc(1)</a></em> to\nprovide simpler semantics. Error checking from read/write functions\nare a real headache in C, as the functions return <code>-1</code> on error,\nwhich means a signed <code>ssize_t</code> is returned. However, they accept an\nunsigned value as the size of the buffer to process, which means\nthey could potentially return a value outside the range of the\nreturn value. This means you have to check if the return is <code>-1</code>,\nwhich indicates an error, and otherwise cast to a <code>size_t</code> to\ncorrectly get the buffer size back. With the new atomicio, it always\nreturns a <code>size_t</code>, and returns <code>0</code> to signal an error (with <code>errno</code>\ncontaining the error, and <code>EPIPE</code> being set for an <code>EOF</code> condition).</li>\n<li>Start looking at the Bluetooth stack to get L2CAP and RFCOMM\nsupport. We are half-way through un-netgraphing the FreeBSD stack\nand having a more traditional <code>netbt</code> socket interface (much like\n<code>netinet</code> or <code>netinet6</code>) to Bluetooth.</li>\n<li>Use <a href=\"http://cil.sf.net/\">CIL</a> to implement a few fun kernel\nsource->source transforms. <code>kerneltrace</code> just accepts a regular\nexpression and inserts a <code>printf</code> in the function prologue which\noutputs the function name and any arguments passed into it. Had this\nidea when chatting with <a href=\"http://www.monkey.org/~marius/\">Marius</a>,\nand it turned out to be very useful when trying to figure out\ndataflow in the Bluetooth stack (just compile with\n<code>make CC="/usr/local/bin/cilly --dokerneltrace --trace-regexp='ubt|ng_blue'"</code>).\nThe second one was even simpler; <code>randomvars</code> assigns a non-zero\nvalue to every local variable in a function call to help track down\nuninitialized-local-variable bugs. Heres\n<a href=\"http://www.openbsd.org/cgi-bin/cvsweb.cgi/src/usr.bin/mg/search.c.diff?r1=1.15&r2=1.16\">one</a>\nChad Loder found in\n<em><a href=\"http://www.openbsd.org/cgi-bin/man.cgi?query=mg\">mg(1)</a></em>.</li>\n<li>Other random <a href=\"http://marc.theaimsgroup.com/?l=openbsd-cvs&m=111689009724884&w=2\">signed/unsigned cleanups</a>\nin OpenSSH. Boring but important I guess...</li>\n</ul>\n<p>All in all, the hackathon re-motivated me to continue work on the\nOCaml-based daemons that <a href=\"https://github.com/djs55\">Dave Scott</a> and I have been\nhacking on. I don't want to be fixing random buffer or integer overflows\nin an OpenBSD hackathon 5 years from now; we need to move on to more\nhigh-level issues.</p>",
+18
avsm/notes_cambridge-essc-progress.json
+18
avsm/notes_cambridge-essc-progress.json
···+"summary": "<p>I joined Cambridge's loftily named <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environment Sustainability Strategy Committee</a> this academic year, and have attended a couple of meetings with the latest one being held today. While a lot of what goes on is intricately tied into the University's rather <a href=\"https://www.governance.cam.ac.uk/Pages/default.aspx\">special</a> governance structure and the complexity of the College system, there has been significant progress on making all of this more visible more widely.</p>\n<p><a href=\"mailto:Sally.Pidgeon@admin.cam.ac.uk\">Sally Pidgeon</a>, our wonderful head of <a href=\"https://www.environment.admin.cam.ac.uk/\">Enviromental Sustainaibility</a>, has been redeveloping the public website and has put a lot of interesting data online.\nThere is now a new <a href=\"https://www.environment.admin.cam.ac.uk/\">Environmental Sustainability website</a> that tracks the University <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach\">committment</a> structure more closely, with the areas broken up into <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/carbon-and-energy\">Carbon & Energy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/travel-and-transport\">Travel & Transport</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/waste-and-circular-economy\">Waste & Circular Economy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/biodiversity\">Biodiversity</a>, and <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/water\">Water</a> usage.</p>\n<p>These pages makes it far clearer what our University aims are for operational environmental sustainability, and how we're getting there. There's also a dedicated area to <a href=\"https://www.environment.admin.cam.ac.uk/our-progress\">track our actual progress</a> along with a bunch of <a href=\"https://www.environment.admin.cam.ac.uk/news\">case studies</a> such as our own <a href=\"https://www.environment.admin.cam.ac.uk/news/david-attenborough-building-outstanding-environmental-management\">David Attenborough Building at the CCI</a>!</p>\n<p>Some highlights from the progress as I read through them:</p>\n<ul>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\">Carbon & Energy progress</a> reports on two different ways of measuring our energy usage: market <em>or</em> location-based. The location-based emissions reporting is quite straightforward as it involves calculating the kWh of electricity used multiplied by the local grid emissions, therefore representing the mean emission resulting from energy generation within the local Cambridge area.<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> <br>The <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">market-based approach</a> calculates the emissions resulting from the energy supplier that we contract which spreads out the emissions calculation based on the contracts the energy supplier has. The market-based approach has many of the complexities that we've grappled with in <a href=\"https://4c.cst.cam.ac.uk\">4C</a> for avoided emissions, but is useful for <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">net-zero reporting</a> of GHG emissions. While this is best summarised as being "bloody complicated", it's good to see the University reporting <em>both</em> calculations and letting the readers decide which (or both) calculations to use.\nAnd finally, the use of the term "natural gas" turns out to be a surprisingly <a href=\"https://climatecommunication.yale.edu/publications/should-it-be-called-natural-gas-or-methane/\">bad idea</a>. Names <a href=\"https://news.gallup.com/opinion/polling-matters/169541/name-affordable-care-act-obamacare.aspx\">do matter</a> when it comes to public communication.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/essc-1.webp\" title=\"\">\n </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\">Transport & Travel progress</a> is fantastic to go through, as I worked on this with <span>Ian Leslie</span> absolutely ages ago with a <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox\">Databox</a>-based <a href=\"https://anil.recoil.org/papers/2012-mpm-caware\">commuting calculator</a>! However, it's a little disappointing to see that there hasn't been much of a systematic change in the modes of transport used, and also that "work-from-home" is excluded from the figures here as that's an obvious way to reduce the emissions associated with travelling. It's also interesting to see that business flying has bounced back hard since the pandemic despite strict <a href=\"https://www.environment.admin.cam.ac.uk/files/guidelines_for_sustainable_business_travel_approved.pdf\">business travel guidelines</a> that require us to use trains when possible.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/essc-2.webp\" title=\"\">\n </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/waste-and-circular-economy-progress\">Waste and Circular economy progress</a> appears to be largely flatlined in the last couple of years with not much substantive progress but this is also tied to the <a href=\"https://www.em.admin.cam.ac.uk/reshaping-our-estate\">amount of building work</a> going on in the University and isn't a relative metric (i.e. more building projects will result in more waste, but the University does need to do this building for its operations).</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/bap-1.webp\" title=\"\">\n </a> <a href=\"https://www.environment.admin.cam.ac.uk/our-progress/biodiversity-progress\">Biodiversity progress</a> is closest to my heart, but also the hardest to assess despite the comprehensive <a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\">Biodiversity Action Plan</a> from last year (not because anyone's doing a bad job, but biodiversity is just a <em>really</em> complicated <a href=\"https://anil.recoil.org/papers/2024-life\">metric</a>!). There's a University-wide biodiversity manager now and a really well described set of action points here.\n<br> My suggestion during the meeting (and one I'll turn into a project idea soon) is that we should put spatial polygons of the progress described up as a layer over the <a href=\"https://map.cam.ac.uk\">University map</a> so people can overlay these data points and get a sense of what's going on (and where we don't have data). <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> has been <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">steadily working</a> with the Estates department on a side project regarding this as well!</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/water-progress\">Water progress</a> shows some of the difficulty of long-term reporting in this space, as a quick glance seems to reveal that we're getting worse in terms of water consumption. However, our monitoring mechanisms were improved in recent years with smart meters, and so we're just getting more accurate. However, the rise in <a href=\"https://anil.recoil.org/notes/ai-for-science-2024\">AI for research</a> has meant that the demand for GPUs is causing our cooling needs to spike as well, with a corresponding increase in water usage.</li>\n</ul>\n<p>So, lots to digest in here, and something I'm still piecing together in the context of the <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">Cambridge Green Blue</a> idea! The overall message seems clear that we need to continue to push harder for progress towards our net-zero goals to be far higher up the University's strategic plan than it currently is. That doesn't necessarily just involve spending more money, but bringing the juggernaut of <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">research innovation</a> around here to bear, as well as shifting <a href=\"https://csaenvironmental.co.uk/projects/lord-bridges-solar-farm/\">landuse for renewable energy</a> while preserving biodiversity and water according to the biodiversity action plan.</p>\n\n<ol>\n<li>\n<p><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has developed a <a href=\"https://github.com/geocaml/carbon-intensity\">carbon-intensity</a> based on this reporting style which <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> then used in a <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres\">carbon-aware DNS server</a> recently. This is an example of location-based emissions data being used.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>I joined Cambridge's loftily named <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environment Sustainability Strategy Committee</a> this academic year, and have attended a couple of meetings with the latest one being held today. While a lot of what goes on is intricately tied into the University's rather <a href=\"https://www.governance.cam.ac.uk/Pages/default.aspx\">special</a> governance structure and the complexity of the College system, there has been significant progress on making all of this more visible more widely.</p>\n<p><a href=\"mailto:Sally.Pidgeon@admin.cam.ac.uk\">Sally Pidgeon</a>, our wonderful head of <a href=\"https://www.environment.admin.cam.ac.uk/\">Enviromental Sustainaibility</a>, has been redeveloping the public website and has put a lot of interesting data online.\nThere is now a new <a href=\"https://www.environment.admin.cam.ac.uk/\">Environmental Sustainability website</a> that tracks the University <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach\">committment</a> structure more closely, with the areas broken up into <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/carbon-and-energy\">Carbon & Energy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/travel-and-transport\">Travel & Transport</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/waste-and-circular-economy\">Waste & Circular Economy</a>, <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/biodiversity\">Biodiversity</a>, and <a href=\"https://www.environment.admin.cam.ac.uk/our-commitments-and-approach/water\">Water</a> usage.</p>\n<p>These pages makes it far clearer what our University aims are for operational environmental sustainability, and how we're getting there. There's also a dedicated area to <a href=\"https://www.environment.admin.cam.ac.uk/our-progress\">track our actual progress</a> along with a bunch of <a href=\"https://www.environment.admin.cam.ac.uk/news\">case studies</a> such as our own <a href=\"https://www.environment.admin.cam.ac.uk/news/david-attenborough-building-outstanding-environmental-management\">David Attenborough Building at the CCI</a>!</p>\n<p>Some highlights from the progress as I read through them:</p>\n<ul>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\">Carbon & Energy progress</a> reports on two different ways of measuring our energy usage: market <em>or</em> location-based. The location-based emissions reporting is quite straightforward as it involves calculating the kWh of electricity used multiplied by the local grid emissions, therefore representing the mean emission resulting from energy generation within the local Cambridge area.<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> <br>The <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">market-based approach</a> calculates the emissions resulting from the energy supplier that we contract which spreads out the emissions calculation based on the contracts the energy supplier has. The market-based approach has many of the complexities that we've grappled with in <a href=\"https://4c.cst.cam.ac.uk\">4C</a> for avoided emissions, but is useful for <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/Scope2_ExecSum_Final.pdf\">net-zero reporting</a> of GHG emissions. While this is best summarised as being "bloody complicated", it's good to see the University reporting <em>both</em> calculations and letting the readers decide which (or both) calculations to use.\nAnd finally, the use of the term "natural gas" turns out to be a surprisingly <a href=\"https://climatecommunication.yale.edu/publications/should-it-be-called-natural-gas-or-methane/\">bad idea</a>. Names <a href=\"https://news.gallup.com/opinion/polling-matters/169541/name-affordable-care-act-obamacare.aspx\">do matter</a> when it comes to public communication.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/carbon-and-energy-progress\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/essc-1.webp\" title=\"\">\n </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\">Transport & Travel progress</a> is fantastic to go through, as I worked on this with <span>Ian Leslie</span> absolutely ages ago with a <a href=\"https://anil.recoil.org/papers/2015-aarhus-databox\">Databox</a>-based <a href=\"https://anil.recoil.org/papers/2012-mpm-caware\">commuting calculator</a>! However, it's a little disappointing to see that there hasn't been much of a systematic change in the modes of transport used, and also that "work-from-home" is excluded from the figures here as that's an obvious way to reduce the emissions associated with travelling. It's also interesting to see that business flying has bounced back hard since the pandemic despite strict <a href=\"https://www.environment.admin.cam.ac.uk/files/guidelines_for_sustainable_business_travel_approved.pdf\">business travel guidelines</a> that require us to use trains when possible.\n<a href=\"https://www.environment.admin.cam.ac.uk/our-progress/travel-and-transport-progress\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/essc-2.webp\" title=\"\">\n </a></li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/waste-and-circular-economy-progress\">Waste and Circular economy progress</a> appears to be largely flatlined in the last couple of years with not much substantive progress but this is also tied to the <a href=\"https://www.em.admin.cam.ac.uk/reshaping-our-estate\">amount of building work</a> going on in the University and isn't a relative metric (i.e. more building projects will result in more waste, but the University does need to do this building for its operations).</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/bap-1.webp\" title=\"\">\n </a> <a href=\"https://www.environment.admin.cam.ac.uk/our-progress/biodiversity-progress\">Biodiversity progress</a> is closest to my heart, but also the hardest to assess despite the comprehensive <a href=\"https://www.environment.admin.cam.ac.uk/files/uoc_bap.pdf\">Biodiversity Action Plan</a> from last year (not because anyone's doing a bad job, but biodiversity is just a <em>really</em> complicated <a href=\"https://anil.recoil.org/papers/2024-life\">metric</a>!). There's a University-wide biodiversity manager now and a really well described set of action points here.\n<br> My suggestion during the meeting (and one I'll turn into a project idea soon) is that we should put spatial polygons of the progress described up as a layer over the <a href=\"https://map.cam.ac.uk\">University map</a> so people can overlay these data points and get a sense of what's going on (and where we don't have data). <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> has been <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">steadily working</a> with the Estates department on a side project regarding this as well!</li>\n<li><a href=\"https://www.environment.admin.cam.ac.uk/our-progress/water-progress\">Water progress</a> shows some of the difficulty of long-term reporting in this space, as a quick glance seems to reveal that we're getting worse in terms of water consumption. However, our monitoring mechanisms were improved in recent years with smart meters, and so we're just getting more accurate. However, the rise in <a href=\"https://anil.recoil.org/notes/ai-for-science-2024\">AI for research</a> has meant that the demand for GPUs is causing our cooling needs to spike as well, with a corresponding increase in water usage.</li>\n</ul>\n<p>So, lots to digest in here, and something I'm still piecing together in the context of the <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">Cambridge Green Blue</a> idea! The overall message seems clear that we need to continue to push harder for progress towards our net-zero goals to be far higher up the University's strategic plan than it currently is. That doesn't necessarily just involve spending more money, but bringing the juggernaut of <a href=\"https://www.cam.ac.uk/stories/ai-and-climate-and-nature\">research innovation</a> around here to bear, as well as shifting <a href=\"https://csaenvironmental.co.uk/projects/lord-bridges-solar-farm/\">landuse for renewable energy</a> while preserving biodiversity and water according to the biodiversity action plan.</p>\n\n<ol>\n<li>\n<p><a href=\"https://patrick.sirref.org\">Patrick Ferris</a> has developed a <a href=\"https://github.com/geocaml/carbon-intensity\">carbon-intensity</a> based on this reporting style which <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> then used in a <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres\">carbon-aware DNS server</a> recently. This is an example of location-based emissions data being used.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_cambridge-green-blue.json
+18
avsm/notes_cambridge-green-blue.json
···+"summary": "<p><a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> recently gave a great <a href=\"https://watch.eeg.cl.cam.ac.uk/w/qEsMt2Ayk37SaKgxrfwoBt\">talk</a> in our group about his thoughts on <a href=\"https://mlg.eng.cam.ac.uk/carl/words/mechanisms.pdf\">mechanisms against climate change</a>. He persuasively argued that the <a href=\"https://unfccc.int/process-and-meetings/the-paris-agreement\">Paris Agreement</a> was doing more harm than good by giving the <em>illusion</em> of being a concrete agreement, but is in reality a huge distraction. Our actual <a href=\"https://ourworldindata.org/co2-emissions\">emissions</a> have increased since the Paris agreement was signed!</p>\n<p>Carl <a href=\"https://www.youtube.com/watch?v=naFaQsFxs1g\">argues</a> that a climate system ultimately only responds to collective actions, and without a global cooperative incentive each nation will spring back to their own isolated short-term incentives that lead to an increase in fossil fuel burning. He has just published the "<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">Themis Mechanism</a>" as a simple alternative for equitable global emission reduction (<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">long form</a>). <em>(6th May 2025: See a new <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis as well)</em></p>\n<p>This got me brainstorming with Carl about how to test his theories out and we came up with an idea that is either terrible or awesome; please read on and judge appropriately. I think we should take advantage of Cambridge's unique structure to trial the Themis mechanism via a new <strong>competitive decarbonisation sporting league among Colleges that I dub the "Cambridge Green Blue"</strong>. Given the Chancellor's recent unveiling of an <a href=\"https://www.theguardian.com/business/2025/jan/28/reeves-plans-to-create-silicon-valley-between-oxford-and-cambridge\">innovation corridor</a> between Oxford and Cambridge, the timing could not be better for an initiative like this. <em>(TL;DR sign up at the bottom of this post if you'd like to participate)</em></p>\n<h2><a href=\"https://anil.recoil.org/#the-basics-of-the-themis-mechanism\"></a>The basics of the Themis mechanism</h2>\n<p>First, let's understand what Carl is <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">proposing</a>, which is built on three foundations:</p>\n<blockquote>\n<ul>\n<li>Our atmosphere is a shared resource, a commons. Fossil fuel users benefit fully from fuel\nconsumption, while the CO2 cost is spread globally. This dilution effect makes continued\nuse rational for individuals but collectively disastrous. [...] To prevent this,\nwe must cooperate to guarantee positive climate results.</li>\n<li>The root cause of climate change is the failure to account for the true cost of emissions.\nBy treating the atmosphere as a free resource, we encourage overexploitation. Themis\ncorrects this unpriced externality by pricing greenhouse gas emissions.</li>\n<li>Effective cooperation requires a fair guiding principle. Themis upholds equity: that our\natmospheric resources should be shared equally between all humans.\n -- <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>, <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">The Themis Mechanism</a> </li>\n</ul>\n</blockquote>\n<p>As I <a href=\"https://anil.recoil.org/notes/carbon-credits-vs-offsets\">noted last week</a>, most tech companies regularly <a href=\"https://www.theverge.com/2022/8/1/23287351/amazon-climate-change-carbon-emissions-worse-2021\">break</a> future carbon pledges due to competitive pressure. So it's good to see that Themis requires only immediate commitments rather than <a href=\"https://climate.ec.europa.eu/eu-action/climate-strategies-targets/2050-long-term-strategy_en\">long-term pledges</a> which are impossible to police. Instead of forcing <a href=\"https://climateactiontracker.org/publications/the-climate-crisis-worsens-the-warming-outlook-stagnates/\">unwilling</a> participants to join, Themis is a coalition in which partners check on each other, learn by doing, and build up mutual trust.</p>\n<p><a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/themis-ss-1.webp\" title=\"\">\n </a></p>\n<p>The core scheme itself is based on a value <em>Py</em> which is the price of emitting a single ton of CO2 into the atmosphere in year <em>y</em>. Here's how it works:</p>\n<ol>\n<li>Every year <em>y</em> there is a price <em>Py</em> that all nations agree to.</li>\n<li>At year end, each members pays <em>Py</em> times their emissions into a common pool.</li>\n<li>The pool is immediately redistributed to members in proportion to their population.</li>\n<li>Each member votes on <em>Py+1</em> and the median result decides next year's price.</li>\n</ol>\n<p>This mechanism only depends on per-capita emissions for one year, and not on\nany <a href=\"https://www.carbonbrief.org/analysis-95-of-countries-miss-un-deadline-to-submit-2035-climate-pledges/\">future pledges</a> or <a href=\"http://pdf.wri.org/navigating_numbers_chapter6.pdf\">historic emissions</a>. If a country has above average per capita emissions, then\nthey pay into the common pool. If they are below average per capita, then the country\nbenefits from payments from the pool. The system permits co-existence with any other\ncarbon reduction efforts, and works with a non-exhaustive pool of nations participating.</p>\n<h2><a href=\"https://anil.recoil.org/#will-themis-be-more-effective-than-paris\"></a>Will Themis be more effective than Paris?</h2>\n<p>The main reason Themis might fail is that participating in the league <a href=\"https://www.ft.com/content/921381a8-48a4-4bb9-9196-b1d49f871bb7\">disadvantages</a> the participants vs those just continuing with business-as-usual. The economics theory behind Themis is similar to a <a href=\"https://en.wikipedia.org/wiki/Pigouvian_tax\">Pigouvian tax</a>\nwhich dates back to a century ago, when the Cambridge economist <a href=\"https://en.wikipedia.org/wiki/Arthur_Cecil_Pigou\">Arthur Pigou</a> suggested in 1920 that a tax equal to the external cost of pollution could align private costs with social costs. This idea also works for <a href=\"https://www.ecb.europa.eu/pub/pdf/scpwps/ecb.wp2812~81379c0224.en.pdf\">discounting</a> <a href=\"https://www.nature.com/articles/s41558-023-01680-x\">future</a> actions, and is the basis for some of our own work on <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">pricing impermanent but delayed emissions</a>.</p>\n<p>From an economic theory perspective, Pigou and the other prominent Cambridge economist at the time <a href=\"https://en.wikipedia.org/wiki/John_Maynard_Keynes\">JM Keynes</a><a href=\"https://anil.recoil.org/#fn-2\">[1]</a> had deep <a href=\"https://www.tandfonline.com/doi/pdf/10.1080/10370196.1994.11733148\">disagreements</a>. Keynes argued for higher interest rates to boost aggregate growth, while Pigou wanted to give people an increase in real wealth relative to prices. Both of their approaches ultimately <a href=\"https://en.wikipedia.org/wiki/Post-war_displacement_of_Keynesianism\">lost out</a> by the 1980s as free market economics ruled supreme instead, leading to the current <em>"grow, emit and die"</em> competitive spiral of doom we find ourselves in. However, Pigou's theories are clearly ones we should <a href=\"https://link.springer.com/article/10.1007/s10797-020-09653-y\">revisit today</a> in light of Themis: by raising the cost of emitting via taxes (or Themis contributions) we can incentivise countries to reduce pollution or decarbonise instead of treating the atmosphere as a free sink to dump into.</p>\n<p>A modern counterpoint to the "lack of competitiveness" argument from participating in a emissions reduction competition is the increasing evidence of <a href=\"https://www.nhm.ac.uk/discover/news/2025/january/ocean-temperature-rise-accelerating-greenhouse-gas-levels-rising.html\">runaway</a> <a href=\"https://anil.recoil.org/notes/rs-ecorisk-day1\">tipping points</a> that might suddenly need everyone to decarbonise really quickly. <a href=\"https://www.katharinehayhoe.com/\">Katherine Hayhoe</a>, the chief scientist at TNC observes that we <a href=\"https://www.motherjones.com/environment/2022/06/climate-scientist-katharine-hayhoe-crisis-adaptation-global-warming-impact/\">can't adapt our way out of this climate crisis</a> due to the sheer magnitude of change that will occur if we continue to emit.</p>\n<blockquote>\n<p>Our infrastructure, worth trillions of dollars, built over decades, was built for a planet that no longer exists [...]\n - Katherine Hayhoe, <a href=\"https://www.theguardian.com/environment/2022/jun/01/we-cannot-adapt-our-way-out-of-climate-crisis-warns-leading-scientist\">The Guardian</a> 2022</p>\n</blockquote>\n<p>This is a pragmatic point in favour of countries joining Themis, since participation strengthens their economic infrastructure towards decarbonisation. By joining, countries can trade off some short term losses in their economy with being well hedged for either a "sudden" black swan <a href=\"https://en.wikipedia.org/wiki/Tipping_points_in_the_climate_system\">climate tipping point</a> that requires rapid change in their societal infrastructure, and it also gives them a long-term advantage heading into the inevitable <a href=\"https://cleantechnica.com/2024/09/12/virtual-power-plants-may-hold-the-key-to-an-all-electric-future/\">electric future</a>. So perhaps the fact that things are now much worse since Paris could force the emergence of cooperative groups who wish to <a href=\"https://www.e3g.org/wp-content/uploads/E3G-Report-Living-on-the-Edge-How-Climate-Tipping-Points-will-Reshape-Geopolitics.pdf\">prepare</a> for <a href=\"https://www.aria.org.uk/media/wxrnowvq/aria-forecasting-climate-tipping-points-programme-thesis.pdf\">sudden</a> change.</p>\n<p>As <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> also notes in his Themis proposal, there is consensus among climate scientists that we must <a href=\"https://www.pnas.org/doi/10.1073/pnas.2301531121\">cooperate in the planetary commons</a> if we are to succeed.\nBut his proposal seems overwhelmingly difficult to evaluate in a <a href=\"https://www.theguardian.com/us-news/2024/oct/01/trump-visits-georgia-denies-climate-crisis-after-hurricane-helene\">political climate</a> that is moving <a href=\"https://www.bbc.co.uk/news/articles/cx253xjnxrmo\">away</a> from global cooperation. There must be a way to try some of these ideas out at a smaller scale, and especially locally in our sleepy University town!</p>\n<h2><a href=\"https://anil.recoil.org/#cooperation-through-sport-and-games\"></a>Cooperation through sport and games</h2>\n<p>One area where nations have remained cooperative through no clear immediate financial gain is that of <a href=\"https://www.bloomsbury.com/uk/sport-in-ancient-times-9780275987398/\">competitive sport</a>. We just had the <a href=\"https://www.olympics.com/en/olympic-games/paris-2024\">Paris Olympics</a> with almost every nation in the world competing for no good reason other than a desire to win. And they're not seeking to win money as in most other areas of competition; instead it's just virtual credit in the form of <a href=\"https://www.eurosport.com/olympics/olympic-games-paris-2024/2024/gold-medal-table-per-capita-population_sto20028430/story.shtml\">medal tables</a> that are celebrated from the largest to the <a href=\"https://www.olympics.com/en/news/paris-2024-olympics-nations-won-first-ever-medal-at-the-games\">smallest</a> countries!</p>\n<p>Sporting events such as the Olympics are highly structured events with clear rules dictating almost every aspect. An interesting consequence of decoupling the rules of the games from direct financial incentives is that many sports are not <a href=\"https://en.wikipedia.org/wiki/Zero-sum_game\">zero-sum games</a>. In <a href=\"https://en.wikipedia.org/wiki/Laws_of_rugby_union\">rugby union</a> or <a href=\"https://www.thefa.com/football-rules-governance/lawsandrules\">football</a> for example, the <a href=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC6315358\">winner gains more than the loser loses</a>. While this structure can encourage <a href=\"https://www.responsiblegambling.eu/wp-content/uploads/2016/06/Match-Fixing%E2%80%94The-Biggest-Threat-to-Sport-in-the-21st-Century.pdf\">match-fixing</a> due to the asymmetry, participants also build trust amongst themselves over the years, for example via <a href=\"https://link.springer.com/article/10.1007/s12197-009-9120-4\">promotion through divisions</a>.\n<a href=\"https://en.wikipedia.org/wiki/Game_theory\">Game theorists</a> often note how stable cooperation emerges in <a href=\"https://academics.hamilton.edu/economics/cgeorges/game-theory-files/repeated.pdf\">infinitely repeated</a> games. Sports seasons are simply repeated competitions; over time, codes of conduct evolve and become self-policing agreements for mutual benefit (avoiding injuries, preserving dignity in loss, etc). There are clear lessons for the Themis mechanism here, as it also needs to establish long-term cooperation deep into the next century until <a href=\"https://www.nature.com/articles/s41558-018-0091-3\">total CO2 levels decline</a>.</p>\n<p>\n<img alt=\"If the Olympics aren&apos;t for you, perhaps boardgames are\" src=\"https://anil.recoil.org/images/board-game-pd-1.webp\" title=\"If the Olympics aren&apos;t for you, perhaps boardgames are\">\nIf the Olympics aren't for you, perhaps boardgames are</p>\n<p>Away from physical sports, we also see similar scoring dynamics in <a href=\"https://boardgamegeek.com/\">boardgames</a>! There is a whole genre of semi-competitive boardgames such as <a href=\"https://drakesflames.blogspot.com/2012/11/board-game-review-archipelago.html\">Archipelago</a> which are <em>"competitive games that everyone can lose"</em>. This sounds a lot like Themis; we want to be able to stave off emissions disaster, but otherwise be the top dog in our league for every other aspect of our societies! The game rules must be structured so that even selfish players find it in their interest to cooperate to <a href=\"https://boardgamegeek.com/geeklist/71983/competitive-games-where-everyone-can-lose\">avoid losing</a>. In Archipelago, the rule is simple: if instability within the game hits a certain point, all players lose, which forces even the leader to sometimes help the laggard to save themselves.<a href=\"https://anil.recoil.org/#fn-1\">[2]</a></p>\n<h2><a href=\"https://anil.recoil.org/#enter-the-cambridge-green-blue\"></a>Enter the Cambridge Green Blue</h2>\n<p>So how is this relevant to evaluating the global Themis mechanism from earlier? Everything global must start locally, so I propose a new semi-competitive league here in Cambridge, with willing Colleges as participants, and with virtual points instead of using real currency. And just like the <a href=\"https://en.wikipedia.org/wiki/Blue_(university_sport)\">two century old</a> tradition, we should make this sufficiently competitive to gain a coveted <a href=\"https://www.hawksclub.co.uk/about/history/the-cambridge-blue/\">sporting blue</a>! To give you some context, being really good at <a href=\"https://www.christs.cam.ac.uk/news/70-years-tiddlywinks\">tiddlywinks</a> can gain you a <a href=\"https://www.varsity.co.uk/sport/9629\">quarter blue</a>.</p>\n<p>In the rest of this post, I've written up the structure of this league with <a href=\"https://en.wikipedia.org/wiki/Elinor_Ostrom%23%2522Design_principles_illustrated_by_long-enduring_CPR_%28Common_Pool_Resource%29_institutions%2522\">Ostrom's principles</a> in mind, by treating the CO2 management problem as a <a href=\"https://en.wikipedia.org/wiki/Common-pool_resource\">common pool resource</a>.\nCambridge Colleges have been around for centuries and so naturally appreciate the long perspective required; Pembroke was <a href=\"https://www.pem.cam.ac.uk/college\">founded</a> in 1347. Our collective collegiate goal is to urgently reduce CO2e that accumulate in the atmosphere and contribute to climate change for hundreds of years. This requires cooperation and learning from each other, but also a certain drive to do better than each other to get to the goal as quickly as we can.</p>\n<h3><a href=\"https://anil.recoil.org/#what-do-we-measure-in-this-league\"></a>What do we measure in this league?</h3>\n<p>The three key sources of carbon emissions this league would track would initially come from food, heating and travel, noting again that we are only measuring <em>this year's</em> reductions and emissions, not historic or future pledges. We need to design specific mechanisms for each of these, but I'll just sketch out what makes measuring each of these "interesting".</p>\n<h4><a href=\"https://anil.recoil.org/#food-consumption-and-waste\"></a>Food consumption and waste</h4>\n<p>Students, Fellows, visitors and staff all eat a <em>lot</em> of food in the Colleges from <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">all over</a> the world. Communal dining is so central to the Cambridge College experience that it is mentioned in many College statutes as part of our charitable purpose.</p>\n<blockquote>\n<p>In furtherance of the College\u2019s purposes, Fellows shall be entitled to dine daily free of charge at common table.\n -- <a href=\"https://www.pem.cam.ac.uk/sites/default/files/downloads/inlinearstatutesordsregs12july2022.pdf\">Pembroke College Statutes</a> presented to Her Majesty in 2009</p>\n</blockquote>\n<p>Since thousands of meals go through a typical College every day, identifying pragmatic sources of emissions reductions is very important. In a recent committee meeting at Pembroke College, I was incredibly pleased to hear that we've reduced <a href=\"https://lordslibrary.parliament.uk/food-waste-in-the-uk/\">food waste</a> from the kitchens down to just one or two meals a day (which, considering the vast number of meals served is hugely impressive).\nAnd similarly, Darwin College reported on the recent <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">plant based May Ball</a> which was a rather fine party, and the world did not end due to black tie attendees being unable to find a sausage roll.\nHow can we communicate the lessons learnt from the catering teams here to other Colleges? The CGB allows us to rank and categorise these initiatives!</p>\n<p>Research, with much of it conducted here in Cambridge, shows us that key gains in food impacts come from reducing <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a> and the corresponding damage to <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forests</a> full of <a href=\"https://anil.recoil.org/papers/2024-life\">biodiversity</a>.\nImportantly, we're not trying to force every College member to suddenly become vegan, but instead provide sustainable and <a href=\"https://www.bbc.com/future/article/20241011-what-explains-increasing-anxiety-about-ultra-processed-plant-based-foods\">healthy</a> options.\n<a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a> have both shown that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical, without alienating those that insist on their meat'n'twoveg:</p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options from 1-in-4 to 2-in-4 increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</p>\n</blockquote>\n<p>The league does need some way to turn these initiatives into a points based system. This is where my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>'s recent <a href=\"https://anil.recoil.org/papers/2024-food-life\">research</a> is instructive. He's been working on quantifying the <a href=\"https://anil.recoil.org/papers/2024-life\">biodiversity cost</a> of <a href=\"https://anil.recoil.org/papers/2024-food-life\">food imports</a>, broken up by the food type. The CGB food points game could correlate consumption choices with where the food comes from and how much it is wasted, and so we could steadily work across Colleges on reducing our impact year-on-year.</p>\n<p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\"> \n<img alt=\"An excerpt from the paper &apos;Quantifying the impact of the food we eat on species extinctions&apos; (Tom Ball et al, under review)\" src=\"https://anil.recoil.org/images/tball-food-paper-ss-1.webp\" title=\"An excerpt from the paper &apos;Quantifying the impact of the food we eat on species extinctions&apos; (Tom Ball et al, under review)\">\nAn excerpt from the paper 'Quantifying the impact of the food we eat on species extinctions' (Tom Ball et al, under review) </a></p>\n<h4><a href=\"https://anil.recoil.org/#heating-without-fossil-fuels\"></a>Heating without fossil fuels</h4>\n<p>Turning off the natural gas flows in Colleges is a major challenge. We have some of\nthe oldest buildings in the world around here, and much of the infrastructure is\ncorrespondingly aged. Pembroke has just spent a ton of cash on a <a href=\"https://www.cibsejournal.com/uncategorized/fuel-for-thought-cambridge-college-plans-for-heat-pump-transition/\">communal heat pump</a> for our new development in Mill Lane, which got me thinking about how this aspect of the CGB league could be based around this. The rules and regulations for heat pump installation in the UK are incredibly baroque, as <a href=\"https://ramcq.net/\">Robert McQueen</a> pointed out recently:</p>\n<blockquote>\n<p>I have a neighbour who embarked on a planning application for a heat pump for his terraced house. There is a difference in ridiculous paperwork necessary simply to install <1m from the boundary compared to the presumed consent in permitted development. Of course now they are waiving that requirement but he's stuck half way through the process. I can't even imagine adding listed requirements into that</p>\n<p>[...] <a href=\"https://mhclgmedia.blog.gov.uk/2024/11/21/warm-homes-plan-and-heat-pumps/\">due to be waived</a> for permitted development - whether that tracks through to the full regulations is anyone's guess. They are already bafflingly inconsistent.\n-- <a href=\"https://bsky.app/profile/ramcq.net/post/3lhcdlycth22n\">Robert McQueen, Bluesky</a>, Feb 2025</p>\n</blockquote>\n<p>However, the Cambridge City Council isn't sitting still and has been working with the University on this. <span>Ian Leslie</span> pointed me to city-wide explorations into <a href=\"https://www.cambridge.gov.uk/city-centre-heat-network\">district heating</a> networks for Cambridge that includes a <a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\">phase 1 report</a>\nthat plots out what it might look like by using different Colleges as sinks and sources!</p>\n<p><a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cambridge-district-heat-ss-1.webp\" title=\"\">\n </a></p>\n<p>Darwin College also reports in their <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">2023 sustainability report</a> the progress they've made on establishing heat pumps in the River Cam.</p>\n<blockquote>\n<p>In 2022, in a collaboration with six other riverside Colleges, Mott MacDonald were commissioned to monitor\nwater flow, depth and temperature at four locations on the river and to produce a detailed hydrology study.\nThe report, delivered in 2023, confirms the considerable potential of the river to supply heat for space\nand hot water heating for the adjacent Colleges.\n -- <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">Darwin sustainability report</a>, 2023</p>\n</blockquote>\n<p>And famously most recently, <a href=\"https://anil.recoil.org/www.kings.cam.ac.uk\">Kings College</a> installed <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">400 solar panels</a> on their world-famous chapel, despite <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">opposition</a> from Historic England. This sets a huge precedent for the rest of Cambridge to take similar action, and they deserve recognition for this from the CGB!</p>\n<p>\n<img alt=\"The roof of Kings College chapel. Source: BBC News.\" src=\"https://anil.recoil.org/images/kings-solar-panels.webp\" title=\"The roof of Kings College chapel. Source: BBC News.\">\nThe roof of Kings College chapel. Source: BBC News.</p>\n<p>So this aspect of the CGB league could focus on building spatial connections across Colleges. Perhaps the College that brings the most benefit to its neighbours by contributing the most towards a district heating mechanism could win this round.</p>\n<h4><a href=\"https://anil.recoil.org/#reducing-impact-of-international-travel\"></a>Reducing impact of international travel</h4>\n<p>Finally, lots of the Colleges do facilitate international travel, for a variety of reasons ranging from <a href=\"https://www.pem.cam.ac.uk/international-programmes\">pedagogical</a> to <a href=\"https://www.pem.cam.ac.uk/alumni-development/connect-pembroke/alumni-events\">developmental</a>. The most obvious one is when conducting in-person interviews, when candidates fly in from all over the world. Since the pandemic, there has been <a href=\"https://oxbridgeapplications.com/blog/cambridge-interviews-online-or-in-person/\">split opinion</a> among Colleges about returning to in-person interviews or not, with Pembroke opting for in-person this year. While there are lots of good reasons to encourage in-person interactions, the carbon cost has been so low down in the discussion points in the meetings I've attended that it might as well not even be a factor. A CGB league might encourage us to tally up the scores across Colleges more systematically to factor in these costs into the overall decisionmaking.</p>\n<p>The other opposite end of the spectrum is international air travel for conferences, which are thankfully quite rare as most of our business is conducted locally. We do host events here such as the <a href=\"https://www.sccs-cam.org/\">SCCS</a> student conservation conference that flies in young scholars from all over the world, but this is quite rightly justified as being essential as it brings together underrepresented young students from all over the world who find tremendous value from meeting each other. I've made more extensive notes on the topic of travel mitigation elsewhere in my note on <a href=\"https://anil.recoil.org/carbon-credits-vs-offsets\">carbon contributions</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#implementing-the-cambridge-green-blue\"></a>Implementing the Cambridge Green Blue</h3>\n<p>I've hopefully convinced you that there quite a few interesting dimensions around which we could design our semi-competitive Cambridge Green Blue (CGB) league. I've avoided over-specifying the rules at this early juncture, since I want to bring in more people's thoughts and ideas first. However, here's a strawman attempt.</p>\n<blockquote>\n<p>We treat the emission of CO2 into the atmosphere as a shared common pool resource (CPR); i.e. we can collectively only emit a limited amount if we are to avoid the worst effects of climate change. Cooperation on a global CPR should ideally happen on a global basis, however that current approach is inadequate. Therefore, we must locally initiate mechanisms which will build up into a global framework from the ground up. Cambridge Colleges are institutions for young people who will be greatly affected by climate change, and Colleges make decisions with long time horizons, and a body of scholars should represent intellectual leadership in a time of crises. Therefore Cambridge Colleges should be an ideal proving ground for exploring cooperative frameworks in practise!</p>\n</blockquote>\n<p>The CGB would select its initial College membership and define baseline rules about how to measure emissions collectively, based around the first interest areas of travel, food and heating described above. Members will then write a rule book that follows the Themis mechanism to establish a virtual price for each tonne of emissions, and we will self-report progress monthly with points assigned to those who are beating their baselines of emissions reduction interventions. The league is used to collectively learn from those who are winning, and equalise the playing field in future seasons for the others to catch up.</p>\n<p>Following Ostrom's principles, the league looks like this:</p>\n<ol>\n<li><em>Define group boundaries and the contents of the CPR.</em> The common pool resource we measure are CO2 emissions from the Cambridge Colleges. The goal is to reduce emissions year-on-year, and so "0" is defined as the previous year\u2019s emissions, with any additional emissions reductions resulting in points awarded. The league therefore measures the CPR as "CO2e tonnes avoided" without getting into any historic or future plans, only what is happening this year.</li>\n<li><em>Appropriation and provision of common resources.</em> The Colleges all have initiatives to reduce their CO2e, and have agreed to cooperate towards this common goal. Membership of the league is voluntary, and we make the membership public. We reserve the right to laugh derisively at those Colleges who elect not to participate.</li>\n<li><em>Collective-choice arrangements for resource appropriators to make decisions.</em> The league will maintain a points database tracking emissions across heating, travel and food-related emissions reduction activities. The league will not be directly involved in individual College decision making, but we hope to recruit persons from the Colleges who may be involved in those activities in addition to their participation in the league.</li>\n<li><em>Effective monitoring by monitors who are accountable to the appropriators.</em> The league will self-report their emissions reductions monthly, and there will be a collective consensus formed on the CO2e measurements across the emissions reductions. The reporters are all part of the Cambridge Colleges, and so have access to internal channels to verify their own claims.</li>\n<li><em>A scale of graduated sanctions for resource appropriators who violate community rules.</em> As a voluntary league, we do not anticipate any incentive to cheat. Sanctions will first be redaction of those points from the table, followed by ejection from the league.</li>\n<li><em>Mechanisms of conflict resolution that are cheap and of easy access.</em> The league has monthly checkpoints where participants collectively score their emissions reductions. Disagreements about methodologies will be resolved at these meetings, which also aim to collectively educate each other about the diverse emissions reduction methods available.</li>\n<li><em>Self-determination of the community recognised by higher-level authorities.</em> Cambridge Colleges have committed to various net-zero targets. Therefore, the emissions reductions tracked by this league will eventually be incorporated into some broader net-zero reporting that apply at a national and international level. But for now, we just want to reduce the real amount year-on-year.</li>\n<li><em>Organisation in the form of multiple layers of nested enterprises, with small local CPRs at the base level.</em> Our hope is that the Cambridge Green Blue is the first league of many, with other organisations also following our template. To that end, we will make our rules templates available freely as an open-source rulesheet after the first round concludes successfully. When there are multiple organisations running their own leagues (come on Oxford!), we will build up a bigger collective framework for Themis participants, akin to a sporting governing body.</li>\n</ol>\n<p>One very important aspect of this is to adopt a respectful "<a href=\"https://en.wikipedia.org/wiki/Sportsmanship\">sportsmanship</a>" rule to the relative ranking of Colleges, and not engage in <a href=\"https://www.varsity.co.uk/news/28426\">shaming</a> wars. There is a wide wealth <a href=\"https://www.varsity.co.uk/news/14626\">disparity</a> among the Cambridge Colleges, and we could adjust for this using the per-capita rules from the Themis mechanism. Ultimately, it's also about celebrating and learning from every participant and using the competition to spur us on, build each other up, and have fun doing so. We're all in this together.</p>\n<h2><a href=\"https://anil.recoil.org/#err-are-you-serious-about-this-anil\"></a>Err, are you serious about this Anil?</h2>\n<p>Yeah, I think this is worth a try! I have recently joined the University's <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environmental Sustainability Strategy</a> committee, and I've found it extremely difficult to educate myself about the local initiatives going on (not because of any individual's fault, but because there are 31 separate constituent Colleges and University and townspeople sharing a fairly small area). If nothing else, this initiative will let us collectively bring together a wiki of all the positive actions happening across Cambridge. If it succeeds though, I'd like to spread the next iteration of the league to other Universities to run their own (I'm looking at you, Oxford), and see if we can turn this into a distributed game.</p>\n<p>I was reading <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>'s book <a href=\"https://press.uchicago.edu/ucp/books/book/chicago/W/bo13823467.html\">Wild Hope</a> over the weekend, and his conclusion at the end was that we must not lose hope in our quest for a biodiverse, equitable world. And given the chaotic start to 2025, I can't think of a better place to start something new than within Cambridge, with our collegiate structure already providing a ready-made framework.</p>\n<p>So what next? If you're interested in helping <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and me organise this, get in touch with either of us! I'm on <a href=\"https://www.hr.admin.cam.ac.uk/policies-procedures/flexible-working-policy/supporting-guidance/sabbatical-leave\">academic sabbatical</a> for a year from the summer, so I'll have loads of time. I'll edit this post with a list of first Colleges that have been in touch. We'll likely organise a pub get-together in early March (exact date to follow) to brainstorm about this without anyone interested.</p>\n<p> <em>This post is the result of many conversations around Cambridgeshire over the past year, ranging from a balmy summer dinner in Ely with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a>, chilly autumn cups of tea in my Pembroke office with <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and <a href=\"http://carlhenrik.com/\">Carl Henrik Ek</a>, to misty morning coffees at <a href=\"https://www.visitcambridge.org/place/pages-cambridge/\">Pages</a> with <a href=\"https://en.wikipedia.org/wiki/Melissa_Leach\">Melissa Leach</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> or at <a href=\"https://www.espressolane.co.uk/\">Espresso Lane</a> with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, to cosy pubs with <span>Ian Leslie</span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, to College dinners with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, and <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>/<a href=\"https://www.zoo.cam.ac.uk/research/groups/conservation-science\">CSG</a> discussions with <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>, <a href=\"https://biomin.esc.cam.ac.uk/people/2023-Orlando-Timmerman/\">Orlando Timmerman</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>, <a href=\"https://www.cl.cam.ac.uk/~arb33/\">Alastair Beresford</a>, <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a>. Many thanks to them for corrections and feedback, and any remaining errors are my own. Changelog: 12th Feb added note on sportsmanship and Carl's NeurIPS@Cam talk. 6th May 2025: added <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>'s published <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis.</em> </p>\n\n<ol>\n<li>\n<p>I promise I'm not a JMK shill, despite being a <a href=\"https://www.cshss.cam.ac.uk/research-info/j-m-keynes-fellowship-fund/j-m-keynes-fellows\">JMK Fellow</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>The keen boardgame player will probably observe that there's always one player who decides to cause trouble just for fun, making everyone lose. This can be dealt with by social means.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p><a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> recently gave a great <a href=\"https://watch.eeg.cl.cam.ac.uk/w/qEsMt2Ayk37SaKgxrfwoBt\">talk</a> in our group about his thoughts on <a href=\"https://mlg.eng.cam.ac.uk/carl/words/mechanisms.pdf\">mechanisms against climate change</a>. He persuasively argued that the <a href=\"https://unfccc.int/process-and-meetings/the-paris-agreement\">Paris Agreement</a> was doing more harm than good by giving the <em>illusion</em> of being a concrete agreement, but is in reality a huge distraction. Our actual <a href=\"https://ourworldindata.org/co2-emissions\">emissions</a> have increased since the Paris agreement was signed!</p>\n<p>Carl <a href=\"https://www.youtube.com/watch?v=naFaQsFxs1g\">argues</a> that a climate system ultimately only responds to collective actions, and without a global cooperative incentive each nation will spring back to their own isolated short-term incentives that lead to an increase in fossil fuel burning. He has just published the "<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">Themis Mechanism</a>" as a simple alternative for equitable global emission reduction (<a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">long form</a>). <em>(6th May 2025: See a new <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis as well)</em></p>\n<p>This got me brainstorming with Carl about how to test his theories out and we came up with an idea that is either terrible or awesome; please read on and judge appropriately. I think we should take advantage of Cambridge's unique structure to trial the Themis mechanism via a new <strong>competitive decarbonisation sporting league among Colleges that I dub the "Cambridge Green Blue"</strong>. Given the Chancellor's recent unveiling of an <a href=\"https://www.theguardian.com/business/2025/jan/28/reeves-plans-to-create-silicon-valley-between-oxford-and-cambridge\">innovation corridor</a> between Oxford and Cambridge, the timing could not be better for an initiative like this. <em>(TL;DR sign up at the bottom of this post if you'd like to participate)</em></p>\n<h2><a href=\"https://anil.recoil.org/#the-basics-of-the-themis-mechanism\"></a>The basics of the Themis mechanism</h2>\n<p>First, let's understand what Carl is <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis.pdf\">proposing</a>, which is built on three foundations:</p>\n<blockquote>\n<ul>\n<li>Our atmosphere is a shared resource, a commons. Fossil fuel users benefit fully from fuel\nconsumption, while the CO2 cost is spread globally. This dilution effect makes continued\nuse rational for individuals but collectively disastrous. [...] To prevent this,\nwe must cooperate to guarantee positive climate results.</li>\n<li>The root cause of climate change is the failure to account for the true cost of emissions.\nBy treating the atmosphere as a free resource, we encourage overexploitation. Themis\ncorrects this unpriced externality by pricing greenhouse gas emissions.</li>\n<li>Effective cooperation requires a fair guiding principle. Themis upholds equity: that our\natmospheric resources should be shared equally between all humans.\n -- <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>, <a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\">The Themis Mechanism</a> </li>\n</ul>\n</blockquote>\n<p>As I <a href=\"https://anil.recoil.org/notes/carbon-credits-vs-offsets\">noted last week</a>, most tech companies regularly <a href=\"https://www.theverge.com/2022/8/1/23287351/amazon-climate-change-carbon-emissions-worse-2021\">break</a> future carbon pledges due to competitive pressure. So it's good to see that Themis requires only immediate commitments rather than <a href=\"https://climate.ec.europa.eu/eu-action/climate-strategies-targets/2050-long-term-strategy_en\">long-term pledges</a> which are impossible to police. Instead of forcing <a href=\"https://climateactiontracker.org/publications/the-climate-crisis-worsens-the-warming-outlook-stagnates/\">unwilling</a> participants to join, Themis is a coalition in which partners check on each other, learn by doing, and build up mutual trust.</p>\n<p><a href=\"https://mlg.eng.cam.ac.uk/carl/climate/themis0.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/themis-ss-1.webp\" title=\"\">\n </a></p>\n<p>The core scheme itself is based on a value <em>Py</em> which is the price of emitting a single ton of CO2 into the atmosphere in year <em>y</em>. Here's how it works:</p>\n<ol>\n<li>Every year <em>y</em> there is a price <em>Py</em> that all nations agree to.</li>\n<li>At year end, each members pays <em>Py</em> times their emissions into a common pool.</li>\n<li>The pool is immediately redistributed to members in proportion to their population.</li>\n<li>Each member votes on <em>Py+1</em> and the median result decides next year's price.</li>\n</ol>\n<p>This mechanism only depends on per-capita emissions for one year, and not on\nany <a href=\"https://www.carbonbrief.org/analysis-95-of-countries-miss-un-deadline-to-submit-2035-climate-pledges/\">future pledges</a> or <a href=\"http://pdf.wri.org/navigating_numbers_chapter6.pdf\">historic emissions</a>. If a country has above average per capita emissions, then\nthey pay into the common pool. If they are below average per capita, then the country\nbenefits from payments from the pool. The system permits co-existence with any other\ncarbon reduction efforts, and works with a non-exhaustive pool of nations participating.</p>\n<h2><a href=\"https://anil.recoil.org/#will-themis-be-more-effective-than-paris\"></a>Will Themis be more effective than Paris?</h2>\n<p>The main reason Themis might fail is that participating in the league <a href=\"https://www.ft.com/content/921381a8-48a4-4bb9-9196-b1d49f871bb7\">disadvantages</a> the participants vs those just continuing with business-as-usual. The economics theory behind Themis is similar to a <a href=\"https://en.wikipedia.org/wiki/Pigouvian_tax\">Pigouvian tax</a>\nwhich dates back to a century ago, when the Cambridge economist <a href=\"https://en.wikipedia.org/wiki/Arthur_Cecil_Pigou\">Arthur Pigou</a> suggested in 1920 that a tax equal to the external cost of pollution could align private costs with social costs. This idea also works for <a href=\"https://www.ecb.europa.eu/pub/pdf/scpwps/ecb.wp2812~81379c0224.en.pdf\">discounting</a> <a href=\"https://www.nature.com/articles/s41558-023-01680-x\">future</a> actions, and is the basis for some of our own work on <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">pricing impermanent but delayed emissions</a>.</p>\n<p>From an economic theory perspective, Pigou and the other prominent Cambridge economist at the time <a href=\"https://en.wikipedia.org/wiki/John_Maynard_Keynes\">JM Keynes</a><a href=\"https://anil.recoil.org/#fn-2\">[1]</a> had deep <a href=\"https://www.tandfonline.com/doi/pdf/10.1080/10370196.1994.11733148\">disagreements</a>. Keynes argued for higher interest rates to boost aggregate growth, while Pigou wanted to give people an increase in real wealth relative to prices. Both of their approaches ultimately <a href=\"https://en.wikipedia.org/wiki/Post-war_displacement_of_Keynesianism\">lost out</a> by the 1980s as free market economics ruled supreme instead, leading to the current <em>"grow, emit and die"</em> competitive spiral of doom we find ourselves in. However, Pigou's theories are clearly ones we should <a href=\"https://link.springer.com/article/10.1007/s10797-020-09653-y\">revisit today</a> in light of Themis: by raising the cost of emitting via taxes (or Themis contributions) we can incentivise countries to reduce pollution or decarbonise instead of treating the atmosphere as a free sink to dump into.</p>\n<p>A modern counterpoint to the "lack of competitiveness" argument from participating in a emissions reduction competition is the increasing evidence of <a href=\"https://www.nhm.ac.uk/discover/news/2025/january/ocean-temperature-rise-accelerating-greenhouse-gas-levels-rising.html\">runaway</a> <a href=\"https://anil.recoil.org/notes/rs-ecorisk-day1\">tipping points</a> that might suddenly need everyone to decarbonise really quickly. <a href=\"https://www.katharinehayhoe.com/\">Katherine Hayhoe</a>, the chief scientist at TNC observes that we <a href=\"https://www.motherjones.com/environment/2022/06/climate-scientist-katharine-hayhoe-crisis-adaptation-global-warming-impact/\">can't adapt our way out of this climate crisis</a> due to the sheer magnitude of change that will occur if we continue to emit.</p>\n<blockquote>\n<p>Our infrastructure, worth trillions of dollars, built over decades, was built for a planet that no longer exists [...]\n - Katherine Hayhoe, <a href=\"https://www.theguardian.com/environment/2022/jun/01/we-cannot-adapt-our-way-out-of-climate-crisis-warns-leading-scientist\">The Guardian</a> 2022</p>\n</blockquote>\n<p>This is a pragmatic point in favour of countries joining Themis, since participation strengthens their economic infrastructure towards decarbonisation. By joining, countries can trade off some short term losses in their economy with being well hedged for either a "sudden" black swan <a href=\"https://en.wikipedia.org/wiki/Tipping_points_in_the_climate_system\">climate tipping point</a> that requires rapid change in their societal infrastructure, and it also gives them a long-term advantage heading into the inevitable <a href=\"https://cleantechnica.com/2024/09/12/virtual-power-plants-may-hold-the-key-to-an-all-electric-future/\">electric future</a>. So perhaps the fact that things are now much worse since Paris could force the emergence of cooperative groups who wish to <a href=\"https://www.e3g.org/wp-content/uploads/E3G-Report-Living-on-the-Edge-How-Climate-Tipping-Points-will-Reshape-Geopolitics.pdf\">prepare</a> for <a href=\"https://www.aria.org.uk/media/wxrnowvq/aria-forecasting-climate-tipping-points-programme-thesis.pdf\">sudden</a> change.</p>\n<p>As <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> also notes in his Themis proposal, there is consensus among climate scientists that we must <a href=\"https://www.pnas.org/doi/10.1073/pnas.2301531121\">cooperate in the planetary commons</a> if we are to succeed.\nBut his proposal seems overwhelmingly difficult to evaluate in a <a href=\"https://www.theguardian.com/us-news/2024/oct/01/trump-visits-georgia-denies-climate-crisis-after-hurricane-helene\">political climate</a> that is moving <a href=\"https://www.bbc.co.uk/news/articles/cx253xjnxrmo\">away</a> from global cooperation. There must be a way to try some of these ideas out at a smaller scale, and especially locally in our sleepy University town!</p>\n<h2><a href=\"https://anil.recoil.org/#cooperation-through-sport-and-games\"></a>Cooperation through sport and games</h2>\n<p>One area where nations have remained cooperative through no clear immediate financial gain is that of <a href=\"https://www.bloomsbury.com/uk/sport-in-ancient-times-9780275987398/\">competitive sport</a>. We just had the <a href=\"https://www.olympics.com/en/olympic-games/paris-2024\">Paris Olympics</a> with almost every nation in the world competing for no good reason other than a desire to win. And they're not seeking to win money as in most other areas of competition; instead it's just virtual credit in the form of <a href=\"https://www.eurosport.com/olympics/olympic-games-paris-2024/2024/gold-medal-table-per-capita-population_sto20028430/story.shtml\">medal tables</a> that are celebrated from the largest to the <a href=\"https://www.olympics.com/en/news/paris-2024-olympics-nations-won-first-ever-medal-at-the-games\">smallest</a> countries!</p>\n<p>Sporting events such as the Olympics are highly structured events with clear rules dictating almost every aspect. An interesting consequence of decoupling the rules of the games from direct financial incentives is that many sports are not <a href=\"https://en.wikipedia.org/wiki/Zero-sum_game\">zero-sum games</a>. In <a href=\"https://en.wikipedia.org/wiki/Laws_of_rugby_union\">rugby union</a> or <a href=\"https://www.thefa.com/football-rules-governance/lawsandrules\">football</a> for example, the <a href=\"https://pmc.ncbi.nlm.nih.gov/articles/PMC6315358\">winner gains more than the loser loses</a>. While this structure can encourage <a href=\"https://www.responsiblegambling.eu/wp-content/uploads/2016/06/Match-Fixing%E2%80%94The-Biggest-Threat-to-Sport-in-the-21st-Century.pdf\">match-fixing</a> due to the asymmetry, participants also build trust amongst themselves over the years, for example via <a href=\"https://link.springer.com/article/10.1007/s12197-009-9120-4\">promotion through divisions</a>.\n<a href=\"https://en.wikipedia.org/wiki/Game_theory\">Game theorists</a> often note how stable cooperation emerges in <a href=\"https://academics.hamilton.edu/economics/cgeorges/game-theory-files/repeated.pdf\">infinitely repeated</a> games. Sports seasons are simply repeated competitions; over time, codes of conduct evolve and become self-policing agreements for mutual benefit (avoiding injuries, preserving dignity in loss, etc). There are clear lessons for the Themis mechanism here, as it also needs to establish long-term cooperation deep into the next century until <a href=\"https://www.nature.com/articles/s41558-018-0091-3\">total CO2 levels decline</a>.</p>\n<p>\n<img alt=\"If the Olympics aren&apos;t for you, perhaps boardgames are\" src=\"https://anil.recoil.org/images/board-game-pd-1.webp\" title=\"If the Olympics aren&apos;t for you, perhaps boardgames are\">\nIf the Olympics aren't for you, perhaps boardgames are</p>\n<p>Away from physical sports, we also see similar scoring dynamics in <a href=\"https://boardgamegeek.com/\">boardgames</a>! There is a whole genre of semi-competitive boardgames such as <a href=\"https://drakesflames.blogspot.com/2012/11/board-game-review-archipelago.html\">Archipelago</a> which are <em>"competitive games that everyone can lose"</em>. This sounds a lot like Themis; we want to be able to stave off emissions disaster, but otherwise be the top dog in our league for every other aspect of our societies! The game rules must be structured so that even selfish players find it in their interest to cooperate to <a href=\"https://boardgamegeek.com/geeklist/71983/competitive-games-where-everyone-can-lose\">avoid losing</a>. In Archipelago, the rule is simple: if instability within the game hits a certain point, all players lose, which forces even the leader to sometimes help the laggard to save themselves.<a href=\"https://anil.recoil.org/#fn-1\">[2]</a></p>\n<h2><a href=\"https://anil.recoil.org/#enter-the-cambridge-green-blue\"></a>Enter the Cambridge Green Blue</h2>\n<p>So how is this relevant to evaluating the global Themis mechanism from earlier? Everything global must start locally, so I propose a new semi-competitive league here in Cambridge, with willing Colleges as participants, and with virtual points instead of using real currency. And just like the <a href=\"https://en.wikipedia.org/wiki/Blue_(university_sport)\">two century old</a> tradition, we should make this sufficiently competitive to gain a coveted <a href=\"https://www.hawksclub.co.uk/about/history/the-cambridge-blue/\">sporting blue</a>! To give you some context, being really good at <a href=\"https://www.christs.cam.ac.uk/news/70-years-tiddlywinks\">tiddlywinks</a> can gain you a <a href=\"https://www.varsity.co.uk/sport/9629\">quarter blue</a>.</p>\n<p>In the rest of this post, I've written up the structure of this league with <a href=\"https://en.wikipedia.org/wiki/Elinor_Ostrom%23%2522Design_principles_illustrated_by_long-enduring_CPR_%28Common_Pool_Resource%29_institutions%2522\">Ostrom's principles</a> in mind, by treating the CO2 management problem as a <a href=\"https://en.wikipedia.org/wiki/Common-pool_resource\">common pool resource</a>.\nCambridge Colleges have been around for centuries and so naturally appreciate the long perspective required; Pembroke was <a href=\"https://www.pem.cam.ac.uk/college\">founded</a> in 1347. Our collective collegiate goal is to urgently reduce CO2e that accumulate in the atmosphere and contribute to climate change for hundreds of years. This requires cooperation and learning from each other, but also a certain drive to do better than each other to get to the goal as quickly as we can.</p>\n<h3><a href=\"https://anil.recoil.org/#what-do-we-measure-in-this-league\"></a>What do we measure in this league?</h3>\n<p>The three key sources of carbon emissions this league would track would initially come from food, heating and travel, noting again that we are only measuring <em>this year's</em> reductions and emissions, not historic or future pledges. We need to design specific mechanisms for each of these, but I'll just sketch out what makes measuring each of these "interesting".</p>\n<h4><a href=\"https://anil.recoil.org/#food-consumption-and-waste\"></a>Food consumption and waste</h4>\n<p>Students, Fellows, visitors and staff all eat a <em>lot</em> of food in the Colleges from <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">all over</a> the world. Communal dining is so central to the Cambridge College experience that it is mentioned in many College statutes as part of our charitable purpose.</p>\n<blockquote>\n<p>In furtherance of the College\u2019s purposes, Fellows shall be entitled to dine daily free of charge at common table.\n -- <a href=\"https://www.pem.cam.ac.uk/sites/default/files/downloads/inlinearstatutesordsregs12july2022.pdf\">Pembroke College Statutes</a> presented to Her Majesty in 2009</p>\n</blockquote>\n<p>Since thousands of meals go through a typical College every day, identifying pragmatic sources of emissions reductions is very important. In a recent committee meeting at Pembroke College, I was incredibly pleased to hear that we've reduced <a href=\"https://lordslibrary.parliament.uk/food-waste-in-the-uk/\">food waste</a> from the kitchens down to just one or two meals a day (which, considering the vast number of meals served is hugely impressive).\nAnd similarly, Darwin College reported on the recent <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">plant based May Ball</a> which was a rather fine party, and the world did not end due to black tie attendees being unable to find a sausage roll.\nHow can we communicate the lessons learnt from the catering teams here to other Colleges? The CGB allows us to rank and categorise these initiatives!</p>\n<p>Research, with much of it conducted here in Cambridge, shows us that key gains in food impacts come from reducing <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a> and the corresponding damage to <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forests</a> full of <a href=\"https://anil.recoil.org/papers/2024-life\">biodiversity</a>.\nImportantly, we're not trying to force every College member to suddenly become vegan, but instead provide sustainable and <a href=\"https://www.bbc.com/future/article/20241011-what-explains-increasing-anxiety-about-ultra-processed-plant-based-foods\">healthy</a> options.\n<a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a> have both shown that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical, without alienating those that insist on their meat'n'twoveg:</p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options from 1-in-4 to 2-in-4 increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</p>\n</blockquote>\n<p>The league does need some way to turn these initiatives into a points based system. This is where my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>'s recent <a href=\"https://anil.recoil.org/papers/2024-food-life\">research</a> is instructive. He's been working on quantifying the <a href=\"https://anil.recoil.org/papers/2024-life\">biodiversity cost</a> of <a href=\"https://anil.recoil.org/papers/2024-food-life\">food imports</a>, broken up by the food type. The CGB food points game could correlate consumption choices with where the food comes from and how much it is wasted, and so we could steadily work across Colleges on reducing our impact year-on-year.</p>\n<p><a href=\"https://www.cambridge.org/engage/coe/article-details/67a21eac81d2151a0225692b\"> \n<img alt=\"An excerpt from the paper &apos;Quantifying the impact of the food we eat on species extinctions&apos; (Tom Ball et al, under review)\" src=\"https://anil.recoil.org/images/tball-food-paper-ss-1.webp\" title=\"An excerpt from the paper &apos;Quantifying the impact of the food we eat on species extinctions&apos; (Tom Ball et al, under review)\">\nAn excerpt from the paper 'Quantifying the impact of the food we eat on species extinctions' (Tom Ball et al, under review) </a></p>\n<h4><a href=\"https://anil.recoil.org/#heating-without-fossil-fuels\"></a>Heating without fossil fuels</h4>\n<p>Turning off the natural gas flows in Colleges is a major challenge. We have some of\nthe oldest buildings in the world around here, and much of the infrastructure is\ncorrespondingly aged. Pembroke has just spent a ton of cash on a <a href=\"https://www.cibsejournal.com/uncategorized/fuel-for-thought-cambridge-college-plans-for-heat-pump-transition/\">communal heat pump</a> for our new development in Mill Lane, which got me thinking about how this aspect of the CGB league could be based around this. The rules and regulations for heat pump installation in the UK are incredibly baroque, as <a href=\"https://ramcq.net/\">Robert McQueen</a> pointed out recently:</p>\n<blockquote>\n<p>I have a neighbour who embarked on a planning application for a heat pump for his terraced house. There is a difference in ridiculous paperwork necessary simply to install <1m from the boundary compared to the presumed consent in permitted development. Of course now they are waiving that requirement but he's stuck half way through the process. I can't even imagine adding listed requirements into that</p>\n<p>[...] <a href=\"https://mhclgmedia.blog.gov.uk/2024/11/21/warm-homes-plan-and-heat-pumps/\">due to be waived</a> for permitted development - whether that tracks through to the full regulations is anyone's guess. They are already bafflingly inconsistent.\n-- <a href=\"https://bsky.app/profile/ramcq.net/post/3lhcdlycth22n\">Robert McQueen, Bluesky</a>, Feb 2025</p>\n</blockquote>\n<p>However, the Cambridge City Council isn't sitting still and has been working with the University on this. <span>Ian Leslie</span> pointed me to city-wide explorations into <a href=\"https://www.cambridge.gov.uk/city-centre-heat-network\">district heating</a> networks for Cambridge that includes a <a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\">phase 1 report</a>\nthat plots out what it might look like by using different Colleges as sinks and sources!</p>\n<p><a href=\"https://www.cambridge.gov.uk/media/pkjcwy1m/city-centre-heat-network-connection-guidance.pdf\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/cambridge-district-heat-ss-1.webp\" title=\"\">\n </a></p>\n<p>Darwin College also reports in their <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">2023 sustainability report</a> the progress they've made on establishing heat pumps in the River Cam.</p>\n<blockquote>\n<p>In 2022, in a collaboration with six other riverside Colleges, Mott MacDonald were commissioned to monitor\nwater flow, depth and temperature at four locations on the river and to produce a detailed hydrology study.\nThe report, delivered in 2023, confirms the considerable potential of the river to supply heat for space\nand hot water heating for the adjacent Colleges.\n -- <a href=\"https://www.darwin.cam.ac.uk/wp-content/uploads/2024/02/Compressed-2023-Sustainability-Progress-Report-compressed-1mb.pdf\">Darwin sustainability report</a>, 2023</p>\n</blockquote>\n<p>And famously most recently, <a href=\"https://anil.recoil.org/www.kings.cam.ac.uk\">Kings College</a> installed <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">400 solar panels</a> on their world-famous chapel, despite <a href=\"https://www.kings.cam.ac.uk/news/2023/kings-unveils-new-solar-panels-restored-chapel-roof\">opposition</a> from Historic England. This sets a huge precedent for the rest of Cambridge to take similar action, and they deserve recognition for this from the CGB!</p>\n<p>\n<img alt=\"The roof of Kings College chapel. Source: BBC News.\" src=\"https://anil.recoil.org/images/kings-solar-panels.webp\" title=\"The roof of Kings College chapel. Source: BBC News.\">\nThe roof of Kings College chapel. Source: BBC News.</p>\n<p>So this aspect of the CGB league could focus on building spatial connections across Colleges. Perhaps the College that brings the most benefit to its neighbours by contributing the most towards a district heating mechanism could win this round.</p>\n<h4><a href=\"https://anil.recoil.org/#reducing-impact-of-international-travel\"></a>Reducing impact of international travel</h4>\n<p>Finally, lots of the Colleges do facilitate international travel, for a variety of reasons ranging from <a href=\"https://www.pem.cam.ac.uk/international-programmes\">pedagogical</a> to <a href=\"https://www.pem.cam.ac.uk/alumni-development/connect-pembroke/alumni-events\">developmental</a>. The most obvious one is when conducting in-person interviews, when candidates fly in from all over the world. Since the pandemic, there has been <a href=\"https://oxbridgeapplications.com/blog/cambridge-interviews-online-or-in-person/\">split opinion</a> among Colleges about returning to in-person interviews or not, with Pembroke opting for in-person this year. While there are lots of good reasons to encourage in-person interactions, the carbon cost has been so low down in the discussion points in the meetings I've attended that it might as well not even be a factor. A CGB league might encourage us to tally up the scores across Colleges more systematically to factor in these costs into the overall decisionmaking.</p>\n<p>The other opposite end of the spectrum is international air travel for conferences, which are thankfully quite rare as most of our business is conducted locally. We do host events here such as the <a href=\"https://www.sccs-cam.org/\">SCCS</a> student conservation conference that flies in young scholars from all over the world, but this is quite rightly justified as being essential as it brings together underrepresented young students from all over the world who find tremendous value from meeting each other. I've made more extensive notes on the topic of travel mitigation elsewhere in my note on <a href=\"https://anil.recoil.org/carbon-credits-vs-offsets\">carbon contributions</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#implementing-the-cambridge-green-blue\"></a>Implementing the Cambridge Green Blue</h3>\n<p>I've hopefully convinced you that there quite a few interesting dimensions around which we could design our semi-competitive Cambridge Green Blue (CGB) league. I've avoided over-specifying the rules at this early juncture, since I want to bring in more people's thoughts and ideas first. However, here's a strawman attempt.</p>\n<blockquote>\n<p>We treat the emission of CO2 into the atmosphere as a shared common pool resource (CPR); i.e. we can collectively only emit a limited amount if we are to avoid the worst effects of climate change. Cooperation on a global CPR should ideally happen on a global basis, however that current approach is inadequate. Therefore, we must locally initiate mechanisms which will build up into a global framework from the ground up. Cambridge Colleges are institutions for young people who will be greatly affected by climate change, and Colleges make decisions with long time horizons, and a body of scholars should represent intellectual leadership in a time of crises. Therefore Cambridge Colleges should be an ideal proving ground for exploring cooperative frameworks in practise!</p>\n</blockquote>\n<p>The CGB would select its initial College membership and define baseline rules about how to measure emissions collectively, based around the first interest areas of travel, food and heating described above. Members will then write a rule book that follows the Themis mechanism to establish a virtual price for each tonne of emissions, and we will self-report progress monthly with points assigned to those who are beating their baselines of emissions reduction interventions. The league is used to collectively learn from those who are winning, and equalise the playing field in future seasons for the others to catch up.</p>\n<p>Following Ostrom's principles, the league looks like this:</p>\n<ol>\n<li><em>Define group boundaries and the contents of the CPR.</em> The common pool resource we measure are CO2 emissions from the Cambridge Colleges. The goal is to reduce emissions year-on-year, and so "0" is defined as the previous year\u2019s emissions, with any additional emissions reductions resulting in points awarded. The league therefore measures the CPR as "CO2e tonnes avoided" without getting into any historic or future plans, only what is happening this year.</li>\n<li><em>Appropriation and provision of common resources.</em> The Colleges all have initiatives to reduce their CO2e, and have agreed to cooperate towards this common goal. Membership of the league is voluntary, and we make the membership public. We reserve the right to laugh derisively at those Colleges who elect not to participate.</li>\n<li><em>Collective-choice arrangements for resource appropriators to make decisions.</em> The league will maintain a points database tracking emissions across heating, travel and food-related emissions reduction activities. The league will not be directly involved in individual College decision making, but we hope to recruit persons from the Colleges who may be involved in those activities in addition to their participation in the league.</li>\n<li><em>Effective monitoring by monitors who are accountable to the appropriators.</em> The league will self-report their emissions reductions monthly, and there will be a collective consensus formed on the CO2e measurements across the emissions reductions. The reporters are all part of the Cambridge Colleges, and so have access to internal channels to verify their own claims.</li>\n<li><em>A scale of graduated sanctions for resource appropriators who violate community rules.</em> As a voluntary league, we do not anticipate any incentive to cheat. Sanctions will first be redaction of those points from the table, followed by ejection from the league.</li>\n<li><em>Mechanisms of conflict resolution that are cheap and of easy access.</em> The league has monthly checkpoints where participants collectively score their emissions reductions. Disagreements about methodologies will be resolved at these meetings, which also aim to collectively educate each other about the diverse emissions reduction methods available.</li>\n<li><em>Self-determination of the community recognised by higher-level authorities.</em> Cambridge Colleges have committed to various net-zero targets. Therefore, the emissions reductions tracked by this league will eventually be incorporated into some broader net-zero reporting that apply at a national and international level. But for now, we just want to reduce the real amount year-on-year.</li>\n<li><em>Organisation in the form of multiple layers of nested enterprises, with small local CPRs at the base level.</em> Our hope is that the Cambridge Green Blue is the first league of many, with other organisations also following our template. To that end, we will make our rules templates available freely as an open-source rulesheet after the first round concludes successfully. When there are multiple organisations running their own leagues (come on Oxford!), we will build up a bigger collective framework for Themis participants, akin to a sporting governing body.</li>\n</ol>\n<p>One very important aspect of this is to adopt a respectful "<a href=\"https://en.wikipedia.org/wiki/Sportsmanship\">sportsmanship</a>" rule to the relative ranking of Colleges, and not engage in <a href=\"https://www.varsity.co.uk/news/28426\">shaming</a> wars. There is a wide wealth <a href=\"https://www.varsity.co.uk/news/14626\">disparity</a> among the Cambridge Colleges, and we could adjust for this using the per-capita rules from the Themis mechanism. Ultimately, it's also about celebrating and learning from every participant and using the competition to spur us on, build each other up, and have fun doing so. We're all in this together.</p>\n<h2><a href=\"https://anil.recoil.org/#err-are-you-serious-about-this-anil\"></a>Err, are you serious about this Anil?</h2>\n<p>Yeah, I think this is worth a try! I have recently joined the University's <a href=\"https://www.governance.cam.ac.uk/committees/essc/Pages/default.aspx\">Environmental Sustainability Strategy</a> committee, and I've found it extremely difficult to educate myself about the local initiatives going on (not because of any individual's fault, but because there are 31 separate constituent Colleges and University and townspeople sharing a fairly small area). If nothing else, this initiative will let us collectively bring together a wiki of all the positive actions happening across Cambridge. If it succeeds though, I'd like to spread the next iteration of the league to other Universities to run their own (I'm looking at you, Oxford), and see if we can turn this into a distributed game.</p>\n<p>I was reading <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>'s book <a href=\"https://press.uchicago.edu/ucp/books/book/chicago/W/bo13823467.html\">Wild Hope</a> over the weekend, and his conclusion at the end was that we must not lose hope in our quest for a biodiverse, equitable world. And given the chaotic start to 2025, I can't think of a better place to start something new than within Cambridge, with our collegiate structure already providing a ready-made framework.</p>\n<p>So what next? If you're interested in helping <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and me organise this, get in touch with either of us! I'm on <a href=\"https://www.hr.admin.cam.ac.uk/policies-procedures/flexible-working-policy/supporting-guidance/sabbatical-leave\">academic sabbatical</a> for a year from the summer, so I'll have loads of time. I'll edit this post with a list of first Colleges that have been in touch. We'll likely organise a pub get-together in early March (exact date to follow) to brainstorm about this without anyone interested.</p>\n<p> <em>This post is the result of many conversations around Cambridgeshire over the past year, ranging from a balmy summer dinner in Ely with <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> and <a href=\"https://en.wikipedia.org/wiki/Theresa_Marteau\">Theresa Marteau</a>, chilly autumn cups of tea in my Pembroke office with <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a> and <a href=\"http://carlhenrik.com/\">Carl Henrik Ek</a>, to misty morning coffees at <a href=\"https://www.visitcambridge.org/place/pages-cambridge/\">Pages</a> with <a href=\"https://en.wikipedia.org/wiki/Melissa_Leach\">Melissa Leach</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a> or at <a href=\"https://www.espressolane.co.uk/\">Espresso Lane</a> with <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, to cosy pubs with <span>Ian Leslie</span>, <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, to College dinners with <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, and <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>/<a href=\"https://www.zoo.cam.ac.uk/research/groups/conservation-science\">CSG</a> discussions with <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>, <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>, <a href=\"https://biomin.esc.cam.ac.uk/people/2023-Orlando-Timmerman/\">Orlando Timmerman</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>, <a href=\"https://www.cl.cam.ac.uk/~arb33/\">Alastair Beresford</a>, <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a>. Many thanks to them for corrections and feedback, and any remaining errors are my own. Changelog: 12th Feb added note on sportsmanship and Carl's NeurIPS@Cam talk. 6th May 2025: added <a href=\"https://mlg.eng.cam.ac.uk/carl/\">Carl Edward Rasmussen</a>'s published <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">article</a> on Themis.</em> </p>\n\n<ol>\n<li>\n<p>I promise I'm not a JMK shill, despite being a <a href=\"https://www.cshss.cam.ac.uk/research-info/j-m-keynes-fellowship-fund/j-m-keynes-fellows\">JMK Fellow</a>.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>The keen boardgame player will probably observe that there's always one player who decides to cause trouble just for fun, making everyone lose. This can be dealt with by social means.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_carbon-credits-vs-offsets.json
+18
avsm/notes_carbon-credits-vs-offsets.json
···+"summary": "<p>The terms <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">carbon credits and carbon offsets</a> are often used interchangeably,\nbut are in fact two distinct concepts. I've spent a nice Sunday morning\nreading up on some <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">recent articles</a> that <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a> sent me which introduce a\n<em>third</em> term, known as <em>"carbon contributions"</em>. Rather than this adding confusion, I\nfound it helped me clarify my own thoughts on the matter, which I\nnote down here in draft form. <em>(Update 7th Feb: I've revised this several times after many discussions this week, especially with <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, with full list of credits in the end)</em></p>\n<h2><a href=\"https://anil.recoil.org/#what-are-carbon-credits-and-offsets\"></a>What are carbon credits and offsets?</h2>\n<p>A <em>carbon credit</em> aims to quantify the net climate benefit resulting an\nintervention that alters some CO2 emissions that would otherwise have gone into\nthe atmosphere in a business-as-usual counterfactual scenario. While there are many\ndifferent categories of carbon credits, I'll focus on <a href=\"https://iucn.org/our-work/nature-based-solutions\">nature-based solutions</a>. For example,\nwe could fund an intervention which provides an <a href=\"https://www.rspb.org.uk/whats-happening/news/the-power-of-forest-friendly-chocolate\">alternative livelihood</a> to cutting down tropical rainforests,\nand then calculate the area of rainforest saved (and therefore, the amount of avoided carbon emitted into the atmosphere) as a result\nof this action.</p>\n<p>The carbon credit therefore measures the <em>additional</em> amount of CO2 avoided as a result of the specific intervention,\nadjusted for <a href=\"https://www.lse.ac.uk/granthaminstitute/publication/avoiding-leakage-from-nature-based-offsets-by-design/\">negative externalities</a> and the <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">impermanence</a> of\nthe action into the future if it's at risk of being reversed. We can monitor the measurements using spaceborn sensing to\nestablish <a href=\"https://anil.recoil.org/notes/credible-credit-principles\">global baselines</a> against which to calculate the counterfactual impacts of positive actions.<a href=\"https://anil.recoil.org/#fn-2\">[1]</a> Carbon credits are nowadays their own asset class, both <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon\">legally</a> and <a href=\"https://www.gov.uk/government/publications/revenue-and-customs-brief-7-2024-vat-treatment-of-voluntary-carbon-credits/revenue-and-customs-brief-vat-treatment-of-voluntary-carbon-credits\">fiscally</a>.</p>\n<p>A <em>carbon offset</em> <a href=\"https://anil.recoil.org/#fn-1\">[2]</a> is then a way to account for the net climate benefits that one entity brings to another. The "benefits" are the amount of CO2e avoided or removed via the carbon credit, and the "costs" are the amounts of CO2e being emitted by the other party. The origin of this accounting can be traced back to the UN's <a href=\"https://en.wikipedia.org/wiki/Net-zero_emissions\">net-zero</a> goals:</p>\n<blockquote>\n<p>Net-zero means cutting carbon emissions to a small amount of residual emissions that can be absorbed and durably stored by nature and other carbon dioxide removal measures, leaving zero in the atmosphere.\n-- UN <a href=\"https://www.un.org/en/climatechange/net-zero-coalition\">Net Zero coalition</a></p>\n</blockquote>\n<p>The theory behind offsetting is that we can never get to a complete net zero state due to the <a href=\"https://www.nature.com/articles/s41558-022-01592-2\">residual CO2 emissions</a> that will remain in even the most optimal decarbonised societies. For these residual emissions, we need to offset them with corresponding climate benefits in order to balance the books on how much carbon is in the atmosphere and how much is being <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">absorbed</a> by the planet's biosphere. And one of the main sources of CO2 absorption that we must protect in the biosphere are rainforests:</p>\n<blockquote>\n<p>Carbon sinks have increased in temperate and tropical regrowth forests owing to increases in forest area, but they decreased in boreal and tropical intact forests, as a result of intensified disturbances and losses in intact forest area, respectively. The global forest sink is equivalent to almost half of fossil-fuel emissions. However, two-thirds of the benefit from the sink has been negated by tropical deforestation.\n-- <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">The enduring world forest carbon sink</a>, Nature 2024</p>\n</blockquote>\n<p>Since tropical rainforests are so crucial for both <a href=\"https://www.unesco.org/en/articles/yangambi-biosphere-reserve-congo-basin-become-knowledge-hub-climate-and-biodiversity\">CO2 absorption</a> and biodiversity, my own recent <a href=\"https://4c.cst.cam.ac.uk/publications\">research</a> has largely focussed on reliable <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">accounting</a> for quantifying carbon credits accurately for <a href=\"https://unfccc.int/topics/land-use/workstreams/redd/what-is-redd\">avoided deforestation</a> projects in these regions. This work been <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">progressing</a> steadily, and we're increasingly confident in the quantification methods used behind measuring the carbon sequestration impact of nature-based credits.</p>\n<p>However, what has been dragging down carbon credits is how they are used <em>after</em> they are verified and purchased, which is predominately via carbon offsetting. Let's first examine the problems with carbon <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">offsetting</a>, and then examine an emerging concept of "carbon <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">contributions</a>" might provide a better way forward for carbon credits.</p>\n<h2><a href=\"https://anil.recoil.org/#is-carbon-offsetting-a-license-to-pollute\"></a>Is carbon offsetting a license to pollute?</h2>\n<p>Carbon offsets are currently mostly <a href=\"https://icvcm.org/voluntary-carbon-market-explained/\">voluntary</a>, where private actors can purchase carbon credits towards reducing their emissions targets. The obvious problem with offsetting is that it can give <a href=\"https://www.ft.com/content/93938a1b-dc36-4ea6-9308-170189be0cb0\">bad actors</a> a license to spend money to <a href=\"https://www.theguardian.com/environment/2023/jan/19/shell-to-spend-450m-on-carbon-offsetting-fears-grow-credits-worthless-aoe\">continue to pollute</a>, while <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">breaking their emissions pledges</a>. And the harsh reality is that if we don't engage in immediate and real emissions reductions, we're <a href=\"https://www.newscientist.com/article/2344159-world-is-on-track-for-2-5c-of-global-warming-by-end-of-the-century/\">screwed</a> in the coming decades.</p>\n<p>Unfortunately, we need to balance this with the short-term reality that many of these businesses have to emit to <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">remain competitive</a>, for example in the AI sector (<a href=\"https://anil.recoil.org/notes/deepseek-r1-advances\">Deepseek</a> notwithstanding!).\nAmazon highlighted the difficulty of forecasting their emissions in their annual sustainability report in 2023:</p>\n<blockquote>\n<p>[...] our progress toward a net-zero carbon business will not be linear, and each year as our various businesses grow and evolve, we will produce different results [...] These results will be influenced by significant changes to our business, investments in growth, and meeting the needs of our customers.\n-- <a href=\"https://sustainability.aboutamazon.com/2023-amazon-sustainability-report.pdf\">Amazon Sustainability Report 2023</a></p>\n</blockquote>\n<p>As did Google, who gave up on 'real time net zero' last year, preferring instead to aim for the comfortably distant 2030:</p>\n<blockquote>\n<p>[...] starting in 2023, we're no longer maintaining operational carbon neutrality. We're instead focusing on accelerating an array of carbon solutions and partnerships that will help us work toward our net-zero goal [...]\n-- <a href=\"https://www.gstatic.com/gumdrop/sustainability/google-2024-environmental-report.pdf\">Google Environment Report 2024</a></p>\n</blockquote>\n<p>Your heart may not be bleeding for these tech companies finding it difficult to forecast how they'll make their next <a href=\"https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization#Trillion-dollar_companies\">trillion dollars</a>, but there is the undeniable reality that they need to break emissions pledges in response to global competitive pressure on their core businesses. But given this, is there still any point in all the precise accounting frameworks for net-zero carbon <em>offsetting</em>?</p>\n<p>A December <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">article</a> in the FT argues that there needs to be a fundamental shift in our approach to carbon credits for this reason. They observed that the use of carbon offsets for emissions trading in the EU will probably only apply to removal projects that <a href=\"https://en.wikipedia.org/wiki/Direct_air_capture\">suck carbon from the air</a> and not to the nature-based deforestation avoidance schemes I described above.</p>\n<blockquote>\n<p>Corporate funding for nature conservation has a useful role to play \u2014 but as a contribution to the public good, not for use in tonne-for-tonne emissions offsetting calculations.\n-- <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">Simon Mundy</a>, "It's time for a shift in approach to carbon credits", FT</p>\n</blockquote>\n<p>And <em>there</em> is the critical distinction between carbon "credits" and "offsets" I was looking for! Simon acknowledges the crucial importance of generating forest carbon credits to advance the extremely urgent problem of tackling tropical deforestation, but notes that corporations should not be giving to this pot as part of a complex accounting scheme tied to the vagaries of their ever-shifting business strategies. Forests are too important to our continued existence to be left to the mercies of a <a href=\"https://www.theguardian.com/environment/article/2024/may/31/market-value-of-carbon-offsets-drops-61-aoe\">volatile stock market</a>.</p>\n<p>Instead, we need to come up with a scheme for spending carbon credits whose incentives are aligned towards keeping the focus on emissions reductions and behavioural change. So, let's next firmly decouple carbon credits from carbon offsets, and examine how organisations that wish to <em>do</em> the right thing can...contribute...instead.</p>\n<h2><a href=\"https://anil.recoil.org/#carbon-contributions-as-an-alternative-to-offsetting\"></a>Carbon contributions as an alternative to offsetting</h2>\n<p>An <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">article last year</a> by a former Cambridge Gates Scholar <a href=\"https://www.libbyblanchard.com/\">Libby Blanchard</a> and colleagues made a very clear case how and why we might replace carbon offsetting with "carbon contributions", and especially so for forest protection. She observed that the <a href=\"https://www.ft.com/content/6eb8981e-4117-4aeb-a1b3-40f08ae85f53\">integrity crisis</a> in the offsets market has quite rightly lead to the exposure of many poor quality schemes, but is also drying up crucial funding for the <a href=\"https://www.fscindigenousfoundation.org/global-south-voices-in-support-of-redd/\">good actors</a> who are working hard under very adverse conditions to launch forest protection schemes in the global <a href=\"https://www.wildlifeworks.com/post/listen-to-global-south-voices-the-carbon-market-s-key-role-in-financing-sustainable-development-and\">south</a> and <a href=\"https://www.reuters.com/sustainability/land-use-biodiversity/how-carbon-finance-is-seeding-new-hope-northern-forests-2024-12-20/\">north</a>.</p>\n<blockquote>\n<p>One way to channel forest finance away from bad offsets toward more productive outcomes is, simply, to stop claiming that forests offset fossil fuel emissions. Companies could, instead, make "contributions" to global climate mitigation through investments in forests.</p>\n<p>This change in terminology may seem small, but it represents a fundamentally different approach. For one thing, not allowing companies to subtract carbon credits from their direct emissions into a single net number, as offsetting does, refocuses priorities on direct emissions reductions. Companies would no longer be able to hide inaction behind offset purchases.\n-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">Libby Blanchard, Bill Anderegg and Barbara Haya</a>, Instead of Carbon Offsets, We Need 'Contributions' to Forests, Jan 2024</p>\n</blockquote>\n<p>This approach is radically more accessible for a good actor who has been scared away from offsets and is entangled in complex <a href=\"https://sciencebasedtargets.org\">SBTI</a>-style accounting frameworks!</p>\n<p>Firstly and most importantly, it removes the incentive to purchase the cheapest credits on the market at the lowest price possible. Since the organisations are no longer racing to hit a net-zero target, they can afford to find the highest quality and highest impact carbon projects available, and put their money towards those instead.</p>\n<p>Secondly, a contributions model focussed on quality means that more organisations can safely participate. In the current voluntary market, there is a <a href=\"https://en.wikipedia.org/wiki/The_Market_for_Lemons\">market for lemons</a> situation where it is very difficult to distinguish <a href=\"https://www.theguardian.com/environment/article/2024/may/30/corporate-carbon-offsets-credits\">junk credits</a> from <a href=\"https://community.rspb.org.uk/ourwork/b/actionfornature/posts/protecting-gola-10-years-of-the-redd-conservation-project-in-sierra-leone-s-gola-rainforest\">worthwhile credits</a>, since the market price is not a reliable indicator of quality. This means that the vast majority of organisations <a href=\"https://www.statista.com/statistics/501730/voluntary-carbon-offset-market-transaction-volume-worldwide/\">withdraw</a> from participating in the (voluntary) market due to the <a href=\"https://infiniteglobal.com/insights/a-net-zero-fairytale-the-reputational-risks-of-carbon-offsetting/\">reputational risks</a>, leaving only two sorts of participants: very good actors who <em>really</em> want to do the right thing, and very bad actors who are blatantly <a href=\"https://en.wikipedia.org/wiki/Greenwashing\">greenwashing</a>. It's a very odd party if the only two sorts of people left are the sinners and the saints!</p>\n<p>Let's look more closely at each of these points, as I think it fundamentally changes the dynamics of the use of carbon credits.</p>\n<h2><a href=\"https://anil.recoil.org/#selecting-the-highest-quality-carbon-credits-instead-of-the-cheapest\"></a>Selecting the highest quality carbon credits instead of the cheapest</h2>\n<p>There are a <a href=\"https://www.carbon-direct.com/insights/how-do-carbon-credits-actually-work-removal-reduction-and-avoidance-credits-explained\">vast array</a> of carbon avoidance, reduction and removal schemes; how do we chose between them? The current carbon markets focus on <a href=\"https://carbonmarketwatch.org/2024/08/14/faq-understanding-the-financial-workings-of-the-voluntary-carbon-market/\">secondary trading</a> as a price proxy, but this is a poor indicator of the underlying reliability and human and biodiversity cobenefits of any given intervention. In 2021, the University of <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">Cambridge Offset Working Group</a> commissioned a <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">comprehensive report</a> on how we might compare project quality and cobenefits first, and then figure out a suitable price for each. This methodology (dubbed "<a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">PACT</a>") allows us to compare diverse credit types such as direct-air-capture and nature-based solution projects as apples-to-apples. Here's an excerpt from that <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">report</a>:</p>\n<p>\n<img alt=\"Table of relative costs of carbon credits across project types from the COWG report\" src=\"https://anil.recoil.org/images/pact-table.webp\" title=\"Table of relative costs of carbon credits across project types from the COWG report\">\nTable of relative costs of carbon credits across project types from the COWG report</p>\n<p>The important column is the \u00a3PACT one, which shows the adjusted costs per ton of carbon of purchasing those credits. The <a href=\"https://climeworks.com/subscriptions-co2-removal\">Climeworks</a> direct-air-capture comes in at \u00a3900/tonne <a href=\"https://anil.recoil.org/#fn-3\">[3]</a> whereas a tropical rainforest project in Sierra Leone ranks in at \u00a373/tonne, <em>even after impermanance is adjusted for</em>! That's an absolutely mind-blowing price difference for a market that's allegedly more <a href=\"https://en.wikipedia.org/wiki/Efficient-market_hypothesis\">efficient</a> due to the existence of secondary trading. Yet there is an order-of-magnitude price difference between tropical forest protection and direct air capture, and that's <em>before</em> taking into account the obvious co-benefits of forest protection such as <a href=\"https://anil.recoil.org/projects/life\">biodiversity</a> and livelihood improvements.</p>\n<p>Blanchard's earlier article identifies the key benefits of a contributions model here:</p>\n<blockquote>\n<p>Freeing companies from the pressure of "offsetting" by switching to a "contributions" frame lessens the incentive to minimize costs at the expense of quality, allowing them to focus on contributing to higher-quality projects.\n-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">Libby Blanchard, Bill Anderegg and Barbara Haya</a></p>\n</blockquote>\n<p>Since the University is <em>not</em> planning on spending these carbon credits on accounting towards a net-zero goal, it is free to search the market for the highest quality impact -- in this case, tropical rainforest avoidance credits that are hugely undervalued -- and also filtering based on important co-benefits such as biodiversity and livelihood impacts. And by sharing our knowledge about high quality carbon credit projects, we could hopefully find many other organisations that want to similarly contribute, and drive up the price of rainforest credits to their <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">true value</a>.<a href=\"https://anil.recoil.org/#fn-5\">[4]</a></p>\n<p>With a contributions model, we no longer care what the absolute price we're paying for the credits are: our contributions only reflect a fraction of our total climate damage anyway, and we want the carbon credits that we do purchase to reflect the highest available impact out of the spectrum of compensation efforts that we could engage in. There's still one important consideration we'll talk about next though: how should an organisation account for these contributions, if not as part of a net-zero mechanism?</p>\n<h2><a href=\"https://anil.recoil.org/#applying-carbon-contributions-to-sustainability-policies\"></a>Applying carbon contributions to sustainability policies</h2>\n<p>The primary sustainability focus of any organisation must be on <a href=\"https://en.wikipedia.org/wiki/Climate_change_mitigation\">decarbonisation</a> via direct emissions reduction. With carbon contributions, we can focus on this without the distractions of race-to-the-bottom carbon offset accounting.</p>\n<p>For example, consider the area of <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">international air travel</a>. There are <em>plenty</em> of things to do to reduce emissions here as a matter of urgent policy change. My University's <a href=\"https://www.environment.admin.cam.ac.uk/travel/sustainable-business-travel\">sustainable travel policy</a> is sensible and dictates that it must be a trip of last resort to fly; we must use trains or other land travel where available, such as for European trips. There is also plenty of science to invest in to reduce the impact of aviation; ranging from <a href=\"https://www.bbc.co.uk/news/av/technology-60985913\">electrified planes</a> and <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">contrails</a> and <a href=\"https://www.sciencedirect.com/science/article/pii/S0191261524001899\">optimised routing</a>. But, while all this is going on, sometimes there is only one practical way to get somewhere internationally, such as for an annual conference. We need all the emissions reductions strategies to be deployed first, and while these are taking effect we <em>also</em> need to also augment them with voluntary contribution towards the last-resort travel that's happening while they are being rolled out or researched. Or indeed, also compensate for past travel emissions, as CO2e affects the climate for <a href=\"https://www.nature.com/articles/climate.2008.122\">longer than Stonehenge</a> has existed!</p>\n<p>Another similarly <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">topical</a> emissions reduction area is on how to reduce our <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a>. More and more research is showing how damaging this is for <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forest destruction</a> but also from a <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity angle</a>. But it turns out that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical:<a href=\"https://anil.recoil.org/#fn-4\">[5]</a></p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options \u2013 from 1-in-4 to 2-in-4 \u2013 increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</p>\n</blockquote>\n<p>For both of these emissions reductions initiatives, we could tag on a voluntary contribution whenever some damaging action (long-haul flying, importing ruminant meat, etc) is taken. This is an contribution of <em>last resort</em> ("I am a grad student presenting a paper and have to go to abroad for this conference"). In <a href=\"https://www.environment.admin.cam.ac.uk/Annual-Report\">annual sustainability reports</a>, the primary focus of reporting would remain firmly on the emissions reductions initiatives themselves. But the contributions gathered from these schemes could be pooled, and treated as a collective (but voluntary) <a href=\"https://en.wikipedia.org/wiki/Carbon_tax\">carbon tax</a> on the damages to nature and the atmosphere.</p>\n<p>And how do we spend this carbon tax? On the highest quality carbon projects we can find in the big wide world, as I described earlier! Each individual reductions scheme doesn't worry about what the compensation mechanisms are; groups similar to the <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">COWG</a> could regularly assess projects worldwide. By publically sharing their results to allow other organisations to participate in supporting them, they would also help reinforce the emerging <a href=\"https://icvcm.org/core-carbon-principles/\">core carbon principles</a> championed by the <a href=\"https://icvcm.org/\">IC-VCM</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#im-pretty-sold-on-carbon-contributions-vs-offsets\"></a>I'm pretty sold on carbon contributions vs offsets</h2>\n<p>This contributions model places the emphasis back where it should be -- on behavioural and systemic reductions of our environment impacts -- rather than on being a "license to pollute", as carbon offsets have often been used as. It allows us to pragmatically identify high-impact areas where we have policies in place to reduce emissions, purchase carbon credits from those projects, and then account for their expenditure via our emissions reductions activities.</p>\n<p>An explicit non-goal is to use credits towards a big net-zero target of claiming carbon neutrality; they just reflect our collective contribution towards mitigating environmental damage that we've judged that we had to do.\n<a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a> succinctly summarises this: <em>"a contribution is an acknowledgement of harm rather than its <a href=\"https://dictionary.cambridge.org/dictionary/english/expiation\">expiation</a>"</em>.</p>\n<p><a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> also applies this approach to <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">biodiversity credits</a> in a recent <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">piece</a>:</p>\n<blockquote>\n<p>Using biodiversity credits to quantify contributions toward nature recovery, rather than to directly offset specific negative impacts, is a key way to reduce some of the risks we highlight. This is referred to in the forest carbon world as a "contribution" model. Instead of buyers of forest carbon credits claiming that the credits can offset emissions to achieve "net zero", they instead make a "contribution" to global climate mitigation through investments in forests.</p>\n<p>While this may seem like a small change in terminology, it represents an important difference. If carbon credits cannot be subtracted from a company's emissions to produce a single net number, they cannot be used as a license to continue emitting. This also lessens the incentive for buyers to focus on quantity rather than quality in purchased credits. Some biodiversity credit operators are already promoting this approach [...]\n-- <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">Hannah Wauchope et al</a>, What is a unit of nature? Measurement challenges in the emerging biodiversity credit market, Royal Society 2024</p>\n</blockquote>\n<p>I couldn't agree more! Julia also highlights eloquently the urgency of the situation in her <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">commentary</a> in Nature in response to a recent <a href=\"https://www.bbc.co.uk/programmes/m001zd68\">Panorama</a> program on the BBC:</p>\n<blockquote>\n<p>However, dramatically more finance is urgently needed to stop the ongoing loss of forests and the vital services that they provide. REDD+ credits that cover the true cost of reducing deforestation in an effective and equitable way can help to provide that finance. If they are only used to offset residual emissions after substantial reductions, they could also contribute to the transition to net zero. The bottom line is that failure to conserve our carbon-rich forests and the life they support would be a dramatic and catastrophic failure for humanity.\n- <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">Julia P.G. Jones</a>, Scandal in the voluntary carbon market must not impede tropical forest conservation, Nature</p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#draft-principles-to-operationalise-carbon-contributions\"></a>Draft principles to operationalise carbon contributions</h2>\n<p>While we're still in early days of working through the details, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and I have been framing a three-step checklist that organisations could apply towards the implementation of a carbon contributions model:</p>\n<ol>\n<li>The organisation acknowledges harm from recent and historic emissions. Decarbonisation remains the first priority, whilst minimising residual emissions.</li>\n<li>Contributions are intended to mitigate harm from residual emissions and not to claim carbon neutrality</li>\n<li>The organisation is transparent about decreases or increases in emissions and beneficiaries of its contributions</li>\n</ol>\n<p>With these principles, it should be possible for an organisation to contribute to carbon credit financing without adverse incentives. While there is some concern that this contributions mechanism has no built-in incentive to force organisations to contribute, I believe that it could bring a lot more people into the fold than voluntary offsetting has (which, as I noted earlier, has only mainly the best and the worst participants remaining now with the majority of people stepping back from it due to all the controversies). However, we still need to see if this is a strong enough incentive to get more organisations to participate voluntarily; this concern has been raised by several colleagues in response to this article and I will think on it further.</p>\n<p>The stakes <a href=\"https://news.mongabay.com/2024/12/the-year-in-tropical-rainforests-2024/\">cannot be higher</a> right now for tropical rainforests, and we do not have the collective luxury of time to remain locked in the <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">offset-or-not</a> debate without an immediate alternative. The carbon contributions model could be just what we need to push forward! My hope is that this model makes it easier and safer for many organisations that have decided against offsetting to still contribute towards nature protection and restoration.</p>\n<p>Other universities also grappling with this topic include <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">Brown</a> and <a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">UPenn</a>, so I plan to circulate this article to them to gather wider opinions. The good folks at <a href=\"https://native.eco\">Native</a> also published a <a href=\"https://www.linkedin.com/pulse/why-businesses-must-shift-from-compensation-contribution-gkwee/?trackingId=ebXd8K96TidbACLeGURK%2Fw%3D%3D\">piece</a> about this shift from a compensation mindset to a contributions one.</p>\n<p>As noted at the beginning, I am updating this article regularly and would greatly welcome any other thoughts from you, the reader! I am grateful to <a href=\"https://coomeslab.org\">David Coomes</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>, <a href=\"https://www.geog.cam.ac.uk/people/garrett/\">Rachael Garrett</a>, <a href=\"https://www.linkedin.com/in/isobelcohen/\">Isobel Cohen</a>, <a href=\"https://en.wikipedia.org/wiki/Simon_Zadek\">Simon Zadek</a>, <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a>, <a href=\"https://www.cam.ac.uk/stories/changemakers-melissa-leach\">Melissa Leach</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a>, <a href=\"https://www.linkedin.com/in/harriet-hunnable-uk/\">Harriet Hunnable</a>, <a href=\"https://www.eden-plus.org/team-members/elliot-kinsey\">Elliot Kinsey</a>, <a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a>, <a href=\"https://www.linkedin.com/in/jonpierre1/\">Jon Pierre</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> and many others for their thoughts. This article includes their input but is not endorsed by them and any mistakes are mine alone.</p>\n<p><a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I decided it might be instructive to run a <a href=\"https://notebooklm.google\">NotebookLM</a> summary of some of our discussions, which you can find as in (AI-generated) podcast format below.</p>\n<p></p><div></div><p></p>\n<p> Changelog: 2nd Feb 2025 was original article. 5th Feb 2025 refined draft principles. 12th Feb 2025 added note about Native.eco article via <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, note on incentives via <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>. 20th Feb 2025 fixed typo in Ellen Quigley quote, via <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n\n<ol>\n<li>\n<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> has an excellent <a href=\"https://4c.cst.cam.ac.uk/about/additionality-leakage-and-permanence\">video explainer</a> series of the work <a href=\"https://anil.recoil.org/projects/4c\">4C</a> has been doing towards this.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>From the <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">Wikipedia article</a> to carbon credits and offsets.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>The Climeworks price seems to have gone up since 2022, and the <a href=\"https://climeworks.com/subscriptions-co2-removal\">subscription</a> site now shows \u00a31100/tonne.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>There's a nice <a href=\"https://www.vice.com/en/article/the-amazon-is-worth-more-money-left-standing-study-shows/\">article from Vice</a> that explains the <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">paper</a> more accessibly.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-5\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>As an aside, I've been purchasing <a href=\"https://shopping.rspb.org.uk/gifts-home/home-and-kitchen/food-drink/food/gola-chocolate.html\">sustainable Gola rainforest chocolate</a> from the RSPB. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> gave me some of their truffles for Christmas and they were consumed rapidly by my family.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-4\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>The terms <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">carbon credits and carbon offsets</a> are often used interchangeably,\nbut are in fact two distinct concepts. I've spent a nice Sunday morning\nreading up on some <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">recent articles</a> that <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a> sent me which introduce a\n<em>third</em> term, known as <em>"carbon contributions"</em>. Rather than this adding confusion, I\nfound it helped me clarify my own thoughts on the matter, which I\nnote down here in draft form. <em>(Update 7th Feb: I've revised this several times after many discussions this week, especially with <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, with full list of credits in the end)</em></p>\n<h2><a href=\"https://anil.recoil.org/#what-are-carbon-credits-and-offsets\"></a>What are carbon credits and offsets?</h2>\n<p>A <em>carbon credit</em> aims to quantify the net climate benefit resulting an\nintervention that alters some CO2 emissions that would otherwise have gone into\nthe atmosphere in a business-as-usual counterfactual scenario. While there are many\ndifferent categories of carbon credits, I'll focus on <a href=\"https://iucn.org/our-work/nature-based-solutions\">nature-based solutions</a>. For example,\nwe could fund an intervention which provides an <a href=\"https://www.rspb.org.uk/whats-happening/news/the-power-of-forest-friendly-chocolate\">alternative livelihood</a> to cutting down tropical rainforests,\nand then calculate the area of rainforest saved (and therefore, the amount of avoided carbon emitted into the atmosphere) as a result\nof this action.</p>\n<p>The carbon credit therefore measures the <em>additional</em> amount of CO2 avoided as a result of the specific intervention,\nadjusted for <a href=\"https://www.lse.ac.uk/granthaminstitute/publication/avoiding-leakage-from-nature-based-offsets-by-design/\">negative externalities</a> and the <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">impermanence</a> of\nthe action into the future if it's at risk of being reversed. We can monitor the measurements using spaceborn sensing to\nestablish <a href=\"https://anil.recoil.org/notes/credible-credit-principles\">global baselines</a> against which to calculate the counterfactual impacts of positive actions.<a href=\"https://anil.recoil.org/#fn-2\">[1]</a> Carbon credits are nowadays their own asset class, both <a href=\"https://anil.recoil.org/papers/2024-cclr-carbon\">legally</a> and <a href=\"https://www.gov.uk/government/publications/revenue-and-customs-brief-7-2024-vat-treatment-of-voluntary-carbon-credits/revenue-and-customs-brief-vat-treatment-of-voluntary-carbon-credits\">fiscally</a>.</p>\n<p>A <em>carbon offset</em> <a href=\"https://anil.recoil.org/#fn-1\">[2]</a> is then a way to account for the net climate benefits that one entity brings to another. The "benefits" are the amount of CO2e avoided or removed via the carbon credit, and the "costs" are the amounts of CO2e being emitted by the other party. The origin of this accounting can be traced back to the UN's <a href=\"https://en.wikipedia.org/wiki/Net-zero_emissions\">net-zero</a> goals:</p>\n<blockquote>\n<p>Net-zero means cutting carbon emissions to a small amount of residual emissions that can be absorbed and durably stored by nature and other carbon dioxide removal measures, leaving zero in the atmosphere.\n-- UN <a href=\"https://www.un.org/en/climatechange/net-zero-coalition\">Net Zero coalition</a></p>\n</blockquote>\n<p>The theory behind offsetting is that we can never get to a complete net zero state due to the <a href=\"https://www.nature.com/articles/s41558-022-01592-2\">residual CO2 emissions</a> that will remain in even the most optimal decarbonised societies. For these residual emissions, we need to offset them with corresponding climate benefits in order to balance the books on how much carbon is in the atmosphere and how much is being <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">absorbed</a> by the planet's biosphere. And one of the main sources of CO2 absorption that we must protect in the biosphere are rainforests:</p>\n<blockquote>\n<p>Carbon sinks have increased in temperate and tropical regrowth forests owing to increases in forest area, but they decreased in boreal and tropical intact forests, as a result of intensified disturbances and losses in intact forest area, respectively. The global forest sink is equivalent to almost half of fossil-fuel emissions. However, two-thirds of the benefit from the sink has been negated by tropical deforestation.\n-- <a href=\"https://www.nature.com/articles/s41586-024-07602-x\">The enduring world forest carbon sink</a>, Nature 2024</p>\n</blockquote>\n<p>Since tropical rainforests are so crucial for both <a href=\"https://www.unesco.org/en/articles/yangambi-biosphere-reserve-congo-basin-become-knowledge-hub-climate-and-biodiversity\">CO2 absorption</a> and biodiversity, my own recent <a href=\"https://4c.cst.cam.ac.uk/publications\">research</a> has largely focussed on reliable <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">accounting</a> for quantifying carbon credits accurately for <a href=\"https://unfccc.int/topics/land-use/workstreams/redd/what-is-redd\">avoided deforestation</a> projects in these regions. This work been <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">progressing</a> steadily, and we're increasingly confident in the quantification methods used behind measuring the carbon sequestration impact of nature-based credits.</p>\n<p>However, what has been dragging down carbon credits is how they are used <em>after</em> they are verified and purchased, which is predominately via carbon offsetting. Let's first examine the problems with carbon <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">offsetting</a>, and then examine an emerging concept of "carbon <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">contributions</a>" might provide a better way forward for carbon credits.</p>\n<h2><a href=\"https://anil.recoil.org/#is-carbon-offsetting-a-license-to-pollute\"></a>Is carbon offsetting a license to pollute?</h2>\n<p>Carbon offsets are currently mostly <a href=\"https://icvcm.org/voluntary-carbon-market-explained/\">voluntary</a>, where private actors can purchase carbon credits towards reducing their emissions targets. The obvious problem with offsetting is that it can give <a href=\"https://www.ft.com/content/93938a1b-dc36-4ea6-9308-170189be0cb0\">bad actors</a> a license to spend money to <a href=\"https://www.theguardian.com/environment/2023/jan/19/shell-to-spend-450m-on-carbon-offsetting-fears-grow-credits-worthless-aoe\">continue to pollute</a>, while <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">breaking their emissions pledges</a>. And the harsh reality is that if we don't engage in immediate and real emissions reductions, we're <a href=\"https://www.newscientist.com/article/2344159-world-is-on-track-for-2-5c-of-global-warming-by-end-of-the-century/\">screwed</a> in the coming decades.</p>\n<p>Unfortunately, we need to balance this with the short-term reality that many of these businesses have to emit to <a href=\"https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-emissions-for-google-and-microsoft-a-major-contributor-to-climate-change\">remain competitive</a>, for example in the AI sector (<a href=\"https://anil.recoil.org/notes/deepseek-r1-advances\">Deepseek</a> notwithstanding!).\nAmazon highlighted the difficulty of forecasting their emissions in their annual sustainability report in 2023:</p>\n<blockquote>\n<p>[...] our progress toward a net-zero carbon business will not be linear, and each year as our various businesses grow and evolve, we will produce different results [...] These results will be influenced by significant changes to our business, investments in growth, and meeting the needs of our customers.\n-- <a href=\"https://sustainability.aboutamazon.com/2023-amazon-sustainability-report.pdf\">Amazon Sustainability Report 2023</a></p>\n</blockquote>\n<p>As did Google, who gave up on 'real time net zero' last year, preferring instead to aim for the comfortably distant 2030:</p>\n<blockquote>\n<p>[...] starting in 2023, we're no longer maintaining operational carbon neutrality. We're instead focusing on accelerating an array of carbon solutions and partnerships that will help us work toward our net-zero goal [...]\n-- <a href=\"https://www.gstatic.com/gumdrop/sustainability/google-2024-environmental-report.pdf\">Google Environment Report 2024</a></p>\n</blockquote>\n<p>Your heart may not be bleeding for these tech companies finding it difficult to forecast how they'll make their next <a href=\"https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization#Trillion-dollar_companies\">trillion dollars</a>, but there is the undeniable reality that they need to break emissions pledges in response to global competitive pressure on their core businesses. But given this, is there still any point in all the precise accounting frameworks for net-zero carbon <em>offsetting</em>?</p>\n<p>A December <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">article</a> in the FT argues that there needs to be a fundamental shift in our approach to carbon credits for this reason. They observed that the use of carbon offsets for emissions trading in the EU will probably only apply to removal projects that <a href=\"https://en.wikipedia.org/wiki/Direct_air_capture\">suck carbon from the air</a> and not to the nature-based deforestation avoidance schemes I described above.</p>\n<blockquote>\n<p>Corporate funding for nature conservation has a useful role to play \u2014 but as a contribution to the public good, not for use in tonne-for-tonne emissions offsetting calculations.\n-- <a href=\"https://www.ft.com/content/969b487f-9534-44b6-a47d-ce7519667884\">Simon Mundy</a>, "It's time for a shift in approach to carbon credits", FT</p>\n</blockquote>\n<p>And <em>there</em> is the critical distinction between carbon "credits" and "offsets" I was looking for! Simon acknowledges the crucial importance of generating forest carbon credits to advance the extremely urgent problem of tackling tropical deforestation, but notes that corporations should not be giving to this pot as part of a complex accounting scheme tied to the vagaries of their ever-shifting business strategies. Forests are too important to our continued existence to be left to the mercies of a <a href=\"https://www.theguardian.com/environment/article/2024/may/31/market-value-of-carbon-offsets-drops-61-aoe\">volatile stock market</a>.</p>\n<p>Instead, we need to come up with a scheme for spending carbon credits whose incentives are aligned towards keeping the focus on emissions reductions and behavioural change. So, let's next firmly decouple carbon credits from carbon offsets, and examine how organisations that wish to <em>do</em> the right thing can...contribute...instead.</p>\n<h2><a href=\"https://anil.recoil.org/#carbon-contributions-as-an-alternative-to-offsetting\"></a>Carbon contributions as an alternative to offsetting</h2>\n<p>An <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">article last year</a> by a former Cambridge Gates Scholar <a href=\"https://www.libbyblanchard.com/\">Libby Blanchard</a> and colleagues made a very clear case how and why we might replace carbon offsetting with "carbon contributions", and especially so for forest protection. She observed that the <a href=\"https://www.ft.com/content/6eb8981e-4117-4aeb-a1b3-40f08ae85f53\">integrity crisis</a> in the offsets market has quite rightly lead to the exposure of many poor quality schemes, but is also drying up crucial funding for the <a href=\"https://www.fscindigenousfoundation.org/global-south-voices-in-support-of-redd/\">good actors</a> who are working hard under very adverse conditions to launch forest protection schemes in the global <a href=\"https://www.wildlifeworks.com/post/listen-to-global-south-voices-the-carbon-market-s-key-role-in-financing-sustainable-development-and\">south</a> and <a href=\"https://www.reuters.com/sustainability/land-use-biodiversity/how-carbon-finance-is-seeding-new-hope-northern-forests-2024-12-20/\">north</a>.</p>\n<blockquote>\n<p>One way to channel forest finance away from bad offsets toward more productive outcomes is, simply, to stop claiming that forests offset fossil fuel emissions. Companies could, instead, make "contributions" to global climate mitigation through investments in forests.</p>\n<p>This change in terminology may seem small, but it represents a fundamentally different approach. For one thing, not allowing companies to subtract carbon credits from their direct emissions into a single net number, as offsetting does, refocuses priorities on direct emissions reductions. Companies would no longer be able to hide inaction behind offset purchases.\n-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">Libby Blanchard, Bill Anderegg and Barbara Haya</a>, Instead of Carbon Offsets, We Need 'Contributions' to Forests, Jan 2024</p>\n</blockquote>\n<p>This approach is radically more accessible for a good actor who has been scared away from offsets and is entangled in complex <a href=\"https://sciencebasedtargets.org\">SBTI</a>-style accounting frameworks!</p>\n<p>Firstly and most importantly, it removes the incentive to purchase the cheapest credits on the market at the lowest price possible. Since the organisations are no longer racing to hit a net-zero target, they can afford to find the highest quality and highest impact carbon projects available, and put their money towards those instead.</p>\n<p>Secondly, a contributions model focussed on quality means that more organisations can safely participate. In the current voluntary market, there is a <a href=\"https://en.wikipedia.org/wiki/The_Market_for_Lemons\">market for lemons</a> situation where it is very difficult to distinguish <a href=\"https://www.theguardian.com/environment/article/2024/may/30/corporate-carbon-offsets-credits\">junk credits</a> from <a href=\"https://community.rspb.org.uk/ourwork/b/actionfornature/posts/protecting-gola-10-years-of-the-redd-conservation-project-in-sierra-leone-s-gola-rainforest\">worthwhile credits</a>, since the market price is not a reliable indicator of quality. This means that the vast majority of organisations <a href=\"https://www.statista.com/statistics/501730/voluntary-carbon-offset-market-transaction-volume-worldwide/\">withdraw</a> from participating in the (voluntary) market due to the <a href=\"https://infiniteglobal.com/insights/a-net-zero-fairytale-the-reputational-risks-of-carbon-offsetting/\">reputational risks</a>, leaving only two sorts of participants: very good actors who <em>really</em> want to do the right thing, and very bad actors who are blatantly <a href=\"https://en.wikipedia.org/wiki/Greenwashing\">greenwashing</a>. It's a very odd party if the only two sorts of people left are the sinners and the saints!</p>\n<p>Let's look more closely at each of these points, as I think it fundamentally changes the dynamics of the use of carbon credits.</p>\n<h2><a href=\"https://anil.recoil.org/#selecting-the-highest-quality-carbon-credits-instead-of-the-cheapest\"></a>Selecting the highest quality carbon credits instead of the cheapest</h2>\n<p>There are a <a href=\"https://www.carbon-direct.com/insights/how-do-carbon-credits-actually-work-removal-reduction-and-avoidance-credits-explained\">vast array</a> of carbon avoidance, reduction and removal schemes; how do we chose between them? The current carbon markets focus on <a href=\"https://carbonmarketwatch.org/2024/08/14/faq-understanding-the-financial-workings-of-the-voluntary-carbon-market/\">secondary trading</a> as a price proxy, but this is a poor indicator of the underlying reliability and human and biodiversity cobenefits of any given intervention. In 2021, the University of <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">Cambridge Offset Working Group</a> commissioned a <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">comprehensive report</a> on how we might compare project quality and cobenefits first, and then figure out a suitable price for each. This methodology (dubbed "<a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">PACT</a>") allows us to compare diverse credit types such as direct-air-capture and nature-based solution projects as apples-to-apples. Here's an excerpt from that <a href=\"https://www.cambridge.org/engage/coe/article-details/6409c345cc600523a3e778ae\">report</a>:</p>\n<p>\n<img alt=\"Table of relative costs of carbon credits across project types from the COWG report\" src=\"https://anil.recoil.org/images/pact-table.webp\" title=\"Table of relative costs of carbon credits across project types from the COWG report\">\nTable of relative costs of carbon credits across project types from the COWG report</p>\n<p>The important column is the \u00a3PACT one, which shows the adjusted costs per ton of carbon of purchasing those credits. The <a href=\"https://climeworks.com/subscriptions-co2-removal\">Climeworks</a> direct-air-capture comes in at \u00a3900/tonne <a href=\"https://anil.recoil.org/#fn-3\">[3]</a> whereas a tropical rainforest project in Sierra Leone ranks in at \u00a373/tonne, <em>even after impermanance is adjusted for</em>! That's an absolutely mind-blowing price difference for a market that's allegedly more <a href=\"https://en.wikipedia.org/wiki/Efficient-market_hypothesis\">efficient</a> due to the existence of secondary trading. Yet there is an order-of-magnitude price difference between tropical forest protection and direct air capture, and that's <em>before</em> taking into account the obvious co-benefits of forest protection such as <a href=\"https://anil.recoil.org/projects/life\">biodiversity</a> and livelihood improvements.</p>\n<p>Blanchard's earlier article identifies the key benefits of a contributions model here:</p>\n<blockquote>\n<p>Freeing companies from the pressure of "offsetting" by switching to a "contributions" frame lessens the incentive to minimize costs at the expense of quality, allowing them to focus on contributing to higher-quality projects.\n-- <a href=\"https://ssir.org/articles/entry/forest-contributions-carbon-offsets\">Libby Blanchard, Bill Anderegg and Barbara Haya</a></p>\n</blockquote>\n<p>Since the University is <em>not</em> planning on spending these carbon credits on accounting towards a net-zero goal, it is free to search the market for the highest quality impact -- in this case, tropical rainforest avoidance credits that are hugely undervalued -- and also filtering based on important co-benefits such as biodiversity and livelihood impacts. And by sharing our knowledge about high quality carbon credit projects, we could hopefully find many other organisations that want to similarly contribute, and drive up the price of rainforest credits to their <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">true value</a>.<a href=\"https://anil.recoil.org/#fn-5\">[4]</a></p>\n<p>With a contributions model, we no longer care what the absolute price we're paying for the credits are: our contributions only reflect a fraction of our total climate damage anyway, and we want the carbon credits that we do purchase to reflect the highest available impact out of the spectrum of compensation efforts that we could engage in. There's still one important consideration we'll talk about next though: how should an organisation account for these contributions, if not as part of a net-zero mechanism?</p>\n<h2><a href=\"https://anil.recoil.org/#applying-carbon-contributions-to-sustainability-policies\"></a>Applying carbon contributions to sustainability policies</h2>\n<p>The primary sustainability focus of any organisation must be on <a href=\"https://en.wikipedia.org/wiki/Climate_change_mitigation\">decarbonisation</a> via direct emissions reduction. With carbon contributions, we can focus on this without the distractions of race-to-the-bottom carbon offset accounting.</p>\n<p>For example, consider the area of <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">international air travel</a>. There are <em>plenty</em> of things to do to reduce emissions here as a matter of urgent policy change. My University's <a href=\"https://www.environment.admin.cam.ac.uk/travel/sustainable-business-travel\">sustainable travel policy</a> is sensible and dictates that it must be a trip of last resort to fly; we must use trains or other land travel where available, such as for European trips. There is also plenty of science to invest in to reduce the impact of aviation; ranging from <a href=\"https://www.bbc.co.uk/news/av/technology-60985913\">electrified planes</a> and <a href=\"https://www.bbc.co.uk/news/articles/cz7wp777780o\">contrails</a> and <a href=\"https://www.sciencedirect.com/science/article/pii/S0191261524001899\">optimised routing</a>. But, while all this is going on, sometimes there is only one practical way to get somewhere internationally, such as for an annual conference. We need all the emissions reductions strategies to be deployed first, and while these are taking effect we <em>also</em> need to also augment them with voluntary contribution towards the last-resort travel that's happening while they are being rolled out or researched. Or indeed, also compensate for past travel emissions, as CO2e affects the climate for <a href=\"https://www.nature.com/articles/climate.2008.122\">longer than Stonehenge</a> has existed!</p>\n<p>Another similarly <a href=\"https://ourworldindata.org/food-choice-vs-eating-local\">topical</a> emissions reduction area is on how to reduce our <a href=\"https://www.britishecologicalsociety.org/wp-content/uploads/Ripple-et-al-2014-ruminants.pdf\">ruminant meat consumption</a>. More and more research is showing how damaging this is for <a href=\"https://www.worldwildlife.org/magazine/issues/summer-2018/articles/what-are-the-biggest-drivers-of-tropical-deforestation\">tropical forest destruction</a> but also from a <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity angle</a>. But it turns out that <a href=\"https://doi.org/10.1038/d41586-019-01662-0\">nudging consumers</a> such as Cambridge students and staff towards less damaging choices by default is entirely practical:<a href=\"https://anil.recoil.org/#fn-4\">[5]</a></p>\n<blockquote>\n<p>A study of over 94000 cafeteria meal choices has found that doubling the vegetarian options \u2013 from 1-in-4 to 2-in-4 \u2013 increased the proportion of plant-based purchases by between 40-80% without affecting overall food sales.\n-- <a href=\"https://www.cam.ac.uk/stories/veg-nudge\">Veg nudge</a>. Impact of increasing vegetarian availability on meals (<a href=\"https://doi.org/10.1073/pnas.1907207116\">paper</a> / <a href=\"https://www.nature.com/articles/s43016-020-0132-8\">followup</a>)</p>\n</blockquote>\n<p>For both of these emissions reductions initiatives, we could tag on a voluntary contribution whenever some damaging action (long-haul flying, importing ruminant meat, etc) is taken. This is an contribution of <em>last resort</em> ("I am a grad student presenting a paper and have to go to abroad for this conference"). In <a href=\"https://www.environment.admin.cam.ac.uk/Annual-Report\">annual sustainability reports</a>, the primary focus of reporting would remain firmly on the emissions reductions initiatives themselves. But the contributions gathered from these schemes could be pooled, and treated as a collective (but voluntary) <a href=\"https://en.wikipedia.org/wiki/Carbon_tax\">carbon tax</a> on the damages to nature and the atmosphere.</p>\n<p>And how do we spend this carbon tax? On the highest quality carbon projects we can find in the big wide world, as I described earlier! Each individual reductions scheme doesn't worry about what the compensation mechanisms are; groups similar to the <a href=\"https://www.environment.admin.cam.ac.uk/ESSC/carbon-offsetting-working-group-terms-reference\">COWG</a> could regularly assess projects worldwide. By publically sharing their results to allow other organisations to participate in supporting them, they would also help reinforce the emerging <a href=\"https://icvcm.org/core-carbon-principles/\">core carbon principles</a> championed by the <a href=\"https://icvcm.org/\">IC-VCM</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#im-pretty-sold-on-carbon-contributions-vs-offsets\"></a>I'm pretty sold on carbon contributions vs offsets</h2>\n<p>This contributions model places the emphasis back where it should be -- on behavioural and systemic reductions of our environment impacts -- rather than on being a "license to pollute", as carbon offsets have often been used as. It allows us to pragmatically identify high-impact areas where we have policies in place to reduce emissions, purchase carbon credits from those projects, and then account for their expenditure via our emissions reductions activities.</p>\n<p>An explicit non-goal is to use credits towards a big net-zero target of claiming carbon neutrality; they just reflect our collective contribution towards mitigating environmental damage that we've judged that we had to do.\n<a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a> succinctly summarises this: <em>"a contribution is an acknowledgement of harm rather than its <a href=\"https://dictionary.cambridge.org/dictionary/english/expiation\">expiation</a>"</em>.</p>\n<p><a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> also applies this approach to <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">biodiversity credits</a> in a recent <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">piece</a>:</p>\n<blockquote>\n<p>Using biodiversity credits to quantify contributions toward nature recovery, rather than to directly offset specific negative impacts, is a key way to reduce some of the risks we highlight. This is referred to in the forest carbon world as a "contribution" model. Instead of buyers of forest carbon credits claiming that the credits can offset emissions to achieve "net zero", they instead make a "contribution" to global climate mitigation through investments in forests.</p>\n<p>While this may seem like a small change in terminology, it represents an important difference. If carbon credits cannot be subtracted from a company's emissions to produce a single net number, they cannot be used as a license to continue emitting. This also lessens the incentive for buyers to focus on quantity rather than quality in purchased credits. Some biodiversity credit operators are already promoting this approach [...]\n-- <a href=\"https://royalsocietypublishing.org/doi/10.1098/rspb.2024.2353\">Hannah Wauchope et al</a>, What is a unit of nature? Measurement challenges in the emerging biodiversity credit market, Royal Society 2024</p>\n</blockquote>\n<p>I couldn't agree more! Julia also highlights eloquently the urgency of the situation in her <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">commentary</a> in Nature in response to a recent <a href=\"https://www.bbc.co.uk/programmes/m001zd68\">Panorama</a> program on the BBC:</p>\n<blockquote>\n<p>However, dramatically more finance is urgently needed to stop the ongoing loss of forests and the vital services that they provide. REDD+ credits that cover the true cost of reducing deforestation in an effective and equitable way can help to provide that finance. If they are only used to offset residual emissions after substantial reductions, they could also contribute to the transition to net zero. The bottom line is that failure to conserve our carbon-rich forests and the life they support would be a dramatic and catastrophic failure for humanity.\n- <a href=\"https://www.nature.com/articles/s41559-024-02442-4\">Julia P.G. Jones</a>, Scandal in the voluntary carbon market must not impede tropical forest conservation, Nature</p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#draft-principles-to-operationalise-carbon-contributions\"></a>Draft principles to operationalise carbon contributions</h2>\n<p>While we're still in early days of working through the details, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://coomeslab.org\">David Coomes</a> and I have been framing a three-step checklist that organisations could apply towards the implementation of a carbon contributions model:</p>\n<ol>\n<li>The organisation acknowledges harm from recent and historic emissions. Decarbonisation remains the first priority, whilst minimising residual emissions.</li>\n<li>Contributions are intended to mitigate harm from residual emissions and not to claim carbon neutrality</li>\n<li>The organisation is transparent about decreases or increases in emissions and beneficiaries of its contributions</li>\n</ol>\n<p>With these principles, it should be possible for an organisation to contribute to carbon credit financing without adverse incentives. While there is some concern that this contributions mechanism has no built-in incentive to force organisations to contribute, I believe that it could bring a lot more people into the fold than voluntary offsetting has (which, as I noted earlier, has only mainly the best and the worst participants remaining now with the majority of people stepping back from it due to all the controversies). However, we still need to see if this is a strong enough incentive to get more organisations to participate voluntarily; this concern has been raised by several colleagues in response to this article and I will think on it further.</p>\n<p>The stakes <a href=\"https://news.mongabay.com/2024/12/the-year-in-tropical-rainforests-2024/\">cannot be higher</a> right now for tropical rainforests, and we do not have the collective luxury of time to remain locked in the <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">offset-or-not</a> debate without an immediate alternative. The carbon contributions model could be just what we need to push forward! My hope is that this model makes it easier and safer for many organisations that have decided against offsetting to still contribute towards nature protection and restoration.</p>\n<p>Other universities also grappling with this topic include <a href=\"https://www.ecosystemmarketplace.com/articles/commentaryhow-i-learned-to-stop-worrying-and-love-or-tolerate-carbon-offsets/\">Brown</a> and <a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">UPenn</a>, so I plan to circulate this article to them to gather wider opinions. The good folks at <a href=\"https://native.eco\">Native</a> also published a <a href=\"https://www.linkedin.com/pulse/why-businesses-must-shift-from-compensation-contribution-gkwee/?trackingId=ebXd8K96TidbACLeGURK%2Fw%3D%3D\">piece</a> about this shift from a compensation mindset to a contributions one.</p>\n<p>As noted at the beginning, I am updating this article regularly and would greatly welcome any other thoughts from you, the reader! I am grateful to <a href=\"https://coomeslab.org\">David Coomes</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>, <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a>, <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>, <a href=\"https://www.geog.cam.ac.uk/people/garrett/\">Rachael Garrett</a>, <a href=\"https://www.linkedin.com/in/isobelcohen/\">Isobel Cohen</a>, <a href=\"https://en.wikipedia.org/wiki/Simon_Zadek\">Simon Zadek</a>, <a href=\"https://en.wikipedia.org/wiki/Bhaskar_Vira\">Bhaskar Vira</a>, <a href=\"https://www.cam.ac.uk/stories/changemakers-melissa-leach\">Melissa Leach</a>, <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a>, <a href=\"https://www.linkedin.com/in/harriet-hunnable-uk/\">Harriet Hunnable</a>, <a href=\"https://www.eden-plus.org/team-members/elliot-kinsey\">Elliot Kinsey</a>, <a href=\"https://www.landecon.cam.ac.uk/person/dr-ellen-quigley\">Ellen Quigley</a>, <a href=\"https://www.linkedin.com/in/jonpierre1/\">Jon Pierre</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> and many others for their thoughts. This article includes their input but is not endorsed by them and any mistakes are mine alone.</p>\n<p><a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I decided it might be instructive to run a <a href=\"https://notebooklm.google\">NotebookLM</a> summary of some of our discussions, which you can find as in (AI-generated) podcast format below.</p>\n<p></p><div></div><p></p>\n<p> Changelog: 2nd Feb 2025 was original article. 5th Feb 2025 refined draft principles. 12th Feb 2025 added note about Native.eco article via <a href=\"https://www.wolfson.cam.ac.uk/people/dr-robin-daniels\">Robin Daniels</a>, note on incentives via <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a>. 20th Feb 2025 fixed typo in Ellen Quigley quote, via <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n\n<ol>\n<li>\n<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> has an excellent <a href=\"https://4c.cst.cam.ac.uk/about/additionality-leakage-and-permanence\">video explainer</a> series of the work <a href=\"https://anil.recoil.org/projects/4c\">4C</a> has been doing towards this.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-2\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>From the <a href=\"https://en.wikipedia.org/wiki/Carbon_offsets_and_credits\">Wikipedia article</a> to carbon credits and offsets.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>The Climeworks price seems to have gone up since 2022, and the <a href=\"https://climeworks.com/subscriptions-co2-removal\">subscription</a> site now shows \u00a31100/tonne.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-3\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>There's a nice <a href=\"https://www.vice.com/en/article/the-amazon-is-worth-more-money-left-standing-study-shows/\">article from Vice</a> that explains the <a href=\"https://www.nature.com/articles/s41893-018-0175-0\">paper</a> more accessibly.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-5\">\u21a9\ufe0e\ufe0e</a></span></li><li>\n<p>As an aside, I've been purchasing <a href=\"https://shopping.rspb.org.uk/gifts-home/home-and-kitchen/food-drink/food/gola-chocolate.html\">sustainable Gola rainforest chocolate</a> from the RSPB. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> gave me some of their truffles for Christmas and they were consumed rapidly by my family.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-4\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_chora-internationalised.json
+18
avsm/notes_chora-internationalised.json
···+"summary": "<p>One of the coolest things about hacking on the <a href=\"https://horde.org\">Horde</a> framework is that it gives me lots of features for free that I can use in my web applications. The latest thing I added to the Chora CVS viewer today is the internationalisation framework, so that the frontend can be translated to multiple languages.</p>\n<p>I've added in a simple <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010730/002975.html\">German translation</a> to start with, but please contribute your own strings if you get the opportunity.</p>",+"content": "<p>One of the coolest things about hacking on the <a href=\"https://horde.org\">Horde</a> framework is that it gives me lots of features for free that I can use in my web applications. The latest thing I added to the Chora CVS viewer today is the internationalisation framework, so that the frontend can be translated to multiple languages.</p>\n<p>I've added in a simple <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010730/002975.html\">German translation</a> to start with, but please contribute your own strings if you get the opportunity.</p>",
+18
avsm/notes_chora-live-on-php.json
+18
avsm/notes_chora-live-on-php.json
···+"summary": "<p>I spent a chunk of time through the year working on the <a href=\"https://horde.org\">Horde</a> project. I began when I got commit to <a href=\"https://www.horde.org/apps/imp/\">IMP webmail</a> to fix some bugs in the MIME rendering for our Recoil deployment. You can see my code commits on the <a href=\"https://marc.info/?a=97359997900001&r=6\">horde-cvs</a> mailing list archive.</p>\n<p>After getting to grips with the PHP code, I then went on to totally rewrite the <a href=\"https://www.horde.org/apps/chora/\">Chora</a> version control viewer so that the CVS repositories for Horde could be browsed online instead of only via the command line.</p>\n<p>I'm extremely proud to report that the <a href=\"http://php.net\">PHP project</a> has <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">now deployed Chora</a> for production use to serve up <code>cvs.php.net</code>, making it our biggest user by far. Thanks for making my day, Rasmus!</p>\n<blockquote>\n<p>I switched Chora over to be the default web cvs system behind cvs.php.net\nnow. The old viewcvs site is still available at viewcvs.php.net (dns may\nnot have updated yet)\n -- <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">Rasmus Lerdorf</a>, php.net</p>\n</blockquote>",+"content": "<p>I spent a chunk of time through the year working on the <a href=\"https://horde.org\">Horde</a> project. I began when I got commit to <a href=\"https://www.horde.org/apps/imp/\">IMP webmail</a> to fix some bugs in the MIME rendering for our Recoil deployment. You can see my code commits on the <a href=\"https://marc.info/?a=97359997900001&r=6\">horde-cvs</a> mailing list archive.</p>\n<p>After getting to grips with the PHP code, I then went on to totally rewrite the <a href=\"https://www.horde.org/apps/chora/\">Chora</a> version control viewer so that the CVS repositories for Horde could be browsed online instead of only via the command line.</p>\n<p>I'm extremely proud to report that the <a href=\"http://php.net\">PHP project</a> has <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">now deployed Chora</a> for production use to serve up <code>cvs.php.net</code>, making it our biggest user by far. Thanks for making my day, Rasmus!</p>\n<blockquote>\n<p>I switched Chora over to be the default web cvs system behind cvs.php.net\nnow. The old viewcvs site is still available at viewcvs.php.net (dns may\nnot have updated yet)\n -- <a href=\"https://lists.horde.org/archives/dev/Week-of-Mon-20010806/002886.html\">Rasmus Lerdorf</a>, php.net</p>\n</blockquote>",
+18
avsm/notes_claude-copilot-sandbox.json
+18
avsm/notes_claude-copilot-sandbox.json
···+"summary": "<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> nerdsniped me last week into getting OCaml to drive the 80s-retro <a href=\"https://www.adafruit.com/product/2345\">RGB Matrix</a> displays. I grabbed one from the local Pi Store and soldered it together with help from <a href=\"https://mynameismwd.org\">Michael Dales</a>. But instead of writing OCaml bindings by hand, we thought we'd try out the latest agentic CLI called <a href=\"https://github.com/kodu-ai/claude-code\">Claude Code</a> released <a href=\"https://ai-claude.net/\">last week</a> to see if we could entirely autogenerate the bindings.</p>\n<p></p><div></div><p></p>\n<p><em>TL;DR:</em> Claude Coder generated working OCaml code almost from scratch, ranging from C bindings to high-level OCaml interface files and even Cmdliner terms, but needs a more sophisticated sandboxing model before something goes horribly wrong. So much potential and so much danger awaits us. Coincidentally <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I <a href=\"https://anil.recoil.org/papers/2024-hope-bastion\">wrote</a> about this a few months ago. Read on...</p>\n<h2><a href=\"https://anil.recoil.org/#wiring-up-the-display-to-my-raspberry-pi\"></a>Wiring up the display to my Raspberry Pi</h2>\n<p>The RGB Matrix display has a very nice C++ <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix\">rpi-rgb-led-matrix</a> library, so I fired up my Raspberry Pi 4 to get an OCaml development environment going by using that. The included <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/tree/master/examples-api-use\">demo</a> immediately gave me a disappointingly noisy display, but my larger-than-usual 64x64 display turned out to just need a jumper soldered.</p>\n<p>\n<img alt=\"Deploying my local friendly agentic soldering machine otherwise known as Michael Dales\" src=\"https://anil.recoil.org/images/rgb-matrix-hat-ocaml-2.webp\" title=\"Deploying my local friendly agentic soldering machine otherwise known as Michael Dales\">\nDeploying my local friendly agentic soldering machine otherwise known as Michael Dales</p>\n<p>As soon that was soldered, the examples worked great out of the box, so I could get on with some agentic OCaml coding. Thanks <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://web.makespace.org/\">CamMakespace</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#building-ocaml-bindings-using-claude-coder\"></a>Building OCaml bindings using Claude Coder</h2>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I first played around with using <a href=\"https://dev.realworldocaml.org/foreign-function-interface.html\">ocaml-ctypes</a> to build the bindings by hand, but quickly switched over to trying out Claude Sonnet 3.7, first in VSCode and then directly on the Pi CLI via <a href=\"https://github.com/anthropics/claude-code\">Claude Code</a>. The latter fires up an interactive session where you not only input prompts, but it can also <em>run shell commands</em> including builds.</p>\n<p>The very first hurdle was sorting out the build rules. This is the one place where Claude failed badly; it couldn't figure out <a href=\"https://dune.readthedocs.io/en/latest/quick-start.html\">dune files</a> at all, nor the intricate linking flags required to find and link to the C++ library. I made those changes quickly by hand, leaving just a stub <code>librgbmatrix_stubs.c</code> that linked successfully with the main C++ library, but didn't do much beyond that. I also added a near-empty <code>rgb_matrix.ml</code> and <code>rgb_matrix.mli</code> interface files to have a place for the OCaml side of the interface.</p>\n<p>\n<img alt=\"The Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.\" src=\"https://anil.recoil.org/images/claude-coder-ss-1.webp\" title=\"The Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.\">\nThe Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.</p>\n<p>After that, it was just a matter of "asking the Claude Code CLI" via a series of prompts to get it to fill in the code blanks I'd left. The VSCode Copilot editing mode has to be told which files to look at within the project for its context, but I didn't have to do that with the Claude Code CLI.</p>\n<p>Instead, I just prompted it to generate C stubs from the <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/blob/master/include/led-matrix-c.h\">led-matrix-c.h</a> C interface (so it didn't get distracted attempting to bind C++ to OCaml, which isn't a winning proposition). It duly generated reasonable low-level bindings, along with the right OCaml interface files by suggesting edits to the files I'd created earlier. At this point, I got a very basic "hello world" circle going (with the test binary also built by Claude).</p>\n<p>\n<img alt=\"The OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7\" src=\"https://anil.recoil.org/images/rgb-matrix-hat-ocaml-3.webp\" title=\"The OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7\">\nThe OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7</p>\n<p>Although the binding generation built fine, they did segfault when I first ran the test binary! Claude 3.7 bound some C/OCaml functions with more than 5 arguments, which are a special case in OCaml due to <a href=\"https://ocaml.org/manual/5.3/intfc.html#ss:c-prim-impl\">differing bytecode and native code ABIs</a>. Although Claude <em>almost</em> got it right, it subtly mixed up the order of the <code>external</code> binding on the OCaml side. The correct version is:</p>\n<pre><code>external set_pixels_native :\n t -> int -> int -> int -> int -> Color.t array -> unit =\n "caml_led_canvas_set_pixels_bytecode" "caml_led_canvas_set_pixels"\n</code></pre>\n<p>The bytecode C stub comes first, and the native code second, but Claude swapped them which lead to memory corruption. This mixup would ordinarily be rather hard to spot, but the <a href=\"https://valgrind.org/\">valgrind</a> backtrace lead me to the problem very quickly (but only because I'm very familiar with the OCaml FFI!). I couldn't convince Claude to fix this with prompting as it kept making the same mistake, so I swapped the arguments manually and committed the results by hand.</p>\n<h2><a href=\"https://anil.recoil.org/#generating-higher-level-ocaml-interfaces-and-docstrings\"></a>Generating higher level OCaml interfaces and docstrings</h2>\n<p>Once the basics were in place, I then asked it to then refine the OCaml interface to be higher-level; for example instead of a <code>string</code> for the hardware mode, could it scan the C header file, find the appropriate <code>#defines</code>, and generate corresponding OCaml <a href=\"https://dev.realworldocaml.org/variants.html\">variant types</a>? Incredibly, it not only did this, but <em>also</em> generated appropriate OCamldoc annotations for those types from the C header files.</p>\n<p>\n<img alt=\"These OCamldoc entries are generated automatically from the C header files\" src=\"https://anil.recoil.org/images/claude-coder-ss-2.webp\" title=\"These OCamldoc entries are generated automatically from the C header files\">\nThese OCamldoc entries are generated automatically from the C header files</p>\n<p>The Claude Code CLI then helpfully summarises all the changes, and also offers execute dune to check the result works! This is starting to get a bit mad...</p>\n<p>\n<img alt=\"Claude offers to do the dune build after making code changes\" src=\"https://anil.recoil.org/images/claude-coder-ss-3.webp\" title=\"Claude offers to do the dune build after making code changes\">\nClaude offers to do the dune build after making code changes</p>\n<p>\n<img alt=\"It can also navigate the output of commands to see if the desired outcome is successful\" src=\"https://anil.recoil.org/images/claude-coder-ss-4.webp\" title=\"It can also navigate the output of commands to see if the desired outcome is successful\">\nIt can also navigate the output of commands to see if the desired outcome is successful</p>\n<p>\n<img alt=\"The patches to the interface and implementation added in more abstract types as requested\" src=\"https://anil.recoil.org/images/claude-coder-ss-5.webp\" title=\"The patches to the interface and implementation added in more abstract types as requested\">\nThe patches to the interface and implementation added in more abstract types as requested</p>\n<p>The OCaml interfaces generated here required a little iteration to get right, with some manual tweaks. Claude, for some reason, generated duplicate entries for some type definitions, which OCaml doesn't permit. I fixed those manually very quickly, and then asked Claude Code to commit the changes to git for me. It generated a <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3/commits/70c7739696ca207245dfdbc80c5d6d08fe2fce79\">good summary commit message</a>. The interfaces were all documented with docs from the C header file, such as:</p>\n<pre><code>type multiplexing =\n | DirectMultiplexing (* 0: Direct multiplexing *)\n | Stripe (* 1: Stripe multiplexing *)\n | Checker (* 2: Checker multiplexing (typical for 1:8) *)\n | Spiral (* 3: Spiral multiplexing *)\n | ZStripe (* 4: Z-Stripe multiplexing *)\n | ZnMirrorZStripe (* 5: ZnMirrorZStripe multiplexing *)\n | Coreman (* 6: Coreman multiplexing *)\n | Kaler2Scan (* 7: Kaler2Scan multiplexing *)\n | ZStripeUneven (* 8: ZStripeUneven multiplexing *)\n | P10MapperZ (* 9: P10MapperZ multiplexing *)\n | QiangLiQ8 (* 10: QiangLiQ8 multiplexing *)\n | InversedZStripe (* 11: InversedZStripe multiplexing *)\n | P10Outdoor1R1G1_1 (* 12: P10Outdoor1R1G1_1 multiplexing *)\n | P10Outdoor1R1G1_2 (* 13: P10Outdoor1R1G1_2 multiplexing *)\n (* ...etc <snipped> *)\n | Custom of int (* Custom multiplexing as an integer *)\n</code></pre>\n<p>Pretty good! After that, I couldn't resist pushing it a bit further. I asked the CLI to generate me a good command-line interface using <a href=\"https://github.com/dbuenzli/cmdliner\">Cmdliner</a>, which is normally a fairly intricate process that involves remembering the <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html\">Term/Arg DSL</a>. But Claude aced this; it generated a huge series of CLI converter functions like this:</p>\n<pre><code>(* scan_mode conversion *)\n let scan_mode_conv =\n let parse s =\n match String.lowercase_ascii s with\n | "progressive" -> Ok Progressive\n | "interlaced" -> Ok Interlaced\n | _ -> Error (`Msg "scan_mode must be 'progressive' or 'interlaced'")\n in\n let print fmt m =\n Format.fprintf fmt "%s"\n (match m with\n | Progressive -> "progressive"\n | Interlaced -> "interlaced")\n in\n Arg.conv (parse, print)\n</code></pre>\n<p>These are not entirely what I'd write, as <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Arg/index.html#val-enum\">Cmdliner.Arg.enum</a> would suffice, but they're fine as-is and could be refactored later. I even got it to complete the job and generate a combined options parsing function for the (dozens) of command-line arguments, which would have been <em>very</em> tedious to do by hand:</p>\n<pre><code>(* Apply options from command line to Options.t *)\nlet apply_options options\n ~rows ~cols ~chain_length ~parallel ~hardware_mapping ~brightness \n ~pwm_bits ~pwm_lsb_nanoseconds ~pwm_dither_bits ~scan_mode ~row_address_type \n ~multiplexing ~disable_hardware_pulsing ~show_refresh_rate ~inverse_colors\n ~led_rgb_sequence ~pixel_mapper_config ~panel_type ~limit_refresh_rate_hz \n ~disable_busy_waiting =\n Options.set_rows options rows;\n Options.set_cols options cols;\n Options.set_chain_length options chain_length;\n Options.set_parallel options parallel;\n Options.set_hardware_mapping options hardware_mapping;\n Options.set_brightness options brightness;\n Options.set_pwm_bits options pwm_bits;\n Options.set_pwm_dither_bits options pwm_dither_bits;\n Options.set_scan_mode options scan_mode;\n Options.set_pixel_mapper_config options pixel_mapper_config;\n Options.set_panel_type options panel_type;\n Options.set_limit_refresh_rate_hz options limit_refresh_rate_hz;\n Options.set_disable_busy_waiting options disable_busy_waiting;\n (* ...etc <snipped> *)\n options\n</code></pre>\n<p>Once this compiled, I asked for a rotating 3D cube demo, and it duly used the bindings to give me a full command-line enabled generator which you can see below. I just ran:</p>\n<pre><code>rotating_block_generator.exe --disable-hardware-pulsing -c 64 -r 64 --hardware-mapping=adafruit-hat --gpio-slowdown=2\n</code></pre>\n<p>and I had a spinning cube on my display! The code model had no problem with the matrix transformations required to render the cool spinning effect.</p>\n<p></p><div></div><p></p>\n<p>Of course, I had to pay the piper for the truckload of GPUs that drove this code model. At one point, the Claude Code agent got into a loop that I had to manually interrupt as it kept oscillating on a code fix without ever finding the right solution. This turned out to have sucked up quite a lot of money from my Claude API account!</p>\n<p>\n<img alt=\"This post cost me a cup of coffee and a boatload of energy\" src=\"https://anil.recoil.org/images/claude-coder-ss-6.webp\" title=\"This post cost me a cup of coffee and a boatload of energy\">\nThis post cost me a cup of coffee and a boatload of energy</p>\n<p>Overall, I'm impressed. There's clearly some <a href=\"https://arxiv.org/abs/2502.18449\">RL or SFT</a> required to teach the code model the specifics of OCaml and its tooling, but the basics are already incredible. <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://github.com/jonludlam\">Jon Ludlam</a> and I are having a go at this in the coming months.</p>\n<h2><a href=\"https://anil.recoil.org/#claude-code-is-powerful-but-it-can-doanythingto-your-machine\"></a>Claude Code is powerful, but it can do...anything...to your machine</h2>\n<p>The obvious downside of this whirlwind binding exercise is that while the NPM-based Claude Code asks nicely before it runs shell commands, <em>it doesn't have to ask</em>. I happened to run it inside a well-sandboxed <a href=\"https://docker.com\">Docker</a> container on my rPi, but most people probably won't. And in general, we need a more sophisticated security model; running the agent within a coarse sandbox that limits access to the file system, the network, and other sensitive resources is too restrictive, as we want to provide access to these resources for certain agentic tasks!</p>\n<p>So in a happy coincidence, this leads to a line of research that <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started last year with something we <a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">presented at HOPE 2024</a>. We explored how to express more precise constraints on what an AI can do by the use of the scary-sounding <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.pdf\">Dijkstra monad</a>. It's far easier to understand by perusing the <a href=\"https://anil.recoil.org/slides/2024-hope-bastion-slides.pdf\">slides</a> of the talk, or watch <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a>'s great <a href=\"https://www.youtube.com/watch?v=U9H9xU-8-qc&list=PLyrlk8Xaylp7OQNLeCGS0j2fjEnvIWL9u\">video presentation</a>.</p>\n<p>We're mainly concerned with situations where the AI models are running over sensitive codebases or datasets. Consider three scenarios we want to handle, which are very logical extensions from the above agentic coding one:</p>\n<ol>\n<li>Modify or ignore sensor data to minimize the extent of habitat loss in a <a href=\"https://anil.recoil.org/papers/2024-terracorder\">biodiversity monitoring</a> setup. <em>But we may want to be able to delete duplicate sensor data in some phases of the analysis.</em></li>\n<li>Leak location sightings of vulnerable species to poachers. <em>But we still want to be able to work with this data to design effective interventions \u2014 we want a sandbox that limits information flows, in a statistical sense (differential privacy).</em></li>\n<li>Enact an intervention that may not satisfy legal constraints. <em>We want a sandbox that requires that a sound causal argument has been formulated</em></li>\n</ol>\n<p>For each of these, we could use a <a href=\"https://en.wikipedia.org/wiki/Capability-based_security\">capability security</a> model where access to sensitive data and effects can occur only via unforgeable capabilities granted explicitly. And the generation of that specification could also be done via code LLMs, but needs to target a verification friendly language like <a href=\"https://fstar-lang.com\">Fstar</a>. The prototype <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> built looks like this:</p>\n<pre><code>module type CapDataAccess (readonly : list dir, writable : list dir)\n (* abstract monad *)\n type Cmd a\n val return : a -> Cmd a\n val bind : Cmd a -> ( a -> Cmd b ) -> Cmd b\n (* only allows access to given directories *)\n val readfile : path -> Cmd string\n (* only allows writes to writable dirs *)\n val writefile : path -> string -> Cmd ()\n</code></pre>\n<p>And then you can use this rich specification to add constraints, for example see this <a href=\"https://github.com/patricoferris/hope-2024/tree/main/simple-json\">JSON parsing example</a> from the Fstar prototype:</p>\n<pre><code>(* Following IUCN's Globally Endangered (GE) scoring *)\nlet datamap = [\n"iberian-lynx.geojson", O [ "rarity", Int 2 ];\n"bornean-elephant.geojson", O [ "rarity", Int 3 ]\n]\n\n(* We add some additional predicates on the files allowed to be used *)\n@|-1,9 +1,10 ==========================================\n| (ensures (fun _ -> True))\n| (requires (fun _ _ local_trace ->\n| dont_delete_any_file local_trace /\\\n+| all_paths_are_not_endangered readonly /\\\n| only_open_some_files local_trace readonly))\n|}\n</code></pre>\n<p>Once you have this specification, then it's a matter of implementing fine-grained OS-level sandboxing policies to interpret and enforce them. Spoiler: we're working on such a system, so I'll write about that just as soon as it's more self-hosting; this area is moving incredibly fast.</p>\n\n<p>Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for help soldering. For the curious, here's the <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3\">PR with the code</a>, but it shouldn't go anywhere near any real use until we've had a chance to review the bindings carefully. There needs to be a new, even more buyer-beware no-warranty license for AI generated code!</p>",+"content": "<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> nerdsniped me last week into getting OCaml to drive the 80s-retro <a href=\"https://www.adafruit.com/product/2345\">RGB Matrix</a> displays. I grabbed one from the local Pi Store and soldered it together with help from <a href=\"https://mynameismwd.org\">Michael Dales</a>. But instead of writing OCaml bindings by hand, we thought we'd try out the latest agentic CLI called <a href=\"https://github.com/kodu-ai/claude-code\">Claude Code</a> released <a href=\"https://ai-claude.net/\">last week</a> to see if we could entirely autogenerate the bindings.</p>\n<p></p><div></div><p></p>\n<p><em>TL;DR:</em> Claude Coder generated working OCaml code almost from scratch, ranging from C bindings to high-level OCaml interface files and even Cmdliner terms, but needs a more sophisticated sandboxing model before something goes horribly wrong. So much potential and so much danger awaits us. Coincidentally <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I <a href=\"https://anil.recoil.org/papers/2024-hope-bastion\">wrote</a> about this a few months ago. Read on...</p>\n<h2><a href=\"https://anil.recoil.org/#wiring-up-the-display-to-my-raspberry-pi\"></a>Wiring up the display to my Raspberry Pi</h2>\n<p>The RGB Matrix display has a very nice C++ <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix\">rpi-rgb-led-matrix</a> library, so I fired up my Raspberry Pi 4 to get an OCaml development environment going by using that. The included <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/tree/master/examples-api-use\">demo</a> immediately gave me a disappointingly noisy display, but my larger-than-usual 64x64 display turned out to just need a jumper soldered.</p>\n<p>\n<img alt=\"Deploying my local friendly agentic soldering machine otherwise known as Michael Dales\" src=\"https://anil.recoil.org/images/rgb-matrix-hat-ocaml-2.webp\" title=\"Deploying my local friendly agentic soldering machine otherwise known as Michael Dales\">\nDeploying my local friendly agentic soldering machine otherwise known as Michael Dales</p>\n<p>As soon that was soldered, the examples worked great out of the box, so I could get on with some agentic OCaml coding. Thanks <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://web.makespace.org/\">CamMakespace</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#building-ocaml-bindings-using-claude-coder\"></a>Building OCaml bindings using Claude Coder</h2>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I first played around with using <a href=\"https://dev.realworldocaml.org/foreign-function-interface.html\">ocaml-ctypes</a> to build the bindings by hand, but quickly switched over to trying out Claude Sonnet 3.7, first in VSCode and then directly on the Pi CLI via <a href=\"https://github.com/anthropics/claude-code\">Claude Code</a>. The latter fires up an interactive session where you not only input prompts, but it can also <em>run shell commands</em> including builds.</p>\n<p>The very first hurdle was sorting out the build rules. This is the one place where Claude failed badly; it couldn't figure out <a href=\"https://dune.readthedocs.io/en/latest/quick-start.html\">dune files</a> at all, nor the intricate linking flags required to find and link to the C++ library. I made those changes quickly by hand, leaving just a stub <code>librgbmatrix_stubs.c</code> that linked successfully with the main C++ library, but didn't do much beyond that. I also added a near-empty <code>rgb_matrix.ml</code> and <code>rgb_matrix.mli</code> interface files to have a place for the OCaml side of the interface.</p>\n<p>\n<img alt=\"The Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.\" src=\"https://anil.recoil.org/images/claude-coder-ss-1.webp\" title=\"The Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.\">\nThe Claude Code CLI runs fine on the Raspberry Pi 4, since most of the heavy computation is done on their end.</p>\n<p>After that, it was just a matter of "asking the Claude Code CLI" via a series of prompts to get it to fill in the code blanks I'd left. The VSCode Copilot editing mode has to be told which files to look at within the project for its context, but I didn't have to do that with the Claude Code CLI.</p>\n<p>Instead, I just prompted it to generate C stubs from the <a href=\"https://github.com/hzeller/rpi-rgb-led-matrix/blob/master/include/led-matrix-c.h\">led-matrix-c.h</a> C interface (so it didn't get distracted attempting to bind C++ to OCaml, which isn't a winning proposition). It duly generated reasonable low-level bindings, along with the right OCaml interface files by suggesting edits to the files I'd created earlier. At this point, I got a very basic "hello world" circle going (with the test binary also built by Claude).</p>\n<p>\n<img alt=\"The OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7\" src=\"https://anil.recoil.org/images/rgb-matrix-hat-ocaml-3.webp\" title=\"The OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7\">\nThe OCaml bindings and concentric circles were all auto-generated by Claude Sonnet 3.7</p>\n<p>Although the binding generation built fine, they did segfault when I first ran the test binary! Claude 3.7 bound some C/OCaml functions with more than 5 arguments, which are a special case in OCaml due to <a href=\"https://ocaml.org/manual/5.3/intfc.html#ss:c-prim-impl\">differing bytecode and native code ABIs</a>. Although Claude <em>almost</em> got it right, it subtly mixed up the order of the <code>external</code> binding on the OCaml side. The correct version is:</p>\n<pre><code>external set_pixels_native :\n t -> int -> int -> int -> int -> Color.t array -> unit =\n "caml_led_canvas_set_pixels_bytecode" "caml_led_canvas_set_pixels"\n</code></pre>\n<p>The bytecode C stub comes first, and the native code second, but Claude swapped them which lead to memory corruption. This mixup would ordinarily be rather hard to spot, but the <a href=\"https://valgrind.org/\">valgrind</a> backtrace lead me to the problem very quickly (but only because I'm very familiar with the OCaml FFI!). I couldn't convince Claude to fix this with prompting as it kept making the same mistake, so I swapped the arguments manually and committed the results by hand.</p>\n<h2><a href=\"https://anil.recoil.org/#generating-higher-level-ocaml-interfaces-and-docstrings\"></a>Generating higher level OCaml interfaces and docstrings</h2>\n<p>Once the basics were in place, I then asked it to then refine the OCaml interface to be higher-level; for example instead of a <code>string</code> for the hardware mode, could it scan the C header file, find the appropriate <code>#defines</code>, and generate corresponding OCaml <a href=\"https://dev.realworldocaml.org/variants.html\">variant types</a>? Incredibly, it not only did this, but <em>also</em> generated appropriate OCamldoc annotations for those types from the C header files.</p>\n<p>\n<img alt=\"These OCamldoc entries are generated automatically from the C header files\" src=\"https://anil.recoil.org/images/claude-coder-ss-2.webp\" title=\"These OCamldoc entries are generated automatically from the C header files\">\nThese OCamldoc entries are generated automatically from the C header files</p>\n<p>The Claude Code CLI then helpfully summarises all the changes, and also offers execute dune to check the result works! This is starting to get a bit mad...</p>\n<p>\n<img alt=\"Claude offers to do the dune build after making code changes\" src=\"https://anil.recoil.org/images/claude-coder-ss-3.webp\" title=\"Claude offers to do the dune build after making code changes\">\nClaude offers to do the dune build after making code changes</p>\n<p>\n<img alt=\"It can also navigate the output of commands to see if the desired outcome is successful\" src=\"https://anil.recoil.org/images/claude-coder-ss-4.webp\" title=\"It can also navigate the output of commands to see if the desired outcome is successful\">\nIt can also navigate the output of commands to see if the desired outcome is successful</p>\n<p>\n<img alt=\"The patches to the interface and implementation added in more abstract types as requested\" src=\"https://anil.recoil.org/images/claude-coder-ss-5.webp\" title=\"The patches to the interface and implementation added in more abstract types as requested\">\nThe patches to the interface and implementation added in more abstract types as requested</p>\n<p>The OCaml interfaces generated here required a little iteration to get right, with some manual tweaks. Claude, for some reason, generated duplicate entries for some type definitions, which OCaml doesn't permit. I fixed those manually very quickly, and then asked Claude Code to commit the changes to git for me. It generated a <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3/commits/70c7739696ca207245dfdbc80c5d6d08fe2fce79\">good summary commit message</a>. The interfaces were all documented with docs from the C header file, such as:</p>\n<pre><code>type multiplexing =\n | DirectMultiplexing (* 0: Direct multiplexing *)\n | Stripe (* 1: Stripe multiplexing *)\n | Checker (* 2: Checker multiplexing (typical for 1:8) *)\n | Spiral (* 3: Spiral multiplexing *)\n | ZStripe (* 4: Z-Stripe multiplexing *)\n | ZnMirrorZStripe (* 5: ZnMirrorZStripe multiplexing *)\n | Coreman (* 6: Coreman multiplexing *)\n | Kaler2Scan (* 7: Kaler2Scan multiplexing *)\n | ZStripeUneven (* 8: ZStripeUneven multiplexing *)\n | P10MapperZ (* 9: P10MapperZ multiplexing *)\n | QiangLiQ8 (* 10: QiangLiQ8 multiplexing *)\n | InversedZStripe (* 11: InversedZStripe multiplexing *)\n | P10Outdoor1R1G1_1 (* 12: P10Outdoor1R1G1_1 multiplexing *)\n | P10Outdoor1R1G1_2 (* 13: P10Outdoor1R1G1_2 multiplexing *)\n (* ...etc <snipped> *)\n | Custom of int (* Custom multiplexing as an integer *)\n</code></pre>\n<p>Pretty good! After that, I couldn't resist pushing it a bit further. I asked the CLI to generate me a good command-line interface using <a href=\"https://github.com/dbuenzli/cmdliner\">Cmdliner</a>, which is normally a fairly intricate process that involves remembering the <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html\">Term/Arg DSL</a>. But Claude aced this; it generated a huge series of CLI converter functions like this:</p>\n<pre><code>(* scan_mode conversion *)\n let scan_mode_conv =\n let parse s =\n match String.lowercase_ascii s with\n | "progressive" -> Ok Progressive\n | "interlaced" -> Ok Interlaced\n | _ -> Error (`Msg "scan_mode must be 'progressive' or 'interlaced'")\n in\n let print fmt m =\n Format.fprintf fmt "%s"\n (match m with\n | Progressive -> "progressive"\n | Interlaced -> "interlaced")\n in\n Arg.conv (parse, print)\n</code></pre>\n<p>These are not entirely what I'd write, as <a href=\"https://erratique.ch/software/cmdliner/doc/Cmdliner/Arg/index.html#val-enum\">Cmdliner.Arg.enum</a> would suffice, but they're fine as-is and could be refactored later. I even got it to complete the job and generate a combined options parsing function for the (dozens) of command-line arguments, which would have been <em>very</em> tedious to do by hand:</p>\n<pre><code>(* Apply options from command line to Options.t *)\nlet apply_options options\n ~rows ~cols ~chain_length ~parallel ~hardware_mapping ~brightness \n ~pwm_bits ~pwm_lsb_nanoseconds ~pwm_dither_bits ~scan_mode ~row_address_type \n ~multiplexing ~disable_hardware_pulsing ~show_refresh_rate ~inverse_colors\n ~led_rgb_sequence ~pixel_mapper_config ~panel_type ~limit_refresh_rate_hz \n ~disable_busy_waiting =\n Options.set_rows options rows;\n Options.set_cols options cols;\n Options.set_chain_length options chain_length;\n Options.set_parallel options parallel;\n Options.set_hardware_mapping options hardware_mapping;\n Options.set_brightness options brightness;\n Options.set_pwm_bits options pwm_bits;\n Options.set_pwm_dither_bits options pwm_dither_bits;\n Options.set_scan_mode options scan_mode;\n Options.set_pixel_mapper_config options pixel_mapper_config;\n Options.set_panel_type options panel_type;\n Options.set_limit_refresh_rate_hz options limit_refresh_rate_hz;\n Options.set_disable_busy_waiting options disable_busy_waiting;\n (* ...etc <snipped> *)\n options\n</code></pre>\n<p>Once this compiled, I asked for a rotating 3D cube demo, and it duly used the bindings to give me a full command-line enabled generator which you can see below. I just ran:</p>\n<pre><code>rotating_block_generator.exe --disable-hardware-pulsing -c 64 -r 64 --hardware-mapping=adafruit-hat --gpio-slowdown=2\n</code></pre>\n<p>and I had a spinning cube on my display! The code model had no problem with the matrix transformations required to render the cool spinning effect.</p>\n<p></p><div></div><p></p>\n<p>Of course, I had to pay the piper for the truckload of GPUs that drove this code model. At one point, the Claude Code agent got into a loop that I had to manually interrupt as it kept oscillating on a code fix without ever finding the right solution. This turned out to have sucked up quite a lot of money from my Claude API account!</p>\n<p>\n<img alt=\"This post cost me a cup of coffee and a boatload of energy\" src=\"https://anil.recoil.org/images/claude-coder-ss-6.webp\" title=\"This post cost me a cup of coffee and a boatload of energy\">\nThis post cost me a cup of coffee and a boatload of energy</p>\n<p>Overall, I'm impressed. There's clearly some <a href=\"https://arxiv.org/abs/2502.18449\">RL or SFT</a> required to teach the code model the specifics of OCaml and its tooling, but the basics are already incredible. <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://github.com/jonludlam\">Jon Ludlam</a> and I are having a go at this in the coming months.</p>\n<h2><a href=\"https://anil.recoil.org/#claude-code-is-powerful-but-it-can-doanythingto-your-machine\"></a>Claude Code is powerful, but it can do...anything...to your machine</h2>\n<p>The obvious downside of this whirlwind binding exercise is that while the NPM-based Claude Code asks nicely before it runs shell commands, <em>it doesn't have to ask</em>. I happened to run it inside a well-sandboxed <a href=\"https://docker.com\">Docker</a> container on my rPi, but most people probably won't. And in general, we need a more sophisticated security model; running the agent within a coarse sandbox that limits access to the file system, the network, and other sensitive resources is too restrictive, as we want to provide access to these resources for certain agentic tasks!</p>\n<p>So in a happy coincidence, this leads to a line of research that <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started last year with something we <a href=\"https://anil.recoil.org/news/2024-hope-bastion-1\">presented at HOPE 2024</a>. We explored how to express more precise constraints on what an AI can do by the use of the scary-sounding <a href=\"https://anil.recoil.org/papers/2024-hope-bastion.pdf\">Dijkstra monad</a>. It's far easier to understand by perusing the <a href=\"https://anil.recoil.org/slides/2024-hope-bastion-slides.pdf\">slides</a> of the talk, or watch <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a>'s great <a href=\"https://www.youtube.com/watch?v=U9H9xU-8-qc&list=PLyrlk8Xaylp7OQNLeCGS0j2fjEnvIWL9u\">video presentation</a>.</p>\n<p>We're mainly concerned with situations where the AI models are running over sensitive codebases or datasets. Consider three scenarios we want to handle, which are very logical extensions from the above agentic coding one:</p>\n<ol>\n<li>Modify or ignore sensor data to minimize the extent of habitat loss in a <a href=\"https://anil.recoil.org/papers/2024-terracorder\">biodiversity monitoring</a> setup. <em>But we may want to be able to delete duplicate sensor data in some phases of the analysis.</em></li>\n<li>Leak location sightings of vulnerable species to poachers. <em>But we still want to be able to work with this data to design effective interventions \u2014 we want a sandbox that limits information flows, in a statistical sense (differential privacy).</em></li>\n<li>Enact an intervention that may not satisfy legal constraints. <em>We want a sandbox that requires that a sound causal argument has been formulated</em></li>\n</ol>\n<p>For each of these, we could use a <a href=\"https://en.wikipedia.org/wiki/Capability-based_security\">capability security</a> model where access to sensitive data and effects can occur only via unforgeable capabilities granted explicitly. And the generation of that specification could also be done via code LLMs, but needs to target a verification friendly language like <a href=\"https://fstar-lang.com\">Fstar</a>. The prototype <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> built looks like this:</p>\n<pre><code>module type CapDataAccess (readonly : list dir, writable : list dir)\n (* abstract monad *)\n type Cmd a\n val return : a -> Cmd a\n val bind : Cmd a -> ( a -> Cmd b ) -> Cmd b\n (* only allows access to given directories *)\n val readfile : path -> Cmd string\n (* only allows writes to writable dirs *)\n val writefile : path -> string -> Cmd ()\n</code></pre>\n<p>And then you can use this rich specification to add constraints, for example see this <a href=\"https://github.com/patricoferris/hope-2024/tree/main/simple-json\">JSON parsing example</a> from the Fstar prototype:</p>\n<pre><code>(* Following IUCN's Globally Endangered (GE) scoring *)\nlet datamap = [\n"iberian-lynx.geojson", O [ "rarity", Int 2 ];\n"bornean-elephant.geojson", O [ "rarity", Int 3 ]\n]\n\n(* We add some additional predicates on the files allowed to be used *)\n@|-1,9 +1,10 ==========================================\n| (ensures (fun _ -> True))\n| (requires (fun _ _ local_trace ->\n| dont_delete_any_file local_trace /\\\n+| all_paths_are_not_endangered readonly /\\\n| only_open_some_files local_trace readonly))\n|}\n</code></pre>\n<p>Once you have this specification, then it's a matter of implementing fine-grained OS-level sandboxing policies to interpret and enforce them. Spoiler: we're working on such a system, so I'll write about that just as soon as it's more self-hosting; this area is moving incredibly fast.</p>\n\n<p>Thanks to <a href=\"https://mynameismwd.org\">Michael Dales</a> for help soldering. For the curious, here's the <a href=\"https://github.com/yminsky/rpi-rgb-led-matrix/pull/3\">PR with the code</a>, but it shouldn't go anywhere near any real use until we've had a chance to review the bindings carefully. There needs to be a new, even more buyer-beware no-warranty license for AI generated code!</p>",
+18
avsm/notes_codio-now-has-opam-support.json
+18
avsm/notes_codio-now-has-opam-support.json
···+"summary": "<p>I noticed an offhand tweet from Phil Tomson about <a href=\"http://codio.com/\">Codio</a> adding OPAM support, and naturally had to take a quick look. I was <em>really</em> impressed by the whole process, and ended up building the <a href=\"https://web.archive.org/web/20170914182531/http://www.openmirage.org/wiki/mirage-www\">Mirage Xen website</a> unikernel directly from my web browser in less than a minute, including registration!</p>\n<ul>\n<li>I signed up to Codio for free (since it\u2019s <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/avsm/Mirage-WWW/\">a public project</a>) using GitHub oAuth (only public identity access required at first, no repository access).</li>\n<li>Selected a <code>git</code> project and pointed it at the <a href=\"https://web.archive.org/web/20170914182531/https://github.com/mirage/mirage-www\">mirage-www</a> repository.</li>\n<li>At this point, you get the usual file explorer and code editor view in your browser. The magic begins when you go to \u201cTools/Terminal\u201d, and an interactive Ubuntu shell pops up. Since Codio added <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/s/blog/2014/03/new-parts/\">opam support</a>, setting up the Mirage environment is a breeze:</li>\n</ul>\n<blockquote>\n<p>I notice Codio supports OCaml and opam on the server side now.\n\u2014 phil tomson (@philtor)\n<a href=\"https://web.archive.org/web/20170914182531/https://twitter.com/philtor/statuses/448884571950444545\">March 26, 2014</a></p>\n</blockquote>\n<pre><code>$ parts install opam\n$ opam init -a\n$ eval `opam config env`\n$ opam install mirage-www -y\n$ make MODE=xen\n</code></pre>\n<p>Then have a cup of coffee while the box builds, and you have a <code>mir-www.xen</code>, all from your web browser! Codio has a number of deployment options available too, so you should be able to hook up a <a href=\"https://web.archive.org/web/20170914182531/http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Git-based workflow</a> using some combination of Travis or other CI service.</p>\n<p>This is the first time I\u2019ve ever been impressed by an online editor, and might consider moving away from my beloved vi...</p>",+"content": "<p>I noticed an offhand tweet from Phil Tomson about <a href=\"http://codio.com/\">Codio</a> adding OPAM support, and naturally had to take a quick look. I was <em>really</em> impressed by the whole process, and ended up building the <a href=\"https://web.archive.org/web/20170914182531/http://www.openmirage.org/wiki/mirage-www\">Mirage Xen website</a> unikernel directly from my web browser in less than a minute, including registration!</p>\n<ul>\n<li>I signed up to Codio for free (since it\u2019s <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/avsm/Mirage-WWW/\">a public project</a>) using GitHub oAuth (only public identity access required at first, no repository access).</li>\n<li>Selected a <code>git</code> project and pointed it at the <a href=\"https://web.archive.org/web/20170914182531/https://github.com/mirage/mirage-www\">mirage-www</a> repository.</li>\n<li>At this point, you get the usual file explorer and code editor view in your browser. The magic begins when you go to \u201cTools/Terminal\u201d, and an interactive Ubuntu shell pops up. Since Codio added <a href=\"https://web.archive.org/web/20170914182531/https://codio.com/s/blog/2014/03/new-parts/\">opam support</a>, setting up the Mirage environment is a breeze:</li>\n</ul>\n<blockquote>\n<p>I notice Codio supports OCaml and opam on the server side now.\n\u2014 phil tomson (@philtor)\n<a href=\"https://web.archive.org/web/20170914182531/https://twitter.com/philtor/statuses/448884571950444545\">March 26, 2014</a></p>\n</blockquote>\n<pre><code>$ parts install opam\n$ opam init -a\n$ eval `opam config env`\n$ opam install mirage-www -y\n$ make MODE=xen\n</code></pre>\n<p>Then have a cup of coffee while the box builds, and you have a <code>mir-www.xen</code>, all from your web browser! Codio has a number of deployment options available too, so you should be able to hook up a <a href=\"https://web.archive.org/web/20170914182531/http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Git-based workflow</a> using some combination of Travis or other CI service.</p>\n<p>This is the first time I\u2019ve ever been impressed by an online editor, and might consider moving away from my beloved vi...</p>",
+18
avsm/notes_commit-access-to-php.json
+18
avsm/notes_commit-access-to-php.json
···+"summary": "<p>I've been maintaining <a href=\"http://php.net\">PHP</a> on OpenBSD for a while now, including the core package distributed as binary packages.</p>\n<p>So as of today, the core team has decided I'm trustworthy enough to have my own commit bit to the central PHP repository, where I can commit code fixes and also maintain the <a href=\"https://www.php.net/manual/en/install.unix.openbsd.php\">OpenBSD on PHP</a> official instructions. You can contact me on <code>avsm@php.net</code> if you need any help!</p>",+"content": "<p>I've been maintaining <a href=\"http://php.net\">PHP</a> on OpenBSD for a while now, including the core package distributed as binary packages.</p>\n<p>So as of today, the core team has decided I'm trustworthy enough to have my own commit bit to the central PHP repository, where I can commit code fixes and also maintain the <a href=\"https://www.php.net/manual/en/install.unix.openbsd.php\">OpenBSD on PHP</a> official instructions. You can contact me on <code>avsm@php.net</code> if you need any help!</p>",
+18
avsm/notes_compass2024-ric-tripreport.json
+18
avsm/notes_compass2024-ric-tripreport.json
···+"summary": "<p>This is a trip report of <a href=\"https://compass.acm.org\">ACM COMPASS 2024</a> held in New Delhi, which had a novel track of <a href=\"https://compass.acm.org/research-impact-collaboratives/\">"Research to Impact Collaboratives"</a> that drew me in. The general chair, <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> wrote a fantastic book on "<a href=\"https://www.cse.iitd.ac.in/~aseth/act.html\">Technology and Disempowerment</a>" a few years ago, and he organised one RIC session on the CoRE Stack -- a climate adaptation stack for rural communities. This was a must-visit for me as it is closely related to the work we've been doing on <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a>. The following notes are somewhat raw as they have only been lightly edited, but please refer to the more polished documents on the <a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">agenda for ACM COMPASS RIC</a> and the overall <a href=\"https://core-stack.org\">CoRE Stack</a> initiative on commoning technologies for resilience and equality</p>\n<p>The conference itself was held at <a href=\"http://iiitd.ac.in/\">IIIT-D</a> in New Delhi, right at the cusp of the monsoon season and after record-breaking temperatures. Luckily, as always, the hospitality and welcoming nature of New Delhi overrode all the climate discomfort!</p>\n<p>\n<img alt=\"Arriving at the IIIT-D campus\" src=\"https://anil.recoil.org/images/compass24/compass24-17.webp\" title=\"Arriving at the IIIT-D campus\">\nArriving at the IIIT-D campus</p>\n<p>The main focus of this report is the one-day RIC held on the 8th July 2024. The RIC had around <a href=\"https://docs.google.com/spreadsheets/d/1IF7bOT-868ky138ysKXZE-BBN0z6KjI7D7ZjfKufFQQ/edit?gid=0#gid=0\">60 attendees</a> in person and 40 online, and was a mix of presentations and discussions on the CoRE stack and how it could be used to address climate adaptation in rural communities. The day was divided into two sessions, with the first being a series of scene setting presentations by practitioners and researchers, and the second being a series of breakout discussions on how the CoRE stack could be used in different contexts.</p>\n<h2><a href=\"https://anil.recoil.org/#intro-the-ric-core-stack-aadi-seth\"></a>Intro: The RIC Core stack (Aadi Seth)</h2>\n<p>Data driven approaches enable new approaches to social ecological system health, but need to be grounded in community based approaches, and the scope is too vast for any one group to handle. The CoRE stack (Commoning for Resilience and Equality) is being architected as a digital public infrastructure consisting of datasets, pre-computed analytics, and tools that can be used by rural communities and other stakeholders to improve the sustainability and resilience of their local landscapes. It will enable innovators to build upon and contribute their own datasets, use APIs for third-party apps, and track and monitor socio-ecological sustainability through a systems approach. The CoRE stack broadly consists of four layers.</p>\n<p>\n<img alt=\"Getting a signed copy of Aadi&apos;s book!\" src=\"https://anil.recoil.org/images/compass24/compass24-19.webp\" title=\"Getting a signed copy of Aadi&apos;s book!\">\nGetting a signed copy of Aadi's book!</p>\n<p>The broad approach is bottom-up usecase discovery, and picking a digital public infrastructure approach to work with civic services with, and to do distributed problem solving across stakeholders in academia, government and business.\nAadi noted the need to balance between standards and design and end-to-end servicing, and the overheads of collaboration across so many people; see the notes on <a href=\"https://docs.google.com/document/d/1akzDkbCxbXQe49uaArNLw-2z_AYtF5jjZxR2UGJ66o0/edit\">RIC collaboration across people</a>.</p>\n<p>Aadi then described the CoRE stack is a logical layered architecture:</p>\n<ul>\n<li>Layer 1 is the inclusion of new datasets: what is the standards and processes\nbehind this? There are a lot of geospatial data products around, including\ncommunity data that has been gathered in an ad-hoc way.</li>\n<li>Layer 2 is the generation of indicators, APIs and reports which give us\nlandscape level socio-ecological indicators. Includes alert services,\ncomputation infrastructure and suport.</li>\n<li>Layer 3 are the tools and platforms for implementation partners and\ncommunities. There are planning tools that are community based and\nparticipatory processes. Once we "know our landscape" we can perform fund\nallocation guidelines. Example of such as tool is Jaltol, for landscape and\nsite-level analysis. And ultimately we want to support new innovations such as\ndMRV for credits or socioecological indices.</li>\n<li>Layer 4 is about integrating into government and mark programmed, such as\nwater security, foresty and biodiversity credits, natural farming, flood\nhazard adaption and so on.</li>\n</ul>\n<p>To enable this, Aadi motivated the need to work together with networked co-creation and a\ndigital commons and build on top of it with open licenses. We need to overcome\nfault lines not only in terms of new climate change problems but also\nsocio-ecological barriers. And ultimately we need to go to scale and work with\ngovernment departments to make urgent improvements.</p>\n<p>An example of this is water security, via <a href=\"https://welllabs.org/jaltol/\">WellLabs Jaltol</a> which allows for\nlandscape characterisation for action pathways and side validation decision\nsupport tools, but also builds community based capacity for social accountability.\nE.g. given a drainage network, if you were to construct a new water body at this\npoint, what would the effect be on downstream water bodies and the communities that depend on it?</p>\n<p>\n<img alt=\"The general chair, Aadi Seth, opening the conference\" src=\"https://anil.recoil.org/images/compass24/compass24-2.webp\" title=\"The general chair, Aadi Seth, opening the conference\">\nThe general chair, Aadi Seth, opening the conference</p>\n<p>Aadi states the goals for this RIC:</p>\n<ul>\n<li>Find new usecases, what challenges exist, and what principles we adopt for collaboration.</li>\n<li>Look at problems through different lenses: issues of equity, data validity, unnecessary digital labour, aligned with knowledge commons, scaling challenges, productisation challenges.</li>\n<li>Consider the data and algorithm standards necessary to enable networked co-creation but not hinder progress</li>\n<li>Think in productised terms for end-to-end usecases to solve real problems in rural communities.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#discussion-session-1\"></a>Discussion Session 1</h2>\n<h3><a href=\"https://anil.recoil.org/#sustainability-action-at-scale-abhijeet-parmar-isb\"></a>Sustainability Action at Scale. Abhijeet Parmar (ISB)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1wZhXjRCStvkFIHh9Lo4UwIGFSezRdUKX/edit#slide=id.p1\">Slides</a>\n\n<img alt=\"Abhijeet Parmar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-3.webp\" title=\"Abhijeet Parmar presenting\">\nAbhijeet Parmar presenting</p>\n<p>The speaker highlighted the importance of scalability in approaches, particularly in the context of technological applications. Applications must remain simple, grounded in community needs, and usable by the general public. A key problem discussed was the extraction of Above-Ground Biomass (AGB) using smartphone cameras while traversing forested areas. Traditional Lidar-based systems, though effective in providing detailed depth information, are deemed impractical due to the specialised equipment required.</p>\n<p>The proposed solution involves creating a Self-Supervised Learning (SSL) model that utilises mobile phones to conduct real-time segmentation of individual trees as one walks through a forest. This approach leverages a pre-trained segmentation model alongside advanced modelling and tagging processes.</p>\n<p>The development involves three distinct pipelines, which could be integrated into a single application in the future. Consideration must be given to the UI design to ensure accessibility and effectiveness by rural populations. Advancements in data collection, benchmarking, and pipeline development suggest that such technology could support large-scale forest management initiatives, particularly in public policy contexts. The initial testing phase of this model is being conducted under controlled conditions, including specific lighting and seasonal factors, with plans to extend its applicability.</p>\n<p>During the discussion, a question was raised regarding the allocation of funds for tree planting initiatives and identifying a starting point. Answer: it was suggested that bamboo, a valuable resource for biofuel production, could be a focal point. The Indian landscape has sufficient bamboo to meet current biofuel demand, and directing Corporate Social Responsibility (CSR) funds towards this effort could significantly expedite progress.</p>\n<p><em>During a break later I showed <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>'s GreenLens mobile app for estimating DBH from a mobile phone image (see <a href=\"https://drive.google.com/drive/folders/17-Yu3KXcgJiFapGc2AjJ2dHNC30YUbup?usp=sharing\">app download</a>).</em></p>\n<h3><a href=\"https://anil.recoil.org/#plantation-monitoring-for-drone-images-snehasis-mukherjee-snu\"></a>Plantation monitoring for drone images, Snehasis Mukherjee (SNU)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1yyqx1Z8aVwtHnbkycGiSV_L7WllaI8JI/edit#slide=id.p3\">Slides</a></p>\n<p>\n<img alt=\"Snehasis Mukherjee presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-4.webp\" title=\"Snehasis Mukherjee presenting\">\nSnehasis Mukherjee presenting\nThe presentation by Snehasis Mukherjee focused on plantation monitoring using drone imagery, addressing the limitations of satellite images, esp. their inaccessibility to farmers. The workflow involves using drones at lower altitudes to capture detailed field imagery. The process begins with printing satellite images of a village onto paper, collaboratively marking land use with the locals, and proposing interventions. These are then imported into QGIS by a technical team, followed by field trips to gather further data using GeoODK, which is also integrated into QGIS. This iterative process is intended to inform local policy decisions at the Gram Sabha level.</p>\n<p>For drone imagery, the low-cost DJI Mini 2 with an RGB camera was chosen. Heights of 50-100m proved effective for capturing plantation images with sufficient resolution. The use cases include crop area estimation, classification, and monitoring plantation health. The first field trip occurred in Aug 2023 in Vittalpur village near Hyderabad, resulting in 253 usable images at ~50m (mainly of plantations).</p>\n<p>Image annotation was labor-intensive, with 100 images annotated by the team and 150 outsourced for \u20b91000, resulting in approximately 9000 annotations. The Xception and ResNet50 models showed promising results with reduced overfitting, and 2000 acres have been mapped now with multiple tree varieties. The challenge remains on how to supplement limited drone imagery with lower-resolution satellite images, since flying drones is expensive.</p>\n<h3><a href=\"https://anil.recoil.org/#forestry-agroforestry-and-restoration-toolkit-using-technology-and-community-participation---ashish-kumar\"></a>Forestry Agroforestry and Restoration Toolkit using Technology and Community Participation - Ashish Kumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1hJ0NwdiRq5hAvxSDsopznuZD-B8Ik-OX/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Ashish Kumar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-5.webp\" title=\"Ashish Kumar presenting\">\nAshish Kumar presenting\nAshish is building a community participation model to scale agroforestry, aiming to create a feedback/knowledge loop with locals. Goal is to promote tree planting outside traditional forestry areas and restore degraded common lands. The approach involves identifying degraded areas and building a toolkit to recommend suitable tree species.</p>\n<p>The project includes several modules: Species Distribution Modelling (SDM), water management, carbon sequestration, and economic analysis. Water management is particularly critical and is informed by <a href=\"https://www.sciencedirect.com/science/article/pii/S2214581820302068\">research from the Kolar district</a>, which has experienced declining groundwater levels since the 1990s and exacerbated by increasing demand. Remote sensing data shows significant variation in water usage depending on plant type and location (e.g., mango vs eucalyptus).</p>\n<p>Their work utilised the <a href=\"https://earlywarning.usgs.gov/docs/SSEBopETreadme.pdf\">SSEBOP evapotranspiration</a> product, accessed via Google Earth Engine (GEE), to analyse water use and its implications for agroforestry efforts.</p>\n<h3><a href=\"https://anil.recoil.org/#riverbed-sand-mining-activity-detection-based-on-satellite-imagery---siddharth-agarwal\"></a>Riverbed sand mining activity detection based on satellite imagery - Siddharth Agarwal</h3>\n<p><a href=\"https://drive.google.com/file/d/1iXaGuY0Ihb1luCn3aifkYvIhX3aI4pzT/view\">Slides</a></p>\n<p>\n<img alt=\"Siddharth Agarwal presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-6.webp\" title=\"Siddharth Agarwal presenting\">\nSiddharth Agarwal presenting</p>\n<p>Focussing on detecting riverbed sand mining activities using satellite imagery, particularly in areas where on-site visits are impractical. It turns out that sand is the second most extracted material globally after water, and its mining is a significant environmental concern especially for river communities. The project aims to develop a machine learning model to detect such mining activities using S1/S2 (didn't catch which, or both) satellite data.</p>\n<p>India Sand Watch, an open data platform developed with <a href=\"https://www.ooloilabs.in\">Ooloi Labs</a>, aims to collect annotate and archive data related to sand mining in India. This emerged due to the high costs associated with using detailed satellite imagery and processing and the need to understand sand mining comprehensively. The project covers the entire sand mining process, from discovery and land auctions to clearances and mining, and includes a 'sites of violence' framework that identifies intervention points.</p>\n<p>Significant challenge identified was the readability of documents associated with planning, which can be difficult even for humans let alone LLMs, making digitisation and structuring of data crucial. The transition from discovery to the actual mining site often involves navigating poorly accessible documents, highlighting the need for better evidence pipelines. <em>Note to self: just like our <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project!</em></p>\n<p>They are in collaboration with Berkeley with the aim to develop a machine learning model that predicts mining activity using low-resolution imagery (thus saving costs), covering vast areas (up to 10000 km2+) with Sentinel-1/2 as base maps. Their goal is to combine this data to create large-scale evidence that can then be used to drive large-scale action. This approach has been validated in court, where the data was accepted as evidence by the <a href=\"https://greentribunal.gov.in\">National Green Tribunal</a> (NGT).</p>\n<p>Q: is the community getting involved? A: The initiative began with community action, reflecting concerns over sand mining's impact on ecosystems, as sand is the second most extracted material globally after water.</p>\n<h2><a href=\"https://anil.recoil.org/#session-2\"></a>Session 2</h2>\n<h3><a href=\"https://anil.recoil.org/#proposal-for-a-new-interactive-electronic-publication-standard---r-nandakumar\"></a>Proposal for a new interactive electronic publication standard - R. Nandakumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/142YSXa8IUUmKSKUhH1TvIN-PaD1Cuirv/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"R. Nandakumar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-7.webp\" title=\"R. Nandakumar presenting\">\nR. Nandakumar presenting\nR. Nandakumar (recently retired from ISRO but still working on this space) proposed a new interactive electronic publication standard aimed at improving the quality of information products in communicating research results more interactively. He seeks to integrate code with data, ensuring version control while addressing security and privacy concerns. The current business model, which relies on distracting advertisements, exacerbates the digital divide especially with rural communities and hampers effective communication.</p>\n<p>He highlighted several issues with existing formats; inadequate representation of images, maps, infographics, and spreadsheets, and the absence of interactive features like running commentaries during visualisation animations. Also, there is a lack of fine printing and zoom capabilities and flexible authorisation mechanisms.</p>\n<p>His proposal suggests evolving existing standards (like PDFs) into more interactive and self-contained formats that include code. First phase would extend 2D image maps to support animations and metadata while embedding free and open-source software within the PDF. The second phase could expand this to include 3D models.</p>\n<p>The end goal is to standardise interactions across various formats\u2014image maps, spreadsheets, infographics, animations, and audiovisual content\u2014using the ISO/IEC 25010 square standard, which provides a comprehensive framework for functionality, performance, compatibility, usability, reliability, security, maintainability, and portability. (see slides for more details on each of these)</p>\n<p><em>My mad idea:</em> might we build a WASM interpreter in JavaScript so that it can run inside the existing PDF JS interpreter and work with existing docs? WASI for PDF! I've got a project idea relevant to this that can perhaps be extended or forked; see <a href=\"https://anil.recoil.org/ideas/life-explorer-wasm\">Using wasm to locally explore geospatial layers</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#geospatial-data-standards-to-enable-co-creation-of-data-products-craig-dsouza\"></a>Geospatial data standards to enable co-creation of data products (Craig Dsouza)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1n1CN66Yh9wKKcquMHInbQPSRCkPY9vmhae-_ogJmIcg/edit#slide=id.g2eaa42613c0_0_73\">Slides</a></p>\n<p>\n<img alt=\"Craig Dsouza presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-8.webp\" title=\"Craig Dsouza presenting\">\nCraig Dsouza presenting</p>\n<p>There is an overload of data and algorithms in all directions, so we want to accelerate development of <em>better</em> data and algorithms rather than quantity. How do we increase trust and reduce friction in the source data and eventual results with rural communities?\nExisting domain specific standards do exist, but they either dont exist or aren't widely adopted (see previous talk), especially for natural resource managemen where it can be of different modalities/resolution and some commonality exists but also sector specific extensions are required from current standards to deal with local variability.</p>\n<p>So they are surveying data standards and algorithm standards. To consider data standards first, the most successful is Open Street Map. For algorithm standards, there is rapidly adopted services like HuggingFace. But what is the <em>combination</em> of both so that they can be coupled to real outcomes?</p>\n<p>How do we compare the performance of data standards and build guiding principles of which ones to pick?</p>\n<ul>\n<li><em>to reduce friction:</em>\n<ul>\n<li>consider the time taken for dataset and model integration with existing open source tools</li>\n<li>or the time taken for the end user to create a new dummy datapoint.</li>\n<li>time taken for end user to run the model and make the first minor fix.</li>\n</ul>\n</li>\n<li><em>to accelerate development:</em>\n<ul>\n<li>number of collaborators over time</li>\n<li>number of additions by 3rd parties over time</li>\n<li>increase in model performance over time</li>\n</ul>\n</li>\n</ul>\n<p>An existing example is how to share a LULC dataset using existing open geospatial standards (<a href=\"https://stacspec.org/en\">STAC</a>). The data standard creates a simple JSON file which has metadata for that module. The data user can then access to eh latest version of the data via either an API or the STAC browser.</p>\n<p><em>TODO for myself:</em> Look at mapping these metrics onto our TMF pipeline (in <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a>) and investigate a possible user study with some CCI data. Also is STAC relevant to TMF/LIFE/FOOD publishing pipeline in <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> as we need to publish the various layers there soon.</p>\n<h3><a href=\"https://anil.recoil.org/#geospatial-data-flow-management---anil-madhavapeddy\"></a>Geospatial data flow management - Anil Madhavapeddy</h3>\n<p>My talk, I was speaking, so no notes! I'll upload the slides later and edit this section.</p>\n<p>Good question from the audience about healthcare management and its relevance to planetary computing -- it seems to share a lot of the problems involving data sensitivity and the need for spatially explicit data sharing.</p>\n<h3><a href=\"https://anil.recoil.org/#opportunities-in-agricultural-sensing---anupam-sobti\"></a>Opportunities in agricultural sensing - Anupam Sobti</h3>\n<p><a href=\"https://docs.google.com/presentation/d/11XAuKb78TpIpMkZGYWn58I3iQnlBvRmQ/edit#slide=id.p1\">Slides</a></p>\n<p>Anupan introduced the main questions across the rural farming cycle including:</p>\n<ul>\n<li><em>Sowing:</em> "Is this the right crop?" "Will I have enough resources (water, heat, seeds)?" "Are these the right seeds?"</li>\n<li><em>Harvesting:</em> "Is this the right time to harvest?" "How do I plan post-harvest logistics?" "How do I manage residue?"</li>\n<li><em>Selling:</em> "Is this the right time to sell?" "Who do I trust to sell to?" "Do I sell now or wait?"</li>\n</ul>\n<p>So onto the notion of "Agricultural Computing", which:</p>\n<ul>\n<li>involves multiple decision layers: farmer-centric, government-centric, and finance-centric.</li>\n<li>features recent innovations such as advancements in remote sensing and game theory applications to navigate complex agricultural decisions.</li>\n</ul>\n<p>Urban heat islands are a significant problem detectable with geospatial data. He noted the reference of paper by\nMohajerani, Abbas, Jason Bakaric, and Tristan Jeffrey-Bailey. "The urban heat island effect, its causes, and mitigation, with reference to the thermal properties of asphalt concrete." <em>Journal of Environmental Management</em> 197 (2017): 522-538.</p>\n<p><em>Note to self: Send to <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> re <a href=\"https://anil.recoil.org/papers/2024-green-urban-eq\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a>.</em></p>\n<p><em>Q:</em> For marginalised communities, should there be standards for interactions to obtain feedback iteratively, reducing the shock of policy changes? <strong>A:</strong> There is a need for significant groundwork engineering right now to provide immediate feedback, helping communities adapt more smoothly to changes.</p>\n<h3><a href=\"https://anil.recoil.org/#understanding-soil-moisture-regime-for-crop-diversification---prachi-d-patil\"></a>Understanding Soil Moisture Regime for Crop Diversification - Prachi D. Patil</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1ZZMqF-8hCIupNm5VUH8wu61v9eTuI1e-/edit#slide=id.p1\">Slides</a>\n\n<img alt=\"Prachi D. Patil presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-9.webp\" title=\"Prachi D. Patil presenting\">\nPrachi D. Patil presenting</p>\n<p>Prachi gave a perspective from the farmer's fields, with a study aiming to group relatively homogenous regions based on soil, climate, and physiography, focusing on moisture availability periods for soil and the length of the growing season. Their approach uses simple moisture sensors at various depths to measure soil resistivity, providing farmers with real-time information on whether to irrigate. This system can map dry spells and their duration, offering actionable insights for crop management.</p>\n<p>The <a href=\"https://www.wassan.org/wp-content/uploads/2022/03/WASSANPublication_BhagyalakshmiUthappaSudhakarUday_03032022.pdf\">Navadhanya system</a> is a traditional cropping method with specific design and crop geometry, which can be analysed for soil moisture as a multidimensional system\u2014both spatially and temporally. Different crops have varying maturity and root depth cycles, making soil moisture critical for establishing and protecting these crops. A fallow period during a critical stage can lead to crop loss and so highlights the importance of consistent moisture.</p>\n<p>Navadhanya bridges traditional crop mixing knowledge with modern scientific sensor methods as described in the talk. Navadhanya offers nutritional security through crop variety though farmers typically sell a reliable monocrop in the market. Their analysis suggests a need to consider soil use regimes both in the short and long term, challenging the practice of forcing farmers to switch crops (e.g., from rice to bajra) based on short-term profitability.</p>\n<p><strong>Q:</strong> How can this tool assist with monsoon management? <strong>A:</strong> The tool can map soil moisture and integrate it with traditional knowledge, enabling the development of combined solutions for managing monsoon impacts.</p>\n<h3><a href=\"https://anil.recoil.org/#ranking-and-financing-based-on-climate-smart-agriculture---atanu-garai-socialwell\"></a>Ranking and financing based on climate smart agriculture - Atanu Garai (SocialWell)</h3>\n<p><a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">Slides</a>\n\n<img alt=\"Atanu Garai presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-10.webp\" title=\"Atanu Garai presenting\">\nAtanu Garai presenting\n\n<img alt=\"The machine learning approaches to climate models\" src=\"https://anil.recoil.org/images/compass24/compass24-11.webp\" title=\"The machine learning approaches to climate models\">\nThe machine learning approaches to climate models</p>\n<p>Atanu switched tack to the business side of things, focused on switching Farmer Producer Organisations (FPOs), of which there are 10000+ in India, to adopt climate-smart practices. The incentive based approach includes:</p>\n<ol>\n<li><strong>Business Plan:</strong> Farmers, FPOs, and market data collaboratively generate a business plan, which is then used by FPOs to secure loans.</li>\n<li><strong>Land Parcels and FPO Rating:</strong> Farm inputs, soil, and weather data are tracked to classify and rate each land parcel.</li>\n<li><strong>Climate Smart Financing:</strong> Execute the plan based on the gathered data.</li>\n</ol>\n<p>The key requirements for obtaining an FPO Land Parcel Rating with their method are:</p>\n<ol>\n<li><strong>Farm Inputs:</strong> Data on seeds, fertilizers, and pesticides provided by the FPO and sourced by the farmer, recorded by the FPO.</li>\n<li><strong>Soil Data:</strong> Rating of soil using a combination of mobile and sensor technologies.</li>\n<li><strong>Climate Data:</strong> Sourced from public datasets, focusing on classifying rainfall and extreme weather events.</li>\n<li><strong>Farm Practices:</strong> Documentation through photos of sowing, irrigation, and data on the methods used.</li>\n</ol>\n<p>For climate data, their approach involves using neural network-based chaos forecasting to provide weather predictions in a format useful to farmers. <em>The second half of the presentation went into great detail into their ensemble methods to predict weather patterns, which I didn't note in detail, but see <a href=\"https://anil.recoil.org/ideas/diffusion-model-satellites\">Diffusion models for terrestrial predictions about land use change</a>.</em></p>\n<h2><a href=\"https://anil.recoil.org/#session-3\"></a>Session 3</h2>\n<h3><a href=\"https://anil.recoil.org/#groundwater-monitoring-tool-challenges-to-apply-ecological-health-monitoring-at-scale---himani-sharmachiranjit-guha\"></a>Groundwater monitoring tool, challenges to apply ecological health monitoring at scale - Himani Sharma/Chiranjit Guha</h3>\n<p><a href=\"https://docs.google.com/presentation/d/14zesuTt8R9UGOvaSXsvOPwARO-c4xyg6/edit?usp=sharing&ouid=116413035808485050246&rtpof=true&sd=true\">Slides</a></p>\n<p>\n<img alt=\"Himani Sharma presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-12.webp\" title=\"Himani Sharma presenting\">\nHimani Sharma presenting\nGroundwater monitoring in India faces significant data scarcity, with only 4886 wells having long-term data in the whole country, averaging just 7 wells per district. To address this 150+ organisations collaborated a few years ago to create an Android app for crowdsourcing groundwater data. Starting with 5000 villages, the project has now expanded to 11000+ villages and is used both pre- and post-monsoon and is revealing substantial fluctuations in water levels.</p>\n<p>The app enables users to generate village-level groundwater maps, correlating water level data with geological information to create comprehensive groundwater flow maps, even within individual villages. The process involves measuring water depth from three wells per village, using GPS and mobile devices, and rendering the data on an online platform.</p>\n<p>\n<img alt=\"Soil moisture measurements\" src=\"https://anil.recoil.org/images/compass24/compass24-sm-ss.webp\" title=\"Soil moisture measurements\">\nSoil moisture measurements\nThe crowdsourcing presents challenges in data quality, requiring post-processing and filtering. Despite this, the analysis has been highly effective, and the Jaldoot scheme now covers 450000+ villages as of 2023, following extensive lobbying with the Indian government who are now supporting it directly.</p>\n<p>In addition to groundwater monitoring, efforts are also focused on community-based ecological health monitoring, including biodiversity, biomass assessment, and pollinator/insect tracking. Four sample watersheds with detailed socio-ecological-economic indicators and over 150 annual monitoring sites are used to track changes in vegetation and species over time. These assessments both reveal valuable insights (e.g., the increased presence of a rare frog in specific watersheds) and are resource-intensive and challenging to scale. Potential solutions include GIS-based platforms, remote sensing, and tools for tracking changes in standing biomass, carbon stock, and biodiversity.</p>\n<p><em>Note to self:</em> Possible connection with the iRecord team in the UK to explore applicability of biodiversity data collected?</p>\n<p>The project also maps highly infested areas by invasive species, such as the <a href=\"https://india.mongabay.com/2020/08/lantana-invasion-threatens-40-percent-of-indias-tiger-habitat-reports-study/\">Lantana camara</a>, to focus restoration efforts abd is drawing on data from 150+ sites.</p>\n<p>Q: what are the next steps? A: going to withdraw the Android app in the next few years, so the government is taking over next after creating a similar app. Declaring the project a success! Q: But will the data remain open for the communities once the government takes over? A: There is a growing widening of the dataset collection (e.g. biodiversity) to refine the datasets for things not yet considered such as ecosystem services. Not clear on the future of the government-run data provenance.</p>\n<h3><a href=\"https://anil.recoil.org/#land-suitability-assessment----athithiyan-mr\"></a>Land Suitability Assessment -- Athithiyan MR</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19rXpXNoizFA-Pc8UKXC0G1qbfzSm3iZ-/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Athithiyan presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-13.webp\" title=\"Athithiyan presenting\">\nAthithiyan presenting\nTheir "LifeLands" system is designed to unlock the productive potential of degraded lands, aiming to mitigate climate impacts through better land use. The digital planning tool they built utilises satellite imagery, public databases, and AI modelling to assess land suitability for regenerative purposes such as solar energy, sustainable water management, or ecological restoration.</p>\n<p>The system integrates geospatial and socioeconomic data layers, along with public datasets, to produce an interactive map and report, determining whether land is unused and suitable for intervention. Data collection is facilitated through a mobile app that traces land boundaries using GPS, captures four site photos and a video, and gathers information on land ownership and existing vegetation (shrubs and trees).</p>\n<h3><a href=\"https://anil.recoil.org/#designing-for-context---aila-dutt\"></a>Designing for Context - Aila Dutt</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19lThkR3LfHhQvDibQiHs_vtNeCr4XOFj/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Aila Dutt presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-14.webp\" title=\"Aila Dutt presenting\">\nAila Dutt presenting\nCitizens and community stewards need to be able to understand, analyse and apply various concepts and data around climate change to understand intricacies of socio-economic changes. So how might be simplify complex systems and data to encourage data driven decision making through these interventions? To be successful this needs to be participatory decision making and a reclamation of agency of each of the stakeholders within the system.\nIt is essential for citizens and community stewards to comprehend, analyse, and apply complex concepts and data. The goal is to simplify these systems and data, fostering participatory decision-making and empowering stakeholders to reclaim their agency within the system.</p>\n<p>Broad research approach:</p>\n<ol>\n<li><strong>Discover:</strong> Conduct field research, interviews, observations, secondary research, and expert consultations.</li>\n<li><strong>Define:</strong> Engage in systems mapping, curriculum design, and persona mapping using analogous examples.</li>\n<li><strong>Ideate:</strong> Perform field testing, map problems to solutions, and explore sacrificial concepts.</li>\n<li><strong>Prototype:</strong> Conduct usability testing, create sketches and wireframes, and integrate data analytics.</li>\n</ol>\n<p>To enhance understanding, environmental education and curriculum design can incorporate semi-fictional "case studies" that place users in relatable contexts. This approach increases adoption by breaking the system into modules and using gamification to test concepts. For example, users can explore the concept of 'climate change' as it pertains to their own land and prosperity.</p>\n<p>In the analysis phase, it\u2019s crucial to not only graph data but also describe it in ways that participants can relate to their own landscapes. The decision-making process must integrate data-driven insights with existing frameworks. Generative images and brainstorming sessions are used to develop innovative ways to visualise complex data, such as precipitation and climatic variables, in a simple and understandable form.</p>\n<p><strong>Example Activity:</strong> "Set a 15-minute timer and brainstorm all possible ways to present data simply." Consider descriptors like terrain, slopes, plains, rainfall, surface water, MNREGA projects, and agriculture to see how users can better utilise this information.</p>\n<p><strong>Q:</strong> Is 'making data actionable' a priority, and how do we address the tragedy of the commons? <strong>A:</strong> Yes, systems thinking and collaboration are essential to prevent resource depletion and ensure shared benefits.\n<strong>Q:</strong> Can this approach scale from smaller to larger communities? <strong>A:</strong> Yes, by developing microwatershed data and village-level datasets, even large communities can work at much smaller, more precise resolutions.</p>\n<p>\n<img alt=\"The attendees of the RIC\" src=\"https://anil.recoil.org/images/compass24/compass24-group1.webp\" title=\"The attendees of the RIC\">\nThe attendees of the RIC</p>\n<h2><a href=\"https://anil.recoil.org/#group-sessions\"></a>Group Sessions</h2>\n<p>After this, we split into groups to discuss the following topics roughly as follows:</p>\n<ul>\n<li>What do we need to do to take this into scale? e.g. remote sensing: works at some scale, but validation also needs to scale.</li>\n<li>Then we saw new usecases. E.g. soil moisture. Now we need to think this through and come up with succinct problem statement to.</li>\n<li>Start taking through some datasets and algorithms as examples and turn them in to a spec. What is the specification process and ultimate metadata standards?</li>\n<li>One group then will work on methods to facilitate community engagement with data</li>\n<li>And then what are principles and processes for effective collaboration and co-creation. What are barriers?</li>\n</ul>\n<p>I'll follow up with more analysis about the outcomes soon, as I'm in touch with Aadi and hopefully we will be working on a project together in the future. But for now, I'll conclude this trip report with great appreciation for Aadi and the hard working volunteers at COMPASS 2024 that made attendance such a pleasure!</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-18.webp\" title=\"Glorious Delhi sunset to finish the conference\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-21.webp\" title=\"Spotted some electric charging stations!\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-22.webp\" title=\"Made it back to London in time to catch some tennis\"></p>",+"content": "<p>This is a trip report of <a href=\"https://compass.acm.org\">ACM COMPASS 2024</a> held in New Delhi, which had a novel track of <a href=\"https://compass.acm.org/research-impact-collaboratives/\">"Research to Impact Collaboratives"</a> that drew me in. The general chair, <a href=\"https://www.cse.iitd.ac.in/~aseth/\">Aadi Seth</a> wrote a fantastic book on "<a href=\"https://www.cse.iitd.ac.in/~aseth/act.html\">Technology and Disempowerment</a>" a few years ago, and he organised one RIC session on the CoRE Stack -- a climate adaptation stack for rural communities. This was a must-visit for me as it is closely related to the work we've been doing on <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a>. The following notes are somewhat raw as they have only been lightly edited, but please refer to the more polished documents on the <a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">agenda for ACM COMPASS RIC</a> and the overall <a href=\"https://core-stack.org\">CoRE Stack</a> initiative on commoning technologies for resilience and equality</p>\n<p>The conference itself was held at <a href=\"http://iiitd.ac.in/\">IIIT-D</a> in New Delhi, right at the cusp of the monsoon season and after record-breaking temperatures. Luckily, as always, the hospitality and welcoming nature of New Delhi overrode all the climate discomfort!</p>\n<p>\n<img alt=\"Arriving at the IIIT-D campus\" src=\"https://anil.recoil.org/images/compass24/compass24-17.webp\" title=\"Arriving at the IIIT-D campus\">\nArriving at the IIIT-D campus</p>\n<p>The main focus of this report is the one-day RIC held on the 8th July 2024. The RIC had around <a href=\"https://docs.google.com/spreadsheets/d/1IF7bOT-868ky138ysKXZE-BBN0z6KjI7D7ZjfKufFQQ/edit?gid=0#gid=0\">60 attendees</a> in person and 40 online, and was a mix of presentations and discussions on the CoRE stack and how it could be used to address climate adaptation in rural communities. The day was divided into two sessions, with the first being a series of scene setting presentations by practitioners and researchers, and the second being a series of breakout discussions on how the CoRE stack could be used in different contexts.</p>\n<h2><a href=\"https://anil.recoil.org/#intro-the-ric-core-stack-aadi-seth\"></a>Intro: The RIC Core stack (Aadi Seth)</h2>\n<p>Data driven approaches enable new approaches to social ecological system health, but need to be grounded in community based approaches, and the scope is too vast for any one group to handle. The CoRE stack (Commoning for Resilience and Equality) is being architected as a digital public infrastructure consisting of datasets, pre-computed analytics, and tools that can be used by rural communities and other stakeholders to improve the sustainability and resilience of their local landscapes. It will enable innovators to build upon and contribute their own datasets, use APIs for third-party apps, and track and monitor socio-ecological sustainability through a systems approach. The CoRE stack broadly consists of four layers.</p>\n<p>\n<img alt=\"Getting a signed copy of Aadi&apos;s book!\" src=\"https://anil.recoil.org/images/compass24/compass24-19.webp\" title=\"Getting a signed copy of Aadi&apos;s book!\">\nGetting a signed copy of Aadi's book!</p>\n<p>The broad approach is bottom-up usecase discovery, and picking a digital public infrastructure approach to work with civic services with, and to do distributed problem solving across stakeholders in academia, government and business.\nAadi noted the need to balance between standards and design and end-to-end servicing, and the overheads of collaboration across so many people; see the notes on <a href=\"https://docs.google.com/document/d/1akzDkbCxbXQe49uaArNLw-2z_AYtF5jjZxR2UGJ66o0/edit\">RIC collaboration across people</a>.</p>\n<p>Aadi then described the CoRE stack is a logical layered architecture:</p>\n<ul>\n<li>Layer 1 is the inclusion of new datasets: what is the standards and processes\nbehind this? There are a lot of geospatial data products around, including\ncommunity data that has been gathered in an ad-hoc way.</li>\n<li>Layer 2 is the generation of indicators, APIs and reports which give us\nlandscape level socio-ecological indicators. Includes alert services,\ncomputation infrastructure and suport.</li>\n<li>Layer 3 are the tools and platforms for implementation partners and\ncommunities. There are planning tools that are community based and\nparticipatory processes. Once we "know our landscape" we can perform fund\nallocation guidelines. Example of such as tool is Jaltol, for landscape and\nsite-level analysis. And ultimately we want to support new innovations such as\ndMRV for credits or socioecological indices.</li>\n<li>Layer 4 is about integrating into government and mark programmed, such as\nwater security, foresty and biodiversity credits, natural farming, flood\nhazard adaption and so on.</li>\n</ul>\n<p>To enable this, Aadi motivated the need to work together with networked co-creation and a\ndigital commons and build on top of it with open licenses. We need to overcome\nfault lines not only in terms of new climate change problems but also\nsocio-ecological barriers. And ultimately we need to go to scale and work with\ngovernment departments to make urgent improvements.</p>\n<p>An example of this is water security, via <a href=\"https://welllabs.org/jaltol/\">WellLabs Jaltol</a> which allows for\nlandscape characterisation for action pathways and side validation decision\nsupport tools, but also builds community based capacity for social accountability.\nE.g. given a drainage network, if you were to construct a new water body at this\npoint, what would the effect be on downstream water bodies and the communities that depend on it?</p>\n<p>\n<img alt=\"The general chair, Aadi Seth, opening the conference\" src=\"https://anil.recoil.org/images/compass24/compass24-2.webp\" title=\"The general chair, Aadi Seth, opening the conference\">\nThe general chair, Aadi Seth, opening the conference</p>\n<p>Aadi states the goals for this RIC:</p>\n<ul>\n<li>Find new usecases, what challenges exist, and what principles we adopt for collaboration.</li>\n<li>Look at problems through different lenses: issues of equity, data validity, unnecessary digital labour, aligned with knowledge commons, scaling challenges, productisation challenges.</li>\n<li>Consider the data and algorithm standards necessary to enable networked co-creation but not hinder progress</li>\n<li>Think in productised terms for end-to-end usecases to solve real problems in rural communities.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#discussion-session-1\"></a>Discussion Session 1</h2>\n<h3><a href=\"https://anil.recoil.org/#sustainability-action-at-scale-abhijeet-parmar-isb\"></a>Sustainability Action at Scale. Abhijeet Parmar (ISB)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1wZhXjRCStvkFIHh9Lo4UwIGFSezRdUKX/edit#slide=id.p1\">Slides</a>\n\n<img alt=\"Abhijeet Parmar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-3.webp\" title=\"Abhijeet Parmar presenting\">\nAbhijeet Parmar presenting</p>\n<p>The speaker highlighted the importance of scalability in approaches, particularly in the context of technological applications. Applications must remain simple, grounded in community needs, and usable by the general public. A key problem discussed was the extraction of Above-Ground Biomass (AGB) using smartphone cameras while traversing forested areas. Traditional Lidar-based systems, though effective in providing detailed depth information, are deemed impractical due to the specialised equipment required.</p>\n<p>The proposed solution involves creating a Self-Supervised Learning (SSL) model that utilises mobile phones to conduct real-time segmentation of individual trees as one walks through a forest. This approach leverages a pre-trained segmentation model alongside advanced modelling and tagging processes.</p>\n<p>The development involves three distinct pipelines, which could be integrated into a single application in the future. Consideration must be given to the UI design to ensure accessibility and effectiveness by rural populations. Advancements in data collection, benchmarking, and pipeline development suggest that such technology could support large-scale forest management initiatives, particularly in public policy contexts. The initial testing phase of this model is being conducted under controlled conditions, including specific lighting and seasonal factors, with plans to extend its applicability.</p>\n<p>During the discussion, a question was raised regarding the allocation of funds for tree planting initiatives and identifying a starting point. Answer: it was suggested that bamboo, a valuable resource for biofuel production, could be a focal point. The Indian landscape has sufficient bamboo to meet current biofuel demand, and directing Corporate Social Responsibility (CSR) funds towards this effort could significantly expedite progress.</p>\n<p><em>During a break later I showed <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>'s GreenLens mobile app for estimating DBH from a mobile phone image (see <a href=\"https://drive.google.com/drive/folders/17-Yu3KXcgJiFapGc2AjJ2dHNC30YUbup?usp=sharing\">app download</a>).</em></p>\n<h3><a href=\"https://anil.recoil.org/#plantation-monitoring-for-drone-images-snehasis-mukherjee-snu\"></a>Plantation monitoring for drone images, Snehasis Mukherjee (SNU)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1yyqx1Z8aVwtHnbkycGiSV_L7WllaI8JI/edit#slide=id.p3\">Slides</a></p>\n<p>\n<img alt=\"Snehasis Mukherjee presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-4.webp\" title=\"Snehasis Mukherjee presenting\">\nSnehasis Mukherjee presenting\nThe presentation by Snehasis Mukherjee focused on plantation monitoring using drone imagery, addressing the limitations of satellite images, esp. their inaccessibility to farmers. The workflow involves using drones at lower altitudes to capture detailed field imagery. The process begins with printing satellite images of a village onto paper, collaboratively marking land use with the locals, and proposing interventions. These are then imported into QGIS by a technical team, followed by field trips to gather further data using GeoODK, which is also integrated into QGIS. This iterative process is intended to inform local policy decisions at the Gram Sabha level.</p>\n<p>For drone imagery, the low-cost DJI Mini 2 with an RGB camera was chosen. Heights of 50-100m proved effective for capturing plantation images with sufficient resolution. The use cases include crop area estimation, classification, and monitoring plantation health. The first field trip occurred in Aug 2023 in Vittalpur village near Hyderabad, resulting in 253 usable images at ~50m (mainly of plantations).</p>\n<p>Image annotation was labor-intensive, with 100 images annotated by the team and 150 outsourced for \u20b91000, resulting in approximately 9000 annotations. The Xception and ResNet50 models showed promising results with reduced overfitting, and 2000 acres have been mapped now with multiple tree varieties. The challenge remains on how to supplement limited drone imagery with lower-resolution satellite images, since flying drones is expensive.</p>\n<h3><a href=\"https://anil.recoil.org/#forestry-agroforestry-and-restoration-toolkit-using-technology-and-community-participation---ashish-kumar\"></a>Forestry Agroforestry and Restoration Toolkit using Technology and Community Participation - Ashish Kumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1hJ0NwdiRq5hAvxSDsopznuZD-B8Ik-OX/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Ashish Kumar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-5.webp\" title=\"Ashish Kumar presenting\">\nAshish Kumar presenting\nAshish is building a community participation model to scale agroforestry, aiming to create a feedback/knowledge loop with locals. Goal is to promote tree planting outside traditional forestry areas and restore degraded common lands. The approach involves identifying degraded areas and building a toolkit to recommend suitable tree species.</p>\n<p>The project includes several modules: Species Distribution Modelling (SDM), water management, carbon sequestration, and economic analysis. Water management is particularly critical and is informed by <a href=\"https://www.sciencedirect.com/science/article/pii/S2214581820302068\">research from the Kolar district</a>, which has experienced declining groundwater levels since the 1990s and exacerbated by increasing demand. Remote sensing data shows significant variation in water usage depending on plant type and location (e.g., mango vs eucalyptus).</p>\n<p>Their work utilised the <a href=\"https://earlywarning.usgs.gov/docs/SSEBopETreadme.pdf\">SSEBOP evapotranspiration</a> product, accessed via Google Earth Engine (GEE), to analyse water use and its implications for agroforestry efforts.</p>\n<h3><a href=\"https://anil.recoil.org/#riverbed-sand-mining-activity-detection-based-on-satellite-imagery---siddharth-agarwal\"></a>Riverbed sand mining activity detection based on satellite imagery - Siddharth Agarwal</h3>\n<p><a href=\"https://drive.google.com/file/d/1iXaGuY0Ihb1luCn3aifkYvIhX3aI4pzT/view\">Slides</a></p>\n<p>\n<img alt=\"Siddharth Agarwal presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-6.webp\" title=\"Siddharth Agarwal presenting\">\nSiddharth Agarwal presenting</p>\n<p>Focussing on detecting riverbed sand mining activities using satellite imagery, particularly in areas where on-site visits are impractical. It turns out that sand is the second most extracted material globally after water, and its mining is a significant environmental concern especially for river communities. The project aims to develop a machine learning model to detect such mining activities using S1/S2 (didn't catch which, or both) satellite data.</p>\n<p>India Sand Watch, an open data platform developed with <a href=\"https://www.ooloilabs.in\">Ooloi Labs</a>, aims to collect annotate and archive data related to sand mining in India. This emerged due to the high costs associated with using detailed satellite imagery and processing and the need to understand sand mining comprehensively. The project covers the entire sand mining process, from discovery and land auctions to clearances and mining, and includes a 'sites of violence' framework that identifies intervention points.</p>\n<p>Significant challenge identified was the readability of documents associated with planning, which can be difficult even for humans let alone LLMs, making digitisation and structuring of data crucial. The transition from discovery to the actual mining site often involves navigating poorly accessible documents, highlighting the need for better evidence pipelines. <em>Note to self: just like our <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project!</em></p>\n<p>They are in collaboration with Berkeley with the aim to develop a machine learning model that predicts mining activity using low-resolution imagery (thus saving costs), covering vast areas (up to 10000 km2+) with Sentinel-1/2 as base maps. Their goal is to combine this data to create large-scale evidence that can then be used to drive large-scale action. This approach has been validated in court, where the data was accepted as evidence by the <a href=\"https://greentribunal.gov.in\">National Green Tribunal</a> (NGT).</p>\n<p>Q: is the community getting involved? A: The initiative began with community action, reflecting concerns over sand mining's impact on ecosystems, as sand is the second most extracted material globally after water.</p>\n<h2><a href=\"https://anil.recoil.org/#session-2\"></a>Session 2</h2>\n<h3><a href=\"https://anil.recoil.org/#proposal-for-a-new-interactive-electronic-publication-standard---r-nandakumar\"></a>Proposal for a new interactive electronic publication standard - R. Nandakumar</h3>\n<p><a href=\"https://docs.google.com/presentation/d/142YSXa8IUUmKSKUhH1TvIN-PaD1Cuirv/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"R. Nandakumar presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-7.webp\" title=\"R. Nandakumar presenting\">\nR. Nandakumar presenting\nR. Nandakumar (recently retired from ISRO but still working on this space) proposed a new interactive electronic publication standard aimed at improving the quality of information products in communicating research results more interactively. He seeks to integrate code with data, ensuring version control while addressing security and privacy concerns. The current business model, which relies on distracting advertisements, exacerbates the digital divide especially with rural communities and hampers effective communication.</p>\n<p>He highlighted several issues with existing formats; inadequate representation of images, maps, infographics, and spreadsheets, and the absence of interactive features like running commentaries during visualisation animations. Also, there is a lack of fine printing and zoom capabilities and flexible authorisation mechanisms.</p>\n<p>His proposal suggests evolving existing standards (like PDFs) into more interactive and self-contained formats that include code. First phase would extend 2D image maps to support animations and metadata while embedding free and open-source software within the PDF. The second phase could expand this to include 3D models.</p>\n<p>The end goal is to standardise interactions across various formats\u2014image maps, spreadsheets, infographics, animations, and audiovisual content\u2014using the ISO/IEC 25010 square standard, which provides a comprehensive framework for functionality, performance, compatibility, usability, reliability, security, maintainability, and portability. (see slides for more details on each of these)</p>\n<p><em>My mad idea:</em> might we build a WASM interpreter in JavaScript so that it can run inside the existing PDF JS interpreter and work with existing docs? WASI for PDF! I've got a project idea relevant to this that can perhaps be extended or forked; see <a href=\"https://anil.recoil.org/ideas/life-explorer-wasm\">Using wasm to locally explore geospatial layers</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#geospatial-data-standards-to-enable-co-creation-of-data-products-craig-dsouza\"></a>Geospatial data standards to enable co-creation of data products (Craig Dsouza)</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1n1CN66Yh9wKKcquMHInbQPSRCkPY9vmhae-_ogJmIcg/edit#slide=id.g2eaa42613c0_0_73\">Slides</a></p>\n<p>\n<img alt=\"Craig Dsouza presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-8.webp\" title=\"Craig Dsouza presenting\">\nCraig Dsouza presenting</p>\n<p>There is an overload of data and algorithms in all directions, so we want to accelerate development of <em>better</em> data and algorithms rather than quantity. How do we increase trust and reduce friction in the source data and eventual results with rural communities?\nExisting domain specific standards do exist, but they either dont exist or aren't widely adopted (see previous talk), especially for natural resource managemen where it can be of different modalities/resolution and some commonality exists but also sector specific extensions are required from current standards to deal with local variability.</p>\n<p>So they are surveying data standards and algorithm standards. To consider data standards first, the most successful is Open Street Map. For algorithm standards, there is rapidly adopted services like HuggingFace. But what is the <em>combination</em> of both so that they can be coupled to real outcomes?</p>\n<p>How do we compare the performance of data standards and build guiding principles of which ones to pick?</p>\n<ul>\n<li><em>to reduce friction:</em>\n<ul>\n<li>consider the time taken for dataset and model integration with existing open source tools</li>\n<li>or the time taken for the end user to create a new dummy datapoint.</li>\n<li>time taken for end user to run the model and make the first minor fix.</li>\n</ul>\n</li>\n<li><em>to accelerate development:</em>\n<ul>\n<li>number of collaborators over time</li>\n<li>number of additions by 3rd parties over time</li>\n<li>increase in model performance over time</li>\n</ul>\n</li>\n</ul>\n<p>An existing example is how to share a LULC dataset using existing open geospatial standards (<a href=\"https://stacspec.org/en\">STAC</a>). The data standard creates a simple JSON file which has metadata for that module. The data user can then access to eh latest version of the data via either an API or the STAC browser.</p>\n<p><em>TODO for myself:</em> Look at mapping these metrics onto our TMF pipeline (in <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a>) and investigate a possible user study with some CCI data. Also is STAC relevant to TMF/LIFE/FOOD publishing pipeline in <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a> as we need to publish the various layers there soon.</p>\n<h3><a href=\"https://anil.recoil.org/#geospatial-data-flow-management---anil-madhavapeddy\"></a>Geospatial data flow management - Anil Madhavapeddy</h3>\n<p>My talk, I was speaking, so no notes! I'll upload the slides later and edit this section.</p>\n<p>Good question from the audience about healthcare management and its relevance to planetary computing -- it seems to share a lot of the problems involving data sensitivity and the need for spatially explicit data sharing.</p>\n<h3><a href=\"https://anil.recoil.org/#opportunities-in-agricultural-sensing---anupam-sobti\"></a>Opportunities in agricultural sensing - Anupam Sobti</h3>\n<p><a href=\"https://docs.google.com/presentation/d/11XAuKb78TpIpMkZGYWn58I3iQnlBvRmQ/edit#slide=id.p1\">Slides</a></p>\n<p>Anupan introduced the main questions across the rural farming cycle including:</p>\n<ul>\n<li><em>Sowing:</em> "Is this the right crop?" "Will I have enough resources (water, heat, seeds)?" "Are these the right seeds?"</li>\n<li><em>Harvesting:</em> "Is this the right time to harvest?" "How do I plan post-harvest logistics?" "How do I manage residue?"</li>\n<li><em>Selling:</em> "Is this the right time to sell?" "Who do I trust to sell to?" "Do I sell now or wait?"</li>\n</ul>\n<p>So onto the notion of "Agricultural Computing", which:</p>\n<ul>\n<li>involves multiple decision layers: farmer-centric, government-centric, and finance-centric.</li>\n<li>features recent innovations such as advancements in remote sensing and game theory applications to navigate complex agricultural decisions.</li>\n</ul>\n<p>Urban heat islands are a significant problem detectable with geospatial data. He noted the reference of paper by\nMohajerani, Abbas, Jason Bakaric, and Tristan Jeffrey-Bailey. "The urban heat island effect, its causes, and mitigation, with reference to the thermal properties of asphalt concrete." <em>Journal of Environmental Management</em> 197 (2017): 522-538.</p>\n<p><em>Note to self: Send to <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a> re <a href=\"https://anil.recoil.org/papers/2024-green-urban-eq\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a>.</em></p>\n<p><em>Q:</em> For marginalised communities, should there be standards for interactions to obtain feedback iteratively, reducing the shock of policy changes? <strong>A:</strong> There is a need for significant groundwork engineering right now to provide immediate feedback, helping communities adapt more smoothly to changes.</p>\n<h3><a href=\"https://anil.recoil.org/#understanding-soil-moisture-regime-for-crop-diversification---prachi-d-patil\"></a>Understanding Soil Moisture Regime for Crop Diversification - Prachi D. Patil</h3>\n<p><a href=\"https://docs.google.com/presentation/d/1ZZMqF-8hCIupNm5VUH8wu61v9eTuI1e-/edit#slide=id.p1\">Slides</a>\n\n<img alt=\"Prachi D. Patil presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-9.webp\" title=\"Prachi D. Patil presenting\">\nPrachi D. Patil presenting</p>\n<p>Prachi gave a perspective from the farmer's fields, with a study aiming to group relatively homogenous regions based on soil, climate, and physiography, focusing on moisture availability periods for soil and the length of the growing season. Their approach uses simple moisture sensors at various depths to measure soil resistivity, providing farmers with real-time information on whether to irrigate. This system can map dry spells and their duration, offering actionable insights for crop management.</p>\n<p>The <a href=\"https://www.wassan.org/wp-content/uploads/2022/03/WASSANPublication_BhagyalakshmiUthappaSudhakarUday_03032022.pdf\">Navadhanya system</a> is a traditional cropping method with specific design and crop geometry, which can be analysed for soil moisture as a multidimensional system\u2014both spatially and temporally. Different crops have varying maturity and root depth cycles, making soil moisture critical for establishing and protecting these crops. A fallow period during a critical stage can lead to crop loss and so highlights the importance of consistent moisture.</p>\n<p>Navadhanya bridges traditional crop mixing knowledge with modern scientific sensor methods as described in the talk. Navadhanya offers nutritional security through crop variety though farmers typically sell a reliable monocrop in the market. Their analysis suggests a need to consider soil use regimes both in the short and long term, challenging the practice of forcing farmers to switch crops (e.g., from rice to bajra) based on short-term profitability.</p>\n<p><strong>Q:</strong> How can this tool assist with monsoon management? <strong>A:</strong> The tool can map soil moisture and integrate it with traditional knowledge, enabling the development of combined solutions for managing monsoon impacts.</p>\n<h3><a href=\"https://anil.recoil.org/#ranking-and-financing-based-on-climate-smart-agriculture---atanu-garai-socialwell\"></a>Ranking and financing based on climate smart agriculture - Atanu Garai (SocialWell)</h3>\n<p><a href=\"https://docs.google.com/document/d/1MJ-Nw_P3z6gI9rvh4OcjJmdZRE83D_OXedgEeDZDnm8/edit\">Slides</a>\n\n<img alt=\"Atanu Garai presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-10.webp\" title=\"Atanu Garai presenting\">\nAtanu Garai presenting\n\n<img alt=\"The machine learning approaches to climate models\" src=\"https://anil.recoil.org/images/compass24/compass24-11.webp\" title=\"The machine learning approaches to climate models\">\nThe machine learning approaches to climate models</p>\n<p>Atanu switched tack to the business side of things, focused on switching Farmer Producer Organisations (FPOs), of which there are 10000+ in India, to adopt climate-smart practices. The incentive based approach includes:</p>\n<ol>\n<li><strong>Business Plan:</strong> Farmers, FPOs, and market data collaboratively generate a business plan, which is then used by FPOs to secure loans.</li>\n<li><strong>Land Parcels and FPO Rating:</strong> Farm inputs, soil, and weather data are tracked to classify and rate each land parcel.</li>\n<li><strong>Climate Smart Financing:</strong> Execute the plan based on the gathered data.</li>\n</ol>\n<p>The key requirements for obtaining an FPO Land Parcel Rating with their method are:</p>\n<ol>\n<li><strong>Farm Inputs:</strong> Data on seeds, fertilizers, and pesticides provided by the FPO and sourced by the farmer, recorded by the FPO.</li>\n<li><strong>Soil Data:</strong> Rating of soil using a combination of mobile and sensor technologies.</li>\n<li><strong>Climate Data:</strong> Sourced from public datasets, focusing on classifying rainfall and extreme weather events.</li>\n<li><strong>Farm Practices:</strong> Documentation through photos of sowing, irrigation, and data on the methods used.</li>\n</ol>\n<p>For climate data, their approach involves using neural network-based chaos forecasting to provide weather predictions in a format useful to farmers. <em>The second half of the presentation went into great detail into their ensemble methods to predict weather patterns, which I didn't note in detail, but see <a href=\"https://anil.recoil.org/ideas/diffusion-model-satellites\">Diffusion models for terrestrial predictions about land use change</a>.</em></p>\n<h2><a href=\"https://anil.recoil.org/#session-3\"></a>Session 3</h2>\n<h3><a href=\"https://anil.recoil.org/#groundwater-monitoring-tool-challenges-to-apply-ecological-health-monitoring-at-scale---himani-sharmachiranjit-guha\"></a>Groundwater monitoring tool, challenges to apply ecological health monitoring at scale - Himani Sharma/Chiranjit Guha</h3>\n<p><a href=\"https://docs.google.com/presentation/d/14zesuTt8R9UGOvaSXsvOPwARO-c4xyg6/edit?usp=sharing&ouid=116413035808485050246&rtpof=true&sd=true\">Slides</a></p>\n<p>\n<img alt=\"Himani Sharma presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-12.webp\" title=\"Himani Sharma presenting\">\nHimani Sharma presenting\nGroundwater monitoring in India faces significant data scarcity, with only 4886 wells having long-term data in the whole country, averaging just 7 wells per district. To address this 150+ organisations collaborated a few years ago to create an Android app for crowdsourcing groundwater data. Starting with 5000 villages, the project has now expanded to 11000+ villages and is used both pre- and post-monsoon and is revealing substantial fluctuations in water levels.</p>\n<p>The app enables users to generate village-level groundwater maps, correlating water level data with geological information to create comprehensive groundwater flow maps, even within individual villages. The process involves measuring water depth from three wells per village, using GPS and mobile devices, and rendering the data on an online platform.</p>\n<p>\n<img alt=\"Soil moisture measurements\" src=\"https://anil.recoil.org/images/compass24/compass24-sm-ss.webp\" title=\"Soil moisture measurements\">\nSoil moisture measurements\nThe crowdsourcing presents challenges in data quality, requiring post-processing and filtering. Despite this, the analysis has been highly effective, and the Jaldoot scheme now covers 450000+ villages as of 2023, following extensive lobbying with the Indian government who are now supporting it directly.</p>\n<p>In addition to groundwater monitoring, efforts are also focused on community-based ecological health monitoring, including biodiversity, biomass assessment, and pollinator/insect tracking. Four sample watersheds with detailed socio-ecological-economic indicators and over 150 annual monitoring sites are used to track changes in vegetation and species over time. These assessments both reveal valuable insights (e.g., the increased presence of a rare frog in specific watersheds) and are resource-intensive and challenging to scale. Potential solutions include GIS-based platforms, remote sensing, and tools for tracking changes in standing biomass, carbon stock, and biodiversity.</p>\n<p><em>Note to self:</em> Possible connection with the iRecord team in the UK to explore applicability of biodiversity data collected?</p>\n<p>The project also maps highly infested areas by invasive species, such as the <a href=\"https://india.mongabay.com/2020/08/lantana-invasion-threatens-40-percent-of-indias-tiger-habitat-reports-study/\">Lantana camara</a>, to focus restoration efforts abd is drawing on data from 150+ sites.</p>\n<p>Q: what are the next steps? A: going to withdraw the Android app in the next few years, so the government is taking over next after creating a similar app. Declaring the project a success! Q: But will the data remain open for the communities once the government takes over? A: There is a growing widening of the dataset collection (e.g. biodiversity) to refine the datasets for things not yet considered such as ecosystem services. Not clear on the future of the government-run data provenance.</p>\n<h3><a href=\"https://anil.recoil.org/#land-suitability-assessment----athithiyan-mr\"></a>Land Suitability Assessment -- Athithiyan MR</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19rXpXNoizFA-Pc8UKXC0G1qbfzSm3iZ-/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Athithiyan presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-13.webp\" title=\"Athithiyan presenting\">\nAthithiyan presenting\nTheir "LifeLands" system is designed to unlock the productive potential of degraded lands, aiming to mitigate climate impacts through better land use. The digital planning tool they built utilises satellite imagery, public databases, and AI modelling to assess land suitability for regenerative purposes such as solar energy, sustainable water management, or ecological restoration.</p>\n<p>The system integrates geospatial and socioeconomic data layers, along with public datasets, to produce an interactive map and report, determining whether land is unused and suitable for intervention. Data collection is facilitated through a mobile app that traces land boundaries using GPS, captures four site photos and a video, and gathers information on land ownership and existing vegetation (shrubs and trees).</p>\n<h3><a href=\"https://anil.recoil.org/#designing-for-context---aila-dutt\"></a>Designing for Context - Aila Dutt</h3>\n<p><a href=\"https://docs.google.com/presentation/d/19lThkR3LfHhQvDibQiHs_vtNeCr4XOFj/edit#slide=id.p1\">Slides</a></p>\n<p>\n<img alt=\"Aila Dutt presenting\" src=\"https://anil.recoil.org/images/compass24/compass24-14.webp\" title=\"Aila Dutt presenting\">\nAila Dutt presenting\nCitizens and community stewards need to be able to understand, analyse and apply various concepts and data around climate change to understand intricacies of socio-economic changes. So how might be simplify complex systems and data to encourage data driven decision making through these interventions? To be successful this needs to be participatory decision making and a reclamation of agency of each of the stakeholders within the system.\nIt is essential for citizens and community stewards to comprehend, analyse, and apply complex concepts and data. The goal is to simplify these systems and data, fostering participatory decision-making and empowering stakeholders to reclaim their agency within the system.</p>\n<p>Broad research approach:</p>\n<ol>\n<li><strong>Discover:</strong> Conduct field research, interviews, observations, secondary research, and expert consultations.</li>\n<li><strong>Define:</strong> Engage in systems mapping, curriculum design, and persona mapping using analogous examples.</li>\n<li><strong>Ideate:</strong> Perform field testing, map problems to solutions, and explore sacrificial concepts.</li>\n<li><strong>Prototype:</strong> Conduct usability testing, create sketches and wireframes, and integrate data analytics.</li>\n</ol>\n<p>To enhance understanding, environmental education and curriculum design can incorporate semi-fictional "case studies" that place users in relatable contexts. This approach increases adoption by breaking the system into modules and using gamification to test concepts. For example, users can explore the concept of 'climate change' as it pertains to their own land and prosperity.</p>\n<p>In the analysis phase, it\u2019s crucial to not only graph data but also describe it in ways that participants can relate to their own landscapes. The decision-making process must integrate data-driven insights with existing frameworks. Generative images and brainstorming sessions are used to develop innovative ways to visualise complex data, such as precipitation and climatic variables, in a simple and understandable form.</p>\n<p><strong>Example Activity:</strong> "Set a 15-minute timer and brainstorm all possible ways to present data simply." Consider descriptors like terrain, slopes, plains, rainfall, surface water, MNREGA projects, and agriculture to see how users can better utilise this information.</p>\n<p><strong>Q:</strong> Is 'making data actionable' a priority, and how do we address the tragedy of the commons? <strong>A:</strong> Yes, systems thinking and collaboration are essential to prevent resource depletion and ensure shared benefits.\n<strong>Q:</strong> Can this approach scale from smaller to larger communities? <strong>A:</strong> Yes, by developing microwatershed data and village-level datasets, even large communities can work at much smaller, more precise resolutions.</p>\n<p>\n<img alt=\"The attendees of the RIC\" src=\"https://anil.recoil.org/images/compass24/compass24-group1.webp\" title=\"The attendees of the RIC\">\nThe attendees of the RIC</p>\n<h2><a href=\"https://anil.recoil.org/#group-sessions\"></a>Group Sessions</h2>\n<p>After this, we split into groups to discuss the following topics roughly as follows:</p>\n<ul>\n<li>What do we need to do to take this into scale? e.g. remote sensing: works at some scale, but validation also needs to scale.</li>\n<li>Then we saw new usecases. E.g. soil moisture. Now we need to think this through and come up with succinct problem statement to.</li>\n<li>Start taking through some datasets and algorithms as examples and turn them in to a spec. What is the specification process and ultimate metadata standards?</li>\n<li>One group then will work on methods to facilitate community engagement with data</li>\n<li>And then what are principles and processes for effective collaboration and co-creation. What are barriers?</li>\n</ul>\n<p>I'll follow up with more analysis about the outcomes soon, as I'm in touch with Aadi and hopefully we will be working on a project together in the future. But for now, I'll conclude this trip report with great appreciation for Aadi and the hard working volunteers at COMPASS 2024 that made attendance such a pleasure!</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-18.webp\" title=\"Glorious Delhi sunset to finish the conference\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-21.webp\" title=\"Spotted some electric charging stations!\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/compass24/compass24-22.webp\" title=\"Made it back to London in time to catch some tennis\"></p>",
+18
avsm/notes_credible-credit-principles.json
+18
avsm/notes_credible-credit-principles.json
···+"summary": "<p>My colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> lead the publication of a comprehensive\n<a href=\"https://www.cambridge.org/engage/coe/article-details/679385946dde43c9082f7009\">report</a> of the steps the voluntary carbon market needs to take\nto restore its scientific credibility, with input from many of us in <a href=\"https://anil.recoil.org/projects/4c\">4C</a> and beyond.</p>\n<blockquote>\n<ul>\n<li>establishing common standards for carbon quantification and accounting, to cover additionality, leakage and permanence.</li>\n<li>avoiding perverse incentives and align the motivations of all stakeholders with high-integrity outcomes. [...]</li>\n<li>issuing all carbon credits based on trusted primary observations.</li>\n<li>making all the data needed to reproduce carbon calculations available in standard file formats.</li>\n<li>[...] reporting social and biodiversity dimensions of projects separately from carbon calculations.</li>\n<li>integrating DMRV methods into carbon and biodiversity accounting standards to reduce the financial and administrative burdens on nature-based projects and the local communities participating in or affected by them.</li>\n</ul>\n</blockquote>\n<p>This paper represents three years of hard work from the team on trying to blend remote sensing with carbon quantification. For more reading on the topic, you may also wish to browse the full <a href=\"https://4c.cst.cam.ac.uk/publications\">4C publication list</a> for the firehose of activity from the centre.</p>",+"content": "<p>My colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> lead the publication of a comprehensive\n<a href=\"https://www.cambridge.org/engage/coe/article-details/679385946dde43c9082f7009\">report</a> of the steps the voluntary carbon market needs to take\nto restore its scientific credibility, with input from many of us in <a href=\"https://anil.recoil.org/projects/4c\">4C</a> and beyond.</p>\n<blockquote>\n<ul>\n<li>establishing common standards for carbon quantification and accounting, to cover additionality, leakage and permanence.</li>\n<li>avoiding perverse incentives and align the motivations of all stakeholders with high-integrity outcomes. [...]</li>\n<li>issuing all carbon credits based on trusted primary observations.</li>\n<li>making all the data needed to reproduce carbon calculations available in standard file formats.</li>\n<li>[...] reporting social and biodiversity dimensions of projects separately from carbon calculations.</li>\n<li>integrating DMRV methods into carbon and biodiversity accounting standards to reduce the financial and administrative burdens on nature-based projects and the local communities participating in or affected by them.</li>\n</ul>\n</blockquote>\n<p>This paper represents three years of hard work from the team on trying to blend remote sensing with carbon quantification. For more reading on the topic, you may also wish to browse the full <a href=\"https://4c.cst.cam.ac.uk/publications\">4C publication list</a> for the firehose of activity from the centre.</p>",
+18
avsm/notes_cufp-2011-mirage.json
+18
avsm/notes_cufp-2011-mirage.json
···+"summary": "<p>We signed up to do a MirageOS tutorial at ICFP, which is a bit daunting: we had to get all the embedded ARM hardware and laptop support in shape, as well as make it work for a bunch of discerning hackers.</p>",+"content": "<p>We signed up to do a MirageOS tutorial at ICFP, which is a bit daunting: we had to get all the embedded ARM hardware and laptop support in shape, as well as make it work for a bunch of discerning hackers.</p>",
+18
avsm/notes_cufp-2013-liveblog.json
+18
avsm/notes_cufp-2013-liveblog.json
···+"summary": "<p>The <a href=\"https://cufp.org\">Commercial Uses of Functional Programming</a> workshop is one of the best industry/academia crossover workshops to attend, and these are my livenotes from the 2013 edition.</p>",+"content": "<p>The <a href=\"https://cufp.org\">Commercial Uses of Functional Programming</a> workshop is one of the best industry/academia crossover workshops to attend, and these are my livenotes from the 2013 edition.</p>",
+18
avsm/notes_custom-homebrew-taps.json
+18
avsm/notes_custom-homebrew-taps.json
···+"summary": "<p>Now that I've <a href=\"https://anil.recoil.org/notes/bushel-lives\">switched</a> to a new website, I'm working on open-sourcing its components. I've got a lot of small OCaml scripts that are all work-in-progress, and so not quite suitable to be published to the <a href=\"https://github.com/ocaml/opam-repository\">central opam-repository</a> but I still need be able to run them conveniently on my own <a href=\"https://anil.recoil.org/\">self-hosted</a> infrastructure.</p>\n<p>I mainly use a variety of macOS and Linux hosts<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> and I want a workflow as simple as "<code>brew install avsm/ocaml/srcsetter</code>" and have it install a working binary version of my CLI utility. In this case, it's <a href=\"https://github.com/avsm/srcsetter\">srcsetter</a>, a simple tool I knocked up to generate the <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTML/Responsive_images\">responsive images</a> on this website. Luckily, Homebrew has made this <em>really</em> easy for us! They have a <a href=\"https://docs.brew.sh/BrewTestBot\">BrewTestBot</a> that integrates with GitHub Actions to automate the compilation of binary packages for us, all from a convenient PR-like workflow.</p>\n<p>First, we need to set up a GitHub Homebrew "tap" repository. Mine is <a href=\"https://github.com/avsm/homebrew-ocaml\">avsm/homebrew-ocaml</a> which allows for the tap to be referred to as <code>avsm/ocaml</code> (Homebrew special-cases these to expand to the full GitHub repository). We then add in a couple of GitHub Actions to activate the testbot:</p>\n<ul>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/tests.yml\">.github/workflows/tests.yml</a> runs in response to pull requests to that repository and does a full Brew build of the package.</li>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/publish.yml\">.github/workflows/publish.yml</a> allows us to simply add a <code>pr-pull</code> label to a successful PR and have it be merged automatically by the bot.</li>\n</ul>\n<p>Secondly, we need to create a Homebrew package for the opam package. For this, I just added a very simple script to the srcsetter repository called <a href=\"https://github.com/avsm/srcsetter/blob/main/.opambuild.sh\">.opambuild.sh</a> which builds a local binary using a temporary opam installation. In the future, we should be able to use <a href=\"https://preview.dune.build\">dune package management</a> to remove the need for this script, but I'm blocked on some <a href=\"https://github.com/ocaml/dune/issues/11405\">teething issues</a> there in the short-term.</p>\n<pre><code>export OPAMROOT=`pwd`/_opamroot\nexport OPAMYES=1\nexport OPAMCONFIRMLEVEL=unsafe-yes\nopam init -ny --disable-sandboxing\nopam switch create . \nopam exec -- dune build --profile=release\n</code></pre>\n<p>Once this is present in the repository we're building, I just need to <a href=\"https://github.com/avsm/homebrew-ocaml/pull/2\">open a pull request</a> with the Homebrew <a href=\"https://docs.brew.sh/Formula-Cookbook\">formula</a> for my CLI tool.</p>\n<pre><code>class Srcsetter < Formula\n desc "Webp image generator for responsive HTML sites"\n homepage "https://github.com/avsm/srcsetter/"\n url "https://github.com/avsm/srcsetter.git", branch: "main"\n version "0.0.1"\n license "ISC"\n\n depends_on "gpatch"\n depends_on "opam"\n\n def install\n system "bash", "./.opambuild.sh"\n bin.install "_opam/bin/srcsetter"\n end\nend\n</code></pre>\n<p>The formula is fairly self-explanatory: I just point Homebrew at the source repository, give it some descriptive metadata, and tell it to invoke the binary build script and make the sole resulting binary available as the contents of the package. At this point, the BrewBot will run against the PR and report any build failures on both macOS and Ubuntu. Most of these were swiftly fixed by running <code>brew style</code> (as instructed in the build failures) to take of fairly minor issues.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/gh-brewbot-screen.webp\" title=\"\">\n</p>\n<p>When the PR went green, all I then had to do was to add the <code>pr-pull</code> label, and the bot takes care of uploading the binary artefacts to my <a href=\"https://github.com/avsm/homebrew-ocaml/releases/tag/srcsetter-0.0.1\">homebrew tap repo</a> and merging the PR. It also takes care of adding checksums to the merged Formula, so what actually got merged is:</p>\n<pre><code>class Srcsetter < Formula\n desc "Webp image generator for responsive HTML sites"\n homepage "https://github.com/avsm/srcsetter/"\n url "https://github.com/avsm/srcsetter.git", branch: "main"\n version "0.0.1"\n license "ISC"\n\n bottle do\n root_url "https://github.com/avsm/homebrew-ocaml/releases/download/srcsetter-0.0.1"\n sha256 cellar: :any_skip_relocation, arm64_sequoia: "b3e1289965d8bcf086db06b18e6c2865f9949a9e1202b8fafa640f3e363b6bd4"\n sha256 cellar: :any_skip_relocation, ventura: "9b61e8e4be5f777e3ef98672f275909a80c3cc3f82d6886ca1a90b66ea7bb9f8"\n sha256 cellar: :any_skip_relocation, x86_64_linux: "d8279f11f30edf865368a3c6f63d811d31c1a9ca019ef86e93afeb6624232850"\n end\n\n depends_on "gpatch"\n depends_on "opam"\n\n def install\n system "bash", "./.opambuild.sh"\n bin.install "_opam/bin/srcsetter"\n end\nend\n</code></pre>\n<p>The end result is that <code>brew install avsm/ocaml/srcsetter</code> now works, without me having to cut a release of the tool more centrally. I'd love to incorporate some aspects of this workflow into the OCaml opam-repository, as users are currently responsible for the checksumming generation themselves via <a href=\"https://discuss.ocaml.org/t/dune-release-version-1-4-0-released/6103\">dune-release</a> or <a href=\"https://opam.ocaml.org/doc/Packaging.html\">opam-publish</a>. It's an interesting twist to automate this part of the process and let the humans focus on the core package metadata instead. Thanks for all the help, Brewbot!</p>\n\n<ol>\n<li>\n<p>Let's leave <a href=\"https://anil.recoil.org/\">OpenBSD</a> support to another day!</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>Now that I've <a href=\"https://anil.recoil.org/notes/bushel-lives\">switched</a> to a new website, I'm working on open-sourcing its components. I've got a lot of small OCaml scripts that are all work-in-progress, and so not quite suitable to be published to the <a href=\"https://github.com/ocaml/opam-repository\">central opam-repository</a> but I still need be able to run them conveniently on my own <a href=\"https://anil.recoil.org/\">self-hosted</a> infrastructure.</p>\n<p>I mainly use a variety of macOS and Linux hosts<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> and I want a workflow as simple as "<code>brew install avsm/ocaml/srcsetter</code>" and have it install a working binary version of my CLI utility. In this case, it's <a href=\"https://github.com/avsm/srcsetter\">srcsetter</a>, a simple tool I knocked up to generate the <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTML/Responsive_images\">responsive images</a> on this website. Luckily, Homebrew has made this <em>really</em> easy for us! They have a <a href=\"https://docs.brew.sh/BrewTestBot\">BrewTestBot</a> that integrates with GitHub Actions to automate the compilation of binary packages for us, all from a convenient PR-like workflow.</p>\n<p>First, we need to set up a GitHub Homebrew "tap" repository. Mine is <a href=\"https://github.com/avsm/homebrew-ocaml\">avsm/homebrew-ocaml</a> which allows for the tap to be referred to as <code>avsm/ocaml</code> (Homebrew special-cases these to expand to the full GitHub repository). We then add in a couple of GitHub Actions to activate the testbot:</p>\n<ul>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/tests.yml\">.github/workflows/tests.yml</a> runs in response to pull requests to that repository and does a full Brew build of the package.</li>\n<li><a href=\"https://github.com/avsm/homebrew-ocaml/blob/main/.github/workflows/publish.yml\">.github/workflows/publish.yml</a> allows us to simply add a <code>pr-pull</code> label to a successful PR and have it be merged automatically by the bot.</li>\n</ul>\n<p>Secondly, we need to create a Homebrew package for the opam package. For this, I just added a very simple script to the srcsetter repository called <a href=\"https://github.com/avsm/srcsetter/blob/main/.opambuild.sh\">.opambuild.sh</a> which builds a local binary using a temporary opam installation. In the future, we should be able to use <a href=\"https://preview.dune.build\">dune package management</a> to remove the need for this script, but I'm blocked on some <a href=\"https://github.com/ocaml/dune/issues/11405\">teething issues</a> there in the short-term.</p>\n<pre><code>export OPAMROOT=`pwd`/_opamroot\nexport OPAMYES=1\nexport OPAMCONFIRMLEVEL=unsafe-yes\nopam init -ny --disable-sandboxing\nopam switch create . \nopam exec -- dune build --profile=release\n</code></pre>\n<p>Once this is present in the repository we're building, I just need to <a href=\"https://github.com/avsm/homebrew-ocaml/pull/2\">open a pull request</a> with the Homebrew <a href=\"https://docs.brew.sh/Formula-Cookbook\">formula</a> for my CLI tool.</p>\n<pre><code>class Srcsetter < Formula\n desc "Webp image generator for responsive HTML sites"\n homepage "https://github.com/avsm/srcsetter/"\n url "https://github.com/avsm/srcsetter.git", branch: "main"\n version "0.0.1"\n license "ISC"\n\n depends_on "gpatch"\n depends_on "opam"\n\n def install\n system "bash", "./.opambuild.sh"\n bin.install "_opam/bin/srcsetter"\n end\nend\n</code></pre>\n<p>The formula is fairly self-explanatory: I just point Homebrew at the source repository, give it some descriptive metadata, and tell it to invoke the binary build script and make the sole resulting binary available as the contents of the package. At this point, the BrewBot will run against the PR and report any build failures on both macOS and Ubuntu. Most of these were swiftly fixed by running <code>brew style</code> (as instructed in the build failures) to take of fairly minor issues.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/gh-brewbot-screen.webp\" title=\"\">\n</p>\n<p>When the PR went green, all I then had to do was to add the <code>pr-pull</code> label, and the bot takes care of uploading the binary artefacts to my <a href=\"https://github.com/avsm/homebrew-ocaml/releases/tag/srcsetter-0.0.1\">homebrew tap repo</a> and merging the PR. It also takes care of adding checksums to the merged Formula, so what actually got merged is:</p>\n<pre><code>class Srcsetter < Formula\n desc "Webp image generator for responsive HTML sites"\n homepage "https://github.com/avsm/srcsetter/"\n url "https://github.com/avsm/srcsetter.git", branch: "main"\n version "0.0.1"\n license "ISC"\n\n bottle do\n root_url "https://github.com/avsm/homebrew-ocaml/releases/download/srcsetter-0.0.1"\n sha256 cellar: :any_skip_relocation, arm64_sequoia: "b3e1289965d8bcf086db06b18e6c2865f9949a9e1202b8fafa640f3e363b6bd4"\n sha256 cellar: :any_skip_relocation, ventura: "9b61e8e4be5f777e3ef98672f275909a80c3cc3f82d6886ca1a90b66ea7bb9f8"\n sha256 cellar: :any_skip_relocation, x86_64_linux: "d8279f11f30edf865368a3c6f63d811d31c1a9ca019ef86e93afeb6624232850"\n end\n\n depends_on "gpatch"\n depends_on "opam"\n\n def install\n system "bash", "./.opambuild.sh"\n bin.install "_opam/bin/srcsetter"\n end\nend\n</code></pre>\n<p>The end result is that <code>brew install avsm/ocaml/srcsetter</code> now works, without me having to cut a release of the tool more centrally. I'd love to incorporate some aspects of this workflow into the OCaml opam-repository, as users are currently responsible for the checksumming generation themselves via <a href=\"https://discuss.ocaml.org/t/dune-release-version-1-4-0-released/6103\">dune-release</a> or <a href=\"https://opam.ocaml.org/doc/Packaging.html\">opam-publish</a>. It's an interesting twist to automate this part of the process and let the humans focus on the core package metadata instead. Thanks for all the help, Brewbot!</p>\n\n<ol>\n<li>\n<p>Let's leave <a href=\"https://anil.recoil.org/\">OpenBSD</a> support to another day!</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_datacaml-with-ciel.json
+18
avsm/notes_datacaml-with-ciel.json
···+"summary": "<p>Distributed programming frameworks like\n<a href=\"http://wiki.apache.org/hadoop\">Hadoop</a> and\n<a href=\"http://research.microsoft.com/en-us/projects/dryad/\">Dryad</a> are popular\nfor performing computation over large amounts of data. The reason is\nprogrammer convenience: they accept a query expressed in a simple form\nsuch as <a href=\"http://wiki.apache.org/hadoop/HadoopMapReduce\">MapReduce</a>, and\nautomatically take care of distributing computation to multiple hosts,\nensuring the data is available at all nodes that need it, and dealing\nwith host failures and stragglers.</p>\n<p>A major limitation of Hadoop and Dryad is that they are not well-suited\nto expressing <a href=\"http://en.wikipedia.org/wiki/Iterative_method\">iterative\nalgorithms</a> or <a href=\"http://en.wikipedia.org/wiki/Dynamic_programming\">dynamic\nprogramming</a> problems.\nThese are very commonly found patterns in many algorithms, such as\n<a href=\"http://en.wikipedia.org/wiki/K-means_clustering\">k-means clustering</a>,\n<a href=\"http://en.wikipedia.org/wiki/Binomial_options_pricing_model\">binomial options\npricing</a> or\n<a href=\"http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm\">Smith Waterman</a>\nfor sequence alignment.</p>\n<p>Over in the SRG in Cambridge,\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/who-we-are/\">we</a>\ndeveloped a Turing-powerful distributed execution engine called\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> that addresses\nthis. The <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel\">CIEL: A universal execution engine for distributed data-flow computing</a>\npaper describes the system in detail, but here\u2019s a shorter introduction.</p>\n<h2><a href=\"https://anil.recoil.org/#the-ciel-execution-engine\"></a>The CIEL Execution Engine</h2>\n<p>CIEL consists of a master coordination server and workers installed on\nevery host. The engine is job-oriented: a job consists of a graph of\ntasks which results in a deterministic output. CIEL tasks can run in any\nlanguage and are started by the worker processes as needed. Data flows\naround the cluster in the form of <em>references</em> that are fed to tasks as\ndependencies. Tasks can publish their outputs either as <em>concrete</em>\nreferences if they can finish the work immediately or as a <em>future</em>\nreference. Additionally, tasks can dynamically spawn more tasks and\ndelegate references to them, which makes the system Turing-powerful and\nsuitable for iterative and dynamic programming problems where the task\ngraph cannot be computed statically.</p>\n<p>The first iteration of CIEL used a domain-specific language called\n<a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.pdf\">Skywriting</a> to\ncoordinate how tasks should run across a cluster. Skywriting is an\ninterpreted language that is \u201cnative\u201d to CIEL, and when it needs to\nblock it stores its entire execution state inside CIEL as a\ncontinuation. <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek Murray</a> has\nwritten a blog post <a href=\"http://www.syslog.cl.cam.ac.uk/2011/04/06/ciel/\">explaining this in more\ndetail</a>.</p>\n<p>More recently, we have been working on eliminating the need for\nSkywriting entirely, by adding direct support for CIEL into languages\nsuch as <a href=\"http://www.stackless.com/\">Python</a>, Java,\n<a href=\"http://www.scala-lang.org/\">Scala</a>, and the main subject of this post \u2013\n<a href=\"http://caml.inria.fr\">OCaml</a>. It works via libraries that communicate\nwith CIEL to spawn tasks, publish references, or suspend itself into the\ncluster to be woken up when a future reference is completed.</p>\n<h2><a href=\"https://anil.recoil.org/#datacaml-api\"></a>DataCaml API</h2>\n<p>Rather than go into too much detail about the innards of CIEL, this post\ndescribes the OCaml API and gives some examples of how to use it. The\nsimplest interface to start with is:</p>\n<pre><code>type 'a ref\nval deref : 'a ref -> 'a\n</code></pre>\n<p>The type <code>'a ref</code> represents a CIEL reference. This data might not be\nimmediately present on the current node, and so must be dereferenced\nusing the <code>deref</code> function.</p>\n<p>If the reference has been completed, then the OCaml value is\nunmarshalled and returned. If it is not present, then the program needs\nto wait until the computation involving the reference has completed\nelsewhere. The future reference might contain a large data structure and\nbe on another host entirely, and so we should serialise the program\nstate and spawn a task that is dependent on the future\u2019s completion.\nThis way, CIEL can resume execution on whatever node finished that\ncomputation, avoiding the need to move data across the network.</p>\n<p>Luckily, we do not need to serialise the entire heap to suspend the\nprogram. DataCaml uses the\n<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\ndelimited continuations library to walk the stack and save only the\nsubset required to restart this particular task. Delimcc abstracts this\nin the form a \u201crestartable exception\u201d that supplies a closure which can\nbe called later to resume the execution, as if the exception had never\nhappened. Delimcc supports serialising this closure to an output\nchannel, which you can read about in Oleg\u2019s\n<a href=\"http://okmij.org/ftp/continuations/caml-shift.pdf\">paper</a>.</p>\n<p>So how do we construct references? Lets fill in more of the interface:</p>\n<pre><code>module Ciel = struct\n type 'a ref\n val deref : 'a ref -> 'a\n val spawn : ('a -> 'b) -> 'a -> 'b ref\n val run : (string list -> 'a) -> ('a -> string) -> unit\nend\n</code></pre>\n<p>The <code>spawn</code> function accepts a closure and an argument, and returns a\nfuture of the result as a reference. The <code>run</code> function begins the\nexecution of a job, with the first parameter taking some\n<code>string arguments</code> and returning an <code>'a</code> value. We also supply a\npretty-printer second argument to convert the <code>'a</code> into a string for\nreturning as the result of the job (this can actually be any JSON value\nin CIEL, and just simplified here).</p>\n<pre><code>let r1 = spawn (fun x -> x + 5) arg1 in\nlet r2 = spawn (fun x -> deref r1 + 5) arg1 in\nderef r2\n</code></pre>\n<p>We first spawn a function <code>r1</code> which simply adds 5 to the job argument.\nA job in CIEL is <em>lazily scheduled</em>, so this marshals the function to\nCIEL, creates a future, and returns immediately. Next, the <code>r2</code> function\nspawns a task which also adds 5, but to the dereferenced value of <code>r1</code>.\nAgain, it is not scheduled yet as the return reference has not been\ndereferenced.</p>\n<p>Finally, we attempt to dereference <code>r2</code>, which causes it be scheduled on\na worker. While executing, it will try to dereference <code>r1</code> that will\nschedule it, and all the tasks will run to completion.</p>\n<p>Programming language boffins will recognise that this interface is very\nsimilar to <a href=\"http://www.ps.uni-saarland.de/alice/\">AliceML</a>\u2019s concept of\n<a href=\"http://www.ps.uni-saarland.de/alice/manual/futures.html\">lazy futures</a>.\nThe main difference is that it is implemented as a pure OCaml library,\nand uses a general-purpose distributed engine that can also work with\nother languages.</p>\n<h2><a href=\"https://anil.recoil.org/#streaming-references\"></a>Streaming References</h2>\n<p>The references described so far only have two states: they are either\nconcrete or futures. However, there are times when a task can\nprogressively accept input and make forward progress. For these\nsituations, references can also be typed as <em>opaque</em> references that are\naccessed via <code>in_channel</code> and <code>out_channel</code>, as networks are:</p>\n<pre><code>type opaque_ref\n\nval spawn_ref : (unit -> opaque_ref) -> opaque_ref\nval output : ?stream:bool -> ?pipe:bool -> (out_channel -> unit) -> opaque_ref\nval input : (in_channel -> 'a) -> opaque_ref -> 'a\n</code></pre>\n<p>This interface is a lower-level version of the previous one:</p>\n<ul>\n<li><code>spawn_ref</code> creates a lazy future as before, but the type of\nreferences here is completely opaque to the program.</li>\n<li>Inside a spawned function, <code>output</code> is called with a closure that\naccepts an <code>out_channel</code>. The <code>stream</code> argument informs CIEL that a\ndependent task can consume the output before it is completed, and\n<code>pipe</code> forms an even more closely coupled shared-memory connection\n(requiring the tasks to be scheduled on the same host). Piping is\nmore efficient, but will require more work to recover from a fault,\nand so using it is left to the programmer to decide.</li>\n<li>The <code>input</code> function is used by the receiving task to parse the\ninput as a standard <code>in_channel</code>.</li>\n</ul>\n<p>The CIEL engine actually supports multiple concurrent input and output\nstreams to a task, but I\u2019ve just bound it as a single version for now\nwhile the bindings find their feet. Here\u2019s an example of how streaming\nreferences can be used:</p>\n<pre><code>let x_ref = spawn_ref (fun () ->\n output ~stream:true (fun oc ->\n for i = 0 to 5 do\n Unix.sleep 1;\n fprintf oc "%d\\n%!" i;\n done\n )\n ) in\n let y_ref = spawn_ref (fun () ->\n input (fun ic ->\n output ~stream:true (fun oc ->\n for i = 0 to 5 do\n let line = input_line ic in\n fprintf oc "LINE=%s\\n%!" line\n done\n )\n ) x_ref\n ) in\n</code></pre>\n<p>We first spawn an <code>x_ref</code> which pretends to do 5 seconds of work by\nsleeping and outputing a number. This would of course be heavy number\ncrunching in a real program. The <code>y_ref</code> then inputs this stream, and\noutputs its own result by prepending a string to each line.</p>\n<h2><a href=\"https://anil.recoil.org/#try-it-out\"></a>Try it out</h2>\n<p>If you are interested in a more real example, then read through the\n<a href=\"https://github.com/avsm/ciel/blob/master/src/ocaml/binomial.ml\">binomial\noptions</a>\ncalculator that uses streaming references to parallelise a dynamic\nprogramming problem (this would be difficult to express in MapReduce).\nOn my Mac, I can run this by:</p>\n<ul>\n<li>check out CIEL from from Derek\u2019s <a href=\"http://github.com/mrry/ciel\">Git\nrepository</a>.</li>\n<li>install all the Python libraries required (see the <code>INSTALL</code> file)\nand OCaml libraries\n(<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\nand <a href=\"http://martin.jambon.free.fr/yojson.html\">Yojson</a>).</li>\n<li>add <code><repo>/src/python</code> to your <code>PYTHONPATH</code></li>\n<li>in one terminal: <code>./scripts/run_master.sh</code></li>\n<li>in another terminal: <code>./scripts/run_worker.sh -n 5</code> (this allocates\n5 execution slots)</li>\n<li>build the OCaml libraries: <code>cd src/ocaml && make</code></li>\n<li>start the binomial options job:\n<code>./scripts/sw-start-job -m http://localhost:8000 ./src/package/ocaml_binopt.pack</code></li>\n<li>there will be a URL printed which shows the execution progress in\nreal-time</li>\n<li>you should see log activity on the worker(s), and a result reference\nwith the answer (<code>10.x</code>)</li>\n<li>let us know the happy news if it worked or sad news if something\nbroke</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#discussion\"></a>Discussion</h2>\n<p>The DataCaml bindings outlined here provide an easy way to write\ndistributed, fault-tolerant and cluster-scheduled jobs in OCaml. The\ncurrent implementation of the engine is aimed at cluster computation,\nbut <a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> has been working on\n<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-sfma.pdf\">condensing CIEL onto multicore\nhardware</a>.\nThus, this could be one approach to \u2018solving the OCaml multicore\nproblem\u2019 for problems that fit nicely into the dataflow paradigm.</p>\n<p>The biggest limitation for using these bindings is that delimited\ncontinuation serialisation only works in bytecode. Native code delimcc\nsupports <code>shift/reduce</code> in the same program, but serialising is\nproblematic since native code continuations contain a C stack, which may\nhave unwrapped integers. One way to work around this is by switching to\na monadic approach to dereferencing, but I find delimcc programming more\nnatural (also see <a href=\"http://www.openmirage.org/wiki/delimcc-vs-lwt\">this\ndiscussion</a>).</p>\n<p>Another important point is that tasks are lazy and purely functional\n(remind you of Haskell?). This is essential for reliable fault-tolerance\nand reproducibility, while allowing individual tasks to run fast, strict\nand mutable OCaml code. The tasks must remain referentially transparent\nand idempotent, as CIEL may choose to schedule them multiple times (in\nthe case of faults or straggler correction). Derek has been working on\n<a href=\"http://www.cl.cam.ac.uk/~dgm36/publications/2011-murray2011nondet.pdf\">integrating non-determinism into\nCIEL</a>,\nso this restriction may be relaxed soon.</p>\n<p>Finally, these ideas are not limited to OCaml at all, but also apply to\nScala, Java, and Python. We have submitted a draft paper dubbed <em>\u2018<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-socc-draft.pdf\">A\nPolyglot Approach to Cloud\nProgramming</a>\u2019</em>\nwith more details and the ubiquitous evaluation versus Hadoop. There is\na really interesting line to explore between low-level\n<a href=\"http://en.wikipedia.org/wiki/Message_Passing_Interface\">MPI</a> coding and\nhigh-level MapReduce, and we think CIEL is a useful spot in that design\nspace.</p>\n<p>Incidentally, I was recently hosted by <a href=\"http://research.nokia.com/\">Nokia\nResearch</a> in Palo Alto by my friend\n<a href=\"http://www.linkedin.com/pub/prashanth-mundkur/6/b44/27\">Prashanth\nMundkur</a>, where\nthey work on the Python/Erlang/OCaml <a href=\"http://discoproject.org/\">Disco</a>\nMapReduce engine. I\u2019m looking forward to seeing more critical\ncomparisons and discussions of alternatives to Hadoop, from them and\nothers.</p>\n<p><em>Thanks are due to <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek</a>,\n<a href=\"https://twitter.com/#!/chrissmowton\">Chris</a> and\n<a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> for answering my incessant CIEL\nquestions while writing this post! Remember that DataCaml is a work in\nprogress and a research prototype, and feedback is most welcome.</em></p>",+"content": "<p>Distributed programming frameworks like\n<a href=\"http://wiki.apache.org/hadoop\">Hadoop</a> and\n<a href=\"http://research.microsoft.com/en-us/projects/dryad/\">Dryad</a> are popular\nfor performing computation over large amounts of data. The reason is\nprogrammer convenience: they accept a query expressed in a simple form\nsuch as <a href=\"http://wiki.apache.org/hadoop/HadoopMapReduce\">MapReduce</a>, and\nautomatically take care of distributing computation to multiple hosts,\nensuring the data is available at all nodes that need it, and dealing\nwith host failures and stragglers.</p>\n<p>A major limitation of Hadoop and Dryad is that they are not well-suited\nto expressing <a href=\"http://en.wikipedia.org/wiki/Iterative_method\">iterative\nalgorithms</a> or <a href=\"http://en.wikipedia.org/wiki/Dynamic_programming\">dynamic\nprogramming</a> problems.\nThese are very commonly found patterns in many algorithms, such as\n<a href=\"http://en.wikipedia.org/wiki/K-means_clustering\">k-means clustering</a>,\n<a href=\"http://en.wikipedia.org/wiki/Binomial_options_pricing_model\">binomial options\npricing</a> or\n<a href=\"http://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm\">Smith Waterman</a>\nfor sequence alignment.</p>\n<p>Over in the SRG in Cambridge,\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/who-we-are/\">we</a>\ndeveloped a Turing-powerful distributed execution engine called\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> that addresses\nthis. The <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel\">CIEL: A universal execution engine for distributed data-flow computing</a>\npaper describes the system in detail, but here\u2019s a shorter introduction.</p>\n<h2><a href=\"https://anil.recoil.org/#the-ciel-execution-engine\"></a>The CIEL Execution Engine</h2>\n<p>CIEL consists of a master coordination server and workers installed on\nevery host. The engine is job-oriented: a job consists of a graph of\ntasks which results in a deterministic output. CIEL tasks can run in any\nlanguage and are started by the worker processes as needed. Data flows\naround the cluster in the form of <em>references</em> that are fed to tasks as\ndependencies. Tasks can publish their outputs either as <em>concrete</em>\nreferences if they can finish the work immediately or as a <em>future</em>\nreference. Additionally, tasks can dynamically spawn more tasks and\ndelegate references to them, which makes the system Turing-powerful and\nsuitable for iterative and dynamic programming problems where the task\ngraph cannot be computed statically.</p>\n<p>The first iteration of CIEL used a domain-specific language called\n<a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel.pdf\">Skywriting</a> to\ncoordinate how tasks should run across a cluster. Skywriting is an\ninterpreted language that is \u201cnative\u201d to CIEL, and when it needs to\nblock it stores its entire execution state inside CIEL as a\ncontinuation. <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek Murray</a> has\nwritten a blog post <a href=\"http://www.syslog.cl.cam.ac.uk/2011/04/06/ciel/\">explaining this in more\ndetail</a>.</p>\n<p>More recently, we have been working on eliminating the need for\nSkywriting entirely, by adding direct support for CIEL into languages\nsuch as <a href=\"http://www.stackless.com/\">Python</a>, Java,\n<a href=\"http://www.scala-lang.org/\">Scala</a>, and the main subject of this post \u2013\n<a href=\"http://caml.inria.fr\">OCaml</a>. It works via libraries that communicate\nwith CIEL to spawn tasks, publish references, or suspend itself into the\ncluster to be woken up when a future reference is completed.</p>\n<h2><a href=\"https://anil.recoil.org/#datacaml-api\"></a>DataCaml API</h2>\n<p>Rather than go into too much detail about the innards of CIEL, this post\ndescribes the OCaml API and gives some examples of how to use it. The\nsimplest interface to start with is:</p>\n<pre><code>type 'a ref\nval deref : 'a ref -> 'a\n</code></pre>\n<p>The type <code>'a ref</code> represents a CIEL reference. This data might not be\nimmediately present on the current node, and so must be dereferenced\nusing the <code>deref</code> function.</p>\n<p>If the reference has been completed, then the OCaml value is\nunmarshalled and returned. If it is not present, then the program needs\nto wait until the computation involving the reference has completed\nelsewhere. The future reference might contain a large data structure and\nbe on another host entirely, and so we should serialise the program\nstate and spawn a task that is dependent on the future\u2019s completion.\nThis way, CIEL can resume execution on whatever node finished that\ncomputation, avoiding the need to move data across the network.</p>\n<p>Luckily, we do not need to serialise the entire heap to suspend the\nprogram. DataCaml uses the\n<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\ndelimited continuations library to walk the stack and save only the\nsubset required to restart this particular task. Delimcc abstracts this\nin the form a \u201crestartable exception\u201d that supplies a closure which can\nbe called later to resume the execution, as if the exception had never\nhappened. Delimcc supports serialising this closure to an output\nchannel, which you can read about in Oleg\u2019s\n<a href=\"http://okmij.org/ftp/continuations/caml-shift.pdf\">paper</a>.</p>\n<p>So how do we construct references? Lets fill in more of the interface:</p>\n<pre><code>module Ciel = struct\n type 'a ref\n val deref : 'a ref -> 'a\n val spawn : ('a -> 'b) -> 'a -> 'b ref\n val run : (string list -> 'a) -> ('a -> string) -> unit\nend\n</code></pre>\n<p>The <code>spawn</code> function accepts a closure and an argument, and returns a\nfuture of the result as a reference. The <code>run</code> function begins the\nexecution of a job, with the first parameter taking some\n<code>string arguments</code> and returning an <code>'a</code> value. We also supply a\npretty-printer second argument to convert the <code>'a</code> into a string for\nreturning as the result of the job (this can actually be any JSON value\nin CIEL, and just simplified here).</p>\n<pre><code>let r1 = spawn (fun x -> x + 5) arg1 in\nlet r2 = spawn (fun x -> deref r1 + 5) arg1 in\nderef r2\n</code></pre>\n<p>We first spawn a function <code>r1</code> which simply adds 5 to the job argument.\nA job in CIEL is <em>lazily scheduled</em>, so this marshals the function to\nCIEL, creates a future, and returns immediately. Next, the <code>r2</code> function\nspawns a task which also adds 5, but to the dereferenced value of <code>r1</code>.\nAgain, it is not scheduled yet as the return reference has not been\ndereferenced.</p>\n<p>Finally, we attempt to dereference <code>r2</code>, which causes it be scheduled on\na worker. While executing, it will try to dereference <code>r1</code> that will\nschedule it, and all the tasks will run to completion.</p>\n<p>Programming language boffins will recognise that this interface is very\nsimilar to <a href=\"http://www.ps.uni-saarland.de/alice/\">AliceML</a>\u2019s concept of\n<a href=\"http://www.ps.uni-saarland.de/alice/manual/futures.html\">lazy futures</a>.\nThe main difference is that it is implemented as a pure OCaml library,\nand uses a general-purpose distributed engine that can also work with\nother languages.</p>\n<h2><a href=\"https://anil.recoil.org/#streaming-references\"></a>Streaming References</h2>\n<p>The references described so far only have two states: they are either\nconcrete or futures. However, there are times when a task can\nprogressively accept input and make forward progress. For these\nsituations, references can also be typed as <em>opaque</em> references that are\naccessed via <code>in_channel</code> and <code>out_channel</code>, as networks are:</p>\n<pre><code>type opaque_ref\n\nval spawn_ref : (unit -> opaque_ref) -> opaque_ref\nval output : ?stream:bool -> ?pipe:bool -> (out_channel -> unit) -> opaque_ref\nval input : (in_channel -> 'a) -> opaque_ref -> 'a\n</code></pre>\n<p>This interface is a lower-level version of the previous one:</p>\n<ul>\n<li><code>spawn_ref</code> creates a lazy future as before, but the type of\nreferences here is completely opaque to the program.</li>\n<li>Inside a spawned function, <code>output</code> is called with a closure that\naccepts an <code>out_channel</code>. The <code>stream</code> argument informs CIEL that a\ndependent task can consume the output before it is completed, and\n<code>pipe</code> forms an even more closely coupled shared-memory connection\n(requiring the tasks to be scheduled on the same host). Piping is\nmore efficient, but will require more work to recover from a fault,\nand so using it is left to the programmer to decide.</li>\n<li>The <code>input</code> function is used by the receiving task to parse the\ninput as a standard <code>in_channel</code>.</li>\n</ul>\n<p>The CIEL engine actually supports multiple concurrent input and output\nstreams to a task, but I\u2019ve just bound it as a single version for now\nwhile the bindings find their feet. Here\u2019s an example of how streaming\nreferences can be used:</p>\n<pre><code>let x_ref = spawn_ref (fun () ->\n output ~stream:true (fun oc ->\n for i = 0 to 5 do\n Unix.sleep 1;\n fprintf oc "%d\\n%!" i;\n done\n )\n ) in\n let y_ref = spawn_ref (fun () ->\n input (fun ic ->\n output ~stream:true (fun oc ->\n for i = 0 to 5 do\n let line = input_line ic in\n fprintf oc "LINE=%s\\n%!" line\n done\n )\n ) x_ref\n ) in\n</code></pre>\n<p>We first spawn an <code>x_ref</code> which pretends to do 5 seconds of work by\nsleeping and outputing a number. This would of course be heavy number\ncrunching in a real program. The <code>y_ref</code> then inputs this stream, and\noutputs its own result by prepending a string to each line.</p>\n<h2><a href=\"https://anil.recoil.org/#try-it-out\"></a>Try it out</h2>\n<p>If you are interested in a more real example, then read through the\n<a href=\"https://github.com/avsm/ciel/blob/master/src/ocaml/binomial.ml\">binomial\noptions</a>\ncalculator that uses streaming references to parallelise a dynamic\nprogramming problem (this would be difficult to express in MapReduce).\nOn my Mac, I can run this by:</p>\n<ul>\n<li>check out CIEL from from Derek\u2019s <a href=\"http://github.com/mrry/ciel\">Git\nrepository</a>.</li>\n<li>install all the Python libraries required (see the <code>INSTALL</code> file)\nand OCaml libraries\n(<a href=\"http://okmij.org/ftp/continuations/implementations.html\">delimcc</a>\nand <a href=\"http://martin.jambon.free.fr/yojson.html\">Yojson</a>).</li>\n<li>add <code><repo>/src/python</code> to your <code>PYTHONPATH</code></li>\n<li>in one terminal: <code>./scripts/run_master.sh</code></li>\n<li>in another terminal: <code>./scripts/run_worker.sh -n 5</code> (this allocates\n5 execution slots)</li>\n<li>build the OCaml libraries: <code>cd src/ocaml && make</code></li>\n<li>start the binomial options job:\n<code>./scripts/sw-start-job -m http://localhost:8000 ./src/package/ocaml_binopt.pack</code></li>\n<li>there will be a URL printed which shows the execution progress in\nreal-time</li>\n<li>you should see log activity on the worker(s), and a result reference\nwith the answer (<code>10.x</code>)</li>\n<li>let us know the happy news if it worked or sad news if something\nbroke</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#discussion\"></a>Discussion</h2>\n<p>The DataCaml bindings outlined here provide an easy way to write\ndistributed, fault-tolerant and cluster-scheduled jobs in OCaml. The\ncurrent implementation of the engine is aimed at cluster computation,\nbut <a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> has been working on\n<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-sfma.pdf\">condensing CIEL onto multicore\nhardware</a>.\nThus, this could be one approach to \u2018solving the OCaml multicore\nproblem\u2019 for problems that fit nicely into the dataflow paradigm.</p>\n<p>The biggest limitation for using these bindings is that delimited\ncontinuation serialisation only works in bytecode. Native code delimcc\nsupports <code>shift/reduce</code> in the same program, but serialising is\nproblematic since native code continuations contain a C stack, which may\nhave unwrapped integers. One way to work around this is by switching to\na monadic approach to dereferencing, but I find delimcc programming more\nnatural (also see <a href=\"http://www.openmirage.org/wiki/delimcc-vs-lwt\">this\ndiscussion</a>).</p>\n<p>Another important point is that tasks are lazy and purely functional\n(remind you of Haskell?). This is essential for reliable fault-tolerance\nand reproducibility, while allowing individual tasks to run fast, strict\nand mutable OCaml code. The tasks must remain referentially transparent\nand idempotent, as CIEL may choose to schedule them multiple times (in\nthe case of faults or straggler correction). Derek has been working on\n<a href=\"http://www.cl.cam.ac.uk/~dgm36/publications/2011-murray2011nondet.pdf\">integrating non-determinism into\nCIEL</a>,\nso this restriction may be relaxed soon.</p>\n<p>Finally, these ideas are not limited to OCaml at all, but also apply to\nScala, Java, and Python. We have submitted a draft paper dubbed <em>\u2018<a href=\"http://www.cl.cam.ac.uk/~ms705/pub/papers/2011-ciel-socc-draft.pdf\">A\nPolyglot Approach to Cloud\nProgramming</a>\u2019</em>\nwith more details and the ubiquitous evaluation versus Hadoop. There is\na really interesting line to explore between low-level\n<a href=\"http://en.wikipedia.org/wiki/Message_Passing_Interface\">MPI</a> coding and\nhigh-level MapReduce, and we think CIEL is a useful spot in that design\nspace.</p>\n<p>Incidentally, I was recently hosted by <a href=\"http://research.nokia.com/\">Nokia\nResearch</a> in Palo Alto by my friend\n<a href=\"http://www.linkedin.com/pub/prashanth-mundkur/6/b44/27\">Prashanth\nMundkur</a>, where\nthey work on the Python/Erlang/OCaml <a href=\"http://discoproject.org/\">Disco</a>\nMapReduce engine. I\u2019m looking forward to seeing more critical\ncomparisons and discussions of alternatives to Hadoop, from them and\nothers.</p>\n<p><em>Thanks are due to <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek</a>,\n<a href=\"https://twitter.com/#!/chrissmowton\">Chris</a> and\n<a href=\"http://www.cl.cam.ac.uk/~ms705\">Malte</a> for answering my incessant CIEL\nquestions while writing this post! Remember that DataCaml is a work in\nprogress and a research prototype, and feedback is most welcome.</em></p>",
+18
avsm/notes_decentralised-stack.json
+18
avsm/notes_decentralised-stack.json
···+"summary": "<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I have self-hosted recoil.org since around 1996, typically for\nemail and web. These days, there are a number of interesting software stacks\naround decentralised communication that we deploy. This note keeps track of\nthem.</p>\n<ul>\n<li><strong>Email</strong> (active)\n<ul>\n<li>Currently Postfix and DKIM/SPIF relays</li>\n<li>Till 2019, was OpenSMTPD and would like to return to it but waiting on\nfilter support.</li>\n<li>Till around 2016, was qmail but finally gave up due to difficulty of\nspam filtering.</li>\n<li>Next step will be to try out the MirageOS email stack that dinosaure\nhas been leading the development of.</li>\n</ul>\n</li>\n<li><strong>Web</strong> (active)\n<ul>\n<li>This website is an OCaml webserver running a custom multicore OCaml <a href=\"https://github.com/avsm/eeww\">webserver</a></li>\n<li>Next step will be to go solar powered with a custom DNS server.</li>\n</ul>\n</li>\n<li><strong>DNS</strong> (inactive)\n<ul>\n<li>MirageOS DNS server.</li>\n<li>Currently offline due to a hosting issue so fell back to Gandi.</li>\n<li>Hopefully can secondary with @hannesm and his MirageOS infrastructure.</li>\n</ul>\n</li>\n<li><strong>Videos</strong> (active)\n<ul>\n<li>Running a PeerTube instance on <a href=\"https://crank.recoil.org\">https://crank.recoil.org</a></li>\n<li>Also deployed this for the OCaml community as <watch.ocaml.org>, so my\npersonal recoil instance is "following" the OCaml one as well as having\nmy own videos.</li>\n</ul>\n</li>\n<li><strong>Chat</strong> (active)\n<ul>\n<li>Running a Matrix Element. server with a HTTP srv for recoil.org</li>\n<li>Using element.io clients to connect to it.</li>\n<li>Lots of federation to other services happening from this via\nrepublished rooms, so its a fairly busy server.</li>\n<li>Next step is to deploy some of the OCaml Matrix clients to control\nthe notifications. Element doesnt have very good push support.</li>\n<li>Decided not to bridge this to WhatsApp/Signal/etc as the maintenance\ncost is quite high and it requires unencrypted passwords.</li>\n<li>Need to regularly sweep the Element database to keep the size down, as detailed in this <a href=\"https://levans.fr/shrink-synapse-database.html\">handy blog post</a>.</li>\n</ul>\n</li>\n<li><strong>Activity</strong> (active)\n<ul>\n<li>Deplyed a Mastadon instance for distributed tweeting via\nActivityPub, on https://amok.recoil.org/</li>\n</ul>\n</li>\n<li><strong>Images</strong> (inactive)\n<ul>\n<li>Tristan Henderson pointed me to pixelfed which seems worth a try for\nimage sharing over ActivityPub. Not had a chance to use it yet.</li>\n</ul>\n</li>\n<li><strong>Spam</strong> (inactive)\n<ul>\n<li>Problem with the chat service is that I'm getting quite a lot of spam\nrequests on Matrix. Am experimenting with a Tezos node to act as a\nDID introduction proxy with gas costs. Hopefully there's a way to\nbe introduced due to some common service (or some evidence of PoW for the\ncommunication such as having read and quoted one of my papers or something)\nand have micropayment as a last-resort.</li>\n<li>Also deployed SpamAssassin recoil-wide and custom bayes filters.</li>\n</ul>\n</li>\n</ul>\n<p>In general, our operating system of choice is OpenBSD (since 1998 or so) with\nAlpine Linux for the more recent things that run on a cloud or haven't been\nported yet.</p>",+"content": "<p><a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I have self-hosted recoil.org since around 1996, typically for\nemail and web. These days, there are a number of interesting software stacks\naround decentralised communication that we deploy. This note keeps track of\nthem.</p>\n<ul>\n<li><strong>Email</strong> (active)\n<ul>\n<li>Currently Postfix and DKIM/SPIF relays</li>\n<li>Till 2019, was OpenSMTPD and would like to return to it but waiting on\nfilter support.</li>\n<li>Till around 2016, was qmail but finally gave up due to difficulty of\nspam filtering.</li>\n<li>Next step will be to try out the MirageOS email stack that dinosaure\nhas been leading the development of.</li>\n</ul>\n</li>\n<li><strong>Web</strong> (active)\n<ul>\n<li>This website is an OCaml webserver running a custom multicore OCaml <a href=\"https://github.com/avsm/eeww\">webserver</a></li>\n<li>Next step will be to go solar powered with a custom DNS server.</li>\n</ul>\n</li>\n<li><strong>DNS</strong> (inactive)\n<ul>\n<li>MirageOS DNS server.</li>\n<li>Currently offline due to a hosting issue so fell back to Gandi.</li>\n<li>Hopefully can secondary with @hannesm and his MirageOS infrastructure.</li>\n</ul>\n</li>\n<li><strong>Videos</strong> (active)\n<ul>\n<li>Running a PeerTube instance on <a href=\"https://crank.recoil.org\">https://crank.recoil.org</a></li>\n<li>Also deployed this for the OCaml community as <watch.ocaml.org>, so my\npersonal recoil instance is "following" the OCaml one as well as having\nmy own videos.</li>\n</ul>\n</li>\n<li><strong>Chat</strong> (active)\n<ul>\n<li>Running a Matrix Element. server with a HTTP srv for recoil.org</li>\n<li>Using element.io clients to connect to it.</li>\n<li>Lots of federation to other services happening from this via\nrepublished rooms, so its a fairly busy server.</li>\n<li>Next step is to deploy some of the OCaml Matrix clients to control\nthe notifications. Element doesnt have very good push support.</li>\n<li>Decided not to bridge this to WhatsApp/Signal/etc as the maintenance\ncost is quite high and it requires unencrypted passwords.</li>\n<li>Need to regularly sweep the Element database to keep the size down, as detailed in this <a href=\"https://levans.fr/shrink-synapse-database.html\">handy blog post</a>.</li>\n</ul>\n</li>\n<li><strong>Activity</strong> (active)\n<ul>\n<li>Deplyed a Mastadon instance for distributed tweeting via\nActivityPub, on https://amok.recoil.org/</li>\n</ul>\n</li>\n<li><strong>Images</strong> (inactive)\n<ul>\n<li>Tristan Henderson pointed me to pixelfed which seems worth a try for\nimage sharing over ActivityPub. Not had a chance to use it yet.</li>\n</ul>\n</li>\n<li><strong>Spam</strong> (inactive)\n<ul>\n<li>Problem with the chat service is that I'm getting quite a lot of spam\nrequests on Matrix. Am experimenting with a Tezos node to act as a\nDID introduction proxy with gas costs. Hopefully there's a way to\nbe introduced due to some common service (or some evidence of PoW for the\ncommunication such as having read and quoted one of my papers or something)\nand have micropayment as a last-resort.</li>\n<li>Also deployed SpamAssassin recoil-wide and custom bayes filters.</li>\n</ul>\n</li>\n</ul>\n<p>In general, our operating system of choice is OpenBSD (since 1998 or so) with\nAlpine Linux for the more recent things that run on a cloud or haven't been\nported yet.</p>",
+18
avsm/notes_deepseek-r1-advances.json
+18
avsm/notes_deepseek-r1-advances.json
···+"summary": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> posted a link to this excellent deepdive by <a href=\"https://www.linkedin.com/in/prasadraje/\">Prasad Raje</a> of Udemy into the advances that\n<a href=\"https://deepseek.com\">DeepSeek</a> R1 has made from a perspective of the core\ntechnology.</p>\n<blockquote>\n<ul>\n<li>Multi-headed Latent Attention (MLA). In the famous Google "<a href=\"https://arxiv.org/abs/1706.03762\">Attention is all you need</a>" paper, the attention block is responsible for a lot of the magic of LLMs but is also compute heavy [...] Deepseek has innovated here with Multi-headed latent attention - which essentially reduces the size of matrix multiplication applied to generate the K,V vectors that are inputs into the attention block. Combined with KV Caching, this reduces the memory needs [...]</li>\n<li>Mixture of Experts (MoE). The key idea here is that instead of feeding each token through one massive <a href=\"https://en.wikipedia.org/wiki/Feedforward_neural_network\">FFN</a>, break down the single FFN into a number of smaller FFNs and route each token through a subset of these FFNs. [...] each of these smaller FFNs will learn during training something specific about how to transform each token, hence becoming an "expert". Deepseek took MoE to this 670B parameter scale that no one had done before [...] and created 256 FFNs and routes each token through only 8 of these.</li>\n<li>Multi-token prediction (MTP): [...] you compute more than 1 token and send the aggregate error to back propagate. The intuition is that you get more changes made to the model weights in each training step, thus reducing the total training steps needed [...] Deepseek took this idea further, added innovations of their own (Sequential vs parallel MTP) and used this to reduce training time.\n -- <a href=\"https://www.linkedin.com/pulse/deepdive-deepseek-prasad-raje-jakqc\">Prasad Raje</a></li>\n</ul>\n</blockquote>",+"content": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> posted a link to this excellent deepdive by <a href=\"https://www.linkedin.com/in/prasadraje/\">Prasad Raje</a> of Udemy into the advances that\n<a href=\"https://deepseek.com\">DeepSeek</a> R1 has made from a perspective of the core\ntechnology.</p>\n<blockquote>\n<ul>\n<li>Multi-headed Latent Attention (MLA). In the famous Google "<a href=\"https://arxiv.org/abs/1706.03762\">Attention is all you need</a>" paper, the attention block is responsible for a lot of the magic of LLMs but is also compute heavy [...] Deepseek has innovated here with Multi-headed latent attention - which essentially reduces the size of matrix multiplication applied to generate the K,V vectors that are inputs into the attention block. Combined with KV Caching, this reduces the memory needs [...]</li>\n<li>Mixture of Experts (MoE). The key idea here is that instead of feeding each token through one massive <a href=\"https://en.wikipedia.org/wiki/Feedforward_neural_network\">FFN</a>, break down the single FFN into a number of smaller FFNs and route each token through a subset of these FFNs. [...] each of these smaller FFNs will learn during training something specific about how to transform each token, hence becoming an "expert". Deepseek took MoE to this 670B parameter scale that no one had done before [...] and created 256 FFNs and routes each token through only 8 of these.</li>\n<li>Multi-token prediction (MTP): [...] you compute more than 1 token and send the aggregate error to back propagate. The intuition is that you get more changes made to the model weights in each training step, thus reducing the total training steps needed [...] Deepseek took this idea further, added innovations of their own (Sequential vs parallel MTP) and used this to reduce training time.\n -- <a href=\"https://www.linkedin.com/pulse/deepdive-deepseek-prasad-raje-jakqc\">Prasad Raje</a></li>\n</ul>\n</blockquote>",
+18
avsm/notes_delimited-cont-vs-lwt.json
+18
avsm/notes_delimited-cont-vs-lwt.json
···
+18
avsm/notes_deprecating-ocaml-408.json
+18
avsm/notes_deprecating-ocaml-408.json
···+"summary": "<p>I started pushing OCaml Docker images over to the <a href=\"https://hub.docker.com/r/ocaml/opam\">Docker Hub</a> in around 2017, to support the burgeoning automated build infrastructure around the use of the language. Back then, OCaml 4.06 was the latest release, and so I wrote an <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">ocaml-version</a> library to track the release metadata. It has been a bit of a success disaster, as that library now <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">tracks</a> every release of OCaml in the modern era, and also backs the <a href=\"https://github.com/ocurrent/docker-base-images\">automatic building</a> of a huge array of compiler versions and variants across <a href=\"https://images.ci.ocaml.org/?distro=debian-12&\">Linux</a> and <a href=\"https://images.ci.ocaml.org/?distro=windows-msvc&\">Windows</a>.</p>\n<p>The problem is...we're now building the full set of images from OCaml 4.02 onwards through to the latest OCaml 5.3.0 release, which is unsustainable for obvious reasons; despite the hosting being kindly <a href=\"https://www.docker.com/community/open-source/application/\">sponsored by Docker</a>, we must also consider the <a href=\"https://ocaml.org/policies/carbon-footprint\">carbon footprint</a> of our infrastructure.\nSo the question for the OCaml community: <strong>are there are any remaining users who still need images earlier than OCaml 4.08 or can we can stop pushing those now?</strong></p>\n<p><a href=\"https://github.com/hannesm\">Hannes Mehnert</a> lead an effort to deprecate compilers earlier than 4.08 <a href=\"https://discuss.ocaml.org/t/opam-repository-archival-phase-2-ocaml-4-08-is-the-lower-bound/15965\">in the opam-repo</a>, and now <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> is asking the same question <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">on the OCaml discussion forum</a> about the Docker image infrastructure. The latter lags the opam repository since there still may be operational usecases of industrial users who depend on older compilers, even if they don't use the latest package repository. So if you <em>are</em> using a really old OCaml and depend on our infrastructure, we'd appreciate you chiming in on the <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">forum thread</a> or just contact <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> or myself directly to let us know.</p>\n<p>On another note, it's also quite difficult on the central <a href=\"https://hub.docker.com/\">Docker Hub</a> to get statistics per-tag as to how many people are using each image. Does anyone have any recommendations on whether we should deploy our own "proxy registry" before pushing through to the central Docker Hub, or alternative open source registries to run our own?</p>",+"content": "<p>I started pushing OCaml Docker images over to the <a href=\"https://hub.docker.com/r/ocaml/opam\">Docker Hub</a> in around 2017, to support the burgeoning automated build infrastructure around the use of the language. Back then, OCaml 4.06 was the latest release, and so I wrote an <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">ocaml-version</a> library to track the release metadata. It has been a bit of a success disaster, as that library now <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/CHANGES.md\">tracks</a> every release of OCaml in the modern era, and also backs the <a href=\"https://github.com/ocurrent/docker-base-images\">automatic building</a> of a huge array of compiler versions and variants across <a href=\"https://images.ci.ocaml.org/?distro=debian-12&\">Linux</a> and <a href=\"https://images.ci.ocaml.org/?distro=windows-msvc&\">Windows</a>.</p>\n<p>The problem is...we're now building the full set of images from OCaml 4.02 onwards through to the latest OCaml 5.3.0 release, which is unsustainable for obvious reasons; despite the hosting being kindly <a href=\"https://www.docker.com/community/open-source/application/\">sponsored by Docker</a>, we must also consider the <a href=\"https://ocaml.org/policies/carbon-footprint\">carbon footprint</a> of our infrastructure.\nSo the question for the OCaml community: <strong>are there are any remaining users who still need images earlier than OCaml 4.08 or can we can stop pushing those now?</strong></p>\n<p><a href=\"https://github.com/hannesm\">Hannes Mehnert</a> lead an effort to deprecate compilers earlier than 4.08 <a href=\"https://discuss.ocaml.org/t/opam-repository-archival-phase-2-ocaml-4-08-is-the-lower-bound/15965\">in the opam-repo</a>, and now <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> is asking the same question <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">on the OCaml discussion forum</a> about the Docker image infrastructure. The latter lags the opam repository since there still may be operational usecases of industrial users who depend on older compilers, even if they don't use the latest package repository. So if you <em>are</em> using a really old OCaml and depend on our infrastructure, we'd appreciate you chiming in on the <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">forum thread</a> or just contact <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> or myself directly to let us know.</p>\n<p>On another note, it's also quite difficult on the central <a href=\"https://hub.docker.com/\">Docker Hub</a> to get statistics per-tag as to how many people are using each image. Does anyone have any recommendations on whether we should deploy our own "proxy registry" before pushing through to the central Docker Hub, or alternative open source registries to run our own?</p>",
+18
avsm/notes_disentangling-git-with-bluesky.json
+18
avsm/notes_disentangling-git-with-bluesky.json
···+"summary": "<p>I've been an avid user of <a href=\"https://github.com\">GitHub</a> since its launch, and it really has revolutionised how communities come together to work on open source. In recent years though, I find myself utterly overwhelmed by its notifications and want to experiment with <a href=\"https://www.offlineimap.org/github/2016/03/08/github-pr-suck.html\">alternative workflows</a>. This experimentation also has a more serious undertone due to the increasing need for <a href=\"https://www.boell.de/en/2025/01/24/trump-and-big-tech-europes-sovereignty-stake\">data sovereignty</a> and so I'm starting to move my source code to self-hosted solutions that are less reliant on centralised services.</p>\n<p>This has also come up persistently over the years in the <a href=\"https://ocaml.org\">OCaml</a> community, with questions over why participation in packaging <a href=\"https://discuss.ocaml.org/t/publishing-without-github/3232\">requires a GitHub account</a> ever since the <a href=\"https://anil.recoil.org/notes/opam-1-1-beta\">early days</a> of opam. I've never found a good answer... until now, with the launch of an exciting <a href=\"https://tangled.sh\">new service</a> that's built over the same protocol that <a href=\"https://bsky.app\">Bluesky</a> uses.\nAs I <a href=\"https://anil.recoil.org/notes/atproto-for-fun-and-blogging\">noted</a> a few weeks ago, the <a href=\"https://atproto.com/\">ATProto</a> can be used for more than just microblogging. It can also be an <em>identity</em> layer, across which other applications can be built which reuse the social fabric from Bluesky accounts.</p>\n<p>"<a href=\"https://tangled.sh\">Tangled</a>" is a new service launched (just yesterday!) by <a href=\"https://tangled.sh/@oppili.bsky.social\">opilli</a> and <a href=\"https://tangled.sh/@icyphox.sh\">icyphox</a> to manage Git repositories. I'm having a lot of fun trying it out, even in its early alpha stages! The coolest thing about Tangled is that you can self-host your own <a href=\"https://blog.tangled.sh/intro\">knots</a>, which control where the source code repositories are actually stored.</p>\n<h2><a href=\"https://anil.recoil.org/#self-hosting-my-own-tangled-knot\"></a>Self hosting my own Tangled knot</h2>\n<p>I set up one of the first knots on the network on <code>git.recoil.org</code>, and can now directly share my source code online without depending on GitHub! For example, this is the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\">knot-docker</a> container config which you can use to deploy your own version of this.</p>\n<p><a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-1.webp\" title=\"\">\n </a></p>\n<p>It looks pretty similar to GitHub doesn't it? The first key difference is the login on the top-right, which is the same as my <a href=\"https://bsky.app/profile/anil.recoil.org\">anil.recoil.org</a> account. Once you're logged in, the other difference shows up when creating a new Git repository.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-2.webp\" title=\"\">\n</p>\n<p>As you can see, you can not only select the name of the repository, but also <em>where</em> it's going to be stored. I can either put it on the central Tangled knot, or stick it on my own Recoil one. After this, the user experience of cloning is as simple as:</p>\n<pre><code>git clone https://tangled.sh/@anil.recoil.org/knot-docker\ngit clone git@git.recoil.org:anil.recoil.org/knot-docker\n</code></pre>\n<p>In the first case, the central tangled web server proxies the Git contents over HTTP, and for SSH I can just connect directly to my own server. Inside my Knot container, we can see where the Git repositories are stored:</p>\n<pre><code>/home/git # ls -1\ndid:plc:nhyitepp3u4u6fcfboegzcjw\nknotserver.db\nknotserver.db-shm\nknotserver.db-wal\nlog\n</code></pre>\n<p>The <code>did:</code> directory is actually my 'decentralised identifier' from the ATProto, which we can verify by looking up the <a href=\"https://bsky.social/about/blog/4-28-2023-domain-handle-tutorial\">DNS atproto TXT</a> record for my domain:</p>\n<pre><code>$ dig txt _atproto.anil.recoil.org\n;; ANSWER SECTION:\n_atproto.anil.recoil.org. 10799 IN TXT "did=did:plc:nhyitepp3u4u6fcfboegzcjw"\n</code></pre>\n<p>And then if we navigate into that directory, we can see there are just normal bare git repositories stored on my server.</p>\n<pre><code>/home/git/did:plc:nhyitepp3u4u6fcfboegzcjw/knot-docker # ls -la\ntotal 24\ndrwxr-sr-x 4 git git 4096 Mar 8 19:02 .\ndrwxr-sr-x 4 git git 4096 Mar 8 18:23 ..\n-rw-r--r-- 1 git git 21 Mar 8 18:01 HEAD\n-rw-r--r-- 1 git git 36 Mar 8 18:01 config\ndrwxr-sr-x 17 git git 4096 Mar 8 19:02 objects\ndrwxr-sr-x 4 git git 4096 Mar 8 18:01 refs\n</code></pre>\n<p>This makes the core of Tangled very safe to use, even if the service disappears: I maintain the actual git repositories myself, so I can (e.g.) mirror them to GitHub via a simple cron script.</p>\n<h2><a href=\"https://anil.recoil.org/#collaboration-is-as-simple-as-bluesky\"></a>Collaboration is as simple as Bluesky</h2>\n<p>Tangled has only been out for about a day, so I coopted fellow Recoiler <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to create an account. I added his <a href=\"https://bsky.app/profile/nick.recoil.org\">handle</a> over to the Recoil knot, and that's all it took for him to be able to create repositories on our server.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-5.webp\" title=\"\">\n</p>\n<p>I can also just add people directly to a particular repository, as you can see from the one below on his profile.</p>\n<p><a href=\"https://tangled.sh/@nick.recoil.org\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-3.webp\" title=\"\">\n </a></p>\n<h2><a href=\"https://anil.recoil.org/#the-issue-metadata-is-also-distributed\"></a>The issue metadata is also distributed</h2>\n<p>The real lockin to code repository management though, is the metadata around the repository; things like issues, comments and so on. Tangled makes it possible to decentralise where is this stored <a href=\"https://www.chiark.greenend.org.uk/~sgtatham/quasiblog/git-no-forge/\">without needing a central Forge</a>, by relaying it all via the ATProto.\nLet's take a look at how this works.</p>\n<p>I <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/issues/1\">created an issue</a> on knot-docker, and it looks very similar to a GitHub issue. Zicklag on <code>#tangled</code> pointed me to the <a href=\"https://pdsls.dev/\">PDSLS</a> public ATProto browser with which you can browse the actual ATProto records. I can start from my <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw\">DID record</a> and look for the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue\">sh.tangled.repo.issue</a> collection, and find the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\">issue URL from earlier</a>. I then prodded <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to leave a comment on the issue, and you can see his <a href=\"https://pdsls.dev/at://did:plc:dr3wsy7hlzgyanewhbw7fj5g/sh.tangled.repo.issue.comment/3ljvdsrlckj22\">sh.tangled.repo.issue.comment</a> in the relay as well.</p>\n<p><a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-4.webp\" title=\"\">\n </a></p>\n<p>Even the <a href=\"https://bsky.app/profile/tangled.sh/post/3ljv6wpioxc2q\">repository stars</a> are on the relay; see for example <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.feed.star/3ljvbtbrhew22\">this</a> entry for <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo/3ljv45bhfql22\">knot-docker</a> that I did. The Tangled developers just added support for stars <a href=\"https://tangled.sh/@tangled.sh/core/commit/662bd012caec9c2bd2a15e1dcfe184d5b2c49ff9#file-lexicons%2fstar.json\">a few hours ago</a>, and that changeset is a nice way to see how to add a new lexicon entry.</p>\n<p><a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> then reminded me of his project a few years ago to run <a href=\"https://anil.recoil.org/ideas/version-control-matrix\">git pull requests over Matrix chat</a>. It would indeed be very cool if the pull request model on Tangled evolved into something more message-oriented like <a href=\"https://git-scm.com/docs/git-send-email\">git-send-email</a>, in order to let us try out more personalised workflows than GitHub PRs.</p>\n<h2><a href=\"https://anil.recoil.org/#why-this-fits-in-so-well-with-the-rest-of-bluesky\"></a>Why this fits in so well with the rest of Bluesky</h2>\n<p>The ATProto developers also released their <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring\">roadmap for early 2025</a> today, and it aligns really well with some of the productions features I would need to completely shift over to a service like Tangled.</p>\n<p>The first, and most vital one, is <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#auth-scopes\">auth scopes</a> to control the permissions of an app password to only certain operations. Once this is in the protocol, then a client to manage Tangled repositories could use a differently privileged password from the main social client.</p>\n<p>Secondly, <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#privately-shared-data-and-e2ee-dms\">privately shared data</a> and <a href=\"https://www.ietf.org/blog/mls-secure-and-usable-end-to-end-encryption/\">encrypted DMs using MLS</a> point to how private code repositories could work. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I were discussing the difficulty of access-controlled replication over the Internet just yesterday, and I'm starting to believe that ATProto has the right balance of ergonomics and good design to make solving this problem much, much easier.</p>\n<p>If you'd like to try this out, then the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/\">Knot Docker</a> repository welcomes your issues!</p>\n\n<p>Many thanks to Zicklag and icyphox on <a href=\"https://web.libera.chat/#tangled\">tangled IRC</a> for helping me out with debugging the Knot setup and <a href=\"https://tangled.sh/@tangled.sh/core/commit/477da124ad0bdeeab5b621b81999683256ab7a4b\">fixing bugs in real-time</a>. 12th Mar 2025: updated with <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> comments.</p>",+"content": "<p>I've been an avid user of <a href=\"https://github.com\">GitHub</a> since its launch, and it really has revolutionised how communities come together to work on open source. In recent years though, I find myself utterly overwhelmed by its notifications and want to experiment with <a href=\"https://www.offlineimap.org/github/2016/03/08/github-pr-suck.html\">alternative workflows</a>. This experimentation also has a more serious undertone due to the increasing need for <a href=\"https://www.boell.de/en/2025/01/24/trump-and-big-tech-europes-sovereignty-stake\">data sovereignty</a> and so I'm starting to move my source code to self-hosted solutions that are less reliant on centralised services.</p>\n<p>This has also come up persistently over the years in the <a href=\"https://ocaml.org\">OCaml</a> community, with questions over why participation in packaging <a href=\"https://discuss.ocaml.org/t/publishing-without-github/3232\">requires a GitHub account</a> ever since the <a href=\"https://anil.recoil.org/notes/opam-1-1-beta\">early days</a> of opam. I've never found a good answer... until now, with the launch of an exciting <a href=\"https://tangled.sh\">new service</a> that's built over the same protocol that <a href=\"https://bsky.app\">Bluesky</a> uses.\nAs I <a href=\"https://anil.recoil.org/notes/atproto-for-fun-and-blogging\">noted</a> a few weeks ago, the <a href=\"https://atproto.com/\">ATProto</a> can be used for more than just microblogging. It can also be an <em>identity</em> layer, across which other applications can be built which reuse the social fabric from Bluesky accounts.</p>\n<p>"<a href=\"https://tangled.sh\">Tangled</a>" is a new service launched (just yesterday!) by <a href=\"https://tangled.sh/@oppili.bsky.social\">opilli</a> and <a href=\"https://tangled.sh/@icyphox.sh\">icyphox</a> to manage Git repositories. I'm having a lot of fun trying it out, even in its early alpha stages! The coolest thing about Tangled is that you can self-host your own <a href=\"https://blog.tangled.sh/intro\">knots</a>, which control where the source code repositories are actually stored.</p>\n<h2><a href=\"https://anil.recoil.org/#self-hosting-my-own-tangled-knot\"></a>Self hosting my own Tangled knot</h2>\n<p>I set up one of the first knots on the network on <code>git.recoil.org</code>, and can now directly share my source code online without depending on GitHub! For example, this is the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\">knot-docker</a> container config which you can use to deploy your own version of this.</p>\n<p><a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-1.webp\" title=\"\">\n </a></p>\n<p>It looks pretty similar to GitHub doesn't it? The first key difference is the login on the top-right, which is the same as my <a href=\"https://bsky.app/profile/anil.recoil.org\">anil.recoil.org</a> account. Once you're logged in, the other difference shows up when creating a new Git repository.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-2.webp\" title=\"\">\n</p>\n<p>As you can see, you can not only select the name of the repository, but also <em>where</em> it's going to be stored. I can either put it on the central Tangled knot, or stick it on my own Recoil one. After this, the user experience of cloning is as simple as:</p>\n<pre><code>git clone https://tangled.sh/@anil.recoil.org/knot-docker\ngit clone git@git.recoil.org:anil.recoil.org/knot-docker\n</code></pre>\n<p>In the first case, the central tangled web server proxies the Git contents over HTTP, and for SSH I can just connect directly to my own server. Inside my Knot container, we can see where the Git repositories are stored:</p>\n<pre><code>/home/git # ls -1\ndid:plc:nhyitepp3u4u6fcfboegzcjw\nknotserver.db\nknotserver.db-shm\nknotserver.db-wal\nlog\n</code></pre>\n<p>The <code>did:</code> directory is actually my 'decentralised identifier' from the ATProto, which we can verify by looking up the <a href=\"https://bsky.social/about/blog/4-28-2023-domain-handle-tutorial\">DNS atproto TXT</a> record for my domain:</p>\n<pre><code>$ dig txt _atproto.anil.recoil.org\n;; ANSWER SECTION:\n_atproto.anil.recoil.org. 10799 IN TXT "did=did:plc:nhyitepp3u4u6fcfboegzcjw"\n</code></pre>\n<p>And then if we navigate into that directory, we can see there are just normal bare git repositories stored on my server.</p>\n<pre><code>/home/git/did:plc:nhyitepp3u4u6fcfboegzcjw/knot-docker # ls -la\ntotal 24\ndrwxr-sr-x 4 git git 4096 Mar 8 19:02 .\ndrwxr-sr-x 4 git git 4096 Mar 8 18:23 ..\n-rw-r--r-- 1 git git 21 Mar 8 18:01 HEAD\n-rw-r--r-- 1 git git 36 Mar 8 18:01 config\ndrwxr-sr-x 17 git git 4096 Mar 8 19:02 objects\ndrwxr-sr-x 4 git git 4096 Mar 8 18:01 refs\n</code></pre>\n<p>This makes the core of Tangled very safe to use, even if the service disappears: I maintain the actual git repositories myself, so I can (e.g.) mirror them to GitHub via a simple cron script.</p>\n<h2><a href=\"https://anil.recoil.org/#collaboration-is-as-simple-as-bluesky\"></a>Collaboration is as simple as Bluesky</h2>\n<p>Tangled has only been out for about a day, so I coopted fellow Recoiler <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to create an account. I added his <a href=\"https://bsky.app/profile/nick.recoil.org\">handle</a> over to the Recoil knot, and that's all it took for him to be able to create repositories on our server.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-5.webp\" title=\"\">\n</p>\n<p>I can also just add people directly to a particular repository, as you can see from the one below on his profile.</p>\n<p><a href=\"https://tangled.sh/@nick.recoil.org\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-3.webp\" title=\"\">\n </a></p>\n<h2><a href=\"https://anil.recoil.org/#the-issue-metadata-is-also-distributed\"></a>The issue metadata is also distributed</h2>\n<p>The real lockin to code repository management though, is the metadata around the repository; things like issues, comments and so on. Tangled makes it possible to decentralise where is this stored <a href=\"https://www.chiark.greenend.org.uk/~sgtatham/quasiblog/git-no-forge/\">without needing a central Forge</a>, by relaying it all via the ATProto.\nLet's take a look at how this works.</p>\n<p>I <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/issues/1\">created an issue</a> on knot-docker, and it looks very similar to a GitHub issue. Zicklag on <code>#tangled</code> pointed me to the <a href=\"https://pdsls.dev/\">PDSLS</a> public ATProto browser with which you can browse the actual ATProto records. I can start from my <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw\">DID record</a> and look for the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue\">sh.tangled.repo.issue</a> collection, and find the <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\">issue URL from earlier</a>. I then prodded <a href=\"https://nick.recoil.org\">Nick Ludlam</a> to leave a comment on the issue, and you can see his <a href=\"https://pdsls.dev/at://did:plc:dr3wsy7hlzgyanewhbw7fj5g/sh.tangled.repo.issue.comment/3ljvdsrlckj22\">sh.tangled.repo.issue.comment</a> in the relay as well.</p>\n<p><a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo.issue/3ljvbt4zni322\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/tangled-ss-4.webp\" title=\"\">\n </a></p>\n<p>Even the <a href=\"https://bsky.app/profile/tangled.sh/post/3ljv6wpioxc2q\">repository stars</a> are on the relay; see for example <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.feed.star/3ljvbtbrhew22\">this</a> entry for <a href=\"https://pdsls.dev/at://did:plc:nhyitepp3u4u6fcfboegzcjw/sh.tangled.repo/3ljv45bhfql22\">knot-docker</a> that I did. The Tangled developers just added support for stars <a href=\"https://tangled.sh/@tangled.sh/core/commit/662bd012caec9c2bd2a15e1dcfe184d5b2c49ff9#file-lexicons%2fstar.json\">a few hours ago</a>, and that changeset is a nice way to see how to add a new lexicon entry.</p>\n<p><a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> then reminded me of his project a few years ago to run <a href=\"https://anil.recoil.org/ideas/version-control-matrix\">git pull requests over Matrix chat</a>. It would indeed be very cool if the pull request model on Tangled evolved into something more message-oriented like <a href=\"https://git-scm.com/docs/git-send-email\">git-send-email</a>, in order to let us try out more personalised workflows than GitHub PRs.</p>\n<h2><a href=\"https://anil.recoil.org/#why-this-fits-in-so-well-with-the-rest-of-bluesky\"></a>Why this fits in so well with the rest of Bluesky</h2>\n<p>The ATProto developers also released their <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring\">roadmap for early 2025</a> today, and it aligns really well with some of the productions features I would need to completely shift over to a service like Tangled.</p>\n<p>The first, and most vital one, is <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#auth-scopes\">auth scopes</a> to control the permissions of an app password to only certain operations. Once this is in the protocol, then a client to manage Tangled repositories could use a differently privileged password from the main social client.</p>\n<p>Secondly, <a href=\"https://docs.bsky.app/blog/2025-protocol-roadmap-spring#privately-shared-data-and-e2ee-dms\">privately shared data</a> and <a href=\"https://www.ietf.org/blog/mls-secure-and-usable-end-to-end-encryption/\">encrypted DMs using MLS</a> point to how private code repositories could work. <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and I were discussing the difficulty of access-controlled replication over the Internet just yesterday, and I'm starting to believe that ATProto has the right balance of ergonomics and good design to make solving this problem much, much easier.</p>\n<p>If you'd like to try this out, then the <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker/\">Knot Docker</a> repository welcomes your issues!</p>\n\n<p>Many thanks to Zicklag and icyphox on <a href=\"https://web.libera.chat/#tangled\">tangled IRC</a> for helping me out with debugging the Knot setup and <a href=\"https://tangled.sh/@tangled.sh/core/commit/477da124ad0bdeeab5b621b81999683256ab7a4b\">fixing bugs in real-time</a>. 12th Mar 2025: updated with <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> comments.</p>",
+18
avsm/notes_docker-and-opam.json
+18
avsm/notes_docker-and-opam.json
···+"summary": "<p>Now that OCaml 4.01 has been released, there is a frenzy of commit\nactivity in the <a href=\"https://github.com/ocaml/ocaml\">development trunk</a> of\nOCaml as the new features for 4.02 are all integrated. These include\nsome enhancements to the type system such as\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>,\n<a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module aliases</a> and\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> as a\nsimpler alternative to syntax extensions.</p>\n<p>The best way to ensure that these all play well together is to test\nagainst the ever-growing OPAM package database as early as possible.\nWhile we\u2019re working on more elaborate <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">continuous\nbuilding</a>\nsolutions, it\u2019s far easier if a developer can quickly run a bulk build\non their own system. The difficulty with doing this is that you also\nneed to install all the external dependencies (e.g. libraries and header\nfiles for bindings) needed by the thousands of packages in OPAM.</p>\n<p>Enter a hip new lightweight container system called\n<a href=\"http://docker.io\">Docker</a>. While containers aren\u2019t quite as secure as\n<a href=\"http://en.wikipedia.org/wiki/Hypervisor\">type-1 hypervisors</a> such as\n<a href=\"http://xenproject.org\">Xen</a>, they are brilliant for spawning lots of\nlightweight tasks such as installing (and reverting) package\ninstallations. Docker is still under heavy development, but it didn\u2019t\ntake me long to follow the documentation and put together a\nconfiguration file for creating an OCaml+OPAM image to let OCaml\ndevelopers do these bulk builds.</p>\n<h2><a href=\"https://anil.recoil.org/#a-basic-docker-and-opam-setup\"></a>A basic Docker and OPAM setup</h2>\n<p>I started by spinning up a fresh Ubuntu Saucy VM on the <a href=\"https://rackspace.com\">Rackspace\nCloud</a>, which has a recent enough kernel version\nto work out-of-the-box with Docker. The <a href=\"http://docs.docker.io/en/latest/installation/ubuntulinux/#ubuntu-raring\">installation\ninstructions</a>\nworked without any problems.</p>\n<p>Next, I created a\n<a href=\"http://docs.docker.io/en/latest/use/builder/#dockerfiles-for-images\">Dockerfile</a>\nto represent the set of commands needed to prepare the base Ubuntu image\nwith an OPAM and OCaml environment. You can find the complete repository\nonline at\n<strong><a href=\"https://github.com/avsm/docker-opam\">https://github.com/avsm/docker-opam</a></strong>.\nLet\u2019s walk through the <code>Dockerfile</code> in chunks.</p>\n<pre><code>FROM ubuntu:latest\nMAINTAINER Anil Madhavapeddy <anil@recoil.org>\nRUN apt-get -y install sudo pkg-config git build-essential m4 software-properties-common\nRUN git config --global user.email "docker@example.com"\nRUN git config --global user.name "Docker CI"\nRUN apt-get -y install python-software-properties\nRUN echo "yes" | add-apt-repository ppa:avsm/ocaml41+opam11\nRUN apt-get -y update -qq\nRUN apt-get -y install -qq ocaml ocaml-native-compilers camlp4-extra opam\nADD opam-installext /usr/bin/opam-installext\n</code></pre>\n<p>This sets up a basic OCaml and OPAM environment using the same Ubuntu\nPPAs as the <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">Travis\ninstructions</a> I\nposted a few months ago. The final command adds a helper script which\nuses the new <code>depexts</code> feature in OPAM 1.1 to also install operating\nsystem packages that are required by some libraries. I\u2019ll explain in\nmore detail in a later post, but for now all you need to know is that\n<code>opam installext ctypes</code> will not only install the <code>ctypes</code> OCaml\nlibrary, but also invoke <code>apt-get install libffi-dev</code> to install the\nrelevant development library first.</p>\n<pre><code>RUN adduser --disabled-password --gecos "" opam\nRUN passwd -l opam\nADD opamsudo /etc/sudoers.d/opam\nUSER opam\nENV HOME /home/opam\nENV OPAMVERBOSE 1\nENV OPAMYES 1\n</code></pre>\n<p>The next chunk of the Dockerfile configures the OPAM environment by\ninstalling a non-root user (several OPAM packages fail with an error if\nconfigured as root). We also set the <code>OPAMVERBOSE</code> and <code>OPAMYES</code>\nvariables to ensure we get the full build logs and non-interactive use,\nrespectively.</p>\n<h2><a href=\"https://anil.recoil.org/#running-the-bulk-tests\"></a>Running the bulk tests</h2>\n<p>We\u2019re now set to build a Docker environment for the exact test that we\nwant to run.</p>\n<pre><code>RUN opam init git://github.com/mirage/opam-repository#add-depexts-11\nRUN opam install ocamlfind\nENTRYPOINT ["usr/bin/opam-installext"]\n</code></pre>\n<p>This last addition to the <code>Dockerfile</code> initializes our OPAM package set.\nThis is using my development branch which adds a <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">massive\ndiff</a> to populate\nthe OPAM metadata with external dependency information for Ubuntu and\nDebian.</p>\n<p>Building an image from this is a single command:</p>\n<pre><code>$ docker build -t avsm/opam github.com/avsm/docker-opam\n</code></pre>\n<p>The <code>ENTRYPOINT</code> tells Docker that our wrapper script is the \u201croot\ncommand\u201d to run for this container, so we can install a package in a\ncontainer by doing this:</p>\n<pre><code>$ docker run avsm/opam ctypes\n</code></pre>\n<p>The complete output is logged to stdout and stderr, so we can capture\nthat as easily as a normal shell command. With all these pieces in\nplace, my local bulk build shell script is trivial:</p>\n<pre><code>pkg=`opam list -s -a`\nRUN=5\nmkdir -p /log/$RUN/raw /log/$RUN/err /log/$RUN/ok\nfor p in $pkg; do\n docker run avsm/opam $p > /log/$RUN/raw/$p 2>&1\n if [ $? != 0 ]; then\n ln -s /log/$RUN/raw/$p /log/$RUN/err/$p\n else\n ln -s /log/$RUN/raw/$p /log/$RUN/ok/$p\n fi\ndone \n</code></pre>\n<p>This iterates through a local package set and serially builds\neverything. Future enhancements I\u2019m working on: parallelising these on a\nmulticore box, and having a <a href=\"http://blog.docker.io/2013/10/docker-0-6-5-links-container-naming-advanced-port-redirects-host-integration/\">linked\ncontainer</a>\nthat hosts a local package repository so that we don\u2019t require a lot of\nexternal bandwidth. Stay tuned!</p>",+"content": "<p>Now that OCaml 4.01 has been released, there is a frenzy of commit\nactivity in the <a href=\"https://github.com/ocaml/ocaml\">development trunk</a> of\nOCaml as the new features for 4.02 are all integrated. These include\nsome enhancements to the type system such as\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>,\n<a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module aliases</a> and\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> as a\nsimpler alternative to syntax extensions.</p>\n<p>The best way to ensure that these all play well together is to test\nagainst the ever-growing OPAM package database as early as possible.\nWhile we\u2019re working on more elaborate <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">continuous\nbuilding</a>\nsolutions, it\u2019s far easier if a developer can quickly run a bulk build\non their own system. The difficulty with doing this is that you also\nneed to install all the external dependencies (e.g. libraries and header\nfiles for bindings) needed by the thousands of packages in OPAM.</p>\n<p>Enter a hip new lightweight container system called\n<a href=\"http://docker.io\">Docker</a>. While containers aren\u2019t quite as secure as\n<a href=\"http://en.wikipedia.org/wiki/Hypervisor\">type-1 hypervisors</a> such as\n<a href=\"http://xenproject.org\">Xen</a>, they are brilliant for spawning lots of\nlightweight tasks such as installing (and reverting) package\ninstallations. Docker is still under heavy development, but it didn\u2019t\ntake me long to follow the documentation and put together a\nconfiguration file for creating an OCaml+OPAM image to let OCaml\ndevelopers do these bulk builds.</p>\n<h2><a href=\"https://anil.recoil.org/#a-basic-docker-and-opam-setup\"></a>A basic Docker and OPAM setup</h2>\n<p>I started by spinning up a fresh Ubuntu Saucy VM on the <a href=\"https://rackspace.com\">Rackspace\nCloud</a>, which has a recent enough kernel version\nto work out-of-the-box with Docker. The <a href=\"http://docs.docker.io/en/latest/installation/ubuntulinux/#ubuntu-raring\">installation\ninstructions</a>\nworked without any problems.</p>\n<p>Next, I created a\n<a href=\"http://docs.docker.io/en/latest/use/builder/#dockerfiles-for-images\">Dockerfile</a>\nto represent the set of commands needed to prepare the base Ubuntu image\nwith an OPAM and OCaml environment. You can find the complete repository\nonline at\n<strong><a href=\"https://github.com/avsm/docker-opam\">https://github.com/avsm/docker-opam</a></strong>.\nLet\u2019s walk through the <code>Dockerfile</code> in chunks.</p>\n<pre><code>FROM ubuntu:latest\nMAINTAINER Anil Madhavapeddy <anil@recoil.org>\nRUN apt-get -y install sudo pkg-config git build-essential m4 software-properties-common\nRUN git config --global user.email "docker@example.com"\nRUN git config --global user.name "Docker CI"\nRUN apt-get -y install python-software-properties\nRUN echo "yes" | add-apt-repository ppa:avsm/ocaml41+opam11\nRUN apt-get -y update -qq\nRUN apt-get -y install -qq ocaml ocaml-native-compilers camlp4-extra opam\nADD opam-installext /usr/bin/opam-installext\n</code></pre>\n<p>This sets up a basic OCaml and OPAM environment using the same Ubuntu\nPPAs as the <a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">Travis\ninstructions</a> I\nposted a few months ago. The final command adds a helper script which\nuses the new <code>depexts</code> feature in OPAM 1.1 to also install operating\nsystem packages that are required by some libraries. I\u2019ll explain in\nmore detail in a later post, but for now all you need to know is that\n<code>opam installext ctypes</code> will not only install the <code>ctypes</code> OCaml\nlibrary, but also invoke <code>apt-get install libffi-dev</code> to install the\nrelevant development library first.</p>\n<pre><code>RUN adduser --disabled-password --gecos "" opam\nRUN passwd -l opam\nADD opamsudo /etc/sudoers.d/opam\nUSER opam\nENV HOME /home/opam\nENV OPAMVERBOSE 1\nENV OPAMYES 1\n</code></pre>\n<p>The next chunk of the Dockerfile configures the OPAM environment by\ninstalling a non-root user (several OPAM packages fail with an error if\nconfigured as root). We also set the <code>OPAMVERBOSE</code> and <code>OPAMYES</code>\nvariables to ensure we get the full build logs and non-interactive use,\nrespectively.</p>\n<h2><a href=\"https://anil.recoil.org/#running-the-bulk-tests\"></a>Running the bulk tests</h2>\n<p>We\u2019re now set to build a Docker environment for the exact test that we\nwant to run.</p>\n<pre><code>RUN opam init git://github.com/mirage/opam-repository#add-depexts-11\nRUN opam install ocamlfind\nENTRYPOINT ["usr/bin/opam-installext"]\n</code></pre>\n<p>This last addition to the <code>Dockerfile</code> initializes our OPAM package set.\nThis is using my development branch which adds a <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">massive\ndiff</a> to populate\nthe OPAM metadata with external dependency information for Ubuntu and\nDebian.</p>\n<p>Building an image from this is a single command:</p>\n<pre><code>$ docker build -t avsm/opam github.com/avsm/docker-opam\n</code></pre>\n<p>The <code>ENTRYPOINT</code> tells Docker that our wrapper script is the \u201croot\ncommand\u201d to run for this container, so we can install a package in a\ncontainer by doing this:</p>\n<pre><code>$ docker run avsm/opam ctypes\n</code></pre>\n<p>The complete output is logged to stdout and stderr, so we can capture\nthat as easily as a normal shell command. With all these pieces in\nplace, my local bulk build shell script is trivial:</p>\n<pre><code>pkg=`opam list -s -a`\nRUN=5\nmkdir -p /log/$RUN/raw /log/$RUN/err /log/$RUN/ok\nfor p in $pkg; do\n docker run avsm/opam $p > /log/$RUN/raw/$p 2>&1\n if [ $? != 0 ]; then\n ln -s /log/$RUN/raw/$p /log/$RUN/err/$p\n else\n ln -s /log/$RUN/raw/$p /log/$RUN/ok/$p\n fi\ndone \n</code></pre>\n<p>This iterates through a local package set and serially builds\neverything. Future enhancements I\u2019m working on: parallelising these on a\nmulticore box, and having a <a href=\"http://blog.docker.io/2013/10/docker-0-6-5-links-container-naming-advanced-port-redirects-host-integration/\">linked\ncontainer</a>\nthat hosts a local package repository so that we don\u2019t require a lot of\nexternal bandwidth. Stay tuned!</p>",
+18
avsm/notes_docker-buys-unikernel-systems.json
+18
avsm/notes_docker-buys-unikernel-systems.json
···+"summary": "<p>My startup <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernel Systems</a> was acquired by <a href=\"https://anil.recoil.org/docker.com\">Docker</a>, and I'll\nbe joining and setting up a UK branch of Docker along with the rest of my team.</p>\n<blockquote>\n<p>'Just like we did with containers, we are interested is democratizing that technology, making it available and useful to the millions of developers and IT pros out there, said <a href=\"https://www.linkedin.com/in/solomonhykes\">Solomon Hykes</a>, founder and chief technology officer for Docker. 'Unikernels allow you to basically get rid of the operating system, and instead compile into the application the small bits of the operating system it really needs.'\n-- <a href=\"https://thenewstack.io/docker-buys-unikernel-systems-plans-bring-unikernels-data-center/\">The New Stack</a></p>\n</blockquote>\n<p>You can also see an announcement from me explaining the background story:</p>\n<p></p><div></div><p></p>",+"content": "<p>My startup <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernel Systems</a> was acquired by <a href=\"https://anil.recoil.org/docker.com\">Docker</a>, and I'll\nbe joining and setting up a UK branch of Docker along with the rest of my team.</p>\n<blockquote>\n<p>'Just like we did with containers, we are interested is democratizing that technology, making it available and useful to the millions of developers and IT pros out there, said <a href=\"https://www.linkedin.com/in/solomonhykes\">Solomon Hykes</a>, founder and chief technology officer for Docker. 'Unikernels allow you to basically get rid of the operating system, and instead compile into the application the small bits of the operating system it really needs.'\n-- <a href=\"https://thenewstack.io/docker-buys-unikernel-systems-plans-bring-unikernels-data-center/\">The New Stack</a></p>\n</blockquote>\n<p>You can also see an announcement from me explaining the background story:</p>\n<p></p><div></div><p></p>",
+18
avsm/notes_dreamplug-debian-and-ocaml.json
+18
avsm/notes_dreamplug-debian-and-ocaml.json
···+"summary": "<p>I\u2019ve been meaning to play with <a href=\"http://www.plugcomputer.org/\">Plug\nComputers</a> for some time now, as I need a\nlow-power embedded system around the house. I recently bought a <a href=\"http://soekris.com/products/net6501.html\">Soekris\nNet6501</a> (a pretty powerful\nIntel CPU, that even has VT support), but had annoying\n<a href=\"http://marc.info/?l=soekris-tech&m=132915532912206&w=2\">issues</a> getting\nit working reliably. I ordered an ARM-based\n<a href=\"http://www.newit.co.uk/shop/products.php?cat=21\">Dreamplug</a> as an\nalternative (and as a bonus, the Dreamplug is 6x cheaper than the\nSoekris!). Here are my notes on getting it to work.</p>\n<p><a href=\"http://www.flickr.com/photos/tlamer/5693063642/\" title=\"dreamplug by tlamer, on Flickr\"><img alt=\"dreamplug\" src=\"http://farm6.staticflickr.com/5230/5693063642_47aa7c4c99.jpg\"></a></p>\n<p>Requirements:</p>\n<ul>\n<li>Aside from the Dreamplug itself, make sure you order the optional\nJTAG module. This provides a serial console that is essential to\ngetting any development done with it.</li>\n<li>I also grabbed the extra 16GB Class 10 SLC SD Card, to act as my\nhome directory.</li>\n<li>You will also need another functional system running Debian (or a VM\non your Mac; whatever is easiest). The JTAG drivers for the USB\nserial are easiest to get running on Linux.</li>\n</ul>\n<p>The Dreamplug arrived with a working installation, but running the\nabsolutely ancient Debian Lenny. A dist-upgrade through to Wheezy led to\nbricking it almost immediately, and so I did a fresh installation from\nscratch.</p>\n<p>For a fresh installation, place a USB stick of suitable size (greater\nthan 2GB is best) into your functional Debian installation. Then:</p>\n<ul>\n<li>\n<p>The Marvell bootloader boots from a VFAT partition, so you will need\ntwo partitions. The first should be a small <code>fat16</code> (I picked 150MB)\nand the remainder an <code>ext3</code> partition for Linux itself. There are\ngood instructions available on the\n<a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor/Dreamplug</a>\nwiki which show you how to do this.</p>\n</li>\n<li>\n<p>I grabbed the latest kernel (at this time, 3.2.7) from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/\">with-linux</a>, and\ninstalled it with the following commands (assuming your USB stick is\n<code>/dev/sdb</code>).</p>\n<pre><code>$ sudo mount /dev/sdb1 /mnt\n$ sudo cp uImage /mnt\n$ sudo umount /mnt\n</code></pre>\n</li>\n<li>\n<p>You now need to use <code>debootstrap</code> to install a fresh root image.\nBecause it is ARM and your main PC is probably an x86, you will need\nto setup the QEMU CPU emulator. An extremely cool feature of QEMU is\nthat it can do <a href=\"http://wiki.debian.org/QemuUserEmulation\">transparent\nemulation</a> of foreign\nbinaries, so you can chroot directly into the ARM filesystem and run\ncommands as if they were x86. The <code>qemu-deboostrap</code> command will\ntake care of this for you, if you perform the steps below (again,\nassuming your USB stick is <code>/dev/sdb</code>).</p>\n<pre><code>$ sudo apt-get install qemu-user-static debootstrap\n$ sudo mount /dev/sdb2 /mnt\n$ sudo mkdir -p /mnt/usr/bin\n$ sudo cp /usr/bin/qemu-arm-static /mnt/usr/bin/\n$ sudo qemu-debootstrap --arch=armel wheezy http://ftp.uk.debian.org/debian/\n</code></pre>\n</li>\n<li>\n<p>Now grab the kernel modules from the same place as your uImage (for\n3.2.7, from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/sheeva-3.2.7-Modules.tar.gz\">here</a>).\nThen, chroot into your fresh installation and untar them.</p>\n<pre><code>$ cd /mnt\n$ sudo tar -zxvf ~/sheeva-3.2.7-Modules.tar.gz\n$ sudo chroot /mnt\n$ depmod -a\n# edit /etc/network/interfaces\n# edit /etc/resolv.conf\n</code></pre>\n</li>\n<li>\n<p>The wireless setup involves some extremely crap firmware which\nrelentlessly kernel panicked for me, so I just disabled it by adding\nthe following to <code>/etc/modprobe.d/dpwifiap.conf</code>, as I only want\nwired access:</p>\n<pre><code>blacklist libertas\nblacklist libertas_sdio\n</code></pre>\n</li>\n<li>\n<p>From there on, put the USB stick into the Dreamplug, and follow the\nrest of the boot instructions from the <a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor\nwiki</a>\nto interact with the Marvell BIOS and boot from the USB stick. I\ncopied the contents of the USB stick onto the internal MicroSD, and\nit all boots standalone now.</p>\n</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#ocaml-on-arm\"></a>OCaml on ARM</h2>\n<p>One of the reasons I wanted an ARM-based setup is to experiment with the\nOCaml native code generation. <a href=\"http://www.home.unix-ag.org/bmeurer/index.html\">Benedikt\nMeurer</a> has been doing\nsome excellent work on <a href=\"http://old.nabble.com/New-ARM-backend-merged-into-trunk-td33262083.html\">improving code\ngeneration</a>\nfor embedded systems, including support for 16-bit Thumb code, exception\nbacktraces, and dynamic linking and profiling.</p>\n<p>Once Linux was up and running, compiling up the latest ocaml-trunk was\nstraightforward.</p>\n<pre><code> $ sudo apt-get install build-essential git\n $ git clone http://github.com/OCamlPro/ocp-ocaml svn-trunk\n $ cd ocp-ocaml\n $ ./configure && make world opt opt.opt install\n</code></pre>\n<p>This compiles the bytecode and native code compilers, and then compiles\nthem again using the native code generator. This takes a while to do on\nthe poor little ARM CPU. Once that finished, I compiled up a few simple\nmodules and they worked great! Since the trunk of OCaml is a development\nbranch, you may run into a few packaging issues (use the very latest\nOASIS to regenerate any <code>setup.ml</code>, and you will need a small patch\nuntil <a href=\"http://caml.inria.fr/mantis/view.php?id=5503\">PR 5503</a> is\napplied).</p>\n<p>Incidentally, if anyone is interested in working on a\n<a href=\"http://openmirage.org\">Mirage</a> port to ARM as an internship in the\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/\">Cambridge Computer Lab</a>,\ndo get in touch with me...</p>",+"content": "<p>I\u2019ve been meaning to play with <a href=\"http://www.plugcomputer.org/\">Plug\nComputers</a> for some time now, as I need a\nlow-power embedded system around the house. I recently bought a <a href=\"http://soekris.com/products/net6501.html\">Soekris\nNet6501</a> (a pretty powerful\nIntel CPU, that even has VT support), but had annoying\n<a href=\"http://marc.info/?l=soekris-tech&m=132915532912206&w=2\">issues</a> getting\nit working reliably. I ordered an ARM-based\n<a href=\"http://www.newit.co.uk/shop/products.php?cat=21\">Dreamplug</a> as an\nalternative (and as a bonus, the Dreamplug is 6x cheaper than the\nSoekris!). Here are my notes on getting it to work.</p>\n<p><a href=\"http://www.flickr.com/photos/tlamer/5693063642/\" title=\"dreamplug by tlamer, on Flickr\"><img alt=\"dreamplug\" src=\"http://farm6.staticflickr.com/5230/5693063642_47aa7c4c99.jpg\"></a></p>\n<p>Requirements:</p>\n<ul>\n<li>Aside from the Dreamplug itself, make sure you order the optional\nJTAG module. This provides a serial console that is essential to\ngetting any development done with it.</li>\n<li>I also grabbed the extra 16GB Class 10 SLC SD Card, to act as my\nhome directory.</li>\n<li>You will also need another functional system running Debian (or a VM\non your Mac; whatever is easiest). The JTAG drivers for the USB\nserial are easiest to get running on Linux.</li>\n</ul>\n<p>The Dreamplug arrived with a working installation, but running the\nabsolutely ancient Debian Lenny. A dist-upgrade through to Wheezy led to\nbricking it almost immediately, and so I did a fresh installation from\nscratch.</p>\n<p>For a fresh installation, place a USB stick of suitable size (greater\nthan 2GB is best) into your functional Debian installation. Then:</p>\n<ul>\n<li>\n<p>The Marvell bootloader boots from a VFAT partition, so you will need\ntwo partitions. The first should be a small <code>fat16</code> (I picked 150MB)\nand the remainder an <code>ext3</code> partition for Linux itself. There are\ngood instructions available on the\n<a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor/Dreamplug</a>\nwiki which show you how to do this.</p>\n</li>\n<li>\n<p>I grabbed the latest kernel (at this time, 3.2.7) from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/\">with-linux</a>, and\ninstalled it with the following commands (assuming your USB stick is\n<code>/dev/sdb</code>).</p>\n<pre><code>$ sudo mount /dev/sdb1 /mnt\n$ sudo cp uImage /mnt\n$ sudo umount /mnt\n</code></pre>\n</li>\n<li>\n<p>You now need to use <code>debootstrap</code> to install a fresh root image.\nBecause it is ARM and your main PC is probably an x86, you will need\nto setup the QEMU CPU emulator. An extremely cool feature of QEMU is\nthat it can do <a href=\"http://wiki.debian.org/QemuUserEmulation\">transparent\nemulation</a> of foreign\nbinaries, so you can chroot directly into the ARM filesystem and run\ncommands as if they were x86. The <code>qemu-deboostrap</code> command will\ntake care of this for you, if you perform the steps below (again,\nassuming your USB stick is <code>/dev/sdb</code>).</p>\n<pre><code>$ sudo apt-get install qemu-user-static debootstrap\n$ sudo mount /dev/sdb2 /mnt\n$ sudo mkdir -p /mnt/usr/bin\n$ sudo cp /usr/bin/qemu-arm-static /mnt/usr/bin/\n$ sudo qemu-debootstrap --arch=armel wheezy http://ftp.uk.debian.org/debian/\n</code></pre>\n</li>\n<li>\n<p>Now grab the kernel modules from the same place as your uImage (for\n3.2.7, from\n<a href=\"http://sheeva.with-linux.com/sheeva/3/3.2/3.2.7/sheeva-3.2.7-Modules.tar.gz\">here</a>).\nThen, chroot into your fresh installation and untar them.</p>\n<pre><code>$ cd /mnt\n$ sudo tar -zxvf ~/sheeva-3.2.7-Modules.tar.gz\n$ sudo chroot /mnt\n$ depmod -a\n# edit /etc/network/interfaces\n# edit /etc/resolv.conf\n</code></pre>\n</li>\n<li>\n<p>The wireless setup involves some extremely crap firmware which\nrelentlessly kernel panicked for me, so I just disabled it by adding\nthe following to <code>/etc/modprobe.d/dpwifiap.conf</code>, as I only want\nwired access:</p>\n<pre><code>blacklist libertas\nblacklist libertas_sdio\n</code></pre>\n</li>\n<li>\n<p>From there on, put the USB stick into the Dreamplug, and follow the\nrest of the boot instructions from the <a href=\"https://trac.torproject.org/projects/tor/wiki/doc/DebianDreamPlug\">Tor\nwiki</a>\nto interact with the Marvell BIOS and boot from the USB stick. I\ncopied the contents of the USB stick onto the internal MicroSD, and\nit all boots standalone now.</p>\n</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#ocaml-on-arm\"></a>OCaml on ARM</h2>\n<p>One of the reasons I wanted an ARM-based setup is to experiment with the\nOCaml native code generation. <a href=\"http://www.home.unix-ag.org/bmeurer/index.html\">Benedikt\nMeurer</a> has been doing\nsome excellent work on <a href=\"http://old.nabble.com/New-ARM-backend-merged-into-trunk-td33262083.html\">improving code\ngeneration</a>\nfor embedded systems, including support for 16-bit Thumb code, exception\nbacktraces, and dynamic linking and profiling.</p>\n<p>Once Linux was up and running, compiling up the latest ocaml-trunk was\nstraightforward.</p>\n<pre><code> $ sudo apt-get install build-essential git\n $ git clone http://github.com/OCamlPro/ocp-ocaml svn-trunk\n $ cd ocp-ocaml\n $ ./configure && make world opt opt.opt install\n</code></pre>\n<p>This compiles the bytecode and native code compilers, and then compiles\nthem again using the native code generator. This takes a while to do on\nthe poor little ARM CPU. Once that finished, I compiled up a few simple\nmodules and they worked great! Since the trunk of OCaml is a development\nbranch, you may run into a few packaging issues (use the very latest\nOASIS to regenerate any <code>setup.ml</code>, and you will need a small patch\nuntil <a href=\"http://caml.inria.fr/mantis/view.php?id=5503\">PR 5503</a> is\napplied).</p>\n<p>Incidentally, if anyone is interested in working on a\n<a href=\"http://openmirage.org\">Mirage</a> port to ARM as an internship in the\n<a href=\"http://www.cl.cam.ac.uk/research/srg/netos/\">Cambridge Computer Lab</a>,\ndo get in touch with me...</p>",
+18
avsm/notes_eeg-interns-2025.json
+18
avsm/notes_eeg-interns-2025.json
···+"summary": "<p>The exam marking is over, and a glorious Cambridge summer awaits! This year, we\nhave a sizeable cohort of undergraduate and graduate interns joining us from\nnext week.</p>\n<p>This note serves as a point of coordination to keep track of what's\ngoing on, and I'll update it as we get ourselves organised.\nIf you're an intern, then I highly recommend you take the time to carefully\nread through all of this, starting with <a href=\"https://anil.recoil.org/#who-we-all-are-this-summer\">who we are</a>,\nsome <a href=\"https://anil.recoil.org/#ground-rules\">ground rules</a>, <a href=\"https://anil.recoil.org/#where-we-will-work\">where we will work</a>,\n<a href=\"https://anil.recoil.org/#registering-on-chat-channels\">how we chat</a>, <a href=\"https://anil.recoil.org/#how-you-will-get-paid\">how to get paid</a>, and of course <a href=\"https://anil.recoil.org/#summer-social-activities\">social activities</a> to make sure we have some fun!</p>\n<h2><a href=\"https://anil.recoil.org/#who-we-all-are-this-summer\"></a>Who we all are this summer</h2>\n<p>We're working on quite the diversity of projects this summer, ranging from classic\ncomputer systems and programming problems all the way through to environmental\nscience. Here's a recap of what's going on.</p>\n<p>First we're working against the <a href=\"https://anil.recoil.org/projects/ce\">evidence database</a> we've been building for the past couple of years:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a>"</em> with <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>, supervised by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a>"</em> with <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>, supervised by <a href=\"https://toao.com\">Sadiq Jaffer</a> following up her successful MPhil submission.</li>\n</ul>\n<p>We're then heading into <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> and working on some mapping projects:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a>"</em> with <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>, supervised by <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a>"</em> with <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>, supervised by <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>, as well as writing up his MPhil dissertation on <em>"<a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a>"</em></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a>"</em> with <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>, supervised by <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a></li>\n</ul>\n<p>Dropping down towards <a href=\"https://anil.recoil.org/projects/osmose\">embedded systems</a> and fun "real-world" projects, we have:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a>"</em> with <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a> and <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, supervised by <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing the planet (or bits of it)</a>"</em> with <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, supervised by <a href=\"https://mynameismwd.org\">Michael Dales</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/embedded-whisper\">Low power audio transcription with Whisper</a>"</em> with <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a> and <em>"<a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a>"</em> with <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>, both supervised by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a></li>\n</ul>\n<p>Going back to classic computer science, we have a few programming language and systems projects:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a>"</em> with <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>, supervised by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a>"</em> with <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <em>"<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</a>"</em> with <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>, both supervised by <a href=\"https://github.com/dra27\">David Allsopp</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a>"</em> with <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>, supervised by <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#ground-rules\"></a>Ground rules</h2>\n<p>Since there are so many of us this summer, it's imperative that you're all\n<strong>proactive about communicating</strong> any problems or clarifications you need. If something\nhere doesn't make sense, or you have a better idea, then just reach out to any\nof the supervisors or me directly!</p>\n<p>Do also take time to <strong>learn from each other</strong>. Read up on not just your own project in the\nlist above, but take some to read the remainder so that you have a sense of what everyone\nis working on. When you see each other, it'll be much easier to chat about what's going\non and find opportunities for commonality.</p>\n<p>The projects above have been carefully selected to <strong>not be on the critical path</strong> for any\ndeadlines. If it's not going well from your perspective, then it's ok to take a step back\nand figure out why! We're hear to learn and discover things, so take the time to do so.</p>\n<h2><a href=\"https://anil.recoil.org/#where-we-will-work\"></a>Where we will work</h2>\n<p>This will be different for everyone, since it depends on which home department will house the project.\nSome of us will be in the David Attenborough Building, in the third floor where the <a href=\"https://www.conservation.cam.ac.uk\">CRI</a> is:</p>\n<ul>\n<li><a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a> and <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a> will be with the <a href=\"https://anil.recoil.org/projects/ce\">CE</a> crew near <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s office</li>\n<li><a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a> and <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a> will hang out with <a href=\"https://coomeslab.org\">David Coomes</a>'s group</li>\n<li><a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> can work near <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>'s office where <a href=\"https://charlesemogor.com\">Charles Emogor</a> works</li>\n</ul>\n<p>Those working on the Zoology Museum itself (<a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>) will have an health and safety induction on Monday with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and find offices there.</p>\n<p>The rest of us will be in the Computer Lab over in West Cambridge:</p>\n<ul>\n<li><a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a> will work out of FW15 with <a href=\"https://github.com/dra27\">David Allsopp</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a></li>\n<li><a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>, <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a> and <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a> will be in FW15/14. We may need to clear out one desk in FW15 to make room here (just put the stuff in my office in FW16). <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> will work out of my office (FW16) for the summer, and <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> is away for an internship in the USA.</li>\n<li>We'll find somewhere for <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a> either in West Cambridge or in Pembroke soon, depending on preferences and heat!</li>\n</ul>\n<p>It'll probably take a week to let this all shake out, so please do shout if you find yourself stuck in your room and without an office! You should of course arrange to meet your immediate supervisors regularly according to whatever schedule and location works for you.</p>\n<h2><a href=\"https://anil.recoil.org/#how-you-will-get-paid\"></a>How you will get paid</h2>\n<p>The way you get paid weekly is via the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">Cambridge Casual Worker</a> system. This has a few important steps that you <strong>must</strong> pay attention to, or you will not get paid!</p>\n<ul>\n<li><strong>Before starting work</strong> you must go find <a href=\"https://www.cst.cam.ac.uk/people/ac733\">Alicja Zavros</a> in the Computer Lab with your passport or other proof of your right to work in the UK. I've told Alicja that may of you will show up on Monday 30th June morning. It won't take more than a few minutes, as she'll take a photocopy of your id. You should also have registered on the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">CCWS</a> and gotten a login.</li>\n<li><strong>Every Friday</strong> that you do some work, fill in a timesheet on the CCWS. Round this off to a full day (8 hours) and don't do fine-grained timekeeping; just the number of days you've worked is fine. If you don't fill in a timesheet promptly, you won't get paid.</li>\n<li><strong>You must keep a research log with weeknotes</strong> that record what you've been up to. The exact style of weeknotes are entirely up to you, but it's vital that you get in the habit of keeping a log. If you have your own homepage, then send an <a href=\"https://en.wikipedia.org/wiki/Atom_(web_standard)\">Atom feed</a> to me. If you don't, then we have a <a href=\"https://github.com/ucam-eo/interns-2025\">github/ucam-eo/interns-2025</a> which I can give you write access to. It's typical to store your weeknotes in Markdown format, and just a simple subdirectory with a date-based convention is fine. The primary use of weeknotes is to highlight things you've accomplished, areas where you are blocked, and interesting things you have run across. Try to make it a record to your future self, and also a way to let those around you know what's going on. While missing the occasional weeknote is just fine, missing them all will be a problem, so plan your time accordingly. Weeknotes are also <em>not</em> a mechanism to assess anything to do with your progress, but a simple form of communication.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#registering-on-chat-channels\"></a>Registering on chat channels</h2>\n<p>Since we're all going to spread around Cambridge physically, it's important to have a chat channel. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> is setting up a WhatsApp group for social things (see below), but we also use <a href=\"https://matrix.org\">Matrix</a> as our "hackers choice" for day-to-day messaging.</p>\n<p>We host a Computer Lab <a href=\"https://matrix.org\">Matrix</a> server on which anyone with a valid Raven account can create an account. Since Matrix is a decentralised chat system, it is also possible to use other accounts from third-party servers, and also to join channels elsewhere.</p>\n<p>To create an account:</p>\n<ul>\n<li>In your Matrix client (we most commonly use <a href=\"https://element.io\">Element</a>), select <code>eeg.cl.cam.ac.uk</code> as your homeserver.</li>\n<li>Login with SSO (Single Sign On)</li>\n<li>You should see a Cambridge authentication screen for your CRSID.</li>\n</ul>\n<p>Once you create your account, you will be in the "EEG" Matrix space. A <a href=\"https://matrix.org/blog/2021/05/17/the-matrix-space-beta/\">Matrix space</a> is a collection of channels, and you should join "EEGeneral" as the overall channel for the group. We'll create a separate room just for intern chats. We also have a bot in the room that posts our blogs to the channel, so you can keep up with what the group members are all chattering about. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> runs the CL matrix server, and there are occasional quirks, so just let us know if you run into any problems. I am <code>@avsm:recoil.org</code> on there, not <code>avsm2</code> as I use my personal Matrix for a bunch of stuff.</p>\n<h2><a href=\"https://anil.recoil.org/#summer-social-activities\"></a>Summer social activities</h2>\n<p>It's important to get some downtime this summer and recharge. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> has been setting up a social group for the interns to hang out together, and we'll organise a punting excursion at some point to get us out to the river. Of course, many of us will be travelling this summer (I'm heading off to Botswana in late July for instance), so please do also make suggestions.</p>",+"content": "<p>The exam marking is over, and a glorious Cambridge summer awaits! This year, we\nhave a sizeable cohort of undergraduate and graduate interns joining us from\nnext week.</p>\n<p>This note serves as a point of coordination to keep track of what's\ngoing on, and I'll update it as we get ourselves organised.\nIf you're an intern, then I highly recommend you take the time to carefully\nread through all of this, starting with <a href=\"https://anil.recoil.org/#who-we-all-are-this-summer\">who we are</a>,\nsome <a href=\"https://anil.recoil.org/#ground-rules\">ground rules</a>, <a href=\"https://anil.recoil.org/#where-we-will-work\">where we will work</a>,\n<a href=\"https://anil.recoil.org/#registering-on-chat-channels\">how we chat</a>, <a href=\"https://anil.recoil.org/#how-you-will-get-paid\">how to get paid</a>, and of course <a href=\"https://anil.recoil.org/#summer-social-activities\">social activities</a> to make sure we have some fun!</p>\n<h2><a href=\"https://anil.recoil.org/#who-we-all-are-this-summer\"></a>Who we all are this summer</h2>\n<p>We're working on quite the diversity of projects this summer, ranging from classic\ncomputer systems and programming problems all the way through to environmental\nscience. Here's a recap of what's going on.</p>\n<p>First we're working against the <a href=\"https://anil.recoil.org/projects/ce\">evidence database</a> we've been building for the past couple of years:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/ai-assisted-inclusion-criteria\">Evaluating a human-in-the-loop AI framework to improve inclusion criteria for evidence synthesis</a>"</em> with <a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a>, supervised by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/accurate-summarisation-for-ce\">Accurate summarisation of threats for conservation evidence literature</a>"</em> with <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a>, supervised by <a href=\"https://toao.com\">Sadiq Jaffer</a> following up her successful MPhil submission.</li>\n</ul>\n<p>We're then heading into <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> and working on some mapping projects:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/cairngorms-connect-habitats\">Habitat mapping of the Cairngormes Connect restoration area</a>"</em> with <a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a>, supervised by <a href=\"https://coomeslab.org\">David Coomes</a> and <a href=\"https://eo.conservation.cam.ac.uk/people/aland-chan/\">Aland Chan</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">Mapping urban and rural British hedgehogs</a>"</em> with <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>, supervised by <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a>, as well as writing up his MPhil dissertation on <em>"<a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">Enhancing Navigation Algorithms with Semantic Embeddings</a>"</em></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/validating-anti-poaching-predictions\">Validating predictions with ranger insights to enhance anti-poaching patrol strategies in protected areas</a>"</em> with <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a>, supervised by <a href=\"https://charlesemogor.com\">Charles Emogor</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a></li>\n</ul>\n<p>Dropping down towards <a href=\"https://anil.recoil.org/projects/osmose\">embedded systems</a> and fun "real-world" projects, we have:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">Affordable digitisation of insect collections using photogrammetry</a>"</em> with <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a>, <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a> and <a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, supervised by <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-edgar-turner\">Edgar Turner</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing the planet (or bits of it)</a>"</em> with <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, supervised by <a href=\"https://mynameismwd.org\">Michael Dales</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/embedded-whisper\">Low power audio transcription with Whisper</a>"</em> with <a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a> and <em>"<a href=\"https://anil.recoil.org/ideas/battery-free-riotee\">Battery-free wildlife monitoring with Riotee</a>"</em> with <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a>, both supervised by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a></li>\n</ul>\n<p>Going back to classic computer science, we have a few programming language and systems projects:</p>\n<ul>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/hazel-to-ocaml-to-hazel\">Bidirectional Hazel to OCaml programming</a>"</em> with <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a>, supervised by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://web.eecs.umich.edu/~comar/\">Cyrus Omar</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">Effects based scheduling for the OCaml compiler pipeline</a>"</em> with <a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <em>"<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">Runtimes \u00e0 la carte: crossloading native and bytecode OCaml</a>"</em> with <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a>, both supervised by <a href=\"https://github.com/dra27\">David Allsopp</a></li>\n<li><em>"<a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">ZFS replication strategies with encryption</a>"</em> with <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a>, supervised by <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#ground-rules\"></a>Ground rules</h2>\n<p>Since there are so many of us this summer, it's imperative that you're all\n<strong>proactive about communicating</strong> any problems or clarifications you need. If something\nhere doesn't make sense, or you have a better idea, then just reach out to any\nof the supervisors or me directly!</p>\n<p>Do also take time to <strong>learn from each other</strong>. Read up on not just your own project in the\nlist above, but take some to read the remainder so that you have a sense of what everyone\nis working on. When you see each other, it'll be much easier to chat about what's going\non and find opportunities for commonality.</p>\n<p>The projects above have been carefully selected to <strong>not be on the critical path</strong> for any\ndeadlines. If it's not going well from your perspective, then it's ok to take a step back\nand figure out why! We're hear to learn and discover things, so take the time to do so.</p>\n<h2><a href=\"https://anil.recoil.org/#where-we-will-work\"></a>Where we will work</h2>\n<p>This will be different for everyone, since it depends on which home department will house the project.\nSome of us will be in the David Attenborough Building, in the third floor where the <a href=\"https://www.conservation.cam.ac.uk\">CRI</a> is:</p>\n<ul>\n<li><a href=\"mailto:ra684@cam.ac.uk\">Radhika Agrawal</a> and <a href=\"mailto:kh807@cam.ac.uk\">Kittson Hamill</a> will be with the <a href=\"https://anil.recoil.org/projects/ce\">CE</a> crew near <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s office</li>\n<li><a href=\"https://github.com/Isabel-Mansley\">Isabel Mansley</a> and <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a> will hang out with <a href=\"https://coomeslab.org\">David Coomes</a>'s group</li>\n<li><a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> can work near <a href=\"https://www.zoo.cam.ac.uk/directory/professor-rob-fletcher\">Rob Fletcher</a>'s office where <a href=\"https://charlesemogor.com\">Charles Emogor</a> works</li>\n</ul>\n<p>Those working on the Zoology Museum itself (<a href=\"mailto:aer82@cam.ac.uk\">Arissa-Elena Rotunjanu</a>, <a href=\"mailto:bsys2@cam.ac.uk\">Beatrice Spence</a> and <a href=\"mailto:ntay2@cam.ac.uk\">Anna Yiu</a>) will have an health and safety induction on Monday with <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> and find offices there.</p>\n<p>The rest of us will be in the Computer Lab over in West Cambridge:</p>\n<ul>\n<li><a href=\"mailto:khm39@cam.ac.uk\">Lucas Ma</a> and <a href=\"mailto:jc2483@cam.ac.uk\">Jeremy Chen</a> will work out of FW15 with <a href=\"https://github.com/dra27\">David Allsopp</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a></li>\n<li><a href=\"mailto:dk729@cam.ac.uk\">Dan Kvit</a>, <a href=\"mailto:fs618@cam.ac.uk\">Finley Stirk</a>, <a href=\"mailto:btt31@cam.ac.uk\">Becky Terefe-Zenebe</a> and <a href=\"mailto:dp717@cam.ac.uk\">Dominico Parish</a> will be in FW15/14. We may need to clear out one desk in FW15 to make room here (just put the stuff in my office in FW16). <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> will work out of my office (FW16) for the summer, and <a href=\"https://www.cst.cam.ac.uk/people/og309\">Onkar Gulati</a> is away for an internship in the USA.</li>\n<li>We'll find somewhere for <a href=\"mailto:mc2372@cam.ac.uk\">Max Carroll</a> either in West Cambridge or in Pembroke soon, depending on preferences and heat!</li>\n</ul>\n<p>It'll probably take a week to let this all shake out, so please do shout if you find yourself stuck in your room and without an office! You should of course arrange to meet your immediate supervisors regularly according to whatever schedule and location works for you.</p>\n<h2><a href=\"https://anil.recoil.org/#how-you-will-get-paid\"></a>How you will get paid</h2>\n<p>The way you get paid weekly is via the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">Cambridge Casual Worker</a> system. This has a few important steps that you <strong>must</strong> pay attention to, or you will not get paid!</p>\n<ul>\n<li><strong>Before starting work</strong> you must go find <a href=\"https://www.cst.cam.ac.uk/people/ac733\">Alicja Zavros</a> in the Computer Lab with your passport or other proof of your right to work in the UK. I've told Alicja that may of you will show up on Monday 30th June morning. It won't take more than a few minutes, as she'll take a photocopy of your id. You should also have registered on the <a href=\"https://www.hrsystems.admin.cam.ac.uk/systems/systems-overview/ccws\">CCWS</a> and gotten a login.</li>\n<li><strong>Every Friday</strong> that you do some work, fill in a timesheet on the CCWS. Round this off to a full day (8 hours) and don't do fine-grained timekeeping; just the number of days you've worked is fine. If you don't fill in a timesheet promptly, you won't get paid.</li>\n<li><strong>You must keep a research log with weeknotes</strong> that record what you've been up to. The exact style of weeknotes are entirely up to you, but it's vital that you get in the habit of keeping a log. If you have your own homepage, then send an <a href=\"https://en.wikipedia.org/wiki/Atom_(web_standard)\">Atom feed</a> to me. If you don't, then we have a <a href=\"https://github.com/ucam-eo/interns-2025\">github/ucam-eo/interns-2025</a> which I can give you write access to. It's typical to store your weeknotes in Markdown format, and just a simple subdirectory with a date-based convention is fine. The primary use of weeknotes is to highlight things you've accomplished, areas where you are blocked, and interesting things you have run across. Try to make it a record to your future self, and also a way to let those around you know what's going on. While missing the occasional weeknote is just fine, missing them all will be a problem, so plan your time accordingly. Weeknotes are also <em>not</em> a mechanism to assess anything to do with your progress, but a simple form of communication.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#registering-on-chat-channels\"></a>Registering on chat channels</h2>\n<p>Since we're all going to spread around Cambridge physically, it's important to have a chat channel. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> is setting up a WhatsApp group for social things (see below), but we also use <a href=\"https://matrix.org\">Matrix</a> as our "hackers choice" for day-to-day messaging.</p>\n<p>We host a Computer Lab <a href=\"https://matrix.org\">Matrix</a> server on which anyone with a valid Raven account can create an account. Since Matrix is a decentralised chat system, it is also possible to use other accounts from third-party servers, and also to join channels elsewhere.</p>\n<p>To create an account:</p>\n<ul>\n<li>In your Matrix client (we most commonly use <a href=\"https://element.io\">Element</a>), select <code>eeg.cl.cam.ac.uk</code> as your homeserver.</li>\n<li>Login with SSO (Single Sign On)</li>\n<li>You should see a Cambridge authentication screen for your CRSID.</li>\n</ul>\n<p>Once you create your account, you will be in the "EEG" Matrix space. A <a href=\"https://matrix.org/blog/2021/05/17/the-matrix-space-beta/\">Matrix space</a> is a collection of channels, and you should join "EEGeneral" as the overall channel for the group. We'll create a separate room just for intern chats. We also have a bot in the room that posts our blogs to the channel, so you can keep up with what the group members are all chattering about. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> runs the CL matrix server, and there are occasional quirks, so just let us know if you run into any problems. I am <code>@avsm:recoil.org</code> on there, not <code>avsm2</code> as I use my personal Matrix for a bunch of stuff.</p>\n<h2><a href=\"https://anil.recoil.org/#summer-social-activities\"></a>Summer social activities</h2>\n<p>It's important to get some downtime this summer and recharge. <a href=\"mailto:hm708@cam.ac.uk\">Hannah McLoone</a> has been setting up a social group for the interns to hang out together, and we'll organise a punting excursion at some point to get us out to the river. Of course, many of us will be travelling this summer (I'm heading off to Botswana in late July for instance), so please do also make suggestions.</p>",
+18
avsm/notes_enter-the-matrix-hookshot.json
+18
avsm/notes_enter-the-matrix-hookshot.json
···+"summary": "<p>We've been happy users of <a href=\"https://matrix.org\">Matrix</a> for our group communications in the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>. Today we've been bringing in more members of the wider group to using it instead of Slack. As part of that, I've set up a cool bot called <a href=\"https://github.com/matrix-org/matrix-hookshot\">Hookshot</a> which allows Matrix to be connected to external services such as GitHub and Atom/RSS feeds. This is a test post to demonstrate to the members of the EEG how Matrix and Atom work!</p>\n<p>The basic idea behind Hookshot is to provide a bridging service to communications rooms hosted on Matrix, in such a way that it can exert administrative control over a room to intercept requests for services (such as adding an Atom feed).</p>\n<p>The setup for Hookshot can be a little involved as there are lots of encryption keys flying around. In a nutshell, I have a Docker container running it with a Yaml config of this nature:</p>\n<pre><code>bridge:\n domain: recoil.org\n url: http://synapse:8008\n port: 9993\n bindAddress: 0.0.0.0\npassFile:\n /data/hookshot.pem\nexperimentalEncryption:\n storagePath: /state\nexperimental_features:\n msc3202_device_masquerading: true\n msc3202_transaction_extensions: true\n msc2409_to_device_messages_enabled: true\nlogging:\n level: debug\n colorize: true\n json: false\n timestampFormat: HH:mm:ss:SSS\n</code></pre>\n<p>This gives me a healthy amount of debug logging, and uses my <a href=\"https://anil.recoil.org/notes/decentralised-stack\">personal</a> Matrix server at <a href=\"https://recoil.org\">recoil.org</a> as the "home" for the bot. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> set up our EEG Matrix server completely separately over in a VM in the Computer Lab, at <code>eeg.cl.cam.ac.uk</code>. It's pretty cool that Matrix allows for this sort of decentralised communucation pretty seamlessly!</p>\n<p>After this worked and was tested, I now have an active bot user on the Matrix (in this case, it's <code>llama:recoil.org</code>. I then configured GitHub on Hookshot so that the bot can monitor GitHub via its API.</p>\n<pre><code>github:\n auth:\n id: 861482\n privateKeyFile: /data/github-key.pem\n webhook:\n secret: <secret>\n oauth:\n client_id: <client-id>\n client_secret: <client-secret>\n redirect_uri: https://recoil.org/hookshot/oauth/\n defaultOptions:\n showIssueRoomLink: false\n hotlinkIssues:\n prefix: "#"\n</code></pre>\n<p>This bit of Yaml takes some configuring via the GitHub OAuth API to get the client-id and secrets. Once that's done, the bot can then be instructed to monitor certain repositories just by issing some commands from within Matrix!</p>\n<p>\n<img alt=\"The Hookshot bot monitoring Quantify Earth\" src=\"https://anil.recoil.org/images/hookshot-ss-1.webp\" title=\"The Hookshot bot monitoring Quantify Earth\">\nThe Hookshot bot monitoring Quantify Earth</p>\n<p>After this, the bot can be configured for a variety of other service. For instance it can monitor Atom feeds to keep track of what the whole group is writing. For this, it's as simple as:</p>\n<ul>\n<li>Invite the bot to the room and give it admin privileges</li>\n<li>Write a message <code>!hookshot feed https://anil.recoil.org/news.xml</code> (as an example for my feed)</li>\n<li>The bot will start monitoring it and post every five minutes by default</li>\n</ul>\n<p>\n<img alt=\"It did admittedly take some messing around to get it to work\" src=\"https://anil.recoil.org/images/hookshot-ss-2.webp\" title=\"It did admittedly take some messing around to get it to work\">\nIt did admittedly take some messing around to get it to work</p>\n<p>\n<img alt=\"But it picked up my post! First!\" src=\"https://anil.recoil.org/images/hookshot-ss-3.webp\" title=\"But it picked up my post! First!\">\nBut it picked up my post! First!</p>\n<p>Hookshot supports a <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/index.html\">variety of other</a> services to bridge to as well, including <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/setup/webhooks.html\">webhooks</a> for arbitrary services. One of the most fun student projects I've supervised recently is "<a href=\"https://anil.recoil.org/ideas/version-control-matrix\">Decentralised Capability-based Code Collaboration using Matrix</a>" in which <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> built Git-patches-over-Matrix. If anyone wants to pick up on that and build a "real" version, perhaps we could use this for peer-to-peer coding! It might work really well with coding copilots, as they have a chat based interface anyway...</p>",+"content": "<p>We've been happy users of <a href=\"https://matrix.org\">Matrix</a> for our group communications in the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a>. Today we've been bringing in more members of the wider group to using it instead of Slack. As part of that, I've set up a cool bot called <a href=\"https://github.com/matrix-org/matrix-hookshot\">Hookshot</a> which allows Matrix to be connected to external services such as GitHub and Atom/RSS feeds. This is a test post to demonstrate to the members of the EEG how Matrix and Atom work!</p>\n<p>The basic idea behind Hookshot is to provide a bridging service to communications rooms hosted on Matrix, in such a way that it can exert administrative control over a room to intercept requests for services (such as adding an Atom feed).</p>\n<p>The setup for Hookshot can be a little involved as there are lots of encryption keys flying around. In a nutshell, I have a Docker container running it with a Yaml config of this nature:</p>\n<pre><code>bridge:\n domain: recoil.org\n url: http://synapse:8008\n port: 9993\n bindAddress: 0.0.0.0\npassFile:\n /data/hookshot.pem\nexperimentalEncryption:\n storagePath: /state\nexperimental_features:\n msc3202_device_masquerading: true\n msc3202_transaction_extensions: true\n msc2409_to_device_messages_enabled: true\nlogging:\n level: debug\n colorize: true\n json: false\n timestampFormat: HH:mm:ss:SSS\n</code></pre>\n<p>This gives me a healthy amount of debug logging, and uses my <a href=\"https://anil.recoil.org/notes/decentralised-stack\">personal</a> Matrix server at <a href=\"https://recoil.org\">recoil.org</a> as the "home" for the bot. <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> set up our EEG Matrix server completely separately over in a VM in the Computer Lab, at <code>eeg.cl.cam.ac.uk</code>. It's pretty cool that Matrix allows for this sort of decentralised communucation pretty seamlessly!</p>\n<p>After this worked and was tested, I now have an active bot user on the Matrix (in this case, it's <code>llama:recoil.org</code>. I then configured GitHub on Hookshot so that the bot can monitor GitHub via its API.</p>\n<pre><code>github:\n auth:\n id: 861482\n privateKeyFile: /data/github-key.pem\n webhook:\n secret: <secret>\n oauth:\n client_id: <client-id>\n client_secret: <client-secret>\n redirect_uri: https://recoil.org/hookshot/oauth/\n defaultOptions:\n showIssueRoomLink: false\n hotlinkIssues:\n prefix: "#"\n</code></pre>\n<p>This bit of Yaml takes some configuring via the GitHub OAuth API to get the client-id and secrets. Once that's done, the bot can then be instructed to monitor certain repositories just by issing some commands from within Matrix!</p>\n<p>\n<img alt=\"The Hookshot bot monitoring Quantify Earth\" src=\"https://anil.recoil.org/images/hookshot-ss-1.webp\" title=\"The Hookshot bot monitoring Quantify Earth\">\nThe Hookshot bot monitoring Quantify Earth</p>\n<p>After this, the bot can be configured for a variety of other service. For instance it can monitor Atom feeds to keep track of what the whole group is writing. For this, it's as simple as:</p>\n<ul>\n<li>Invite the bot to the room and give it admin privileges</li>\n<li>Write a message <code>!hookshot feed https://anil.recoil.org/news.xml</code> (as an example for my feed)</li>\n<li>The bot will start monitoring it and post every five minutes by default</li>\n</ul>\n<p>\n<img alt=\"It did admittedly take some messing around to get it to work\" src=\"https://anil.recoil.org/images/hookshot-ss-2.webp\" title=\"It did admittedly take some messing around to get it to work\">\nIt did admittedly take some messing around to get it to work</p>\n<p>\n<img alt=\"But it picked up my post! First!\" src=\"https://anil.recoil.org/images/hookshot-ss-3.webp\" title=\"But it picked up my post! First!\">\nBut it picked up my post! First!</p>\n<p>Hookshot supports a <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/index.html\">variety of other</a> services to bridge to as well, including <a href=\"https://matrix-org.github.io/matrix-hookshot/latest/setup/webhooks.html\">webhooks</a> for arbitrary services. One of the most fun student projects I've supervised recently is "<a href=\"https://anil.recoil.org/ideas/version-control-matrix\">Decentralised Capability-based Code Collaboration using Matrix</a>" in which <a href=\"https://bsky.app/profile/wedg.dev\">Samuel Wedgwood</a> built Git-patches-over-Matrix. If anyone wants to pick up on that and build a "real" version, perhaps we could use this for peer-to-peer coding! It might work really well with coding copilots, as they have a chat based interface anyway...</p>",
+18
avsm/notes_focs.json
+18
avsm/notes_focs.json
···+"summary": "<p>Here are the various repos used to create the interactive <a href=\"https://anil.recoil.org/notes/teaching\">teaching</a> environment\nwe use for 1A Foundations of Computer Science in Cambridge. It may be useful to\nother professors who are using OCaml in their courses.</p>\n<ul>\n<li><a href=\"https://github.com/avsm/teaching-fcs\">https://github.com/avsm/teaching-fcs</a> is a private repo, but ping me if\nare teaching and I'll give you access (it has coursework answers in it).\nWe use a Jupyter notebook, with the course written in Markdown using the\n<a href=\"https://github.com/realworldocaml/mdx\">mdx</a> OCaml parser which evaluates\ntoplevel phrases through the compiler and promotes the output directly\ninto the markdown.</li>\n<li>We then convert the Markdown into Jupyter format using a\n<a href=\"https://github.com/realworldocaml/mdx/pull/124\">fork of mdx</a>, and then\nnbconvert it into LaTeX for the printed notes.</li>\n<li>A <a href=\"https://jupyter.org/install.html\">JupyterLab</a> installation with a\n<a href=\"https://github.com/akabe/ocaml-jupyter\">custom OCaml kernel</a> suffices\nfor the live setup. Every student gets their own container on the server\nand one server is sufficient for a full class of ~125 students.</li>\n</ul>\n<p>Ping me if you want to know more, and other people who have worked\non this with me are <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a>, <a href=\"https://github.com/dra27\">David Allsopp</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a>, with Jon\nbeing the currently active additional lecturer on the course as of 2024/2025.</p>",+"content": "<p>Here are the various repos used to create the interactive <a href=\"https://anil.recoil.org/notes/teaching\">teaching</a> environment\nwe use for 1A Foundations of Computer Science in Cambridge. It may be useful to\nother professors who are using OCaml in their courses.</p>\n<ul>\n<li><a href=\"https://github.com/avsm/teaching-fcs\">https://github.com/avsm/teaching-fcs</a> is a private repo, but ping me if\nare teaching and I'll give you access (it has coursework answers in it).\nWe use a Jupyter notebook, with the course written in Markdown using the\n<a href=\"https://github.com/realworldocaml/mdx\">mdx</a> OCaml parser which evaluates\ntoplevel phrases through the compiler and promotes the output directly\ninto the markdown.</li>\n<li>We then convert the Markdown into Jupyter format using a\n<a href=\"https://github.com/realworldocaml/mdx/pull/124\">fork of mdx</a>, and then\nnbconvert it into LaTeX for the printed notes.</li>\n<li>A <a href=\"https://jupyter.org/install.html\">JupyterLab</a> installation with a\n<a href=\"https://github.com/akabe/ocaml-jupyter\">custom OCaml kernel</a> suffices\nfor the live setup. Every student gets their own container on the server\nand one server is sufficient for a full class of ~125 students.</li>\n</ul>\n<p>Ping me if you want to know more, and other people who have worked\non this with me are <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a>, <a href=\"https://github.com/dra27\">David Allsopp</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a>, with Jon\nbeing the currently active additional lecturer on the course as of 2024/2025.</p>",
+18
avsm/notes_forest-apps-and-benchmarks.json
+18
avsm/notes_forest-apps-and-benchmarks.json
···+"summary": "<p>This week I've been reading three really nice pieces of work by my\ncolleagues, in the form of a <a href=\"https://www.nature.com/articles/s44358-025-00022-3\">review paper</a> on biodiversity and AI,\na <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark</a> for 3D forest reconstruction using laser scanners and a <a href=\"https://github.com/MingyueX/GreenLens\">mobile app</a> for measuring the width of tree trunks. A real bonanza for forest lovers!</p>\n<h2><a href=\"https://anil.recoil.org/#review-paper-on-mapping-opportunities-for-ai-in-biodiversity\"></a>Review paper on mapping opportunities for AI in biodiversity</h2>\n<p>A paper on '<a href=\"https://www.nature.com/articles/s44358-025-00022-3\">Harnessing AI to fill global shortfalls in biodiversity knowledge</a>' just came out in Nature Biodiversity today (via <a href=\"http://oisin.info\">Oisin Mac Aodha</a>). They start with the baseline present uses of AI (camera traps, acoustic monitoring and improved data analysis) which are pretty well known to anyone in the field, but then introduce <a href=\"https://www.nature.com/articles/s44358-025-00022-3/figures/1\">a lovely diagram of future uses</a> of AI for biodiversity which includes:</p>\n<ul>\n<li>Rapid retrieval of existing information means both looking into existing literature, but also the digitisation of existing museum specimens. Coincidentally, I have just posted a new student project on the area of <a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">insect digitisation at the Zoology museum</a> from <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> on the latter topic, which I'd be very happy to hear from interested students about. I have also have been working on <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">LLM driven evidence retrieval</a> recently, so I'm all in favour of lots more projects in this space.</li>\n<li>Once the data is retrieved, they discuss how this could be used for richer hypothesis generation via detection of new patterns for humans to review, ranking high-value areas that need more observations, and generally doing more unsupervised learning over the vast space. This is a good zooming in from many of the general areas covered in the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society Science in the Age of AI</a> report as well, and very good to see given the sheer urgency of more action in the field of biodiversity conservation.</li>\n<li>Finally, there's also the fascinating topic of ecological modelling where we move from individual species to whole communities, as well as knowledge-guided machine learning towards this. I'm planning on experimenting more with differentiable models (beyond ABMs, where both <a href=\"https://anil.recoil.org/ideas/differentiable-abm\">differentiable</a> and <a href=\"https://anil.recoil.org/ideas/rev-abm\">reversible</a> have worked very well). The recent paper on <a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> from the Google team underlined the huge potential of combining purely data-driven and purely-computational models into a combined system with much better predictive power than either by itself.</li>\n</ul>\n<p>Those interested in this may also want to look at our recent <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan on AI and conservation</a> from a few months ago. The field is moving so quickly that I wouldn't be surprised if both of these were obsolete a year from now!</p>\n<p><a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\"> \n<img alt=\"If you like biodiversity, consider working with me on this project!\" src=\"https://anil.recoil.org/images/umzc-4.webp\" title=\"If you like biodiversity, consider working with me on this project!\">\nIf you like biodiversity, consider working with me on this project! </a></p>\n<h2><a href=\"https://anil.recoil.org/#benchmark-dataset-for-tree-species-identifications\"></a>Benchmark dataset for tree species identifications</h2>\n<p>And then out in MEE is a comprehensive benchmark from a collection of forestry researchers on a <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark for tree species classifiction from proximal laser scanners</a>. Their <a href=\"https://zenodo.org/records/13255198\">FOR-species20k</a> dataset is on Zenodo, and has tons of tree point clouds taken using a variety of laser scanning techniques (<a href=\"https://www.earthscope.org/what-is/tls/\">TLS</a>, <a href=\"https://www.sciencedirect.com/science/article/pii/S1618866723003710\">MLS</a> and <a href=\"https://www.gispro.pl/en/products/unmanned-laser-scanning-uls/\">ULS</a>).</p>\n<p>As <a href=\"https://www.geog.cam.ac.uk/people/lines/\">Emily Lines</a> notes:</p>\n<blockquote>\n<p>Most importantly, we demonstrate that community efforts and open science are the only way to make significant progress in this important task. With more researchers publishing and sharing high quality 3D forest datasets, I hope we see an end of single-site studies and that proper and broad benchmarking of all new 3D forest deep learning methods becomes the standard.\n-- <a href=\"https://www.linkedin.com/posts/emily-lines-2b271a80_openscience-ai-deeplearning-activity-7292116486519676928-XfwF\">Emily Lines on LinkedIn</a></p>\n</blockquote>\n<p>I've been learning more about <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">tree species identification for tropical species</a> last year, so I'm looking forward to delving more into laser scanning techniques soon from this work.</p>\n<h2><a href=\"https://anil.recoil.org/#a-mobile-app-for-measuring-tree-trunks\"></a>A mobile app for measuring tree trunks</h2>\n<p>And last but not least, I was delighted to see that my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> released the extremely cool mobile phone app they've been working on for some time along with a <a href=\"https://www.sciencedirect.com/science/article/pii/S1574954124003169?via%3Dihub#s0125\">paper in Ecological Informatics</a>. Their app is a simple and elegant mobile phone app that can measure the diameter of a tree trunk (more specifically, the <a href=\"https://en.wikipedia.org/wiki/Diameter_at_breast_height\">DBH</a>) just using standard cameraphone hardware on most modern-ish Android phones.</p>\n<p>I was lucky enough to beta test this and try it out on my <a href=\"https://anil.recoil.org/notes/compass2024-ric-tripreport\">recent trip to India</a>, and the <a href=\"https://anil.recoil.org/\">GreenLens</a> is also now <a href=\"https://github.com/MingyueX/GreenLens\">open source</a> as well.</p>\n<blockquote>\n<p>Other apps for measuring forest plots are available [...] but those for Android phones tend not to perform as well as ours, while those designed for the iPhone require the purchase of a high-end phone that is not affordable for researchers in the Global South.</p>\n<p>We believe ours is the only app to sit in the 'sweet spot' of offering high quality for low cost.\n-- <a href=\"https://www.cst.cam.ac.uk/using-ai-see-wood-trees\">Frank and Keshav on cam.ac.uk</a></p>\n</blockquote>\n<p>\n<img alt=\"I actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post\" src=\"https://anil.recoil.org/images/pups-india-1.webp\" title=\"I actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post\">\nI actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post</p>",+"content": "<p>This week I've been reading three really nice pieces of work by my\ncolleagues, in the form of a <a href=\"https://www.nature.com/articles/s44358-025-00022-3\">review paper</a> on biodiversity and AI,\na <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark</a> for 3D forest reconstruction using laser scanners and a <a href=\"https://github.com/MingyueX/GreenLens\">mobile app</a> for measuring the width of tree trunks. A real bonanza for forest lovers!</p>\n<h2><a href=\"https://anil.recoil.org/#review-paper-on-mapping-opportunities-for-ai-in-biodiversity\"></a>Review paper on mapping opportunities for AI in biodiversity</h2>\n<p>A paper on '<a href=\"https://www.nature.com/articles/s44358-025-00022-3\">Harnessing AI to fill global shortfalls in biodiversity knowledge</a>' just came out in Nature Biodiversity today (via <a href=\"http://oisin.info\">Oisin Mac Aodha</a>). They start with the baseline present uses of AI (camera traps, acoustic monitoring and improved data analysis) which are pretty well known to anyone in the field, but then introduce <a href=\"https://www.nature.com/articles/s44358-025-00022-3/figures/1\">a lovely diagram of future uses</a> of AI for biodiversity which includes:</p>\n<ul>\n<li>Rapid retrieval of existing information means both looking into existing literature, but also the digitisation of existing museum specimens. Coincidentally, I have just posted a new student project on the area of <a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">insect digitisation at the Zoology museum</a> from <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiffany Ki</a> on the latter topic, which I'd be very happy to hear from interested students about. I have also have been working on <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">LLM driven evidence retrieval</a> recently, so I'm all in favour of lots more projects in this space.</li>\n<li>Once the data is retrieved, they discuss how this could be used for richer hypothesis generation via detection of new patterns for humans to review, ranking high-value areas that need more observations, and generally doing more unsupervised learning over the vast space. This is a good zooming in from many of the general areas covered in the <a href=\"https://royalsociety.org/news-resources/projects/science-in-the-age-of-ai/\">Royal Society Science in the Age of AI</a> report as well, and very good to see given the sheer urgency of more action in the field of biodiversity conservation.</li>\n<li>Finally, there's also the fascinating topic of ecological modelling where we move from individual species to whole communities, as well as knowledge-guided machine learning towards this. I'm planning on experimenting more with differentiable models (beyond ABMs, where both <a href=\"https://anil.recoil.org/ideas/differentiable-abm\">differentiable</a> and <a href=\"https://anil.recoil.org/ideas/rev-abm\">reversible</a> have worked very well). The recent paper on <a href=\"https://www.nature.com/articles/s41586-024-07744-y\">NeuralGCM</a> from the Google team underlined the huge potential of combining purely data-driven and purely-computational models into a combined system with much better predictive power than either by itself.</li>\n</ul>\n<p>Those interested in this may also want to look at our recent <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan on AI and conservation</a> from a few months ago. The field is moving so quickly that I wouldn't be surprised if both of these were obsolete a year from now!</p>\n<p><a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\"> \n<img alt=\"If you like biodiversity, consider working with me on this project!\" src=\"https://anil.recoil.org/images/umzc-4.webp\" title=\"If you like biodiversity, consider working with me on this project!\">\nIf you like biodiversity, consider working with me on this project! </a></p>\n<h2><a href=\"https://anil.recoil.org/#benchmark-dataset-for-tree-species-identifications\"></a>Benchmark dataset for tree species identifications</h2>\n<p>And then out in MEE is a comprehensive benchmark from a collection of forestry researchers on a <a href=\"https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.14503\">benchmark for tree species classifiction from proximal laser scanners</a>. Their <a href=\"https://zenodo.org/records/13255198\">FOR-species20k</a> dataset is on Zenodo, and has tons of tree point clouds taken using a variety of laser scanning techniques (<a href=\"https://www.earthscope.org/what-is/tls/\">TLS</a>, <a href=\"https://www.sciencedirect.com/science/article/pii/S1618866723003710\">MLS</a> and <a href=\"https://www.gispro.pl/en/products/unmanned-laser-scanning-uls/\">ULS</a>).</p>\n<p>As <a href=\"https://www.geog.cam.ac.uk/people/lines/\">Emily Lines</a> notes:</p>\n<blockquote>\n<p>Most importantly, we demonstrate that community efforts and open science are the only way to make significant progress in this important task. With more researchers publishing and sharing high quality 3D forest datasets, I hope we see an end of single-site studies and that proper and broad benchmarking of all new 3D forest deep learning methods becomes the standard.\n-- <a href=\"https://www.linkedin.com/posts/emily-lines-2b271a80_openscience-ai-deeplearning-activity-7292116486519676928-XfwF\">Emily Lines on LinkedIn</a></p>\n</blockquote>\n<p>I've been learning more about <a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">tree species identification for tropical species</a> last year, so I'm looking forward to delving more into laser scanning techniques soon from this work.</p>\n<h2><a href=\"https://anil.recoil.org/#a-mobile-app-for-measuring-tree-trunks\"></a>A mobile app for measuring tree trunks</h2>\n<p>And last but not least, I was delighted to see that my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a> released the extremely cool mobile phone app they've been working on for some time along with a <a href=\"https://www.sciencedirect.com/science/article/pii/S1574954124003169?via%3Dihub#s0125\">paper in Ecological Informatics</a>. Their app is a simple and elegant mobile phone app that can measure the diameter of a tree trunk (more specifically, the <a href=\"https://en.wikipedia.org/wiki/Diameter_at_breast_height\">DBH</a>) just using standard cameraphone hardware on most modern-ish Android phones.</p>\n<p>I was lucky enough to beta test this and try it out on my <a href=\"https://anil.recoil.org/notes/compass2024-ric-tripreport\">recent trip to India</a>, and the <a href=\"https://anil.recoil.org/\">GreenLens</a> is also now <a href=\"https://github.com/MingyueX/GreenLens\">open source</a> as well.</p>\n<blockquote>\n<p>Other apps for measuring forest plots are available [...] but those for Android phones tend not to perform as well as ours, while those designed for the iPhone require the purchase of a high-end phone that is not affordable for researchers in the Global South.</p>\n<p>We believe ours is the only app to sit in the 'sweet spot' of offering high quality for low cost.\n-- <a href=\"https://www.cst.cam.ac.uk/using-ai-see-wood-trees\">Frank and Keshav on cam.ac.uk</a></p>\n</blockquote>\n<p>\n<img alt=\"I actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post\" src=\"https://anil.recoil.org/images/pups-india-1.webp\" title=\"I actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post\">\nI actually got quite distracted while trying to beta test GreenLens in India as I ran across these adorable stray street puppies, which seems important to post</p>",
+18
avsm/notes_forests-spatial-resolution.json
+18
avsm/notes_forests-spatial-resolution.json
···+"summary": "<p>There's a <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">letter in Science</a> today from a bunch of well known remote sensing researchers that make the unusual point that modern satellite resolution is getting <em>too</em> good to be accurate for forest carbon estimation.</p>\n<blockquote>\n<p>Many new satellites can resolve fine features on the landscape, and even some individual trees outside forests, but this resolution (0.3-5m) is too high for mapping forest carbon. Forest carbon has a natural resolution constraint: the size of an individual tree. To create these maps, tree data from the ground are required because there is no direct measure of tree carbon nor any way to accurately divide trees into smaller components from space.\n[...]\nBecause most carbon in a forest is stored in large trees, map resolutions should at minimum exceed the crown diameter of a typical large tree, which ranges from about 10m for temperate forests to about 20m for tropical forests\n--- <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">Laura Duncanson et al</a>, Spatial resolution for forest carbon maps, Science</p>\n</blockquote>\n<p>The lead author <a href=\"https://geog.umd.edu/facultyprofile/duncanson/laura\">Laura Duncanson</a> is a remote sensing scientist at Maryland who works on the incredible <a href=\"https://en.wikipedia.org/wiki/Global_Ecosystem_Dynamics_Investigation\">GEDI</a> instrument on the International Space Station. In her recent <a href=\"https://watch.eeg.cl.cam.ac.uk/w/uoH2Gie4WiiAocQJYLi9im\">EEG seminar talk</a>, she noted that their instrument is so sensitively calibrated that they can detect when astronauts on the space station are flushing the loo!</p>\n<div></div>\n<p><a href=\"https://coomeslab.org\">David Coomes</a> further notes that we shouldn't think of either field data or GEDI footprints as sole ground truths, but rather factor in the combined uncertainties in both ground and remote sensing data. This <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">2018 Geosciences paper</a> goes through the details of how this error propagation works in Borneo rainforests:</p>\n<blockquote>\n<p>By combining ALS imagery with data from 173 permanent forest plots spanning the lowland rainforests of Sabah on the island of Borneo, we develop a simple yet general model for estimating forest carbon stocks using ALS-derived canopy height and canopy cover as input metrics. An advanced feature of this new model is the propagation of uncertainty in both ALS- and ground-based data, allowing uncertainty in hectare-scale estimates of carbon stocks to be quantified robustly.</p>\n<p>[...] Since the 1970s Borneo has lost more than 60% of its old-growth forests, the majority of which have been replaced by large-scale industrial palm oil plantations.</p>\n<p>With the view of halting the further deforestation of carbon-dense old-growth forests and generating the necessary knowledge to better manage its forests into the future, in 2016 the Sabah state government commissioned CAO to deliver a high-resolution ALS-based carbon map of the entire state. The regional carbon model we develop here underpins this initiative [...]\n-- <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">Tommaso Jucker, David Coomes et al</a>, Estimating aboveground carbon density and its uncertainty in Borneo\u2019s structurally complex tropical forests using airborne laser scanning</p>\n</blockquote>\n<p><a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> are just starting to refresh our <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">PACT methodology spec</a>, so this yet another timely warning to not race ahead with the <a href=\"https://anil.recoil.org/projects/rsn\">latest satellite data</a> without careful consideration of what it is we are actually measuring (in our case, forest carbon for <a href=\"https://anil.recoil.org/news?t=carboncredits\">carboncredits</a>).</p>",+"content": "<p>There's a <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">letter in Science</a> today from a bunch of well known remote sensing researchers that make the unusual point that modern satellite resolution is getting <em>too</em> good to be accurate for forest carbon estimation.</p>\n<blockquote>\n<p>Many new satellites can resolve fine features on the landscape, and even some individual trees outside forests, but this resolution (0.3-5m) is too high for mapping forest carbon. Forest carbon has a natural resolution constraint: the size of an individual tree. To create these maps, tree data from the ground are required because there is no direct measure of tree carbon nor any way to accurately divide trees into smaller components from space.\n[...]\nBecause most carbon in a forest is stored in large trees, map resolutions should at minimum exceed the crown diameter of a typical large tree, which ranges from about 10m for temperate forests to about 20m for tropical forests\n--- <a href=\"https://www.science.org/doi/10.1126/science.adt6811\">Laura Duncanson et al</a>, Spatial resolution for forest carbon maps, Science</p>\n</blockquote>\n<p>The lead author <a href=\"https://geog.umd.edu/facultyprofile/duncanson/laura\">Laura Duncanson</a> is a remote sensing scientist at Maryland who works on the incredible <a href=\"https://en.wikipedia.org/wiki/Global_Ecosystem_Dynamics_Investigation\">GEDI</a> instrument on the International Space Station. In her recent <a href=\"https://watch.eeg.cl.cam.ac.uk/w/uoH2Gie4WiiAocQJYLi9im\">EEG seminar talk</a>, she noted that their instrument is so sensitively calibrated that they can detect when astronauts on the space station are flushing the loo!</p>\n<div></div>\n<p><a href=\"https://coomeslab.org\">David Coomes</a> further notes that we shouldn't think of either field data or GEDI footprints as sole ground truths, but rather factor in the combined uncertainties in both ground and remote sensing data. This <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">2018 Geosciences paper</a> goes through the details of how this error propagation works in Borneo rainforests:</p>\n<blockquote>\n<p>By combining ALS imagery with data from 173 permanent forest plots spanning the lowland rainforests of Sabah on the island of Borneo, we develop a simple yet general model for estimating forest carbon stocks using ALS-derived canopy height and canopy cover as input metrics. An advanced feature of this new model is the propagation of uncertainty in both ALS- and ground-based data, allowing uncertainty in hectare-scale estimates of carbon stocks to be quantified robustly.</p>\n<p>[...] Since the 1970s Borneo has lost more than 60% of its old-growth forests, the majority of which have been replaced by large-scale industrial palm oil plantations.</p>\n<p>With the view of halting the further deforestation of carbon-dense old-growth forests and generating the necessary knowledge to better manage its forests into the future, in 2016 the Sabah state government commissioned CAO to deliver a high-resolution ALS-based carbon map of the entire state. The regional carbon model we develop here underpins this initiative [...]\n-- <a href=\"https://tforces.net/upload/publication-store/2018/Jucker_et_al_2018_Borneo_carbon_Biogeosciences-15-3811-2018.pdf\">Tommaso Jucker, David Coomes et al</a>, Estimating aboveground carbon density and its uncertainty in Borneo\u2019s structurally complex tropical forests using airborne laser scanning</p>\n</blockquote>\n<p><a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> are just starting to refresh our <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">PACT methodology spec</a>, so this yet another timely warning to not race ahead with the <a href=\"https://anil.recoil.org/projects/rsn\">latest satellite data</a> without careful consideration of what it is we are actually measuring (in our case, forest carbon for <a href=\"https://anil.recoil.org/news?t=carboncredits\">carboncredits</a>).</p>",
+18
avsm/notes_forests.json
+18
avsm/notes_forests.json
···+"summary": "<p>I track external notes and media articles here on forest preservation and\nrestoration as part of my work on <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a>. Not complete, just a reading list.</p>\n<ul>\n<li><a href=\"https://www.youtube.com/watch?v=yiw6_JakZFc\">Can YOU Fix Climate Change?</a> (great short summary of the overall issues)</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#rewilding\"></a>Rewilding</h2>\n<ul>\n<li><a href=\"https://www.theguardian.com/environment/2021/sep/24/vast-area-of-scottish-highlands-to-be-rewilded-in-ambitious-30-year-project-aoe\">Affric Highlands initiative</a> to rewild Scotland over 30 years</li>\n<li><a href=\"https://www.bloomberg.com/news/articles/2021-09-14/gabon-s-climate-law-brings-it-closer-to-carbon-trade-ambition\">Gabon's Climate Law</a></li>\n<li><a href=\"https://www.soilassociation.org/blogs/2021/august/3/pairing-agroforestry-with-livestock-the-major-benefits/\">Pairing agroforestry with livestock: the major benefits</a></li>\n<li><a href=\"https://www.nationalparks.uk/2021/10/06/press-release-major-global-companies-to-fund-vital-nature-restoration-projects-in-the-uks-national-parks-through-innovative-new-financing-facility/\">Major global companies to fund nature restoration projects in UK's national parks</a> (via <a href=\"https://www.thepalladiumgroup.com\">Palladium group</a>)</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#remote-sensing\"></a>Remote sensing</h2>\n<ul>\n<li><a href=\"https://www.kiss.caltech.edu/papers/biodiversity/papers/2020_Book_RemoteSensingOfPlantBiodiversi.pdf\">Remote sensing of plant biodiversity</a>\n<ul>\n<li><a href=\"https://geobon.org\">Geobon</a> - global researcher network working on above.</li>\n</ul>\n</li>\n<li><a href=\"https://earthi.space/\">Earth-i</a> - sub-1m sensing satellite constellation</li>\n<li><a href=\"https://www.mantle-labs.com\">Mantle Labs</a> - earth observation + machine learning for farmers</li>\n<li><a href=\"https://www.cgi.com/uk/en-gb/news/climate/cgi-announces-strategic-partnership-project-seagrass-reduce-co2\">Seagrass from space</a></li>\n<li>Keshav's <a href=\"http://blizzard.cs.uwaterloo.ca/iss4e/wp-content/uploads/2017/10/Communication-technologies-for-energy-informatics.pdf\">comms technologies for energy informatics</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#carbon-credits\"></a>Carbon Credits</h2>\n<ul>\n<li><a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">Notes on Carbon offsets for scientific societies</a></li>\n<li><a href=\"https://vcmintegrity.org/\">Voluntary Carbon Markets integrity initiative</a></li>\n<li><a href=\"https://www.ecosystemmarketplace.com/articles/press-release-voluntary-carbon-markets-rocket-in-2021-on-track-to-break-1b-for-first-time/\">Voluntary Carbon Markets Rocket in 2021, On Track to Break $1B for First Time</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#biodiversity\"></a>Biodiversity</h2>\n<ul>\n<li><a href=\"https://kiss.caltech.edu/lectures/2019_biodiversity.html\">Biodiversity: Perspectives of a Techie</a> - Dave Thau - Data and Technology Global Lead Scientist, WWF</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#valuing-climate-change\"></a>Valuing climate change</h2>\n<ul>\n<li><a href=\"https://www.sciencedirect.com/science/article/pii/S001671851930051X\">Cryptocarbon: The promises and pitfalls of forest protection on a blockchain</a></li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0285-8\">Valuing climate damages at the country level</a> - nature climate change, 2018</li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0282-y\">Country-level social cost of carbon</a>, nature climate change 2018</li>\n</ul>",+"content": "<p>I track external notes and media articles here on forest preservation and\nrestoration as part of my work on <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a>. Not complete, just a reading list.</p>\n<ul>\n<li><a href=\"https://www.youtube.com/watch?v=yiw6_JakZFc\">Can YOU Fix Climate Change?</a> (great short summary of the overall issues)</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#rewilding\"></a>Rewilding</h2>\n<ul>\n<li><a href=\"https://www.theguardian.com/environment/2021/sep/24/vast-area-of-scottish-highlands-to-be-rewilded-in-ambitious-30-year-project-aoe\">Affric Highlands initiative</a> to rewild Scotland over 30 years</li>\n<li><a href=\"https://www.bloomberg.com/news/articles/2021-09-14/gabon-s-climate-law-brings-it-closer-to-carbon-trade-ambition\">Gabon's Climate Law</a></li>\n<li><a href=\"https://www.soilassociation.org/blogs/2021/august/3/pairing-agroforestry-with-livestock-the-major-benefits/\">Pairing agroforestry with livestock: the major benefits</a></li>\n<li><a href=\"https://www.nationalparks.uk/2021/10/06/press-release-major-global-companies-to-fund-vital-nature-restoration-projects-in-the-uks-national-parks-through-innovative-new-financing-facility/\">Major global companies to fund nature restoration projects in UK's national parks</a> (via <a href=\"https://www.thepalladiumgroup.com\">Palladium group</a>)</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#remote-sensing\"></a>Remote sensing</h2>\n<ul>\n<li><a href=\"https://www.kiss.caltech.edu/papers/biodiversity/papers/2020_Book_RemoteSensingOfPlantBiodiversi.pdf\">Remote sensing of plant biodiversity</a>\n<ul>\n<li><a href=\"https://geobon.org\">Geobon</a> - global researcher network working on above.</li>\n</ul>\n</li>\n<li><a href=\"https://earthi.space/\">Earth-i</a> - sub-1m sensing satellite constellation</li>\n<li><a href=\"https://www.mantle-labs.com\">Mantle Labs</a> - earth observation + machine learning for farmers</li>\n<li><a href=\"https://www.cgi.com/uk/en-gb/news/climate/cgi-announces-strategic-partnership-project-seagrass-reduce-co2\">Seagrass from space</a></li>\n<li>Keshav's <a href=\"http://blizzard.cs.uwaterloo.ca/iss4e/wp-content/uploads/2017/10/Communication-technologies-for-energy-informatics.pdf\">comms technologies for energy informatics</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#carbon-credits\"></a>Carbon Credits</h2>\n<ul>\n<li><a href=\"https://www.cis.upenn.edu/~bcpierce/papers/carbon-offsets.pdf\">Notes on Carbon offsets for scientific societies</a></li>\n<li><a href=\"https://vcmintegrity.org/\">Voluntary Carbon Markets integrity initiative</a></li>\n<li><a href=\"https://www.ecosystemmarketplace.com/articles/press-release-voluntary-carbon-markets-rocket-in-2021-on-track-to-break-1b-for-first-time/\">Voluntary Carbon Markets Rocket in 2021, On Track to Break $1B for First Time</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#biodiversity\"></a>Biodiversity</h2>\n<ul>\n<li><a href=\"https://kiss.caltech.edu/lectures/2019_biodiversity.html\">Biodiversity: Perspectives of a Techie</a> - Dave Thau - Data and Technology Global Lead Scientist, WWF</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#valuing-climate-change\"></a>Valuing climate change</h2>\n<ul>\n<li><a href=\"https://www.sciencedirect.com/science/article/pii/S001671851930051X\">Cryptocarbon: The promises and pitfalls of forest protection on a blockchain</a></li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0285-8\">Valuing climate damages at the country level</a> - nature climate change, 2018</li>\n<li><a href=\"https://www.nature.com/articles/s41558-018-0282-y\">Country-level social cost of carbon</a>, nature climate change 2018</li>\n</ul>",
+18
avsm/notes_founded-tarides.json
+18
avsm/notes_founded-tarides.json
···+"summary": "<p>I'm delighted to report that I'm helping my long-time collaborator <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> to return to his OCaml roots from Docker. He has just founded Tarides, a startup in Paris with the goal of advancing the open source OCaml ecosystem.</p>\n<blockquote>\n<p>Founded in Paris in early 2018, Tarides helps developers and companies build secure, performant and resource-efficient network and storage services. We are using MirageOS to run applications without the overhead of a traditional operating system and Irmin to create scalable distributed applications. Tarides offers commercial support and commercial development services for companies interested to run MirageOS or Irmin as part of their technology stack.\n -- <a href=\"https://discuss.ocaml.org/t/tarides-is-looking-for-software-engineers-to-work-on-mirageos-and-irmin/1690\">Thomas Gazagnaire</a></p>\n</blockquote>",+"content": "<p>I'm delighted to report that I'm helping my long-time collaborator <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> to return to his OCaml roots from Docker. He has just founded Tarides, a startup in Paris with the goal of advancing the open source OCaml ecosystem.</p>\n<blockquote>\n<p>Founded in Paris in early 2018, Tarides helps developers and companies build secure, performant and resource-efficient network and storage services. We are using MirageOS to run applications without the overhead of a traditional operating system and Irmin to create scalable distributed applications. Tarides offers commercial support and commercial development services for companies interested to run MirageOS or Irmin as part of their technology stack.\n -- <a href=\"https://discuss.ocaml.org/t/tarides-is-looking-for-software-engineers-to-work-on-mirageos-and-irmin/1690\">Thomas Gazagnaire</a></p>\n</blockquote>",
+18
avsm/notes_fpgas-hardcaml.json
+18
avsm/notes_fpgas-hardcaml.json
···+"summary": "<p>With the vast amount of data we have these days for our <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> processing, it's naturally tempting to use more hardware offload. The obvious choice, GPGPUs, are not a great fit for the problem due to the difficulty of unlocking high data parallelism for geospatial data. So it's back to an old technology I worked on <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga\">twelve years ago</a> in the form of <a href=\"https://en.wikipedia.org/wiki/Field-programmable_gate_array\">FPGAs</a>!</p>\n<p>FPGAs are a very flexible way to execute boolean combinatorial logic, but are notoriously difficult to program. We have two possible angles to explore to address this. One is to design more declarative DSLs for data processing that compile to the FPGAs, such as <a href=\"https://mynameismwd.org\">Michael Dales</a> work on <a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> or <a href=\"https://github.com/omarathon\">Omar Tanner</a>'s work on in-memory <a href=\"https://anil.recoil.org/ideas/compressive-geospatial\">compressive computation</a>. The other angle is to work on the low-level API to programming the FPGAs, to get away from <a href=\"https://danluu.com/why-hardware-development-is-hard/\">Verilog</a> and program in our favourite high-level language...OCaml! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I have started making a list of resources for programming FPGAs in OCaml for our own education.</p>\n<p>HardCaml was originally a side project by <a href=\"https://www.ujamjar.com\">Andy Ray</a>. He gave a great presentation about it at <a href=\"https://www.ujamjar.com/presentations/orconf2015.html\">ORConf 2015</a>. Later on in the project's lifecycle, he moved it to being maintained by <a href=\"https://janestreet.com\">Jane Street</a>, where is used in production and is <a href=\"https://github.com/janestreet/hardcaml\">open source</a>. The first two resources to learn about HardCaml are to listen to the <a href=\"https://www.youtube.com/watch?v=GJX5VbKvh90\">Signals and Threads episode with Andy</a>, and then to <a href=\"https://arxiv.org/pdf/2312.15035\">read the 2023 paper</a>:</p>\n<blockquote>\n<p>Unlike high level synthesis (HLS), Hardcaml allows for low level control of the underlying hardware for maximum productivity, while abstracting away many of the tedious aspects of traditional hardware definition languages (HDLs) such as Verilog or VHDL. The richness of OCaml\u2019s type system combined with Hardcaml\u2019s fast circuit elaboration checks reduces the chance of user-introduced bugs and erroneous connections with features like custom type defining, type-safe parameterized modules and elaboration-time bit-width inference and validation.</p>\n<p>Hardcaml tooling emphasizes fast feedback through simulation, testing, and verification. It includes both a native OCaml cycle-accurate and an event-driven simulator. Unit tests can live in the source code and include digital ASCII waveforms representing the simulator\u2019s output. Hardcaml also provides tools for SAT proving and formal verification. Hardcaml is industrially proven, and has been used at Jane Street internally for many large FPGA designs.</p>\n</blockquote>\n<p>Let's look at the <a href=\"https://github.com/janestreet/hardcaml\">source code repository</a> next to see some more code.\nHardCaml is easily installable via <a href=\"https://opam.ocaml.org\">opam</a>, so there appears to be few barriers to getting the software up and running. For the development lifecycle, there are a few other packages to ease the interfacing with the FPGA hardware:</p>\n<ul>\n<li><a href=\"https://github.com/janestreet/hardcaml_waveterm\">Hardcaml_waveterm</a> is a terminal-based digital waveform viewer. These are usable in <a href=\"https://dev.realworldocaml.org/testing.html\">expect tests</a> or from an interactive terminal application. I love a good terminal user interface, particularly now that I've shifted to <a href=\"https://ghostty.org/\">Ghostty</a> with extremely good UTF-8 and colour support, so this is a very good sign.</li>\n<li><a href=\"https://github.com/janestreet/hardcaml_c\">Hardcaml_c</a> then converts a Hardcaml design over to C, where it can be compiled into a cycle-accurate simulation model and <a href=\"https://github.com/janestreet/hardcaml_verilator\">Hardcaml_verilator</a> does the same except for the open-source <a href=\"https://www.veripool.org/verilator/\">verilator</a> Verilog emulator.</li>\n</ul>\n<p>Let's look at some examples. There is a <a href=\"https://github.com/janestreet/hardcaml_circuits\">hardcaml_circuits</a> repository with some interesting designs in HardCaml. Picking some at random:</p>\n<ul>\n<li>There's a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.mli\">sorting network</a> that arranges a fixed configuration of compare-and-swaps to sort data. The network's structure is static (so it can be implemented easily in hardware), but the library abstracts its implementation to allow plugging in different compare-and-swap and data structures. Looking at the OCaml interface, it's an <a href=\"https://dev.realworldocaml.org/functors.html\">OCaml functor</a> over the compare-and-swap function, and has implementations in the module for a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L140\">merge sort</a> and a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L65\">bitonic merge</a>. This is already quite instructive to compare vs a software implementation, as for my <a href=\"https://anil.recoil.org/notes/focs\">Foundations of CS</a> course where I teach <a href=\"https://www.cl.cam.ac.uk/teaching/2324/FoundsCS/slides/FoCS-202324-5.pdf\">merge strategies</a> quite early on.</li>\n<li>For floating point calculations, we generally do <a href=\"https://www.allaboutcircuits.com/technical-articles/an-introduction-to-the-cordic-algorithm/\">CORDIC</a> algorithms which perform vector rotations iteratively to solve trig functions. The <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.mli\">cordic.mli</a> interface here is very readable, with nice use of OCaml features such as <a href=\"https://dev.realworldocaml.org/variants.html#variants\">algebraic data types</a> to express the equations themselves. The implementation of <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.ml#L97-L101\">arctan</a> shows how elegantly the OCaml implementation expresses the CORDIC equation as a higher level function.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#is-hardcaml-worth-learning\"></a>Is HardCaml worth learning?</h2>\n<p>I was curious to see what HardCaml's been used for recently. Most notably, it took home awards at the <a href=\"https://zprize.hardcaml.com/\">ZPrize</a> in 2022, winning the multi-scalar multiplication track. So this thing is right up there with other HDLs in terms of producing high performing circuits!</p>\n<p>There are two good blog posts about each of the implementations:</p>\n<ul>\n<li>The <a href=\"https://zprize.hardcaml.com/msm-overview.html\">multi-scalar multiplication post</a> looks to multiply 226 points on the <a href=\"https://neuromancer.sk/std/bls/BLS12-377\">BLS12-377</a> <a href=\"https://en.wikipedia.org/wiki/Elliptic_curve\">elliptic curve</a> by scalars from the associated 253-bit scalar field and add them all as fast as possible. This is difficult as the full set of transforms can't fit within a single FPGA's RAM, and so needs to call out to the host DRAM. There's an <a href=\"https://dl.acm.org/doi/10.1145/3626202.3637577\">paper</a> with all the details on the evaluation, which was done on an <a href=\"https://fpga-development-on-ec2.workshop.aws/en/4-f1-application-development-flow/introduction-to-f1-development-environment.html\">Amazon F1</a> FPGA instance.</li>\n<li>The <a href=\"https://zprize.hardcaml.com/ntt-overview.html\">number-theoretic transform post</a> describes what's going on there as something similar to fourier transforms but working over a <a href=\"https://en.wikipedia.org/wiki/Finite_field\">Galois field</a>. An extremely cool <a href=\"https://zprize.hardcaml.com/apps/ntt/ntt-core-with-rams-app\">web based interaction visualisation</a> allows you to step through the NTT implementation.\nThey used an <a href=\"https://www.amd.com/en/products/accelerators/alveo.html\">AMD Alveo</a> for this; I think that team are formerly Xilinx and based locally here in Cambridge!</li>\n</ul>\n<p>\n<img alt=\"The web-based waveform view for the NTT transformer\" src=\"https://anil.recoil.org/images/hardcaml-webterm-1.webp\" title=\"The web-based waveform view for the NTT transformer\">\nThe web-based waveform view for the NTT transformer</p>\n<p>More relevantly to my interested in geospatial processing, there is a <a href=\"https://github.com/hardcamls/video-coding/tree/main/jpeg\">JPEG decoder in HardCaml</a> which looks rather exciting. It implements the <a href=\"https://stackoverflow.com/questions/26523504/what-is-the-baseline-architecture-of-jpeg\">JPEG baseline profile</a> with arbitrary huffman tables for encoding, along with a more work-in-progress decoder. A <a href=\"https://github.com/geocaml/ocaml-tiff\">GeoTIFF</a> implementation would be a fun starter project to port to HardCaml!</p>\n<h2><a href=\"https://anil.recoil.org/#some-ideas-for-student-projects\"></a>Some ideas for student projects</h2>\n<p>Moving on from prizes, there is also a <a href=\"https://github.com/askvortsov1/hardcaml-mips\">MIPS processor in HardCaml</a> designed by a couple of students at <a href=\"https://www.psu.edu/\">Penn State</a>. They've also written a series of great <a href=\"https://ceramichacker.com/blog/34-1412-hardcaml-mips-and-io\">blog posts</a> about their adventures in learning HardCaml as students.</p>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and I have also been discussing the possibility of using <a href=\"https://anil.recoil.org/ideas/computational-storage-for-vector-dbs\">computational SSDs to accelerate vector databases</a>, which would be a game-changer for the <a href=\"https://anil.recoil.org/projects/rsn\">huge datasets</a> we're throwing around at the moment.</p>\n<p>I'm going to continue to explore this further, and will update this note with any more resources I found. Please do send me any ideas you might have! <em>(Update 2025/02/07):</em> Immediately after <a href=\"https://amok.recoil.org/@avsm/113962272067656593\">posting</a> this, two interesting responses came up:</p>\n<ul>\n<li><a href=\"https://github.com/edwintorok\">T\u00f6r\u00f6k Edwin</a> from the <a href=\"https://anil.recoil.org/projects/xen\">Xen</a> team <a href=\"https://amok.recoil.org/@edwintorok@discuss.systems/113962395735439060\">reports</a> that he experimented with <a href=\"https://tinytapeout.com/runs/ttihp0p2/tt_um_edwintorok\">TinyTapeout</a> in HardCaml to implement a raytracer:</li>\n</ul>\n<blockquote>\n<p>The VGA controller is <a href=\"https://github.com/edwintorok/roundingerror-ihp/blob/main/src/generator/vga.ml\">here</a> and the hardcaml output works nicely with yosys and open lane tooling and verilator. So far it seems to work in simulation and on an FPGA (output <a href=\"https://www.youtube.com/watch?v=K9mu3getxhU&t=42s\">recording video</a>, see bottom of <a href=\"https://tinytapeout.com/competitions/demoscene-tt08-entries/\">this</a> on how it got recorded).</p>\n<p>Yet to find out whether it'll work in a physical chip (they say the tape out will be done in April). I particularly like the waveforms in source code for unit test (see the above VGA example).</p>\n</blockquote>\n<ul>\n<li>My colleague <a href=\"https://albert.rierol.net/\">Albert Cordona</a> works on analysing the <a href=\"https://www.science.org/doi/full/10.1126/science.add9330\">connectomes of insect brains</a> (among other brains), which involves a lot of image processing over vast datasets as well. I <a href=\"https://amok.recoil.org/@avsm/113962390567495016\">pointed</a> him at an <a href=\"https://hackaday.io/project/27550-the-hobbyists-guide-to-fpgas\">FPGA overview</a>; any other good beginner "FPGA for programmers" ones I could also use?</li>\n</ul>\n<p>Thanks also to <a href=\"https://ujamjar.com\">Andy Ray</a> and <span>Andrew W. Moore</span> for feedback and corrections to this post.</p>",+"content": "<p>With the vast amount of data we have these days for our <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> processing, it's naturally tempting to use more hardware offload. The obvious choice, GPGPUs, are not a great fit for the problem due to the difficulty of unlocking high data parallelism for geospatial data. So it's back to an old technology I worked on <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga\">twelve years ago</a> in the form of <a href=\"https://en.wikipedia.org/wiki/Field-programmable_gate_array\">FPGAs</a>!</p>\n<p>FPGAs are a very flexible way to execute boolean combinatorial logic, but are notoriously difficult to program. We have two possible angles to explore to address this. One is to design more declarative DSLs for data processing that compile to the FPGAs, such as <a href=\"https://mynameismwd.org\">Michael Dales</a> work on <a href=\"https://github.com/quantifyearth/yirgacheffe\">Yirgacheffe</a> or <a href=\"https://github.com/omarathon\">Omar Tanner</a>'s work on in-memory <a href=\"https://anil.recoil.org/ideas/compressive-geospatial\">compressive computation</a>. The other angle is to work on the low-level API to programming the FPGAs, to get away from <a href=\"https://danluu.com/why-hardware-development-is-hard/\">Verilog</a> and program in our favourite high-level language...OCaml! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and I have started making a list of resources for programming FPGAs in OCaml for our own education.</p>\n<p>HardCaml was originally a side project by <a href=\"https://www.ujamjar.com\">Andy Ray</a>. He gave a great presentation about it at <a href=\"https://www.ujamjar.com/presentations/orconf2015.html\">ORConf 2015</a>. Later on in the project's lifecycle, he moved it to being maintained by <a href=\"https://janestreet.com\">Jane Street</a>, where is used in production and is <a href=\"https://github.com/janestreet/hardcaml\">open source</a>. The first two resources to learn about HardCaml are to listen to the <a href=\"https://www.youtube.com/watch?v=GJX5VbKvh90\">Signals and Threads episode with Andy</a>, and then to <a href=\"https://arxiv.org/pdf/2312.15035\">read the 2023 paper</a>:</p>\n<blockquote>\n<p>Unlike high level synthesis (HLS), Hardcaml allows for low level control of the underlying hardware for maximum productivity, while abstracting away many of the tedious aspects of traditional hardware definition languages (HDLs) such as Verilog or VHDL. The richness of OCaml\u2019s type system combined with Hardcaml\u2019s fast circuit elaboration checks reduces the chance of user-introduced bugs and erroneous connections with features like custom type defining, type-safe parameterized modules and elaboration-time bit-width inference and validation.</p>\n<p>Hardcaml tooling emphasizes fast feedback through simulation, testing, and verification. It includes both a native OCaml cycle-accurate and an event-driven simulator. Unit tests can live in the source code and include digital ASCII waveforms representing the simulator\u2019s output. Hardcaml also provides tools for SAT proving and formal verification. Hardcaml is industrially proven, and has been used at Jane Street internally for many large FPGA designs.</p>\n</blockquote>\n<p>Let's look at the <a href=\"https://github.com/janestreet/hardcaml\">source code repository</a> next to see some more code.\nHardCaml is easily installable via <a href=\"https://opam.ocaml.org\">opam</a>, so there appears to be few barriers to getting the software up and running. For the development lifecycle, there are a few other packages to ease the interfacing with the FPGA hardware:</p>\n<ul>\n<li><a href=\"https://github.com/janestreet/hardcaml_waveterm\">Hardcaml_waveterm</a> is a terminal-based digital waveform viewer. These are usable in <a href=\"https://dev.realworldocaml.org/testing.html\">expect tests</a> or from an interactive terminal application. I love a good terminal user interface, particularly now that I've shifted to <a href=\"https://ghostty.org/\">Ghostty</a> with extremely good UTF-8 and colour support, so this is a very good sign.</li>\n<li><a href=\"https://github.com/janestreet/hardcaml_c\">Hardcaml_c</a> then converts a Hardcaml design over to C, where it can be compiled into a cycle-accurate simulation model and <a href=\"https://github.com/janestreet/hardcaml_verilator\">Hardcaml_verilator</a> does the same except for the open-source <a href=\"https://www.veripool.org/verilator/\">verilator</a> Verilog emulator.</li>\n</ul>\n<p>Let's look at some examples. There is a <a href=\"https://github.com/janestreet/hardcaml_circuits\">hardcaml_circuits</a> repository with some interesting designs in HardCaml. Picking some at random:</p>\n<ul>\n<li>There's a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.mli\">sorting network</a> that arranges a fixed configuration of compare-and-swaps to sort data. The network's structure is static (so it can be implemented easily in hardware), but the library abstracts its implementation to allow plugging in different compare-and-swap and data structures. Looking at the OCaml interface, it's an <a href=\"https://dev.realworldocaml.org/functors.html\">OCaml functor</a> over the compare-and-swap function, and has implementations in the module for a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L140\">merge sort</a> and a <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/sorting_network.ml#L65\">bitonic merge</a>. This is already quite instructive to compare vs a software implementation, as for my <a href=\"https://anil.recoil.org/notes/focs\">Foundations of CS</a> course where I teach <a href=\"https://www.cl.cam.ac.uk/teaching/2324/FoundsCS/slides/FoCS-202324-5.pdf\">merge strategies</a> quite early on.</li>\n<li>For floating point calculations, we generally do <a href=\"https://www.allaboutcircuits.com/technical-articles/an-introduction-to-the-cordic-algorithm/\">CORDIC</a> algorithms which perform vector rotations iteratively to solve trig functions. The <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.mli\">cordic.mli</a> interface here is very readable, with nice use of OCaml features such as <a href=\"https://dev.realworldocaml.org/variants.html#variants\">algebraic data types</a> to express the equations themselves. The implementation of <a href=\"https://github.com/janestreet/hardcaml_circuits/blob/master/src/cordic_reference.ml#L97-L101\">arctan</a> shows how elegantly the OCaml implementation expresses the CORDIC equation as a higher level function.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#is-hardcaml-worth-learning\"></a>Is HardCaml worth learning?</h2>\n<p>I was curious to see what HardCaml's been used for recently. Most notably, it took home awards at the <a href=\"https://zprize.hardcaml.com/\">ZPrize</a> in 2022, winning the multi-scalar multiplication track. So this thing is right up there with other HDLs in terms of producing high performing circuits!</p>\n<p>There are two good blog posts about each of the implementations:</p>\n<ul>\n<li>The <a href=\"https://zprize.hardcaml.com/msm-overview.html\">multi-scalar multiplication post</a> looks to multiply 226 points on the <a href=\"https://neuromancer.sk/std/bls/BLS12-377\">BLS12-377</a> <a href=\"https://en.wikipedia.org/wiki/Elliptic_curve\">elliptic curve</a> by scalars from the associated 253-bit scalar field and add them all as fast as possible. This is difficult as the full set of transforms can't fit within a single FPGA's RAM, and so needs to call out to the host DRAM. There's an <a href=\"https://dl.acm.org/doi/10.1145/3626202.3637577\">paper</a> with all the details on the evaluation, which was done on an <a href=\"https://fpga-development-on-ec2.workshop.aws/en/4-f1-application-development-flow/introduction-to-f1-development-environment.html\">Amazon F1</a> FPGA instance.</li>\n<li>The <a href=\"https://zprize.hardcaml.com/ntt-overview.html\">number-theoretic transform post</a> describes what's going on there as something similar to fourier transforms but working over a <a href=\"https://en.wikipedia.org/wiki/Finite_field\">Galois field</a>. An extremely cool <a href=\"https://zprize.hardcaml.com/apps/ntt/ntt-core-with-rams-app\">web based interaction visualisation</a> allows you to step through the NTT implementation.\nThey used an <a href=\"https://www.amd.com/en/products/accelerators/alveo.html\">AMD Alveo</a> for this; I think that team are formerly Xilinx and based locally here in Cambridge!</li>\n</ul>\n<p>\n<img alt=\"The web-based waveform view for the NTT transformer\" src=\"https://anil.recoil.org/images/hardcaml-webterm-1.webp\" title=\"The web-based waveform view for the NTT transformer\">\nThe web-based waveform view for the NTT transformer</p>\n<p>More relevantly to my interested in geospatial processing, there is a <a href=\"https://github.com/hardcamls/video-coding/tree/main/jpeg\">JPEG decoder in HardCaml</a> which looks rather exciting. It implements the <a href=\"https://stackoverflow.com/questions/26523504/what-is-the-baseline-architecture-of-jpeg\">JPEG baseline profile</a> with arbitrary huffman tables for encoding, along with a more work-in-progress decoder. A <a href=\"https://github.com/geocaml/ocaml-tiff\">GeoTIFF</a> implementation would be a fun starter project to port to HardCaml!</p>\n<h2><a href=\"https://anil.recoil.org/#some-ideas-for-student-projects\"></a>Some ideas for student projects</h2>\n<p>Moving on from prizes, there is also a <a href=\"https://github.com/askvortsov1/hardcaml-mips\">MIPS processor in HardCaml</a> designed by a couple of students at <a href=\"https://www.psu.edu/\">Penn State</a>. They've also written a series of great <a href=\"https://ceramichacker.com/blog/34-1412-hardcaml-mips-and-io\">blog posts</a> about their adventures in learning HardCaml as students.</p>\n<p><a href=\"https://toao.com\">Sadiq Jaffer</a> and I have also been discussing the possibility of using <a href=\"https://anil.recoil.org/ideas/computational-storage-for-vector-dbs\">computational SSDs to accelerate vector databases</a>, which would be a game-changer for the <a href=\"https://anil.recoil.org/projects/rsn\">huge datasets</a> we're throwing around at the moment.</p>\n<p>I'm going to continue to explore this further, and will update this note with any more resources I found. Please do send me any ideas you might have! <em>(Update 2025/02/07):</em> Immediately after <a href=\"https://amok.recoil.org/@avsm/113962272067656593\">posting</a> this, two interesting responses came up:</p>\n<ul>\n<li><a href=\"https://github.com/edwintorok\">T\u00f6r\u00f6k Edwin</a> from the <a href=\"https://anil.recoil.org/projects/xen\">Xen</a> team <a href=\"https://amok.recoil.org/@edwintorok@discuss.systems/113962395735439060\">reports</a> that he experimented with <a href=\"https://tinytapeout.com/runs/ttihp0p2/tt_um_edwintorok\">TinyTapeout</a> in HardCaml to implement a raytracer:</li>\n</ul>\n<blockquote>\n<p>The VGA controller is <a href=\"https://github.com/edwintorok/roundingerror-ihp/blob/main/src/generator/vga.ml\">here</a> and the hardcaml output works nicely with yosys and open lane tooling and verilator. So far it seems to work in simulation and on an FPGA (output <a href=\"https://www.youtube.com/watch?v=K9mu3getxhU&t=42s\">recording video</a>, see bottom of <a href=\"https://tinytapeout.com/competitions/demoscene-tt08-entries/\">this</a> on how it got recorded).</p>\n<p>Yet to find out whether it'll work in a physical chip (they say the tape out will be done in April). I particularly like the waveforms in source code for unit test (see the above VGA example).</p>\n</blockquote>\n<ul>\n<li>My colleague <a href=\"https://albert.rierol.net/\">Albert Cordona</a> works on analysing the <a href=\"https://www.science.org/doi/full/10.1126/science.add9330\">connectomes of insect brains</a> (among other brains), which involves a lot of image processing over vast datasets as well. I <a href=\"https://amok.recoil.org/@avsm/113962390567495016\">pointed</a> him at an <a href=\"https://hackaday.io/project/27550-the-hobbyists-guide-to-fpgas\">FPGA overview</a>; any other good beginner "FPGA for programmers" ones I could also use?</li>\n</ul>\n<p>Thanks also to <a href=\"https://ujamjar.com\">Andy Ray</a> and <span>Andrew W. Moore</span> for feedback and corrections to this post.</p>",
+18
avsm/notes_gcc-bounds.json
+18
avsm/notes_gcc-bounds.json
···+"summary": "<p>After many rounds of review and helpful feedback from fellow developers,\nI merged my <a href=\"https://man.openbsd.org/gcc-local.1\">GCC static bounds checking extension</a> into OpenBSD today!</p>\n<blockquote>\n<p>Introduce a simple static checker for making sure that the bounds\nlength passed to common functions such as strlcpy/strlcat match the\nreal length of the buffer. It also checks to make sure that the bound\nlength was not incorrectly derived from a sizeof(pointer) operation.</p>\n<p>Functions must be marked with the new attribute <strong>bounded</strong>, and warnings\nare turned on by -Wbounded. Specifying -Wformat also enables bounds\nchecking for scanf(3) bounds to '%s' format variables. -Wall now turns\non -Wbounded also.</p>\n<p>The checking is pretty limited right now to constant parameters, and the\nbuffers must be statically declared, and not inside a record type. This\nsimple checking still found hundreds of bugs around the ports tree though,\nand there have been no false positive warnings.</p>\n</blockquote>\n<p>You can read more details in the <a href=\"https://man.openbsd.org/gcc-local.1\"><em>gcc-local(1)</em></a> manual page as well.</p>",+"content": "<p>After many rounds of review and helpful feedback from fellow developers,\nI merged my <a href=\"https://man.openbsd.org/gcc-local.1\">GCC static bounds checking extension</a> into OpenBSD today!</p>\n<blockquote>\n<p>Introduce a simple static checker for making sure that the bounds\nlength passed to common functions such as strlcpy/strlcat match the\nreal length of the buffer. It also checks to make sure that the bound\nlength was not incorrectly derived from a sizeof(pointer) operation.</p>\n<p>Functions must be marked with the new attribute <strong>bounded</strong>, and warnings\nare turned on by -Wbounded. Specifying -Wformat also enables bounds\nchecking for scanf(3) bounds to '%s' format variables. -Wall now turns\non -Wbounded also.</p>\n<p>The checking is pretty limited right now to constant parameters, and the\nbuffers must be statically declared, and not inside a record type. This\nsimple checking still found hundreds of bugs around the ports tree though,\nand there have been no false positive warnings.</p>\n</blockquote>\n<p>You can read more details in the <a href=\"https://man.openbsd.org/gcc-local.1\"><em>gcc-local(1)</em></a> manual page as well.</p>",
+18
avsm/notes_goaccess-for-logs.json
+18
avsm/notes_goaccess-for-logs.json
···+"summary": "<p>Like many <a href=\"https://anil.recoil.org/notes/ai-ietf-aiprefs\">others</a>, my website is under a constant barrage of crawling from bots. I need to figure out which one is hosing me, but I am also resisting having third-party trackers of any form. I took a look at hosting a <a href=\"https://plausible.io/\">Plausible</a> instance as <a href=\"https://plausible.ci.dev/ocaml.org\">OCaml does</a>, but it's yet another service to run and maintain. Then <a href=\"https://nick.recoil.org\">Nick Ludlam</a> pointed me to an old-fashioned server-side log analyser with builtin privacy called <a href=\"https://goaccess.io\">Goaccess</a> he's using on his <a href=\"https://nick.recoil.org\">site</a>, which is also perfect for my needs!</p>\n<p>Setting up Goaccess is extremely simple. It's a single binary with no dependencies outside of ncurses, and just needs some server side logs. I currently use <a href=\"https://caddyserver.com\">Caddy</a> to front the HTTP2/3 for my custom OCaml webserver, so I just had to configure it to output JSON-format logs.</p>\n<pre><code>anil.recoil.org {\n reverse_proxy http://localhost:8080\n encode zstd gzip\n log {\n format json\n output file /var/log/caddy/anil.recoil.org.log {\n roll_size 1gb\n roll_keep 100\n }\n }\n}\n</code></pre>\n<p>The above causes Caddy to log lines in a JSON format like this:</p>\n<pre><code>{ "level": "info", "ts": 1745414562.426229,\n "logger": "http.log.access.log0",\n "msg": "handled request",\n "request": {\n "remote_ip": "<snip>", "remote_port": "56839",\n "client_ip": "<snip>", "proto": "HTTP/3.0",\n "method": "GET", "host": "anil.recoil.org",\n "uri": "/assets/home.svg",\n "headers": {\n "Sec-Fetch-Dest": [ "image" ],\n "Sec-Fetch-Site": [ "same-origin" ],\n "Sec-Fetch-Mode": [ "no-cors" ],\n "Priority": [ "u=5, i" ],\n "Accept-Encoding": [ "gzip, deflate, br" ],\n "Accept": [\n "image/webp,image/avif,image/jxl,image/heic,image/heic-sequence,video/*;q=0.8,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5" ],\n "User-Agent": [\n "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3.1 Safari/605.1.15"\n ],\n "Referer": [ "https://anil.recoil.org/" ],\n "Accept-Language": [ "en-GB,en;q=0.9" ]\n },\n "tls": {\n "resumed": false, "version": 772, "cipher_suite": 4865,\n "proto": "h3", "server_name": "anil.recoil.org"\n }\n }, <...etc>\n}\n</code></pre>\n<p>While this is a verbose logging format, it compresses very well and has lots of information that can be analysed without the need for any JavaScript. Once the logging is setup, just running <code>goaccess <logfile></code> spins up a curses configuration from which I can select the Caddy log format.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/goaccess-ss-1.webp\" title=\"\">\n</p>\n<p>After that, there is a simple interactive terminal dashboard that not only shows the usual analytics, but also fun things like operating system and time-of-access frequency patterns.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/goaccess-ss-2.webp\" title=\"\">\n</p>\n<p>The tool can also blank out IP addresses in order to preserve privacy in the output analytics, and also spit out an <a href=\"https://theorangeone.net/posts/goaccess-analytics/\">HTML report</a>, although I'm not using this particular functionality. While Plausible looks like the answer for bigger sites, this simple tool is good enough for me. The very first iteration of this site in 1998 used to use <a href=\"https://en.wikipedia.org/wiki/Analog_(program)\">Analog</a> (written by my former Xen/Docker colleague Stephen Turner), so it's nice to go back full circle to this sort of tool again!</p>",+"content": "<p>Like many <a href=\"https://anil.recoil.org/notes/ai-ietf-aiprefs\">others</a>, my website is under a constant barrage of crawling from bots. I need to figure out which one is hosing me, but I am also resisting having third-party trackers of any form. I took a look at hosting a <a href=\"https://plausible.io/\">Plausible</a> instance as <a href=\"https://plausible.ci.dev/ocaml.org\">OCaml does</a>, but it's yet another service to run and maintain. Then <a href=\"https://nick.recoil.org\">Nick Ludlam</a> pointed me to an old-fashioned server-side log analyser with builtin privacy called <a href=\"https://goaccess.io\">Goaccess</a> he's using on his <a href=\"https://nick.recoil.org\">site</a>, which is also perfect for my needs!</p>\n<p>Setting up Goaccess is extremely simple. It's a single binary with no dependencies outside of ncurses, and just needs some server side logs. I currently use <a href=\"https://caddyserver.com\">Caddy</a> to front the HTTP2/3 for my custom OCaml webserver, so I just had to configure it to output JSON-format logs.</p>\n<pre><code>anil.recoil.org {\n reverse_proxy http://localhost:8080\n encode zstd gzip\n log {\n format json\n output file /var/log/caddy/anil.recoil.org.log {\n roll_size 1gb\n roll_keep 100\n }\n }\n}\n</code></pre>\n<p>The above causes Caddy to log lines in a JSON format like this:</p>\n<pre><code>{ "level": "info", "ts": 1745414562.426229,\n "logger": "http.log.access.log0",\n "msg": "handled request",\n "request": {\n "remote_ip": "<snip>", "remote_port": "56839",\n "client_ip": "<snip>", "proto": "HTTP/3.0",\n "method": "GET", "host": "anil.recoil.org",\n "uri": "/assets/home.svg",\n "headers": {\n "Sec-Fetch-Dest": [ "image" ],\n "Sec-Fetch-Site": [ "same-origin" ],\n "Sec-Fetch-Mode": [ "no-cors" ],\n "Priority": [ "u=5, i" ],\n "Accept-Encoding": [ "gzip, deflate, br" ],\n "Accept": [\n "image/webp,image/avif,image/jxl,image/heic,image/heic-sequence,video/*;q=0.8,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5" ],\n "User-Agent": [\n "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.3.1 Safari/605.1.15"\n ],\n "Referer": [ "https://anil.recoil.org/" ],\n "Accept-Language": [ "en-GB,en;q=0.9" ]\n },\n "tls": {\n "resumed": false, "version": 772, "cipher_suite": 4865,\n "proto": "h3", "server_name": "anil.recoil.org"\n }\n }, <...etc>\n}\n</code></pre>\n<p>While this is a verbose logging format, it compresses very well and has lots of information that can be analysed without the need for any JavaScript. Once the logging is setup, just running <code>goaccess <logfile></code> spins up a curses configuration from which I can select the Caddy log format.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/goaccess-ss-1.webp\" title=\"\">\n</p>\n<p>After that, there is a simple interactive terminal dashboard that not only shows the usual analytics, but also fun things like operating system and time-of-access frequency patterns.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/goaccess-ss-2.webp\" title=\"\">\n</p>\n<p>The tool can also blank out IP addresses in order to preserve privacy in the output analytics, and also spit out an <a href=\"https://theorangeone.net/posts/goaccess-analytics/\">HTML report</a>, although I'm not using this particular functionality. While Plausible looks like the answer for bigger sites, this simple tool is good enough for me. The very first iteration of this site in 1998 used to use <a href=\"https://en.wikipedia.org/wiki/Analog_(program)\">Analog</a> (written by my former Xen/Docker colleague Stephen Turner), so it's nice to go back full circle to this sort of tool again!</p>",
+18
avsm/notes_grepping-every-known-ocaml-package-source.json
+18
avsm/notes_grepping-every-known-ocaml-package-source.json
···+"summary": "<p>A regular question that comes up from OCaml developers is how to use\n<a href=\"http://opam.ocaml.org\">OPAM</a> as a hypothesis testing tool against the\nknown corpus of OCaml source code. In other words: can we quickly and\nsimply run <code>grep</code> over every source archive in OPAM? So that\u2019s the topic\nof today\u2019s 5 minute blog post:</p>\n<pre><code>git clone git://github.com/ocaml/opam-repository\ncd opam-repository\nopam-admin make\ncd archives\nfor i in *.tar.gz; \\\n do tar -zxOf $i | grep caml_stat_alloc_string; \\\ndone\n</code></pre>\n<p>In this particular example we\u2019re looking for instances of\n<code>caml_stat_alloc_string</code>, so just replace that with the regular\nexpression of your choice. The <code>opam-admin</code> tool repacks upstream\narchives into a straightforward tarball, so you don\u2019t need to worry\nabout all the different <a href=\"http://opam.ocaml.org/doc/Packaging.html#h1-CreatingOPAMpackages#Notes\">archival\nformats</a>\nthat OPAM supports (such as git or Darcs). It just adds an <code>archive</code>\ndirectory to a normal\n<a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> checkout, so\nyou can reuse an existing checkout if you have one already.</p>\n<pre><code>$ cd opam-repository/archives\n$ du -h\n669M .\n$ ls | wc -l\n2092\n</code></pre>",+"content": "<p>A regular question that comes up from OCaml developers is how to use\n<a href=\"http://opam.ocaml.org\">OPAM</a> as a hypothesis testing tool against the\nknown corpus of OCaml source code. In other words: can we quickly and\nsimply run <code>grep</code> over every source archive in OPAM? So that\u2019s the topic\nof today\u2019s 5 minute blog post:</p>\n<pre><code>git clone git://github.com/ocaml/opam-repository\ncd opam-repository\nopam-admin make\ncd archives\nfor i in *.tar.gz; \\\n do tar -zxOf $i | grep caml_stat_alloc_string; \\\ndone\n</code></pre>\n<p>In this particular example we\u2019re looking for instances of\n<code>caml_stat_alloc_string</code>, so just replace that with the regular\nexpression of your choice. The <code>opam-admin</code> tool repacks upstream\narchives into a straightforward tarball, so you don\u2019t need to worry\nabout all the different <a href=\"http://opam.ocaml.org/doc/Packaging.html#h1-CreatingOPAMpackages#Notes\">archival\nformats</a>\nthat OPAM supports (such as git or Darcs). It just adds an <code>archive</code>\ndirectory to a normal\n<a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> checkout, so\nyou can reuse an existing checkout if you have one already.</p>\n<pre><code>$ cd opam-repository/archives\n$ du -h\n669M .\n$ ls | wc -l\n2092\n</code></pre>",
+18
avsm/notes_hdi-workshop-2013-liveblog.json
+18
avsm/notes_hdi-workshop-2013-liveblog.json
···+"summary": "<p>We held the first <a href=\"https://hdi-network.org\">Human Data Interaction</a> workshop over in Cambridge, with lots of discussion about social networks and the state of play with decentralising them.</p>",+"content": "<p>We held the first <a href=\"https://hdi-network.org\">Human Data Interaction</a> workshop over in Cambridge, with lots of discussion about social networks and the state of play with decentralising them.</p>",
+18
avsm/notes_hem.json
+18
avsm/notes_hem.json
···+"summary": "<p>The <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">SpotCode</a> cellphone software was spun out into a startup called High Energy Magic Ltd, and was covered on Slashdot.</p>\n<blockquote>\n<p>Check this out! High Energy Magic have announced a public beta of software to let you use your camera-phone as a physical mouse by just pointing and clicking and rotating it in the air.</p>\n</blockquote>\n<p>There were also articles on <a href=\"https://web.archive.org/web/20060505171702/http://www.linuxdevices.com/news/NS3157166681.html\">DeviceForge</a> that were picked up by quite a few outlets.</p>\n<p><em>Update: You can see some of the videos under <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">Ubiquitous Interaction Devices</a> as well.</em></p>",+"content": "<p>The <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">SpotCode</a> cellphone software was spun out into a startup called High Energy Magic Ltd, and was covered on Slashdot.</p>\n<blockquote>\n<p>Check this out! High Energy Magic have announced a public beta of software to let you use your camera-phone as a physical mouse by just pointing and clicking and rotating it in the air.</p>\n</blockquote>\n<p>There were also articles on <a href=\"https://web.archive.org/web/20060505171702/http://www.linuxdevices.com/news/NS3157166681.html\">DeviceForge</a> that were picked up by quite a few outlets.</p>\n<p><em>Update: You can see some of the videos under <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">Ubiquitous Interaction Devices</a> as well.</em></p>",
+18
avsm/notes_horde-cache.json
+18
avsm/notes_horde-cache.json
···+"summary": "<p>While hacking on making Chora performant enough to work as the official PHP CVS web-viewer, I added in a general caching subsystem into the Horde PHP framework. Do let me know if you end up finding a use for it in your own applications.</p>\n<blockquote>\n<p>Add in a Cache framework for persistent storage and retrieval of cached\nobjects. Consider it experimental for now.</p>\n<p>Basically works for Chora's needs ... implements a filesystem driver\nwhich tries to act sensibly (writes to a tmp file, then does an atomic\nrename to the cache object), to avoid synchronization issues.</p>\n<p>It does not cleanup the cached repository at the moment - needs to have\na garbage collection function done at some point.</p>\n<p> -- <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010820/003116.html\">Anil Madhavapeddy</a></p>\n</blockquote>",+"content": "<p>While hacking on making Chora performant enough to work as the official PHP CVS web-viewer, I added in a general caching subsystem into the Horde PHP framework. Do let me know if you end up finding a use for it in your own applications.</p>\n<blockquote>\n<p>Add in a Cache framework for persistent storage and retrieval of cached\nobjects. Consider it experimental for now.</p>\n<p>Basically works for Chora's needs ... implements a filesystem driver\nwhich tries to act sensibly (writes to a tmp file, then does an atomic\nrename to the cache object), to avoid synchronization issues.</p>\n<p>It does not cleanup the cached repository at the moment - needs to have\na garbage collection function done at some point.</p>\n<p> -- <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20010820/003116.html\">Anil Madhavapeddy</a></p>\n</blockquote>",
+18
avsm/notes_horde-developer.json
+18
avsm/notes_horde-developer.json
···+"summary": "<p>After contributing some patches, I've now got the honour of becoming a <a href=\"https://www.horde.org/community/team\">core team</a> member of the <a href=\"https://horde.org\">Horde</a> project. Many thanks to Chuck Hagenbuch and Jon Parise for their trust in me!</p>\n<p>I'm planning on fixing bugs in IMP and the webmail subsystem, and am getting interested in version control and CVS as well, so I'm going to look at Whups and Chora ore. You can follow my commits on the <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20001016/author.html\">Horde-CVS</a> commit archives, where I am <code>avsm@horde.org</code>!</p>",+"content": "<p>After contributing some patches, I've now got the honour of becoming a <a href=\"https://www.horde.org/community/team\">core team</a> member of the <a href=\"https://horde.org\">Horde</a> project. Many thanks to Chuck Hagenbuch and Jon Parise for their trust in me!</p>\n<p>I'm planning on fixing bugs in IMP and the webmail subsystem, and am getting interested in version control and CVS as well, so I'm going to look at Whups and Chora ore. You can follow my commits on the <a href=\"https://lists.horde.org/archives/cvs/Week-of-Mon-20001016/author.html\">Horde-CVS</a> commit archives, where I am <code>avsm@horde.org</code>!</p>",
+18
avsm/notes_hotnets-pc-2024.json
+18
avsm/notes_hotnets-pc-2024.json
···+"summary": "<p>I was on the program committee for <a href=\"https://conferences.sigcomm.org/hotnets/2024/\">HotNets\n2024</a> this year, which was a\nthoroughly enjoyable experience. The <a href=\"https://conferences.sigcomm.org/hotnets/2024/accepted.html\">list of accepted\npapers</a> is now out,\nand it's a diverse program -- with my personal favourites being the ones on\nspace communications networks using low earth orbit satellites.</p>\n<p>Well done to <a href=\"https://www.microsoft.com/en-us/research/people/bearzani/\">Behnaz\nArzani</a> and <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate\nFoster</a> for really excellent general\nchairing and ensuring the PC maintained a constructive, positive tone while\ndoing the difficult job of selecting papers from a crowded set of submissions.\nThe structure of of the program committee was also somewhat novel, and one\nI'd like to replicate in other conferences I organise in the future.</p>\n<p>\n<img alt=\"The spectacular view from Jane Street&apos;s 18th floor!\" src=\"https://anil.recoil.org/images/hotnetspc-view-2024.webp\" title=\"The spectacular view from Jane Street&apos;s 18th floor!\">\nThe spectacular view from Jane Street's 18th floor!</p>\n<ul>\n<li><strong>Two Review Rounds.</strong> There were two rounds of reviewing, with any clear decisions from the first\nset of reviewers resulting in an early rejection decision. Remaining papers\nwent through to round 2, where they got a further set of reviews.</li>\n<li><strong>HotCRP Discussions.</strong> The PC strove to discuss the papers on <a href=\"https://hotcrp.com\">HotCRP</a> before\nthe in-person PC meeting, coming to consensus on a number of them. Only a\nsmall subset of the full papers had to be discussed in the live meeting. HotCRP has\nsuperb support to facilitate this sort of interaction, in a way that alternatives\nlike EasyChair simply don't. I'm <em>much</em> more likely to agree to future program\ncommittees if they use HotCRP.</li>\n<li><strong>Hybrid Meeting with Pods.</strong> For the live meeting, the chairs organised "pods" at Microsoft in Seattle (with Behnaz)\nand at Jane Street in New York (with Nate). I was hoping to host a pod in\nCambridge as well, but I ended up having to travel to New York for some\nmeetings on biodiversity and so went along to the Jane Street pod.\nThis was wonderful -- we got to minimise travel, and yet have good synchronous\ndiscussions, with excellent A/V links between the pods. Other PC members\ngot to Zoom in as usual if they couldn't make it to a pod, but there was\nenough critical mass to make it a more social occasion for those who did attend\none.</li>\n<li><strong>Post PC Workshop.</strong> There was an excellent workshop of talks held afterwards, where I spoke on\nplanetary computing, and I got to hear the legendary <a href=\"https://www.linkedin.com/in/brian-nigito-a366052/\">Brian Nigito</a>\ntalk about their low latency <a href=\"https://x.com/yminsky/status/1837650874409136339\">TCP/IP stack called NetKit</a>\nthat's written in OCaml. Now, I've <a href=\"https://github.com/mirage/mirage-tcpip\">written an OCaml TCP/IP stack</a>\nor two in my time, but what makes theirs really exciting is that it takes advantage\nof the experimental modal types in their <a href=\"https://blog.janestreet.com/author/mslater/\">"oxidised" OCaml</a>\nbranch be as performance as a non-garbage-collected stack. I sadly had to run\nfor my flight back home half-way through the workshop, but it was lovely to\nreconnect with the networking community again after being deep into environmental\nscience for the past few years.</li>\n</ul>\n<p>\n<img alt=\"Me giving a talk! (photo courtesy Nate Foster)\" src=\"https://anil.recoil.org/images/hotnetspc-anil-2024.webp\" title=\"Me giving a talk! (photo courtesy Nate Foster)\">\nMe giving a talk! (photo courtesy Nate Foster)</p>\n<p>I'm noting down the HotNets as a potentially really good way to run the next\n<a href=\"https://propl.dev\">Programming for the Planet</a>, which is due in 2025. More\nnews on that soon! In the meanwhile, get your papers into <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> which is due in a couple of\ndays...</p>",+"content": "<p>I was on the program committee for <a href=\"https://conferences.sigcomm.org/hotnets/2024/\">HotNets\n2024</a> this year, which was a\nthoroughly enjoyable experience. The <a href=\"https://conferences.sigcomm.org/hotnets/2024/accepted.html\">list of accepted\npapers</a> is now out,\nand it's a diverse program -- with my personal favourites being the ones on\nspace communications networks using low earth orbit satellites.</p>\n<p>Well done to <a href=\"https://www.microsoft.com/en-us/research/people/bearzani/\">Behnaz\nArzani</a> and <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate\nFoster</a> for really excellent general\nchairing and ensuring the PC maintained a constructive, positive tone while\ndoing the difficult job of selecting papers from a crowded set of submissions.\nThe structure of of the program committee was also somewhat novel, and one\nI'd like to replicate in other conferences I organise in the future.</p>\n<p>\n<img alt=\"The spectacular view from Jane Street&apos;s 18th floor!\" src=\"https://anil.recoil.org/images/hotnetspc-view-2024.webp\" title=\"The spectacular view from Jane Street&apos;s 18th floor!\">\nThe spectacular view from Jane Street's 18th floor!</p>\n<ul>\n<li><strong>Two Review Rounds.</strong> There were two rounds of reviewing, with any clear decisions from the first\nset of reviewers resulting in an early rejection decision. Remaining papers\nwent through to round 2, where they got a further set of reviews.</li>\n<li><strong>HotCRP Discussions.</strong> The PC strove to discuss the papers on <a href=\"https://hotcrp.com\">HotCRP</a> before\nthe in-person PC meeting, coming to consensus on a number of them. Only a\nsmall subset of the full papers had to be discussed in the live meeting. HotCRP has\nsuperb support to facilitate this sort of interaction, in a way that alternatives\nlike EasyChair simply don't. I'm <em>much</em> more likely to agree to future program\ncommittees if they use HotCRP.</li>\n<li><strong>Hybrid Meeting with Pods.</strong> For the live meeting, the chairs organised "pods" at Microsoft in Seattle (with Behnaz)\nand at Jane Street in New York (with Nate). I was hoping to host a pod in\nCambridge as well, but I ended up having to travel to New York for some\nmeetings on biodiversity and so went along to the Jane Street pod.\nThis was wonderful -- we got to minimise travel, and yet have good synchronous\ndiscussions, with excellent A/V links between the pods. Other PC members\ngot to Zoom in as usual if they couldn't make it to a pod, but there was\nenough critical mass to make it a more social occasion for those who did attend\none.</li>\n<li><strong>Post PC Workshop.</strong> There was an excellent workshop of talks held afterwards, where I spoke on\nplanetary computing, and I got to hear the legendary <a href=\"https://www.linkedin.com/in/brian-nigito-a366052/\">Brian Nigito</a>\ntalk about their low latency <a href=\"https://x.com/yminsky/status/1837650874409136339\">TCP/IP stack called NetKit</a>\nthat's written in OCaml. Now, I've <a href=\"https://github.com/mirage/mirage-tcpip\">written an OCaml TCP/IP stack</a>\nor two in my time, but what makes theirs really exciting is that it takes advantage\nof the experimental modal types in their <a href=\"https://blog.janestreet.com/author/mslater/\">"oxidised" OCaml</a>\nbranch be as performance as a non-garbage-collected stack. I sadly had to run\nfor my flight back home half-way through the workshop, but it was lovely to\nreconnect with the networking community again after being deep into environmental\nscience for the past few years.</li>\n</ul>\n<p>\n<img alt=\"Me giving a talk! (photo courtesy Nate Foster)\" src=\"https://anil.recoil.org/images/hotnetspc-anil-2024.webp\" title=\"Me giving a talk! (photo courtesy Nate Foster)\">\nMe giving a talk! (photo courtesy Nate Foster)</p>\n<p>I'm noting down the HotNets as a potentially really good way to run the next\n<a href=\"https://propl.dev\">Programming for the Planet</a>, which is due in 2025. More\nnews on that soon! In the meanwhile, get your papers into <a href=\"https://www.sicsa.ac.uk/loco/loco2024/\">LOCO\n2024</a> which is due in a couple of\ndays...</p>",
+18
avsm/notes_humans-save-nature-not-ai.json
+18
avsm/notes_humans-save-nature-not-ai.json
···+"summary": "<p>In my earlier note about how <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">AI should unite conservation</a>, I talked about the robust debate\nongoing within Cambridge about whether or not we're too "AI obsessed" and are losing track of our goals in the rush to adopt learning algorithms. <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a> has written a <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">brilliant roundup</a> about how colleages like <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> in the\n<a href=\"https://www.conservation.cam.ac.uk\">CCI</a> are leading conversations to make sure we advance with eyes wide open.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/camacuk-ainature.webp\" title=\"\">\n </a></p>\n<p>The <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">article</a> covers many areas of concern to us right now: the takeover by big tech companies of data, our own <a href=\"https://anil.recoil.org/projects/ce\">conservation copilot</a> project, and ultimately how people and equity must remain at the centre of this process if we are to avoid causing harm to humans.</p>\n<blockquote>\n<p>Have you ever persisted in following your SatNav even when you knew you were\ngoing in the wrong direction?</p>\n<p>If so, you\u2019ll know that placing all your trust in a machine powered by AI, without also engaging your own intelligence, does not always get you where you want to go.</p>\n<p>This is the message that a group of conservation scientists at Cambridge is pushing hard.\nEfforts to protect the natural world need all the help they can get - but before embracing AI as the solution, we need discussions about its risks and wider implications.\n-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">To save nature, AI needs our help</a> - cam.ac.uk (2025)</p>\n</blockquote>\n<p>Last week, we held a brilliant half-day <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">AI for Climate and Nature Day</a><a href=\"https://anil.recoil.org/#fn-1\">[1]</a> with <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> that had many of the CCI community present, and this topic was at the forefront of the group discussions at the end.</p>\n<p><a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\"> \n<img alt=\"An annotated guide to the AI@Cam day\" src=\"https://anil.recoil.org/images/aicamday-1.webp\" title=\"An annotated guide to the AI@Cam day\">\nAn annotated guide to the AI@Cam day </a></p>\n<p>I thought <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a>'s point about societal change was key:</p>\n<blockquote>\n<p>If we give all our attention to inventing new AI tools to fix specific conservation problems - important as these are - we\u2019re missing a trick."</p>\n<p>AI\u2019s biggest impact on biodiversity is probably going to be through the ways it changes wider society.\n-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution#section-FkJRUuRF4m\">Chris Sandbrook</a></p>\n</blockquote>\n<p>I've been thinking recently that this principle applies at a <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">local level</a> as well, and not just with respect to AI. We generally to figure out how to change incentives towards more positive <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">collective action</a>, with <a href=\"https://anil.recoil.org/notes/carbon-credits-vs-offsets\">lightweight ways of keeping score</a> that do not give perverse incentives to cheat.</p>\n<p>One really interesting path (pun intended) in this direction is <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>'s project on <a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">generating urban walkability maps</a> that I've been supervising this year for the CompSci MPhil. Gabriel combines <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>'s <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">urban tree maps</a> with OSM labels in order to help people to really enjoy walking around cities. Imagine you want to bias your experience of walking to work along different dimensions such as the chance of seeing a particular bird you like, or need to go shopping at a local coop, or need to find a safe running route late at night. AI should be a tool that helps you to do all of this, and improve the general experience a human wanting to get the most out of nature, and generally help humans value their wild neighbours.</p>\n\n<ol>\n<li>\n<p>I only had time to do a <a href=\"https://bsky.app/profile/anil.recoil.org/post/3lo43thrhvs2p\">Bluesky post storm</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a> did a <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">roundup</a> as well.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>In my earlier note about how <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">AI should unite conservation</a>, I talked about the robust debate\nongoing within Cambridge about whether or not we're too "AI obsessed" and are losing track of our goals in the rush to adopt learning algorithms. <a href=\"https://www.communications.cam.ac.uk/our-team\">Jacqueline Garget</a> has written a <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">brilliant roundup</a> about how colleages like <a href=\"https://samreynolds.org/\">Sam Reynolds</a>, <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> in the\n<a href=\"https://www.conservation.cam.ac.uk\">CCI</a> are leading conversations to make sure we advance with eyes wide open.</p>\n<p><a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\"> \n<img alt=\"\" src=\"https://anil.recoil.org/images/camacuk-ainature.webp\" title=\"\">\n </a></p>\n<p>The <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">article</a> covers many areas of concern to us right now: the takeover by big tech companies of data, our own <a href=\"https://anil.recoil.org/projects/ce\">conservation copilot</a> project, and ultimately how people and equity must remain at the centre of this process if we are to avoid causing harm to humans.</p>\n<blockquote>\n<p>Have you ever persisted in following your SatNav even when you knew you were\ngoing in the wrong direction?</p>\n<p>If so, you\u2019ll know that placing all your trust in a machine powered by AI, without also engaging your own intelligence, does not always get you where you want to go.</p>\n<p>This is the message that a group of conservation scientists at Cambridge is pushing hard.\nEfforts to protect the natural world need all the help they can get - but before embracing AI as the solution, we need discussions about its risks and wider implications.\n-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution\">To save nature, AI needs our help</a> - cam.ac.uk (2025)</p>\n</blockquote>\n<p>Last week, we held a brilliant half-day <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">AI for Climate and Nature Day</a><a href=\"https://anil.recoil.org/#fn-1\">[1]</a> with <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> that had many of the CCI community present, and this topic was at the forefront of the group discussions at the end.</p>\n<p><a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\"> \n<img alt=\"An annotated guide to the AI@Cam day\" src=\"https://anil.recoil.org/images/aicamday-1.webp\" title=\"An annotated guide to the AI@Cam day\">\nAn annotated guide to the AI@Cam day </a></p>\n<p>I thought <a href=\"https://www.gci.cam.ac.uk/people/members/dr-chris-sandbrook\">Chris Sandbrook</a>'s point about societal change was key:</p>\n<blockquote>\n<p>If we give all our attention to inventing new AI tools to fix specific conservation problems - important as these are - we\u2019re missing a trick."</p>\n<p>AI\u2019s biggest impact on biodiversity is probably going to be through the ways it changes wider society.\n-- <a href=\"https://www.cam.ac.uk/stories/ai-for-nature-embrace-with-caution#section-FkJRUuRF4m\">Chris Sandbrook</a></p>\n</blockquote>\n<p>I've been thinking recently that this principle applies at a <a href=\"https://anil.recoil.org/notes/cambridge-green-blue\">local level</a> as well, and not just with respect to AI. We generally to figure out how to change incentives towards more positive <a href=\"https://kogod.american.edu/news/how-good-is-the-paris-agreement\">collective action</a>, with <a href=\"https://anil.recoil.org/notes/carbon-credits-vs-offsets\">lightweight ways of keeping score</a> that do not give perverse incentives to cheat.</p>\n<p>One really interesting path (pun intended) in this direction is <a href=\"https://www.theboatrace.org/athletes/gabriel-mahler\">Gabriel Mahler</a>'s project on <a href=\"https://anil.recoil.org/ideas/walkability-for-osm\">generating urban walkability maps</a> that I've been supervising this year for the CompSci MPhil. Gabriel combines <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>'s <a href=\"https://ancazugo.github.io/research/outreach/2025/04/27/weekly-notes.html\">urban tree maps</a> with OSM labels in order to help people to really enjoy walking around cities. Imagine you want to bias your experience of walking to work along different dimensions such as the chance of seeing a particular bird you like, or need to go shopping at a local coop, or need to find a safe running route late at night. AI should be a tool that helps you to do all of this, and improve the general experience a human wanting to get the most out of nature, and generally help humans value their wild neighbours.</p>\n\n<ol>\n<li>\n<p>I only had time to do a <a href=\"https://bsky.app/profile/anil.recoil.org/post/3lo43thrhvs2p\">Bluesky post storm</a> and <a href=\"https://github.com/jonludlam\">Jon Ludlam</a> did a <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">roundup</a> as well.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_icfp-call-for-sponsorships.json
+18
avsm/notes_icfp-call-for-sponsorships.json
···+"summary": "<p>The call for papers for this year\u2019s <a href=\"http://icfpconference.org/icfp2014/\">International Conference on Functional Programming</a> has just closed, with around a hundred cutting-edge research papers submitted on the theory, application, and experiences behind functional programming. This marks just the beginning of sorting out the program, as there are also over 10 big <a href=\"http://icfpconference.org/icfp2014/affiliated.html\">affiliated workshops</a> that run throughout the week on topics ranging from specific languages (<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>, <a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>, <a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial community</a>, and even <a href=\"http://functional-art.org/\">art and music</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from rain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at <a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in a smoky Tokyo bar.</p>\n<p>Functional programming worldwide has been growing ever more popular in 2014 (and <a href=\"http://whatsapp.com/\">lucrative</a>). We\u2019re committed to growing the ICFP community, not just in numbers but also in diversity. We had a record number of sponsors in 2013, and sustaining the growth means that we need to reach ever wider to support the activities of the (not-for-profit) conference.</p>\n<p>So as this year\u2019s industrial relations chair, I thought I\u2019d throw the gates open and <strong>invite any organization that wishes to support FP to get in touch with us</strong> (e-mail at <code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I\u2019ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.</p>\n<blockquote>\n<p>I\u2019m writing to ask if you would be willing to provide corporate financial support for the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Gothenburg, Sweden, from September 1st through 3rd, 2014:</p>\n<p><a href=\"http://icfpconference.org/icfp2014/\">http://icfpconference.org/icfp2014/</a></p>\n<p>Corporate support funds are primarily used to subsidize students \u2013 the lifeblood of our community \u2013 and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2013 in Boston. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2013 sponsoring companies had the opportunity to speak to the gathered students, academics, and software professionals.</p>\n<p>This year, let\u2019s build on that success and continue to grow our community, and bring even more students to ICFP 2014 in Sweden!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who\u2019ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them.</p>\n<p>This year, we\u2019re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p><strong>Bronze:</strong> $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p><strong>Silver:</strong> $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p><strong>Gold:</strong> $5000: As above plus: named supporter of industrial reception, opportunity to include branded merchandise in participants\u2019 swag bag.</p>\n<p><strong>Platinum:</strong> $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2014/index.html\">organizing committee</a>.\nIf you\u2019re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.</p>",+"content": "<p>The call for papers for this year\u2019s <a href=\"http://icfpconference.org/icfp2014/\">International Conference on Functional Programming</a> has just closed, with around a hundred cutting-edge research papers submitted on the theory, application, and experiences behind functional programming. This marks just the beginning of sorting out the program, as there are also over 10 big <a href=\"http://icfpconference.org/icfp2014/affiliated.html\">affiliated workshops</a> that run throughout the week on topics ranging from specific languages (<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>, <a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>, <a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial community</a>, and even <a href=\"http://functional-art.org/\">art and music</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from rain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at <a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in a smoky Tokyo bar.</p>\n<p>Functional programming worldwide has been growing ever more popular in 2014 (and <a href=\"http://whatsapp.com/\">lucrative</a>). We\u2019re committed to growing the ICFP community, not just in numbers but also in diversity. We had a record number of sponsors in 2013, and sustaining the growth means that we need to reach ever wider to support the activities of the (not-for-profit) conference.</p>\n<p>So as this year\u2019s industrial relations chair, I thought I\u2019d throw the gates open and <strong>invite any organization that wishes to support FP to get in touch with us</strong> (e-mail at <code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I\u2019ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.</p>\n<blockquote>\n<p>I\u2019m writing to ask if you would be willing to provide corporate financial support for the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Gothenburg, Sweden, from September 1st through 3rd, 2014:</p>\n<p><a href=\"http://icfpconference.org/icfp2014/\">http://icfpconference.org/icfp2014/</a></p>\n<p>Corporate support funds are primarily used to subsidize students \u2013 the lifeblood of our community \u2013 and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2013 in Boston. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2013 sponsoring companies had the opportunity to speak to the gathered students, academics, and software professionals.</p>\n<p>This year, let\u2019s build on that success and continue to grow our community, and bring even more students to ICFP 2014 in Sweden!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who\u2019ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them.</p>\n<p>This year, we\u2019re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p><strong>Bronze:</strong> $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p><strong>Silver:</strong> $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p><strong>Gold:</strong> $5000: As above plus: named supporter of industrial reception, opportunity to include branded merchandise in participants\u2019 swag bag.</p>\n<p><strong>Platinum:</strong> $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2014/index.html\">organizing committee</a>.\nIf you\u2019re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.</p>",
+18
avsm/notes_icfp15-call-for-sponsorships.json
+18
avsm/notes_icfp15-call-for-sponsorships.json
···+"summary": "<p>The call for papers for this year\u2019s <a href=\"http://icfpconference.org/icfp2015/\">International Conference on Functional\nProgramming</a> is about to close in two\nweeks, and over a hundred cutting-edge research papers will be submitted on the\ntheory, application, and experiences behind functional programming and type\ntheory. In addition to the main conference, there are also over 10 big\n<a href=\"http://icfpconference.org/icfp2015/affiliated.html\">affiliated workshops</a> that\nrun throughout the week on topics ranging from specific languages\n(<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>,\n<a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial\ncommunity</a>, and even <a href=\"http://functional-art.org/\">art and\nmusic</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great\nideas have emerged from random corridor conversations between talks with the\nlikes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from\nrain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at\n<a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in\na smoky Tokyo bar. This year, it will be held in the beautiful city of\nVancouver in the fall.</p>\n<p>We\u2019re committed to growing the ICFP community, not just in numbers but also in\ndiversity. The <a href=\"http://plmw15.iisc-seal.net/\">Programming Language Mentoring\nWorkshop</a> has been at capacity since it started\nand will run again. For the first time ever, I am really excited to announce\nthat the <a href=\"https://adainitiative.org/\">Ada Initiative</a> will also be running an\n<a href=\"https://adainitiative.org/what-we-do/workshops-and-training/\">Ally Skills</a>\nworkshop during the conference.</p>\n<p>Sustaining these activities and responsible growth means that we need to reach\never wider to support the activities of the (not-for-profit) ICFP conference.\nSo as this year\u2019s industrial relations chair, I wish to <strong>invite any\norganization that wishes to support ICFP to get in touch with us</strong> (e-mail at\n<code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I\u2019ve put an abridged version of the\ne-mail solicitation below that describes the benefits. Sponsorship can start as\nlow as $500 and is often tax-deductible in many countries.</p>\n<blockquote>\n<p>I\u2019m writing to ask if you would be willing to provide corporate financial support for the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Vancouver, Canada, from August 30th through September 5th, 2015:</p>\n<pre><code>http://icfpconference.org/icfp2015/\n</code></pre>\n<p>Corporate support funds are primarily used to subsidize students \u2013 the lifeblood of our community \u2013 and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2014 in Sweden. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2014 sponsoring companies had the opportunity to interact with the gathered students, academics, and software professionals.</p>\n<p>This year, let\u2019s build on that success and continue to grow our community, and bring even more students to ICFP 2015 in Vancouver!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who\u2019ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them. For the first time, we will also host an Ally Skills workshop by the Ada Foundation, as well as continue the successful student mentoring workshop from previous years.</p>\n<p>This year, we\u2019re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p>Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p>Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p>Gold: $5000: As above plus: named supporter of industrial reception with opportunity to speak to the audience, and opportunity to include branded merchandise in participants\u2019 swag bag.</p>\n<p>Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n<p>Thank you for your time and especially for your generosity! I look forward to seeing you in Vancouver. If you are willing to be a sponsor, it would be helpful to hear back by March 9th to help us plan and budget.</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2015/index.html\">organizing committee</a>. If you\u2019re interested in helping out ICFP in a non-financial capacity (for example, as a student volunteer), then there will also be plenty of opportunities to sign up later in the year.</p>",+"content": "<p>The call for papers for this year\u2019s <a href=\"http://icfpconference.org/icfp2015/\">International Conference on Functional\nProgramming</a> is about to close in two\nweeks, and over a hundred cutting-edge research papers will be submitted on the\ntheory, application, and experiences behind functional programming and type\ntheory. In addition to the main conference, there are also over 10 big\n<a href=\"http://icfpconference.org/icfp2015/affiliated.html\">affiliated workshops</a> that\nrun throughout the week on topics ranging from specific languages\n(<a href=\"http://www.erlang.org/workshop/2014/\">Erlang</a>,\n<a href=\"http://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop\">Haskell</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/\">OCaml</a>), the broader <a href=\"http://cufp.org/\">commercial\ncommunity</a>, and even <a href=\"http://functional-art.org/\">art and\nmusic</a>.</p>\n<p>The ICFP conference experience can be a remarkable one for students. Some great\nideas have emerged from random corridor conversations between talks with the\nlikes of <a href=\"http://homepages.inf.ed.ac.uk/wadler/\">Phil Wadler</a>, or from\nrain-soaked discussions with <a href=\"http://research.microsoft.com/en-us/people/simonpj/\">Simon PJ</a> at\n<a href=\"http://mikkeller.dk/\">Mikeller</a>, or in my case, from being convinced to <a href=\"https://blogs.janestreet.com/the-making-of-real-world-ocaml/\">write a book</a> while in\na smoky Tokyo bar. This year, it will be held in the beautiful city of\nVancouver in the fall.</p>\n<p>We\u2019re committed to growing the ICFP community, not just in numbers but also in\ndiversity. The <a href=\"http://plmw15.iisc-seal.net/\">Programming Language Mentoring\nWorkshop</a> has been at capacity since it started\nand will run again. For the first time ever, I am really excited to announce\nthat the <a href=\"https://adainitiative.org/\">Ada Initiative</a> will also be running an\n<a href=\"https://adainitiative.org/what-we-do/workshops-and-training/\">Ally Skills</a>\nworkshop during the conference.</p>\n<p>Sustaining these activities and responsible growth means that we need to reach\never wider to support the activities of the (not-for-profit) ICFP conference.\nSo as this year\u2019s industrial relations chair, I wish to <strong>invite any\norganization that wishes to support ICFP to get in touch with us</strong> (e-mail at\n<code>avsm2@cl.cam.ac.uk</code>) and sponsor us. I\u2019ve put an abridged version of the\ne-mail solicitation below that describes the benefits. Sponsorship can start as\nlow as $500 and is often tax-deductible in many countries.</p>\n<blockquote>\n<p>I\u2019m writing to ask if you would be willing to provide corporate financial support for the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Vancouver, Canada, from August 30th through September 5th, 2015:</p>\n<pre><code>http://icfpconference.org/icfp2015/\n</code></pre>\n<p>Corporate support funds are primarily used to subsidize students \u2013 the lifeblood of our community \u2013 and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.</p>\n<p>Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2014 in Sweden. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2014 sponsoring companies had the opportunity to interact with the gathered students, academics, and software professionals.</p>\n<p>This year, let\u2019s build on that success and continue to grow our community, and bring even more students to ICFP 2015 in Vancouver!</p>\n<p>Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who\u2019ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them. For the first time, we will also host an Ally Skills workshop by the Ada Foundation, as well as continue the successful student mentoring workshop from previous years.</p>\n<p>This year, we\u2019re continuing a similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).</p>\n<p>The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).</p>\n<p>Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.</p>\n<p>Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)</p>\n<p>Gold: $5000: As above plus: named supporter of industrial reception with opportunity to speak to the audience, and opportunity to include branded merchandise in participants\u2019 swag bag.</p>\n<p>Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).</p>\n<p>Thank you for your time and especially for your generosity! I look forward to seeing you in Vancouver. If you are willing to be a sponsor, it would be helpful to hear back by March 9th to help us plan and budget.</p>\n</blockquote>\n<p>If you are interested, please get in touch with <a href=\"mailto:anil@recoil.org\">me</a> or any of the <a href=\"http://icfpconference.org/icfp2015/index.html\">organizing committee</a>. If you\u2019re interested in helping out ICFP in a non-financial capacity (for example, as a student volunteer), then there will also be plenty of opportunities to sign up later in the year.</p>",
+18
avsm/notes_installing-ubuntu-on-xenserver.json
+18
avsm/notes_installing-ubuntu-on-xenserver.json
···+"summary": "<p>I thought I\u2019d kick off my Citrix blog with a question I get pretty often\nfrom Linux enthusiasts: how to install unsupported Linux distributions\non <a href=\"https://xenserver.com\">XenServer</a> 4.1.</p>\n<p>The most common solution people find is to use the "Other Install Media"\ntemplate, insert the distribution installation CD, and find that the\nmouse cursor doesn\u2019t work when they boot into X11. The reason for this\nis that they are using the hardware-assisted emulation mode of\ninstalling Linux. In this mode (dubbed \u201cHVM\u201d), all input and output is\nemulated, and in particular the mouse interface uses the USB tablet\ninterface. If the distribution doesn\u2019t include a driver for USB tablets,\nthen no mouse will appear.</p>\n<p>Windows guests run at high-speed in HVM mode due to the installation of\nthe XenServer tools which install <a href=\"http://xen.org/files/summit_3/xen-pv-drivers.pdf\">high-speed\ndrivers</a>, but these\nare not necessary for Linux distributions since they can be run in\n<a href=\"http://en.wikipedia.org/wiki/Paravirtualization\">para-virtualized</a> mode\n(dubbed \u201cPV\u201d). This involves obtaining a Xen-enabled PV kernel from the\ndistribution, and modifying the VM record in XenServer to boot into this\nkernel instead of HVM mode. The XenServer built-in templates for popular\ndistributions such as RHEL, CentOS or SUSE Linux already automate all\nthis and are in PV mode from the installer onwards.</p>\n<p>In the remainder of this post, I\u2019ll explain how to take a distribution\nwithout direct support (<a href=\"http://www.ubuntu.com/\">Ubuntu</a>\n<a href=\"https://wiki.ubuntu.com/HardyHeron\">8.04</a>), get it installed in HVM\nmode on XenServer 4.1, and convert it to PV mode with a XenCenter\ngraphical console.</p>\n<ul>\n<li>\n<p>Download the "<a href=\"http://www.ubuntu.com/GetUbuntu/download\">Alternative Installation\nCD</a>". The main\ninstallation CD uses graphical mode, which won't install as well in\nHVM mode due to the use of esoteric 16-bit mode instructions for the\ngraphics operations. The 16-bit emulation mechanisms vary between\nprocessors (with better support on AMD chips, and a software\ninstruction emulator required on Intel VT chips). However, the\nUbuntu alternate CD uses a text-based installer which works fine.</p>\n</li>\n<li>\n<p>Create a new VM on the XenServer 4.1 host using the "Windows Server\n2003" template. This template is set up with a sensible set of\nhardware emulation flags and default disks, and so is a good base\nfor the HVM installation of Ubuntu as well. Attach the Ubuntu ISO\nyou just downloaded to the VM, and proceed to install Ubuntu as\nnormal. You should install it onto the first disk, to make the\nsubsequent steps in this guide easier.</p>\n</li>\n<li>\n<p>When the installation is finished, reboot the VM (don't forget to\ndetach the installation ISO first). It should boot up in HVM mode\ninto the graphical login screen. The XenCenter display will show it\nas not being optimized, which is fine. At this stage, I prefer to\nwork via a remote command-line using SSH. Open up a Terminal from\nUbuntu, and run "<code>sudo apt-get install openssh-server</code>". Then find\nout the VM's IP address with "<code>ifconfig eth0</code>", and then connect to\nit remotely. Alternatively, you can continue to type in the commands\ndirectly into the terminal as well.</p>\n</li>\n<li>\n<p>On the Ubuntu guest, you now need to install the latest Xen version\nof the Ubuntu kernel:</p>\n<ul>\n<li>Install the Linux kernel virtual package with\n"<code>sudo apt-get install linux-image-xen</code>". This is a virtual\npackage which pulls in the latest Xen kernel and modules, in my\ncase <code>2.6.24.19.21</code>.</li>\n<li>You now need to workaround a\n<a href=\"http://www.mail-archive.com/grub-devel@gnu.org/msg06024.html\">bug</a>\nin grub. Due to the switch in recent versions of Linux to work\nwith the hypervisor-independent\n<a href=\"http://xen.xensource.com/files/xensummit_4/xen-paravirt_ops_Fitzhardinge.pdf\">paravirt_ops</a>\ninterface, <code>update-grub</code> doesn't update the grub configuration\nwith your newly installed Xen kernel. To fix this:\n<ul>\n<li>\n<p>Open <code>/boot/grub/menu.lst</code> in your favourite editor.</p>\n</li>\n<li>\n<p>Scroll to the bottom to the kernel list, and find the entry\nwhich looks like:</p>\n<pre><code>title Ubuntu 8.04, kernel 2.6.24-16-generic\nroot (hd0,0)\nkernel /boot/vmlinuz-2.6.24-16-generic root=UUID=<uuid> ro quiet splash\ninitrd /boot/initrd.img-2.6.24-16-generic\nquiet\n</code></pre>\n</li>\n<li>\n<p>Add a new entry which is similar to this, but change all\nreferences to the <code>2.6.24-16-generic</code> to the Xen kernel. In\n<code>/boot</code> I have <code>vmlinuz-2.6.24-19-xen</code>, so my new entry\nlooks like:</p>\n<pre><code>title Ubuntu 8.04, kernel 2.6.24-19-xen\nroot (hd0,0)\nkernel /boot/vmlinuz-2.6.24-19-xen root=UUID=<uuid> ro quiet splash\ninitrd /boot/initrd.img-2.6.24-19-xen\nquiet\n</code></pre>\n</li>\n<li>\n<p>Also edit the <code>default</code> entry in the <code>menu.lst</code> to match the\nnumber of the kernel you just added. I set mine to 3, since\nit is the fourth entry in the list and the indexing starts\nfrom 0.</p>\n</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>When this is done, shut down the guest but do not reboot it just\nyet. You first need to edit the VM record for your Ubuntu VM to\nconvert it to PV boot mode. From the control domain console of your\nXenServer:</p>\n<ul>\n<li>Determine the UUID of the Ubuntu VM by using the <code>xe</code> CLI:\n<ul>\n<li><code>xe vm-list name-label=Ubuntu params=uuid --minimal</code> : this\nwill print out the UUID of the VM named "Ubuntu". If you are\nlogged into the control domain, pressing the <code><tab></code> key\nwill perform auto-completion of UUIDs in subsequent XE\ncommands, so you don't need to keep typing it in every time!</li>\n<li><code>xe vm-param-set uuid=<uuid> HVM-boot-policy=</code> : this will\nclear the HVM boot mode from the VM.</li>\n<li><code>xe vm-param-set uuid=<uuid> PV-bootloader=pygrub</code> : this\nwill switch the VM to using to the pygrub bootloader which\nstarts the guest in PV mode by examining its filesystem for\nkernel.</li>\n<li><code>vm vm-param-set uuid=<uuid> PV-args="console=tty0 xencons=tty"</code>\n: this configures the kernel boot arguments to display the\nlogin console on the correct TTY, so that it shows up in the\nXenCenter console.</li>\n</ul>\n</li>\n<li>Next, you need to flag the root disk of the VM as bootable so\nthat pygrub knows where to look for the PV kernel:\n<ul>\n<li><code>xe vm-disk-list uuid=<uuid></code> and look for the UUID of the\nVBD for the disk. VBD stands for "Virtual Block Device" and\nrepresents how to map the virtual disk into the virtual\nmachine.</li>\n<li><code>xe vbd-param-set uuid=<vbd uuid> bootable=true</code> will set\nthe root disk VBD to be bootable.</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>You should be all set now! If you boot up the Ubuntu VM, it should\nstart up in text-mode with the high-speed PV kernel. If it doesn't\nwork due to an incorrect grub configuration, you can use the\n<code>xe-edit-bootloader</code> script in the XenServer control domain to edit\nthe <code>grub.conf</code> until it works.</p>\n</li>\n<li>\n<p>The next step is to install the XenServer tools within the guest, so\nthat metrics such as the network interface IP addresses are recorded\nand reported from XenCenter. To do this:</p>\n<ul>\n<li>Due to a portability issues with the default shell in Ubuntu\n(<a href=\"http://en.wikipedia.org/wiki/Debian_Almquist_shell\">dash</a>),\nyou will need to replace it by:\n<code>sudo apt-get -y install bash && sudo dpkg-reconfigure dash</code>.\nWe've actually fixed this issue in future releases of XenServer,\nbut for XenServer 4.1 you will need to use <code>bash</code>.</li>\n<li>Attach the XenServer Tools ISO into the VM, and mount it into\nthe guest with <code>sudo mount /dev/xvdd /mnt</code></li>\n<li>Install the tools with\n<code>sudo dpkg -i /mnt/Linux/xe-guest-utilities_4.1.0-257_i386.deb</code>.</li>\n<li>The warnings about the VM being unoptimized should disappear,\nand additional information such as the IP address of the guest\nshould appear in XenCenter.</li>\n</ul>\n</li>\n<li>\n<p>In order to access the Ubuntu installation via the graphical\nconsole, you need to configure it to run\n<a href=\"http://www.realvnc.com/\">VNC</a> on the external network interface.\nXenCenter polls the guest to see if it is listening on the VNC port\n5900, and offers the option to switch to the graphical console if it\nfinds it. I followed the excellent instructions on this <a href=\"http://ubuntuforums.org/showpost.php?p=4963842&postcount=1\">forum\npost</a>.\nTo summarise them:</p>\n<ul>\n<li>\n<p><code>sudo apt-get install vnc4server xinetd</code> : to install the\nrequired packages</p>\n</li>\n<li>\n<p>Edit <code>/etc/gdm/gdm.conf</code> and uncomment the\n<code>RemoteGreeter=/usr/lib/gdm/gdmlogin</code> line, set the key\n<code>Enable=true</code> in the <code>[xdcmp]</code> section.</p>\n</li>\n<li>\n<p>Install a new service file for <code>xinetd</code> into\n<code>/etc/xinetd.d/Xvnc</code> with the following contents:</p>\n<pre><code>service Xvnc\n{\n type = UNLISTED\n disable = no\n socket_type = stream\n protocol = tcp\n wait = no\n user = nobody\n server = /usr/bin/Xvnc\n server_args = -inetd -query localhost -geometry 1024x768 -depth 16 -cc 3 -once -SecurityTypes=none -extension XFIXES\n port = 5900\n}\n</code></pre>\n</li>\n<li>\n<p>The major difference from the forum poster is to run it on port\n5900, and not to restrict it to just localhost (since XenCenter\nalso needs to connect to it).</p>\n</li>\n<li>\n<p>Finally, restart the <code>xinetd</code> service by running\n<code>sudo /etc/init.d/xinetd restart</code>.</p>\n</li>\n</ul>\n</li>\n</ul>\n<p>Once you're done with this installation, you can shut down the VM and\nconvert it to a template. Any exports or clones will continue to run in\nPV mode, since the XenServer XVA export format records all of the\nmetadata required to re-create the VM records.</p>\n<p>Enjoy the Ubuntu on XenServer experience! Remember to report any issues\nyou have with the in-guest packages on the Ubuntu support forums, or\njust give them positive feedback.</p>\n<p>PS: many thanks to Andrew Peace and Ian Campbell for assistance. May\ntheir Linux beards remain long and uncut.</p>",+"content": "<p>I thought I\u2019d kick off my Citrix blog with a question I get pretty often\nfrom Linux enthusiasts: how to install unsupported Linux distributions\non <a href=\"https://xenserver.com\">XenServer</a> 4.1.</p>\n<p>The most common solution people find is to use the "Other Install Media"\ntemplate, insert the distribution installation CD, and find that the\nmouse cursor doesn\u2019t work when they boot into X11. The reason for this\nis that they are using the hardware-assisted emulation mode of\ninstalling Linux. In this mode (dubbed \u201cHVM\u201d), all input and output is\nemulated, and in particular the mouse interface uses the USB tablet\ninterface. If the distribution doesn\u2019t include a driver for USB tablets,\nthen no mouse will appear.</p>\n<p>Windows guests run at high-speed in HVM mode due to the installation of\nthe XenServer tools which install <a href=\"http://xen.org/files/summit_3/xen-pv-drivers.pdf\">high-speed\ndrivers</a>, but these\nare not necessary for Linux distributions since they can be run in\n<a href=\"http://en.wikipedia.org/wiki/Paravirtualization\">para-virtualized</a> mode\n(dubbed \u201cPV\u201d). This involves obtaining a Xen-enabled PV kernel from the\ndistribution, and modifying the VM record in XenServer to boot into this\nkernel instead of HVM mode. The XenServer built-in templates for popular\ndistributions such as RHEL, CentOS or SUSE Linux already automate all\nthis and are in PV mode from the installer onwards.</p>\n<p>In the remainder of this post, I\u2019ll explain how to take a distribution\nwithout direct support (<a href=\"http://www.ubuntu.com/\">Ubuntu</a>\n<a href=\"https://wiki.ubuntu.com/HardyHeron\">8.04</a>), get it installed in HVM\nmode on XenServer 4.1, and convert it to PV mode with a XenCenter\ngraphical console.</p>\n<ul>\n<li>\n<p>Download the "<a href=\"http://www.ubuntu.com/GetUbuntu/download\">Alternative Installation\nCD</a>". The main\ninstallation CD uses graphical mode, which won't install as well in\nHVM mode due to the use of esoteric 16-bit mode instructions for the\ngraphics operations. The 16-bit emulation mechanisms vary between\nprocessors (with better support on AMD chips, and a software\ninstruction emulator required on Intel VT chips). However, the\nUbuntu alternate CD uses a text-based installer which works fine.</p>\n</li>\n<li>\n<p>Create a new VM on the XenServer 4.1 host using the "Windows Server\n2003" template. This template is set up with a sensible set of\nhardware emulation flags and default disks, and so is a good base\nfor the HVM installation of Ubuntu as well. Attach the Ubuntu ISO\nyou just downloaded to the VM, and proceed to install Ubuntu as\nnormal. You should install it onto the first disk, to make the\nsubsequent steps in this guide easier.</p>\n</li>\n<li>\n<p>When the installation is finished, reboot the VM (don't forget to\ndetach the installation ISO first). It should boot up in HVM mode\ninto the graphical login screen. The XenCenter display will show it\nas not being optimized, which is fine. At this stage, I prefer to\nwork via a remote command-line using SSH. Open up a Terminal from\nUbuntu, and run "<code>sudo apt-get install openssh-server</code>". Then find\nout the VM's IP address with "<code>ifconfig eth0</code>", and then connect to\nit remotely. Alternatively, you can continue to type in the commands\ndirectly into the terminal as well.</p>\n</li>\n<li>\n<p>On the Ubuntu guest, you now need to install the latest Xen version\nof the Ubuntu kernel:</p>\n<ul>\n<li>Install the Linux kernel virtual package with\n"<code>sudo apt-get install linux-image-xen</code>". This is a virtual\npackage which pulls in the latest Xen kernel and modules, in my\ncase <code>2.6.24.19.21</code>.</li>\n<li>You now need to workaround a\n<a href=\"http://www.mail-archive.com/grub-devel@gnu.org/msg06024.html\">bug</a>\nin grub. Due to the switch in recent versions of Linux to work\nwith the hypervisor-independent\n<a href=\"http://xen.xensource.com/files/xensummit_4/xen-paravirt_ops_Fitzhardinge.pdf\">paravirt_ops</a>\ninterface, <code>update-grub</code> doesn't update the grub configuration\nwith your newly installed Xen kernel. To fix this:\n<ul>\n<li>\n<p>Open <code>/boot/grub/menu.lst</code> in your favourite editor.</p>\n</li>\n<li>\n<p>Scroll to the bottom to the kernel list, and find the entry\nwhich looks like:</p>\n<pre><code>title Ubuntu 8.04, kernel 2.6.24-16-generic\nroot (hd0,0)\nkernel /boot/vmlinuz-2.6.24-16-generic root=UUID=<uuid> ro quiet splash\ninitrd /boot/initrd.img-2.6.24-16-generic\nquiet\n</code></pre>\n</li>\n<li>\n<p>Add a new entry which is similar to this, but change all\nreferences to the <code>2.6.24-16-generic</code> to the Xen kernel. In\n<code>/boot</code> I have <code>vmlinuz-2.6.24-19-xen</code>, so my new entry\nlooks like:</p>\n<pre><code>title Ubuntu 8.04, kernel 2.6.24-19-xen\nroot (hd0,0)\nkernel /boot/vmlinuz-2.6.24-19-xen root=UUID=<uuid> ro quiet splash\ninitrd /boot/initrd.img-2.6.24-19-xen\nquiet\n</code></pre>\n</li>\n<li>\n<p>Also edit the <code>default</code> entry in the <code>menu.lst</code> to match the\nnumber of the kernel you just added. I set mine to 3, since\nit is the fourth entry in the list and the indexing starts\nfrom 0.</p>\n</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>When this is done, shut down the guest but do not reboot it just\nyet. You first need to edit the VM record for your Ubuntu VM to\nconvert it to PV boot mode. From the control domain console of your\nXenServer:</p>\n<ul>\n<li>Determine the UUID of the Ubuntu VM by using the <code>xe</code> CLI:\n<ul>\n<li><code>xe vm-list name-label=Ubuntu params=uuid --minimal</code> : this\nwill print out the UUID of the VM named "Ubuntu". If you are\nlogged into the control domain, pressing the <code><tab></code> key\nwill perform auto-completion of UUIDs in subsequent XE\ncommands, so you don't need to keep typing it in every time!</li>\n<li><code>xe vm-param-set uuid=<uuid> HVM-boot-policy=</code> : this will\nclear the HVM boot mode from the VM.</li>\n<li><code>xe vm-param-set uuid=<uuid> PV-bootloader=pygrub</code> : this\nwill switch the VM to using to the pygrub bootloader which\nstarts the guest in PV mode by examining its filesystem for\nkernel.</li>\n<li><code>vm vm-param-set uuid=<uuid> PV-args="console=tty0 xencons=tty"</code>\n: this configures the kernel boot arguments to display the\nlogin console on the correct TTY, so that it shows up in the\nXenCenter console.</li>\n</ul>\n</li>\n<li>Next, you need to flag the root disk of the VM as bootable so\nthat pygrub knows where to look for the PV kernel:\n<ul>\n<li><code>xe vm-disk-list uuid=<uuid></code> and look for the UUID of the\nVBD for the disk. VBD stands for "Virtual Block Device" and\nrepresents how to map the virtual disk into the virtual\nmachine.</li>\n<li><code>xe vbd-param-set uuid=<vbd uuid> bootable=true</code> will set\nthe root disk VBD to be bootable.</li>\n</ul>\n</li>\n</ul>\n</li>\n<li>\n<p>You should be all set now! If you boot up the Ubuntu VM, it should\nstart up in text-mode with the high-speed PV kernel. If it doesn't\nwork due to an incorrect grub configuration, you can use the\n<code>xe-edit-bootloader</code> script in the XenServer control domain to edit\nthe <code>grub.conf</code> until it works.</p>\n</li>\n<li>\n<p>The next step is to install the XenServer tools within the guest, so\nthat metrics such as the network interface IP addresses are recorded\nand reported from XenCenter. To do this:</p>\n<ul>\n<li>Due to a portability issues with the default shell in Ubuntu\n(<a href=\"http://en.wikipedia.org/wiki/Debian_Almquist_shell\">dash</a>),\nyou will need to replace it by:\n<code>sudo apt-get -y install bash && sudo dpkg-reconfigure dash</code>.\nWe've actually fixed this issue in future releases of XenServer,\nbut for XenServer 4.1 you will need to use <code>bash</code>.</li>\n<li>Attach the XenServer Tools ISO into the VM, and mount it into\nthe guest with <code>sudo mount /dev/xvdd /mnt</code></li>\n<li>Install the tools with\n<code>sudo dpkg -i /mnt/Linux/xe-guest-utilities_4.1.0-257_i386.deb</code>.</li>\n<li>The warnings about the VM being unoptimized should disappear,\nand additional information such as the IP address of the guest\nshould appear in XenCenter.</li>\n</ul>\n</li>\n<li>\n<p>In order to access the Ubuntu installation via the graphical\nconsole, you need to configure it to run\n<a href=\"http://www.realvnc.com/\">VNC</a> on the external network interface.\nXenCenter polls the guest to see if it is listening on the VNC port\n5900, and offers the option to switch to the graphical console if it\nfinds it. I followed the excellent instructions on this <a href=\"http://ubuntuforums.org/showpost.php?p=4963842&postcount=1\">forum\npost</a>.\nTo summarise them:</p>\n<ul>\n<li>\n<p><code>sudo apt-get install vnc4server xinetd</code> : to install the\nrequired packages</p>\n</li>\n<li>\n<p>Edit <code>/etc/gdm/gdm.conf</code> and uncomment the\n<code>RemoteGreeter=/usr/lib/gdm/gdmlogin</code> line, set the key\n<code>Enable=true</code> in the <code>[xdcmp]</code> section.</p>\n</li>\n<li>\n<p>Install a new service file for <code>xinetd</code> into\n<code>/etc/xinetd.d/Xvnc</code> with the following contents:</p>\n<pre><code>service Xvnc\n{\n type = UNLISTED\n disable = no\n socket_type = stream\n protocol = tcp\n wait = no\n user = nobody\n server = /usr/bin/Xvnc\n server_args = -inetd -query localhost -geometry 1024x768 -depth 16 -cc 3 -once -SecurityTypes=none -extension XFIXES\n port = 5900\n}\n</code></pre>\n</li>\n<li>\n<p>The major difference from the forum poster is to run it on port\n5900, and not to restrict it to just localhost (since XenCenter\nalso needs to connect to it).</p>\n</li>\n<li>\n<p>Finally, restart the <code>xinetd</code> service by running\n<code>sudo /etc/init.d/xinetd restart</code>.</p>\n</li>\n</ul>\n</li>\n</ul>\n<p>Once you're done with this installation, you can shut down the VM and\nconvert it to a template. Any exports or clones will continue to run in\nPV mode, since the XenServer XVA export format records all of the\nmetadata required to re-create the VM records.</p>\n<p>Enjoy the Ubuntu on XenServer experience! Remember to report any issues\nyou have with the in-guest packages on the Ubuntu support forums, or\njust give them positive feedback.</p>\n<p>PS: many thanks to Andrew Peace and Ian Campbell for assistance. May\ntheir Linux beards remain long and uncut.</p>",
+18
avsm/notes_junior-rangers.json
+18
avsm/notes_junior-rangers.json
···+"summary": "<p>What might a Dame of the Realm, a Fellow of the Royal Society, the latest member of the UK Joint Nature Conservation Committee, and me all covet? That's right: a <a href=\"https://www.nps.gov/kids/become-a-junior-ranger.htm\">Junior Ranger</a> badge from <a href=\"https://www.nps.gov/shen/index.htm\">Shenandoah National Park</a>! After an <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">intense</a> few days, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a>, <a href=\"https://en.wikipedia.org/wiki/E._J._Milner-Gulland\">EJ Milner-Gulland</a> and I headed into nature to experience the spectacular landscapes of the Blue Ridge Mountains in Virginia and do some birding.</p>\n<p>The National Park Service in the US runs a wonderful program for anyone aged 8+ (which we just about qualified for) to introduce people to nature, and Shenandoah <a href=\"https://www.goshenandoah.com/activities-events/national-park-service-programs\">is no exception</a>. We visited the local ranger lodge in the park, and picked up a program booklet. They're full of activities for kids to do, but of course adults also pick up a lot of random knowledge (such as the <a href=\"https://en.wikipedia.org/wiki/Shenandoah_salamander\">endemic salamander species</a> in the region).</p>\n<p>\n<img alt=\"EJ and Julia hard at work on their junior ranger books\" src=\"https://anil.recoil.org/images/shen-1.webp\" title=\"EJ and Julia hard at work on their junior ranger books\">\nEJ and Julia hard at work on their junior ranger books</p>\n<p>The activities are rigorous: in addition to crosswords and drawings about the park, we had to compose poems and haiku, and also experience nature. First, we had to dance like butterflies... (I was a Monarch, so I occasionally had to respawn while migrating).</p>\n<p></p><div></div><p></p>\n<p>...and then hug a tree, of which there were many around, and they told us their stories...</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/shen-2.webp\" title=\"\">\n</p>\n<p>...and try not to fall off sweeping overlooks as we clambered around...</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/shen-3.webp\" title=\"\">\n</p>\n<p>And once that was done, we returned to the ranger lodge to swear our ranger oaths, with a delightful Ranger who knew lots about the local area, the forest/meadow management, and where we might spot more of the salamanders (which are apparently under some threat from an invasive look-a-like that is not endemic but outcompeting the local salamandar).</p>\n<p></p><div></div><p></p>\n<h2><a href=\"https://anil.recoil.org/#bills-conservation-concepts\"></a>Bill's Conservation Concepts</h2>\n<p>During our hikes to the various lovely sites, we also took some time to record the 101st video for <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s wonderful <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts</a> video series, about the work <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been helping out the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> team on.</p>\n\n<p>I highly recommend watching <a href=\"https://www.youtube.com/@Bill_Sutherland\">Bill's whole channel series</a> as it's full of endlessly fascinating facts, and congratulations to Bill on reaching the century mark on the number he's done!</p>\n<p>\n<img alt=\"I&apos;m never taking my badge off, best day ever\" src=\"https://anil.recoil.org/images/shen-4.webp\" title=\"I&apos;m never taking my badge off, best day ever\">\nI'm never taking my badge off, best day ever</p>",+"content": "<p>What might a Dame of the Realm, a Fellow of the Royal Society, the latest member of the UK Joint Nature Conservation Committee, and me all covet? That's right: a <a href=\"https://www.nps.gov/kids/become-a-junior-ranger.htm\">Junior Ranger</a> badge from <a href=\"https://www.nps.gov/shen/index.htm\">Shenandoah National Park</a>! After an <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">intense</a> few days, <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a>, <a href=\"https://en.wikipedia.org/wiki/E._J._Milner-Gulland\">EJ Milner-Gulland</a> and I headed into nature to experience the spectacular landscapes of the Blue Ridge Mountains in Virginia and do some birding.</p>\n<p>The National Park Service in the US runs a wonderful program for anyone aged 8+ (which we just about qualified for) to introduce people to nature, and Shenandoah <a href=\"https://www.goshenandoah.com/activities-events/national-park-service-programs\">is no exception</a>. We visited the local ranger lodge in the park, and picked up a program booklet. They're full of activities for kids to do, but of course adults also pick up a lot of random knowledge (such as the <a href=\"https://en.wikipedia.org/wiki/Shenandoah_salamander\">endemic salamander species</a> in the region).</p>\n<p>\n<img alt=\"EJ and Julia hard at work on their junior ranger books\" src=\"https://anil.recoil.org/images/shen-1.webp\" title=\"EJ and Julia hard at work on their junior ranger books\">\nEJ and Julia hard at work on their junior ranger books</p>\n<p>The activities are rigorous: in addition to crosswords and drawings about the park, we had to compose poems and haiku, and also experience nature. First, we had to dance like butterflies... (I was a Monarch, so I occasionally had to respawn while migrating).</p>\n<p></p><div></div><p></p>\n<p>...and then hug a tree, of which there were many around, and they told us their stories...</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/shen-2.webp\" title=\"\">\n</p>\n<p>...and try not to fall off sweeping overlooks as we clambered around...</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/shen-3.webp\" title=\"\">\n</p>\n<p>And once that was done, we returned to the ranger lodge to swear our ranger oaths, with a delightful Ranger who knew lots about the local area, the forest/meadow management, and where we might spot more of the salamanders (which are apparently under some threat from an invasive look-a-like that is not endemic but outcompeting the local salamandar).</p>\n<p></p><div></div><p></p>\n<h2><a href=\"https://anil.recoil.org/#bills-conservation-concepts\"></a>Bill's Conservation Concepts</h2>\n<p>During our hikes to the various lovely sites, we also took some time to record the 101st video for <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s wonderful <a href=\"https://www.youtube.com/@Bill_Sutherland\">Conservation Concepts</a> video series, about the work <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been helping out the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> team on.</p>\n\n<p>I highly recommend watching <a href=\"https://www.youtube.com/@Bill_Sutherland\">Bill's whole channel series</a> as it's full of endlessly fascinating facts, and congratulations to Bill on reaching the century mark on the number he's done!</p>\n<p>\n<img alt=\"I&apos;m never taking my badge off, best day ever\" src=\"https://anil.recoil.org/images/shen-4.webp\" title=\"I&apos;m never taking my badge off, best day ever\">\nI'm never taking my badge off, best day ever</p>",
+18
avsm/notes_komodo-docker-compose.json
+18
avsm/notes_komodo-docker-compose.json
···+"summary": "<p>With the <a href=\"https://www.tunbury.org/equinix-moves/\">sunsetting of Equinix Metal</a>\nI've also been migrating the Recoil machines over to new hosts in <a href=\"https://www.mythic-beasts.com/\">Mythic\nBeasts</a>. This time around, rather than manually\nsetting up services, I've turned to a nice new tool called\n<a href=\"https://github.com/moghtech/komodo\">Komodo</a> which helps with deploying Docker\ncontainers across multiple servers. Unlike many <a href=\"https://kubernetes.io/\">other</a>\ncontainer management solutions, Komodo is refreshingly simple. It has a mode\nwhere it can take <em>existing</em> <a href=\"https://docs.docker.com/compose/\">Docker compose</a> files on a\ngiven host, and run them, and provide a web-based monitor to keep an eye on a\nfew machines.</p>\n<h2><a href=\"https://anil.recoil.org/#the-komodo-interface\"></a>The Komodo interface</h2>\n<p>There's an online <a href=\"https://demo.komo.do/\">demo</a> of Komodo available (user/pass\nis demo/demo). The basic idea is that you first register servers (see below for\n"Periphery"), and then add in "Stacks" which represent a service each.</p>\n<p>\n<img alt=\"The list of Stacks running on Recoil\" src=\"https://anil.recoil.org/images/komodo-1.webp\" title=\"The list of Stacks running on Recoil\">\nThe list of Stacks running on Recoil</p>\n<p>Every stack is configured to run a <code>docker-compose.yml</code> service that is already\npresent on the host, and the web UI has a convenient way of pulling, deploying\nand polling the Docker Hub to check for updates.</p>\n<p>\n<img alt=\"The stack view for a Tangled.sh knot running on Recoil\" src=\"https://anil.recoil.org/images/komodo-2.webp\" title=\"The stack view for a Tangled.sh knot running on Recoil\">\nThe stack view for a Tangled.sh knot running on Recoil</p>\n<p>The autoupdate functionality is quite cool (if a touch risky), as it polls for the\nimages on the Docker Hub and updates to those automagically. While I've activated\nthis for services I'm happy autoupdating, it's also accompanied by a healthy\ndose of <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">ZFS snapshotting</a> so I can roll back if anything\nuntoward happens.</p>\n<p>\n<img alt=\"The alert view of autoupdates from polling the Hub\" src=\"https://anil.recoil.org/images/komodo-3.webp\" title=\"The alert view of autoupdates from polling the Hub\">\nThe alert view of autoupdates from polling the Hub</p>\n<p>Most importantly to me is that I can always switch away from Komodo at any time\nand directly interact with the services on the host using the normal <code>docker</code> CLI.\nKomodo is just coordinating the compose invocations in the lightest way possible,\nand not wrapping them in such a way that I lose access.</p>\n<h2><a href=\"https://anil.recoil.org/#setting-up-periphery-with-a-wireguard-mesh-and-dsnet\"></a>Setting up Periphery with a Wireguard mesh and dsnet</h2>\n<p>Komodo operates across multiple hosts by using something called a <a href=\"https://komo.do/docs/connect-servers\">periphery agent</a>\nwhich the main host issues RPCs to in order to do something. This is obviously quite a privileged operation, and so rather than\nexpose it to the Internet I setup a Wireguard tunnel mesh across the Recoil hosts for these operations to go over.</p>\n<p>The easiest way to do this was via <a href=\"https://github.com/naggie/dsnet\">dsnet</a>, which generates the configurations and keys\nsuitable for a <a href=\"https://www.man7.org/linux/man-pages/man8/wg-quick.8.html\">wg-quick</a> service to run on each host and connect\nto their peers. Following the instructions let me setup this mesh in minutes; this is a much simpler solution than\n<a href=\"https://tailscale.com\">Tailscale</a> due to the lack of flexibility, but all I want here is few hosts connected by static interfaces\nand with no need for <a href=\"https://tailscale.com/blog/how-nat-traversal-works\">complex NAT punching</a>. Once the dsnet configuration is\nsetup, all that's needed is to activate the <code>wg-quick</code> service on each of the hosts, and they spin up a virtual interface.</p>\n<p>After this, the Periphery setup was straightforward but with one twist. I configured the agent to bind to the wireguard IP, e.g.:</p>\n<pre><code>/etc/komodo/periphery.config.toml\n################################\n# \ud83e\udd8e KOMODO PERIPHERY CONFIG \ud83e\udd8e #\n################################\n\nport = 8120\nbind_ip = "10.100.0.2"\n</code></pre>\n<p>But then on reboot the periphery agent would fail to startup due to the wireguard service being too low a priority in the boot order. This was fixed by a systemd tweak (which took me longer to figure out than the rest of the entire setup altogether, since I find systemd utterly inscrutable).</p>\n<pre><code>/etc/systemd/system/periphery.service\n[Unit]\nDescription=Agent to connect with Komodo Core\nAfter=wg-quick@wg0.service\n</code></pre>\n<p>This little tweak to the script, followed by umpteen <code>daemon-reload</code> prods and\nreboots to get systemd happy, did the trick.</p>\n<p>I'm pretty happy with Komodo, thank you to the devs! It's a system that's simple enough that I can try\nit out progressively, and can bypass easily if required, and provides a very\nuseful part of the <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> jigsaw puzzle.</p>",+"content": "<p>With the <a href=\"https://www.tunbury.org/equinix-moves/\">sunsetting of Equinix Metal</a>\nI've also been migrating the Recoil machines over to new hosts in <a href=\"https://www.mythic-beasts.com/\">Mythic\nBeasts</a>. This time around, rather than manually\nsetting up services, I've turned to a nice new tool called\n<a href=\"https://github.com/moghtech/komodo\">Komodo</a> which helps with deploying Docker\ncontainers across multiple servers. Unlike many <a href=\"https://kubernetes.io/\">other</a>\ncontainer management solutions, Komodo is refreshingly simple. It has a mode\nwhere it can take <em>existing</em> <a href=\"https://docs.docker.com/compose/\">Docker compose</a> files on a\ngiven host, and run them, and provide a web-based monitor to keep an eye on a\nfew machines.</p>\n<h2><a href=\"https://anil.recoil.org/#the-komodo-interface\"></a>The Komodo interface</h2>\n<p>There's an online <a href=\"https://demo.komo.do/\">demo</a> of Komodo available (user/pass\nis demo/demo). The basic idea is that you first register servers (see below for\n"Periphery"), and then add in "Stacks" which represent a service each.</p>\n<p>\n<img alt=\"The list of Stacks running on Recoil\" src=\"https://anil.recoil.org/images/komodo-1.webp\" title=\"The list of Stacks running on Recoil\">\nThe list of Stacks running on Recoil</p>\n<p>Every stack is configured to run a <code>docker-compose.yml</code> service that is already\npresent on the host, and the web UI has a convenient way of pulling, deploying\nand polling the Docker Hub to check for updates.</p>\n<p>\n<img alt=\"The stack view for a Tangled.sh knot running on Recoil\" src=\"https://anil.recoil.org/images/komodo-2.webp\" title=\"The stack view for a Tangled.sh knot running on Recoil\">\nThe stack view for a Tangled.sh knot running on Recoil</p>\n<p>The autoupdate functionality is quite cool (if a touch risky), as it polls for the\nimages on the Docker Hub and updates to those automagically. While I've activated\nthis for services I'm happy autoupdating, it's also accompanied by a healthy\ndose of <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">ZFS snapshotting</a> so I can roll back if anything\nuntoward happens.</p>\n<p>\n<img alt=\"The alert view of autoupdates from polling the Hub\" src=\"https://anil.recoil.org/images/komodo-3.webp\" title=\"The alert view of autoupdates from polling the Hub\">\nThe alert view of autoupdates from polling the Hub</p>\n<p>Most importantly to me is that I can always switch away from Komodo at any time\nand directly interact with the services on the host using the normal <code>docker</code> CLI.\nKomodo is just coordinating the compose invocations in the lightest way possible,\nand not wrapping them in such a way that I lose access.</p>\n<h2><a href=\"https://anil.recoil.org/#setting-up-periphery-with-a-wireguard-mesh-and-dsnet\"></a>Setting up Periphery with a Wireguard mesh and dsnet</h2>\n<p>Komodo operates across multiple hosts by using something called a <a href=\"https://komo.do/docs/connect-servers\">periphery agent</a>\nwhich the main host issues RPCs to in order to do something. This is obviously quite a privileged operation, and so rather than\nexpose it to the Internet I setup a Wireguard tunnel mesh across the Recoil hosts for these operations to go over.</p>\n<p>The easiest way to do this was via <a href=\"https://github.com/naggie/dsnet\">dsnet</a>, which generates the configurations and keys\nsuitable for a <a href=\"https://www.man7.org/linux/man-pages/man8/wg-quick.8.html\">wg-quick</a> service to run on each host and connect\nto their peers. Following the instructions let me setup this mesh in minutes; this is a much simpler solution than\n<a href=\"https://tailscale.com\">Tailscale</a> due to the lack of flexibility, but all I want here is few hosts connected by static interfaces\nand with no need for <a href=\"https://tailscale.com/blog/how-nat-traversal-works\">complex NAT punching</a>. Once the dsnet configuration is\nsetup, all that's needed is to activate the <code>wg-quick</code> service on each of the hosts, and they spin up a virtual interface.</p>\n<p>After this, the Periphery setup was straightforward but with one twist. I configured the agent to bind to the wireguard IP, e.g.:</p>\n<pre><code>/etc/komodo/periphery.config.toml\n################################\n# \ud83e\udd8e KOMODO PERIPHERY CONFIG \ud83e\udd8e #\n################################\n\nport = 8120\nbind_ip = "10.100.0.2"\n</code></pre>\n<p>But then on reboot the periphery agent would fail to startup due to the wireguard service being too low a priority in the boot order. This was fixed by a systemd tweak (which took me longer to figure out than the rest of the entire setup altogether, since I find systemd utterly inscrutable).</p>\n<pre><code>/etc/systemd/system/periphery.service\n[Unit]\nDescription=Agent to connect with Komodo Core\nAfter=wg-quick@wg0.service\n</code></pre>\n<p>This little tweak to the script, followed by umpteen <code>daemon-reload</code> prods and\nreboots to get systemd happy, did the trick.</p>\n<p>I'm pretty happy with Komodo, thank you to the devs! It's a system that's simple enough that I can try\nit out progressively, and can bypass easily if required, and provides a very\nuseful part of the <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> jigsaw puzzle.</p>",
+18
avsm/notes_life-official-statistic.json
+18
avsm/notes_life-official-statistic.json
···+"summary": "<p>Our <a href=\"https://anil.recoil.org/papers/2024-life\">recently published</a> <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> biodiversity metric has just been integrated into a newly recognised <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">Official Statistic from the UK government</a>! This integrates the core LIFE biodiversity metric with <a href=\"https://anil.recoil.org/papers/2024-food-life\">food provenance data</a> to track the environmental impacts of our consumption habits.</p>\n<p>I must admit that I'd not heard of "Official Statistics" before this, so I did a bit of research. The UK <a href=\"https://osr.statisticsauthority.gov.uk/\">Office for Statistics Regulation</a> says that:</p>\n<blockquote>\n<p>Official statistics are statistics produced by Crown bodies and other organisations listed within an Official Statistics Order, on behalf of the UK government or devolved administrations.\nThey provide a factual basis for assessment and decisions on economic, social and environmental issues at all levels of society.\n-- <a href=\"https://osr.statisticsauthority.gov.uk/policies/official-statistics-policies/\">OSR Policies</a> </p>\n</blockquote>\n<p>The good folk at the <a href=\"https://jncc.gov.uk/\">Joint Nature Conservation Committee</a> are responsible for this particular statistic. The JNCC launched their <a href=\"https://data.jncc.gov.uk/data/ccb9f624-7121-4c32-aefa-e0579d7eaaa1/together-for-nature.pdf\">Together for Nature</a> strategy in 2023, and have the remit of turning scientific outcomes into robust evidence-based action for protecting nature worldwide. They've been developing a <a href=\"https://commodityfootprints.earth/\">Global Environmental Impacts of Consumption</a> indicator that provides information to policymakers about the tradeoffs of various consumption actions vs the corresponding global environmental impact.</p>\n<blockquote>\n<p>When products arrive in our shops and on our doorsteps, they can look very different to the raw ingredients that were used to make them. Most products are made up of many parts, and these can move thousands of miles through many countries before reaching their final destination.</p>\n<p>The average consumer is so far removed from the production process, both physically and conceptually, that it is hard to imagine where their products are from, let alone the environmental impacts resulting from their production. This is also true for the academics and governments working to monitor and reduce the environmental impact of our consumption.\n-- <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">DEFRA Blog</a></p>\n</blockquote>\n<p>Their <a href=\"https://commodityfootprints.earth/\">GEIC tool</a>, developed jointly with the <a href=\"https://www.sei.org/\">Stockholm Environmental Institute</a> and our LIFE collaborator <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a>, provides data on spatial biodiversity, water use, and forest cover changes associated with a country consumption.</p>\n<a href=\"https://commodityfootprints.earth/?footprint_type=consuming&footprint_opposite=producing&focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&filter_year=2022&domestic_flows=true&lang=en#dashboard\">\n<p>\n<img alt=\"The consumption impacts of the UK on global species extinctions\" src=\"https://anil.recoil.org/images/life-statistic-1.webp\" title=\"The consumption impacts of the UK on global species extinctions\">\nThe consumption impacts of the UK on global species extinctions</p>\n</a>\n<p>This metric is updated annually, and this year <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> supplied LIFE+FOOD to additionally map where species are more likely to go extinct as a result of land-use change.</p>\n<p>This year's GEIC update is the first where it was recognised by the OSR as being of sufficient stability and quality to "graduate" into Offical Statistic status. A pretty cool feeling, and it's all openly downloadable of course; you can <a href=\"https://commodityfootprints.earth/?footprint_type=consuming&footprint_opposite=producing&focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&filter_year=2022&domestic_flows=true&lang=en\">navigate</a> over to the Commodity Footprints LIFE section to explore the metrics for yourself.</p>\n<p>From a <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> perspective, what I also found interesting is how the flow of observations and evidence works in practise. The computational processing for LIFE involves <a href=\"https://anil.recoil.org/ideas/effective-geospatial-code\">crunching</a> petabytes of raster maps from a <a href=\"https://github.com/quantifyearth/aoh-calculator\">species habitat pipeline</a> into a global map, which is then published as an aggregate map on <a href=\"https://zenodo.org/records/14945383\">Zenodo</a> by <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>\n<img alt=\"The LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)\" src=\"https://anil.recoil.org/images/life-statistic-2.webp\" title=\"The LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)\">\nThe LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)</p>\n<p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> and <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> then worked directly with the policy team at the JNCC to further customise the metric for GEIC needs, and the aggregate result of that is what's actually used in the dashboard.</p>\n<p>There's quite a long gap between the original observations and the resulting policy use, with many humans in the loop in between. Computational systems need to capture all this nuance rather than viewing these metrics as "just" dataflow pipelines. However, it's equally important to capture the policy customisations in some sort of code, so that we can reliably issue annual updates. Figuring this pipeline out is part of what we're working on in the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project at present. See below for a <a href=\"https://anil.recoil.org/videos/d592bf17-c835-435f-9469-f0f65e926975\">recent talk</a> I gave on the functional programming aspects of this problem at LambdaDays.</p>\n<p></p><div></div><p></p>",+"content": "<p>Our <a href=\"https://anil.recoil.org/papers/2024-life\">recently published</a> <a href=\"https://anil.recoil.org/projects/life\">LIFE</a> biodiversity metric has just been integrated into a newly recognised <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">Official Statistic from the UK government</a>! This integrates the core LIFE biodiversity metric with <a href=\"https://anil.recoil.org/papers/2024-food-life\">food provenance data</a> to track the environmental impacts of our consumption habits.</p>\n<p>I must admit that I'd not heard of "Official Statistics" before this, so I did a bit of research. The UK <a href=\"https://osr.statisticsauthority.gov.uk/\">Office for Statistics Regulation</a> says that:</p>\n<blockquote>\n<p>Official statistics are statistics produced by Crown bodies and other organisations listed within an Official Statistics Order, on behalf of the UK government or devolved administrations.\nThey provide a factual basis for assessment and decisions on economic, social and environmental issues at all levels of society.\n-- <a href=\"https://osr.statisticsauthority.gov.uk/policies/official-statistics-policies/\">OSR Policies</a> </p>\n</blockquote>\n<p>The good folk at the <a href=\"https://jncc.gov.uk/\">Joint Nature Conservation Committee</a> are responsible for this particular statistic. The JNCC launched their <a href=\"https://data.jncc.gov.uk/data/ccb9f624-7121-4c32-aefa-e0579d7eaaa1/together-for-nature.pdf\">Together for Nature</a> strategy in 2023, and have the remit of turning scientific outcomes into robust evidence-based action for protecting nature worldwide. They've been developing a <a href=\"https://commodityfootprints.earth/\">Global Environmental Impacts of Consumption</a> indicator that provides information to policymakers about the tradeoffs of various consumption actions vs the corresponding global environmental impact.</p>\n<blockquote>\n<p>When products arrive in our shops and on our doorsteps, they can look very different to the raw ingredients that were used to make them. Most products are made up of many parts, and these can move thousands of miles through many countries before reaching their final destination.</p>\n<p>The average consumer is so far removed from the production process, both physically and conceptually, that it is hard to imagine where their products are from, let alone the environmental impacts resulting from their production. This is also true for the academics and governments working to monitor and reduce the environmental impact of our consumption.\n-- <a href=\"https://defraenvironment.blog.gov.uk/2025/01/20/newly-recognised-official-statistic-tracks-the-environmental-impact-of-our-consumption/\">DEFRA Blog</a></p>\n</blockquote>\n<p>Their <a href=\"https://commodityfootprints.earth/\">GEIC tool</a>, developed jointly with the <a href=\"https://www.sei.org/\">Stockholm Environmental Institute</a> and our LIFE collaborator <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a>, provides data on spatial biodiversity, water use, and forest cover changes associated with a country consumption.</p>\n<a href=\"https://commodityfootprints.earth/?footprint_type=consuming&footprint_opposite=producing&focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&filter_year=2022&domestic_flows=true&lang=en#dashboard\">\n<p>\n<img alt=\"The consumption impacts of the UK on global species extinctions\" src=\"https://anil.recoil.org/images/life-statistic-1.webp\" title=\"The consumption impacts of the UK on global species extinctions\">\nThe consumption impacts of the UK on global species extinctions</p>\n</a>\n<p>This metric is updated annually, and this year <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> supplied LIFE+FOOD to additionally map where species are more likely to go extinct as a result of land-use change.</p>\n<p>This year's GEIC update is the first where it was recognised by the OSR as being of sufficient stability and quality to "graduate" into Offical Statistic status. A pretty cool feeling, and it's all openly downloadable of course; you can <a href=\"https://commodityfootprints.earth/?footprint_type=consuming&footprint_opposite=producing&focal_country=United+Kingdom+of+Great+Britain+and+Northern+Ireland&measure=LIFE_score_embedded_in_consumption__change_in_prob_of_extinct_n&filter_year=2022&domestic_flows=true&lang=en\">navigate</a> over to the Commodity Footprints LIFE section to explore the metrics for yourself.</p>\n<p>From a <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> perspective, what I also found interesting is how the flow of observations and evidence works in practise. The computational processing for LIFE involves <a href=\"https://anil.recoil.org/ideas/effective-geospatial-code\">crunching</a> petabytes of raster maps from a <a href=\"https://github.com/quantifyearth/aoh-calculator\">species habitat pipeline</a> into a global map, which is then published as an aggregate map on <a href=\"https://zenodo.org/records/14945383\">Zenodo</a> by <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p>\n<img alt=\"The LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)\" src=\"https://anil.recoil.org/images/life-statistic-2.webp\" title=\"The LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)\">\nThe LIFE map in false colour around the equatorial region (credit: Tom Swinfield/Michael Dales)</p>\n<p><a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> and <a href=\"https://www.york.ac.uk/sei/staff/jonathan-green/\">Jonathan Green</a> then worked directly with the policy team at the JNCC to further customise the metric for GEIC needs, and the aggregate result of that is what's actually used in the dashboard.</p>\n<p>There's quite a long gap between the original observations and the resulting policy use, with many humans in the loop in between. Computational systems need to capture all this nuance rather than viewing these metrics as "just" dataflow pipelines. However, it's equally important to capture the policy customisations in some sort of code, so that we can reliably issue annual updates. Figuring this pipeline out is part of what we're working on in the <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence Copilots</a> project at present. See below for a <a href=\"https://anil.recoil.org/videos/d592bf17-c835-435f-9469-f0f65e926975\">recent talk</a> I gave on the functional programming aspects of this problem at LambdaDays.</p>\n<p></p><div></div><p></p>",
+18
avsm/notes_liveblog-plos-2013.json
+18
avsm/notes_liveblog-plos-2013.json
···+"summary": "<p>I co-chaired the Programming Languages and Operating Systems workshop at SOSP 2013, and made livenotes about the (many) papers presented there.</p>",+"content": "<p>I co-chaired the Programming Languages and Operating Systems workshop at SOSP 2013, and made livenotes about the (many) papers presented there.</p>",
+18
avsm/notes_loco24-talks-online.json
+18
avsm/notes_loco24-talks-online.json
···+"summary": "<p>The sister conference to <a href=\"https://anil.recoil.org/notes/propl-at-splash\">PROPL</a> was held late last year in Scotland with a bumper attendance from Cambridge. All of the talks from it are now available online at <a href=\"https://www.youtube.com/@loco-workshop\">YouTube</a>, or on our ad-free <a href=\"https://watch.eeg.cl.cam.ac.uk/c/loco/videos\">EEG video site</a>.\nThe keynote from <a href=\"https://www.annecurrie.com\">Anne Currie</a> was fantastic and wide-ranging (she is the author of the eerily predictive <a href=\"https://www.annecurrie.com/chapter-1-utopia-five\">Panopticon series</a>):</p>\n<p></p><div></div><p></p>\n<p>I was also involved with four fun talks up there given by collaborators:</p>\n<p></p><div></div>\n<div></div>\n<div></div>\n<div></div><p></p>\n<p>More broadly, I'm quite pleased with the <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> of videos. I'm now operating three sites (with <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>):</p>\n<ul>\n<li>The <a href=\"https://watch.cl.cam.ac.uk\">EEG video site</a> with a <a href=\"https://watch.eeg.cl.cam.ac.uk/about/instance#statistics\">100+ videos</a> of EEG talks and workshops like <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">PROPL</a> and LOCO (above).</li>\n<li>The <a href=\"https://watch.ocaml.org\">Watch OCaml</a> site with <a href=\"https://watch.ocaml.org/about/instance#statistics\">almost 200 videos across 20 years</a> of talks related to the OCaml programming language.</li>\n<li>My personal <a href=\"https://crank.recoil.org\">Recoil video mirror</a> with <a href=\"https://crank.recoil.org/about/instance/home\">~65 videos</a> of my own stuff.</li>\n</ul>\n<p>Many of the talks in the instances above have been sunset from their respective video hosting sites, so there's a strong element of robustness over time to hosting things ourselves. Each of the instances above also follow each other to <a href=\"https://docs.joinpeertube.org/admin/following-instances\">provide p2p redundancy</a> if a video hits the socials and goes viral.</p>",+"content": "<p>The sister conference to <a href=\"https://anil.recoil.org/notes/propl-at-splash\">PROPL</a> was held late last year in Scotland with a bumper attendance from Cambridge. All of the talks from it are now available online at <a href=\"https://www.youtube.com/@loco-workshop\">YouTube</a>, or on our ad-free <a href=\"https://watch.eeg.cl.cam.ac.uk/c/loco/videos\">EEG video site</a>.\nThe keynote from <a href=\"https://www.annecurrie.com\">Anne Currie</a> was fantastic and wide-ranging (she is the author of the eerily predictive <a href=\"https://www.annecurrie.com/chapter-1-utopia-five\">Panopticon series</a>):</p>\n<p></p><div></div><p></p>\n<p>I was also involved with four fun talks up there given by collaborators:</p>\n<p></p><div></div>\n<div></div>\n<div></div>\n<div></div><p></p>\n<p>More broadly, I'm quite pleased with the <a href=\"https://anil.recoil.org/news?t=selfhosting\">selfhosting</a> of videos. I'm now operating three sites (with <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>):</p>\n<ul>\n<li>The <a href=\"https://watch.cl.cam.ac.uk\">EEG video site</a> with a <a href=\"https://watch.eeg.cl.cam.ac.uk/about/instance#statistics\">100+ videos</a> of EEG talks and workshops like <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">PROPL</a> and LOCO (above).</li>\n<li>The <a href=\"https://watch.ocaml.org\">Watch OCaml</a> site with <a href=\"https://watch.ocaml.org/about/instance#statistics\">almost 200 videos across 20 years</a> of talks related to the OCaml programming language.</li>\n<li>My personal <a href=\"https://crank.recoil.org\">Recoil video mirror</a> with <a href=\"https://crank.recoil.org/about/instance/home\">~65 videos</a> of my own stuff.</li>\n</ul>\n<p>Many of the talks in the instances above have been sunset from their respective video hosting sites, so there's a strong element of robustness over time to hosting things ourselves. Each of the instances above also follow each other to <a href=\"https://docs.joinpeertube.org/admin/following-instances\">provide p2p redundancy</a> if a video hits the socials and goes viral.</p>",
+18
avsm/notes_mars-polar-lander.json
+18
avsm/notes_mars-polar-lander.json
···+"summary": "<p>In my capacity as <a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">webmaster</a> of the Mars Polar Lander, I submitted a note to Slashdot. Although our amazing distributed website took quite a beating (with some estimated 1 in 4 Internet users trying to access it simultaneously), the Lander itself <a href=\"https://www.wired.com/1999/12/mars-lander-wont-phone-home/\">sadly crashed</a>. On the bright side, I got mentioned in a <a href=\"https://web.archive.org/web/20020106163651/http://www.sun.com/smi/Press/sunflash/1999-12/sunflash.991202.1.html\">Sun press release</a> because of the Sun Netra T1 servers they gave us to host the website!</p>\n<p>You can read more about the architecture behind the site in "<a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">Application of a Distributed Web Site Acceleration: Mars Polar Lander</a>".</p>",+"content": "<p>In my capacity as <a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">webmaster</a> of the Mars Polar Lander, I submitted a note to Slashdot. Although our amazing distributed website took quite a beating (with some estimated 1 in 4 Internet users trying to access it simultaneously), the Lander itself <a href=\"https://www.wired.com/1999/12/mars-lander-wont-phone-home/\">sadly crashed</a>. On the bright side, I got mentioned in a <a href=\"https://web.archive.org/web/20020106163651/http://www.sun.com/smi/Press/sunflash/1999-12/sunflash.991202.1.html\">Sun press release</a> because of the Sun Netra T1 servers they gave us to host the website!</p>\n<p>You can read more about the architecture behind the site in "<a href=\"https://anil.recoil.org/papers/netapp-tr-3071\">Application of a Distributed Web Site Acceleration: Mars Polar Lander</a>".</p>",
+18
avsm/notes_mirage-self-hosting.json
+18
avsm/notes_mirage-self-hosting.json
···+"summary": "<p>I managed to get early <a href=\"https://mirageos.org\">MirageOS</a> suitably feature-complete enough that we could run the Mirage website using Mirage. This was all very satisfying after hacking on the <a href=\"https://github.com/mirage/mirage-tcpip\">TCP/IP</a> stack for ages.</p>",+"content": "<p>I managed to get early <a href=\"https://mirageos.org\">MirageOS</a> suitably feature-complete enough that we could run the Mirage website using Mirage. This was all very satisfying after hacking on the <a href=\"https://github.com/mirage/mirage-tcpip\">TCP/IP</a> stack for ages.</p>",
+18
avsm/notes_mirageos-hack-retreat-2016.json
+18
avsm/notes_mirageos-hack-retreat-2016.json
···
+18
avsm/notes_mission-possible.json
+18
avsm/notes_mission-possible.json
···+"summary": "<p>I was on stage in New York for <a href=\"https://www.cam.ac.uk/news/cambridge-zero-highlights-university-efforts-at-climate-week-nyc\">Mission Possible</a>\nduring <a href=\"https://www.climateweeknyc.org\">NYC Climate Week</a>. I was there with <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and we met with a lot of Cambridge alumni who\nare all engaged with climate change related activities -- either directly in their careers, or through a side interest.</p>\n<p>The major highlights on the discussions with alumni centred around agency: a lot of them were wondering how to combine the evidence coming\nout Cambridge research and combine it with real policy action. A number of the alumni are obviously highly successful in their individual\ncareers, and so the University helping to glue this together would potentially result in valuable actions that might not otherwise come together.</p>\n<p>This reminded me strongly of the discussions we had in Pembroke a little while back when <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> chaired my talk about "Who's in Charge?" for the <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/corporate-partnership-events/william-pitt-seminars/17th-william-pitt\">William Pitt Seminar</a> where we had very similar discussions at dinner afterwards.</p>\n<p></p><div></div><p></p>\n<p><em>(See also <a href=\"https://www.zero.cam.ac.uk/who-we-are/blog/news/cambridge-zero-takes-centre-stage-climate-week-nyc\">Cambridge Zero</a> notes on the event, and thanks to <a href=\"https://www.cisl.cam.ac.uk/\">CISL</a>.)</em></p>",+"content": "<p>I was on stage in New York for <a href=\"https://www.cam.ac.uk/news/cambridge-zero-highlights-university-efforts-at-climate-week-nyc\">Mission Possible</a>\nduring <a href=\"https://www.climateweeknyc.org\">NYC Climate Week</a>. I was there with <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and we met with a lot of Cambridge alumni who\nare all engaged with climate change related activities -- either directly in their careers, or through a side interest.</p>\n<p>The major highlights on the discussions with alumni centred around agency: a lot of them were wondering how to combine the evidence coming\nout Cambridge research and combine it with real policy action. A number of the alumni are obviously highly successful in their individual\ncareers, and so the University helping to glue this together would potentially result in valuable actions that might not otherwise come together.</p>\n<p>This reminded me strongly of the discussions we had in Pembroke a little while back when <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> chaired my talk about "Who's in Charge?" for the <a href=\"https://www.pem.cam.ac.uk/college/corporate-partnership/corporate-partnership-events/william-pitt-seminars/17th-william-pitt\">William Pitt Seminar</a> where we had very similar discussions at dinner afterwards.</p>\n<p></p><div></div><p></p>\n<p><em>(See also <a href=\"https://www.zero.cam.ac.uk/who-we-are/blog/news/cambridge-zero-takes-centre-stage-climate-week-nyc\">Cambridge Zero</a> notes on the event, and thanks to <a href=\"https://www.cisl.cam.ac.uk/\">CISL</a>.)</em></p>",
+18
avsm/notes_mit-spotcodes.json
+18
avsm/notes_mit-spotcodes.json
···+"summary": "<p>We got more coverage of <a href=\"https://en.wikipedia.org/wiki/ShotCode\">SpotCodes</a> and our startup <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">High Energy Magic</a>, leading to lots of interest in the technology.</p>\n<blockquote>\n<p>Public touch-screen displays such as airport check-in kiosks aren\u2019t known for having versatile interfaces; they usually lack keyboards or pointing devices, limiting users to a few navigational buttons. But new software from High Energy Magic of Cambridge, England, turns a camera phone with a Bluetooth wireless connection into a portable mouse and keyboard that can take full command of public displays, doing away with the old touch screen. Working with Intel\u2019s Cambridge research lab, High Energy Magic has developed a set of circular symbols, similar in concept to bar codes, that can be displayed by public terminals. Camera phones loaded with the company\u2019s software can translate the symbols into data. Once a phone locks onto one of the symbols, it uses the Bluetooth short-range wireless protocol to send information about its size, position, and orientation to the computer running the display. The phone can then act as a mouse, manipulating on-screen controls such as scroll bars. The company plans to license the technology to businesses, such as travel agencies, that operate public kiosks.\n-- <a href=\"https://web.archive.org/web/20241202023917/https://cdn.technologyreview.com/s/403022/phone-it-in/\">MIT Technology Review</a></p>\n</blockquote>",+"content": "<p>We got more coverage of <a href=\"https://en.wikipedia.org/wiki/ShotCode\">SpotCodes</a> and our startup <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">High Energy Magic</a>, leading to lots of interest in the technology.</p>\n<blockquote>\n<p>Public touch-screen displays such as airport check-in kiosks aren\u2019t known for having versatile interfaces; they usually lack keyboards or pointing devices, limiting users to a few navigational buttons. But new software from High Energy Magic of Cambridge, England, turns a camera phone with a Bluetooth wireless connection into a portable mouse and keyboard that can take full command of public displays, doing away with the old touch screen. Working with Intel\u2019s Cambridge research lab, High Energy Magic has developed a set of circular symbols, similar in concept to bar codes, that can be displayed by public terminals. Camera phones loaded with the company\u2019s software can translate the symbols into data. Once a phone locks onto one of the symbols, it uses the Bluetooth short-range wireless protocol to send information about its size, position, and orientation to the computer running the display. The phone can then act as a mouse, manipulating on-screen controls such as scroll bars. The company plans to license the technology to businesses, such as travel agencies, that operate public kiosks.\n-- <a href=\"https://web.archive.org/web/20241202023917/https://cdn.technologyreview.com/s/403022/phone-it-in/\">MIT Technology Review</a></p>\n</blockquote>",
+18
avsm/notes_mitigating-nbs-risk-paper.json
+18
avsm/notes_mitigating-nbs-risk-paper.json
···+"summary": "<p>Many of the questions around our recent <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">Nature Sustainability commentary on NbS credits</a> revolve around\n<em>how</em> to finance new projects if credible credits need to be ex-post. Our\nlatest paper published in Carbon Management on <em>"<a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>"</em> tries to address this.</p>\n<p>The problem with selling ex-ante (future) carbon credits for (e.g.) a\ndeforestation avoidance scheme is that project reversals can happen in the\nfuture ("deforestation has increased") thus rendering any credits issued\npreviously useless. On the flip side though, an overly conservative view of the\nfuture ("the entire forest will disappear overnight!") is clearly so\nconservative that it doesn't serve the best interests of the project developer.\nSo ideally, a project would make realistic but conservative ex-ante predictions\nthat is safe for both project developer (who gets more funds upfront) and a\ncarbon credit purchasers (who needs to account for impermanence of nature\ncredits).</p>\n<p>Our paper shows how to do this by calculating a "release schedule" to predict\nfuture drawdowns, and then issuing extra credits when the release at some\nfuture date is less than predicted by the release schedule. We use verified\nex-post observations to construct these release schedules, and design them to\nbound the risk of the project becoming negative overall (that is, net drawdown\nis negative) and thus failing.</p>\n<p>The paper evaluates this process with both theoretical and real projects to\nassess how well it balances the tradeoff between generating permanent nature\ncredits and bounding the risk of project failure in the future. As a nice side\neffect, our method removes the need for buffer pools entirely, which do not\ncurrently base the sizing on an empirical assessment of reversal risks, and are\nusually cancelled at project end (wasting potential credits). Read the full\nopen access paper, lead expertly by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>,\nthat just came out in Carbon Management for details: <a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a></p>\n<p>There's still plenty of future work to be done -- we focus on avoided\ndeforestation projects in this paper, but afforestation projects could also be\nmodelled on similar principles. Do get in touch if you'd like to help assess\nour methods!</p>",+"content": "<p>Many of the questions around our recent <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">Nature Sustainability commentary on NbS credits</a> revolve around\n<em>how</em> to finance new projects if credible credits need to be ex-post. Our\nlatest paper published in Carbon Management on <em>"<a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>"</em> tries to address this.</p>\n<p>The problem with selling ex-ante (future) carbon credits for (e.g.) a\ndeforestation avoidance scheme is that project reversals can happen in the\nfuture ("deforestation has increased") thus rendering any credits issued\npreviously useless. On the flip side though, an overly conservative view of the\nfuture ("the entire forest will disappear overnight!") is clearly so\nconservative that it doesn't serve the best interests of the project developer.\nSo ideally, a project would make realistic but conservative ex-ante predictions\nthat is safe for both project developer (who gets more funds upfront) and a\ncarbon credit purchasers (who needs to account for impermanence of nature\ncredits).</p>\n<p>Our paper shows how to do this by calculating a "release schedule" to predict\nfuture drawdowns, and then issuing extra credits when the release at some\nfuture date is less than predicted by the release schedule. We use verified\nex-post observations to construct these release schedules, and design them to\nbound the risk of the project becoming negative overall (that is, net drawdown\nis negative) and thus failing.</p>\n<p>The paper evaluates this process with both theoretical and real projects to\nassess how well it balances the tradeoff between generating permanent nature\ncredits and bounding the risk of project failure in the future. As a nice side\neffect, our method removes the need for buffer pools entirely, which do not\ncurrently base the sizing on an empirical assessment of reversal risks, and are\nusually cancelled at project end (wasting potential credits). Read the full\nopen access paper, lead expertly by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://coomeslab.org\">David Coomes</a>,\nthat just came out in Carbon Management for details: <a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a></p>\n<p>There's still plenty of future work to be done -- we focus on avoided\ndeforestation projects in this paper, but afforestation projects could also be\nmodelled on similar principles. Do get in touch if you'd like to help assess\nour methods!</p>",
+18
avsm/notes_multicore-monthly-apr20.json
+18
avsm/notes_multicore-monthly-apr20.json
···+"summary": "<p>In the April OCaml multicore monthly, we have a preprint available of our ICFP submission about the OCaml 5 multicore runtime.\n<em>(Update: This paper actually won the ICFP best paper award later in the year! Read it at "<a href=\"https://anil.recoil.org/papers/2020-icfp-retropar\">Retrofitting parallelism onto OCaml</a>").</em></p>",+"content": "<p>In the April OCaml multicore monthly, we have a preprint available of our ICFP submission about the OCaml 5 multicore runtime.\n<em>(Update: This paper actually won the ICFP best paper award later in the year! Read it at "<a href=\"https://anil.recoil.org/papers/2020-icfp-retropar\">Retrofitting parallelism onto OCaml</a>").</em></p>",
+18
avsm/notes_multicore-monthly-dec21.json
+18
avsm/notes_multicore-monthly-dec21.json
···+"summary": "<p>We've been working hard on OCaml multicore support, and went over to Paris to sit down with some core developers from Inria and work through code review of our proposed patches.</p>",+"content": "<p>We've been working hard on OCaml multicore support, and went over to Paris to sit down with some core developers from Inria and work through code review of our proposed patches.</p>",
+18
avsm/notes_multicore-monthly-jan20.json
+18
avsm/notes_multicore-monthly-jan20.json
···+"summary": "<p>We started the process of upstreaming our <a href=\"https://anil.recoil.org/papers/2014-oud-multicore\">multicore OCaml</a> branch to mainline OCaml, and so I started posting regular updates to the community forum.</p>\n<blockquote>\n<p>The most common question we get is how to contribute to the overall multicore effort. As I noted last year, we are now in the process of steadily upstreaming our efforts to mainline OCaml. Therefore, the best way by far to contribute is to test for regressions or opportunities for improvements in the patches that are outstanding in the main OCaml repository.\n-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-january-2020-update/5090\">me, on the discussion forum</a></p>\n</blockquote>",+"content": "<p>We started the process of upstreaming our <a href=\"https://anil.recoil.org/papers/2014-oud-multicore\">multicore OCaml</a> branch to mainline OCaml, and so I started posting regular updates to the community forum.</p>\n<blockquote>\n<p>The most common question we get is how to contribute to the overall multicore effort. As I noted last year, we are now in the process of steadily upstreaming our efforts to mainline OCaml. Therefore, the best way by far to contribute is to test for regressions or opportunities for improvements in the patches that are outstanding in the main OCaml repository.\n-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-january-2020-update/5090\">me, on the discussion forum</a></p>\n</blockquote>",
+18
avsm/notes_multicore-monthly-jan22.json
+18
avsm/notes_multicore-monthly-jan22.json
···+"summary": "<p>After we got the massive OCaml 5.0 pull request merged, we've taken some time to consolidate the trunk branch of OCaml and start down the release path towards getting OCaml 5.0 out of the door.</p>",+"content": "<p>After we got the massive OCaml 5.0 pull request merged, we've taken some time to consolidate the trunk branch of OCaml and start down the release path towards getting OCaml 5.0 out of the door.</p>",
+18
avsm/notes_multicore-monthly-mar22.json
+18
avsm/notes_multicore-monthly-mar22.json
···+"summary": "<p>We're getting closer to a stable release of OCaml 5.0, including reenabling support for the BSDs and introducing ARM64 multicore support.</p>",+"content": "<p>We're getting closer to a stable release of OCaml 5.0, including reenabling support for the BSDs and introducing ARM64 multicore support.</p>",
+18
avsm/notes_multicore-monthly-sep20.json
+18
avsm/notes_multicore-monthly-sep20.json
···+"summary": "<p>The big advance in the multicore OCaml branch is that we restored compatibility\nwith the traditional OCaml systhreads. This in turn means that many existing\nsoftware packages just work out of the box on the new runtime.</p>\n<blockquote>\n<p>Big news this month is that the systhreads compatibility support PR has been\nmerged, which means that Dune (and other users of the Thread module) can\ncompile out of the box. You can now compile the multicore OCaml fork\nconveniently using the new opam compiler plugin (see announcement).\n-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-september-2020/6565\">me, on the discussion forum</a></p>\n</blockquote>",+"content": "<p>The big advance in the multicore OCaml branch is that we restored compatibility\nwith the traditional OCaml systhreads. This in turn means that many existing\nsoftware packages just work out of the box on the new runtime.</p>\n<blockquote>\n<p>Big news this month is that the systhreads compatibility support PR has been\nmerged, which means that Dune (and other users of the Thread module) can\ncompile out of the box. You can now compile the multicore OCaml fork\nconveniently using the new opam compiler plugin (see announcement).\n-- <a href=\"https://discuss.ocaml.org/t/multicore-ocaml-september-2020/6565\">me, on the discussion forum</a></p>\n</blockquote>",
+18
avsm/notes_multicore-monthly-sep21.json
+18
avsm/notes_multicore-monthly-sep21.json
···+"summary": "<p>We're making steady progress on getting multicore support merged into OCaml, including some great developer meetings where we achieved consensus with the core team to include support for effect handlers in the 5.0 release.</p>",+"content": "<p>We're making steady progress on getting multicore support merged into OCaml, including some great developer meetings where we achieved consensus with the core team to include support for effect handlers in the 5.0 release.</p>",
+18
avsm/notes_nas-rs-biodiversity.json
+18
avsm/notes_nas-rs-biodiversity.json
···+"summary": "<p>I spent a couple of days at the <a href=\"https://www.nationalacademies.org/home\">National Academy of Sciences</a> in the USA at the invitation of the <a href=\"https://royalsociety.org\">Royal Society</a>, who held a forum on "<a href=\"https://anil.recoil.org/\">Measuring Biodiversity for Addressing the Global Crisis</a>". It was a <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">packed program</a> for those working in evidence-driven conservation:</p>\n<blockquote>\n<p>Assessing biodiversity is fundamental to understanding the distribution of biodiversity, the changes that are occurring and, crucially, the effectiveness of actions to address the ongoing biodiversity crisis. Such assessments face multiple challenges, not least the great complexity of natural systems, but also a lack of standardized approaches to measurement, a plethora of measurement technologies with their own strengths and weaknesses, and different data needs depending on the purpose\nfor which the information is being gathered.</p>\n<p>Other sectors have faced similar challenges, and the forum will look to learn from these precedents with a view to building momentum toward standardized methods for using environmental monitoring technologies, including new technologies, for particular purposes.\n-- NAS/Royal Society <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">US-UK Scientific Forum on Measuring Biodiversity</a></p>\n</blockquote>\n<p>I was honoured to talk about our work on using AI to "connect the dots" between disparate data like the academic literature and remote observations at scale. But before that, here's some of the bigger picture stuff I learnt...</p>\n<p><a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\"> \n<img alt=\"Identifying the bird is an exercise for the reader!\" src=\"https://anil.recoil.org/images/nas-rs-cover.webp\" title=\"Identifying the bird is an exercise for the reader!\">\nIdentifying the bird is an exercise for the reader! </a></p>\n<h2><a href=\"https://anil.recoil.org/#shifting-conservation-to-a-winning-stance\"></a>Shifting conservation to a winning stance</h2>\n<p>The need for urgent, additional action came across loud and clear from all the top actors in biodiversity. On the bright side, we have made stellar progress in measuring more dimensions of biodiversity accurately than ever before in human history. But, the field of biodiversity does not have a single "simple question" that needs answering, unlike many other science challenges in physics or chemistry. The ecosystem of nature measurements need to span scales ranging from the micro (from fungi and soil health) to the macro (species richness and diversity), with geographical coverage across the planet but also hyperlocal accuracy for ecosystem services.</p>\n<p>One key question asked at the forum was how we can get to interoperable, pragmatic tools that enable all the actors involved in conservation actions (from the governments that set policy, to the private sector that controls the supply chains, to the people who have to live in and depend on natural services) to work together more effectively on gathering all the data needed.</p>\n<p>This interoperability has to emerge during a rapid shift towards digital methods, which are vulnerable to being <a href=\"https://www.bbc.com/future/article/20250422-usa-scientists-race-to-save-climate-data-before-its-deleted-by-the-trump-administration\">deleted and edited at scale</a> with decades of painstaking observations at risk at the moment. And in the middle of all this, machine learning is swooping in to perform data interpolation at scale, but also risks <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">dividing</a> and polluting observations with inaccurate projections.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-2.webp\" title=\"\">\n</p>\n<h2><a href=\"https://anil.recoil.org/#what-is-an-optimistic-future-for-conservation\"></a>What is an optimistic future for conservation?</h2>\n<p>This is all quite the challenge even for a gung-ho computer scientist like me, and I was struggling with the enormity of it all! But things really clicked into place after the inspirational <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> pointed me at a <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">fantastic big-picture paper</a>:</p>\n<blockquote>\n<p>Drawing reasonable inferences from current patterns, we can predict that 100 years from now, the Earth could be inhabited by between 6-8 billion people, with very few remaining in extreme poverty, most living in towns and cities, and nearly all participating in a technologically driven, interconnected market economy.</p>\n<p>[...] we articulate a theory of social\u2013environmental change that describes the simultaneous and interacting effects of urban lifestyles on fertility, poverty alleviation, and ideation.</p>\n<p><a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation</a></p>\n</blockquote>\n<p>They observe that the field of conservation has often "succumbed to jeremiad, bickering, and despair". Much of this angst springs from the (failed) bets made by <a href=\"https://en.wikipedia.org/wiki/Paul_R._Ehrlich\">Paul Ehlrich</a>, who thinks <a href=\"https://www.nature.com/articles/d41586-024-03592-y\">humans are going to be wiped out</a> because of unbounded expansion. In response, conservation has become "the art of slowing declines" rather than achieving long term wins. But instead of being moribund, the paper paints an optimistic, practical endgame for conservation:</p>\n<blockquote>\n<p>We suggest that lasting conservation success can best be realized when:</p>\n<ul>\n<li>the human population stabilizes and begins to decrease</li>\n<li>extreme poverty is alleviated</li>\n<li>the majority of the world's people and institutions act on a shared belief that it is in their best interest to care for rather than destroy the natural bases of life on Earth.</li>\n</ul>\n</blockquote>\n<p>It turns out that most of these conditions can be reasonably projected to happen in the next fifty years or so. Population is projected to <a href=\"https://en.wikipedia.org/wiki/Human_population_projections\">peak by the turn of the century</a>, <a href=\"https://openknowledge.worldbank.org/entities/publication/9d0fb27a-3afe-5999-8d8e-baf90b4331c0/full\">extreme poverty might reasonably be eradicated by 2050</a>, and <a href=\"https://iopscience.iop.org/article/10.1088/1748-9326/8/1/014025\">urban landuse will stabilise at 6% of terrestrial land</a> by 2030-ish.</p>\n<p><a href=\"https://academic.oup.com/view-large/figure/118140827/biy039fig4.jpeg\"> \n<img alt=\"Connecting demographic and economic trends in the 21st century to the environment\" src=\"https://anil.recoil.org/images/nas-rs-6.webp\" title=\"Connecting demographic and economic trends in the 21st century to the environment\">\nConnecting demographic and economic trends in the 21st century to the environment </a></p>\n<p>Given this projection, the paper then points out that conservation doesn't need to save nature "forever". Instead, we have to save enough nature now to "breakthrough" from the <a href=\"https://en.wikipedia.org/wiki/Great_Acceleration\">great acceleration</a> of WWII until we stabilise landuse.</p>\n<blockquote>\n<p>The profound danger is that by the time the foundations of recovery are in place, little of wildlife and wild places will be left. If society focuses only on economic development and technological innovation as a mechanism to pass through the bottleneck as fast as possible, then what remains of nature could well be sacrificed.\nIf society were to focus only on limiting economic growth to protect nature, then terrible poverty and population growth could overwhelm what remains.</p>\n<p>Either extreme risks narrowing the bottleneck to such an extent that our world passes through without its tigers, elephants, rainforests, coral reefs, or a life-sustaining climate. Therefore, the only sensible path for conservation is to continue its efforts to protect biodiversity while engaging in cities to build the foundations for a lasting recovery of nature.\n-- <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough</a></p>\n</blockquote>\n<p>This puts what we need to achieve today in a far, far more pragmatic light:</p>\n<blockquote>\n<p>[...] it means that conservation faces another 30\u201350 years of extreme difficulty, when more losses can be expected. However, if we can sustain enough nature through the bottleneck\u2014despite climate change, growth in the population and economy, and urban expansion\u2014then we can see the future of nature in a dramatically more positive light.</p>\n</blockquote>\n<p>Conservation is all about solving difficult opportunity-cost decisions in society.\nScience can help calculate <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">credible counterfactuals</a> that allow policymakers to balance\nlimited resources to minimise nature harm while maximising benefit to humans. We can also figure out new <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">economic methods</a> to figure out the value of future actions. When combined, this can help conservation break through the bottleneck of the next fifty years of nature loss... and computer science can make a serious <a href=\"https://fivetimesfaster.org/\">accelerative</a> impact here (yay!).</p>\n<p>\n<img alt=\"What does one call a group of ecology legends? A committee!\" src=\"https://anil.recoil.org/images/nas-rs-5.webp\" title=\"What does one call a group of ecology legends? A committee!\">\nWhat does one call a group of ecology legends? A committee!</p>\n<h2><a href=\"https://anil.recoil.org/#topics-relevant-to-our-planetary-computing-research\"></a>Topics relevant to our planetary computing research</h2>\n<p>Having got my existential big-picture crisis under control, here are some more concrete thoughts about some of the joint ideas that emerged from the NAS meeting.</p>\n<h3><a href=\"https://anil.recoil.org/#resilience-in-biodiversity-data\"></a>Resilience in biodiversity data</h3>\n<p>We've been doing a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">lot</a> of <a href=\"https://digitalflapjack.com/weeknotes/2025-04-22/\">work</a> on mechanisms to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">process and ingest</a> remote sensing data. All of our techniques also apply to biodiversity, except that the pipelines are even more complex due to the multi-modal nature of the data being stored. This can be clearly seen in this <a href=\"https://www.science.org/doi/10.1126/science.adq2110\">review on the decline of insect biodiversity</a> that speaker Nick Isaac and my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> published last month.</p>\n<p><a href=\"https://www.science.org/doi/10.1126/science.adq2110\"> \n<img alt=\"(source: Science, 10.1126/science.adq2110)\" src=\"https://anil.recoil.org/images/nas-rs-1.webp\" title=\"(source: Science, 10.1126/science.adq2110)\">\n(source: Science, 10.1126/science.adq2110) </a></p>\n<p>The data itself isn't just from one source; instead, we need a pipeline of spatial (at different resolution) measurements, of different types (visual, acoustic, occurrence), of different provenance (experts, crowdsourced, museum), and from different hypotheses tests (evidence bases).</p>\n<p>Once the ingestion pipeline is in place, there's a full range of validation and combination and extrapolation involved, often involving AI methods these days. The output from all of this is then tested to determine which <a href=\"https://anil.recoil.org/projects/ce\">conservation actions</a> to take.</p>\n<p>\n<img alt=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\" src=\"https://anil.recoil.org/images/nas-rs-3.webp\" title=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\">\nNick Isaac explains how different lines of biodiversity evidence are necessary</p>\n<p><a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also talked about the ambitious <a href=\"https://www.nature.com/articles/s41559-023-02171-0\">global biodiversity observing system</a> that he's been assembling a coalition for in recent years. They are using Docker as part of this via their <a href=\"https://boninabox.geobon.org/\">Bon in a Box</a> product but hitting scaling issues (a common problem due to the size of geospatial tiles).</p>\n<p><a href=\"https://www.nature.com/articles/s41559-023-02171-0\"> \n<img alt=\"Andrew Gonzalez explains the GBioS concept\" src=\"https://anil.recoil.org/images/nas-rs-7.webp\" title=\"Andrew Gonzalez explains the GBioS concept\">\nAndrew Gonzalez explains the GBioS concept </a></p>\n<p>There's a good tie in for collaboration with us here via the next-generation <a href=\"https://patrick.sirref.org/weekly-2025-05-12/index.xml\">time-travelling shell</a> that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is developing that can handle this via <a href=\"https://www.tunbury.org/zfs-system-concept/\">ZFS snapshots</a>. <a href=\"https://mynameismwd.org\">Michael Dales</a> has been applying this to scaling the <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> pipelines recently with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. And meanwhile <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a> have been researching <a href=\"https://anil.recoil.org/papers/2024-terracorder\">embedded biodiversity sensors</a>. The overall theme is that we need to make the hardware and software stack involved far easier to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">use for non-expert programmers</a>.</p>\n<p>\n<img alt=\"A key part of the GBioS vision is to have a federated system\" src=\"https://anil.recoil.org/images/nas-rs-8.webp\" title=\"A key part of the GBioS vision is to have a federated system\">\nA key part of the GBioS vision is to have a federated system</p>\n<h3><a href=\"https://anil.recoil.org/#observing-the-earth-through-geospatial-foundation-models\"></a>Observing the earth through geospatial foundation models</h3>\n<p>Another problem that several speakers discussed was how complex biodiversity observations are to manage since they span multiple scales. In my talk, I described the new <a href=\"https://github.com/FrankFeng-23/btfm_project\">TESSERA</a> geospatial foundation model that <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have been leading in Cambridge. As this is a pre-trained foundation model, it needs to be finetuned to specific downstream tasks. A number of people came up after my talk with suggestions for collaborations here!</p>\n<p>Firstly, <a href=\"https://earthshotprize.org/winners-finalists/naturemetrics/\">Kat Bruce</a> (fresh from <a href=\"https://www.bbc.com/news/articles/cre8xxd7xl8o\">spraying pondwater</a> with Prince William) explained how <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a> are gathering <a href=\"https://en.wikipedia.org/wiki/Environmental_DNA\">eDNA</a> from many diverse sources. The data is of varying licenses depending on which customer paid for the acquisition, but overall there is a lot of information about species presence that's very orthogonal to the kind of data gathered from satellite observations.</p>\n<p>\n<img alt=\"Kat Bruce showing how much information is packed into eDNA measurements\" src=\"https://anil.recoil.org/images/nas-rs-4.webp\" title=\"Kat Bruce showing how much information is packed into eDNA measurements\">\nKat Bruce showing how much information is packed into eDNA measurements</p>\n<p>Secondly, <a href=\"https://darulab.org/\">Barnabas Daru</a> from Stanford described his efforts to map plant traits to species distribution models. This complements some work <a href=\"https://coomeslab.org\">David Coomes</a> has been leading recently in our group with <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a> and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> on mapping rare plants globally. The basic problem here is that plant occurrence data is <em>extremely</em> data deficient and spatially biased for 100k+ species, and so we'll need cunning interpolation techniques to fill in the data gaps.</p>\n<p>\n<img alt=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\" src=\"https://anil.recoil.org/images/nas-rs-12.webp\" title=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\">\nBarnabas Daru shows his maps on gathering plant samples from all over the world</p>\n<p>When back in Cambridge, I'm going to arrange for all of us to chat to see if we can somehow combine eDNA, fungal biodiversity, plant traits and satellite foundation models into a comprehensive global plant species map!</p>\n<h3><a href=\"https://anil.recoil.org/#evidence-synthesis-from-the-literature\"></a>Evidence synthesis from the literature</h3>\n<p>There was also huge enthusiasm for another of our projects on <a href=\"https://anil.recoil.org/projects/ce\">analysing the academic literature</a> at scale. While we've been using it initially to accelerate the efficiacy and accuracy of <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">systematic reviews</a> for <a href=\"https://conservationevidence.com\">Conservation Evidence</a>, there are a huge number of followup benefits for having a comprehensive data corpus.</p>\n<p>Firstly, <a href=\"http://elphick.lab.uconn.edu/\">Chris Elphick</a> pointed out a metasynthesis where they manually integrate recent <a href=\"https://academic.oup.com/bioscience/advance-article-abstract/doi/10.1093/biosci/biaf034/8115312\">hypotheses about insect stressors and responses</a> into a network (3385 edges / 108 nodes). It found that the network is highly interconnected, with agricultural intensification often identified as a root cause for insect decline. Much like the CE manually labeled dataset, it should be possible to do hypothesis searches in our LLM pipeline to expand this search and make it more dynamic.</p>\n<p>Secondly, <a href=\"http://oisin.info\">Oisin Mac Aodha</a>, fresh from a <a href=\"https://watch.eeg.cl.cam.ac.uk/w/7aqBd2Nn9E6QpMvnoBPxuQ\">recent talk</a> in Cambridge, discussed his <a href=\"https://arxiv.org/abs/2502.14977\">recent work</a> on few-shot species range estimation and also <a href=\"https://arxiv.org/abs/2412.14428\">WildSAT text/image encoding</a>. His example showed how you could not only spot a species from images, but also use text prompts to refine the search. An obvious extension for us to have a go at here is to combine our large corpus of academic papers with these models to see how good the search/range estimation could get with a much larger corpus of data.</p>\n<p>\n<img alt=\"I am proud to have pronounced Oisin&apos;s name correctly while introducing his recent CCI seminar\" src=\"https://anil.recoil.org/images/nas-rs-13.webp\" title=\"I am proud to have pronounced Oisin&apos;s name correctly while introducing his recent CCI seminar\">\nI am proud to have pronounced Oisin's name correctly while introducing his recent CCI seminar</p>\n<p>And thirdly, I finally met my coauthor <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\">David Williams</a> in the flesh for the first time! We've worked together recently on the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impact of food</a>, and we had a long discussion over dinner about whether we could glean more behavioural data about how people react from the wider literature. This would require us expanding our literature corpus into <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">grey literature</a> and policy documents, but this is something that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I want to do soon anyway.</p>\n<p>The connective tissue across these seemingly disparate projects is that there is a strong connection between what you can observe from space (the canopies of trees) to the traits expressed via knowledge of plant physiology and their DNA. If we could figure out how to connect the dots between the observed species to the physiological traits to the bioclimatic range variables, we could figure out where the (many) data-deficient plant species in the world are! I'll be hosting a meeting in Cambridge soon on this since we're already <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">working on it</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#visualisations-in-biodiversity\"></a>Visualisations in biodiversity</h3>\n<p>The most unexpectedly cool talk was <a href=\"https://www.weizmann.ac.il/plants/Milo/home\">Ron Milo</a> showing us visualisations of the <a href=\"https://www.pnas.org/doi/10.1073/pnas.1711842115\">mass distribution of all life on earth</a>. His work really puts our overall challenge into context, as it shows just how utterly dominated wildlife is by domesticated animals.</p>\n<p>\n<img alt=\"The dominant mammal biomass on the planet are domesticated animals\" src=\"https://anil.recoil.org/images/nas-rs-11.webp\" title=\"The dominant mammal biomass on the planet are domesticated animals\">\nThe dominant mammal biomass on the planet are domesticated animals</p>\n<p>It struck me just how important these sort of high-level visualisations are in putting detailed numbers into context. For example, he also broke down global biomass that showed that plants are by far the "heaviest" living thing on earth, and that the ocean organisms do still dominate animal biomass.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-9.webp\" title=\"\">\n</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-10.webp\" title=\"\">\n</p>\n<p>My favourite new animation library on the block is <a href=\"https://animejs.com/\">AnimeJS</a>, and so once I plan to try to do some nice animations for <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> along these lines after the academic term finishes.</p>\n<p>And that's a wrap on my notes for now! I'm still hanging out in the US for a bunch more meetings (including one at <a href=\"https://www.nationalgeographic.com/\">National Geographic HQ</a>), so I'll update this note when the official RS/NAS videos and writeup comes out.</p>\n<p><em>(Update 5th June: the <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">full talk videos series</a> is now online at the National Academy of Sciences channel. Enjoy!)</em></p>",+"content": "<p>I spent a couple of days at the <a href=\"https://www.nationalacademies.org/home\">National Academy of Sciences</a> in the USA at the invitation of the <a href=\"https://royalsociety.org\">Royal Society</a>, who held a forum on "<a href=\"https://anil.recoil.org/\">Measuring Biodiversity for Addressing the Global Crisis</a>". It was a <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">packed program</a> for those working in evidence-driven conservation:</p>\n<blockquote>\n<p>Assessing biodiversity is fundamental to understanding the distribution of biodiversity, the changes that are occurring and, crucially, the effectiveness of actions to address the ongoing biodiversity crisis. Such assessments face multiple challenges, not least the great complexity of natural systems, but also a lack of standardized approaches to measurement, a plethora of measurement technologies with their own strengths and weaknesses, and different data needs depending on the purpose\nfor which the information is being gathered.</p>\n<p>Other sectors have faced similar challenges, and the forum will look to learn from these precedents with a view to building momentum toward standardized methods for using environmental monitoring technologies, including new technologies, for particular purposes.\n-- NAS/Royal Society <a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\">US-UK Scientific Forum on Measuring Biodiversity</a></p>\n</blockquote>\n<p>I was honoured to talk about our work on using AI to "connect the dots" between disparate data like the academic literature and remote observations at scale. But before that, here's some of the bigger picture stuff I learnt...</p>\n<p><a href=\"https://www.nasonline.org/wp-content/uploads/2024/10/US-UK-Forum-2025-program-web.pdf\"> \n<img alt=\"Identifying the bird is an exercise for the reader!\" src=\"https://anil.recoil.org/images/nas-rs-cover.webp\" title=\"Identifying the bird is an exercise for the reader!\">\nIdentifying the bird is an exercise for the reader! </a></p>\n<h2><a href=\"https://anil.recoil.org/#shifting-conservation-to-a-winning-stance\"></a>Shifting conservation to a winning stance</h2>\n<p>The need for urgent, additional action came across loud and clear from all the top actors in biodiversity. On the bright side, we have made stellar progress in measuring more dimensions of biodiversity accurately than ever before in human history. But, the field of biodiversity does not have a single "simple question" that needs answering, unlike many other science challenges in physics or chemistry. The ecosystem of nature measurements need to span scales ranging from the micro (from fungi and soil health) to the macro (species richness and diversity), with geographical coverage across the planet but also hyperlocal accuracy for ecosystem services.</p>\n<p>One key question asked at the forum was how we can get to interoperable, pragmatic tools that enable all the actors involved in conservation actions (from the governments that set policy, to the private sector that controls the supply chains, to the people who have to live in and depend on natural services) to work together more effectively on gathering all the data needed.</p>\n<p>This interoperability has to emerge during a rapid shift towards digital methods, which are vulnerable to being <a href=\"https://www.bbc.com/future/article/20250422-usa-scientists-race-to-save-climate-data-before-its-deleted-by-the-trump-administration\">deleted and edited at scale</a> with decades of painstaking observations at risk at the moment. And in the middle of all this, machine learning is swooping in to perform data interpolation at scale, but also risks <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">dividing</a> and polluting observations with inaccurate projections.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-2.webp\" title=\"\">\n</p>\n<h2><a href=\"https://anil.recoil.org/#what-is-an-optimistic-future-for-conservation\"></a>What is an optimistic future for conservation?</h2>\n<p>This is all quite the challenge even for a gung-ho computer scientist like me, and I was struggling with the enormity of it all! But things really clicked into place after the inspirational <a href=\"https://www.bangor.ac.uk/staff/sens/julia-patricia-gordon-jones-010356/en\">Julia P.G. Jones</a> pointed me at a <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">fantastic big-picture paper</a>:</p>\n<blockquote>\n<p>Drawing reasonable inferences from current patterns, we can predict that 100 years from now, the Earth could be inhabited by between 6-8 billion people, with very few remaining in extreme poverty, most living in towns and cities, and nearly all participating in a technologically driven, interconnected market economy.</p>\n<p>[...] we articulate a theory of social\u2013environmental change that describes the simultaneous and interacting effects of urban lifestyles on fertility, poverty alleviation, and ideation.</p>\n<p><a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough: Urbanization and the Future of Biodiversity Conservation</a></p>\n</blockquote>\n<p>They observe that the field of conservation has often "succumbed to jeremiad, bickering, and despair". Much of this angst springs from the (failed) bets made by <a href=\"https://en.wikipedia.org/wiki/Paul_R._Ehrlich\">Paul Ehlrich</a>, who thinks <a href=\"https://www.nature.com/articles/d41586-024-03592-y\">humans are going to be wiped out</a> because of unbounded expansion. In response, conservation has become "the art of slowing declines" rather than achieving long term wins. But instead of being moribund, the paper paints an optimistic, practical endgame for conservation:</p>\n<blockquote>\n<p>We suggest that lasting conservation success can best be realized when:</p>\n<ul>\n<li>the human population stabilizes and begins to decrease</li>\n<li>extreme poverty is alleviated</li>\n<li>the majority of the world's people and institutions act on a shared belief that it is in their best interest to care for rather than destroy the natural bases of life on Earth.</li>\n</ul>\n</blockquote>\n<p>It turns out that most of these conditions can be reasonably projected to happen in the next fifty years or so. Population is projected to <a href=\"https://en.wikipedia.org/wiki/Human_population_projections\">peak by the turn of the century</a>, <a href=\"https://openknowledge.worldbank.org/entities/publication/9d0fb27a-3afe-5999-8d8e-baf90b4331c0/full\">extreme poverty might reasonably be eradicated by 2050</a>, and <a href=\"https://iopscience.iop.org/article/10.1088/1748-9326/8/1/014025\">urban landuse will stabilise at 6% of terrestrial land</a> by 2030-ish.</p>\n<p><a href=\"https://academic.oup.com/view-large/figure/118140827/biy039fig4.jpeg\"> \n<img alt=\"Connecting demographic and economic trends in the 21st century to the environment\" src=\"https://anil.recoil.org/images/nas-rs-6.webp\" title=\"Connecting demographic and economic trends in the 21st century to the environment\">\nConnecting demographic and economic trends in the 21st century to the environment </a></p>\n<p>Given this projection, the paper then points out that conservation doesn't need to save nature "forever". Instead, we have to save enough nature now to "breakthrough" from the <a href=\"https://en.wikipedia.org/wiki/Great_Acceleration\">great acceleration</a> of WWII until we stabilise landuse.</p>\n<blockquote>\n<p>The profound danger is that by the time the foundations of recovery are in place, little of wildlife and wild places will be left. If society focuses only on economic development and technological innovation as a mechanism to pass through the bottleneck as fast as possible, then what remains of nature could well be sacrificed.\nIf society were to focus only on limiting economic growth to protect nature, then terrible poverty and population growth could overwhelm what remains.</p>\n<p>Either extreme risks narrowing the bottleneck to such an extent that our world passes through without its tigers, elephants, rainforests, coral reefs, or a life-sustaining climate. Therefore, the only sensible path for conservation is to continue its efforts to protect biodiversity while engaging in cities to build the foundations for a lasting recovery of nature.\n-- <a href=\"https://academic.oup.com/bioscience/article/68/6/412/4976422\">From Bottleneck to Breakthrough</a></p>\n</blockquote>\n<p>This puts what we need to achieve today in a far, far more pragmatic light:</p>\n<blockquote>\n<p>[...] it means that conservation faces another 30\u201350 years of extreme difficulty, when more losses can be expected. However, if we can sustain enough nature through the bottleneck\u2014despite climate change, growth in the population and economy, and urban expansion\u2014then we can see the future of nature in a dramatically more positive light.</p>\n</blockquote>\n<p>Conservation is all about solving difficult opportunity-cost decisions in society.\nScience can help calculate <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">credible counterfactuals</a> that allow policymakers to balance\nlimited resources to minimise nature harm while maximising benefit to humans. We can also figure out new <a href=\"https://anil.recoil.org/papers/2023-ncc-permanence\">economic methods</a> to figure out the value of future actions. When combined, this can help conservation break through the bottleneck of the next fifty years of nature loss... and computer science can make a serious <a href=\"https://fivetimesfaster.org/\">accelerative</a> impact here (yay!).</p>\n<p>\n<img alt=\"What does one call a group of ecology legends? A committee!\" src=\"https://anil.recoil.org/images/nas-rs-5.webp\" title=\"What does one call a group of ecology legends? A committee!\">\nWhat does one call a group of ecology legends? A committee!</p>\n<h2><a href=\"https://anil.recoil.org/#topics-relevant-to-our-planetary-computing-research\"></a>Topics relevant to our planetary computing research</h2>\n<p>Having got my existential big-picture crisis under control, here are some more concrete thoughts about some of the joint ideas that emerged from the NAS meeting.</p>\n<h3><a href=\"https://anil.recoil.org/#resilience-in-biodiversity-data\"></a>Resilience in biodiversity data</h3>\n<p>We've been doing a <a href=\"https://digitalflapjack.com/blog/yirgacheffe/\">lot</a> of <a href=\"https://digitalflapjack.com/weeknotes/2025-04-22/\">work</a> on mechanisms to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">process and ingest</a> remote sensing data. All of our techniques also apply to biodiversity, except that the pipelines are even more complex due to the multi-modal nature of the data being stored. This can be clearly seen in this <a href=\"https://www.science.org/doi/10.1126/science.adq2110\">review on the decline of insect biodiversity</a> that speaker Nick Isaac and my colleague <a href=\"https://www.zoo.cam.ac.uk/directory/prof-lynn-dicks\">Lynn Dicks</a> published last month.</p>\n<p><a href=\"https://www.science.org/doi/10.1126/science.adq2110\"> \n<img alt=\"(source: Science, 10.1126/science.adq2110)\" src=\"https://anil.recoil.org/images/nas-rs-1.webp\" title=\"(source: Science, 10.1126/science.adq2110)\">\n(source: Science, 10.1126/science.adq2110) </a></p>\n<p>The data itself isn't just from one source; instead, we need a pipeline of spatial (at different resolution) measurements, of different types (visual, acoustic, occurrence), of different provenance (experts, crowdsourced, museum), and from different hypotheses tests (evidence bases).</p>\n<p>Once the ingestion pipeline is in place, there's a full range of validation and combination and extrapolation involved, often involving AI methods these days. The output from all of this is then tested to determine which <a href=\"https://anil.recoil.org/projects/ce\">conservation actions</a> to take.</p>\n<p>\n<img alt=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\" src=\"https://anil.recoil.org/images/nas-rs-3.webp\" title=\"Nick Isaac explains how different lines of biodiversity evidence are necessary\">\nNick Isaac explains how different lines of biodiversity evidence are necessary</p>\n<p><a href=\"https://www.thegonzalezlab.org/\">Andrew Gonzalez</a> also talked about the ambitious <a href=\"https://www.nature.com/articles/s41559-023-02171-0\">global biodiversity observing system</a> that he's been assembling a coalition for in recent years. They are using Docker as part of this via their <a href=\"https://boninabox.geobon.org/\">Bon in a Box</a> product but hitting scaling issues (a common problem due to the size of geospatial tiles).</p>\n<p><a href=\"https://www.nature.com/articles/s41559-023-02171-0\"> \n<img alt=\"Andrew Gonzalez explains the GBioS concept\" src=\"https://anil.recoil.org/images/nas-rs-7.webp\" title=\"Andrew Gonzalez explains the GBioS concept\">\nAndrew Gonzalez explains the GBioS concept </a></p>\n<p>There's a good tie in for collaboration with us here via the next-generation <a href=\"https://patrick.sirref.org/weekly-2025-05-12/index.xml\">time-travelling shell</a> that <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> is developing that can handle this via <a href=\"https://www.tunbury.org/zfs-system-concept/\">ZFS snapshots</a>. <a href=\"https://mynameismwd.org\">Michael Dales</a> has been applying this to scaling the <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> pipelines recently with <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a> and <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a>. And meanwhile <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> and <a href=\"https://haddadi.github.io/\">Hamed Haddadi</a> have been researching <a href=\"https://anil.recoil.org/papers/2024-terracorder\">embedded biodiversity sensors</a>. The overall theme is that we need to make the hardware and software stack involved far easier to <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">use for non-expert programmers</a>.</p>\n<p>\n<img alt=\"A key part of the GBioS vision is to have a federated system\" src=\"https://anil.recoil.org/images/nas-rs-8.webp\" title=\"A key part of the GBioS vision is to have a federated system\">\nA key part of the GBioS vision is to have a federated system</p>\n<h3><a href=\"https://anil.recoil.org/#observing-the-earth-through-geospatial-foundation-models\"></a>Observing the earth through geospatial foundation models</h3>\n<p>Another problem that several speakers discussed was how complex biodiversity observations are to manage since they span multiple scales. In my talk, I described the new <a href=\"https://github.com/FrankFeng-23/btfm_project\">TESSERA</a> geospatial foundation model that <a href=\"https://www.cst.cam.ac.uk/people/zf281\">Frank Feng</a>, <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have been leading in Cambridge. As this is a pre-trained foundation model, it needs to be finetuned to specific downstream tasks. A number of people came up after my talk with suggestions for collaborations here!</p>\n<p>Firstly, <a href=\"https://earthshotprize.org/winners-finalists/naturemetrics/\">Kat Bruce</a> (fresh from <a href=\"https://www.bbc.com/news/articles/cre8xxd7xl8o\">spraying pondwater</a> with Prince William) explained how <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a> are gathering <a href=\"https://en.wikipedia.org/wiki/Environmental_DNA\">eDNA</a> from many diverse sources. The data is of varying licenses depending on which customer paid for the acquisition, but overall there is a lot of information about species presence that's very orthogonal to the kind of data gathered from satellite observations.</p>\n<p>\n<img alt=\"Kat Bruce showing how much information is packed into eDNA measurements\" src=\"https://anil.recoil.org/images/nas-rs-4.webp\" title=\"Kat Bruce showing how much information is packed into eDNA measurements\">\nKat Bruce showing how much information is packed into eDNA measurements</p>\n<p>Secondly, <a href=\"https://darulab.org/\">Barnabas Daru</a> from Stanford described his efforts to map plant traits to species distribution models. This complements some work <a href=\"https://coomeslab.org\">David Coomes</a> has been leading recently in our group with <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a> and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> on mapping rare plants globally. The basic problem here is that plant occurrence data is <em>extremely</em> data deficient and spatially biased for 100k+ species, and so we'll need cunning interpolation techniques to fill in the data gaps.</p>\n<p>\n<img alt=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\" src=\"https://anil.recoil.org/images/nas-rs-12.webp\" title=\"Barnabas Daru shows his maps on gathering plant samples from all over the world\">\nBarnabas Daru shows his maps on gathering plant samples from all over the world</p>\n<p>When back in Cambridge, I'm going to arrange for all of us to chat to see if we can somehow combine eDNA, fungal biodiversity, plant traits and satellite foundation models into a comprehensive global plant species map!</p>\n<h3><a href=\"https://anil.recoil.org/#evidence-synthesis-from-the-literature\"></a>Evidence synthesis from the literature</h3>\n<p>There was also huge enthusiasm for another of our projects on <a href=\"https://anil.recoil.org/projects/ce\">analysing the academic literature</a> at scale. While we've been using it initially to accelerate the efficiacy and accuracy of <a href=\"https://en.wikipedia.org/wiki/Systematic_review\">systematic reviews</a> for <a href=\"https://conservationevidence.com\">Conservation Evidence</a>, there are a huge number of followup benefits for having a comprehensive data corpus.</p>\n<p>Firstly, <a href=\"http://elphick.lab.uconn.edu/\">Chris Elphick</a> pointed out a metasynthesis where they manually integrate recent <a href=\"https://academic.oup.com/bioscience/advance-article-abstract/doi/10.1093/biosci/biaf034/8115312\">hypotheses about insect stressors and responses</a> into a network (3385 edges / 108 nodes). It found that the network is highly interconnected, with agricultural intensification often identified as a root cause for insect decline. Much like the CE manually labeled dataset, it should be possible to do hypothesis searches in our LLM pipeline to expand this search and make it more dynamic.</p>\n<p>Secondly, <a href=\"http://oisin.info\">Oisin Mac Aodha</a>, fresh from a <a href=\"https://watch.eeg.cl.cam.ac.uk/w/7aqBd2Nn9E6QpMvnoBPxuQ\">recent talk</a> in Cambridge, discussed his <a href=\"https://arxiv.org/abs/2502.14977\">recent work</a> on few-shot species range estimation and also <a href=\"https://arxiv.org/abs/2412.14428\">WildSAT text/image encoding</a>. His example showed how you could not only spot a species from images, but also use text prompts to refine the search. An obvious extension for us to have a go at here is to combine our large corpus of academic papers with these models to see how good the search/range estimation could get with a much larger corpus of data.</p>\n<p>\n<img alt=\"I am proud to have pronounced Oisin&apos;s name correctly while introducing his recent CCI seminar\" src=\"https://anil.recoil.org/images/nas-rs-13.webp\" title=\"I am proud to have pronounced Oisin&apos;s name correctly while introducing his recent CCI seminar\">\nI am proud to have pronounced Oisin's name correctly while introducing his recent CCI seminar</p>\n<p>And thirdly, I finally met my coauthor <a href=\"https://environment.leeds.ac.uk/see/staff/2720/david-williams\">David Williams</a> in the flesh for the first time! We've worked together recently on the <a href=\"https://anil.recoil.org/papers/2024-food-life\">biodiversity impact of food</a>, and we had a long discussion over dinner about whether we could glean more behavioural data about how people react from the wider literature. This would require us expanding our literature corpus into <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">grey literature</a> and policy documents, but this is something that <a href=\"https://toao.com\">Sadiq Jaffer</a> and I want to do soon anyway.</p>\n<p>The connective tissue across these seemingly disparate projects is that there is a strong connection between what you can observe from space (the canopies of trees) to the traits expressed via knowledge of plant physiology and their DNA. If we could figure out how to connect the dots between the observed species to the physiological traits to the bioclimatic range variables, we could figure out where the (many) data-deficient plant species in the world are! I'll be hosting a meeting in Cambridge soon on this since we're already <a href=\"https://anil.recoil.org/notes/ukri-grant-terra\">working on it</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#visualisations-in-biodiversity\"></a>Visualisations in biodiversity</h3>\n<p>The most unexpectedly cool talk was <a href=\"https://www.weizmann.ac.il/plants/Milo/home\">Ron Milo</a> showing us visualisations of the <a href=\"https://www.pnas.org/doi/10.1073/pnas.1711842115\">mass distribution of all life on earth</a>. His work really puts our overall challenge into context, as it shows just how utterly dominated wildlife is by domesticated animals.</p>\n<p>\n<img alt=\"The dominant mammal biomass on the planet are domesticated animals\" src=\"https://anil.recoil.org/images/nas-rs-11.webp\" title=\"The dominant mammal biomass on the planet are domesticated animals\">\nThe dominant mammal biomass on the planet are domesticated animals</p>\n<p>It struck me just how important these sort of high-level visualisations are in putting detailed numbers into context. For example, he also broke down global biomass that showed that plants are by far the "heaviest" living thing on earth, and that the ocean organisms do still dominate animal biomass.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-9.webp\" title=\"\">\n</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nas-rs-10.webp\" title=\"\">\n</p>\n<p>My favourite new animation library on the block is <a href=\"https://animejs.com/\">AnimeJS</a>, and so once I plan to try to do some nice animations for <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> and <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD</a> along these lines after the academic term finishes.</p>\n<p>And that's a wrap on my notes for now! I'm still hanging out in the US for a bunch more meetings (including one at <a href=\"https://www.nationalgeographic.com/\">National Geographic HQ</a>), so I'll update this note when the official RS/NAS videos and writeup comes out.</p>\n<p><em>(Update 5th June: the <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">full talk videos series</a> is now online at the National Academy of Sciences channel. Enjoy!)</em></p>",
+18
avsm/notes_natgeo-urban-wildlife.json
+18
avsm/notes_natgeo-urban-wildlife.json
···+"summary": "<p>I stayed on for a few days extra in Washington DC after the <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">biodiversity extravaganza</a> to attend a workshop at legendary <a href=\"https://www.nationalgeographic.org/society/visit-base-camp/\">National Geographic Basecamp</a>. While I've been to several NatGeo <a href=\"https://www.nationalgeographic.org/society/national-geographic-explorers/\">Explorers</a> meetups in California, I've never had the chance to visit their HQ. The purpose of this was to attend a workshop organised by <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz</a> from St Andrews about the "Urban Exploration Project":</p>\n<blockquote>\n<p>[The UEP is a...] global-scale, community-driven initiative will collaboratively track animals across gradients of urbanization worldwide, to produce a holistic understanding of animal behaviour in human-modified landscapes that can, in turn, be used to develop evidence-based approaches to achieving sustainable human-wildlife coexistence.\n-- <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz's homepage</a></p>\n</blockquote>\n<p>This immediately grabbed my interest, since it's a very different angle of biodiversity measurements to my usual. I've so far been mainly involved in efforts that use <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> or expert <a href=\"https://anil.recoil.org/projects/life\">range maps</a>, but the UEP program is more concerned with the dynamic <em>movements</em> of species. Wildlife movements are extremely relevant to conservation efforts since there is a large tension between human/wildlife coexistence in areas where both communities are under spatial pressure. <a href=\"https://ratsakatika.com/\">Tom Ratsakatika</a> for example did his <a href=\"https://ai4er-cdt.esc.cam.ac.uk/\">AI4ER</a> <a href=\"https://github.com/ratsakatika/camera-traps\">project</a> on the tensions in the <a href=\"https://www.endangeredlandscapes.org/news/advancing-human-wildlife-coexistence-in-the-carpathian-mountains/\">Romanian Carpathian mountains</a>, and <a href=\"https://www.ifaw.org/journal/human-elephant-conflict-major-threat\">elephant/human conflicts</a> and <a href=\"https://www.bbc.co.uk/news/articles/cx2j43e2j5ro\">tiger/human conflicts</a> are also well known.</p>\n<p>The core challenge posed at the workshop was how to build momentum for the UEP's vision of fostering human\u2013wildlife coexistence in the world's <em>unprotected</em> areas (often, this is areas near urban expansion zones like cities). The UEP idea sprang from Christian's earlier efforts after the pandemic on the <a href=\"https://bio-logging.net/wg/covid19-biologging/\">COVID-19 Bio-Logging</a> that built up a database of over 1 billion satellite fixes for ~13,000 tagged animals across ~200 species. The lead student on that <a href=\"https://www.nature.com/articles/s41559-023-02125-6\">work</a>, <a href=\"https://diegoellissoto.org/\">Diego Ellis Soto</a> has since graduated and was also at the UEP workshop sitting beside me!</p>\n<p>\n<img alt=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\" src=\"https://anil.recoil.org/images/ngs-2.webp\" title=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\">\nNatGeo Chief Scientist Ian Miller kicks off proceedings</p>\n<p>The workshop itself wasn't fully public (not because it's secret, but just because the details are still being iterated on), so here are some high-level takeaways from my conversations there...</p>\n<h2><a href=\"https://anil.recoil.org/#movebank-for-gps-tracking\"></a>Movebank for GPS tracking</h2>\n<p>I've used <a href=\"https://inaturalist.org\">iNaturalist</a> and <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a> extensively for wildlife occurrence and urban data, but I'm less familiar with how animal movement data is recorded. <a href=\"https://www.ab.mpg.de/person/98226\">Martin Wikelski</a> was at the workshop and explained the <a href=\"https://www.humboldt-foundation.de/en/entdecken/magazin-humboldt-kosmos/humboldt-today-the-secret-of-an-eternal-idol/the-high-flyer\">ICARUS</a> project to me, which collected data fitted to animals via GPS transmitters. This is then fed into the <a href=\"https://www.movebank.org/cms/movebank-main\">MoveBank</a> service that is custom-designed for movement data.</p>\n<p>Unlike most other biodiversity data services though, MoveBank data is not immediately made public (due to the sensitivity of animal movements), but is licensed to the user that made it. For that reason, it's less of a "social" service than iNaturalist, but still has a staggering <a href=\"https://www.movebank.org/cms/movebank-content/february-2024-newsletter\">11 million records added every day</a>. This data is then <a href=\"https://www.movebank.org/cms/movebank-content/archiving-animal-movements-as-biodiversity-2023-01-04\">fed into GBIF</a>, although it is downsampled to a single record per day. Martin also indicated to me that they're considering federating Movebank to other countries, which is important as <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">biodiversity data resilience</a> was a hot topic in our <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">meeting</a> a few days before.</p>\n<p>\n<img alt=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\" src=\"https://anil.recoil.org/images/ngs-3.webp\" title=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\">\nThe workshop was highly interactive through the 1.5 days. No laptops needed!</p>\n<h2><a href=\"https://anil.recoil.org/#storytelling-about-conservation-actions\"></a>Storytelling about conservation actions</h2>\n<p>I was really struck by how deeply the National Geographic staff were thinking about and co-designing solutions for along with the academics involved. I got chatting to <a href=\"https://www.nationalgeographic.org/society/our-leadership/\">Ian Miller</a>, the chief scientist at NatGeo about his scientific background (he's worked on all seven continents!) and how our <a href=\"https://anil.recoil.org/projects/ce\">conservation evidence database</a> might be of use to help the Society figure out the long-term impacts of their projects. I also met the person with the coolest job title there: <a href=\"https://www.linkedin.com/in/alextait/\">Alex Tait</a>, who is <a href=\"https://education.nationalgeographic.org/resource/mapping-change-roof-world/\">The Geographer</a> at the NGS. Alex, along with <a href=\"https://theorg.com/org/national-geographic-society/org-chart/lindsay-anderson\">Lindsay Anderson</a> and other NGS staff who participated, all had infectious enthusiasm about exploration combined with an encyclopedic knowledge of specific projects that they support involving explorers across the world.</p>\n<p>These projects ranged from the <a href=\"https://www.nationalgeographic.com/into-the-amazon/pink-dolphins-tricksters-and-thieves/\">Amazon River Dolphins</a> (to understand <a href=\"https://www.nationalgeographic.com/impact/article/fernando-trujillo-explorer-story\">aquatic health</a>) over to <a href=\"https://www.nationalgeographic.com/impact/article/alex-schnell-explorer-story\">cephalopod empathy</a>) and <a href=\"https://www.nationalgeographic.com/impact/article\">many more</a>. These gave me a new perspective on the importance of <em>storytelling</em> as a key mechanism to help connect the dots from conservation actions to people; something that I've been learning from <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s <a href=\"https://anil.recoil.org/notes/junior-rangers\">video series</a> as well!</p>\n<p><a href=\"https://www.nationalgeographic.com/impact\"> \n<img alt=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\" src=\"https://anil.recoil.org/images/ngs-5.webp\" title=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\">\nI spent the whole return trip reading the impact stories. So very, very, very inspiring. </a></p>\n<p>It's also worth noting that the NGS support goes beyond "just" filmmaking. Our own <a href=\"https://charlesemogor.com\">Charles Emogor</a> is also an <a href=\"https://explorers.nationalgeographic.org/directory/charles-agbor-emogor\">Explorer</a>, and recently received support from their <a href=\"https://www.nationalgeographic.org/society/our-programs/lab/\">Exploration Technology Lab</a> to get a bunch of <a href=\"https://www.wildlifeacoustics.com/products/song-meter-mini-2-aa\">biologgers</a> to support his research on <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">mapping hunting pressures</a>. Rather than placing a few big bets, the Society seems to focus on investing widely in a diverse range of people and geographies.</p>\n<h2><a href=\"https://anil.recoil.org/#the-importance-of-hedgehogs\"></a>The importance of hedgehogs</h2>\n<p>A lot of the discussion at the workshop naturally focussed on charismatic mammals such as the amazing work done by the <a href=\"https://www.zambiacarnivores.org/\">Zambian Carnivore programme</a>. However, I also had in mind the importance of addressing issues closer to home in the UK as well so that we didn't ignore Europe.</p>\n<p>Luckily, before the workshop, I had grabbed a coffee with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a> from the CCI, who has been bringing me up to speed on the <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring programme</a> (did you know that British hedgehogs are now <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">vulnerable to extinction</a>?). This particular effort seems to tick a lot of boxes; it's a local and beloved species in the UK, it requires <a href=\"https://www.conservationevidence.com/individual-study/1018\">evidence-based interventions</a> to avoid making the problems worse, and also requires combining data sources (from camera traps to species distribution models to urban planning to the GPS Movebank data) to build up a really accurate high res picture of what's going on.</p>\n<p>I brought up UK hedgehog conservation at the NatGeo workshop, and then while down at <a href=\"https://earthfest.world/\">Earthfest</a> at Google a few days later I learnt from <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> that they've developed an extremely high-res map of <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">woodland and hedgerows</a> in the UK. I've therefore created a new student project on <a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">hedgehog mapping</a> and hope to recruit a summer internship for this. It would be extremely cool to put the pieces together with a very concrete project such as this as a first small step for the UEP.</p>\n<p>\n<img alt=\"NatGeo Basecamp is under construction, but still epic\" src=\"https://anil.recoil.org/images/ngs-1.webp\" title=\"NatGeo Basecamp is under construction, but still epic\">\nNatGeo Basecamp is under construction, but still epic</p>\n<p>I found the whole experience of visiting National Geographic inspirational, and not just because of the projects discussed. The walls of their HQ are full of incredible photographs of explorers all over the world, and a seemingly unbounded enthusiasm for exploring the unknown. I kind of thought I'd aged out on applying to become an explorer, but <a href=\"https://totalkatastrophe.blogspot.com/\">Kathy Ho</a> has been encouraging me to apply, and the same was echoed by the lovely conversations with NatGeo staffers.</p>\n<p>I'm therefore putting on my thinking hat on for what my Explorers project proposal should be, as I am on academic sabbatical next year and have more freedom to travel; suggestions are welcome if you see me at the pub!</p>\n<p>\n<img alt=\"I might have deliberately gone the wrong way a few times while exploring the HQ\" src=\"https://anil.recoil.org/images/ngs-4.webp\" title=\"I might have deliberately gone the wrong way a few times while exploring the HQ\">\nI might have deliberately gone the wrong way a few times while exploring the HQ</p>",+"content": "<p>I stayed on for a few days extra in Washington DC after the <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">biodiversity extravaganza</a> to attend a workshop at legendary <a href=\"https://www.nationalgeographic.org/society/visit-base-camp/\">National Geographic Basecamp</a>. While I've been to several NatGeo <a href=\"https://www.nationalgeographic.org/society/national-geographic-explorers/\">Explorers</a> meetups in California, I've never had the chance to visit their HQ. The purpose of this was to attend a workshop organised by <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz</a> from St Andrews about the "Urban Exploration Project":</p>\n<blockquote>\n<p>[The UEP is a...] global-scale, community-driven initiative will collaboratively track animals across gradients of urbanization worldwide, to produce a holistic understanding of animal behaviour in human-modified landscapes that can, in turn, be used to develop evidence-based approaches to achieving sustainable human-wildlife coexistence.\n-- <a href=\"https://www.st-andrews.ac.uk/biology/people/cr68\">Christian Rutz's homepage</a></p>\n</blockquote>\n<p>This immediately grabbed my interest, since it's a very different angle of biodiversity measurements to my usual. I've so far been mainly involved in efforts that use <a href=\"https://anil.recoil.org/projects/rsn\">remote sensing</a> or expert <a href=\"https://anil.recoil.org/projects/life\">range maps</a>, but the UEP program is more concerned with the dynamic <em>movements</em> of species. Wildlife movements are extremely relevant to conservation efforts since there is a large tension between human/wildlife coexistence in areas where both communities are under spatial pressure. <a href=\"https://ratsakatika.com/\">Tom Ratsakatika</a> for example did his <a href=\"https://ai4er-cdt.esc.cam.ac.uk/\">AI4ER</a> <a href=\"https://github.com/ratsakatika/camera-traps\">project</a> on the tensions in the <a href=\"https://www.endangeredlandscapes.org/news/advancing-human-wildlife-coexistence-in-the-carpathian-mountains/\">Romanian Carpathian mountains</a>, and <a href=\"https://www.ifaw.org/journal/human-elephant-conflict-major-threat\">elephant/human conflicts</a> and <a href=\"https://www.bbc.co.uk/news/articles/cx2j43e2j5ro\">tiger/human conflicts</a> are also well known.</p>\n<p>The core challenge posed at the workshop was how to build momentum for the UEP's vision of fostering human\u2013wildlife coexistence in the world's <em>unprotected</em> areas (often, this is areas near urban expansion zones like cities). The UEP idea sprang from Christian's earlier efforts after the pandemic on the <a href=\"https://bio-logging.net/wg/covid19-biologging/\">COVID-19 Bio-Logging</a> that built up a database of over 1 billion satellite fixes for ~13,000 tagged animals across ~200 species. The lead student on that <a href=\"https://www.nature.com/articles/s41559-023-02125-6\">work</a>, <a href=\"https://diegoellissoto.org/\">Diego Ellis Soto</a> has since graduated and was also at the UEP workshop sitting beside me!</p>\n<p>\n<img alt=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\" src=\"https://anil.recoil.org/images/ngs-2.webp\" title=\"NatGeo Chief Scientist Ian Miller kicks off proceedings\">\nNatGeo Chief Scientist Ian Miller kicks off proceedings</p>\n<p>The workshop itself wasn't fully public (not because it's secret, but just because the details are still being iterated on), so here are some high-level takeaways from my conversations there...</p>\n<h2><a href=\"https://anil.recoil.org/#movebank-for-gps-tracking\"></a>Movebank for GPS tracking</h2>\n<p>I've used <a href=\"https://inaturalist.org\">iNaturalist</a> and <a href=\"https://www.openstreetmap.org/\">OpenStreetMap</a> extensively for wildlife occurrence and urban data, but I'm less familiar with how animal movement data is recorded. <a href=\"https://www.ab.mpg.de/person/98226\">Martin Wikelski</a> was at the workshop and explained the <a href=\"https://www.humboldt-foundation.de/en/entdecken/magazin-humboldt-kosmos/humboldt-today-the-secret-of-an-eternal-idol/the-high-flyer\">ICARUS</a> project to me, which collected data fitted to animals via GPS transmitters. This is then fed into the <a href=\"https://www.movebank.org/cms/movebank-main\">MoveBank</a> service that is custom-designed for movement data.</p>\n<p>Unlike most other biodiversity data services though, MoveBank data is not immediately made public (due to the sensitivity of animal movements), but is licensed to the user that made it. For that reason, it's less of a "social" service than iNaturalist, but still has a staggering <a href=\"https://www.movebank.org/cms/movebank-content/february-2024-newsletter\">11 million records added every day</a>. This data is then <a href=\"https://www.movebank.org/cms/movebank-content/archiving-animal-movements-as-biodiversity-2023-01-04\">fed into GBIF</a>, although it is downsampled to a single record per day. Martin also indicated to me that they're considering federating Movebank to other countries, which is important as <a href=\"https://www.youtube.com/watch?v=gDTQ1rIEaYo&list=PLlKst-jESy-8t7lg429Movg6Fmsq2DU7y\">biodiversity data resilience</a> was a hot topic in our <a href=\"https://anil.recoil.org/notes/nas-rs-biodiversity\">meeting</a> a few days before.</p>\n<p>\n<img alt=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\" src=\"https://anil.recoil.org/images/ngs-3.webp\" title=\"The workshop was highly interactive through the 1.5 days. No laptops needed!\">\nThe workshop was highly interactive through the 1.5 days. No laptops needed!</p>\n<h2><a href=\"https://anil.recoil.org/#storytelling-about-conservation-actions\"></a>Storytelling about conservation actions</h2>\n<p>I was really struck by how deeply the National Geographic staff were thinking about and co-designing solutions for along with the academics involved. I got chatting to <a href=\"https://www.nationalgeographic.org/society/our-leadership/\">Ian Miller</a>, the chief scientist at NatGeo about his scientific background (he's worked on all seven continents!) and how our <a href=\"https://anil.recoil.org/projects/ce\">conservation evidence database</a> might be of use to help the Society figure out the long-term impacts of their projects. I also met the person with the coolest job title there: <a href=\"https://www.linkedin.com/in/alextait/\">Alex Tait</a>, who is <a href=\"https://education.nationalgeographic.org/resource/mapping-change-roof-world/\">The Geographer</a> at the NGS. Alex, along with <a href=\"https://theorg.com/org/national-geographic-society/org-chart/lindsay-anderson\">Lindsay Anderson</a> and other NGS staff who participated, all had infectious enthusiasm about exploration combined with an encyclopedic knowledge of specific projects that they support involving explorers across the world.</p>\n<p>These projects ranged from the <a href=\"https://www.nationalgeographic.com/into-the-amazon/pink-dolphins-tricksters-and-thieves/\">Amazon River Dolphins</a> (to understand <a href=\"https://www.nationalgeographic.com/impact/article/fernando-trujillo-explorer-story\">aquatic health</a>) over to <a href=\"https://www.nationalgeographic.com/impact/article/alex-schnell-explorer-story\">cephalopod empathy</a>) and <a href=\"https://www.nationalgeographic.com/impact/article\">many more</a>. These gave me a new perspective on the importance of <em>storytelling</em> as a key mechanism to help connect the dots from conservation actions to people; something that I've been learning from <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>'s <a href=\"https://anil.recoil.org/notes/junior-rangers\">video series</a> as well!</p>\n<p><a href=\"https://www.nationalgeographic.com/impact\"> \n<img alt=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\" src=\"https://anil.recoil.org/images/ngs-5.webp\" title=\"I spent the whole return trip reading the impact stories. So very, very, very inspiring.\">\nI spent the whole return trip reading the impact stories. So very, very, very inspiring. </a></p>\n<p>It's also worth noting that the NGS support goes beyond "just" filmmaking. Our own <a href=\"https://charlesemogor.com\">Charles Emogor</a> is also an <a href=\"https://explorers.nationalgeographic.org/directory/charles-agbor-emogor\">Explorer</a>, and recently received support from their <a href=\"https://www.nationalgeographic.org/society/our-programs/lab/\">Exploration Technology Lab</a> to get a bunch of <a href=\"https://www.wildlifeacoustics.com/products/song-meter-mini-2-aa\">biologgers</a> to support his research on <a href=\"https://anil.recoil.org/ideas/mapping-hunting-risks-for-wild-meat\">mapping hunting pressures</a>. Rather than placing a few big bets, the Society seems to focus on investing widely in a diverse range of people and geographies.</p>\n<h2><a href=\"https://anil.recoil.org/#the-importance-of-hedgehogs\"></a>The importance of hedgehogs</h2>\n<p>A lot of the discussion at the workshop naturally focussed on charismatic mammals such as the amazing work done by the <a href=\"https://www.zambiacarnivores.org/\">Zambian Carnivore programme</a>. However, I also had in mind the importance of addressing issues closer to home in the UK as well so that we didn't ignore Europe.</p>\n<p>Luckily, before the workshop, I had grabbed a coffee with <a href=\"https://www.cambridgeconservation.org/about/people/dr-silviu-o-petrovan/\">Silviu Petrovan</a> from the CCI, who has been bringing me up to speed on the <a href=\"https://www.mammalweb.org/en/nhmp\">National Hedgehog Monitoring programme</a> (did you know that British hedgehogs are now <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">vulnerable to extinction</a>?). This particular effort seems to tick a lot of boxes; it's a local and beloved species in the UK, it requires <a href=\"https://www.conservationevidence.com/individual-study/1018\">evidence-based interventions</a> to avoid making the problems worse, and also requires combining data sources (from camera traps to species distribution models to urban planning to the GPS Movebank data) to build up a really accurate high res picture of what's going on.</p>\n<p>I brought up UK hedgehog conservation at the NatGeo workshop, and then while down at <a href=\"https://earthfest.world/\">Earthfest</a> at Google a few days later I learnt from <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> that they've developed an extremely high-res map of <a href=\"https://eoscience-external.projects.earthengine.app/view/farmscapes\">woodland and hedgerows</a> in the UK. I've therefore created a new student project on <a href=\"https://anil.recoil.org/ideas/hedgehog-mapping\">hedgehog mapping</a> and hope to recruit a summer internship for this. It would be extremely cool to put the pieces together with a very concrete project such as this as a first small step for the UEP.</p>\n<p>\n<img alt=\"NatGeo Basecamp is under construction, but still epic\" src=\"https://anil.recoil.org/images/ngs-1.webp\" title=\"NatGeo Basecamp is under construction, but still epic\">\nNatGeo Basecamp is under construction, but still epic</p>\n<p>I found the whole experience of visiting National Geographic inspirational, and not just because of the projects discussed. The walls of their HQ are full of incredible photographs of explorers all over the world, and a seemingly unbounded enthusiasm for exploring the unknown. I kind of thought I'd aged out on applying to become an explorer, but <a href=\"https://totalkatastrophe.blogspot.com/\">Kathy Ho</a> has been encouraging me to apply, and the same was echoed by the lovely conversations with NatGeo staffers.</p>\n<p>I'm therefore putting on my thinking hat on for what my Explorers project proposal should be, as I am on academic sabbatical next year and have more freedom to travel; suggestions are welcome if you see me at the pub!</p>\n<p>\n<img alt=\"I might have deliberately gone the wrong way a few times while exploring the HQ\" src=\"https://anil.recoil.org/images/ngs-4.webp\" title=\"I might have deliberately gone the wrong way a few times while exploring the HQ\">\nI might have deliberately gone the wrong way a few times while exploring the HQ</p>",
+18
avsm/notes_nature-crossroads.json
+18
avsm/notes_nature-crossroads.json
···+"summary": "<p>Our <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">commentary on nature-based credits</a> has been published in <a href=\"https://www.nature.com/articles/s41893-024-01403-w\">Nature\nSustainability</a>,\nlead expertly by my colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a>.</p>\n<p>In our view the carbon credits markets are vitally important for forest\nconservation, but the key is to only transact these credits <em>after they have\nbeen proven to be demonstrably additional using robust statistical techniques</em>,\nso that we know before a sale that each credit represents real gains that would\nnot otherwise have occurred without the carbon finance.</p>\n<p>A more scientific approach that supports transparent, third-party validation\ncould absolutely transform these markets. And given the rapid rate of tropical\nforest loss, such upscaling of credibility is vitally necessary to raise\ninvestor confidence in protecting nature, since we can now be confident that\nevery "credit" sold is resulting in real climate benefit. There are real\nquestions remaining about this reform, of course.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/naturecrossroads-method.webp\" title=\"\">\n</p>\n<h3><a href=\"https://anil.recoil.org/#where-does-early-project-finance-come-from\"></a>Where does early project finance come from?</h3>\n<p>Since projects can no longer\nsell ex-ante credits (i.e. future credits which may not be real), then we\nneed to come up with financing models that embrace the upfront risk. This\nalready happens in other areas such as oil and gas; as <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> notes:</p>\n<blockquote>\n<p><..>speculative efforts like mining or oil exploration, we\u2019ve still managed to build large industries out of uncertain (but potentially very valuable) payoffs. The challenge here will be to figure out which archetype different projects fall into, and create enough trust that the output will be real and valuable enough to someone to justify the up front investments<..>\n-- <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> via <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7226538933961007104?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226597328550273025%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226840222288789504%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287226597328550273025%2Curn%3Ali%3Aactivity%3A7226538933961007104%29&dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287226840222288789504%2Curn%3Ali%3Aactivity%3A7226538933961007104%29\">LinkedIn</a></p>\n</blockquote>\n<p>Lead author <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> comments as well that:</p>\n<blockquote>\n<p>Society has made huge policy commitments to upscale carbon & biodiversity offsetting.\nBut, carbon credit markets have suffered serious hits to their credibility & nascent biodiversity markets risk inheriting shortcomings. Impact evaluations have shown that these markets have systematically underdelivered additionality.\n-- <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> via <a href=\"https://www.linkedin.com/posts/sophus-zu-ermgassen-12915ba6_nature-based-carbon-markets-have-experienced-activity-7226538933961007104-mM-u?utm_source=share&utm_medium=member_desktop\">LinkedIn</a></p>\n</blockquote>\n<p>We've been working on this aspect in <a href=\"https://anil.recoil.org/projects/4c\">4C</a>, since ex-ante predictions of outcomes are necessary for project developers to be able to forecast financing. See the paper "<a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>" for our latest work on that, lead by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<div>\n\n</div>\n<p><a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> also gave a fantastic talk at the CCI ont his topic a few months ago that is a must watch for anyone working on carbon or biodiversity markets.</p>\n<h3><a href=\"https://anil.recoil.org/#questions-of-equity-and-justice\"></a>Questions of equity and justice</h3>\n<p>It's also not enough to "just" show that a given project is additional from a satellite perspective, but also that they do not result in justice and equity concerns for the local populations. Current reporting practices often require only superficial descriptions of how projects approach justice and equity issues, which are challenging to verify and lack consistency and transparency. So our group has also been working on <a href=\"https://4c.cst.cam.ac.uk/news/introducing-new-framework-assessing-justice-and-equity-impacts-nature-based-solutions-projects\">a framework for assessing justice and equity impacts</a>, started by <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\">Miranda Lam</a>. I've also been working with <a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> on the <a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">Legal perspectives on integrity issues in forest carbon</a>. Please do get in touch if you have thoughts on this aspect of project development.</p>",+"content": "<p>Our <a href=\"https://anil.recoil.org/papers/2023-naturecredits\">commentary on nature-based credits</a> has been published in <a href=\"https://www.nature.com/articles/s41893-024-01403-w\">Nature\nSustainability</a>,\nlead expertly by my colleagues <a href=\"https://www.conservation.cam.ac.uk/directory/dr-tom-swinfield\">Thomas Swinfield</a> and <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a>.</p>\n<p>In our view the carbon credits markets are vitally important for forest\nconservation, but the key is to only transact these credits <em>after they have\nbeen proven to be demonstrably additional using robust statistical techniques</em>,\nso that we know before a sale that each credit represents real gains that would\nnot otherwise have occurred without the carbon finance.</p>\n<p>A more scientific approach that supports transparent, third-party validation\ncould absolutely transform these markets. And given the rapid rate of tropical\nforest loss, such upscaling of credibility is vitally necessary to raise\ninvestor confidence in protecting nature, since we can now be confident that\nevery "credit" sold is resulting in real climate benefit. There are real\nquestions remaining about this reform, of course.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/naturecrossroads-method.webp\" title=\"\">\n</p>\n<h3><a href=\"https://anil.recoil.org/#where-does-early-project-finance-come-from\"></a>Where does early project finance come from?</h3>\n<p>Since projects can no longer\nsell ex-ante credits (i.e. future credits which may not be real), then we\nneed to come up with financing models that embrace the upfront risk. This\nalready happens in other areas such as oil and gas; as <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> notes:</p>\n<blockquote>\n<p><..>speculative efforts like mining or oil exploration, we\u2019ve still managed to build large industries out of uncertain (but potentially very valuable) payoffs. The challenge here will be to figure out which archetype different projects fall into, and create enough trust that the output will be real and valuable enough to someone to justify the up front investments<..>\n-- <a href=\"https://uk.linkedin.com/in/siddarthshrikanth\">Siddarth Shrikanth</a> via <a href=\"https://www.linkedin.com/feed/update/urn:li:activity:7226538933961007104?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226597328550273025%29&replyUrn=urn%3Ali%3Acomment%3A%28activity%3A7226538933961007104%2C7226840222288789504%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287226597328550273025%2Curn%3Ali%3Aactivity%3A7226538933961007104%29&dashReplyUrn=urn%3Ali%3Afsd_comment%3A%287226840222288789504%2Curn%3Ali%3Aactivity%3A7226538933961007104%29\">LinkedIn</a></p>\n</blockquote>\n<p>Lead author <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> comments as well that:</p>\n<blockquote>\n<p>Society has made huge policy commitments to upscale carbon & biodiversity offsetting.\nBut, carbon credit markets have suffered serious hits to their credibility & nascent biodiversity markets risk inheriting shortcomings. Impact evaluations have shown that these markets have systematically underdelivered additionality.\n-- <a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> via <a href=\"https://www.linkedin.com/posts/sophus-zu-ermgassen-12915ba6_nature-based-carbon-markets-have-experienced-activity-7226538933961007104-mM-u?utm_source=share&utm_medium=member_desktop\">LinkedIn</a></p>\n</blockquote>\n<p>We've been working on this aspect in <a href=\"https://anil.recoil.org/projects/4c\">4C</a>, since ex-ante predictions of outcomes are necessary for project developers to be able to forecast financing. See the paper "<a href=\"https://anil.recoil.org/papers/2024-nbs-risk\">Mitigating risk of credit reversal in nature-based climate solutions by optimally anticipating carbon release</a>" for our latest work on that, lead by <a href=\"https://www.plantsci.cam.ac.uk/staff/dr-e-ping-rau\">E.-Ping Rau</a> and <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a>.</p>\n<div>\n\n</div>\n<p><a href=\"https://www.naturerecovery.ox.ac.uk/people/sophus-zu-ermgassen/\">Sophus zu Ermgassen</a> also gave a fantastic talk at the CCI ont his topic a few months ago that is a must watch for anyone working on carbon or biodiversity markets.</p>\n<h3><a href=\"https://anil.recoil.org/#questions-of-equity-and-justice\"></a>Questions of equity and justice</h3>\n<p>It's also not enough to "just" show that a given project is additional from a satellite perspective, but also that they do not result in justice and equity concerns for the local populations. Current reporting practices often require only superficial descriptions of how projects approach justice and equity issues, which are challenging to verify and lack consistency and transparency. So our group has also been working on <a href=\"https://4c.cst.cam.ac.uk/news/introducing-new-framework-assessing-justice-and-equity-impacts-nature-based-solutions-projects\">a framework for assessing justice and equity impacts</a>, started by <a href=\"https://uk.linkedin.com/in/miranda-lam-a088561b4\">Miranda Lam</a>. I've also been working with <a href=\"https://www.cst.cam.ac.uk/people/smc70\">Sophie Chapman</a> and <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> on the <a href=\"https://anil.recoil.org/ideas/legal-aspects-of-credits\">Legal perspectives on integrity issues in forest carbon</a>. Please do get in touch if you have thoughts on this aspect of project development.</p>",
+18
avsm/notes_new-teaching-page.json
+18
avsm/notes_new-teaching-page.json
···+"summary": "<p>There's a new <a href=\"https://anil.recoil.org/notes/teaching\">teaching</a> page with my past and present courses, and links\nto the associated teaching materials. One of the nice things about most Cambridge\ncourses is that all the teaching materials are public, except for video recordings of\nthe lectures themselves.</p>",+"content": "<p>There's a new <a href=\"https://anil.recoil.org/notes/teaching\">teaching</a> page with my past and present courses, and links\nto the associated teaching materials. One of the nice things about most Cambridge\ncourses is that all the teaching materials are public, except for video recordings of\nthe lectures themselves.</p>",
+18
avsm/notes_ocaml-2013-liveblog.json
+18
avsm/notes_ocaml-2013-liveblog.json
···+"summary": "<p>I attended the OCaml 2013 workshop and took live notes of the event. There was a lot going on here, which you can learn more about in the "<a href=\"https://anil.recoil.org/notes/the-year-in-ocamllabs\">Reviewing the first year of OCaml Labs in 2013</a>" roundup as well that I published later in the year.</p>",+"content": "<p>I attended the OCaml 2013 workshop and took live notes of the event. There was a lot going on here, which you can learn more about in the "<a href=\"https://anil.recoil.org/notes/the-year-in-ocamllabs\">Reviewing the first year of OCaml Labs in 2013</a>" roundup as well that I published later in the year.</p>",
+18
avsm/notes_ocaml-github-and-opam.json
+18
avsm/notes_ocaml-github-and-opam.json
···+"summary": "<p>Gabriel Scherer <a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">announced an\nexperiment</a> to\nhost OCaml compiler pull requests on\n<a href=\"https://github.com/ocaml/ocaml/pulls\">GitHub</a> for six months. There is\na general feeling that GitHub would be a more modern hosting platform\nthan the venerable but reliable\n<a href=\"http://caml.inria.fr/mantis/changelog_page.php\">Mantis</a> setup that has\nin place for over a decade, but the only way to find out for sure is by\ntrying it out for a while.</p>\n<p>One of the great benefits of using GitHub is their excellent\n<a href=\"http://developer.github.com/v3/\">API</a> to easily automate workflows\naround issues and pull requests. After a suggestion from Jeremy Yallop\nand David Sheets over lunch, I decided to use this to make it easier to\nlocally apply compiler patches. OPAM has a great <a href=\"https://opam.ocaml.org/doc/Advanced_Usage.html#h2-Usingadifferentcompiler\">compiler\nswitch</a>\nfeature that lets you run simultaneous OCaml installations and swap\nbetween them easily.</p>\n<p>For instance, the default setting gives you access\nto:</p>\n<pre><code>$ opam switch\nsystem C system System compiler (4.01.0)\n-- -- 3.11.2 Official 3.11.2 release\n-- -- 3.12.1 Official 3.12.1 release\n-- -- 4.00.0 Official 4.00.0 release\n-- -- 4.00.1 Official 4.00.1 release\n-- -- 4.01.0 Official 4.01.0 release\n-- -- 4.01.0beta1 Beta1 release of 4.01.0\n</code></pre>\n<p>I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub API bindings</a> to\nknock up a script that converts every GitHub pull request into a custom\ncompiler switch. You can see these by passing the <code>--all</code> option to\n<code>opam switch</code>, as follows:</p>\n<pre><code>$ opam switch --all\n-- -- 4.02.0dev+pr10 Add String.{split,rsplit}\n-- -- 4.02.0dev+pr13 Add String.{cut,rcut}.\n-- -- 4.02.0dev+pr14 Add absolute directory names to bytecode format for ocamldebug to use\n-- -- 4.02.0dev+pr15 replace String.blit by String.unsafe_blit\n-- -- 4.02.0dev+pr17 Cmm arithmetic optimisations\n-- -- 4.02.0dev+pr18 Patch for issue 5584\n-- -- 4.02.0dev+pr2 Parse -.x**2. (unary -.) as -.(x**2.). Fix PR#3414\n-- -- 4.02.0dev+pr20 OCamlbuild: Fix the check of ocamlfind\n-- -- 4.02.0dev+pr3 Extend record punning to allow destructuring.\n-- -- 4.02.0dev+pr4 Fix for PR#4832 (Filling bigarrays may block out runtime)\n-- -- 4.02.0dev+pr6 Warn user when a type variable in a type constraint has been instantiated.\n-- -- 4.02.0dev+pr7 Extend ocamllex with actions before refilling\n-- -- 4.02.0dev+pr8 Adds a .gitignore to ignore all generated files during `make world.opt'\n-- -- 4.02.0dev+pr9 FreeBSD 10 uses clang by default, with gcc not available by default\n-- -- 4.02.0dev+trunk latest trunk snapshot\n</code></pre>\n<p>Testing the impact of a particular compiler switch is now pretty\nstraightforward. If you want to play with Stephen Dolan\u2019s <a href=\"https://github.com/ocaml/ocaml/pull/17\">optimized\narithmetic operations</a>, for\ninstance, you just need to do:</p>\n<pre><code>$ opam switch 4.02.0dev+pr17\n$ eval `opam config env`\n</code></pre>\n<p>And your local environment now points to the patched OCaml compiler. For\nthe curious, the scripts to generate the OPAM pull requests are in my\n<a href=\"https://github.com/avsm/opam-sync-github-prs\">avsm/opam-sync-github-prs</a>\nrepository. It contains an example of how to query active pull requests,\nand also to create a new cross-repository pull request (using the <a href=\"https://github.com/avsm/ocaml-github\">git\njar</a> binary from my GitHub\nbindings). The scripts run daily for now, and delete switches once the\ncorresponding pull request is closed. Just run <code>opam update</code> to retrieve\nthe latest switch set from the upstream <a href=\"https://github.com/ocaml/opam-repository\">OPAM package\nrepository</a>.</p>",+"content": "<p>Gabriel Scherer <a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">announced an\nexperiment</a> to\nhost OCaml compiler pull requests on\n<a href=\"https://github.com/ocaml/ocaml/pulls\">GitHub</a> for six months. There is\na general feeling that GitHub would be a more modern hosting platform\nthan the venerable but reliable\n<a href=\"http://caml.inria.fr/mantis/changelog_page.php\">Mantis</a> setup that has\nin place for over a decade, but the only way to find out for sure is by\ntrying it out for a while.</p>\n<p>One of the great benefits of using GitHub is their excellent\n<a href=\"http://developer.github.com/v3/\">API</a> to easily automate workflows\naround issues and pull requests. After a suggestion from Jeremy Yallop\nand David Sheets over lunch, I decided to use this to make it easier to\nlocally apply compiler patches. OPAM has a great <a href=\"https://opam.ocaml.org/doc/Advanced_Usage.html#h2-Usingadifferentcompiler\">compiler\nswitch</a>\nfeature that lets you run simultaneous OCaml installations and swap\nbetween them easily.</p>\n<p>For instance, the default setting gives you access\nto:</p>\n<pre><code>$ opam switch\nsystem C system System compiler (4.01.0)\n-- -- 3.11.2 Official 3.11.2 release\n-- -- 3.12.1 Official 3.12.1 release\n-- -- 4.00.0 Official 4.00.0 release\n-- -- 4.00.1 Official 4.00.1 release\n-- -- 4.01.0 Official 4.01.0 release\n-- -- 4.01.0beta1 Beta1 release of 4.01.0\n</code></pre>\n<p>I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub API bindings</a> to\nknock up a script that converts every GitHub pull request into a custom\ncompiler switch. You can see these by passing the <code>--all</code> option to\n<code>opam switch</code>, as follows:</p>\n<pre><code>$ opam switch --all\n-- -- 4.02.0dev+pr10 Add String.{split,rsplit}\n-- -- 4.02.0dev+pr13 Add String.{cut,rcut}.\n-- -- 4.02.0dev+pr14 Add absolute directory names to bytecode format for ocamldebug to use\n-- -- 4.02.0dev+pr15 replace String.blit by String.unsafe_blit\n-- -- 4.02.0dev+pr17 Cmm arithmetic optimisations\n-- -- 4.02.0dev+pr18 Patch for issue 5584\n-- -- 4.02.0dev+pr2 Parse -.x**2. (unary -.) as -.(x**2.). Fix PR#3414\n-- -- 4.02.0dev+pr20 OCamlbuild: Fix the check of ocamlfind\n-- -- 4.02.0dev+pr3 Extend record punning to allow destructuring.\n-- -- 4.02.0dev+pr4 Fix for PR#4832 (Filling bigarrays may block out runtime)\n-- -- 4.02.0dev+pr6 Warn user when a type variable in a type constraint has been instantiated.\n-- -- 4.02.0dev+pr7 Extend ocamllex with actions before refilling\n-- -- 4.02.0dev+pr8 Adds a .gitignore to ignore all generated files during `make world.opt'\n-- -- 4.02.0dev+pr9 FreeBSD 10 uses clang by default, with gcc not available by default\n-- -- 4.02.0dev+trunk latest trunk snapshot\n</code></pre>\n<p>Testing the impact of a particular compiler switch is now pretty\nstraightforward. If you want to play with Stephen Dolan\u2019s <a href=\"https://github.com/ocaml/ocaml/pull/17\">optimized\narithmetic operations</a>, for\ninstance, you just need to do:</p>\n<pre><code>$ opam switch 4.02.0dev+pr17\n$ eval `opam config env`\n</code></pre>\n<p>And your local environment now points to the patched OCaml compiler. For\nthe curious, the scripts to generate the OPAM pull requests are in my\n<a href=\"https://github.com/avsm/opam-sync-github-prs\">avsm/opam-sync-github-prs</a>\nrepository. It contains an example of how to query active pull requests,\nand also to create a new cross-repository pull request (using the <a href=\"https://github.com/avsm/ocaml-github\">git\njar</a> binary from my GitHub\nbindings). The scripts run daily for now, and delete switches once the\ncorresponding pull request is closed. Just run <code>opam update</code> to retrieve\nthe latest switch set from the upstream <a href=\"https://github.com/ocaml/opam-repository\">OPAM package\nrepository</a>.</p>",
+18
avsm/notes_ocaml-labs-at-icfp-2014.json
+18
avsm/notes_ocaml-labs-at-icfp-2014.json
···+"summary": "<p>It's the ever-exciting week of the <a href=\"https://icfpconference.org/\">International Conference on\nFunctional Programming</a> again in Sweden,\nand this time <a href=\"http://ocaml.io\">OCaml Labs</a> has a variety of talks,\ntutorials and keynotes to deliver throughout the week. This post\nsummarises all them so you can navigate your way to the right session.\nRemember that once you register for a particular day at ICFP, you can\nmove between workshops and tutorials as you please.</p>\n<p>\n<img alt=\"Gothenburg, the location of this year&apos;s ICFP conference.\" src=\"https://anil.recoil.org/images/gothenburg.webp\" title=\"Gothenburg, the location of this year&apos;s ICFP conference.\">\nGothenburg, the location of this year's ICFP conference.\nQuick links to the below in date order:</p>\n<ul>\n<li>Talk on <a href=\"https://anil.recoil.org/#coeffects\">Coeffects, a Calculus of Context-dependent\nComputation</a>, Monday 1st September, 16:30-17:20, ICFP\nDay 1.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#implicits\">Modular Implicits</a>, Thu 4th September,\n14:25-14:50, ML Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#modulealiases\">Module Aliases</a>, Thu 4th September,\n09:35-10:00, ML Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#metamirage\">Metaprogramming in the Mirage OS</a>, Thu 4th\nSeptember, 14:50-15:10, ML Workshop.</li>\n<li>Keynote talk on <a href=\"https://anil.recoil.org/#unikernels\">Unikernels</a>, Fri 5th September,\n09:00-10:00, Haskell Symposium.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#multicore\">Multicore OCaml</a>, Fri 5th September,\n09:10-10:00, OCaml Workshop.</li>\n<li>Tutorial on <a href=\"https://anil.recoil.org/#cufptutorial\">OCaml and JavaScript Programming</a>, Fri\n5th September, 09:00-12:00, CUFP Tutorial Day 2.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#zeroinstall\">0install binary distribution</a>, Fri 5th\nSeptember, 10:25-10:50, OCaml Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#tls\">Transport Layer Security in OCaml</a>, Fri 5th\nSeptember, 10:50-11:20, OCaml Workshop.</li>\n<li>Talk/Demo on the <a href=\"https://anil.recoil.org/#platform\">OCaml Platform</a>, Fri 5th September,\n12:00-12:30, OCaml Workshop.</li>\n<li>Poster and Demo of the <a href=\"https://anil.recoil.org/#irmin\">Irmin branch-consistent store</a>, Fri\n5th September, 15:10-16:30, OCaml/ML Workshop.</li>\n<li><a href=\"https://anil.recoil.org/#social\">Social Events</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#language-and-compiler-improvements\"></a>Language and Compiler Improvements</h2>\n<p>The first round of talks are about improvements to the core OCaml\nlanguage and runtime.</p>\n<h3><a href=\"https://anil.recoil.org/#modular-implicits\"></a>\u00bb Modular implicits</h3>\n<p>Leo White and Frederic Bour have been taking inspiration from Scala\nimplicits and <a href=\"https://www.mpi-sws.org/~dreyer/papers/mtc/main-short.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al</em>, and will describe the design and implementation of a\nsystem for ad-hoc polymorphism in OCaml based on passing implicit module\nparameters to functions based on their module type.</p>\n<p>This provides a concise way to write functions to print or manipulate\nvalues generically, while maintaining the ML spirit of explicit\nmodularity. You can actually get get a taste of this new feature ahead\nof the talk, thanks to a new facility in OCaml: we can compile any OPAM\nswitch directly into an interactive JavaScript notebook thanks to\n<a href=\"https://github.com/andrewray/iocamljs\">iocamljs</a> by <a href=\"http://ujamjar.github.io/\">Andy\nRay</a>.</p>\n<ul>\n<li><a href=\"http://www.lpw25.net/ml2014.pdf\">Abstract</a></li>\n<li><a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">Interactive\nCompiler</a></li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#multicore-ocaml\"></a>Multicore OCaml</h3>\n<p>Currently, threading in OCaml is only supported by means of a global\nlock, allowing at most one thread to run OCaml code at any time. Stephen\nDolan, Leo White and Anil Madhavapeddy have been building on the <a href=\"http://www.cl.cam.ac.uk/~sd601/multicore.md\">early\ndesign</a> of a multicore\nOCaml runtime that they started in January, and now have a (early)\nprototype of a runtime design that is capable of shared memory\nparallelism.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">Abstract</a></li>\n<li>Date: 09:10-10:00, OCaml Workshop, Fri Sept 5th</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#type-level-module-aliases\"></a>Type-level Module Aliases</h3>\n<p>Leo White has been working with <a href=\"http://www.math.nagoya-u.ac.jp/~garrigue/\">Jacques\nGarrigue</a> on adding support\nfor module aliases into OCaml. This significantly improves the\ncompilation speed and executable binary sizes when using large libraries\nsuch as\n<a href=\"https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html\">Core/Async</a>.</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/modalias.pdf?attredirects=0\">Abstract</a></li>\n<li><a href=\"https://blogs.janestreet.com/better-namespaces-through-module-aliases\">Better Namespaces through Module\nAliases</a></li>\n<li>Date: 0935-1000, ML Workshop, Thu Sep 4th.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#coeffects-a-calculus-of-context-dependent-computation\"></a>Coeffects: A Calculus of Context-dependent Computation</h3>\n<p>Alan Mycroft has been working with Tomas Petricek and Dominic Orchard on\ndefining a broader notion of context than just variables in scope. Tomas\nwill be presenting a research paper on developing a generalized coeffect\nsystem with annotations indexed by a correct shape.</p>\n<ul>\n<li><a href=\"http://www.cl.cam.ac.uk/~dao29/publ/coeffects-icfp14.pdf\">Paper</a></li>\n<li>Date: 16:30-17:20, ICFP Day 1, Mon Sep 1st.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#mirage-os-20\"></a>Mirage OS 2.0</h2>\n<p>We <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">released Mirage OS\n2.0</a> in July,\nand there will be several talks diving into some of the new features you\nmay have read on the blog.</p>\n<h3><a href=\"https://anil.recoil.org/#unikernels-keynote-at-haskell-symposium\"></a>Unikernels Keynote at Haskell Symposium</h3>\n<p>Since MirageOS is a\n<a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\">unikernel</a>\nwritten entirely in OCaml, it makes perfect sense to describe it in\ndetail to our friends over at the <a href=\"http://www.haskell.org/haskell-symposium/\">Haskell\nSymposium</a> and reflect on\nsome of the design implications between Haskell type-classes and OCaml\nfunctors and metaprogramming. Anil Madhavapeddy will be doing just that\nin a Friday morning keynote at the Haskell Symposium.</p>\n<ul>\n<li>Haskell Symposium\n<a href=\"http://www.haskell.org/haskell-symposium/2014/index.html\">Program</a></li>\n<li>Date: 0900-1000, Haskell Symposium, Fri Sep 5th.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#transport-layer-security-in-ocaml\"></a>Transport Layer Security in OCaml</h3>\n<p>Hannes Menhert and David Kaloper have been <a href=\"http://openmirage.org/blog/introducing-ocaml-tls\">working\nhard</a> on integrating a\npure OCaml Transport Layer Security stack into Mirage OS. They\u2019ll talk\nabout the design principles underlying the library, and reflect on the\nnext steps to build a TLS stack that we can rely on not to been more\ninsecure than telnet.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_4.pdf\">Abstract</a></li>\n<li>Date: 10:25-11:20, OCaml Workshop, Fri Sep 5th.</li>\n</ul>\n<p>Hannes will also continue his travels and deliver a couple of talks the\nweek after ICFP on the same topic in Denmark, so you can still see it if\nyou happen to miss this week\u2019s presentation:</p>\n<ul>\n<li>9th Sep at 15:00, IT University of Copenhagen (2A08),\n<a href=\"http://list.ku.dk/pipermail/sci-diku-prog-lang/2014-August/000244.html\">details</a></li>\n<li>11th Sep Aarhus University, same talk (time and room TBA)</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#irmin-a-branch-consistent-distributed-library-database\"></a>Irmin: a Branch-consistent Distributed Library Database</h3>\n<p>Irmin is an <a href=\"https://github.com/mirage/irmin\">OCaml library</a> to persist\nand synchronize distributed data structures both on-disk and in-memory.\nIt enables a style of programming very similar to the Git workflow,\nwhere distributed nodes fork, fetch, merge and push data between each\nother. The general idea is that you want every active node to get a\nlocal (partial) copy of a global database and always be very explicit\nabout how and when data is shared and migrated.</p>\n<p>This has been a big collaborative effort lead by Thomas Gazagnaire, and\nincludes contributions from Amir Chaudhry, Anil Madhavapeddy, Richard\nMortier, David Scott, David Sheets, Gregory Tsipenyuk, Jon Crowcroft.\nWe\u2019ll be demonstrating Irmin <a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">in\naction</a>, so please come\nalong if you\u2019ve got any interesting applications you would like to talk\nto us about.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_11.pdf\">Abstract</a></li>\n<li><a href=\"http://openmirage.org/blog/introducing-irmin\">Blog Post</a></li>\n<li>Date: 15:10-16:30, Joint Poster Session for OCaml/ML Workshop, Fri\nSep 5th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#metaprogramming-with-ml-modules-in-the-mirageos\"></a>Metaprogramming with ML modules in the MirageOS</h3>\n<p>Mirage OS lets the programmer build modular operating system components\nusing a combination of OCaml functors and generative metaprogramming.\nThis ensures portability across both Unix binaries and Xen unikernels,\nwhile preserving a usable developer workflow.</p>\n<p>The core Mirage OS team of Anil Madhavapeddy, Thomas Gazagnaire, David\nScott and Richard Mortier will be talking about the details of the\nfunctor combinators that make all this possible, and doing a live\ndemonstration of it running on a tiny <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARM\nboard</a>!</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/Gazagnaire-abstract.pdf?attredirects=0\">Abstract</a></li>\n<li>Date: 14:50-15:10, ML Workshop, Thu Sep 4th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#cufp-ocaml-language-tutorial\"></a>CUFP OCaml Language Tutorial</h3>\n<p>Leo White and Jeremy Yallop (with much helpful assistance from Daniel\nBuenzli) will be giving a rather different OCaml tutorial from the usual\nfare: they are taking you on a journey of building a variant of the\npopular <a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game in pure\nOCaml, and compiling it to JavaScript using the\n<a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. This is a\nvery pragmatic introduction to using statically typed functional\nprogramming combined with efficient compilation to JavaScript.</p>\n<blockquote>\n<p>In this tutorial, we will first introduce the basics of OCaml using an\ninteractive environment running in a web browser, as well as a local\ninstall of OCaml using the OPAM package manager. We will also explore\nhow to compile OCaml to JavaScript using the js_of_ocaml tool.</p>\n</blockquote>\n<p>The tutorial is focused around writing the 2048 logic, which will then\nbe compiled with js_of_ocaml and linked together with a frontend based\non (a pre-release version of) Useri, React, Gg and Vg, thanks to Daniel\nBuenzli. There\u2019ll also be appearances from OPAM, IOCaml, Qcheck and\nOUnit.</p>\n<ul>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/\">Tutorial Code</a></li>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/blob/master/task.md\">Task\nSheet</a></li>\n<li>Date: 09:00-12:00, CUFP Tutorial Day 2, Fri Sep 5th 2014.</li>\n</ul>\n<p>There will also be a limited supply of special edition OCaml-branded USB\nsticks for the first tutorial attendees, so get here early for your\nexclusive swag!</p>\n<h2><a href=\"https://anil.recoil.org/#the-ocaml-platform\"></a>The OCaml Platform</h2>\n<p>The group here has been working hard all summer to pull together an\nintegrated demonstration of the new generation of OCaml tools being\nbuilt around the increasingly popular <a href=\"https://opam.ocaml.org\">OPAM</a>\npackage manager. Anil Madhavapeddy will demonstrate all of these pieces\nin the OCaml Workshop, with guest appearances of work from Amir\nChaudhry, Daniel Buenzli, Jeremie Diminio, Thomas Gazagnaire, Louis\nGesbert, Thomas Leonard, David Sheets, Mark Shinwell, Christophe\nTroestler, Leo White and Jeremy Yallop.</p>\n<blockquote>\n<p>The OCaml Platform combines the OCaml compiler toolchain with a\ncoherent set of tools for build, documentation, testing and IDE\nintegration. The project is a collaborative effort across the OCaml\ncommunity, tied together by the OCaml Labs group in Cambridge and with\nother major contributors.</p>\n</blockquote>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">Abstract</a></li>\n<li><a href=\"https://opam.ocaml.org/blog\">Platform Blog</a></li>\n<li>Date: 12:00-12:30, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#the-0install-binary-installation-system\"></a>The 0install Binary Installation System</h3>\n<p>Thomas Leonard will also be delivering a separate talk about\ncross-platform binary installation via his\n<a href=\"http://zero-install.sourceforge.net/\">0install</a> library, which works on\na variety of platforms ranging from Windows, Linux and MacOS X. He\nrecently rewrote it in <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">OCaml from\nPython</a>,\nand will be sharing his experiences on how this went as a new OCaml\nuser, as well as deliver an introduction to 0install.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_3.pdf\">Abstract</a></li>\n<li>Date: 10:25-10:50, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#service-and-socialising\"></a>Service and Socialising</h2>\n<p>Heidi Howard and Leonhard Markert are acting as student volunteers at\nthis years ICFP, and assisting with videoing various workshops such as\nCUFP Tutorials, Haskell Symposium, the Workshop on Functional\nHigh-Performance Computing and the ML Family Workshop. Follow their live\nblogging on the <a href=\"http://www.syslog.cl.cam.ac.uk/\">Systems Research Group\nSysBlog</a> and leave comments about any\nsessions you\u2019d like to know more about!</p>\n<p>Anil Madhavapeddy is the ICFP industrial relations chair and will be\nhosting an Industrial Reception on Thursday 4th September in the <a href=\"http://www.varldskulturmuseerna.se/varldskulturmuseet/\">Museum\nof World\nCulture</a>\nstarting from 7pm. There will be wine, food and some inspirational talks from the ICFP\nsponsors that not only make the conference possible, but provide an\navenue for the academic work to make its way out into industry (grad\nstudents that are job hunting: this is where you get to chat to folk\nhiring FP talent).</p>\n<p>This list hasn\u2019t been exhaustive, and only covers the activities of my\ngroup in <a href=\"http://ocaml.io\">OCaml Labs</a> and the <a href=\"http://www.cl.cam.ac.uk/research/srg/\">Systems Research\nGroup</a> at Cambridge. There are\nnumerous other talks from the Cambridge Computer Lab during the week,\nbut the artistic highlight will be on Saturday evening following the\n<a href=\"http://cufp.org/2014/\">CUFP talks</a>: <a href=\"http://sam.aaron.name/\">Sam Aaron</a>\nwill be doing a <a href=\"https://twitter.com/samaaron/status/505081137660981248\">live musical\nperformance</a>\nsometime after 8pm at <a href=\"http://www.3vaningen.se/\">3vaningen</a>. Sounds like\na perfect way to wind down after what\u2019s gearing to up to be an intense\nICFP 2014. I look forward to seeing old friends and making new ones in\nGothenburg soon!</p>",+"content": "<p>It's the ever-exciting week of the <a href=\"https://icfpconference.org/\">International Conference on\nFunctional Programming</a> again in Sweden,\nand this time <a href=\"http://ocaml.io\">OCaml Labs</a> has a variety of talks,\ntutorials and keynotes to deliver throughout the week. This post\nsummarises all them so you can navigate your way to the right session.\nRemember that once you register for a particular day at ICFP, you can\nmove between workshops and tutorials as you please.</p>\n<p>\n<img alt=\"Gothenburg, the location of this year&apos;s ICFP conference.\" src=\"https://anil.recoil.org/images/gothenburg.webp\" title=\"Gothenburg, the location of this year&apos;s ICFP conference.\">\nGothenburg, the location of this year's ICFP conference.\nQuick links to the below in date order:</p>\n<ul>\n<li>Talk on <a href=\"https://anil.recoil.org/#coeffects\">Coeffects, a Calculus of Context-dependent\nComputation</a>, Monday 1st September, 16:30-17:20, ICFP\nDay 1.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#implicits\">Modular Implicits</a>, Thu 4th September,\n14:25-14:50, ML Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#modulealiases\">Module Aliases</a>, Thu 4th September,\n09:35-10:00, ML Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#metamirage\">Metaprogramming in the Mirage OS</a>, Thu 4th\nSeptember, 14:50-15:10, ML Workshop.</li>\n<li>Keynote talk on <a href=\"https://anil.recoil.org/#unikernels\">Unikernels</a>, Fri 5th September,\n09:00-10:00, Haskell Symposium.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#multicore\">Multicore OCaml</a>, Fri 5th September,\n09:10-10:00, OCaml Workshop.</li>\n<li>Tutorial on <a href=\"https://anil.recoil.org/#cufptutorial\">OCaml and JavaScript Programming</a>, Fri\n5th September, 09:00-12:00, CUFP Tutorial Day 2.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#zeroinstall\">0install binary distribution</a>, Fri 5th\nSeptember, 10:25-10:50, OCaml Workshop.</li>\n<li>Talk on <a href=\"https://anil.recoil.org/#tls\">Transport Layer Security in OCaml</a>, Fri 5th\nSeptember, 10:50-11:20, OCaml Workshop.</li>\n<li>Talk/Demo on the <a href=\"https://anil.recoil.org/#platform\">OCaml Platform</a>, Fri 5th September,\n12:00-12:30, OCaml Workshop.</li>\n<li>Poster and Demo of the <a href=\"https://anil.recoil.org/#irmin\">Irmin branch-consistent store</a>, Fri\n5th September, 15:10-16:30, OCaml/ML Workshop.</li>\n<li><a href=\"https://anil.recoil.org/#social\">Social Events</a></li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#language-and-compiler-improvements\"></a>Language and Compiler Improvements</h2>\n<p>The first round of talks are about improvements to the core OCaml\nlanguage and runtime.</p>\n<h3><a href=\"https://anil.recoil.org/#modular-implicits\"></a>\u00bb Modular implicits</h3>\n<p>Leo White and Frederic Bour have been taking inspiration from Scala\nimplicits and <a href=\"https://www.mpi-sws.org/~dreyer/papers/mtc/main-short.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al</em>, and will describe the design and implementation of a\nsystem for ad-hoc polymorphism in OCaml based on passing implicit module\nparameters to functions based on their module type.</p>\n<p>This provides a concise way to write functions to print or manipulate\nvalues generically, while maintaining the ML spirit of explicit\nmodularity. You can actually get get a taste of this new feature ahead\nof the talk, thanks to a new facility in OCaml: we can compile any OPAM\nswitch directly into an interactive JavaScript notebook thanks to\n<a href=\"https://github.com/andrewray/iocamljs\">iocamljs</a> by <a href=\"http://ujamjar.github.io/\">Andy\nRay</a>.</p>\n<ul>\n<li><a href=\"http://www.lpw25.net/ml2014.pdf\">Abstract</a></li>\n<li><a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">Interactive\nCompiler</a></li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#multicore-ocaml\"></a>Multicore OCaml</h3>\n<p>Currently, threading in OCaml is only supported by means of a global\nlock, allowing at most one thread to run OCaml code at any time. Stephen\nDolan, Leo White and Anil Madhavapeddy have been building on the <a href=\"http://www.cl.cam.ac.uk/~sd601/multicore.md\">early\ndesign</a> of a multicore\nOCaml runtime that they started in January, and now have a (early)\nprototype of a runtime design that is capable of shared memory\nparallelism.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">Abstract</a></li>\n<li>Date: 09:10-10:00, OCaml Workshop, Fri Sept 5th</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#type-level-module-aliases\"></a>Type-level Module Aliases</h3>\n<p>Leo White has been working with <a href=\"http://www.math.nagoya-u.ac.jp/~garrigue/\">Jacques\nGarrigue</a> on adding support\nfor module aliases into OCaml. This significantly improves the\ncompilation speed and executable binary sizes when using large libraries\nsuch as\n<a href=\"https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html\">Core/Async</a>.</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/modalias.pdf?attredirects=0\">Abstract</a></li>\n<li><a href=\"https://blogs.janestreet.com/better-namespaces-through-module-aliases\">Better Namespaces through Module\nAliases</a></li>\n<li>Date: 0935-1000, ML Workshop, Thu Sep 4th.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#coeffects-a-calculus-of-context-dependent-computation\"></a>Coeffects: A Calculus of Context-dependent Computation</h3>\n<p>Alan Mycroft has been working with Tomas Petricek and Dominic Orchard on\ndefining a broader notion of context than just variables in scope. Tomas\nwill be presenting a research paper on developing a generalized coeffect\nsystem with annotations indexed by a correct shape.</p>\n<ul>\n<li><a href=\"http://www.cl.cam.ac.uk/~dao29/publ/coeffects-icfp14.pdf\">Paper</a></li>\n<li>Date: 16:30-17:20, ICFP Day 1, Mon Sep 1st.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#mirage-os-20\"></a>Mirage OS 2.0</h2>\n<p>We <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">released Mirage OS\n2.0</a> in July,\nand there will be several talks diving into some of the new features you\nmay have read on the blog.</p>\n<h3><a href=\"https://anil.recoil.org/#unikernels-keynote-at-haskell-symposium\"></a>Unikernels Keynote at Haskell Symposium</h3>\n<p>Since MirageOS is a\n<a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\">unikernel</a>\nwritten entirely in OCaml, it makes perfect sense to describe it in\ndetail to our friends over at the <a href=\"http://www.haskell.org/haskell-symposium/\">Haskell\nSymposium</a> and reflect on\nsome of the design implications between Haskell type-classes and OCaml\nfunctors and metaprogramming. Anil Madhavapeddy will be doing just that\nin a Friday morning keynote at the Haskell Symposium.</p>\n<ul>\n<li>Haskell Symposium\n<a href=\"http://www.haskell.org/haskell-symposium/2014/index.html\">Program</a></li>\n<li>Date: 0900-1000, Haskell Symposium, Fri Sep 5th.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#transport-layer-security-in-ocaml\"></a>Transport Layer Security in OCaml</h3>\n<p>Hannes Menhert and David Kaloper have been <a href=\"http://openmirage.org/blog/introducing-ocaml-tls\">working\nhard</a> on integrating a\npure OCaml Transport Layer Security stack into Mirage OS. They\u2019ll talk\nabout the design principles underlying the library, and reflect on the\nnext steps to build a TLS stack that we can rely on not to been more\ninsecure than telnet.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_4.pdf\">Abstract</a></li>\n<li>Date: 10:25-11:20, OCaml Workshop, Fri Sep 5th.</li>\n</ul>\n<p>Hannes will also continue his travels and deliver a couple of talks the\nweek after ICFP on the same topic in Denmark, so you can still see it if\nyou happen to miss this week\u2019s presentation:</p>\n<ul>\n<li>9th Sep at 15:00, IT University of Copenhagen (2A08),\n<a href=\"http://list.ku.dk/pipermail/sci-diku-prog-lang/2014-August/000244.html\">details</a></li>\n<li>11th Sep Aarhus University, same talk (time and room TBA)</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#irmin-a-branch-consistent-distributed-library-database\"></a>Irmin: a Branch-consistent Distributed Library Database</h3>\n<p>Irmin is an <a href=\"https://github.com/mirage/irmin\">OCaml library</a> to persist\nand synchronize distributed data structures both on-disk and in-memory.\nIt enables a style of programming very similar to the Git workflow,\nwhere distributed nodes fork, fetch, merge and push data between each\nother. The general idea is that you want every active node to get a\nlocal (partial) copy of a global database and always be very explicit\nabout how and when data is shared and migrated.</p>\n<p>This has been a big collaborative effort lead by Thomas Gazagnaire, and\nincludes contributions from Amir Chaudhry, Anil Madhavapeddy, Richard\nMortier, David Scott, David Sheets, Gregory Tsipenyuk, Jon Crowcroft.\nWe\u2019ll be demonstrating Irmin <a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">in\naction</a>, so please come\nalong if you\u2019ve got any interesting applications you would like to talk\nto us about.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_11.pdf\">Abstract</a></li>\n<li><a href=\"http://openmirage.org/blog/introducing-irmin\">Blog Post</a></li>\n<li>Date: 15:10-16:30, Joint Poster Session for OCaml/ML Workshop, Fri\nSep 5th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#metaprogramming-with-ml-modules-in-the-mirageos\"></a>Metaprogramming with ML modules in the MirageOS</h3>\n<p>Mirage OS lets the programmer build modular operating system components\nusing a combination of OCaml functors and generative metaprogramming.\nThis ensures portability across both Unix binaries and Xen unikernels,\nwhile preserving a usable developer workflow.</p>\n<p>The core Mirage OS team of Anil Madhavapeddy, Thomas Gazagnaire, David\nScott and Richard Mortier will be talking about the details of the\nfunctor combinators that make all this possible, and doing a live\ndemonstration of it running on a tiny <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARM\nboard</a>!</p>\n<ul>\n<li><a href=\"https://sites.google.com/site/mlworkshoppe/Gazagnaire-abstract.pdf?attredirects=0\">Abstract</a></li>\n<li>Date: 14:50-15:10, ML Workshop, Thu Sep 4th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#cufp-ocaml-language-tutorial\"></a>CUFP OCaml Language Tutorial</h3>\n<p>Leo White and Jeremy Yallop (with much helpful assistance from Daniel\nBuenzli) will be giving a rather different OCaml tutorial from the usual\nfare: they are taking you on a journey of building a variant of the\npopular <a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game in pure\nOCaml, and compiling it to JavaScript using the\n<a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. This is a\nvery pragmatic introduction to using statically typed functional\nprogramming combined with efficient compilation to JavaScript.</p>\n<blockquote>\n<p>In this tutorial, we will first introduce the basics of OCaml using an\ninteractive environment running in a web browser, as well as a local\ninstall of OCaml using the OPAM package manager. We will also explore\nhow to compile OCaml to JavaScript using the js_of_ocaml tool.</p>\n</blockquote>\n<p>The tutorial is focused around writing the 2048 logic, which will then\nbe compiled with js_of_ocaml and linked together with a frontend based\non (a pre-release version of) Useri, React, Gg and Vg, thanks to Daniel\nBuenzli. There\u2019ll also be appearances from OPAM, IOCaml, Qcheck and\nOUnit.</p>\n<ul>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/\">Tutorial Code</a></li>\n<li><a href=\"https://github.com/ocamllabs/cufp-tutorial/blob/master/task.md\">Task\nSheet</a></li>\n<li>Date: 09:00-12:00, CUFP Tutorial Day 2, Fri Sep 5th 2014.</li>\n</ul>\n<p>There will also be a limited supply of special edition OCaml-branded USB\nsticks for the first tutorial attendees, so get here early for your\nexclusive swag!</p>\n<h2><a href=\"https://anil.recoil.org/#the-ocaml-platform\"></a>The OCaml Platform</h2>\n<p>The group here has been working hard all summer to pull together an\nintegrated demonstration of the new generation of OCaml tools being\nbuilt around the increasingly popular <a href=\"https://opam.ocaml.org\">OPAM</a>\npackage manager. Anil Madhavapeddy will demonstrate all of these pieces\nin the OCaml Workshop, with guest appearances of work from Amir\nChaudhry, Daniel Buenzli, Jeremie Diminio, Thomas Gazagnaire, Louis\nGesbert, Thomas Leonard, David Sheets, Mark Shinwell, Christophe\nTroestler, Leo White and Jeremy Yallop.</p>\n<blockquote>\n<p>The OCaml Platform combines the OCaml compiler toolchain with a\ncoherent set of tools for build, documentation, testing and IDE\nintegration. The project is a collaborative effort across the OCaml\ncommunity, tied together by the OCaml Labs group in Cambridge and with\nother major contributors.</p>\n</blockquote>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">Abstract</a></li>\n<li><a href=\"https://opam.ocaml.org/blog\">Platform Blog</a></li>\n<li>Date: 12:00-12:30, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#the-0install-binary-installation-system\"></a>The 0install Binary Installation System</h3>\n<p>Thomas Leonard will also be delivering a separate talk about\ncross-platform binary installation via his\n<a href=\"http://zero-install.sourceforge.net/\">0install</a> library, which works on\na variety of platforms ranging from Windows, Linux and MacOS X. He\nrecently rewrote it in <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">OCaml from\nPython</a>,\nand will be sharing his experiences on how this went as a new OCaml\nuser, as well as deliver an introduction to 0install.</p>\n<ul>\n<li><a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_3.pdf\">Abstract</a></li>\n<li>Date: 10:25-10:50, OCaml Workshop, Fri Sep 5th 2014.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#service-and-socialising\"></a>Service and Socialising</h2>\n<p>Heidi Howard and Leonhard Markert are acting as student volunteers at\nthis years ICFP, and assisting with videoing various workshops such as\nCUFP Tutorials, Haskell Symposium, the Workshop on Functional\nHigh-Performance Computing and the ML Family Workshop. Follow their live\nblogging on the <a href=\"http://www.syslog.cl.cam.ac.uk/\">Systems Research Group\nSysBlog</a> and leave comments about any\nsessions you\u2019d like to know more about!</p>\n<p>Anil Madhavapeddy is the ICFP industrial relations chair and will be\nhosting an Industrial Reception on Thursday 4th September in the <a href=\"http://www.varldskulturmuseerna.se/varldskulturmuseet/\">Museum\nof World\nCulture</a>\nstarting from 7pm. There will be wine, food and some inspirational talks from the ICFP\nsponsors that not only make the conference possible, but provide an\navenue for the academic work to make its way out into industry (grad\nstudents that are job hunting: this is where you get to chat to folk\nhiring FP talent).</p>\n<p>This list hasn\u2019t been exhaustive, and only covers the activities of my\ngroup in <a href=\"http://ocaml.io\">OCaml Labs</a> and the <a href=\"http://www.cl.cam.ac.uk/research/srg/\">Systems Research\nGroup</a> at Cambridge. There are\nnumerous other talks from the Cambridge Computer Lab during the week,\nbut the artistic highlight will be on Saturday evening following the\n<a href=\"http://cufp.org/2014/\">CUFP talks</a>: <a href=\"http://sam.aaron.name/\">Sam Aaron</a>\nwill be doing a <a href=\"https://twitter.com/samaaron/status/505081137660981248\">live musical\nperformance</a>\nsometime after 8pm at <a href=\"http://www.3vaningen.se/\">3vaningen</a>. Sounds like\na perfect way to wind down after what\u2019s gearing to up to be an intense\nICFP 2014. I look forward to seeing old friends and making new ones in\nGothenburg soon!</p>",
+18
avsm/notes_ocaml-opam-new-layout.json
+18
avsm/notes_ocaml-opam-new-layout.json
···+"summary": "<p>Managing package manager constraints is getting difficult, particularly given the growth of the number of packages in the <a href=\"https://github.com/ocaml/opam-repository\">opam repository</a>. I'm therefore laying out a new mechanism for the OCaml contributors to submit large package sets, such as those from <a href=\"https://janestreet.com\">Jane Street</a>.</p>",+"content": "<p>Managing package manager constraints is getting difficult, particularly given the growth of the number of packages in the <a href=\"https://github.com/ocaml/opam-repository\">opam repository</a>. I'm therefore laying out a new mechanism for the OCaml contributors to submit large package sets, such as those from <a href=\"https://janestreet.com\">Jane Street</a>.</p>",
+18
avsm/notes_ocaml-users-group.json
+18
avsm/notes_ocaml-users-group.json
···+"summary": "<p>I'm at the <a href=\"https://forge.ocamlcore.org/plugins/mediawiki/wiki/ocaml-meeting/index.php/OCamlMeeting2011\">2011 OCaml Users Group</a> in Paris, reporting on some splendid talks this year. It looked like around 60-70 people in the room, and I had the pleasure of meeting users all the way from <a href=\"http://ru.linkedin.com/pub/dmitry-bely/4/955/717\">Russia</a> to <a href=\"http://ashishagarwal.org/about/\">New York</a> as well as all the Europeans!</p>\n<h3><a href=\"https://anil.recoil.org/#js_of_ocaml\"></a>Js_of_ocaml</h3>\n<p>First up was <a href=\"http://www.lsv.ens-cachan.fr/~chambart/\">Pierre Chambart</a> talking about the <a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. It compiles OCaml bytecode directly to Javascript, with few external dependencies. Since the bytecode format changes very rarely, it is simpler to maintain than alternatives (such as Jake Donham\u2019s <a href=\"https://github.com/jaked/ocamljs\">ocamljs</a>) that require patching the compiler tool-chain. Javascript objects are mapped to dynamic OCaml objects via a light-weight <code>##</code> operator, so you can simply write code like:</p>\n<pre><code> class type window = object\n method alert : js_string t -> unit meth\n method name : js_string t prop\n end\n let window : window t =\n JS.Unsafe.variable "window"\n \n let () = \n window##alert ( window##name)\n name <- Js.string "name"\n</code></pre>\n<p>Overloading is handled similarly to <a href=\"http://pyobjc.sourceforge.net/\">PyObjC</a>, with each parameter combination being mapped into a uniquely named function. <a href=\"https://github.com/raphael-proust\">Raphael Proust</a> then demonstrated a cool game he wrote using via <a href=\"https://github.com/raphael-proust/raphael\">bindings</a> to the <a href=\"http://raphaeljs.com/\">Raphael</a> Javascript vector graphics library. Performance of <code>js_of_ocaml</code> is good compared to writing it by hand, and they have have quite a few <a href=\"http://ocsigen.org/js_of_ocaml/doc/1.0.2/manual/performances\">benchmarks</a> on their website.</p>\n<p>Overall the project looks very usable: the main omissions are Bigarray, no dynlink, no Str (replaced by native regexps), no recursive modules or weak references. None of these missing features seem very critical for the sorts of applications that <code>js_of_ocaml</code> is intended for.</p>\n<h3><a href=\"https://anil.recoil.org/#ocaml-on-a-pic-ocapic\"></a>OCaml on a PIC (OCAPIC)</h3>\n<p>Next up Phillipe Wang presented something completely different: <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php\">running OCaml on tiny 8-bit PIC microcontrollers</a>! These PICs have 4-128Kb of flash (to store the code), and from 256 <em>bytes</em> to 4 kilobytes. Not a lot of room to waste there. He demonstrated an example with a game with 24 physical push buttons that beat humans at a conference (JFLA).</p>\n<p>It works by translating OCaml bytecode through several stages: <code>ocamlclean</code> to eliminate dead code in the bytecode (which would be very useful for native code too!), a compression step that does run-length encoding, and then translation to PIC assembly. They have a replacement stop-and-copy GC (150 lines of assembly) and a full collection cycle runs in less than 1.5ms. Integers are 15-bits (with 1 bit reserved) and the block representation is the same as native OCaml. Very cool project!</p>\n<h3><a href=\"https://anil.recoil.org/#frama-c\"></a>Frama-C</h3>\n<p>We went onto static analysis and <a href=\"http://www.linkedin.com/pub/julien-signoles/24/5a9/4b4\">Julien Signoles</a> presented <a href=\"http://frama-c.com/\">Frama-C</a>, a powerful static analysis tool for real-world C. It forks the <a href=\"http://www.eecs.berkeley.edu/~necula/cil/\">CIL</a> project from Berkeley and adds <a href=\"http://ocamlgraph.lri.fr/\">ocamlgraph</a> and GUI support. He demonstrated a simple loop counter plugin to count them in C code, and the homepage has many interesting <a href=\"http://frama-c.com/plugins.html\">plugins</a> maintained by the community.</p>\n<p>I hadn\u2019t realised that CIL was still maintained in the face of <a href=\"http://clang.llvm.org/\">clang</a>, so it\u2019s nice to see it live on as part of Frama-C.</p>\n<h3><a href=\"https://anil.recoil.org/#ocsigen\"></a>Ocsigen</h3>\n<p>The ever-cheerful <a href=\"http://www.pps.jussieu.fr/~balat/\">Vincent Balat</a> updated us about the <a href=\"http://ocsigen.org\">Ocsigen</a> web framework, including unveiling their exciting new logo! This was written using an amazing <a href=\"http://ocsigen.org/tutorial/tutorial1\">collaborative editor</a> that lets users edit in real time.</p>\n<p>Ocsigen is based around <em>services</em> of type <code>service: parameters -> page</code>. Services are first-class values, and can be registered dynamically and associated with sessions. The code for the collaborative editor was about 100 lines of code.</p>\n<p>There is a syntax extension to distinguish between client and server side code, and both can be written in the same service (invoking <code>js_of_ocaml</code> to compile the client code to Javascript). They have bindings to <a href=\"http://code.google.com/closure/\">Google Closure</a> in order to provide UI support. There is a really nice \u201cbus\u201d service to pass messages between the server and the client, with seamless integration of <a href=\"http://ocsigen.org/lwt\">Lwt</a> to hide the details of communication to the browser.</p>\n<p>Ocsigen is looking like a very mature project at this point, and I\u2019m very keen to integrate it with <a href=\"http://www.openmirage.org\">Mirage</a> to specialise the into micro-kernels. A task for the hacking day tomorrow morning I think!</p>\n<h3><a href=\"https://anil.recoil.org/#mirage\"></a>Mirage</h3>\n<p>I talked about <a href=\"http://www.openmirage.org\">Mirage</a>, hurrah! Good questions about why we need a block device (and not just use NFS), and I replied that everything is available as the library and the programmer can choose depending on their needs (the core goal of <a href=\"http://en.wikipedia.org/wiki/Exokernel\">exokernels</a>).</p>\n<p>A highlight for me was lunch where I finally met <a href=\"http://people.redhat.com/~rjones/\">Richard Jones</a>, who is one of the other OCaml and cloud hackers out there. Wide ranging conversation about what the cool stuff going in <a href=\"http://www.linux-kvm.org/page/Main_Page\">KVM</a> and Red Hat in general. Richard also gave a short talk about how they use OCaml to generate hundreds of thousands of lines of code in <a href=\"http://libguestfs.org/\">libguestfs</a>. There are bindings for pretty much every major language, and it is all generated from an executable specification. He notes that \u201cnormal\u201d programmers love the OCaml type safety without explicit annotations, and that it is a really practical language for the working programmer. The <a href=\"http://xen.org\">Xen Cloud Platform</a> also has a similar <a href=\"https://github.com/xen-org/xen-api/blob/master/ocaml/idl/datamodel.ml\">generator</a> for XenAPI bindings, so I definitely agree with him about this!</p>\n<h3><a href=\"https://anil.recoil.org/#ocaml-future\"></a>OCaml Future</h3>\n<p><a href=\"http://pauillac.inria.fr/~xleroy/\">Xavier \u201csuperstar\u201d Leroy</a> then gave an update of OCaml development. Major new features in 3.12.0 are first-class modules, polymorphic recursion, local module opens, and richer operations over module signatures. Version 3.12.1 is coming out soon, with bug fixes (in camlp4 and ocamlbuild mainly), and better performance on x86_64: turns out a new <code>mov</code> instruction change improves floating point performance on <code>x86_64</code>.</p>\n<p>OCaml 3.13 has no release date, but several exciting features are in the pipeline. Firstly, more lightweight first-class modules by permitting some annotations to be inferred by the context, and it introduces patterns to match and bind first-class module values. Much more exciting is support for GADTs (Generalised Algebraic Data Types). This permits more type constraints to be enforced at compile time:</p>\n<pre><code> type _ t =\n | IntLit : int -> int t\n | Pair : 'a t * 'b t -> ('a * 'b) t\n | App : ('a -> 'b) t * 'a t -> 'b t\n | Abs : ('a -> 'b) -> ('a -> 'b) t\n \n let rec eval : type s . s t -> s = function\n | IntLit x -> x (* s = int here *)\n | Pair (x,y) -> (eval x, eval y) (* s = 'a * 'b here *)\n | App (f,a) -> (eval f) (eval a)\n | Abs f -> f\n</code></pre>\n<p>In this example of a typed interpreter, the <code>eval</code> function is annotated with a <code>type s . s t -> s</code> type that lets each branch of the pattern match have a constrained type for <code>s</code> depending on the use. This reminded me of Edwin Brady\u2019s <a href=\"http://www.cs.st-andrews.ac.uk/~eb/writings/icfp10.pdf\">partial evaluation</a> work using dependent types, but a much more restricted version suitable for OCaml.</p>\n<p>There are some really interesting uses for GADTs:</p>\n<ul>\n<li>Enforcing invariants in data structures, as with the typed interpreter example above.</li>\n<li>Reflecting types into values means that libraries such as our own <a href=\"http://github.com/mirage/dyntype\">dyntype</a> can be expressed in the core language without lots of camlp4 hacks. Finally, this should make typed I/O generators for XML, JSON and other network formats much simpler.</li>\n</ul>\n<p>The challenges in the implementation are that principle type inference is now impossible (so some annotation is required), and pattern matching warnings are also trickier.</p>\n<p>From the IDE perspective, the third bit of work is to have the OCaml compiler save the full abstract syntax tree annotation with source locations, scoping information, types (declared and inferred) and addition user-defined annotations. This generalises the <code>-annot</code> flag and can help projects like <a href=\"http://jun.furuse.info/hacks/ocamlspotter\">OCamlSpotter</a>, <a href=\"http://ocamlwizard.lri.fr/\">OCamlWizard</a>, <a href=\"http://www.algo-prog.info/ocaide/\">OcaIDE</a>, etc. It also helps code-generators driven by type-generators (such as our <a href=\"http://github.com/mirage/orm\">SQL ORM</a> or <a href=\"http://oss.wink.com/atdgen/\">ATDgen</a>).</p>\n<p>The OCaml consortium has new members; <a href=\"http://mlstate.com\">MLState</a> and <a href=\"http://mylife.com\">MyLife</a>, and <a href=\"http://www.esterel-technologies.com/\">Esterel</a>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> and one unnamed new member are joining. The consortium goals are to sell permissive licensing (BSD) to members, and sound off new features with the serious users. Three companies are now doing commercial development (Gerd, OCamlCore, OCamlPro) which is growing the community nicely.</p>\n<h3><a href=\"https://anil.recoil.org/#jocaml\"></a>JoCaml</h3>\n<p><a href=\"http://pauillac.inria.fr/~maranget/\">Luc Maranget</a> (who looks like an archetypal mad professor!) gave a great rundown on <a href=\"http://jocaml.inria.fr/\">JoCaml</a>, a distributed programming extension to OCaml. This extends the compiler with join-definitions (a compiler patch), and a small bit of runtime support (using Thread), and significant extensions for concurrent and distributed programming in a type-safe way.</p>\n<p>It extends the syntax with three new keywords: <code>def</code>, <code>spawn</code> and <code>reply</code>, and new usage for <code>or</code> and <code>&</code> (you should be using <code>||</code> and <code>&&</code> anyway). Binary libraries remain compatible between matching versions of JoCaml and OCaml. An example of JoCaml code is:</p>\n<pre><code> let create n =\n def st(rem) & tick() = st(rem-1)\n or st(0) & wait() = reply to wait in\n spawn st(n) ; { tick=tick; wait=wait; }\n \n type t = {\n tick: unit Join.chan;\n wait: unit -> unit;\n }\n</code></pre>\n<p>After <code>n</code> messages to <code>tick</code>, the <code>wait</code> barrier function will be called.</p>\n<pre><code> let c = create n\n let () =\n for k = 0 to 9 do\n spawn begin printf "%i" k; c.tick ()\n done;\n c.wait ()\n</code></pre>\n<p>Here we asynchronously print the numbers of <code>0</code> to <code>9</code>, and then the <code>wait</code> call acts as a barrier until it finishes. JoCaml is useful for distributed fork-join parallelism tasks such as raytracing, but with the type system support of OCaml. It is a bit like MapReduce, but without the data partitioning support of Hadoop (and is more light-weight). It would be quite interesting to combine some of the JoCaml extensions with the dynamic dataflow graphs in our own <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> distributed execution engine.</p>\n<h3><a href=\"https://anil.recoil.org/#forgetful-memoisation-in-ocaml\"></a>Forgetful Memoisation in OCaml</h3>\n<p><a href=\"http://www.lri.fr/~bobot/\">Francois Bobot</a> talks about the problem of memoizing values so that they can be re-used (e.g. in a cache). Consider a standard memoiser:</p>\n<pre><code> let memo_f =\n let cache = H.create () in\n fun k ->\n try H.find cache k\n with Not_found ->\n let v = f k in\n H.add cache k v;\n v\n \n let v1 = memo_f k1\n let v2 = memo_f k2 in (* k2 = k1 in O(1) *)\n</code></pre>\n<p>If a key is not reachable from anywhere other than the heap, we want to eliminate it from the cache also. The first solution is a normal hashtable, but this results in an obvious memory leak since a key held in the cache marks it as reachable. A better solution is using OCaml <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/libref/Weak.html\">weak pointers</a> that permit references to values without holding on to them (see <a href=\"http://www.pps.jussieu.fr/~li/software/weaktbl/doc/html/Weaktbl.html\">Weaktbl</a> by <a href=\"http://www.pps.jussieu.fr/~li/\">Zheng Li</a> who is now an OCaml hacker at Citrix). The problem with Weaktbl is that if the value points to the key, forming a cycle which will never be reclaimed.</p>\n<p>Francois solves this by using <a href=\"http://en.wikipedia.org/wiki/Ephemeron\">Ephemerons</a> from Smalltalk. They use the rule that the value can be reclaimed if the key or the ephemeron itself can be reclaimed by the GC, and have a signature like:</p>\n<pre><code> module Ephemeron : sig type ('a,'b) t\n val create : 'a -> 'b -> ('a,'b) t\n val check : ('a,'b) t -> bool\n val get : ('a,'b) t -> 'b option\n val get_key : ('a,'b) t -> 'a option\n end\n</code></pre>\n<p>The implementation in OCaml patches the runtime to use a new tag for ephemerons, and the performance graphs in his <a href=\"https://forge.ocamlcore.org/docman/view.php/77/134/memoization2011.pdf\">slides</a> look good. This is an interesting topic for me since we need efficient memoisation in Mirage I/O (see the effects on DNS performance in the <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Eurosys paper</a> which used Weaktbl). When asked if the OCaml patch will be upstreamed, <a href=\"http://gallium.inria.fr/~doligez/\">Damien Doligez</a> did not like the worst-case complexity of long chains of ephemerons in the GC, and there are several approaches under consideration to alleviate this without too many changes to the runtime, but Francois believes the current complexity is not too bad in practise.</p>\n<h3><a href=\"https://anil.recoil.org/#oasis-and-website\"></a>Oasis and website</h3>\n<p><a href=\"http://sylvain.le-gall.net/\">Sylvain</a> came on stage later to give a demonstration of <a href=\"http://oasis.forge.ocamlcore.org/oasis-db.html\">OASIS</a>, an equivalent of <a href=\"http://www.haskell.org/cabal/\">Cabal</a> for Haskell or <a href=\"http://www.cpan.org/\">CPAN</a> for Perl. It works with a small <code>_oasis</code> file that describes the project, and then the OASIS tool auto-generates <code>ocamlbuild</code> files from it (this reminds me of Perl\u2019s <a href=\"http://perldoc.perl.org/ExtUtils/MakeMaker.html\">MakeMaker</a>). Once the files are auto-generated, it is self-contained and there is no further dependency on OASIS itself.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"How many OCaml hackers does it take to change a lightbulb?\" src=\"https://anil.recoil.org/images/ocaml-users-1.webp\" title=\"How many OCaml hackers does it take to change a lightbulb?\">\nHow many OCaml hackers does it take to change a lightbulb?\n\n<img alt=\"Wearing bibs at French Teppinyaki\" src=\"https://anil.recoil.org/images/ocaml-users-3.webp\" title=\"Wearing bibs at French Teppinyaki\">\nWearing bibs at French Teppinyaki\n\n<img alt=\"Team Mirage cheeses it up\" src=\"https://anil.recoil.org/images/ocaml-users-2.webp\" title=\"Team Mirage cheeses it up\">\nTeam Mirage cheeses it up</li>\n</ul>\n<p>OASIS works with either an existing build system in a project, or can be integrated more closely with <code>ocamlbuild</code> by advanced users. Lots of projects are already using OASIS (from Cryptokit to Lwt to the huge <a href=\"http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=641\">Jane Street Core</a>). He is also working on a distribution mechanism on a central website, which should make for convenient OCaml packaging when it is finished and gets more adoption from the community.</p>\n<p>Finally, <a href=\"http://ashishagarwal.org/\">Ashish Agarwal</a> led a discussion on how OCaml can improve its web presence for beginners. Lots of good ideas here (some of which we implemented when reworking the <a href=\"http://cufp.org\">CUFP</a> website last year). Looking forward to seeing what happens next year in this space! I really enjoyed the day; the quality of talks was very high, and many engaging discussions from all involved!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/sf-ocaml.webp\" title=\"\">\n</p>\n<p>Of course, not all of the OCaml community action is in France. The ever-social <a href=\"http://www.twitter.com/jakedonham\">Jake Donham</a> organised the First Ever San Francisco User Group that I attended when I was over there a few weeks ago. Ok, admittedly it was mainly French people there too, but it was excellent to meet up with <a href=\"http://www.linkedin.com/pub/mika-illouz/0/a02/7b4\">Mika</a>, <a href=\"http://martin.jambon.free.fr/\">Martin</a>, <a href=\"http://www.linkedin.com/pub/julien-verlaguet/20/10a/b57\">Julien</a>, <a href=\"http://fr.linkedin.com/in/henribinsztok\">Henri</a> and of course Jake when over there.</p>\n<p>We should definitely have more of these fun local meetups, and a number of other OCaml hackers I mentioned it to want to attend next time in the Bay Area, if only to cry into their drinks about the state of multi-core... <em>just kidding</em>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> is hard at work fixing that after all :-)</p>",+"content": "<p>I'm at the <a href=\"https://forge.ocamlcore.org/plugins/mediawiki/wiki/ocaml-meeting/index.php/OCamlMeeting2011\">2011 OCaml Users Group</a> in Paris, reporting on some splendid talks this year. It looked like around 60-70 people in the room, and I had the pleasure of meeting users all the way from <a href=\"http://ru.linkedin.com/pub/dmitry-bely/4/955/717\">Russia</a> to <a href=\"http://ashishagarwal.org/about/\">New York</a> as well as all the Europeans!</p>\n<h3><a href=\"https://anil.recoil.org/#js_of_ocaml\"></a>Js_of_ocaml</h3>\n<p>First up was <a href=\"http://www.lsv.ens-cachan.fr/~chambart/\">Pierre Chambart</a> talking about the <a href=\"http://ocsigen.org/js_of_ocaml/\">js_of_ocaml</a> compiler. It compiles OCaml bytecode directly to Javascript, with few external dependencies. Since the bytecode format changes very rarely, it is simpler to maintain than alternatives (such as Jake Donham\u2019s <a href=\"https://github.com/jaked/ocamljs\">ocamljs</a>) that require patching the compiler tool-chain. Javascript objects are mapped to dynamic OCaml objects via a light-weight <code>##</code> operator, so you can simply write code like:</p>\n<pre><code> class type window = object\n method alert : js_string t -> unit meth\n method name : js_string t prop\n end\n let window : window t =\n JS.Unsafe.variable "window"\n \n let () = \n window##alert ( window##name)\n name <- Js.string "name"\n</code></pre>\n<p>Overloading is handled similarly to <a href=\"http://pyobjc.sourceforge.net/\">PyObjC</a>, with each parameter combination being mapped into a uniquely named function. <a href=\"https://github.com/raphael-proust\">Raphael Proust</a> then demonstrated a cool game he wrote using via <a href=\"https://github.com/raphael-proust/raphael\">bindings</a> to the <a href=\"http://raphaeljs.com/\">Raphael</a> Javascript vector graphics library. Performance of <code>js_of_ocaml</code> is good compared to writing it by hand, and they have have quite a few <a href=\"http://ocsigen.org/js_of_ocaml/doc/1.0.2/manual/performances\">benchmarks</a> on their website.</p>\n<p>Overall the project looks very usable: the main omissions are Bigarray, no dynlink, no Str (replaced by native regexps), no recursive modules or weak references. None of these missing features seem very critical for the sorts of applications that <code>js_of_ocaml</code> is intended for.</p>\n<h3><a href=\"https://anil.recoil.org/#ocaml-on-a-pic-ocapic\"></a>OCaml on a PIC (OCAPIC)</h3>\n<p>Next up Phillipe Wang presented something completely different: <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php\">running OCaml on tiny 8-bit PIC microcontrollers</a>! These PICs have 4-128Kb of flash (to store the code), and from 256 <em>bytes</em> to 4 kilobytes. Not a lot of room to waste there. He demonstrated an example with a game with 24 physical push buttons that beat humans at a conference (JFLA).</p>\n<p>It works by translating OCaml bytecode through several stages: <code>ocamlclean</code> to eliminate dead code in the bytecode (which would be very useful for native code too!), a compression step that does run-length encoding, and then translation to PIC assembly. They have a replacement stop-and-copy GC (150 lines of assembly) and a full collection cycle runs in less than 1.5ms. Integers are 15-bits (with 1 bit reserved) and the block representation is the same as native OCaml. Very cool project!</p>\n<h3><a href=\"https://anil.recoil.org/#frama-c\"></a>Frama-C</h3>\n<p>We went onto static analysis and <a href=\"http://www.linkedin.com/pub/julien-signoles/24/5a9/4b4\">Julien Signoles</a> presented <a href=\"http://frama-c.com/\">Frama-C</a>, a powerful static analysis tool for real-world C. It forks the <a href=\"http://www.eecs.berkeley.edu/~necula/cil/\">CIL</a> project from Berkeley and adds <a href=\"http://ocamlgraph.lri.fr/\">ocamlgraph</a> and GUI support. He demonstrated a simple loop counter plugin to count them in C code, and the homepage has many interesting <a href=\"http://frama-c.com/plugins.html\">plugins</a> maintained by the community.</p>\n<p>I hadn\u2019t realised that CIL was still maintained in the face of <a href=\"http://clang.llvm.org/\">clang</a>, so it\u2019s nice to see it live on as part of Frama-C.</p>\n<h3><a href=\"https://anil.recoil.org/#ocsigen\"></a>Ocsigen</h3>\n<p>The ever-cheerful <a href=\"http://www.pps.jussieu.fr/~balat/\">Vincent Balat</a> updated us about the <a href=\"http://ocsigen.org\">Ocsigen</a> web framework, including unveiling their exciting new logo! This was written using an amazing <a href=\"http://ocsigen.org/tutorial/tutorial1\">collaborative editor</a> that lets users edit in real time.</p>\n<p>Ocsigen is based around <em>services</em> of type <code>service: parameters -> page</code>. Services are first-class values, and can be registered dynamically and associated with sessions. The code for the collaborative editor was about 100 lines of code.</p>\n<p>There is a syntax extension to distinguish between client and server side code, and both can be written in the same service (invoking <code>js_of_ocaml</code> to compile the client code to Javascript). They have bindings to <a href=\"http://code.google.com/closure/\">Google Closure</a> in order to provide UI support. There is a really nice \u201cbus\u201d service to pass messages between the server and the client, with seamless integration of <a href=\"http://ocsigen.org/lwt\">Lwt</a> to hide the details of communication to the browser.</p>\n<p>Ocsigen is looking like a very mature project at this point, and I\u2019m very keen to integrate it with <a href=\"http://www.openmirage.org\">Mirage</a> to specialise the into micro-kernels. A task for the hacking day tomorrow morning I think!</p>\n<h3><a href=\"https://anil.recoil.org/#mirage\"></a>Mirage</h3>\n<p>I talked about <a href=\"http://www.openmirage.org\">Mirage</a>, hurrah! Good questions about why we need a block device (and not just use NFS), and I replied that everything is available as the library and the programmer can choose depending on their needs (the core goal of <a href=\"http://en.wikipedia.org/wiki/Exokernel\">exokernels</a>).</p>\n<p>A highlight for me was lunch where I finally met <a href=\"http://people.redhat.com/~rjones/\">Richard Jones</a>, who is one of the other OCaml and cloud hackers out there. Wide ranging conversation about what the cool stuff going in <a href=\"http://www.linux-kvm.org/page/Main_Page\">KVM</a> and Red Hat in general. Richard also gave a short talk about how they use OCaml to generate hundreds of thousands of lines of code in <a href=\"http://libguestfs.org/\">libguestfs</a>. There are bindings for pretty much every major language, and it is all generated from an executable specification. He notes that \u201cnormal\u201d programmers love the OCaml type safety without explicit annotations, and that it is a really practical language for the working programmer. The <a href=\"http://xen.org\">Xen Cloud Platform</a> also has a similar <a href=\"https://github.com/xen-org/xen-api/blob/master/ocaml/idl/datamodel.ml\">generator</a> for XenAPI bindings, so I definitely agree with him about this!</p>\n<h3><a href=\"https://anil.recoil.org/#ocaml-future\"></a>OCaml Future</h3>\n<p><a href=\"http://pauillac.inria.fr/~xleroy/\">Xavier \u201csuperstar\u201d Leroy</a> then gave an update of OCaml development. Major new features in 3.12.0 are first-class modules, polymorphic recursion, local module opens, and richer operations over module signatures. Version 3.12.1 is coming out soon, with bug fixes (in camlp4 and ocamlbuild mainly), and better performance on x86_64: turns out a new <code>mov</code> instruction change improves floating point performance on <code>x86_64</code>.</p>\n<p>OCaml 3.13 has no release date, but several exciting features are in the pipeline. Firstly, more lightweight first-class modules by permitting some annotations to be inferred by the context, and it introduces patterns to match and bind first-class module values. Much more exciting is support for GADTs (Generalised Algebraic Data Types). This permits more type constraints to be enforced at compile time:</p>\n<pre><code> type _ t =\n | IntLit : int -> int t\n | Pair : 'a t * 'b t -> ('a * 'b) t\n | App : ('a -> 'b) t * 'a t -> 'b t\n | Abs : ('a -> 'b) -> ('a -> 'b) t\n \n let rec eval : type s . s t -> s = function\n | IntLit x -> x (* s = int here *)\n | Pair (x,y) -> (eval x, eval y) (* s = 'a * 'b here *)\n | App (f,a) -> (eval f) (eval a)\n | Abs f -> f\n</code></pre>\n<p>In this example of a typed interpreter, the <code>eval</code> function is annotated with a <code>type s . s t -> s</code> type that lets each branch of the pattern match have a constrained type for <code>s</code> depending on the use. This reminded me of Edwin Brady\u2019s <a href=\"http://www.cs.st-andrews.ac.uk/~eb/writings/icfp10.pdf\">partial evaluation</a> work using dependent types, but a much more restricted version suitable for OCaml.</p>\n<p>There are some really interesting uses for GADTs:</p>\n<ul>\n<li>Enforcing invariants in data structures, as with the typed interpreter example above.</li>\n<li>Reflecting types into values means that libraries such as our own <a href=\"http://github.com/mirage/dyntype\">dyntype</a> can be expressed in the core language without lots of camlp4 hacks. Finally, this should make typed I/O generators for XML, JSON and other network formats much simpler.</li>\n</ul>\n<p>The challenges in the implementation are that principle type inference is now impossible (so some annotation is required), and pattern matching warnings are also trickier.</p>\n<p>From the IDE perspective, the third bit of work is to have the OCaml compiler save the full abstract syntax tree annotation with source locations, scoping information, types (declared and inferred) and addition user-defined annotations. This generalises the <code>-annot</code> flag and can help projects like <a href=\"http://jun.furuse.info/hacks/ocamlspotter\">OCamlSpotter</a>, <a href=\"http://ocamlwizard.lri.fr/\">OCamlWizard</a>, <a href=\"http://www.algo-prog.info/ocaide/\">OcaIDE</a>, etc. It also helps code-generators driven by type-generators (such as our <a href=\"http://github.com/mirage/orm\">SQL ORM</a> or <a href=\"http://oss.wink.com/atdgen/\">ATDgen</a>).</p>\n<p>The OCaml consortium has new members; <a href=\"http://mlstate.com\">MLState</a> and <a href=\"http://mylife.com\">MyLife</a>, and <a href=\"http://www.esterel-technologies.com/\">Esterel</a>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> and one unnamed new member are joining. The consortium goals are to sell permissive licensing (BSD) to members, and sound off new features with the serious users. Three companies are now doing commercial development (Gerd, OCamlCore, OCamlPro) which is growing the community nicely.</p>\n<h3><a href=\"https://anil.recoil.org/#jocaml\"></a>JoCaml</h3>\n<p><a href=\"http://pauillac.inria.fr/~maranget/\">Luc Maranget</a> (who looks like an archetypal mad professor!) gave a great rundown on <a href=\"http://jocaml.inria.fr/\">JoCaml</a>, a distributed programming extension to OCaml. This extends the compiler with join-definitions (a compiler patch), and a small bit of runtime support (using Thread), and significant extensions for concurrent and distributed programming in a type-safe way.</p>\n<p>It extends the syntax with three new keywords: <code>def</code>, <code>spawn</code> and <code>reply</code>, and new usage for <code>or</code> and <code>&</code> (you should be using <code>||</code> and <code>&&</code> anyway). Binary libraries remain compatible between matching versions of JoCaml and OCaml. An example of JoCaml code is:</p>\n<pre><code> let create n =\n def st(rem) & tick() = st(rem-1)\n or st(0) & wait() = reply to wait in\n spawn st(n) ; { tick=tick; wait=wait; }\n \n type t = {\n tick: unit Join.chan;\n wait: unit -> unit;\n }\n</code></pre>\n<p>After <code>n</code> messages to <code>tick</code>, the <code>wait</code> barrier function will be called.</p>\n<pre><code> let c = create n\n let () =\n for k = 0 to 9 do\n spawn begin printf "%i" k; c.tick ()\n done;\n c.wait ()\n</code></pre>\n<p>Here we asynchronously print the numbers of <code>0</code> to <code>9</code>, and then the <code>wait</code> call acts as a barrier until it finishes. JoCaml is useful for distributed fork-join parallelism tasks such as raytracing, but with the type system support of OCaml. It is a bit like MapReduce, but without the data partitioning support of Hadoop (and is more light-weight). It would be quite interesting to combine some of the JoCaml extensions with the dynamic dataflow graphs in our own <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/ciel/\">CIEL</a> distributed execution engine.</p>\n<h3><a href=\"https://anil.recoil.org/#forgetful-memoisation-in-ocaml\"></a>Forgetful Memoisation in OCaml</h3>\n<p><a href=\"http://www.lri.fr/~bobot/\">Francois Bobot</a> talks about the problem of memoizing values so that they can be re-used (e.g. in a cache). Consider a standard memoiser:</p>\n<pre><code> let memo_f =\n let cache = H.create () in\n fun k ->\n try H.find cache k\n with Not_found ->\n let v = f k in\n H.add cache k v;\n v\n \n let v1 = memo_f k1\n let v2 = memo_f k2 in (* k2 = k1 in O(1) *)\n</code></pre>\n<p>If a key is not reachable from anywhere other than the heap, we want to eliminate it from the cache also. The first solution is a normal hashtable, but this results in an obvious memory leak since a key held in the cache marks it as reachable. A better solution is using OCaml <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/libref/Weak.html\">weak pointers</a> that permit references to values without holding on to them (see <a href=\"http://www.pps.jussieu.fr/~li/software/weaktbl/doc/html/Weaktbl.html\">Weaktbl</a> by <a href=\"http://www.pps.jussieu.fr/~li/\">Zheng Li</a> who is now an OCaml hacker at Citrix). The problem with Weaktbl is that if the value points to the key, forming a cycle which will never be reclaimed.</p>\n<p>Francois solves this by using <a href=\"http://en.wikipedia.org/wiki/Ephemeron\">Ephemerons</a> from Smalltalk. They use the rule that the value can be reclaimed if the key or the ephemeron itself can be reclaimed by the GC, and have a signature like:</p>\n<pre><code> module Ephemeron : sig type ('a,'b) t\n val create : 'a -> 'b -> ('a,'b) t\n val check : ('a,'b) t -> bool\n val get : ('a,'b) t -> 'b option\n val get_key : ('a,'b) t -> 'a option\n end\n</code></pre>\n<p>The implementation in OCaml patches the runtime to use a new tag for ephemerons, and the performance graphs in his <a href=\"https://forge.ocamlcore.org/docman/view.php/77/134/memoization2011.pdf\">slides</a> look good. This is an interesting topic for me since we need efficient memoisation in Mirage I/O (see the effects on DNS performance in the <a href=\"https://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Eurosys paper</a> which used Weaktbl). When asked if the OCaml patch will be upstreamed, <a href=\"http://gallium.inria.fr/~doligez/\">Damien Doligez</a> did not like the worst-case complexity of long chains of ephemerons in the GC, and there are several approaches under consideration to alleviate this without too many changes to the runtime, but Francois believes the current complexity is not too bad in practise.</p>\n<h3><a href=\"https://anil.recoil.org/#oasis-and-website\"></a>Oasis and website</h3>\n<p><a href=\"http://sylvain.le-gall.net/\">Sylvain</a> came on stage later to give a demonstration of <a href=\"http://oasis.forge.ocamlcore.org/oasis-db.html\">OASIS</a>, an equivalent of <a href=\"http://www.haskell.org/cabal/\">Cabal</a> for Haskell or <a href=\"http://www.cpan.org/\">CPAN</a> for Perl. It works with a small <code>_oasis</code> file that describes the project, and then the OASIS tool auto-generates <code>ocamlbuild</code> files from it (this reminds me of Perl\u2019s <a href=\"http://perldoc.perl.org/ExtUtils/MakeMaker.html\">MakeMaker</a>). Once the files are auto-generated, it is self-contained and there is no further dependency on OASIS itself.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"How many OCaml hackers does it take to change a lightbulb?\" src=\"https://anil.recoil.org/images/ocaml-users-1.webp\" title=\"How many OCaml hackers does it take to change a lightbulb?\">\nHow many OCaml hackers does it take to change a lightbulb?\n\n<img alt=\"Wearing bibs at French Teppinyaki\" src=\"https://anil.recoil.org/images/ocaml-users-3.webp\" title=\"Wearing bibs at French Teppinyaki\">\nWearing bibs at French Teppinyaki\n\n<img alt=\"Team Mirage cheeses it up\" src=\"https://anil.recoil.org/images/ocaml-users-2.webp\" title=\"Team Mirage cheeses it up\">\nTeam Mirage cheeses it up</li>\n</ul>\n<p>OASIS works with either an existing build system in a project, or can be integrated more closely with <code>ocamlbuild</code> by advanced users. Lots of projects are already using OASIS (from Cryptokit to Lwt to the huge <a href=\"http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=641\">Jane Street Core</a>). He is also working on a distribution mechanism on a central website, which should make for convenient OCaml packaging when it is finished and gets more adoption from the community.</p>\n<p>Finally, <a href=\"http://ashishagarwal.org/\">Ashish Agarwal</a> led a discussion on how OCaml can improve its web presence for beginners. Lots of good ideas here (some of which we implemented when reworking the <a href=\"http://cufp.org\">CUFP</a> website last year). Looking forward to seeing what happens next year in this space! I really enjoyed the day; the quality of talks was very high, and many engaging discussions from all involved!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/sf-ocaml.webp\" title=\"\">\n</p>\n<p>Of course, not all of the OCaml community action is in France. The ever-social <a href=\"http://www.twitter.com/jakedonham\">Jake Donham</a> organised the First Ever San Francisco User Group that I attended when I was over there a few weeks ago. Ok, admittedly it was mainly French people there too, but it was excellent to meet up with <a href=\"http://www.linkedin.com/pub/mika-illouz/0/a02/7b4\">Mika</a>, <a href=\"http://martin.jambon.free.fr/\">Martin</a>, <a href=\"http://www.linkedin.com/pub/julien-verlaguet/20/10a/b57\">Julien</a>, <a href=\"http://fr.linkedin.com/in/henribinsztok\">Henri</a> and of course Jake when over there.</p>\n<p>We should definitely have more of these fun local meetups, and a number of other OCaml hackers I mentioned it to want to attend next time in the Bay Area, if only to cry into their drinks about the state of multi-core... <em>just kidding</em>, <a href=\"http://www.ocamlpro.com\">OCamlPro</a> is hard at work fixing that after all :-)</p>",
+18
avsm/notes_ocamllabs-2014-review.json
+18
avsm/notes_ocamllabs-2014-review.json
···+"summary": "<p>The <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a> initiative within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Laboratory</a> is now just over two years\nold, and it is time for an update about our activities since the last\nupdate at the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">end of\n2013</a>\nand\n<a href=\"https://anil.recoil.org/2012/10/19/announcing-ocaml-labs.html\">2012</a>.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that takes on some of the load of day-to-day OCaml maintenance\nfrom <a href=\"http://caml.inria.fr/\">INRIA</a>, as well as help grow the wider\ncommunity and meet our own research agendas around topics such as\n<a href=\"https://queue.acm.org/detail.cfm?id=2566628\">unikernels</a>. To this end,\nall of our projects have been highly collaborative, often involving\ncolleagues from <a href=\"http://ocamlpro.com\">OCamlPro</a>,\n<a href=\"http://caml.inria.fr/\">INRIA</a>, <a href=\"http://janestreet.com\">Jane Street</a>,\n<a href=\"http://lexifi.com\">Lexifi</a> and <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers our progress in tooling, the compiler and language,\ncommunity efforts, research projects and concludes with our priorities\nfor 2015.</p>\n<h2><a href=\"https://anil.recoil.org/#r-tooling\"></a>\n<img alt=\"OCaml: it&apos;s a dog&apos;s life. In this case, Toru the dog.\" src=\"https://anil.recoil.org/images/toru-cucl-window.webp\" title=\"OCaml: it&apos;s a dog&apos;s life. In this case, Toru the dog.\">\nOCaml: it's a dog's life. In this case, Toru the dog.\nTooling</h2>\n<p>At the start of 2014, we had just helped to release <a href=\"http://opam.ocaml.org/blog/opam-1-1-1-released/\">OPAM\n1.1.1</a> with our\ncolleagues at <a href=\"http://ocamlpro.com\">OCamlPro</a>, and serious OCaml users\nhad just started moving over to using it.</p>\n<p>Our overall goal at OCaml Labs is to deliver a modular set of of\ndevelopment tools around OCaml that we dub the <em>OCaml Platform</em>. The\nremainder of 2014 was thus spent polishing this nascent OPAM release\ninto a solid base (both as a command-line tool and as a library) that we\ncould use as the basis for documentation, testing and build\ninfrastructure, all the while making sure that bigger OCaml projects\ncontinued to migrate over to it. Things have been busy; here are the\nhighlights of this effort.</p>\n<h3><a href=\"https://anil.recoil.org/#opam\"></a>OPAM</h3>\n<p>The central <a href=\"https://github.com/ocaml/opam-repository\">OPAM repository</a>\nthat contains the package descriptions has grown tremendously in 2014,\nwith over 280 contributors committing almost 10000 changesets across\n3800 <a href=\"https://github.com/ocaml/opam-repository/pulls\">pull requests</a> on\nGitHub. The front line of incoming testing has been continuous\nintegration by the wonderful <a href=\"http://travis-ci.org/ocaml/opam-repository\">Travis\nCI</a>, who also granted us\naccess to their experimental <a href=\"http://docs.travis-ci.com/user/osx-ci-environment/\">MacOS\nX</a> build pool. The\nOPAM package team also to expanded to give David Sheets, Jeremy Yallop,\nPeter Zotov and Damien Doligez commit rights, and they have all been\nbusily triaging new packages as they come in.</p>\n<p>Several large projects such as <a href=\"http://xapi-project.github.io/\">Xapi</a>,\n<a href=\"http://ocsigen.org\">Ocsigen</a> and our own\n<a href=\"http://openmirage.org\">MirageOS</a> switched over to using OPAM for\nday-to-day development, as well as prolific individual developers such\nas <a href=\"http://erratique.ch\">Daniel Buenzli</a> and <a href=\"http://ocaml.info/\">Markus\nMottl</a>. <a href=\"https://blogs.janestreet.com/category/ocaml/\">Jane\nStreet</a> continued to send\nregular <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Adiml+\">monthly\nupdates</a>\nof their Core/Async suite, and releases appeared from the\n<a href=\"https://github.com/ocaml/opam-repository/pull/3570\">Facebook</a>\nopen-source team as well (who develop\n<a href=\"https://code.facebook.com/posts/264544830379293/hack-a-new-programming-language-for-hhvm/\">Hack</a>,\n<a href=\"https://github.com/facebook/flow\">Flow</a> and\n<a href=\"https://github.com/facebook/pfff\">Pfff</a> in OCaml).</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Number of unique contributors to the central OPAM package repository\" src=\"https://anil.recoil.org/images/opam12-contributors-mar14.webp\" title=\"Number of unique contributors to the central OPAM package repository\">\nNumber of unique contributors to the central OPAM package repository\n\n<img alt=\"Total number of unique packages (including multiple versions of the same package)\" src=\"https://anil.recoil.org/images/opam12-packages-mar14.webp\" title=\"Total number of unique packages (including multiple versions of the same package)\">\nTotal number of unique packages (including multiple versions of the same package)\n\n<img alt=\"Total packages with multiple versions coalesced so you can see new package growth\" src=\"https://anil.recoil.org/images/opam12-unique-packages-mar14.webp\" title=\"Total packages with multiple versions coalesced so you can see new package growth\">\nTotal packages with multiple versions coalesced so you can see new package growth</li>\n</ul>\n<p>We used feedback from the users to smooth away many of the rough edges,\nwith:</p>\n<ul>\n<li>a redesigned <a href=\"http://opam.ocaml.org/blog/opam-1-2-pin/\">development workflow</a> that lets developers quickly grab a development version of a library recompile all dependent packages automatically, and quickly publish results to GitHub.</li>\n<li>binary distributions for common OS distributions via their <a href=\"https://github.com/ocaml/opam/wiki/Distributions\">native packaging</a>, as well as <a href=\"http://opam.ocaml.org/blog/0install-intro/\">0install</a> and <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant boxes</a>.</li>\n<li>a unified way of cloning the source of any package via <code>opam source</code>. This handles any supported OPAM archive, including Git, Mercurial or Darcs remotes.</li>\n<li>a richer package metadata, including source code, development archives and bug report URLs.</li>\n</ul>\n<p>These changes were all incorporated into the <a href=\"http://opam.ocaml.org/blog/opam-1-2-0-release/\">OPAM 1.2</a>, along with backwards compatibility shims to keep the old 1.1 metadata format working until the migration is complete. The 1.2.x series has been a solid and usable development manager, and last week\u2019s release of <a href=\"http://opam.ocaml.org/blog/opam-1-2-1-release/\">OPAM 1.2.1</a> has further polished the core scripting engine.</p>\n<h4><a href=\"https://anil.recoil.org/#platform-blog\"></a>Platform Blog</h4>\n<p>One of the more notable developments during 2014 was the <a href=\"http://coq-blog.clarus.me/use-opam-for-coq.html\">adoption of\nOPAM</a> further up the\necosystem by the <a href=\"https://coq.inria.fr/\">Coq</a> theorem prover. This\nbroadening of the community prompted us to create an <a href=\"http://opam.ocaml.org\">official OPAM\nblog</a> to give us a central place for new and\ntips, and we\u2019ve had posts about\n<a href=\"http://opam.ocaml.org/blog/opam-in-xenserver/\">XenServer</a> developments,\nthe <a href=\"http://opam.ocaml.org/blog/turn-your-editor-into-an-ocaml-ide/\">Merlin IDE\ntool</a>\nand the modern <a href=\"http://opam.ocaml.org/blog/about-utop/\">UTop</a>\ninteractive REPL. If you are using OPAM in an interesting or production\ncapacity, please do <a href=\"https://github.com/ocaml/platform-blog/issues\">get in\ntouch</a> so that we can\nwork with you to write about it for the wider community.</p>\n<p>The goal of the blog is also to start bringing together the various\ncomponents that form the OCaml Platform. These are designed to be\nmodular tools (so that you can pick and choose which ones are necessary\nfor your particular use of OCaml). There are more details available from\nthe OCaml Workshop presentation at ICFP 2014\n(<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>).</p>\n<h4><a href=\"https://anil.recoil.org/#onboarding-new-users\"></a>Onboarding New Users</h4>\n<p>OPAM has also been adopted now by <a href=\"http://harvard.edu\">several</a>\n<a href=\"http://cornell.edu\">big</a> <a href=\"http://princeton.edu\">universities</a>\n(including <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">us at\nCambridge</a>!) for\nundergraduate and graduate Computer Science courses. The demands\nincreased for an out-of-the-box solution that makes it as easy possible\nfor new users to get started with minimum hassle. We created a\n<a href=\"http://lists.ocaml.org/listinfo/teaching\">dedicated teaching list</a> to\naid collaboration, and a list of <a href=\"http://ocaml.org/learn/teaching-ocaml.html\">teaching resources on\nocaml.org</a> and supported\nseveral initiatives in collaboration with <a href=\"https://github.com/AltGr\">Louis\nGesbert</a> at OCamlPro, as usual with OPAM\ndevelopment).</p>\n<p>The easiest way to make things "just work" are via regular binary builds\nof the latest releases of OCaml and OPAM on Debian, Ubuntu, CentOS and\nFedora, via <a href=\"http://launchpad.net/~avsm\">Ubuntu PPAs</a> and the <a href=\"https://build.opensuse.org/package/show/home:ocaml/opam\">OpenSUSE\nBuild Service</a>\nrepositories. Our industrial collaborators from Citrix, <a href=\"http://jon.recoil.org\">Jon\nLudlam</a> and <a href=\"http://dave.recoil.org\">Dave Scott</a>\nbegan an <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-January/000910.html\">upstreaming\ninitiative</a>\nto Fedora and sponsored the creation of a <a href=\"http://lists.centos.org/pipermail/centos-devel/2014-November/012375.html\">CentOS\nSIG</a>\nto ensure that binary packages remain up-to-date. We also contribute to\nthe hardworking packagers on MacOS X, Debian, FreeBSD, NetBSD and\nOpenBSD where possible as well to ensure that binary builds are well\nrounded out. Richard Mortier also assembled <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant\nboxes</a> that contain OCaml,\nfor use with VirtualBox.</p>\n<ul>\n<li>Gallery il\n\n<img alt=\"Louis cooks us dinner in Nice at our OPAM developer summit\" src=\"https://anil.recoil.org/images/opam-in-nice.webp\" title=\"Louis cooks us dinner in Nice at our OPAM developer summit\">\nLouis cooks us dinner in Nice at our OPAM developer summit</li>\n</ul>\n<p>Within OPAM itself, we applied polish to the handling of <a href=\"https://github.com/ocaml/opam-depext\">external\ndependencies</a> to automate checking\nthat the system libraries required by OPAM are present. Two emerging\ntools that should help further in 2015 are the\n<a href=\"https://github.com/OCamlPro/opam-user-setup\">opam-user-setup</a> and\n<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-box</a> plugins that\nautomate first-time configuration. These last two are primarily\ndeveloped at OCamlPro, with design input and support from OCaml Labs.</p>\n<p>We do have a lot of work left to do with making the new user experience\nreally seamless, and help is <em>very</em> welcome from anyone who is\ninterested. It often helps to get the perspective of a newcomer to find\nout where the stumbling blocks are, and we value any such advice. Just\nmail <a href=\"mailto:opam-devel@lists.ocaml.org\">opam-devel@lists.ocaml.org</a>\nwith your thoughts, or <a href=\"https://github.com/ocaml/opam/issues\">create an\nissue</a> on how we can improve. A\nparticularly good example of such an initiative was started by Jordan\nWalke, who prototyped <a href=\"https://github.com/jordwalke/CommonML\">CommonML</a>\nwith a NodeJS-style development workflow, and <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000975.html\">wrote\nup</a>\nhis design document for the mailing list. (Your questions or ideas do\nnot need to be as well developed as Jordan\u2019s prototype!)</p>\n<h3><a href=\"https://anil.recoil.org/#testing-packages\"></a>Testing Packages</h3>\n<p>The public Travis CI testing does come with some limitations, since it\nonly checks that the latest package sets install, but not if any\ntransitive dependencies fail due to interface changes. It also doesn\u2019t\ntest all the optional dependency combinations due to the 50 minute time\nlimit.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/travis-mascot-200px.webp\" title=\"\">\n</p>\n<p>We expanded the OPAM repository testing in several ways to get around\nthis:</p>\n<ul>\n<li>\n<p><strong>Individual Repositories:</strong> Thomas Gazagnaire built <a href=\"http://opam.ocaml.org/blog/opam-1-2-travisci/\">centralised\nTravis scripts</a> that\ncan be used on any OCaml GitHub repository to easily test code\nbefore it is released into OPAM. These scripts are sourced from a\ncentral\n<a href=\"https://github.com/ocaml/ocaml-travisci-skeleton\">repository</a> and\nsupport external, optional and reverse dependency checking across\nmultiple revisions of the compiler. For instance, it just needs <a href=\"https://github.com/mirage/ocaml-cohttp/blob/master/.travis.yml\">one\nfile</a>\nto test all the supported permutations of the\n<a href=\"https://github.com/mirage/ocaml-cohttp\">CoHTTP</a> library.</p>\n</li>\n<li>\n<p><strong>Bulk Builds</strong>: Damien Doligez and I independently started doing\nlarge-scale bulk builds of the repository to ensure that a single\nsnapshot of the package repository can automatically build as many\npackages as possible. My implementation used the\n<a href=\"http://docker.com\">Docker</a> container manager to spawn off 1000s of\npackage builds in parallel and commit the results into a filesystem\nThis required building a <a href=\"http://avsm.github.io/ocaml-dockerfile\">Dockerfile\neDSL</a>, and the results are\nnow online at\n<a href=\"https://opam.ocaml.org/builds\">https://opam.ocaml.org/builds</a>.</p>\n</li>\n<li>\n<p><strong>OCamlot</strong>: An ongoing piece of infrastructure work is to take the\nbulk build logs (which are around 7GB per daily run), and to store\nand render them using our <a href=\"http://irmin.io\">Irmin</a> Git store. Expect\nto see more around this soon; it has the awesome feature of letting\nany developer clone the build logs for their project locally, to\nmake triage of foreign operating systems as simple as possible.</p>\n</li>\n</ul>\n<h4><a href=\"https://anil.recoil.org/#language-evolution\"></a>Language Evolution</h4>\n<p>This ability to do unattended builds of the package repository has also\nimproved the decision making process within the core compiler team.\nSince we now have a large (3000+ package) corpus of OCaml code, it\nbecame a regular occurrence in the 4.02 development cycle to \u201c<a href=\"https://anil.recoil.org/2014/04/08/grepping-every-known-ocaml-package-source.html\">ask\nOPAM</a>\u201d\nwhether a particular feature or new syntax would break any existing\ncode. This in turn provides an incentive for commercial users to provide\nrepresentative samples of their code; for instance, the Jane Street Core\nreleases in OPAM (with their very modular style) act as an open-source\ncanary without needing access to any closed source code.</p>\n<p>One good example in 2014 was the decoupling of the\n<a href=\"http://en.wikipedia.org/wiki/Camlp4\">Camlp4</a> macro preprocessor from\nthe main OCaml distribution. Since Camlp4 has been used for over a\ndecade and there are some very commonly used syntax extensions such as\n<a href=\"https://github.com/janestreet/type_conv\">type_conv</a>, a simple removal\nwould break a lot of packages. We used OPAM to perform a gradual\nmovement that most users hopefully never noticed by the time OCaml 4.02\nwas released. First, we added a <a href=\"https://github.com/ocaml/opam-repository/pull/2558\">dummy\npackage</a> in OPAM for\nearlier versions of the compiler that had Camlp4 built-in, and then used\nthe OPAM constraint engine to compile it as an external tool for the\nnewer compiler revisions. Then we just had to triage the bulk build logs\nto find build failures from packages that were missing a Camlp4\ndependency, and <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&q=camlp4+requires+is%3Apr+\">add\nthem</a>\nto the package metadata.</p>\n<h4><a href=\"https://anil.recoil.org/#github-integration\"></a>GitHub Integration</h4>\n<p>An interesting\n<a href=\"https://twitter.com/vincenthz/status/563108158907097089\">comment</a> from\nVincent Hanquez about OPAM is that "OCaml's OPAM is a post-GitHub\ndesign". This is very true, as much of the workflow for pinning <code>git://</code>\nURLs emerged out of being early adopters of GitHub for hosting the\nMirageOS. OCaml Labs supported two pieces of infrastructure integration\naround GitHub in 2014:</p>\n<ul>\n<li>\n<p>OPAM has a compiler switch feature that lets you run simultaneous\nOCaml installations and swap between them easily. I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub\nAPI bindings</a> to regularly\nconvert every GitHub pull request into a custom compiler\nswitch (see <a href=\"https://anil.recoil.org/notes/ocaml-github-and-opam\">Easily OPAM switching to any OCaml feature request</a>).\nThis lets users reporting bugs try out a patched compiler almost\nimmediately upon a fix becoming available.</p>\n</li>\n<li>\n<p>The motivation behind this feature was our collaborator Gabriel\nScherer\u2019s\n<a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">experiment</a>\nto enable patch review of OCaml on GitHub, alongside the venerable\n<a href=\"http://caml.inria.fr/mantis/view_all_bug_page.php\">Mantis bug\ntracker</a>. We\nsupported this via adding Travis CI support to the main compiler,\nand also helped to migrate a number of support libraries to GitHub,\nsuch as <a href=\"https://github.com/ocaml/camlp4\">camlp4</a>. These can all be\nfound on the <a href=\"https://github.com/ocaml\">ocaml</a> organisation on\nGitHub.</p>\n</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#codoc-documentation\"></a>Codoc Documentation</h3>\n<p>Leo White, David Sheets, Amir Chaudhry and Thomas Gazagnaire led the\ncharge to build a modern documentation generator for OCaml, and\n<a href=\"http://lists.ocaml.org/pipermail/platform/2015-February/000539.html\">published</a>\nan <em>alpha</em> version of <a href=\"https://github.com/dsheets/codoc\">codoc 0.2.0</a>\nafter a lot of work throughout 2014. In the 2014 OCaml workshop\npresentation\n(<a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>),\nwe mentioned the \u201cmodule wall\u201d for documentation and this attempts to\nfix it. To try it out, simply follow the directions in the README on\nthat repository, or <a href=\"http://dsheets.github.io/codoc\">browse some\nsamples</a> of the current, default output\nof the tool. Please do bear in mind codoc and its constituent libraries\nare still under heavy development and are <em>not</em> feature complete, but\nwe\u2019re gathering <a href=\"https://github.com/dsheets/codoc/issues\">feedback</a> from\nearly adopters.</p>\n<p>codoc's aim is to provide a widely useful set of tools for generating\nOCaml documentation. In particular, we are striving to:</p>\n<ol>\n<li>Cover all of OCaml\u2019s language features</li>\n<li>Provide accurate name resolution and linking</li>\n<li>Support cross-linking between different packages</li>\n<li>Expose interfaces to the components we\u2019ve used to build <code>codoc</code></li>\n<li>Provide a magic-free command-line interface to the tool itself</li>\n<li>Reduce external dependencies and default integration with other\ntools</li>\n</ol>\n<p>We haven\u2019t yet achieved all of these at all levels of our tool stack but\nare getting close, and the patches are all under discussion for\nintegration into the mainstream OCaml compiler. <code>codoc</code> 0.2.0 is usable\ntoday (if a little rough in some areas like default CSS), and there is a\n<a href=\"http://opam.ocaml.org/blog/codoc-0-2-0-released/\">blog post</a> that\noutlines the architecture of the new system to make it easier to\nunderstand the design decisions that went into it.</p>\n<h3><a href=\"https://anil.recoil.org/#community-governance\"></a>Community Governance</h3>\n<p>As the amount of infrastructure built around the\n<a href=\"http://ocaml.org\">ocaml.org</a> domain grows (e.g. mailing lists, file\nhosting, bulk building), it is important to establish a governance\nframework to ensure that it is being used as best needed by the wider\nOCaml community.</p>\n<p>Amir Chaudhry took a good look at how other language communities\norganise themself, and began putting together a succinct <a href=\"http://amirchaudhry.com/towards-governance-framework-for-ocamlorg/\">governance\nframework</a>\nto capture how the community around <code>ocaml.org</code> operates, and how to\nquickly resolve any conflicts that may arise in the future. He took care\nto ensure it had a well-defined scope, is simple and self-contained, and\n(crucially) documents the current reality. The result of this work is\ncirculating privately through all the existing volunteers for a first\nround of feedback, and will go live in the next few months as a living\ndocument that explains how our community operates.</p>\n<h3><a href=\"https://anil.recoil.org/#assemblage\"></a>Assemblage</h3>\n<p>One consequence of OCaml\u2019s age (close to twenty years old now) is that\nthe tools built around the compiler have evolved fairly independently.\nWhile OPAM now handles the high-level package management, there is quite\na complex ecosystem of other components that are complex for new users\nto get to grips with: <a href=\"http://github.com/ocaml/oasis\">OASIS</a>,\n<a href=\"http://projects.camlcity.org/projects/findlib.html\">ocamlfind</a>,\n<a href=\"https://ocaml.org/learn/tutorials/ocamlbuild/\">ocamlbuild</a>, and\n<a href=\"https://github.com/the-lambda-church/merlin\">Merlin</a> to name a few.\nEach of these components (while individually stable) have their own\nmetadata and namespace formats, further compounding the lack of cohesion\nof the tools.</p>\n<p>Thomas Gazagnaire and Daniel Buenzli embarked on an effort to build an\neDSL that unifies OCaml package descriptions, with the short-term aim of\ngenerating the support files required by the various support tools, and\nthe long-term goal of being the integration point for the build, test\nand documentation generation lifecycle of an OCaml/OPAM package. This\nprototype, dubbed <a href=\"https://github.com/samoht/assemblage\">Assemblage</a> has\ngone through several iterations and <a href=\"https://github.com/samoht/assemblage/labels/design\">design\ndiscussions</a> over\nthe summer of 2014. Daniel has since been splitting out portions of it\ninto the <a href=\"http://erratique.ch/software/bos\">Bos</a> OS interaction library.</p>\n<p>Assemblage is not released officially yet, but we are committed to\nresuming work on it this summer when Daniel visits again, with the\nintention of unifying much of our workflow through this tool. If you are\ninterested in build and packaging systems, now is the time to <a href=\"https://github.com/samoht/assemblage\">make your\nopinion known</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#core-compiler\"></a>Core Compiler</h2>\n<p>We also spent time in 2014 working on the core OCaml language and\ncompiler, with our work primarily led by Jeremy Yallop and Leo White.\nThese efforts were not looking to make any radical changes in the core\nlanguage; instead, we generally opted for evolutionary changes that\neither polish rough edges in the language (such as open type and handler\ncases), or new features that fit into the ML style of building programs.</p>\n<h3><a href=\"https://anil.recoil.org/#new-features-in-4020\"></a>New Features in 4.02.0</h3>\n<p>The OCaml 4.02 series was primarily developed and\n<a href=\"https://ocaml.org/releases/4.02.html\">released</a> in 2014. The\n<a href=\"http://caml.inria.fr/pub/distrib/ocaml-4.02/notes/Changes\">ChangeLog</a>\ngenerated much <a href=\"https://blogs.janestreet.com/ocaml-4-02-everything-else/\">user\nexcitement</a>,\nand we were also pleased to have contributed several language\nimprovements.</p>\n<h4><a href=\"https://anil.recoil.org/#handler-cases-and-exceptional-syntax\"></a>Handler Cases and exceptional syntax</h4>\n<p>OCaml\u2019s <code>try</code> and <code>match</code> constructs are good at dealing with exceptions\nand values respectively, but neither constructs can handle both values\nand exceptions. Jeremy Yallop investigated <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">how to handle\nsuccess</a>\nmore elegantly, and an elegant unified syntax emerged. A simple example\nis that of a stream iterator that uses exceptions for control flow:</p>\n<pre><code>let rec iter_stream f s =\n match (try Some (MyStream.get s) with End_of_stream -> None) with\n | None -> ()\n | Some (x, s') -> f x; iter_stream f s'\n</code></pre>\n<p>This code is not only verbose, but it also has to allocate an <code>option</code>\nvalue to ensure that the <code>iter_stream</code> calls remains tail recursive. The\nnew syntax in OCaml 4.02 allows the above to be rewritten succinctly:</p>\n<pre><code>let rec iter_stream f s =\n match MyStream.get s with\n | (x, s') -> f x; iter_stream f s'\n | exception End_of_stream -> ()\n</code></pre>\n<p>Read more about the background of this feature in Jeremy\u2019s <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">blog\npost</a>,\nthe associated discussion in the <a href=\"http://caml.inria.fr/mantis/view.php?id=6318\">upstream Mantis\nbug</a>, and the final\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec245\">manual\npage</a> in\nthe OCaml 4.02 release. For an example of its use in a real library, see\nthe Jane Street\n<a href=\"https://github.com/janestreet/sexplib/blob/1bd69553/lib/conv.ml#L213-L215\">usage</a>\nin the <a href=\"https://github.com/janestreet/sexplib\">s-expression</a> handling\nlibrary (which they use widely to reify arbitrary OCaml values and\nexceptions).</p>\n<h4><a href=\"https://anil.recoil.org/#open-extensible-types\"></a>Open Extensible Types</h4>\n<p>A long-standing trick to build <a href=\"https://blogs.janestreet.com/rethinking-univ/\">universal\ncontainers</a> in OCaml has\nbeen to encode them using the exception <code>exn</code> type. There is a similar\nconcept of a <a href=\"http://mlton.org/UniversalType\">universal type</a> in\nStandard ML, and they were described in the \u201c<a href=\"http://www.andres-loeh.de/OpenDatatypes.pdf\">Open Data Types and Open\nFunctions</a>\u201d paper by Andres\nL\u00f6h and Ralf Hinze in 2006.</p>\n<p>Leo White designed, implemented and upstreamed support for <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec246\">extensible\nvariant\ntypes</a> in\nOCaml 4.02. Extensible variant types are variant types that can be\nextended with new variant constructors. They can be defined as follows:</p>\n<pre><code>type attr = ..\n\ntype attr += Str of string\n\ntype attr +=\n | Int of int\n | Float of float\n</code></pre>\n<p>Pattern matching on an extensible variant type requires a default case\nto handle unknown variant constructors, just as is required for pattern\nmatching on exceptions (extensible types use the exception memory\nrepresentation at runtime).</p>\n<p>With this feature added, the OCaml <code>exn</code> type simply becomes a special\ncase of open extensible types. Exception constructors can be declared\nusing the type extension syntax:</p>\n<pre><code> type exn += Exc of int\n</code></pre>\n<p>You can read more about the discussion behind open extensible types in\nthe upstream <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">Mantis bug</a>.\nIf you\u2019d like to see another example of their use, they have been\nadopted by the latest releases of the Jane Street Core libraries in the\n<a href=\"https://github.com/janestreet/core_kernel/blob/43ee3eef/lib/type_equal.ml#L64\">Type_equal</a>\nmodule.</p>\n<h3><a href=\"https://anil.recoil.org/#modular-implicits\"></a>Modular Implicits</h3>\n<p>A common criticism of OCaml is its lack of support for ad-hoc\npolymorphism. The classic example of this is OCaml\u2019s separate addition\noperators for integers (<code>+</code>) and floating-point numbers (<code>+.</code>). Another\nexample is the need for type-specific printing functions (<code>print_int</code>,\n<code>print_string</code>, etc.) rather than a single <code>print</code> function which works\nacross multiple types.</p>\n<p>Taking inspiration from Scala\u2019s\n<a href=\"http://docs.scala-lang.org/tutorials/tour/implicit-parameters.html\">implicits</a>\nand <a href=\"http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al.</em>, Leo White designed a system for ad-hoc polymorphism in\nOCaml based on using modules as type-directed implicit parameters. The\ndesign not only supports implicit modules, but also implicit functors\n(that is, modules parameterised by other module types) to permit the\nexpression of generic modular implicits in exactly the same way that\nfunctors are used to build abstract data structures.</p>\n<p>Frederic Bour joined us as a summer intern and dove straight into the\nimplementation, resulting in an <a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">online\ndemo</a> and ML\nWorkshop presentation\n(<a href=\"https://sites.google.com/site/mlworkshoppe/modular-implicits.pdf?attredirects=0\">abstract</a>,\n<a href=\"https://www.youtube.com/watch?v=3wVUXTd4WNc\">video</a> and\n<a href=\"http://www.lpw25.net/ml2014.pdf\">paper</a>). Another innovation in how\nwe\u2019ve been trialling this feature is the use of Andy Ray\u2019s\n<a href=\"https://andrewray.github.io/iocamljs/\">IOCamlJS</a> to publish an\ninteractive, online notebook that is fully hosted in the browser. You\ncan follow the examples of modular implicits\n<a href=\"https://andrewray.github.io/iocamljs/modimp_show.html\">online</a>, or try\nthem out on your own computer via an OPAM switch:</p>\n<pre><code>opam switch 4.02.0+modular-implicits\neval `opam config env`\nopam install utop \nutop\n</code></pre>\n<p>Some of the early feedback on modular implicits from industrial users\nwas interesting. Jane Street commented that although this would be a big\nusability leap, it would be dangerous to lose control over exactly what\ngoes into the implicit environment (i.e. the programmer should always\nknow what <code>(a + b)</code> represents by locally reasoning about the code). The\ncurrent design thus follows the ML discipline of maintaining explicit\ncontrol over the namespace, with any ambiguities in resolving an\nimplicit module type resulting in a type error.</p>\n<h3><a href=\"https://anil.recoil.org/#multicore\"></a>Multicore</h3>\n<p>In addition to ad-hoc polymorphism, support for parallel execution on\nmulticore CPUs is undoubtedly the most common feature request for OCaml.\nThis has been high on our list after improving tooling support, and\nStephen Dolan and Leo White made solid progress in 2014 on the core\nruntime plumbing required.</p>\n<p>Stephen initially added <a href=\"https://github.com/stedolan/ocaml\">thread-local\nsupport</a> to the OCaml compiler. This\ndesign avoided the need to make the entire OCaml runtime preemptive (and\nthus a huge patch) by allocating thread-local state per core.</p>\n<p>We are now deep into the design and implementation of the programming\nabstractions built over these low-level primitives. One exciting aspect\nof our implementation is much of the scheduling logic for multicore\nOCaml can be written in (single-threaded) OCaml, making the design very\nflexible with respect to <a href=\"http://kcsrk.info/papers/mmscc_marc12.pdf\">heterogenous\nhardware</a> and <a href=\"http://fable.io\">variable IPC\nperformance</a>.</p>\n<p>To get feedback on the overall design of multicore OCaml, we presented\nat OCaml 2014\n(<a href=\"http://www.cl.cam.ac.uk/~sd601/papers/multicore_slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=FzmQTC_X5R4\">video</a> and\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">abstract</a>), and\nStephen visited INRIA to consult with the development team and Arthur\nChargueraud (the author of\n<a href=\"http://www.chargueraud.org/softs/pasl/\">PASL</a>). Towards the end of the\nyear, <a href=\"http://kcsrk.info/\">KC Sivaramakrishnan</a> finished his PhD studies\nat Purdue and joined our OCaml Labs group. He is the author of\n<a href=\"http://multimlton.cs.purdue.edu/mML/Welcome.html\">MultiMlton</a>, and is\nnow driving the completion of the OCaml multicore work along with\nStephen Dolan, Leo White and Mark Shinwell. Stay tuned for updates from\nus when there is more to show later this year!</p>\n<h3><a href=\"https://anil.recoil.org/#ctypes-a-modular-foreign-function-interface\"></a>Ctypes: a Modular Foreign Function Interface</h3>\n<p>The <a href=\"https://github.com/ocamllabs/ocaml-ctypes\">Ctypes</a> library started\nas an experiment with GADTs by Jeremy Yallop, and has since ballooned in\na robust, comprehensive library for safely interacting with the OCaml\nforeign function interface. The first release came out in time to be\nincluded in <a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Real World\nOCaml</a>\nin lieu of the low-level FFI (which I was not particularly enamoured\nwith having to explain in a tight page limit).</p>\n<p>Throughout 2014, Jeremy expanded support for a number of features\nrequested by users (both industrial and academic) who adopted the\nlibrary in preference to manually writing C code to interface with the\nruntime, and issued several updated\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/releases\">releases</a>.</p>\n<h4><a href=\"https://anil.recoil.org/#c-stub-generation\"></a>C Stub Generation</h4>\n<p>The first release of Ctypes required the use of\n<a href=\"https://sourceware.org/libffi/\">libffi</a> to dynamically load shared\nlibraries and dynamically construct function call stack frames whenever\na foreign function is called. While this works for simple libraries, it\ncannot cover <em>all</em> usecases, since interfacing with C demands an\nunderstanding of <code>struct</code> memory layout, C preprocessor macros, and\nother platform-dependent quirks which are more easily dealt with by\ninvoking a C compiler. Finally, the performance of a <code>libffi</code>-based API\nwill necessarily be slower than writing direct C stub code.</p>\n<p>While many other language FFIs provide separate libraries for dynamic\nand static FFI libraries, we decided to have a go at building a\n<em>modular</em> version of Ctypes that could handle both cases from a single\ndescription of the foreign function interface. The result (dubbed\n\u201cCmeleon\u201d) remained surprisingly succinct and usable, and now covers\nalmost every use of the OCaml foreign function interface. We submitted a\npaper to <a href=\"http://icfpconference.org/2015\">ICFP 2015</a> dubbed \u201c<a href=\"https://anil.recoil.org/papers/drafts/2015-cmeleon-icfp-draft1.pdf\">A modular\nforeign function\ninterface</a>\u201d\nthat describes it in detail. Here is a highlight of how simple a generic\nbinding looks:</p>\n<pre><code>module Bindings(F : FOREIGN) = struct\n open F\n let gettimeofday = foreign "gettimeofday"\n (ptr timeval @-> ptr timezone @-> returning int)\nend\n</code></pre>\n<p>The <code>FOREIGN</code> module type completely abstracts the details of whether or\nnot dynamic or static binding is used, and handles C complexities such\nas computing the struct layout on the local machine architecture.</p>\n<h4><a href=\"https://anil.recoil.org/#inverse-stubs\"></a>Inverse Stubs</h4>\n<p>The other nice result from functorising the foreign function interface\nemerged when we tried to <em>invert</em> the FFI and serve a C interface from\nOCaml code (for example, by compiling the OCaml code as a <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/intfc.html\">shared\nlibrary</a>). This\nwould let us begin swapping out C libraries that we <a href=\"http://openssl.org\">don\u2019t\ntrust</a> with <a href=\"https://github.com/mirage/ocaml-tls\">safer\nequivalents</a> written in OCaml.</p>\n<p>You can see an\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">example</a>\nof how inverted stubs work via a simple C XML parsing exposed from the\n<a href=\"http://erratique.ch/software/xmlm\">Xmlm</a> library. We can define a C\n<code>struct</code> by:</p>\n<pre><code>(* Define a struct of callbacks (C function pointers) *)\nlet handlers : [`handlers] structure typ = structure "handlers"\nlet (--) s f = field handlers s (funptr f)\n let on_data = "on_data" -- (string @-> returning void)\n let on_start_tag = "on_start_tag" -- (string @-> string @-> returning void)\n let on_end_tag = "on_end_tag" -- (void @-> returning void)\n let on_dtd = "on_dtd" -- (string @-> returning void) \n let on_error = "on_error" -- (int @-> int @-> string @-> returning void)\nlet () = seal handlers\n</code></pre>\n<p>and then expose this via C functions:</p>\n<pre><code>module Stubs(I : Cstubs_inverted.INTERNAL) = struct\n (* Expose the type 'struct handlers' to C. *)\n let () = I.structure handlers\n\n (* We expose just a single function to C. The first argument is a (pointer\n to a) struct of callbacks, and the second argument is a string\n representing a filename to parse. *)\n let () = I.internal "parse_xml" \n (ptr handlers @-> string @-> returning void) parse\nend\n</code></pre>\n<p>You can find the full source code to these snippets on the\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">ocaml-ctypes-inverted-stubs-example</a>\nrepository on GitHub.</p>\n<p>We\u2019ll be exploring this aspect of Ctypes further in 2015 for SSL/TLS\nwith David Kaloper and Hannes Mehnert, and Microsoft Research has\ngenerously funded a <a href=\"http://research.microsoft.com/en-us/collaboration/global/phd_projects2015.aspx\">PhD\nstudentship</a>\nto facilitate the work.</p>\n<h4><a href=\"https://anil.recoil.org/#community-contributions\"></a>Community Contributions</h4>\n<p>Ctypes benefited enormously from several external contributions from the\nOCaml community. From a portability perspective, A. Hauptmann\ncontributed <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/190\">Windows\nsupport</a>, and Thomas\nLeonard added <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/231\">Xen\nsupport</a> to allow\nCtypes bindings to work with <a href=\"http://openmirage.org\">MirageOS\nunikernels</a> (which opens up the intriguing\npossibility of accessing shared libraries across virtual machine\nboundaries in the future). C language support was fleshed out by Edwin\nTorok contributing <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/238\">typedef\nsupport</a>, Ramkumar\nRamachandra adding <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/220\">C99\nbools</a> and Peter\nZotov integrating <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/143\">native\nstrings</a>.</p>\n<p>The winner of \u201cmost enthusiastic use of OCaml Labs code\u201d goes to <a href=\"https://github.com/braibant\">Thomas\nBraibant</a> of\n<a href=\"http://cryptosense.com/the-team/\">Cryptosense</a>, who used <em>every</em>\nfeature of the Ctypes library (consider multi-threaded, inverted, staged\nand marshalled bindings) in their effort to <a href=\"http://www.economist.com/news/science-and-technology/21647269-automating-search-loopholes-software-hacking-hackers\">hack the\nhackers</a>.\nDavid Sheets comes a close second with his implementation of the <a href=\"https://github.com/dsheets/profuse\">FUSE\nbinary protocol</a>, parameterised by\nversion quirks.</p>\n<p>If you\u2019re using Ctypes, we would love to hear about your particular use.\nA search on GitHub and OPAM reveals over 20 projects using it already,\nincluding industrial use at <a href=\"http://cryptosense.com\">Cryptosense</a> and\n<a href=\"http://ocaml.janestreet.com\">Jane Street</a>, and ports to Windows, *BSD,\nMacOS X and even iPhone and Android. There\u2019s a <a href=\"https://github.com/ocamllabs/ocaml-ctypes/wiki\">getting\nstarted</a> guide, and a\n<a href=\"http://lists.ocaml.org/listinfo/ctypes\">mailing list</a> available.</p>\n<h2><a href=\"https://anil.recoil.org/#community-and-teaching-efforts\"></a>Community and Teaching Efforts</h2>\n<p>In addition to the online community building, we also participated in a\nnumber of conferences and face-to-face events to promote education about\nfunctional programming.</p>\n<h3><a href=\"https://anil.recoil.org/#conferences-and-talks\"></a>Conferences and Talks</h3>\n<ul>\n<li>Gallery ir\n\n<img alt=\"Anil speaking at QCon on unikernels\" src=\"https://anil.recoil.org/images/qcon-unikernel-talk.webp\" title=\"Anil speaking at QCon on unikernels\">\nAnil speaking at QCon on unikernels</li>\n</ul>\n<p>There has been a huge growth in the number of quality conferences in\nrecent years, making it tough to choose which ones to attend.\n<a href=\"http://icfpconference.org\">ICFP</a> is the academic meeting point that\npredates most of them, and we <a href=\"https://anil.recoil.org/2014/08/31/ocaml-labs-at-icfp-2014.html\">participated\nextensively</a>\nin 2014 via talks, tutorials and a\n<a href=\"https://www.youtube.com/watch?v=UEIHfXLMtwA\">keynote</a> at the Haskell\nSymposium.<br>\nI also served on the <a href=\"http://icfpconference.org/icfp2014/\">program\ncommittee</a> and <a href=\"https://anil.recoil.org/2015/02/18/icfp15-call-for-sponsorships.html\">industrial\nrelations\nchair</a>\nand took over as the steering committee chair of\n<a href=\"http://cufp.org\">CUFP</a>. Jeremy Yallop, Thomas Gazagnaire and Leo White\nall served program committees on workshops, with Jeremy also chairing\nthis year\u2019s ML Workshop.</p>\n<p>Outside of academic conferences, we participated in a number of\nnon-academic conferences such as <a href=\"https://qconsf.com/\">QCon</a>,\n<a href=\"http://oscon.com\">OSCON</a>, <a href=\"http://ccc.de\">CCC</a>, <a href=\"https://operatingsystems.io/\">New Directions in\nOS</a>,\n<a href=\"http://functionalconf.com\">FunctionalConf</a>,\n<a href=\"https://skillsmatter.com/conferences/1819-functional-programming-exchange\">FPX</a>\nand <a href=\"https://fosdem.org/2014/\">FOSDEM</a>. The vast majority of these talks\nwere about the MirageOS, and slides can be found at\n<a href=\"http://decks.openmirage.org\">decks.openmirage.org</a>.</p>\n<h4><a href=\"https://anil.recoil.org/#the-2048-browser-game\"></a>The 2048 Browser Game</h4>\n<p>Yaron Minsky and I have run OCaml tutorials for ICFP for\n<a href=\"http://cufp.org/2011/t3-building-functional-os.html\">a</a>\n<a href=\"http://cufp.org/2013/t2-yaron-minsky-anil-madhavapeddy-ocaml-tutorial.html\">few</a>\n<a href=\"http://cufp.org/2012/t1-real-world-ocaml-anil-madhavapeddy-university-c.html\">years</a>,\nand we finally hung up our boots in favour of a new crowd.</p>\n<p>Jeremy Yallop and Leo White stepped up to the mark with their ICFP/CUFP\n2014 <a href=\"http://cufp.org/2014/t7-leo-white-introduction-to-ocaml.html\">Introduction to\nOCaml</a>\ntutorial, which had the additional twist of being taught entirely in a\nweb browser by virtue of using the\n<a href=\"http://ocsigen.org/js_of_ocaml\">js_of_ocaml</a> and\n<a href=\"http://andrewray.github.io/iocamljs/\">IOCamlJS</a>. They decided that a\ngood practical target was the popular\n<a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game that has wasted many\nprogrammer hours here at OCaml Labs. They <a href=\"https://github.com/ocamllabs/2048-tutorial\">hacked on\nit</a> over the summertime,\nassisted by our visitor Daniel Buenzli who also released useful\nlibraries such as <a href=\"http://erratique.ch/software/vg\">Vg</a>,\n<a href=\"http://erratique.ch/software/react\">React</a>,\n<a href=\"http://erratique.ch/software/useri\">Useri</a>, and\n<a href=\"http://erratique.ch/software/gg\">Gg</a>.</p>\n<p>The end result is satisfyingly <a href=\"http://ocamllabs.github.io/2048-tutorial/\">playable\nonline</a>, with the source code\navailable at\n<a href=\"https://github.com/ocamllabs/2048-tutorial\">ocamllabs/2048-tutorial</a>.</p>\n<p>Thomas Gazagnaire got invited to Bangalore for <a href=\"http://functionalconf.com/\">Functional\nConf</a> later in the year, and he extended the\n<a href=\"http://gazagnaire.org/fuconf14/\">interactive tutorial notebook</a> and\nalso ran an OCaml tutorial to a packed room. We were very happy to\nsupport the first functional programming conference in India, and hope\nto see many more such events spring up! Amir Chaudhry then went to\nBelgium to <a href=\"https://fosdem.org/2015/\">FOSDEM 2015</a> where he showed off\n<a href=\"http://amirchaudhry.com/unikernel-arm-demo-fosdem/\">the 2048 game running as an ARM\nunikernel</a> to a\ncrowd of attendees at the Xen booth.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Jeremy Yallop giving the L23 course at Cambridge\" src=\"https://anil.recoil.org/images/l23.webp\" title=\"Jeremy Yallop giving the L23 course at Cambridge\">\nJeremy Yallop giving the L23 course at Cambridge\n\n<img alt=\"Compiling hacking with Don Syme\" src=\"https://anil.recoil.org/images/compiler-hacking-dsyme.webp\" title=\"Compiling hacking with Don Syme\">\nCompiling hacking with Don Syme\n\n<img alt=\"Finding a copy of Real World OCaml in Foyles!\" src=\"https://anil.recoil.org/images/jeremy-rwo.webp\" title=\"Finding a copy of Real World OCaml in Foyles!\">\nFinding a copy of Real World OCaml in Foyles!</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#graduate-teaching\"></a>Graduate Teaching</h3>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a> and <a href=\"https://github.com/lpw25\">Leo White</a> (with assistance from <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and\nmyself) also led the design of a new graduate course on <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">Advanced\nFunctional Programming</a> at\nthe Computer Laboratory. This ran in the <a href=\"http://en.wikipedia.org/wiki/Lent_term\">Lent\nTerm</a> and was over-subscribed by\nthree times the number who pre-registered (due to a number of PhD\nstudents and our collaborators from <a href=\"http://citrix.com\">Citrix</a> also\nattending).</p>\n<p>The course materials are <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/materials.html\">freely available\nonline</a> and\ncover the theory behind functional programming, and then move onto type\ninference, abstraction and parametricity, GADTs, rows, monads, and\nstaging. We will be running this again in future years, and the lecture\nmaterials are already proving useful to <a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-04/msg00001.html\">answer mailing list\nquestions</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#mentoring-beginners\"></a>Mentoring Beginners</h3>\n<p>We also had the pleasure of mentoring up-and-coming functional\nprogrammers via several outreach programs, both face-to-face and remote.</p>\n<h4><a href=\"https://anil.recoil.org/#cambridge-compiler-hacking\"></a>Cambridge Compiler Hacking</h4>\n<p>We started the <a href=\"http://ocamllabs.github.io/compiler-hacking/\">Cambridge Compiler\nHacking</a> sessions in a\nsmall way towards the end of 2013 in order to provide a local, friendly\nplace to assist people who wanted to dip their toes into the\nunnecessarily mysterious world of programming language hacking. The plan\nwas simple: provide drinks, pizza, network and a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">bug list of varying\ndifficulty</a> for\nattendees to choose from and work on for the evening, with mentoring\nfrom the experienced OCaml contributors.</p>\n<p>We continued this bi-monthly tradition in 2014, with a regular\nattendance of 15-30 people, and even cross-pollinated communities with\nour local F# and Haskell colleagues. We rotated locations from the\nCambridge Computer Laboratory to Citrix, Makespace, and the new\nCambridge Postdoc Centre. We posted some\n<a href=\"http://ocamllabs.github.io/compiler-hacking/2014/06/24/highlights-from-recent-sessions.html\">highlights</a>\nfrom sessions towards the start of the year, and are very happy with how\nit\u2019s going. There has even been uptake of the bug list across the water\nin France, thanks to Gabriel Scherer.</p>\n<p>In 2015, we\u2019d like to branch out further and host some sessions in\nLondon. If you have a suggestion for a venue or theme, please <a href=\"http://lists.ocaml.org/listinfo/cam-compiler-hacking\">get in\ntouch</a>!</p>\n<h4><a href=\"https://anil.recoil.org/#summer-programs\"></a>Summer Programs</h4>\n<p>There has been a laudable rise in summer programs designed to encourage\ndiversity in our community, and we of course leap at the opportunity to\nparticipate in these when we find them.</p>\n<ul>\n<li>The <a href=\"https://gnome.org/opw/\">GNOME Outreach Program</a> (now also known\nas <a href=\"https://www.gnome.org/outreachy/\">Outreachy</a>) had one funded\nplace for Xen and MirageOS. <a href=\"http://www.somerandomidiot.com/\">Mindy\nPreston</a> did a spectacular <a href=\"http://www.somerandomidiot.com/blog/categories/ocaml/\">blog\nseries</a> about\nher experiences and motivations behind learning OCaml.</li>\n<li>The <a href=\"https://www.google-melange.com/\">Google Summer of Code 2014</a>\nalso had us\n<a href=\"http://openmirage.org/blog/applying-for-gsoc2014\">participating</a>\nvia MirageOS, and <a href=\"https://github.com/moonlightdrive\">Jyotsna\nPrakash</a> took on the challenging\njob of building OCaml bindings for Amazon EC2, also detailed on <a href=\"https://1000hippos.wordpress.com/\">her\nblog</a>.</li>\n<li>Amir Chaudhry began the <a href=\"https://github.com/mirage/mirage-www/wiki/Pioneer-Projects\">Mirage Pioneer\nProjects</a>\ninitiative to give beginners an easier onramp, and this has taken\noff very effectively as a way to advertise interesting projects for\nbeginners at varying levels of difficulties.</li>\n</ul>\n<p>Our own students also had the chance to participate in such workshops to\nget out of Cambridge in the summer! <a href=\"http://hh360.user.srcf.net/blog/\">Heidi\nHoward</a> liveblogged her experiences at\nthe\n<a href=\"http://www.syslog.cl.cam.ac.uk/2015/01/14/programming-languages-mentoring-workshop-plmw/\">PLMW</a>\nworkshop in Mumbai. Meanwhile, <a href=\"https://github.com/dsheets\">David\nSheets</a> got to travel to the slightly less\nexotic London to <a href=\"http://www.syslog.cl.cam.ac.uk/2014/11/25/new-directions-in-operating-systems/\">liveblog\nOSIO</a>,\nand Leonhard Markert covered <a href=\"http://www.syslog.cl.cam.ac.uk/2014/09/05/ocaml-2014/\">ICFP\n2014</a> as a\nstudent volunteer.</p>\n<h3><a href=\"https://anil.recoil.org/#blogging-and-online-activities\"></a>Blogging and Online Activities</h3>\n<p>Our <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/blogs/\">blog roll</a>\nmaintains the ongoing stream of activity from the OCaml Labs crew, but\nthere were some particular highlights throughout 2014.</p>\n<ul>\n<li><a href=\"http://roscidus.com/blog/\">Thomas Leonard</a> began writing about his\nexperiences with switching his <a href=\"http://0install.net\">0install</a>\ninstallation system from <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">Python to\nOCaml</a>\nand <a href=\"http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain/\">what you gain with\nOCaml</a>.\nThis series led to a bunch of interesting feedback on social\nnetworking sites, and Thomas joined the group full-time to work on\nour research into\n<a href=\"http://roscidus.com/blog/blog/2015/01/21/securing-the-unikernel/\">unikernels</a>.</li>\n<li><a href=\"http://www.skjegstad.com/\">Magnus Skjegstad</a> returned from Norway\nto Cambridge to work on MirageOS, and came up with some <a href=\"http://www.skjegstad.com/blog/2015/03/25/mirageos-vm-per-url-experiment/\">crazy\nexperiements</a>,\nas well as helping to build <a href=\"http://www.skjegstad.com/blog/2015/01/19/mirageos-xen-virtualbox/\">Vagrant\nimages</a>\nof the OCaml development environment.</li>\n<li><a href=\"http://amirchaudhry.com\">Amir Chaudhry</a> began his quest to <a href=\"http://amirchaudhry.com/writing-planet-in-pure-ocaml/\">port\nhis website</a>\nwebsite to a <a href=\"http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Jekyll\nunikernel</a>.</li>\n<li>The <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">Mirage 2.0\nrelease</a> in\nthe summer of 2014 saw a slew of blogs posts about the\n<a href=\"http://openmirage.org/blog/2014-in-review\">surge</a> in MirageOS\nactivity.</li>\n</ul>\n<p>It wasn\u2019t all just blogging though, and Jeremy Yallop and Leo White in\nparticular participated in some epic OCaml <a href=\"http://caml.inria.fr/mantis/view.php?id=5528\">bug\nthreads</a> about new\nfeatures, and\n<a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-02/msg00150.html\">explanations</a>\nabout OCaml semantics on the mailing list.</p>\n<p>Amir Chaudhry also continued to curate and develop the content on the\n<a href=\"http://ocaml.org\">ocaml.org</a> website with our external collaborators\n<a href=\"https://anil.recoil.org/\">Ashish Agarwal</a>, <a href=\"https://anil.recoil.org/\">Christophe Troestler</a> and <a href=\"https://anil.recoil.org/\">Phillippe Wang</a>.\nNotably, it is now the recommended site for OCaml (with the <a href=\"http://caml.inria.fr\">INRIA\nsite</a> being infrequently updated), and also hosts\nthe <a href=\"https://ocaml.org/meetings/\">ACM OCaml Workshop</a> pages. One\naddition that highlighted the userbase of OCaml in the teaching\ncommunity came from building a <a href=\"https://ocaml.org/learn/teaching-ocaml.html\">map of all of the\nuniversities</a> where the\nlanguage is taught, and this was Yan Shvartzshnaider\u2019s <a href=\"http://yansnotes.blogspot.co.uk/2014/11/good-news-everyone-ocamlorg-teaching.html\">first\ncontribution</a>\nto the site.</p>\n<h3><a href=\"https://anil.recoil.org/#visitors-and-interns\"></a>Visitors and Interns</h3>\n<ul>\n<li>Gallery ir\n\n<img alt=\"Down at the pub with the gang!\" src=\"https://anil.recoil.org/images/ocl-pub.webp\" title=\"Down at the pub with the gang!\">\nDown at the pub with the gang!</li>\n</ul>\n<p>Finally, a really important part of any community is hanging out with\neach other to chat over ideas in a friendly environment. As usual, we\nhad a very steady stream of visitors and interns throughout 2014 to\nfacilitate this.</p>\n<p>Frederic Bour, Benjamin Farinier and Matthieu Journault joined us as\nsummer interns from their respective universities in France as part of\ntheir Masters programs. Frederic worked on modular implicits and <a href=\"https://www.irill.org/videos/oups-december-2014/Modular_implicits\">gave a\ngreat\ntalk</a>\nat the OCaml Users group. Benjamin and Matthieu worked on Irmin data\nstructures and complexity (and\n<a href=\"https://github.com/mirage/merge-queues\">merge-queues</a> and\n<a href=\"https://github.com/mirage/merge-ropes\">merge-ropes</a>), and Benjamin had\nhis paper on \u201c<a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\">Mergeable Persistent Data\nStructures</a>\u201d accepted\nto <a href=\"http://jfla.inria.fr/2015/\">JFLA 2015</a>, while Matthieu\u2019s work on\nefficient algorithms for synchronising Irmin DAGs is being integrated\ninto the upstream source code.</p>\n<p>Daniel Buenzli repeated his visit from 2013 and spent a productive\nsummer with us, commenting on almost every project we\u2019re working on. In\nhis own words (edited for brevity):</p>\n<blockquote>\n<p>I started by implementing and releasing\n<a href=\"http://erratique.ch/software/uucp\">Uucp</a>, a library to provide\nefficient access to a selection of the properties of the latest\nUnicode Character database (UCD). [\u2026] As a side effect of the previous\npoint I took time to write an absolute <a href=\"http://erratique.ch/software/uucp/doc/Uucp.html#uminimal\">minimal introduction to\nUnicode</a>.\n[\u2026] Since I was in this Unicode business I took the opportunity to\npropose a <a href=\"https://github.com/ocaml/ocaml/pull/80\">31 loc patch to the standard\nlibrary</a> for a type to\nrepresent Unicode scalar values (an Unicode character to be imprecise)\nto improve interoperability.</p>\n<p>The usual yearly update to OpenGL was announced at the Siggraph\nconference. This prompted me to update the ctypes-based <a href=\"http://erratique.ch/software/tgls\">tgls\nlibrary</a> for supporting the latest\nentry point of OpenGL 4.5 and OpenGL ES 3.1. Since the bindings are\nautomatically generated from the OpenGL XML registry the work is not\ntoo involved but there\u2019s always the odd function signature you\ndon\u2019t/can\u2019t handle automatically yet.</p>\n<p>Spend quite a bit (too much) time on\n<a href=\"http://erratique.ch/software/useri\">useri</a>, a small multi-platform\nabstraction for setting up a drawing surface and gather user input\n(<em>not</em> usury) as <a href=\"http://erratique.ch/software/react\">React</a> events.\nUseri started this winter as a layer on top of SDL to implement a <a href=\"http://erratique.ch/log/2014-05-18\">CT\nscan app</a> and it felt like this\ncould be the basis for adding interactivity and animation to Vg/Vz\nvisualizations \u2013 js viz libraries simply rely on the support provided\nby the browser or SVG support but Vg/Vz strives for backend\nindependence and clear separations of concern (up to which limit\nremains an open question). Unfortunately I couldn\u2019t bring it to a\nrelease and got a little bit lost in browser compatibility issues and\ntrying to reconcile what browser and SDL give us in terms of\nfunctionality and way of operating, so that a maximum of client code\ncan be shared among the supported platforms. But despite this\nnon-release it still managed to be useful in some way, see the next\npoint.</p>\n<p>Helped Jeremy and Leo to implement the rendering and interaction for\ntheir ICFP tutorial <a href=\"https://github.com/ocamllabs/2048-tutorial\">2048 js_of_ocaml\nimplementation</a>. This\nfeatured the use of Gg, Vg, Useri and React and I was quite pleased\nwith the result (despite some performance problems in certain\nbrowsers, but hey composable rendering and animation without a single\nassignement in client code). It\u2019s nice to see that all these pains at\ntrying to design good APIs eventually fit together [\u2026]</p>\n</blockquote>\n<p>A couple of visitors joined us from sunny\n<a href=\"http://github.com/mirleft\">Morocco</a>, where Hannes Mehnert and David\nKaloper had gone to work on a clean-slate TLS stack. They found the\n<a href=\"http://openmirage.org\">MirageOS</a> effort online, and got in touch about\nvisiting. After a very fun summer of hacking, their stack is now the\nstandard TLS option in MirageOS and resulted in the <a href=\"http://amirchaudhry.com/bitcoin-pinata/\">Bitcoin Pinata\nchallenge</a> being issued! Hannes\nand David have since moved to Cambridge to work on this stack full-time\nin 2015, but the internships served as a great way for everyone to get\nto know each other.</p>\n<p>We also had the pleasure of visits from several of our usually remote\ncollaborators. <a href=\"https://github.com/Chris00\">Christophe Troestler</a>,\n<a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>, <a href=\"http://github.com/diml\">Jeremie\nDiminio</a> and <a href=\"https://github.com/andrewray\">Andy\nRay</a> all visited for the annual OCaml Labs\n<a href=\"https://gist.github.com/avsm/18450004ae19c2facf7a\">review meeting</a> in\nChrist\u2019s College. There were also many academic talks from foreign\nvisitors in our <a href=\"http://talks.cam.ac.uk/show/archive/8316\">SRG seminar\nseries</a>, ranging from <a href=\"http://www.cse.iitb.ac.in/~uday/\">Uday\nKhedkar</a> from IIT to <a href=\"http://okmij.org/ftp/\">Oleg\nKiselyov</a> deliver multiple talks on staging and\noptimisation (as well as making a celebrity appearance at the compiler\nhacking session, and <a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>\ndelivering an Emacs-driven departmental seminar on his experiences with\n<a href=\"http://talks.cam.ac.uk/talk/index/51144\">Incremental</a> computation.</p>\n<h2><a href=\"https://anil.recoil.org/#research-efforts\"></a>Research Efforts</h2>\n<p>The OCaml Labs are of course based in the Cambridge Computer Laboratory,\nwhere our day job is to do academic research. Balancing the demands of\nopen source coding, community efforts and top-tier research has be a\ntricky one, but an effort that has been worthwhile.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Dinner at Christ&apos;s College\" src=\"https://anil.recoil.org/images/christs-dinner.webp\" title=\"Dinner at Christ&apos;s College\">\nDinner at Christ's College\n\n<img alt=\"Hacking to the clock for the NSDI deadline\" src=\"https://anil.recoil.org/images/nsdi-deadline.webp\" title=\"Hacking to the clock for the NSDI deadline\">\nHacking to the clock for the NSDI deadline\n\n<img alt=\"Dave enters the glass filled future\" src=\"https://anil.recoil.org/images/scotty.webp\" title=\"Dave enters the glass filled future\">\nDave enters the glass filled future</li>\n</ul>\n<p>Our research efforts are broadly unchanged <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">from\n2013</a>\n(it takes time to craft good ideas!), and this will not be an exhaustive\nrecap. Instead, we\u2019ll summarise them here and point to our\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">papers</a>\nthat describe the work in detail.</p>\n<ul>\n<li>\n<p>The <a href=\"http://openmirage.org\">MirageOS</a> really found its own feet in\n2014, with a <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">summer 2.0\nrelease</a>\nand an extensive <a href=\"http://openmirage.org/blog/2014-in-review\">end-of-year\nrecap</a>. The most notable\nthing has been how well the MirageOS research work has melded with\nthe core OCaml Labs efforts, since much of it has been constructing\ngood quality OCaml libraries to plug holes in the ecosystem. It also\nserved to make us use OPAM on a day-to-day basis for our own work,\nthus creating an effective feedback loop between open-source and\nresearch.</p>\n</li>\n<li>\n<p>In the <a href=\"http://trilogy2.it.uc3m.es/\">Trilogy2</a> and\n<a href=\"http://usercentricnetworking.eu/\">UCN</a> EU projects, we built out\nMirageOS features such as the\n<a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.pdf\">Jitsu</a> toolstack\nfor the \u201cjust-in-time\u201d summoning of unikernels in response to DNS\nrequests. This paper will be presented next month at UlSENIX\n<a href=\"https://www.usenix.org/conference/nsdi15/\">NSDI</a>. It also drove the\ndevelopment of the <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARMv7\nport</a>, an\narchitecture for which OCaml has an excellent native code generator,\nas well as more experimental forays into <a href=\"http://arxiv.org/abs/1412.4638\">BitCoin incentive\nschemes</a> for distributed systems.</p>\n</li>\n<li>\n<p>The <a href=\"http://irmin.io\">Irmin</a> Git-like branchable store created by\nThomas Gazagnaire matured, with Dave Scott\n<a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">prototyping</a> a complex\nport of the <a href=\"http://wiki.xen.org/wiki/XenStore\">XenStore</a> database\nto Irmin, thus letting us show off <a href=\"http://decks.openmirage.org/xendevsummit14#/\">debugging systems with\nGit</a>. We had a paper\naccepted on some early datastructures accepted at\n<a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\">JFLA</a>, and\nThomas Leonard is building the JavaScript backend for running\nin-browser, while Yan Schvartzshnaider is experimenting with <a href=\"http://yansnotes.blogspot.co.uk/2015/01/work-summary-ocaml-labs.html\">graph\nprocessing</a>\nover the DAG representation for privacy-friendly queries. KC is\ninvestigating how to adapt his PLDI 2015 paper on\n<a href=\"http://kcsrk.info/papers/quelea_pldi15.pdf\">Quelea</a> into using\nIrmin as a backend as well.</p>\n</li>\n<li>\n<p>The <a href=\"https://github.com/ocamllabs/higher\">Higher</a> kinded\npolymorphism library written by Jeremy Yallop and Leo White was\npublished in <a href=\"http://www.lpw25.net/flops2014.pdf\">FLOPS 2014</a>,\nforming a basis for building more complex use-cases that need the\nflexibility of higher kinded types without requiring functorising\ncode.</p>\n</li>\n</ul>\n<p>Our long standing research into <a href=\"http://nymote.org\">personal online\nprivacy</a> led to our next system target that uses\nunikernels: the <a href=\"http://arxiv.org/abs/1501.04737\">Databox</a> paper\noutlines the architecture, and was covered in the\n<a href=\"http://www.theguardian.com/technology/2015/feb/01/control-personal-data-databox-end-user-agreement\">Guardian</a>\nnewspaper. Jon Crowcroft led the establishment of the Cambridge wing of\nthe <a href=\"http://www.mccrc.eu/about-us\">Microsoft Cloud Computing Research\nCenter</a> to consider the legal aspect of\nthings, and so we have made forays outside of technology into\nconsidering the implications of <a href=\"http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.pdf\">region-specific\nclouds</a> as well.</p>\n<p>Some of the most exciting work done in the group as part of the\n<a href=\"http://rems.io\">REMS</a> and <a href=\"http://www.naas-project.org/\">NaaS</a> projects\ncame towards the end of 2014 and start of 2015, with multiple\nsubmissions going into top conferences. Unfortunately, due to most of\nthem being double blind reviewed, we cannot link to the papers yet. Keep\nan eye on the blog and <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">published paper\nset</a>, or\nask us directly about what\u2019s been going on!</p>\n<h2><a href=\"https://anil.recoil.org/#priorities-for-2015\"></a>Priorities for 2015</h2>\n<p>As spring breaks and the weather (almost) becomes bearable again, we\u2019re\nsetting our work priorities for the remainder of the year.</p>\n<ul>\n<li>\n<p><strong>Tooling Cohesion</strong>: The entire core team is focussed on fusing\ntogether the individual tools that have been created last year into\na cohesive OCaml Platform release that covers the lifecycle of\ndocumentation, testing and build. This is being managed by Amir\nChaudhry. OPAM remains at the heart of this strategy, and Louis\nGesbert and Thomas Gazagnaire have settled on the <a href=\"https://github.com/ocaml/opam/wiki/1.3-Roadmap\">OPAM 1.3\nroadmap</a>\n(<a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000940.html\">summary</a>).</p>\n</li>\n<li>\n<p><strong>Multicore</strong>: <a href=\"https://anil.recoil.org/kcsrk.info\">KC Sivaramakrishnan</a> has joined the core\nOCaml Labs fulltime to drive the multicore work into a publically\ntestable form. Leo White recently departed after many productive\nyears in Cambridge to head into a career in industry (but still\nremains very much involved with OCaml development!).</p>\n</li>\n<li>\n<p><strong>Language Evolution</strong>: Jeremy Yallop continues to drive our efforts\non staged programming, modular implicits, and a macro system for\nOCaml, all of which are key features that make building complex,\nreliable systems more tractable than ever.</p>\n</li>\n</ul>\n<p>I\u2019d like to thank the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/index.html\">entire\nteam</a> and\nwider community for a wonderfully enjoyable 2014 and start of 2015, and\nam very thankful to the funding and support from Jane Street, Citrix,\nBritish Telecom, RCUK, EPSRC, DARPA and the EU FP7 that made it all\npossible. As always, please feel free to contact any of us directly with\nquestions, or reach out to me <a href=\"mailto:avsm2@cl.cam.ac.uk\">personally</a>\nwith any queries, concerns or bars of chocolate as encouragement.</p>",+"content": "<p>The <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a> initiative within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Laboratory</a> is now just over two years\nold, and it is time for an update about our activities since the last\nupdate at the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">end of\n2013</a>\nand\n<a href=\"https://anil.recoil.org/2012/10/19/announcing-ocaml-labs.html\">2012</a>.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that takes on some of the load of day-to-day OCaml maintenance\nfrom <a href=\"http://caml.inria.fr/\">INRIA</a>, as well as help grow the wider\ncommunity and meet our own research agendas around topics such as\n<a href=\"https://queue.acm.org/detail.cfm?id=2566628\">unikernels</a>. To this end,\nall of our projects have been highly collaborative, often involving\ncolleagues from <a href=\"http://ocamlpro.com\">OCamlPro</a>,\n<a href=\"http://caml.inria.fr/\">INRIA</a>, <a href=\"http://janestreet.com\">Jane Street</a>,\n<a href=\"http://lexifi.com\">Lexifi</a> and <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers our progress in tooling, the compiler and language,\ncommunity efforts, research projects and concludes with our priorities\nfor 2015.</p>\n<h2><a href=\"https://anil.recoil.org/#r-tooling\"></a>\n<img alt=\"OCaml: it&apos;s a dog&apos;s life. In this case, Toru the dog.\" src=\"https://anil.recoil.org/images/toru-cucl-window.webp\" title=\"OCaml: it&apos;s a dog&apos;s life. In this case, Toru the dog.\">\nOCaml: it's a dog's life. In this case, Toru the dog.\nTooling</h2>\n<p>At the start of 2014, we had just helped to release <a href=\"http://opam.ocaml.org/blog/opam-1-1-1-released/\">OPAM\n1.1.1</a> with our\ncolleagues at <a href=\"http://ocamlpro.com\">OCamlPro</a>, and serious OCaml users\nhad just started moving over to using it.</p>\n<p>Our overall goal at OCaml Labs is to deliver a modular set of of\ndevelopment tools around OCaml that we dub the <em>OCaml Platform</em>. The\nremainder of 2014 was thus spent polishing this nascent OPAM release\ninto a solid base (both as a command-line tool and as a library) that we\ncould use as the basis for documentation, testing and build\ninfrastructure, all the while making sure that bigger OCaml projects\ncontinued to migrate over to it. Things have been busy; here are the\nhighlights of this effort.</p>\n<h3><a href=\"https://anil.recoil.org/#opam\"></a>OPAM</h3>\n<p>The central <a href=\"https://github.com/ocaml/opam-repository\">OPAM repository</a>\nthat contains the package descriptions has grown tremendously in 2014,\nwith over 280 contributors committing almost 10000 changesets across\n3800 <a href=\"https://github.com/ocaml/opam-repository/pulls\">pull requests</a> on\nGitHub. The front line of incoming testing has been continuous\nintegration by the wonderful <a href=\"http://travis-ci.org/ocaml/opam-repository\">Travis\nCI</a>, who also granted us\naccess to their experimental <a href=\"http://docs.travis-ci.com/user/osx-ci-environment/\">MacOS\nX</a> build pool. The\nOPAM package team also to expanded to give David Sheets, Jeremy Yallop,\nPeter Zotov and Damien Doligez commit rights, and they have all been\nbusily triaging new packages as they come in.</p>\n<p>Several large projects such as <a href=\"http://xapi-project.github.io/\">Xapi</a>,\n<a href=\"http://ocsigen.org\">Ocsigen</a> and our own\n<a href=\"http://openmirage.org\">MirageOS</a> switched over to using OPAM for\nday-to-day development, as well as prolific individual developers such\nas <a href=\"http://erratique.ch\">Daniel Buenzli</a> and <a href=\"http://ocaml.info/\">Markus\nMottl</a>. <a href=\"https://blogs.janestreet.com/category/ocaml/\">Jane\nStreet</a> continued to send\nregular <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Adiml+\">monthly\nupdates</a>\nof their Core/Async suite, and releases appeared from the\n<a href=\"https://github.com/ocaml/opam-repository/pull/3570\">Facebook</a>\nopen-source team as well (who develop\n<a href=\"https://code.facebook.com/posts/264544830379293/hack-a-new-programming-language-for-hhvm/\">Hack</a>,\n<a href=\"https://github.com/facebook/flow\">Flow</a> and\n<a href=\"https://github.com/facebook/pfff\">Pfff</a> in OCaml).</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Number of unique contributors to the central OPAM package repository\" src=\"https://anil.recoil.org/images/opam12-contributors-mar14.webp\" title=\"Number of unique contributors to the central OPAM package repository\">\nNumber of unique contributors to the central OPAM package repository\n\n<img alt=\"Total number of unique packages (including multiple versions of the same package)\" src=\"https://anil.recoil.org/images/opam12-packages-mar14.webp\" title=\"Total number of unique packages (including multiple versions of the same package)\">\nTotal number of unique packages (including multiple versions of the same package)\n\n<img alt=\"Total packages with multiple versions coalesced so you can see new package growth\" src=\"https://anil.recoil.org/images/opam12-unique-packages-mar14.webp\" title=\"Total packages with multiple versions coalesced so you can see new package growth\">\nTotal packages with multiple versions coalesced so you can see new package growth</li>\n</ul>\n<p>We used feedback from the users to smooth away many of the rough edges,\nwith:</p>\n<ul>\n<li>a redesigned <a href=\"http://opam.ocaml.org/blog/opam-1-2-pin/\">development workflow</a> that lets developers quickly grab a development version of a library recompile all dependent packages automatically, and quickly publish results to GitHub.</li>\n<li>binary distributions for common OS distributions via their <a href=\"https://github.com/ocaml/opam/wiki/Distributions\">native packaging</a>, as well as <a href=\"http://opam.ocaml.org/blog/0install-intro/\">0install</a> and <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant boxes</a>.</li>\n<li>a unified way of cloning the source of any package via <code>opam source</code>. This handles any supported OPAM archive, including Git, Mercurial or Darcs remotes.</li>\n<li>a richer package metadata, including source code, development archives and bug report URLs.</li>\n</ul>\n<p>These changes were all incorporated into the <a href=\"http://opam.ocaml.org/blog/opam-1-2-0-release/\">OPAM 1.2</a>, along with backwards compatibility shims to keep the old 1.1 metadata format working until the migration is complete. The 1.2.x series has been a solid and usable development manager, and last week\u2019s release of <a href=\"http://opam.ocaml.org/blog/opam-1-2-1-release/\">OPAM 1.2.1</a> has further polished the core scripting engine.</p>\n<h4><a href=\"https://anil.recoil.org/#platform-blog\"></a>Platform Blog</h4>\n<p>One of the more notable developments during 2014 was the <a href=\"http://coq-blog.clarus.me/use-opam-for-coq.html\">adoption of\nOPAM</a> further up the\necosystem by the <a href=\"https://coq.inria.fr/\">Coq</a> theorem prover. This\nbroadening of the community prompted us to create an <a href=\"http://opam.ocaml.org\">official OPAM\nblog</a> to give us a central place for new and\ntips, and we\u2019ve had posts about\n<a href=\"http://opam.ocaml.org/blog/opam-in-xenserver/\">XenServer</a> developments,\nthe <a href=\"http://opam.ocaml.org/blog/turn-your-editor-into-an-ocaml-ide/\">Merlin IDE\ntool</a>\nand the modern <a href=\"http://opam.ocaml.org/blog/about-utop/\">UTop</a>\ninteractive REPL. If you are using OPAM in an interesting or production\ncapacity, please do <a href=\"https://github.com/ocaml/platform-blog/issues\">get in\ntouch</a> so that we can\nwork with you to write about it for the wider community.</p>\n<p>The goal of the blog is also to start bringing together the various\ncomponents that form the OCaml Platform. These are designed to be\nmodular tools (so that you can pick and choose which ones are necessary\nfor your particular use of OCaml). There are more details available from\nthe OCaml Workshop presentation at ICFP 2014\n(<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>).</p>\n<h4><a href=\"https://anil.recoil.org/#onboarding-new-users\"></a>Onboarding New Users</h4>\n<p>OPAM has also been adopted now by <a href=\"http://harvard.edu\">several</a>\n<a href=\"http://cornell.edu\">big</a> <a href=\"http://princeton.edu\">universities</a>\n(including <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">us at\nCambridge</a>!) for\nundergraduate and graduate Computer Science courses. The demands\nincreased for an out-of-the-box solution that makes it as easy possible\nfor new users to get started with minimum hassle. We created a\n<a href=\"http://lists.ocaml.org/listinfo/teaching\">dedicated teaching list</a> to\naid collaboration, and a list of <a href=\"http://ocaml.org/learn/teaching-ocaml.html\">teaching resources on\nocaml.org</a> and supported\nseveral initiatives in collaboration with <a href=\"https://github.com/AltGr\">Louis\nGesbert</a> at OCamlPro, as usual with OPAM\ndevelopment).</p>\n<p>The easiest way to make things "just work" are via regular binary builds\nof the latest releases of OCaml and OPAM on Debian, Ubuntu, CentOS and\nFedora, via <a href=\"http://launchpad.net/~avsm\">Ubuntu PPAs</a> and the <a href=\"https://build.opensuse.org/package/show/home:ocaml/opam\">OpenSUSE\nBuild Service</a>\nrepositories. Our industrial collaborators from Citrix, <a href=\"http://jon.recoil.org\">Jon\nLudlam</a> and <a href=\"http://dave.recoil.org\">Dave Scott</a>\nbegan an <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-January/000910.html\">upstreaming\ninitiative</a>\nto Fedora and sponsored the creation of a <a href=\"http://lists.centos.org/pipermail/centos-devel/2014-November/012375.html\">CentOS\nSIG</a>\nto ensure that binary packages remain up-to-date. We also contribute to\nthe hardworking packagers on MacOS X, Debian, FreeBSD, NetBSD and\nOpenBSD where possible as well to ensure that binary builds are well\nrounded out. Richard Mortier also assembled <a href=\"https://github.com/mirage/mirage-vagrant-vms\">Vagrant\nboxes</a> that contain OCaml,\nfor use with VirtualBox.</p>\n<ul>\n<li>Gallery il\n\n<img alt=\"Louis cooks us dinner in Nice at our OPAM developer summit\" src=\"https://anil.recoil.org/images/opam-in-nice.webp\" title=\"Louis cooks us dinner in Nice at our OPAM developer summit\">\nLouis cooks us dinner in Nice at our OPAM developer summit</li>\n</ul>\n<p>Within OPAM itself, we applied polish to the handling of <a href=\"https://github.com/ocaml/opam-depext\">external\ndependencies</a> to automate checking\nthat the system libraries required by OPAM are present. Two emerging\ntools that should help further in 2015 are the\n<a href=\"https://github.com/OCamlPro/opam-user-setup\">opam-user-setup</a> and\n<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-box</a> plugins that\nautomate first-time configuration. These last two are primarily\ndeveloped at OCamlPro, with design input and support from OCaml Labs.</p>\n<p>We do have a lot of work left to do with making the new user experience\nreally seamless, and help is <em>very</em> welcome from anyone who is\ninterested. It often helps to get the perspective of a newcomer to find\nout where the stumbling blocks are, and we value any such advice. Just\nmail <a href=\"mailto:opam-devel@lists.ocaml.org\">opam-devel@lists.ocaml.org</a>\nwith your thoughts, or <a href=\"https://github.com/ocaml/opam/issues\">create an\nissue</a> on how we can improve. A\nparticularly good example of such an initiative was started by Jordan\nWalke, who prototyped <a href=\"https://github.com/jordwalke/CommonML\">CommonML</a>\nwith a NodeJS-style development workflow, and <a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000975.html\">wrote\nup</a>\nhis design document for the mailing list. (Your questions or ideas do\nnot need to be as well developed as Jordan\u2019s prototype!)</p>\n<h3><a href=\"https://anil.recoil.org/#testing-packages\"></a>Testing Packages</h3>\n<p>The public Travis CI testing does come with some limitations, since it\nonly checks that the latest package sets install, but not if any\ntransitive dependencies fail due to interface changes. It also doesn\u2019t\ntest all the optional dependency combinations due to the 50 minute time\nlimit.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/travis-mascot-200px.webp\" title=\"\">\n</p>\n<p>We expanded the OPAM repository testing in several ways to get around\nthis:</p>\n<ul>\n<li>\n<p><strong>Individual Repositories:</strong> Thomas Gazagnaire built <a href=\"http://opam.ocaml.org/blog/opam-1-2-travisci/\">centralised\nTravis scripts</a> that\ncan be used on any OCaml GitHub repository to easily test code\nbefore it is released into OPAM. These scripts are sourced from a\ncentral\n<a href=\"https://github.com/ocaml/ocaml-travisci-skeleton\">repository</a> and\nsupport external, optional and reverse dependency checking across\nmultiple revisions of the compiler. For instance, it just needs <a href=\"https://github.com/mirage/ocaml-cohttp/blob/master/.travis.yml\">one\nfile</a>\nto test all the supported permutations of the\n<a href=\"https://github.com/mirage/ocaml-cohttp\">CoHTTP</a> library.</p>\n</li>\n<li>\n<p><strong>Bulk Builds</strong>: Damien Doligez and I independently started doing\nlarge-scale bulk builds of the repository to ensure that a single\nsnapshot of the package repository can automatically build as many\npackages as possible. My implementation used the\n<a href=\"http://docker.com\">Docker</a> container manager to spawn off 1000s of\npackage builds in parallel and commit the results into a filesystem\nThis required building a <a href=\"http://avsm.github.io/ocaml-dockerfile\">Dockerfile\neDSL</a>, and the results are\nnow online at\n<a href=\"https://opam.ocaml.org/builds\">https://opam.ocaml.org/builds</a>.</p>\n</li>\n<li>\n<p><strong>OCamlot</strong>: An ongoing piece of infrastructure work is to take the\nbulk build logs (which are around 7GB per daily run), and to store\nand render them using our <a href=\"http://irmin.io\">Irmin</a> Git store. Expect\nto see more around this soon; it has the awesome feature of letting\nany developer clone the build logs for their project locally, to\nmake triage of foreign operating systems as simple as possible.</p>\n</li>\n</ul>\n<h4><a href=\"https://anil.recoil.org/#language-evolution\"></a>Language Evolution</h4>\n<p>This ability to do unattended builds of the package repository has also\nimproved the decision making process within the core compiler team.\nSince we now have a large (3000+ package) corpus of OCaml code, it\nbecame a regular occurrence in the 4.02 development cycle to \u201c<a href=\"https://anil.recoil.org/2014/04/08/grepping-every-known-ocaml-package-source.html\">ask\nOPAM</a>\u201d\nwhether a particular feature or new syntax would break any existing\ncode. This in turn provides an incentive for commercial users to provide\nrepresentative samples of their code; for instance, the Jane Street Core\nreleases in OPAM (with their very modular style) act as an open-source\ncanary without needing access to any closed source code.</p>\n<p>One good example in 2014 was the decoupling of the\n<a href=\"http://en.wikipedia.org/wiki/Camlp4\">Camlp4</a> macro preprocessor from\nthe main OCaml distribution. Since Camlp4 has been used for over a\ndecade and there are some very commonly used syntax extensions such as\n<a href=\"https://github.com/janestreet/type_conv\">type_conv</a>, a simple removal\nwould break a lot of packages. We used OPAM to perform a gradual\nmovement that most users hopefully never noticed by the time OCaml 4.02\nwas released. First, we added a <a href=\"https://github.com/ocaml/opam-repository/pull/2558\">dummy\npackage</a> in OPAM for\nearlier versions of the compiler that had Camlp4 built-in, and then used\nthe OPAM constraint engine to compile it as an external tool for the\nnewer compiler revisions. Then we just had to triage the bulk build logs\nto find build failures from packages that were missing a Camlp4\ndependency, and <a href=\"https://github.com/ocaml/opam-repository/pulls?utf8=%E2%9C%93&q=camlp4+requires+is%3Apr+\">add\nthem</a>\nto the package metadata.</p>\n<h4><a href=\"https://anil.recoil.org/#github-integration\"></a>GitHub Integration</h4>\n<p>An interesting\n<a href=\"https://twitter.com/vincenthz/status/563108158907097089\">comment</a> from\nVincent Hanquez about OPAM is that "OCaml's OPAM is a post-GitHub\ndesign". This is very true, as much of the workflow for pinning <code>git://</code>\nURLs emerged out of being early adopters of GitHub for hosting the\nMirageOS. OCaml Labs supported two pieces of infrastructure integration\naround GitHub in 2014:</p>\n<ul>\n<li>\n<p>OPAM has a compiler switch feature that lets you run simultaneous\nOCaml installations and swap between them easily. I used my <a href=\"https://github.com/avsm/ocaml-github\">GitHub\nAPI bindings</a> to regularly\nconvert every GitHub pull request into a custom compiler\nswitch (see <a href=\"https://anil.recoil.org/notes/ocaml-github-and-opam\">Easily OPAM switching to any OCaml feature request</a>).\nThis lets users reporting bugs try out a patched compiler almost\nimmediately upon a fix becoming available.</p>\n</li>\n<li>\n<p>The motivation behind this feature was our collaborator Gabriel\nScherer\u2019s\n<a href=\"http://gallium.inria.fr/blog/patch-review-on-github/\">experiment</a>\nto enable patch review of OCaml on GitHub, alongside the venerable\n<a href=\"http://caml.inria.fr/mantis/view_all_bug_page.php\">Mantis bug\ntracker</a>. We\nsupported this via adding Travis CI support to the main compiler,\nand also helped to migrate a number of support libraries to GitHub,\nsuch as <a href=\"https://github.com/ocaml/camlp4\">camlp4</a>. These can all be\nfound on the <a href=\"https://github.com/ocaml\">ocaml</a> organisation on\nGitHub.</p>\n</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#codoc-documentation\"></a>Codoc Documentation</h3>\n<p>Leo White, David Sheets, Amir Chaudhry and Thomas Gazagnaire led the\ncharge to build a modern documentation generator for OCaml, and\n<a href=\"http://lists.ocaml.org/pipermail/platform/2015-February/000539.html\">published</a>\nan <em>alpha</em> version of <a href=\"https://github.com/dsheets/codoc\">codoc 0.2.0</a>\nafter a lot of work throughout 2014. In the 2014 OCaml workshop\npresentation\n(<a href=\"http://ocaml.org/meetings/ocaml/2014/ocaml2014_7.pdf\">abstract</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2014/ocl-platform-2014-slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=jxhtpQ5nJHg&list=UUP9g4dLR7xt6KzCYntNqYcw\">video</a>),\nwe mentioned the \u201cmodule wall\u201d for documentation and this attempts to\nfix it. To try it out, simply follow the directions in the README on\nthat repository, or <a href=\"http://dsheets.github.io/codoc\">browse some\nsamples</a> of the current, default output\nof the tool. Please do bear in mind codoc and its constituent libraries\nare still under heavy development and are <em>not</em> feature complete, but\nwe\u2019re gathering <a href=\"https://github.com/dsheets/codoc/issues\">feedback</a> from\nearly adopters.</p>\n<p>codoc's aim is to provide a widely useful set of tools for generating\nOCaml documentation. In particular, we are striving to:</p>\n<ol>\n<li>Cover all of OCaml\u2019s language features</li>\n<li>Provide accurate name resolution and linking</li>\n<li>Support cross-linking between different packages</li>\n<li>Expose interfaces to the components we\u2019ve used to build <code>codoc</code></li>\n<li>Provide a magic-free command-line interface to the tool itself</li>\n<li>Reduce external dependencies and default integration with other\ntools</li>\n</ol>\n<p>We haven\u2019t yet achieved all of these at all levels of our tool stack but\nare getting close, and the patches are all under discussion for\nintegration into the mainstream OCaml compiler. <code>codoc</code> 0.2.0 is usable\ntoday (if a little rough in some areas like default CSS), and there is a\n<a href=\"http://opam.ocaml.org/blog/codoc-0-2-0-released/\">blog post</a> that\noutlines the architecture of the new system to make it easier to\nunderstand the design decisions that went into it.</p>\n<h3><a href=\"https://anil.recoil.org/#community-governance\"></a>Community Governance</h3>\n<p>As the amount of infrastructure built around the\n<a href=\"http://ocaml.org\">ocaml.org</a> domain grows (e.g. mailing lists, file\nhosting, bulk building), it is important to establish a governance\nframework to ensure that it is being used as best needed by the wider\nOCaml community.</p>\n<p>Amir Chaudhry took a good look at how other language communities\norganise themself, and began putting together a succinct <a href=\"http://amirchaudhry.com/towards-governance-framework-for-ocamlorg/\">governance\nframework</a>\nto capture how the community around <code>ocaml.org</code> operates, and how to\nquickly resolve any conflicts that may arise in the future. He took care\nto ensure it had a well-defined scope, is simple and self-contained, and\n(crucially) documents the current reality. The result of this work is\ncirculating privately through all the existing volunteers for a first\nround of feedback, and will go live in the next few months as a living\ndocument that explains how our community operates.</p>\n<h3><a href=\"https://anil.recoil.org/#assemblage\"></a>Assemblage</h3>\n<p>One consequence of OCaml\u2019s age (close to twenty years old now) is that\nthe tools built around the compiler have evolved fairly independently.\nWhile OPAM now handles the high-level package management, there is quite\na complex ecosystem of other components that are complex for new users\nto get to grips with: <a href=\"http://github.com/ocaml/oasis\">OASIS</a>,\n<a href=\"http://projects.camlcity.org/projects/findlib.html\">ocamlfind</a>,\n<a href=\"https://ocaml.org/learn/tutorials/ocamlbuild/\">ocamlbuild</a>, and\n<a href=\"https://github.com/the-lambda-church/merlin\">Merlin</a> to name a few.\nEach of these components (while individually stable) have their own\nmetadata and namespace formats, further compounding the lack of cohesion\nof the tools.</p>\n<p>Thomas Gazagnaire and Daniel Buenzli embarked on an effort to build an\neDSL that unifies OCaml package descriptions, with the short-term aim of\ngenerating the support files required by the various support tools, and\nthe long-term goal of being the integration point for the build, test\nand documentation generation lifecycle of an OCaml/OPAM package. This\nprototype, dubbed <a href=\"https://github.com/samoht/assemblage\">Assemblage</a> has\ngone through several iterations and <a href=\"https://github.com/samoht/assemblage/labels/design\">design\ndiscussions</a> over\nthe summer of 2014. Daniel has since been splitting out portions of it\ninto the <a href=\"http://erratique.ch/software/bos\">Bos</a> OS interaction library.</p>\n<p>Assemblage is not released officially yet, but we are committed to\nresuming work on it this summer when Daniel visits again, with the\nintention of unifying much of our workflow through this tool. If you are\ninterested in build and packaging systems, now is the time to <a href=\"https://github.com/samoht/assemblage\">make your\nopinion known</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#core-compiler\"></a>Core Compiler</h2>\n<p>We also spent time in 2014 working on the core OCaml language and\ncompiler, with our work primarily led by Jeremy Yallop and Leo White.\nThese efforts were not looking to make any radical changes in the core\nlanguage; instead, we generally opted for evolutionary changes that\neither polish rough edges in the language (such as open type and handler\ncases), or new features that fit into the ML style of building programs.</p>\n<h3><a href=\"https://anil.recoil.org/#new-features-in-4020\"></a>New Features in 4.02.0</h3>\n<p>The OCaml 4.02 series was primarily developed and\n<a href=\"https://ocaml.org/releases/4.02.html\">released</a> in 2014. The\n<a href=\"http://caml.inria.fr/pub/distrib/ocaml-4.02/notes/Changes\">ChangeLog</a>\ngenerated much <a href=\"https://blogs.janestreet.com/ocaml-4-02-everything-else/\">user\nexcitement</a>,\nand we were also pleased to have contributed several language\nimprovements.</p>\n<h4><a href=\"https://anil.recoil.org/#handler-cases-and-exceptional-syntax\"></a>Handler Cases and exceptional syntax</h4>\n<p>OCaml\u2019s <code>try</code> and <code>match</code> constructs are good at dealing with exceptions\nand values respectively, but neither constructs can handle both values\nand exceptions. Jeremy Yallop investigated <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">how to handle\nsuccess</a>\nmore elegantly, and an elegant unified syntax emerged. A simple example\nis that of a stream iterator that uses exceptions for control flow:</p>\n<pre><code>let rec iter_stream f s =\n match (try Some (MyStream.get s) with End_of_stream -> None) with\n | None -> ()\n | Some (x, s') -> f x; iter_stream f s'\n</code></pre>\n<p>This code is not only verbose, but it also has to allocate an <code>option</code>\nvalue to ensure that the <code>iter_stream</code> calls remains tail recursive. The\nnew syntax in OCaml 4.02 allows the above to be rewritten succinctly:</p>\n<pre><code>let rec iter_stream f s =\n match MyStream.get s with\n | (x, s') -> f x; iter_stream f s'\n | exception End_of_stream -> ()\n</code></pre>\n<p>Read more about the background of this feature in Jeremy\u2019s <a href=\"http://ocamllabs.github.io/compiler-hacking/2014/02/04/handler-case.html#match-exception\">blog\npost</a>,\nthe associated discussion in the <a href=\"http://caml.inria.fr/mantis/view.php?id=6318\">upstream Mantis\nbug</a>, and the final\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec245\">manual\npage</a> in\nthe OCaml 4.02 release. For an example of its use in a real library, see\nthe Jane Street\n<a href=\"https://github.com/janestreet/sexplib/blob/1bd69553/lib/conv.ml#L213-L215\">usage</a>\nin the <a href=\"https://github.com/janestreet/sexplib\">s-expression</a> handling\nlibrary (which they use widely to reify arbitrary OCaml values and\nexceptions).</p>\n<h4><a href=\"https://anil.recoil.org/#open-extensible-types\"></a>Open Extensible Types</h4>\n<p>A long-standing trick to build <a href=\"https://blogs.janestreet.com/rethinking-univ/\">universal\ncontainers</a> in OCaml has\nbeen to encode them using the exception <code>exn</code> type. There is a similar\nconcept of a <a href=\"http://mlton.org/UniversalType\">universal type</a> in\nStandard ML, and they were described in the \u201c<a href=\"http://www.andres-loeh.de/OpenDatatypes.pdf\">Open Data Types and Open\nFunctions</a>\u201d paper by Andres\nL\u00f6h and Ralf Hinze in 2006.</p>\n<p>Leo White designed, implemented and upstreamed support for <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/extn.html#sec246\">extensible\nvariant\ntypes</a> in\nOCaml 4.02. Extensible variant types are variant types that can be\nextended with new variant constructors. They can be defined as follows:</p>\n<pre><code>type attr = ..\n\ntype attr += Str of string\n\ntype attr +=\n | Int of int\n | Float of float\n</code></pre>\n<p>Pattern matching on an extensible variant type requires a default case\nto handle unknown variant constructors, just as is required for pattern\nmatching on exceptions (extensible types use the exception memory\nrepresentation at runtime).</p>\n<p>With this feature added, the OCaml <code>exn</code> type simply becomes a special\ncase of open extensible types. Exception constructors can be declared\nusing the type extension syntax:</p>\n<pre><code> type exn += Exc of int\n</code></pre>\n<p>You can read more about the discussion behind open extensible types in\nthe upstream <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">Mantis bug</a>.\nIf you\u2019d like to see another example of their use, they have been\nadopted by the latest releases of the Jane Street Core libraries in the\n<a href=\"https://github.com/janestreet/core_kernel/blob/43ee3eef/lib/type_equal.ml#L64\">Type_equal</a>\nmodule.</p>\n<h3><a href=\"https://anil.recoil.org/#modular-implicits\"></a>Modular Implicits</h3>\n<p>A common criticism of OCaml is its lack of support for ad-hoc\npolymorphism. The classic example of this is OCaml\u2019s separate addition\noperators for integers (<code>+</code>) and floating-point numbers (<code>+.</code>). Another\nexample is the need for type-specific printing functions (<code>print_int</code>,\n<code>print_string</code>, etc.) rather than a single <code>print</code> function which works\nacross multiple types.</p>\n<p>Taking inspiration from Scala\u2019s\n<a href=\"http://docs.scala-lang.org/tutorials/tour/implicit-parameters.html\">implicits</a>\nand <a href=\"http://www.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf\">Modular Type\nClasses</a> by\nDreyer <em>et al.</em>, Leo White designed a system for ad-hoc polymorphism in\nOCaml based on using modules as type-directed implicit parameters. The\ndesign not only supports implicit modules, but also implicit functors\n(that is, modules parameterised by other module types) to permit the\nexpression of generic modular implicits in exactly the same way that\nfunctors are used to build abstract data structures.</p>\n<p>Frederic Bour joined us as a summer intern and dove straight into the\nimplementation, resulting in an <a href=\"http://andrewray.github.io/iocamljs/modimp_show.html\">online\ndemo</a> and ML\nWorkshop presentation\n(<a href=\"https://sites.google.com/site/mlworkshoppe/modular-implicits.pdf?attredirects=0\">abstract</a>,\n<a href=\"https://www.youtube.com/watch?v=3wVUXTd4WNc\">video</a> and\n<a href=\"http://www.lpw25.net/ml2014.pdf\">paper</a>). Another innovation in how\nwe\u2019ve been trialling this feature is the use of Andy Ray\u2019s\n<a href=\"https://andrewray.github.io/iocamljs/\">IOCamlJS</a> to publish an\ninteractive, online notebook that is fully hosted in the browser. You\ncan follow the examples of modular implicits\n<a href=\"https://andrewray.github.io/iocamljs/modimp_show.html\">online</a>, or try\nthem out on your own computer via an OPAM switch:</p>\n<pre><code>opam switch 4.02.0+modular-implicits\neval `opam config env`\nopam install utop \nutop\n</code></pre>\n<p>Some of the early feedback on modular implicits from industrial users\nwas interesting. Jane Street commented that although this would be a big\nusability leap, it would be dangerous to lose control over exactly what\ngoes into the implicit environment (i.e. the programmer should always\nknow what <code>(a + b)</code> represents by locally reasoning about the code). The\ncurrent design thus follows the ML discipline of maintaining explicit\ncontrol over the namespace, with any ambiguities in resolving an\nimplicit module type resulting in a type error.</p>\n<h3><a href=\"https://anil.recoil.org/#multicore\"></a>Multicore</h3>\n<p>In addition to ad-hoc polymorphism, support for parallel execution on\nmulticore CPUs is undoubtedly the most common feature request for OCaml.\nThis has been high on our list after improving tooling support, and\nStephen Dolan and Leo White made solid progress in 2014 on the core\nruntime plumbing required.</p>\n<p>Stephen initially added <a href=\"https://github.com/stedolan/ocaml\">thread-local\nsupport</a> to the OCaml compiler. This\ndesign avoided the need to make the entire OCaml runtime preemptive (and\nthus a huge patch) by allocating thread-local state per core.</p>\n<p>We are now deep into the design and implementation of the programming\nabstractions built over these low-level primitives. One exciting aspect\nof our implementation is much of the scheduling logic for multicore\nOCaml can be written in (single-threaded) OCaml, making the design very\nflexible with respect to <a href=\"http://kcsrk.info/papers/mmscc_marc12.pdf\">heterogenous\nhardware</a> and <a href=\"http://fable.io\">variable IPC\nperformance</a>.</p>\n<p>To get feedback on the overall design of multicore OCaml, we presented\nat OCaml 2014\n(<a href=\"http://www.cl.cam.ac.uk/~sd601/papers/multicore_slides.pdf\">slides</a>,\n<a href=\"https://www.youtube.com/watch?v=FzmQTC_X5R4\">video</a> and\n<a href=\"https://ocaml.org/meetings/ocaml/2014/ocaml2014_1.pdf\">abstract</a>), and\nStephen visited INRIA to consult with the development team and Arthur\nChargueraud (the author of\n<a href=\"http://www.chargueraud.org/softs/pasl/\">PASL</a>). Towards the end of the\nyear, <a href=\"http://kcsrk.info/\">KC Sivaramakrishnan</a> finished his PhD studies\nat Purdue and joined our OCaml Labs group. He is the author of\n<a href=\"http://multimlton.cs.purdue.edu/mML/Welcome.html\">MultiMlton</a>, and is\nnow driving the completion of the OCaml multicore work along with\nStephen Dolan, Leo White and Mark Shinwell. Stay tuned for updates from\nus when there is more to show later this year!</p>\n<h3><a href=\"https://anil.recoil.org/#ctypes-a-modular-foreign-function-interface\"></a>Ctypes: a Modular Foreign Function Interface</h3>\n<p>The <a href=\"https://github.com/ocamllabs/ocaml-ctypes\">Ctypes</a> library started\nas an experiment with GADTs by Jeremy Yallop, and has since ballooned in\na robust, comprehensive library for safely interacting with the OCaml\nforeign function interface. The first release came out in time to be\nincluded in <a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Real World\nOCaml</a>\nin lieu of the low-level FFI (which I was not particularly enamoured\nwith having to explain in a tight page limit).</p>\n<p>Throughout 2014, Jeremy expanded support for a number of features\nrequested by users (both industrial and academic) who adopted the\nlibrary in preference to manually writing C code to interface with the\nruntime, and issued several updated\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/releases\">releases</a>.</p>\n<h4><a href=\"https://anil.recoil.org/#c-stub-generation\"></a>C Stub Generation</h4>\n<p>The first release of Ctypes required the use of\n<a href=\"https://sourceware.org/libffi/\">libffi</a> to dynamically load shared\nlibraries and dynamically construct function call stack frames whenever\na foreign function is called. While this works for simple libraries, it\ncannot cover <em>all</em> usecases, since interfacing with C demands an\nunderstanding of <code>struct</code> memory layout, C preprocessor macros, and\nother platform-dependent quirks which are more easily dealt with by\ninvoking a C compiler. Finally, the performance of a <code>libffi</code>-based API\nwill necessarily be slower than writing direct C stub code.</p>\n<p>While many other language FFIs provide separate libraries for dynamic\nand static FFI libraries, we decided to have a go at building a\n<em>modular</em> version of Ctypes that could handle both cases from a single\ndescription of the foreign function interface. The result (dubbed\n\u201cCmeleon\u201d) remained surprisingly succinct and usable, and now covers\nalmost every use of the OCaml foreign function interface. We submitted a\npaper to <a href=\"http://icfpconference.org/2015\">ICFP 2015</a> dubbed \u201c<a href=\"https://anil.recoil.org/papers/drafts/2015-cmeleon-icfp-draft1.pdf\">A modular\nforeign function\ninterface</a>\u201d\nthat describes it in detail. Here is a highlight of how simple a generic\nbinding looks:</p>\n<pre><code>module Bindings(F : FOREIGN) = struct\n open F\n let gettimeofday = foreign "gettimeofday"\n (ptr timeval @-> ptr timezone @-> returning int)\nend\n</code></pre>\n<p>The <code>FOREIGN</code> module type completely abstracts the details of whether or\nnot dynamic or static binding is used, and handles C complexities such\nas computing the struct layout on the local machine architecture.</p>\n<h4><a href=\"https://anil.recoil.org/#inverse-stubs\"></a>Inverse Stubs</h4>\n<p>The other nice result from functorising the foreign function interface\nemerged when we tried to <em>invert</em> the FFI and serve a C interface from\nOCaml code (for example, by compiling the OCaml code as a <a href=\"http://caml.inria.fr/pub/docs/manual-ocaml/intfc.html\">shared\nlibrary</a>). This\nwould let us begin swapping out C libraries that we <a href=\"http://openssl.org\">don\u2019t\ntrust</a> with <a href=\"https://github.com/mirage/ocaml-tls\">safer\nequivalents</a> written in OCaml.</p>\n<p>You can see an\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">example</a>\nof how inverted stubs work via a simple C XML parsing exposed from the\n<a href=\"http://erratique.ch/software/xmlm\">Xmlm</a> library. We can define a C\n<code>struct</code> by:</p>\n<pre><code>(* Define a struct of callbacks (C function pointers) *)\nlet handlers : [`handlers] structure typ = structure "handlers"\nlet (--) s f = field handlers s (funptr f)\n let on_data = "on_data" -- (string @-> returning void)\n let on_start_tag = "on_start_tag" -- (string @-> string @-> returning void)\n let on_end_tag = "on_end_tag" -- (void @-> returning void)\n let on_dtd = "on_dtd" -- (string @-> returning void) \n let on_error = "on_error" -- (int @-> int @-> string @-> returning void)\nlet () = seal handlers\n</code></pre>\n<p>and then expose this via C functions:</p>\n<pre><code>module Stubs(I : Cstubs_inverted.INTERNAL) = struct\n (* Expose the type 'struct handlers' to C. *)\n let () = I.structure handlers\n\n (* We expose just a single function to C. The first argument is a (pointer\n to a) struct of callbacks, and the second argument is a string\n representing a filename to parse. *)\n let () = I.internal "parse_xml" \n (ptr handlers @-> string @-> returning void) parse\nend\n</code></pre>\n<p>You can find the full source code to these snippets on the\n<a href=\"https://github.com/yallop/ocaml-ctypes-inverted-stubs-example\">ocaml-ctypes-inverted-stubs-example</a>\nrepository on GitHub.</p>\n<p>We\u2019ll be exploring this aspect of Ctypes further in 2015 for SSL/TLS\nwith David Kaloper and Hannes Mehnert, and Microsoft Research has\ngenerously funded a <a href=\"http://research.microsoft.com/en-us/collaboration/global/phd_projects2015.aspx\">PhD\nstudentship</a>\nto facilitate the work.</p>\n<h4><a href=\"https://anil.recoil.org/#community-contributions\"></a>Community Contributions</h4>\n<p>Ctypes benefited enormously from several external contributions from the\nOCaml community. From a portability perspective, A. Hauptmann\ncontributed <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/190\">Windows\nsupport</a>, and Thomas\nLeonard added <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/231\">Xen\nsupport</a> to allow\nCtypes bindings to work with <a href=\"http://openmirage.org\">MirageOS\nunikernels</a> (which opens up the intriguing\npossibility of accessing shared libraries across virtual machine\nboundaries in the future). C language support was fleshed out by Edwin\nTorok contributing <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/238\">typedef\nsupport</a>, Ramkumar\nRamachandra adding <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/220\">C99\nbools</a> and Peter\nZotov integrating <a href=\"https://github.com/ocamllabs/ocaml-ctypes/pull/143\">native\nstrings</a>.</p>\n<p>The winner of \u201cmost enthusiastic use of OCaml Labs code\u201d goes to <a href=\"https://github.com/braibant\">Thomas\nBraibant</a> of\n<a href=\"http://cryptosense.com/the-team/\">Cryptosense</a>, who used <em>every</em>\nfeature of the Ctypes library (consider multi-threaded, inverted, staged\nand marshalled bindings) in their effort to <a href=\"http://www.economist.com/news/science-and-technology/21647269-automating-search-loopholes-software-hacking-hackers\">hack the\nhackers</a>.\nDavid Sheets comes a close second with his implementation of the <a href=\"https://github.com/dsheets/profuse\">FUSE\nbinary protocol</a>, parameterised by\nversion quirks.</p>\n<p>If you\u2019re using Ctypes, we would love to hear about your particular use.\nA search on GitHub and OPAM reveals over 20 projects using it already,\nincluding industrial use at <a href=\"http://cryptosense.com\">Cryptosense</a> and\n<a href=\"http://ocaml.janestreet.com\">Jane Street</a>, and ports to Windows, *BSD,\nMacOS X and even iPhone and Android. There\u2019s a <a href=\"https://github.com/ocamllabs/ocaml-ctypes/wiki\">getting\nstarted</a> guide, and a\n<a href=\"http://lists.ocaml.org/listinfo/ctypes\">mailing list</a> available.</p>\n<h2><a href=\"https://anil.recoil.org/#community-and-teaching-efforts\"></a>Community and Teaching Efforts</h2>\n<p>In addition to the online community building, we also participated in a\nnumber of conferences and face-to-face events to promote education about\nfunctional programming.</p>\n<h3><a href=\"https://anil.recoil.org/#conferences-and-talks\"></a>Conferences and Talks</h3>\n<ul>\n<li>Gallery ir\n\n<img alt=\"Anil speaking at QCon on unikernels\" src=\"https://anil.recoil.org/images/qcon-unikernel-talk.webp\" title=\"Anil speaking at QCon on unikernels\">\nAnil speaking at QCon on unikernels</li>\n</ul>\n<p>There has been a huge growth in the number of quality conferences in\nrecent years, making it tough to choose which ones to attend.\n<a href=\"http://icfpconference.org\">ICFP</a> is the academic meeting point that\npredates most of them, and we <a href=\"https://anil.recoil.org/2014/08/31/ocaml-labs-at-icfp-2014.html\">participated\nextensively</a>\nin 2014 via talks, tutorials and a\n<a href=\"https://www.youtube.com/watch?v=UEIHfXLMtwA\">keynote</a> at the Haskell\nSymposium.<br>\nI also served on the <a href=\"http://icfpconference.org/icfp2014/\">program\ncommittee</a> and <a href=\"https://anil.recoil.org/2015/02/18/icfp15-call-for-sponsorships.html\">industrial\nrelations\nchair</a>\nand took over as the steering committee chair of\n<a href=\"http://cufp.org\">CUFP</a>. Jeremy Yallop, Thomas Gazagnaire and Leo White\nall served program committees on workshops, with Jeremy also chairing\nthis year\u2019s ML Workshop.</p>\n<p>Outside of academic conferences, we participated in a number of\nnon-academic conferences such as <a href=\"https://qconsf.com/\">QCon</a>,\n<a href=\"http://oscon.com\">OSCON</a>, <a href=\"http://ccc.de\">CCC</a>, <a href=\"https://operatingsystems.io/\">New Directions in\nOS</a>,\n<a href=\"http://functionalconf.com\">FunctionalConf</a>,\n<a href=\"https://skillsmatter.com/conferences/1819-functional-programming-exchange\">FPX</a>\nand <a href=\"https://fosdem.org/2014/\">FOSDEM</a>. The vast majority of these talks\nwere about the MirageOS, and slides can be found at\n<a href=\"http://decks.openmirage.org\">decks.openmirage.org</a>.</p>\n<h4><a href=\"https://anil.recoil.org/#the-2048-browser-game\"></a>The 2048 Browser Game</h4>\n<p>Yaron Minsky and I have run OCaml tutorials for ICFP for\n<a href=\"http://cufp.org/2011/t3-building-functional-os.html\">a</a>\n<a href=\"http://cufp.org/2013/t2-yaron-minsky-anil-madhavapeddy-ocaml-tutorial.html\">few</a>\n<a href=\"http://cufp.org/2012/t1-real-world-ocaml-anil-madhavapeddy-university-c.html\">years</a>,\nand we finally hung up our boots in favour of a new crowd.</p>\n<p>Jeremy Yallop and Leo White stepped up to the mark with their ICFP/CUFP\n2014 <a href=\"http://cufp.org/2014/t7-leo-white-introduction-to-ocaml.html\">Introduction to\nOCaml</a>\ntutorial, which had the additional twist of being taught entirely in a\nweb browser by virtue of using the\n<a href=\"http://ocsigen.org/js_of_ocaml\">js_of_ocaml</a> and\n<a href=\"http://andrewray.github.io/iocamljs/\">IOCamlJS</a>. They decided that a\ngood practical target was the popular\n<a href=\"http://gabrielecirulli.github.io/2048/\">2048</a> game that has wasted many\nprogrammer hours here at OCaml Labs. They <a href=\"https://github.com/ocamllabs/2048-tutorial\">hacked on\nit</a> over the summertime,\nassisted by our visitor Daniel Buenzli who also released useful\nlibraries such as <a href=\"http://erratique.ch/software/vg\">Vg</a>,\n<a href=\"http://erratique.ch/software/react\">React</a>,\n<a href=\"http://erratique.ch/software/useri\">Useri</a>, and\n<a href=\"http://erratique.ch/software/gg\">Gg</a>.</p>\n<p>The end result is satisfyingly <a href=\"http://ocamllabs.github.io/2048-tutorial/\">playable\nonline</a>, with the source code\navailable at\n<a href=\"https://github.com/ocamllabs/2048-tutorial\">ocamllabs/2048-tutorial</a>.</p>\n<p>Thomas Gazagnaire got invited to Bangalore for <a href=\"http://functionalconf.com/\">Functional\nConf</a> later in the year, and he extended the\n<a href=\"http://gazagnaire.org/fuconf14/\">interactive tutorial notebook</a> and\nalso ran an OCaml tutorial to a packed room. We were very happy to\nsupport the first functional programming conference in India, and hope\nto see many more such events spring up! Amir Chaudhry then went to\nBelgium to <a href=\"https://fosdem.org/2015/\">FOSDEM 2015</a> where he showed off\n<a href=\"http://amirchaudhry.com/unikernel-arm-demo-fosdem/\">the 2048 game running as an ARM\nunikernel</a> to a\ncrowd of attendees at the Xen booth.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Jeremy Yallop giving the L23 course at Cambridge\" src=\"https://anil.recoil.org/images/l23.webp\" title=\"Jeremy Yallop giving the L23 course at Cambridge\">\nJeremy Yallop giving the L23 course at Cambridge\n\n<img alt=\"Compiling hacking with Don Syme\" src=\"https://anil.recoil.org/images/compiler-hacking-dsyme.webp\" title=\"Compiling hacking with Don Syme\">\nCompiling hacking with Don Syme\n\n<img alt=\"Finding a copy of Real World OCaml in Foyles!\" src=\"https://anil.recoil.org/images/jeremy-rwo.webp\" title=\"Finding a copy of Real World OCaml in Foyles!\">\nFinding a copy of Real World OCaml in Foyles!</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#graduate-teaching\"></a>Graduate Teaching</h3>\n<p><a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a> and <a href=\"https://github.com/lpw25\">Leo White</a> (with assistance from <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a> and\nmyself) also led the design of a new graduate course on <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/\">Advanced\nFunctional Programming</a> at\nthe Computer Laboratory. This ran in the <a href=\"http://en.wikipedia.org/wiki/Lent_term\">Lent\nTerm</a> and was over-subscribed by\nthree times the number who pre-registered (due to a number of PhD\nstudents and our collaborators from <a href=\"http://citrix.com\">Citrix</a> also\nattending).</p>\n<p>The course materials are <a href=\"http://www.cl.cam.ac.uk/teaching/1415/L28/materials.html\">freely available\nonline</a> and\ncover the theory behind functional programming, and then move onto type\ninference, abstraction and parametricity, GADTs, rows, monads, and\nstaging. We will be running this again in future years, and the lecture\nmaterials are already proving useful to <a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-04/msg00001.html\">answer mailing list\nquestions</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#mentoring-beginners\"></a>Mentoring Beginners</h3>\n<p>We also had the pleasure of mentoring up-and-coming functional\nprogrammers via several outreach programs, both face-to-face and remote.</p>\n<h4><a href=\"https://anil.recoil.org/#cambridge-compiler-hacking\"></a>Cambridge Compiler Hacking</h4>\n<p>We started the <a href=\"http://ocamllabs.github.io/compiler-hacking/\">Cambridge Compiler\nHacking</a> sessions in a\nsmall way towards the end of 2013 in order to provide a local, friendly\nplace to assist people who wanted to dip their toes into the\nunnecessarily mysterious world of programming language hacking. The plan\nwas simple: provide drinks, pizza, network and a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">bug list of varying\ndifficulty</a> for\nattendees to choose from and work on for the evening, with mentoring\nfrom the experienced OCaml contributors.</p>\n<p>We continued this bi-monthly tradition in 2014, with a regular\nattendance of 15-30 people, and even cross-pollinated communities with\nour local F# and Haskell colleagues. We rotated locations from the\nCambridge Computer Laboratory to Citrix, Makespace, and the new\nCambridge Postdoc Centre. We posted some\n<a href=\"http://ocamllabs.github.io/compiler-hacking/2014/06/24/highlights-from-recent-sessions.html\">highlights</a>\nfrom sessions towards the start of the year, and are very happy with how\nit\u2019s going. There has even been uptake of the bug list across the water\nin France, thanks to Gabriel Scherer.</p>\n<p>In 2015, we\u2019d like to branch out further and host some sessions in\nLondon. If you have a suggestion for a venue or theme, please <a href=\"http://lists.ocaml.org/listinfo/cam-compiler-hacking\">get in\ntouch</a>!</p>\n<h4><a href=\"https://anil.recoil.org/#summer-programs\"></a>Summer Programs</h4>\n<p>There has been a laudable rise in summer programs designed to encourage\ndiversity in our community, and we of course leap at the opportunity to\nparticipate in these when we find them.</p>\n<ul>\n<li>The <a href=\"https://gnome.org/opw/\">GNOME Outreach Program</a> (now also known\nas <a href=\"https://www.gnome.org/outreachy/\">Outreachy</a>) had one funded\nplace for Xen and MirageOS. <a href=\"http://www.somerandomidiot.com/\">Mindy\nPreston</a> did a spectacular <a href=\"http://www.somerandomidiot.com/blog/categories/ocaml/\">blog\nseries</a> about\nher experiences and motivations behind learning OCaml.</li>\n<li>The <a href=\"https://www.google-melange.com/\">Google Summer of Code 2014</a>\nalso had us\n<a href=\"http://openmirage.org/blog/applying-for-gsoc2014\">participating</a>\nvia MirageOS, and <a href=\"https://github.com/moonlightdrive\">Jyotsna\nPrakash</a> took on the challenging\njob of building OCaml bindings for Amazon EC2, also detailed on <a href=\"https://1000hippos.wordpress.com/\">her\nblog</a>.</li>\n<li>Amir Chaudhry began the <a href=\"https://github.com/mirage/mirage-www/wiki/Pioneer-Projects\">Mirage Pioneer\nProjects</a>\ninitiative to give beginners an easier onramp, and this has taken\noff very effectively as a way to advertise interesting projects for\nbeginners at varying levels of difficulties.</li>\n</ul>\n<p>Our own students also had the chance to participate in such workshops to\nget out of Cambridge in the summer! <a href=\"http://hh360.user.srcf.net/blog/\">Heidi\nHoward</a> liveblogged her experiences at\nthe\n<a href=\"http://www.syslog.cl.cam.ac.uk/2015/01/14/programming-languages-mentoring-workshop-plmw/\">PLMW</a>\nworkshop in Mumbai. Meanwhile, <a href=\"https://github.com/dsheets\">David\nSheets</a> got to travel to the slightly less\nexotic London to <a href=\"http://www.syslog.cl.cam.ac.uk/2014/11/25/new-directions-in-operating-systems/\">liveblog\nOSIO</a>,\nand Leonhard Markert covered <a href=\"http://www.syslog.cl.cam.ac.uk/2014/09/05/ocaml-2014/\">ICFP\n2014</a> as a\nstudent volunteer.</p>\n<h3><a href=\"https://anil.recoil.org/#blogging-and-online-activities\"></a>Blogging and Online Activities</h3>\n<p>Our <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/blogs/\">blog roll</a>\nmaintains the ongoing stream of activity from the OCaml Labs crew, but\nthere were some particular highlights throughout 2014.</p>\n<ul>\n<li><a href=\"http://roscidus.com/blog/\">Thomas Leonard</a> began writing about his\nexperiences with switching his <a href=\"http://0install.net\">0install</a>\ninstallation system from <a href=\"http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-retrospective/\">Python to\nOCaml</a>\nand <a href=\"http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain/\">what you gain with\nOCaml</a>.\nThis series led to a bunch of interesting feedback on social\nnetworking sites, and Thomas joined the group full-time to work on\nour research into\n<a href=\"http://roscidus.com/blog/blog/2015/01/21/securing-the-unikernel/\">unikernels</a>.</li>\n<li><a href=\"http://www.skjegstad.com/\">Magnus Skjegstad</a> returned from Norway\nto Cambridge to work on MirageOS, and came up with some <a href=\"http://www.skjegstad.com/blog/2015/03/25/mirageos-vm-per-url-experiment/\">crazy\nexperiements</a>,\nas well as helping to build <a href=\"http://www.skjegstad.com/blog/2015/01/19/mirageos-xen-virtualbox/\">Vagrant\nimages</a>\nof the OCaml development environment.</li>\n<li><a href=\"http://amirchaudhry.com\">Amir Chaudhry</a> began his quest to <a href=\"http://amirchaudhry.com/writing-planet-in-pure-ocaml/\">port\nhis website</a>\nwebsite to a <a href=\"http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Jekyll\nunikernel</a>.</li>\n<li>The <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">Mirage 2.0\nrelease</a> in\nthe summer of 2014 saw a slew of blogs posts about the\n<a href=\"http://openmirage.org/blog/2014-in-review\">surge</a> in MirageOS\nactivity.</li>\n</ul>\n<p>It wasn\u2019t all just blogging though, and Jeremy Yallop and Leo White in\nparticular participated in some epic OCaml <a href=\"http://caml.inria.fr/mantis/view.php?id=5528\">bug\nthreads</a> about new\nfeatures, and\n<a href=\"https://sympa.inria.fr/sympa/arc/caml-list/2015-02/msg00150.html\">explanations</a>\nabout OCaml semantics on the mailing list.</p>\n<p>Amir Chaudhry also continued to curate and develop the content on the\n<a href=\"http://ocaml.org\">ocaml.org</a> website with our external collaborators\n<a href=\"https://anil.recoil.org/\">Ashish Agarwal</a>, <a href=\"https://anil.recoil.org/\">Christophe Troestler</a> and <a href=\"https://anil.recoil.org/\">Phillippe Wang</a>.\nNotably, it is now the recommended site for OCaml (with the <a href=\"http://caml.inria.fr\">INRIA\nsite</a> being infrequently updated), and also hosts\nthe <a href=\"https://ocaml.org/meetings/\">ACM OCaml Workshop</a> pages. One\naddition that highlighted the userbase of OCaml in the teaching\ncommunity came from building a <a href=\"https://ocaml.org/learn/teaching-ocaml.html\">map of all of the\nuniversities</a> where the\nlanguage is taught, and this was Yan Shvartzshnaider\u2019s <a href=\"http://yansnotes.blogspot.co.uk/2014/11/good-news-everyone-ocamlorg-teaching.html\">first\ncontribution</a>\nto the site.</p>\n<h3><a href=\"https://anil.recoil.org/#visitors-and-interns\"></a>Visitors and Interns</h3>\n<ul>\n<li>Gallery ir\n\n<img alt=\"Down at the pub with the gang!\" src=\"https://anil.recoil.org/images/ocl-pub.webp\" title=\"Down at the pub with the gang!\">\nDown at the pub with the gang!</li>\n</ul>\n<p>Finally, a really important part of any community is hanging out with\neach other to chat over ideas in a friendly environment. As usual, we\nhad a very steady stream of visitors and interns throughout 2014 to\nfacilitate this.</p>\n<p>Frederic Bour, Benjamin Farinier and Matthieu Journault joined us as\nsummer interns from their respective universities in France as part of\ntheir Masters programs. Frederic worked on modular implicits and <a href=\"https://www.irill.org/videos/oups-december-2014/Modular_implicits\">gave a\ngreat\ntalk</a>\nat the OCaml Users group. Benjamin and Matthieu worked on Irmin data\nstructures and complexity (and\n<a href=\"https://github.com/mirage/merge-queues\">merge-queues</a> and\n<a href=\"https://github.com/mirage/merge-ropes\">merge-ropes</a>), and Benjamin had\nhis paper on \u201c<a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\">Mergeable Persistent Data\nStructures</a>\u201d accepted\nto <a href=\"http://jfla.inria.fr/2015/\">JFLA 2015</a>, while Matthieu\u2019s work on\nefficient algorithms for synchronising Irmin DAGs is being integrated\ninto the upstream source code.</p>\n<p>Daniel Buenzli repeated his visit from 2013 and spent a productive\nsummer with us, commenting on almost every project we\u2019re working on. In\nhis own words (edited for brevity):</p>\n<blockquote>\n<p>I started by implementing and releasing\n<a href=\"http://erratique.ch/software/uucp\">Uucp</a>, a library to provide\nefficient access to a selection of the properties of the latest\nUnicode Character database (UCD). [\u2026] As a side effect of the previous\npoint I took time to write an absolute <a href=\"http://erratique.ch/software/uucp/doc/Uucp.html#uminimal\">minimal introduction to\nUnicode</a>.\n[\u2026] Since I was in this Unicode business I took the opportunity to\npropose a <a href=\"https://github.com/ocaml/ocaml/pull/80\">31 loc patch to the standard\nlibrary</a> for a type to\nrepresent Unicode scalar values (an Unicode character to be imprecise)\nto improve interoperability.</p>\n<p>The usual yearly update to OpenGL was announced at the Siggraph\nconference. This prompted me to update the ctypes-based <a href=\"http://erratique.ch/software/tgls\">tgls\nlibrary</a> for supporting the latest\nentry point of OpenGL 4.5 and OpenGL ES 3.1. Since the bindings are\nautomatically generated from the OpenGL XML registry the work is not\ntoo involved but there\u2019s always the odd function signature you\ndon\u2019t/can\u2019t handle automatically yet.</p>\n<p>Spend quite a bit (too much) time on\n<a href=\"http://erratique.ch/software/useri\">useri</a>, a small multi-platform\nabstraction for setting up a drawing surface and gather user input\n(<em>not</em> usury) as <a href=\"http://erratique.ch/software/react\">React</a> events.\nUseri started this winter as a layer on top of SDL to implement a <a href=\"http://erratique.ch/log/2014-05-18\">CT\nscan app</a> and it felt like this\ncould be the basis for adding interactivity and animation to Vg/Vz\nvisualizations \u2013 js viz libraries simply rely on the support provided\nby the browser or SVG support but Vg/Vz strives for backend\nindependence and clear separations of concern (up to which limit\nremains an open question). Unfortunately I couldn\u2019t bring it to a\nrelease and got a little bit lost in browser compatibility issues and\ntrying to reconcile what browser and SDL give us in terms of\nfunctionality and way of operating, so that a maximum of client code\ncan be shared among the supported platforms. But despite this\nnon-release it still managed to be useful in some way, see the next\npoint.</p>\n<p>Helped Jeremy and Leo to implement the rendering and interaction for\ntheir ICFP tutorial <a href=\"https://github.com/ocamllabs/2048-tutorial\">2048 js_of_ocaml\nimplementation</a>. This\nfeatured the use of Gg, Vg, Useri and React and I was quite pleased\nwith the result (despite some performance problems in certain\nbrowsers, but hey composable rendering and animation without a single\nassignement in client code). It\u2019s nice to see that all these pains at\ntrying to design good APIs eventually fit together [\u2026]</p>\n</blockquote>\n<p>A couple of visitors joined us from sunny\n<a href=\"http://github.com/mirleft\">Morocco</a>, where Hannes Mehnert and David\nKaloper had gone to work on a clean-slate TLS stack. They found the\n<a href=\"http://openmirage.org\">MirageOS</a> effort online, and got in touch about\nvisiting. After a very fun summer of hacking, their stack is now the\nstandard TLS option in MirageOS and resulted in the <a href=\"http://amirchaudhry.com/bitcoin-pinata/\">Bitcoin Pinata\nchallenge</a> being issued! Hannes\nand David have since moved to Cambridge to work on this stack full-time\nin 2015, but the internships served as a great way for everyone to get\nto know each other.</p>\n<p>We also had the pleasure of visits from several of our usually remote\ncollaborators. <a href=\"https://github.com/Chris00\">Christophe Troestler</a>,\n<a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>, <a href=\"http://github.com/diml\">Jeremie\nDiminio</a> and <a href=\"https://github.com/andrewray\">Andy\nRay</a> all visited for the annual OCaml Labs\n<a href=\"https://gist.github.com/avsm/18450004ae19c2facf7a\">review meeting</a> in\nChrist\u2019s College. There were also many academic talks from foreign\nvisitors in our <a href=\"http://talks.cam.ac.uk/show/archive/8316\">SRG seminar\nseries</a>, ranging from <a href=\"http://www.cse.iitb.ac.in/~uday/\">Uday\nKhedkar</a> from IIT to <a href=\"http://okmij.org/ftp/\">Oleg\nKiselyov</a> deliver multiple talks on staging and\noptimisation (as well as making a celebrity appearance at the compiler\nhacking session, and <a href=\"http://ocaml.janestreet.com\">Yaron Minsky</a>\ndelivering an Emacs-driven departmental seminar on his experiences with\n<a href=\"http://talks.cam.ac.uk/talk/index/51144\">Incremental</a> computation.</p>\n<h2><a href=\"https://anil.recoil.org/#research-efforts\"></a>Research Efforts</h2>\n<p>The OCaml Labs are of course based in the Cambridge Computer Laboratory,\nwhere our day job is to do academic research. Balancing the demands of\nopen source coding, community efforts and top-tier research has be a\ntricky one, but an effort that has been worthwhile.</p>\n<ul>\n<li>Gallery\n\n<img alt=\"Dinner at Christ&apos;s College\" src=\"https://anil.recoil.org/images/christs-dinner.webp\" title=\"Dinner at Christ&apos;s College\">\nDinner at Christ's College\n\n<img alt=\"Hacking to the clock for the NSDI deadline\" src=\"https://anil.recoil.org/images/nsdi-deadline.webp\" title=\"Hacking to the clock for the NSDI deadline\">\nHacking to the clock for the NSDI deadline\n\n<img alt=\"Dave enters the glass filled future\" src=\"https://anil.recoil.org/images/scotty.webp\" title=\"Dave enters the glass filled future\">\nDave enters the glass filled future</li>\n</ul>\n<p>Our research efforts are broadly unchanged <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/news/index.html#Dec%202013\">from\n2013</a>\n(it takes time to craft good ideas!), and this will not be an exhaustive\nrecap. Instead, we\u2019ll summarise them here and point to our\n<a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">papers</a>\nthat describe the work in detail.</p>\n<ul>\n<li>\n<p>The <a href=\"http://openmirage.org\">MirageOS</a> really found its own feet in\n2014, with a <a href=\"http://openmirage.org/blog/announcing-mirage-20-release\">summer 2.0\nrelease</a>\nand an extensive <a href=\"http://openmirage.org/blog/2014-in-review\">end-of-year\nrecap</a>. The most notable\nthing has been how well the MirageOS research work has melded with\nthe core OCaml Labs efforts, since much of it has been constructing\ngood quality OCaml libraries to plug holes in the ecosystem. It also\nserved to make us use OPAM on a day-to-day basis for our own work,\nthus creating an effective feedback loop between open-source and\nresearch.</p>\n</li>\n<li>\n<p>In the <a href=\"http://trilogy2.it.uc3m.es/\">Trilogy2</a> and\n<a href=\"http://usercentricnetworking.eu/\">UCN</a> EU projects, we built out\nMirageOS features such as the\n<a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu.pdf\">Jitsu</a> toolstack\nfor the \u201cjust-in-time\u201d summoning of unikernels in response to DNS\nrequests. This paper will be presented next month at UlSENIX\n<a href=\"https://www.usenix.org/conference/nsdi15/\">NSDI</a>. It also drove the\ndevelopment of the <a href=\"http://openmirage.org/blog/introducing-xen-minios-arm\">ARMv7\nport</a>, an\narchitecture for which OCaml has an excellent native code generator,\nas well as more experimental forays into <a href=\"http://arxiv.org/abs/1412.4638\">BitCoin incentive\nschemes</a> for distributed systems.</p>\n</li>\n<li>\n<p>The <a href=\"http://irmin.io\">Irmin</a> Git-like branchable store created by\nThomas Gazagnaire matured, with Dave Scott\n<a href=\"https://www.youtube.com/watch?v=DSzvFwIVm5s\">prototyping</a> a complex\nport of the <a href=\"http://wiki.xen.org/wiki/XenStore\">XenStore</a> database\nto Irmin, thus letting us show off <a href=\"http://decks.openmirage.org/xendevsummit14#/\">debugging systems with\nGit</a>. We had a paper\naccepted on some early datastructures accepted at\n<a href=\"https://anil.recoil.org/papers/2015-jfla-irmin.pdf\">JFLA</a>, and\nThomas Leonard is building the JavaScript backend for running\nin-browser, while Yan Schvartzshnaider is experimenting with <a href=\"http://yansnotes.blogspot.co.uk/2015/01/work-summary-ocaml-labs.html\">graph\nprocessing</a>\nover the DAG representation for privacy-friendly queries. KC is\ninvestigating how to adapt his PLDI 2015 paper on\n<a href=\"http://kcsrk.info/papers/quelea_pldi15.pdf\">Quelea</a> into using\nIrmin as a backend as well.</p>\n</li>\n<li>\n<p>The <a href=\"https://github.com/ocamllabs/higher\">Higher</a> kinded\npolymorphism library written by Jeremy Yallop and Leo White was\npublished in <a href=\"http://www.lpw25.net/flops2014.pdf\">FLOPS 2014</a>,\nforming a basis for building more complex use-cases that need the\nflexibility of higher kinded types without requiring functorising\ncode.</p>\n</li>\n</ul>\n<p>Our long standing research into <a href=\"http://nymote.org\">personal online\nprivacy</a> led to our next system target that uses\nunikernels: the <a href=\"http://arxiv.org/abs/1501.04737\">Databox</a> paper\noutlines the architecture, and was covered in the\n<a href=\"http://www.theguardian.com/technology/2015/feb/01/control-personal-data-databox-end-user-agreement\">Guardian</a>\nnewspaper. Jon Crowcroft led the establishment of the Cambridge wing of\nthe <a href=\"http://www.mccrc.eu/about-us\">Microsoft Cloud Computing Research\nCenter</a> to consider the legal aspect of\nthings, and so we have made forays outside of technology into\nconsidering the implications of <a href=\"http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-863.pdf\">region-specific\nclouds</a> as well.</p>\n<p>Some of the most exciting work done in the group as part of the\n<a href=\"http://rems.io\">REMS</a> and <a href=\"http://www.naas-project.org/\">NaaS</a> projects\ncame towards the end of 2014 and start of 2015, with multiple\nsubmissions going into top conferences. Unfortunately, due to most of\nthem being double blind reviewed, we cannot link to the papers yet. Keep\nan eye on the blog and <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/papers/index.html\">published paper\nset</a>, or\nask us directly about what\u2019s been going on!</p>\n<h2><a href=\"https://anil.recoil.org/#priorities-for-2015\"></a>Priorities for 2015</h2>\n<p>As spring breaks and the weather (almost) becomes bearable again, we\u2019re\nsetting our work priorities for the remainder of the year.</p>\n<ul>\n<li>\n<p><strong>Tooling Cohesion</strong>: The entire core team is focussed on fusing\ntogether the individual tools that have been created last year into\na cohesive OCaml Platform release that covers the lifecycle of\ndocumentation, testing and build. This is being managed by Amir\nChaudhry. OPAM remains at the heart of this strategy, and Louis\nGesbert and Thomas Gazagnaire have settled on the <a href=\"https://github.com/ocaml/opam/wiki/1.3-Roadmap\">OPAM 1.3\nroadmap</a>\n(<a href=\"http://lists.ocaml.org/pipermail/opam-devel/2015-February/000940.html\">summary</a>).</p>\n</li>\n<li>\n<p><strong>Multicore</strong>: <a href=\"https://anil.recoil.org/kcsrk.info\">KC Sivaramakrishnan</a> has joined the core\nOCaml Labs fulltime to drive the multicore work into a publically\ntestable form. Leo White recently departed after many productive\nyears in Cambridge to head into a career in industry (but still\nremains very much involved with OCaml development!).</p>\n</li>\n<li>\n<p><strong>Language Evolution</strong>: Jeremy Yallop continues to drive our efforts\non staged programming, modular implicits, and a macro system for\nOCaml, all of which are key features that make building complex,\nreliable systems more tractable than ever.</p>\n</li>\n</ul>\n<p>I\u2019d like to thank the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/index.html\">entire\nteam</a> and\nwider community for a wonderfully enjoyable 2014 and start of 2015, and\nam very thankful to the funding and support from Jane Street, Citrix,\nBritish Telecom, RCUK, EPSRC, DARPA and the EU FP7 that made it all\npossible. As always, please feel free to contact any of us directly with\nquestions, or reach out to me <a href=\"mailto:avsm2@cl.cam.ac.uk\">personally</a>\nwith any queries, concerns or bars of chocolate as encouragement.</p>",
+18
avsm/notes_opam-1-1-beta.json
+18
avsm/notes_opam-1-1-beta.json
···+"summary": "<p><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> just announced the availability of the\n<a href=\"http://opam.ocamlpro.com\">OPAM</a> beta release. This has been a huge\namount of work for him and <a href=\"http://louis.gesbert.fr/\">Louis</a>, so I\u2019m\nexcited to see this release!</p>\n<p>Aside from general stability, the main\nhighlights for me are:</p>\n<ul>\n<li>\n<p>A switch to the\n<a href=\"http://creativecommons.org/publicdomain/zero/1.0/\">CC0</a>\npublic-domain-like license for the repository, and LGPL2+linking\nexception for OPAM itself. The <a href=\"https://github.com/OCamlPro/opam-repository/issues/955\">cutover to the new\nlicense</a> was\nthe first non-gratuitous use of GitHub\u2019s fancy issue lists I\u2019ve\nseen, too! As part of this, we\u2019re also beginning a transition over\nto hosting it at <code>opam.ocaml.org</code>, to underline our committment to\nmaintaining it as an OCaml community resource.</p>\n</li>\n<li>\n<p>Much-improved support for package pinning and updates. This is the\nfeature that makes OPAM work well with\n<a href=\"http://openmirage.org\">MirageOS</a>, since we often need to do\ndevelopment work on a low-level library (such as a <a href=\"https://github.com/mirage/ocaml-xen-block-driver\">device\ndriver</a> and\nrecompile all the reverse dependencies.</p>\n</li>\n<li>\n<p>Support for post-installation messages (e.g. to display <a href=\"https://github.com/OCamlPro/opam-repository/pull/1100\">licensing\ninformation</a>\nor configuration hints) and better support for the external library\nmanagement issues I explained in an earlier post about <a href=\"https://anil.recoil.org/2013/09/09/ocamlot-autotriaging.html\">OCamlot\ntesting</a>.</p>\n</li>\n<li>\n<p>Better library structuring to let tools like\n<a href=\"http://github.com/OCamlPro/opam2web\">Opam2web</a> work with the\npackage metadata. For instance, my group\u2019s <a href=\"http://ocaml.io\">OCaml\nLabs</a> has a comprehensive list of <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/index.html\">the software\npackages that we work\non</a>\ngenerated directly from an OPAM remote.</p>\n</li>\n<li>\n<p>A growing set of administration tools (via the <code>opam-admin</code> binary)\nthat run health checks and compute statistics over package\nrepositories. For example, here\u2019s the result of running\n<code>opam-admin stats</code> over the latest package repository to show\nvarious growth curves.</p>\n</li>\n</ul>",+"content": "<p><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> just announced the availability of the\n<a href=\"http://opam.ocamlpro.com\">OPAM</a> beta release. This has been a huge\namount of work for him and <a href=\"http://louis.gesbert.fr/\">Louis</a>, so I\u2019m\nexcited to see this release!</p>\n<p>Aside from general stability, the main\nhighlights for me are:</p>\n<ul>\n<li>\n<p>A switch to the\n<a href=\"http://creativecommons.org/publicdomain/zero/1.0/\">CC0</a>\npublic-domain-like license for the repository, and LGPL2+linking\nexception for OPAM itself. The <a href=\"https://github.com/OCamlPro/opam-repository/issues/955\">cutover to the new\nlicense</a> was\nthe first non-gratuitous use of GitHub\u2019s fancy issue lists I\u2019ve\nseen, too! As part of this, we\u2019re also beginning a transition over\nto hosting it at <code>opam.ocaml.org</code>, to underline our committment to\nmaintaining it as an OCaml community resource.</p>\n</li>\n<li>\n<p>Much-improved support for package pinning and updates. This is the\nfeature that makes OPAM work well with\n<a href=\"http://openmirage.org\">MirageOS</a>, since we often need to do\ndevelopment work on a low-level library (such as a <a href=\"https://github.com/mirage/ocaml-xen-block-driver\">device\ndriver</a> and\nrecompile all the reverse dependencies.</p>\n</li>\n<li>\n<p>Support for post-installation messages (e.g. to display <a href=\"https://github.com/OCamlPro/opam-repository/pull/1100\">licensing\ninformation</a>\nor configuration hints) and better support for the external library\nmanagement issues I explained in an earlier post about <a href=\"https://anil.recoil.org/2013/09/09/ocamlot-autotriaging.html\">OCamlot\ntesting</a>.</p>\n</li>\n<li>\n<p>Better library structuring to let tools like\n<a href=\"http://github.com/OCamlPro/opam2web\">Opam2web</a> work with the\npackage metadata. For instance, my group\u2019s <a href=\"http://ocaml.io\">OCaml\nLabs</a> has a comprehensive list of <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/index.html\">the software\npackages that we work\non</a>\ngenerated directly from an OPAM remote.</p>\n</li>\n<li>\n<p>A growing set of administration tools (via the <code>opam-admin</code> binary)\nthat run health checks and compute statistics over package\nrepositories. For example, here\u2019s the result of running\n<code>opam-admin stats</code> over the latest package repository to show\nvarious growth curves.</p>\n</li>\n</ul>",
+18
avsm/notes_openbsd-developer.json
+18
avsm/notes_openbsd-developer.json
···+"summary": "<p>I've been using OpenBSD for a few years now as the primary OS for <a href=\"https://anil.recoil.org/news?t=recoil\">recoil</a>\nand have been contributing fixes and ports when I get a chance. So I'm\nincredibly excited to report that the project leader, Theo de Raadt, has\ninvited me to become an OpenBSD developer. I've registered my keys now, and\nwill be known as <code>avsm@openbsd.org</code>!</p>\n<p>My first commit is to start fixing up the PHP port, which I have been working\non in <a href=\"https://news-web.php.net/php.qa/652\">PHP-land</a> for a while now.</p>\n<pre><code>commit 93d5cc5ae56b22b19aa3bce34d38fa260b882d16\nAuthor: avsm <avsm@openbsd.org>\nDate: Tue Dec 26 23:35:43 2000 +0000\n\n - update to php-4.0.4\n - bump NEED_VERSION\n - no longer need extra distfile number4.tar.gz since it has\n been integrated into the main distribution\n - ltconfig, mysql socket patches are in main distribution now,\n so they are removed. Note that the ltconfig patch was only\n applied to the 4_0_4 branch by the PHP team, so we will have\n to resubmit it for the next version, unless libtool-cvs has\n been updated with our information.\n - Since php3/4 conflict with each other anyway, versioning is\n not needed.\n</code></pre>\n<p>Many thanks to Jakob for the help with getting started, and for ok'ing my first commit.</p>",+"content": "<p>I've been using OpenBSD for a few years now as the primary OS for <a href=\"https://anil.recoil.org/news?t=recoil\">recoil</a>\nand have been contributing fixes and ports when I get a chance. So I'm\nincredibly excited to report that the project leader, Theo de Raadt, has\ninvited me to become an OpenBSD developer. I've registered my keys now, and\nwill be known as <code>avsm@openbsd.org</code>!</p>\n<p>My first commit is to start fixing up the PHP port, which I have been working\non in <a href=\"https://news-web.php.net/php.qa/652\">PHP-land</a> for a while now.</p>\n<pre><code>commit 93d5cc5ae56b22b19aa3bce34d38fa260b882d16\nAuthor: avsm <avsm@openbsd.org>\nDate: Tue Dec 26 23:35:43 2000 +0000\n\n - update to php-4.0.4\n - bump NEED_VERSION\n - no longer need extra distfile number4.tar.gz since it has\n been integrated into the main distribution\n - ltconfig, mysql socket patches are in main distribution now,\n so they are removed. Note that the ltconfig patch was only\n applied to the 4_0_4 branch by the PHP team, so we will have\n to resubmit it for the next version, unless libtool-cvs has\n been updated with our information.\n - Since php3/4 conflict with each other anyway, versioning is\n not needed.\n</code></pre>\n<p>Many thanks to Jakob for the help with getting started, and for ok'ing my first commit.</p>",
+18
avsm/notes_openbsd-hosting.json
+18
avsm/notes_openbsd-hosting.json
···+"summary": "<p>I <a href=\"https://twitter.com/avsm/status/1167012354556669952\">asked on Twitter</a> about hosting options for OpenBSD on cloud providers, so that we could have some alternative options for Recoil. We have a strong preference for bare-metal and not VMs when it comes to OpenBSD. Options that came back were:</p>\n<ul>\n<li>OpenBSDAMS\n<ul>\n<li>Dedicated/VMs for openbsd hosting (see <a href=\"https://twitter.com/OpenBSDAms\">here</a>)</li>\n</ul>\n</li>\n<li>Mythic Beasts\n<ul>\n<li>I have provisioned a bare metal server there and they kindly stuck a USB stick in with an OpenBSD installer.</li>\n</ul>\n</li>\n<li>DataCentreLite\n<ul>\n<li>Not tried this yet but <a href=\"https://twitter.com/NicoSchottelius/status/1167163133024264192\">possible followup</a>.</li>\n</ul>\n</li>\n<li>LiquidWeb\n<ul>\n<li>Good <a href=\"https://twitter.com/vphantom/status/1167020959771049984\">recommendation</a> from Stephane</li>\n</ul>\n</li>\n</ul>",+"content": "<p>I <a href=\"https://twitter.com/avsm/status/1167012354556669952\">asked on Twitter</a> about hosting options for OpenBSD on cloud providers, so that we could have some alternative options for Recoil. We have a strong preference for bare-metal and not VMs when it comes to OpenBSD. Options that came back were:</p>\n<ul>\n<li>OpenBSDAMS\n<ul>\n<li>Dedicated/VMs for openbsd hosting (see <a href=\"https://twitter.com/OpenBSDAms\">here</a>)</li>\n</ul>\n</li>\n<li>Mythic Beasts\n<ul>\n<li>I have provisioned a bare metal server there and they kindly stuck a USB stick in with an OpenBSD installer.</li>\n</ul>\n</li>\n<li>DataCentreLite\n<ul>\n<li>Not tried this yet but <a href=\"https://twitter.com/NicoSchottelius/status/1167163133024264192\">possible followup</a>.</li>\n</ul>\n</li>\n<li>LiquidWeb\n<ul>\n<li>Good <a href=\"https://twitter.com/vphantom/status/1167020959771049984\">recommendation</a> from Stephane</li>\n</ul>\n</li>\n</ul>",
+18
avsm/notes_openfx.json
+18
avsm/notes_openfx.json
···+"summary": "<p>Slashdot covers the GPL release of <a href=\"http://openfx.org\">OpenFX</a>, which I worked on with Stuart Ferguson (my brother's PhD supervisor in Queen's University Belfast).</p>\n<blockquote>\n<p>It has a renderer and raytrace engine, NURBS support, kinematics-based animation, morphing, a plugin API - and it's under the GPL. Currently only for Windows, but they're working on a Linux and FreeBSD port.</p>\n</blockquote>",+"content": "<p>Slashdot covers the GPL release of <a href=\"http://openfx.org\">OpenFX</a>, which I worked on with Stuart Ferguson (my brother's PhD supervisor in Queen's University Belfast).</p>\n<blockquote>\n<p>It has a renderer and raytrace engine, NURBS support, kinematics-based animation, morphing, a plugin API - and it's under the GPL. Currently only for Windows, but they're working on a Linux and FreeBSD port.</p>\n</blockquote>",
+18
avsm/notes_opening-a-website.json
+18
avsm/notes_opening-a-website.json
···+"summary": "<p>We've been working away at building a new type of database to help individuals\nkeep reigns on their ever-increasing personal digital information. The first\nprototypes run freely on <a href=\"https://web.archive.org/web/20110509135538/http://code.google.com/appengine\">Google App Engine</a> to gather your data\nbehind-the-scenes, and we are working on more advanced versions that run on\nembedded devices and the cloud.</p>\n<p>If you\u2019re interested in keeping track of your personal data, you can start off\nwith the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/install.html\">installation</a> instructions to clone your own version. After that, read\nup on the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/design.html\">design</a> of the system (which is still changing as we research new\nideas around it). When you find something you want to fix, or add a new plugin\ndata source, just clone the <a href=\"https://github.com/avsm/perscon\">code</a> and send us back fixes!</p>",+"content": "<p>We've been working away at building a new type of database to help individuals\nkeep reigns on their ever-increasing personal digital information. The first\nprototypes run freely on <a href=\"https://web.archive.org/web/20110509135538/http://code.google.com/appengine\">Google App Engine</a> to gather your data\nbehind-the-scenes, and we are working on more advanced versions that run on\nembedded devices and the cloud.</p>\n<p>If you\u2019re interested in keeping track of your personal data, you can start off\nwith the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/install.html\">installation</a> instructions to clone your own version. After that, read\nup on the <a href=\"https://web.archive.org/web/20110509135538/http://perscon.net/design.html\">design</a> of the system (which is still changing as we research new\nideas around it). When you find something you want to fix, or add a new plugin\ndata source, just clone the <a href=\"https://github.com/avsm/perscon\">code</a> and send us back fixes!</p>",
+18
avsm/notes_opening-anil-recoil-org.json
+18
avsm/notes_opening-anil-recoil-org.json
···+"summary": "<p>I've taken the opportunity to redesign my homepage and switch to its hopefully-permanent\nURL on <code>anil.recoil.org</code>. Many thanks to Jon Parise for giving me permission to base my\nHTML upon his homepage's, saving me lots of design trouble!</p>",+"content": "<p>I've taken the opportunity to redesign my homepage and switch to its hopefully-permanent\nURL on <code>anil.recoil.org</code>. Many thanks to Jon Parise for giving me permission to base my\nHTML upon his homepage's, saving me lots of design trouble!</p>",
+18
avsm/notes_opening-discuss-ocaml.json
+18
avsm/notes_opening-discuss-ocaml.json
···+"summary": "<p>I opened up a <a href=\"https://discourse.org\">Discourse</a> forum for the OCaml community to use, which is running successfully on https://discuss.ocaml.org. This forum thread collates the feedback and discussions about it.</p>",+"content": "<p>I opened up a <a href=\"https://discourse.org\">Discourse</a> forum for the OCaml community to use, which is running successfully on https://discuss.ocaml.org. This forum thread collates the feedback and discussions about it.</p>",
+18
avsm/notes_peeking-under-the-hood-of-high-availability.json
+18
avsm/notes_peeking-under-the-hood-of-high-availability.json
···+"summary": "<p>Well, the big launch of <a href=\"http://www.xenserver5.com/\">XenServer 5</a> has gone smoothly, and with it have arrived a flood of questions about how exactly the new <a href=\"https://web.archive.org/web/20081121042533/https://xenserver5.com/ha.php\">High Availability</a> functionality works.\u00a0 I\u2019ll use this post to explain the overall architecture of HA in XenServer 5, and also how some of the fault detection and failure planning works.</p>\n<p>Fundamentally, HA is about making sure important VMs are always running on a resource pool. There are two aspects to this: reliably <strong>detecting host failure</strong>, and computing a <strong>failure plan</strong> to deal with swift recovery.</p>\n<p>Detecting host failure reliably is difficult since you need to remotely distinguish between a host disappearing for a while versus exploding in a ball of flames.\u00a0 If we mistakenly decide that a master host has broken down and elect a new master in its place, there may be unpredictable results if the original host were to make a comeback!\u00a0\u00a0 Similarly, if there is a network issue and a resource pool splits into two equal halves, we need to ensure that only one half accesses the shared storage and not both simultaneously.</p>\n<h2><a href=\"https://anil.recoil.org/#heartbeating-for-availability\"></a>Heartbeating for availability</h2>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-3b.webp\" title=\"\">\n</p>\n<p>We solve all these problems in XenServer by having two mechanisms: a <strong>storage heartbeat</strong> and a <strong>network heartbeat</strong>. When you enable HA in a pool, you must nominate an iSCSI or FC storage repository to be the heartbeat SR. XenServer automatically creates a couple of small virtual disks in this SR. The first disk is used by every physical host in the resource pool as a <strong>shared quorum disk</strong>. Each host allocates itself a unique block in the shared disk and regularly writes to the block to indicate that it is alive.</p>\n<p>I asked <a href=\"https://github.com/djs55\">Dave Scott</a>, the principal engineer behind HA about the startup process:</p>\n<blockquote>\n<p>When HA starts up, all hosts exchange data over both network and\nstorage channels, indicating which hosts <em>they</em> can see over both\nchannels; i.e. which I/O paths are working and which are not.\u00a0 This\nliveness information is exchanged until a fixed point is reached and\nall of the hosts are satisfied that they are in agreement about what\nthey can see.\u00a0 When this happens, the HA functionality is \u2018armed\u2019 and\nthe pool is protected.</p>\n</blockquote>\n<blockquote>\n<p>This HA arming process can take a few minutes to settle for larger\npools, but is only required when HA is first enabled.</p>\n</blockquote>\n<blockquote>\n<p>Once HA is active, each host regularly writes storage updates to the\nheartbeat virtual disk, and network packets over the management\ninterface.\u00a0 It is vital to ensure that network adapters are\n<a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#networking-standalone_host_config-bonds\">bonded</a>\nfor resilience, and that storage interfaces are using <a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#id2557754\">dynamic\nmultipathing</a>\nwhere supported.\u00a0 This will ensure that any single adapter or wiring\nfailures do not result in any availability issues.</p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5-1.webp\" title=\"\">\n</p>\n<p>The worst-case scenario for HA is the situation where a host is thought to be off-line but is actually still writing to the shared storage, since this can result in corruption of persistent data.\u00a0 In order to prevent this situation without requiring active power strip control, we implemented <strong>hypervisor-level fencing</strong>.\u00a0 This is a Xen modification which will hard-power the host off at a very low-level if it doesn\u2019t hear regularly from a watchdog process running in the control domain.\u00a0 Since it is implemented at a very low-level, this also covers the case where the control domain becomes unresponsive for any reason.</p>\n<p>Hosts will self-fence (i.e. power off and restart) in the event of any heartbeat failure unless any of the following hold true:</p>\n<ul>\n<li>The storage heartbeat is present for all hosts but the network has\npartitioned (so that there are now two groups of hosts).\u00a0 In this\ncase, all of the hosts which are members of the largest network\npartition stay running, and the hosts in the smaller network\npartition self-fence.\u00a0 The assumption here is that the network\noutage has isolated the VMs, and they ought to be restarted on a\nhost with working networking.\u00a0 If the network partitions are exactly\nthe same size, then only one of them will self-fence according to a\nstable selection function.</li>\n<li>If the storage heartbeat goes away but the network heartbeat\nremains, then the hosts check to see if they can see all other hosts\nover the network.\u00a0 If this condition holds true, then the hosts\nremain running on the assumption that the storage heartbeat server\nhas gone away.\u00a0 This doesn\u2019t compromise VM safety, but any network\nglitches will result in fencing since that would mean both\nheartbeats have disappeared.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#planning-for-failure\"></a>Planning for failure</h2>\n<p>The heartbeat system gives us reliable notification of host failure, and so we move onto the second step of HA: capacity planning for failure.</p>\n<p>A resource pool consists of several physical hosts (say, 16), each with potentially different amounts of host memory and a different number of running VMs.\u00a0 In order to ensure that no single host failure will result in the VMs on that host being unrestartable (e.g. due to insufficient memory on any other host), the XenServer pool dynamically computes a <strong>failure plan</strong> which calculates the actions that would be taken on any host failure.</p>\n<p>But there\u2019s one more complexity... a single host failure plan does not cover more advanced cases such as network partitions which take out entire groups of hosts.\u00a0 It would be very useful to be able to create a plan that could tolerate more than a single host failure, so that administrators could ignore the first host failure and be safe in the knowledge that (for example) three more hosts could fail before the pool runs out of spare capacity.</p>\n<p>That\u2019s exactly what we do in XenServer... the resource pool <em>dynamically</em> computes a failure plan which considers the \u201cnumber of host failures to tolerate\u201d (or <em>nhtol</em>).\u00a0 This represents the number of disposable servers in a pool for a given set of protected VMs.</p>\n<p>The planning algorithms are pretty complex, since doing a brute force search of all possible failures across all hosts across all VMs is an exponential problem.\u00a0 We apply heuristics to ensure we can compute a plan in a reasonably small time:</p>\n<ul>\n<li>for up to 3 host failures, we do a comprehensive search which tries\nalmost all permutations.\u00a0 This covers corner cases such as having\nhosts or VMs with very different amounts of memory (e.g. 4GB vs\n128GB).\u00a0 Rather than calculate memory slots or otherwise approximate\nresults, we just deal with them individually and give very accurate\nplans.</li>\n<li>for greater than 3 host failures, we make conservative decisions by\napproximating every VM to be as large as the largest, and\nconsidering each host to be the same as the most densely packed\nhost.\u00a0 We do not approximate the host memory, and so having pools\nwith uneven amounts of host memory will be fine.\u00a0 However, in\napproximate planning mode having a single very large VM will result\nin a low <em>nhtol</em> value.\u00a0 If this is a problem, then try to reduce\nthe <em>nhtol</em> or try to have a more even spread of VM memory sizes.</li>\n</ul>\n<p>Since planning algorithms are designed for unexpected host failures, we only consider absolutely essential resource reservations which would prevent the VM from starting on the alternative host (e.g. storage is visible, and enough memory is present).\u00a0 We do not perform CPU reservation on the basis that it can be optimised at a later stage via live relocation once the VM is back up and running.</p>\n<h3><a href=\"https://anil.recoil.org/#overcommit-protection\"></a>Overcommit protection</h3>\n<p>We now have HA armed and a failover plan for our VMs.\u00a0 But what if you want to make changes to your configuration after HA is enabled?\u00a0 This is dealt with via <strong>overcommit protection</strong>.</p>\n<p>The XenServer pool dynamically calculates a new failover plan in response to every XenAPI call which would affect it (e.g. starting a new VM).\u00a0 If a new plan cannot be calculated due to insufficient resources across the pool, the XenServer will return an <strong>overcommitment</strong> error message to the client which blocks the operation.</p>\n<h4><a href=\"https://anil.recoil.org/#the-what-if-machine\"></a>The \u201cWhat if?\u201d Machine</h4>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-4b.webp\" title=\"\">\n</p>\n<p>This overcommit protection would be quite irritating if you have to keep trying things and seeing if a plan exists or not, and so we built in a "<a href=\"http://www.gotfuturama.com/Information/Encyc-55-What_If_Machine/\">What If?</a>" machine into XenServer to facilitate counter-factual reasoning.</p>\n<p>When reconfiguring HA via XenCenter, you can supply a hypothetical series of VM priorities, and XenServer will return a number of host failures which would be tolerated under this scheme.\u00a0 This lets you try various combinations of VM protections depending on your business needs, and see if the number of host failures is appropriate to the level of paranoia you desire.</p>\n<p>This can even be done via the CLI, using the snappily named "<strong>xe pool-ha-compute-max-host-failures-to-tolerate</strong>" when HA is enabled.</p>\n<p>The nice thing about XenServer HA is that it is done at the XenAPI level, and so\u00a0 any of the standard clients (such as the xe CLI or XenCenter) or any third-party clients which use the XenAPI will all interoperate just fine.\u00a0 The XenServer pool dynamically recalculates plans in response to the client requests, and so no special \u201coracle\u201d is required outside of the pool to figure out HA plans.</p>\n<p>Finally, HA makes master election completely invisible.\u00a0 Any host in a pool can be a master host, and the pool database is constantly replicated across all nodes and also backed up to shared storage on the heartbeat SR for additional safety.\u00a0 Any XenAPI client can connect to any host, and a redirect is issued to the current master host.</p>\n<h2><a href=\"https://anil.recoil.org/#protection-levels\"></a>Protection Levels</h2>\n<p>Each VM in an HA pool can be either <strong>fully protected</strong>, <strong>best-effort</strong> or <strong>unprotected</strong>. VMs which are protected are all included in the failover planning, and if no plan exists for which they can all be reliably restarted then the pool is considered to be overcommitted. Hugh Warrington (who implemented the XenCenter HA support) explained what use protection levels are:</p>\n<blockquote>\n<p>Best-effort VMs are not considered when calculating a failover plan,\nbut the pool will still try to start them as a one-off if a host that\nis running them fails.\u00a0 This restart is attempted after all protected\nVMs are restarted, and if the attempt to start them fails then it will\nnot be retried.\u00a0 This is a useful setting for test/dev VMs which\naren\u2019t critical to keep running, but would be nice to do so in a pool\nwhich also has some important VMs which absolutely must run.</p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5.webp\" title=\"\">\n</p>\n<p>There are some advanced features which are only available via the CLI.\u00a0\u00a0 Each protected VM in an HA pool can be assigned a numeric <code>ha-restart-priority</code>.\u00a0 If a pool is well-resourced with a high <em>nhtol</em>, then these restart priorities are not relevant: the VMs are all guaranteed to be started.</p>\n<p>If more hosts fail than have been planned for, then the priorities are used to determine the order in which VMs are restarted.\u00a0 This ensures that in over-committed pools, the most important VMs are restarted first.\u00a0 Although the pool will start priority 1 VMs first, they might not finish booting before the priority 2 VMs, and so this should not be used as the basis for service ordering.</p>\n<p>Note that it's very important to <strong>ensure that a VM is agile</strong> when protecting it by HA.\u00a0 If the VM is not agile (e.g has a physical CD drive mapped in from a host), then it can only be assigned Best Effort restart since it is tied to one host.</p>\n<h2><a href=\"https://anil.recoil.org/#xencenter-support-for-ha\"></a>XenCenter support for HA</h2>\n<p>The best practice for HA is not to make configuration changes while it is enabled.\u00a0 Instead, it is intended to be the "2am safeguard" which will restart hosts in the event of a problem when there isn't a human administrator nearby.\u00a0 If you are actively making configuration changes such as applying patches, then HA should be disabled for the duration of these changes.</p>\n<p>XenCenter makes some common changes under HA much more user-friendly, which I asked <a href=\"http://community.citrix.com/blogs/citrite/ewanm/\">Ewan Mellor</a> (the principal GUI engineer) about:</p>\n<ul>\n<li>Normally a protected VM cannot be shut down via the CLI or from\nwithin the guest (a shutdown from within the guest will\nautomatically restart it).\u00a0 If you try to shutdown from XenCenter,\nit will give you the option of unprotecting the VM and then shutting\nit down first.\u00a0 Thus, accidental in-guest shutdowns wont result in\ndowntime, but administrators can still stop a protected guest if\nthey really want to.</li>\n<li>If you want to reboot a host when HA is enabled, XenCenter\nautomatically uses the hypothetical planning calculation to\ndetermine if this would invalidate the failover plan.\u00a0 If it doesn\u2019t\naffect it, then the host is shut down normally.\u00a0 If the plan would\nbe violated, but the <em>nhtol</em> is greater than 1, XenCenter will give\nthe administrator the option of lowering the <em>nhtol</em> value by 1.\u00a0\nThis reduces the overall resilience of the pool, but always ensures\nthat at least one host failure will be tolerated.\u00a0 When the host\ncomes back up, the plan is automatically recalculated and the\noriginal <em>nhtol</em> value restored if appropriate.</li>\n<li>If you try to apply a hotfix, then XenCenter will disable HA for the\nduration of the pool patching wizard.\u00a0 It is important to manually\nkeep an eye on hotfix application to ensure that host failures do\nnot disrupt the operation of the pool.</li>\n</ul>\n<p>So, I hope this short article has given you a taster... just kidding! This post is almost as long as my PhD thesis, but then, HA is a complex topic. Please do feel free to get back to me with comments and feedback about how we can improve it in the future releases, or if you just love it the way it is.\u00a0 Many thanks to <a href=\"https://github.com/djs55\">Dave Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, Ewan Mellor and Hugh Warrington for their input to this article.</p>",+"content": "<p>Well, the big launch of <a href=\"http://www.xenserver5.com/\">XenServer 5</a> has gone smoothly, and with it have arrived a flood of questions about how exactly the new <a href=\"https://web.archive.org/web/20081121042533/https://xenserver5.com/ha.php\">High Availability</a> functionality works.\u00a0 I\u2019ll use this post to explain the overall architecture of HA in XenServer 5, and also how some of the fault detection and failure planning works.</p>\n<p>Fundamentally, HA is about making sure important VMs are always running on a resource pool. There are two aspects to this: reliably <strong>detecting host failure</strong>, and computing a <strong>failure plan</strong> to deal with swift recovery.</p>\n<p>Detecting host failure reliably is difficult since you need to remotely distinguish between a host disappearing for a while versus exploding in a ball of flames.\u00a0 If we mistakenly decide that a master host has broken down and elect a new master in its place, there may be unpredictable results if the original host were to make a comeback!\u00a0\u00a0 Similarly, if there is a network issue and a resource pool splits into two equal halves, we need to ensure that only one half accesses the shared storage and not both simultaneously.</p>\n<h2><a href=\"https://anil.recoil.org/#heartbeating-for-availability\"></a>Heartbeating for availability</h2>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-3b.webp\" title=\"\">\n</p>\n<p>We solve all these problems in XenServer by having two mechanisms: a <strong>storage heartbeat</strong> and a <strong>network heartbeat</strong>. When you enable HA in a pool, you must nominate an iSCSI or FC storage repository to be the heartbeat SR. XenServer automatically creates a couple of small virtual disks in this SR. The first disk is used by every physical host in the resource pool as a <strong>shared quorum disk</strong>. Each host allocates itself a unique block in the shared disk and regularly writes to the block to indicate that it is alive.</p>\n<p>I asked <a href=\"https://github.com/djs55\">Dave Scott</a>, the principal engineer behind HA about the startup process:</p>\n<blockquote>\n<p>When HA starts up, all hosts exchange data over both network and\nstorage channels, indicating which hosts <em>they</em> can see over both\nchannels; i.e. which I/O paths are working and which are not.\u00a0 This\nliveness information is exchanged until a fixed point is reached and\nall of the hosts are satisfied that they are in agreement about what\nthey can see.\u00a0 When this happens, the HA functionality is \u2018armed\u2019 and\nthe pool is protected.</p>\n</blockquote>\n<blockquote>\n<p>This HA arming process can take a few minutes to settle for larger\npools, but is only required when HA is first enabled.</p>\n</blockquote>\n<blockquote>\n<p>Once HA is active, each host regularly writes storage updates to the\nheartbeat virtual disk, and network packets over the management\ninterface.\u00a0 It is vital to ensure that network adapters are\n<a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#networking-standalone_host_config-bonds\">bonded</a>\nfor resilience, and that storage interfaces are using <a href=\"http://docs.xensource.com/XenServer/5.0.0/1.0/en_gb/reference.html#id2557754\">dynamic\nmultipathing</a>\nwhere supported.\u00a0 This will ensure that any single adapter or wiring\nfailures do not result in any availability issues.</p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5-1.webp\" title=\"\">\n</p>\n<p>The worst-case scenario for HA is the situation where a host is thought to be off-line but is actually still writing to the shared storage, since this can result in corruption of persistent data.\u00a0 In order to prevent this situation without requiring active power strip control, we implemented <strong>hypervisor-level fencing</strong>.\u00a0 This is a Xen modification which will hard-power the host off at a very low-level if it doesn\u2019t hear regularly from a watchdog process running in the control domain.\u00a0 Since it is implemented at a very low-level, this also covers the case where the control domain becomes unresponsive for any reason.</p>\n<p>Hosts will self-fence (i.e. power off and restart) in the event of any heartbeat failure unless any of the following hold true:</p>\n<ul>\n<li>The storage heartbeat is present for all hosts but the network has\npartitioned (so that there are now two groups of hosts).\u00a0 In this\ncase, all of the hosts which are members of the largest network\npartition stay running, and the hosts in the smaller network\npartition self-fence.\u00a0 The assumption here is that the network\noutage has isolated the VMs, and they ought to be restarted on a\nhost with working networking.\u00a0 If the network partitions are exactly\nthe same size, then only one of them will self-fence according to a\nstable selection function.</li>\n<li>If the storage heartbeat goes away but the network heartbeat\nremains, then the hosts check to see if they can see all other hosts\nover the network.\u00a0 If this condition holds true, then the hosts\nremain running on the assumption that the storage heartbeat server\nhas gone away.\u00a0 This doesn\u2019t compromise VM safety, but any network\nglitches will result in fencing since that would mean both\nheartbeats have disappeared.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#planning-for-failure\"></a>Planning for failure</h2>\n<p>The heartbeat system gives us reliable notification of host failure, and so we move onto the second step of HA: capacity planning for failure.</p>\n<p>A resource pool consists of several physical hosts (say, 16), each with potentially different amounts of host memory and a different number of running VMs.\u00a0 In order to ensure that no single host failure will result in the VMs on that host being unrestartable (e.g. due to insufficient memory on any other host), the XenServer pool dynamically computes a <strong>failure plan</strong> which calculates the actions that would be taken on any host failure.</p>\n<p>But there\u2019s one more complexity... a single host failure plan does not cover more advanced cases such as network partitions which take out entire groups of hosts.\u00a0 It would be very useful to be able to create a plan that could tolerate more than a single host failure, so that administrators could ignore the first host failure and be safe in the knowledge that (for example) three more hosts could fail before the pool runs out of spare capacity.</p>\n<p>That\u2019s exactly what we do in XenServer... the resource pool <em>dynamically</em> computes a failure plan which considers the \u201cnumber of host failures to tolerate\u201d (or <em>nhtol</em>).\u00a0 This represents the number of disposable servers in a pool for a given set of protected VMs.</p>\n<p>The planning algorithms are pretty complex, since doing a brute force search of all possible failures across all hosts across all VMs is an exponential problem.\u00a0 We apply heuristics to ensure we can compute a plan in a reasonably small time:</p>\n<ul>\n<li>for up to 3 host failures, we do a comprehensive search which tries\nalmost all permutations.\u00a0 This covers corner cases such as having\nhosts or VMs with very different amounts of memory (e.g. 4GB vs\n128GB).\u00a0 Rather than calculate memory slots or otherwise approximate\nresults, we just deal with them individually and give very accurate\nplans.</li>\n<li>for greater than 3 host failures, we make conservative decisions by\napproximating every VM to be as large as the largest, and\nconsidering each host to be the same as the most densely packed\nhost.\u00a0 We do not approximate the host memory, and so having pools\nwith uneven amounts of host memory will be fine.\u00a0 However, in\napproximate planning mode having a single very large VM will result\nin a low <em>nhtol</em> value.\u00a0 If this is a problem, then try to reduce\nthe <em>nhtol</em> or try to have a more even spread of VM memory sizes.</li>\n</ul>\n<p>Since planning algorithms are designed for unexpected host failures, we only consider absolutely essential resource reservations which would prevent the VM from starting on the alternative host (e.g. storage is visible, and enough memory is present).\u00a0 We do not perform CPU reservation on the basis that it can be optimised at a later stage via live relocation once the VM is back up and running.</p>\n<h3><a href=\"https://anil.recoil.org/#overcommit-protection\"></a>Overcommit protection</h3>\n<p>We now have HA armed and a failover plan for our VMs.\u00a0 But what if you want to make changes to your configuration after HA is enabled?\u00a0 This is dealt with via <strong>overcommit protection</strong>.</p>\n<p>The XenServer pool dynamically calculates a new failover plan in response to every XenAPI call which would affect it (e.g. starting a new VM).\u00a0 If a new plan cannot be calculated due to insufficient resources across the pool, the XenServer will return an <strong>overcommitment</strong> error message to the client which blocks the operation.</p>\n<h4><a href=\"https://anil.recoil.org/#the-what-if-machine\"></a>The \u201cWhat if?\u201d Machine</h4>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-4b.webp\" title=\"\">\n</p>\n<p>This overcommit protection would be quite irritating if you have to keep trying things and seeing if a plan exists or not, and so we built in a "<a href=\"http://www.gotfuturama.com/Information/Encyc-55-What_If_Machine/\">What If?</a>" machine into XenServer to facilitate counter-factual reasoning.</p>\n<p>When reconfiguring HA via XenCenter, you can supply a hypothetical series of VM priorities, and XenServer will return a number of host failures which would be tolerated under this scheme.\u00a0 This lets you try various combinations of VM protections depending on your business needs, and see if the number of host failures is appropriate to the level of paranoia you desire.</p>\n<p>This can even be done via the CLI, using the snappily named "<strong>xe pool-ha-compute-max-host-failures-to-tolerate</strong>" when HA is enabled.</p>\n<p>The nice thing about XenServer HA is that it is done at the XenAPI level, and so\u00a0 any of the standard clients (such as the xe CLI or XenCenter) or any third-party clients which use the XenAPI will all interoperate just fine.\u00a0 The XenServer pool dynamically recalculates plans in response to the client requests, and so no special \u201coracle\u201d is required outside of the pool to figure out HA plans.</p>\n<p>Finally, HA makes master election completely invisible.\u00a0 Any host in a pool can be a master host, and the pool database is constantly replicated across all nodes and also backed up to shared storage on the heartbeat SR for additional safety.\u00a0 Any XenAPI client can connect to any host, and a redirect is issued to the current master host.</p>\n<h2><a href=\"https://anil.recoil.org/#protection-levels\"></a>Protection Levels</h2>\n<p>Each VM in an HA pool can be either <strong>fully protected</strong>, <strong>best-effort</strong> or <strong>unprotected</strong>. VMs which are protected are all included in the failover planning, and if no plan exists for which they can all be reliably restarted then the pool is considered to be overcommitted. Hugh Warrington (who implemented the XenCenter HA support) explained what use protection levels are:</p>\n<blockquote>\n<p>Best-effort VMs are not considered when calculating a failover plan,\nbut the pool will still try to start them as a one-off if a host that\nis running them fails.\u00a0 This restart is attempted after all protected\nVMs are restarted, and if the attempt to start them fails then it will\nnot be retried.\u00a0 This is a useful setting for test/dev VMs which\naren\u2019t critical to keep running, but would be nice to do so in a pool\nwhich also has some important VMs which absolutely must run.</p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/ha-wizard-5.webp\" title=\"\">\n</p>\n<p>There are some advanced features which are only available via the CLI.\u00a0\u00a0 Each protected VM in an HA pool can be assigned a numeric <code>ha-restart-priority</code>.\u00a0 If a pool is well-resourced with a high <em>nhtol</em>, then these restart priorities are not relevant: the VMs are all guaranteed to be started.</p>\n<p>If more hosts fail than have been planned for, then the priorities are used to determine the order in which VMs are restarted.\u00a0 This ensures that in over-committed pools, the most important VMs are restarted first.\u00a0 Although the pool will start priority 1 VMs first, they might not finish booting before the priority 2 VMs, and so this should not be used as the basis for service ordering.</p>\n<p>Note that it's very important to <strong>ensure that a VM is agile</strong> when protecting it by HA.\u00a0 If the VM is not agile (e.g has a physical CD drive mapped in from a host), then it can only be assigned Best Effort restart since it is tied to one host.</p>\n<h2><a href=\"https://anil.recoil.org/#xencenter-support-for-ha\"></a>XenCenter support for HA</h2>\n<p>The best practice for HA is not to make configuration changes while it is enabled.\u00a0 Instead, it is intended to be the "2am safeguard" which will restart hosts in the event of a problem when there isn't a human administrator nearby.\u00a0 If you are actively making configuration changes such as applying patches, then HA should be disabled for the duration of these changes.</p>\n<p>XenCenter makes some common changes under HA much more user-friendly, which I asked <a href=\"http://community.citrix.com/blogs/citrite/ewanm/\">Ewan Mellor</a> (the principal GUI engineer) about:</p>\n<ul>\n<li>Normally a protected VM cannot be shut down via the CLI or from\nwithin the guest (a shutdown from within the guest will\nautomatically restart it).\u00a0 If you try to shutdown from XenCenter,\nit will give you the option of unprotecting the VM and then shutting\nit down first.\u00a0 Thus, accidental in-guest shutdowns wont result in\ndowntime, but administrators can still stop a protected guest if\nthey really want to.</li>\n<li>If you want to reboot a host when HA is enabled, XenCenter\nautomatically uses the hypothetical planning calculation to\ndetermine if this would invalidate the failover plan.\u00a0 If it doesn\u2019t\naffect it, then the host is shut down normally.\u00a0 If the plan would\nbe violated, but the <em>nhtol</em> is greater than 1, XenCenter will give\nthe administrator the option of lowering the <em>nhtol</em> value by 1.\u00a0\nThis reduces the overall resilience of the pool, but always ensures\nthat at least one host failure will be tolerated.\u00a0 When the host\ncomes back up, the plan is automatically recalculated and the\noriginal <em>nhtol</em> value restored if appropriate.</li>\n<li>If you try to apply a hotfix, then XenCenter will disable HA for the\nduration of the pool patching wizard.\u00a0 It is important to manually\nkeep an eye on hotfix application to ensure that host failures do\nnot disrupt the operation of the pool.</li>\n</ul>\n<p>So, I hope this short article has given you a taster... just kidding! This post is almost as long as my PhD thesis, but then, HA is a complex topic. Please do feel free to get back to me with comments and feedback about how we can improve it in the future releases, or if you just love it the way it is.\u00a0 Many thanks to <a href=\"https://github.com/djs55\">Dave Scott</a>, <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, Ewan Mellor and Hugh Warrington for their input to this article.</p>",
+18
avsm/notes_php-port-layout-openbsd.json
+18
avsm/notes_php-port-layout-openbsd.json
···+"summary": "<p>I've committed a big improvement to the PHP port on OpenBSD, by switching from a complex set of FLAVOR tags over to a set of independently installing "multi packages".</p>\n<p>The first thing I did was to import the core PHP without any extensions.</p>\n<pre><code>commit 15dc0f67ef5fd0cae9fb841e608b90b9f51c71ca\nAuthor: avsm <avsm@openbsd.org>\nDate: Mon Jun 24 19:23:41 2002 +0000\n\n Import php4-core-4.2.1\n \n Installs the barebones php4 with only the gettext, iconv and recode\n modules compiled in.\n \n All of the other modules have to be installed as shared modules on\n top of this.\n \n In addition to the Apache module, this package also includes a php\n command-line binary which can be used in shell scripts. The binary\n uses the same /var/www/conf/php.ini file as the Apache module.\n \n There is some non-i386 breakage at the moment (notably macppc).\n</code></pre>\n<p>After that, I imported in the extensions system which has many more dependencies, and that generates the multi packages.</p>\n<pre><code>commit a5c226010f93bd3ce70667b801d6518354f44914\nAuthor: avsm <avsm@openbsd.org>\nDate: Mon Jun 24 19:27:46 2002 +0000\n\n Import php4-4.2.1 extensions\n \n This module generates a bunch of php4 extensions as shared modules,\n and seperates them out into multiple packages.\n \n End result is that you can pkg_add individual modules now without\n getting into the mess of flavors that we've had in the past.\n</code></pre>\n<p>This should make the use of <code>pkg_add</code> for PHP much simpler for new users. Any problems, please file a bug report or let me know.</p>",+"content": "<p>I've committed a big improvement to the PHP port on OpenBSD, by switching from a complex set of FLAVOR tags over to a set of independently installing "multi packages".</p>\n<p>The first thing I did was to import the core PHP without any extensions.</p>\n<pre><code>commit 15dc0f67ef5fd0cae9fb841e608b90b9f51c71ca\nAuthor: avsm <avsm@openbsd.org>\nDate: Mon Jun 24 19:23:41 2002 +0000\n\n Import php4-core-4.2.1\n \n Installs the barebones php4 with only the gettext, iconv and recode\n modules compiled in.\n \n All of the other modules have to be installed as shared modules on\n top of this.\n \n In addition to the Apache module, this package also includes a php\n command-line binary which can be used in shell scripts. The binary\n uses the same /var/www/conf/php.ini file as the Apache module.\n \n There is some non-i386 breakage at the moment (notably macppc).\n</code></pre>\n<p>After that, I imported in the extensions system which has many more dependencies, and that generates the multi packages.</p>\n<pre><code>commit a5c226010f93bd3ce70667b801d6518354f44914\nAuthor: avsm <avsm@openbsd.org>\nDate: Mon Jun 24 19:27:46 2002 +0000\n\n Import php4-4.2.1 extensions\n \n This module generates a bunch of php4 extensions as shared modules,\n and seperates them out into multiple packages.\n \n End result is that you can pkg_add individual modules now without\n getting into the mess of flavors that we've had in the past.\n</code></pre>\n<p>This should make the use of <code>pkg_add</code> for PHP much simpler for new users. Any problems, please file a bug report or let me know.</p>",
+18
avsm/notes_propl-at-splash.json
+18
avsm/notes_propl-at-splash.json
···+"summary": "<p><a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I had a blast <a href=\"https://plas4sci.github.io/conference/2024/01/22/propl.html\">running</a> the first <a href=\"https://propl.dev\">PROPL</a> workshop a couple of years ago, with a full room and engaged audience in POPL in London. Last year, our sister conference <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCO</a> took over, and it's our turn again this year! PROPL will return for a <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">second outing</a> in October, co-located with <a href=\"https://icfp25.sigplan.org/\">ICFP</a>/SPLASH in Singapore in October. Read the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a> here (deadline 3rd July 2025).</p>\n<p>\n<img alt=\"Dominic prepping for the first PROPL in the rather delightful venue\" src=\"https://anil.recoil.org/images/propl-1.webp\" title=\"Dominic prepping for the first PROPL in the rather delightful venue\">\nDominic prepping for the first PROPL in the rather delightful venue</p>\n<p>We'd love to get wider participation in computer science interacting with matters of climate and biodiversity:</p>\n<blockquote>\n<p>There are simultaneous interlinked crises across the planet due to human actions: climate change, biodiversity loss, and desertification. Addressing these challenges requires, amongst other things, a global understanding of the present state of affairs and the effectiveness of our adaptations and mitigations, leveraging both data and computation.</p>\n<p>However, programming the computer systems required to effectively ingest, clean, collate, process, explore, archive, and derive policy decisions from the planetary data we are collecting is difficult and leads to artefacts presently not usable by non-CS-experts, not reliable enough for scientific and political decision making, and not widely and openly available to all interested parties. Concurrently, domains where computational techniques are already central (e.g., climate modelling) are facing diminishing returns from current hardware trends and software techniques.</p>\n<p>PROPL explores how to close the gap between state-of-the-art programming methods being developed in academia and the use of programming in climate analysis, modelling, forecasting, policy, and diplomacy. The aim is to build bridges to the current practices used in the scientific community.\n -- <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">About PROPL</a></p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#how-to-take-part\"></a>How to take part</h2>\n<p>\n<img alt=\"The first PROPL had keynotes from Drew Purves and Lisa Rennels\" src=\"https://anil.recoil.org/images/propl-2.webp\" title=\"The first PROPL had keynotes from Drew Purves and Lisa Rennels\">\nThe first PROPL had keynotes from Drew Purves and Lisa Rennels</p>\n<p>In order to get a wide set of participants, we've got three different ways you can contribute to the program this year, all of which are listed in the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a>:</p>\n<ul>\n<li>Firstly, we want to hear short "provocations" from practitioners in the field, which outline a problem, application area, challenge, or capacity gap, that might be addressable by computer scientists. Since said practitioners are busy people, we've put together a <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">simple online form</a> in which you can submit your thoughts, rants, and ideas.</li>\n<li>Secondly, we're now going to publish a post-proceedings in the ACM digital library. These can be short papers (up to 5 pages, excluding bibliography/appendices) addressing a topic within the scope of the workshop. <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCOS</a> did a great job encouraging thoughtful submissions last year, and we'd love to see a similar enthusiasm this year too.</li>\n<li>Thirdly, consider submitting a talk or discussion idea aligned with the topics of the workshop. This could include reporting on existing work, a demo, open problems, work in progress, or new ideas and speculation. We may combine multiple talk proposals into panel discussions, depending on the submitted topics.</li>\n</ul>\n<p>The papers and talks can be submitted using the <a href=\"https://propl25.hotcrp.com\">PROPL HotCRP</a>, and the provocations via an <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">online form</a>. The deadline is the 3rd July 2025 (anywhere on earth), so we hope to see you take part!</p>\n<h2><a href=\"https://anil.recoil.org/#see-last-years-talks\"></a>See last year's talks</h2>\n<p>For those curious about the first PROPL outing, all of the talk videos are all online <a href=\"https://www.youtube.com/watch?v=yZeS4oN_XeI&list=PLyrlk8Xaylp7j9K6CETKpQSpCIOcJ9iO9\">on YouTube</a> or our <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">EEG video mirror</a> (ad-free).</p>\n<p>\n<img alt=\"We&apos;re looking forward to seeing you in Singapore for the second outing!\" src=\"https://anil.recoil.org/images/propl-3.webp\" title=\"We&apos;re looking forward to seeing you in Singapore for the second outing!\">\nWe're looking forward to seeing you in Singapore for the second outing!</p>\n<p>(Thanks to Lena Yang for spotting typos.)</p>",+"content": "<p><a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I had a blast <a href=\"https://plas4sci.github.io/conference/2024/01/22/propl.html\">running</a> the first <a href=\"https://propl.dev\">PROPL</a> workshop a couple of years ago, with a full room and engaged audience in POPL in London. Last year, our sister conference <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCO</a> took over, and it's our turn again this year! PROPL will return for a <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">second outing</a> in October, co-located with <a href=\"https://icfp25.sigplan.org/\">ICFP</a>/SPLASH in Singapore in October. Read the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a> here (deadline 3rd July 2025).</p>\n<p>\n<img alt=\"Dominic prepping for the first PROPL in the rather delightful venue\" src=\"https://anil.recoil.org/images/propl-1.webp\" title=\"Dominic prepping for the first PROPL in the rather delightful venue\">\nDominic prepping for the first PROPL in the rather delightful venue</p>\n<p>We'd love to get wider participation in computer science interacting with matters of climate and biodiversity:</p>\n<blockquote>\n<p>There are simultaneous interlinked crises across the planet due to human actions: climate change, biodiversity loss, and desertification. Addressing these challenges requires, amongst other things, a global understanding of the present state of affairs and the effectiveness of our adaptations and mitigations, leveraging both data and computation.</p>\n<p>However, programming the computer systems required to effectively ingest, clean, collate, process, explore, archive, and derive policy decisions from the planetary data we are collecting is difficult and leads to artefacts presently not usable by non-CS-experts, not reliable enough for scientific and political decision making, and not widely and openly available to all interested parties. Concurrently, domains where computational techniques are already central (e.g., climate modelling) are facing diminishing returns from current hardware trends and software techniques.</p>\n<p>PROPL explores how to close the gap between state-of-the-art programming methods being developed in academia and the use of programming in climate analysis, modelling, forecasting, policy, and diplomacy. The aim is to build bridges to the current practices used in the scientific community.\n -- <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">About PROPL</a></p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#how-to-take-part\"></a>How to take part</h2>\n<p>\n<img alt=\"The first PROPL had keynotes from Drew Purves and Lisa Rennels\" src=\"https://anil.recoil.org/images/propl-2.webp\" title=\"The first PROPL had keynotes from Drew Purves and Lisa Rennels\">\nThe first PROPL had keynotes from Drew Purves and Lisa Rennels</p>\n<p>In order to get a wide set of participants, we've got three different ways you can contribute to the program this year, all of which are listed in the <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">call for papers</a>:</p>\n<ul>\n<li>Firstly, we want to hear short "provocations" from practitioners in the field, which outline a problem, application area, challenge, or capacity gap, that might be addressable by computer scientists. Since said practitioners are busy people, we've put together a <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">simple online form</a> in which you can submit your thoughts, rants, and ideas.</li>\n<li>Secondly, we're now going to publish a post-proceedings in the ACM digital library. These can be short papers (up to 5 pages, excluding bibliography/appendices) addressing a topic within the scope of the workshop. <a href=\"https://sicsa.ac.uk/loco/loco2024/\">LOCOS</a> did a great job encouraging thoughtful submissions last year, and we'd love to see a similar enthusiasm this year too.</li>\n<li>Thirdly, consider submitting a talk or discussion idea aligned with the topics of the workshop. This could include reporting on existing work, a demo, open problems, work in progress, or new ideas and speculation. We may combine multiple talk proposals into panel discussions, depending on the submitted topics.</li>\n</ul>\n<p>The papers and talks can be submitted using the <a href=\"https://propl25.hotcrp.com\">PROPL HotCRP</a>, and the provocations via an <a href=\"https://forms.gle/DV2rA1iUgNwxfjiW6\">online form</a>. The deadline is the 3rd July 2025 (anywhere on earth), so we hope to see you take part!</p>\n<h2><a href=\"https://anil.recoil.org/#see-last-years-talks\"></a>See last year's talks</h2>\n<p>For those curious about the first PROPL outing, all of the talk videos are all online <a href=\"https://www.youtube.com/watch?v=yZeS4oN_XeI&list=PLyrlk8Xaylp7j9K6CETKpQSpCIOcJ9iO9\">on YouTube</a> or our <a href=\"https://watch.eeg.cl.cam.ac.uk/c/propl24/videos\">EEG video mirror</a> (ad-free).</p>\n<p>\n<img alt=\"We&apos;re looking forward to seeing you in Singapore for the second outing!\" src=\"https://anil.recoil.org/images/propl-3.webp\" title=\"We&apos;re looking forward to seeing you in Singapore for the second outing!\">\nWe're looking forward to seeing you in Singapore for the second outing!</p>\n<p>(Thanks to Lena Yang for spotting typos.)</p>",
+18
avsm/notes_recapping-ocaml-22.json
+18
avsm/notes_recapping-ocaml-22.json
···+"summary": "<p>I recap the OCaml community progress in 2022, which covers a number of bases ranging from\nthe release of OCaml 5.0, the launch of a new website with integrated documentation for 20000+ packages, prototyping new developer workflows that are better integrated into editors, and the launch of ActivityPub based services such as <a href=\"https://watch.ocaml.org\">https://watch.ocaml.org</a>.</p>",+"content": "<p>I recap the OCaml community progress in 2022, which covers a number of bases ranging from\nthe release of OCaml 5.0, the launch of a new website with integrated documentation for 20000+ packages, prototyping new developer workflows that are better integrated into editors, and the launch of ActivityPub based services such as <a href=\"https://watch.ocaml.org\">https://watch.ocaml.org</a>.</p>",
+18
avsm/notes_roadmap-ocamlorg-v3.json
+18
avsm/notes_roadmap-ocamlorg-v3.json
···+"summary": "<p>After a decade of good service, it's time to overhaul OCaml's online presence\nto more modern technologies. This post lays out the roadmap for the third\nedition of the OCaml.org website.</p>",+"content": "<p>After a decade of good service, it's time to overhaul OCaml's online presence\nto more modern technologies. This post lays out the roadmap for the third\nedition of the OCaml.org website.</p>",
+18
avsm/notes_royal-society-newton.json
+18
avsm/notes_royal-society-newton.json
···+"summary": "<p>I joined the <a href=\"https://royalsociety.org\">Royal Society</a> <a href=\"https://royalsociety.org/grants/newton-international/\">Newton International Fellowships</a>\n<a href=\"https://royalsociety.org/people/anil-madhavapeddy-36582/\">committee</a> to help with selecting bright new scientists from abroad who wish to conduct research in the UK.</p>\n<blockquote>\n<p>The Newton International Fellowship (NIF) programme provides support for outstanding early career researchers to make a first step towards developing an independent research career through gaining experience across international borders. The fellowships enable researchers to access expertise, gain new perspectives and build long-lasting collaborative relationships.\n -- <a href=\"https://royalsociety.org/grants/newton-international/\">The Royal Society</a></p>\n</blockquote>",+"content": "<p>I joined the <a href=\"https://royalsociety.org\">Royal Society</a> <a href=\"https://royalsociety.org/grants/newton-international/\">Newton International Fellowships</a>\n<a href=\"https://royalsociety.org/people/anil-madhavapeddy-36582/\">committee</a> to help with selecting bright new scientists from abroad who wish to conduct research in the UK.</p>\n<blockquote>\n<p>The Newton International Fellowship (NIF) programme provides support for outstanding early career researchers to make a first step towards developing an independent research career through gaining experience across international borders. The fellowships enable researchers to access expertise, gain new perspectives and build long-lasting collaborative relationships.\n -- <a href=\"https://royalsociety.org/grants/newton-international/\">The Royal Society</a></p>\n</blockquote>",
+18
avsm/notes_rs-ecorisk-day1.json
+18
avsm/notes_rs-ecorisk-day1.json
···+"summary": "<p>I'm at the Royal Society this morning for the 2 day programme on <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">"How does ecological risk related to commercial risk?"</a>, and am reporting on the <a href=\"https://royalsociety.org/-/media/events/2024/10/ecological-risk/programme-booklet.pdf\">morning session</a>. The full program is being <a href=\"https://www.youtube.com/watch?v=gVuxzand8RE\">livestreamed</a> so please do dial in if the below notes seem interesting to you. I put this note up almost live, so any errors below are my own.\n<em>(Update: partial <a href=\"https://anil.recoil.org/#daytwo\">day 2 notes</a> now available below)</em></p>\n<h2><a href=\"https://anil.recoil.org/#opening-keynote-by-sir-partha-dasgupta\"></a>Opening Keynote by Sir Partha Dasgupta</h2>\n<p>The summit kicked off with a keynote by economist <a href=\"https://en.wikipedia.org/wiki/Partha_Dasgupta\">Sir Partha Dasgupta</a>. The focus was on the intersection of nature and economics, covering how markets fail to account for the ecosystems that sustain them. His <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">landmark report</a> covered ecosystem services, freshwater, tipping points, and physical risk, bringing to light the urgent need to reframe economic activities around the services provided by nature.</p>\n<p>\n<img alt=\"Sir Partha Dasgupta opening the morning session.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-1.webp\" title=\"Sir Partha Dasgupta opening the morning session.\">\nSir Partha Dasgupta opening the morning session.</p>\n<p>He began by distinguishing between two types of market activities:</p>\n<ul>\n<li><strong>Provisioning goods</strong>. Commodities like food, water, timber, fibres and so forth are the primary products which, when aggregated and valued at market prices, form the GDP (the visible outputs) of human endeavour.</li>\n<li><strong>Processes</strong> are the background work of maintaining and regulating services which produce these goods are problematic from a commercial perspective. The services aren't extractive in the same way as provisioning goods, but they express themselves via the goods.</li>\n</ul>\n<p>In the language of economics, there is a missing link here for the markets of\nprocesses, which makes for inefficiency, but more alarmingly a crisis in the\nmanagement of natural resources. The action is on the activities we undertake\nwhich affect our landscape -- and the huge number of species affected therein.\nOur knowledge is extremely limited, but keeping them intact is in our interest.</p>\n<p>There is a good deal of work on option values which economists have tried to\nuncover, but research has somewhat stalled in recent years. The risks that\ncompanies face as a consequence of this inefficiency are correlated, which is\nextremely dangerous for market stability.</p>\n<p>He tried to address this in the review of the <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">Economics of Biodiversity\nreport</a>,\nbut it needs a lot more thinking. A feature of the human condition is that\nfragmented ecosystems lose their productivity. The sum of the productivity of\nan ecosystem vs the sum of the individual parts has a big difference. The\n<a href=\"https://www.theguardian.com/environment/2021/dec/27/thomas-lovejoy-conservation-biologist-dies-80\">late</a>\nTom Lovejoy did a lot of work on this in the context of the Amazon rainforest\necosystem. Similarly, we are looking at a fragmented nature ecosystem today in\na global context as more and more habitats get truncated.</p>\n<p>The background extinction rates of organisms are hugely growing - the option\nvalue of organisms suggests we are losing enormous amounts of value in the form\nof unknown lifeforms. The correlation risk here is something that whole\ngovernments need to take on board -- it's too vast for any single organisation\nto take on board!</p>\n<h2><a href=\"https://anil.recoil.org/#ecosystem-services-and-physical-risk\"></a>Ecosystem Services and Physical Risk</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Jane_Lubchenco\">Jane Lutchenko</a> chaired the panel session. She is the administrator of the NOAA, "on loan" to the White House to work on nature and technology policy.</p>\n<p>\n<img alt=\"Jane Lutchenko chairing the morning session.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-2.webp\" title=\"Jane Lutchenko chairing the morning session.\">\nJane Lutchenko chairing the morning session.</p>\n<h3><a href=\"https://anil.recoil.org/#dr-tony-juniper-cbe\"></a>Dr Tony Juniper CBE</h3>\n<p>We have not always been looking at nature through a financial perspective. The\njourney of Natural England in the 1940 began more an ethical and moral\nperspective, and the scientific and beauty value of landscapes. It's only\nrecently been that practical impacts of nature and its benefit to humanity is\nbeing considered.</p>\n<p>The <a href=\"https://en.wikipedia.org/wiki/Millennium_Ecosystem_Assessment\">Millenium Ecosystem\nAssessment</a>\ncommissioned in 2001 by Kofi Annan was a stocktake of the earth's natural\ncapital assets, and remarkable for how it made it clear how if we didn't\nreverse the decline of nature then we would be unable to meet humanity's needs\nsuch as ending poverty. A few years later in 2007 the G8 commissioned\n<a href=\"https://en.wikipedia.org/wiki/The_Economics_of_Ecosystems_and_Biodiversity\">"Economics of Ecosystems and\nBiodiversity"</a>\nthat worked until 2011 and helped reset the understanding of the human world\nand is fundamental to how we conduct our ecosystem. Later on the\n"Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem\nServices" (<a href=\"https://www.ipbes.net\">IPBES</a>) commissioned by the UN general\nassembly to consider the contribution of nature to people.</p>\n<p>\n<img alt=\"Tony Juniper speaking.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-3.webp\" title=\"Tony Juniper speaking.\">\nTony Juniper speaking.</p>\n<p>So as nature rises in the political eye, there are multiple national programs\nall trying to assess the state of natural assets. In 2020 there was a Treasury\norganisation to bring natural capital into national accounting in the UK.\nDasgputa's landmark report established that nature and the economy are\nintertwined, which was a huge change in thinking. We no longer need to assume\nthat degradation of nature is the price of progress, and the new reality are\nthat the range of ecosystem services are critical to how we need to conduct our\necosystem services into the future. Pollination, carbon capture, biomimicry,\nnutrient capture, when all added up are worth more than the UK's GDP, but we\nstill struggle to how to measure this as part of our conventional economies!</p>\n<p>After 20 years of expert studies and carefully constructed datasets, we\nstill struggle to do the right thing, but why? No one company is quite able to\nfully embrace the scope of the problem. There are 1000s of medium to large\ncompanies that depend on agriculture, but they are all dependent on 25 billion\ntons of water moving around intercontinental distances for the global water\ncycle. Any particular one company might make a small difference to reduce their\ndeforestation footprint but without collective action they will be unable to\nmake a global change. There must be a collective force to bring things\ntogether, and also better understanding of the connectivity across these\nfactors.</p>\n<p>However, there still isn't a prominent intellectual place for the connections\nbetween ecology and economy and still isn't being taught in undergraduate\ncourses and so not making it through to high political decision making. There\nis still a lot of emphasis on financial growth, but not much discussion of\nnature: we should be factoring in that nature is the mechanism by which we can\nachieve financial growth. It's quite hard to meaure biodiversity vs "tons of\ncarbon". Environment regulation must move from stopping harm, but also plot\npathways to nature recovery. So the link between ecological and commercial\nrisk is very real and must be measured and actioned.</p>\n<h3><a href=\"https://anil.recoil.org/#freshwater-professor-louise-heathwaite-cbe-frs\"></a>Freshwater (Professor Louise Heathwaite CBE FRS)</h3>\n<p>Her first ever "proper job" was working at the Nature Conservancy as their first hydrologist. Freshwater is at the centre of the triple planetary crisis today. As a result of these challenges, what we see is an intensification of the water cycle:</p>\n<ul>\n<li>increasing temperatures increase atmospheric water holding capacity</li>\n<li>increases preciptation, increases evaporation, so the cycle intensifies</li>\n<li>evaporative demand increases and so extreme drough events increase</li>\n</ul>\n<p>Our lack of investment in proper waste recycling infrastructures is now coming\nback to haunt us. They impact hugely on freshwater availability and quality. It\ndisrupts commercial supply chains, and the quality of the supply for both water\nand ecosystem services for which we heavily depend.</p>\n<p>\n<img alt=\"Louise Heathwaite shows the GRACE water map.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-4.webp\" title=\"Louise Heathwaite shows the GRACE water map.\">\nLouise Heathwaite shows the GRACE water map.</p>\n<p>Two satellites (GRACE) spinning around earth from 2002-2017 that can detect\nchanges at a centimetre scale of changes in water mass. The image shows how\nconnected the world is, and the scale of the changes at the poles and greenland\nis incredible. Some research just finished up in Greenland, and the rate of\nwarming in the arctic is roughly 2x the global warming rate. Greenland is\nlosing almost 260 gigatons of ice per year, which is a million Olympic-sized\nswimming pools a year -- an awful lot of water to move around into the ocean\nsystems. The ocean cycles aren't on the slide, but they will impact.</p>\n<p>Only 1% of the total water supply is freshwater, and its appropriation among human\nactivities is quite unalanced. We have green water to blue water to white water\nto grey water to black water. We have messed up the blue and black water\nmanagement. We are producing 360 billions of cubic metres of waste water a year\nand only 3% is recycled. A significant percentage of the world population (2\nbillion people) are dependent on black water, and a lot only have access to\ngrey or black water.</p>\n<p>\n<img alt=\"Water flows in the ecosystem.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-5.webp\" title=\"Water flows in the ecosystem.\">\nWater flows in the ecosystem.</p>\n<p>We need to treat management of surface and groundwater systems holistically.\nThe freshwater living planet index is dropping precipitously: some causes are flow\nreduction, introduction of invasive species and nitrogen/acidification peaks in the\n1970s/80s/90s which all causes major changes in freshwater biodiversity. We\nalso introduced laws on long range transport regulation about acidification. In\n1991 we had the EU urban wastewater directive and in 2000 the EU water\nframework directive, but none of this seems to have made much of a difference (see graph above).</p>\n<p>\n<img alt=\"Freshwater confidence index not measurably changed by legislation.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-6.webp\" title=\"Freshwater confidence index not measurably changed by legislation.\">\nFreshwater confidence index not measurably changed by legislation.</p>\n<p>Are we looking a tipping point in terms of waste water systems? We dont have\nclear metrics about freshwater biodiversity, and without the metrics we dont\nhave a way of valuing them as part of commercial systems. We have new emerging\npollutants (microplastics etc) which we have to deal with multi-sectoral\nchanges to make a difference, which is lacking despite the legislation in\nplace.\nHowever, the growth in urbanisation (56% of UK population lives in cities, increasing to\n70% by 2050) which makes a huge difference to where waste comes from. When we\nmove food around the word, there is a water cost which is passed onto other\ncountries. <em>(Note: see also our preprint on <a href=\"https://anil.recoil.org/papers/2024-food-life\">Quantifying the impact of the food we eat on species extinctions</a>)</em></p>\n<p>When we move to decarbonisation, the last 20% of action (the really difficult bit) is\nrelated to land use in particular. But if we join up the practises around good\nwater use and decarbonisation then we have a bunch of innovative interventions,\nfor example natural flood management. There are also cobenefits around\ninfrastructure and there is a real challenge with our UK underinvestment. The\nlast report in 2022 there are some excellent propositions for how to deal with\nwater and atmospheric pollution and biodiversity. We need to act on this\nreport, but progress is slow: e.g. Thames Water is a long way away from\nactioning this.</p>\n<p>\n<img alt=\"Water flows between countries worldwide.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-7.webp\" title=\"Water flows between countries worldwide.\">\nWater flows between countries worldwide.</p>\n<p>In the 1980s, <a href=\"https://adas.co.uk\">ADAS</a> was the wing of farming and food ministry worked on how to\nrecycle waste better. Metals were the big problem back then, but now there are\nloads of extra contaminants such as plastics and forever chemicals, We must\nmove towards reusing waste we produce towards reusable end products for farming\nand dramatically drop our water use.</p>\n<h3><a href=\"https://anil.recoil.org/#tipping-points-and-biosphere-stewardship-professor-carl-folke\"></a>Tipping points and biosphere stewardship (Professor Carl Folke)</h3>\n<p>We have polycrises with climate change, biosphere pressure, intertwined\nsystems, interacting shocks, and many tipping points. Tipping points are a\nshock that can move a system from a local maxima into a new stable state, but also\nthe gradual loss resilience can cause a system to shift suddenly. There are a\nnumber of drives that cause these tipping points, and they are not just a\nrandom aside: they should be a key part of any investment strategy.</p>\n<p>A lot of people live in areas that are at risk of ecological tipping points,\nand the map seems very correlated to the earlier water change map shown by\nLouise! 56% of human CO2e have been taken up by the biosphere so far (1430\nGtCO2 from 1850-2019) which has been happening for "free" thanks to nature, and\nthis might come to an abrupt end soon which is a big source of worry in the\nscientific community.</p>\n<p>\n<img alt=\"Prof Folke discussing tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-8.webp\" title=\"Prof Folke discussing tipping points.\">\nProf Folke discussing tipping points.</p>\n<p>The connectivity of various tipping points is becoming clearer. There are just\na few major actors that shape some of these (such as the Amazon rainforest) and\nthey are often quite far away geographically by virtual of their influence over\nmarkets.</p>\n<p>The current idea is that social tipping ("social norms as solutions") is the\nway to alter our value systems. See "<a href=\"https://www.cambridge.org/core/journals/global-sustainability/article/operationalising-positive-tipping-points-towards-global-sustainability/8E318C85A8E462AEC26913EC43FE60B1\">Operationalising positive tipping points\ntowards global\nsustainability</a>"\nby Tim Lenton et al. <em>(Note: Simon Sharpe also authored this, see our CSaP\n<a href=\"https://www.csap.cam.ac.uk/news/article-reading-group-five-times-faster-4-rethinking-unive/?preview=1\">reading\ngroup</a>\non his excellent <a href=\"https://fivetimesfaster.org\">"Five Times Faster"</a> book.)</em></p>\n<p>\n<img alt=\"Exposure of people to tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-9.webp\" title=\"Exposure of people to tipping points.\">\nExposure of people to tipping points.\n\n<img alt=\"Interconnected climate tipping points worldwide.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-10.webp\" title=\"Interconnected climate tipping points worldwide.\">\nInterconnected climate tipping points worldwide.</p>\n<p>So we need to prepare for tranformation, navigate the sudden transition and\nthen build resilience in the new norm after the tipping point. But our window\nof opportinty is happening right now so action must happen urgently or we miss\nthe window and enter the tipping point unprepared. So "corporate biosphere\nstewardship" is a new business logic with the purpose of shepherding and\nsafeguarding the resilience of the biosphere for human well-being, and\nfostering the sustainability of a rapidly changing planet. Rather than viewing\nnature as a compliance question, we should view it as humanity's greatest\nbusiness opportunnity! For example, Seafood Business for Ocean Stewardship\n([seabos.org](Seafood Business for Ocean Stweardship)) is codifying this\napproach for marine foodstocks. There is an increasing focus on how to report\nthis stuff. There are three good books to read more on this:</p>\n<ul>\n<li><a href=\"https://openknowledge.worldbank.org/entities/publication/855c2e15-c88b-4c04-a2e5-2d98c25b8eca\">"Nature's Frontiers"</a>, 2023 from the World Bank group.</li>\n<li><a href=\"https://www.stockholmresilience.org/publications/publications/2022-09-29-economy-and-finance-for-a-just-future-on-a-thriving-planet.html\">"Economy and Finance for a Just Future on a Thriving Planet"</a>, 2022 from the SRC.</li>\n<li><a href=\"https://www.ngfs.net/en/the-green-scorpion-macro-criticality-nature-for-finance\">"The Green Scorpion"</a>, the macro-criticality of nature for finance.</li>\n</ul>\n<p>\n<img alt=\"The window to tackle tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-11.webp\" title=\"The window to tackle tipping points.\">\nThe window to tackle tipping points.</p>\n<h3><a href=\"https://anil.recoil.org/#ecosystem-services-and-physical-risk-paul-polman-kbe\"></a>Ecosystem services and physical risk (Paul Polman KBE)</h3>\n<p><a href=\"https://en.wikipedia.org/wiki/Paul_Polman\">Paul Polman</a> is the former CEO of\nUnilever. He started by noting that we have to make sure that we dont just\nspend our time talking to each other, but we have to get the message out in the\nform of action to the wider world! When they developed the SDGs, Ban Ki Moon\nrequested Paul to represent the private sector. He was described as <em>"the\nproblem walking into the room"</em> when the politicans first met him, but luckily\nthey ended up in a happier place! We must find a balance into a sustainable\neconomy into the future. Most businesses, although they dont behave entirely\nsustainabiltly now, they do understand the need for a planet. When he ran\nUnilever for a decade, they encoutered hundreds of millions of dollars worth of\nnature-related interruptions tot heir business.</p>\n<p>4 billion people in the world depend on natural medicine, and nature governs\nhuge amounts of the human ecosystem. Nature supports peace and national\nsecurity - many of the world geopolitical events train back to inequality in\naccess to natural resources and this is the foundation for many economics. The\nWEF report recently calculated thaat $44tn of the world economy depends on\nnature -- but this is a huge understatement given that our entire life depends\non it! Changes in one ecosystem affects others, and so our conversation must\naddress the interrelationship of all this that makes it so complex.</p>\n<p>\n<img alt=\"Paul Polman discussing global business and nature.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-12.webp\" title=\"Paul Polman discussing global business and nature.\">\nPaul Polman discussing global business and nature.</p>\n<p>We must not see ourselves at the top of the pyramid -- biomimicry and\ngeoengineering is an arrogant approach as our survival relies on cooperation\nwith the biosphere -- when we destroy nature we destroy ourselves. (paraphrased quote) <em>"Man is the\nmost insane species - he worships an invisible god and destroys the visible\nnature, not realising that the invisible god he worships is the visible nature\nhe destroys"</em> (original: <a href=\"https://www.goodreads.com/quotes/1171374-man-is-the-most-insane-species-he-worships-an-invisible\">Hubert Reeves</a>). Extinctions are 10-100x the average of the previous\ncenturies.</p>\n<p>Food and land use is about 30% of our global emissions, and yet we have the\naudacity to keep our famers in poverty. Every $1 invested in changing nature\nand regenerative farming approaches give us $16 return. The food companies are\nreally exposed if they don't act, much as we criticiise fossil fuel companies\nright now.</p>\n<p>We are making withdrawals much faster than we are depositing in the bank of\nplanetary boundaries, and there are millions of people losing their lives and\nbillions being displaced because of these choices. The Carribean has lowered\nits tourism by a significant percentage due to beach erosion. The Amazon this\nyear has had unprecedented wildfires (around the size of Italy) as an example.\nWe name our tropical storms (we name them as if they are our friends); would it\nbe different if we named them Exxon Chevron etc? We must absolutely link the\ncommercial sector to nature as companies like Unilever must become\nborder-positive and nature positive as they are hugely global.</p>\n<p>Leadership is increasingly centred in europe with many business regulations\nthat are nature positive. There is a bunch of business interaction happening,\nbut the estimation is that loss of biodiversity has cost us over $10tn. In\nagriculture we are destroying $12tn of value, but if we turn that around its a\n$4tn opportunity. This just doesn't make business sense to not act. Covid\nshowed us that infinite growth on a finite planet is unsustainable.</p>\n<p>What can business do next then?</p>\n<ul>\n<li><em>Invest in nature.</em> It drives innovation and nature has probably been the best R&D lab (1/3rd of medicines come directly from nature)</li>\n<li><em>Water utility improvement.</em> If the 500 largest cities looked into restoring local forests and water tables, it would make a huge difference in quality of life.</li>\n<li><em>A mindset change for business.</em> See his <a href=\"https://netpositive.world/book/\">book on the topic</a>. The only way to think in business to be successful now is to think regeneratively and restoratively -- every action needs to contribute to restoration and not cause to us to fail slightly more slowly.</li>\n</ul>\n<p>Priorities for business action:</p>\n<ul>\n<li>Repair and restore nature. e.g. 30x30 the global biodiversity framework</li>\n<li>Account for the value of nature, the point of this event. less than 5% of companies today account for nature, and this needs to vastly increase.</li>\n<li>Form partnerships for advocacy and change. most of the big issues cannot be tackled by one company; even at Unilever Paul could only solve about 20% of the problem with one of the biggest companies in the world. So for global change (like "planetary guardians" launched at the UN last week) is another example of partnership.</li>\n<li>Align financial flows using the political moves around GBF and the Paris agreement. We need to put money behind this.</li>\n</ul>\n<p>Nature needs business, and nature needs business, and the time to act is now!</p>\n<p>\n<img alt=\"Breaching planetary boundaries.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-13.webp\" title=\"Breaching planetary boundaries.\">\nBreaching planetary boundaries.\n\n<img alt=\"Key priorities next for nature and business.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-14.webp\" title=\"Key priorities next for nature and business.\">\nKey priorities next for nature and business.\n\n<img alt=\"The full panel discussing these topics.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-15.webp\" title=\"The full panel discussing these topics.\">\nThe full panel discussing these topics.</p>\n<h2><a href=\"https://anil.recoil.org/#discussion\"></a>Discussion</h2>\n<p>Partial notes from the audience Q&A follow.</p>\n<p>Q: To what extent have we made progress in the transition towards a nature positive future -- is the awakening fast enough?</p>\n<p>Tony: what we have discussed here is not very well known in the majority of the\nworld's population. Most people get in their car and go to work, and their food\nis processed and disconnected from nature. So the insight that most people have\ninto their dependencies on nature isnt very well known. David Attenborough has\nmade it clear that if we take steps towards repairing the natural world then\npeople will need to be exposed to nature so that they understand what they're\ntrying to save. This was disconnected in the beginning of the industrial\nrevolution, and we need to urgently reconnect it, or people will simply not\nappreciate what we're about to lose.</p>\n<p>Louise: is quite optimistic because the younger generation is very engaged in\nthis, and we need to ensure they have the levers for action. It's the older\ngenerations challenge to bring the right levers to them.</p>\n<p>Paul: the science is still evolving at a fast pace, and will continue to, and\nif we dont convert it to fast practical steps we'll be trouble. Everything\nstarts and ends with educaiton, but we are also out of time. The urgency isnt\nwell understood -- a survey of 4000 people about this skewed to about 30% of\npeople where they left a company because it wasnt aligned with their values.\n60% were considering leaving but didnt. We must get businesses to move out of\nthe current gridlock where multilateral institutions cannot change easily, and\nwe must cooperate across boundaries. The EU has a nature restoration law which\nwouldnt have happened without businesses getting involved to push it over the\nline for example -- without that it wouldnt have moved. And tipping points mean\nthat only 4-5% change is required to get us to a new stable state, so this is\nboth an opportunity and a risk. The biggest risk right now is the US election,\nas its a global vote but decided by a tiny minority.</p>\n<p>Q: What examples have you seen about nature flows in the real world?</p>\n<p>Paul: the economic forces must be made to work; people arent skeptical but the\neconomic realities must work. But there are pressures; the average tenure of a\nCEO has dropped to 4-5 years and so their actions are aligned with reelection\ncycles just like poliitics! So we have to make capital and money flow work much\nmore transparently. With a relatively small investment, we can get to 50%\nregenerative agriculture; this is happening in the USA (thanks to IRA). Many\nof the major food companies (from Pepsi to Unilever) are committed to $15bn\ntowards soil health and regenerative agriculture. And there is nothing better\nthan healthy soil towards preserving yields in the face of climate change. The\naverage age of farmers is ~50 now and so they only have 10 seasons left before\nthey retire, so they need to be paid to shift proactively for ecosystem\nservices. It needs a proper farmer rewards to do the shift quickly and hedge\nthe risk. Luckily, the speed of nature restoration is faster than we predicted.</p>\n<p>Carl: it has become a strategic issues in companies, as opposed to a\ngreenwashing protection. They run an executive course for companies and they\nare factoring in risks proactively now in a way they didnt before. They are\ndemanding satellite data and other risk management data products to quantify\nthe uncertainties in nature. Earth SYstem Indicators is their system to combine\nmultoiple planetary boundaries into one actionable indicator. It is a bit like\nan avalanche now where the reaction is awakening, but we are probably in a\ntipping point right now so we must act in the right direction.</p>\n<p>Jane: social systems are highly non linear and also characterised by social\nsystem tipping points. Water seems to be an obvious one that underpins the\noperation of many companies.</p>\n<p>Louise: Universities and research must turn into businesses (of innovation).\nOne thing we failed to do is to find the problems but dont find the solutions,\nand research must come up with solutions. "Ecologists turn over stones looking\nfor problems". We must take a position on coming up with solutions and\ncommunicating them clearly, and the only way to do this is to work together\nacross disciplines. We are looking to social scientists to combine civic change\nwith environmental scientists to propose interventions.</p>\n<p>Jane: there has been a huge transformation in the US towards finding solutions,\nand that means partnering with government agencies and NGOs in a way we haven't\nseen before. <em>(Note: this describes the <a href=\"https://www.cambridgeconservation.org\">Cambridge Conservation Initiative</a> model perfectly!).</em></p>\n<p>Tony: even when the solutions exist, a number of entities want to keep the\nstatus quo: e.g. Exxon and Shell spent more than two decades pretending it\nwasnt a problem to protect their interests, and they're not alone. And so we\nhave to establish a social tipping point (just 3-4%) was just the idea of\nExctinction Rebellion to overload the court system and get to the point where\nwe could shift the discussion, But it didnt work, so what will work? We need to\ntry, try and try again on this. Otherwise we're rather just talking to\nourselves.</p>\n<p>Q: Water related financial risks: TCFD galvanised focus on climate,\nspeciifcally carbon, and one of the barriers is this focus on regulations that\nfocus on the cause (carbon) but not the impacts. We mainly feel these impacts\nthrough the water cycle, so how can we make the focus on the phsyical risks of\nclimate change being on the impacts and not the cause?</p>\n<p>Louise: we need a systems approach to this; tipping points sort of get there,\nbut how do we get the evidence to get over the top?</p>\n<p>Q: Why is it that the business and finance community isn't moving faster? The\nsuspicion is that not all businesses are part of the solution, and do we need\nto be clearer on that? If we combine some of the social aspects and linking of\npeople to nature, does that supply chain need to be shorter?</p>\n<p>Paul: we are slightly bending the curve linearly, but the gap is getting bigger\nexponentially. If we dont solve the problems for the world, we can be great as\na company but it doesnt save the planet for the next generation. Business alone\ncannot do it, but what we can do is to make it more transparent. The more\ntransparent we are the more we change behaviours and social norms. Companies\nusing AI across supply chains are doing surprisingly well at bringing this\ntransparency around. In every transition in the history of mankind, there will\nbe people resisting change, but now the noise globally is becoming louder is a\nbig deal and is a sign that the process is unstoppable and has begun. The noise\nwill only increase in the next five years. In the food chain there are\nherbicide, pesticide producers that are doing really well on food speculation,\nbut at the expense of many lives. We need a leader to change them, and if they\ndont change they will obsolete their business models.</p>\n<p>Q: What is the single most important action businesses can take today to address the crisis?\nQ: Why not spin this around to look at this at a very local level (e.g. flooding for business crisis)?\nQ: Unless we can localise analysis with improved precision, can we improve decision making?</p>\n<p>Tony: Localisation of issues is critical. The founding slogan of Friends of the Earth was to "think globally act lcoally" which was a very good slogan. Natural England is working with local authorities on spatial planning exercises on where the good remaining nature exists and where it might usefully be repaired (cleaning up peatlands etc). The targeting of the resources like this might help us bend the curve. This must be linked with our need to build 100k new houses and balance natural and business.</p>\n<p>Paul: COP30 is likely to involve businesses heavily (COP29 is a writeoff due to location). But businesses are getting behind it all as countries cant implement their policies without business help. A large percentage of indigenous populations are planetary guardians and so we must support their efforts to protect existing natural capital against tipping points. Science without impact is useless when we have a crisis at this level.</p>\n<h2><a href=\"https://anil.recoil.org/#day-1-reflections-by-sir-partha-dasgupta\"></a>Day 1 Reflections by Sir Partha Dasgupta</h2>\n<p>Ecosystems are capital assets; we now use the term natural capital to include\nthe biosphere in the full stock. This raises the question of asset management,\nand the questions discussed today represent shortcomings in asset management.</p>\n<p>With natural capital, property rights are hard to enforce -- nature is always\non the move and "mobile" and the commodity changes when it moves. This leads to\nan underpricing of many forms of natural capital (not all, but many) which\nleads to the overuse of it, which in turns leads to deterioration of the asset,\nwhich implies a runaway tipping point of decline. This all leads to a circular\ndecline in the value of natural capital; if there is a heightened risk of an\nasset collapsing, the accounting value is naturally reduced. So this is a\nvicious cycle at work that leads to the inevitable decline.</p>\n<p>Global GDP has increased hugely as opposed to natural capital, which increases\nthe pressure to overspend on natural capital. What are the arbitrage conditions\n(e.g. risk adjusted rate of return) on the portfolio assets that comprise\nnatural assets? Because of imperfect pricing, there is a big misalignment of\nany portfolios. For example, the Ganges is the most polluted river in the world\nand under the Ganges action plan to remedy this there was a <a href=\"https://archive.org/details/cleaningupganges0000mark\">social cost benefit\nanalysis</a> -- there was a 15% rate of return calculated. The rate of return on\ngovernment bonds was roughly 5%. There was a gap of 10% between these two assets,\nand so if the structure was efficiently organised then the Ganges value as a stock\nshould be decreasing by 10%. But this is perverse since the reverse should hold\nsince the river quality has been decreasing not increasing!</p>\n<p>So from the firm's point of view we are looking at their balance sheets (from\nthe natural capital POV). But GDP was not constructed for this purpose -- instead\nit was constructed in the post war period to calculate progress towards getting\nout of economic depression resulting from lack of economic activity. But somehow\nafter WWII it became a long-term goal but it has no social benefit justification.\nGDP is a flow and so not a future predictor like stocks. You can have GDP growth\nand the national package of accounts, but it just won't take natural capital\ninto account. There is a gap between portfolio management and GDP therefore.</p>\n<p>But countries are moving towards wealth accounting bit by bit, including for natural\ncapital aspects of wealth. Most of the attention has been towards human capital,\nincluding the attention of investors. This now needs to shift quickly towards\nnatural capital. There have been studies attempting to assess the market demand\nfor natural capital. Think of the biosphere as a massive fishery -- the natural\nanalogy is how much we take out of it, and what the regeneration rate is. The question\nis what the overreach is and there are <a href=\"https://naturalcapitalproject.stanford.edu/about\">projects working on this</a>.\nWe need to take the outputs of these projects and treat them as underestimates due\nto the huge extinction pressure on organisms.</p>\n<p>Even if firms are competing in the market, they still need to cooperate on the\nunderlying natural capital accounting to avoid a "market crash". In academia,\nthere is both cooperation and competition. There is a race for paper publication,\nbut also a huge amount of sharing towards global scientific co-creation. There is a\ngood deal that companies could learn from this to cooperate and communicate towards\nnatural capital accounting. More on day 2 tomorrow about how we can get there!</p>\n<p><span></span></p>\n<h2><a href=\"https://anil.recoil.org/#day-2-metrics-and-actions\"></a>Day 2: Metrics and Actions</h2>\n<p>Kat Bruce (founder of <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a>) opened the session by noting just how quickly\nthe biodiversity space is moving, and how encouraging it is to see so many businesses\nengaging with environmental scientists.</p>\n<h3><a href=\"https://anil.recoil.org/#metrics-for-business-use-prof-neil-burgess\"></a>Metrics for business use (Prof Neil Burgess)</h3>\n<p>Is going to cover terrestrial metrics only. There are several sorts of metrics we could measure:</p>\n<ul>\n<li>pressures on biodiversity</li>\n<li>steady state metrics for biodiversity</li>\n<li>measuring benefits from biodiversity</li>\n<li>response metrics on the effectiveness of biodiversity interventions</li>\n</ul>\n<p>We do need to understand biodiversity risk in regional detail so that\nbusinesses can use this to influence their actions. There is also the need for\nclear target setting to know how much opportunity cost to "spend" on\nbiodiversity vs other actions.</p>\n<p>\n<img alt=\"Neil describes how to classify biodiversity metrics\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-16.webp\" title=\"Neil describes how to classify biodiversity metrics\">\nNeil describes how to classify biodiversity metrics\n\n<img alt=\"Categorising 573 (!) biodiversity metrics\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-17.webp\" title=\"Categorising 573 (!) biodiversity metrics\">\nCategorising 573 (!) biodiversity metrics\n\n<img alt=\"Metrics usable for businesses currently\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-18.webp\" title=\"Metrics usable for businesses currently\">\nMetrics usable for businesses currently</p>\n<p>UNEP-WCMC have a huge database about all the various metrics (~573!) that\ncover biodiversity. If you even add other things like marine, the number\nbreaks 600 (see picture). There are around 23 useful ones for business use;\nthere are few in this list that check benefits to people or discuss genetic\nchanges, but there are lots of on diversity, habitat area and so on.\n(Neil noted that our <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> paper isnt listed as its not published yet\ntill the end of the year, but it will be!)</p>\n<p>\n<img alt=\"What next for biodiversity metrics?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-19.webp\" title=\"What next for biodiversity metrics?\">\nWhat next for biodiversity metrics?</p>\n<p>There is quite a lot of work needed to make the metrics usable, and a pipeline.\nHe also noted the importance of incremental pipelines and mentioned the\nincremental pipelines that <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> are working on for these.\nAnd this is something I'd like to advance with <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a>...</p>\n<p>The main tool that pulls this together is <a href=\"https://www.ibat-alliance.org\">IBAT</a>\nwhich is a paid-for model to support its continued developed. There is also\nthe free <a href=\"https://encorenature.org/en\">ENCORE nature</a> platform. For supply chains,\nthere is also tools to help check the country to country. There is a big emphasis\non the need for open sharing between platforms as well due to the sheer\ncomplexity of biodiversity worldwide, which is a key thing to keep the platforms\nsustainable both financially and equitably.</p>\n<h3><a href=\"https://anil.recoil.org/#data-availability-and-use-prof-andy-purvis\"></a>Data availability and use (Prof Andy Purvis)</h3>\n<p>Andy's career as a scientist started 35 years, and he is now at the Natural\nHistory Museum as a senior researcher. The mission there is not just to track\nthe decline of the planet, but also to advocate towards a healthier nature\npositive society as well.</p>\n<p>\n<img alt=\"Why care about biodiversity?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-20.webp\" title=\"Why care about biodiversity?\">\nWhy care about biodiversity?</p>\n<p>We have a million or so species of animal and plants threatened by species\nextinction. Our job as scientists is to come up with defensible statistics,\nbut they also don't convince anyone. So stories about individual species\n(like the white rhino) are vital to build public awareness about the real\nimpact.</p>\n<p>Is there one index to rule them all? Andy is against that idea of a single\nmetric to describe the world's biodiversity. Any indicator combining some of\nthe metrics has an "exchange rate" between extinctions and human wellbeing. In\nother words, it creates an "extinction market".</p>\n<p>Decision grade data are derived very carefully from very biased raw data. And\nthe data collection requires a huge amount of expertise, and is painstaking\nto conduct. This has to be funded, and not even the collection of the raw data\nis very well funded right now. To illustrate the data bias, there are a huge\namount of birds...and mallards...in there, which is clearly not representative\nof global species spread.</p>\n<p>We can plot ecosystem function (resource capture, biomass production, decomposition\nand nutrient recycling) on the axis of biological diversity (variation in genes\nand species and functional traits). Across this, there is a huge spectrum\nof ecosystem services that this supports, of which a few are really useful to humans.\nAnd the <a href=\"https://www.nhm.ac.uk/our-science/services/data/biodiversity-intactness-index.html\">BII index</a>\nis a statistical model relating nature to human pressures that produces both\nhigh resolution temporal and spatial models.</p>\n<p>What should we do in terms of ecosystem health?</p>\n<ul>\n<li>Deintensify activities in unhealthy systems where people depend on local ecosystem services</li>\n<li>Divest from businesses that are poor stewards of ecosystem health</li>\n<li>Invest in actions that are "nature positive", which is an action that improves the expected global status of biodiversity relative to counterfactuals.</li>\n</ul>\n<p>It requires a model, has to be global, and measure both species persistence and\necosystem health. It has to be vs counterfactuals otherwise the cost is too\nhigh for any individual organisation <em>vs</em> society taking action collectively.</p>\n<p>\n<img alt=\"Why not have one biodiversity index?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-21.webp\" title=\"Why not have one biodiversity index?\">\nWhy not have one biodiversity index?\n\n<img alt=\"Ecosystem function vs health\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-22.webp\" title=\"Ecosystem function vs health\">\nEcosystem function vs health\n\n<img alt=\"Defining nature positivity as a counterfactual\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-23.webp\" title=\"Defining nature positivity as a counterfactual\">\nDefining nature positivity as a counterfactual</p>\n<p>We need to combine models with monitoring to give us a "sat nav" for nature.\nThere is a need to monitor drivers as well biodiversity, which\nrelies on data being available to improve the models. And if the platforms are\nopen then this is possible. There are platforms in the form of Geo BON, IBAT, BII ha a data license with\nBloomberg, and some TNFD tools.</p>\n<p>Take home messages:</p>\n<ul>\n<li>Use data whose methodologies are transparent</li>\n<li>Remember the pitfall of hybrid indicators or indices</li>\n<li>Reduce extinctions and mitigate existing activities in important areas and dont do new human activities there</li>\n<li>Divest from poor stewards and invest in nature positive actions</li>\n<li>Monitor closely to verify gains and contribute to data repositories</li>\n<li>Accept that decision grade data does cost money and needs funding!</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#qa\"></a>Q&A</h3>\n<ul>\n<li>Is the restoration of an ecosystem the reverse of destruction curve?</li>\n<li>Is there consensus about BII outside of the NHM team?</li>\n<li>Are we moving towards biodiversity standardisation?</li>\n<li>How can we make these metrics to be more understandable to businesses?</li>\n</ul>\n<p>Andy: if we go back to a system with the original biodiversity, but the actions to get it there don't have an immediate effect (there is a timelag). There's no standardisation effort yet across many metrics, but we are moving towards ensembles of models with alternative sets of inputs for the environmental rasters in order to normalise the uncertainty in both the environmental and geographic space.</p>\n<p>Neil: UNEP-WCMC uses BII a lot! In terms of metrics, a species and ecosystem metric and something on genes would cover the three dimensions of biodiversity, but others also need to be factored in. There is an effort to use AI/geospatial to get higher resolution landuse data (is this a forest?) and also probabilistic SDMs, but no standardisation.</p>\n<p>Kat: being able to reverse degradation is a secret weapon for nature as it has an incredible ability to bounce back if pressures are reduced due to its innate resilience. The aspect of the data we often forget about is its ability to tell these stories and drive uptake, and this is powerful.</p>\n<ul>\n<li>What are the greenwashing risks of nature-positive counterfactuals?</li>\n<li>When we boil down the metrics that are ready for both country and business use, what are the qualities that make a metric useful? There were only 16 or so outside of ~600!</li>\n</ul>\n<p>Neil: There are criteria (peer reviewed, published, etc) with lots of feedback from the co-authors of the paper (to appear later this year). There are strongly held opinions in the biodiversity metric space. But it's actually not a huge and manageable list of metrics.</p>\n<p>Andy: trying to avoid false claims of nature positive requires verification. We're going to get better at this by estimating the net gain and how certain its positive. The models will improve when there is a connecting pipeline to monitoring data and needs verification. So any payment for nature-positive claims needs to be staged and ex-ante.</p>\n<h2><a href=\"https://anil.recoil.org/#other-talks\"></a>Other talks</h2>\n<p>I failed to capture notes for the middle sessions as I was engrossed in conversations, but here's a gallery of some of the fantastic speakers! They're on the livestream video if you want to catch the details.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-24.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-25.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-26.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-28.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-29.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-30.webp\" title=\"\">\n</p>\n<h1><a href=\"https://anil.recoil.org/#parthas-day-2-roundup\"></a>Partha's Day 2 roundup</h1>\n<p>See the equation below. The right hand side is <code>G</code> (nature's regeneration rates) and is a function of the state of the biosphere (the <code>S</code>).\nThe left hand side is the human demand (human activity) which is a per-capita adjusted population and the efficiency of provisioning goods (recall what those are from the start of day 1 notes).\nMost of the discussion in these two days have been on the right hand side. A huge amount of the literature is on the alpha, and how technological progress can raise alpha (e.g. via cleaner energy) and help rebalance the inequality. How fast can alpha change, as the faster it grows, the faster GDP and natural capital grows.</p>\n<p>\n<img alt=\"Sir Partha&apos;s equation to sum up the two days!\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-31.webp\" title=\"Sir Partha&apos;s equation to sum up the two days!\">\nSir Partha's equation to sum up the two days!</p>\n<p>How might we advocate for change? The total we pay ourselves is 2-3% of global GDP for subsidising our assault on nature. Removing those subsidies is the equivalent of raising alpha. Adam Smith's classic book in the 18th century was the "Wealth of Nations" and not the "GDP of Nations". For us, wealth includes natural capital. Even as early as 60 years ago, human capital didn't appear in the economic literature. Most national accountants include statements about the increase in human capital, and for our purposes we must include natural capital in the notion of wealth. We must shift accounting away from GDP into calculations of stock and inequalities, as shown in the equation.</p>\n<p>What the discussions missed didn't discuss invasive species and the "transfer of natural capital" through the fact that goods and services are traded. This is represented in the capital <code>N</code> in the equation. So that's future work!</p>\n<h2><a href=\"https://anil.recoil.org/#follow-more\"></a>Follow more</h2>\n<p>This concludes my rapid note taking. Do join the <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">livestream</a> and follow for the remaining day and a half if you have a spare moment!</p>\n<p><em>Edit: upon being prodded by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, I've uploaded a <a href=\"https://notebooklm.google\">NotebookLM</a> generated podcast summary of the morning. It's surprisingly entertaining, but I'm sure I'm going to regret this for some reason...</em></p>\n<p></p><div></div><p></p>",+"content": "<p>I'm at the Royal Society this morning for the 2 day programme on <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">"How does ecological risk related to commercial risk?"</a>, and am reporting on the <a href=\"https://royalsociety.org/-/media/events/2024/10/ecological-risk/programme-booklet.pdf\">morning session</a>. The full program is being <a href=\"https://www.youtube.com/watch?v=gVuxzand8RE\">livestreamed</a> so please do dial in if the below notes seem interesting to you. I put this note up almost live, so any errors below are my own.\n<em>(Update: partial <a href=\"https://anil.recoil.org/#daytwo\">day 2 notes</a> now available below)</em></p>\n<h2><a href=\"https://anil.recoil.org/#opening-keynote-by-sir-partha-dasgupta\"></a>Opening Keynote by Sir Partha Dasgupta</h2>\n<p>The summit kicked off with a keynote by economist <a href=\"https://en.wikipedia.org/wiki/Partha_Dasgupta\">Sir Partha Dasgupta</a>. The focus was on the intersection of nature and economics, covering how markets fail to account for the ecosystems that sustain them. His <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">landmark report</a> covered ecosystem services, freshwater, tipping points, and physical risk, bringing to light the urgent need to reframe economic activities around the services provided by nature.</p>\n<p>\n<img alt=\"Sir Partha Dasgupta opening the morning session.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-1.webp\" title=\"Sir Partha Dasgupta opening the morning session.\">\nSir Partha Dasgupta opening the morning session.</p>\n<p>He began by distinguishing between two types of market activities:</p>\n<ul>\n<li><strong>Provisioning goods</strong>. Commodities like food, water, timber, fibres and so forth are the primary products which, when aggregated and valued at market prices, form the GDP (the visible outputs) of human endeavour.</li>\n<li><strong>Processes</strong> are the background work of maintaining and regulating services which produce these goods are problematic from a commercial perspective. The services aren't extractive in the same way as provisioning goods, but they express themselves via the goods.</li>\n</ul>\n<p>In the language of economics, there is a missing link here for the markets of\nprocesses, which makes for inefficiency, but more alarmingly a crisis in the\nmanagement of natural resources. The action is on the activities we undertake\nwhich affect our landscape -- and the huge number of species affected therein.\nOur knowledge is extremely limited, but keeping them intact is in our interest.</p>\n<p>There is a good deal of work on option values which economists have tried to\nuncover, but research has somewhat stalled in recent years. The risks that\ncompanies face as a consequence of this inefficiency are correlated, which is\nextremely dangerous for market stability.</p>\n<p>He tried to address this in the review of the <a href=\"https://www.gov.uk/government/publications/final-report-the-economics-of-biodiversity-the-dasgupta-review\">Economics of Biodiversity\nreport</a>,\nbut it needs a lot more thinking. A feature of the human condition is that\nfragmented ecosystems lose their productivity. The sum of the productivity of\nan ecosystem vs the sum of the individual parts has a big difference. The\n<a href=\"https://www.theguardian.com/environment/2021/dec/27/thomas-lovejoy-conservation-biologist-dies-80\">late</a>\nTom Lovejoy did a lot of work on this in the context of the Amazon rainforest\necosystem. Similarly, we are looking at a fragmented nature ecosystem today in\na global context as more and more habitats get truncated.</p>\n<p>The background extinction rates of organisms are hugely growing - the option\nvalue of organisms suggests we are losing enormous amounts of value in the form\nof unknown lifeforms. The correlation risk here is something that whole\ngovernments need to take on board -- it's too vast for any single organisation\nto take on board!</p>\n<h2><a href=\"https://anil.recoil.org/#ecosystem-services-and-physical-risk\"></a>Ecosystem Services and Physical Risk</h2>\n<p><a href=\"https://en.wikipedia.org/wiki/Jane_Lubchenco\">Jane Lutchenko</a> chaired the panel session. She is the administrator of the NOAA, "on loan" to the White House to work on nature and technology policy.</p>\n<p>\n<img alt=\"Jane Lutchenko chairing the morning session.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-2.webp\" title=\"Jane Lutchenko chairing the morning session.\">\nJane Lutchenko chairing the morning session.</p>\n<h3><a href=\"https://anil.recoil.org/#dr-tony-juniper-cbe\"></a>Dr Tony Juniper CBE</h3>\n<p>We have not always been looking at nature through a financial perspective. The\njourney of Natural England in the 1940 began more an ethical and moral\nperspective, and the scientific and beauty value of landscapes. It's only\nrecently been that practical impacts of nature and its benefit to humanity is\nbeing considered.</p>\n<p>The <a href=\"https://en.wikipedia.org/wiki/Millennium_Ecosystem_Assessment\">Millenium Ecosystem\nAssessment</a>\ncommissioned in 2001 by Kofi Annan was a stocktake of the earth's natural\ncapital assets, and remarkable for how it made it clear how if we didn't\nreverse the decline of nature then we would be unable to meet humanity's needs\nsuch as ending poverty. A few years later in 2007 the G8 commissioned\n<a href=\"https://en.wikipedia.org/wiki/The_Economics_of_Ecosystems_and_Biodiversity\">"Economics of Ecosystems and\nBiodiversity"</a>\nthat worked until 2011 and helped reset the understanding of the human world\nand is fundamental to how we conduct our ecosystem. Later on the\n"Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem\nServices" (<a href=\"https://www.ipbes.net\">IPBES</a>) commissioned by the UN general\nassembly to consider the contribution of nature to people.</p>\n<p>\n<img alt=\"Tony Juniper speaking.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-3.webp\" title=\"Tony Juniper speaking.\">\nTony Juniper speaking.</p>\n<p>So as nature rises in the political eye, there are multiple national programs\nall trying to assess the state of natural assets. In 2020 there was a Treasury\norganisation to bring natural capital into national accounting in the UK.\nDasgputa's landmark report established that nature and the economy are\nintertwined, which was a huge change in thinking. We no longer need to assume\nthat degradation of nature is the price of progress, and the new reality are\nthat the range of ecosystem services are critical to how we need to conduct our\necosystem services into the future. Pollination, carbon capture, biomimicry,\nnutrient capture, when all added up are worth more than the UK's GDP, but we\nstill struggle to how to measure this as part of our conventional economies!</p>\n<p>After 20 years of expert studies and carefully constructed datasets, we\nstill struggle to do the right thing, but why? No one company is quite able to\nfully embrace the scope of the problem. There are 1000s of medium to large\ncompanies that depend on agriculture, but they are all dependent on 25 billion\ntons of water moving around intercontinental distances for the global water\ncycle. Any particular one company might make a small difference to reduce their\ndeforestation footprint but without collective action they will be unable to\nmake a global change. There must be a collective force to bring things\ntogether, and also better understanding of the connectivity across these\nfactors.</p>\n<p>However, there still isn't a prominent intellectual place for the connections\nbetween ecology and economy and still isn't being taught in undergraduate\ncourses and so not making it through to high political decision making. There\nis still a lot of emphasis on financial growth, but not much discussion of\nnature: we should be factoring in that nature is the mechanism by which we can\nachieve financial growth. It's quite hard to meaure biodiversity vs "tons of\ncarbon". Environment regulation must move from stopping harm, but also plot\npathways to nature recovery. So the link between ecological and commercial\nrisk is very real and must be measured and actioned.</p>\n<h3><a href=\"https://anil.recoil.org/#freshwater-professor-louise-heathwaite-cbe-frs\"></a>Freshwater (Professor Louise Heathwaite CBE FRS)</h3>\n<p>Her first ever "proper job" was working at the Nature Conservancy as their first hydrologist. Freshwater is at the centre of the triple planetary crisis today. As a result of these challenges, what we see is an intensification of the water cycle:</p>\n<ul>\n<li>increasing temperatures increase atmospheric water holding capacity</li>\n<li>increases preciptation, increases evaporation, so the cycle intensifies</li>\n<li>evaporative demand increases and so extreme drough events increase</li>\n</ul>\n<p>Our lack of investment in proper waste recycling infrastructures is now coming\nback to haunt us. They impact hugely on freshwater availability and quality. It\ndisrupts commercial supply chains, and the quality of the supply for both water\nand ecosystem services for which we heavily depend.</p>\n<p>\n<img alt=\"Louise Heathwaite shows the GRACE water map.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-4.webp\" title=\"Louise Heathwaite shows the GRACE water map.\">\nLouise Heathwaite shows the GRACE water map.</p>\n<p>Two satellites (GRACE) spinning around earth from 2002-2017 that can detect\nchanges at a centimetre scale of changes in water mass. The image shows how\nconnected the world is, and the scale of the changes at the poles and greenland\nis incredible. Some research just finished up in Greenland, and the rate of\nwarming in the arctic is roughly 2x the global warming rate. Greenland is\nlosing almost 260 gigatons of ice per year, which is a million Olympic-sized\nswimming pools a year -- an awful lot of water to move around into the ocean\nsystems. The ocean cycles aren't on the slide, but they will impact.</p>\n<p>Only 1% of the total water supply is freshwater, and its appropriation among human\nactivities is quite unalanced. We have green water to blue water to white water\nto grey water to black water. We have messed up the blue and black water\nmanagement. We are producing 360 billions of cubic metres of waste water a year\nand only 3% is recycled. A significant percentage of the world population (2\nbillion people) are dependent on black water, and a lot only have access to\ngrey or black water.</p>\n<p>\n<img alt=\"Water flows in the ecosystem.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-5.webp\" title=\"Water flows in the ecosystem.\">\nWater flows in the ecosystem.</p>\n<p>We need to treat management of surface and groundwater systems holistically.\nThe freshwater living planet index is dropping precipitously: some causes are flow\nreduction, introduction of invasive species and nitrogen/acidification peaks in the\n1970s/80s/90s which all causes major changes in freshwater biodiversity. We\nalso introduced laws on long range transport regulation about acidification. In\n1991 we had the EU urban wastewater directive and in 2000 the EU water\nframework directive, but none of this seems to have made much of a difference (see graph above).</p>\n<p>\n<img alt=\"Freshwater confidence index not measurably changed by legislation.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-6.webp\" title=\"Freshwater confidence index not measurably changed by legislation.\">\nFreshwater confidence index not measurably changed by legislation.</p>\n<p>Are we looking a tipping point in terms of waste water systems? We dont have\nclear metrics about freshwater biodiversity, and without the metrics we dont\nhave a way of valuing them as part of commercial systems. We have new emerging\npollutants (microplastics etc) which we have to deal with multi-sectoral\nchanges to make a difference, which is lacking despite the legislation in\nplace.\nHowever, the growth in urbanisation (56% of UK population lives in cities, increasing to\n70% by 2050) which makes a huge difference to where waste comes from. When we\nmove food around the word, there is a water cost which is passed onto other\ncountries. <em>(Note: see also our preprint on <a href=\"https://anil.recoil.org/papers/2024-food-life\">Quantifying the impact of the food we eat on species extinctions</a>)</em></p>\n<p>When we move to decarbonisation, the last 20% of action (the really difficult bit) is\nrelated to land use in particular. But if we join up the practises around good\nwater use and decarbonisation then we have a bunch of innovative interventions,\nfor example natural flood management. There are also cobenefits around\ninfrastructure and there is a real challenge with our UK underinvestment. The\nlast report in 2022 there are some excellent propositions for how to deal with\nwater and atmospheric pollution and biodiversity. We need to act on this\nreport, but progress is slow: e.g. Thames Water is a long way away from\nactioning this.</p>\n<p>\n<img alt=\"Water flows between countries worldwide.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-7.webp\" title=\"Water flows between countries worldwide.\">\nWater flows between countries worldwide.</p>\n<p>In the 1980s, <a href=\"https://adas.co.uk\">ADAS</a> was the wing of farming and food ministry worked on how to\nrecycle waste better. Metals were the big problem back then, but now there are\nloads of extra contaminants such as plastics and forever chemicals, We must\nmove towards reusing waste we produce towards reusable end products for farming\nand dramatically drop our water use.</p>\n<h3><a href=\"https://anil.recoil.org/#tipping-points-and-biosphere-stewardship-professor-carl-folke\"></a>Tipping points and biosphere stewardship (Professor Carl Folke)</h3>\n<p>We have polycrises with climate change, biosphere pressure, intertwined\nsystems, interacting shocks, and many tipping points. Tipping points are a\nshock that can move a system from a local maxima into a new stable state, but also\nthe gradual loss resilience can cause a system to shift suddenly. There are a\nnumber of drives that cause these tipping points, and they are not just a\nrandom aside: they should be a key part of any investment strategy.</p>\n<p>A lot of people live in areas that are at risk of ecological tipping points,\nand the map seems very correlated to the earlier water change map shown by\nLouise! 56% of human CO2e have been taken up by the biosphere so far (1430\nGtCO2 from 1850-2019) which has been happening for "free" thanks to nature, and\nthis might come to an abrupt end soon which is a big source of worry in the\nscientific community.</p>\n<p>\n<img alt=\"Prof Folke discussing tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-8.webp\" title=\"Prof Folke discussing tipping points.\">\nProf Folke discussing tipping points.</p>\n<p>The connectivity of various tipping points is becoming clearer. There are just\na few major actors that shape some of these (such as the Amazon rainforest) and\nthey are often quite far away geographically by virtual of their influence over\nmarkets.</p>\n<p>The current idea is that social tipping ("social norms as solutions") is the\nway to alter our value systems. See "<a href=\"https://www.cambridge.org/core/journals/global-sustainability/article/operationalising-positive-tipping-points-towards-global-sustainability/8E318C85A8E462AEC26913EC43FE60B1\">Operationalising positive tipping points\ntowards global\nsustainability</a>"\nby Tim Lenton et al. <em>(Note: Simon Sharpe also authored this, see our CSaP\n<a href=\"https://www.csap.cam.ac.uk/news/article-reading-group-five-times-faster-4-rethinking-unive/?preview=1\">reading\ngroup</a>\non his excellent <a href=\"https://fivetimesfaster.org\">"Five Times Faster"</a> book.)</em></p>\n<p>\n<img alt=\"Exposure of people to tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-9.webp\" title=\"Exposure of people to tipping points.\">\nExposure of people to tipping points.\n\n<img alt=\"Interconnected climate tipping points worldwide.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-10.webp\" title=\"Interconnected climate tipping points worldwide.\">\nInterconnected climate tipping points worldwide.</p>\n<p>So we need to prepare for tranformation, navigate the sudden transition and\nthen build resilience in the new norm after the tipping point. But our window\nof opportinty is happening right now so action must happen urgently or we miss\nthe window and enter the tipping point unprepared. So "corporate biosphere\nstewardship" is a new business logic with the purpose of shepherding and\nsafeguarding the resilience of the biosphere for human well-being, and\nfostering the sustainability of a rapidly changing planet. Rather than viewing\nnature as a compliance question, we should view it as humanity's greatest\nbusiness opportunnity! For example, Seafood Business for Ocean Stewardship\n([seabos.org](Seafood Business for Ocean Stweardship)) is codifying this\napproach for marine foodstocks. There is an increasing focus on how to report\nthis stuff. There are three good books to read more on this:</p>\n<ul>\n<li><a href=\"https://openknowledge.worldbank.org/entities/publication/855c2e15-c88b-4c04-a2e5-2d98c25b8eca\">"Nature's Frontiers"</a>, 2023 from the World Bank group.</li>\n<li><a href=\"https://www.stockholmresilience.org/publications/publications/2022-09-29-economy-and-finance-for-a-just-future-on-a-thriving-planet.html\">"Economy and Finance for a Just Future on a Thriving Planet"</a>, 2022 from the SRC.</li>\n<li><a href=\"https://www.ngfs.net/en/the-green-scorpion-macro-criticality-nature-for-finance\">"The Green Scorpion"</a>, the macro-criticality of nature for finance.</li>\n</ul>\n<p>\n<img alt=\"The window to tackle tipping points.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-11.webp\" title=\"The window to tackle tipping points.\">\nThe window to tackle tipping points.</p>\n<h3><a href=\"https://anil.recoil.org/#ecosystem-services-and-physical-risk-paul-polman-kbe\"></a>Ecosystem services and physical risk (Paul Polman KBE)</h3>\n<p><a href=\"https://en.wikipedia.org/wiki/Paul_Polman\">Paul Polman</a> is the former CEO of\nUnilever. He started by noting that we have to make sure that we dont just\nspend our time talking to each other, but we have to get the message out in the\nform of action to the wider world! When they developed the SDGs, Ban Ki Moon\nrequested Paul to represent the private sector. He was described as <em>"the\nproblem walking into the room"</em> when the politicans first met him, but luckily\nthey ended up in a happier place! We must find a balance into a sustainable\neconomy into the future. Most businesses, although they dont behave entirely\nsustainabiltly now, they do understand the need for a planet. When he ran\nUnilever for a decade, they encoutered hundreds of millions of dollars worth of\nnature-related interruptions tot heir business.</p>\n<p>4 billion people in the world depend on natural medicine, and nature governs\nhuge amounts of the human ecosystem. Nature supports peace and national\nsecurity - many of the world geopolitical events train back to inequality in\naccess to natural resources and this is the foundation for many economics. The\nWEF report recently calculated thaat $44tn of the world economy depends on\nnature -- but this is a huge understatement given that our entire life depends\non it! Changes in one ecosystem affects others, and so our conversation must\naddress the interrelationship of all this that makes it so complex.</p>\n<p>\n<img alt=\"Paul Polman discussing global business and nature.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-12.webp\" title=\"Paul Polman discussing global business and nature.\">\nPaul Polman discussing global business and nature.</p>\n<p>We must not see ourselves at the top of the pyramid -- biomimicry and\ngeoengineering is an arrogant approach as our survival relies on cooperation\nwith the biosphere -- when we destroy nature we destroy ourselves. (paraphrased quote) <em>"Man is the\nmost insane species - he worships an invisible god and destroys the visible\nnature, not realising that the invisible god he worships is the visible nature\nhe destroys"</em> (original: <a href=\"https://www.goodreads.com/quotes/1171374-man-is-the-most-insane-species-he-worships-an-invisible\">Hubert Reeves</a>). Extinctions are 10-100x the average of the previous\ncenturies.</p>\n<p>Food and land use is about 30% of our global emissions, and yet we have the\naudacity to keep our famers in poverty. Every $1 invested in changing nature\nand regenerative farming approaches give us $16 return. The food companies are\nreally exposed if they don't act, much as we criticiise fossil fuel companies\nright now.</p>\n<p>We are making withdrawals much faster than we are depositing in the bank of\nplanetary boundaries, and there are millions of people losing their lives and\nbillions being displaced because of these choices. The Carribean has lowered\nits tourism by a significant percentage due to beach erosion. The Amazon this\nyear has had unprecedented wildfires (around the size of Italy) as an example.\nWe name our tropical storms (we name them as if they are our friends); would it\nbe different if we named them Exxon Chevron etc? We must absolutely link the\ncommercial sector to nature as companies like Unilever must become\nborder-positive and nature positive as they are hugely global.</p>\n<p>Leadership is increasingly centred in europe with many business regulations\nthat are nature positive. There is a bunch of business interaction happening,\nbut the estimation is that loss of biodiversity has cost us over $10tn. In\nagriculture we are destroying $12tn of value, but if we turn that around its a\n$4tn opportunity. This just doesn't make business sense to not act. Covid\nshowed us that infinite growth on a finite planet is unsustainable.</p>\n<p>What can business do next then?</p>\n<ul>\n<li><em>Invest in nature.</em> It drives innovation and nature has probably been the best R&D lab (1/3rd of medicines come directly from nature)</li>\n<li><em>Water utility improvement.</em> If the 500 largest cities looked into restoring local forests and water tables, it would make a huge difference in quality of life.</li>\n<li><em>A mindset change for business.</em> See his <a href=\"https://netpositive.world/book/\">book on the topic</a>. The only way to think in business to be successful now is to think regeneratively and restoratively -- every action needs to contribute to restoration and not cause to us to fail slightly more slowly.</li>\n</ul>\n<p>Priorities for business action:</p>\n<ul>\n<li>Repair and restore nature. e.g. 30x30 the global biodiversity framework</li>\n<li>Account for the value of nature, the point of this event. less than 5% of companies today account for nature, and this needs to vastly increase.</li>\n<li>Form partnerships for advocacy and change. most of the big issues cannot be tackled by one company; even at Unilever Paul could only solve about 20% of the problem with one of the biggest companies in the world. So for global change (like "planetary guardians" launched at the UN last week) is another example of partnership.</li>\n<li>Align financial flows using the political moves around GBF and the Paris agreement. We need to put money behind this.</li>\n</ul>\n<p>Nature needs business, and nature needs business, and the time to act is now!</p>\n<p>\n<img alt=\"Breaching planetary boundaries.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-13.webp\" title=\"Breaching planetary boundaries.\">\nBreaching planetary boundaries.\n\n<img alt=\"Key priorities next for nature and business.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-14.webp\" title=\"Key priorities next for nature and business.\">\nKey priorities next for nature and business.\n\n<img alt=\"The full panel discussing these topics.\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-15.webp\" title=\"The full panel discussing these topics.\">\nThe full panel discussing these topics.</p>\n<h2><a href=\"https://anil.recoil.org/#discussion\"></a>Discussion</h2>\n<p>Partial notes from the audience Q&A follow.</p>\n<p>Q: To what extent have we made progress in the transition towards a nature positive future -- is the awakening fast enough?</p>\n<p>Tony: what we have discussed here is not very well known in the majority of the\nworld's population. Most people get in their car and go to work, and their food\nis processed and disconnected from nature. So the insight that most people have\ninto their dependencies on nature isnt very well known. David Attenborough has\nmade it clear that if we take steps towards repairing the natural world then\npeople will need to be exposed to nature so that they understand what they're\ntrying to save. This was disconnected in the beginning of the industrial\nrevolution, and we need to urgently reconnect it, or people will simply not\nappreciate what we're about to lose.</p>\n<p>Louise: is quite optimistic because the younger generation is very engaged in\nthis, and we need to ensure they have the levers for action. It's the older\ngenerations challenge to bring the right levers to them.</p>\n<p>Paul: the science is still evolving at a fast pace, and will continue to, and\nif we dont convert it to fast practical steps we'll be trouble. Everything\nstarts and ends with educaiton, but we are also out of time. The urgency isnt\nwell understood -- a survey of 4000 people about this skewed to about 30% of\npeople where they left a company because it wasnt aligned with their values.\n60% were considering leaving but didnt. We must get businesses to move out of\nthe current gridlock where multilateral institutions cannot change easily, and\nwe must cooperate across boundaries. The EU has a nature restoration law which\nwouldnt have happened without businesses getting involved to push it over the\nline for example -- without that it wouldnt have moved. And tipping points mean\nthat only 4-5% change is required to get us to a new stable state, so this is\nboth an opportunity and a risk. The biggest risk right now is the US election,\nas its a global vote but decided by a tiny minority.</p>\n<p>Q: What examples have you seen about nature flows in the real world?</p>\n<p>Paul: the economic forces must be made to work; people arent skeptical but the\neconomic realities must work. But there are pressures; the average tenure of a\nCEO has dropped to 4-5 years and so their actions are aligned with reelection\ncycles just like poliitics! So we have to make capital and money flow work much\nmore transparently. With a relatively small investment, we can get to 50%\nregenerative agriculture; this is happening in the USA (thanks to IRA). Many\nof the major food companies (from Pepsi to Unilever) are committed to $15bn\ntowards soil health and regenerative agriculture. And there is nothing better\nthan healthy soil towards preserving yields in the face of climate change. The\naverage age of farmers is ~50 now and so they only have 10 seasons left before\nthey retire, so they need to be paid to shift proactively for ecosystem\nservices. It needs a proper farmer rewards to do the shift quickly and hedge\nthe risk. Luckily, the speed of nature restoration is faster than we predicted.</p>\n<p>Carl: it has become a strategic issues in companies, as opposed to a\ngreenwashing protection. They run an executive course for companies and they\nare factoring in risks proactively now in a way they didnt before. They are\ndemanding satellite data and other risk management data products to quantify\nthe uncertainties in nature. Earth SYstem Indicators is their system to combine\nmultoiple planetary boundaries into one actionable indicator. It is a bit like\nan avalanche now where the reaction is awakening, but we are probably in a\ntipping point right now so we must act in the right direction.</p>\n<p>Jane: social systems are highly non linear and also characterised by social\nsystem tipping points. Water seems to be an obvious one that underpins the\noperation of many companies.</p>\n<p>Louise: Universities and research must turn into businesses (of innovation).\nOne thing we failed to do is to find the problems but dont find the solutions,\nand research must come up with solutions. "Ecologists turn over stones looking\nfor problems". We must take a position on coming up with solutions and\ncommunicating them clearly, and the only way to do this is to work together\nacross disciplines. We are looking to social scientists to combine civic change\nwith environmental scientists to propose interventions.</p>\n<p>Jane: there has been a huge transformation in the US towards finding solutions,\nand that means partnering with government agencies and NGOs in a way we haven't\nseen before. <em>(Note: this describes the <a href=\"https://www.cambridgeconservation.org\">Cambridge Conservation Initiative</a> model perfectly!).</em></p>\n<p>Tony: even when the solutions exist, a number of entities want to keep the\nstatus quo: e.g. Exxon and Shell spent more than two decades pretending it\nwasnt a problem to protect their interests, and they're not alone. And so we\nhave to establish a social tipping point (just 3-4%) was just the idea of\nExctinction Rebellion to overload the court system and get to the point where\nwe could shift the discussion, But it didnt work, so what will work? We need to\ntry, try and try again on this. Otherwise we're rather just talking to\nourselves.</p>\n<p>Q: Water related financial risks: TCFD galvanised focus on climate,\nspeciifcally carbon, and one of the barriers is this focus on regulations that\nfocus on the cause (carbon) but not the impacts. We mainly feel these impacts\nthrough the water cycle, so how can we make the focus on the phsyical risks of\nclimate change being on the impacts and not the cause?</p>\n<p>Louise: we need a systems approach to this; tipping points sort of get there,\nbut how do we get the evidence to get over the top?</p>\n<p>Q: Why is it that the business and finance community isn't moving faster? The\nsuspicion is that not all businesses are part of the solution, and do we need\nto be clearer on that? If we combine some of the social aspects and linking of\npeople to nature, does that supply chain need to be shorter?</p>\n<p>Paul: we are slightly bending the curve linearly, but the gap is getting bigger\nexponentially. If we dont solve the problems for the world, we can be great as\na company but it doesnt save the planet for the next generation. Business alone\ncannot do it, but what we can do is to make it more transparent. The more\ntransparent we are the more we change behaviours and social norms. Companies\nusing AI across supply chains are doing surprisingly well at bringing this\ntransparency around. In every transition in the history of mankind, there will\nbe people resisting change, but now the noise globally is becoming louder is a\nbig deal and is a sign that the process is unstoppable and has begun. The noise\nwill only increase in the next five years. In the food chain there are\nherbicide, pesticide producers that are doing really well on food speculation,\nbut at the expense of many lives. We need a leader to change them, and if they\ndont change they will obsolete their business models.</p>\n<p>Q: What is the single most important action businesses can take today to address the crisis?\nQ: Why not spin this around to look at this at a very local level (e.g. flooding for business crisis)?\nQ: Unless we can localise analysis with improved precision, can we improve decision making?</p>\n<p>Tony: Localisation of issues is critical. The founding slogan of Friends of the Earth was to "think globally act lcoally" which was a very good slogan. Natural England is working with local authorities on spatial planning exercises on where the good remaining nature exists and where it might usefully be repaired (cleaning up peatlands etc). The targeting of the resources like this might help us bend the curve. This must be linked with our need to build 100k new houses and balance natural and business.</p>\n<p>Paul: COP30 is likely to involve businesses heavily (COP29 is a writeoff due to location). But businesses are getting behind it all as countries cant implement their policies without business help. A large percentage of indigenous populations are planetary guardians and so we must support their efforts to protect existing natural capital against tipping points. Science without impact is useless when we have a crisis at this level.</p>\n<h2><a href=\"https://anil.recoil.org/#day-1-reflections-by-sir-partha-dasgupta\"></a>Day 1 Reflections by Sir Partha Dasgupta</h2>\n<p>Ecosystems are capital assets; we now use the term natural capital to include\nthe biosphere in the full stock. This raises the question of asset management,\nand the questions discussed today represent shortcomings in asset management.</p>\n<p>With natural capital, property rights are hard to enforce -- nature is always\non the move and "mobile" and the commodity changes when it moves. This leads to\nan underpricing of many forms of natural capital (not all, but many) which\nleads to the overuse of it, which in turns leads to deterioration of the asset,\nwhich implies a runaway tipping point of decline. This all leads to a circular\ndecline in the value of natural capital; if there is a heightened risk of an\nasset collapsing, the accounting value is naturally reduced. So this is a\nvicious cycle at work that leads to the inevitable decline.</p>\n<p>Global GDP has increased hugely as opposed to natural capital, which increases\nthe pressure to overspend on natural capital. What are the arbitrage conditions\n(e.g. risk adjusted rate of return) on the portfolio assets that comprise\nnatural assets? Because of imperfect pricing, there is a big misalignment of\nany portfolios. For example, the Ganges is the most polluted river in the world\nand under the Ganges action plan to remedy this there was a <a href=\"https://archive.org/details/cleaningupganges0000mark\">social cost benefit\nanalysis</a> -- there was a 15% rate of return calculated. The rate of return on\ngovernment bonds was roughly 5%. There was a gap of 10% between these two assets,\nand so if the structure was efficiently organised then the Ganges value as a stock\nshould be decreasing by 10%. But this is perverse since the reverse should hold\nsince the river quality has been decreasing not increasing!</p>\n<p>So from the firm's point of view we are looking at their balance sheets (from\nthe natural capital POV). But GDP was not constructed for this purpose -- instead\nit was constructed in the post war period to calculate progress towards getting\nout of economic depression resulting from lack of economic activity. But somehow\nafter WWII it became a long-term goal but it has no social benefit justification.\nGDP is a flow and so not a future predictor like stocks. You can have GDP growth\nand the national package of accounts, but it just won't take natural capital\ninto account. There is a gap between portfolio management and GDP therefore.</p>\n<p>But countries are moving towards wealth accounting bit by bit, including for natural\ncapital aspects of wealth. Most of the attention has been towards human capital,\nincluding the attention of investors. This now needs to shift quickly towards\nnatural capital. There have been studies attempting to assess the market demand\nfor natural capital. Think of the biosphere as a massive fishery -- the natural\nanalogy is how much we take out of it, and what the regeneration rate is. The question\nis what the overreach is and there are <a href=\"https://naturalcapitalproject.stanford.edu/about\">projects working on this</a>.\nWe need to take the outputs of these projects and treat them as underestimates due\nto the huge extinction pressure on organisms.</p>\n<p>Even if firms are competing in the market, they still need to cooperate on the\nunderlying natural capital accounting to avoid a "market crash". In academia,\nthere is both cooperation and competition. There is a race for paper publication,\nbut also a huge amount of sharing towards global scientific co-creation. There is a\ngood deal that companies could learn from this to cooperate and communicate towards\nnatural capital accounting. More on day 2 tomorrow about how we can get there!</p>\n<p><span></span></p>\n<h2><a href=\"https://anil.recoil.org/#day-2-metrics-and-actions\"></a>Day 2: Metrics and Actions</h2>\n<p>Kat Bruce (founder of <a href=\"https://www.naturemetrics.com/\">NatureMetrics</a>) opened the session by noting just how quickly\nthe biodiversity space is moving, and how encouraging it is to see so many businesses\nengaging with environmental scientists.</p>\n<h3><a href=\"https://anil.recoil.org/#metrics-for-business-use-prof-neil-burgess\"></a>Metrics for business use (Prof Neil Burgess)</h3>\n<p>Is going to cover terrestrial metrics only. There are several sorts of metrics we could measure:</p>\n<ul>\n<li>pressures on biodiversity</li>\n<li>steady state metrics for biodiversity</li>\n<li>measuring benefits from biodiversity</li>\n<li>response metrics on the effectiveness of biodiversity interventions</li>\n</ul>\n<p>We do need to understand biodiversity risk in regional detail so that\nbusinesses can use this to influence their actions. There is also the need for\nclear target setting to know how much opportunity cost to "spend" on\nbiodiversity vs other actions.</p>\n<p>\n<img alt=\"Neil describes how to classify biodiversity metrics\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-16.webp\" title=\"Neil describes how to classify biodiversity metrics\">\nNeil describes how to classify biodiversity metrics\n\n<img alt=\"Categorising 573 (!) biodiversity metrics\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-17.webp\" title=\"Categorising 573 (!) biodiversity metrics\">\nCategorising 573 (!) biodiversity metrics\n\n<img alt=\"Metrics usable for businesses currently\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-18.webp\" title=\"Metrics usable for businesses currently\">\nMetrics usable for businesses currently</p>\n<p>UNEP-WCMC have a huge database about all the various metrics (~573!) that\ncover biodiversity. If you even add other things like marine, the number\nbreaks 600 (see picture). There are around 23 useful ones for business use;\nthere are few in this list that check benefits to people or discuss genetic\nchanges, but there are lots of on diversity, habitat area and so on.\n(Neil noted that our <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> paper isnt listed as its not published yet\ntill the end of the year, but it will be!)</p>\n<p>\n<img alt=\"What next for biodiversity metrics?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-19.webp\" title=\"What next for biodiversity metrics?\">\nWhat next for biodiversity metrics?</p>\n<p>There is quite a lot of work needed to make the metrics usable, and a pipeline.\nHe also noted the importance of incremental pipelines and mentioned the\nincremental pipelines that <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> are working on for these.\nAnd this is something I'd like to advance with <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a>...</p>\n<p>The main tool that pulls this together is <a href=\"https://www.ibat-alliance.org\">IBAT</a>\nwhich is a paid-for model to support its continued developed. There is also\nthe free <a href=\"https://encorenature.org/en\">ENCORE nature</a> platform. For supply chains,\nthere is also tools to help check the country to country. There is a big emphasis\non the need for open sharing between platforms as well due to the sheer\ncomplexity of biodiversity worldwide, which is a key thing to keep the platforms\nsustainable both financially and equitably.</p>\n<h3><a href=\"https://anil.recoil.org/#data-availability-and-use-prof-andy-purvis\"></a>Data availability and use (Prof Andy Purvis)</h3>\n<p>Andy's career as a scientist started 35 years, and he is now at the Natural\nHistory Museum as a senior researcher. The mission there is not just to track\nthe decline of the planet, but also to advocate towards a healthier nature\npositive society as well.</p>\n<p>\n<img alt=\"Why care about biodiversity?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-20.webp\" title=\"Why care about biodiversity?\">\nWhy care about biodiversity?</p>\n<p>We have a million or so species of animal and plants threatened by species\nextinction. Our job as scientists is to come up with defensible statistics,\nbut they also don't convince anyone. So stories about individual species\n(like the white rhino) are vital to build public awareness about the real\nimpact.</p>\n<p>Is there one index to rule them all? Andy is against that idea of a single\nmetric to describe the world's biodiversity. Any indicator combining some of\nthe metrics has an "exchange rate" between extinctions and human wellbeing. In\nother words, it creates an "extinction market".</p>\n<p>Decision grade data are derived very carefully from very biased raw data. And\nthe data collection requires a huge amount of expertise, and is painstaking\nto conduct. This has to be funded, and not even the collection of the raw data\nis very well funded right now. To illustrate the data bias, there are a huge\namount of birds...and mallards...in there, which is clearly not representative\nof global species spread.</p>\n<p>We can plot ecosystem function (resource capture, biomass production, decomposition\nand nutrient recycling) on the axis of biological diversity (variation in genes\nand species and functional traits). Across this, there is a huge spectrum\nof ecosystem services that this supports, of which a few are really useful to humans.\nAnd the <a href=\"https://www.nhm.ac.uk/our-science/services/data/biodiversity-intactness-index.html\">BII index</a>\nis a statistical model relating nature to human pressures that produces both\nhigh resolution temporal and spatial models.</p>\n<p>What should we do in terms of ecosystem health?</p>\n<ul>\n<li>Deintensify activities in unhealthy systems where people depend on local ecosystem services</li>\n<li>Divest from businesses that are poor stewards of ecosystem health</li>\n<li>Invest in actions that are "nature positive", which is an action that improves the expected global status of biodiversity relative to counterfactuals.</li>\n</ul>\n<p>It requires a model, has to be global, and measure both species persistence and\necosystem health. It has to be vs counterfactuals otherwise the cost is too\nhigh for any individual organisation <em>vs</em> society taking action collectively.</p>\n<p>\n<img alt=\"Why not have one biodiversity index?\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-21.webp\" title=\"Why not have one biodiversity index?\">\nWhy not have one biodiversity index?\n\n<img alt=\"Ecosystem function vs health\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-22.webp\" title=\"Ecosystem function vs health\">\nEcosystem function vs health\n\n<img alt=\"Defining nature positivity as a counterfactual\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-23.webp\" title=\"Defining nature positivity as a counterfactual\">\nDefining nature positivity as a counterfactual</p>\n<p>We need to combine models with monitoring to give us a "sat nav" for nature.\nThere is a need to monitor drivers as well biodiversity, which\nrelies on data being available to improve the models. And if the platforms are\nopen then this is possible. There are platforms in the form of Geo BON, IBAT, BII ha a data license with\nBloomberg, and some TNFD tools.</p>\n<p>Take home messages:</p>\n<ul>\n<li>Use data whose methodologies are transparent</li>\n<li>Remember the pitfall of hybrid indicators or indices</li>\n<li>Reduce extinctions and mitigate existing activities in important areas and dont do new human activities there</li>\n<li>Divest from poor stewards and invest in nature positive actions</li>\n<li>Monitor closely to verify gains and contribute to data repositories</li>\n<li>Accept that decision grade data does cost money and needs funding!</li>\n</ul>\n<h3><a href=\"https://anil.recoil.org/#qa\"></a>Q&A</h3>\n<ul>\n<li>Is the restoration of an ecosystem the reverse of destruction curve?</li>\n<li>Is there consensus about BII outside of the NHM team?</li>\n<li>Are we moving towards biodiversity standardisation?</li>\n<li>How can we make these metrics to be more understandable to businesses?</li>\n</ul>\n<p>Andy: if we go back to a system with the original biodiversity, but the actions to get it there don't have an immediate effect (there is a timelag). There's no standardisation effort yet across many metrics, but we are moving towards ensembles of models with alternative sets of inputs for the environmental rasters in order to normalise the uncertainty in both the environmental and geographic space.</p>\n<p>Neil: UNEP-WCMC uses BII a lot! In terms of metrics, a species and ecosystem metric and something on genes would cover the three dimensions of biodiversity, but others also need to be factored in. There is an effort to use AI/geospatial to get higher resolution landuse data (is this a forest?) and also probabilistic SDMs, but no standardisation.</p>\n<p>Kat: being able to reverse degradation is a secret weapon for nature as it has an incredible ability to bounce back if pressures are reduced due to its innate resilience. The aspect of the data we often forget about is its ability to tell these stories and drive uptake, and this is powerful.</p>\n<ul>\n<li>What are the greenwashing risks of nature-positive counterfactuals?</li>\n<li>When we boil down the metrics that are ready for both country and business use, what are the qualities that make a metric useful? There were only 16 or so outside of ~600!</li>\n</ul>\n<p>Neil: There are criteria (peer reviewed, published, etc) with lots of feedback from the co-authors of the paper (to appear later this year). There are strongly held opinions in the biodiversity metric space. But it's actually not a huge and manageable list of metrics.</p>\n<p>Andy: trying to avoid false claims of nature positive requires verification. We're going to get better at this by estimating the net gain and how certain its positive. The models will improve when there is a connecting pipeline to monitoring data and needs verification. So any payment for nature-positive claims needs to be staged and ex-ante.</p>\n<h2><a href=\"https://anil.recoil.org/#other-talks\"></a>Other talks</h2>\n<p>I failed to capture notes for the middle sessions as I was engrossed in conversations, but here's a gallery of some of the fantastic speakers! They're on the livestream video if you want to catch the details.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-24.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-25.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-26.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-28.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-29.webp\" title=\"\">\n\n\n<img alt=\"\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-30.webp\" title=\"\">\n</p>\n<h1><a href=\"https://anil.recoil.org/#parthas-day-2-roundup\"></a>Partha's Day 2 roundup</h1>\n<p>See the equation below. The right hand side is <code>G</code> (nature's regeneration rates) and is a function of the state of the biosphere (the <code>S</code>).\nThe left hand side is the human demand (human activity) which is a per-capita adjusted population and the efficiency of provisioning goods (recall what those are from the start of day 1 notes).\nMost of the discussion in these two days have been on the right hand side. A huge amount of the literature is on the alpha, and how technological progress can raise alpha (e.g. via cleaner energy) and help rebalance the inequality. How fast can alpha change, as the faster it grows, the faster GDP and natural capital grows.</p>\n<p>\n<img alt=\"Sir Partha&apos;s equation to sum up the two days!\" src=\"https://anil.recoil.org/images/rs-ecorisk24/rs-ecorisk-31.webp\" title=\"Sir Partha&apos;s equation to sum up the two days!\">\nSir Partha's equation to sum up the two days!</p>\n<p>How might we advocate for change? The total we pay ourselves is 2-3% of global GDP for subsidising our assault on nature. Removing those subsidies is the equivalent of raising alpha. Adam Smith's classic book in the 18th century was the "Wealth of Nations" and not the "GDP of Nations". For us, wealth includes natural capital. Even as early as 60 years ago, human capital didn't appear in the economic literature. Most national accountants include statements about the increase in human capital, and for our purposes we must include natural capital in the notion of wealth. We must shift accounting away from GDP into calculations of stock and inequalities, as shown in the equation.</p>\n<p>What the discussions missed didn't discuss invasive species and the "transfer of natural capital" through the fact that goods and services are traded. This is represented in the capital <code>N</code> in the equation. So that's future work!</p>\n<h2><a href=\"https://anil.recoil.org/#follow-more\"></a>Follow more</h2>\n<p>This concludes my rapid note taking. Do join the <a href=\"https://royalsociety.org/science-events-and-lectures/2024/10/ecological-and-commercial-risk/\">livestream</a> and follow for the remaining day and a half if you have a spare moment!</p>\n<p><em>Edit: upon being prodded by <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a>, I've uploaded a <a href=\"https://notebooklm.google\">NotebookLM</a> generated podcast summary of the morning. It's surprisingly entertaining, but I'm sure I'm going to regret this for some reason...</em></p>\n<p></p><div></div><p></p>",
+18
avsm/notes_rs-future-of-publishing.json
+18
avsm/notes_rs-future-of-publishing.json
···+"summary": "<p>I was a bit sleepy getting into the Royal Society <a href=\"https://royalsociety.org/science-events-and-lectures/2025/07/future-of-scientific-publishing/\">Future of Scientific\nPublishing</a>\nconference early this morning, but was quickly woken up by the dramatic passion\non show as publishers, librarians, academics and funders all got together for a\n"frank exchange of views" at a meeting that didn't pull any punches!</p>\n<p>These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available\nfrom the RS in due course.</p>\n<p>\n<img alt=\"Sir Mark Walport FRS opens up the conference\" src=\"https://anil.recoil.org/images/rspub-1.webp\" title=\"Sir Mark Walport FRS opens up the conference\">\nSir Mark Walport FRS opens up the conference</p>\n<h2><a href=\"https://anil.recoil.org/#mark-walport-sets-the-scene\"></a>Mark Walport sets the scene</h2>\n<p>Sir Mark Walport was a delightful emcee for the proceedings of the day, and\nopened how important the moment is for the future of how we conduct science.\nAcademic publishing faces a perfect storm: peer review is buckling under\nenormous volume, funding models are broken and replete with perverse\nincentives, and the entire system groans with inefficiency.</p>\n<p>The Royal Society is the publisher of the world's oldest continuously published\nscientific journal <a href=\"https://royalsocietypublishing.org/journal/rstb\">Philosophical Transactions</a>\n(since 1665) and has convened this conference for academies worldwide. The\noverall question is: what <em>is</em> a scientific journal in 2025 and beyond?\nWalport traced the economic evolution of publishing: for centuries, readers\npaid through subscriptions (I hadn't realised that the <a href=\"https://royalsociety.org/blog/2015/03/philosophical-transactions-the-early-years/\">early editions of the RS</a>\nused to be sent for free to libraries worldwide until the current commercial\nmodel arrived about 80 years ago).. Now, the pendulum has swung to open access\nthat creates perverse incentives that prioritize volume over quality. He called\nit a "smoke and mirrors" era where diamond open access models obscure who\n<em>actually</em> pays for the infrastructure of knowledge dissemination: is it the\npublishers, the governments, the academics, the libraries, or some combination\nof the above? The profit margins of the commercial publishers answers that\nquestion for me...</p>\n<p>He then identified the transformative forces that are a forcing function:</p>\n<ul>\n<li>LLMs have <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">entered</a> the publishing ecosystem</li>\n<li>The proliferation of journals has created an attention economy rather than a knowledge economy</li>\n<li><a href=\"https://openreview.net/\">Preprint</a> archives are reshaping how research is shared quickly</li>\n</ul>\n<p>The challenges ahead while dealing with these are maintaining metadata\nintegrity, preserving the scholarly archive into the long term, and ensuring\nsystematic access for meta-analyses that advance human knowledge.</p>\n<h2><a href=\"https://anil.recoil.org/#historical-perspectives-350-years-of-evolution\"></a>Historical Perspectives: 350 Years of Evolution</h2>\n<p>The opening pair of speakers were unexpected: they brought a historical and\nlinguistic perspective to the problem. I found both of these talks the\nhighlights of the day! Firstly <a href=\"https://www.st-andrews.ac.uk/history/people/akf\">Professor Aileen\nFyfe</a> drew upon her research\nfrom 350 years of the Royal Society archives. Back in the day, there was no\nreal fixed entity called a "scientific journal". Over the centuries, everything\nfrom editorial practices to publication methods over to dissemination means\nhave transformed repeatedly, so we shouldn't view the status quo as set in stone.</p>\n<p>\n<img alt=\"Professor Aileen Fyfe talks publishing history\" src=\"https://anil.recoil.org/images/rspub-2.webp\" title=\"Professor Aileen Fyfe talks publishing history\">\nProfessor Aileen Fyfe talks publishing history</p>\n<p>While the early days of science were essentially people writing letters to each\nother, the post-WWII era of journals marked the shift to "scale". The tools for\ndistance communication (i.e. publishing collected issues) and universities\nswitching from being teaching focused over to today's research-centric\npublishing ecosystem were both key factors. University scientists used to\nproduce 30% of published articles in 1900; by 2020, that figure exceeded 80%.\nThis parallels the globalization of science itself in the past century;\nresearch has expanded well beyond its European origins to encompass almost all\ninstitutions and countries worldwide.</p>\n<p>Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked <em>"<a href=\"https://www.nature.com/articles/186018a0\">How many more new\njournals?</a>"</em> even back then! The 1950s\ndid bring some standardization efforts (nomenclature, units, symbols) also\nthough citation formats robustly seem to resist uniformity. English was also\nexplicitly selected as the "<a href=\"https://en.wikipedia.org/wiki/Languages_of_science\">default language for\nscience</a>, and peer review\nwas also formalised via papers like <em>"<a href=\"https://journals.sagepub.com/doi/10.1177/000456327901600179\">Uniform requirements for manuscripts submitted to biomedical journals</a>"</em> (in 1979). <a href=\"https://nsf-gov-resources.nsf.gov/pubs/1977/nsb77468/nsb77468.pdf\">US Congressional hearings</a>\nwith the NSF began distinguishing peer review from other evaluation methods.</p>\n<p>\n<img alt=\"Professor Aileen Fyfe shows the globalisation of research over the years\" src=\"https://anil.recoil.org/images/rspub-3.webp\" title=\"Professor Aileen Fyfe shows the globalisation of research over the years\">\nProfessor Aileen Fyfe shows the globalisation of research over the years</p>\n<p>All of this scale was then "solved" by financialisation after WWII. At the turn of the\n20th century, almost no journals generated any profit (the Royal Society\ndistributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a\n<a href=\"https://journals.sagepub.com/doi/10.1177/0073275321999901\">reckoning</a>, leading\nto more self-supporting models by the 1960s. An era of mergers and acquisitions\namong journals followed, reshaping the <a href=\"https://serials.uksg.org/articles/259/files/submission/proof/259-1-259-1-10-20150210.pdf\">scientific information system</a>.</p>\n<p><a href=\"https://www.universiteitleiden.nl/en/staffmembers/vincent-lariviere#tab-1\">Professor Vincent Larivi\u00e8re</a> then took the stage to dispel some myths of English monolingualism in scientific publishing. While <a href=\"https://garfield.library.upenn.edu/essays/V1p019y1962-73.pdf\">English offers some practical benefits</a>, the reality at non-Anglophone institutions (like his own Universit\u00e9 de Montr\u00e9al) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this.\nThis wasn't always the case though; Einstein published primarily in German, not English!</p>\n<p>He went on to note that today's landscape for paper language choices is more\ndiverse than is commonly assumed. English represents only 67% of publications,\na figure whic itself has been inflated by non-English papers that are commonly\npublished with English abstracts. Initiatives like the <a href=\"https://pkp.sfu.ca/2025/03/05/ojs-workshops-indonesia/\">Public Knowledge\nProject</a> has enabled\ngrowth in Indonesian and Latin America for example. Chinese journals now\npublish twice the volume of English-language publishers, but are difficult to\nindex which makes Lariviere's numbers even more interesting: a growing majority\nof the world is no longer publishing in English! I also heard this in my trip\nin 2023 to China with the Royal Society; the scholars we met had a sequence of\nChinese language journals they submitted too, often before "translating" the\noutputs to English journals.</p>\n<p>\n<img alt=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\" src=\"https://anil.recoil.org/images/rspub-4.webp\" title=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\">\nProfessor Lariviere uses OpenAlex to show non-English linguistic breakdowns</p>\n<p>All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita <a href=\"https://ourworldindata.org/grapher/scientific-publications-per-million\">wealth and geography</a>, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivi\u00e8re argued that research "prestige" is a socially constructed phenomenon, and not intrinsic to quality.</p>\n<p>In the Q&A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially <a href=\"https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals\">in AI topics</a>). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from <a href=\"https://coherentdigital.net/\">Coherent Digital</a> noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">30TB of grey literature</a> on our servers, still waiting to be categorisd).</p>\n<p>\n<img alt=\"Professor Lariviere shows how uneven citations are across languages and geographies\" src=\"https://anil.recoil.org/images/rspub-5.webp\" title=\"Professor Lariviere shows how uneven citations are across languages and geographies\">\nProfessor Lariviere shows how uneven citations are across languages and geographies</p>\n<h2><a href=\"https://anil.recoil.org/#what-researchers-actually-need-from-stem-publishing\"></a>What researchers actually need from STEM publishing</h2>\n<p>Our very own <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> opened with a sobering demonstration of "AI\npoisoning" in the literature, referencing <a href=\"https://anil.recoil.org/static/papers/2025-ai-poison.pdf\">our recent Nature\ncomment</a>. He did the risky-but-catchy\ngeneration of a plausible-sounding but entirely fabricated conservation study\nusing an LLM and noted how economically motivated rational actors might quite\nreasonably use these tools to advance their agendas via the scientific record.\nAnd recovering from this will be very difficult indeed once it mixes up with\nreal science.</p>\n<p>\n<img alt=\"Bill talks about our recent AI poisoning piece\" src=\"https://anil.recoil.org/images/rspub-6.webp\" title=\"Bill talks about our recent AI poisoning piece\">\nBill talks about our recent AI poisoning piece</p>\n<p>Bill then outlined our <a href=\"https://anil.recoil.org/projects/ce\">emerging approach to subject-wide synthesis</a> via:</p>\n<ul>\n<li><strong>Systematic reviews</strong>: Slow, steady, comprehensive</li>\n<li><strong>Rapid reviews</strong>: Sprint-based approaches for urgent needs</li>\n<li><strong>Subject-wide evidence synthesis</strong>: Focused sectoral analyses</li>\n<li><strong>Ultrafast bespoke reviews</strong>: AI-accelerated with human-in-the-loop</li>\n</ul>\n<p>Going back to what ournals are <em>for</em> in 2025, Bill then discussed how they were\noriginally vehicles for exchanging information through letters, but now serve\nprimarily as stamps of authority and quality assurance. In an "AI slop world,"\nthis quality assurance function becomes existentially important, but shouldn't\nnecessarily be implemented in the current system of incentives. So then, how do\nwe maintain trust when the vast majority of submissions may soon be\nAI-generated? <em>(Bill and I scribbled down a plan on the back of a napkin for\nthis; more on that soon!)</em></p>\n<p>\n<img alt=\"Bill also does a cheeky advert for his Conservation Concepts channel!\" src=\"https://anil.recoil.org/images/rspub-7.webp\" title=\"Bill also does a cheeky advert for his Conservation Concepts channel!\">\nBill also does a cheeky advert for his Conservation Concepts channel!</p>\n<h3><a href=\"https://anil.recoil.org/#early-career-researcher-perspectives\"></a>Early Career Researcher perspectives</h3>\n<p><a href=\"https://www.york.ac.uk/psychology/staff/postdocs/meekings,-sophie/\">Dr. Sophie Meekings</a> then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is <em>after</em> years spent on poorly paid PhD stipends!</p>\n<p>ECRs require:</p>\n<ul>\n<li>clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the "system"</li>\n<li>constructive, blinded peer review** that educates rather than gatekeeps</li>\n<li>consistent authorship conventions like <a href=\"https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement\">CRediT</a> (Contributor Roles Taxonomy)</li>\n</ul>\n<p>Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.</p>\n<p>If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between "conventional" science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.</p>\n<p>Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!</p>\n<ul>\n<li><strong>S</strong>peed in publication processes</li>\n<li><strong>C</strong>larity in requirements and decisions</li>\n<li><strong>A</strong>ffordability of publication fees</li>\n<li><strong>R</strong>ecognition of contributions</li>\n<li><strong>F</strong>airness in review and credit</li>\n</ul>\n<p>\n<img alt=\"Dr Sophie Meekings&apos; SCARF principles for ECRs\" src=\"https://anil.recoil.org/images/rspub-8.webp\" title=\"Dr Sophie Meekings&apos; SCARF principles for ECRs\">\nDr Sophie Meekings' SCARF principles for ECRs</p>\n<p>The audience Q&A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely?\n<a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> noted that we are currently extending this to education working with <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a>. Interconnected datasets <em>across</em> subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> are supervising <a href=\"https://anil.recoil.org/notes/eeg-interns-2025\">projects on evidence synthesis</a> this summer on just this topic here in Cambridge.</p>\n<p>Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit <em>after</em> the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.</p>\n<p>A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g</p>\n<h2><a href=\"https://anil.recoil.org/#charting-the-future-for-scientific-publishing\"></a>Charting the future for scientific publishing</h2>\n<p>The next panel brought together folks from across the scientific\npublishing ecosystem, moderated by Clive Cookson of the Financial Times. This\nwas a particularly frank and pointed panel, with lots of quite direct messages\nbeing sent between the representatives of libraries, publishers and funders!</p>\n<p>\n<img alt=\"Amy Brand from MIT Press opens the panel\" src=\"https://anil.recoil.org/images/rspub-9.webp\" title=\"Amy Brand from MIT Press opens the panel\">\nAmy Brand from MIT Press opens the panel</p>\n<p>Amy Brand (MIT Press) started by delivered a warning about conflating "open to\nread" with "open to train on". She pointed out that when MIT Press did a survey\nacross their authors, many of them raised concerns about the reinforcement of\nbias through AI training on scientific literature. While many of the authors\nacknowledged a moral imperative to make science available for LLM training,\nthey also wanted the <em>choice</em> of making their own work used for this. She urged\nthe community to pause and ask fundamental questions like "AI training, at what\ncost?" and "to whose benefit?". I did think she made a good point by drawing\nparallels with the early internet, where Brand pointed out that lack of\nregulation accelerated the decline of non-advertising-driven models. Her\nclosing question asked if search engines merely lead to AI-generated summaries,\nwhy serve the original content at all? This is something we discuss in our\n<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">upcoming Aarhus paper on an Internet ecology</a>.</p>\n<p><a href=\"https://experts.deakin.edu.au/66981-danny-kingsley\">Danny Kingsley</a> from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are "the ones that sign the cheques that keeps the system running", which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations). Her survey of librarians was interesting; they all asked for:</p>\n<ul>\n<li>Transparent peer review processes</li>\n<li>Unified expectations around AI declarations and disclosures</li>\n<li>Licensing as open as possible, resisting the "salami slicing" of specific use. We also ran across this problem of overly precise restrictions on use while <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">building our paper corpus</a> for <a href=\"https://anil.recoil.org/projects/ce\">CE</a>.</li>\n</ul>\n<p>Kingsley had a great line that "publishers re monetizing the funding mandate",\nwhich <a href=\"https://www.stats.ox.ac.uk/~deane/\">Charlotte Deane</a> later also said was the most succinct way she had heard\nto describe the annoyance we all have with the vast profit margins of\ncommercial publishers. Kingsley highlighted this via the troubling practices\nin the IEEE and the American Chemical Society by charging to place repositories\nunder green open access. Her blunt assessment was that publishers are not\nnegotiating in good faith. Her talk drew the biggest applause of the day by\nfar.</p>\n<p>After this, <a href=\"https://wellcome.org/about-us/our-people/staff/john-arne-rottingen\">John-Arne\nR\u00f8ttingen</a>\n(CEO of the Wellcome Trust) emphasised that funders depend on scientific\ndiscourse as a continuous process of refutations and discussions. He expressed\nconcern about overly depending on brand value as a proxy for quality, calling\nit eventually misleading even if it works sometimes in the short term. Key\npriorities the WT have is ensuring that reviewers have easy access to all\nliterature, to supporting evidence synthesis initiatives to translate research\ninto impact, and controlling the open body of research outputs through digital\ninfrastructure to manage the new scale. However, his challenge lies in\nmaintaining sustainable financing models for all this research data; he noted\nexplicitly that the Wellcome would not cover open access costs for commercial\npublishers.</p>\n<p>R\u00f8ttingen further highlighted the Global Biodata Coalition (which he was a\nmember of) concerns about US data resilience and framed research infrastructure\nas "a global public good" requiring collective investment and fair financing\nacross nations. Interestingly, he explicitly called out UNESCO as a weak force\nin global governance for this from the UN; I hadn't even realised that UNESCO\nwas responsible for this stuff!</p>\n<p>Finally, <a href=\"https://www.stats.ox.ac.uk/~deane/\">Prof Charlotte Deane</a> from the EPSRC also discussed what a scientific\njournal is for these days. It's not for proofreading or typesetting anymore and\n(as <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> also noted earlier), the stamp of quality is key. Deane\nargued that "research completion" doesn't happen until someone else can read it\nand reasonably verify the methods are sound; not something that can happen\nwithout more open access. Deane also warned of the existential threat of <a href=\"https://anil.recoil.org/notes/ai-poisoning\">AI poisoning</a> since "AI can make fake papers at a rate humans can't\nimagine. It won't be long before mose of the content on the Internet will be AI\ngenerated".</p>\n<p>The audience Q&A was <em>very</em> blunt here. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> pointed out that we\nare pumping of billions of dollars into the publishing industry, many of which\nare shareholder companies, and so we are losing a significant percentage of\neach dollar spent. There is enough money in the system, but it's very\ninefficiently deployed right now!</p>\n<p><a href=\"https://www.linkedin.com/in/richardsever\">Richard Sever</a> from openRxiv asked\nhow we pay for this when major funders like the NIH have issued a series of\n<em>unfunded</em> open data mandates over recent years. John-Arne Rottingen noted that\nUNESCO is a very weak global body and not influential here, but that we need\ncoalitions of the willing to build such open data approaches from the bottom\nup. Challenging the publisher hegemony can only be done as a pack, which lead\nnicely onto the next session after lunch where the founder of\n<a href=\"https://openalex.org/\">OpenAlex</a> would be present!</p>\n<h2><a href=\"https://anil.recoil.org/#who-are-the-stewards-of-knowledge-\"></a>Who are the stewards of knowledge ?</h2>\n<p>After lunch (where sadly, the vegetarian options were terrible but\nluckily I had my trustly Huel bar!), we reconvened with a panel debating\nwho the stewards of the scientific record should be. This brought together\nperspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex),\nfunders (MRC), and university leadership (pro-VC of Birmingham).</p>\n<p><a href=\"https://www.elsevier.com/people/victoria-eva\">Victoria Eva</a> (<a href=\"https://researcheracademy.elsevier.com/publication-process/open-science/open-access-end-user-licenses\">SVP from\nElsevier</a>)\nopened by describing the "perfect storm" facing their academic publishing\nbusiness as they had 600k more submissions this year than the previous year.\nThere was a high level view on how their digital pipeline "aims to insert\nsafeguards" throughout the publication process to maintain integrity. She\nargued in general terms to view GenAI through separate lenses of trust and\ndiscoverability and argud that Elsevier's substantial technological investments\nposition them to manage both challenges well. I was\n<a href=\"https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science\">predisposed</a>\nto dislike excuses from staggeringly profitable commercial publishers, but I\ndid find her answers to providing bulk access to their corpus unsatisfying.\nWhile she highlighted their growing open access base of papers, she also noted\nthat the transitionon to open access cannot happen overnight (my personal\ntranslation is that this means slow-walking). She mentioned special cases in\nplace for\n<a href=\"https://www.elsevier.com/en-gb/about/open-science/research-data/text-and-data-mining\">TDM</a>\nin the Global South and healthcare access (presumably at the commercial\ndiscretion of Elsevier).</p>\n<p><a href=\"https://jasonpriem.org/\">Jason Priem</a> from <a href=\"https://openalex.org/\">OpenAlex</a>\n(part of <a href=\"https://ourresearch.org/\">OurResearch</a>) then offered a radically\ndifferent perspective. I'm a huge fan of OpenAlex, as we use it extensively in\nthe <a href=\"https://anil.recoil.org/projects/ce\">CE</a> infrastructure. He disagreed with the conference framing of\npublishers as "custodians" or "stewards," noting that these evoke someone\nmaintaining a static, old lovely house. Science <em>isn't</em> a static edifice but a\ngrowing ecosystem, with more scientists alive today than at any point in\nhistory. He instead proposed a "gardener" as a better metaphor; the science\necosystem needs to nourish growth rather than merely preserving what exists.\nExtending the metaphor, Priem contrasted French and English garden styles:\nFrench gardens constrain nature into platonic geometric forms, while English\ngardens embrace a more rambling style that better represents nature's inherent\ndiversity. He argued that science needs to adopt the "English garden" approach\nand that we don't have an information overload problem but rather "<a href=\"https://www.cnet.com/culture/shirky-problem-is-filter-failure-not-info-overload/\">bad\nfilters</a>"\n(to quote Clay Shirky).</p>\n<p>\n<img alt=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\" src=\"https://anil.recoil.org/images/rspub-11.webp\" title=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\">\nJason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel</p>\n<p>Priem advocated <em>strongly</em> for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the "digital soil" (open infrastructure) then they will prosper. OpenAlex and <a href=\"https://zenodo.org/\">Zenodo</a> are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.</p>\n<p><a href=\"https://www.ukri.org/people/patrick-chinnery/\">Patrick Chinnery</a> from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly \u00a324 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Adam_Tickell\">Adam Tickell</a> from the University of Birmingham declared the current model "broken", and not that each attempt at reform fails to solve the <em>basic problem of literature access</em> (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced <a href=\"https://www.gov.uk/government/news/government-to-open-up-publicly-funded-research\">subsequent government policy</a> towards open access.\nTickell was scathing about the oligopolies of Elsevier and Springer, arguing their <a href=\"https://www.researchprofessionalnews.com/rr-news-world-2025-2-elsevier-parent-company-reports-10-rise-in-profit-to-3-2bn/\">profit margins</a> are out of proportion with the public funding for science. He noted that early open access attempts from the <a href=\"https://ioppublishing.org/news/spotlight-on-the-finch-report/\">Finch Report</a> were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">National Data Library</a>...\nTickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a <a href=\"https://bsky.app/profile/hetanshah.bsky.social/post/3lttyexntps2y\">recent report on British confidence in science</a>, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!</p>\n<p>The Q&A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about "digital screening throughout the publication process" but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> once again made a great point from the audience about how Elsevier could let through <a href=\"https://www.vice.com/en/article/scientific-journal-frontiers-publishes-ai-generated-rat-with-gigantic-penis-in-worrying-incident/\">AI generated rats with giant penises</a> with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.</p>\n<h2><a href=\"https://anil.recoil.org/#how-to-make-open-infrastructure-sustainable\"></a>How to make open infrastructure sustainable</h2>\n<p>My laptop power ran out at this point, but the next panel was an absolute treat as it had both <a href=\"https://kaythaney.com/\">Kaitlin Thaney</a> and <a href=\"https://en.wikipedia.org/wiki/Jimmy_Wales\">Jimmy Wales</a> of Wikipedia fame on it!</p>\n<p>\n<img alt=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\" src=\"https://anil.recoil.org/images/rspub-12.webp\" title=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\">\nHylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany</p>\n<p>Jimmy Wales pointed out an interesting point from his "seven rules of trust" is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.</p>\n<p>Kaitlin Thaney pointed out the importance of openness given <a href=\"https://www.motherjones.com/politics/2025/07/trump-war-assault-national-science-foundation-american-innovation-greatness-education/\">the US assault on\nscience</a>\nmeans that the open data repositories can be replicated reasonably as well.</p>\n<p>Ian Mulvaney pointed out that Nature claims to have invested $240m in research\ninfrastructure, and this is a struggle for a medium sized publisher (like his\nown <a href=\"https://www.bmj.com/\">BMJ</a>). Open infrastructure allows sharing and\ncreation of value to make it possible to let these smaller organisations\nsurvive.</p>\n<p>When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?</p>\n<ul>\n<li>The <a href=\"https://www.ccsd.cnrs.fr/en/posi-principles/\">POSI principles</a> came up as important levels.</li>\n<li>Kaitlin mentioned the <a href=\"https://www.nextgenlibpub.org/forest-framework\">FOREST framework</a> funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.</li>\n<li>Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals. If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.</li>\n</ul>\n<p><em>I'm updating this as I go along but running out of laptop battery too!</em></p>",+"content": "<p>I was a bit sleepy getting into the Royal Society <a href=\"https://royalsociety.org/science-events-and-lectures/2025/07/future-of-scientific-publishing/\">Future of Scientific\nPublishing</a>\nconference early this morning, but was quickly woken up by the dramatic passion\non show as publishers, librarians, academics and funders all got together for a\n"frank exchange of views" at a meeting that didn't pull any punches!</p>\n<p>These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available\nfrom the RS in due course.</p>\n<p>\n<img alt=\"Sir Mark Walport FRS opens up the conference\" src=\"https://anil.recoil.org/images/rspub-1.webp\" title=\"Sir Mark Walport FRS opens up the conference\">\nSir Mark Walport FRS opens up the conference</p>\n<h2><a href=\"https://anil.recoil.org/#mark-walport-sets-the-scene\"></a>Mark Walport sets the scene</h2>\n<p>Sir Mark Walport was a delightful emcee for the proceedings of the day, and\nopened how important the moment is for the future of how we conduct science.\nAcademic publishing faces a perfect storm: peer review is buckling under\nenormous volume, funding models are broken and replete with perverse\nincentives, and the entire system groans with inefficiency.</p>\n<p>The Royal Society is the publisher of the world's oldest continuously published\nscientific journal <a href=\"https://royalsocietypublishing.org/journal/rstb\">Philosophical Transactions</a>\n(since 1665) and has convened this conference for academies worldwide. The\noverall question is: what <em>is</em> a scientific journal in 2025 and beyond?\nWalport traced the economic evolution of publishing: for centuries, readers\npaid through subscriptions (I hadn't realised that the <a href=\"https://royalsociety.org/blog/2015/03/philosophical-transactions-the-early-years/\">early editions of the RS</a>\nused to be sent for free to libraries worldwide until the current commercial\nmodel arrived about 80 years ago).. Now, the pendulum has swung to open access\nthat creates perverse incentives that prioritize volume over quality. He called\nit a "smoke and mirrors" era where diamond open access models obscure who\n<em>actually</em> pays for the infrastructure of knowledge dissemination: is it the\npublishers, the governments, the academics, the libraries, or some combination\nof the above? The profit margins of the commercial publishers answers that\nquestion for me...</p>\n<p>He then identified the transformative forces that are a forcing function:</p>\n<ul>\n<li>LLMs have <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">entered</a> the publishing ecosystem</li>\n<li>The proliferation of journals has created an attention economy rather than a knowledge economy</li>\n<li><a href=\"https://openreview.net/\">Preprint</a> archives are reshaping how research is shared quickly</li>\n</ul>\n<p>The challenges ahead while dealing with these are maintaining metadata\nintegrity, preserving the scholarly archive into the long term, and ensuring\nsystematic access for meta-analyses that advance human knowledge.</p>\n<h2><a href=\"https://anil.recoil.org/#historical-perspectives-350-years-of-evolution\"></a>Historical Perspectives: 350 Years of Evolution</h2>\n<p>The opening pair of speakers were unexpected: they brought a historical and\nlinguistic perspective to the problem. I found both of these talks the\nhighlights of the day! Firstly <a href=\"https://www.st-andrews.ac.uk/history/people/akf\">Professor Aileen\nFyfe</a> drew upon her research\nfrom 350 years of the Royal Society archives. Back in the day, there was no\nreal fixed entity called a "scientific journal". Over the centuries, everything\nfrom editorial practices to publication methods over to dissemination means\nhave transformed repeatedly, so we shouldn't view the status quo as set in stone.</p>\n<p>\n<img alt=\"Professor Aileen Fyfe talks publishing history\" src=\"https://anil.recoil.org/images/rspub-2.webp\" title=\"Professor Aileen Fyfe talks publishing history\">\nProfessor Aileen Fyfe talks publishing history</p>\n<p>While the early days of science were essentially people writing letters to each\nother, the post-WWII era of journals marked the shift to "scale". The tools for\ndistance communication (i.e. publishing collected issues) and universities\nswitching from being teaching focused over to today's research-centric\npublishing ecosystem were both key factors. University scientists used to\nproduce 30% of published articles in 1900; by 2020, that figure exceeded 80%.\nThis parallels the globalization of science itself in the past century;\nresearch has expanded well beyond its European origins to encompass almost all\ninstitutions and countries worldwide.</p>\n<p>Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked <em>"<a href=\"https://www.nature.com/articles/186018a0\">How many more new\njournals?</a>"</em> even back then! The 1950s\ndid bring some standardization efforts (nomenclature, units, symbols) also\nthough citation formats robustly seem to resist uniformity. English was also\nexplicitly selected as the "<a href=\"https://en.wikipedia.org/wiki/Languages_of_science\">default language for\nscience</a>, and peer review\nwas also formalised via papers like <em>"<a href=\"https://journals.sagepub.com/doi/10.1177/000456327901600179\">Uniform requirements for manuscripts submitted to biomedical journals</a>"</em> (in 1979). <a href=\"https://nsf-gov-resources.nsf.gov/pubs/1977/nsb77468/nsb77468.pdf\">US Congressional hearings</a>\nwith the NSF began distinguishing peer review from other evaluation methods.</p>\n<p>\n<img alt=\"Professor Aileen Fyfe shows the globalisation of research over the years\" src=\"https://anil.recoil.org/images/rspub-3.webp\" title=\"Professor Aileen Fyfe shows the globalisation of research over the years\">\nProfessor Aileen Fyfe shows the globalisation of research over the years</p>\n<p>All of this scale was then "solved" by financialisation after WWII. At the turn of the\n20th century, almost no journals generated any profit (the Royal Society\ndistributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a\n<a href=\"https://journals.sagepub.com/doi/10.1177/0073275321999901\">reckoning</a>, leading\nto more self-supporting models by the 1960s. An era of mergers and acquisitions\namong journals followed, reshaping the <a href=\"https://serials.uksg.org/articles/259/files/submission/proof/259-1-259-1-10-20150210.pdf\">scientific information system</a>.</p>\n<p><a href=\"https://www.universiteitleiden.nl/en/staffmembers/vincent-lariviere#tab-1\">Professor Vincent Larivi\u00e8re</a> then took the stage to dispel some myths of English monolingualism in scientific publishing. While <a href=\"https://garfield.library.upenn.edu/essays/V1p019y1962-73.pdf\">English offers some practical benefits</a>, the reality at non-Anglophone institutions (like his own Universit\u00e9 de Montr\u00e9al) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this.\nThis wasn't always the case though; Einstein published primarily in German, not English!</p>\n<p>He went on to note that today's landscape for paper language choices is more\ndiverse than is commonly assumed. English represents only 67% of publications,\na figure whic itself has been inflated by non-English papers that are commonly\npublished with English abstracts. Initiatives like the <a href=\"https://pkp.sfu.ca/2025/03/05/ojs-workshops-indonesia/\">Public Knowledge\nProject</a> has enabled\ngrowth in Indonesian and Latin America for example. Chinese journals now\npublish twice the volume of English-language publishers, but are difficult to\nindex which makes Lariviere's numbers even more interesting: a growing majority\nof the world is no longer publishing in English! I also heard this in my trip\nin 2023 to China with the Royal Society; the scholars we met had a sequence of\nChinese language journals they submitted too, often before "translating" the\noutputs to English journals.</p>\n<p>\n<img alt=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\" src=\"https://anil.recoil.org/images/rspub-4.webp\" title=\"Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns\">\nProfessor Lariviere uses OpenAlex to show non-English linguistic breakdowns</p>\n<p>All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita <a href=\"https://ourworldindata.org/grapher/scientific-publications-per-million\">wealth and geography</a>, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivi\u00e8re argued that research "prestige" is a socially constructed phenomenon, and not intrinsic to quality.</p>\n<p>In the Q&A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially <a href=\"https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals\">in AI topics</a>). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from <a href=\"https://coherentdigital.net/\">Coherent Digital</a> noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">30TB of grey literature</a> on our servers, still waiting to be categorisd).</p>\n<p>\n<img alt=\"Professor Lariviere shows how uneven citations are across languages and geographies\" src=\"https://anil.recoil.org/images/rspub-5.webp\" title=\"Professor Lariviere shows how uneven citations are across languages and geographies\">\nProfessor Lariviere shows how uneven citations are across languages and geographies</p>\n<h2><a href=\"https://anil.recoil.org/#what-researchers-actually-need-from-stem-publishing\"></a>What researchers actually need from STEM publishing</h2>\n<p>Our very own <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> opened with a sobering demonstration of "AI\npoisoning" in the literature, referencing <a href=\"https://anil.recoil.org/static/papers/2025-ai-poison.pdf\">our recent Nature\ncomment</a>. He did the risky-but-catchy\ngeneration of a plausible-sounding but entirely fabricated conservation study\nusing an LLM and noted how economically motivated rational actors might quite\nreasonably use these tools to advance their agendas via the scientific record.\nAnd recovering from this will be very difficult indeed once it mixes up with\nreal science.</p>\n<p>\n<img alt=\"Bill talks about our recent AI poisoning piece\" src=\"https://anil.recoil.org/images/rspub-6.webp\" title=\"Bill talks about our recent AI poisoning piece\">\nBill talks about our recent AI poisoning piece</p>\n<p>Bill then outlined our <a href=\"https://anil.recoil.org/projects/ce\">emerging approach to subject-wide synthesis</a> via:</p>\n<ul>\n<li><strong>Systematic reviews</strong>: Slow, steady, comprehensive</li>\n<li><strong>Rapid reviews</strong>: Sprint-based approaches for urgent needs</li>\n<li><strong>Subject-wide evidence synthesis</strong>: Focused sectoral analyses</li>\n<li><strong>Ultrafast bespoke reviews</strong>: AI-accelerated with human-in-the-loop</li>\n</ul>\n<p>Going back to what ournals are <em>for</em> in 2025, Bill then discussed how they were\noriginally vehicles for exchanging information through letters, but now serve\nprimarily as stamps of authority and quality assurance. In an "AI slop world,"\nthis quality assurance function becomes existentially important, but shouldn't\nnecessarily be implemented in the current system of incentives. So then, how do\nwe maintain trust when the vast majority of submissions may soon be\nAI-generated? <em>(Bill and I scribbled down a plan on the back of a napkin for\nthis; more on that soon!)</em></p>\n<p>\n<img alt=\"Bill also does a cheeky advert for his Conservation Concepts channel!\" src=\"https://anil.recoil.org/images/rspub-7.webp\" title=\"Bill also does a cheeky advert for his Conservation Concepts channel!\">\nBill also does a cheeky advert for his Conservation Concepts channel!</p>\n<h3><a href=\"https://anil.recoil.org/#early-career-researcher-perspectives\"></a>Early Career Researcher perspectives</h3>\n<p><a href=\"https://www.york.ac.uk/psychology/staff/postdocs/meekings,-sophie/\">Dr. Sophie Meekings</a> then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is <em>after</em> years spent on poorly paid PhD stipends!</p>\n<p>ECRs require:</p>\n<ul>\n<li>clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the "system"</li>\n<li>constructive, blinded peer review** that educates rather than gatekeeps</li>\n<li>consistent authorship conventions like <a href=\"https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement\">CRediT</a> (Contributor Roles Taxonomy)</li>\n</ul>\n<p>Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.</p>\n<p>If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between "conventional" science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.</p>\n<p>Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!</p>\n<ul>\n<li><strong>S</strong>peed in publication processes</li>\n<li><strong>C</strong>larity in requirements and decisions</li>\n<li><strong>A</strong>ffordability of publication fees</li>\n<li><strong>R</strong>ecognition of contributions</li>\n<li><strong>F</strong>airness in review and credit</li>\n</ul>\n<p>\n<img alt=\"Dr Sophie Meekings&apos; SCARF principles for ECRs\" src=\"https://anil.recoil.org/images/rspub-8.webp\" title=\"Dr Sophie Meekings&apos; SCARF principles for ECRs\">\nDr Sophie Meekings' SCARF principles for ECRs</p>\n<p>The audience Q&A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely?\n<a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> noted that we are currently extending this to education working with <a href=\"https://www.educ.cam.ac.uk/people/staff/gibson/\">Jenny Gibson</a>. Interconnected datasets <em>across</em> subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> are supervising <a href=\"https://anil.recoil.org/notes/eeg-interns-2025\">projects on evidence synthesis</a> this summer on just this topic here in Cambridge.</p>\n<p>Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit <em>after</em> the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.</p>\n<p>A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g</p>\n<h2><a href=\"https://anil.recoil.org/#charting-the-future-for-scientific-publishing\"></a>Charting the future for scientific publishing</h2>\n<p>The next panel brought together folks from across the scientific\npublishing ecosystem, moderated by Clive Cookson of the Financial Times. This\nwas a particularly frank and pointed panel, with lots of quite direct messages\nbeing sent between the representatives of libraries, publishers and funders!</p>\n<p>\n<img alt=\"Amy Brand from MIT Press opens the panel\" src=\"https://anil.recoil.org/images/rspub-9.webp\" title=\"Amy Brand from MIT Press opens the panel\">\nAmy Brand from MIT Press opens the panel</p>\n<p>Amy Brand (MIT Press) started by delivered a warning about conflating "open to\nread" with "open to train on". She pointed out that when MIT Press did a survey\nacross their authors, many of them raised concerns about the reinforcement of\nbias through AI training on scientific literature. While many of the authors\nacknowledged a moral imperative to make science available for LLM training,\nthey also wanted the <em>choice</em> of making their own work used for this. She urged\nthe community to pause and ask fundamental questions like "AI training, at what\ncost?" and "to whose benefit?". I did think she made a good point by drawing\nparallels with the early internet, where Brand pointed out that lack of\nregulation accelerated the decline of non-advertising-driven models. Her\nclosing question asked if search engines merely lead to AI-generated summaries,\nwhy serve the original content at all? This is something we discuss in our\n<a href=\"https://anil.recoil.org/papers/2025-internet-ecology\">upcoming Aarhus paper on an Internet ecology</a>.</p>\n<p><a href=\"https://experts.deakin.edu.au/66981-danny-kingsley\">Danny Kingsley</a> from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are "the ones that sign the cheques that keeps the system running", which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations). Her survey of librarians was interesting; they all asked for:</p>\n<ul>\n<li>Transparent peer review processes</li>\n<li>Unified expectations around AI declarations and disclosures</li>\n<li>Licensing as open as possible, resisting the "salami slicing" of specific use. We also ran across this problem of overly precise restrictions on use while <a href=\"https://anil.recoil.org/papers/2025-ai-poison\">building our paper corpus</a> for <a href=\"https://anil.recoil.org/projects/ce\">CE</a>.</li>\n</ul>\n<p>Kingsley had a great line that "publishers re monetizing the funding mandate",\nwhich <a href=\"https://www.stats.ox.ac.uk/~deane/\">Charlotte Deane</a> later also said was the most succinct way she had heard\nto describe the annoyance we all have with the vast profit margins of\ncommercial publishers. Kingsley highlighted this via the troubling practices\nin the IEEE and the American Chemical Society by charging to place repositories\nunder green open access. Her blunt assessment was that publishers are not\nnegotiating in good faith. Her talk drew the biggest applause of the day by\nfar.</p>\n<p>After this, <a href=\"https://wellcome.org/about-us/our-people/staff/john-arne-rottingen\">John-Arne\nR\u00f8ttingen</a>\n(CEO of the Wellcome Trust) emphasised that funders depend on scientific\ndiscourse as a continuous process of refutations and discussions. He expressed\nconcern about overly depending on brand value as a proxy for quality, calling\nit eventually misleading even if it works sometimes in the short term. Key\npriorities the WT have is ensuring that reviewers have easy access to all\nliterature, to supporting evidence synthesis initiatives to translate research\ninto impact, and controlling the open body of research outputs through digital\ninfrastructure to manage the new scale. However, his challenge lies in\nmaintaining sustainable financing models for all this research data; he noted\nexplicitly that the Wellcome would not cover open access costs for commercial\npublishers.</p>\n<p>R\u00f8ttingen further highlighted the Global Biodata Coalition (which he was a\nmember of) concerns about US data resilience and framed research infrastructure\nas "a global public good" requiring collective investment and fair financing\nacross nations. Interestingly, he explicitly called out UNESCO as a weak force\nin global governance for this from the UN; I hadn't even realised that UNESCO\nwas responsible for this stuff!</p>\n<p>Finally, <a href=\"https://www.stats.ox.ac.uk/~deane/\">Prof Charlotte Deane</a> from the EPSRC also discussed what a scientific\njournal is for these days. It's not for proofreading or typesetting anymore and\n(as <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a> also noted earlier), the stamp of quality is key. Deane\nargued that "research completion" doesn't happen until someone else can read it\nand reasonably verify the methods are sound; not something that can happen\nwithout more open access. Deane also warned of the existential threat of <a href=\"https://anil.recoil.org/notes/ai-poisoning\">AI poisoning</a> since "AI can make fake papers at a rate humans can't\nimagine. It won't be long before mose of the content on the Internet will be AI\ngenerated".</p>\n<p>The audience Q&A was <em>very</em> blunt here. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> pointed out that we\nare pumping of billions of dollars into the publishing industry, many of which\nare shareholder companies, and so we are losing a significant percentage of\neach dollar spent. There is enough money in the system, but it's very\ninefficiently deployed right now!</p>\n<p><a href=\"https://www.linkedin.com/in/richardsever\">Richard Sever</a> from openRxiv asked\nhow we pay for this when major funders like the NIH have issued a series of\n<em>unfunded</em> open data mandates over recent years. John-Arne Rottingen noted that\nUNESCO is a very weak global body and not influential here, but that we need\ncoalitions of the willing to build such open data approaches from the bottom\nup. Challenging the publisher hegemony can only be done as a pack, which lead\nnicely onto the next session after lunch where the founder of\n<a href=\"https://openalex.org/\">OpenAlex</a> would be present!</p>\n<h2><a href=\"https://anil.recoil.org/#who-are-the-stewards-of-knowledge-\"></a>Who are the stewards of knowledge ?</h2>\n<p>After lunch (where sadly, the vegetarian options were terrible but\nluckily I had my trustly Huel bar!), we reconvened with a panel debating\nwho the stewards of the scientific record should be. This brought together\nperspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex),\nfunders (MRC), and university leadership (pro-VC of Birmingham).</p>\n<p><a href=\"https://www.elsevier.com/people/victoria-eva\">Victoria Eva</a> (<a href=\"https://researcheracademy.elsevier.com/publication-process/open-science/open-access-end-user-licenses\">SVP from\nElsevier</a>)\nopened by describing the "perfect storm" facing their academic publishing\nbusiness as they had 600k more submissions this year than the previous year.\nThere was a high level view on how their digital pipeline "aims to insert\nsafeguards" throughout the publication process to maintain integrity. She\nargued in general terms to view GenAI through separate lenses of trust and\ndiscoverability and argud that Elsevier's substantial technological investments\nposition them to manage both challenges well. I was\n<a href=\"https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science\">predisposed</a>\nto dislike excuses from staggeringly profitable commercial publishers, but I\ndid find her answers to providing bulk access to their corpus unsatisfying.\nWhile she highlighted their growing open access base of papers, she also noted\nthat the transitionon to open access cannot happen overnight (my personal\ntranslation is that this means slow-walking). She mentioned special cases in\nplace for\n<a href=\"https://www.elsevier.com/en-gb/about/open-science/research-data/text-and-data-mining\">TDM</a>\nin the Global South and healthcare access (presumably at the commercial\ndiscretion of Elsevier).</p>\n<p><a href=\"https://jasonpriem.org/\">Jason Priem</a> from <a href=\"https://openalex.org/\">OpenAlex</a>\n(part of <a href=\"https://ourresearch.org/\">OurResearch</a>) then offered a radically\ndifferent perspective. I'm a huge fan of OpenAlex, as we use it extensively in\nthe <a href=\"https://anil.recoil.org/projects/ce\">CE</a> infrastructure. He disagreed with the conference framing of\npublishers as "custodians" or "stewards," noting that these evoke someone\nmaintaining a static, old lovely house. Science <em>isn't</em> a static edifice but a\ngrowing ecosystem, with more scientists alive today than at any point in\nhistory. He instead proposed a "gardener" as a better metaphor; the science\necosystem needs to nourish growth rather than merely preserving what exists.\nExtending the metaphor, Priem contrasted French and English garden styles:\nFrench gardens constrain nature into platonic geometric forms, while English\ngardens embrace a more rambling style that better represents nature's inherent\ndiversity. He argued that science needs to adopt the "English garden" approach\nand that we don't have an information overload problem but rather "<a href=\"https://www.cnet.com/culture/shirky-problem-is-filter-failure-not-info-overload/\">bad\nfilters</a>"\n(to quote Clay Shirky).</p>\n<p>\n<img alt=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\" src=\"https://anil.recoil.org/images/rspub-11.webp\" title=\"Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel\">\nJason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel</p>\n<p>Priem advocated <em>strongly</em> for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the "digital soil" (open infrastructure) then they will prosper. OpenAlex and <a href=\"https://zenodo.org/\">Zenodo</a> are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.</p>\n<p><a href=\"https://www.ukri.org/people/patrick-chinnery/\">Patrick Chinnery</a> from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly \u00a324 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Adam_Tickell\">Adam Tickell</a> from the University of Birmingham declared the current model "broken", and not that each attempt at reform fails to solve the <em>basic problem of literature access</em> (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced <a href=\"https://www.gov.uk/government/news/government-to-open-up-publicly-funded-research\">subsequent government policy</a> towards open access.\nTickell was scathing about the oligopolies of Elsevier and Springer, arguing their <a href=\"https://www.researchprofessionalnews.com/rr-news-world-2025-2-elsevier-parent-company-reports-10-rise-in-profit-to-3-2bn/\">profit margins</a> are out of proportion with the public funding for science. He noted that early open access attempts from the <a href=\"https://ioppublishing.org/news/spotlight-on-the-finch-report/\">Finch Report</a> were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK <a href=\"https://anil.recoil.org/notes/uk-national-data-lib\">National Data Library</a>...\nTickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a <a href=\"https://bsky.app/profile/hetanshah.bsky.social/post/3lttyexntps2y\">recent report on British confidence in science</a>, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!</p>\n<p>The Q&A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about "digital screening throughout the publication process" but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. <a href=\"https://www.scholcommlab.ca/stefanie-haustein/\">Stefanie Haustein</a> once again made a great point from the audience about how Elsevier could let through <a href=\"https://www.vice.com/en/article/scientific-journal-frontiers-publishes-ai-generated-rat-with-gigantic-penis-in-worrying-incident/\">AI generated rats with giant penises</a> with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.</p>\n<h2><a href=\"https://anil.recoil.org/#how-to-make-open-infrastructure-sustainable\"></a>How to make open infrastructure sustainable</h2>\n<p>My laptop power ran out at this point, but the next panel was an absolute treat as it had both <a href=\"https://kaythaney.com/\">Kaitlin Thaney</a> and <a href=\"https://en.wikipedia.org/wiki/Jimmy_Wales\">Jimmy Wales</a> of Wikipedia fame on it!</p>\n<p>\n<img alt=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\" src=\"https://anil.recoil.org/images/rspub-12.webp\" title=\"Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany\">\nHylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany</p>\n<p>Jimmy Wales pointed out an interesting point from his "seven rules of trust" is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.</p>\n<p>Kaitlin Thaney pointed out the importance of openness given <a href=\"https://www.motherjones.com/politics/2025/07/trump-war-assault-national-science-foundation-american-innovation-greatness-education/\">the US assault on\nscience</a>\nmeans that the open data repositories can be replicated reasonably as well.</p>\n<p>Ian Mulvaney pointed out that Nature claims to have invested $240m in research\ninfrastructure, and this is a struggle for a medium sized publisher (like his\nown <a href=\"https://www.bmj.com/\">BMJ</a>). Open infrastructure allows sharing and\ncreation of value to make it possible to let these smaller organisations\nsurvive.</p>\n<p>When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?</p>\n<ul>\n<li>The <a href=\"https://www.ccsd.cnrs.fr/en/posi-principles/\">POSI principles</a> came up as important levels.</li>\n<li>Kaitlin mentioned the <a href=\"https://www.nextgenlibpub.org/forest-framework\">FOREST framework</a> funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.</li>\n<li>Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals. If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.</li>\n</ul>\n<p><em>I'm updating this as I go along but running out of laptop battery too!</em></p>",
+18
avsm/notes_shedding-some-light-on-xenapp-on-xenserver-performance-tuning.json
+18
avsm/notes_shedding-some-light-on-xenapp-on-xenserver-performance-tuning.json
···+"id": "https://anil.recoil.org/notes/shedding-some-light-on-xenapp-on-xenserver-performance-tuning",+"link": "https://anil.recoil.org/notes/shedding-some-light-on-xenapp-on-xenserver-performance-tuning",+"summary": "<p>You won\u2019t be surprised to hear that we spend a lot of time improving\n<a href=\"http://www.citrix.com/XenApp\">XenApp</a> performance when running on\n<a href=\"http://www.citrix.com/XenServer\">XenServer</a>. Although there are some\ngood benchmark comparisons available (such as the <a href=\"http://community.citrix.com/x/_4ENAg\">Tolly\nGroup</a> report), I still get a lot\nof customers asking about what the \u201csecret sauce\u201d is. I sat down with\nGeorge Dunlap, the lead XenServer performance engineer to chat about the\nvery first optimisation we did back in XenServer 4.0 last year.</p>\n<p>Before we dive in, we first need to explain how a normal operating\nsystem handles memory. George explains:</p>\n<blockquote>\n<p>Modern desktop and server processors don\u2019t access memory directly\nusing its physical address. They use \u2018<a href=\"http://en.wikipedia.org/Virtual_Memory\">virtual\nmemory</a>\u2019 to separate the\naddresses that processes use to read and write memory from the actual\nmemory itself. This allows operating systems to hide from processes\nall the dirty details of how much memory there is, where in physical\nmemory the process needs to write to, and so on.</p>\n<p>However, the actual processor still needs to translate from a\n<a href=\"http://en.wikipedia.org/wiki/Virtual_address\">virtual address</a> to the\nphysical memory address in order to actually read and write any\nmemory. This translation is done with something called <a href=\"http://en.wikipedia.org/wiki/Page_tables\">page\ntables</a>.</p>\n<p>Page tables are used to implement virtual memory by mapping virtual\naddresses to physical addresses. The operating system constructs page\ntables using physical memory addresses, and then puts the physical\naddress of the \u201ctop-level\u201d page table into a hardware register called\nthe \u2018base pointer\u2019. Then the processor will read these page tables to\ntranslate virtual addresses to physical addresses as needed, before\nreading and writing to physical memory.</p>\n</blockquote>\n<p>Most modern processor types have some sort of paging mechanism, although\nXenServer is specifically tuned for\n<a href=\"http://en.wikipedia.org/wiki/X86-64\">x86-64</a> CPUs. An excellent book on\nthe general topic is <a href=\"http://en.wikipedia.org/wiki/Special:BookSources/0130313580\">Modern Operating\nSystems</a> by\n<a href=\"http://www.cs.vu.nl/~ast/\">Andrew Tanenbaum</a>. When XenServer creates\nWindows VMs, it takes advantage of the <a href=\"http://en.wikipedia.org/wiki/X86_virtualization\">virtualization\nextensions</a> in modern\nCPUs, which requires special memory handling in Xen. George explains\nthis further:</p>\n<blockquote>\n<p>When we create a virtual machine, we virtualize the memory as well;\nthat means that the guest operating system\u2019s idea of physical memory\ndoes not match up to real physical memory on the host. Traditionally,\nwhat the guest thinks of as physical memory is called \u201cphysical\nmemory\u201d, and what the hypervisor thinks of as physical memory is\ncalled \u201cmachine memory\u201d. Since this terminology is a bit confusing,\nXen tends to call what the guest thinks of as physical memory as\n\u201cguest physical\u201d memory, just to help make things more clear.</p>\n</blockquote>\n<blockquote>\n<p>This means that any fully-virtualized operating system, like Windows,\nwill create page tables using guest physical memory, and will point\nthe base pointer at the guest physical address of the top-level page\ntable. Unfortunately, the hardware still needs to map from virtual\nmemory address to machine addresses, not guest physical addresses.</p>\n</blockquote>\n<blockquote>\n<p>In order to allow this to happen, the hypervisor sets up <strong>shadow\npage tables</strong>. These page tables are generated by the hypervisor are\ncopies of the guest page tables, but with the guest physical addresses\nconverted into machine physical addresses. The guest cannot access\nthem directly, and they don\u2019t reside in the guest\u2019s physical memory;\nthey\u2019re generated out of a pool of memory that the hypervisor\nallocates when a VM is created, called shadow page table memory.</p>\n</blockquote>\n<blockquote>\n<p>What this means is that whenever the guest operating system wants to\nmap some new memory, after it writes the data into the page table but\nbefore it can actually use it, the hypervisor needs to translate the\nchange to the guest page table into changes to the shadow page table.\nSo any workload that involves a lot of this will necessarily involve\nthe hypervisor a lot, which causes overhead.</p>\n</blockquote>\n<p>So shadow page tables are our mechanism of giving a guest an interface\nwhich is identical to real hardware (so it doesn\u2019t need to be modified),\nbut still intercepting changes before they reach the real hardware. You\ncan find more details from the <a href=\"http://www.xensource.com/files/summit_3/XenSummit_Shadow2.pdf\">XenSummit 2006\ntalk</a> or\nfrom the 2005 <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf\">NSDI\npaper</a>.\nSo how is this all relevant to XenApp performance? Back to George\u2026</p>\n<blockquote>\n<p>The hypervisor allocates a certain amount of memory for each VM to\nuse for shadow page tables; this is called <strong>shadow page table\nmemory</strong>. As new page tables are created and old ones aren\u2019t used\nanymore, the hypervisor cycles through this shadow page table memory.\nWhen it needs a new page and there isn\u2019t enough, it will \u2018unshadow\u2019\nthe guest page tables that haven\u2019t been used for the longest time to\nreclaim shadow memory, so that it can use more.</p>\n</blockquote>\n<blockquote>\n<p>We don\u2019t know ahead of time how much shadow memory a given workload\nwill use, but we can estimate based on the amount of memory that the\nVM has. We allocate enough shadow memory for each page to be mapped\nonce, more or less, then add an extra 50% to have some slack. For all\nthe workloads we\u2019ve tested, that\u2019s been enough \u2013 except XenApp.</p>\n</blockquote>\n<blockquote>\n<p>XenApp is the one workload we\u2019ve found that requires more shadow page\ntable memory than our standard default. Because XenApp generally\nstarts hundreds of copies of the same process, the same memory ends up\nmapped in hundreds of different processes. What happens when all of\nthose processes are active is that XenServer is continually\nunshadowing one process\u2019 page tables in order to shadow another\nprocess\u2019 pagetables; only to have to re-shadow the original ones a\nsecond or two later when it runs again! This is called\n<a href=\"http://en.wikipedia.org/wiki/Thrash_(computer_science)\">thrashing</a>,\nwhen there\u2019s not enough of a limited resource.</p>\n</blockquote>\n<p>Once the bottleneck was discovered, the solution was simple. In\nXenServer 4.1, we created a special XenServer application template\ncalled <em>\u201cCitrix XenApp\u201d</em>, which has an increased shadow multiplier that\nreserves more shadow memory for the guest when it starts. This is also a\ngood example of how templates hide the complexities of performance\ntuning from the user, but still permitting custom modifications if they\nare required. For example, on your XenServer host with a VM called\n\u201cXenApp\u201d, you could view the shadow multiplier by using the CLI:</p>\n<pre><code># xe vm-list name-label=XenApp params=HVM-shadow-multiplier\n HVM-shadow-multiplier ( RW) : 4.000\n</code></pre>\n<p>The same value is also available from XenCenter in the Optimization\npane, but of course do remember that the default value was chosen\nthrough extensive testing and doesn\u2019t need to be changed. Most of the\nother templates in XenServer also have carefully tuned settings (e.g.\nthe hardware platform flags) to ensure smooth running, or in the case of\nLinux templates, to support <a href=\"http://docs.xensource.com/XenServer/4.1.0/1.0/en_gb/sdk.html#id2553443\">para-virtual\ninstallation</a>.\nThis is why it\u2019s so important that you not use the <em>\u201cOther Install\nMedia\u201d</em> template in preference of a more specialised one!</p>\n<p>I mentioned at the beginning of this post that this was the first of\nmany XenApp optimisations. We\u2019ve just released the <a href=\"https://www.citrix.com/English/ss/downloads/details.asp?downloadId=1679827&productId=683148\">public\nbeta</a>\nof the latest XenServer (\u201cOrlando\u201d) which is even faster. The story of\nwhat those improvements are, and the tools which George and his team\nuses to analyze the inner workings of Xen, are a topic for a future\npost. For now, get downloading XenServer and start virtualizing your\nXenApp installations! Or if you\u2019re feeling inspired, go over to\n<a href=\"http://xen.org/\">xen.org</a>, check out the source, and get coding\u2026</p>",+"content": "<p>You won\u2019t be surprised to hear that we spend a lot of time improving\n<a href=\"http://www.citrix.com/XenApp\">XenApp</a> performance when running on\n<a href=\"http://www.citrix.com/XenServer\">XenServer</a>. Although there are some\ngood benchmark comparisons available (such as the <a href=\"http://community.citrix.com/x/_4ENAg\">Tolly\nGroup</a> report), I still get a lot\nof customers asking about what the \u201csecret sauce\u201d is. I sat down with\nGeorge Dunlap, the lead XenServer performance engineer to chat about the\nvery first optimisation we did back in XenServer 4.0 last year.</p>\n<p>Before we dive in, we first need to explain how a normal operating\nsystem handles memory. George explains:</p>\n<blockquote>\n<p>Modern desktop and server processors don\u2019t access memory directly\nusing its physical address. They use \u2018<a href=\"http://en.wikipedia.org/Virtual_Memory\">virtual\nmemory</a>\u2019 to separate the\naddresses that processes use to read and write memory from the actual\nmemory itself. This allows operating systems to hide from processes\nall the dirty details of how much memory there is, where in physical\nmemory the process needs to write to, and so on.</p>\n<p>However, the actual processor still needs to translate from a\n<a href=\"http://en.wikipedia.org/wiki/Virtual_address\">virtual address</a> to the\nphysical memory address in order to actually read and write any\nmemory. This translation is done with something called <a href=\"http://en.wikipedia.org/wiki/Page_tables\">page\ntables</a>.</p>\n<p>Page tables are used to implement virtual memory by mapping virtual\naddresses to physical addresses. The operating system constructs page\ntables using physical memory addresses, and then puts the physical\naddress of the \u201ctop-level\u201d page table into a hardware register called\nthe \u2018base pointer\u2019. Then the processor will read these page tables to\ntranslate virtual addresses to physical addresses as needed, before\nreading and writing to physical memory.</p>\n</blockquote>\n<p>Most modern processor types have some sort of paging mechanism, although\nXenServer is specifically tuned for\n<a href=\"http://en.wikipedia.org/wiki/X86-64\">x86-64</a> CPUs. An excellent book on\nthe general topic is <a href=\"http://en.wikipedia.org/wiki/Special:BookSources/0130313580\">Modern Operating\nSystems</a> by\n<a href=\"http://www.cs.vu.nl/~ast/\">Andrew Tanenbaum</a>. When XenServer creates\nWindows VMs, it takes advantage of the <a href=\"http://en.wikipedia.org/wiki/X86_virtualization\">virtualization\nextensions</a> in modern\nCPUs, which requires special memory handling in Xen. George explains\nthis further:</p>\n<blockquote>\n<p>When we create a virtual machine, we virtualize the memory as well;\nthat means that the guest operating system\u2019s idea of physical memory\ndoes not match up to real physical memory on the host. Traditionally,\nwhat the guest thinks of as physical memory is called \u201cphysical\nmemory\u201d, and what the hypervisor thinks of as physical memory is\ncalled \u201cmachine memory\u201d. Since this terminology is a bit confusing,\nXen tends to call what the guest thinks of as physical memory as\n\u201cguest physical\u201d memory, just to help make things more clear.</p>\n</blockquote>\n<blockquote>\n<p>This means that any fully-virtualized operating system, like Windows,\nwill create page tables using guest physical memory, and will point\nthe base pointer at the guest physical address of the top-level page\ntable. Unfortunately, the hardware still needs to map from virtual\nmemory address to machine addresses, not guest physical addresses.</p>\n</blockquote>\n<blockquote>\n<p>In order to allow this to happen, the hypervisor sets up <strong>shadow\npage tables</strong>. These page tables are generated by the hypervisor are\ncopies of the guest page tables, but with the guest physical addresses\nconverted into machine physical addresses. The guest cannot access\nthem directly, and they don\u2019t reside in the guest\u2019s physical memory;\nthey\u2019re generated out of a pool of memory that the hypervisor\nallocates when a VM is created, called shadow page table memory.</p>\n</blockquote>\n<blockquote>\n<p>What this means is that whenever the guest operating system wants to\nmap some new memory, after it writes the data into the page table but\nbefore it can actually use it, the hypervisor needs to translate the\nchange to the guest page table into changes to the shadow page table.\nSo any workload that involves a lot of this will necessarily involve\nthe hypervisor a lot, which causes overhead.</p>\n</blockquote>\n<p>So shadow page tables are our mechanism of giving a guest an interface\nwhich is identical to real hardware (so it doesn\u2019t need to be modified),\nbut still intercepting changes before they reach the real hardware. You\ncan find more details from the <a href=\"http://www.xensource.com/files/summit_3/XenSummit_Shadow2.pdf\">XenSummit 2006\ntalk</a> or\nfrom the 2005 <a href=\"http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf\">NSDI\npaper</a>.\nSo how is this all relevant to XenApp performance? Back to George\u2026</p>\n<blockquote>\n<p>The hypervisor allocates a certain amount of memory for each VM to\nuse for shadow page tables; this is called <strong>shadow page table\nmemory</strong>. As new page tables are created and old ones aren\u2019t used\nanymore, the hypervisor cycles through this shadow page table memory.\nWhen it needs a new page and there isn\u2019t enough, it will \u2018unshadow\u2019\nthe guest page tables that haven\u2019t been used for the longest time to\nreclaim shadow memory, so that it can use more.</p>\n</blockquote>\n<blockquote>\n<p>We don\u2019t know ahead of time how much shadow memory a given workload\nwill use, but we can estimate based on the amount of memory that the\nVM has. We allocate enough shadow memory for each page to be mapped\nonce, more or less, then add an extra 50% to have some slack. For all\nthe workloads we\u2019ve tested, that\u2019s been enough \u2013 except XenApp.</p>\n</blockquote>\n<blockquote>\n<p>XenApp is the one workload we\u2019ve found that requires more shadow page\ntable memory than our standard default. Because XenApp generally\nstarts hundreds of copies of the same process, the same memory ends up\nmapped in hundreds of different processes. What happens when all of\nthose processes are active is that XenServer is continually\nunshadowing one process\u2019 page tables in order to shadow another\nprocess\u2019 pagetables; only to have to re-shadow the original ones a\nsecond or two later when it runs again! This is called\n<a href=\"http://en.wikipedia.org/wiki/Thrash_(computer_science)\">thrashing</a>,\nwhen there\u2019s not enough of a limited resource.</p>\n</blockquote>\n<p>Once the bottleneck was discovered, the solution was simple. In\nXenServer 4.1, we created a special XenServer application template\ncalled <em>\u201cCitrix XenApp\u201d</em>, which has an increased shadow multiplier that\nreserves more shadow memory for the guest when it starts. This is also a\ngood example of how templates hide the complexities of performance\ntuning from the user, but still permitting custom modifications if they\nare required. For example, on your XenServer host with a VM called\n\u201cXenApp\u201d, you could view the shadow multiplier by using the CLI:</p>\n<pre><code># xe vm-list name-label=XenApp params=HVM-shadow-multiplier\n HVM-shadow-multiplier ( RW) : 4.000\n</code></pre>\n<p>The same value is also available from XenCenter in the Optimization\npane, but of course do remember that the default value was chosen\nthrough extensive testing and doesn\u2019t need to be changed. Most of the\nother templates in XenServer also have carefully tuned settings (e.g.\nthe hardware platform flags) to ensure smooth running, or in the case of\nLinux templates, to support <a href=\"http://docs.xensource.com/XenServer/4.1.0/1.0/en_gb/sdk.html#id2553443\">para-virtual\ninstallation</a>.\nThis is why it\u2019s so important that you not use the <em>\u201cOther Install\nMedia\u201d</em> template in preference of a more specialised one!</p>\n<p>I mentioned at the beginning of this post that this was the first of\nmany XenApp optimisations. We\u2019ve just released the <a href=\"https://www.citrix.com/English/ss/downloads/details.asp?downloadId=1679827&productId=683148\">public\nbeta</a>\nof the latest XenServer (\u201cOrlando\u201d) which is even faster. The story of\nwhat those improvements are, and the tools which George and his team\nuses to analyze the inner workings of Xen, are a topic for a future\npost. For now, get downloading XenServer and start virtualizing your\nXenApp installations! Or if you\u2019re feeling inspired, go over to\n<a href=\"http://xen.org/\">xen.org</a>, check out the source, and get coding\u2026</p>",
+18
avsm/notes_signals-and-threads.json
+18
avsm/notes_signals-and-threads.json
···+"summary": "<p>I am the latest person to feature on the first season of the <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and\nThreads</a> podcast\nhosted by <a href=\"https://github.com/yminsky\">Yaron Minsky</a> (you may recognise him as my co-author on <a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml</a>).</p>\n<blockquote>\n<p>Anil Madhavapeddy is an academic, author, engineer, entrepreneur, and OCaml aficionado. In this episode, Anil and Ron consider the evolving role of operating systems, security on the internet, and the pending arrival (at last!) of OCaml 5.0. They also discuss using Raspberry Pis to fight climate change; the programming inspiration found in British pubs and on Moroccan beaches; and the time Anil went to a party, got drunk, and woke up with a job working on the Mars Polar Lander.\n-- <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and Threads</a></p>\n</blockquote>\n<p>I think I might be the first non- Jane Street person to be on their podcast! Quite the honour.</p>",+"content": "<p>I am the latest person to feature on the first season of the <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and\nThreads</a> podcast\nhosted by <a href=\"https://github.com/yminsky\">Yaron Minsky</a> (you may recognise him as my co-author on <a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml</a>).</p>\n<blockquote>\n<p>Anil Madhavapeddy is an academic, author, engineer, entrepreneur, and OCaml aficionado. In this episode, Anil and Ron consider the evolving role of operating systems, security on the internet, and the pending arrival (at last!) of OCaml 5.0. They also discuss using Raspberry Pis to fight climate change; the programming inspiration found in British pubs and on Moroccan beaches; and the time Anil went to a party, got drunk, and woke up with a job working on the Mars Polar Lander.\n-- <a href=\"https://signalsandthreads.com/what-is-an-operating-system/\">Signals and Threads</a></p>\n</blockquote>\n<p>I think I might be the first non- Jane Street person to be on their podcast! Quite the honour.</p>",
+18
avsm/notes_socc-pc.json
+18
avsm/notes_socc-pc.json
···+"summary": "<p>After some time away from cloud computing (due to my new focus on <a href=\"https://anil.recoil.org/projects/life\">conservation research</a>), I served on the <a href=\"https://acmsocc.org/2024/\">ACM SOCC 2024</a> program committee. It was quite interesting seeing the massive shift away from "traditional" cloud research (such as consensus protocols) towards many submissions aimed at accelerating machine learning workloads.</p>\n<p>I also had a paper accepted there on <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration\">decentralised scheduling</a>, thanks to my former PhD student <a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> and her hard work on Murmuration!</p>",+"content": "<p>After some time away from cloud computing (due to my new focus on <a href=\"https://anil.recoil.org/projects/life\">conservation research</a>), I served on the <a href=\"https://acmsocc.org/2024/\">ACM SOCC 2024</a> program committee. It was quite interesting seeing the massive shift away from "traditional" cloud research (such as consensus protocols) towards many submissions aimed at accelerating machine learning workloads.</p>\n<p>I also had a paper accepted there on <a href=\"https://anil.recoil.org/papers/2024-socc-murmuration\">decentralised scheduling</a>, thanks to my former PhD student <a href=\"https://www.cl.cam.ac.uk/~sv440/\">Smita Vijayakumar</a> and her hard work on Murmuration!</p>",
+18
avsm/notes_spotcodes-nytimes.json
+18
avsm/notes_spotcodes-nytimes.json
···+"summary": "<p>In what is definitely our most exciting media coverage yet, <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">Spotcodes</a> are featured in the New York Times!</p>\n<blockquote>\n<p>When you think of a public information kiosk, your mental picture might include greasy touch screens, broken trackballs and frozen monitors.\nBut researchers at an Intel-financed lab at Cambridge University have developed a way to replace displays like those with something portable, not to mention personal: a cellphone's built-in camera and screen. They and others plan to use commercially available hardware to turn the camera-equipped cellphone into a mouse, remote control, keyboard and more.\n-- <a href=\"https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html\">New York Times</a></p>\n</blockquote>\n<p><a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> got cited as I wasn't in the department that day when the journalist showed up at Intel Research!</p>\n<blockquote>\n<p>"Instead of having all the hassle of putting things out in the environment that you have to maintain and that people can vandalize, you get a cheap PC, shove it in the back room of your shop and just put posters out front," said Richard Sharp, an Intel researcher here.</p>\n</blockquote>",+"content": "<p>In what is definitely our most exciting media coverage yet, <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">Spotcodes</a> are featured in the New York Times!</p>\n<blockquote>\n<p>When you think of a public information kiosk, your mental picture might include greasy touch screens, broken trackballs and frozen monitors.\nBut researchers at an Intel-financed lab at Cambridge University have developed a way to replace displays like those with something portable, not to mention personal: a cellphone's built-in camera and screen. They and others plan to use commercially available hardware to turn the camera-equipped cellphone into a mouse, remote control, keyboard and more.\n-- <a href=\"https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html\">New York Times</a></p>\n</blockquote>\n<p><a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> got cited as I wasn't in the department that day when the journalist showed up at Intel Research!</p>\n<blockquote>\n<p>"Instead of having all the hassle of putting things out in the environment that you have to maintain and that people can vandalize, you get a cheap PC, shove it in the back room of your shop and just put posters out front," said Richard Sharp, an Intel researcher here.</p>\n</blockquote>",
+18
avsm/notes_srg-fp.json
+18
avsm/notes_srg-fp.json
···+"summary": "<p>We've been doing loads of OCaml programming in the Systems Research Group, and this blog post lays out some of the things going on. It ranges from OCaml hacking, over to the CIEL distributed execution engine, and even some Haskell hacking ongoing for distributed execution.</p>",+"content": "<p>We've been doing loads of OCaml programming in the Systems Research Group, and this blog post lays out some of the things going on. It ranges from OCaml hacking, over to the CIEL distributed execution engine, and even some Haskell hacking ongoing for distributed execution.</p>",
+18
avsm/notes_starting-phd.json
+18
avsm/notes_starting-phd.json
···+"summary": "<p>I started my PhD at the Systems Research Group in Cambridge this week, based in\nthe <a href=\"https://www.cl.cam.ac.uk\">Computer Laboratory</a> and <a href=\"https://robinson.cam.ac.uk\">Robinson\nCollege</a>. I'll still be working part-time at\n<a href=\"https://netapp.com\">NetApp</a>, but my primary focus will be on the <a href=\"https://anil.recoil.org/projects/xen\">Xen</a>\nhypervisor and other systems research topics.</p>",+"content": "<p>I started my PhD at the Systems Research Group in Cambridge this week, based in\nthe <a href=\"https://www.cl.cam.ac.uk\">Computer Laboratory</a> and <a href=\"https://robinson.cam.ac.uk\">Robinson\nCollege</a>. I'll still be working part-time at\n<a href=\"https://netapp.com\">NetApp</a>, but my primary focus will be on the <a href=\"https://anil.recoil.org/projects/xen\">Xen</a>\nhypervisor and other systems research topics.</p>",
+18
avsm/notes_student-ideas.json
+18
avsm/notes_student-ideas.json
···+"summary": "<p>I've refreshed the set of project <a href=\"https://anil.recoil.org/ideas\">ideas</a> for incoming <a href=\"https://www.cst.cam.ac.uk/teaching/part-ii\">CST Part II</a> and <a href=\"https://www.cst.cam.ac.uk/teaching/masters\">MPhil</a> and PhD student projects for 2024-2025.</p>\n<p>These are not an exhaustive list, but intended to kickstart conversations for things we could work on together. Do get in touch if you're an incoming student and see something that grabs your interest.</p>",+"content": "<p>I've refreshed the set of project <a href=\"https://anil.recoil.org/ideas\">ideas</a> for incoming <a href=\"https://www.cst.cam.ac.uk/teaching/part-ii\">CST Part II</a> and <a href=\"https://www.cst.cam.ac.uk/teaching/masters\">MPhil</a> and PhD student projects for 2024-2025.</p>\n<p>These are not an exhaustive list, but intended to kickstart conversations for things we could work on together. Do get in touch if you're an incoming student and see something that grabs your interest.</p>",
+18
avsm/notes_swseng.json
+18
avsm/notes_swseng.json
···+"summary": "<p>We're still reeling from the shocking and unexpected <a href=\"https://www.cst.cam.ac.uk/news/ross-anderson\">passing of Ross Anderson</a>\nlast week. You can read a lovely <a href=\"https://raintown.org/ross_anderson/\">tribute</a> to him by <a href=\"https://raintown.org\">Satnam Singh</a>, and I still getting my thoughts together on all the guidance, advice and prods in the right direction that Ross has given me over the years.</p>\n<p>In pragmatic news, I'll be emergency lecturing part of Ross' 1A <a href=\"https://www.cl.cam.ac.uk/teaching/2324/SWSecEng/\">Software and Security Engineering</a> course here at Cambridge, along with my colleagues Martin, Alastair, Mort and Rob who have all stepped up at short notice. I have absolutely no idea how we'll live up to Ross' standard, but we'll do our very best in his memory!</p>",+"content": "<p>We're still reeling from the shocking and unexpected <a href=\"https://www.cst.cam.ac.uk/news/ross-anderson\">passing of Ross Anderson</a>\nlast week. You can read a lovely <a href=\"https://raintown.org/ross_anderson/\">tribute</a> to him by <a href=\"https://raintown.org\">Satnam Singh</a>, and I still getting my thoughts together on all the guidance, advice and prods in the right direction that Ross has given me over the years.</p>\n<p>In pragmatic news, I'll be emergency lecturing part of Ross' 1A <a href=\"https://www.cl.cam.ac.uk/teaching/2324/SWSecEng/\">Software and Security Engineering</a> course here at Cambridge, along with my colleagues Martin, Alastair, Mort and Rob who have all stepped up at short notice. I have absolutely no idea how we'll live up to Ross' standard, but we'll do our very best in his memory!</p>",
+18
avsm/notes_syncoid-sanoid-zfs.json
+18
avsm/notes_syncoid-sanoid-zfs.json
···+"summary": "<p>Over in my <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group, we have a <em>lot</em> of primary and secondary datasets lying around: 100s of terabytes of <a href=\"https://anil.recoil.org/projects/rsn\">satellite imagery</a>, <a href=\"https://anil.recoil.org/projects/life\">biodiversity data</a>, <a href=\"https://anil.recoil.org/projects/ce\">academic literature</a>, and the intermediate computations that go along with them. Our trusty central shared storage server running <a href=\"https://www.truenas.com\">TrueNAS</a> stores data in <a href=\"https://en.wikipedia.org/wiki/ZFS\">ZFS</a> and serves it over <a href=\"https://en.wikipedia.org/wiki/Network_File_System\">NFSv4</a> to a bunch of hosts. This is rapidly becoming a bottleneck as our group and datasets grow, and <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> has been steadily adding <a href=\"https://www.tunbury.org/kingston-drives/\">lots more raw capacity</a>. The question now is how to configure this raw SSD capacity into a more nimble storage setup. If anyone's seen any systems similar to the one sketched out below, I'd love to hear from you.</p>\n<h2><a href=\"https://anil.recoil.org/#why-get-rid-of-nfs\"></a>Why get rid of NFS?</h2>\n<p>The first design constraint is to get rid of centralised network storage. This is both slow when compared to a modern NVMEs, and also hard to extend beyond the <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">POSIX-ish</a> API to take advantage of filesystem-specific features like snapshots or <a href=\"https://docs.rs/reflink/latest/reflink/\">reflink clones</a>. We also don't take much advantage of the simultaneous use of the network storage. Instead, we'd like to just make every host materialise the portion of storage it needs locally by cloning it from the remote server.</p>\n<p>The alternative I'm considering here is to use ZFS filesystems on the nodes themselves rather than NFS. This has the upside of having the cloned data be directly available on the local disk of the host that's using it, meaning that there's no performance impact as with networked storage. ZFS also scales fairly enormous sizes, and so it seems likely that we won't run into an upper bound due to this choice of filesystem in the medium term.</p>\n<p>ZFS operates through the creation of a <a href=\"https://wiki.ubuntu.com/ZFS/ZPool\">zpool</a> across a block of disks, over which <a href=\"https://blog.victormendonca.com/2020/11/03/zfs-for-dummies/\">datasets</a> can be created in a tree. One of our typical research servers looks like this:</p>\n<pre><code>$ zfs list\nNAME USED AVAIL REFER MOUNTPOINT\neeg 20.4T 7.37T 7.84T /eeg\neeg/gbif 8.55T 7.37T 8.55T /eeg/gbif\neeg/logs/fetcher 11.3G 7.37T 8.41G /eeg/logs/fetcher\neeg/logs/zotero 12.2G 7.37T 8.29G /eeg/logs/zotero\neeg/papers/doi 3.11T 7.37T 3.11T /eeg/papers/doi\neeg/papers/pmc 843G 7.37T 843G /eeg/papers/pmc\neeg/papers/tei 87.4G 7.37T 85.8G /eeg/papers/tei\neeg/repology 5.92G 7.37T 5.92G /eeg/repolo\n</code></pre>\n<p>Inside the <code>eeg</code> zpool, each of the sub-datasets can themselves be arranged in a hierarchy. Each of them can also have key-value labels and separate properties attached to them, and inherit their parent datasets properties. There are a <a href=\"https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html\">vast number of ZFS properties</a> that can be tuned.</p>\n<h2><a href=\"https://anil.recoil.org/#snapshots-and-replication-with-sanoid\"></a>Snapshots and replication with Sanoid</h2>\n<p>Once a single host has some important data in a local ZFS dataset, I've started using <a href=\"https://github.com/jimsalterjrs/sanoid\">Sanoid</a> for the snapshot management:</p>\n<blockquote>\n<p>Sanoid is a policy-driven snapshot management tool for ZFS filesystems [...] you can use it to make your systems functionally immortal via automated snapshot management and over-the-air replication.</p>\n</blockquote>\n<p>The first use of Sanoid is to regularly take ZFS snapshots of important filesystems. These snapshots will be rotated regularly at different intervals, and subsequently replicated off-host. Here's an example <code>/etc/sanoid/sanoid.conf</code> from the machine above.</p>\n<pre><code>[eeg/papers/doi]\n use_template = production\n[eeg/papers/pmc]\n use_template = production\n[eeg/papers/tei]\n use_template = production\n\n[template_production]\n frequently = 0\n hourly = 24\n daily = 30\n monthly = 3\n yearly = 0\n autosnap = yes\n autoprune = yes\n</code></pre>\n<p>This <code>sanoid.conf</code> establishes a "production" template that keeps 24 hourly snapshots, 30 daily ones and 3 monthly ones. After some time passes, I can verify this by checking the local filesystem snapshots.</p>\n<pre><code>$ zfs list -t snapshot\nNAME USED AVAIL REFER MOUNTPOINT\neeg/p/doi@2024122101 134M - 1.19T -\neeg/p/doi@2024122102 45.3M - 1.19T -\neeg/p/doi@autosnap_2025-02-01_00:00:02_monthly 224M - 2.84T -\neeg/p/doi@autosnap_2025-03-01_00:00:02_monthly 173M - 3.11T -\neeg/p/doi@autosnap_2025-03-08_00:00:03_daily 173M - 3.11T -\neeg/p/doi@autosnap_2025-03-09_00:00:01_daily 0B - 3.11T -\neeg/p/doi@autosnap_2025-03-10_00:00:02_daily 0B - 3.11T -\neeg/p/doi@autosnap_2025-03-11_00:00:02_daily 0B - 3.11T -\n<...etc>\n</code></pre>\n<p>These snapshots are incremental, and each subsequent one uses only the differential space taken up by the earlier ones. See this <a href=\"https://zedfs.com/all-you-have-to-know-about-reading-zfs-disk-usage/\">handy guide</a> for the meaning of the different space accounting terms above.</p>\n<p>Once happy with the production template, I then automate it within cron:</p>\n<pre><code>* * * * * TZ=Europe/London /usr/bin/sanoid --cron\n</code></pre>\n<p>This currently runs every minute (a little wasteful), in order to quickly check if any snapshots are required. Once happy with the hourly cadence working as expected, I'll drop this back to an <code>@hourly</code> job.</p>\n<h2><a href=\"https://anil.recoil.org/#replicating-the-zfs-snapshots\"></a>Replicating the ZFS snapshots</h2>\n<p>Once Sanoid is merrily making snapshots on the active host, it's also necessary to replicate this off the host for robustness. Since we're not using a networked store, I'd like to replicate the snapshots onto two other hosts, one of which is offsite.\nCrucially, these backup hosts can have their <em>own</em> sanoid configuration with a longer-term horizon of backups (e.g. keeping some yearly snapshots). To make this work, we first use a sister tool <a href=\"https://github.com/jimsalterjrs/sanoid?tab=readme-ov-file#syncoid\"><code>syncoid</code></a> that is included in the Sanoid distribution.</p>\n<p>I run <code>syncoid</code> in 'pull mode', which means that the backup server is configured to be able to SSH into the production server(s) in order to fetch the datasets. Under the hood, syncoid uses a combination of ZFS <a href=\"https://forums.truenas.com/t/zfs-bookmarks-and-why-you-dont-use-them-but-should/5578\">bookmarks</a> and <a href=\"https://xai.sh/2018/08/27/zfs-incremental-backups.html\">send/recv</a> to incrementally and efficiently transmit the snapshots over the network and reconstruct the filesystems locally.</p>\n<p>Once the SSH host keys are configured in the usual way, a series of crontab entries like this is sufficient to fetch all the remote snapshots to the local host. The backup host that's doing the pulling just needs to run <code>syncoid</code> regularly:</p>\n<pre><code>@daily /usr/sbin/syncoid backup@marpe:eeg/papers/doi eeg/papers/doi\n@daily /usr/sbin/syncoid backup@marpe:eeg/papers/tei eeg/papers/tei\n</code></pre>\n<p>At this point, the backup host now has all the snapshots from the live host (including hourly ones), and can then run Sanoid again in order to decide which ones it wants to keep locally. I haven't put too much effort into optimising these yet, but you can see they're different from the ones above.</p>\n<pre><code>[eeg/papers]\n use_template = backup\n recursive = yes\n\n[template_backup]\n autoprune = yes\n frequently = 0\n hourly = 30\n daily = 90\n monthly = 12\n yearly = 0\n autosnap = no\n</code></pre>\n<p>These will keep a few more hourly snapshots, and three times the number of daily snapshots available on the backup servers, in case a rollback is needed. Since the backup server typically has a lot more raw capacity than the live server, it's practical to do this there rather than on the production hosts.</p>\n<p>Finally, we can also hook this up to our monitoring scripts with a handy <a href=\"https://www.nagios.org/\">Nagios</a>-compatible interface.</p>\n<pre><code># sanoid --monitor-health\nOK ZPOOL eeg : ONLINE {Size:30.6T Free:2.44T Cap:92%}\n</code></pre>\n<h2><a href=\"https://anil.recoil.org/#should-we-use-zfs-root-volumes\"></a>Should we use ZFS root volumes?</h2>\n<p>It's a bit trickier to figure out if the root volume of the hosts should be ZFS. This requires that the boot initrd always has a working ZFS kernel module, which sometimes goes wrong on updates if the DKMS shim falters for some reason. In terms of specific distributions:</p>\n<ul>\n<li>Ubuntu in theory supports ZFS with all its kernels, but <a href=\"https://discourse.ubuntu.com/t/future-of-zfs-on-ubuntu-desktop/33001/19?u=d0od\">the long term future</a> of ZFS root on Ubuntu is in <a href=\"https://www.omgubuntu.co.uk/2023/01/ubuntu-zfs-support-status\">question</a>. <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> has got <a href=\"https://www.tunbury.org/ubuntu-with-zfs-root/\">detailed instructions</a> on how to automate this with an <a href=\"https://gist.github.com/mtelvers/2cbeb5e35f43f5e461aa0c14c4a0a6b8\">Ansible playbook</a>.</li>\n<li>Debian only packages <a href=\"https://wiki.debian.org/ZFS\">ZFS via DKMS</a> due to the CDDL <a href=\"https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/\">licensing concerns</a>.</li>\n<li>Alpine also has good <a href=\"https://wiki.alpinelinux.org/wiki/ZFS\">ZFS support</a>, including <a href=\"https://wiki.alpinelinux.org/wiki/Root_on_ZFS_with_native_encryption\">encrypted root</a>.</li>\n</ul>\n<p>For my own personal servers, I've been using a normal ext4 root volume, and creating a ZFS for the remainder of the disk, without using LVM underneath it. It's a bit less flexible, but strikes a balance between performance and flexibility.</p>\n<h2><a href=\"https://anil.recoil.org/#next-steps\"></a>Next steps</h2>\n<p>This basic setup is sufficient to have pull-based ZFS snapshot and replication across multiple hosts, and making it easy to quickly materialise a ZFS dataset onto a given host for use in data processing. It still needs a bunch of development to turn this into a properly robust system though:</p>\n<ul>\n<li><strong>Dynamic dataset construction:</strong> One downside is that you must create the right ZFS dataset structure ahead of time, since you can't send/receive arbitary filesystem subtrees. I'm not sure if it's easy to <code>zfs create</code> into an existing subdirectory of a dataset and have it copy the files within there semi-automatically into the new sub-dataset.</li>\n<li><strong>Backing up into encrypted volumes:</strong> One of the coolest features of ZFS is that it can maintain <em>unmounted</em> and <em>encrypted at rest</em> datasets. It's therefore possible to have unencrypted data on the production servers (so no performance hit), with more secure-at-rest encryption on the backup servers. However, <a href=\"https://mtlynch.io/zfs-encrypted-backups/\">it requires some messing around</a> to figure out the right runes.</li>\n<li><strong>Discovery of datasets in a cluster:</strong> We also need a way of knowing which datasets have been backed up to which hosts. This way, if a host in a cluster needs a particular dataset, it can request it from the other host. Given we probably have 1000s of datasets (as opposed to potentially millions of snapshotS), this doesn't seem like too difficult a problem. We may even be able to use a <a href=\"https://irmin.org\">Irmin</a> database or a DNS-based broadcast mechanism to do this easily within a cluster.</li>\n<li><strong>Switching from ZFS to XFS locally:</strong> While ZFS seems like the ideal replication filesystem, it still lacks some of the cooler local features like <a href=\"https://github.com/openzfs/zfs/issues/405#issuecomment-1880208374\">XFS reflinks</a>. It would be nice to find an efficient way to materialise an XFS filesystem from a ZFS base, but without copying absolutely everything. This is either impossibly difficult or really easy via some cunning use of <a href=\"https://en.wikipedia.org/wiki/OverlayFS\">overlayfs</a>. Probably impossible though, given how much block-level information is needed to do deduplication.</li>\n<li><strong>ZFS labels for policy:</strong> Most ZFS tools use custom key/value labels on datasets to implement policies. For example, a <code>syncoid:sync</code> label can be used to tell syncoid to include a particular recursive dataset in its replication. There are some scalability limits in just how many labels you can add before slowing a machine down a crawl (though not as bad as how many live mounts). <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started some WIP <a href=\"https://github.com/quantifyearth/ocaml-zfs\">ocaml-zfs</a> bindings to <a href=\"https://github.com/openzfs/zfs/blob/master/include/libzfs.h\">libzfs</a> to help explore this question.</li>\n</ul>\n<p>So lots of work left to do here, but quite good fun as systems hacking goes! When <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> is <a href=\"https://www.tunbury.org/kingston-drives/\">done installing</a> our new drives, we'll have a few petabytes of raw capacity to implement this system over...</p>",+"content": "<p>Over in my <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group, we have a <em>lot</em> of primary and secondary datasets lying around: 100s of terabytes of <a href=\"https://anil.recoil.org/projects/rsn\">satellite imagery</a>, <a href=\"https://anil.recoil.org/projects/life\">biodiversity data</a>, <a href=\"https://anil.recoil.org/projects/ce\">academic literature</a>, and the intermediate computations that go along with them. Our trusty central shared storage server running <a href=\"https://www.truenas.com\">TrueNAS</a> stores data in <a href=\"https://en.wikipedia.org/wiki/ZFS\">ZFS</a> and serves it over <a href=\"https://en.wikipedia.org/wiki/Network_File_System\">NFSv4</a> to a bunch of hosts. This is rapidly becoming a bottleneck as our group and datasets grow, and <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> has been steadily adding <a href=\"https://www.tunbury.org/kingston-drives/\">lots more raw capacity</a>. The question now is how to configure this raw SSD capacity into a more nimble storage setup. If anyone's seen any systems similar to the one sketched out below, I'd love to hear from you.</p>\n<h2><a href=\"https://anil.recoil.org/#why-get-rid-of-nfs\"></a>Why get rid of NFS?</h2>\n<p>The first design constraint is to get rid of centralised network storage. This is both slow when compared to a modern NVMEs, and also hard to extend beyond the <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">POSIX-ish</a> API to take advantage of filesystem-specific features like snapshots or <a href=\"https://docs.rs/reflink/latest/reflink/\">reflink clones</a>. We also don't take much advantage of the simultaneous use of the network storage. Instead, we'd like to just make every host materialise the portion of storage it needs locally by cloning it from the remote server.</p>\n<p>The alternative I'm considering here is to use ZFS filesystems on the nodes themselves rather than NFS. This has the upside of having the cloned data be directly available on the local disk of the host that's using it, meaning that there's no performance impact as with networked storage. ZFS also scales fairly enormous sizes, and so it seems likely that we won't run into an upper bound due to this choice of filesystem in the medium term.</p>\n<p>ZFS operates through the creation of a <a href=\"https://wiki.ubuntu.com/ZFS/ZPool\">zpool</a> across a block of disks, over which <a href=\"https://blog.victormendonca.com/2020/11/03/zfs-for-dummies/\">datasets</a> can be created in a tree. One of our typical research servers looks like this:</p>\n<pre><code>$ zfs list\nNAME USED AVAIL REFER MOUNTPOINT\neeg 20.4T 7.37T 7.84T /eeg\neeg/gbif 8.55T 7.37T 8.55T /eeg/gbif\neeg/logs/fetcher 11.3G 7.37T 8.41G /eeg/logs/fetcher\neeg/logs/zotero 12.2G 7.37T 8.29G /eeg/logs/zotero\neeg/papers/doi 3.11T 7.37T 3.11T /eeg/papers/doi\neeg/papers/pmc 843G 7.37T 843G /eeg/papers/pmc\neeg/papers/tei 87.4G 7.37T 85.8G /eeg/papers/tei\neeg/repology 5.92G 7.37T 5.92G /eeg/repolo\n</code></pre>\n<p>Inside the <code>eeg</code> zpool, each of the sub-datasets can themselves be arranged in a hierarchy. Each of them can also have key-value labels and separate properties attached to them, and inherit their parent datasets properties. There are a <a href=\"https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html\">vast number of ZFS properties</a> that can be tuned.</p>\n<h2><a href=\"https://anil.recoil.org/#snapshots-and-replication-with-sanoid\"></a>Snapshots and replication with Sanoid</h2>\n<p>Once a single host has some important data in a local ZFS dataset, I've started using <a href=\"https://github.com/jimsalterjrs/sanoid\">Sanoid</a> for the snapshot management:</p>\n<blockquote>\n<p>Sanoid is a policy-driven snapshot management tool for ZFS filesystems [...] you can use it to make your systems functionally immortal via automated snapshot management and over-the-air replication.</p>\n</blockquote>\n<p>The first use of Sanoid is to regularly take ZFS snapshots of important filesystems. These snapshots will be rotated regularly at different intervals, and subsequently replicated off-host. Here's an example <code>/etc/sanoid/sanoid.conf</code> from the machine above.</p>\n<pre><code>[eeg/papers/doi]\n use_template = production\n[eeg/papers/pmc]\n use_template = production\n[eeg/papers/tei]\n use_template = production\n\n[template_production]\n frequently = 0\n hourly = 24\n daily = 30\n monthly = 3\n yearly = 0\n autosnap = yes\n autoprune = yes\n</code></pre>\n<p>This <code>sanoid.conf</code> establishes a "production" template that keeps 24 hourly snapshots, 30 daily ones and 3 monthly ones. After some time passes, I can verify this by checking the local filesystem snapshots.</p>\n<pre><code>$ zfs list -t snapshot\nNAME USED AVAIL REFER MOUNTPOINT\neeg/p/doi@2024122101 134M - 1.19T -\neeg/p/doi@2024122102 45.3M - 1.19T -\neeg/p/doi@autosnap_2025-02-01_00:00:02_monthly 224M - 2.84T -\neeg/p/doi@autosnap_2025-03-01_00:00:02_monthly 173M - 3.11T -\neeg/p/doi@autosnap_2025-03-08_00:00:03_daily 173M - 3.11T -\neeg/p/doi@autosnap_2025-03-09_00:00:01_daily 0B - 3.11T -\neeg/p/doi@autosnap_2025-03-10_00:00:02_daily 0B - 3.11T -\neeg/p/doi@autosnap_2025-03-11_00:00:02_daily 0B - 3.11T -\n<...etc>\n</code></pre>\n<p>These snapshots are incremental, and each subsequent one uses only the differential space taken up by the earlier ones. See this <a href=\"https://zedfs.com/all-you-have-to-know-about-reading-zfs-disk-usage/\">handy guide</a> for the meaning of the different space accounting terms above.</p>\n<p>Once happy with the production template, I then automate it within cron:</p>\n<pre><code>* * * * * TZ=Europe/London /usr/bin/sanoid --cron\n</code></pre>\n<p>This currently runs every minute (a little wasteful), in order to quickly check if any snapshots are required. Once happy with the hourly cadence working as expected, I'll drop this back to an <code>@hourly</code> job.</p>\n<h2><a href=\"https://anil.recoil.org/#replicating-the-zfs-snapshots\"></a>Replicating the ZFS snapshots</h2>\n<p>Once Sanoid is merrily making snapshots on the active host, it's also necessary to replicate this off the host for robustness. Since we're not using a networked store, I'd like to replicate the snapshots onto two other hosts, one of which is offsite.\nCrucially, these backup hosts can have their <em>own</em> sanoid configuration with a longer-term horizon of backups (e.g. keeping some yearly snapshots). To make this work, we first use a sister tool <a href=\"https://github.com/jimsalterjrs/sanoid?tab=readme-ov-file#syncoid\"><code>syncoid</code></a> that is included in the Sanoid distribution.</p>\n<p>I run <code>syncoid</code> in 'pull mode', which means that the backup server is configured to be able to SSH into the production server(s) in order to fetch the datasets. Under the hood, syncoid uses a combination of ZFS <a href=\"https://forums.truenas.com/t/zfs-bookmarks-and-why-you-dont-use-them-but-should/5578\">bookmarks</a> and <a href=\"https://xai.sh/2018/08/27/zfs-incremental-backups.html\">send/recv</a> to incrementally and efficiently transmit the snapshots over the network and reconstruct the filesystems locally.</p>\n<p>Once the SSH host keys are configured in the usual way, a series of crontab entries like this is sufficient to fetch all the remote snapshots to the local host. The backup host that's doing the pulling just needs to run <code>syncoid</code> regularly:</p>\n<pre><code>@daily /usr/sbin/syncoid backup@marpe:eeg/papers/doi eeg/papers/doi\n@daily /usr/sbin/syncoid backup@marpe:eeg/papers/tei eeg/papers/tei\n</code></pre>\n<p>At this point, the backup host now has all the snapshots from the live host (including hourly ones), and can then run Sanoid again in order to decide which ones it wants to keep locally. I haven't put too much effort into optimising these yet, but you can see they're different from the ones above.</p>\n<pre><code>[eeg/papers]\n use_template = backup\n recursive = yes\n\n[template_backup]\n autoprune = yes\n frequently = 0\n hourly = 30\n daily = 90\n monthly = 12\n yearly = 0\n autosnap = no\n</code></pre>\n<p>These will keep a few more hourly snapshots, and three times the number of daily snapshots available on the backup servers, in case a rollback is needed. Since the backup server typically has a lot more raw capacity than the live server, it's practical to do this there rather than on the production hosts.</p>\n<p>Finally, we can also hook this up to our monitoring scripts with a handy <a href=\"https://www.nagios.org/\">Nagios</a>-compatible interface.</p>\n<pre><code># sanoid --monitor-health\nOK ZPOOL eeg : ONLINE {Size:30.6T Free:2.44T Cap:92%}\n</code></pre>\n<h2><a href=\"https://anil.recoil.org/#should-we-use-zfs-root-volumes\"></a>Should we use ZFS root volumes?</h2>\n<p>It's a bit trickier to figure out if the root volume of the hosts should be ZFS. This requires that the boot initrd always has a working ZFS kernel module, which sometimes goes wrong on updates if the DKMS shim falters for some reason. In terms of specific distributions:</p>\n<ul>\n<li>Ubuntu in theory supports ZFS with all its kernels, but <a href=\"https://discourse.ubuntu.com/t/future-of-zfs-on-ubuntu-desktop/33001/19?u=d0od\">the long term future</a> of ZFS root on Ubuntu is in <a href=\"https://www.omgubuntu.co.uk/2023/01/ubuntu-zfs-support-status\">question</a>. <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> has got <a href=\"https://www.tunbury.org/ubuntu-with-zfs-root/\">detailed instructions</a> on how to automate this with an <a href=\"https://gist.github.com/mtelvers/2cbeb5e35f43f5e461aa0c14c4a0a6b8\">Ansible playbook</a>.</li>\n<li>Debian only packages <a href=\"https://wiki.debian.org/ZFS\">ZFS via DKMS</a> due to the CDDL <a href=\"https://sfconservancy.org/blog/2016/feb/25/zfs-and-linux/\">licensing concerns</a>.</li>\n<li>Alpine also has good <a href=\"https://wiki.alpinelinux.org/wiki/ZFS\">ZFS support</a>, including <a href=\"https://wiki.alpinelinux.org/wiki/Root_on_ZFS_with_native_encryption\">encrypted root</a>.</li>\n</ul>\n<p>For my own personal servers, I've been using a normal ext4 root volume, and creating a ZFS for the remainder of the disk, without using LVM underneath it. It's a bit less flexible, but strikes a balance between performance and flexibility.</p>\n<h2><a href=\"https://anil.recoil.org/#next-steps\"></a>Next steps</h2>\n<p>This basic setup is sufficient to have pull-based ZFS snapshot and replication across multiple hosts, and making it easy to quickly materialise a ZFS dataset onto a given host for use in data processing. It still needs a bunch of development to turn this into a properly robust system though:</p>\n<ul>\n<li><strong>Dynamic dataset construction:</strong> One downside is that you must create the right ZFS dataset structure ahead of time, since you can't send/receive arbitary filesystem subtrees. I'm not sure if it's easy to <code>zfs create</code> into an existing subdirectory of a dataset and have it copy the files within there semi-automatically into the new sub-dataset.</li>\n<li><strong>Backing up into encrypted volumes:</strong> One of the coolest features of ZFS is that it can maintain <em>unmounted</em> and <em>encrypted at rest</em> datasets. It's therefore possible to have unencrypted data on the production servers (so no performance hit), with more secure-at-rest encryption on the backup servers. However, <a href=\"https://mtlynch.io/zfs-encrypted-backups/\">it requires some messing around</a> to figure out the right runes.</li>\n<li><strong>Discovery of datasets in a cluster:</strong> We also need a way of knowing which datasets have been backed up to which hosts. This way, if a host in a cluster needs a particular dataset, it can request it from the other host. Given we probably have 1000s of datasets (as opposed to potentially millions of snapshotS), this doesn't seem like too difficult a problem. We may even be able to use a <a href=\"https://irmin.org\">Irmin</a> database or a DNS-based broadcast mechanism to do this easily within a cluster.</li>\n<li><strong>Switching from ZFS to XFS locally:</strong> While ZFS seems like the ideal replication filesystem, it still lacks some of the cooler local features like <a href=\"https://github.com/openzfs/zfs/issues/405#issuecomment-1880208374\">XFS reflinks</a>. It would be nice to find an efficient way to materialise an XFS filesystem from a ZFS base, but without copying absolutely everything. This is either impossibly difficult or really easy via some cunning use of <a href=\"https://en.wikipedia.org/wiki/OverlayFS\">overlayfs</a>. Probably impossible though, given how much block-level information is needed to do deduplication.</li>\n<li><strong>ZFS labels for policy:</strong> Most ZFS tools use custom key/value labels on datasets to implement policies. For example, a <code>syncoid:sync</code> label can be used to tell syncoid to include a particular recursive dataset in its replication. There are some scalability limits in just how many labels you can add before slowing a machine down a crawl (though not as bad as how many live mounts). <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> started some WIP <a href=\"https://github.com/quantifyearth/ocaml-zfs\">ocaml-zfs</a> bindings to <a href=\"https://github.com/openzfs/zfs/blob/master/include/libzfs.h\">libzfs</a> to help explore this question.</li>\n</ul>\n<p>So lots of work left to do here, but quite good fun as systems hacking goes! When <a href=\"https://tarides.com/blog/author/mark-elvers/\">Mark Elvers</a> is <a href=\"https://www.tunbury.org/kingston-drives/\">done installing</a> our new drives, we'll have a few petabytes of raw capacity to implement this system over...</p>",
+18
avsm/notes_the-state-of-ai-tools.json
+18
avsm/notes_the-state-of-ai-tools.json
···+"summary": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> organised this week's <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group <a href=\"https://www.cst.cam.ac.uk/seminars/list/229027\">discussion</a> on what AI tools we use for our daily work. I was immediately struck by how <em>few</em> tools there are that are actually making us more productive, so I jotted down notes as the discussion was going on.</p>\n<ul>\n<li>Personally, the only tool I've found that's (only just recently) making me more productive is agentic coding, which I <a href=\"https://anil.recoil.org/notes/claude-copilot-sandbox\">wrote about a few days ago</a>. Since then, I've been mildly obsessively forking off ideas I've wanted to try for years (like converting RFCs to OCaml code) and greatly enjoying myself. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I have been looking for how to do this more ethically, and the best I ran across was the <a href=\"https://www.ibm.com/impact/ai-ethics\">IBM AI ethics</a> guidance and their <a href=\"https://github.com/ibm-granite/granite-code-models\">granite models</a>, but not much else. Any pointers to other models that don't violate open source licensing norms would be gratefully accepted; I'm using Claude 3.7 here, but don't feel great doing so!</li>\n<li><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> described his use of <a href=\"https://fathom.video/\">Fathom</a> for note-taking, and (having been on the receiving end) can confirm it does a very good transcription job.</li>\n<li><a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> has a local <a href=\"https://stabledifffusion.com/\">Stable Diffusion</a> image generator to help create local content for presentations/etc, but the setup broke when going from macOS 13 to 15 (Sequoia). Apple seem to have changed something in <a href=\"https://developer.apple.com/metal/\">Metal</a> so the existing HuggingFace installation (mostly <a href=\"https://developer.apple.com/metal/pytorch/\">pyTorch-metal</a> and the <a href=\"https://developer.apple.com/metal/tensorflow-plugin/\">Tensorflow MPS</a> backend) were out of date with the system Metal libraries. Package management for these tightly integrated hardware/software inference systems is pretty bad right now (<a href=\"https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html\">nvidia-container-toolkit</a> is another bag of hacks for containerised applications).</li>\n</ul>\n<p>Then there's a long list of things that people <em>aren't</em> using because they suck. LLM-driven searches are pretty inaccurate, as many people noted; I use <a href=\"https://kagi.com\">Kagi</a> but only because I love their AI-filtered search results, not because of their assistant!. I've turned off Apple Intelligence on all my devices, not because of privacy concerns, but because it's just utter crap -- the summaries are actually <a href=\"https://www.bbc.co.uk/news/articles/cq5ggew08eyo\">incorrect</a> half the time. I find the autocorrect features similarly distracting and wrong most of the time, and normal spellcheckers do a better job in practise.</p>\n<h2><a href=\"https://anil.recoil.org/#wheres-this-going\"></a>Where's this going?</h2>\n<p>Our discussion then into developing news of emerging tools and techniques, since the field overall is just moving incredibly fast. Two things I've been reading this week are:</p>\n<ul>\n<li>With <a href=\"https://awards.acm.org/about/2024-turing\">RL winning the Turing award</a> this week, some folks investigated whether lightweight open-weight models could reach the performance of advanced heavy frontier models in terms of deductive reasoning. They applied RL to train an LLM for the <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">game of temporal clue</a>, and their post describes many neat tricks (including the use of <a href=\"https://developers.google.com/optimization/cp/cp_solver\">CP-SAT</a> to generate difficult-but-solvable game scenarios). They applied <a href=\"https://arxiv.org/abs/2402.03300\">GRPO</a> (as made famous by <a href=\"https://anil.recoil.org/notes/deepseek-r1-advances\">DeepSeek</a>) to do the RL loop of solving puzzles via model responses, grading groups of responses, and fine tuning the model using clipped policy gradients derived from these group estimates. Their results <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">were impressive</a> and reached frontier-model performance using Qwen 14B!</li>\n<li>And for something completely different, another team released their <a href=\"https://google-research.github.io/self-organising-systems/difflogic-ca/\">Differentiable Logic Cellular Automata</a> paper which describes how to go from the <a href=\"https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life\">Game of Life</a> to full pattern generation using learned recurrent circuits. This one should really be read in its entirity to appreciate how incredible it might become in the future, as it would allow us to generate distributed systems that can build a very complex end-goal pattern by following a set of simple rules. <a href=\"https://coomeslab.org\">David Coomes</a> pointed out to me recently that the question of <a href=\"https://www.wired.com/story/mystery-solved-how-plant-cells-know-when-to-stop-growing/\">why cells stop growing</a> has only very recently been understood in traditional biology, and yet here we are applying ML to the case.</li>\n<li><a href=\"https://mistral.ai/fr/news/mistral-ocr\">Mistral OCR</a> came out today and seems to be the state of the art in multi-modally breaking down documents into a consistent linear structure. Their results show that they can break down complex PDFs in multiple languages into seemingly clean HTML with semantic structure (such as tables, equations, figures and so on). I've only just finished running <a href=\"https://anil.recoil.org/projects/ce\">millions of papers</a> through <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a>, so this is next on the queue to try out...</li>\n</ul>\n<p>So, I guess the TL;DR of our discussion was that current AI tools are the first generation, but we're heading rather rapidly into new frontiers of discovery, so there's only going to be more of them coming up...</p>",+"content": "<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> organised this week's <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG</a> group <a href=\"https://www.cst.cam.ac.uk/seminars/list/229027\">discussion</a> on what AI tools we use for our daily work. I was immediately struck by how <em>few</em> tools there are that are actually making us more productive, so I jotted down notes as the discussion was going on.</p>\n<ul>\n<li>Personally, the only tool I've found that's (only just recently) making me more productive is agentic coding, which I <a href=\"https://anil.recoil.org/notes/claude-copilot-sandbox\">wrote about a few days ago</a>. Since then, I've been mildly obsessively forking off ideas I've wanted to try for years (like converting RFCs to OCaml code) and greatly enjoying myself. <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and I have been looking for how to do this more ethically, and the best I ran across was the <a href=\"https://www.ibm.com/impact/ai-ethics\">IBM AI ethics</a> guidance and their <a href=\"https://github.com/ibm-granite/granite-code-models\">granite models</a>, but not much else. Any pointers to other models that don't violate open source licensing norms would be gratefully accepted; I'm using Claude 3.7 here, but don't feel great doing so!</li>\n<li><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> described his use of <a href=\"https://fathom.video/\">Fathom</a> for note-taking, and (having been on the receiving end) can confirm it does a very good transcription job.</li>\n<li><a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> has a local <a href=\"https://stabledifffusion.com/\">Stable Diffusion</a> image generator to help create local content for presentations/etc, but the setup broke when going from macOS 13 to 15 (Sequoia). Apple seem to have changed something in <a href=\"https://developer.apple.com/metal/\">Metal</a> so the existing HuggingFace installation (mostly <a href=\"https://developer.apple.com/metal/pytorch/\">pyTorch-metal</a> and the <a href=\"https://developer.apple.com/metal/tensorflow-plugin/\">Tensorflow MPS</a> backend) were out of date with the system Metal libraries. Package management for these tightly integrated hardware/software inference systems is pretty bad right now (<a href=\"https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html\">nvidia-container-toolkit</a> is another bag of hacks for containerised applications).</li>\n</ul>\n<p>Then there's a long list of things that people <em>aren't</em> using because they suck. LLM-driven searches are pretty inaccurate, as many people noted; I use <a href=\"https://kagi.com\">Kagi</a> but only because I love their AI-filtered search results, not because of their assistant!. I've turned off Apple Intelligence on all my devices, not because of privacy concerns, but because it's just utter crap -- the summaries are actually <a href=\"https://www.bbc.co.uk/news/articles/cq5ggew08eyo\">incorrect</a> half the time. I find the autocorrect features similarly distracting and wrong most of the time, and normal spellcheckers do a better job in practise.</p>\n<h2><a href=\"https://anil.recoil.org/#wheres-this-going\"></a>Where's this going?</h2>\n<p>Our discussion then into developing news of emerging tools and techniques, since the field overall is just moving incredibly fast. Two things I've been reading this week are:</p>\n<ul>\n<li>With <a href=\"https://awards.acm.org/about/2024-turing\">RL winning the Turing award</a> this week, some folks investigated whether lightweight open-weight models could reach the performance of advanced heavy frontier models in terms of deductive reasoning. They applied RL to train an LLM for the <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">game of temporal clue</a>, and their post describes many neat tricks (including the use of <a href=\"https://developers.google.com/optimization/cp/cp_solver\">CP-SAT</a> to generate difficult-but-solvable game scenarios). They applied <a href=\"https://arxiv.org/abs/2402.03300\">GRPO</a> (as made famous by <a href=\"https://anil.recoil.org/notes/deepseek-r1-advances\">DeepSeek</a>) to do the RL loop of solving puzzles via model responses, grading groups of responses, and fine tuning the model using clipped policy gradients derived from these group estimates. Their results <a href=\"https://openpipe.ai/blog/using-grpo-to-beat-o1-o3-mini-and-r1-on-temporal-clue\">were impressive</a> and reached frontier-model performance using Qwen 14B!</li>\n<li>And for something completely different, another team released their <a href=\"https://google-research.github.io/self-organising-systems/difflogic-ca/\">Differentiable Logic Cellular Automata</a> paper which describes how to go from the <a href=\"https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life\">Game of Life</a> to full pattern generation using learned recurrent circuits. This one should really be read in its entirity to appreciate how incredible it might become in the future, as it would allow us to generate distributed systems that can build a very complex end-goal pattern by following a set of simple rules. <a href=\"https://coomeslab.org\">David Coomes</a> pointed out to me recently that the question of <a href=\"https://www.wired.com/story/mystery-solved-how-plant-cells-know-when-to-stop-growing/\">why cells stop growing</a> has only very recently been understood in traditional biology, and yet here we are applying ML to the case.</li>\n<li><a href=\"https://mistral.ai/fr/news/mistral-ocr\">Mistral OCR</a> came out today and seems to be the state of the art in multi-modally breaking down documents into a consistent linear structure. Their results show that they can break down complex PDFs in multiple languages into seemingly clean HTML with semantic structure (such as tables, equations, figures and so on). I've only just finished running <a href=\"https://anil.recoil.org/projects/ce\">millions of papers</a> through <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a>, so this is next on the queue to try out...</li>\n</ul>\n<p>So, I guess the TL;DR of our discussion was that current AI tools are the first generation, but we're heading rather rapidly into new frontiers of discovery, so there's only going to be more of them coming up...</p>",
+18
avsm/notes_the-year-in-ocamllabs.json
+18
avsm/notes_the-year-in-ocamllabs.json
···+"summary": "<p>This time last year in 2012, I had just\n<a href=\"https://anil.recoil.org/2012/10/19/announcing-ocaml-labs.html\">announced</a>\nthe formation of a new group called <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml\nLabs</a> in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Lab</a> that would combine research and\ncommunity work towards the practical application of functional\nprogramming. An incredible year has absolutely flown by, and I\u2019ve put\ntogether this post to summarise what\u2019s gone on, and point to our future\ndirections for 2014.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that would take on some of the load of day-to-day OCaml\nmaintenance from <a href=\"http://caml.inria.fr\">INRIA</a>, as well as help grow the\nwider OCaml community. To this end, all of our projects have been highly\ncollaborative, often involving colleagues from\n<a href=\"http://ocamlpro.com\">OCamlPro</a>, <a href=\"http://gallium.inria.fr/\">INRIA</a>,\n<a href=\"http://janestreet.com\">Jane Street</a>, <a href=\"http://www.lexifi.com/\">Lexifi</a>\nand <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers progress in <a href=\"https://anil.recoil.org/#tooling\">tooling</a>, the <a href=\"https://anil.recoil.org/#core_compiler\">compiler and\nlanguage</a>, <a href=\"https://anil.recoil.org/#community_efforts\">community efforts</a>,\n<a href=\"https://anil.recoil.org/#research_projects\">research projects</a> and concludes with our\n<a href=\"https://anil.recoil.org/#priorities_for_2014\">priorities for 2014</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#tooling\"></a>Tooling</h2>\n<p>At the start of 2013, OCaml was in the interesting position of being a\nmature decades-old language with a small, loyal community of industrial\nusers who built mission critical applications using it. We had the\nopportunity to sit down with many of them at the <a href=\"http://caml.inria.fr/consortium/\">OCaml\nConsortium</a> meeting and prioritise\nwhere we started work. The answer came back clearly: while the compiler\nitself is legendary for its stability, the tooling around it (such as\npackage management) was a pressing problem.</p>\n<h3><a href=\"https://anil.recoil.org/#opam\"></a>OPAM</h3>\n<p>Our solution to this tooling was centered around the\n<a href=\"http://opam.ocaml.org\">OPAM</a> package manager that\n<a href=\"http://ocamlpro.com\">OCamlPro</a> released into beta just at the end of\n2012, and had its first stable release in March 2013. OPAM differs from\nmost system package managers by emphasising a flexible distributed\nworkflow that uses version constraints to ensure incompatible libraries\naren\u2019t mixed up (important for the statically-typed OCaml that is very\ncareful about dependencies). Working closely with\n<a href=\"http://ocamlpro.com\">OCamlPro</a> we developed a git-based workflow to\nmake it possible for users (both individual or industrial) to easily\nbuild up their own package repositories and redistribute OCaml code, and\nstarted curating the <a href=\"https://github.com/ocaml/opam-repository\">package\nrepository</a>.</p>\n<p>The results have been satisfying: we started with an initial set of\naround 100 packages in OPAM (mostly imported by the 4 developers), and\nended 2013 with 587 unique packages and 2000 individual versions, with\ncontributions from 160 individuals. We now have a curated <a href=\"https://github.com/ocaml/opam-repository\">central\npackage repository</a> for anyone\nto submit their OCaml code, several third-party remotes are maintained\n(e.g. the <a href=\"https://github.com/xapi-project/opam-repo-dev\">Xen Project</a>\nand <a href=\"https://github.com/ocsigen/opam-ocsigen\">Ocsigen</a>). We also\nregularly receive releases of the <a href=\"http://ocaml.janestreet.com\">Core</a>\nlibraries from Jane Street, and updates from sources as varied as\n<a href=\"https://github.com/ocaml/opam-repository/pull/1300\">Facebook</a>,\n<a href=\"https://anil.recoil.org/2013/09/16/camlpdf-the-end-of-sucky-pdf-tools.html\">Coherent\nPDF</a>,\nto the <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/guha.pdf\">Frenetic\nSDN</a> research.</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/opam11-contributors-dec13.webp\" title=\"Number of unique contributors to the central OPAM package repository\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/opam11-packages-dec13.webp\" title=\"Total number of unique packages (including multiple versions of the same package)\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/opam11-unique-packages-dec13.webp\" title=\"Total packages with multiple versions coalesced so you can see new package growth\"></p>\n<p>A notable contribution from OCamlPro during this time was to\n<a href=\"https://github.com/ocaml/opam-repository/issues/955\">clarify</a> the\nlicensing on the package repository to be the liberal\n<a href=\"http://creativecommons.org/choose/zero/\">CC0</a>, and also to pass\nownership to the <a href=\"http://github.com/ocaml\">OCaml</a> organization on\nGitHub, where it\u2019s now jointly maintained by OCaml Labs, OCamlPro and\nanyone else that wishes to contribute.</p>\n<h3><a href=\"https://anil.recoil.org/#a-lens-into-global-ocaml-code\"></a>A lens into global OCaml code</h3>\n<p>It\u2019s been quite interesting just watching all the varied code fly into\nthe repository, but stability quickly became a concern as the new\npackages piled up. OCaml compiles to native code on not just x86, but\nalso PowerPC, Sparc and\n<a href=\"https://anil.recoil.org/2012/02/25/dreamplug-debian-and-ocaml.html\">ARM</a>\nCPUs. We kicked off various efforts into automated testing: firstly\n<a href=\"https://github.com/dsheets\">David Sheets</a> built the\n<a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">OCamlot</a>\ndaemon that would schedule builds across all the exotic hardware. Later\nin the year, the <a href=\"http://travis-ci.org\">Travis</a> service launched support\nfor testing from GitHub pull requests, and this became the front line of\n<a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">automated\nchecking</a> for\nall incoming new packages to OPAM.</p>\n<p>A major headache with automated testing is usually setting up the right\nbuild environment with external library dependencies, and so we <a href=\"https://anil.recoil.org/2013/11/15/docker-and-opam.html\">added\nDocker support</a>\nto make it easier to bulk-build packages for local developer use, with\nthe results of builds available\n<a href=\"https://github.com/avsm/opam-bulk-logs\">publically</a> for anyone to help\ntriage. Unfortunately fixing the bugs themselves is still a <a href=\"https://github.com/ocaml/opam-repository/issues/1304\">very manual\nprocess</a>, so more\nvolunteers are always welcome to help out!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/travis-mascot-200px.webp\" title=\"\">\n\nWe\u2019re going to be really seeing the rewards from all this effort as\nOCaml 4.02 development proceeds, since we can now adopt a data-driven\napproach to changing language features instead of guessing how much\nthird-party code will break. If your code is in OPAM, then it\u2019ll be\ntested as new features such as <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>\nand <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> show up.</p>\n<h3><a href=\"https://anil.recoil.org/#better-documentation\"></a>Better documentation</h3>\n<p>The venerable\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual029.html\">OCamlDoc</a>\ntool has done an admirable job for the last decade, but is increasingly\nshowing its age due to a lack of support for cross-referencing across\npackages. We started working on this problem in the summer when <a href=\"https://github.com/vincent-botbol\">Vincent\nBotbol</a> visited us on an internship,\nexpecting it to be a quick job to come up with something as good as\nHaskell\u2019s excellent <a href=\"http://www.haskell.org/haddock/\">Haddock</a> online\ndocumentation.</p>\n<p>Instead, we ran into the "module wall": since OCaml makes it so easy to\nparameterise code over other modules, it makes it hard to generate\nstatic documentation without outputting hundreds of megabytes of HTML\nevery time. After some hard work from Vincent and Leo, we\u2019ve got a\nworking prototype that lets you simply run\n<code>opam install opam-doc && opam doc core async</code> to generate package\ndocumentation. You can see the results for\n<a href=\"http://mirage.github.io/\">Mirage</a> online, but expect to see this\nintegrated into the main OCaml site for all OPAM packages as we work\nthrough polishing up the user interface.</p>\n<h3><a href=\"https://anil.recoil.org/#turning-opam-into-libraries\"></a>Turning OPAM into libraries</h3>\n<p>The other behind-the-scenes effort for OPAM has been to keep the core\ncommand-line tool simple and stable, and to have it install OCaml\nlibraries that can be interfaced with by other tools to do\ndomain-specific tasks. <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a>,\n<a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> and <a href=\"https://github.com/dsheets\">David\nSheets</a> have been steadily hacking away at\nthis and we now have <a href=\"https://github.com/ocamllabs/opamfu\">opamfu</a> to\nrun operations over all packages, and an easy-to-template\n<a href=\"https://github.com/ocaml/opam2web\">opam2web</a> that generates the live\n<a href=\"http://opam.ocaml.org\">opam.ocaml.org</a> website.</p>\n<p>This makes OPAM easier to deploy within other organizations that want to\nintegrate it into their workflow. For example, the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">software\nsection</a> of the OCaml\nLabs website is regularly generated from a search of all OPAM packages\ntagged <code>ocamllabs</code>. We also used it to rewrite the entire OPAM\nrepository <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">in one epic\ndiff</a> to add\nexternal library dependencies via a <a href=\"https://github.com/ocaml/opam/pull/886/files\">command-line\nshim</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#opam-in-a-box\"></a>OPAM-in-a-Box</h3>\n<p>All of this effort is geared towards making it easier to maintain\nreusable local OPAM installations. After several requests from big\nuniversities to help out their teaching needs, we\u2019re putting together\nall the support needed to easily redistribute OPAM packages via an\n\u201c<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-Box</a>\u201d command\nthat uses <a href=\"http://docker.io\">Docker</a> containers to let you clone and do\nlightweight modifications of OCaml installations.</p>\n<p>This will also be useful for anyone who\u2019d like to run tutorials or teach\nOCaml, without having to rely on flaky network connectivity at\nconference venues: a problem we\u2019ve <a href=\"http://amirchaudhry.com/fpdays-review\">suffered\nfrom</a> too!</p>\n<h2><a href=\"https://anil.recoil.org/#core-compiler\"></a>Core Compiler</h2>\n<p>\n<img alt=\"Compiling hacking at the Cambridge Makespace\" src=\"https://anil.recoil.org/images/compiler-hacking.webp\" title=\"Compiling hacking at the Cambridge Makespace\">\nCompiling hacking at the Cambridge Makespace\nStarting to work on a real compiler can often be a daunting prospect,\nand so one initiative we started this year is to host regular <a href=\"http://ocamllabs.github.io/compiler-hacking/2013/10/30/third-compiler-hacking-session.html\">compiler\nhacking\nsessions</a>\nwhere people could find a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">curated list of\nfeatures</a> to work\non, with the regular developers at hand to help out when people get\nstuck, and free beer and pizza to oil the coding wheels. This has worked\nout well, with around 20 people showing up on average for the three we\nheld, and <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki/Things-previously-worked-on\">several\npatches</a>\nsubmitted upstream to OCaml. <a href=\"http://gallium.inria.fr/~scherer/\">Gabriel\nScherer</a> and <a href=\"http://cristal.inria.fr/~doligez/\">Damien\nDoligez</a> have been helping this\neffort by tagging <a href=\"http://caml.inria.fr/mantis/search.php?project_id=1&sticky_issues=1&sortby=last_updated&dir=DESC&highlight_changed=24&hide_status_id=90&tag_string=junior_job\">junior\njobs</a>\nin the OCaml Mantis bug tracker as they are filed.</p>\n<h3><a href=\"https://anil.recoil.org/#syntax-transformations-and-extension-points\"></a>Syntax transformations and extension points</h3>\n<p><a href=\"http://www.lpw25.net\">Leo White</a> started the year fresh out of\ncompleting his PhD with <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a>,\nand before he realized what he\u2019d gotten himself into was working with\n<a href=\"http://alain.frisch.fr/\">Alain Frisch</a> on the future of syntax\ntransformations in OCaml. We started off our first\n<a href=\"http://lists.ocaml.org/listinfo/wg-camlp4\">wg-camlp4</a> working group on\nthe new <a href=\"http://lists.ocaml.org\">lists.ocaml.org</a> host, and a spirited\ndiscussion\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-January/thread.html\">started</a>\nthat went\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-February/thread.html\">on</a>\nand\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-March/thread.html\">on</a>\nfor several months. It ended with a very satisfying design for a simpler\n<em>extension points</em> mechanism which Leo\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">presented</a> at\nthe OCaml 2013 workshop at ICFP, and is now merged into OCaml\n4.02-trunk.</p>\n<h3><a href=\"https://anil.recoil.org/#namespaces\"></a>Namespaces</h3>\n<p>Not all of the working groups were quite as successful in coming to a\nconclusion as the Camlp4 one. On the Platform mailing list, Gabriel\nScherer started a discussion on the design for\n<a href=\"http://lists.ocaml.org/pipermail/platform/2013-February/000050.html\">namespaces</a>\nin OCaml. The resulting discussion was useful in separating multiple\nconcerns that were intermingled in the initial proposal, and Leo wrote a\n<a href=\"http://www.lpw25.net/2013/03/10/ocaml-namespaces.html\">comprehensive blog\npost</a> on a\nproposed namespace design.</p>\n<p>After further discussion at <a href=\"http://icfpconference.org/icfp2013/\">ICFP\n2013</a> with Jacques Garrigue later\nin the year, it turns out adding support for <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a> would solve much\nof the cost associated with compiling large libraries such as\n<a href=\"http://ocaml.janestreet.com\">Core</a>, with no backwards compatibility\nissues. This solution has now been integrated into OCaml 4.02.0dev and\nis being tested with Core.</p>\n<h3><a href=\"https://anil.recoil.org/#delving-into-the-bug-tracker\"></a>Delving into the bug tracker</h3>\n<p>Jeremy Yallop joined us in April, and he and Leo also leapt into the\ncore compiler and started triaging issues on the OCaml <a href=\"http://caml.inria.fr/mantis\">bug\ntracker</a>. This seems unglamorous in the\nbeginning, but there rapidly turned out to be many fascinating threads\nthat shed light on OCaml\u2019s design and implementation through seemingly\nharmless bugs. Here is a pick of some interesting threads through the\nyear that we\u2019ve been involved with:</p>\n<ul>\n<li>An <a href=\"http://caml.inria.fr/mantis/view.php?id=5985&nbn=49#bugnotes\">unexpected interaction between variance and GADTs</a>\nthat led to Jacques Garrigue\u2019s\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">talk</a> at\nOCaml 2013.</li>\n<li>Type unsoundness by <a href=\"http://caml.inria.fr/mantis/view.php?id=5992\">pattern matching lazy mutable\nvalues</a>, thus shedding\nlight on the precise semantics of the order of pattern matching.</li>\n<li>Leo proposed an <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">open types</a> extension to\nallow abstract types to be declared open. You can try it via\n<code>opam switch 4.00.1+open-types</code>.</li>\n<li>Designing the popular, but controversial <a href=\"http://caml.inria.fr/mantis/view.php?id=5759\">record disambiguation feature</a> in OCaml\n4.01.0, and debating <a href=\"http://caml.inria.fr/mantis/view.php?id=6000\">the right warnings</a> needed to\nprevent programmer surprise.</li>\n<li>Exposing a <a href=\"http://caml.inria.fr/mantis/view.php?id=6064\">GADT representation for Bigarray</a>.</li>\n</ul>\n<p>This is just a sample of some of the issues solved in Mantis; if you\nwant to learn more about OCaml, it\u2019s well worth browsing through it to\nlearn from over a decade of interesting discussions from all the\ndevelopers.</p>\n<h3><a href=\"https://anil.recoil.org/#thread-local-storage-runtime\"></a>Thread-local storage runtime</h3>\n<p>While OCamlPro was working on their <a href=\"https://github.com/lucasaiu/ocaml\">reentrant OCaml\nruntime</a>, we took a different tack by\nadding <a href=\"https://github.com/ocamllabs/ocaml/tree/multicore\">thread-local\nstorage</a> to the\nruntime instead, courtesy of <a href=\"http://mu.netsoc.ie/\">Stephen Dolan</a>. This\nis an important choice to make at the outset of adding multicore, so\nboth approaches are warranted. The preemptive runtime adds a lot of code\nchurn (due to adding a context parameter to most function calls) and\ntakes up a register, whereas the thread-local storage approach we tried\ndoesn\u2019t permit callbacks to different threads.</p>\n<p>Much of this work isn\u2019t interesting on its own, but forms the basis for\na fully multicore runtime (with associated programming model) in 2014.\nStay tuned!</p>\n<h3><a href=\"https://anil.recoil.org/#ctypes\"></a>Ctypes</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/c.webp\" title=\"\">\n\nOne other complaint from the Consortium members was quite surprising:\nthe difficulty of using the OCaml foreign function interface safely to\ninterface with C code. Jeremy Yallop began working on the\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library that had the\ngoal of eliminating the need to write any C code at all for the vast\nmajority of foreign bindings.</p>\n<p>Instead, Ctypes lets you describe any C function call as an OCaml value,\nand provides various linkage options to invoke that function into C. The\nfirst option he implemented was a <code>dlopen</code> interface, which immediately\nbrought us the same level of functionality as the\n<a href=\"http://docs.python.org/2/library/ctypes.html\">Python</a> or\n<a href=\"http://www.haskell.org/haskellwiki/Library/libffi\">Haskell</a> Ctypes\nequivalents. This early code was in itself startlingly useful and more\npleasant to use than the raw FFI, and various folk (such as David\nSheets\u2019 <a href=\"https://github.com/dsheets/ocaml-sodium\">libsodium</a>\ncryptography bindings) started adopting it.</p>\n<p>At this point, I happened to be struggling to write the Foreign Function\nInterface chapter of <a href=\"https://realworldocaml.org\">Real World OCaml</a>\nwithout blowing through our page budget with a comprehensive explanation\nof the existing system. I decided to take a risk and write about Ctypes\ninstead, since it let new users to the language have a <em>far</em> more\nproductive experience to get started. Xavier Leroy pointed out <a href=\"https://github.com/realworldocaml/book/issues/1701\">some\nshortcomings</a> of the\nlibrary in his technical book review, most notably with the lack of an\ninterface with C macros. The design of Ctypes fully supports alternate\nlinking mechanisms than just <code>dlopen</code> though, and Jeremy has added\nautomatic C stub generation support as well. This means that if you use\nCtypes to build an OCaml binding in 2014, you can choose several\nmechanisms for the same source code to link to the external system.\nJeremy even demonstrated a forking model at OCaml 2013 that protects the\nOCaml runtime from the C binding via process separation.</p>\n<p>The effort is paying off: Daniel B\u00fcnzli <a href=\"http://alan.petitepomme.net/cwn/2013.12.17.html#9\">ported\nSDL2</a> using ctypes,\nand gave us extensive\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/issues\">feedback</a> about any\nmissing corner cases, and the resulting bindings don\u2019t require any C\ncode to be written. <a href=\"http://xulforum.org\">Jonathan Protzenko</a> even used\nit to implement an OCaml controller for the <a href=\"http://gallium.inria.fr/blog/raspi-lcd/\">Adafruit Raspberry Pi RGB\nLCD</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#community-efforts\"></a>Community Efforts</h2>\n<p>Our community efforts were largely online, but we also hosted visitors\nover the year and regular face-to-face tutorials.</p>\n<h3><a href=\"https://anil.recoil.org/#online-at-ocamlorg\"></a>Online at OCaml.org</h3>\n<p>While the rest of the crew were hacking on OPAM and OCaml, <a href=\"http://amirchaudhry.com/\">Amir\nChaudhry</a> and <a href=\"http://philippewang.info/CL/\">Philippe\nWang</a> teamed up with Ashish Agarwal and\nChristophe Troestler to redesign and relaunch the <a href=\"http://ocaml.org\">OCaml\nwebsite</a>. Historically, OCaml\u2019s homepage has been the\n<a href=\"http://caml.inria.fr\">caml.inria.fr</a> domain, and the\n<a href=\"http://ocaml.org\">ocaml.org</a> effort was begun by Christophe and Ashish\n<a href=\"https://www.mail-archive.com/caml-list@inria.fr/msg00169.html\">some years\nago</a> to\nmodernize the web presence.</p>\n<p>The webpages were already rather large with complex scripting (for\nexample, the <a href=\"http://ocaml.org/learn/tutorials/99problems.html\">99\nProblems</a> page runs\nthe OCaml code to autogenerate the output). Philippe developed a\n<a href=\"https://github.com/pw374/MPP-language-blender\">template DSL</a> that made\nit easier to unify a lot of the templates around the website, and also a\n<a href=\"https://github.com/pw374/omd\">Markdown parser</a> that we could link to as\na library from the rest of the infrastructure without shelling out to\nPandoc.</p>\n<p>Meanwhile, Amir designed a series of <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">interactive wireframe\nsketches</a> and\n<a href=\"http://amirchaudhry.com/ocamlorg-request-for-feedback/\">gathered feedback</a> on it\nfrom the community. A local <a href=\"http://onespacemedia.com\">design agency</a> in\nCambridge helped with visual look and feel, and finally at the end of\nthe summer we began the\n<a href=\"http://amirchaudhry.com/migration-plan-ocaml-org/\">migration</a> to the\nnew website, followed by a triumphant\n<a href=\"http://amirchaudhry.com/announcing-new-ocamlorg/\">switchover</a> in\nNovember to the design you see today.</p>\n<p>The domain isn\u2019t just limited to the website itself. Leo and I set up a\n<a href=\"https://github.com/ocaml/ocaml.org-scripts\">SVN-to-Git mirror</a> of the\nOCaml compiler <a href=\"http://caml.inria.fr/ocaml/anonsvn.en.html\">Subversion\nrepository</a> on the GitHub\n<a href=\"https://github.com/ocaml/ocaml\">OCaml organization</a>, which is proving\npopular with developers. There is an ongoing effort to simplify the core\ncompiler tree by splitting out some of the larger components, and so\n<a href=\"http://github.com/ocaml/camlp4\">camlp4</a> is also now hosted on that\norganization, along with <a href=\"https://github.com/ocaml/oasis\">OASIS</a>. We\nalso administer several subdomains of <a href=\"http://ocaml.org\">ocaml.org</a>,\nsuch as the <a href=\"http://lists.ocaml.org\">mailing lists</a> and the <a href=\"http://opam.ocaml.org\">OPAM\nrepository</a>, and other services such as the\n<a href=\"http://forge.ocamlcore.org\">OCaml Forge</a> are currently migrating over.\nThis was made significantly easier thanks to sponsorship from <a href=\"http://rackspace.com\">Rackspace\nCloud</a> (users of <a href=\"http://xenserver.org\">XenServer</a>\nwhich is written in OCaml). They saw our struggles with managing\nphysical machines and gave us developer accounts, and all of the\nocaml.org infrastructure is now hosted on Rackspace. We\u2019re very grateful\nto their ongoing help!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/rackspace.webp\" title=\"\">\n\nIf you\u2019d like to contribute to infrastructure help (for example, I\u2019m\nexperimenting with a <a href=\"http://git.ocaml.org/public/\">GitLab</a> mirror),\nthen please join the\n<a href=\"http://lists.ocaml.org/listinfo/infrastructure\">infrastructure@lists.ocaml.org</a>\nmailing list and share your thoughts. The website team also need help\nwith adding content and <a href=\"https://github.com/ocaml/ocaml.org/issues/376\">international\ntranslations</a>, so head\nover to the <a href=\"http://github.com/ocaml/ocaml.org/issues\">website issue\ntracker</a> and start proposing\nimprovements you\u2019d like to see.</p>\n<h3><a href=\"https://anil.recoil.org/#next-steps-for-ocamlorg\"></a>Next steps for ocaml.org</h3>\n<p>The floodgates requesting features opened up after the launch of the new\nlook and feel. Pretty much everyone wanted deeper OPAM integration into\nthe main website, for features such as:</p>\n<ul>\n<li>Starring and reviewing packages</li>\n<li>Integrating the <a href=\"https://github.com/ocamllabs/opam-doc\">opam-doc</a>\ndocumentation with the metadata</li>\n<li>Display test results and a compatibility matrix for non-x86 and\nnon-Linux architectures.</li>\n<li>Link to blog posts and tutorials about the package.</li>\n</ul>\n<p>Many of these features were part of the <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">original\nwireframes</a> but\nwe\u2019re being careful to take a long-term view of how they should be\ncreated and maintained. Rather than building all of this as a huge\nbloated <a href=\"https://github.com/ocaml/opam2web\">opam2web</a> extension, David\nSheets (our resident relucant-to-admit-it web expert) has designed an\noverlay directory scheme that permits the overlaying of different\nmetadata onto the website. This lets one particular feature (such as\nblog post aggregation) be handled separately from the others via Atom\naggregators.</p>\n<h3><a href=\"https://anil.recoil.org/#real-world-ocaml\"></a>Real World OCaml</h3>\n<p><img alt=\"%r\" src=\"https://anil.recoil.org/papers/rwo\">\nA big effort that took up most of the year for me was finishing and\npublishing an O\u2019Reilly book called <a href=\"https://realworldocaml.org\">Real World\nOCaml</a> with <a href=\"https://ocaml.janestreet.com/?q=blog/5\">Yaron\nMinsky</a> and Jason Hickey. Yaron\ndescribes how it all started in <a href=\"https://ocaml.janestreet.com/?q=node/117\">his blog\npost</a>, but I learnt a lot from\ndeveloping a book using the <a href=\"https://web.archive.org/web/20160324164610/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">open commenting\nscheme</a>\nthat we developed just for this.</p>\n<p>In particular, the book ended up shining a bright light into dark\nlanguage corners that we might otherwise not have explored in OCaml\nLabs. Two chapters of the book that I wasn\u2019t satisfied with were the\n<a href=\"https://realworldocaml.org/v1/en/html/objects.html\">objects</a> and\n<a href=\"https://realworldocaml.org/v1/en/html/classes.html\">classes</a> chapters,\nlargely since neither Yaron nor Jason nor I had ever really used their\nfull power in our own code. Luckily, Leo White decided to pick up the\nbaton and champion these oft-maligned (but very powerful) features of\nOCaml, and the result is the clearest explanation of them that I\u2019ve read\nyet. Meanwhile, Jeremy Yallop helped out with extensive review of the\n<a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Foreign Function\nInterface</a>\nchapter that used his\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library. Finally,\n<a href=\"https://plus.google.com/100586365409172579442/posts\">Jeremie Diminio</a>\nat Jane Street worked hard on adding several features to his\n<a href=\"https://github.com/diml/utop\">utop</a> toplevel that made it compelling\nenough to become our default recommendation for newcomers.</p>\n<p>All in all, we ended up closing over <a href=\"https://web.archive.org/web/20160101000000*/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">2000\ncomments</a>\nin the process of writing the book, and I\u2019m very proud of the result\n(freely available <a href=\"https://realworldocaml.org\">online</a>, but do <a href=\"http://www.amazon.com/Real-World-OCaml-Functional-programming/dp/144932391X/\">buy a\ncopy</a>\nif you can to support it). Still, there\u2019s more I\u2019d like to do in 2014 to\nimprove the ease of using OCaml further. In particular, I removed a\nchapter on packaging and build systems since I wasn\u2019t happy with its\nquality, and both <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> and I\nintend to spend time in 2014 on improving this part of the ecosystem.</p>\n<h3><a href=\"https://anil.recoil.org/#tutorials-and-talks\"></a>Tutorials and Talks</h3>\n<p>\n<img alt=\"Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\" src=\"https://anil.recoil.org/images/pfff.webp\" title=\"Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\">\nJulien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\nWe had a lively presence at <a href=\"http://icfpconference.org\">ICFP 2013</a> this\nyear, with the third iteration of the <a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">OCaml\n2013</a> held there, and\nStephen Dolan presenting a paper in the main conference. I <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/24/liveblogging-ocaml-workshop-2013/\">liveblogged\nOCaml\n2013</a>\nand <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/22/liveblogging-cufp-2013/\">CUFP\n2013</a>\nas they happened, and all the\n<a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">talks</a> we gave are\nlinked from the program. The most exciting part of the conference for a\nlot of us were the two talks by Facebook on their use of OCaml: first\nfor <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/padioleau.pdf\">program analysis using\nPfff</a> and\nthen to migrate their massive PHP codebase <a href=\"http://www.youtube.com/watch?feature=player_detailpage&v=gKWNjFagR9k#t=1150\">using an OCaml\ncompiler</a>.\nI also had the opportunity to participate in a panel at the Haskell\nWorkshop on whether <a href=\"http://ezyang.tumblr.com/post/62157468762/haskell-haskell-and-ghc-too-big-to-fail-panel\">Haskell is too big to fail\nyet</a>;\nlots of interesting perspectives on scaling another formerly academic\nlanguage into the real world.</p>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I have been\ngiving tutorials on OCaml at ICFP for several years, but the release of\nReal World OCaml has made it significantly easier to give tutorials\nwithout the sort of labor intensity that it took in previous years (one\nmemorable ICFP 2011 tutorial that we did took almost 2 hours to get\neveryone installed with OCaml. In ICFP 2013, it took us 15 minutes or so\nto get everyone started). Still, giving tutorials at ICFP is very much\npreaching to the choir, and so we\u2019ve started speaking at more\ngeneral-purpose events.</p>\n<p>\n<img alt=\"Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\" src=\"https://anil.recoil.org/images/marius-yaron-icfp.webp\" title=\"Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\">\nMarius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\nOur first local effort was <a href=\"http://fpdays.net/2013/\">FPDays</a> in\nCambridge, where Jeremy Yallop and Amir Chaudhry ran the tutorial with\nhelp from Phillipe Wang, Leo White and David Sheets. The OCaml session\nthere ended up being the biggest one in the entire two days, and Amir\n<a href=\"http://amirchaudhry.com/fpdays-review/\">wrote up</a> their experiences.\nOne interesting change from our ICFP tutorial is that Jeremy used\n<a href=\"https://github.com/ocsigen/js_of_ocaml\">js_of_ocaml</a> to teach OCaml\nvia JavaScript by building a fun <a href=\"https://github.com/ocamllabs/fpdays-skeleton\">Monty\nHall</a> game.</p>\n<h3><a href=\"https://anil.recoil.org/#visitors-and-interns\"></a>Visitors and Interns</h3>\n<p>\n<img alt=\"Thomas Gazagnaire presents at Jane Street\" src=\"https://anil.recoil.org/images/thomas-nycoug-2013.webp\" title=\"Thomas Gazagnaire presents at Jane Street\">\nThomas Gazagnaire presents at Jane Street\nSince OCaml Labs is a normal group within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer\nLab</a>, we often host academic visitors and\ninterns who pass through. This year was certainly diverse, and we\nwelcomed a range of colleagues:</p>\n<ul>\n<li><a href=\"http://www.lip6.fr/actualite/personnes-fiche.php?ident=D1161&LANG=en\">Mathias\nBourgoin</a>\nhas just finished his work on interfacing OCaml with GPUs, and gave\nus a seminar on how his\n<a href=\"http://www.algo-prog.info/spoc/web/index.php?id=spoc\">SPOC</a> tool\nworks (also available in OPAM via a <a href=\"http://www.algo-prog.info/spoc/distribution/opam/\">custom\nremote</a>).</li>\n<li><a href=\"http://www.benjamin.canou.fr/\">Benjamin Canou</a> (now at OCamlPro)\npractised his <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/canou.pdf\">OCaml 2013\ntalk</a> on\nbuilding high-level interfaces to JavaScript with OCaml by giving a\ndepartmental seminar.</li>\n<li><a href=\"http://www.dicosmo.org/\">Roberto Di Cosmo</a>, who directs the\n<a href=\"http://www.irill.org/\">IRILL</a> organization on Free Software in\nParis delivered a seminar on constraint solving for <a href=\"http://mancoosi.org\">package\nsystems</a> that are as large-scale as Debian\u2019s.</li>\n<li><a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> visited during the summer\nto help plot the <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">Mirage\n1.0</a> and <a href=\"https://anil.recoil.org/2013/09/20/opam-1-1-beta.html\">OPAM\n1.1</a> releases.\nHe has also since joined OCaml Labs fulltime to work on\n<a href=\"http://nymote.org\">Nymote</a>.</li>\n<li><a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> from OCamlPro\nvisited for 2 weeks in December and kicked off the inaugral OPAM\ndevelopers summit (which was, admittedly, just 5 developers in the\n<a href=\"http://www.kingston-arms.co.uk/\">Kingston Arms</a>, but all good\nthings start in a pub, right?)</li>\n<li><a href=\"http://www.xulforum.org/\">Jonathan Protzenko</a> presented his PhD\nwork on <a href=\"http://protz.github.io/mezzo/\">Mezzo</a> (which is now <a href=\"http://gallium.inria.fr/blog/mezzo-on-opam/\">merged\ninto OPAM</a>), and\neducated us on the vagaries of <a href=\"http://protz.github.io/ocaml-installer/\">Windows\nsupport</a>.</li>\n<li><a href=\"http://gallium.inria.fr/~scherer/\">Gabriel Scherer</a> from the\nGallium INRIA group visited to discuss the direction of OPAM and\nvarious language feature discussions (such as namespaces). He didn\u2019t\ngive a talk, but promises to do so next time!</li>\n<li><a href=\"https://github.com/bvaugon\">Beno\u00eet Vaugon</a> gave a seminar on his\n<a href=\"http://oud.ocaml.org/2012/slides/oud2012-paper10-slides.pdf\">OCamlCC</a>\nOCaml-to-C compiler, talked about porting OCaml to <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php?id=ocapic\">8-bit\nPICs</a>,\nand using GADTs to <a href=\"http://caml.inria.fr/mantis/view.php?id=6017\">implement\nPrintf</a> properly.</li>\n</ul>\n<p>We were also visited several times by <a href=\"http://danmey.org/\">Wojciech\nMeyer</a> from ARM, who was an OCaml developer who\nmaintained (among other things) the\n<a href=\"http://brion.inria.fr/gallium/index.php/Ocamlbuild\">ocamlbuild</a> system\nand worked on <a href=\"http://www.youtube.com/watch?v=d9Hg5L76FG8\">DragonKit</a>\n(an extensible LLVM-like compiler written in OCaml). Wojciech very sadly\npassed away on November 18th, and we all fondly remember his\nenthusiastic and intelligent contributions to our small Cambridge\ncommunity.</p>\n<p>We also hosted visitors to live in Cambridge and work with us over the\nsummer. In addition to Vincent Botbol (who worked on OPAM-doc as\ndescribed earlier) we had the pleasure of having <a href=\"http://erratique.ch/\">Daniel\nB\u00fcnzli</a> and <a href=\"http://www.x9c.fr/\">Xavier Clerc</a>\nwork here. Here\u2019s what they did in their own words.</p>\n<h4><a href=\"https://anil.recoil.org/#xavier-clerc-ocamljava\"></a>Xavier Clerc: OCamlJava</h4>\n<p>Xavier Clerc took a break from his regular duties at INRIA to join us\nover the summer to work on\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> and adapt it to the\nlatest JVM features. This is an incredibly important project to bridge\nOCaml with the huge Java community, and here\u2019s his report:</p>\n<blockquote>\n<p>After a four-month visit to the OCaml Labs dedicated to the\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> project, the time has\ncome for an appraisal! The undertaken work can be split into two\nareas: improvements to code generation, and interaction between the\nOCaml & Java languages. Regarding code generation, several classical\noptimizations have been added to the compiler, for example loop\nunrolling, more aggressive unboxing, better handling of globals, or\npartial evaluation (at the bytecode level). A new tool, namely\nocamljar, has been introduced allowing post-compilation optimizations.\nThe underlying idea is that some optimizations cannot always be\napplied (e.g. depending whether multiple threads/programs will\ncoexist), but enabling them through command-line flags would lead to\nrecompilation and/or multiple installations of each library according\nto the set of chosen optimizations. It is thus far more easier to\nfirst build an executable jar file, and then modify it according to\nthese optimizations. Furthermore, this workflow allows the ocamljar\ntool to take advantage of whole-program information for some\noptimizations. All these improvements, combined, often lead to a gain\nof roughly 1/3 in terms of execution time.</p>\n<p>Regarding language interoperability, there are actually two directions\ndepending on whether you want to call OCaml code from Java, or want to\ncall Java code from OCaml. For the first direction, a tool allows to\ngenerate Java source files from OCaml compiled interfaces, mapping the\nvarious constructs of the OCaml language to Java classes. It is then\npossible to call functions, and to manipulate instances of OCaml types\nin pure Java, still benefiting from the type safety provided by the\nOCaml language. In the other direction, an extension of the OCaml\ntyper is provided allowing to create and manipulate Java instances\ndirectly from OCaml sources. This typer extension is indeed a thin\nlayer upon the original OCaml typer, that is mainly responsible for\nencoding Java types into OCaml types. This encoding uses a number of\nadvanced elements such as polymorphic variants, subtyping, variance\nannotations, phantom typing, and printf-hack, but the end-user does\nnot have to be aware of this encoding. On the surface, the type of\ninstances of the Java Object classes is\n<code>java'lang'Object java_instance</code>, and instances can be created by\ncalling Java.make <code>Object()</code>.</p>\n<p>While still under heavy development, a working prototype <a href=\"http://ocamljava.x9c.fr/preview/\">is\navailable</a>, and bugs <a href=\"http://bugs.x9c.fr/\">can be\nreported</a>. Finally, I would like to thank the\nOCaml Labs for providing a great working environment.</p>\n</blockquote>\n<h4><a href=\"https://anil.recoil.org/#daniel-b\u00fcnzli-typography-and-visualisation\"></a>Daniel B\u00fcnzli: Typography and Visualisation</h4>\n<p>Daniel joined us from Switzerland, and spent some time at Citrix before\njoining us in OCaml Labs. All of his\n<a href=\"http://erratique.ch/software\">software</a> is now on OPAM, and is seeing\never-increasing adoption from the community.</p>\n<blockquote>\n<p>Released a first version of <a href=\"http://erratique.ch/software/vg\">Vg</a> [\u2026]\nI\u2019m especially happy about that as I wanted to use and work on these\nideas since at least 2008. The project is a long term project and is\ncertainly not finished yet but this is already a huge step.</p>\n<p>Adjusted and released a first version of\n<a href=\"http://erratique.ch/software/gg\">Gg</a>. While the module was already\nmostly written before my arrival to Cambridge, the development of Vg\nand Vz prompted me to make some changes to the module.</p>\n<p>[\u2026] released <a href=\"http://erratique.ch/software/otfm\">Otfm</a>, a module to\ndecode OpenType fonts. This is a work in progress as not every\nOpenType table has built-in support for decoding yet. But since it is\nneeded by Vg\u2019s PDF renderer I had to cut a release. It can however\nalready be used to implement certain simple things like font kerning\nwith Vg, this can be seen in action in the <code>vecho</code> binary installed by\nVg.</p>\n<p>Started to work on <a href=\"http://erratique.ch/software/vz/doc/Vz.html\">Vz</a>,\na module for helping to map data to Vg images. This is really\nunfinished and is still considered to be at a design stage. There are\na few things that are however well implemented like (human)\nperceptually meaningful <a href=\"http://erratique.ch/software/vz/demos/color_schemes.html\">color\npalettes</a>\nand the small folding stat module (<code>Vz.Stat</code>). However it quickly\nbecame evident that I needed to have more in the box w.r.t. text\nrendering in Vg/Otfm. Things like d3js entirely rely on the SVG/CSS\nsupport for text which makes it easy to e.g. align things (like tick\nlabels on <a href=\"http://erratique.ch/software/vz/demos/iris.html\">such\ndrawings</a>). If you\ncan\u2019t rely on that you need ways of measuring rendered text. So I\ndecided to suspend the work on Vz and put more energy in making a\nfirst good release of Vg. Vz still needs quite some design work,\nespecially since it tries to be independent of Vg\u2019s backend and from\nthe mechanism for user input.</p>\n<p>Spent some time figuring out a new \u201copam-friendly\u201d release workflow in\npkgopkg. One of my problem is that by designing in the small for\nprogramming in the large \u2014 what a slogan \u2014 the number of packages I\u2019m\npublishing is growing (12 and still counting). This means that I need\nto scale horizontally maintenance-wise unhelped by the sad state of\nbuild systems for OCaml. I need tools that make the release process\nflawless, painless and up to my quality standards. This lead me to\nenhance and consolidate my old scattered distribution scripts in that\nrepo, killing my dependencies on Oasis and ocamlfind along the way.\n<em>(edited for brevity, see\n<a href=\"https://github.com/dbuenzli/pkgopkg\">here</a>)</em></p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/daniel-presentation-vg.webp\" title=\"\">\n\nDaniel also left his bicycle here for future visitors to use, and the\n\u201cB\u00fcnzli-bike\u201d is available for our next visitor! (<span>Louis Gesbert</span> even\ndonated lights, giving it a semblance of safety).</p>\n<h3><a href=\"https://anil.recoil.org/#industrial-fellows\"></a>Industrial Fellows</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/xenserver.webp\" title=\"\">\n\nMost of our regular funding bodies such as <a href=\"http://epsrc.ac.uk\">EPSRC</a>\nor <a href=\"http://cordis.europa.eu/fp7/home_en.html\">EU FP7</a> provide funding,\nbut leave all the intellectual input to the academics. A compelling\naspect of OCaml Labs has been how involved our industrial colleagues\nhave been with the day-to-day problems that we solve. Both Jane Street\nand Citrix have senior staff regularly visiting our group and working\nalongside us as industrial fellows in the Computer Lab.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/js.webp\" title=\"\">\n\n<a href=\"http://www.three-tuns.net/mark/\">Mark Shinwell</a> from Jane Street\nEurope has been working on improving the <a href=\"http://www.youtube.com/watch?v=NF2WpWnB-nk\">state of native\ndebugging</a> in OCaml, by\nadding extended DWARF debugging information to the compiler output.\nMark is also a useful source of feedback about the forthcoming\ndesign of multicore, since he has daily insight into a huge\nproduction codebase at Jane Street (and can tell us about it without\nus requiring access!).</p>\n<p><a href=\"http://dave.recoil.org\">Dave Scott</a> is the principal architect of\n<a href=\"http://xenserver.org\">XenServer</a> at Citrix in Cambridge. This year\nhas been transformative for that project, since Citrix <a href=\"http://blogs.citrix.com/2013/06/26/open-source-what-does-it-mean-for-xenserver/\">open-sourced\nXenServer</a>\nto GitHub and fully adopted OPAM into their workflow. Dave is the\nauthor of numerous libraries that have all been released to OPAM,\nand his colleagues <a href=\"http://jon.recoil.org\">Jon Ludlam</a> and <a href=\"http://www.xenserver.org/blog/blogger/listings/euanh.html\">Euan\nHarris</a>\nare also regular visitors who have also been contributors to the\nOPAM and Mirage ecosystems.</p>\n<h2><a href=\"https://anil.recoil.org/#research-projects\"></a>Research Projects</h2>\n<p>The other 100% of our time at the Labs is spent on research projects.\nWhen we started the group, I wanted to set up a feedback loop between\nlocal people <em>using</em> OCaml to build systems, with the folk <em>developing</em>\nOCaml itself. This has worked out particularly well with a couple of big\nresearch projects in the Lab.</p>\n<h3><a href=\"https://anil.recoil.org/#mirage\"></a>Mirage</h3>\n<p>Mirage is a <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\">library operating\nsystem</a> written in\nOCaml that compiles source code into specialised Xen microkernels,\ndeveloped at the Cambridge Computer Lab, Citrix and the <a href=\"http://horizon.ac.uk\">Horizon Digital\nEconomy</a> institute at Nottingham. This year saw\nseveral years of effort culminate in the first release of <a href=\"http://openmirage.org\">Mirage\n1.0</a> as a self-hosting entity. While Mirage\nstarted off as a <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.pdf\">quick\nexperiment</a> into\nbuilding specialised virtual appliances, it rapidly became useful to\nmake into a real system for use in bigger research projects. You can\nlearn more about Mirage <a href=\"http://openmirage.org/docs\">here</a>, or read the\n<a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels/abstract\">Communications of the\nACM</a>\narticle that <a href=\"http://dave.recoil.org\">Dave Scott</a> and I wrote to close\nout the year.</p>\n<p>This project is where the OCaml Labs \u201cfeedback loop\u201d has been strongest.\nA typical <a href=\"http://www.openmirage.org/wiki/hello-world\">Mirage\napplication</a> consists of\naround 50 libraries that are all installed via OPAM. These range from\n<a href=\"https://github.com/mirage/mirage-block-xen\">device drivers</a> to protocol\nlibraries for <a href=\"https://github.com/avsm/ocaml-cohttp\">HTTP</a> or\n<a href=\"https://github.com/mirage/ocaml-dns\">DNS</a>, to filesystems such as\n<a href=\"https://github.com/mirage/ocaml-fat\">FAT32</a>. Coordinating <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">regular\nreleases</a> of all of\nthese would be near impossible without using OPAM, and has also forced\nus to use our own tools daily, helping to sort out bugs more quickly.\nYou can see the full list of libraries on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">OCaml Labs software\npage</a>.</p>\n<p>Mirage is also starting to share code with big projects such as\n<a href=\"http://xenserver.org\">XenServer</a> now, and we have been working with\nCitrix engineers to help them to move to the\n<a href=\"http://ocaml.janestreet.com\">Core</a> library that Jane Street has\nreleased (and that is covered in <a href=\"https://realworldocaml.org\">Real World\nOCaml</a>). Moving production codebases this\nlarge can take years, but OCaml Labs is turning out to be a good place\nto start unifying some of the bigger users of OCaml into one place.\nWe\u2019re also now an official <a href=\"http://www.xenproject.org/developers/teams/mirage-os.html\">Xen Project incubator\nproject</a>,\nwhich helps us to validate functional programming to other Linux\nFoundation efforts.</p>\n<h3><a href=\"https://anil.recoil.org/#nymote-and-user-centric-networking\"></a>Nymote and User Centric Networking</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nymote.webp\" title=\"\">\n\nThe release of Mirage 1.0 has put us on the road to simplifying embedded\nsystems programming. The move to the centralized cloud has led to\nregular well-publicised privacy and security threats to the way <a href=\"http://de2013.org/wp-content/uploads/2013/09/de2013_submission_25-1.pdf\">we\nhandle</a>\nour digital infrastructure, and so <a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon\nCrowcroft</a>, <a href=\"http://www.cs.nott.ac.uk/~rmm/\">Richard\nMortier</a> and I are leading an effort to\nbuild an alternative privacy-preserving infrastructure using embedded\ndevices as part of the <a href=\"http://usercentricnetworking.eu/\">User Centric\nNetworking</a> project, in collaboration\nwith a host of companies led by <a href=\"http://www.thlab.net/\">Technicolor</a>\nParis. This work also plays on the strong points of OCaml: it already\nhas a <a href=\"https://anil.recoil.org/2012/02/25/dreamplug-debian-and-ocaml.html\">fast ARM\nbackend</a>,\nand Mirage can easily be ported to the new Xen/ARM target as hardware\nbecomes available.</p>\n<p>One of the most difficult aspects of programming on the \u201cwide area\u201d\nInternet are dealing with the lack of a distributed identity service\nthat\u2019s fully secure. We published <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.pdf\">our\nthoughts</a> on this\nat the USENIX Free and Open Communications on the Internet workhsop, and\nDavid Sheets is working towards a full implementation using Mirage. If\nyou\u2019re interested in following this effort, Amir Chaudhry is blogging at\nthe <a href=\"http://nymote.org/\">Nymote</a> project website, where we\u2019ll talk about\nthe components as they are released.</p>\n<h3><a href=\"https://anil.recoil.org/#data-center-networking\"></a>Data Center Networking</h3>\n<p>At the other extreme from embedded programming is datacenter networking,\nand we started the\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K034723/1\">Network-as-a-Service</a>\nresearch project with <a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K032968/1\">Imperial\nCollege</a>\nand\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K031724/1\">Nottingham</a>.\nWith the rapid rise of <a href=\"http://en.wikipedia.org/wiki/Software-defined_networking\">Software Defined\nNetworking</a>\nthis year, we are investigating how application-specific customisation\nof network resources can build fast, better, cheaper infrasructure.\nOCaml is in a good position here: several other groups have built\nOpenFlow controllers in OCaml (most notably, the <a href=\"https://github.com/frenetic-lang\">Frenetic\nProject</a>), and Mirage is specifically\ndesigned to assemble such bespoke infrastructure.</p>\n<p>Another aspect we\u2019ve been considering is how to solve the problem of\noptimal connectivity across nodes. TCP is increasingly considered\nharmful in high-through, high-density clusters, and <a href=\"http://www.sussex.ac.uk/informatics/people/peoplelists/person/334868\">George\nParisis</a>\nled the design of\n<a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.pdf\">Trevi</a>, which is\na fountain-coding based alternative for storage networking. Meanwhile,\n<a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> (who joined OCaml Labs in\nNovember), has been working on a branch-consistent data store called\n<a href=\"https://github.com/samoht/irminsule\">Irminsule</a> which supports scalable\ndata sharing and reconciliation using Mirage. Both of these systems will\nsee implementations based on the research done this year.</p>\n<h3><a href=\"https://anil.recoil.org/#higher-kinded-programming\"></a>Higher Kinded Programming</h3>\n<p>Jeremy Yallop and Leo White have been developing an approach that makes\nit possible to write programs with higher-kinded polymorphism (such as\nmonadic functions that are polymorphic in the monad they use) without\nusing functors. It\u2019s early days yet, but there\u2019s a\n<a href=\"https://github.com/ocamllabs/higher\">library</a> available on\n<a href=\"http://opam.ocaml.org/pkg/higher/higher.0.1\">OPAM</a> that implements the\napproach, and a <a href=\"https://github.com/ocamllabs/higher/raw/paper/higher.pdf\">draft\npaper</a> that\noutlines the design.</p>\n<h2><a href=\"https://anil.recoil.org/#priorities-for-2014\"></a>Priorities for 2014</h2>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/camel.webp\" title=\"\">\n\nThis year has been a wild ride to get us up to speed, but we now have a\nsolid sense of what to work on for 2014. We\u2019ve decided on a high-level\nset of priorities led by the senior members of the group:</p>\n<ul>\n<li><strong>Multicore</strong>: Leo White will be leading efforts in putting an\nend-to-end multicore capable OCaml together.</li>\n<li><strong>Metaprogramming</strong>: Jeremy Yallop will direct the metaprogramming\nefforts, continuing with Ctypes and into macros and extension\npoints.</li>\n<li><strong>Platform</strong>: Thomas Gazagnaire will continue to drive OPAM\ndevelopment towards becoming the first <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/madhavapeddy.pdf\">OCaml\nPlatform</a>.</li>\n<li><strong>Online</strong>: Amir Chaudhry will develop the online and community\nefforts that started in 2013.</li>\n</ul>\n<p>These are guidelines to choosing where to spend our time, but not\nexcluding other work or day-to-day bugfixing. Our focus on collaboration\nwith Jane Street, Citrix, Lexifi, OCamlPro and our existing colleagues\nwill continue, as well as warmly welcoming new community members that\nwish to work with us on any of the projects, either via internships,\nstudentships or good old-fashioned open source hacking.</p>\n<p>I appreciate the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/\">whole\nteam's</a> feedback in\nediting this long post into shape, the amazing professorial support from\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon Crowcroft</a>, <a href=\"https://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a> and <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan\nMycroft</a> throughout the year, and of\ncourse the funding and support from Jane Street, Citrix, RCUK, EPSRC,\nDARPA and the EU FP7 that made all this possible. Roll on 2014, and\nplease do <a href=\"mailto:avsm2@cl.cam.ac.uk\">get in touch</a> with me with any\nqueries!</p>\n<p>\n<img alt=\"A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!\" src=\"https://anil.recoil.org/images/fpdays2013-04.webp\" title=\"A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!\">\nA successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!</p>",+"content": "<p>This time last year in 2012, I had just\n<a href=\"https://anil.recoil.org/2012/10/19/announcing-ocaml-labs.html\">announced</a>\nthe formation of a new group called <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/\">OCaml\nLabs</a> in the <a href=\"http://www.cl.cam.ac.uk\">Cambridge\nComputer Lab</a> that would combine research and\ncommunity work towards the practical application of functional\nprogramming. An incredible year has absolutely flown by, and I\u2019ve put\ntogether this post to summarise what\u2019s gone on, and point to our future\ndirections for 2014.</p>\n<p>The theme of our group was not to be pure research, but rather a hybrid\ngroup that would take on some of the load of day-to-day OCaml\nmaintenance from <a href=\"http://caml.inria.fr\">INRIA</a>, as well as help grow the\nwider OCaml community. To this end, all of our projects have been highly\ncollaborative, often involving colleagues from\n<a href=\"http://ocamlpro.com\">OCamlPro</a>, <a href=\"http://gallium.inria.fr/\">INRIA</a>,\n<a href=\"http://janestreet.com\">Jane Street</a>, <a href=\"http://www.lexifi.com/\">Lexifi</a>\nand <a href=\"http://citrix.com\">Citrix</a>.</p>\n<p>This post covers progress in <a href=\"https://anil.recoil.org/#tooling\">tooling</a>, the <a href=\"https://anil.recoil.org/#core_compiler\">compiler and\nlanguage</a>, <a href=\"https://anil.recoil.org/#community_efforts\">community efforts</a>,\n<a href=\"https://anil.recoil.org/#research_projects\">research projects</a> and concludes with our\n<a href=\"https://anil.recoil.org/#priorities_for_2014\">priorities for 2014</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#tooling\"></a>Tooling</h2>\n<p>At the start of 2013, OCaml was in the interesting position of being a\nmature decades-old language with a small, loyal community of industrial\nusers who built mission critical applications using it. We had the\nopportunity to sit down with many of them at the <a href=\"http://caml.inria.fr/consortium/\">OCaml\nConsortium</a> meeting and prioritise\nwhere we started work. The answer came back clearly: while the compiler\nitself is legendary for its stability, the tooling around it (such as\npackage management) was a pressing problem.</p>\n<h3><a href=\"https://anil.recoil.org/#opam\"></a>OPAM</h3>\n<p>Our solution to this tooling was centered around the\n<a href=\"http://opam.ocaml.org\">OPAM</a> package manager that\n<a href=\"http://ocamlpro.com\">OCamlPro</a> released into beta just at the end of\n2012, and had its first stable release in March 2013. OPAM differs from\nmost system package managers by emphasising a flexible distributed\nworkflow that uses version constraints to ensure incompatible libraries\naren\u2019t mixed up (important for the statically-typed OCaml that is very\ncareful about dependencies). Working closely with\n<a href=\"http://ocamlpro.com\">OCamlPro</a> we developed a git-based workflow to\nmake it possible for users (both individual or industrial) to easily\nbuild up their own package repositories and redistribute OCaml code, and\nstarted curating the <a href=\"https://github.com/ocaml/opam-repository\">package\nrepository</a>.</p>\n<p>The results have been satisfying: we started with an initial set of\naround 100 packages in OPAM (mostly imported by the 4 developers), and\nended 2013 with 587 unique packages and 2000 individual versions, with\ncontributions from 160 individuals. We now have a curated <a href=\"https://github.com/ocaml/opam-repository\">central\npackage repository</a> for anyone\nto submit their OCaml code, several third-party remotes are maintained\n(e.g. the <a href=\"https://github.com/xapi-project/opam-repo-dev\">Xen Project</a>\nand <a href=\"https://github.com/ocsigen/opam-ocsigen\">Ocsigen</a>). We also\nregularly receive releases of the <a href=\"http://ocaml.janestreet.com\">Core</a>\nlibraries from Jane Street, and updates from sources as varied as\n<a href=\"https://github.com/ocaml/opam-repository/pull/1300\">Facebook</a>,\n<a href=\"https://anil.recoil.org/2013/09/16/camlpdf-the-end-of-sucky-pdf-tools.html\">Coherent\nPDF</a>,\nto the <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/guha.pdf\">Frenetic\nSDN</a> research.</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/opam11-contributors-dec13.webp\" title=\"Number of unique contributors to the central OPAM package repository\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/opam11-packages-dec13.webp\" title=\"Total number of unique packages (including multiple versions of the same package)\">\n<img alt=\"\" src=\"https://anil.recoil.org/images/opam11-unique-packages-dec13.webp\" title=\"Total packages with multiple versions coalesced so you can see new package growth\"></p>\n<p>A notable contribution from OCamlPro during this time was to\n<a href=\"https://github.com/ocaml/opam-repository/issues/955\">clarify</a> the\nlicensing on the package repository to be the liberal\n<a href=\"http://creativecommons.org/choose/zero/\">CC0</a>, and also to pass\nownership to the <a href=\"http://github.com/ocaml\">OCaml</a> organization on\nGitHub, where it\u2019s now jointly maintained by OCaml Labs, OCamlPro and\nanyone else that wishes to contribute.</p>\n<h3><a href=\"https://anil.recoil.org/#a-lens-into-global-ocaml-code\"></a>A lens into global OCaml code</h3>\n<p>It\u2019s been quite interesting just watching all the varied code fly into\nthe repository, but stability quickly became a concern as the new\npackages piled up. OCaml compiles to native code on not just x86, but\nalso PowerPC, Sparc and\n<a href=\"https://anil.recoil.org/2012/02/25/dreamplug-debian-and-ocaml.html\">ARM</a>\nCPUs. We kicked off various efforts into automated testing: firstly\n<a href=\"https://github.com/dsheets\">David Sheets</a> built the\n<a href=\"https://github.com/ocaml/v2.ocaml.org/blob/master/site/meetings/ocaml/2013/proposals/ocamlot.pdf\">OCamlot</a>\ndaemon that would schedule builds across all the exotic hardware. Later\nin the year, the <a href=\"http://travis-ci.org\">Travis</a> service launched support\nfor testing from GitHub pull requests, and this became the front line of\n<a href=\"https://web.archive.org/web/20181114154831/https://anil.recoil.org/2013/09/30/travis-and-ocaml.html\">automated\nchecking</a> for\nall incoming new packages to OPAM.</p>\n<p>A major headache with automated testing is usually setting up the right\nbuild environment with external library dependencies, and so we <a href=\"https://anil.recoil.org/2013/11/15/docker-and-opam.html\">added\nDocker support</a>\nto make it easier to bulk-build packages for local developer use, with\nthe results of builds available\n<a href=\"https://github.com/avsm/opam-bulk-logs\">publically</a> for anyone to help\ntriage. Unfortunately fixing the bugs themselves is still a <a href=\"https://github.com/ocaml/opam-repository/issues/1304\">very manual\nprocess</a>, so more\nvolunteers are always welcome to help out!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/travis-mascot-200px.webp\" title=\"\">\n\nWe\u2019re going to be really seeing the rewards from all this effort as\nOCaml 4.02 development proceeds, since we can now adopt a data-driven\napproach to changing language features instead of guessing how much\nthird-party code will break. If your code is in OPAM, then it\u2019ll be\ntested as new features such as <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a>,\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">injectivity</a>\nand <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">extension\npoints</a> show up.</p>\n<h3><a href=\"https://anil.recoil.org/#better-documentation\"></a>Better documentation</h3>\n<p>The venerable\n<a href=\"http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual029.html\">OCamlDoc</a>\ntool has done an admirable job for the last decade, but is increasingly\nshowing its age due to a lack of support for cross-referencing across\npackages. We started working on this problem in the summer when <a href=\"https://github.com/vincent-botbol\">Vincent\nBotbol</a> visited us on an internship,\nexpecting it to be a quick job to come up with something as good as\nHaskell\u2019s excellent <a href=\"http://www.haskell.org/haddock/\">Haddock</a> online\ndocumentation.</p>\n<p>Instead, we ran into the "module wall": since OCaml makes it so easy to\nparameterise code over other modules, it makes it hard to generate\nstatic documentation without outputting hundreds of megabytes of HTML\nevery time. After some hard work from Vincent and Leo, we\u2019ve got a\nworking prototype that lets you simply run\n<code>opam install opam-doc && opam doc core async</code> to generate package\ndocumentation. You can see the results for\n<a href=\"http://mirage.github.io/\">Mirage</a> online, but expect to see this\nintegrated into the main OCaml site for all OPAM packages as we work\nthrough polishing up the user interface.</p>\n<h3><a href=\"https://anil.recoil.org/#turning-opam-into-libraries\"></a>Turning OPAM into libraries</h3>\n<p>The other behind-the-scenes effort for OPAM has been to keep the core\ncommand-line tool simple and stable, and to have it install OCaml\nlibraries that can be interfaced with by other tools to do\ndomain-specific tasks. <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a>,\n<a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> and <a href=\"https://github.com/dsheets\">David\nSheets</a> have been steadily hacking away at\nthis and we now have <a href=\"https://github.com/ocamllabs/opamfu\">opamfu</a> to\nrun operations over all packages, and an easy-to-template\n<a href=\"https://github.com/ocaml/opam2web\">opam2web</a> that generates the live\n<a href=\"http://opam.ocaml.org\">opam.ocaml.org</a> website.</p>\n<p>This makes OPAM easier to deploy within other organizations that want to\nintegrate it into their workflow. For example, the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">software\nsection</a> of the OCaml\nLabs website is regularly generated from a search of all OPAM packages\ntagged <code>ocamllabs</code>. We also used it to rewrite the entire OPAM\nrepository <a href=\"https://github.com/ocaml/opam-repository/pull/1240\">in one epic\ndiff</a> to add\nexternal library dependencies via a <a href=\"https://github.com/ocaml/opam/pull/886/files\">command-line\nshim</a>.</p>\n<h3><a href=\"https://anil.recoil.org/#opam-in-a-box\"></a>OPAM-in-a-Box</h3>\n<p>All of this effort is geared towards making it easier to maintain\nreusable local OPAM installations. After several requests from big\nuniversities to help out their teaching needs, we\u2019re putting together\nall the support needed to easily redistribute OPAM packages via an\n\u201c<a href=\"https://github.com/ocaml/opam/issues/1035\">OPAM-in-a-Box</a>\u201d command\nthat uses <a href=\"http://docker.io\">Docker</a> containers to let you clone and do\nlightweight modifications of OCaml installations.</p>\n<p>This will also be useful for anyone who\u2019d like to run tutorials or teach\nOCaml, without having to rely on flaky network connectivity at\nconference venues: a problem we\u2019ve <a href=\"http://amirchaudhry.com/fpdays-review\">suffered\nfrom</a> too!</p>\n<h2><a href=\"https://anil.recoil.org/#core-compiler\"></a>Core Compiler</h2>\n<p>\n<img alt=\"Compiling hacking at the Cambridge Makespace\" src=\"https://anil.recoil.org/images/compiler-hacking.webp\" title=\"Compiling hacking at the Cambridge Makespace\">\nCompiling hacking at the Cambridge Makespace\nStarting to work on a real compiler can often be a daunting prospect,\nand so one initiative we started this year is to host regular <a href=\"http://ocamllabs.github.io/compiler-hacking/2013/10/30/third-compiler-hacking-session.html\">compiler\nhacking\nsessions</a>\nwhere people could find a <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki\">curated list of\nfeatures</a> to work\non, with the regular developers at hand to help out when people get\nstuck, and free beer and pizza to oil the coding wheels. This has worked\nout well, with around 20 people showing up on average for the three we\nheld, and <a href=\"https://github.com/ocamllabs/compiler-hacking/wiki/Things-previously-worked-on\">several\npatches</a>\nsubmitted upstream to OCaml. <a href=\"http://gallium.inria.fr/~scherer/\">Gabriel\nScherer</a> and <a href=\"http://cristal.inria.fr/~doligez/\">Damien\nDoligez</a> have been helping this\neffort by tagging <a href=\"http://caml.inria.fr/mantis/search.php?project_id=1&sticky_issues=1&sortby=last_updated&dir=DESC&highlight_changed=24&hide_status_id=90&tag_string=junior_job\">junior\njobs</a>\nin the OCaml Mantis bug tracker as they are filed.</p>\n<h3><a href=\"https://anil.recoil.org/#syntax-transformations-and-extension-points\"></a>Syntax transformations and extension points</h3>\n<p><a href=\"http://www.lpw25.net\">Leo White</a> started the year fresh out of\ncompleting his PhD with <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan Mycroft</a>,\nand before he realized what he\u2019d gotten himself into was working with\n<a href=\"http://alain.frisch.fr/\">Alain Frisch</a> on the future of syntax\ntransformations in OCaml. We started off our first\n<a href=\"http://lists.ocaml.org/listinfo/wg-camlp4\">wg-camlp4</a> working group on\nthe new <a href=\"http://lists.ocaml.org\">lists.ocaml.org</a> host, and a spirited\ndiscussion\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-January/thread.html\">started</a>\nthat went\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-February/thread.html\">on</a>\nand\n<a href=\"http://lists.ocaml.org/pipermail/wg-camlp4/2013-March/thread.html\">on</a>\nfor several months. It ended with a very satisfying design for a simpler\n<em>extension points</em> mechanism which Leo\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/white.pdf\">presented</a> at\nthe OCaml 2013 workshop at ICFP, and is now merged into OCaml\n4.02-trunk.</p>\n<h3><a href=\"https://anil.recoil.org/#namespaces\"></a>Namespaces</h3>\n<p>Not all of the working groups were quite as successful in coming to a\nconclusion as the Camlp4 one. On the Platform mailing list, Gabriel\nScherer started a discussion on the design for\n<a href=\"http://lists.ocaml.org/pipermail/platform/2013-February/000050.html\">namespaces</a>\nin OCaml. The resulting discussion was useful in separating multiple\nconcerns that were intermingled in the initial proposal, and Leo wrote a\n<a href=\"http://www.lpw25.net/2013/03/10/ocaml-namespaces.html\">comprehensive blog\npost</a> on a\nproposed namespace design.</p>\n<p>After further discussion at <a href=\"http://icfpconference.org/icfp2013/\">ICFP\n2013</a> with Jacques Garrigue later\nin the year, it turns out adding support for <a href=\"http://caml.inria.fr/mantis/view.php?id=6063\">module\naliases</a> would solve much\nof the cost associated with compiling large libraries such as\n<a href=\"http://ocaml.janestreet.com\">Core</a>, with no backwards compatibility\nissues. This solution has now been integrated into OCaml 4.02.0dev and\nis being tested with Core.</p>\n<h3><a href=\"https://anil.recoil.org/#delving-into-the-bug-tracker\"></a>Delving into the bug tracker</h3>\n<p>Jeremy Yallop joined us in April, and he and Leo also leapt into the\ncore compiler and started triaging issues on the OCaml <a href=\"http://caml.inria.fr/mantis\">bug\ntracker</a>. This seems unglamorous in the\nbeginning, but there rapidly turned out to be many fascinating threads\nthat shed light on OCaml\u2019s design and implementation through seemingly\nharmless bugs. Here is a pick of some interesting threads through the\nyear that we\u2019ve been involved with:</p>\n<ul>\n<li>An <a href=\"http://caml.inria.fr/mantis/view.php?id=5985&nbn=49#bugnotes\">unexpected interaction between variance and GADTs</a>\nthat led to Jacques Garrigue\u2019s\n<a href=\"http://ocaml.org/meetings/ocaml/2013/slides/garrigue.pdf\">talk</a> at\nOCaml 2013.</li>\n<li>Type unsoundness by <a href=\"http://caml.inria.fr/mantis/view.php?id=5992\">pattern matching lazy mutable\nvalues</a>, thus shedding\nlight on the precise semantics of the order of pattern matching.</li>\n<li>Leo proposed an <a href=\"http://caml.inria.fr/mantis/view.php?id=5584\">open types</a> extension to\nallow abstract types to be declared open. You can try it via\n<code>opam switch 4.00.1+open-types</code>.</li>\n<li>Designing the popular, but controversial <a href=\"http://caml.inria.fr/mantis/view.php?id=5759\">record disambiguation feature</a> in OCaml\n4.01.0, and debating <a href=\"http://caml.inria.fr/mantis/view.php?id=6000\">the right warnings</a> needed to\nprevent programmer surprise.</li>\n<li>Exposing a <a href=\"http://caml.inria.fr/mantis/view.php?id=6064\">GADT representation for Bigarray</a>.</li>\n</ul>\n<p>This is just a sample of some of the issues solved in Mantis; if you\nwant to learn more about OCaml, it\u2019s well worth browsing through it to\nlearn from over a decade of interesting discussions from all the\ndevelopers.</p>\n<h3><a href=\"https://anil.recoil.org/#thread-local-storage-runtime\"></a>Thread-local storage runtime</h3>\n<p>While OCamlPro was working on their <a href=\"https://github.com/lucasaiu/ocaml\">reentrant OCaml\nruntime</a>, we took a different tack by\nadding <a href=\"https://github.com/ocamllabs/ocaml/tree/multicore\">thread-local\nstorage</a> to the\nruntime instead, courtesy of <a href=\"http://mu.netsoc.ie/\">Stephen Dolan</a>. This\nis an important choice to make at the outset of adding multicore, so\nboth approaches are warranted. The preemptive runtime adds a lot of code\nchurn (due to adding a context parameter to most function calls) and\ntakes up a register, whereas the thread-local storage approach we tried\ndoesn\u2019t permit callbacks to different threads.</p>\n<p>Much of this work isn\u2019t interesting on its own, but forms the basis for\na fully multicore runtime (with associated programming model) in 2014.\nStay tuned!</p>\n<h3><a href=\"https://anil.recoil.org/#ctypes\"></a>Ctypes</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/c.webp\" title=\"\">\n\nOne other complaint from the Consortium members was quite surprising:\nthe difficulty of using the OCaml foreign function interface safely to\ninterface with C code. Jeremy Yallop began working on the\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library that had the\ngoal of eliminating the need to write any C code at all for the vast\nmajority of foreign bindings.</p>\n<p>Instead, Ctypes lets you describe any C function call as an OCaml value,\nand provides various linkage options to invoke that function into C. The\nfirst option he implemented was a <code>dlopen</code> interface, which immediately\nbrought us the same level of functionality as the\n<a href=\"http://docs.python.org/2/library/ctypes.html\">Python</a> or\n<a href=\"http://www.haskell.org/haskellwiki/Library/libffi\">Haskell</a> Ctypes\nequivalents. This early code was in itself startlingly useful and more\npleasant to use than the raw FFI, and various folk (such as David\nSheets\u2019 <a href=\"https://github.com/dsheets/ocaml-sodium\">libsodium</a>\ncryptography bindings) started adopting it.</p>\n<p>At this point, I happened to be struggling to write the Foreign Function\nInterface chapter of <a href=\"https://realworldocaml.org\">Real World OCaml</a>\nwithout blowing through our page budget with a comprehensive explanation\nof the existing system. I decided to take a risk and write about Ctypes\ninstead, since it let new users to the language have a <em>far</em> more\nproductive experience to get started. Xavier Leroy pointed out <a href=\"https://github.com/realworldocaml/book/issues/1701\">some\nshortcomings</a> of the\nlibrary in his technical book review, most notably with the lack of an\ninterface with C macros. The design of Ctypes fully supports alternate\nlinking mechanisms than just <code>dlopen</code> though, and Jeremy has added\nautomatic C stub generation support as well. This means that if you use\nCtypes to build an OCaml binding in 2014, you can choose several\nmechanisms for the same source code to link to the external system.\nJeremy even demonstrated a forking model at OCaml 2013 that protects the\nOCaml runtime from the C binding via process separation.</p>\n<p>The effort is paying off: Daniel B\u00fcnzli <a href=\"http://alan.petitepomme.net/cwn/2013.12.17.html#9\">ported\nSDL2</a> using ctypes,\nand gave us extensive\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes/issues\">feedback</a> about any\nmissing corner cases, and the resulting bindings don\u2019t require any C\ncode to be written. <a href=\"http://xulforum.org\">Jonathan Protzenko</a> even used\nit to implement an OCaml controller for the <a href=\"http://gallium.inria.fr/blog/raspi-lcd/\">Adafruit Raspberry Pi RGB\nLCD</a>!</p>\n<h2><a href=\"https://anil.recoil.org/#community-efforts\"></a>Community Efforts</h2>\n<p>Our community efforts were largely online, but we also hosted visitors\nover the year and regular face-to-face tutorials.</p>\n<h3><a href=\"https://anil.recoil.org/#online-at-ocamlorg\"></a>Online at OCaml.org</h3>\n<p>While the rest of the crew were hacking on OPAM and OCaml, <a href=\"http://amirchaudhry.com/\">Amir\nChaudhry</a> and <a href=\"http://philippewang.info/CL/\">Philippe\nWang</a> teamed up with Ashish Agarwal and\nChristophe Troestler to redesign and relaunch the <a href=\"http://ocaml.org\">OCaml\nwebsite</a>. Historically, OCaml\u2019s homepage has been the\n<a href=\"http://caml.inria.fr\">caml.inria.fr</a> domain, and the\n<a href=\"http://ocaml.org\">ocaml.org</a> effort was begun by Christophe and Ashish\n<a href=\"https://www.mail-archive.com/caml-list@inria.fr/msg00169.html\">some years\nago</a> to\nmodernize the web presence.</p>\n<p>The webpages were already rather large with complex scripting (for\nexample, the <a href=\"http://ocaml.org/learn/tutorials/99problems.html\">99\nProblems</a> page runs\nthe OCaml code to autogenerate the output). Philippe developed a\n<a href=\"https://github.com/pw374/MPP-language-blender\">template DSL</a> that made\nit easier to unify a lot of the templates around the website, and also a\n<a href=\"https://github.com/pw374/omd\">Markdown parser</a> that we could link to as\na library from the rest of the infrastructure without shelling out to\nPandoc.</p>\n<p>Meanwhile, Amir designed a series of <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">interactive wireframe\nsketches</a> and\n<a href=\"http://amirchaudhry.com/ocamlorg-request-for-feedback/\">gathered feedback</a> on it\nfrom the community. A local <a href=\"http://onespacemedia.com\">design agency</a> in\nCambridge helped with visual look and feel, and finally at the end of\nthe summer we began the\n<a href=\"http://amirchaudhry.com/migration-plan-ocaml-org/\">migration</a> to the\nnew website, followed by a triumphant\n<a href=\"http://amirchaudhry.com/announcing-new-ocamlorg/\">switchover</a> in\nNovember to the design you see today.</p>\n<p>The domain isn\u2019t just limited to the website itself. Leo and I set up a\n<a href=\"https://github.com/ocaml/ocaml.org-scripts\">SVN-to-Git mirror</a> of the\nOCaml compiler <a href=\"http://caml.inria.fr/ocaml/anonsvn.en.html\">Subversion\nrepository</a> on the GitHub\n<a href=\"https://github.com/ocaml/ocaml\">OCaml organization</a>, which is proving\npopular with developers. There is an ongoing effort to simplify the core\ncompiler tree by splitting out some of the larger components, and so\n<a href=\"http://github.com/ocaml/camlp4\">camlp4</a> is also now hosted on that\norganization, along with <a href=\"https://github.com/ocaml/oasis\">OASIS</a>. We\nalso administer several subdomains of <a href=\"http://ocaml.org\">ocaml.org</a>,\nsuch as the <a href=\"http://lists.ocaml.org\">mailing lists</a> and the <a href=\"http://opam.ocaml.org\">OPAM\nrepository</a>, and other services such as the\n<a href=\"http://forge.ocamlcore.org\">OCaml Forge</a> are currently migrating over.\nThis was made significantly easier thanks to sponsorship from <a href=\"http://rackspace.com\">Rackspace\nCloud</a> (users of <a href=\"http://xenserver.org\">XenServer</a>\nwhich is written in OCaml). They saw our struggles with managing\nphysical machines and gave us developer accounts, and all of the\nocaml.org infrastructure is now hosted on Rackspace. We\u2019re very grateful\nto their ongoing help!</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/rackspace.webp\" title=\"\">\n\nIf you\u2019d like to contribute to infrastructure help (for example, I\u2019m\nexperimenting with a <a href=\"http://git.ocaml.org/public/\">GitLab</a> mirror),\nthen please join the\n<a href=\"http://lists.ocaml.org/listinfo/infrastructure\">infrastructure@lists.ocaml.org</a>\nmailing list and share your thoughts. The website team also need help\nwith adding content and <a href=\"https://github.com/ocaml/ocaml.org/issues/376\">international\ntranslations</a>, so head\nover to the <a href=\"http://github.com/ocaml/ocaml.org/issues\">website issue\ntracker</a> and start proposing\nimprovements you\u2019d like to see.</p>\n<h3><a href=\"https://anil.recoil.org/#next-steps-for-ocamlorg\"></a>Next steps for ocaml.org</h3>\n<p>The floodgates requesting features opened up after the launch of the new\nlook and feel. Pretty much everyone wanted deeper OPAM integration into\nthe main website, for features such as:</p>\n<ul>\n<li>Starring and reviewing packages</li>\n<li>Integrating the <a href=\"https://github.com/ocamllabs/opam-doc\">opam-doc</a>\ndocumentation with the metadata</li>\n<li>Display test results and a compatibility matrix for non-x86 and\nnon-Linux architectures.</li>\n<li>Link to blog posts and tutorials about the package.</li>\n</ul>\n<p>Many of these features were part of the <a href=\"http://amirchaudhry.com/wireframe-demos-for-ocamlorg/\">original\nwireframes</a> but\nwe\u2019re being careful to take a long-term view of how they should be\ncreated and maintained. Rather than building all of this as a huge\nbloated <a href=\"https://github.com/ocaml/opam2web\">opam2web</a> extension, David\nSheets (our resident relucant-to-admit-it web expert) has designed an\noverlay directory scheme that permits the overlaying of different\nmetadata onto the website. This lets one particular feature (such as\nblog post aggregation) be handled separately from the others via Atom\naggregators.</p>\n<h3><a href=\"https://anil.recoil.org/#real-world-ocaml\"></a>Real World OCaml</h3>\n<p><img alt=\"%r\" src=\"https://anil.recoil.org/papers/rwo\">\nA big effort that took up most of the year for me was finishing and\npublishing an O\u2019Reilly book called <a href=\"https://realworldocaml.org\">Real World\nOCaml</a> with <a href=\"https://ocaml.janestreet.com/?q=blog/5\">Yaron\nMinsky</a> and Jason Hickey. Yaron\ndescribes how it all started in <a href=\"https://ocaml.janestreet.com/?q=node/117\">his blog\npost</a>, but I learnt a lot from\ndeveloping a book using the <a href=\"https://web.archive.org/web/20160324164610/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">open commenting\nscheme</a>\nthat we developed just for this.</p>\n<p>In particular, the book ended up shining a bright light into dark\nlanguage corners that we might otherwise not have explored in OCaml\nLabs. Two chapters of the book that I wasn\u2019t satisfied with were the\n<a href=\"https://realworldocaml.org/v1/en/html/objects.html\">objects</a> and\n<a href=\"https://realworldocaml.org/v1/en/html/classes.html\">classes</a> chapters,\nlargely since neither Yaron nor Jason nor I had ever really used their\nfull power in our own code. Luckily, Leo White decided to pick up the\nbaton and champion these oft-maligned (but very powerful) features of\nOCaml, and the result is the clearest explanation of them that I\u2019ve read\nyet. Meanwhile, Jeremy Yallop helped out with extensive review of the\n<a href=\"https://realworldocaml.org/v1/en/html/foreign-function-interface.html\">Foreign Function\nInterface</a>\nchapter that used his\n<a href=\"https://github.com/ocamllabs/ocaml-ctypes\">ctypes</a> library. Finally,\n<a href=\"https://plus.google.com/100586365409172579442/posts\">Jeremie Diminio</a>\nat Jane Street worked hard on adding several features to his\n<a href=\"https://github.com/diml/utop\">utop</a> toplevel that made it compelling\nenough to become our default recommendation for newcomers.</p>\n<p>All in all, we ended up closing over <a href=\"https://web.archive.org/web/20160101000000*/https://anil.recoil.org/2013/08/06/real-world-ocaml-beta2.html\">2000\ncomments</a>\nin the process of writing the book, and I\u2019m very proud of the result\n(freely available <a href=\"https://realworldocaml.org\">online</a>, but do <a href=\"http://www.amazon.com/Real-World-OCaml-Functional-programming/dp/144932391X/\">buy a\ncopy</a>\nif you can to support it). Still, there\u2019s more I\u2019d like to do in 2014 to\nimprove the ease of using OCaml further. In particular, I removed a\nchapter on packaging and build systems since I wasn\u2019t happy with its\nquality, and both <a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> and I\nintend to spend time in 2014 on improving this part of the ecosystem.</p>\n<h3><a href=\"https://anil.recoil.org/#tutorials-and-talks\"></a>Tutorials and Talks</h3>\n<p>\n<img alt=\"Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\" src=\"https://anil.recoil.org/images/pfff.webp\" title=\"Julien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\">\nJulien Verlaguet and Yoann Padioleau show off Pfff code visualisation at Facebook.\nWe had a lively presence at <a href=\"http://icfpconference.org\">ICFP 2013</a> this\nyear, with the third iteration of the <a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">OCaml\n2013</a> held there, and\nStephen Dolan presenting a paper in the main conference. I <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/24/liveblogging-ocaml-workshop-2013/\">liveblogged\nOCaml\n2013</a>\nand <a href=\"http://www.syslog.cl.cam.ac.uk/2013/09/22/liveblogging-cufp-2013/\">CUFP\n2013</a>\nas they happened, and all the\n<a href=\"http://ocaml.org/meetings/ocaml/2013/program.html\">talks</a> we gave are\nlinked from the program. The most exciting part of the conference for a\nlot of us were the two talks by Facebook on their use of OCaml: first\nfor <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/padioleau.pdf\">program analysis using\nPfff</a> and\nthen to migrate their massive PHP codebase <a href=\"http://www.youtube.com/watch?feature=player_detailpage&v=gKWNjFagR9k#t=1150\">using an OCaml\ncompiler</a>.\nI also had the opportunity to participate in a panel at the Haskell\nWorkshop on whether <a href=\"http://ezyang.tumblr.com/post/62157468762/haskell-haskell-and-ghc-too-big-to-fail-panel\">Haskell is too big to fail\nyet</a>;\nlots of interesting perspectives on scaling another formerly academic\nlanguage into the real world.</p>\n<p><a href=\"https://github.com/yminsky\">Yaron Minsky</a> and I have been\ngiving tutorials on OCaml at ICFP for several years, but the release of\nReal World OCaml has made it significantly easier to give tutorials\nwithout the sort of labor intensity that it took in previous years (one\nmemorable ICFP 2011 tutorial that we did took almost 2 hours to get\neveryone installed with OCaml. In ICFP 2013, it took us 15 minutes or so\nto get everyone started). Still, giving tutorials at ICFP is very much\npreaching to the choir, and so we\u2019ve started speaking at more\ngeneral-purpose events.</p>\n<p>\n<img alt=\"Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\" src=\"https://anil.recoil.org/images/marius-yaron-icfp.webp\" title=\"Marius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\">\nMarius Eriksen and Yaron Minsky start a Scala vs OCaml rap battle at the ICFP industrial fair. Maybe.\nOur first local effort was <a href=\"http://fpdays.net/2013/\">FPDays</a> in\nCambridge, where Jeremy Yallop and Amir Chaudhry ran the tutorial with\nhelp from Phillipe Wang, Leo White and David Sheets. The OCaml session\nthere ended up being the biggest one in the entire two days, and Amir\n<a href=\"http://amirchaudhry.com/fpdays-review/\">wrote up</a> their experiences.\nOne interesting change from our ICFP tutorial is that Jeremy used\n<a href=\"https://github.com/ocsigen/js_of_ocaml\">js_of_ocaml</a> to teach OCaml\nvia JavaScript by building a fun <a href=\"https://github.com/ocamllabs/fpdays-skeleton\">Monty\nHall</a> game.</p>\n<h3><a href=\"https://anil.recoil.org/#visitors-and-interns\"></a>Visitors and Interns</h3>\n<p>\n<img alt=\"Thomas Gazagnaire presents at Jane Street\" src=\"https://anil.recoil.org/images/thomas-nycoug-2013.webp\" title=\"Thomas Gazagnaire presents at Jane Street\">\nThomas Gazagnaire presents at Jane Street\nSince OCaml Labs is a normal group within the <a href=\"http://www.cl.cam.ac.uk\">Cambridge Computer\nLab</a>, we often host academic visitors and\ninterns who pass through. This year was certainly diverse, and we\nwelcomed a range of colleagues:</p>\n<ul>\n<li><a href=\"http://www.lip6.fr/actualite/personnes-fiche.php?ident=D1161&LANG=en\">Mathias\nBourgoin</a>\nhas just finished his work on interfacing OCaml with GPUs, and gave\nus a seminar on how his\n<a href=\"http://www.algo-prog.info/spoc/web/index.php?id=spoc\">SPOC</a> tool\nworks (also available in OPAM via a <a href=\"http://www.algo-prog.info/spoc/distribution/opam/\">custom\nremote</a>).</li>\n<li><a href=\"http://www.benjamin.canou.fr/\">Benjamin Canou</a> (now at OCamlPro)\npractised his <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/canou.pdf\">OCaml 2013\ntalk</a> on\nbuilding high-level interfaces to JavaScript with OCaml by giving a\ndepartmental seminar.</li>\n<li><a href=\"http://www.dicosmo.org/\">Roberto Di Cosmo</a>, who directs the\n<a href=\"http://www.irill.org/\">IRILL</a> organization on Free Software in\nParis delivered a seminar on constraint solving for <a href=\"http://mancoosi.org\">package\nsystems</a> that are as large-scale as Debian\u2019s.</li>\n<li><a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> visited during the summer\nto help plot the <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">Mirage\n1.0</a> and <a href=\"https://anil.recoil.org/2013/09/20/opam-1-1-beta.html\">OPAM\n1.1</a> releases.\nHe has also since joined OCaml Labs fulltime to work on\n<a href=\"http://nymote.org\">Nymote</a>.</li>\n<li><a href=\"http://louis.gesbert.fr/cv.en.html\">Louis Gesbert</a> from OCamlPro\nvisited for 2 weeks in December and kicked off the inaugral OPAM\ndevelopers summit (which was, admittedly, just 5 developers in the\n<a href=\"http://www.kingston-arms.co.uk/\">Kingston Arms</a>, but all good\nthings start in a pub, right?)</li>\n<li><a href=\"http://www.xulforum.org/\">Jonathan Protzenko</a> presented his PhD\nwork on <a href=\"http://protz.github.io/mezzo/\">Mezzo</a> (which is now <a href=\"http://gallium.inria.fr/blog/mezzo-on-opam/\">merged\ninto OPAM</a>), and\neducated us on the vagaries of <a href=\"http://protz.github.io/ocaml-installer/\">Windows\nsupport</a>.</li>\n<li><a href=\"http://gallium.inria.fr/~scherer/\">Gabriel Scherer</a> from the\nGallium INRIA group visited to discuss the direction of OPAM and\nvarious language feature discussions (such as namespaces). He didn\u2019t\ngive a talk, but promises to do so next time!</li>\n<li><a href=\"https://github.com/bvaugon\">Beno\u00eet Vaugon</a> gave a seminar on his\n<a href=\"http://oud.ocaml.org/2012/slides/oud2012-paper10-slides.pdf\">OCamlCC</a>\nOCaml-to-C compiler, talked about porting OCaml to <a href=\"http://www.algo-prog.info/ocaml_for_pic/web/index.php?id=ocapic\">8-bit\nPICs</a>,\nand using GADTs to <a href=\"http://caml.inria.fr/mantis/view.php?id=6017\">implement\nPrintf</a> properly.</li>\n</ul>\n<p>We were also visited several times by <a href=\"http://danmey.org/\">Wojciech\nMeyer</a> from ARM, who was an OCaml developer who\nmaintained (among other things) the\n<a href=\"http://brion.inria.fr/gallium/index.php/Ocamlbuild\">ocamlbuild</a> system\nand worked on <a href=\"http://www.youtube.com/watch?v=d9Hg5L76FG8\">DragonKit</a>\n(an extensible LLVM-like compiler written in OCaml). Wojciech very sadly\npassed away on November 18th, and we all fondly remember his\nenthusiastic and intelligent contributions to our small Cambridge\ncommunity.</p>\n<p>We also hosted visitors to live in Cambridge and work with us over the\nsummer. In addition to Vincent Botbol (who worked on OPAM-doc as\ndescribed earlier) we had the pleasure of having <a href=\"http://erratique.ch/\">Daniel\nB\u00fcnzli</a> and <a href=\"http://www.x9c.fr/\">Xavier Clerc</a>\nwork here. Here\u2019s what they did in their own words.</p>\n<h4><a href=\"https://anil.recoil.org/#xavier-clerc-ocamljava\"></a>Xavier Clerc: OCamlJava</h4>\n<p>Xavier Clerc took a break from his regular duties at INRIA to join us\nover the summer to work on\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> and adapt it to the\nlatest JVM features. This is an incredibly important project to bridge\nOCaml with the huge Java community, and here\u2019s his report:</p>\n<blockquote>\n<p>After a four-month visit to the OCaml Labs dedicated to the\n<a href=\"http://ocamljava.x9c.fr/preview/\">OCaml-Java</a> project, the time has\ncome for an appraisal! The undertaken work can be split into two\nareas: improvements to code generation, and interaction between the\nOCaml & Java languages. Regarding code generation, several classical\noptimizations have been added to the compiler, for example loop\nunrolling, more aggressive unboxing, better handling of globals, or\npartial evaluation (at the bytecode level). A new tool, namely\nocamljar, has been introduced allowing post-compilation optimizations.\nThe underlying idea is that some optimizations cannot always be\napplied (e.g. depending whether multiple threads/programs will\ncoexist), but enabling them through command-line flags would lead to\nrecompilation and/or multiple installations of each library according\nto the set of chosen optimizations. It is thus far more easier to\nfirst build an executable jar file, and then modify it according to\nthese optimizations. Furthermore, this workflow allows the ocamljar\ntool to take advantage of whole-program information for some\noptimizations. All these improvements, combined, often lead to a gain\nof roughly 1/3 in terms of execution time.</p>\n<p>Regarding language interoperability, there are actually two directions\ndepending on whether you want to call OCaml code from Java, or want to\ncall Java code from OCaml. For the first direction, a tool allows to\ngenerate Java source files from OCaml compiled interfaces, mapping the\nvarious constructs of the OCaml language to Java classes. It is then\npossible to call functions, and to manipulate instances of OCaml types\nin pure Java, still benefiting from the type safety provided by the\nOCaml language. In the other direction, an extension of the OCaml\ntyper is provided allowing to create and manipulate Java instances\ndirectly from OCaml sources. This typer extension is indeed a thin\nlayer upon the original OCaml typer, that is mainly responsible for\nencoding Java types into OCaml types. This encoding uses a number of\nadvanced elements such as polymorphic variants, subtyping, variance\nannotations, phantom typing, and printf-hack, but the end-user does\nnot have to be aware of this encoding. On the surface, the type of\ninstances of the Java Object classes is\n<code>java'lang'Object java_instance</code>, and instances can be created by\ncalling Java.make <code>Object()</code>.</p>\n<p>While still under heavy development, a working prototype <a href=\"http://ocamljava.x9c.fr/preview/\">is\navailable</a>, and bugs <a href=\"http://bugs.x9c.fr/\">can be\nreported</a>. Finally, I would like to thank the\nOCaml Labs for providing a great working environment.</p>\n</blockquote>\n<h4><a href=\"https://anil.recoil.org/#daniel-b\u00fcnzli-typography-and-visualisation\"></a>Daniel B\u00fcnzli: Typography and Visualisation</h4>\n<p>Daniel joined us from Switzerland, and spent some time at Citrix before\njoining us in OCaml Labs. All of his\n<a href=\"http://erratique.ch/software\">software</a> is now on OPAM, and is seeing\never-increasing adoption from the community.</p>\n<blockquote>\n<p>Released a first version of <a href=\"http://erratique.ch/software/vg\">Vg</a> [\u2026]\nI\u2019m especially happy about that as I wanted to use and work on these\nideas since at least 2008. The project is a long term project and is\ncertainly not finished yet but this is already a huge step.</p>\n<p>Adjusted and released a first version of\n<a href=\"http://erratique.ch/software/gg\">Gg</a>. While the module was already\nmostly written before my arrival to Cambridge, the development of Vg\nand Vz prompted me to make some changes to the module.</p>\n<p>[\u2026] released <a href=\"http://erratique.ch/software/otfm\">Otfm</a>, a module to\ndecode OpenType fonts. This is a work in progress as not every\nOpenType table has built-in support for decoding yet. But since it is\nneeded by Vg\u2019s PDF renderer I had to cut a release. It can however\nalready be used to implement certain simple things like font kerning\nwith Vg, this can be seen in action in the <code>vecho</code> binary installed by\nVg.</p>\n<p>Started to work on <a href=\"http://erratique.ch/software/vz/doc/Vz.html\">Vz</a>,\na module for helping to map data to Vg images. This is really\nunfinished and is still considered to be at a design stage. There are\na few things that are however well implemented like (human)\nperceptually meaningful <a href=\"http://erratique.ch/software/vz/demos/color_schemes.html\">color\npalettes</a>\nand the small folding stat module (<code>Vz.Stat</code>). However it quickly\nbecame evident that I needed to have more in the box w.r.t. text\nrendering in Vg/Otfm. Things like d3js entirely rely on the SVG/CSS\nsupport for text which makes it easy to e.g. align things (like tick\nlabels on <a href=\"http://erratique.ch/software/vz/demos/iris.html\">such\ndrawings</a>). If you\ncan\u2019t rely on that you need ways of measuring rendered text. So I\ndecided to suspend the work on Vz and put more energy in making a\nfirst good release of Vg. Vz still needs quite some design work,\nespecially since it tries to be independent of Vg\u2019s backend and from\nthe mechanism for user input.</p>\n<p>Spent some time figuring out a new \u201copam-friendly\u201d release workflow in\npkgopkg. One of my problem is that by designing in the small for\nprogramming in the large \u2014 what a slogan \u2014 the number of packages I\u2019m\npublishing is growing (12 and still counting). This means that I need\nto scale horizontally maintenance-wise unhelped by the sad state of\nbuild systems for OCaml. I need tools that make the release process\nflawless, painless and up to my quality standards. This lead me to\nenhance and consolidate my old scattered distribution scripts in that\nrepo, killing my dependencies on Oasis and ocamlfind along the way.\n<em>(edited for brevity, see\n<a href=\"https://github.com/dbuenzli/pkgopkg\">here</a>)</em></p>\n</blockquote>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/daniel-presentation-vg.webp\" title=\"\">\n\nDaniel also left his bicycle here for future visitors to use, and the\n\u201cB\u00fcnzli-bike\u201d is available for our next visitor! (<span>Louis Gesbert</span> even\ndonated lights, giving it a semblance of safety).</p>\n<h3><a href=\"https://anil.recoil.org/#industrial-fellows\"></a>Industrial Fellows</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/xenserver.webp\" title=\"\">\n\nMost of our regular funding bodies such as <a href=\"http://epsrc.ac.uk\">EPSRC</a>\nor <a href=\"http://cordis.europa.eu/fp7/home_en.html\">EU FP7</a> provide funding,\nbut leave all the intellectual input to the academics. A compelling\naspect of OCaml Labs has been how involved our industrial colleagues\nhave been with the day-to-day problems that we solve. Both Jane Street\nand Citrix have senior staff regularly visiting our group and working\nalongside us as industrial fellows in the Computer Lab.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/js.webp\" title=\"\">\n\n<a href=\"http://www.three-tuns.net/mark/\">Mark Shinwell</a> from Jane Street\nEurope has been working on improving the <a href=\"http://www.youtube.com/watch?v=NF2WpWnB-nk\">state of native\ndebugging</a> in OCaml, by\nadding extended DWARF debugging information to the compiler output.\nMark is also a useful source of feedback about the forthcoming\ndesign of multicore, since he has daily insight into a huge\nproduction codebase at Jane Street (and can tell us about it without\nus requiring access!).</p>\n<p><a href=\"http://dave.recoil.org\">Dave Scott</a> is the principal architect of\n<a href=\"http://xenserver.org\">XenServer</a> at Citrix in Cambridge. This year\nhas been transformative for that project, since Citrix <a href=\"http://blogs.citrix.com/2013/06/26/open-source-what-does-it-mean-for-xenserver/\">open-sourced\nXenServer</a>\nto GitHub and fully adopted OPAM into their workflow. Dave is the\nauthor of numerous libraries that have all been released to OPAM,\nand his colleagues <a href=\"http://jon.recoil.org\">Jon Ludlam</a> and <a href=\"http://www.xenserver.org/blog/blogger/listings/euanh.html\">Euan\nHarris</a>\nare also regular visitors who have also been contributors to the\nOPAM and Mirage ecosystems.</p>\n<h2><a href=\"https://anil.recoil.org/#research-projects\"></a>Research Projects</h2>\n<p>The other 100% of our time at the Labs is spent on research projects.\nWhen we started the group, I wanted to set up a feedback loop between\nlocal people <em>using</em> OCaml to build systems, with the folk <em>developing</em>\nOCaml itself. This has worked out particularly well with a couple of big\nresearch projects in the Lab.</p>\n<h3><a href=\"https://anil.recoil.org/#mirage\"></a>Mirage</h3>\n<p>Mirage is a <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage.pdf\">library operating\nsystem</a> written in\nOCaml that compiles source code into specialised Xen microkernels,\ndeveloped at the Cambridge Computer Lab, Citrix and the <a href=\"http://horizon.ac.uk\">Horizon Digital\nEconomy</a> institute at Nottingham. This year saw\nseveral years of effort culminate in the first release of <a href=\"http://openmirage.org\">Mirage\n1.0</a> as a self-hosting entity. While Mirage\nstarted off as a <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp.pdf\">quick\nexperiment</a> into\nbuilding specialised virtual appliances, it rapidly became useful to\nmake into a real system for use in bigger research projects. You can\nlearn more about Mirage <a href=\"http://openmirage.org/docs\">here</a>, or read the\n<a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels/abstract\">Communications of the\nACM</a>\narticle that <a href=\"http://dave.recoil.org\">Dave Scott</a> and I wrote to close\nout the year.</p>\n<p>This project is where the OCaml Labs \u201cfeedback loop\u201d has been strongest.\nA typical <a href=\"http://www.openmirage.org/wiki/hello-world\">Mirage\napplication</a> consists of\naround 50 libraries that are all installed via OPAM. These range from\n<a href=\"https://github.com/mirage/mirage-block-xen\">device drivers</a> to protocol\nlibraries for <a href=\"https://github.com/avsm/ocaml-cohttp\">HTTP</a> or\n<a href=\"https://github.com/mirage/ocaml-dns\">DNS</a>, to filesystems such as\n<a href=\"https://github.com/mirage/ocaml-fat\">FAT32</a>. Coordinating <a href=\"http://openmirage.org/blog/mirage-1.0.3-released\">regular\nreleases</a> of all of\nthese would be near impossible without using OPAM, and has also forced\nus to use our own tools daily, helping to sort out bugs more quickly.\nYou can see the full list of libraries on the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/pkg/\">OCaml Labs software\npage</a>.</p>\n<p>Mirage is also starting to share code with big projects such as\n<a href=\"http://xenserver.org\">XenServer</a> now, and we have been working with\nCitrix engineers to help them to move to the\n<a href=\"http://ocaml.janestreet.com\">Core</a> library that Jane Street has\nreleased (and that is covered in <a href=\"https://realworldocaml.org\">Real World\nOCaml</a>). Moving production codebases this\nlarge can take years, but OCaml Labs is turning out to be a good place\nto start unifying some of the bigger users of OCaml into one place.\nWe\u2019re also now an official <a href=\"http://www.xenproject.org/developers/teams/mirage-os.html\">Xen Project incubator\nproject</a>,\nwhich helps us to validate functional programming to other Linux\nFoundation efforts.</p>\n<h3><a href=\"https://anil.recoil.org/#nymote-and-user-centric-networking\"></a>Nymote and User Centric Networking</h3>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/nymote.webp\" title=\"\">\n\nThe release of Mirage 1.0 has put us on the road to simplifying embedded\nsystems programming. The move to the centralized cloud has led to\nregular well-publicised privacy and security threats to the way <a href=\"http://de2013.org/wp-content/uploads/2013/09/de2013_submission_25-1.pdf\">we\nhandle</a>\nour digital infrastructure, and so <a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon\nCrowcroft</a>, <a href=\"http://www.cs.nott.ac.uk/~rmm/\">Richard\nMortier</a> and I are leading an effort to\nbuild an alternative privacy-preserving infrastructure using embedded\ndevices as part of the <a href=\"http://usercentricnetworking.eu/\">User Centric\nNetworking</a> project, in collaboration\nwith a host of companies led by <a href=\"http://www.thlab.net/\">Technicolor</a>\nParis. This work also plays on the strong points of OCaml: it already\nhas a <a href=\"https://anil.recoil.org/2012/02/25/dreamplug-debian-and-ocaml.html\">fast ARM\nbackend</a>,\nand Mirage can easily be ported to the new Xen/ARM target as hardware\nbecomes available.</p>\n<p>One of the most difficult aspects of programming on the \u201cwide area\u201d\nInternet are dealing with the lack of a distributed identity service\nthat\u2019s fully secure. We published <a href=\"https://anil.recoil.org/papers/2013-foci-signposts.pdf\">our\nthoughts</a> on this\nat the USENIX Free and Open Communications on the Internet workhsop, and\nDavid Sheets is working towards a full implementation using Mirage. If\nyou\u2019re interested in following this effort, Amir Chaudhry is blogging at\nthe <a href=\"http://nymote.org/\">Nymote</a> project website, where we\u2019ll talk about\nthe components as they are released.</p>\n<h3><a href=\"https://anil.recoil.org/#data-center-networking\"></a>Data Center Networking</h3>\n<p>At the other extreme from embedded programming is datacenter networking,\nand we started the\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K034723/1\">Network-as-a-Service</a>\nresearch project with <a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K032968/1\">Imperial\nCollege</a>\nand\n<a href=\"http://gow.epsrc.ac.uk/NGBOViewGrant.aspx?GrantRef=EP/K031724/1\">Nottingham</a>.\nWith the rapid rise of <a href=\"http://en.wikipedia.org/wiki/Software-defined_networking\">Software Defined\nNetworking</a>\nthis year, we are investigating how application-specific customisation\nof network resources can build fast, better, cheaper infrasructure.\nOCaml is in a good position here: several other groups have built\nOpenFlow controllers in OCaml (most notably, the <a href=\"https://github.com/frenetic-lang\">Frenetic\nProject</a>), and Mirage is specifically\ndesigned to assemble such bespoke infrastructure.</p>\n<p>Another aspect we\u2019ve been considering is how to solve the problem of\noptimal connectivity across nodes. TCP is increasingly considered\nharmful in high-through, high-density clusters, and <a href=\"http://www.sussex.ac.uk/informatics/people/peoplelists/person/334868\">George\nParisis</a>\nled the design of\n<a href=\"https://anil.recoil.org/papers/2013-hotnets-trevi.pdf\">Trevi</a>, which is\na fountain-coding based alternative for storage networking. Meanwhile,\n<a href=\"http://gazagnaire.org\">Thomas Gazagnaire</a> (who joined OCaml Labs in\nNovember), has been working on a branch-consistent data store called\n<a href=\"https://github.com/samoht/irminsule\">Irminsule</a> which supports scalable\ndata sharing and reconciliation using Mirage. Both of these systems will\nsee implementations based on the research done this year.</p>\n<h3><a href=\"https://anil.recoil.org/#higher-kinded-programming\"></a>Higher Kinded Programming</h3>\n<p>Jeremy Yallop and Leo White have been developing an approach that makes\nit possible to write programs with higher-kinded polymorphism (such as\nmonadic functions that are polymorphic in the monad they use) without\nusing functors. It\u2019s early days yet, but there\u2019s a\n<a href=\"https://github.com/ocamllabs/higher\">library</a> available on\n<a href=\"http://opam.ocaml.org/pkg/higher/higher.0.1\">OPAM</a> that implements the\napproach, and a <a href=\"https://github.com/ocamllabs/higher/raw/paper/higher.pdf\">draft\npaper</a> that\noutlines the design.</p>\n<h2><a href=\"https://anil.recoil.org/#priorities-for-2014\"></a>Priorities for 2014</h2>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/camel.webp\" title=\"\">\n\nThis year has been a wild ride to get us up to speed, but we now have a\nsolid sense of what to work on for 2014. We\u2019ve decided on a high-level\nset of priorities led by the senior members of the group:</p>\n<ul>\n<li><strong>Multicore</strong>: Leo White will be leading efforts in putting an\nend-to-end multicore capable OCaml together.</li>\n<li><strong>Metaprogramming</strong>: Jeremy Yallop will direct the metaprogramming\nefforts, continuing with Ctypes and into macros and extension\npoints.</li>\n<li><strong>Platform</strong>: Thomas Gazagnaire will continue to drive OPAM\ndevelopment towards becoming the first <a href=\"http://ocaml.org/meetings/ocaml/2013/slides/madhavapeddy.pdf\">OCaml\nPlatform</a>.</li>\n<li><strong>Online</strong>: Amir Chaudhry will develop the online and community\nefforts that started in 2013.</li>\n</ul>\n<p>These are guidelines to choosing where to spend our time, but not\nexcluding other work or day-to-day bugfixing. Our focus on collaboration\nwith Jane Street, Citrix, Lexifi, OCamlPro and our existing colleagues\nwill continue, as well as warmly welcoming new community members that\nwish to work with us on any of the projects, either via internships,\nstudentships or good old-fashioned open source hacking.</p>\n<p>I appreciate the <a href=\"http://www.cl.cam.ac.uk/projects/ocamllabs/people/\">whole\nteam's</a> feedback in\nediting this long post into shape, the amazing professorial support from\n<a href=\"http://www.cl.cam.ac.uk/~jac22/\">Jon Crowcroft</a>, <a href=\"https://www.cl.cam.ac.uk/~iml1/\">Ian\nLeslie</a> and <a href=\"https://www.cl.cam.ac.uk/~am21/\">Alan\nMycroft</a> throughout the year, and of\ncourse the funding and support from Jane Street, Citrix, RCUK, EPSRC,\nDARPA and the EU FP7 that made all this possible. Roll on 2014, and\nplease do <a href=\"mailto:avsm2@cl.cam.ac.uk\">get in touch</a> with me with any\nqueries!</p>\n<p>\n<img alt=\"A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!\" src=\"https://anil.recoil.org/images/fpdays2013-04.webp\" title=\"A successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!\">\nA successful FPDays tutorial in Cambridge, with all attendees getting a free copy of RWO!</p>",
+18
avsm/notes_uiprototype.json
+18
avsm/notes_uiprototype.json
···+"summary": "<p>We\u2019ve been <a href=\"http://github.com/avsm/perscon\">hacking</a> away on fleshing out the <a href=\"http://code.google.com/appengine\">App Engine</a> node for personal containers. We\u2019re building this node first because, crucially, deploying an App Engine VM is free to anyone with a Google account. The service itself is limited since you can only respond to HTTP or XMPP requests and do HTTP fetches, and so its primary use is as an always-on data collection service with a webmail-style UI written using <a href=\"http://www.extjs.com/\">extjs</a>.</p>\n<p>Personal containers gather data from a wide variety of sources, and normalise them into a format which understands people (address book entries, with a set of services such as e-mail, phone, IM and online IDs), places (GPS, WOEID), media (photos, movies) and messages (Tweets, emails, Facebook messages). I\u2019ll post more about the data model behind personal containers in a follow-up as the format settles.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/perscon-extjs.webp\" title=\"\">\n</p>\n<p>The App Engine node has a number of plugins to gather data and aggregate them into a single view (see screenshot). Plugins include:</p>\n<ul>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhoto/\">iPhoto</a> extracts location (via EXIF), people present (associated via <a href=\"http://gizmodo.com/5141741/what-to-know-about-iphoto-09-face-detection-and-recognition\">faces</a>), and of course, the actual photograph.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/Adium/\">Adium</a> logs all IMs into a threaded chat view. - <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> uses the backup files on a Mac to extract SMS messages, phone call records (and it could also get photographs and browsing history, although it currently doesn\u2019t). An AppEngine tracker can also use <a href=\"http://www.apple.com/mobileme/features/find-my-iphone.html\">FindMyIPhone</a> to poll your iPhone regularly to keep track of your location without publishing it to Google or Yahoo (and hopefully in iPhone 4.0, we can operate as a background service at last!).</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/appengine/twitter.py\">Twitter</a> runs directly on AppEngine (authenticated via OAuth) and synchronizes with a Twitter feed.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/MacOS-SyncServices/\">SyncServices</a> hooks into the MacOS X <a href=\"http://developer.apple.com/macosx/syncservices.html\">sync framework</a> and initially subscribes to Address Book updates. This seems to be the first open-source sync alternative to the expensive Mobile Me, as far as I can tell. I\u2019m planning to expand this to also subscribe to the full set of sync information (e.g. calendars).</li>\n</ul>\n<p>I'm switching tacks briefly; we received an <a href=\"http://aws.amazon.com/education/aws-in-education-research-grants/\">Amazon Research Grant</a> recently and I\u2019m building a node that runs as a Linux server to act as a longer-term archival and search server. This is being written in OCaml and uses <a href=\"http://1978th.net/tokyocabinet/\">Tokyo Cabinet</a> (with Jake Donham\u2019s excellent <a href=\"http://github.com/jaked/otoky\">bindings</a>) and so should be speedy and a useful alternative implementation of the HTTP REST interface. The plan is to automatically synchronize meta-data across all the nodes of a personal container, but store large and historical data away from expensive cloud storage such as App Engine.</p>\n<p>There are lots more plugins in development, such as <a href=\"http://foursquare.com\">Foursquare</a> and <a href=\"http://gowalla.com\">Gowalla</a> OAuth collectors, an <a href=\"http://github.com/avsm/perscon/tree/master/android\">Android</a> mobile application to upload location and contacts information, and Google GData synchronization. If you\u2019re interested in one of these or something else, please do <a href=\"http://perscon.net/contact.html\">get in touch</a> or just fork the <a href=\"http://github.com/avsm/perscon\">project</a> and start hacking!</p>",+"content": "<p>We\u2019ve been <a href=\"http://github.com/avsm/perscon\">hacking</a> away on fleshing out the <a href=\"http://code.google.com/appengine\">App Engine</a> node for personal containers. We\u2019re building this node first because, crucially, deploying an App Engine VM is free to anyone with a Google account. The service itself is limited since you can only respond to HTTP or XMPP requests and do HTTP fetches, and so its primary use is as an always-on data collection service with a webmail-style UI written using <a href=\"http://www.extjs.com/\">extjs</a>.</p>\n<p>Personal containers gather data from a wide variety of sources, and normalise them into a format which understands people (address book entries, with a set of services such as e-mail, phone, IM and online IDs), places (GPS, WOEID), media (photos, movies) and messages (Tweets, emails, Facebook messages). I\u2019ll post more about the data model behind personal containers in a follow-up as the format settles.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/perscon-extjs.webp\" title=\"\">\n</p>\n<p>The App Engine node has a number of plugins to gather data and aggregate them into a single view (see screenshot). Plugins include:</p>\n<ul>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhoto/\">iPhoto</a> extracts location (via EXIF), people present (associated via <a href=\"http://gizmodo.com/5141741/what-to-know-about-iphoto-09-face-detection-and-recognition\">faces</a>), and of course, the actual photograph.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/Adium/\">Adium</a> logs all IMs into a threaded chat view. - <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> uses the backup files on a Mac to extract SMS messages, phone call records (and it could also get photographs and browsing history, although it currently doesn\u2019t). An AppEngine tracker can also use <a href=\"http://www.apple.com/mobileme/features/find-my-iphone.html\">FindMyIPhone</a> to poll your iPhone regularly to keep track of your location without publishing it to Google or Yahoo (and hopefully in iPhone 4.0, we can operate as a background service at last!).</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/appengine/twitter.py\">Twitter</a> runs directly on AppEngine (authenticated via OAuth) and synchronizes with a Twitter feed.</li>\n<li><a href=\"http://github.com/avsm/perscon/tree/master/plugins/MacOS-SyncServices/\">SyncServices</a> hooks into the MacOS X <a href=\"http://developer.apple.com/macosx/syncservices.html\">sync framework</a> and initially subscribes to Address Book updates. This seems to be the first open-source sync alternative to the expensive Mobile Me, as far as I can tell. I\u2019m planning to expand this to also subscribe to the full set of sync information (e.g. calendars).</li>\n</ul>\n<p>I'm switching tacks briefly; we received an <a href=\"http://aws.amazon.com/education/aws-in-education-research-grants/\">Amazon Research Grant</a> recently and I\u2019m building a node that runs as a Linux server to act as a longer-term archival and search server. This is being written in OCaml and uses <a href=\"http://1978th.net/tokyocabinet/\">Tokyo Cabinet</a> (with Jake Donham\u2019s excellent <a href=\"http://github.com/jaked/otoky\">bindings</a>) and so should be speedy and a useful alternative implementation of the HTTP REST interface. The plan is to automatically synchronize meta-data across all the nodes of a personal container, but store large and historical data away from expensive cloud storage such as App Engine.</p>\n<p>There are lots more plugins in development, such as <a href=\"http://foursquare.com\">Foursquare</a> and <a href=\"http://gowalla.com\">Gowalla</a> OAuth collectors, an <a href=\"http://github.com/avsm/perscon/tree/master/android\">Android</a> mobile application to upload location and contacts information, and Google GData synchronization. If you\u2019re interested in one of these or something else, please do <a href=\"http://perscon.net/contact.html\">get in touch</a> or just fork the <a href=\"http://github.com/avsm/perscon\">project</a> and start hacking!</p>",
+18
avsm/notes_uk-national-data-lib.json
+18
avsm/notes_uk-national-data-lib.json
···+"summary": "<p>Over the past year, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been getting an object lesson in how the modern Internet handles researcher access to data, as we've been downloading tens of millions of research papers towards our <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> project. This is legally possible via our <a href=\"https://www.lib.cam.ac.uk/stories/student-guide-libraries\">institutional subscriptions</a> that give us license to fulltexts, and the incredibly helpful <a href=\"https://uk.linkedin.com/in/james-caudwell-60681766\">head of electronic services</a> at the University Library who wields encyclopedic knowledge of each of our agreements with the hundreds of publishers out there. My thoughts on this then segwayed into recent conversations I've been having about the emerging <a href=\"https://takes.jamesomalley.co.uk/p/wtf-is-the-national-data-library\">National Data Library</a> and also with the UK <a href=\"https://www.wildlifetrusts.org/\">Wildlife Trusts</a>...</p>\n<h2><a href=\"https://anil.recoil.org/#the-difficulty-of-access-controlled-bulk-data-downloads\"></a>The difficulty of access controlled bulk data downloads</h2>\n<p>In late 2023, once we got past the legal aspects of downloading closed access papers<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> it was still remarkably difficult to <em>actually</em> gain access to the actual paper datasets themselves. For instance, a select few hurdles include:</p>\n<ul>\n<li><a href=\"https://www.cloudflare.com/\">Cloudflare</a> got in the way <em>all</em> the time, preventing batch downloading by throwing <a href=\"https://en.wikipedia.org/wiki/CAPTCHA\">CAPTCHAs</a> down the wire. Each publisher has to individually allowlist our one hardworking IP, and it can take months for them to do this and it's never quite clear when we have been allowed. So I hacked up <a href=\"https://www.zenrows.com/blog/undetected-chromedriver-vs-selenium-stealth\">dodgy stealth downloaders</a> even though we're meant to have access via the publisher.</li>\n<li>Many official <a href=\"https://www.springernature.com/gp/researchers/text-and-data-mining\">text mining</a> APIs for publishers such as Elsevier and Springer do not provide PDF access, and only give an <a href=\"https://www.elsevier.com/en-gb/researcher/author/policies-and-guidelines/elsevier-xml-dtds-and-transport-schemas\">XML equivalent</a> which is both inconsistent in its schemas and misses diagrams. Luckily there are great projects like <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a> to normalise some of these with very <a href=\"https://github.com/kermitt2/Pub2TEI/pull/18\">responsive</a> maintainers.</li>\n<li>There existing archival indices for the PDFs that <a href=\"https://docs.openalex.org/api-entities/works/work-object/location-object\">point to preprints</a> around the web, but <a href=\"https://commoncrawl.org/blog/january-2025-crawl-archive-now-available\">CommonCrawl</a> truncates <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">downloads</a> to their first megabyte, and the <a href=\"https://archive.org/details/UNPAYWALL-PDF-CRAWL-2019-04\">archive.org unpaywall</a> crawls are restricted access for licensing reasons. So I built a crawler to get these ourselves (I'm glad I wrote the first <a href=\"https://github.com/mirage/ocaml-cohttp\">cohttp</a> now!)</li>\n<li>Bulk download still involves individual HTTP queries with various rate throttling mechanisms that all vary slightly, making me an expert in different <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429\">HTTP 429</a> response headers. There's not much sign of <a href=\"https://graphql.org/\">batch query</a> interfaces anywhere, probably because of the difficulty of access checking for each individual result.</li>\n<li>The <a href=\"https://pmc.ncbi.nlm.nih.gov/tools/ftp/#pdf\">NIH PMC</a> only have one hard-working rate-throttled FTP server for PDFs, which I've been slowly mirroring using a hand-crafted OCaml FTP client since Nov 2024 (almost done!)</li>\n<li>Meanwhile, because this is happening through allowlisting of specific IPs, I then got my Pembroke office kicked off the Internet due to automated abuse notifications going to the <a href=\"https://www.uis.cam.ac.uk/\">UIS</a> who turn netblocks off before checking (fair enough, it could be malware). But it would have been easier to run these downloads through <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds\">dust clouds</a> than try to do it properly by registering the addresses involved, eh?</li>\n</ul>\n<p>The situation is better for open access downloads, where projects such as <a href=\"https://core.ac.uk/\">Core</a> offer easier bulk access and large metadata databases like <a href=\"https://openalex.org\">OpenAlex</a> use '<a href=\"https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html\">downloader pays</a>' S3 buckets. And in other domains like satellite data, there is still a lot of complexity in obtaining the data, but <a href=\"https://github.com/sentinel-hub/sentinelhub-py\">programming wrappers</a> make implementing the (often terabyte-level) downloads much more palatable. For our recent <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> biodiversity maps, we also make them available on services like <a href=\"https://zenodo.org/records/14188450\">Zenodo</a> as they are open.</p>\n<p>The lesson I took away from this is that it's really difficult to deal with large sensitive datasets where selective <em>access control</em> is required, and also that sort of data is rarely mirrored on the open web for obvious reasons. But in the <a href=\"https://www.theatlantic.com/health/archive/2025/02/trump-science-data-gender-dei/681698/\">current climate</a>, it's utterly vital that we move to protect human health or <a href=\"https://www.nature.com/articles/s41559-023-02226-2\">biodiversity data</a> gathered over decades that is irreplaceable once lost. And beyond data loss, if the data is present but not accessible, then what's the point in gathering it in the first place? It's also really important not to blame the existing publishers of these datasets, who are getting overwhelmed by <a href=\"https://perishablepress.com/ultimate-ai-block-list/\">AI bots</a> making huge numbers of requests to their infrastructure. So I'm getting energised by the idea of a cooperative solution among all the stakeholders involved.</p>\n<h2><a href=\"https://anil.recoil.org/#enter-the-national-data-library\"></a>Enter the National Data Library</h2>\n<p>You can imagine my excitement late last year when I got a call from the Royal Society to show up bright and early for a mysterious speech by Rishi Sunak. He duly <a href=\"https://www.gov.uk/government/speeches/prime-ministers-speech-on-ai-26-october-2023\">announced</a> the government's AI summit that mostly focussed on <a href=\"https://www.gov.uk/government/topical-events/ai-safety-summit-2023\">safety</a>, but a report by <a href=\"https://sciencesuperpower.substack.com/i/144202375/investing-in-public-goods\">Onward</a> caught my eye by recommending that <em>"the Government should establish a British Library for Data \u2013 a centralised, secure platform to collate high-quality data for scientists and start-ups"</em>. I wasn't down for the "centralised" part of this, but I generally liked the library analogy and the curation it implied.</p>\n<p>\n<img alt=\"Seeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper&apos;s head.\" src=\"https://anil.recoil.org/images/rishi-sunak-rs-ai-1.webp\" title=\"Seeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper&apos;s head.\">\nSeeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper's head.</p>\n<p>Then in 2025, with Sunak dispatched back to <a href=\"https://en.wikipedia.org/wiki/Richmond_and_Northallerton_(UK_Parliament_constituency)\">Richmond</a>, Labour took up the reigns with their <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Action Plan</a>. While this report started predictably with the usual need for acres of GPU-filled datacenters, it continued onto something much more intriguing via the creation of a "National Data Library":</p>\n<blockquote>\n<ul>\n<li>Rapidly identify at least 5 high-impact public datasets it will seek to make available [...] Prioritisation should consider the potential economic and social value of the data, as well as public trust, national security, privacy, ethics, and data protection considerations.</li>\n<li>Build public sector data collection infrastructure and finance the creation of new high-value datasets that meet public sector, academia and startup needs.</li>\n<li>Actively incentivise and reward researchers and industry to curate and unlock private datasets.\n-- <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Opportunities Action Plan</a>, Jan 2025</li>\n</ul>\n</blockquote>\n<p>This takes into account much more of the nuances of getting access to public data. It identifies the need for data curation, and also the costs of curating such private datasets and ensuring correct use. The announcement spurred on a number of excellent thoughts from around the UK web about the implications, particularly from <a href=\"https://gavinfreeguard.com/\">Gavin Freeguard</a> who wrote about <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">how we should think about an NDL</a>. Gavin identified one particularly difficult element of exposing private data:</p>\n<blockquote>\n<p>[...] analogy with the National Data Library suggests that there might be some materials available to everyone, and some restricted to specialist researchers. There may be different access models for more sensitive material. There may be better and worse options \u2014 bringing together all the data in one place for accredited researchers to access [...] would be a logistical and security nightmare [...] may be possible to keep the data where it already is, but provide researchers with the ability to access different systems.\n-- <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">Gavin Freeguard</a></p>\n</blockquote>\n<p>Others also <a href=\"https://theodi.org/news-and-events/blog/how-to-build-a-national-data-library/\">identified</a> that the centralised library analogy only goes so far, and that we should focus on <a href=\"https://peterkwells.com/2024/12/18/the-national-data-library-should-help-people-deliver-trustworthy-data-services/\">building trustworthy data services instead</a> and on <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">the real lifecycle of the data</a> usage:</p>\n<blockquote>\n<p>[...] this means that the latest data is already there in the "library" [...] researchers don't first need to work with the data owners to create it [...] bodies of knowledge around using these complex datasets can be built up over time.</p>\n<p>Researchers can share code and derived data concepts, so the researchers that come after can iterate, refine, and build on what has gone before. None of this was possible with the previous "create and destroy" model of accessing these types of datasets, which was hugely inefficient\n-- <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">Administrative Data Research</a> UK</p>\n</blockquote>\n<p>Gosh, this network effect sounds an awful lot like what I experienced as a <a href=\"https://anil.recoil.org/\">Docker</a> maintainer, which had its incredible <a href=\"https://www.docker.com/blog/docker-index-dramatic-growth-in-docker-usage-affirms-the-continued-rising-power-of-developers/\">popularity</a> fuelled by tapping into users to building <em>and sharing</em> their own software packaging rather than depending on third parties to do it for them. If we could unlock the power of crowds here but go one step further and enforce privacy constraints on the underlying data and code, then the technical solution could be both usable and secure. I'm still not quite sure what that balance of UI would look like, but we're <a href=\"https://anil.recoil.org/projects/plancomp\">working on it</a> spearheaded by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>'s research areas.</p>\n<p>The Wellcome and ESRC have also put together a <a href=\"https://zenodo.org/communities/wellcome/records?q=&f=subject%3AData%20Library&l=list&p=1&s=10&sort=newest\">series of whitepapers</a> about the challenges and potential approaches behind the NDL (via <a href=\"https://en.wikipedia.org/wiki/Nick_McKeown\">Nick McKeown</a>). I'm still going through them in detail, but the <a href=\"https://zenodo.org/records/14671714\">modular approach</a> paper makes sensible observations about not trying to build one enormous national database and to not outsource it all to one organisation to build. Instead, they espouse a <a href=\"https://zenodo.org/records/14672004\">federated architectural</a> approach.</p>\n<p><a href=\"https://zenodo.org/records/14672004\"> \n<img alt=\"Sourced from https://zenodo.org/records/14672004\" src=\"https://anil.recoil.org/images/federated-ndl-ss-1.webp\" title=\"Sourced from https://zenodo.org/records/14672004\">\nSourced from https://zenodo.org/records/14672004 </a></p>\n<p>Since their primary (but not only) usecase focuses on <a href=\"https://ukhealthdata.org/\">health data</a>, there is an emphasis on moving the computation and data around rather than pooling it:</p>\n<blockquote>\n<p>The project's overlay mesh network dynamically and securely connects all the required resources. The\nmesh network creates a transient, project-specific, secure network boundary such that all the project\u2019s\ncomponents are within one overarching safe setting\n-- <a href=\"https://zenodo.org/records/14672004\">A federated architecture for a National Data Library</a></p>\n</blockquote>\n<p>This isn't a million miles away from how we set up <a href=\"https://docs.docker.com/engine/network/tutorials/overlay/\">overlay networks</a> on cloud infrastructure, but with the added twist of putting in more policy enforcement upfront.</p>\n<ul>\n<li>On the programming languages side, we're seeing exciting progress on <a href=\"https://github.com/MLanguage/mlang\">formalising legal systems</a> which encourages <a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4291177\">pair programming with lawyers</a> to capture the nuances of policy accurately (and <a href=\"https://news.law.northwestern.edu/sarah-lawsky-worked-on-a-tax-law-code-that-the-french-government-deemed-officially-awesome/\">pronounced 'awesome'</a> by the French government).</li>\n<li>At a systems level, <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> recently published <a href=\"https://cs.brown.edu/people/malte/pub/papers/2024-sosp-sesame.pdf\">Sesame</a> which provides end-to-end privacy sandboxing guarantees, and there is classic work on <a href=\"https://www.usenix.org/conference/nsdi-08/securing-distributed-systems-information-flow-control\">DIFC</a> that we've been using <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator\">more recently</a> in secure enclave programming.</li>\n<li>From a machine learning perspective, my colleague <a href=\"https://mlsys.cst.cam.ac.uk/\">Nic Lane</a>'s work on <a href=\"https://www.cam.ac.uk/research/news/can-federated-learning-save-the-world\">federated learning</a> via <a href=\"https://flower.ai/\">Flower</a> seems to be everywhere right now with its own <a href=\"https://flower.ai/events/flower-ai-summit-2025/\">summit</a> coming up.</li>\n</ul>\n<p>However, it's not all plain sailing, as there is also mega-controversy ongoing with the UK government's <a href=\"https://takes.jamesomalley.co.uk/p/ask-the-computer-people-first#footnote-anchor-3-156712689\">surprising</a> demands for an <a href=\"https://www.bbc.co.uk/news/articles/c20g288yldko\">encryption backdoor</a> into iCloud, leading to even more of a <a href=\"https://www.theregister.com/2025/02/13/us_demand_uk_apple_backdoor_close/\">geopolitical tangle</a> with the US. Irrespective of what happens with this particular case, it's clear that any end-to-end encryption in these federated systems will need to deal with the reality that jurisdictions will have different lawful decryption needs, so <a href=\"https://statusq.org/archives/2025/02/16/13063/\">end-to-end encryption may be at an end</a> for initiatives like the NDL. Add onto this the flagrant <a href=\"https://shujisado.org/2025/01/27/significant-risks-in-using-ai-models-governed-by-the-llama-license/\">disregard for licensing</a> in current pretrained language models but also the movement <a href=\"https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence\">to revise copyright laws</a> to legislate around this, and it's clear that technology will need to be fluid in adapting to matters of provenance tracking as well.</p>\n<p>There's definitely a rich set of academic literature in this space, combined with interesting constraints, and so I'll pull this together into an annotated bibtex soon!</p>\n<h2><a href=\"https://anil.recoil.org/#who-are-some-users-of-such-a-service\"></a>Who are some users of such a service?</h2>\n<p>To get some more inspiration on a technical solution, I've been looking to users of such an infrastructure to understand what easy-to-use interfaces might look like.</p>\n<p>My colleague <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> over at <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> co-lead a recent report into <a href=\"https://ai.cam.ac.uk/reports/access-to-data-case-studies\">case studies for the NDL</a> which is very much worth a read. From a conservation perspective, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> both <a href=\"https://ai.cam.ac.uk/blog/conserving-with-code-how-data-is-helping-to-save-our-planet\">gave input</a> about the importance of having such infrastructure for <a href=\"https://anil.recoil.org/projects/ce\">evidence-driven landuse</a>.</p>\n<blockquote>\n<p>What would be helpful, according to Dr Jaffer, is more\nstandardisation between publishers for Open Access material\nunder permissive licences.\n[...] having a coherent archive for OA materials that are licensed\nin such a way that they can be used for data mining without\nany technical hurdles would be the ideal scenario for this kind\nof research, as well as for a National Data Library,\n-- <a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\">Access to Data for Research</a>, AI@CAM</p>\n</blockquote>\n<p><a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\"> \n<img alt=\"The extremely cool doodle on the workshop from AI@Cam\" src=\"https://anil.recoil.org/images/ai-cam-data-library.webp\" title=\"The extremely cool doodle on the workshop from AI@Cam\">\nThe extremely cool doodle on the workshop from AI@Cam </a></p>\n<p>Another very different group I talked to back in 2023 via Rosalind Goodfellow as part of her <a href=\"https://www.csap.cam.ac.uk/network/rosalind-goodfellow/\">CSaP</a> fellowship was the <a href=\"https://www.gov.uk/government/organisations/geospatial-commission\">Geospatial Commission</a> who began work on a <a href=\"https://www.gov.uk/guidance/national-underground-asset-register-nuar\">National Underground Asset Register</a>. The NAUR was initially restricted to "<a href=\"https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1148100/NUAR_FAQs__.pdf\">safe dig</a>" usecases and not exposed more widely for security and other concerns. In 2024, they subsequently <a href=\"https://gdsgeospatial.blog.gov.uk/2024/01/11/discovering-potential-opportunities-for-the-national-underground-asset-register/\">reported</a> great interest in expanded usecases and are doing a discovery project on how to expose this information via APIs. This seems like an ideal usecase for some of the access control needs discussed above, as it's not only a lot of data (being geospatial) but also updated quite frequently and not necessarily something to make entirely public (although <a href=\"https://x2n.com/blog/how-utility-companies-are-using-satellite-technology/\">satellite pipeline monitoring</a> is perhaps obsoleting this need).</p>\n<p>And a month ago after reading our <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan for AI and conservation</a> paper, <a href=\"https://samreynolds.org/\">Sam Reynolds</a> <a href=\"https://coomeslab.org\">David Coomes</a> <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and I got invited by <a href=\"https://uk.linkedin.com/in/craig-bennett3\">Craig Bennett</a> to a remarkable dinner with the assembled leaders of all 46 of the UK's <a href=\"https://www.wildlifetrusts.org/\">wildlife trusts</a>. They are a collective of independent charities who together maintain wildlife areas across the UK, with most people living near one of their 2300+ parks (more than there are UK McDonald's branches!). Over the course of dinner, we heard from every single one of them, with the following gist:</p>\n<ul>\n<li>The 46 nature charities work by consensus but independently, but recently are building more central coordination around their use of systematic biodiversity data gathering across the nation. They are building a data pool across all of them, which is important as the sensing they do is very biased both spatially and across species (we know lots about <a href=\"https://www.rspb.org.uk/whats-happening/big-garden-birdwatch\">birds</a>, less about <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">hedgehogs</a>).</li>\n<li>The charities recognise that need to take more risks as the pressures on UK nature are currently <a href=\"https://www.wildlifetrusts.org/news/new-report-reveals-drought-now-considered-biggest-risk-uk-nature-reserves\">immense</a>, which means harnessing their data and AI responsibly to both accelerate action and also to recruit more participation from a broader cross-section of the UK population for citizen science input but also just to experience it.</li>\n<li><a href=\"https://www.conservationevidence.com\">Conservation evidence</a> is important to them, and sharing data from one area to replicate that action elsewhere in the UK is essential but difficult to engineer from scratch. There's a real cost to generating this data, and some confusion about appropriate licensing strategies. I gave a somewhat mixed message here reflecting my own uncertainly about the right way forward: one on hand, restricted licensing might prevent their data being hoovered up by the big tech companies who give peanuts back in return, but then again the bad actors in this space would simply <a href=\"https://www.vox.com/technology/2023/7/27/23808499/ai-openai-google-meta-data-privacy-nope\">ignore</a> the licensing and the good actors probably <a href=\"https://www.weforum.org/stories/2023/01/davos23-ai-divide-global-north-global-south/\">can't afford</a> it.</li>\n</ul>\n<p>The trusts are operating on a fairly shoestring budget already, so they're a great candidate to benefit from a collective, federated National Data Library. In particular, if the NDL can nail down a <a href=\"https://www.gov.uk/working-with-trade-unions/collective-bargaining\">collective bargaining</a> model for data access to big tech companies, this could finance the collection costs among smaller organisations throughout the four nations. The same holds true for thousands of small organisations around the UK that could benefit from this infrastructure and kickstart more <a href=\"https://lookingforgrowth.uk/\">sustainable growth</a>.</p>\n<p>\n<img alt=\"The assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening\" src=\"https://anil.recoil.org/images/wildlife-trusts-homerton.webp\" title=\"The assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening\">\nThe assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening</p>\n<p>I'm organising a get-together on the topic of <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> next month with <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate Foster</a> and a number of colleagues from around the world, so stay tuned for more updates in this space in the coming months! Your thoughts, as always, are most welcome.</p>\n\n<p><em>(Thanks <a href=\"https://samreynolds.org/\">Sam Reynolds</a> for the notes on what we discussed with the Wildlife Trusts)</em></p>\n\n\n<ol>\n<li>\n<p>This largely involved talking to individual publishers and agreeing not to directly train generative AI models and to keep them private to our own research use. Fairly reasonable stuff.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>Over the past year, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been getting an object lesson in how the modern Internet handles researcher access to data, as we've been downloading tens of millions of research papers towards our <a href=\"https://anil.recoil.org/projects/ce\">Conservation Evidence</a> project. This is legally possible via our <a href=\"https://www.lib.cam.ac.uk/stories/student-guide-libraries\">institutional subscriptions</a> that give us license to fulltexts, and the incredibly helpful <a href=\"https://uk.linkedin.com/in/james-caudwell-60681766\">head of electronic services</a> at the University Library who wields encyclopedic knowledge of each of our agreements with the hundreds of publishers out there. My thoughts on this then segwayed into recent conversations I've been having about the emerging <a href=\"https://takes.jamesomalley.co.uk/p/wtf-is-the-national-data-library\">National Data Library</a> and also with the UK <a href=\"https://www.wildlifetrusts.org/\">Wildlife Trusts</a>...</p>\n<h2><a href=\"https://anil.recoil.org/#the-difficulty-of-access-controlled-bulk-data-downloads\"></a>The difficulty of access controlled bulk data downloads</h2>\n<p>In late 2023, once we got past the legal aspects of downloading closed access papers<a href=\"https://anil.recoil.org/#fn-1\">[1]</a> it was still remarkably difficult to <em>actually</em> gain access to the actual paper datasets themselves. For instance, a select few hurdles include:</p>\n<ul>\n<li><a href=\"https://www.cloudflare.com/\">Cloudflare</a> got in the way <em>all</em> the time, preventing batch downloading by throwing <a href=\"https://en.wikipedia.org/wiki/CAPTCHA\">CAPTCHAs</a> down the wire. Each publisher has to individually allowlist our one hardworking IP, and it can take months for them to do this and it's never quite clear when we have been allowed. So I hacked up <a href=\"https://www.zenrows.com/blog/undetected-chromedriver-vs-selenium-stealth\">dodgy stealth downloaders</a> even though we're meant to have access via the publisher.</li>\n<li>Many official <a href=\"https://www.springernature.com/gp/researchers/text-and-data-mining\">text mining</a> APIs for publishers such as Elsevier and Springer do not provide PDF access, and only give an <a href=\"https://www.elsevier.com/en-gb/researcher/author/policies-and-guidelines/elsevier-xml-dtds-and-transport-schemas\">XML equivalent</a> which is both inconsistent in its schemas and misses diagrams. Luckily there are great projects like <a href=\"https://grobid.readthedocs.io/en/latest/\">Grobid</a> to normalise some of these with very <a href=\"https://github.com/kermitt2/Pub2TEI/pull/18\">responsive</a> maintainers.</li>\n<li>There existing archival indices for the PDFs that <a href=\"https://docs.openalex.org/api-entities/works/work-object/location-object\">point to preprints</a> around the web, but <a href=\"https://commoncrawl.org/blog/january-2025-crawl-archive-now-available\">CommonCrawl</a> truncates <a href=\"https://anil.recoil.org/ideas/grey-lit-crawl\">downloads</a> to their first megabyte, and the <a href=\"https://archive.org/details/UNPAYWALL-PDF-CRAWL-2019-04\">archive.org unpaywall</a> crawls are restricted access for licensing reasons. So I built a crawler to get these ourselves (I'm glad I wrote the first <a href=\"https://github.com/mirage/ocaml-cohttp\">cohttp</a> now!)</li>\n<li>Bulk download still involves individual HTTP queries with various rate throttling mechanisms that all vary slightly, making me an expert in different <a href=\"https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/429\">HTTP 429</a> response headers. There's not much sign of <a href=\"https://graphql.org/\">batch query</a> interfaces anywhere, probably because of the difficulty of access checking for each individual result.</li>\n<li>The <a href=\"https://pmc.ncbi.nlm.nih.gov/tools/ftp/#pdf\">NIH PMC</a> only have one hard-working rate-throttled FTP server for PDFs, which I've been slowly mirroring using a hand-crafted OCaml FTP client since Nov 2024 (almost done!)</li>\n<li>Meanwhile, because this is happening through allowlisting of specific IPs, I then got my Pembroke office kicked off the Internet due to automated abuse notifications going to the <a href=\"https://www.uis.cam.ac.uk/\">UIS</a> who turn netblocks off before checking (fair enough, it could be malware). But it would have been easier to run these downloads through <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds\">dust clouds</a> than try to do it properly by registering the addresses involved, eh?</li>\n</ul>\n<p>The situation is better for open access downloads, where projects such as <a href=\"https://core.ac.uk/\">Core</a> offer easier bulk access and large metadata databases like <a href=\"https://openalex.org\">OpenAlex</a> use '<a href=\"https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html\">downloader pays</a>' S3 buckets. And in other domains like satellite data, there is still a lot of complexity in obtaining the data, but <a href=\"https://github.com/sentinel-hub/sentinelhub-py\">programming wrappers</a> make implementing the (often terabyte-level) downloads much more palatable. For our recent <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE</a> biodiversity maps, we also make them available on services like <a href=\"https://zenodo.org/records/14188450\">Zenodo</a> as they are open.</p>\n<p>The lesson I took away from this is that it's really difficult to deal with large sensitive datasets where selective <em>access control</em> is required, and also that sort of data is rarely mirrored on the open web for obvious reasons. But in the <a href=\"https://www.theatlantic.com/health/archive/2025/02/trump-science-data-gender-dei/681698/\">current climate</a>, it's utterly vital that we move to protect human health or <a href=\"https://www.nature.com/articles/s41559-023-02226-2\">biodiversity data</a> gathered over decades that is irreplaceable once lost. And beyond data loss, if the data is present but not accessible, then what's the point in gathering it in the first place? It's also really important not to blame the existing publishers of these datasets, who are getting overwhelmed by <a href=\"https://perishablepress.com/ultimate-ai-block-list/\">AI bots</a> making huge numbers of requests to their infrastructure. So I'm getting energised by the idea of a cooperative solution among all the stakeholders involved.</p>\n<h2><a href=\"https://anil.recoil.org/#enter-the-national-data-library\"></a>Enter the National Data Library</h2>\n<p>You can imagine my excitement late last year when I got a call from the Royal Society to show up bright and early for a mysterious speech by Rishi Sunak. He duly <a href=\"https://www.gov.uk/government/speeches/prime-ministers-speech-on-ai-26-october-2023\">announced</a> the government's AI summit that mostly focussed on <a href=\"https://www.gov.uk/government/topical-events/ai-safety-summit-2023\">safety</a>, but a report by <a href=\"https://sciencesuperpower.substack.com/i/144202375/investing-in-public-goods\">Onward</a> caught my eye by recommending that <em>"the Government should establish a British Library for Data \u2013 a centralised, secure platform to collate high-quality data for scientists and start-ups"</em>. I wasn't down for the "centralised" part of this, but I generally liked the library analogy and the curation it implied.</p>\n<p>\n<img alt=\"Seeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper&apos;s head.\" src=\"https://anil.recoil.org/images/rishi-sunak-rs-ai-1.webp\" title=\"Seeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper&apos;s head.\">\nSeeing Rishi Sunak and, more importantly, the back of my PhD supervisor Andy Hopper's head.</p>\n<p>Then in 2025, with Sunak dispatched back to <a href=\"https://en.wikipedia.org/wiki/Richmond_and_Northallerton_(UK_Parliament_constituency)\">Richmond</a>, Labour took up the reigns with their <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Action Plan</a>. While this report started predictably with the usual need for acres of GPU-filled datacenters, it continued onto something much more intriguing via the creation of a "National Data Library":</p>\n<blockquote>\n<ul>\n<li>Rapidly identify at least 5 high-impact public datasets it will seek to make available [...] Prioritisation should consider the potential economic and social value of the data, as well as public trust, national security, privacy, ethics, and data protection considerations.</li>\n<li>Build public sector data collection infrastructure and finance the creation of new high-value datasets that meet public sector, academia and startup needs.</li>\n<li>Actively incentivise and reward researchers and industry to curate and unlock private datasets.\n-- <a href=\"https://www.gov.uk/government/publications/ai-opportunities-action-plan/ai-opportunities-action-plan\">AI Opportunities Action Plan</a>, Jan 2025</li>\n</ul>\n</blockquote>\n<p>This takes into account much more of the nuances of getting access to public data. It identifies the need for data curation, and also the costs of curating such private datasets and ensuring correct use. The announcement spurred on a number of excellent thoughts from around the UK web about the implications, particularly from <a href=\"https://gavinfreeguard.com/\">Gavin Freeguard</a> who wrote about <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">how we should think about an NDL</a>. Gavin identified one particularly difficult element of exposing private data:</p>\n<blockquote>\n<p>[...] analogy with the National Data Library suggests that there might be some materials available to everyone, and some restricted to specialist researchers. There may be different access models for more sensitive material. There may be better and worse options \u2014 bringing together all the data in one place for accredited researchers to access [...] would be a logistical and security nightmare [...] may be possible to keep the data where it already is, but provide researchers with the ability to access different systems.\n-- <a href=\"https://gavin-freeguard.medium.com/how-should-we-think-about-a-national-data-library-dd2d47edee8b\">Gavin Freeguard</a></p>\n</blockquote>\n<p>Others also <a href=\"https://theodi.org/news-and-events/blog/how-to-build-a-national-data-library/\">identified</a> that the centralised library analogy only goes so far, and that we should focus on <a href=\"https://peterkwells.com/2024/12/18/the-national-data-library-should-help-people-deliver-trustworthy-data-services/\">building trustworthy data services instead</a> and on <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">the real lifecycle of the data</a> usage:</p>\n<blockquote>\n<p>[...] this means that the latest data is already there in the "library" [...] researchers don't first need to work with the data owners to create it [...] bodies of knowledge around using these complex datasets can be built up over time.</p>\n<p>Researchers can share code and derived data concepts, so the researchers that come after can iterate, refine, and build on what has gone before. None of this was possible with the previous "create and destroy" model of accessing these types of datasets, which was hugely inefficient\n-- <a href=\"https://www.adruk.org/news-publications/news-blogs/the-new-uk-government-wants-a-national-data-library-a-brilliant-aspiration-if-built-on-solid-foundations/\">Administrative Data Research</a> UK</p>\n</blockquote>\n<p>Gosh, this network effect sounds an awful lot like what I experienced as a <a href=\"https://anil.recoil.org/\">Docker</a> maintainer, which had its incredible <a href=\"https://www.docker.com/blog/docker-index-dramatic-growth-in-docker-usage-affirms-the-continued-rising-power-of-developers/\">popularity</a> fuelled by tapping into users to building <em>and sharing</em> their own software packaging rather than depending on third parties to do it for them. If we could unlock the power of crowds here but go one step further and enforce privacy constraints on the underlying data and code, then the technical solution could be both usable and secure. I'm still not quite sure what that balance of UI would look like, but we're <a href=\"https://anil.recoil.org/projects/plancomp\">working on it</a> spearheaded by <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a> and <a href=\"https://ryan.freumh.org\">Ryan Gibb</a>'s research areas.</p>\n<p>The Wellcome and ESRC have also put together a <a href=\"https://zenodo.org/communities/wellcome/records?q=&f=subject%3AData%20Library&l=list&p=1&s=10&sort=newest\">series of whitepapers</a> about the challenges and potential approaches behind the NDL (via <a href=\"https://en.wikipedia.org/wiki/Nick_McKeown\">Nick McKeown</a>). I'm still going through them in detail, but the <a href=\"https://zenodo.org/records/14671714\">modular approach</a> paper makes sensible observations about not trying to build one enormous national database and to not outsource it all to one organisation to build. Instead, they espouse a <a href=\"https://zenodo.org/records/14672004\">federated architectural</a> approach.</p>\n<p><a href=\"https://zenodo.org/records/14672004\"> \n<img alt=\"Sourced from https://zenodo.org/records/14672004\" src=\"https://anil.recoil.org/images/federated-ndl-ss-1.webp\" title=\"Sourced from https://zenodo.org/records/14672004\">\nSourced from https://zenodo.org/records/14672004 </a></p>\n<p>Since their primary (but not only) usecase focuses on <a href=\"https://ukhealthdata.org/\">health data</a>, there is an emphasis on moving the computation and data around rather than pooling it:</p>\n<blockquote>\n<p>The project's overlay mesh network dynamically and securely connects all the required resources. The\nmesh network creates a transient, project-specific, secure network boundary such that all the project\u2019s\ncomponents are within one overarching safe setting\n-- <a href=\"https://zenodo.org/records/14672004\">A federated architecture for a National Data Library</a></p>\n</blockquote>\n<p>This isn't a million miles away from how we set up <a href=\"https://docs.docker.com/engine/network/tutorials/overlay/\">overlay networks</a> on cloud infrastructure, but with the added twist of putting in more policy enforcement upfront.</p>\n<ul>\n<li>On the programming languages side, we're seeing exciting progress on <a href=\"https://github.com/MLanguage/mlang\">formalising legal systems</a> which encourages <a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4291177\">pair programming with lawyers</a> to capture the nuances of policy accurately (and <a href=\"https://news.law.northwestern.edu/sarah-lawsky-worked-on-a-tax-law-code-that-the-french-government-deemed-officially-awesome/\">pronounced 'awesome'</a> by the French government).</li>\n<li>At a systems level, <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> recently published <a href=\"https://cs.brown.edu/people/malte/pub/papers/2024-sosp-sesame.pdf\">Sesame</a> which provides end-to-end privacy sandboxing guarantees, and there is classic work on <a href=\"https://www.usenix.org/conference/nsdi-08/securing-distributed-systems-information-flow-control\">DIFC</a> that we've been using <a href=\"https://anil.recoil.org/papers/2023-raid-deluminator\">more recently</a> in secure enclave programming.</li>\n<li>From a machine learning perspective, my colleague <a href=\"https://mlsys.cst.cam.ac.uk/\">Nic Lane</a>'s work on <a href=\"https://www.cam.ac.uk/research/news/can-federated-learning-save-the-world\">federated learning</a> via <a href=\"https://flower.ai/\">Flower</a> seems to be everywhere right now with its own <a href=\"https://flower.ai/events/flower-ai-summit-2025/\">summit</a> coming up.</li>\n</ul>\n<p>However, it's not all plain sailing, as there is also mega-controversy ongoing with the UK government's <a href=\"https://takes.jamesomalley.co.uk/p/ask-the-computer-people-first#footnote-anchor-3-156712689\">surprising</a> demands for an <a href=\"https://www.bbc.co.uk/news/articles/c20g288yldko\">encryption backdoor</a> into iCloud, leading to even more of a <a href=\"https://www.theregister.com/2025/02/13/us_demand_uk_apple_backdoor_close/\">geopolitical tangle</a> with the US. Irrespective of what happens with this particular case, it's clear that any end-to-end encryption in these federated systems will need to deal with the reality that jurisdictions will have different lawful decryption needs, so <a href=\"https://statusq.org/archives/2025/02/16/13063/\">end-to-end encryption may be at an end</a> for initiatives like the NDL. Add onto this the flagrant <a href=\"https://shujisado.org/2025/01/27/significant-risks-in-using-ai-models-governed-by-the-llama-license/\">disregard for licensing</a> in current pretrained language models but also the movement <a href=\"https://www.gov.uk/government/consultations/copyright-and-artificial-intelligence/copyright-and-artificial-intelligence\">to revise copyright laws</a> to legislate around this, and it's clear that technology will need to be fluid in adapting to matters of provenance tracking as well.</p>\n<p>There's definitely a rich set of academic literature in this space, combined with interesting constraints, and so I'll pull this together into an annotated bibtex soon!</p>\n<h2><a href=\"https://anil.recoil.org/#who-are-some-users-of-such-a-service\"></a>Who are some users of such a service?</h2>\n<p>To get some more inspiration on a technical solution, I've been looking to users of such an infrastructure to understand what easy-to-use interfaces might look like.</p>\n<p>My colleague <a href=\"https://inverseprobability.com/\">Neil Lawrence</a> over at <a href=\"https://ai.cam.ac.uk\">AI@Cam</a> co-lead a recent report into <a href=\"https://ai.cam.ac.uk/reports/access-to-data-case-studies\">case studies for the NDL</a> which is very much worth a read. From a conservation perspective, <a href=\"https://toao.com\">Sadiq Jaffer</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> both <a href=\"https://ai.cam.ac.uk/blog/conserving-with-code-how-data-is-helping-to-save-our-planet\">gave input</a> about the importance of having such infrastructure for <a href=\"https://anil.recoil.org/projects/ce\">evidence-driven landuse</a>.</p>\n<blockquote>\n<p>What would be helpful, according to Dr Jaffer, is more\nstandardisation between publishers for Open Access material\nunder permissive licences.\n[...] having a coherent archive for OA materials that are licensed\nin such a way that they can be used for data mining without\nany technical hurdles would be the ideal scenario for this kind\nof research, as well as for a National Data Library,\n-- <a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\">Access to Data for Research</a>, AI@CAM</p>\n</blockquote>\n<p><a href=\"https://ai.cam.ac.uk/projects/access-to-data-case-studies\"> \n<img alt=\"The extremely cool doodle on the workshop from AI@Cam\" src=\"https://anil.recoil.org/images/ai-cam-data-library.webp\" title=\"The extremely cool doodle on the workshop from AI@Cam\">\nThe extremely cool doodle on the workshop from AI@Cam </a></p>\n<p>Another very different group I talked to back in 2023 via Rosalind Goodfellow as part of her <a href=\"https://www.csap.cam.ac.uk/network/rosalind-goodfellow/\">CSaP</a> fellowship was the <a href=\"https://www.gov.uk/government/organisations/geospatial-commission\">Geospatial Commission</a> who began work on a <a href=\"https://www.gov.uk/guidance/national-underground-asset-register-nuar\">National Underground Asset Register</a>. The NAUR was initially restricted to "<a href=\"https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1148100/NUAR_FAQs__.pdf\">safe dig</a>" usecases and not exposed more widely for security and other concerns. In 2024, they subsequently <a href=\"https://gdsgeospatial.blog.gov.uk/2024/01/11/discovering-potential-opportunities-for-the-national-underground-asset-register/\">reported</a> great interest in expanded usecases and are doing a discovery project on how to expose this information via APIs. This seems like an ideal usecase for some of the access control needs discussed above, as it's not only a lot of data (being geospatial) but also updated quite frequently and not necessarily something to make entirely public (although <a href=\"https://x2n.com/blog/how-utility-companies-are-using-satellite-technology/\">satellite pipeline monitoring</a> is perhaps obsoleting this need).</p>\n<p>And a month ago after reading our <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan for AI and conservation</a> paper, <a href=\"https://samreynolds.org/\">Sam Reynolds</a> <a href=\"https://coomeslab.org\">David Coomes</a> <a href=\"https://www.cisl.cam.ac.uk/directory/emily-shuckburgh\">Emily Shuckburgh</a> and I got invited by <a href=\"https://uk.linkedin.com/in/craig-bennett3\">Craig Bennett</a> to a remarkable dinner with the assembled leaders of all 46 of the UK's <a href=\"https://www.wildlifetrusts.org/\">wildlife trusts</a>. They are a collective of independent charities who together maintain wildlife areas across the UK, with most people living near one of their 2300+ parks (more than there are UK McDonald's branches!). Over the course of dinner, we heard from every single one of them, with the following gist:</p>\n<ul>\n<li>The 46 nature charities work by consensus but independently, but recently are building more central coordination around their use of systematic biodiversity data gathering across the nation. They are building a data pool across all of them, which is important as the sensing they do is very biased both spatially and across species (we know lots about <a href=\"https://www.rspb.org.uk/whats-happening/big-garden-birdwatch\">birds</a>, less about <a href=\"https://www.britishhedgehogs.org.uk/british-hedgehog-now-officially-classified-as-vulnerable-to-extinction/\">hedgehogs</a>).</li>\n<li>The charities recognise that need to take more risks as the pressures on UK nature are currently <a href=\"https://www.wildlifetrusts.org/news/new-report-reveals-drought-now-considered-biggest-risk-uk-nature-reserves\">immense</a>, which means harnessing their data and AI responsibly to both accelerate action and also to recruit more participation from a broader cross-section of the UK population for citizen science input but also just to experience it.</li>\n<li><a href=\"https://www.conservationevidence.com\">Conservation evidence</a> is important to them, and sharing data from one area to replicate that action elsewhere in the UK is essential but difficult to engineer from scratch. There's a real cost to generating this data, and some confusion about appropriate licensing strategies. I gave a somewhat mixed message here reflecting my own uncertainly about the right way forward: one on hand, restricted licensing might prevent their data being hoovered up by the big tech companies who give peanuts back in return, but then again the bad actors in this space would simply <a href=\"https://www.vox.com/technology/2023/7/27/23808499/ai-openai-google-meta-data-privacy-nope\">ignore</a> the licensing and the good actors probably <a href=\"https://www.weforum.org/stories/2023/01/davos23-ai-divide-global-north-global-south/\">can't afford</a> it.</li>\n</ul>\n<p>The trusts are operating on a fairly shoestring budget already, so they're a great candidate to benefit from a collective, federated National Data Library. In particular, if the NDL can nail down a <a href=\"https://www.gov.uk/working-with-trade-unions/collective-bargaining\">collective bargaining</a> model for data access to big tech companies, this could finance the collection costs among smaller organisations throughout the four nations. The same holds true for thousands of small organisations around the UK that could benefit from this infrastructure and kickstart more <a href=\"https://lookingforgrowth.uk/\">sustainable growth</a>.</p>\n<p>\n<img alt=\"The assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening\" src=\"https://anil.recoil.org/images/wildlife-trusts-homerton.webp\" title=\"The assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening\">\nThe assembled CEOs of the Wildlife Trusts taught me awful lot about hedgehogs that evening</p>\n<p>I'm organising a get-together on the topic of <a href=\"https://anil.recoil.org/projects/plancomp\">planetary computing</a> next month with <a href=\"https://www.cs.cornell.edu/~jnfoster/\">Nate Foster</a> and a number of colleagues from around the world, so stay tuned for more updates in this space in the coming months! Your thoughts, as always, are most welcome.</p>\n\n<p><em>(Thanks <a href=\"https://samreynolds.org/\">Sam Reynolds</a> for the notes on what we discussed with the Wildlife Trusts)</em></p>\n\n\n<ol>\n<li>\n<p>This largely involved talking to individual publishers and agreeing not to directly train generative AI models and to keep them private to our own research use. Fairly reasonable stuff.</p>\n<span><a href=\"https://anil.recoil.org/#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+18
avsm/notes_ukri-grant-terra.json
+18
avsm/notes_ukri-grant-terra.json
···+"summary": "<p>I don't normally announce funded grants (preferring to focus on outcomes), but I'm really excited by this one and couldn't resist! Myself and my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> (from computer science), <a href=\"https://coomeslab.org\">David Coomes</a> (from Plant Sciences), <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> (from Zoology) and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> (the Head of Science at <a href=\"https://www.unep-wcmc.org/en/the-team\">UNEP-WCMC</a>) have just received a \u00a31.2m grant from the UKRI to work on <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">building foundation models for planetary intelligence</a>.</p>\n<p>Now, normally a grant isn't news, but I wanted to highlight the scheme that it came under. UKRI announced an <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">interdisciplinary program</a> specifically for projects that don't normally get funded by just one research council. In our case, this work usually falls between the cracks of EPSRC <em>("too much nature")</em> or NERC <em>("too much engineering")</em> or STFC <em>("not enough satellites")</em>. But this interdisciplinary program expressly assembled a panel across all these areas, and collectively gave us a shot. I really hope this scheme continues to gather steam within the UKRI.</p>\n<p>As to what we're doing? There'll be the evolution of the work described in <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>, with lots of domain knowledge that we're pulling together with our partners at UNEP-WCMC (especially <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> and <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a>) on plant and animal species distributions across the globe.</p>\n<p>\n<img alt=\"Us freezing in a Scottish August counting heather growth. There&apos;s got to be a more scalable way of doing this, right?\" src=\"https://anil.recoil.org/images/2024-clr-scotland.webp\" title=\"Us freezing in a Scottish August counting heather growth. There&apos;s got to be a more scalable way of doing this, right?\">\nUs freezing in a Scottish August counting heather growth. There's got to be a more scalable way of doing this, right?</p>\n<h2><a href=\"https://anil.recoil.org/#learn-more\"></a>Learn more</h2>\n<p>You can read more both in the <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">UKRI announcement today</a> and in the <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">Cambridge Computer Science coverage</a> about what we're up to. Some exciting preprints about our work in this space so far:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> is our new metric for calculating biodiversity impacts worldwide in a comparable way. We intend to extend it to cover plant species.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-food-life\">Quantifying the impact of the food we eat on species extinctions</a> connects up the biodiversity metric to supply chains to figure out the environmental impact of human food consumption on the planet. We intend to increase its resolution significantly with the new foundation models derived from remote sensing data.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a> is a battery-efficient sensing platform I'm working on with our Imperial buddies. We need more data about our planet!</li>\n</ul>",+"content": "<p>I don't normally announce funded grants (preferring to focus on outcomes), but I'm really excited by this one and couldn't resist! Myself and my colleagues <a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> (from computer science), <a href=\"https://coomeslab.org\">David Coomes</a> (from Plant Sciences), <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew Balmford</a> (from Zoology) and <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> (the Head of Science at <a href=\"https://www.unep-wcmc.org/en/the-team\">UNEP-WCMC</a>) have just received a \u00a31.2m grant from the UKRI to work on <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">building foundation models for planetary intelligence</a>.</p>\n<p>Now, normally a grant isn't news, but I wanted to highlight the scheme that it came under. UKRI announced an <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">interdisciplinary program</a> specifically for projects that don't normally get funded by just one research council. In our case, this work usually falls between the cracks of EPSRC <em>("too much nature")</em> or NERC <em>("too much engineering")</em> or STFC <em>("not enough satellites")</em>. But this interdisciplinary program expressly assembled a panel across all these areas, and collectively gave us a shot. I really hope this scheme continues to gather steam within the UKRI.</p>\n<p>As to what we're doing? There'll be the evolution of the work described in <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a> and <a href=\"https://anil.recoil.org/projects/life\">Mapping LIFE on Earth</a>, with lots of domain knowledge that we're pulling together with our partners at UNEP-WCMC (especially <a href=\"https://www.cambridgeconservation.org/about/people/professor-neil-burgess/\">Neil Burgess</a> and <a href=\"https://www.kew.org/science/our-science/people/ian-ondo\">Ian Ondo</a>) on plant and animal species distributions across the globe.</p>\n<p>\n<img alt=\"Us freezing in a Scottish August counting heather growth. There&apos;s got to be a more scalable way of doing this, right?\" src=\"https://anil.recoil.org/images/2024-clr-scotland.webp\" title=\"Us freezing in a Scottish August counting heather growth. There&apos;s got to be a more scalable way of doing this, right?\">\nUs freezing in a Scottish August counting heather growth. There's got to be a more scalable way of doing this, right?</p>\n<h2><a href=\"https://anil.recoil.org/#learn-more\"></a>Learn more</h2>\n<p>You can read more both in the <a href=\"https://www.ukri.org/news/first-projects-from-ukris-new-interdisciplinary-scheme-announced/\">UKRI announcement today</a> and in the <a href=\"https://www.cst.cam.ac.uk/news/meet-terra-ai-aiming-map-terrestrial-life-planet\">Cambridge Computer Science coverage</a> about what we're up to. Some exciting preprints about our work in this space so far:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> is our new metric for calculating biodiversity impacts worldwide in a comparable way. We intend to extend it to cover plant species.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-food-life\">Quantifying the impact of the food we eat on species extinctions</a> connects up the biodiversity metric to supply chains to figure out the environmental impact of human food consumption on the planet. We intend to increase its resolution significantly with the new foundation models derived from remote sensing data.</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a> is a battery-efficient sensing platform I'm working on with our Imperial buddies. We need more data about our planet!</li>\n</ul>",
+18
avsm/notes_unikernels-in-cacm.json
+18
avsm/notes_unikernels-in-cacm.json
···+"summary": "<p>The Communications of the ACM have just published an article that <a href=\"https://github.com/djs55\">Dave Scott</a> and I wrote providing a broader background on the concept of <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">Unikernels</a> that we\u2019ve been working on since about 2003, when we started building <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> and the <a href=\"http://anil.recoil.org/papers/2010-icfp-xen.pdf\">Xen toolstack</a>. You can read either the <a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels\">print article</a> (requires an ACM subscription) or the <a href=\"http://queue.acm.org/detail.cfm?id=2566628\">open access version</a> on the ACM Queue.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/acm-queue-unikernels-ss.webp\" title=\"\">\n\nThere's been some interesting discussion about it already online:</p>\n<ul>\n<li>On <a href=\"http://www.reddit.com/r/programming/comments/1upy41/mirage_os_10_released_last_december/\">Reddit</a>, a number of queries about how it fits into the space of containers, microkernels, and other experimental operating systems.</li>\n<li>Coverage from <a href=\"http://www.eweek.com/cloud/xen-project-builds-its-own-cloud-os-mirage.html\">eWeek</a>, <a href=\"http://www.infoworld.com/t/operating-systems/xen-mirage-the-less-more-cloud-os-233823\">InfoWorld</a>, and <a href=\"http://www.linux.com/news/enterprise/cloud-computing/751156-are-cloud-operating-systems-the-next-big-thing\">Linux.com</a>, and a couple of interviews on InfoQ covering <a href=\"http://www.infoq.com/news/2013/12/mirageos\">Mirage</a> and my <a href=\"http://www.infoq.com/articles/real-world-ocaml-interview\">book on OCaml</a> that give more background on the project.</li>\n</ul>\n<p>Two of the most interesting bits of feedback for me personally came from <a href=\"http://en.wikipedia.org/wiki/Butler_Lampson\">Butler Lampson</a> (via Jon Crowcroft) and <a href=\"http://www.cs.cmu.edu/~rwh/\">Robert Harper</a>, two computer scientists who have made key contributions to operating systems and programming languages and provided some broader perspective.</p>\n<p>Butler Lampson points out (edited for the web):</p>\n<blockquote>\n<p>I found the Mirage work quite interesting: a 21st-century version of things that we did at Xerox in the 1970s. Of course, the application domain is quite different, and so is the whole-program optimization. And we couldn\u2019t afford garbage collection, so freeing storage was not type-safe. But there are lots of interesting parallels.</p>\n<p>The \u201cOS as libraries\u201d idea was what made it possible to fit big applications into the Alto\u2019s 128k bytes of memory:</p>\n<p><em>Lampson and Sproull</em>, <a href=\"http://research.microsoft.com/pubs/68223/acrobat.pdf\">An open operating system for a single-user machine</a>, ACM Operating Systems Rev. 11, 5 (Dec. 1979), pp 98-105. <a href=\"http://dl.acm.org/citation.cfm?id=800215.806575\">ACM</a>.</p>\n<p>The use of strong type-checking and interfaces for an OS was pioneered in [Mesa](http://en.wikipedia.org/wiki/Mesa_(programming_language%29) and [Pilot](http://en.wikipedia.org/wiki/Pilot_(operating_system%29):</p>\n<p><em>Lauer and Satterthwaite</em>, <a href=\"http://dl.acm.org/citation.cfm?id=802937\">The impact of Mesa on system design</a>, Proc. 4th ICSE, Munich, Sep. 1979, pp 174-182.</p>\n<p><em>Redell et al</em>, <a href=\"http://web.cs.wpi.edu/~cs502/s06/Papers/Redell,%20Pilot%20Operating%20System.pdf\">Pilot: An Operating System for a Personal Computer</a>, Comm. ACM 23, 2 (Feb 1980), pp 81-92 (from 7th SOSP, 1979). <a href=\"http://dl.acm.org/citation.cfm?id=358818.358822&coll=DL&dl=ACM&CFID=396678249&CFTOKEN=51329799\">ACM</a>.</p>\n</blockquote>\n<p>Robert Harper correctly points out some related work that was missing from our CACM article:</p>\n<ul>\n<li><a href=\"http://www.cs.cmu.edu/~fox/foxnet.html\">FoxNet</a> is an implementation of the standard TCP/IP networking protocol stack using the <a href=\"http://en.wikipedia.org/wiki/Standard_ML\">Standard ML</a> (SML) language. It was part of a wide-reaching project at CMU in the 1990s that made seminal contributions in <a href=\"http://www.cs.cmu.edu/~fox/pcc.html\">proof-carrying code</a> and <a href=\"http://www.cs.cmu.edu/~fox/til.html\">typed intermediate languages</a>, among <a href=\"http://www.cs.cmu.edu/~fox/publications.html\">many other things</a>. The FoxNet stack was actually one of my big inspirations for wanting to build Mirage since the elegance of using functors as a form of dependency injection into a system as complex as an OS and application stack is very desirable and the reason we chose to build Mirage in ML instead of another, less modular, language.</li>\n<li>Ensemble (website now offline but here\u2019s a <a href=\"http://www.cs.uni-potsdam.de/ti/kreitz/PDF/99sosp-fastpath.pdf\">SOSP 1999 paper</a>) is a group communication system written in OCaml, developed at Cornell and the Hebrew University. For an application builder, Ensemble provides a library of protocols that can be used for quickly building complex distributed applications. For a distributed systems researcher, Ensemble is a highly modular and reconfigurable toolkit: the high-level protocols provided to applications are really stacks of tiny protocol \u201clayers,\u201d each of whose can be modified or rebuilt to experiment.</li>\n</ul>\n<p>Both Ensemble and FoxNet made strong echoes throughout the design of Mirage (and its precursor software such as <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> in 2007). The <a href=\"http://openmirage.org/wiki/hello-world\">Mirage command-line tool</a> uses staged computation to build a concrete application out of functors, and we are making this even more programmable via a new <a href=\"https://github.com/mirage/mirage/pull/178\">combinator-based functor types</a> library that <a href=\"http://gazagnaire.org/\">Thomas Gazagnaire</a> built, and also experimenting with <a href=\"https://github.com/ocamllabs/higher\">higher kinded polymorphic</a> abstractions.</p>\n<p>My thanks to Butler Lampson and Robert Harper for making me go re-read their papers again, and I\u2019d like to leave you with Malte Schwarzkopf\u2019s <a href=\"http://www.cl.cam.ac.uk/~ms705/netos/os-reading-group.html\">OS Reading Group</a> papers for other essential reading in this space. Many more citations immediately relevant to Mirage can also be found in our <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">ASPLOS 2013</a> paper.</p>",+"content": "<p>The Communications of the ACM have just published an article that <a href=\"https://github.com/djs55\">Dave Scott</a> and I wrote providing a broader background on the concept of <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">Unikernels</a> that we\u2019ve been working on since about 2003, when we started building <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> and the <a href=\"http://anil.recoil.org/papers/2010-icfp-xen.pdf\">Xen toolstack</a>. You can read either the <a href=\"http://cacm.acm.org/magazines/2014/1/170866-unikernels\">print article</a> (requires an ACM subscription) or the <a href=\"http://queue.acm.org/detail.cfm?id=2566628\">open access version</a> on the ACM Queue.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/acm-queue-unikernels-ss.webp\" title=\"\">\n\nThere's been some interesting discussion about it already online:</p>\n<ul>\n<li>On <a href=\"http://www.reddit.com/r/programming/comments/1upy41/mirage_os_10_released_last_december/\">Reddit</a>, a number of queries about how it fits into the space of containers, microkernels, and other experimental operating systems.</li>\n<li>Coverage from <a href=\"http://www.eweek.com/cloud/xen-project-builds-its-own-cloud-os-mirage.html\">eWeek</a>, <a href=\"http://www.infoworld.com/t/operating-systems/xen-mirage-the-less-more-cloud-os-233823\">InfoWorld</a>, and <a href=\"http://www.linux.com/news/enterprise/cloud-computing/751156-are-cloud-operating-systems-the-next-big-thing\">Linux.com</a>, and a couple of interviews on InfoQ covering <a href=\"http://www.infoq.com/news/2013/12/mirageos\">Mirage</a> and my <a href=\"http://www.infoq.com/articles/real-world-ocaml-interview\">book on OCaml</a> that give more background on the project.</li>\n</ul>\n<p>Two of the most interesting bits of feedback for me personally came from <a href=\"http://en.wikipedia.org/wiki/Butler_Lampson\">Butler Lampson</a> (via Jon Crowcroft) and <a href=\"http://www.cs.cmu.edu/~rwh/\">Robert Harper</a>, two computer scientists who have made key contributions to operating systems and programming languages and provided some broader perspective.</p>\n<p>Butler Lampson points out (edited for the web):</p>\n<blockquote>\n<p>I found the Mirage work quite interesting: a 21st-century version of things that we did at Xerox in the 1970s. Of course, the application domain is quite different, and so is the whole-program optimization. And we couldn\u2019t afford garbage collection, so freeing storage was not type-safe. But there are lots of interesting parallels.</p>\n<p>The \u201cOS as libraries\u201d idea was what made it possible to fit big applications into the Alto\u2019s 128k bytes of memory:</p>\n<p><em>Lampson and Sproull</em>, <a href=\"http://research.microsoft.com/pubs/68223/acrobat.pdf\">An open operating system for a single-user machine</a>, ACM Operating Systems Rev. 11, 5 (Dec. 1979), pp 98-105. <a href=\"http://dl.acm.org/citation.cfm?id=800215.806575\">ACM</a>.</p>\n<p>The use of strong type-checking and interfaces for an OS was pioneered in [Mesa](http://en.wikipedia.org/wiki/Mesa_(programming_language%29) and [Pilot](http://en.wikipedia.org/wiki/Pilot_(operating_system%29):</p>\n<p><em>Lauer and Satterthwaite</em>, <a href=\"http://dl.acm.org/citation.cfm?id=802937\">The impact of Mesa on system design</a>, Proc. 4th ICSE, Munich, Sep. 1979, pp 174-182.</p>\n<p><em>Redell et al</em>, <a href=\"http://web.cs.wpi.edu/~cs502/s06/Papers/Redell,%20Pilot%20Operating%20System.pdf\">Pilot: An Operating System for a Personal Computer</a>, Comm. ACM 23, 2 (Feb 1980), pp 81-92 (from 7th SOSP, 1979). <a href=\"http://dl.acm.org/citation.cfm?id=358818.358822&coll=DL&dl=ACM&CFID=396678249&CFTOKEN=51329799\">ACM</a>.</p>\n</blockquote>\n<p>Robert Harper correctly points out some related work that was missing from our CACM article:</p>\n<ul>\n<li><a href=\"http://www.cs.cmu.edu/~fox/foxnet.html\">FoxNet</a> is an implementation of the standard TCP/IP networking protocol stack using the <a href=\"http://en.wikipedia.org/wiki/Standard_ML\">Standard ML</a> (SML) language. It was part of a wide-reaching project at CMU in the 1990s that made seminal contributions in <a href=\"http://www.cs.cmu.edu/~fox/pcc.html\">proof-carrying code</a> and <a href=\"http://www.cs.cmu.edu/~fox/til.html\">typed intermediate languages</a>, among <a href=\"http://www.cs.cmu.edu/~fox/publications.html\">many other things</a>. The FoxNet stack was actually one of my big inspirations for wanting to build Mirage since the elegance of using functors as a form of dependency injection into a system as complex as an OS and application stack is very desirable and the reason we chose to build Mirage in ML instead of another, less modular, language.</li>\n<li>Ensemble (website now offline but here\u2019s a <a href=\"http://www.cs.uni-potsdam.de/ti/kreitz/PDF/99sosp-fastpath.pdf\">SOSP 1999 paper</a>) is a group communication system written in OCaml, developed at Cornell and the Hebrew University. For an application builder, Ensemble provides a library of protocols that can be used for quickly building complex distributed applications. For a distributed systems researcher, Ensemble is a highly modular and reconfigurable toolkit: the high-level protocols provided to applications are really stacks of tiny protocol \u201clayers,\u201d each of whose can be modified or rebuilt to experiment.</li>\n</ul>\n<p>Both Ensemble and FoxNet made strong echoes throughout the design of Mirage (and its precursor software such as <a href=\"http://anil.recoil.org/papers/2007-eurosys-melange.pdf\">Melange</a> in 2007). The <a href=\"http://openmirage.org/wiki/hello-world\">Mirage command-line tool</a> uses staged computation to build a concrete application out of functors, and we are making this even more programmable via a new <a href=\"https://github.com/mirage/mirage/pull/178\">combinator-based functor types</a> library that <a href=\"http://gazagnaire.org/\">Thomas Gazagnaire</a> built, and also experimenting with <a href=\"https://github.com/ocamllabs/higher\">higher kinded polymorphic</a> abstractions.</p>\n<p>My thanks to Butler Lampson and Robert Harper for making me go re-read their papers again, and I\u2019d like to leave you with Malte Schwarzkopf\u2019s <a href=\"http://www.cl.cam.ac.uk/~ms705/netos/os-reading-group.html\">OS Reading Group</a> papers for other essential reading in this space. Many more citations immediately relevant to Mirage can also be found in our <a href=\"http://anil.recoil.org/papers/2013-asplos-mirage.pdf\">ASPLOS 2013</a> paper.</p>",
+18
avsm/notes_unikernels-test-of-time.json
+18
avsm/notes_unikernels-test-of-time.json
···+"summary": "<p>I was gobsmacked to get a note from the SIGARCH <a href=\"https://www.asplos-conference.org\">ASPLOS</a> steering committee that our 2013 paper "<a href=\"https://anil.recoil.org/papers/2013-asplos-mirage\">Unikernels: library operating systems for the cloud</a>" won the <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">most influential paper</a> award at the conference last week! I couldn't make it to Rotterdam myself due to the <a href=\"https://www.businesstraveller.com/forums/topic/reminder-no-direct-eurostar-amsterdam-rotterdam-london-for-six-months/\">travel time</a>, but <a href=\"https://github.com/mor1\">Richard Mortier</a> was <a href=\"https://mort.io/blog/tdis-accepted/\">already there</a> and so accepted the award on the whole team's behalf!</p>\n<p>My officemate <a href=\"https://en.wikipedia.org/wiki/Simon_Peyton_Jones\">Simon Peyton Jones</a> pointed out to me that these 'test of time' awards are his favourite, as they indicate that a piece of research was actually useful over a number of years to other people in the field:</p>\n<blockquote>\n<p>The ASPLOS Influential Paper Award recognizes historical ASPLOS papers that have had major influence on the field. The Program Committee nominates highly influential papers from any ASPLOS conference that occurred ten or more conferences ago, with the final selections being made by the ASPLOS Steering Committee.\n-- <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">SIGARCH awards</a></p>\n</blockquote>\n<p>\n<img alt=\"Mort rocking the award with customary peak-geek EDSAC t-shirt\" src=\"https://anil.recoil.org/images/asplos25-award-1.webp\" title=\"Mort rocking the award with customary peak-geek EDSAC t-shirt\">\nMort rocking the award with customary peak-geek EDSAC t-shirt</p>\n<p>My long-time colleague <a href=\"https://github.com/djs55\">Dave Scott</a> wrote up a great overview of why <a href=\"https://dave.recoil.org/unikernels/\">he likes unikernels</a>, especially in areas like <a href=\"https://anil.recoil.org/\">Docker</a> where he is a senior maintainer these days. Dave uses a nice jigsaw puzzle analogy to show the value of a library operating system approach when building complex systems glue; they're good for high assurance applications, for rapid experimentation and iteration, and for deep systems customisation.</p>\n<h2><a href=\"https://anil.recoil.org/#i-almost-rage-quit-academia\"></a>I almost rage-quit academia</h2>\n<p><a href=\"https://github.com/mor1\">Richard Mortier</a>'s <a href=\"https://mort.io/blog/happy-day/\">note</a> brought up some memories about how this particular work made me almost rage-quit academia entirely. Back in 2009 after I returned to academia with <a href=\"https://horizon.ac.uk\">Horizon</a> fellowship, I spent a year of my life working on lots of libraries for the very first iteration of MirageOS, and published a USENIX <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">workshop paper</a> on it.</p>\n<p>After that, it was time to do the full conference paper, so I spent another year (joined by more colleagues like <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and <a href=\"https://github.com/djs55\">Dave Scott</a> in the <a href=\"https://anil.recoil.org/projects/unikernels\">early</a> days) bringing yet more OCaml libraries to life to make the thing actually useful. <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a> spent forever on an OpenFlow implementation in pure OCaml; <a href=\"https://github.com/balrajsingh\">Balraj Singh</a> sorted out the mess I made of the TCP stack congestion algorithms; <a href=\"https://github.com/sos22\">Steven Smith</a> hacked on Xen with me to add immutable pfn support. There was a lot of intense hacking going on.</p>\n<p>We then trimphantly wrote up our work as a submission to OSDI 2012 after staying up all night for several days in a row to get the evaluation done, and... it got rejected. But the paper didn't <em>just</em> get rejected, it got rejected so hard that I couldn't bear to look at another OSDI proceedings for years. So hard that I can still the weight of the punch of that email as I opened it eagerly, a decade on. Some of our colleagues also had rejected OSDI papers that year; the <a href=\"https://www.microsoft.com/en-us/research/project/naiad/?from=https://research.microsoft.com/en-us/projects/naiad/&type=exact\">Naiad</a> paper got six reviews but bounced (later to win best paper at <a href=\"https://sigops.org/s/conferences/sosp/2013/\">SOSP 2013</a>), and <span>Andrew Warfield</span> had another that got nine (!) reviews before being shown the door. But ours got...three reviews indicating it bounced in the very first round, and one review scored us at '1/5', the lowest possible value. It made all that intense work seem like a total waste of our lives.</p>\n<p>But then... one of the reviews shone out like a beacon. It was the longest review, and was <em>full</em> of directly actionable feedback. It began very constructively:</p>\n<blockquote>\n<p>The approach described in this paper is a very reasonable design point,\na natural intersection of the exokernel and libOS and the type-safe OS.\nIt would help to refocus the abstract and intro on the precise benefits that\nUnikernel can provide, and to attribute each benefit to its origin. Let me\ntake a stab at this, based on my read of the paper:</p>\n</blockquote>\n<p>The reviewer then reinterpreted our submission to add more clarity:</p>\n<blockquote>\n<ul>\n<li>"eliminate several classes of security threats via type-safety" -- clearly due to the top-to-bottom use of type-safe OCaml.</li>\n<li>"eliminate several classes of security threats via ... an address-space which can be made immutable" -- is there any reason an analogous technique could not apply to a libOS version of libc?</li>\n<li>"progressive specialization" -- I didn't find this contribution very exciting. It's nice that I can execute the same OCaml app in a Linux context to debug it; perhaps this is especially important since we don't yet have good symbolic debuggers for OCaml apps standing alone in a VM. But it really doesn't seem like a central benefit.</li>\n<li>"developers no longer need to become sysadmins" -- This claim is specious. If a third party packages an app together with a Linux guest-OS stack to become an appliance, there's no reason that appliance would require any sysadmin-ish behavior more than a Unikernel appliance.</li>\n<li>"The resulting unikernels are also highly compact" -- Could this property not also be readily achieved with a tuned libc-based (that is, not type-safe) libOS? How precious is this property? Is the <em>working set</em> actually much smaller? And finally, how much of the reduction is because the rewritten application is much less functional than the industry standard it replaces?\n[...the review continues on for several pages]</li>\n</ul>\n</blockquote>\n<p>Now, I didn't <em>agree</em> with all the points in the review, but they were restructured in such a way that made it clear that the reviewer had really thought about it, and had tried to pull out their own insights from the system construction. We ate up this feedback, and resubmitted it in a matter of weeks to ASPLOS adopting much of the structure suggested by this OSDI reviewer, and the paper got in with accepts across the board.</p>\n<p>The best bit of all this? The OSDI reviewer voluntarily <em>unblinded</em> their review:</p>\n<blockquote>\n<p>Review by Jon Howell <a href=\"mailto:howell@microsoft.com\">howell@microsoft.com</a>, intentionally unblinded.</p>\n</blockquote>\n<p>Jon's obviously an expert in the field (his own 2011 paper on <a href=\"https://dl.acm.org/doi/10.1145/1961296.1950399\">Drawbridge</a> won the ASPLOS influential paper award last year), but it's how much time he took in helping out a sibling system that stuck with me. His kind, constructive and direct review kept me in academia, and although I still haven't met him in person (life got really busy right afterwards with <a href=\"https://anil.recoil.org/notes/docker-buys-unikernel-systems\">Unikernel Systems</a>), I definitely still owe him a pint!</p>\n<h2><a href=\"https://anil.recoil.org/#systems-research-a-decade-or-more-on\"></a>Systems research a decade or more on</h2>\n<p>Mort also <a href=\"https://mort.io/blog/happy-day/\">made me think</a> about what we learnt from all this work that current students might learn from.</p>\n<p>You never know which papers will sink or swim in the fullness of time and most never get that popular outside a small niche, so don't worry about that right now when doing the work! Focus instead of honing your systems intuition for <em>why</em> you're building something, and bring out the <a href=\"https://blog.regehr.org/archives/6\">delta of your contributions</a> clearly in the paper. When you're building a complex system, there's a lot of boilerplate and scaffolding necessary, but the core of the "thing that hasn't been done before" is what you know best, and it can take some time and <a href=\"https://en.wikipedia.org/wiki/Rubber_duck_debugging\">conversations</a> to figure out what that is.</p>\n<p>\n<img alt=\"In the much missed Flying Pig...with Jon\" src=\"https://anil.recoil.org/images/jc-flyingpig-1.webp\" title=\"In the much missed Flying Pig...with Jon\">\nIn the much missed Flying Pig...with Jon</p>\n<p>Back in the day, most of our discussions about systems research happened after a day of deep hacking down at the pub.\nSince the pandemic, we seem to have lost a big chunk of that social discussion around our work collectively. While I still see people regularly for a swift half, it somehow seems more difficult to gather people in general. Part of that is that I don't go into the department much anymore due to the noise and cold (something <a href=\"https://jonmsterling.com\">Jon Sterling</a> also <a href=\"https://amok.recoil.org/@jonmsterling@mathstodon.xyz/114318437109811024\">observed recently</a>), <em>vs</em> my cosy Pembroke office.</p>\n<p>I'm not sure if it's just me or everyone else also feeling this, but I'm so zoned out after even a few Zoom calls that I'm just not very social afterwards. So one thing I'm aiming to do more consciously after Easter is to try to gather people down at <a href=\"https://www.themillpubcambridge.com/\">The Mill</a> more for a swift half (where they serve excellent <a href=\"https://www.guinness.com/en-gb/beers/guinness-zero\">Guinness Zero</a>), and really cut down on remote interactions that aren't necessary.</p>\n<p>\n<img alt=\"In the Kingston Arms...with Jon\" src=\"https://anil.recoil.org/images/jc-kingston-1.webp\" title=\"In the Kingston Arms...with Jon\">\nIn the Kingston Arms...with Jon</p>\n<p>And the last tip is from Barry Schwartz, who noted that <a href=\"https://en.wikipedia.org/wiki/The_Paradox_of_Choice\">the secret to happiness is low expectations</a>. No matter how much works goes into a system, don't bank too much on the big papers making a splash. Instead, enjoy every step of the journey -- from building things, scrapping them, debugging odd failures, throwing ideas around, releasing the software, seeing adoption, scrapping it all and starting again, the whole journey! There will always be a <a href=\"https://link.springer.com/article/10.1007/s40037-021-00670-z\">reviewer 2</a> waiting to ruin your day if you let them, so don't let them in.</p>\n<p>I also wonder how long paper publishing in its current form will survive; with the sheer number of publications coming out these days and the amount of <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">AI generated output</a>, it's difficult to see something published today having the same ramp as the work we did in the past few decades. Instead, adoption and rapid iteration seem to be the way to go. Thankfully, our University intellectual property rights <a href=\"https://www.enterprise.cam.ac.uk/wp-content/uploads/2015/04/IP-Policy-in-Practice-Guidance-Note-25May10-FINAL-CLEAN-Updated-links-August-2015.pdf\">remain liberal</a> (patents aside, but who cares about those these days for software), so there's nothing stopping us!</p>\n<p>\n<img alt=\"In the Mill...with Jon. Remembering our friend Ross!\" src=\"https://anil.recoil.org/images/jc-mill-1.webp\" title=\"In the Mill...with Jon. Remembering our friend Ross!\">\nIn the Mill...with Jon. Remembering our friend Ross!</p>",+"content": "<p>I was gobsmacked to get a note from the SIGARCH <a href=\"https://www.asplos-conference.org\">ASPLOS</a> steering committee that our 2013 paper "<a href=\"https://anil.recoil.org/papers/2013-asplos-mirage\">Unikernels: library operating systems for the cloud</a>" won the <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">most influential paper</a> award at the conference last week! I couldn't make it to Rotterdam myself due to the <a href=\"https://www.businesstraveller.com/forums/topic/reminder-no-direct-eurostar-amsterdam-rotterdam-london-for-six-months/\">travel time</a>, but <a href=\"https://github.com/mor1\">Richard Mortier</a> was <a href=\"https://mort.io/blog/tdis-accepted/\">already there</a> and so accepted the award on the whole team's behalf!</p>\n<p>My officemate <a href=\"https://en.wikipedia.org/wiki/Simon_Peyton_Jones\">Simon Peyton Jones</a> pointed out to me that these 'test of time' awards are his favourite, as they indicate that a piece of research was actually useful over a number of years to other people in the field:</p>\n<blockquote>\n<p>The ASPLOS Influential Paper Award recognizes historical ASPLOS papers that have had major influence on the field. The Program Committee nominates highly influential papers from any ASPLOS conference that occurred ten or more conferences ago, with the final selections being made by the ASPLOS Steering Committee.\n-- <a href=\"https://www.sigarch.org/benefit/awards/acm-sigarch-sigplan-sigops-asplos-influential-paper-award/\">SIGARCH awards</a></p>\n</blockquote>\n<p>\n<img alt=\"Mort rocking the award with customary peak-geek EDSAC t-shirt\" src=\"https://anil.recoil.org/images/asplos25-award-1.webp\" title=\"Mort rocking the award with customary peak-geek EDSAC t-shirt\">\nMort rocking the award with customary peak-geek EDSAC t-shirt</p>\n<p>My long-time colleague <a href=\"https://github.com/djs55\">Dave Scott</a> wrote up a great overview of why <a href=\"https://dave.recoil.org/unikernels/\">he likes unikernels</a>, especially in areas like <a href=\"https://anil.recoil.org/\">Docker</a> where he is a senior maintainer these days. Dave uses a nice jigsaw puzzle analogy to show the value of a library operating system approach when building complex systems glue; they're good for high assurance applications, for rapid experimentation and iteration, and for deep systems customisation.</p>\n<h2><a href=\"https://anil.recoil.org/#i-almost-rage-quit-academia\"></a>I almost rage-quit academia</h2>\n<p><a href=\"https://github.com/mor1\">Richard Mortier</a>'s <a href=\"https://mort.io/blog/happy-day/\">note</a> brought up some memories about how this particular work made me almost rage-quit academia entirely. Back in 2009 after I returned to academia with <a href=\"https://horizon.ac.uk\">Horizon</a> fellowship, I spent a year of my life working on lots of libraries for the very first iteration of MirageOS, and published a USENIX <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">workshop paper</a> on it.</p>\n<p>After that, it was time to do the full conference paper, so I spent another year (joined by more colleagues like <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and <a href=\"https://github.com/djs55\">Dave Scott</a> in the <a href=\"https://anil.recoil.org/projects/unikernels\">early</a> days) bringing yet more OCaml libraries to life to make the thing actually useful. <a href=\"https://www.lancaster.ac.uk/scc/about-us/people/charalampos-rotsos\">Charalampos Rotsos</a> and <a href=\"https://github.com/mor1\">Richard Mortier</a> spent forever on an OpenFlow implementation in pure OCaml; <a href=\"https://github.com/balrajsingh\">Balraj Singh</a> sorted out the mess I made of the TCP stack congestion algorithms; <a href=\"https://github.com/sos22\">Steven Smith</a> hacked on Xen with me to add immutable pfn support. There was a lot of intense hacking going on.</p>\n<p>We then trimphantly wrote up our work as a submission to OSDI 2012 after staying up all night for several days in a row to get the evaluation done, and... it got rejected. But the paper didn't <em>just</em> get rejected, it got rejected so hard that I couldn't bear to look at another OSDI proceedings for years. So hard that I can still the weight of the punch of that email as I opened it eagerly, a decade on. Some of our colleagues also had rejected OSDI papers that year; the <a href=\"https://www.microsoft.com/en-us/research/project/naiad/?from=https://research.microsoft.com/en-us/projects/naiad/&type=exact\">Naiad</a> paper got six reviews but bounced (later to win best paper at <a href=\"https://sigops.org/s/conferences/sosp/2013/\">SOSP 2013</a>), and <span>Andrew Warfield</span> had another that got nine (!) reviews before being shown the door. But ours got...three reviews indicating it bounced in the very first round, and one review scored us at '1/5', the lowest possible value. It made all that intense work seem like a total waste of our lives.</p>\n<p>But then... one of the reviews shone out like a beacon. It was the longest review, and was <em>full</em> of directly actionable feedback. It began very constructively:</p>\n<blockquote>\n<p>The approach described in this paper is a very reasonable design point,\na natural intersection of the exokernel and libOS and the type-safe OS.\nIt would help to refocus the abstract and intro on the precise benefits that\nUnikernel can provide, and to attribute each benefit to its origin. Let me\ntake a stab at this, based on my read of the paper:</p>\n</blockquote>\n<p>The reviewer then reinterpreted our submission to add more clarity:</p>\n<blockquote>\n<ul>\n<li>"eliminate several classes of security threats via type-safety" -- clearly due to the top-to-bottom use of type-safe OCaml.</li>\n<li>"eliminate several classes of security threats via ... an address-space which can be made immutable" -- is there any reason an analogous technique could not apply to a libOS version of libc?</li>\n<li>"progressive specialization" -- I didn't find this contribution very exciting. It's nice that I can execute the same OCaml app in a Linux context to debug it; perhaps this is especially important since we don't yet have good symbolic debuggers for OCaml apps standing alone in a VM. But it really doesn't seem like a central benefit.</li>\n<li>"developers no longer need to become sysadmins" -- This claim is specious. If a third party packages an app together with a Linux guest-OS stack to become an appliance, there's no reason that appliance would require any sysadmin-ish behavior more than a Unikernel appliance.</li>\n<li>"The resulting unikernels are also highly compact" -- Could this property not also be readily achieved with a tuned libc-based (that is, not type-safe) libOS? How precious is this property? Is the <em>working set</em> actually much smaller? And finally, how much of the reduction is because the rewritten application is much less functional than the industry standard it replaces?\n[...the review continues on for several pages]</li>\n</ul>\n</blockquote>\n<p>Now, I didn't <em>agree</em> with all the points in the review, but they were restructured in such a way that made it clear that the reviewer had really thought about it, and had tried to pull out their own insights from the system construction. We ate up this feedback, and resubmitted it in a matter of weeks to ASPLOS adopting much of the structure suggested by this OSDI reviewer, and the paper got in with accepts across the board.</p>\n<p>The best bit of all this? The OSDI reviewer voluntarily <em>unblinded</em> their review:</p>\n<blockquote>\n<p>Review by Jon Howell <a href=\"mailto:howell@microsoft.com\">howell@microsoft.com</a>, intentionally unblinded.</p>\n</blockquote>\n<p>Jon's obviously an expert in the field (his own 2011 paper on <a href=\"https://dl.acm.org/doi/10.1145/1961296.1950399\">Drawbridge</a> won the ASPLOS influential paper award last year), but it's how much time he took in helping out a sibling system that stuck with me. His kind, constructive and direct review kept me in academia, and although I still haven't met him in person (life got really busy right afterwards with <a href=\"https://anil.recoil.org/notes/docker-buys-unikernel-systems\">Unikernel Systems</a>), I definitely still owe him a pint!</p>\n<h2><a href=\"https://anil.recoil.org/#systems-research-a-decade-or-more-on\"></a>Systems research a decade or more on</h2>\n<p>Mort also <a href=\"https://mort.io/blog/happy-day/\">made me think</a> about what we learnt from all this work that current students might learn from.</p>\n<p>You never know which papers will sink or swim in the fullness of time and most never get that popular outside a small niche, so don't worry about that right now when doing the work! Focus instead of honing your systems intuition for <em>why</em> you're building something, and bring out the <a href=\"https://blog.regehr.org/archives/6\">delta of your contributions</a> clearly in the paper. When you're building a complex system, there's a lot of boilerplate and scaffolding necessary, but the core of the "thing that hasn't been done before" is what you know best, and it can take some time and <a href=\"https://en.wikipedia.org/wiki/Rubber_duck_debugging\">conversations</a> to figure out what that is.</p>\n<p>\n<img alt=\"In the much missed Flying Pig...with Jon\" src=\"https://anil.recoil.org/images/jc-flyingpig-1.webp\" title=\"In the much missed Flying Pig...with Jon\">\nIn the much missed Flying Pig...with Jon</p>\n<p>Back in the day, most of our discussions about systems research happened after a day of deep hacking down at the pub.\nSince the pandemic, we seem to have lost a big chunk of that social discussion around our work collectively. While I still see people regularly for a swift half, it somehow seems more difficult to gather people in general. Part of that is that I don't go into the department much anymore due to the noise and cold (something <a href=\"https://jonmsterling.com\">Jon Sterling</a> also <a href=\"https://amok.recoil.org/@jonmsterling@mathstodon.xyz/114318437109811024\">observed recently</a>), <em>vs</em> my cosy Pembroke office.</p>\n<p>I'm not sure if it's just me or everyone else also feeling this, but I'm so zoned out after even a few Zoom calls that I'm just not very social afterwards. So one thing I'm aiming to do more consciously after Easter is to try to gather people down at <a href=\"https://www.themillpubcambridge.com/\">The Mill</a> more for a swift half (where they serve excellent <a href=\"https://www.guinness.com/en-gb/beers/guinness-zero\">Guinness Zero</a>), and really cut down on remote interactions that aren't necessary.</p>\n<p>\n<img alt=\"In the Kingston Arms...with Jon\" src=\"https://anil.recoil.org/images/jc-kingston-1.webp\" title=\"In the Kingston Arms...with Jon\">\nIn the Kingston Arms...with Jon</p>\n<p>And the last tip is from Barry Schwartz, who noted that <a href=\"https://en.wikipedia.org/wiki/The_Paradox_of_Choice\">the secret to happiness is low expectations</a>. No matter how much works goes into a system, don't bank too much on the big papers making a splash. Instead, enjoy every step of the journey -- from building things, scrapping them, debugging odd failures, throwing ideas around, releasing the software, seeing adoption, scrapping it all and starting again, the whole journey! There will always be a <a href=\"https://link.springer.com/article/10.1007/s40037-021-00670-z\">reviewer 2</a> waiting to ruin your day if you let them, so don't let them in.</p>\n<p>I also wonder how long paper publishing in its current form will survive; with the sheer number of publications coming out these days and the amount of <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">AI generated output</a>, it's difficult to see something published today having the same ramp as the work we did in the past few decades. Instead, adoption and rapid iteration seem to be the way to go. Thankfully, our University intellectual property rights <a href=\"https://www.enterprise.cam.ac.uk/wp-content/uploads/2015/04/IP-Policy-in-Practice-Guidance-Note-25May10-FINAL-CLEAN-Updated-links-August-2015.pdf\">remain liberal</a> (patents aside, but who cares about those these days for software), so there's nothing stopping us!</p>\n<p>\n<img alt=\"In the Mill...with Jon. Remembering our friend Ross!\" src=\"https://anil.recoil.org/images/jc-mill-1.webp\" title=\"In the Mill...with Jon. Remembering our friend Ross!\">\nIn the Mill...with Jon. Remembering our friend Ross!</p>",
+18
avsm/notes_vpnkit-hyperkit.json
+18
avsm/notes_vpnkit-hyperkit.json
···+"summary": "<p>I announce the release of three big components that form the basis for <a href=\"https://docker.com\">Docker for Desktop</a>: a hypervisor framework called HyperKit, a networking framework for host translation called VPNKit, and a versioned data management store called DataKit.</p>",+"content": "<p>I announce the release of three big components that form the basis for <a href=\"https://docker.com\">Docker for Desktop</a>: a hypervisor framework called HyperKit, a networking framework for host translation called VPNKit, and a versioned data management store called DataKit.</p>",
+18
avsm/notes_wasm-on-exotic-targets.json
+18
avsm/notes_wasm-on-exotic-targets.json
···+"summary": "<p>It's about the time of the academic year to come up with project <a href=\"https://anil.recoil.org/ideas\">ideas</a>! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>, <a href=\"https://github.com/andrewray\">Andy Ray</a> and I have been looking into <a href=\"https://anil.recoil.org/notes/fpgas-hardcaml\">FPGA/OCaml matters</a> recently so I thought I'd review the latest in the land of <a href=\"https://webassembly.org\">Webassembly</a> for non-traditional hardware targets. It turns out that there are very fun systems projects going on to turn wasm into a "real" target architecture on several fronts: a native port of Linux to run in wasm, a port of wasm to run in kernel space, a POSIX mapping of wasm, and fledgling wasm-CPUs-on-FPGAs.</p>\n<h2><a href=\"https://anil.recoil.org/#native-port-of-linux-to-wasm\"></a>Native port of Linux to wasm</h2>\n<p>The first one is a <a href=\"https://github.com/tombl/linux\"><em>native</em> port</a> of the Linux kernel to run in webassembly (<a href=\"https://linux.tombl.dev\">try it in your browser</a>). This isn't an emulation; instead, the various kernel subsystems have been ported to have wasm interfaces, so the C kernel code runs directly as webassembly, with virtual device layers.</p>\n<p>The inspiration for this seems to have come from a famous comment eight years ago on the LKML:</p>\n<blockquote>\n<p>One more general comment: I think this may well be the last new CPU architecture we ever add to the kernel. Both nds32 and c-sky are made by companies that also work on risc-v, and generally speaking risc-v seems to be killing off any of the minor licensable instruction set projects, just like ARM has mostly killed off the custom vendor-specific instruction sets already. If we add another architecture in the future, it may instead be something like the LLVM bitcode or WebAssembly, who knows?\n-- <a href=\"https://lore.kernel.org/all/CAK8P3a2-wyXxctVtJxniUoeShASMhF-6Z1vyvfBnr6wKJuioAQ@mail.gmail.com/\">Arnd Bergmann, LKML, 2018</a></p>\n</blockquote>\n<p>And this port brings us much closer to that! I need to spelunk more into the diffs to the mainline kernel to see how it all works, but some quick notes:</p>\n<ul>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">tools/wasm</a> directory shows how some of the glue code works, such as the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">worker.ts</a> which uses <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">WebWorkers</a> to implement multicore, and the venerable <a href=\"https://wiki.libvirt.org/Virtio.html\">virtio</a> to implement <a href=\"https://github.com/tombl/linux/blob/wasm/tools/wasm/src/virtio.ts#L204\">virtual block devices</a>.</li>\n<li>the <a href=\"https://github.com/tombl/linux/tree/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm\">arch/wasm</a> contains the glue code, and <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/irq.c#L17\">mm.c</a> shows how atomic builtins in wasm are sufficient to implement low-level memory management. The <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/fork.c#L12C2-L12C24\">clone</a> implementation leads us to <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/include/asm/wasm_imports.h\">wasm_imports.h</a> which shows all the FFI stubs needed from the runtime in <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/wasm.ts#L21\">worker.ts</a>. Notably, it looks like the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts#L103\">process switcher</a> doesn't use the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension (possibly for compatibility?).</li>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/syscall.c#L19\">arch/wasm/kernel/syscall.c</a> (and that whole directory) could form the basis for a nice OS teaching course. Implementing the core of an OS on a virtual hypervisor is always <a href=\"https://anil.recoil.org/projects/unikernels\">an educational experience</a>, and this port is based on "real" Linux!</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#running-wasm-in-linux-kernel-mode\"></a>Running wasm in Linux kernel mode</h2>\n<p>On the opposite end of the architecture spectrum, we have a <a href=\"https://github.com/wasmerio/kernel-wasm\">Linux in-kernel WASM runtime</a>. This one allows running userspace code within the kernel space, as motivated by:</p>\n<blockquote>\n<p>Since WASM is a virtual ISA protected by a virtual machine, we don't need to rely on external hardware and software checks to ensure safety. Running WASM in the kernel avoids most of the overhead introduced by those checks, e.g. system call (context switching) and <code>copy_{from,to}_user</code>, therefore improving performance.\nAlso, having low-level control means that we can implement a lot of features that were heavy or impossible in userspace, like virtual memory tricks and handling of intensive kernel events (like network packet filtering).\n-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#why-run-webassembly-in-the-kernel\">Why run Wasm in the kernel</a></p>\n</blockquote>\n<p>There are some interesting <a href=\"https://github.com/wasmerio/wasmer/tree/main/examples#examples\">example applications</a> available that they accelerate. They report on the speedup for an echo and http server that can run in kernel space:</p>\n<blockquote>\n<p>When compiled with the singlepass backend (unoptimized direct x86-64 code generation) and benchmarked using tcpkali/wrk, echo-server is ~10% faster (25210 Mbps / 22820 Mbps) than its native equivalent, and http-server is ~6% faster (53293 Rps / 50083 Rps). Even higher performance is expected when the other two Wasmer backends with optimizations (Cranelift/LLVM) are updated to support generating code for the kernel.\n-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#examples-and-benchmark\">kernel wasm benchmarks</a></p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#running-posix-applications-in-the-browser\"></a>Running POSIX applications in the browser</h2>\n<p>The kernel-wasm port lead me to look more closely at the wasmer runtime, which in turn also extends the <a href=\"https://wasi.dev\">wasi</a> server-side interface of WASM to support full POSIX compatibility. You can also view this in the <a href=\"https://wasmer.sh\">browser as a shell</a>, where a variety of applications can be compiled to wasm and run as if you had a shell in the browser!</p>\n<p>There is impressive support for POSIX here, as well as an <a href=\"https://wasmer.io/posts/introducing-the-wasmer-js-sdk\">wasmer/wasix SDK</a> to port existing applications like ffmpeg to run in the browser or <a href=\"https://wasmer.io/posts/wasmer-js-sdk-now-supports-node-and-bun\">on in a server JS runtime</a>.</p>\n<p>So what's stopping OCaml --via the <a href=\"https://tarides.com/blog/2023-11-01-webassembly-support-for-ocaml-introducing-wasm-of-ocaml/\">new wasm-of-ocaml compiler</a> -- from running in the browser? Just the fact that our target runtime depends on the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension, and <a href=\"https://github.com/ocaml-wasm/wasm_of_ocaml/issues/101#issuecomment-2464706078\">wasmer doesnt yet support that extension</a>. Since there, wasmer 2.3 has <a href=\"https://wasmer.io/posts/wasmer-2_3\">improved stack switching</a> performance but the extension isn't quite there yet. So if anyone's looking for some experience with language runtime hacking, this might be a good project. I couldn't find any information on whether wasmer is planning on adding support for this extension yet though.</p>\n<h2><a href=\"https://anil.recoil.org/#running-wasm-on-an-fpga\"></a>Running wasm on an FPGA</h2>\n<p>And last but not least, given all of the above, what would it take to run wasm on an FPGA directly? The existence of the Linux native wasm port is encouraging, since it implies that if you were to get wasm instructions to run directly on an FPGA (just like you might wiht a <a href=\"https://discuss.ocaml.org/t/hardcaml-mips-cpu-learning-project-and-blog/8088\">MIPS FPGA CPU</a> or a <a href=\"https://github.com/ujamjar/hardcaml-riscv\">RISC-V one</a>), then you could hook up the rest of the OS ecosystem to this as custom drivers.</p>\n<p>I found a few projects around this space that I need to look into more:</p>\n<ul>\n<li>wasmachine is an implementation of the WebAssembly specification in a FPGA. It follows a sequential 6-steps design. <a href=\"https://github.com/piranna/wasmachine\">https://github.com/piranna/wasmachine</a> (see <a href=\"https://github.com/WebAssembly/design/issues/1050\">wasm design discussion</a>)</li>\n<li>a <a href=\"https://github.com/denisvasilik/wasm-fpga-engine\">wasm-fpga-engine</a> that executes a subset of instructions</li>\n<li>an <a href=\"https://www.mdpi.com/2079-9292/13/20/3979\">FPGA accelerator for WASM instructions</a>. This one came before the stack switching extension though, which might make the implementation in hardware significantly easier.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#and-more\"></a>And more...</h2>\n<p>After first posting this, here are incoming updates. <a href=\"https://bsky.app/profile/jonaskruckenberg.de/post/3lmygmvbidc2i\">Jonas Kruckenberg</a> tells me that he's got an experimental OS called <a href=\"https://github.com/JonasKruckenberg/k23\">k23</a>. This is a microkernel that is built around the idea of using Wasm as the primary execution environment:</p>\n<blockquote>\n<p>This allows for a number of benefits:</p>\n<ul>\n<li>Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.</li>\n<li>Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.</li>\n<li>Portability: WebAssembly is designed to be very portable. Forget questions like "is this binary compiled for amd64 or arm?". k23 programs just run wherever.</li>\n<li>Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.\n-- <a href=\"https://jonaskruckenberg.github.io/k23/\">The k23 manual</a></li>\n</ul>\n</blockquote>",+"content": "<p>It's about the time of the academic year to come up with project <a href=\"https://anil.recoil.org/ideas\">ideas</a>! <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a>, <a href=\"https://github.com/andrewray\">Andy Ray</a> and I have been looking into <a href=\"https://anil.recoil.org/notes/fpgas-hardcaml\">FPGA/OCaml matters</a> recently so I thought I'd review the latest in the land of <a href=\"https://webassembly.org\">Webassembly</a> for non-traditional hardware targets. It turns out that there are very fun systems projects going on to turn wasm into a "real" target architecture on several fronts: a native port of Linux to run in wasm, a port of wasm to run in kernel space, a POSIX mapping of wasm, and fledgling wasm-CPUs-on-FPGAs.</p>\n<h2><a href=\"https://anil.recoil.org/#native-port-of-linux-to-wasm\"></a>Native port of Linux to wasm</h2>\n<p>The first one is a <a href=\"https://github.com/tombl/linux\"><em>native</em> port</a> of the Linux kernel to run in webassembly (<a href=\"https://linux.tombl.dev\">try it in your browser</a>). This isn't an emulation; instead, the various kernel subsystems have been ported to have wasm interfaces, so the C kernel code runs directly as webassembly, with virtual device layers.</p>\n<p>The inspiration for this seems to have come from a famous comment eight years ago on the LKML:</p>\n<blockquote>\n<p>One more general comment: I think this may well be the last new CPU architecture we ever add to the kernel. Both nds32 and c-sky are made by companies that also work on risc-v, and generally speaking risc-v seems to be killing off any of the minor licensable instruction set projects, just like ARM has mostly killed off the custom vendor-specific instruction sets already. If we add another architecture in the future, it may instead be something like the LLVM bitcode or WebAssembly, who knows?\n-- <a href=\"https://lore.kernel.org/all/CAK8P3a2-wyXxctVtJxniUoeShASMhF-6Z1vyvfBnr6wKJuioAQ@mail.gmail.com/\">Arnd Bergmann, LKML, 2018</a></p>\n</blockquote>\n<p>And this port brings us much closer to that! I need to spelunk more into the diffs to the mainline kernel to see how it all works, but some quick notes:</p>\n<ul>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">tools/wasm</a> directory shows how some of the glue code works, such as the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts\">worker.ts</a> which uses <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">WebWorkers</a> to implement multicore, and the venerable <a href=\"https://wiki.libvirt.org/Virtio.html\">virtio</a> to implement <a href=\"https://github.com/tombl/linux/blob/wasm/tools/wasm/src/virtio.ts#L204\">virtual block devices</a>.</li>\n<li>the <a href=\"https://github.com/tombl/linux/tree/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm\">arch/wasm</a> contains the glue code, and <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/irq.c#L17\">mm.c</a> shows how atomic builtins in wasm are sufficient to implement low-level memory management. The <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/fork.c#L12C2-L12C24\">clone</a> implementation leads us to <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/include/asm/wasm_imports.h\">wasm_imports.h</a> which shows all the FFI stubs needed from the runtime in <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/wasm.ts#L21\">worker.ts</a>. Notably, it looks like the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/tools/wasm/src/worker.ts#L103\">process switcher</a> doesn't use the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension (possibly for compatibility?).</li>\n<li>the <a href=\"https://github.com/tombl/linux/blob/777d95246a8b1dc184e991a76946ccafef392206/arch/wasm/kernel/syscall.c#L19\">arch/wasm/kernel/syscall.c</a> (and that whole directory) could form the basis for a nice OS teaching course. Implementing the core of an OS on a virtual hypervisor is always <a href=\"https://anil.recoil.org/projects/unikernels\">an educational experience</a>, and this port is based on "real" Linux!</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#running-wasm-in-linux-kernel-mode\"></a>Running wasm in Linux kernel mode</h2>\n<p>On the opposite end of the architecture spectrum, we have a <a href=\"https://github.com/wasmerio/kernel-wasm\">Linux in-kernel WASM runtime</a>. This one allows running userspace code within the kernel space, as motivated by:</p>\n<blockquote>\n<p>Since WASM is a virtual ISA protected by a virtual machine, we don't need to rely on external hardware and software checks to ensure safety. Running WASM in the kernel avoids most of the overhead introduced by those checks, e.g. system call (context switching) and <code>copy_{from,to}_user</code>, therefore improving performance.\nAlso, having low-level control means that we can implement a lot of features that were heavy or impossible in userspace, like virtual memory tricks and handling of intensive kernel events (like network packet filtering).\n-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#why-run-webassembly-in-the-kernel\">Why run Wasm in the kernel</a></p>\n</blockquote>\n<p>There are some interesting <a href=\"https://github.com/wasmerio/wasmer/tree/main/examples#examples\">example applications</a> available that they accelerate. They report on the speedup for an echo and http server that can run in kernel space:</p>\n<blockquote>\n<p>When compiled with the singlepass backend (unoptimized direct x86-64 code generation) and benchmarked using tcpkali/wrk, echo-server is ~10% faster (25210 Mbps / 22820 Mbps) than its native equivalent, and http-server is ~6% faster (53293 Rps / 50083 Rps). Even higher performance is expected when the other two Wasmer backends with optimizations (Cranelift/LLVM) are updated to support generating code for the kernel.\n-- <a href=\"https://github.com/wasmerio/kernel-wasm?tab=readme-ov-file#examples-and-benchmark\">kernel wasm benchmarks</a></p>\n</blockquote>\n<h2><a href=\"https://anil.recoil.org/#running-posix-applications-in-the-browser\"></a>Running POSIX applications in the browser</h2>\n<p>The kernel-wasm port lead me to look more closely at the wasmer runtime, which in turn also extends the <a href=\"https://wasi.dev\">wasi</a> server-side interface of WASM to support full POSIX compatibility. You can also view this in the <a href=\"https://wasmer.sh\">browser as a shell</a>, where a variety of applications can be compiled to wasm and run as if you had a shell in the browser!</p>\n<p>There is impressive support for POSIX here, as well as an <a href=\"https://wasmer.io/posts/introducing-the-wasmer-js-sdk\">wasmer/wasix SDK</a> to port existing applications like ffmpeg to run in the browser or <a href=\"https://wasmer.io/posts/wasmer-js-sdk-now-supports-node-and-bun\">on in a server JS runtime</a>.</p>\n<p>So what's stopping OCaml --via the <a href=\"https://tarides.com/blog/2023-11-01-webassembly-support-for-ocaml-introducing-wasm-of-ocaml/\">new wasm-of-ocaml compiler</a> -- from running in the browser? Just the fact that our target runtime depends on the <a href=\"https://github.com/WebAssembly/stack-switching\">wasm stack switching</a> extension, and <a href=\"https://github.com/ocaml-wasm/wasm_of_ocaml/issues/101#issuecomment-2464706078\">wasmer doesnt yet support that extension</a>. Since there, wasmer 2.3 has <a href=\"https://wasmer.io/posts/wasmer-2_3\">improved stack switching</a> performance but the extension isn't quite there yet. So if anyone's looking for some experience with language runtime hacking, this might be a good project. I couldn't find any information on whether wasmer is planning on adding support for this extension yet though.</p>\n<h2><a href=\"https://anil.recoil.org/#running-wasm-on-an-fpga\"></a>Running wasm on an FPGA</h2>\n<p>And last but not least, given all of the above, what would it take to run wasm on an FPGA directly? The existence of the Linux native wasm port is encouraging, since it implies that if you were to get wasm instructions to run directly on an FPGA (just like you might wiht a <a href=\"https://discuss.ocaml.org/t/hardcaml-mips-cpu-learning-project-and-blog/8088\">MIPS FPGA CPU</a> or a <a href=\"https://github.com/ujamjar/hardcaml-riscv\">RISC-V one</a>), then you could hook up the rest of the OS ecosystem to this as custom drivers.</p>\n<p>I found a few projects around this space that I need to look into more:</p>\n<ul>\n<li>wasmachine is an implementation of the WebAssembly specification in a FPGA. It follows a sequential 6-steps design. <a href=\"https://github.com/piranna/wasmachine\">https://github.com/piranna/wasmachine</a> (see <a href=\"https://github.com/WebAssembly/design/issues/1050\">wasm design discussion</a>)</li>\n<li>a <a href=\"https://github.com/denisvasilik/wasm-fpga-engine\">wasm-fpga-engine</a> that executes a subset of instructions</li>\n<li>an <a href=\"https://www.mdpi.com/2079-9292/13/20/3979\">FPGA accelerator for WASM instructions</a>. This one came before the stack switching extension though, which might make the implementation in hardware significantly easier.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#and-more\"></a>And more...</h2>\n<p>After first posting this, here are incoming updates. <a href=\"https://bsky.app/profile/jonaskruckenberg.de/post/3lmygmvbidc2i\">Jonas Kruckenberg</a> tells me that he's got an experimental OS called <a href=\"https://github.com/JonasKruckenberg/k23\">k23</a>. This is a microkernel that is built around the idea of using Wasm as the primary execution environment:</p>\n<blockquote>\n<p>This allows for a number of benefits:</p>\n<ul>\n<li>Security: WebAssembly is designed to run in a sandboxed environment, making it much harder to exploit.</li>\n<li>Modularity: WebAssembly modules can depend on each other, importing and exporting functionality and data, forming a modular system where dependency management is a first class citizen.</li>\n<li>Portability: WebAssembly is designed to be very portable. Forget questions like "is this binary compiled for amd64 or arm?". k23 programs just run wherever.</li>\n<li>Static Analysis: WebAssembly is famous for being very easy to analyze. This means we can check for bad programs without even running them.\n-- <a href=\"https://jonaskruckenberg.github.io/k23/\">The k23 manual</a></li>\n</ul>\n</blockquote>",
+18
avsm/notes_wired-spotcode.json
+18
avsm/notes_wired-spotcode.json
···+"summary": "<p>I gave a talk at <a href=\"https://web.archive.org/web/20050204012820/http://www.quernstone.com/notcon04/\">NotCon 2004</a> on SpotCodes, and it got covered in Wired Magazine!\nOf course, I shared the stage with a man telling the time <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">via a prawn sandwich</a>, so the limelight wasn't all just mine...</p>\n<blockquote>\n<p>Anil Madhavapeddy and his colleagues at High Energy Magic think camera phones should be used for more than taking bad pictures. The company's SpotCode reader software lets camera phones recognize a circular tag and then communicate via Bluetooth with a local server.\n-- <a href=\"https://roxannekhamsi.com\">Roxanne Khamsi</a> for <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">Wired</a></p>\n</blockquote>",+"content": "<p>I gave a talk at <a href=\"https://web.archive.org/web/20050204012820/http://www.quernstone.com/notcon04/\">NotCon 2004</a> on SpotCodes, and it got covered in Wired Magazine!\nOf course, I shared the stage with a man telling the time <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">via a prawn sandwich</a>, so the limelight wasn't all just mine...</p>\n<blockquote>\n<p>Anil Madhavapeddy and his colleagues at High Energy Magic think camera phones should be used for more than taking bad pictures. The company's SpotCode reader software lets camera phones recognize a circular tag and then communicate via Bluetooth with a local server.\n-- <a href=\"https://roxannekhamsi.com\">Roxanne Khamsi</a> for <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">Wired</a></p>\n</blockquote>",
+18
avsm/notes_xenstore-stub-domain.json
+18
avsm/notes_xenstore-stub-domain.json
···
+18
avsm/notes_yurts-for-digital-nomads.json
+18
avsm/notes_yurts-for-digital-nomads.json
···+"summary": "<p>The App Engine data collector for Personal Containers is coming on nicely, and is on track for an alpha preview release <a href=\"http://github.com/avsm/perscon/blob/master/README.md\">fairly soon</a>. Working with AppEngine has been interesting; it\u2019s got excellent availability and you can\u2019t beat the price (free), but coding robust Python that doesn\u2019t trip over the tight resource limits for individual requests, asynchronous tasks and queries is tricky. While it is good for small records such as my <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> or Find My iPhone <a href=\"http://github.com/avsm/perscon/blob/master/appengine/perscon/drivers/fmi.py\">GPS traces</a> traces, it doesn\u2019t work so well with my gigabytes of photographs or decades of e-mail.</p>\n<p>This confirmed our earlier intuition that there is no one perfect solution for personal data handling; instead, we need to <em>embrace diversity</em> and construct an infrastructure that can cope with change over the coming decades. Mobile programming has changed beyond recognition in just a few years, and cloud providers are specialising in different ways (e.g. <a href=\"http://www.picloud.com/\">PiCloud</a> for simple compute, or <a href=\"http://aws.amazon.com\">EC2</a> for fancy services like elastic <a href=\"http://aws.amazon.com/elasticloadbalancing/\">load balancing</a>).</p>\n<p>So to recognise this, we are building components that all interoperate with your personal data, keep it secure, and ensure it persists for more than a few years. <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> came up with the term "digital <a href=\"http://en.wikipedia.org/wiki/Yurt\">yurts</a>", and it's stuck. We\u2019ve written a <a href=\"http://perscon.net/papers/digital-yurts-draft1.pdf\">draft paper</a> about it, and would love to hear your comments and feedback on the approach.</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/nomads-diagram.webp\" title=\"\"></p>\n<p>There are some interesting recent trends that make doing this\nparticularly important:</p>\n<ul>\n<li>The New York Times wrote about the <a href=\"http://www.nytimes.com/2010/05/02/magazine/02self-measurement-t.html\">data-driven\nlife</a>\nincreasingly influencing our decision making. Current sensor data\nsuch as GPS traces are just harbringers for the privacy disaster\nthat would be information such as heart rates or your consumption\nhabits getting into the public domain. <em>(link via <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek\nMurray</a>)</em>.</li>\n<li>Facebook has announced a brand new API platform to get access to\nyour information. The <a href=\"http://eff.org\">EFF</a> has a fantastic timeline\nof <a href=\"http://www.eff.org/deeplinks/2010/04/facebook-timeline\">Facebook\u2019s Eroding\nPrivacy</a>\nover the last five years, to demonstrate how unsafe it is to trust\nyour data to any third-party. We\u2019ve started developing an\ninformation dump plugin for Facebook, but the API just changed\nmid-way and so it has to be started again (volunteers welcome!).</li>\n<li>In the UK, the <a href=\"http://en.wikipedia.org/wiki/Digital_Economy_Act_2010\">Digital Economy\nAct</a> is an\nextremely controversial act that makes anonymity and privacy all the\nmore important. We\u2019re assembling an open-source <a href=\"http://www.scribd.com/doc/28393106/Using-Dust-Clouds-to-Enhance-Anonymous-Communication\">dust\ncloud</a>\nthat integrates Tor into personal containers to automatically grant\nyou anonymity as you communicate with your friends.</li>\n</ul>\n<p>If you\u2019re interested, join our <a href=\"http://perscon.net/contact.html\">group</a>\nor contact <a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a> directly. At this stage, you\nneed desire and the ability to hack code, but things are settling down\nover the next few months...</p>",+"content": "<p>The App Engine data collector for Personal Containers is coming on nicely, and is on track for an alpha preview release <a href=\"http://github.com/avsm/perscon/blob/master/README.md\">fairly soon</a>. Working with AppEngine has been interesting; it\u2019s got excellent availability and you can\u2019t beat the price (free), but coding robust Python that doesn\u2019t trip over the tight resource limits for individual requests, asynchronous tasks and queries is tricky. While it is good for small records such as my <a href=\"http://github.com/avsm/perscon/tree/master/plugins/iPhone/\">iPhone</a> or Find My iPhone <a href=\"http://github.com/avsm/perscon/blob/master/appengine/perscon/drivers/fmi.py\">GPS traces</a> traces, it doesn\u2019t work so well with my gigabytes of photographs or decades of e-mail.</p>\n<p>This confirmed our earlier intuition that there is no one perfect solution for personal data handling; instead, we need to <em>embrace diversity</em> and construct an infrastructure that can cope with change over the coming decades. Mobile programming has changed beyond recognition in just a few years, and cloud providers are specialising in different ways (e.g. <a href=\"http://www.picloud.com/\">PiCloud</a> for simple compute, or <a href=\"http://aws.amazon.com\">EC2</a> for fancy services like elastic <a href=\"http://aws.amazon.com/elasticloadbalancing/\">load balancing</a>).</p>\n<p>So to recognise this, we are building components that all interoperate with your personal data, keep it secure, and ensure it persists for more than a few years. <a href=\"https://cs.brown.edu/people/malte/\">Malte Schwarzkopf</a> came up with the term "digital <a href=\"http://en.wikipedia.org/wiki/Yurt\">yurts</a>", and it's stuck. We\u2019ve written a <a href=\"http://perscon.net/papers/digital-yurts-draft1.pdf\">draft paper</a> about it, and would love to hear your comments and feedback on the approach.</p>\n<p><img alt=\"\" src=\"https://anil.recoil.org/images/nomads-diagram.webp\" title=\"\"></p>\n<p>There are some interesting recent trends that make doing this\nparticularly important:</p>\n<ul>\n<li>The New York Times wrote about the <a href=\"http://www.nytimes.com/2010/05/02/magazine/02self-measurement-t.html\">data-driven\nlife</a>\nincreasingly influencing our decision making. Current sensor data\nsuch as GPS traces are just harbringers for the privacy disaster\nthat would be information such as heart rates or your consumption\nhabits getting into the public domain. <em>(link via <a href=\"http://www.cl.cam.ac.uk/~dgm36/\">Derek\nMurray</a>)</em>.</li>\n<li>Facebook has announced a brand new API platform to get access to\nyour information. The <a href=\"http://eff.org\">EFF</a> has a fantastic timeline\nof <a href=\"http://www.eff.org/deeplinks/2010/04/facebook-timeline\">Facebook\u2019s Eroding\nPrivacy</a>\nover the last five years, to demonstrate how unsafe it is to trust\nyour data to any third-party. We\u2019ve started developing an\ninformation dump plugin for Facebook, but the API just changed\nmid-way and so it has to be started again (volunteers welcome!).</li>\n<li>In the UK, the <a href=\"http://en.wikipedia.org/wiki/Digital_Economy_Act_2010\">Digital Economy\nAct</a> is an\nextremely controversial act that makes anonymity and privacy all the\nmore important. We\u2019re assembling an open-source <a href=\"http://www.scribd.com/doc/28393106/Using-Dust-Clouds-to-Enhance-Anonymous-Communication\">dust\ncloud</a>\nthat integrates Tor into personal containers to automatically grant\nyou anonymity as you communicate with your friends.</li>\n</ul>\n<p>If you\u2019re interested, join our <a href=\"http://perscon.net/contact.html\">group</a>\nor contact <a href=\"https://anil.recoil.org\">Anil Madhavapeddy</a> directly. At this stage, you\nneed desire and the ability to hack code, but things are settling down\nover the next few months...</p>",
+18
avsm/papers_2025-ai-poison.json
+18
avsm/papers_2025-ai-poison.json
···+"summary": "The publication of ever-larger numbers of problematic papers, including fake ones generated by artificial intelligence, represents an existential crisis for the established way of doing evidence synthesis. But with a new approach, AI might also save the day.",+"content": "The publication of ever-larger numbers of problematic papers, including fake ones generated by artificial intelligence, represents an existential crisis for the established way of doing evidence synthesis. But with a new approach, AI might also save the day.",
+18
avsm/papers_2025-conservation-div.json
+18
avsm/papers_2025-conservation-div.json
···+"summary": "This is a response to a [letter](http://doi.org/10.1016/j.tree.2025.03.003) by Murray K. et al to our paper on \"[:2024-ai-conhorizon]\". See [:ai-should-unite-conservation] for further thoughts.",+"content": "This is a response to a [letter](http://doi.org/10.1016/j.tree.2025.03.003) by Murray K. et al to our paper on \"[:2024-ai-conhorizon]\". See [:ai-should-unite-conservation] for further thoughts.",
+18
avsm/papers_2025-hyperres.json
+18
avsm/papers_2025-hyperres.json
···+"summary": "Package managers are everywhere, with seemingly every language and operating system implementing their own solution. The lack of interoperability between these systems means that multi-lingual projects are unable to express precise dependencies across language ecosystems, and external system and hardware dependencies are typically implicit and unversioned.\n\nWe define HyperRes, a formal system for describing versioned dependency resolution using a hypergraph that is expressive enough to model many ecosystems and solve dependency constraints across them. We define translations from dozens of existing package managers to HyperRes and comprehensively demonstrate that dependency resolution can work across ecosystems that are currently distinct. This does not require users to shift their choice of package managers; instead, HyperRes allows for the translation of packaging metadata between ecosystems, and for solving to be precisely specialised to a particular deployment environment.",+"content": "Package managers are everywhere, with seemingly every language and operating system implementing their own solution. The lack of interoperability between these systems means that multi-lingual projects are unable to express precise dependencies across language ecosystems, and external system and hardware dependencies are typically implicit and unversioned.\n\nWe define HyperRes, a formal system for describing versioned dependency resolution using a hypergraph that is expressive enough to model many ecosystems and solve dependency constraints across them. We define translations from dozens of existing package managers to HyperRes and comprehensively demonstrate that dependency resolution can work across ecosystems that are currently distinct. This does not require users to shift their choice of package managers; instead, HyperRes allows for the translation of packaging metadata between ecosystems, and for solving to be precisely specialised to a particular deployment environment.",
+18
avsm/papers_2025-tessera.json
+18
avsm/papers_2025-tessera.json
···+"title": "TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis",+"summary": "Satellite remote sensing (RS) enables a wide array of downstream Earth observation (EO) applications, including climate modeling, carbon accounting, and strategies for conservation and sustainable land use. We present TESSERA, a novel Remote Sensing Foundation Model (RSFM) that uses Self-Supervised Learning (SSL) to generate global, robust representations at 10m scale from pixel-level satellite time series data. TESSERA combines information from only optical and SAR data streams using two parallel Transformer-based encoders: one dedicated to Sentinel-1 SAR polarizations and another to Sentinel-2 MSI data (10 selected spectral bands) to create representations that are then fused using a multilayer perceptron (MLP), resulting in a global representation map covering the years 2017 to 2024.\n\nOur precomputed representations set a new state-of-the-art performance benchmark and our open-source approach democratizes access to high-performance, high-resolution representations. We benchmark the performance of TESSERA in five diverse tasks, comparing our work with state-of-the-art task-specific models and other foundation models. Our results show that TESSERA outperforms both traditional RS baselines and the leading geospatial foundation models in these diverse downstream tasks.\n\nRead more about TESSERA at .",+"content": "Satellite remote sensing (RS) enables a wide array of downstream Earth observation (EO) applications, including climate modeling, carbon accounting, and strategies for conservation and sustainable land use. We present TESSERA, a novel Remote Sensing Foundation Model (RSFM) that uses Self-Supervised Learning (SSL) to generate global, robust representations at 10m scale from pixel-level satellite time series data. TESSERA combines information from only optical and SAR data streams using two parallel Transformer-based encoders: one dedicated to Sentinel-1 SAR polarizations and another to Sentinel-2 MSI data (10 selected spectral bands) to create representations that are then fused using a multilayer perceptron (MLP), resulting in a global representation map covering the years 2017 to 2024.\n\nOur precomputed representations set a new state-of-the-art performance benchmark and our open-source approach democratizes access to high-performance, high-resolution representations. We benchmark the performance of TESSERA in five diverse tasks, comparing our work with state-of-the-art task-specific models and other foundation models. Our results show that TESSERA outperforms both traditional RS baselines and the leading geospatial foundation models in these diverse downstream tasks.\n\nRead more about TESSERA at .",
+18
avsm/projects_ce.json
+18
avsm/projects_ce.json
···+"summary": "The [Conservation Evidence](https://conservationevidence.com) team at the University\nof Cambridge has spent years screening 1.6m+ scientific papers on conservation, as\nwell as manually summarising 8600+ studies relating to conservation actions.\nHowever, progress is limited by the specialised skills needed to screen and\nsummarise relevant studies -- it took more than 75 person years to manually\ncurate the current database and only a few 100 papers can be added each year!\nWe are working on AI-driven techniques to accelerate addition of robust\nevidence to the CE database via automated literature scanning, [LLM-based\ncopilots](:2024-ce-llm) and scanning of grey literature. We aim to provide co-pilots that\naugment human decision making to figure out how to categorise interventions\nmuch more quickly and accurately, and ultimately accelerate the positive impact\nof conservation actions.\n\n\nThe goal of the [Conservation Evidence](https://conservationevidence.com)\nproject is to transform conservation so that evidence is routinely embedded in\ndecisions to improve outcomes for biodiversity and society. CE is becoming the\nauthoritative, most comprehensive, freely available platform for evidence-led\nconservation and is starting to [profoundly change the\nway](https://www.conservation.cam.ac.uk/events/online-workshop-delivering-effectiveness-revolution-conservation-lessons-organisations-policy)\nin which conservationists access and use evidence for improving the state of\nthe planet.\n\n\n\nThe CE collation and synthesis work has significantly improved the availability of evidence for use in conservation practice and remains the only resource of evidence synopses for biodiversity conservation and the largest database of effectiveness reviews of actions outside the field of medicine. The approach of carrying out reviews on an industrial scale means that they can carry out reviews for a fraction (~2%) of the costs in comparable fields, such as medicine. Using subject-wide evidence synthesis, CE systematically searches the literature and summarise results from (and provides citations for) each study testing the effectiveness of an action. As of April 2024, CE has read 1.6 million paper titles in 17 languages (326 non-english journals) and reviewed evidence for >3600 conservation actions, freely available on their [website](http://www.conservationevidence.com), with collaboration from over 380 international academics and practitioners.\n\n## Accelerating literature surveys with LLMs\n\nWe got involved from computer science in 2023 as part of the [AI@CAM competition](https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet) to harness the momentum behind machine learning to accelerate conservation actions. Our overall aim is to help CE to dramatically accelerate their data searching and data extraction pipelines. Currently, the searching of literature and summarising of key data is undertaken by human experts. Although this method of working is time consuming, it does benefit from being thorough and replicable. The main difficulties come in the subtleties of deciphering study designs, methodologies and whether controls are actually appropriate for testing the effectiveness of the specified action. Any LLM-based automation that we deploy must account for these as part of the validation pipeline.\n\nOur [evaluation of LLM performance](:2024-ce-llm) against human experts on conservation intervention questions showed that properly configured LLMs with retrieval augmentation can achieve competitive performance on evidence synthesis tasks. However, out-of-the-box general LLMs performed poorly and risk misinforming decision-makers, reinforcing our commitment to careful validation and human-in-the-loop approaches.\n\n\n\nThe collaboration originally began in 2022 as part of the Computer Science [1B group projects](https://www.cst.cam.ac.uk/teaching/part-ib/group-projects), when [@wsutherland], [@sreynolds] and [@achristie] from Zoology proposed a group project related to CE. A team of undergraduate students (including [@jcao]) trained an ML model to facilitate searching for papers and indexing relevant articles by species and habitat. After the group project completed with encouraging results, [@sadiqj] and I joined the collaboration and -- with help from the Cambridge [Office for Scholarly Communication](https://osc.cam.ac.uk) -- built up a comprehensive (and legal!) corpus of millions of academic papers related to conservation evidence.\n\nThrough 2024, we evaluated ten different LLMs against human experts using the CE database, leading to our [LLM evaluation paper](:2024-ce-llm) showing promising but cautious results. We were joined in the summer of 2024 by three CST undergraduates: [@riyer], [@sbiswas] and [@kmichalik] who built out various elements of the system. Moving into 2025, our focus shifts to production deployment of hybrid retrieval systems while maintaining rigorous validation against the expanding challenges of [AI contamination in scientific literature](:ai-contamination-of-papers).\n\n## Challenges in the AI-Evidence Era\n\nOur work on LLM-based evidence synthesis has gained urgency in 2025 as the body of scientific literature faces challenges from [AI-contaminated papers](:ai-contamination-of-papers). Recent analysis suggests that up to 2% of submitted papers may be AI-generated, with some estimates potentially much higher. This contamination poses particular risks for evidence databases like CE, as fake papers with plausible-sounding conservation interventions could mislead decision-makers if incorporated without rigorous validation.\n\nAs AI generation becomes more sophisticated, our validation pipeline therefore needs to evolve beyond \"just\" simple detection to robust experimental design verification and cross-reference validation.\n\n## Technology for Conservation, Not Division\n\nWe also conducted a [horizon scan](:2024-ai-conhorizon) and as outlined in our [response to concerns about AI dividing conservation](:ai-should-unite-conservation), we are committed to developing CE tools that:\n\n- Respect and amplify human expertise rather than replacing it via \"human-in-the-loop\" methods\n- Follow participatory design principles with conservation practitioners\n- Maintain open source and open data approaches with thorough documentation to facilitate reproducible outputs\n- Address capacity building needs, particularly in the Global South with respect to AI capability\n- Keep conservation goals and not short term technology trends at the center of our research.\n\nOur [AI@CAM interview](:aicam-interview-ce) highlights this approach: we are\nbuilding detailed models of the world that can be queried by policy makers to\nhelp make informed decisions. The technology serves the evidence, and the\nevidence serves conservation practices.",+"content": "<div><h1>Conservation Evidence Copilots</h1><p></p><p>The <a href=\"https://conservationevidence.com\">Conservation Evidence</a> team at the University\nof Cambridge has spent years screening 1.6m+ scientific papers on conservation, as\nwell as manually summarising 8600+ studies relating to conservation actions.\nHowever, progress is limited by the specialised skills needed to screen and\nsummarise relevant studies -- it took more than 75 person years to manually\ncurate the current database and only a few 100 papers can be added each year!\nWe are working on AI-driven techniques to accelerate addition of robust\nevidence to the CE database via automated literature scanning, <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">LLM-based copilots</a> and scanning of grey literature. We aim to provide co-pilots that\naugment human decision making to figure out how to categorise interventions\nmuch more quickly and accurately, and ultimately accelerate the positive impact\nof conservation actions.</p>\n<p>The goal of the <a href=\"https://conservationevidence.com\">Conservation Evidence</a>\nproject is to transform conservation so that evidence is routinely embedded in\ndecisions to improve outcomes for biodiversity and society. CE is becoming the\nauthoritative, most comprehensive, freely available platform for evidence-led\nconservation and is starting to <a href=\"https://www.conservation.cam.ac.uk/events/online-workshop-delivering-effectiveness-revolution-conservation-lessons-organisations-policy\">profoundly change the\nway</a>\nin which conservationists access and use evidence for improving the state of\nthe planet.</p>\n<p>\n<img alt=\"Team AICN in the CCI building, Feb 2024\" src=\"https://anil.recoil.org/images/aicn-team-feb24.webp\" title=\"Team AICN in the CCI building, Feb 2024\">\nTeam AICN in the CCI building, Feb 2024</p>\n<p>The CE collation and synthesis work has significantly improved the availability of evidence for use in conservation practice and remains the only resource of evidence synopses for biodiversity conservation and the largest database of effectiveness reviews of actions outside the field of medicine. The approach of carrying out reviews on an industrial scale means that they can carry out reviews for a fraction (~2%) of the costs in comparable fields, such as medicine. Using subject-wide evidence synthesis, CE systematically searches the literature and summarise results from (and provides citations for) each study testing the effectiveness of an action. As of April 2024, CE has read 1.6 million paper titles in 17 languages (326 non-english journals) and reviewed evidence for >3600 conservation actions, freely available on their <a href=\"http://www.conservationevidence.com\">website</a>, with collaboration from over 380 international academics and practitioners.</p>\n<h2><a href=\"https://anil.recoil.org/#accelerating-literature-surveys-with-llms\"></a>Accelerating literature surveys with LLMs</h2>\n<p>We got involved from computer science in 2023 as part of the <a href=\"https://ai.cam.ac.uk/blog/harnessing-the-power-of-ai-to-help-save-our-planet\">AI@CAM competition</a> to harness the momentum behind machine learning to accelerate conservation actions. Our overall aim is to help CE to dramatically accelerate their data searching and data extraction pipelines. Currently, the searching of literature and summarising of key data is undertaken by human experts. Although this method of working is time consuming, it does benefit from being thorough and replicable. The main difficulties come in the subtleties of deciphering study designs, methodologies and whether controls are actually appropriate for testing the effectiveness of the specified action. Any LLM-based automation that we deploy must account for these as part of the validation pipeline.</p>\n<p>Our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">evaluation of LLM performance</a> against human experts on conservation intervention questions showed that properly configured LLMs with retrieval augmentation can achieve competitive performance on evidence synthesis tasks. However, out-of-the-box general LLMs performed poorly and risk misinforming decision-makers, reinforcing our commitment to careful validation and human-in-the-loop approaches.</p>\n<p></p><div></div><p></p>\n<p>The collaboration originally began in 2022 as part of the Computer Science <a href=\"https://www.cst.cam.ac.uk/teaching/part-ib/group-projects\">1B group projects</a>, when <a href=\"https://www.zoo.cam.ac.uk/directory/bill-sutherland\">Bill Sutherland</a>, <a href=\"https://samreynolds.org/\">Sam Reynolds</a> and <a href=\"https://profiles.imperial.ac.uk/a.christie\">Alec Christie</a> from Zoology proposed a group project related to CE. A team of undergraduate students (including <a href=\"https://github.com/CaoJamie\">Jamie Cao</a>) trained an ML model to facilitate searching for papers and indexing relevant articles by species and habitat. After the group project completed with encouraging results, <a href=\"https://toao.com\">Sadiq Jaffer</a> and I joined the collaboration and -- with help from the Cambridge <a href=\"https://osc.cam.ac.uk\">Office for Scholarly Communication</a> -- built up a comprehensive (and legal!) corpus of millions of academic papers related to conservation evidence.</p>\n<p>Through 2024, we evaluated ten different LLMs against human experts using the CE database, leading to our <a href=\"https://anil.recoil.org/papers/2024-ce-llm\">LLM evaluation paper</a> showing promising but cautious results. We were joined in the summer of 2024 by three CST undergraduates: <a href=\"mailto:ri301@cam.ac.uk\">Radhika Iyer</a>, <a href=\"mailto:sb2704@cam.ac.uk\">Shrey Biswas</a> and <a href=\"https://github.com/Kacper-M-Michalik\">Kacper Michalik</a> who built out various elements of the system. Moving into 2025, our focus shifts to production deployment of hybrid retrieval systems while maintaining rigorous validation against the expanding challenges of <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">AI contamination in scientific literature</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#challenges-in-the-ai-evidence-era\"></a>Challenges in the AI-Evidence Era</h2>\n<p>Our work on LLM-based evidence synthesis has gained urgency in 2025 as the body of scientific literature faces challenges from <a href=\"https://anil.recoil.org/notes/ai-contamination-of-papers\">AI-contaminated papers</a>. Recent analysis suggests that up to 2% of submitted papers may be AI-generated, with some estimates potentially much higher. This contamination poses particular risks for evidence databases like CE, as fake papers with plausible-sounding conservation interventions could mislead decision-makers if incorporated without rigorous validation.</p>\n<p>As AI generation becomes more sophisticated, our validation pipeline therefore needs to evolve beyond "just" simple detection to robust experimental design verification and cross-reference validation.</p>\n<h2><a href=\"https://anil.recoil.org/#technology-for-conservation-not-division\"></a>Technology for Conservation, Not Division</h2>\n<p>We also conducted a <a href=\"https://anil.recoil.org/papers/2024-ai-conhorizon\">horizon scan</a> and as outlined in our <a href=\"https://anil.recoil.org/notes/ai-should-unite-conservation\">response to concerns about AI dividing conservation</a>, we are committed to developing CE tools that:</p>\n<ul>\n<li>Respect and amplify human expertise rather than replacing it via "human-in-the-loop" methods</li>\n<li>Follow participatory design principles with conservation practitioners</li>\n<li>Maintain open source and open data approaches with thorough documentation to facilitate reproducible outputs</li>\n<li>Address capacity building needs, particularly in the Global South with respect to AI capability</li>\n<li>Keep conservation goals and not short term technology trends at the center of our research.</li>\n</ul>\n<p>Our <a href=\"https://anil.recoil.org/notes/aicam-interview-ce\">AI@CAM interview</a> highlights this approach: we are\nbuilding detailed models of the world that can be queried by policy makers to\nhelp make informed decisions. The technology serves the evidence, and the\nevidence serves conservation practices.</p>\n<p></p></div>",
+18
avsm/projects_difc-tee.json
+18
avsm/projects_difc-tee.json
···+"summary": "There is now increased hardware support for improving the security and\nperformance of privilege separation and compartmentalization techniques such as\nprocess-based sandboxes, trusted execution environments, and intra-address\nspace compartments. We dub these \"hetero-compartment environments\" and observe\nthat existing system stacks still assume single-compartment models (i.e. user\nspace processes), leading to limitations in using, integrating, and monitoring\nheterogeneous compartments from a security and performance perspective. This\nproject explores how we might deploy techniques such as fine-grained\ninformation flow control (DIFC) to allow developers to securely use and combine\ncompartments, define security policies over shared system resources, and audit\npolicy violations and perform digital forensics across hetero-compartments. \n\n\nThe primary focus of this work was conducting by [@ztarkhani] in her PhD work\non new hypervisor/OS/userspace interfaces for compartmentalization that could\ntake advantage of TEE hardware (see [:dispersed-compartments]).\n\nSince that work has been completed, I've also been exploring the use of DIFC\nlabels as part of [:plancomp], in order to encrypt and control access to\ndatasets across organisation boundaries. This work is still in the exploratory\nstages with [@pf341] and [@mdales].",+"content": "<div><h1>Information Flow for Trusted Execution</h1><p></p><p>There is now increased hardware support for improving the security and\nperformance of privilege separation and compartmentalization techniques such as\nprocess-based sandboxes, trusted execution environments, and intra-address\nspace compartments. We dub these "hetero-compartment environments" and observe\nthat existing system stacks still assume single-compartment models (i.e. user\nspace processes), leading to limitations in using, integrating, and monitoring\nheterogeneous compartments from a security and performance perspective. This\nproject explores how we might deploy techniques such as fine-grained\ninformation flow control (DIFC) to allow developers to securely use and combine\ncompartments, define security policies over shared system resources, and audit\npolicy violations and perform digital forensics across hetero-compartments.</p>\n<p>The primary focus of this work was conducting by <a href=\"https://zatkh.github.io/\">Zahra Tarkhani</a> in her PhD work\non new hypervisor/OS/userspace interfaces for compartmentalization that could\ntake advantage of TEE hardware (see <a href=\"https://anil.recoil.org/ideas/dispersed-compartments\">Secure Programming with Dispersed Compartments</a>).</p>\n<p>Since that work has been completed, I've also been exploring the use of DIFC\nlabels as part of <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a>, in order to encrypt and control access to\ndatasets across organisation boundaries. This work is still in the exploratory\nstages with <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> and <a href=\"https://mynameismwd.org\">Michael Dales</a>.</p>\n<p></p></div>",
+18
avsm/projects_life.json
+18
avsm/projects_life.json
···+"summary": "Human-driven habitat loss is recognised as the greatest cause of biodiversity loss, but we lack robust, spatially explicit metrics quantifying the impacts of anthropogenic changes in habitat extent on species' extinctions. LIFE is our new metric that uses a persistence score approach that combines ecologies and land-cover data whilst considering the cumulative non-linear impact of past habitat loss on species' probability of extinction. We apply large-scale computing to map ~30k species of terrestrial vertebrates and provide quantitative estimates of the marginal changes in the expected number of extinctions caused by converting remaining natural vegetation to agriculture, and also by restoring farmland to natural habitat. We are also investigating many of the conservation opportunities opened up via its estimates of the impact on extinctions of diverse actions that change land cover, from individual dietary choices through to global protected area development.\n\n\n## LIFE v1\n\nOur efforts through 2023-24 were focussed on building the first version of the LIFE metric and addressing peer review contents, with the research expertly lead by [@aeyres]. The [:2024-life] paper appeared in publication at the Royal Society in early 2025, and even covered by [Mongabay](https://news.mongabay.com/2025/01/life-scores-map-out-where-habitat-loss-for-crops-drives-extinction/).\n\nThe computational challenges of generating global maps for ~30k species at 1 arc-minute resolution required significant high-performance computing resources and careful attention to [dataset versioning and reproducibility](:2024-uncertainty-cs). As [@pf341] notes, large ecological datasets are inherently difficult to version and reproduce, making our approach of coupling persistence scores with HPC infrastructure particularly important for ensuring scientific reproducibility.\n\n\n\nOur research efforts in 2025 are focussed on improving the resolution of the persistence maps, increasing the coverage of species, and performing more analyses to identify newer conservation opportunities. This work is part of broader [planetary computing efforts](:2024-planetary-computing) to make global-scale biodiversity data accessible for decision-making.\n\n## Computational Challenges and Infrastructure\n\nThe LIFE metric represents a huge computational ecology challenge, requiring processing of species occurrence data, habitat maps, and persistence calculations across multiple spatial and temporal scales. Our work highlights the [broader challenges in computational conservation](:2024-planetary-computing), where the scale of ecological data now requires sophisticated computational infrastructure.\n\nKey technical challenges we've addressed include Managing and versioning terabyte-scale biodiversity datasets across time, scaling persistence score calculations across 30,000+ species, ensuring reproducible computational workflows for ecological modeling, and balancing computational efficiency with ecological model accuracy. This computational ecology approach is increasingly vital as conservation decisions require [rapid, evidence-based analysis](:2024-uncertainty-cs) of large-scale environmental data, and I am hosting the 2nd outing of [Programming for the Planet](:propl-at-splash) in October 2025 to continue this conversation.\n\n## Mapping other spatial threats\n\nUnfortunately, individual species are affected by anthropogenic threats beyond simply habitat loss from landuse change, including hunting, agricultural practices and the introduction of invasive species. [@elr] is conducting his PhD research on this topic, in particular focussing on per-species abundances and threats. [@cemogor] -- who completed his PhD in 2024 on threats to pangolins via hunting -- is also joining the Computer Lab as a Schmidt Sciences fellow and applying machine learning to predicting sources of hunting pressures on wild species.\n\n## Tying anthropogenic activity such as food consumption to biodiversity.\n\nAgriculturally-driven habitat degradation and destruction is the biggest threat to global biodiversity, and so an exciting line of work that [@tball] has been leading is to tie the LIFE metric with food consumption and production data and provenance modelling in order to figure out the impact of what we eat on species extinctions. The [FOOD metric papers](:2024-food-life) show that despite marked differences in per-capita impacts across countries, there are consistent patterns that could be leveraged for mitigating harm to biodiversity. \n\nThis work connects to broader questions about [sustainable food systems](:2024-food-life) and how computational tools can help consumers and policymakers understand the biodiversity consequences of dietary and agricultural choices. We're continuing to work on refining this data and analysis, particularly via higher resolution supply chain datasets and crop yield data.",+"content": "<div><h1>Mapping LIFE on Earth</h1><p></p><p>Human-driven habitat loss is recognised as the greatest cause of biodiversity loss, but we lack robust, spatially explicit metrics quantifying the impacts of anthropogenic changes in habitat extent on species' extinctions. LIFE is our new metric that uses a persistence score approach that combines ecologies and land-cover data whilst considering the cumulative non-linear impact of past habitat loss on species' probability of extinction. We apply large-scale computing to map ~30k species of terrestrial vertebrates and provide quantitative estimates of the marginal changes in the expected number of extinctions caused by converting remaining natural vegetation to agriculture, and also by restoring farmland to natural habitat. We are also investigating many of the conservation opportunities opened up via its estimates of the impact on extinctions of diverse actions that change land cover, from individual dietary choices through to global protected area development.</p>\n<h2><a href=\"https://anil.recoil.org/#life-v1\"></a>LIFE v1</h2>\n<p>Our efforts through 2023-24 were focussed on building the first version of the LIFE metric and addressing peer review contents, with the research expertly lead by <a href=\"https://www.conservation.cam.ac.uk/staff/dr-alison-eyres\">Alison Eyres</a>. The <a href=\"https://anil.recoil.org/papers/2024-life\">LIFE: A metric for mapping the impact of land-cover change on global extinctions</a> paper appeared in publication at the Royal Society in early 2025, and even covered by <a href=\"https://news.mongabay.com/2025/01/life-scores-map-out-where-habitat-loss-for-crops-drives-extinction/\">Mongabay</a>.</p>\n<p>The computational challenges of generating global maps for ~30k species at 1 arc-minute resolution required significant high-performance computing resources and careful attention to <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">dataset versioning and reproducibility</a>. As <a href=\"https://patrick.sirref.org\">Patrick Ferris</a> notes, large ecological datasets are inherently difficult to version and reproduce, making our approach of coupling persistence scores with HPC infrastructure particularly important for ensuring scientific reproducibility.</p>\n<p>\n<img alt=\"A false colour version of the LIFE dataset over central and south America.\" src=\"https://anil.recoil.org/images/life-false-color-sa.webp\" title=\"A false colour version of the LIFE dataset over central and south America.\">\nA false colour version of the LIFE dataset over central and south America.</p>\n<p>Our research efforts in 2025 are focussed on improving the resolution of the persistence maps, increasing the coverage of species, and performing more analyses to identify newer conservation opportunities. This work is part of broader <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">planetary computing efforts</a> to make global-scale biodiversity data accessible for decision-making.</p>\n<h2><a href=\"https://anil.recoil.org/#computational-challenges-and-infrastructure\"></a>Computational Challenges and Infrastructure</h2>\n<p>The LIFE metric represents a huge computational ecology challenge, requiring processing of species occurrence data, habitat maps, and persistence calculations across multiple spatial and temporal scales. Our work highlights the <a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">broader challenges in computational conservation</a>, where the scale of ecological data now requires sophisticated computational infrastructure.</p>\n<p>Key technical challenges we've addressed include Managing and versioning terabyte-scale biodiversity datasets across time, scaling persistence score calculations across 30,000+ species, ensuring reproducible computational workflows for ecological modeling, and balancing computational efficiency with ecological model accuracy. This computational ecology approach is increasingly vital as conservation decisions require <a href=\"https://anil.recoil.org/papers/2024-uncertainty-cs\">rapid, evidence-based analysis</a> of large-scale environmental data, and I am hosting the 2nd outing of <a href=\"https://anil.recoil.org/notes/propl-at-splash\">Programming for the Planet</a> in October 2025 to continue this conversation.</p>\n<h2><a href=\"https://anil.recoil.org/#mapping-other-spatial-threats\"></a>Mapping other spatial threats</h2>\n<p>Unfortunately, individual species are affected by anthropogenic threats beyond simply habitat loss from landuse change, including hunting, agricultural practices and the introduction of invasive species. <a href=\"https://emiliolr.github.io\">Emilio Luz-Ricca</a> is conducting his PhD research on this topic, in particular focussing on per-species abundances and threats. <a href=\"https://charlesemogor.com\">Charles Emogor</a> -- who completed his PhD in 2024 on threats to pangolins via hunting -- is also joining the Computer Lab as a Schmidt Sciences fellow and applying machine learning to predicting sources of hunting pressures on wild species.</p>\n<h2><a href=\"https://anil.recoil.org/#tying-anthropogenic-activity-such-as-food-consumption-to-biodiversity\"></a>Tying anthropogenic activity such as food consumption to biodiversity.</h2>\n<p>Agriculturally-driven habitat degradation and destruction is the biggest threat to global biodiversity, and so an exciting line of work that <a href=\"https://www.zoo.cam.ac.uk/directory/dr-tom-ball\">Thomas Ball</a> has been leading is to tie the LIFE metric with food consumption and production data and provenance modelling in order to figure out the impact of what we eat on species extinctions. The <a href=\"https://anil.recoil.org/papers/2024-food-life\">FOOD metric papers</a> show that despite marked differences in per-capita impacts across countries, there are consistent patterns that could be leveraged for mitigating harm to biodiversity.</p>\n<p>This work connects to broader questions about <a href=\"https://anil.recoil.org/papers/2024-food-life\">sustainable food systems</a> and how computational tools can help consumers and policymakers understand the biodiversity consequences of dietary and agricultural choices. We're continuing to work on refining this data and analysis, particularly via higher resolution supply chain datasets and crop yield data.</p>\n<p></p></div>",
+18
avsm/projects_melange.json
+18
avsm/projects_melange.json
···+"summary": "My PhD dissertation work proposed an architecture for constructing new implementations of standard Internet protocols with integrated formal methods such as model checking and functional programming that were then not used in deployed servers. A more informal summary is \"rewrite all the things in OCaml from C!\", which lead to a merry adventure into implementing many networks protocols from scratch in a functional style, and learning lots about how to enforce specifications without using a full blown proof assistant.\n\n\nIn the late 90s while working at MVACS on the Mars Polar Lander, I found myself\nusing the secure OpenBSD operating system to deploy the self-hosted service\nthat @nickludlam and I have run together ever since. I became an OpenBSD\ndeveloper with commit rights and went to several hackathons, a sample of which you can read in [:c2k5-thoughts]. Back then, my primary open source experience was working on C code in the OpenBSD base system and in PHP code while hacking on the popular [Horde/IMP](https://horde.org) groupware system for my own email.\n\nI rapidly tired of hacking in C code and looked for safer alternatives. While\nprocrastinating over PhD coffee with [@djs55] he suggested I look into writing\na system daemon in [OCaml](https://ocaml.org). Why not have a go at a SSH server written entirely\nin a type-safe functional language? Being a PhD student sorely in need of a\nchallenge, I took up the project.\n\nThere were a couple of different challenges involved:\n\n- There was no good way of expressing packet parsing policies for the complex\n dynamics of real Internet protocols. I developed a domain-specific language\n for this in OCaml known as [MPL](https://github.com/avsm/melange) (the \"meta packet language\") and used it to\n successfully parse DNS, BGP, Ethernet, IP, SSH and a host of other binary\n protocols. The work won the best student paper award at EuroSys 2007 in\n [:2007-eurosys-melange], and helped to lay the foundation for a growing belief\n in industrial circles that C was not the only way to do low-level parsing.\n- Once parsing was fixed, I also had to express complex state machines using\n OCaml. Using a functional language was not a silver bullet to solve this problem\n since the state machines still had to be verified against a spec. I had a first\n go at this in [:sam03-secpol] using system call tracing, but decided that was\n a dead end due to the poor granularity. I then designed another domain-specific\n language called SPL in [:2005-hotdep-spl] and [:2005-spin-splat] and a detailed\n writeup in [:2009-icfem-spl]. This turned out to be a pretty pragmatic solution\n by using model checking and even included an early visual debugger for protocol\n state machines. The work holds up surprisingly well in 2021: while theorem provers\n and refinement types based languages like Fstar produce amazing results, they\n still require a lot more effort than my simpler model-checking-based solution.\n\nAll this work resulted in the [Melange](https://github.com/avsm/melange) framework\nthat I put together in OCaml and evaluated, and published in my [:anil-phd-thesis] PhD thesis with the following abstract:\n\n> A typical Internet server finds itself in the middle of a virtual battleground,\n> under constant threat from worms, viruses and other malware seeking to subvert\n> the original intentions of the programmer. In particular, critical Internet\n> servers such as OpenSSH, BIND and Sendmail have had numerous security issues\n> ranging from low-level buffer overflows to subtle protocol logic errors. These\n> problems have cost billions of dollars as the growth of the Internet exposes\n> increasing numbers of computers to electronic malware. Despite the decades of\n> research on techniques such as model-checking, type-safety and other forms of\n> formal analysis, the vast majority of server implementations continue to be\n> written unsafely and informally in C/C++.\n>\n> In this dissertation we propose an architecture for constructing new\n> implementations of standard Internet protocols which integrates mature\n> formal methods not currently used in deployed servers: (i) static type\n> systems from the ML family of functional languages; (ii) model checking to\n> verify safety properties exhaustively about aspects of the servers; and (iii)\n> generative meta-programming to express high-level constraints for the\n> domain-specific tasks of packet parsing and constructing non-deterministic\n> state ma- chines. Our architecture -\u2014 dubbed MELANGE -\u2014 is based on Objective Caml\n> and contributes two domain-specific languages: (i) the Meta Packet Language\n> (MPL), a data description language used to describe the wire format of a\n> protocol and output statically type-safe code to handle network traffic using\n> high-level functional data structures; and (ii) the Statecall Policy Language\n> (SPL) for constructing non-deterministic finite state automata which are\n> embedded into applications and dynamically enforced, or translated into\n> PROMELA and statically model-checked. Our research emphasises the importance\n> of delivering efficient, portable code which is feasible to deploy across the\n> Internet. We implemented two complex protocols -\u2014 SSH and DNS -\u2014 to verify our\n> claims, and our evaluation shows that they perform faster than their standard\n> counterparts OpenSSH and BIND, in addition to providing static guarantees\n> against some classes of errors that are currently a major source of security\n> problems.\n\nI didn't do much on this immediately after submitting my thesis since I was busy\nworking on [:xen] from 2006-2009 or so. However, the first thing I did when\nI quit Citrix was to start the [MirageOS](https://mirageos.org) project (the successor to Melange) with\n[@samoht] and [@djs55] in order to develop better personal data infrastructure with\n[:perscon]. This formed the foundation for my subsequent research into library\noperating systems and the concept of [:unikernels].\nRead more about the subsequent work\nthere, or sample [:2010-hotcloud-lamp] to get a taster of the direction\nMelange evolved in.\n\nReflecting on my PhD research in 2021, I think that it\nwas a pretty good piece of systems research. It didn't make any deep contributions\nto formal verification or programming language research, but it did posit a clear\nsystems thesis and implement and evaluate it without a huge team being involved.\nThat's more difficult to do these days in the era of large industrial research\nteams dominating the major conferences, but certainly not impossible.\n\nChoosing a good topic for systems research is crucial, since the context you do\nthe research in is as important as the results you come up with. Much of my subsequent\ncareer has been influenced by the \"crazy challenge\" that [@djs55] set me back in 2003\nto do systems programming in a functional language, with all the intellectual and\nengineering challenges that came along with that extreme (back in 2003) position.",+"content": "<div><h1>Functional Internet Services</h1><p></p><p>My PhD dissertation work proposed an architecture for constructing new implementations of standard Internet protocols with integrated formal methods such as model checking and functional programming that were then not used in deployed servers. A more informal summary is "rewrite all the things in OCaml from C!", which lead to a merry adventure into implementing many networks protocols from scratch in a functional style, and learning lots about how to enforce specifications without using a full blown proof assistant.</p>\n<p>In the late 90s while working at MVACS on the Mars Polar Lander, I found myself\nusing the secure OpenBSD operating system to deploy the self-hosted service\nthat @nickludlam and I have run together ever since. I became an OpenBSD\ndeveloper with commit rights and went to several hackathons, a sample of which you can read in <a href=\"https://anil.recoil.org/notes/c2k5-thoughts\">OpenBSD C2K5 thoughts</a>. Back then, my primary open source experience was working on C code in the OpenBSD base system and in PHP code while hacking on the popular <a href=\"https://horde.org\">Horde/IMP</a> groupware system for my own email.</p>\n<p>I rapidly tired of hacking in C code and looked for safer alternatives. While\nprocrastinating over PhD coffee with <a href=\"https://github.com/djs55\">Dave Scott</a> he suggested I look into writing\na system daemon in <a href=\"https://ocaml.org\">OCaml</a>. Why not have a go at a SSH server written entirely\nin a type-safe functional language? Being a PhD student sorely in need of a\nchallenge, I took up the project.</p>\n<p>There were a couple of different challenges involved:</p>\n<ul>\n<li>There was no good way of expressing packet parsing policies for the complex\ndynamics of real Internet protocols. I developed a domain-specific language\nfor this in OCaml known as <a href=\"https://github.com/avsm/melange\">MPL</a> (the "meta packet language") and used it to\nsuccessfully parse DNS, BGP, Ethernet, IP, SSH and a host of other binary\nprotocols. The work won the best student paper award at EuroSys 2007 in\n<a href=\"https://anil.recoil.org/papers/2007-eurosys-melange\">Melange: creating a "functional" internet</a>, and helped to lay the foundation for a growing belief\nin industrial circles that C was not the only way to do low-level parsing.</li>\n<li>Once parsing was fixed, I also had to express complex state machines using\nOCaml. Using a functional language was not a silver bullet to solve this problem\nsince the state machines still had to be verified against a spec. I had a first\ngo at this in <a href=\"https://anil.recoil.org/papers/sam03-secpol\">The Case for Abstracting Security Policies</a> using system call tracing, but decided that was\na dead end due to the poor granularity. I then designed another domain-specific\nlanguage called SPL in <a href=\"https://anil.recoil.org/papers/2005-hotdep-spl\">On the challenge of delivering high-performance, dependable, model-checked internet servers</a> and <a href=\"https://anil.recoil.org/papers/2005-spin-splat\">SPLAT: A Tool for Model-Checking and Dynamically-Enforcing Abstractions</a> and a detailed\nwriteup in <a href=\"https://anil.recoil.org/papers/2009-icfem-spl\">Combining Static Model Checking with Dynamic Enforcement Using the Statecall Policy Language</a>. This turned out to be a pretty pragmatic solution\nby using model checking and even included an early visual debugger for protocol\nstate machines. The work holds up surprisingly well in 2021: while theorem provers\nand refinement types based languages like Fstar produce amazing results, they\nstill require a lot more effort than my simpler model-checking-based solution.</li>\n</ul>\n<p>All this work resulted in the <a href=\"https://github.com/avsm/melange\">Melange</a> framework\nthat I put together in OCaml and evaluated, and published in my <a href=\"https://anil.recoil.org/papers/anil-phd-thesis\">Creating high-performance, statically type-safe network applications</a> PhD thesis with the following abstract:</p>\n<blockquote>\n<p>A typical Internet server finds itself in the middle of a virtual battleground,\nunder constant threat from worms, viruses and other malware seeking to subvert\nthe original intentions of the programmer. In particular, critical Internet\nservers such as OpenSSH, BIND and Sendmail have had numerous security issues\nranging from low-level buffer overflows to subtle protocol logic errors. These\nproblems have cost billions of dollars as the growth of the Internet exposes\nincreasing numbers of computers to electronic malware. Despite the decades of\nresearch on techniques such as model-checking, type-safety and other forms of\nformal analysis, the vast majority of server implementations continue to be\nwritten unsafely and informally in C/C++.</p>\n<p>In this dissertation we propose an architecture for constructing new\nimplementations of standard Internet protocols which integrates mature\nformal methods not currently used in deployed servers: (i) static type\nsystems from the ML family of functional languages; (ii) model checking to\nverify safety properties exhaustively about aspects of the servers; and (iii)\ngenerative meta-programming to express high-level constraints for the\ndomain-specific tasks of packet parsing and constructing non-deterministic\nstate ma- chines. Our architecture -\u2014 dubbed MELANGE -\u2014 is based on Objective Caml\nand contributes two domain-specific languages: (i) the Meta Packet Language\n(MPL), a data description language used to describe the wire format of a\nprotocol and output statically type-safe code to handle network traffic using\nhigh-level functional data structures; and (ii) the Statecall Policy Language\n(SPL) for constructing non-deterministic finite state automata which are\nembedded into applications and dynamically enforced, or translated into\nPROMELA and statically model-checked. Our research emphasises the importance\nof delivering efficient, portable code which is feasible to deploy across the\nInternet. We implemented two complex protocols -\u2014 SSH and DNS -\u2014 to verify our\nclaims, and our evaluation shows that they perform faster than their standard\ncounterparts OpenSSH and BIND, in addition to providing static guarantees\nagainst some classes of errors that are currently a major source of security\nproblems.</p>\n</blockquote>\n<p>I didn't do much on this immediately after submitting my thesis since I was busy\nworking on <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a> from 2006-2009 or so. However, the first thing I did when\nI quit Citrix was to start the <a href=\"https://mirageos.org\">MirageOS</a> project (the successor to Melange) with\n<a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and <a href=\"https://github.com/djs55\">Dave Scott</a> in order to develop better personal data infrastructure with\n<a href=\"https://anil.recoil.org/projects/perscon\">Personal Containers</a>. This formed the foundation for my subsequent research into library\noperating systems and the concept of <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a>.\nRead more about the subsequent work\nthere, or sample <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">Turning Down the LAMP: Software Specialisation for the Cloud</a> to get a taster of the direction\nMelange evolved in.</p>\n<p>Reflecting on my PhD research in 2021, I think that it\nwas a pretty good piece of systems research. It didn't make any deep contributions\nto formal verification or programming language research, but it did posit a clear\nsystems thesis and implement and evaluate it without a huge team being involved.\nThat's more difficult to do these days in the era of large industrial research\nteams dominating the major conferences, but certainly not impossible.</p>\n<p>Choosing a good topic for systems research is crucial, since the context you do\nthe research in is as important as the results you come up with. Much of my subsequent\ncareer has been influenced by the "crazy challenge" that <a href=\"https://github.com/djs55\">Dave Scott</a> set me back in 2003\nto do systems programming in a functional language, with all the intellectual and\nengineering challenges that came along with that extreme (back in 2003) position.</p>\n<p></p></div>",
+18
avsm/projects_ocamllabs.json
+18
avsm/projects_ocamllabs.json
···+"summary": "I founded a research group called OCaml Labs at the University of Cambridge, with the goal of pushing OCaml and functional programming forward as a platform, making it a more effective tool for all users (including large-scale industrial deployments), while at the same time growing the appeal of the language, broadening its applicability and popularity. Over a decade, we retrofitted multicore parallelism into the mainline OCaml manager, wrote a popular book on the language, and helped start and grow an OCaml package and tooling ecosystem that is thriving today.\n\n\n## Background\n\nIn my PhD work on [:melange] in around 2003-2007, I developed high performance and reliable protocol implementations in OCaml. Subsequently from 2010, I worked on [:perscon] to build high assurance private data processing platforms. This research lead me to really appreciate functional programming as a powerful approach to building robust software, and I got involved in the [Commercial Users of Functional Programming](https://cufp.org) workshop, first as a speaker and then an [organiser](:2011-cufp-scribe) and member of the steering committee.\n\nIt was around this time in 2011 that my work on [:unikernels] and MirageOS was starting to materialise into a real project, but the OCaml language that we wrote everything in didn't have a unified open source community. Instead, there were islands of developers all over the world: the core maintainers concentrated in Inria in France, and academics teaching it in various universities, and some industrial shops like Jane Street or my own experiences from [:2010-icfp-xen]. I put my head together with [@yminsky] in Tokyo at IFCP 2011 to see if we could try something a little unique for the time \u2013 establishing a centre for excellence in functional programming that would focus on the open-source and community building aspects of functional programming as well as traditional academic research.\n\n## Early Days (2012-2014)\n\nIn 2012, we launched the centre from the Cambridge Computer Lab in [:announcing-ocaml-labs]. Things moved very quickly indeed as the group quickly grew to around 6 full time postdocs and engineers, with lots of interns coming through our doors. Our general strategy at this point was to understand the basic problems we were going to tackle, and so started with a few concrete projects to bootstrap the ecosystem:\n\n- publishing [:rwo] with O'Reilly, which sold lots of copies in the early days and created plenty of buzz for OCaml. It was quite fun attending author signings around the world and having lines of people queuing up for a signature!\n- I worked closely with [@samoht] (then CTO at OCamlPro) who lead the development of the first version of the [opam](https://opam.ocaml.org) package manager. Both of us were also establishing the MirageOS project at the time, and so we ended up bootstrapping a big chunk of the [opam-repository](https://github.com/ocaml/opam-repository) for use by it, and we also took a (in hindsight excellent) decision to use the nascent GitHub platform as the primary mechanism for managing packages instead of hosting a database. After a few releases in 2012 and then [:opam-1-1-beta], the package manager rapidly established itself as the defacto standard for the OCaml ecosystem. I've been the chief maintainer of the opam-repository ever since then (with many wonderful co-maintainers who do much of the heavy lifting, of course!). As of 2021, there are over 20000 packages in the repository. I've been less active since about 2023, but still the repository administrator.\n\nWe also began organising community events, both online and offline:\n- Didier Remy and I organised the inaugral [OCaml Users and Developer's workshop](https://ocaml.org/meetings/ocaml/2012/) in 2012, which morphed in subsequent years into the OCaml Workshop. See [:ocaml-users-group] for an earlier user group meeting as well.\n- [Cambridge Compiler Hacking](https://ocamllabs.io/compiler-hacking/) sessions ran from 2013 to 2017 and served as introductions to new developers with experienced mentors on hand.\n- the conference highlight of the year were undoubtedly the CUFP workshops at ICFP as they combined a really active academic and industrial crowd. The writeups are in [:2011-cufp-scribe], [:2012-cufp-scribe] and [:2013-cufp-scribe] to give you a sense of what went on.\n- we worked with Ashish Agarwal and Christophe Troestler to develop a brand new website to replace the original https://caml.inria.fr one, and this eventually became ocaml.org in around 2012. Almost a decade later, I announced the replacement of this one with a [v3](https://discuss.ocaml.org/t/v3-ocaml-org-a-roadmap-for-ocamls-online-presence/8368/18) version as well.\n- helping to open up OCaml compiler development by improving the GitHub infrastructure and starting the `ocaml` organisation there, such as via [OCaml/GitHub integration](https://web.archive.org/web/20181130130707/https://anil.recoil.org/2014/03/25/ocaml-github-and-opam.html). Eventually, compiler development moved over entirely to GitHub thanks to a big push from the core developer team.\n\nThere was enough activity in the early days that I managed to capture it in annual blog posts:\n- [:the-year-in-ocamllabs]\n- [:ocaml-labs-at-icfp-2014]\n- [:ocamllabs-2014-review]\n\nAfter 2014 though, things had grown to the point where it was just too difficult for me to keep up with the flurry of movement. We then aggregated into a \"middle age\" research project around 2015 with the following projects that would take the next few years.\n\n## The OCaml Platform\n\nOne of the main thrusts in OCaml Labs was to construct the tools to enable effective development workflows for OCaml usage at an industrial scale, while remaining maintainable with a small community that needed to migrate from existing workflows. This effort was dubbed the \"OCaml Platform\" and really picked up stream after our release of the opam package manager, since it began the process of unifying the OCaml community around a common package collection.\n\nWhile much of the work was lead from OCaml Labs, it's also been highly collaborative with other organisations and individuals in the community. And of course, 100% of the work was released as open source software under a liberal license. I've been giving annual talks since 2013 or so about the steady progress we've been making towards building, testing, documentation and package management for OCaml.\n\n- [:rwo] was the book published by O'Reilly that explained how to use OCaml with the Core library.\n- My 2013 talk on [:2013-oud-platform] first introduced the OCaml Platform just after opam was first released.\n- My 2014 talk on [:2014-oud-platform] continued the steady adoption of opam within the OCaml community, to start bringing a standard package database across the different users.\n- [My 2015 Platform talk](https://www.youtube.com/watch?v=dEUMNuE4rxc&list=PLnqUlCo055hU46uoONmhYGUbYAK27Y6rS&index=8) then introduced continous integration for opam, as well the start of the central documentation efforts (which were finally completed in 2021 after some [herculean efforts](https://watch.ocaml.org/videos/watch/9bb452d6-1829-4dac-a6a2-46b31050c931)!).\n- By my [2017 Platform talk](https://speakerdeck.com/avsm/ocaml-platform-2017) in Oxford, we had most of the OCaml community using opam and released opam 2.0, started contributing to the new jbuilder build tool from Jane Street, and began the shift from camlp4 to ppx and the development of the new [odoc](https://github.com/ocaml/odoc) tool.\n- In my [2018 Platform talk](https://speakerdeck.com/avsm/the-ocaml-platform-1-dot-0-2018) in Missouri, we had helped evolve jbuilder into the Dune build system (now the build tool of choice in OCaml), and started to combine packaging and build into a cohesive platform. The key challenge so far had been to fill in gaps in functionality, and now we could begin to weave together the components we'd built.\n- My [2019 Platform talk](https://speakerdeck.com/avsm/workflows-in-the-ocaml-platform) in Berlin focussed on how workflows using all these tools would work, such as for package managers or application developers or end users. \n- My [2020 Platform talk](https://speakerdeck.com/avsm/ocaml-platform-2020) saw the unveiling of the [VSCode OCaml Platform plugin](https://github.com/ocamllabs/vscode-ocaml-platform), which provided a seamless integration with the IDE to let all the workflows and tools from earlier years \"just work\" out of the box.\n- In 2021, we embarked on a huge mission to [rebuild the ocaml.org online presence](https://discuss.ocaml.org/t/v3-ocaml-org-a-roadmap-for-ocamls-online-presence/8368/27) with a central documentation site that built 20000 packages with cross-referenced HTML documentation.\n\nAs you can see, it's quite a journey to build community-driven development tools. A key to our approach was to \"leave no OCaml project behind\", and we spent considerable effort ensuring that every step of the tooling evolution had a migration path for older OCaml projects. As a result, it's often still possible to compile 20 year old OCaml code using the modern tooling.\n\n## Multicore OCaml\n\nThe other big research project we drove from OCaml Labs was the effort to bring multicore parallelism to OCaml. While this might seem straightforward, we quickly realised that the challenge was in preserving _existing_ sequential performance while also allowing new code to take advantage of multicore CPUs.\n\nThe first talk we gave was in 2014 on [:2014-oud-multicore]. Little did we know how much work it would take to get this production worthy!\nAfter several years of hacking, we finally had several breakthroughs:\n- Any multicore-capable language needs a well-defined memory model, and we realised that none of the existing ones (e.g. in C++ or Java) were particularly satisfactory. Our PLDI paper on [:2018-pldi-memorymodel] defined a sensible and novel memory model for OCaml that was predictable for developers.\n- Our garbage collector and runtime design won the best paper award at ICFP for its systematic approach to the design and evaluation of several minor heap collectors, in [:2020-icfp-retropar].\n\n## Algebraic Effects\n\nWhile working on parallelism in OCaml with [@lpw25] and [@sdolan], [@kc] joined our group after completing his PhD at Purdue, and started us down the path of using algebraic effects to express concurrency in OCaml code.\n \n- The [:2017-ml-effects] and [:2017-tfp-effecthandlers] papers were our first forays into using the effect system for realistic usecases such as Unix systems programming.\n- We then spent a few years engineering a full production-quality version of runtime fibres in [:2021-pldi-retroeff], again with a focus on maintaining tooling compatibility (e.g. with debuggers) and also having a minimal impact on sequential performance for existing code.\n\nIn around 2020, I started publishing [multicore monthlies](https://discuss.ocaml.org/tag/multicore-monthly) on the OCaml discussion forum. This was because we had begin the journey to upstream our feature into the mainline OCaml compiler. At the end of 2020, [@kc] opened up a pull request to the mainline OCaml repository ([#10831](https://github.com/ocaml/ocaml/pull/10831)) and it got merged in early 2022, adding domains-parallelism and runtime fibres into OCaml 5.0! The amount of work that we put into multicore has been way more than I expected at the outset of the project, but the results are deeply satisfying. I'm finding that coding using effects in a mainstream PL like OCaml to be really fun, and anticipate this having a big boost for [:unikernels] in MirageOS that are struggling somewhat under the weight of over-functorisation for portability. It was also really fun seeing [how much online attention](https://news.ycombinator.com/item?id=29878605) we got as we went through the upstreaming journey.\n\n## OCaml Labs to Tarides (2021-present)\n\nThe OCaml Labs research project at the University of Cambridge finally came to\na happy end in 2021, after almost ten years. After the first decade of fundamental\nresearch and early engineering, the maintainership and stewarding of the resulting code has only\npicked up pace as the OCaml userbase grows. There are now *three* commercial\ncompanies who have taken over the work from the University, all run by research\nstaff originally in the Computer Lab group ([@gemmag], [@kc] and [@samoht]).\n\n- [OCaml Labs Consultancy](https://ocamllabs.io) is based in Cambridge in the UK.\n- [Tarides](https://tarides.com) is based in Paris, France.\n- [Segfault Systems](https://segfault.systems) is based in Chennai, India.\n\nAll of those groups merged into one unified Tarides in 2022 ([OCLC](https://tarides.com/blog/2022-01-27-ocaml-labs-joins-tarides/) and [Segfault](https://segfault.systems)), making it easier to manage a growing community of maintainers. There's really exciting work happening there to continue the upstreaming of the\nmulticore OCaml features into mainline OCaml, making unikernels and MirageOS ever more practical and robust to deploy, and shipping end-to-end Windows support in the OCaml toolchain. You can read about all this and more on the [Tarides blog](https://tarides.com/blog/), which is regularly updated with news on their projects.",+"content": "<div><h1>OCaml Labs</h1><p></p><p>I founded a research group called OCaml Labs at the University of Cambridge, with the goal of pushing OCaml and functional programming forward as a platform, making it a more effective tool for all users (including large-scale industrial deployments), while at the same time growing the appeal of the language, broadening its applicability and popularity. Over a decade, we retrofitted multicore parallelism into the mainline OCaml manager, wrote a popular book on the language, and helped start and grow an OCaml package and tooling ecosystem that is thriving today.</p>\n<h2><a href=\"https://anil.recoil.org/#background\"></a>Background</h2>\n<p>In my PhD work on <a href=\"https://anil.recoil.org/projects/melange\">Functional Internet Services</a> in around 2003-2007, I developed high performance and reliable protocol implementations in OCaml. Subsequently from 2010, I worked on <a href=\"https://anil.recoil.org/projects/perscon\">Personal Containers</a> to build high assurance private data processing platforms. This research lead me to really appreciate functional programming as a powerful approach to building robust software, and I got involved in the <a href=\"https://cufp.org\">Commercial Users of Functional Programming</a> workshop, first as a speaker and then an <a href=\"https://anil.recoil.org/papers/2011-cufp-scribe\">organiser</a> and member of the steering committee.</p>\n<p>It was around this time in 2011 that my work on <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a> and MirageOS was starting to materialise into a real project, but the OCaml language that we wrote everything in didn't have a unified open source community. Instead, there were islands of developers all over the world: the core maintainers concentrated in Inria in France, and academics teaching it in various universities, and some industrial shops like Jane Street or my own experiences from <a href=\"https://anil.recoil.org/papers/2010-icfp-xen\">Using functional programming within an industrial product group: perspectives and perceptions</a>. I put my head together with <a href=\"https://github.com/yminsky\">Yaron Minsky</a> in Tokyo at IFCP 2011 to see if we could try something a little unique for the time \u2013 establishing a centre for excellence in functional programming that would focus on the open-source and community building aspects of functional programming as well as traditional academic research.</p>\n<h2><a href=\"https://anil.recoil.org/#early-days-2012-2014\"></a>Early Days (2012-2014)</h2>\n<p>In 2012, we launched the centre from the Cambridge Computer Lab in <a href=\"https://anil.recoil.org/notes/announcing-ocaml-labs\">Announcing OCaml Labs</a>. Things moved very quickly indeed as the group quickly grew to around 6 full time postdocs and engineers, with lots of interns coming through our doors. Our general strategy at this point was to understand the basic problems we were going to tackle, and so started with a few concrete projects to bootstrap the ecosystem:</p>\n<ul>\n<li>publishing <a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a> with O'Reilly, which sold lots of copies in the early days and created plenty of buzz for OCaml. It was quite fun attending author signings around the world and having lines of people queuing up for a signature!</li>\n<li>I worked closely with <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> (then CTO at OCamlPro) who lead the development of the first version of the <a href=\"https://opam.ocaml.org\">opam</a> package manager. Both of us were also establishing the MirageOS project at the time, and so we ended up bootstrapping a big chunk of the <a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> for use by it, and we also took a (in hindsight excellent) decision to use the nascent GitHub platform as the primary mechanism for managing packages instead of hosting a database. After a few releases in 2012 and then <a href=\"https://anil.recoil.org/notes/opam-1-1-beta\">OPAM 1.1 beta available, with pretty colours</a>, the package manager rapidly established itself as the defacto standard for the OCaml ecosystem. I've been the chief maintainer of the opam-repository ever since then (with many wonderful co-maintainers who do much of the heavy lifting, of course!). As of 2021, there are over 20000 packages in the repository. I've been less active since about 2023, but still the repository administrator.</li>\n</ul>\n<p>We also began organising community events, both online and offline:</p>\n<ul>\n<li>Didier Remy and I organised the inaugral <a href=\"https://ocaml.org/meetings/ocaml/2012/\">OCaml Users and Developer's workshop</a> in 2012, which morphed in subsequent years into the OCaml Workshop. See <a href=\"https://anil.recoil.org/notes/ocaml-users-group\">Camel Spotting in Paris</a> for an earlier user group meeting as well.</li>\n<li><a href=\"https://ocamllabs.io/compiler-hacking/\">Cambridge Compiler Hacking</a> sessions ran from 2013 to 2017 and served as introductions to new developers with experienced mentors on hand.</li>\n<li>the conference highlight of the year were undoubtedly the CUFP workshops at ICFP as they combined a really active academic and industrial crowd. The writeups are in <a href=\"https://anil.recoil.org/papers/2011-cufp-scribe\">CUFP 2011 Workshop Report</a>, <a href=\"https://anil.recoil.org/papers/2012-cufp-scribe\">Commercial users of functional programming workshop report</a> and <a href=\"https://anil.recoil.org/papers/2013-cufp-scribe\">CUFP'13 scribe's report</a> to give you a sense of what went on.</li>\n<li>we worked with Ashish Agarwal and Christophe Troestler to develop a brand new website to replace the original https://caml.inria.fr one, and this eventually became ocaml.org in around 2012. Almost a decade later, I announced the replacement of this one with a <a href=\"https://discuss.ocaml.org/t/v3-ocaml-org-a-roadmap-for-ocamls-online-presence/8368/18\">v3</a> version as well.</li>\n<li>helping to open up OCaml compiler development by improving the GitHub infrastructure and starting the <code>ocaml</code> organisation there, such as via <a href=\"https://web.archive.org/web/20181130130707/https://anil.recoil.org/2014/03/25/ocaml-github-and-opam.html\">OCaml/GitHub integration</a>. Eventually, compiler development moved over entirely to GitHub thanks to a big push from the core developer team.</li>\n</ul>\n<p>There was enough activity in the early days that I managed to capture it in annual blog posts:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/notes/the-year-in-ocamllabs\">Reviewing the first year of OCaml Labs in 2013</a></li>\n<li><a href=\"https://anil.recoil.org/notes/ocaml-labs-at-icfp-2014\">Talks from OCaml Labs during ICFP 2014</a></li>\n<li><a href=\"https://anil.recoil.org/notes/ocamllabs-2014-review\">Reviewing the second year of OCaml Labs in 2014</a></li>\n</ul>\n<p>After 2014 though, things had grown to the point where it was just too difficult for me to keep up with the flurry of movement. We then aggregated into a "middle age" research project around 2015 with the following projects that would take the next few years.</p>\n<h2><a href=\"https://anil.recoil.org/#the-ocaml-platform\"></a>The OCaml Platform</h2>\n<p>One of the main thrusts in OCaml Labs was to construct the tools to enable effective development workflows for OCaml usage at an industrial scale, while remaining maintainable with a small community that needed to migrate from existing workflows. This effort was dubbed the "OCaml Platform" and really picked up stream after our release of the opam package manager, since it began the process of unifying the OCaml community around a common package collection.</p>\n<p>While much of the work was lead from OCaml Labs, it's also been highly collaborative with other organisations and individuals in the community. And of course, 100% of the work was released as open source software under a liberal license. I've been giving annual talks since 2013 or so about the steady progress we've been making towards building, testing, documentation and package management for OCaml.</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/rwo\">Real World OCaml: Functional Programming for the Masses</a> was the book published by O'Reilly that explained how to use OCaml with the Core library.</li>\n<li>My 2013 talk on <a href=\"https://anil.recoil.org/papers/2013-oud-platform\">The OCaml Platform v0.1</a> first introduced the OCaml Platform just after opam was first released.</li>\n<li>My 2014 talk on <a href=\"https://anil.recoil.org/papers/2014-oud-platform\">The OCaml Platform v1.0</a> continued the steady adoption of opam within the OCaml community, to start bringing a standard package database across the different users.</li>\n<li><a href=\"https://www.youtube.com/watch?v=dEUMNuE4rxc&list=PLnqUlCo055hU46uoONmhYGUbYAK27Y6rS&index=8\">My 2015 Platform talk</a> then introduced continous integration for opam, as well the start of the central documentation efforts (which were finally completed in 2021 after some <a href=\"https://watch.ocaml.org/videos/watch/9bb452d6-1829-4dac-a6a2-46b31050c931\">herculean efforts</a>!).</li>\n<li>By my <a href=\"https://speakerdeck.com/avsm/ocaml-platform-2017\">2017 Platform talk</a> in Oxford, we had most of the OCaml community using opam and released opam 2.0, started contributing to the new jbuilder build tool from Jane Street, and began the shift from camlp4 to ppx and the development of the new <a href=\"https://github.com/ocaml/odoc\">odoc</a> tool.</li>\n<li>In my <a href=\"https://speakerdeck.com/avsm/the-ocaml-platform-1-dot-0-2018\">2018 Platform talk</a> in Missouri, we had helped evolve jbuilder into the Dune build system (now the build tool of choice in OCaml), and started to combine packaging and build into a cohesive platform. The key challenge so far had been to fill in gaps in functionality, and now we could begin to weave together the components we'd built.</li>\n<li>My <a href=\"https://speakerdeck.com/avsm/workflows-in-the-ocaml-platform\">2019 Platform talk</a> in Berlin focussed on how workflows using all these tools would work, such as for package managers or application developers or end users.</li>\n<li>My <a href=\"https://speakerdeck.com/avsm/ocaml-platform-2020\">2020 Platform talk</a> saw the unveiling of the <a href=\"https://github.com/ocamllabs/vscode-ocaml-platform\">VSCode OCaml Platform plugin</a>, which provided a seamless integration with the IDE to let all the workflows and tools from earlier years "just work" out of the box.</li>\n<li>In 2021, we embarked on a huge mission to <a href=\"https://discuss.ocaml.org/t/v3-ocaml-org-a-roadmap-for-ocamls-online-presence/8368/27\">rebuild the ocaml.org online presence</a> with a central documentation site that built 20000 packages with cross-referenced HTML documentation.</li>\n</ul>\n<p>As you can see, it's quite a journey to build community-driven development tools. A key to our approach was to "leave no OCaml project behind", and we spent considerable effort ensuring that every step of the tooling evolution had a migration path for older OCaml projects. As a result, it's often still possible to compile 20 year old OCaml code using the modern tooling.</p>\n<h2><a href=\"https://anil.recoil.org/#multicore-ocaml\"></a>Multicore OCaml</h2>\n<p>The other big research project we drove from OCaml Labs was the effort to bring multicore parallelism to OCaml. While this might seem straightforward, we quickly realised that the challenge was in preserving <em>existing</em> sequential performance while also allowing new code to take advantage of multicore CPUs.</p>\n<p>The first talk we gave was in 2014 on <a href=\"https://anil.recoil.org/papers/2014-oud-multicore\">Multicore OCaml</a>. Little did we know how much work it would take to get this production worthy!\nAfter several years of hacking, we finally had several breakthroughs:</p>\n<ul>\n<li>Any multicore-capable language needs a well-defined memory model, and we realised that none of the existing ones (e.g. in C++ or Java) were particularly satisfactory. Our PLDI paper on <a href=\"https://anil.recoil.org/papers/2018-pldi-memorymodel\">Bounding data races in space and time</a> defined a sensible and novel memory model for OCaml that was predictable for developers.</li>\n<li>Our garbage collector and runtime design won the best paper award at ICFP for its systematic approach to the design and evaluation of several minor heap collectors, in <a href=\"https://anil.recoil.org/papers/2020-icfp-retropar\">Retrofitting parallelism onto OCaml</a>.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#algebraic-effects\"></a>Algebraic Effects</h2>\n<p>While working on parallelism in OCaml with <a href=\"https://github.com/lpw25\">Leo White</a> and <a href=\"https://github.com/stedolan\">Stephen Dolan</a>, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> joined our group after completing his PhD at Purdue, and started us down the path of using algebraic effects to express concurrency in OCaml code.</p>\n<ul>\n<li>The <a href=\"https://anil.recoil.org/papers/2017-ml-effects\">Effectively tackling the awkward squad</a> and <a href=\"https://anil.recoil.org/papers/2017-tfp-effecthandlers\">Concurrent System Programming with Effect Handlers</a> papers were our first forays into using the effect system for realistic usecases such as Unix systems programming.</li>\n<li>We then spent a few years engineering a full production-quality version of runtime fibres in <a href=\"https://anil.recoil.org/papers/2021-pldi-retroeff\">Retrofitting effect handlers onto OCaml</a>, again with a focus on maintaining tooling compatibility (e.g. with debuggers) and also having a minimal impact on sequential performance for existing code.</li>\n</ul>\n<p>In around 2020, I started publishing <a href=\"https://discuss.ocaml.org/tag/multicore-monthly\">multicore monthlies</a> on the OCaml discussion forum. This was because we had begin the journey to upstream our feature into the mainline OCaml compiler. At the end of 2020, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> opened up a pull request to the mainline OCaml repository (<a href=\"https://github.com/ocaml/ocaml/pull/10831\">#10831</a>) and it got merged in early 2022, adding domains-parallelism and runtime fibres into OCaml 5.0! The amount of work that we put into multicore has been way more than I expected at the outset of the project, but the results are deeply satisfying. I'm finding that coding using effects in a mainstream PL like OCaml to be really fun, and anticipate this having a big boost for <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a> in MirageOS that are struggling somewhat under the weight of over-functorisation for portability. It was also really fun seeing <a href=\"https://news.ycombinator.com/item?id=29878605\">how much online attention</a> we got as we went through the upstreaming journey.</p>\n<h2><a href=\"https://anil.recoil.org/#ocaml-labs-to-tarides-2021-present\"></a>OCaml Labs to Tarides (2021-present)</h2>\n<p>The OCaml Labs research project at the University of Cambridge finally came to\na happy end in 2021, after almost ten years. After the first decade of fundamental\nresearch and early engineering, the maintainership and stewarding of the resulting code has only\npicked up pace as the OCaml userbase grows. There are now <em>three</em> commercial\ncompanies who have taken over the work from the University, all run by research\nstaff originally in the Computer Lab group (<span>Gemma Gordon</span>, <a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a>).</p>\n<ul>\n<li><a href=\"https://ocamllabs.io\">OCaml Labs Consultancy</a> is based in Cambridge in the UK.</li>\n<li><a href=\"https://tarides.com\">Tarides</a> is based in Paris, France.</li>\n<li><a href=\"https://segfault.systems\">Segfault Systems</a> is based in Chennai, India.</li>\n</ul>\n<p>All of those groups merged into one unified Tarides in 2022 (<a href=\"https://tarides.com/blog/2022-01-27-ocaml-labs-joins-tarides/\">OCLC</a> and <a href=\"https://segfault.systems\">Segfault</a>), making it easier to manage a growing community of maintainers. There's really exciting work happening there to continue the upstreaming of the\nmulticore OCaml features into mainline OCaml, making unikernels and MirageOS ever more practical and robust to deploy, and shipping end-to-end Windows support in the OCaml toolchain. You can read about all this and more on the <a href=\"https://tarides.com/blog/\">Tarides blog</a>, which is regularly updated with news on their projects.</p>\n<p></p></div>",
+18
avsm/projects_osmose.json
+18
avsm/projects_osmose.json
···+"summary": "Digital infrastructure in modern urban environments is currently very\nInternet-centric, and involves transmitting data to physically remote\nenvironments. The cost for this is data insecurity, high response latency and\nunpredictable reliability of services. I am working on Osmose -- a new OS\narchitecture that inverts the current model by building an operating system\ndesigned to securely connect physical spaces with extremely low latency, high\nbandwidth local-area computation capabilities and service discovery.\n\n\nIn 2018, I was starting to wrap up a multi-year focus on [:unikernels],\nand I went back to look over the state of personal data handling (as I'd\nfinished working on [:perscon] in 2016). Things had regressed fairly\ndramatically -- central cloud providers and particularly IoT manufacturers\nwere moving heavily towards ubiquitous surveillance and centralised management.\n\nI started with trying to find a different slant on existing architectures for\nsmart buildings. Why couldn't we invert the Internet so that data is pooled in\na single _physical location_ by default, with networking being opt-in? Why\ncan't we build all of our ubiquitous computing infrastructure (such as voice\nand face recognition) so that it runs locally within the building rather than\nstreamed from remote datacentres? There would be gains all around -- latency,\nenergy usage, offline operation -- if we could figure out how to deploy local\nmachine learning services.\n\nI wrote up the initial thoughts behind this in a workshop\npaper in [:2018-hotpost-osmose]. Since then, I've been collaborating with the\ngood folks at Tarides on building out the library infrastructure in MirageOS\nto setup a prototype set of rooms in Cambridge and Paris that can act as a testbed\nfor our ideas.\n\nThe intention behind the Osmose design is to \"invert\" the architecture\nof smart cities to be self-contained units by default, and only communicate\nwhen required for the purpose of remote interaction. All sensing and storage\nis conducted locally -- resulting in energy efficiency gains, security by\ndefault for sensitive data, and robustness against communications outages\naffecting critical physical infrastructure.\n\nTwo significants advances in 2023 and 2024 on this project were:\n- [:2023-hotnets-sns] which explores a DNS architecture for naming places\n- [:2024-socc-murmuration] which explores a decentralised scheduling architecture for lower job completion times\n\n## Ultra-Low-Power AI Infrastructure\n\nA significant development in 2024-2025 has been our work on [ultra-low-power\nneural processing units](:2025-npu-bench) for edge deployment, directly\nsupporting the Osmose vision of local AI services. Our [benchmarking work](:2025-npu-bench) provides the first comparative evaluation of commercially-available \u03bcNPUs, revealing surprising disparities between hardware specifications and actual performance.\n\nThis connects to our broader research on [energy-aware deep learning](:2025-dl-rcn) for resource-constrained hardware. The combination of energy harvesting, intermittent operation, and sophisticated AI processing represents exactly the kind of intersection we need for sustainable smart building infrastructure.",+"content": "<div><h1>Interspatial OS</h1><p></p><p>Digital infrastructure in modern urban environments is currently very\nInternet-centric, and involves transmitting data to physically remote\nenvironments. The cost for this is data insecurity, high response latency and\nunpredictable reliability of services. I am working on Osmose -- a new OS\narchitecture that inverts the current model by building an operating system\ndesigned to securely connect physical spaces with extremely low latency, high\nbandwidth local-area computation capabilities and service discovery.</p>\n<p>In 2018, I was starting to wrap up a multi-year focus on <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a>,\nand I went back to look over the state of personal data handling (as I'd\nfinished working on <a href=\"https://anil.recoil.org/projects/perscon\">Personal Containers</a> in 2016). Things had regressed fairly\ndramatically -- central cloud providers and particularly IoT manufacturers\nwere moving heavily towards ubiquitous surveillance and centralised management.</p>\n<p>I started with trying to find a different slant on existing architectures for\nsmart buildings. Why couldn't we invert the Internet so that data is pooled in\na single <em>physical location</em> by default, with networking being opt-in? Why\ncan't we build all of our ubiquitous computing infrastructure (such as voice\nand face recognition) so that it runs locally within the building rather than\nstreamed from remote datacentres? There would be gains all around -- latency,\nenergy usage, offline operation -- if we could figure out how to deploy local\nmachine learning services.</p>\n<p>I wrote up the initial thoughts behind this in a workshop\npaper in <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose\">An architecture for interspatial communication</a>. Since then, I've been collaborating with the\ngood folks at Tarides on building out the library infrastructure in MirageOS\nto setup a prototype set of rooms in Cambridge and Paris that can act as a testbed\nfor our ideas.</p>\n<p>The intention behind the Osmose design is to "invert" the architecture\nof smart cities to be self-contained units by default, and only communicate\nwhen required for the purpose of remote interaction. All sensing and storage\nis conducted locally -- resulting in energy efficiency gains, security by\ndefault for sensitive data, and robustness against communications outages\naffecting critical physical infrastructure.</p>\n<p>Two significants advances in 2023 and 2024 on this project were:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2023-hotnets-sns\">Where on Earth is the Spatial Name System?</a> which explores a DNS architecture for naming places</li>\n<li><a href=\"https://anil.recoil.org/papers/2024-socc-murmuration\">Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters</a> which explores a decentralised scheduling architecture for lower job completion times</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#ultra-low-power-ai-infrastructure\"></a>Ultra-Low-Power AI Infrastructure</h2>\n<p>A significant development in 2024-2025 has been our work on <a href=\"https://anil.recoil.org/papers/2025-npu-bench\">ultra-low-power neural processing units</a> for edge deployment, directly\nsupporting the Osmose vision of local AI services. Our <a href=\"https://anil.recoil.org/papers/2025-npu-bench\">benchmarking work</a> provides the first comparative evaluation of commercially-available \u03bcNPUs, revealing surprising disparities between hardware specifications and actual performance.</p>\n<p>This connects to our broader research on <a href=\"https://anil.recoil.org/papers/2025-dl-rcn\">energy-aware deep learning</a> for resource-constrained hardware. The combination of energy harvesting, intermittent operation, and sophisticated AI processing represents exactly the kind of intersection we need for sustainable smart building infrastructure.</p>\n<p></p></div>",
+18
avsm/projects_perscon.json
+18
avsm/projects_perscon.json
···+"summary": "As cloud computing empowered the creation of vast data silos, I investigated how decentralised technologies might be deployed to allow individuals more vertical control over their own data. Personal containers was the prototype we built to learn how to stem the flow of our information out to the ad-driven social tarpits. We also deployed personal containers in an experimental data locker system at the University of Cambridge in order to incentivise lower-carbon travel schemes.\n\n\nI've had a passion for self-hosted, decentralised computing for many years\nsince [@nickludlam] and I set up the recoil.org collective in the late 90s. In\nlate 2008, I'd been working on early cloud computing as part of the [:xen]\nproject and already seeing the rapid rise of centralised data gathering in\nthe early cloud providers. When I left Citrix in 2009, I joined [@mac] and\n[@crowcroft] in their new [Horizon Digital Economy](https://www.horizon.ac.uk)\ncentre to lead a charge into building more privacy-centred digital infrastructure.\nI had the huge privilege of receiving a strings-free 5-year postdoctoral fellowship in\nCambridge. It's rare to see such long term postdoc opportunities these days, but\nsomething I am hugely supportive of for new projects.\n\nMy hacking first began with [@nickludlam] in 2008 on a prototype of a\n[lifedb](https://github.com/avsm/lifedb-server) server and app, which we\nenvisioned as a place to aggregate all the messages from disparate sources (for\nexample, to mirror the then-new Twitter service into my IMAP email). I worked\non [:2010-smarte-privacybutler] to add a policy engine to this prototype.\nWhile the prototype worked well enough for me, it was largely a negative result\nsince it was just too risky to put all that private data in one location\n(especially aggregated).\n\nNow back at Cambridge in 2010, I began working with [@samoht] on a more robust\nimplementation of data aggregation that would have stronger end-to-end security\nand privacy. We started coding up an implementation in OCaml to followup\nmy [:melange] work, and built out infrastructure like an OCaml ORM in\n[:2010-dyntype-wgt] to make it easier to work with databases. It became\nobvious pretty quickly that having this much data in one place required\nend users to become sysadmins, and so I started to lay out a new architecture\nfor this sort of end-user managed data in [:2010-bcs-visions].\n\nOur first prototype of a personal container running as a unikernel was published\nin [:2010-hotcloud-lamp], and would form the basis of the MirageOS project. To this day, the MirageOS community remains passionate about decentralised systems from these origins! We explored a number of directions in the early days:\n\n- [:2010-iswp-dustclouds] looked into spawning tiny unikernels on public cloud infrastructure to form a \"fast flux\" for onion routing. This remains a pretty good idea and something I'd like to see implemented on modern public clouds!\n- [:de10-perscon] was the evolution of the lifedb into the \"personal container\". Although its domain name is now offline, you can still find the [original perscon.net blog](https://github.com/avsm/perscon.net) repository. I worked pretty hard on a [perscon prototype](https://github.com/avsm/perscon) that you can read about in [:uiprototype] and [:yurts-for-digital-nomads].\n- [:2011-nsdi-ciel] investigated what a distributed dataflow engine might look like to help with processing the vast amounts of personal data we were working with. The primary author of CIEL [@mrry] went on to develop Naiad and other influential systems in this space, but I still like CIEL's very simple model. I built a simple continuation based implementation in [:datacaml-with-ciel], and as of 2021 am continuing this work again with OCaml's multicore effects in [:ocamllabs].\n- From an Internet architecture perspective, another fascinating line of thought we came up with was the notion of giving every user their own domain name server that would give them fine-grained control over network connectivity. The [:2012-sigcomm-signposts] and [:2013-foci-signposts] papers both lay out an architecture for a DNSSEC-based dynamic DNS server that users can control. We explored how a \"polyversal TCP\" might look for making p2p connections from this in [:2012-conext-pvtcp], as well as a software Openflow switch to route data from cloud to edge devices in [:2012-iccsdn-mirageflow].\n- [:2012-ahans-soapp] was the result of my collaboration with the just-established CHERI project at the Computer Lab on compartmentalisation interfaces, another area of programming that continues to need improvement.\n\nOne of the main drivers for personal containers was to drive applications that would otherwise be too invasive from a privacy perspective. [@iml] and I worked on the \"c-aware\" project in [:2012-mpm-caware] to figure out if personal containers could help influence user behaviour to reduce carbon usage. Overall, this project taught us just how much effort it would be to deploy real-world infrastructure in corporate environments like the University of Cambridge. We also struggled to get any users to deploy our prototype servers, something explored more in user studies with colleagues in Horizon Nottingham in [:de13-dataware].\n\nMy work on personal data processing petered out from a research perspective in around 2013 since the underlying infrastructure I had built really started gathering steam with [:unikernels] and [:ocamllabs]. We hadn't quite cracked the problem of how to break the cloud hegemony, but (as with XenoServers and Xen), the pieces that succeeded emerged from the research questions we asked.\nHowever, I don't consider this project permanently closed by any means -- after all, I've been self hosting my email since 1997! We've been working steadily over the past decade of MirageOS (as of 2021) to build out a really solid, self-hosted protocol stack that will work as a unikernel. I am revisiting the question of decentralisation in the form of physical infrastructure in the [:osmose] project, and you can read my early thoughts in [:2018-hotpost-osmose].",+"content": "<div><h1>Personal Containers</h1><p></p><p>As cloud computing empowered the creation of vast data silos, I investigated how decentralised technologies might be deployed to allow individuals more vertical control over their own data. Personal containers was the prototype we built to learn how to stem the flow of our information out to the ad-driven social tarpits. We also deployed personal containers in an experimental data locker system at the University of Cambridge in order to incentivise lower-carbon travel schemes.</p>\n<p>I've had a passion for self-hosted, decentralised computing for many years\nsince <a href=\"https://nick.recoil.org\">Nick Ludlam</a> and I set up the recoil.org collective in the late 90s. In\nlate 2008, I'd been working on early cloud computing as part of the <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a>\nproject and already seeing the rapid rise of centralised data gathering in\nthe early cloud providers. When I left Citrix in 2009, I joined <a href=\"https://drdrmc.github.io/about/\">Derek McAuley</a> and\n<a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a> in their new <a href=\"https://www.horizon.ac.uk\">Horizon Digital Economy</a>\ncentre to lead a charge into building more privacy-centred digital infrastructure.\nI had the huge privilege of receiving a strings-free 5-year postdoctoral fellowship in\nCambridge. It's rare to see such long term postdoc opportunities these days, but\nsomething I am hugely supportive of for new projects.</p>\n<p>My hacking first began with <a href=\"https://nick.recoil.org\">Nick Ludlam</a> in 2008 on a prototype of a\n<a href=\"https://github.com/avsm/lifedb-server\">lifedb</a> server and app, which we\nenvisioned as a place to aggregate all the messages from disparate sources (for\nexample, to mirror the then-new Twitter service into my IMAP email). I worked\non <a href=\"https://anil.recoil.org/papers/2010-smarte-privacybutler\">Privacy Butler: A Personal Privacy Rights Manager for Online Presence</a> to add a policy engine to this prototype.\nWhile the prototype worked well enough for me, it was largely a negative result\nsince it was just too risky to put all that private data in one location\n(especially aggregated).</p>\n<p>Now back at Cambridge in 2010, I began working with <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> on a more robust\nimplementation of data aggregation that would have stronger end-to-end security\nand privacy. We started coding up an implementation in OCaml to followup\nmy <a href=\"https://anil.recoil.org/projects/melange\">Functional Internet Services</a> work, and built out infrastructure like an OCaml ORM in\n<a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt\">Dynamics for ML using Meta-Programming</a> to make it easier to work with databases. It became\nobvious pretty quickly that having this much data in one place required\nend users to become sysadmins, and so I started to lay out a new architecture\nfor this sort of end-user managed data in <a href=\"https://anil.recoil.org/papers/2010-bcs-visions\">Multiscale not multicore: efficient heterogeneous cloud computing</a>.</p>\n<p>Our first prototype of a personal container running as a unikernel was published\nin <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">Turning Down the LAMP: Software Specialisation for the Cloud</a>, and would form the basis of the MirageOS project. To this day, the MirageOS community remains passionate about decentralised systems from these origins! We explored a number of directions in the early days:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds\">Using Dust Clouds to Enhance Anonymous Communication</a> looked into spawning tiny unikernels on public cloud infrastructure to form a "fast flux" for onion routing. This remains a pretty good idea and something I'd like to see implemented on modern public clouds!</li>\n<li><a href=\"https://anil.recoil.org/papers/de10-perscon\">The personal container, or your life in bits</a> was the evolution of the lifedb into the "personal container". Although its domain name is now offline, you can still find the <a href=\"https://github.com/avsm/perscon.net\">original perscon.net blog</a> repository. I worked pretty hard on a <a href=\"https://github.com/avsm/perscon\">perscon prototype</a> that you can read about in <a href=\"https://anil.recoil.org/notes/uiprototype\">Pulling together a user interface</a> and <a href=\"https://anil.recoil.org/notes/yurts-for-digital-nomads\">Yurts for Digital Nomads</a>.</li>\n<li><a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel\">CIEL: A universal execution engine for distributed data-flow computing</a> investigated what a distributed dataflow engine might look like to help with processing the vast amounts of personal data we were working with. The primary author of CIEL <a href=\"https://github.com/mrry\">Derek Murray</a> went on to develop Naiad and other influential systems in this space, but I still like CIEL's very simple model. I built a simple continuation based implementation in <a href=\"https://anil.recoil.org/notes/datacaml-with-ciel\">DataCaml: distributed dataflow programming in OCaml</a>, and as of 2021 am continuing this work again with OCaml's multicore effects in <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a>.</li>\n<li>From an Internet architecture perspective, another fascinating line of thought we came up with was the notion of giving every user their own domain name server that would give them fine-grained control over network connectivity. The <a href=\"https://anil.recoil.org/papers/2012-sigcomm-signposts\">Signposts: end-to-end networking in a world of middleboxes</a> and <a href=\"https://anil.recoil.org/papers/2013-foci-signposts\">Lost in the Edge: Finding Your Way with DNSSEC Signposts</a> papers both lay out an architecture for a DNSSEC-based dynamic DNS server that users can control. We explored how a "polyversal TCP" might look for making p2p connections from this in <a href=\"https://anil.recoil.org/papers/2012-conext-pvtcp\">Evolving TCP: how hard can it be?</a>, as well as a software Openflow switch to route data from cloud to edge devices in <a href=\"https://anil.recoil.org/papers/2012-iccsdn-mirageflow\">Cost, Performance & Flexibility in OpenFlow: Pick three</a>.</li>\n<li><a href=\"https://anil.recoil.org/papers/2012-ahans-soapp\">Exploring Compartmentalisation Hypotheses with SOAAP</a> was the result of my collaboration with the just-established CHERI project at the Computer Lab on compartmentalisation interfaces, another area of programming that continues to need improvement.</li>\n</ul>\n<p>One of the main drivers for personal containers was to drive applications that would otherwise be too invasive from a privacy perspective. <span>Ian Leslie</span> and I worked on the "c-aware" project in <a href=\"https://anil.recoil.org/papers/2012-mpm-caware\">Confidential carbon commuting: exploring a privacy-sensitive architecture for incentivising 'greener' commuting</a> to figure out if personal containers could help influence user behaviour to reduce carbon usage. Overall, this project taught us just how much effort it would be to deploy real-world infrastructure in corporate environments like the University of Cambridge. We also struggled to get any users to deploy our prototype servers, something explored more in user studies with colleagues in Horizon Nottingham in <a href=\"https://anil.recoil.org/papers/de13-dataware\">Perceived risks of personal data sharing</a>.</p>\n<p>My work on personal data processing petered out from a research perspective in around 2013 since the underlying infrastructure I had built really started gathering steam with <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a> and <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a>. We hadn't quite cracked the problem of how to break the cloud hegemony, but (as with XenoServers and Xen), the pieces that succeeded emerged from the research questions we asked.\nHowever, I don't consider this project permanently closed by any means -- after all, I've been self hosting my email since 1997! We've been working steadily over the past decade of MirageOS (as of 2021) to build out a really solid, self-hosted protocol stack that will work as a unikernel. I am revisiting the question of decentralisation in the form of physical infrastructure in the <a href=\"https://anil.recoil.org/projects/osmose\">Interspatial OS</a> project, and you can read my early thoughts in <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose\">An architecture for interspatial communication</a>.</p>\n<p></p></div>",
+18
avsm/projects_plancomp.json
+18
avsm/projects_plancomp.json
···+"summary": "Planetary computing is our research into the systems required to handle the\ningestion, transformation, analysis and publication of global data products for\nfurthering environmental science and enabling better informed policy-making. We\napply computer science to problem domains such as forest carbon and\nbiodiversity preservation, and design solutions that can scalably processing\ngeospatial data that build trust in the results via traceability and\nreproducibility. Key problems include how to handle continuously changing\ndatasets that are often collected across decades and require careful access and\nversion control.\n\n\n\"Planetary computing\" originated as a term back in 2020 when a merry band of us from Computer Science ([@keshav] and me, later joined by [@sadiqj], [@pf341], [@mdales] and bigger EEG group now) began working on [:4c] and implementing the large-scale computing infrastructure required for processing remote sensing data. We wrote up our thoughts in \"[:2024-planetary-computing]\".\n\n## Background\n\nThen in early 2024, [@dorchard] and I decided to find others interested in the problem domain, and organised the first \"[Programming for the Planet](https://propl.dev)\" (PROPL) workshop in London, co-located with POPL2024. This turned out to be a fully subscribed event, with chairs having to be brought in at one point for some of the [more popular talks](https://plas4sci.github.io/conference/2024/01/22/propl.html)! Either way, it convinced us that there's a genuine momentum and need for planetary computing research as a distinct discipline.\n\n\n\n## Projects\n\nI'm working on various systems involved with the ingestion, processing, analysis and publication of global geospatial data products. To break them down:\n\n- **Data Ingestion.** Ingesting satellite data is a surprisingly tricky process, usually involving lots of manual curation and trying not to crash nasa.gov or the ESA websites with too many parallel requests. We're working on a system that can ingest data from multiple sources while keeping track of provenance, including satellite imagery (see [:rsn]), ground-based sensors (see [:2024-terracorder]), and citizen science data gathering. This involves a lot of data cleaning and transformation as well as parallel and clustered code, and we're investigating how to make this process more efficient and scalable.\n- **Developer Workflow.** Once data is available, we are also building a next-generation \"Docker for geospatial\" system that can package up precisely versioned data, code and OS environment into a single container that can be run anywhere. This is a key part of our reproducibility story, and is a work-in-progress at [quantifyearth/shark](https://github.com/quantifyearth/shark).\n- **Specification Languages.** We're also working on a domain-specific language for specifying geospatial data processing pipelines, which can be compiled down to efficient code that can run on our planetary computing infrastructure. Ideally, this language would also be able to capture elements of the _specification_ of the data at different levels of precision, so that we can swap out different data sources or processing steps without having to rewrite the entire pipeline or change the intent behind the domain expert writing the code. You can see an example of a manually written and extremely detailed pipeline in our [:2023-pact-tmf] whitepaper -- converting this to readable code is a pretty big challenge!\n\nThere's a lot more to say about ongoing projects, but the overall message is: if you're interested in contributing to some part of the planetary computing ecosystem, either as a collaborator or a student: get in touch!\n\n## Related Reading\n\nCyrus Omar and his team over at Hazel language have also been working on a similar problem domain, and we're looking forward to collaborating with them. Read [Toward a Live, Rich, Composable, and Collaborative Planetary Compute Engine](https://hazel.org/papers/propl24.pdf) here or watch their [PROPL 2024 talk](https://watch.eeg.cl.cam.ac.uk/w/3nGExywoVm6XFRBA2zYxSL).\n\nI've also given several talks on planetary computing, including a [keynote at ICFP 2023](https://icfp23.sigplan.org/track/icfp-2023-icfp-keynotes?track=ICFP%20%20Keynotes#program) and at LambdaDays. Both are linked below, but the latter is the most recent one.\n\n\n\n",+"content": "<div><h1>Planetary Computing</h1><p></p><p>Planetary computing is our research into the systems required to handle the\ningestion, transformation, analysis and publication of global data products for\nfurthering environmental science and enabling better informed policy-making. We\napply computer science to problem domains such as forest carbon and\nbiodiversity preservation, and design solutions that can scalably processing\ngeospatial data that build trust in the results via traceability and\nreproducibility. Key problems include how to handle continuously changing\ndatasets that are often collected across decades and require careful access and\nversion control.</p>\n<p>"Planetary computing" originated as a term back in 2020 when a merry band of us from Computer Science (<a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Srinivasan Keshav</a> and me, later joined by <a href=\"https://toao.com\">Sadiq Jaffer</a>, <a href=\"https://patrick.sirref.org\">Patrick Ferris</a>, <a href=\"https://mynameismwd.org\">Michael Dales</a> and bigger EEG group now) began working on <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a> and implementing the large-scale computing infrastructure required for processing remote sensing data. We wrote up our thoughts in "<a href=\"https://anil.recoil.org/papers/2024-planetary-computing\">Planetary computing for data-driven environmental policy-making</a>".</p>\n<h2><a href=\"https://anil.recoil.org/#background\"></a>Background</h2>\n<p>Then in early 2024, <a href=\"https://dorchard.github.io\">Dominic Orchard</a> and I decided to find others interested in the problem domain, and organised the first "<a href=\"https://propl.dev\">Programming for the Planet</a>" (PROPL) workshop in London, co-located with POPL2024. This turned out to be a fully subscribed event, with chairs having to be brought in at one point for some of the <a href=\"https://plas4sci.github.io/conference/2024/01/22/propl.html\">more popular talks</a>! Either way, it convinced us that there's a genuine momentum and need for planetary computing research as a distinct discipline.</p>\n<p>\n<img alt=\"The PROPL 2024 invitation poster\" src=\"https://anil.recoil.org/images/propl24-poster.webp\" title=\"The PROPL 2024 invitation poster\">\nThe PROPL 2024 invitation poster</p>\n<h2><a href=\"https://anil.recoil.org/#projects\"></a>Projects</h2>\n<p>I'm working on various systems involved with the ingestion, processing, analysis and publication of global geospatial data products. To break them down:</p>\n<ul>\n<li><strong>Data Ingestion.</strong> Ingesting satellite data is a surprisingly tricky process, usually involving lots of manual curation and trying not to crash nasa.gov or the ESA websites with too many parallel requests. We're working on a system that can ingest data from multiple sources while keeping track of provenance, including satellite imagery (see <a href=\"https://anil.recoil.org/projects/rsn\">Remote Sensing of Nature</a>), ground-based sensors (see <a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a>), and citizen science data gathering. This involves a lot of data cleaning and transformation as well as parallel and clustered code, and we're investigating how to make this process more efficient and scalable.</li>\n<li><strong>Developer Workflow.</strong> Once data is available, we are also building a next-generation "Docker for geospatial" system that can package up precisely versioned data, code and OS environment into a single container that can be run anywhere. This is a key part of our reproducibility story, and is a work-in-progress at <a href=\"https://github.com/quantifyearth/shark\">quantifyearth/shark</a>.</li>\n<li><strong>Specification Languages.</strong> We're also working on a domain-specific language for specifying geospatial data processing pipelines, which can be compiled down to efficient code that can run on our planetary computing infrastructure. Ideally, this language would also be able to capture elements of the <em>specification</em> of the data at different levels of precision, so that we can swap out different data sources or processing steps without having to rewrite the entire pipeline or change the intent behind the domain expert writing the code. You can see an example of a manually written and extremely detailed pipeline in our <a href=\"https://anil.recoil.org/papers/2023-pact-tmf\">PACT Tropical Moist Forest Accreditation Methodology v2.1</a> whitepaper -- converting this to readable code is a pretty big challenge!</li>\n</ul>\n<p>There's a lot more to say about ongoing projects, but the overall message is: if you're interested in contributing to some part of the planetary computing ecosystem, either as a collaborator or a student: get in touch!</p>\n<h2><a href=\"https://anil.recoil.org/#related-reading\"></a>Related Reading</h2>\n<p>Cyrus Omar and his team over at Hazel language have also been working on a similar problem domain, and we're looking forward to collaborating with them. Read <a href=\"https://hazel.org/papers/propl24.pdf\">Toward a Live, Rich, Composable, and Collaborative Planetary Compute Engine</a> here or watch their <a href=\"https://watch.eeg.cl.cam.ac.uk/w/3nGExywoVm6XFRBA2zYxSL\">PROPL 2024 talk</a>.</p>\n<p>I've also given several talks on planetary computing, including a <a href=\"https://icfp23.sigplan.org/track/icfp-2023-icfp-keynotes?track=ICFP%20%20Keynotes#program\">keynote at ICFP 2023</a> and at LambdaDays. Both are linked below, but the latter is the most recent one.</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<p></p></div>",
+18
avsm/projects_rsn.json
+18
avsm/projects_rsn.json
···+"summary": "Measuring the world's forest carbon and biodiversity is made possible by remote\nsensing instruments, ranging from satellites in space (Landsat, Sentinel, GEDI)\nto ground-based sensors (ecoacoustics, camera traps, moisture sensors) that\ntake regular samples and are processed into time-series metrics and actionable\ninsights for conservation and human development. However, the algorithms for\nprocessing this data are challenging as the data is highly multimodal\n(multispectral, hyperspectral, synthetic aperture radar, or lidar), often\nsparsely sampled spatially, and not in a continuous time series. I work on\nvarious algorithms and software and hardware systems we are developing to\nimprove the datasets we have about the surface of the earth.\n\n\n## Mapping nature on earth\n\nFiguring out where things live on the planet's surface from satellites requires\na lot of data processing, and tricks to work around the fact that we can't\neasily see through clouds (when using optical sensors) or handle very sloped\nsurfaces (if using lidar) or peek through the top of a dense forest canopy\n(especially in tropical forests). Along with colleagues in the [Cambridge\nCentre for Earth Observation](https://eo.conservation.cam.ac.uk) and especially\n[@dcoomes], I've been working on a few projects that aim to improve the quality\nof the data we have about the surface of the earth.\n\nThe main research question we're tackling is how to improve our knowledge about where most wild species live on the planet, so that we can better protect their receding habitats. And in particular, our knowledge of where rare _plant_ species' live is surprisingly data deficient.\n\n### Satellite and drone sensing\n\nOld-growth tropical trees have the big advantage of being relatively easily visible from the air, and we've been developing a robust satellite and drone processing pipeline as part of the [:plancomp] project. [@jball] and [@sadiqj] have leading an effort to use this data to develop a new approach for mapping tropical tree species. They link a multi-temporal implementation of a CNN method to segment tropical forest tree-crowns from aerial photographs, to ML classifiers that can identify species from [hyperspectral data](https://en.wikipedia.org/wiki/Hyperspectral_imaging).\n\n\n\nRead more about it in the \"[:2024-hyper-tropical-mapping]\" preprint.\n\n### Common base maps for Area of Habitats\n\nAoH calculations per species are really important to agree on, and are generated from a combination of range maps, habitat preferences, climatic variables and occurrence data. [@mdales] and I are working with other developers of biodiversity metrics (such as IUCN's [STAR](https://iucn.org/resources/conservation-tool/species-threat-abatement-and-restoration-star-metric) team) which also require AoH maps to develop a common base layer that can be maintained communally. This will also make it far easier to pinpoint algorithmic differences between STAR and LIFE rather than simply varying because of differring input data.\nYou can find the code for our [area-of-habitat calculators](https://github.com/quantifyearth/aoh-calculator) for 30k\nterrestrial vertebrates online, and (thanks to a UKRI funded project in 2024) this will be expanded this to include plants.\n\n### Species Distribution Modelling\n\nOne use for AoH maps is to turn them into [Species Distribution\nModels](https://en.wikipedia.org/wiki/Species_distribution_modelling), which\nis a way to predict where species are likely to be found based on environmental\nvariables and occurrence data. [@emorris] worked on a new method that uses a combination of satellite data and machine learning to predict the distribution of species across the globe, with her focus being on proteas. Read more about it in [:2024-sdm-sa].\n\n## Ground-based sensing with the Terracorder\n\n\n\nIn 2024, I started collaborating with [@jmillar] over at Imperial College on\ndeveloping a low-cost sensor device designed for long-term deployment in remote\nnature areas as well as urban environments. Since in-situ sensing devices need\nto be deployed in remote environments for long periods of time, minimizing\ntheir power consumption is vital for maximising both their operational lifetime\nand coverage. We started from an ESP32 base (due to the lovely 16-bit ultra-low\npower mode) and have been prototyping the \"Terracorder\" as a versatile\nmulti-sensor device. Read more about it in [:2024-terracorder].\n\nSince I've been exploring spatial networking with [@rgibb] (see\n[:2023-hotnets-sns]), we've also been figuring out whether a combination of\nreinforcement learning and spatial networking knowledge might take this device\nto the next level of usability. We've been experimenting with using an\non-device reinforcement learning scheduler. When evaluating our prototype\nscheduler against a number of fixed schedules; the scheduler captures more than\n80% of events at less than 50% of the number of activations of the\nbest-performing fixed schedule. We're currently working on a collaborative\nscheduler can maximise the useful operation of a network of these Terracorders,\nimproving overall network power consumption and robustness.\n\n## Applications to human health\n\nUltimately, it would also be nice to understand the impact of more natural\nspaces on *human health* as well. After all, we not only need to protect\nunspoilt nature, but also need to make sure that highly urbanised areas are\nalso liveable. [@ag], [@rbardhan] and I have been investigating the impact of\ngreen spaces in cities. These have been demonstrated to offer multiple benefits\nto their inhabitants, including cleaner air, shade in sunny periods, and a\nplace that contributes to mental well-being. In addition, trees in cities are\nhome to several species of animals and work as a nature-based solution that can\nsequester CO2 and regulate water storage in urban ecosystems.\n\nSo far, we've been working on using a combination of remote sensing data and\nlocal metrics to connect the dots about the impact of urban green spaces on\nhuman health. Read more about our work in \"[:2024-green-urban-eq]\" and the project\nin [:urban-vegetation].",+"content": "<div><h1>Remote Sensing of Nature</h1><p></p><p>Measuring the world's forest carbon and biodiversity is made possible by remote\nsensing instruments, ranging from satellites in space (Landsat, Sentinel, GEDI)\nto ground-based sensors (ecoacoustics, camera traps, moisture sensors) that\ntake regular samples and are processed into time-series metrics and actionable\ninsights for conservation and human development. However, the algorithms for\nprocessing this data are challenging as the data is highly multimodal\n(multispectral, hyperspectral, synthetic aperture radar, or lidar), often\nsparsely sampled spatially, and not in a continuous time series. I work on\nvarious algorithms and software and hardware systems we are developing to\nimprove the datasets we have about the surface of the earth.</p>\n<h2><a href=\"https://anil.recoil.org/#mapping-nature-on-earth\"></a>Mapping nature on earth</h2>\n<p>Figuring out where things live on the planet's surface from satellites requires\na lot of data processing, and tricks to work around the fact that we can't\neasily see through clouds (when using optical sensors) or handle very sloped\nsurfaces (if using lidar) or peek through the top of a dense forest canopy\n(especially in tropical forests). Along with colleagues in the <a href=\"https://eo.conservation.cam.ac.uk\">Cambridge\nCentre for Earth Observation</a> and especially\n<a href=\"https://coomeslab.org\">David Coomes</a>, I've been working on a few projects that aim to improve the quality\nof the data we have about the surface of the earth.</p>\n<p>The main research question we're tackling is how to improve our knowledge about where most wild species live on the planet, so that we can better protect their receding habitats. And in particular, our knowledge of where rare <em>plant</em> species' live is surprisingly data deficient.</p>\n<h3><a href=\"https://anil.recoil.org/#satellite-and-drone-sensing\"></a>Satellite and drone sensing</h3>\n<p>Old-growth tropical trees have the big advantage of being relatively easily visible from the air, and we've been developing a robust satellite and drone processing pipeline as part of the <a href=\"https://anil.recoil.org/projects/plancomp\">Planetary Computing</a> project. <a href=\"https://patball1.github.io\">James G. C. Ball</a> and <a href=\"https://toao.com\">Sadiq Jaffer</a> have leading an effort to use this data to develop a new approach for mapping tropical tree species. They link a multi-temporal implementation of a CNN method to segment tropical forest tree-crowns from aerial photographs, to ML classifiers that can identify species from <a href=\"https://en.wikipedia.org/wiki/Hyperspectral_imaging\">hyperspectral data</a>.</p>\n<p>\n<img alt=\"\" src=\"https://anil.recoil.org/images/hyperspectral-tree-crown-pca.webp\" title=\"\">\n</p>\n<p>Read more about it in the "<a href=\"https://anil.recoil.org/papers/2024-hyper-tropical-mapping\">Harnessing temporal & spectral dimensionality to identify individual trees in tropical forests</a>" preprint.</p>\n<h3><a href=\"https://anil.recoil.org/#common-base-maps-for-area-of-habitats\"></a>Common base maps for Area of Habitats</h3>\n<p>AoH calculations per species are really important to agree on, and are generated from a combination of range maps, habitat preferences, climatic variables and occurrence data. <a href=\"https://mynameismwd.org\">Michael Dales</a> and I are working with other developers of biodiversity metrics (such as IUCN's <a href=\"https://iucn.org/resources/conservation-tool/species-threat-abatement-and-restoration-star-metric\">STAR</a> team) which also require AoH maps to develop a common base layer that can be maintained communally. This will also make it far easier to pinpoint algorithmic differences between STAR and LIFE rather than simply varying because of differring input data.\nYou can find the code for our <a href=\"https://github.com/quantifyearth/aoh-calculator\">area-of-habitat calculators</a> for 30k\nterrestrial vertebrates online, and (thanks to a UKRI funded project in 2024) this will be expanded this to include plants.</p>\n<h3><a href=\"https://anil.recoil.org/#species-distribution-modelling\"></a>Species Distribution Modelling</h3>\n<p>One use for AoH maps is to turn them into <a href=\"https://en.wikipedia.org/wiki/Species_distribution_modelling\">Species Distribution\nModels</a>, which\nis a way to predict where species are likely to be found based on environmental\nvariables and occurrence data. <a href=\"https://github.com/emorris7\">Emily Morris</a> worked on a new method that uses a combination of satellite data and machine learning to predict the distribution of species across the globe, with her focus being on proteas. Read more about it in <a href=\"https://anil.recoil.org/papers/2024-sdm-sa\">Towards Scalable Deep Species Distribution Modelling using Global Remote Sensing</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#ground-based-sensing-with-the-terracorder\"></a>Ground-based sensing with the Terracorder</h2>\n<p>\n<img alt=\"The first Terracorder prototype in pieces!\" src=\"https://anil.recoil.org/images/terracorder-pieces-jun24.webp\" title=\"The first Terracorder prototype in pieces!\">\nThe first Terracorder prototype in pieces!</p>\n<p>In 2024, I started collaborating with <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> over at Imperial College on\ndeveloping a low-cost sensor device designed for long-term deployment in remote\nnature areas as well as urban environments. Since in-situ sensing devices need\nto be deployed in remote environments for long periods of time, minimizing\ntheir power consumption is vital for maximising both their operational lifetime\nand coverage. We started from an ESP32 base (due to the lovely 16-bit ultra-low\npower mode) and have been prototyping the "Terracorder" as a versatile\nmulti-sensor device. Read more about it in <a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder: Sense Long and Prosper</a>.</p>\n<p>Since I've been exploring spatial networking with <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> (see\n<a href=\"https://anil.recoil.org/papers/2023-hotnets-sns\">Where on Earth is the Spatial Name System?</a>), we've also been figuring out whether a combination of\nreinforcement learning and spatial networking knowledge might take this device\nto the next level of usability. We've been experimenting with using an\non-device reinforcement learning scheduler. When evaluating our prototype\nscheduler against a number of fixed schedules; the scheduler captures more than\n80% of events at less than 50% of the number of activations of the\nbest-performing fixed schedule. We're currently working on a collaborative\nscheduler can maximise the useful operation of a network of these Terracorders,\nimproving overall network power consumption and robustness.</p>\n<h2><a href=\"https://anil.recoil.org/#applications-to-human-health\"></a>Applications to human health</h2>\n<p>Ultimately, it would also be nice to understand the impact of more natural\nspaces on <em>human health</em> as well. After all, we not only need to protect\nunspoilt nature, but also need to make sure that highly urbanised areas are\nalso liveable. <a href=\"https://ancazugo.github.io/\">Andres Zu\u00f1iga-Gonzalez</a>, <a href=\"https://www.arct.cam.ac.uk/people/dr-ronita-bardhan\">Ronita Bardhan</a> and I have been investigating the impact of\ngreen spaces in cities. These have been demonstrated to offer multiple benefits\nto their inhabitants, including cleaner air, shade in sunny periods, and a\nplace that contributes to mental well-being. In addition, trees in cities are\nhome to several species of animals and work as a nature-based solution that can\nsequester CO2 and regulate water storage in urban ecosystems.</p>\n<p>So far, we've been working on using a combination of remote sensing data and\nlocal metrics to connect the dots about the impact of urban green spaces on\nhuman health. Read more about our work in "<a href=\"https://anil.recoil.org/papers/2024-green-urban-eq\">Green Urban Equity: Analyzing the 3-30-300 Rule in UK Cities and Its Socioeconomic Implications</a>" and the project\nin <a href=\"https://anil.recoil.org/ideas/urban-vegetation\">The role of urban vegetation in human health</a>.</p>\n<p></p></div>",
+18
avsm/projects_ubiqinteraction.json
+18
avsm/projects_ubiqinteraction.json
···+"summary": "I investigated how to interface the new emerging class of smartphone devices\n(circa 2002) with concepts from ubiquitous computing such as location-aware\ninterfaces or context-aware computing. I discovered the surprisingly positive\nbenefits of piggybacking on simple communications medium such as audible sound\nand visual tags. Our implementations of some of these ended up with new audio\nringtone and visual smart tags that worked on the widely deployed mobile phones\nof the era.\n\nIn 2003, the mobile phone market had grown tremendously and given the average\nconsumer access to cheap, small, low-powered and constantly networked devices\nthat they could reliably carry around. Similarly, laptop computers and PDAs\nbecame a common accessory for businesses to equip their employees with when on\nthe move. The research question then, was how to effectively interface them\nwith existing digital infrastructure and realise some of the concepts of\nubiquitous computing such as location-aware interfaces or context-aware\ncomputing.\n\n\nUbiquitous Interaction Devices (see [original webpages](https://www.cl.cam.ac.uk/research/srg/netos/projects/archive/uid/))\nwas a project I started with [@djs55] and [@rws26] in 2003 to work on this at\nthe Cambridge Computer Laboratory and Intel Research Cambridge (who had just\nset up a \"lablet\" within our building and was a great source of free coffee).\n\n## Audio Networking\n\nThe project kicked off when we randomly experimented with our fancy Nokia smartphones\nand discovered that they didn't have anti-aliasing filters on the microphones.\nThis allowed us to record and decode ultrasound between the phones. The \n2003 [:audio-networking] in Ubicomp describes some of the applications it\nallowed. \n\nIn a nutshell, audio networking used ubiquitously available sound hardware\n(i.e, speakers, soundcards and microphones) for low-bandwidth wireless networking.\nIt has a number of characteristics which differentiate it from existing wireless technologies:\n\n- fine-grained control over the range of transmission by adjusting the volume.\n- walls of buildings are typically designed to attenuate sound waves, making it easy to contain transmission to a single room (unlike, for example, Wifi or Bluetooth).\n- existing devices can play or record audio to be networked easily (e.g. voice recorders).\n\nThe *Audionotes* video below demonstrates an Audio Pick and Drop Interface: an Internet URL is recorded into a PDA via a standard computer speaker. The URL is later retrieved from the PDA by playing it into the computer, and also printed by carrying the PDA to a print server and playing back the recorded identifier.\nA great advantage of this is that any cheap voice recorder is capable of carrying the audio information (e.g. mobile phone, PDA, dictaphone, etc).\n\nOne of the more interesting discoveries during this research is that most computer soundcards do not have good anti-aliasing filters fitted, resulting in them being able to send and receive inaudible sound at upto 24 Khz (assuming a 48 KHz soundcard).\nWe used this to create a set of inaudible location beacons that would allow laptop computers to simply listen using their microphones and discover their current location without any advanced equipment being required! The *location* video below demonstrates this.\n\n\n\n\n\nWe then devised a scheme of encoding data into mobile phone ringtones while still making them sound pleasant to the ear. This allows us to use the phone network for the video example above: an SMS 'capability' is transmitted to a phone, which can be played back to gain access to a building.\nSince the data is just a melody, this allows for uses such as parents purchasing cinema tickets online for their children (who dont own a credit card), yet still allowing the children to gain access to the cinema by playing the ringtone capability back (via their mobile phones, common among children).\n\nWe observed that audio networking allows URLs to be easily sent over a telephone conversation, and retrieved over a normal high bandwidth connection by the recipient.\nThis gets over the real awkwardness of \"locating\" someone on the Internet, which is rife with restrictive firewalls and Network Address Translation which is destroying end-to-end connectivity. In contrast, it is trivial to locate someone on the phone network given their telephone number, but harder to perform high bandwidth data transfers (which the Internet excels at). There's more detail on these usecases in [:2005-ieee-smartphones].\n\n\n\n\n\n## SpotCode Interfaces\n\nOnce we'd had such success with audible networking, a natural next step was to use the new fancy cameras present on smartphones. [@eben] joined our project and knocked up a real-time circular barcode detector for Symbian operating system phones in short order that we demonstrated in [:2004-ubicomp-camera].\n\nThe phone app we built could detect our circular barcodes in realtime, unlike the ungainly \"click and wait\" experiences of the time. Since the phone computes the relative rotation angle of detected tags in real-time, an intuitive \"volume control\" was possible by simple locking onto a tag and physically rotating the phone. The videos demonstrates a volume control in the \"SpotCode Jukebox\", and how further interfaces could be \"thrown\" to the phone for detailed interactions.\n\n\n\n\n\nWe then built some more elaborate public applications (such as a travel booking shop) in [:2004-spotcodes] and [:2005-ieee-smartphones]). When [@etoye] joined the project, we subsequently conducted structured user studies to see how effective the tags are in [:2006-puc-tags]. As a side-note, the whole zooming interface was written using OCaml and OpenGL.\nWe spun out a startup company called High Energy Magic Ltd., and got into the [New York Times](https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html) and [Wired](https://www.wired.com/2004/06/from-the-prawn-of-time/) (alongside a decaying prawn sandwich).\n\n\n\n\n\nIt also became obvious that the technology was also really robust, since it worked fine on printed (and crumpled) pieces of paper, making it ideal for public signage. We used this to experiment with more robust device discovery in [:2005-mc2r-visualtags]. This subsequently lead us to show how smartphones could be trusted side-channels in [:2008-mobisys-splittrust], an idea that is now (as of 2020) becoming realised with trusted computing chips in modern smartphones.\n\n\n\n\n\n## Towards smarter environments\n\nAt the Computer Laboratory, we also happen to have one of the world's best indoor location systems (the [Active Bat](https://en.wikipedia.org/wiki/Active_Bat)), which we inherited from AT&T Research when that shut down in Cambridge. [@rip], [@liquidx] and [@kjm25] joined forces with us to investigate how commodity hardware could interface with this smart location system to make really futuristic buildings possible.\n\nA few really fun projects resulted from this:\n- We used the BAT system to help with Bluetooth-based indoor location inference, in [:2005-ubicomp-bluetooth].\n- Interfaced the audio networking with the AT&T \"broadband phones\" in [:2005-bbphone].\n- We constructed a real-time capture the flag game in the Computer Lab with augmented reality \"windows\" into the game in [:netgames04-ctf].\n- I worked with my Recoil-in-chief [@nickludlam] on digital TV interfaces in [:2005-ubiapp-ubimedia].\n\nIn around 2005, we sold High Energy Magic Ltd. to a [Dutch entrepreneur](https://www.youtube.com/watch?v=sN01wkRzsfk) so that [@rws26], [@djs55] and I could join the [:xen] startup company. However, the ubiquitous computing ideals that drove much of our work continue to persist, and in 2018 I started thinking about this again as part of my [:osmose] project. The idea of building truely ubiquitous environments (without smartphones) is resurrected again there, and you can start reading about it in [:2018-hotpost-osmose].",+"content": "<div><h1>Ubiquitous Interaction Devices</h1><p></p><p>I investigated how to interface the new emerging class of smartphone devices\n(circa 2002) with concepts from ubiquitous computing such as location-aware\ninterfaces or context-aware computing. I discovered the surprisingly positive\nbenefits of piggybacking on simple communications medium such as audible sound\nand visual tags. Our implementations of some of these ended up with new audio\nringtone and visual smart tags that worked on the widely deployed mobile phones\nof the era.</p>\n<p>In 2003, the mobile phone market had grown tremendously and given the average\nconsumer access to cheap, small, low-powered and constantly networked devices\nthat they could reliably carry around. Similarly, laptop computers and PDAs\nbecame a common accessory for businesses to equip their employees with when on\nthe move. The research question then, was how to effectively interface them\nwith existing digital infrastructure and realise some of the concepts of\nubiquitous computing such as location-aware interfaces or context-aware\ncomputing.</p>\n<p>Ubiquitous Interaction Devices (see <a href=\"https://www.cl.cam.ac.uk/research/srg/netos/projects/archive/uid/\">original webpages</a>)\nwas a project I started with <a href=\"https://github.com/djs55\">Dave Scott</a> and <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a> in 2003 to work on this at\nthe Cambridge Computer Laboratory and Intel Research Cambridge (who had just\nset up a "lablet" within our building and was a great source of free coffee).</p>\n<h2><a href=\"https://anil.recoil.org/#audio-networking\"></a>Audio Networking</h2>\n<p>The project kicked off when we randomly experimented with our fancy Nokia smartphones\nand discovered that they didn't have anti-aliasing filters on the microphones.\nThis allowed us to record and decode ultrasound between the phones. The\n2003 <a href=\"https://anil.recoil.org/papers/audio-networking\">Context-Aware Computing with Sound</a> in Ubicomp describes some of the applications it\nallowed.</p>\n<p>In a nutshell, audio networking used ubiquitously available sound hardware\n(i.e, speakers, soundcards and microphones) for low-bandwidth wireless networking.\nIt has a number of characteristics which differentiate it from existing wireless technologies:</p>\n<ul>\n<li>fine-grained control over the range of transmission by adjusting the volume.</li>\n<li>walls of buildings are typically designed to attenuate sound waves, making it easy to contain transmission to a single room (unlike, for example, Wifi or Bluetooth).</li>\n<li>existing devices can play or record audio to be networked easily (e.g. voice recorders).</li>\n</ul>\n<p>The <em>Audionotes</em> video below demonstrates an Audio Pick and Drop Interface: an Internet URL is recorded into a PDA via a standard computer speaker. The URL is later retrieved from the PDA by playing it into the computer, and also printed by carrying the PDA to a print server and playing back the recorded identifier.\nA great advantage of this is that any cheap voice recorder is capable of carrying the audio information (e.g. mobile phone, PDA, dictaphone, etc).</p>\n<p>One of the more interesting discoveries during this research is that most computer soundcards do not have good anti-aliasing filters fitted, resulting in them being able to send and receive inaudible sound at upto 24 Khz (assuming a 48 KHz soundcard).\nWe used this to create a set of inaudible location beacons that would allow laptop computers to simply listen using their microphones and discover their current location without any advanced equipment being required! The <em>location</em> video below demonstrates this.</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<p>We then devised a scheme of encoding data into mobile phone ringtones while still making them sound pleasant to the ear. This allows us to use the phone network for the video example above: an SMS 'capability' is transmitted to a phone, which can be played back to gain access to a building.\nSince the data is just a melody, this allows for uses such as parents purchasing cinema tickets online for their children (who dont own a credit card), yet still allowing the children to gain access to the cinema by playing the ringtone capability back (via their mobile phones, common among children).</p>\n<p>We observed that audio networking allows URLs to be easily sent over a telephone conversation, and retrieved over a normal high bandwidth connection by the recipient.\nThis gets over the real awkwardness of "locating" someone on the Internet, which is rife with restrictive firewalls and Network Address Translation which is destroying end-to-end connectivity. In contrast, it is trivial to locate someone on the phone network given their telephone number, but harder to perform high bandwidth data transfers (which the Internet excels at). There's more detail on these usecases in <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones\">Using smart phones to access site-specific services</a>.</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<h2><a href=\"https://anil.recoil.org/#spotcode-interfaces\"></a>SpotCode Interfaces</h2>\n<p>Once we'd had such success with audible networking, a natural next step was to use the new fancy cameras present on smartphones. <a href=\"mailto:eben@phlegethon.org\">Eben Upton</a> joined our project and knocked up a real-time circular barcode detector for Symbian operating system phones in short order that we demonstrated in <a href=\"https://anil.recoil.org/papers/2004-ubicomp-camera\">Using Camera-Phones to Enhance Human-Computer Interaction</a>.</p>\n<p>The phone app we built could detect our circular barcodes in realtime, unlike the ungainly "click and wait" experiences of the time. Since the phone computes the relative rotation angle of detected tags in real-time, an intuitive "volume control" was possible by simple locking onto a tag and physically rotating the phone. The videos demonstrates a volume control in the "SpotCode Jukebox", and how further interfaces could be "thrown" to the phone for detailed interactions.</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<p>We then built some more elaborate public applications (such as a travel booking shop) in <a href=\"https://anil.recoil.org/papers/2004-spotcodes\">Using camera-phones to interact with context-aware mobile services</a> and <a href=\"https://anil.recoil.org/papers/2005-ieee-smartphones\">Using smart phones to access site-specific services</a>). When <a href=\"https://www.cst.cam.ac.uk/people/eft20\">Eleanor Toye Scott</a> joined the project, we subsequently conducted structured user studies to see how effective the tags are in <a href=\"https://anil.recoil.org/papers/2006-puc-tags\">Interacting with mobile services: an evaluation of camera-phones and visual tags</a>. As a side-note, the whole zooming interface was written using OCaml and OpenGL.\nWe spun out a startup company called High Energy Magic Ltd., and got into the <a href=\"https://www.nytimes.com/2004/10/07/technology/circuits/connecting-paper-and-online-worlds-by-cellphone-camera.html\">New York Times</a> and <a href=\"https://www.wired.com/2004/06/from-the-prawn-of-time/\">Wired</a> (alongside a decaying prawn sandwich).</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<p>It also became obvious that the technology was also really robust, since it worked fine on printed (and crumpled) pieces of paper, making it ideal for public signage. We used this to experiment with more robust device discovery in <a href=\"https://anil.recoil.org/papers/2005-mc2r-visualtags\">Using visual tags to bypass Bluetooth device discovery</a>. This subsequently lead us to show how smartphones could be trusted side-channels in <a href=\"https://anil.recoil.org/papers/2008-mobisys-splittrust\">Enhancing web browsing security on public terminals using mobile composition</a>, an idea that is now (as of 2020) becoming realised with trusted computing chips in modern smartphones.</p>\n<p></p><div></div><p></p>\n<p></p><div></div><p></p>\n<h2><a href=\"https://anil.recoil.org/#towards-smarter-environments\"></a>Towards smarter environments</h2>\n<p>At the Computer Laboratory, we also happen to have one of the world's best indoor location systems (the <a href=\"https://en.wikipedia.org/wiki/Active_Bat\">Active Bat</a>), which we inherited from AT&T Research when that shut down in Cambridge. <a href=\"mailto:ripduman.sohan@gmail.com\">Ripduman Sohan</a>, <a href=\"https://liquidx.net\">Alastair Tse</a> and <a href=\"mailto:kieran@recoil.org\">Kieran Mansley</a> joined forces with us to investigate how commodity hardware could interface with this smart location system to make really futuristic buildings possible.</p>\n<p>A few really fun projects resulted from this:</p>\n<ul>\n<li>We used the BAT system to help with Bluetooth-based indoor location inference, in <a href=\"https://anil.recoil.org/papers/2005-ubicomp-bluetooth\">A Study of Bluetooth Propagation Using Accurate Indoor Location Mapping</a>.</li>\n<li>Interfaced the audio networking with the AT&T "broadband phones" in <a href=\"https://anil.recoil.org/papers/2005-bbphone\">The Broadband Phone Network: Experiences with Context-Aware Telephony</a>.</li>\n<li>We constructed a real-time capture the flag game in the Computer Lab with augmented reality "windows" into the game in <a href=\"https://anil.recoil.org/papers/netgames04-ctf\">Feedback, latency, accuracy: exploring tradeoffs in location-aware gaming</a>.</li>\n<li>I worked with my Recoil-in-chief <a href=\"https://nick.recoil.org\">Nick Ludlam</a> on digital TV interfaces in <a href=\"https://anil.recoil.org/papers/2005-ubiapp-ubimedia\">Ubiquitious Computing needs to catch up with Ubiquitous Media</a>.</li>\n</ul>\n<p>In around 2005, we sold High Energy Magic Ltd. to a <a href=\"https://www.youtube.com/watch?v=sN01wkRzsfk\">Dutch entrepreneur</a> so that <a href=\"mailto:richard.sharp@gmail.com\">Richard Sharp</a>, <a href=\"https://github.com/djs55\">Dave Scott</a> and I could join the <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a> startup company. However, the ubiquitous computing ideals that drove much of our work continue to persist, and in 2018 I started thinking about this again as part of my <a href=\"https://anil.recoil.org/projects/osmose\">Interspatial OS</a> project. The idea of building truely ubiquitous environments (without smartphones) is resurrected again there, and you can start reading about it in <a href=\"https://anil.recoil.org/papers/2018-hotpost-osmose\">An architecture for interspatial communication</a>.</p>\n<p></p></div>",
+18
avsm/projects_unikernels.json
+18
avsm/projects_unikernels.json
···+"summary": "I proposed the concept of \"unikernels\" -- single-purpose appliances that are compile-time specialised into standalone bootable kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and reduce operational costs. I also co-founded the MirageOS project which is one of the first complete unikernel frameworks, and also integrated them to create the Docker for Desktop apps that are used by hundreds of millions of users daily.\n\n\nWhile working on [:perscon] in late 2008, I had a need to run lots of distributed edge nodes holding personal data. The state of computer security is generally a disaster when it comes to leaving software unupgraded for even a few months, so building robust infrastructure that normal people could use was proving quite difficult. Meanwhile, my PhD research in building [:melange] had constructed really viable prototypes of network protocols written in pure OCaml, and I'd previously used OCaml industrially in the [:xen] hypervisor to write lots of system management code.\n\n## The Early Days\n\nAll of these ideas came crashing together in late 2009 and I decided to have a go at putting together a complete OCaml-based operating system. The adventure began with grabbing the Xen mini-os and the C lwIP stack to provide networking and sqlite for persistent storage, and hacking for a few months until everything booted and was reasonably stable. I then convinced [@samoht] (then at Inria) to help me with storage integration with OCaml in [:2010-dyntype-wgt] and we had a remarkably good prototype that we presented in [:2010-hotcloud-lamp].\n\nI wrote up my early thoughts on [:2010-bcs-visions] to describe this emerging idea of heterogenous cloud and edge computing combined into a single programming model. After realising that the prototype worked well, I started steadily removing C bindings (like lwIP) and replacing them with pure OCaml code all the way down to the VM Xen interface (e.g. like [mirage-tcpip](https://github.com/mirage/mirage-tcpip)). These early heady days saw lots of prototypes and experimentation:\n\n- I experimented with various models for edge computing for personal data handling, such as [:2011-icdcn-droplets] and [:2010-iswp-dustclouds]. These mechanisms are still surprisingly unrealised in the wild, with some aspects becoming popular (e.g. serverless functions), but not the aggregation networks.\n- In the office next door, @mrry and friends were doing their PhDs and building distributed execution engines. I helped with building out [:2011-nsdi-ciel] and experimenting with what a functional interface would look like in [:datacaml-with-ciel]. As of 2021, I'm revisiting this approach in the context of algebraic effects in our multicore OCaml project.\n- I looked into closer integration with hypervisors as well, via investigating [:2011-fccm-cloudfpga] (TL;DR -- too early, but happened a few years later in commercial clouds) and [:2012-oud-xen].\n\n## Building MirageOS and figuring out unikernels\n\nOne of the earliest decisions I made in MirageOS was to self-host as soon as possible. I registered openmirage.org in late 2009, and (joined by @mort and @djs55) we had a Xen-based website running in short order in 2010 (now [mirage-www](https://github.com/mirage/mirage-www)). A big boost to the project was winning a grant from the [Verisign Infrastructure Awards](https://investor.verisign.com/news-releases/news-release-details/verisign-announces-winners-grants-aimed-strengthening-internet), which was the first external validation that this thing might be of interest to other people. As my [:ocamllabs] group grew in the University, more intrepid hackers joined the group and started making MirageOS work properly.\n\nA year of intense work in 2012 turned the prototype into a fully-fleshed out paper which got soundly rejected by the OSDI review committee as we hadn't identified what the core systems research contribution was (as opposed to the impressive programming work, which they acknowledged in the rejection). I'd just gone to visit Timothy Roscoe's group in ETH where they had been working on the Barrelfish multikernel OS, and the answer came right to me while in the pub with [@crowcroft]. What MirageOS represented was a revival of the concept of library operating systems, but with the additional twist that it specialised the compilation into single-user mode. Thus, I settled on the term \"unikernels\" to describe this idea and rewrote the paper and duly published it in [:2013-asplos-mirage].\n\nPublishing a major research paper in ASPLOS led to further momentum and interest:\n\n- [@djs55] and I published a note in the Communications of the ACM dubbed [:rise-of-libos] which was pretty widely read at the time.\n- [@samoht] moved to Cambridge and started building the storage stack that we'd wanted for years. It was initially called [:2014-oud-irminsule] (later shortened to [irmin](https://irmin.org)) and kicked off our interest in moving beyond CRDTs to [:2015-jfla-irmin]. Irmin picked up a life of its own and was later used by Arthur Breitman as the storage stack in the [Tezos](https://tezos.com) proof-of-stack blockchain in 2017.\n- [@magnuss] also returned to the group and we began hacking on real-time edge infrastructure using unikernels, such as [:2015-diynet-kadupul]. Although this work got put on ice in 2015, I'm revisiting it in 2022 in the context of [:osmose].\n- [@talex5], [@dsheets] and [@balraj] joined our burgeoning group and we all prototyped the idea of real-time booting of edge unikernels in [:2015-nsdi-jitsu]. This represented the first time we'd booted VMs on ARM, as it was very much a niche architecture for virtualisation back then.\n- Meanwhile, in the [beach in Mirleft](https://mirageos.org/blog/ocaml-tls-api-internals-attacks-mitigation) in Morrocco, [@kaloper] and [@hannesm] built an entire TLS stack in OCaml which we published in [:2015-usenixsec-nqsb]. This was a real turning point in the project as it represented an external open source contribution (with both of them joining the University subsequently) and also grew our belief that it wasn't a completely dumb idea to rebuild every Internet protocol in a functional language.\n\nMirageOS also gave us ideas for other top systems research, such as the filesystem verification idas in [:2015-sosp-sibylfs] (which I still intend to use for a proper POSIX compatibility layer on top of Irmin at some point), and [:2016-usenix-flick] (to build domain-specific data processing platforms, something that I'm now working on in 2021 in [:4c]).\n\n## To Unikernel Systems and Docker\n\nBy this point, MirageOS was also a thriving open source community with regular IRC meetings and the beginning of hack retreats. There were several organisations using it, and the overall OCaml community started using some of our protocol implementations independently of the unikernel ideas. For example, the [cohttp](https://github.com/mirage/ocaml-cohttp) was something I rapidly hacked together for the ASPLOS deadline, but the Unix/Lwt/Async backends are now used in quite a few major systems (including within Jane Street, no less).\n\nWe had to deal with all this growth, as a university isn't the easiest place to have a very large group. In 2015, [@balraj] (who had made huge contributions to the Mirage TCP/IP stack) [@samoht] and myself founded Unikernel Systems along with [@yallop], [@talex5], [@magnuss], [@yomimono], [@justin], [@dsheets], [@amirmc], and [@djs55]. After a fun few months pitching to west coast VCs in California (including fun chats with the likes of Jerry Yang), Peter Fenton from Benchmark convinced us to meet Solomon Hykes over at Docker. This conversation changed the course of our careers, as he shared his vision for the future of containerisation and how unikernels could fit in there.\n\nA short set of negotiations later, and [Unikernel Systems was acquired by Docker](https://techcrunch.com/2016/01/21/docker-acquires-unikernel-systems-as-it-looks-beyond-containers/) in 2016. We spent a very fun couple of years commercialising the technology and incorporating it into Docker for Desktop. Our work ended up shipping as Docker for Desktop which remains one of the most popular developer tools in the world, and I describe its architecture [in this talk](https://www.youtube.com/watch?v=zqFDEDl5Zes).\n\n## Unikernels in 2021 and beyond\n\nOur startup aside, the core development of MirageOS continued to be nicely distributed in several spinouts:\n- [@kc] and [@gemmag] founded [OCLC](https://ocamllabs.io) in 2016 as a commercial spinout from the university group to drive OCaml tooling and core compiler development.\n- [@hannesm] setup the cooperative in late 2017 with a [large set of Mirage projects](https://robur.coop/Our%20Work/Projects).\n- [@samoht] founded [Tarides](https://tarides.com) in 2018 after leaving Docker, where they maintain MirageOS and drive development of the Irmin storage stack in particular.\n\nThe wider industry also saw a number of interesting spinouts, as many other communities also latched on to the ideas of unikernels and began their own language-specific and domain-specific versions. I joined the advisory boards of IncludeOS (now sadly defunct) and Zededa (now thankfully going from strength to strength in edge computing) to help guide strategy and adoption outside of just MirageOS. Dr Pierre Oliver maintains a great list of [unikernel papers](https://github.com/olivierpierre/unikernel-papers) where you can see the diversity and interest in unikernels. One of the most exciting implementations of a C-based unikernel can be found in [Unikraft](https://www.unikraft.org/).\n\nAs for my interest in unikernels moving forward? My heart always remains in finding the intersection of _safety_ and _performance_, which means I mostly pay attention to language-based approaches. MirageOS continues to thrive (particularly with the effect system being integrated into OCaml in 2022, which will really change the way we develop OCaml code for embedded systems). Since 2020, I've been investigating the application of DIFC to embedded infrastructure, for example via [:2019-edgesys-snape].\n\nThe unikernel approach has also found new applications in [ultra-low-power computing](:2025-dl-rcn) and [edge AI deployment](:2025-npu-bench), where the security and efficiency benefits align well with the constraints of energy-harvesting and intermittent operation scenarios explored in our [:osmose] work.\n\nIn 2025, we were also honoured to receive a [most influential paper award](:unikernels-test-of-time) from ASPLOS for the original paper, validating the long-term impact of the unikernel approach on systems research.",+"content": "<div><h1>Unikernels</h1><p></p><p>I proposed the concept of "unikernels" -- single-purpose appliances that are compile-time specialised into standalone bootable kernels, and sealed against modification when deployed to a cloud platform. In return they offer significant reduction in image sizes, improved efficiency and security, and reduce operational costs. I also co-founded the MirageOS project which is one of the first complete unikernel frameworks, and also integrated them to create the Docker for Desktop apps that are used by hundreds of millions of users daily.</p>\n<p>While working on <a href=\"https://anil.recoil.org/projects/perscon\">Personal Containers</a> in late 2008, I had a need to run lots of distributed edge nodes holding personal data. The state of computer security is generally a disaster when it comes to leaving software unupgraded for even a few months, so building robust infrastructure that normal people could use was proving quite difficult. Meanwhile, my PhD research in building <a href=\"https://anil.recoil.org/projects/melange\">Functional Internet Services</a> had constructed really viable prototypes of network protocols written in pure OCaml, and I'd previously used OCaml industrially in the <a href=\"https://anil.recoil.org/projects/xen\">Xen Hypervisor</a> hypervisor to write lots of system management code.</p>\n<h2><a href=\"https://anil.recoil.org/#the-early-days\"></a>The Early Days</h2>\n<p>All of these ideas came crashing together in late 2009 and I decided to have a go at putting together a complete OCaml-based operating system. The adventure began with grabbing the Xen mini-os and the C lwIP stack to provide networking and sqlite for persistent storage, and hacking for a few months until everything booted and was reasonably stable. I then convinced <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> (then at Inria) to help me with storage integration with OCaml in <a href=\"https://anil.recoil.org/papers/2010-dyntype-wgt\">Dynamics for ML using Meta-Programming</a> and we had a remarkably good prototype that we presented in <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">Turning Down the LAMP: Software Specialisation for the Cloud</a>.</p>\n<p>I wrote up my early thoughts on <a href=\"https://anil.recoil.org/papers/2010-bcs-visions\">Multiscale not multicore: efficient heterogeneous cloud computing</a> to describe this emerging idea of heterogenous cloud and edge computing combined into a single programming model. After realising that the prototype worked well, I started steadily removing C bindings (like lwIP) and replacing them with pure OCaml code all the way down to the VM Xen interface (e.g. like <a href=\"https://github.com/mirage/mirage-tcpip\">mirage-tcpip</a>). These early heady days saw lots of prototypes and experimentation:</p>\n<ul>\n<li>I experimented with various models for edge computing for personal data handling, such as <a href=\"https://anil.recoil.org/papers/2011-icdcn-droplets\">Unclouded vision</a> and <a href=\"https://anil.recoil.org/papers/2010-iswp-dustclouds\">Using Dust Clouds to Enhance Anonymous Communication</a>. These mechanisms are still surprisingly unrealised in the wild, with some aspects becoming popular (e.g. serverless functions), but not the aggregation networks.</li>\n<li>In the office next door, @mrry and friends were doing their PhDs and building distributed execution engines. I helped with building out <a href=\"https://anil.recoil.org/papers/2011-nsdi-ciel\">CIEL: A universal execution engine for distributed data-flow computing</a> and experimenting with what a functional interface would look like in <a href=\"https://anil.recoil.org/notes/datacaml-with-ciel\">DataCaml: distributed dataflow programming in OCaml</a>. As of 2021, I'm revisiting this approach in the context of algebraic effects in our multicore OCaml project.</li>\n<li>I looked into closer integration with hypervisors as well, via investigating <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga\">Reconfigurable Data Processing for Clouds</a> (TL;DR -- too early, but happened a few years later in commercial clouds) and <a href=\"https://anil.recoil.org/papers/2012-oud-xen\">Programming the Xen cloud using OCaml</a>.</li>\n</ul>\n<h2><a href=\"https://anil.recoil.org/#building-mirageos-and-figuring-out-unikernels\"></a>Building MirageOS and figuring out unikernels</h2>\n<p>One of the earliest decisions I made in MirageOS was to self-host as soon as possible. I registered openmirage.org in late 2009, and (joined by @mort and @djs55) we had a Xen-based website running in short order in 2010 (now <a href=\"https://github.com/mirage/mirage-www\">mirage-www</a>). A big boost to the project was winning a grant from the <a href=\"https://investor.verisign.com/news-releases/news-release-details/verisign-announces-winners-grants-aimed-strengthening-internet\">Verisign Infrastructure Awards</a>, which was the first external validation that this thing might be of interest to other people. As my <a href=\"https://anil.recoil.org/projects/ocamllabs\">OCaml Labs</a> group grew in the University, more intrepid hackers joined the group and started making MirageOS work properly.</p>\n<p>A year of intense work in 2012 turned the prototype into a fully-fleshed out paper which got soundly rejected by the OSDI review committee as we hadn't identified what the core systems research contribution was (as opposed to the impressive programming work, which they acknowledged in the rejection). I'd just gone to visit Timothy Roscoe's group in ETH where they had been working on the Barrelfish multikernel OS, and the answer came right to me while in the pub with <a href=\"mailto:jon.crowcroft@cl.cam.ac.uk\">Jon Crowcroft</a>. What MirageOS represented was a revival of the concept of library operating systems, but with the additional twist that it specialised the compilation into single-user mode. Thus, I settled on the term "unikernels" to describe this idea and rewrote the paper and duly published it in <a href=\"https://anil.recoil.org/papers/2013-asplos-mirage\">Unikernels: library operating systems for the cloud</a>.</p>\n<p>Publishing a major research paper in ASPLOS led to further momentum and interest:</p>\n<ul>\n<li><a href=\"https://github.com/djs55\">Dave Scott</a> and I published a note in the Communications of the ACM dubbed <a href=\"https://anil.recoil.org/papers/rise-of-libos\">Unikernels: Rise of the Virtual Library Operating System</a> which was pretty widely read at the time.</li>\n<li><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> moved to Cambridge and started building the storage stack that we'd wanted for years. It was initially called <a href=\"https://anil.recoil.org/papers/2014-oud-irminsule\">Irminsule: a branch-consistent distributed library database</a> (later shortened to <a href=\"https://irmin.org\">irmin</a>) and kicked off our interest in moving beyond CRDTs to <a href=\"https://anil.recoil.org/papers/2015-jfla-irmin\">Mergeable persistent data structures</a>. Irmin picked up a life of its own and was later used by Arthur Breitman as the storage stack in the <a href=\"https://tezos.com\">Tezos</a> proof-of-stack blockchain in 2017.</li>\n<li><a href=\"http://www.skjegstad.com/about/\">Magnus Skjegstad</a> also returned to the group and we began hacking on real-time edge infrastructure using unikernels, such as <a href=\"https://anil.recoil.org/papers/2015-diynet-kadupul\">Kadupul: Livin' on the Edge with Virtual Currencies and Time-Locked Puzzles</a>. Although this work got put on ice in 2015, I'm revisiting it in 2022 in the context of <a href=\"https://anil.recoil.org/projects/osmose\">Interspatial OS</a>.</li>\n<li><a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a>, <span>David Sheets</span> and <a href=\"https://github.com/balrajsingh\">Balraj Singh</a> joined our burgeoning group and we all prototyped the idea of real-time booting of edge unikernels in <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu\">Jitsu: Just-In-Time Summoning of Unikernels</a>. This represented the first time we'd booted VMs on ARM, as it was very much a niche architecture for virtualisation back then.</li>\n<li>Meanwhile, in the <a href=\"https://mirageos.org/blog/ocaml-tls-api-internals-attacks-mitigation\">beach in Mirleft</a> in Morrocco, <span>David Kaloper-Mer\u0161injak</span> and <a href=\"https://github.com/hannesm\">Hannes Mehnert</a> built an entire TLS stack in OCaml which we published in <a href=\"https://anil.recoil.org/papers/2015-usenixsec-nqsb\">Not-Quite-So-Broken TLS</a>. This was a real turning point in the project as it represented an external open source contribution (with both of them joining the University subsequently) and also grew our belief that it wasn't a completely dumb idea to rebuild every Internet protocol in a functional language.</li>\n</ul>\n<p>MirageOS also gave us ideas for other top systems research, such as the filesystem verification idas in <a href=\"https://anil.recoil.org/papers/2015-sosp-sibylfs\">SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems</a> (which I still intend to use for a proper POSIX compatibility layer on top of Irmin at some point), and <a href=\"https://anil.recoil.org/papers/2016-usenix-flick\">FLICK: Developing and Running Application-Specific Network Services</a> (to build domain-specific data processing platforms, something that I'm now working on in 2021 in <a href=\"https://anil.recoil.org/projects/4c\">Trusted Carbon Credits</a>).</p>\n<h2><a href=\"https://anil.recoil.org/#to-unikernel-systems-and-docker\"></a>To Unikernel Systems and Docker</h2>\n<p>By this point, MirageOS was also a thriving open source community with regular IRC meetings and the beginning of hack retreats. There were several organisations using it, and the overall OCaml community started using some of our protocol implementations independently of the unikernel ideas. For example, the <a href=\"https://github.com/mirage/ocaml-cohttp\">cohttp</a> was something I rapidly hacked together for the ASPLOS deadline, but the Unix/Lwt/Async backends are now used in quite a few major systems (including within Jane Street, no less).</p>\n<p>We had to deal with all this growth, as a university isn't the easiest place to have a very large group. In 2015, <a href=\"https://github.com/balrajsingh\">Balraj Singh</a> (who had made huge contributions to the Mirage TCP/IP stack) <a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> and myself founded Unikernel Systems along with <a href=\"https://www.cst.cam.ac.uk/people/jdy22\">Jeremy Yallop</a>, <a href=\"https://github.com/https://roscidus.com\">Thomas Leonard</a>, <a href=\"http://www.skjegstad.com/about/\">Magnus Skjegstad</a>, <a href=\"https://github.com/yomimono\">Mindy Preston</a>, <a href=\"https://github.com/justincormack\">Justin Cormack</a>, <span>David Sheets</span>, <span>Amir Chaudhry</span>, and <a href=\"https://github.com/djs55\">Dave Scott</a>. After a fun few months pitching to west coast VCs in California (including fun chats with the likes of Jerry Yang), Peter Fenton from Benchmark convinced us to meet Solomon Hykes over at Docker. This conversation changed the course of our careers, as he shared his vision for the future of containerisation and how unikernels could fit in there.</p>\n<p>A short set of negotiations later, and <a href=\"https://techcrunch.com/2016/01/21/docker-acquires-unikernel-systems-as-it-looks-beyond-containers/\">Unikernel Systems was acquired by Docker</a> in 2016. We spent a very fun couple of years commercialising the technology and incorporating it into Docker for Desktop. Our work ended up shipping as Docker for Desktop which remains one of the most popular developer tools in the world, and I describe its architecture <a href=\"https://www.youtube.com/watch?v=zqFDEDl5Zes\">in this talk</a>.</p>\n<h2><a href=\"https://anil.recoil.org/#unikernels-in-2021-and-beyond\"></a>Unikernels in 2021 and beyond</h2>\n<p>Our startup aside, the core development of MirageOS continued to be nicely distributed in several spinouts:</p>\n<ul>\n<li><a href=\"https://kcsrk.info\">KC Sivaramakrishnan</a> and <span>Gemma Gordon</span> founded <a href=\"https://ocamllabs.io\">OCLC</a> in 2016 as a commercial spinout from the university group to drive OCaml tooling and core compiler development.</li>\n<li><a href=\"https://github.com/hannesm\">Hannes Mehnert</a> setup the <robur.io> cooperative in late 2017 with a <a href=\"https://robur.coop/Our%20Work/Projects\">large set of Mirage projects</a>.</li>\n<li><a href=\"https://github.com/samoht\">Thomas Gazagnaire</a> founded <a href=\"https://tarides.com\">Tarides</a> in 2018 after leaving Docker, where they maintain MirageOS and drive development of the Irmin storage stack in particular.</li>\n</ul>\n<p>The wider industry also saw a number of interesting spinouts, as many other communities also latched on to the ideas of unikernels and began their own language-specific and domain-specific versions. I joined the advisory boards of IncludeOS (now sadly defunct) and Zededa (now thankfully going from strength to strength in edge computing) to help guide strategy and adoption outside of just MirageOS. Dr Pierre Oliver maintains a great list of <a href=\"https://github.com/olivierpierre/unikernel-papers\">unikernel papers</a> where you can see the diversity and interest in unikernels. One of the most exciting implementations of a C-based unikernel can be found in <a href=\"https://www.unikraft.org/\">Unikraft</a>.</p>\n<p>As for my interest in unikernels moving forward? My heart always remains in finding the intersection of <em>safety</em> and <em>performance</em>, which means I mostly pay attention to language-based approaches. MirageOS continues to thrive (particularly with the effect system being integrated into OCaml in 2022, which will really change the way we develop OCaml code for embedded systems). Since 2020, I've been investigating the application of DIFC to embedded infrastructure, for example via <a href=\"https://anil.recoil.org/papers/2019-edgesys-snape\">Snape: The Dark Art of Handling Heterogeneous Enclaves</a>.</p>\n<p>The unikernel approach has also found new applications in <a href=\"https://anil.recoil.org/papers/2025-dl-rcn\">ultra-low-power computing</a> and <a href=\"https://anil.recoil.org/papers/2025-npu-bench\">edge AI deployment</a>, where the security and efficiency benefits align well with the constraints of energy-harvesting and intermittent operation scenarios explored in our <a href=\"https://anil.recoil.org/projects/osmose\">Interspatial OS</a> work.</p>\n<p>In 2025, we were also honoured to receive a <a href=\"https://anil.recoil.org/notes/unikernels-test-of-time\">most influential paper award</a> from ASPLOS for the original paper, validating the long-term impact of the unikernel approach on systems research.</p>\n<p></p></div>",
+18
avsm/projects_xen.json
+18
avsm/projects_xen.json
···+"summary": "I was on the original team at Cambridge that built the Xen hypervisor in 2002\n-- the first open-source \"type-1\" hypervisor that ushered in the age of cloud\ncomputing and virtual machines. Xen emerged from the Xenoservers project at\nthe CL SRG, where I started my PhD and hacked on the emerging codebase and\nsubsequently worked on the development of the commercial distribution of\nXenServer.\n\n\nBack at the turn of the century, the Computer Lab SRG faculty at the time\n(led by my first PhD supervisor [@ipratt]) decided to start the\n[XenoServers](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-552.pdf) project,\nwhich would build a public infrastructure for wide-area distributed computing.\nAn [EPSRC grant](https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=GR/S01894/01) lead\nto a number of graduate students all surging into the SRG in around 2002 to\nwork on the project, including me.\n\nThe later history of Xen is chronicled [on the original Xen\nwebsite](http://www-archive.xenproject.org/community/xenhistory.html), but the\nearly days were a heady mixture of furious hacking to put the various\nprototypes together. I did a very early port of NetBSD to the PV interface,\nbefore the introduction of linear page table checking into the hypervisor\ndefeated my port and Christian Limpach took it over. That early work was\nrecorded in the [:xen02] technical report. My involvement for a while after was\nlimited, as I also interned with Sandy Fraser in Princeton and set up the [:ubiqinteraction]\nproject with Intel Research.\n\nIt was after the open source release of Xen 1.0 and the submission of my PhD\nthesis that I joined XenSource as an early engineer and began release managing\nthe first commercial distribution of Xen, known as XenServer. This involved\nbuilding an entire embedded \"appliance\" that hid the underlying complexities of\nmanaging virtual machines. To add to the fun, we also built an entire\nmanagement toolstack in OCaml, making it one of the largest commercial uses\nof functional programming back then. Our experiences with building\nthis are published in [:2010-icfp-xen], and the XenServer management stack is still\ngoing strong as an [open source project](https://github.com/xapi-project/xen-api).\n\nThe nitty-gritty of XenServer engineering has never been captured in an academic\npaper, but I wrote a few blog posts (on the now-defunct Citrix blog) that are\nmirrored here:\n- [:installing-ubuntu-on-xenserver] covers how the then-nascent Linux distribution could be virtualised.\n- [:shedding-some-light-on-xenapp-on-xenserver-performance-tuning] discusses some performance profiling issues with XenServer after the Citrix acquisition of XenSource.\n- [:peeking-under-the-hood-of-high-availability] illustrates the extremely cool high-availability feature we built into XenServer 5.0, using some fairly complex OCaml hacking under the hood of the management stack.\n\nOnce I returned to academia full-time in 2010, much of my later work also improved\nthe Xen toolstack. I laid out the early vision for multiscale computing in [:2010-bcs-visions]\nand subsequently a prototype from the [:unikernels] project in [:2010-hotcloud-lamp].\nAs Xen got itself an ARM port a few years\nlater, my work on [:2015-nsdi-jitsu] also fed back to Xen development by highlighting potential efficiencies in the toolstack.\nI also investigated whether FPGAs would make sense in cloud environments in [:2011-fccm-cloudfpga].\n\nIn 2021, I largely use Solo5 and KVM as my main hacking and production\nhypervisor, but I plan to revisit Xen at some point as I begin looking\nat RISC-V architectures and embedded systems again as part of [:osmose].",+"content": "<div><h1>Xen Hypervisor</h1><p></p><p>I was on the original team at Cambridge that built the Xen hypervisor in 2002\n-- the first open-source "type-1" hypervisor that ushered in the age of cloud\ncomputing and virtual machines. Xen emerged from the Xenoservers project at\nthe CL SRG, where I started my PhD and hacked on the emerging codebase and\nsubsequently worked on the development of the commercial distribution of\nXenServer.</p>\n<p>Back at the turn of the century, the Computer Lab SRG faculty at the time\n(led by my first PhD supervisor <span>Ian Pratt</span>) decided to start the\n<a href=\"https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-552.pdf\">XenoServers</a> project,\nwhich would build a public infrastructure for wide-area distributed computing.\nAn <a href=\"https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=GR/S01894/01\">EPSRC grant</a> lead\nto a number of graduate students all surging into the SRG in around 2002 to\nwork on the project, including me.</p>\n<p>The later history of Xen is chronicled <a href=\"http://www-archive.xenproject.org/community/xenhistory.html\">on the original Xen\nwebsite</a>, but the\nearly days were a heady mixture of furious hacking to put the various\nprototypes together. I did a very early port of NetBSD to the PV interface,\nbefore the introduction of linear page table checking into the hypervisor\ndefeated my port and Christian Limpach took it over. That early work was\nrecorded in the <a href=\"https://anil.recoil.org/papers/xen02\">Xen 2002</a> technical report. My involvement for a while after was\nlimited, as I also interned with Sandy Fraser in Princeton and set up the <a href=\"https://anil.recoil.org/projects/ubiqinteraction\">Ubiquitous Interaction Devices</a>\nproject with Intel Research.</p>\n<p>It was after the open source release of Xen 1.0 and the submission of my PhD\nthesis that I joined XenSource as an early engineer and began release managing\nthe first commercial distribution of Xen, known as XenServer. This involved\nbuilding an entire embedded "appliance" that hid the underlying complexities of\nmanaging virtual machines. To add to the fun, we also built an entire\nmanagement toolstack in OCaml, making it one of the largest commercial uses\nof functional programming back then. Our experiences with building\nthis are published in <a href=\"https://anil.recoil.org/papers/2010-icfp-xen\">Using functional programming within an industrial product group: perspectives and perceptions</a>, and the XenServer management stack is still\ngoing strong as an <a href=\"https://github.com/xapi-project/xen-api\">open source project</a>.</p>\n<p>The nitty-gritty of XenServer engineering has never been captured in an academic\npaper, but I wrote a few blog posts (on the now-defunct Citrix blog) that are\nmirrored here:</p>\n<ul>\n<li><a href=\"https://anil.recoil.org/notes/installing-ubuntu-on-xenserver\">Installing Ubuntu on XenServer</a> covers how the then-nascent Linux distribution could be virtualised.</li>\n<li><a href=\"https://anil.recoil.org/notes/shedding-some-light-on-xenapp-on-xenserver-performance-tuning\">Shedding light on XenApp on XenServer performance tuning</a> discusses some performance profiling issues with XenServer after the Citrix acquisition of XenSource.</li>\n<li><a href=\"https://anil.recoil.org/notes/peeking-under-the-hood-of-high-availability\">Peeking under the hood of High Availability</a> illustrates the extremely cool high-availability feature we built into XenServer 5.0, using some fairly complex OCaml hacking under the hood of the management stack.</li>\n</ul>\n<p>Once I returned to academia full-time in 2010, much of my later work also improved\nthe Xen toolstack. I laid out the early vision for multiscale computing in <a href=\"https://anil.recoil.org/papers/2010-bcs-visions\">Multiscale not multicore: efficient heterogeneous cloud computing</a>\nand subsequently a prototype from the <a href=\"https://anil.recoil.org/projects/unikernels\">Unikernels</a> project in <a href=\"https://anil.recoil.org/papers/2010-hotcloud-lamp\">Turning Down the LAMP: Software Specialisation for the Cloud</a>.\nAs Xen got itself an ARM port a few years\nlater, my work on <a href=\"https://anil.recoil.org/papers/2015-nsdi-jitsu\">Jitsu: Just-In-Time Summoning of Unikernels</a> also fed back to Xen development by highlighting potential efficiencies in the toolstack.\nI also investigated whether FPGAs would make sense in cloud environments in <a href=\"https://anil.recoil.org/papers/2011-fccm-cloudfpga\">Reconfigurable Data Processing for Clouds</a>.</p>\n<p>In 2021, I largely use Solo5 and KVM as my main hacking and production\nhypervisor, but I plan to revisit Xen at some point as I begin looking\nat RISC-V architectures and embedded systems again as part of <a href=\"https://anil.recoil.org/projects/osmose\">Interspatial OS</a>.</p>\n<p></p></div>",
+18
avsm/videos_13cf3878-7436-4512-844e-f72f36425bc7.json
+18
avsm/videos_13cf3878-7436-4512-844e-f72f36425bc7.json
···+"content": "<h2><a href=\"https://anil.recoil.org/videos/13cf3878-7436-4512-844e-f72f36425bc7\">The OCaml Platform 1.0 with Reason ML</a> <span>/ Dec 2018</span></h2><p></p><div></div><p></p>\n<p>Speaking about the OCaml Platform at the ReasonML meetup hosted by Pusher.</p>",
+18
avsm/videos_4324ab18-f3b2-4fdd-883f-a4188dee5816.json
+18
avsm/videos_4324ab18-f3b2-4fdd-883f-a4188dee5816.json
···+"content": "<h2><a href=\"https://anil.recoil.org/videos/4324ab18-f3b2-4fdd-883f-a4188dee5816\">Emission Impossible: privacy-preserving carbon emissions claims</a> <span>/ Apr 2025</span></h2><p></p><div></div><p></p>\n<p>This was a talk given by <a href=\"https://www.cst.cam.ac.uk/people/psjm3\">Jessica Man</a> at the 1st International Workshop on <a href=\"https://sicsa.ac.uk/loco/loco2024\">Low Carbon Computing</a>. This was a hybrid event hosted in Glasgow, Scotland, UK, 3 December 2024.\nDue to a blip in the recording, there is no sound in the first two minutes.\nSee <a href=\"https://anil.recoil.org/papers/2024-loco-emissions\">Emission Impossible: privacy-preserving carbon emissions claims</a> for more information.</p>",
+18
avsm/videos_4cd6efdb-fd22-4a1c-a326-df49dfc1f398.json
+18
avsm/videos_4cd6efdb-fd22-4a1c-a326-df49dfc1f398.json
···+"content": "<h2><a href=\"https://anil.recoil.org/videos/4cd6efdb-fd22-4a1c-a326-df49dfc1f398\">Carbon-Aware Name Resolution</a> <span>/ Apr 2025</span></h2><p></p><div></div><p></p>\n<p>A talk by <a href=\"https://ryan.freumh.org\">Ryan Gibb</a> at the 1st International Workshop on <a href=\"https://sicsa.ac.uk/loco/loco2024\">Low Carbon Computing</a>. This was a hybrid event hosted in Glasgow, Scotland, UK, 3 December 2024. See <a href=\"https://anil.recoil.org/papers/2024-loco-carbonres\">Carbon-aware Name Resolution</a> for more information.</p>",
+18
avsm/videos_be89625e-c671-4e2c-8261-a98b1361a077.json
+18
avsm/videos_be89625e-c671-4e2c-8261-a98b1361a077.json
···+"content": "<h2><a href=\"https://anil.recoil.org/videos/be89625e-c671-4e2c-8261-a98b1361a077\">Cooperative Sensor Networks for Long-Term Biodiversity Monitoring</a> <span>/ Apr 2025</span></h2><p></p><div></div><p></p>\n<p>This was a talk given by <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh Millar</a> at the 1st International Workshop on <a href=\"https://sicsa.ac.uk/loco/loco2024\">Low Carbon Computing</a>. This was a hybrid event hosted in Glasgow, Scotland, UK, 3 December 2024. See <a href=\"https://anil.recoil.org/papers/2024-loco-terracorder\">Cooperative Sensor Networks for Long-Term Biodiversity Monitoring</a> for more information.</p>",
+18
avsm/videos_cb2439c9-d160-4daa-8103-b952c5aa2c5f.json
+18
avsm/videos_cb2439c9-d160-4daa-8103-b952c5aa2c5f.json
···+"content": "<h2><a href=\"https://anil.recoil.org/videos/cb2439c9-d160-4daa-8103-b952c5aa2c5f\">Lineage first computing: towards a frugal userspace for Linux</a> <span>/ Apr 2025</span></h2><p></p><div></div><p></p>\n<p>A talk by <a href=\"https://mynameismwd.org\">Michael Dales</a> at the 1st International Workshop on <a href=\"https://sicsa.ac.uk/loco/loco2024\">Low Carbon Computing</a>. This was a hybrid event hosted in Glasgow, Scotland, UK, 3 December 2024.\nSee <a href=\"https://anil.recoil.org/papers/2024-loco-shark\">Lineage first computing: towards a frugal userspace for Linux</a> for more information.</p>",
+25
dra/blog_misc_2025_05_23_build-event.json
+25
dra/blog_misc_2025_05_23_build-event.json
···+"summary": "Stepping into something different today for a Build Meetup hosted by Tweak, EngFlow and Jane Street at Jane Street\u2019s London offices. I was quite involved with Jbuilder development and early work around Dune 1.0 and some early 2.x work, although it\u2019s not a codebase I get to work on much these days. What was interesting for me, spending a lot of time in GNU make for the compiler, was to get some first-hand \u201cbig picture\u201d experience from the talks and also a chance to catch-up with various OCaml people who can be remarkably hard to pin down.",+"content": "<p>Stepping into something different today for a <a href=\"https://meetup.build/\">Build Meetup</a>\nhosted by <a href=\"https://moduscreate.com/\">Tweak</a>, <a href=\"https://www.engflow.com/\">EngFlow</a>\nand <a href=\"https://www.janestreet.com/\">Jane Street</a> at Jane Street\u2019s London offices.\nI was quite involved with Jbuilder development and early work around Dune 1.0\nand some early 2.x work, although it\u2019s not a codebase I get to work on much\nthese days. What was interesting for me, spending a lot of time in GNU make for\nthe compiler, was to get some first-hand \u201cbig picture\u201d experience from the talks\nand also a chance to catch-up with various OCaml people who can be remarkably\nhard to pin down.</p>\n\n<p>This is more a mish-mash of thoughts and memories from the day than anything\nelse - talks were being recorded, so I may try to update some of the details\nwith links to slides, but I don\u2019t have them to hand at the moment.</p>\n\n<p>There were six talks (and in fact a bonus one at the end!).</p>\n\n<p>\u201cTransitioning A Large Codebase to Bazel\u201d (Benedikt W\u00fcrkner, TTTech Auto). A\ntheme for me started with this talk and continued through others - the day is\nabout build systems for vast repositories within very large companies, but the\nlessons apply just as readily to disparate smaller systems outside in \u201cpublic\nopen source\u201d. The talk identified phases of moving a huge codebase maintained by\nhundreds (or even thousands) of developers. Getting past the envy of being able\nto work in an environment where one has an entire full-time on just \u201cthe build\nsystem\u201d, I particularly focussed on the necessary part of \u201cConvince\u201d -\nespecially that that needed to be across the board (Management - QA -\n<strong>Engineers</strong>), especially as my feeling of online discussions with <code>dune pkg</code>\nis that somehow we\u2019ve missed that part. My limited experience talking to people\nworking on these huge codebases has been that there\u2019s often necessarily a huge\nfocus on <em>speed</em>. It was therefore very interesting to me from the \u201cExecute\u201d\nphase of doing things for the key advice to be not blocking on speed, and indeed\nthe statement that \u201cfast can come later, don\u2019t block on future things which need\nto be changed\u201d (because I personally think that\u2019s been massively missed in our\nown efforts - I\u2019ve always prioritise correctness over speed\u2026 fast but\nsometimes not working is for me only fractionally above broken).</p>\n\n<p>\u201cIntegrating bazel/pre-commit\u201d (Matt Clarkson, Arm). Quite a few years ago, I\nadded pre-commit linting githook for OCaml (<a href=\"https://github.com/ocaml/ocaml/pull/1148\">ocaml/ocaml#1148</a>).\nI find it quite handy, but my impression that there aren\u2019t many others who do.\nHoly moly, there\u2019s a big infrastructure of githooks out there in use in\ncompanies! TIL about <a href=\"https://pre-commit.com/\">pre-commit.com</a>. Integration of\nthis with Bazel was relevant, if not replicable - I vociferously fight to keep\nour lint script in awk not because I\u2019m mad (well\u2026), but because the point is\nthat the githook has no dependencies. This was a very neat demonstration of work\nto allow a hermetic environment for having diverse hooks potentially in a\ndifferent version of Python from the project using them being able to be\ndeployed and updated easily for users (in this case, of course, developers). The\nmain focus resonates with work that has been ongoing and which I hope to be able\nto continue for the compiler - bringing CI as local as possible, ensuring that\nthe PR is not the first time you discover the problem.</p>\n\n<p>Next up was a talk on Dune advances withinin Jane Street (Andrey Mokhov + ???).\nThey\u2019ve made some changes to allow nix to be used to get external dependencies\n(<code>(sysdeps ..)</code>) stanza. Jane Street of course get to simplify the world a\nlittle (and, given the amount of code, why wouldn\u2019t they!!), but interest to\nmuse how this could be extended out to both multiple-platforms and also to\nopam metadata in general (and the overlap with some of our own work on multi-\necosystem package solving). The other feature demonstrated was peer-to-peer\nremote builds. Motivation of this was interesting to me - I\u2019ve previously\nargued that aspects of Windows support get more easily merged by demonstrating\nthat what\u2019s required is actually critical for something else (as have others:\ncf. the excellent <a href=\"https://www.youtube.com/watch?v=qbKGw8MQ0i8\">\u201cNTFS really isn\u2019t that bad\u201d</a>).\nRemote building always sounds like a nice idea, but hits problems quite quickly\n(reproducibility, etc., etc.). Of course, it becomes really critical when that\nremote building involves GPUs - i.e. it\u2019s become something more important by\nwanting to be able to share and schedule hardware, even though the concept of\nremote build servers has been being talked about for years. Nice demonstration\nof \u201cdoing the right thing\u201d as well - the p2p aspect is neat, and while it was\nclear they haven\u2019t to actually benchmark its being better, I liked the subtext\nthat it\u2019s been done this (slightly more complicated) way <em>first</em> because the\nthe simpler centralised system look bottlenecky even without evidence \ud83d\ude0a</p>\n\n<p>\u201cMeasuring & Improving Build Speeds\u201d (Vaibhav Shah, EngFlow). I\u2019ve been musing\non (non-evil) telemetry and more continuous measuring of build performance (both\npackage managers and build systems). I guess the niceish takeaway here is that\nthis affects large companies too\u2026 it\u2019s not just projects with a small number\nof maintainers who end up only looking at build performance regressions when it\ngets really bad and then forgetting about it for a few months/years until it\nnext gets bad!</p>\n\n<p>\u201cWhat Makes Buck2 Special?\u201d (Neil Mitchell, Meta). I hope the video of this talk\nemerges at some point, because it was really great. In particular, this\nidentified for Buck2 a distinction of having a static dependency graph (Bazel,\nmake, etc.) versus a fully dynamic dependency graph (Excel, etc.) as being a\nspectrum between having a static <em>dependency</em> graph and sections of a dynamic\n<em>action</em> graph. For example, in OCaml terms, that explains that <code>foo.ml</code>,\n<code>bar.ml</code> and <code>baz.ml</code> make up <code>awesome.cmxa</code> (static dependencies), but still\nallow the precise dependencies between those ml files to be dynamically\ndiscovered by <code>ocamldep</code>. However, that\u2019s not just the build system - this is\nsimilar (probably unsurprisingly, but I was briefly surprised, as it hadn\u2019t\noccurred to me before) for a package manager where it the distinction between\nthe <em>dependency graph</em> and the <em>action graph</em>. In particular, for Buck2 this\ncan intuitively be the static dependency graph tells you what is strictly needed\n(and is largely specified in the build description) but then the action graph\ndetermines things like parallelism - dynamic, but still guided by the static\ndependency graph. Which is <em>exactly</em> the package manager model. Wondering how to\napply that to my own musings for dynamic/property-based discovery of external\ndependencies for a future version of opam.</p>\n\n<p>\u201cExtending Buck2\u201d (Andreas Herrmann, Tweag). On the downside - the main subject\nof this talk is an internship proposal I floated years ago for Dune which never\ngot anywhere. On the plusside - it works beautifully in Buck2, so it\u2019s\nvalidated! The idea is to be able to break through the boundaries of libraries\nto increase build parallelism - in other words, instead of compiling <code>foo.cmxa</code>,\n<code>bar.cmxa</code> and <code>baz.cmxa</code> in order to link <code>main-program</code>, you actually get to\ncompile <em>exactly</em> the modules which are used in <code>main-program</code> and then link it,\npotentially <em>then</em> creating those cmxa files in parallel as usable future\nartefacts. That\u2019s obviously a quite interesting piece of dynamism - in\nparticular, it means on a build that you might choose to the cmxa files if\nnothing has changed, or you might ignore it completely. Crucially, it provides a\nmore accurate dependency graph - if you change a module in a library which is\nnot linked in the resulting executable, you can avoid rebuilds. TIL that Haskell\nhas a build-system like mode where it can discover dependencies and compile more\nfiles as it goes (I have an intern looking at that in OCaml this summer,\nalthough I\u2019m more interested in seeing how easy it is retrofit using algebraic\neffects). And - interestingly, given why I\u2019d come along for the day - the\nquestion was asked as to why more compiler authors aren\u2019t in the room with\nbuild system authors, because these kinds of optimisations do clearly have to be\ndone in coordination with the compiler. So I polished my halo a bit!</p>",
+23
dra/blog_platform_2025_04_15_yak-trimming.json
+23
dra/blog_platform_2025_04_15_yak-trimming.json
···+"summary": "Since presenting Relocatable OCaml at OCaml Dev Meeting, I have been playing whac-a-mole with our CI systems, working towards getting finalised branches for the work ready for upstreaming. Eventually, it got to me, and I realised it was possibly time to come up with a better test environment for these changes.",+"content": "<p>Since presenting Relocatable OCaml at <a href=\"https://www.dra27.uk/blog/platform/2025/03/28/ocaml-dev-meeting.html\">OCaml Dev Meeting</a>,\nI have been playing <a href=\"https://en.wikipedia.org/wiki/Whac-A-Mole\">whac-a-mole</a>\nwith our CI systems, working towards getting finalised branches for the work\nready for upstreaming. Eventually, it got to me, and I realised it was possibly\ntime to come up with a better test environment for these changes.</p>\n\n<p><a href=\"https://en.wiktionary.org/wiki/yak_shaving\">Yak Shaving</a>:</p>\n\n<ol>\n <li>Any apparently useless activity which, by allowing one to overcome\nintermediate difficulties, allows one to solve a larger problem.</li>\n <li>A less useful activity done consciously or subconsciously to procrastinate\nabout a larger but more useful task.</li>\n</ol>\n\n<p>I definitely fear falling into the second definition! But, the problem at hand:\nrefactoring tests which run on a diverse set of platforms to make them\nproperty-based, rather than name-based (i.e. going from \u201cthis test fails on\nmacOS, FreeBSD and OpenBSD\u201d to \u201cthis test fails if the assembler is LLVM\u2019s\ninternal assembler [which happens to be the case on macOS, FreeBSD and\nOpenBSD]\u201d). The problem was it\u2019s quite hard to get right, so I was ending up\nfixing a series of apparent glitches, then \u201cplaying CI golf\u201d (push it to the CI\nservice, see what you forgot). What I really needed was to be able to work on\nthe test harness, edit it on any of the systems and quickly see the effect on\nany of the others.</p>\n\n<p>I used <a href=\"https://syncthing.net/\">Syncthing</a> both to share the support shell\nscripts and also to distribute the test harness (yes, yes - what was handy with\nSyncthing was <em>automatic</em> synchronisation, which is why I didn\u2019t use Unison).\nSyncthing insists on blatting files into the directories it\u2019s synchronising,\nwhich means I couldn\u2019t quite do what I wanted and sychronise the Git checkouts\ndirectly (it would be <em>so</em> lovely if Git had the ability to mark some files as\nboth ignored and <em>never cleaned</em>\u2026). However, a certain amount of glue was\nneeded to kick off the builds, so it wasn\u2019t too bad to have to hardlink the test\nharness somewhere else for Syncthing to do it\u2019s magic with. After pleasingly\nlittle <code>sh</code> hacking:</p>\n\n<p><img alt=\"Roasting a laptop with 8 builds of OCaml!\" src=\"https://www.dra27.uk/assets/2025-04-15/2025-04-15-screens.png\"></p>\n\n<p>The top is 5 different Windows configurations running in tmux (which needs\nmintty sadly, as Cygwin\u2019s tmux and Windows Terminal really don\u2019t seem to agree):\nthat\u2019s testing MSVC with <code>clang-cl</code>, vanilla MSVC, mingw-w64 in x86_64 and i686\nand then finally Cygwin itself. The bottom right is two builds of Linux running\nin WSL on the same machine (testing a normal build and a static build). On the\nbottom left is an SSH session to a Hyper-V VM running FreeBSD (also on the same\nmachine!) and then an SSH tunnel to the Mac Mini that lives on the desk in my\n<a href=\"https://tarides.com\">Tarides</a> office. I then have a script which can be fed a\ncommit sha and additional configuration options, and all 9 of them then pick up\nthe instruction, rebuild the compiler and run the test harness. When something\nfails, either the test harness can be edited separately, or the affected machine\ncan be broken out of the script and debugged - but as the test harness gets\nupdated, Syncthing redistributes it to the other machines and they immediately\nre-run it.</p>\n\n<p>Unsurprisingly, it was much more efficient to use than the CI golf - especially\nwhen then testing individual commits with different build configurations. The\nnoise of the CPU fans is another matter, but I\u2019m fortunate enough to have a new\nworkstation arriving fairly soon, so at least next time my poor laptop won\u2019t\nhave to do all the work.</p>\n\n<p>Conclusion of the week: occasionally the yak may need at least a trim!</p>",
+23
dra/blog_platform_2025_04_22_branching-out.json
+23
dra/blog_platform_2025_04_22_branching-out.json
···+"summary": "opam 2.4 was branched last week\u2026 very pleasing to see Ryan\u2019s work on Nix depext support get merged (we spent quite a bit of time on that together last summer). It\u2019s a subtle-sounding (huge) change, but the move away from relying on patch and diff as external commands (which has been a HUGE amount of work done by @kit-ty-kate) paves the way for being able to sort out the incredible slowness of opam update on Windows.",+"content": "<p><a href=\"https://opam.ocaml.org/blog/opam-2-4-0-alpha1/\">opam 2.4</a> was branched last\nweek\u2026 very pleasing to see <a href=\"https://ryan.freumh.org/\">Ryan\u2019s</a> work on Nix depext\nsupport get merged (we spent quite a bit of time on that together last summer).\nIt\u2019s a subtle-sounding (huge) change, but the move away from relying on <code>patch</code>\nand <code>diff</code> as external commands (which has been a HUGE amount of work done by\n<a href=\"https://github.com/kit-ty-kate\">@kit-ty-kate</a>) paves the way for being able to\nsort out the incredible slowness of <code>opam update</code> on Windows.</p>\n\n<p><a href=\"https://icfp24.sigplan.org/details/ocaml-2024-papers/10/Opam-2-2-and-beyond\">Not at all coincidentally</a>,\nOCaml 5.4 was frozen two days ago as well. Relocatable OCaml not quite ready in\ntime, but at least those PRs will be ready really, really[, really] soon \ud83e\udee3\u2026</p>",
+23
dra/blog_platform_2025_05_07_oxcaml-toes.json
+23
dra/blog_platform_2025_05_07_oxcaml-toes.json
···+"summary": "Jane Street have been working for a few years on a whole suite of extensions to OCaml, many of which they\u2019ve both blogged and published about. I did some hacking last year getting a version of that running on Windows, which I really must resurrect. But today I actually had a go at doing a tiny something with its features!",+"content": "<p><a href=\"https://opensource.janestreet.com/\">Jane Street</a> have been working for a few\nyears on a whole suite of extensions to OCaml, many of which they\u2019ve both\nblogged and published about. I did some hacking last year getting a version of\nthat running on Windows, which I really must resurrect. But today I actually had\na go at doing a tiny something with its features!</p>\n\n<p>Stack allocation is a fascinating feature to add for me. I strongly believe that\nOCaml\u2019s strength lies in pragmatism, and the promise of stack allocated values\nis that we\u2019ll be able to write highly memory-performant code in OCaml <em>when we\nwant to</em> (i.e. unlike in Rust, when we really don\u2019t care, we can just leave it\nall to the GC as normal) and without having to compromise massively that code.</p>\n\n<p>I\u2019ve dusted off the first day of <a href=\"https://adventofcode.com/2024/day/1\">Advent of Code 2024</a>.\nInitially, not looking at solving the actual puzzle, but my input is 1000 lines\nof text where each line is two 5 digit numbers separated by three spaces. Here\u2019s\na trivial snippet over that:</p>\n\n<div><div><pre><code><span>let</span> <span>f</span> <span>a</span> <span>_s</span> <span>=</span> <span>succ</span> <span>a</span>\n\n<span>let</span> <span>execute</span> <span>file</span> <span>=</span>\n <span>In_channel</span><span>.</span><span>with_open_text</span> <span>file</span> <span>(</span><span>In_channel</span><span>.</span><span>fold_lines</span> <span>f</span> <span>0</span><span>)</span>\n\n<span>let</span> <span>()</span> <span>=</span>\n <span>let</span> <span>c1</span> <span>=</span> <span>Gc</span><span>.</span><span>minor_words</span> <span>()</span> <span>in</span>\n <span>let</span> <span>r</span> <span>=</span> <span>execute</span> <span>\"input-01\"</span> <span>in</span>\n <span>let</span> <span>c2</span> <span>=</span> <span>Gc</span><span>.</span><span>minor_words</span> <span>()</span> <span>in</span>\n <span>Printf</span><span>.</span><span>printf</span>\n <span>\"Result: %d</span><span>\\n</span><span>%.0f words allocated across the call</span><span>\\n</span><span>\"</span> <span>r</span> <span>(</span><span>c2</span> <span>-.</span> <span>c1</span><span>)</span>\n</code></pre></div></div>\n\n<p>This is just counting the number of lines and for me is showing 3012 words\nallocated on the minor heap. There are 1000 lines in the file each of which\nneeds 14 bytes (including the terminator) so, until one of our GC experts\ncorrects me, I reckon that\u2019s 1000 headers, 2000 words containing the strings\nthemselves and we can wave our hands about the channel and closure in those\ncalls to account for the other 12 words.</p>\n\n<p>So far, so good - this is just counting the lines. Now, as a further toy example\n(if one were really concerned about performance, this is totally not the way to\ndo this\u2026), let\u2019s add them up instead:</p>\n\n<div><div><pre><code><span>let</span> <span>f</span> <span>a</span> <span>s</span> <span>=</span>\n <span>let</span> <span>fst</span> <span>=</span> <span>String</span><span>.</span><span>sub</span> <span>s</span> <span>0</span> <span>5</span> <span>in</span>\n <span>let</span> <span>snd</span> <span>=</span> <span>String</span><span>.</span><span>sub</span> <span>s</span> <span>8</span> <span>5</span> <span>in</span>\n <span>a</span> <span>+</span> <span>int_of_string</span> <span>fst</span> <span>+</span> <span>int_of_string</span> <span>snd</span>\n</code></pre></div></div>\n\n<p>That gives me 7012 minor words - another 4000, which corresponds to all of those\n<code>String.sub</code> calls (2000 6-byte strings, 2000 header words). So what can stack\nallocation bring us? Well, I\u2019m wanting to \u201clift the bonnet\u201d with all this, so\nrather than using Base, let\u2019s have a little bit of hand-rolled support (I said\nthis was a toy example):</p>\n\n<div><div><pre><code><span>module</span> <span>String</span> <span>=</span> <span>struct</span>\n <span>include</span> <span>String</span>\n\n <span>external</span> <span>unsafe_create_local</span> <span>:</span> <span>int</span> <span>-></span> <span>local_</span> <span>bytes</span> <span>=</span> <span>\"caml_create_local_bytes\"</span>\n\n <span>external</span> <span>unsafe_blit_string</span> <span>:</span>\n <span>(</span><span>string</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>int</span> <span>-></span> <span>(</span><span>bytes</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>int</span> <span>-></span> <span>int</span> <span>-></span> <span>unit</span>\n <span>=</span> <span>\"caml_blit_string\"</span> <span>[</span><span>@@</span><span>noalloc</span><span>]</span>\n\n <span>external</span> <span>unsafe_to_string</span> <span>:</span>\n <span>(</span><span>bytes</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>(</span><span>string</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>=</span> <span>\"%bytes_to_string\"</span>\n\n <span>external</span> <span>get</span> <span>:</span>\n <span>(</span><span>string</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>(</span><span>int</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>char</span> <span>=</span> <span>\"%string_safe_get\"</span>\n\n <span>let</span> <span>sub_local</span> <span>s</span> <span>ofs</span> <span>len</span> <span>=</span> <span>exclave_</span>\n <span>if</span> <span>ofs</span> <span><</span> <span>0</span> <span>||</span> <span>len</span> <span><</span> <span>0</span> <span>||</span> <span>ofs</span> <span>></span> <span>length</span> <span>s</span> <span>-</span> <span>len</span>\n <span>then</span> <span>invalid_arg</span> <span>\"String.sub\"</span>\n <span>else</span> <span>begin</span>\n <span>let</span> <span>r</span> <span>=</span> <span>unsafe_create_local</span> <span>len</span> <span>in</span>\n <span>unsafe_blit_string</span> <span>s</span> <span>ofs</span> <span>r</span> <span>0</span> <span>len</span><span>;</span>\n <span>unsafe_to_string</span> <span>r</span>\n<span>end</span>\n</code></pre></div></div>\n\n<p>What\u2019s interesting to me is that this doesn\u2019t look <em>too</em> different from an\nexpanded version of <code>String.sub</code> from the Standard Library:</p>\n\n<div><div><pre><code><span>let</span> <span>sub</span> <span>s</span> <span>ofs</span> <span>len</span> <span>=</span>\n <span>if</span> <span>ofs</span> <span><</span> <span>0</span> <span>||</span> <span>len</span> <span><</span> <span>0</span> <span>||</span> <span>ofs</span> <span>></span> <span>length</span> <span>s</span> <span>-</span> <span>len</span>\n <span>then</span> <span>invalid_arg</span> <span>\"String.sub / Bytes.sub\"</span>\n <span>else</span> <span>begin</span>\n <span>let</span> <span>r</span> <span>=</span> <span>create</span> <span>len</span> <span>in</span>\n <span>unsafe_blit</span> <span>s</span> <span>ofs</span> <span>r</span> <span>0</span> <span>len</span><span>;</span>\n <span>unsafe_to_string</span> <span>r</span>\n <span>end</span>\n</code></pre></div></div>\n\n<p>we just had to <em>choose</em> to create the stack-allocated strings (yes, yes, stack\nallocated is still allocated, which isn\u2019t necessary, of course). But we can now\nplug that in:</p>\n\n<div><div><pre><code><span>let</span> <span>f</span> <span>a</span> <span>s</span> <span>=</span>\n <span>let</span> <span>fst</span> <span>=</span> <span>String</span><span>.</span><span>sub_local</span> <span>s</span> <span>0</span> <span>5</span> <span>in</span>\n <span>let</span> <span>snd</span> <span>=</span> <span>String</span><span>.</span><span>sub_local</span> <span>s</span> <span>8</span> <span>5</span> <span>in</span>\n <span>a</span> <span>+</span> <span>int_of_string</span> <span>fst</span> <span>+</span> <span>int_of_string</span> <span>snd</span>\n</code></pre></div></div>\n\n<p>and:</p>\n\n<div><div><pre><code> | a + int_of_string fst + int_of_string snd\n ^^^\nError: This value escapes its region.\n</code></pre></div></div>\n\n<p>Ah, interesting to see how it spreads: we need an updated <code>int_of_string</code>:</p>\n\n<div><div><pre><code><span>external</span> <span>int_of_string</span> <span>:</span> <span>(</span><span>string</span><span>[</span><span>@</span><span>local_opt</span><span>])</span> <span>-></span> <span>int</span> <span>=</span> <span>\"caml_int_of_string\"</span>\n</code></pre></div></div>\n\n<p>and now it works <em>and we\u2019re back to the same allocations as when counting the\nlines instead</em>!</p>\n\n<p>All the mode inference works as you\u2019d expect too: rewriting it so that <code>f</code> takes\nthe <code>sub</code> function as an argument:</p>\n\n<div><div><pre><code><span>let</span> <span>f</span> <span>sub</span> <span>a</span> <span>s</span> <span>=</span>\n <span>let</span> <span>fst</span> <span>=</span> <span>sub</span> <span>s</span> <span>0</span> <span>5</span> <span>in</span>\n <span>let</span> <span>snd</span> <span>=</span> <span>sub</span> <span>s</span> <span>8</span> <span>5</span> <span>in</span>\n <span>a</span> <span>+</span> <span>int_of_string</span> <span>fst</span> <span>+</span> <span>int_of_string</span> <span>snd</span>\n\n<span>let</span> <span>execute</span> <span>sub</span> <span>file</span> <span>=</span>\n <span>In_channel</span><span>.</span><span>with_open_text</span> <span>file</span> <span>(</span><span>In_channel</span><span>.</span><span>fold_lines</span> <span>(</span><span>f</span> <span>sub</span><span>)</span> <span>0</span><span>)</span>\n\n<span>let</span> <span>show</span> <span>name</span> <span>sub</span> <span>=</span>\n <span>let</span> <span>c1</span> <span>=</span> <span>Gc</span><span>.</span><span>minor_words</span> <span>()</span> <span>in</span>\n <span>let</span> <span>r</span> <span>=</span> <span>execute</span> <span>sub</span> <span>\"input-01\"</span> <span>in</span>\n <span>let</span> <span>c2</span> <span>=</span> <span>Gc</span><span>.</span><span>minor_words</span> <span>()</span> <span>in</span>\n <span>Printf</span><span>.</span><span>printf</span>\n <span>\"Result for %s: %d</span><span>\\n</span><span>%.0f words allocated across the call</span><span>\\n</span><span>\"</span>\n <span>name</span> <span>r</span> <span>(</span><span>c2</span> <span>-.</span> <span>c1</span><span>)</span>\n\n<span>let</span> <span>()</span> <span>=</span>\n <span>show</span> <span>\"sub\"</span> <span>String</span><span>.</span><span>sub</span><span>;</span>\n <span>show</span> <span>\"sub_local\"</span> <span>String</span><span>.</span><span>sub_local</span>\n</code></pre></div></div>\n\n<p>and still 4000 words fewer on the minor heap for <code>String.sub_local</code>. Tiny first\nimpression: the <code>[@local_opt]</code> annotation feels quite infectious for library\nauthors!</p>\n\n<p>More serious playing to come\u2026 who knows, might even re-do the puzzle!</p>",
+23
dra/blog_platform_2025_06_22_they-do-it-with-mirrors.json
+23
dra/blog_platform_2025_06_22_they-do-it-with-mirrors.json
···+"summary": "While comfort-watching the indomitable Joan Hickson as Agatha Christie\u2019s Miss Marple in The Body in the Library, it occurred to me that Miss Marple would have been a formidable debugger. Since returning from holiday one, two, three weeks ago, I\u2019ve been mostly straightening out and finalising the final Relocatable OCaml PR. A frustrating task, because I know these things will take weeks and have little to show for at the end, so one spends the entire time feeling it should be finished by now. It\u2019s just about there, when this little testsuite failure popped up:",+"content": "<p>While comfort-watching the indomitable <a href=\"https://en.wikipedia.org/wiki/Joan_Hickson\">Joan Hickson</a>\nas Agatha Christie\u2019s <a href=\"https://en.wikipedia.org/wiki/Miss_Marple_(TV_series)\">Miss Marple</a>\nin <a href=\"https://en.wikipedia.org/wiki/The_Body_in_the_Library_(film)\">The Body in the Library</a>,\nit occurred to me that Miss Marple would have been a formidable debugger. Since\nreturning from holiday one, two, three weeks ago, I\u2019ve been mostly\nstraightening out and finalising the final Relocatable OCaml PR. A frustrating\ntask, because I know these things will take weeks and have little to show for at\nthe end, so one spends the entire time feeling it should be finished by now.\nIt\u2019s just about there, when this little testsuite failure popped up:</p>\n\n<div><div><pre><code>List of failed tests:\n tests/lib-unix/common/cloexec.ml\n tests/warnings/mnemonics.mll\n</code></pre></div></div>\n\n<p>In both cases there was a similar, very strange-looking error:</p>\n\n<div><div><pre><code>the file '/home/runner/work/ocaml/ocaml/testsuite/tests/lib-unix/_ocamltest/tests/lib-unix/common/cloexec/ocamlc.byte/cloexec_leap.exe' is not a bytecode executable file\n</code></pre></div></div>\n\n<p>and</p>\n\n<div><div><pre><code>the file '/home/runner/work/ocaml/ocaml/testsuite/tests/warnings/_ocamltest/tests/warnings/mnemonics/ocamlc.byte/mnemonics.byte' is not a bytecode executable file\nFatal error: exception File \"mnemonics.mll\", line 55, characters 2-8: Assertion failed\n</code></pre></div></div>\n\n<p>Now, as it happens, the diagnosis of <em>what</em> was happening was relatively quick\nfor me. I\u2019ve dusted off and thrown around so many obscure bits of the runtime\nsystem on so many diverse configurations and platforms with Relocatable OCaml\nthat it\u2019s resulted in a lot of other bugs being fixed <em>before</em> the main PRs,\nsome bugs fixed <em>with</em> the main PRs and then a pile of follow-up work with the\nadditional parts. There\u2019s one particularly long-standing bug on Windows:</p>\n\n<div><div><pre><code>C:\\Users\\DRA>where ocamlc.byte\nC:\\Users\\DRA\\AppData\\Local\\opam\\default\\bin\\ocamlc.byte.exe\n\nC:\\Users\\DRA>where ocamlc.byte.exe\nC:\\Users\\DRA\\AppData\\Local\\opam\\default\\bin\\ocamlc.byte.exe\n\nC:\\Users\\DRA>ocamlc.byte.exe --version\n5.2.0\n\nC:\\Users\\DRA>ocamlc.byte --version\nunknown option --version\n</code></pre></div></div>\n\n<p>Strange, huh: <code>ocamlc.byte.exe</code> does one thing and <code>ocamlc.byte</code> does another!\nThe precise diagnosis of what\u2019s going on there is nearly a novel in itself. The\nfix is quite involved, and is at the \u201cmight get put into PR 3; might be left for\nthe future\u201d stage. The failures across CI were just the Unix builds which use\nthe stub launcher for bytecode (it\u2019s an obscure corner of startup which lives in\n<a href=\"https://github.com/ocaml/ocaml/tree/trunk/stdlib/header.c\"><code>stdlib/header.c</code></a>\nand which has received a pre-Relocatable overhaul in <a href=\"https://github.com/ocaml/ocaml/pull/13988\">ocaml/ocaml#13988</a>).\nThere are so many bits to Relocatable OCaml that I have a master script that\nputs them all together and then backports them: the CI failure was only on the\n\u201ctrunk\u201d version of this, the 5.4, 5.3 and 5.2 versions passing as normal. The\nbackports don\u2019t include the \u201cfuture\u201d work, so that quickly pointed me at the\nwork sitting in <a href=\"https://github.com/dra27/ocaml/pull/190/commits\">dra27/ocaml#190</a>.</p>\n\n<p>Both those failures are from tests which themselves spawn executables as part of\nthe test. What was particularly strange was mnemonics because that doesn\u2019t call\nitself, rather it calls the compiler:</p>\n\n<div><div><pre><code><span>let</span> <span>mnemonics</span> <span>=</span>\n <span>let</span> <span>stdout</span> <span>=</span> <span>\"warn-help.out\"</span> <span>in</span>\n <span>let</span> <span>n</span> <span>=</span>\n <span>Sys</span><span>.</span><span>command</span>\n <span>Filename</span><span>.(</span><span>quote_command</span> <span>~</span><span>stdout</span>\n <span>ocamlrun</span> <span>[</span><span>concat</span> <span>ocamlsrcdir</span> <span>\"ocamlc\"</span><span>;</span> <span>\"-warn-help\"</span><span>])</span>\n <span>in</span>\n <span>assert</span> <span>(</span><span>n</span> <span>=</span> <span>0</span><span>);</span>\n</code></pre></div></div>\n\n<p>That\u2019s invoking the <code>ocamlc</code> bytecode binary from the root of the build tree\npassing it as an argument directly to <code>runtime/ocamlrun</code> in the root of the\nbuild tree. The fact that ocamlrun is then displaying a message referring to\n<code>mnemonics.byte</code> is very strange, but was down to a bug in my fix for this other\nissue. The core of the bug-fix is that the stub launcher, having opened the\nbytecode image to find its <code>RNTM</code> section so it can search for the runtime to\ncall now leaves the file descriptor open and hands its number over to <code>ocamlrun</code>\nas part of the <code>exec</code> call (works on Windows as well). The problem was the\ncleanup from this in <code>ocamlrun</code> itself, where that environment is reset having\nbeen consumed:</p>\n\n<div><div><pre><code><span>#if defined(_WIN32)\n</span> <span>_wputenv</span><span>(</span><span>L\"__OCAML_EXEC_FD=\"</span><span>);</span>\n<span>#elif defined(HAS_SETENV_UNSETENV)\n</span> <span>unsetenv</span><span>(</span><span>\"__OCAML_EXEC_FD=\"</span><span>);</span>\n<span>#endif\n</span></code></pre></div></div>\n\n<p>There\u2019s a stray <code>=</code> at the end of the Unix branch there \ud83e\udee3 Right, problem solved\nand, were I Inspector Slack, I should have zipped straight round to Basil\nBlake\u2019s gaudy cottage, handcuffs at the ready.</p>\n\n<p>But what about the second murder? Which, in this case, is why the heck hadn\u2019t\nthis been seen before? That\u2019s the kind of thing that terrifies me with a fix\nlike this: the bug is obvious, but was something else being masked and, more to\nthe point, have I just changed something which introduced a <em>different</em> bug\nwhich happened to cause this one to be visible. At this point, I made a note,\nclosed my laptop, and returned to my knitting (no, wait, that was Miss Marple).\nThen the penny dropped: the compiler\u2019s being configured here with\n<code>--with-target-sh=exe</code> (on Unix, that means that bytecode executables\nintentionally avoid shebang-style scripts and use the stub), which should mean\nthat those two tests are compiled using the stub. Except that because we test\nthe compiler in the build tree, previously the compiler picks up\n<code>stdlib/runtime-launch-info</code> which is the <em>build</em> version of that header, not\nthe <em>target</em> version. However, one of the refactorings I\u2019ve done in <a href=\"https://github.com/dra27/ocaml/pull/189/commits/c60e4aafcf97bde037445e4cd94a9e659caf072a\">c60e4aaf</a>\nstops using <code>runtime-launch-info</code> this way (I introduced that header in <a href=\"https://github.com/ocaml/ocaml/pull/12751\">ocaml/ocaml#12751</a>\nas part of OCaml 5.2.0). A side-effect of that change is that\n<code>stdlib/runtime-launch-info</code> is actually the target version of the header, and\nthe <em>root</em> bytecode compiler is <em>now</em> behaving as we\u2019d always been expecting it\nto that test, using target configuration defined in <code>utils/config.ml</code>\u2026 and so\nonly now revealing this latent bug in my fix.</p>\n\n<p><em>\u201cThey do it with mirrors, you know-that sort of thing-if you understand me.\u201d\nInspector Curry did not understand. He stared and wondered if Miss Marple was\nquite right in the head.</em></p>",
+24
dra/blog_week-that-was_2025_05_02_wtw-18.json
+24
dra/blog_week-that-was_2025_05_02_wtw-18.json
···+"summary": "Don\u2019t let the road to perfect intentions be the enemy of the good. Or some other mixed metaphor. Anyhow, an attempt at musing on the week so that musing on musings may be slightly easier.",+"content": "<p>Don\u2019t let the road to perfect intentions be the enemy of the good. Or some other\nmixed metaphor. Anyhow, an attempt at musing on the week so that musing on\nmusings may be slightly easier.</p>\n\n<p>Hacked around with the opam packaging of OxCaml at the weekend in <a href=\"https://github.com/janestreet/opam-repository/tree/3cb7f5ee49e3be100d322e4dd9be18aab28dd3e8\">janestreet/opam-repository#with-extensions</a>\nto try to get this to work with <code>dune pkg</code>. Quick update to my opam repository\nfor getting <a href=\"https://preview.dune.build/\">Dune Developer Preview</a> without\nbinaries (<a href=\"https://github.com/dra27/opam-repository/commits/e504b2c1857ec9e68b6ece7aaff95c7c6728d2da\">dra27/opam-repository#dune-dp</a>),\nand a thought that automating that would be an entertaining thing for Claude.\nGot it working (cf. <a href=\"https://github.com/ocaml/dune/issues/11652#issuecomment-2833502572\">ocaml/dune#11652 (comment)</a>).\nSomething very strange going on with the build time, as it takes twice the time\nto build the first time in <code>dune pkg</code> as it does to make an opam switch\n(something very strange meaning that <a href=\"https://github.com/dra27/opam-repository/commit/1f9445b0e8abd8b638260863e80e591548dc420b\">this hack</a>\nreduced the build time\u2026).</p>\n\n<p>We played <a href=\"https://boardgamegeek.com/boardgame/256680/return-to-dark-tower\">RTDT</a>\nfor the first time in a few weeks. Undaunted Aegis and Devious Swindler vs\nUtuk-ku with Covenant on gritty. Fun adversary quest laying various \u201cambush\u201d\ntokens on the board in order to counter unimprovable cards against the\nadversary. We just won (that\u2019s feeling like a pattern with Covenant on gritty:\nnot quite as, \u201cOMG, we\u2019re all gonna die\u201d as Alliances on gritty).\nEntertainingly we ended with the maximum 5 corruptions between us.</p>\n\n<p>Week itself was frustratingly treacle-ish. Cursing towards getting the PR for\nthe test harness for Relocatable OCaml opened, to finally get this off my desk.\nChallenge is to tame ~3500 lines of OCaml into something that\u2019s vaguely\nexplainable and definitely maintainable and without completely breaking all the\nbranches which make changes to it. First baby step: break it up, but with a\nlittle help from the OCaml module system so that rebasing the dozen or so things\nwhich sit on top of it don\u2019t become impossibly difficult to resolve. Lots of:</p>\n<div><div><pre><code><span>module</span> <span>TestRelocation</span> <span>:</span> <span>sig</span>\n<span>(* Stuff that will eventually be in testRelocation.mli *)</span>\n<span>end</span> <span>=</span> <span>struct</span>\n<span>(* Stuff that will eventually be in testRelocation.ml *)</span>\n<span>end</span>\n</code></pre></div></div>\n<p>Got that all reconstituted and not failing late Sunday night. Having broken it\nup, next challenge to be able to explain it all. The tests themselves aren\u2019t too\nbad (had to comment all that, or I didn\u2019t understand it\u2026), but the support\nfunctions were an unadulterated mess. They\u2019re not now (hopefully). And another\nyay for labelled tuples (although there were a few places where I\u2019d got a bit\ntoo over-excited and went back to inline records\u2026). Anyway, it\u2019s all taken way\ntoo long, but that branch is finally ready to write up and open \ud83e\udd75</p>\n\n<p>Circles (or possibly whack-a-mole) continued in opam world, but hopefully now\nresolved. When we\u2019ve once-and-for-all solved universal package management, we\u2019ll\nhave the right story in OCaml for dealing with system compilers. Various\nsolutions were being juggled around\u2026 fortunately, it looks as though with some\nsleight of hand, a correct alignment of planets, and a little tweak in the\nrepository, we should be able to have it that new users stop getting system\ncompilers they didn\u2019t expect and landing into problems without breaking advanced\nusers in the process. TL;DR In opam 2.4, if you want a system compiler, you need\nto request <code>ocaml-system</code> explicitly (which is a good thing); if you do any of\n<code>opam switch create 5.3.0</code>, <code>opam switch create .</code>,\n<code>opam switch create foo ocaml</code>, <code>opam switch create ocaml.5.3.0</code> and so forth,\nyou will hopefully end up with a compiler built from source which, unless you\n<strong>REALLY</strong> know what you\u2019re doing, is what you need.</p>\n\n<p>Planetary Computing Group Wednesday lunches resumed, more political this week,\nthan necessarily technical (but then there\u2019s a lot of politics going on \ud83d\ude14).\nSlotted from that to an OCaml triage meeting (45 minutes of gold-dust time every\nfortnight which hopefully nudge a few things forward, help a few of us core devs\nvaguely stay on top of the issue tracker, and mean that we don\u2019t have to go\nthrough hundreds and hundreds of issues at the bigger core dev meetings). Dashed\nfrom that to the station to get to London. Trains messing up in both directions.\nAh well\u2026</p>\n\n<p>Real life collided with everything else for Thursday, which messed up getting\nto, well, anything. In the spirit of Flanders and Swann\u2019s <em>The Gasman Cometh</em>,\nwe learned that gas boilers don\u2019t ignite when the gas meter runs out of battery,\nas it locks the supply shut instead of open at that point: and it\u2019s not regarded\nas a user diagnosable fault! We also learned that induction hobs sometimes get\nupset when asked to heat things\u2026</p>\n\n<p>So, not a particularly wonderful week, although with a new toy having arrived\ntoday, perhaps some tinkering to be done for a change\u2026</p>\n\n<p><img alt=\"What to do with 7 DisplayPort sockets?\" src=\"https://www.dra27.uk/assets/2025-05-02/2025-05-02-precision.jpg\"></p>",
+24
dra/blog_week-that-was_2025_05_09_wtw-19.json
+24
dra/blog_week-that-was_2025_05_09_wtw-19.json
···+"content": "<p>Still \ud83e\ude80ing somewhat, but various nice things happened this week.</p>\n\n<p><a href=\"https://en.wikipedia.org/wiki/Star_Wars_Day\">Star Wars Day</a> marked the opening,\nfinally, of the test harness for Relocatable OCaml in <a href=\"https://github.com/ocaml/ocaml/pull/14014\">ocaml/ocaml#14014</a>,\nalong with a smaller PR with various bits of CI nonsense (<a href=\"https://github.com/ocaml/ocaml/pull/14013\">ocaml/ocaml#14013</a>).\nThat got merged fairly swiftly (thanks Antonin!). Chipping away at getting the\nthree main PRs finally ready to be opened, but that can\u2019t actually happen until\nthe test harness is reviewed and in\u2026</p>\n\n<p>Still in OCaml-land, <a href=\"https://github.com/ocaml/flexdll\">FlexDLL</a> had accumulated\nquite a collection of fixes, and having got the <a href=\"https://github.com/ocaml/flexdll/pull/158\">last one merged</a>,\nI figured it was high time for <a href=\"https://discuss.ocaml.org/t/flexdll-0-44-released/16614\">a release</a>.</p>\n\n<p>Changed tack (finally) had some fun playing with <a href=\"https://www.dra27.uk/blog/platform/2025/05/07/oxcaml-toes.html\">OxCaml</a>,\nversus just getting it packaged and installable. By sheer coincidence, then met\nup with the \u201cCambridge\u201d Jane Street trio (Dolan-Barnes-Shinwell), who were\nmarking the rollout of \u201cruntime5\u201d (i.e. OCaml 5.2) at JS with a <a href=\"https://www.the-geldart.co.uk/\">little pub\nouting</a>.</p>\n\n<p>I finally watched the <a href=\"https://www.youtube.com/watch?v=gSKTfG1GXYQ\">entire talk</a>\nI\u2019d been encouraging many people to watch for several months (I had skimmed it\nbefore!!). It\u2019s bittersweet for me: quite a few of the tricks here are things\nI\u2019ve advocated for a <em>long</em> time in opam, but it\u2019s very cool to have another\nexample to point at. I got nerd-sniped by a couple of things in the talk, and\nwas hoping to be able to see if there were some possible OxCaml ideas - however,\non this occasion it turned out that there were some easy victories to be scored\n(see <a href=\"https://github.com/ocaml/opam/pull/6515\">ocaml/opam#6515</a>; I may have\naccidentally launched a kernel build with <code>make -j</code> and no number, although\nhopefully my laptop will survive). Anyway, pretty cool to get <code>opam show dune</code>\nwhich takes about 1s on my laptop to display anything down to 140ms with only a\ntrain journey\u2019s-and-a-bit of merciless hacking.</p>\n\n<p>Lots of musings around <a href=\"https://github.com/astral-sh/uv\">uv</a> and discussions\nwith <a href=\"https://patrick.sirref.org/index/index.xml\">Patrick</a> and <a href=\"https://ryan.freumh.org/\">Ryan</a>.\nAlready toying with the idea of validating <a href=\"https://www.tunbury.org/\">Mark\u2019s</a>\nbulk-builder work (that\u2019s in use already on the pipelines for the\n<a href=\"https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html\">OCaml docs CI</a>)\nby plugging it into an experimental Dune version. Now toying with whether it\nwould not be too crazy to put an experimental tool together instead (there\u2019s I\nthink still a screaming ecosystem gap in OCaml for <code>uvx</code> or <code>uv run</code> - neither\nidea\u2019s original to <code>uv</code>, but putting them under the one roof, cargo-style, looks\nkinda awesome). But there\u2019s always the screaming sound of <a href=\"https://xkcd.com/927/\">xkcd#927</a>.</p>",
+24
dra/blog_week-that-was_2025_05_18_wtw-20.json
+24
dra/blog_week-that-was_2025_05_18_wtw-20.json
···+"summary": "This week consisted of a lot of spinning plates, which is unfortunate because it\u2019s not something I\u2019m very good at!",+"content": "<p>This week consisted of a lot of spinning plates, which is unfortunate because\nit\u2019s not something I\u2019m very good at!</p>\n\n<p><a href=\"https://ryan.freumh.org/\">Ryan</a> and I spent some time investigating being able\nto get opam packages to emerge via <code>pipx</code> (and therefore, <code>uvx</code>). Idea here is\nto be able to consume an OCaml application from a Python ecosystem (i.e. the\nfact it\u2019s OCaml is probably unimportant to the person invoking it). Requires\nquite a few layers on the Python infra side - we\u2019re meeting in the middle using\n<a href=\"https://pypi.org/project/scikit-build-core/\">scikit-build-core</a> on the Python\nside to give us the ability to invoke stuff on the OCaml-side. Pulls in some of\nour cross-ecosystem encoding work from last year as well. More to go, and also\ninterested to nudge this from the other direction - opening up the possibility\nof consuming OCaml applications this way becomes even more interesting if the\nOCaml ecosystem also encourages them to be packaged this way (i.e.\nopam-repository is mostly libraries, not applications\u2026).</p>\n\n<p>This all sparked off more discussions with <a href=\"https://patrick.sirref.org/index/index.xml\">Patrick</a>\nand <a href=\"https://ryan.freumh.org/\">Ryan</a> on the formalism in our package management\npaper, but <a href=\"https://ryan.freumh.org/2025-05-12.html#update-the-package-management-paper-for-arxiv-publication\">Ryan wrote that up!</a></p>\n\n<p>The Relocatable OCaml spinning plate got some updates, too: <a href=\"https://github.com/ocaml/ocaml/pull/13728\">ocaml/ocaml#13728</a>\ngot merged, which allowed <a href=\"https://github.com/ocaml/ocaml/pull/14014\">ocaml/ocaml#14014</a>\nto be updated to remove it. That PR had some helpful review feedback and, while\npoking another of the branches, found a minor bug in it! While trying to put\na coherent explanation for the second of the \u201cbig\u201d PRs, found a(nother) design\nflaw. There\u2019s a bigger post to come about the history of this change, but\nfortunately as with the previous issues, it\u2019s more that the \u201ccomplicated\u201d\napproach needed in one place is also needed in another. I\u2019ve found the bugs in\nthis branch have all meant resurrected previous commits which I\u2019d thought were\novercomplicating things, rather than actually having to write new stuff. Anyhow,\nhaving fixed that, I managed to consolidate an essay at <a href=\"https://github.com/dra27/ocaml/pull/162\">dra27/ocaml#162</a>\nand the gory details of what should now be the final approach for this are in\nthe \u201cTechnical background\u201d fold on that PR! At some point in the coming weeks\nI\u2019ll try to add the history behind getting to that here, if only so I don\u2019t\nforget it!</p>\n\n<p>Incidentally, there\u2019s a plea gone out from my core maintainer colleagues for\nanyone who\u2019d like to take a go at reviewing these things to have a look (see\n<a href=\"https://discuss.ocaml.org/t/volunteers-to-review-the-relocatable-ocaml-work/16667\">this Discuss thread</a>.</p>\n\n<p>More whiteboarding with <a href=\"https://jon.recoil.org\">Jon</a> figuring out some build-\nrelated ideas behind his JavaScript toplevels (odoc-notebook). The whole thing\nbecomes cross-compilation on speed, but particularly interesting that we might\nbe able to get some OxCaml demos going with it, while temporarily keeping the\nmain parts of the compilation still in OCaml, avoiding problems with patches\nthat aren\u2019t yet available for OxCaml support (means you\u2019d be able to show\nOxCaml code, with some under-the-hood work in the equivalent OCaml compiler\ndoing the rendering heavy lifting for now).</p>\n\n<p>In the meantime, also putting together some ideas for an EoI for <a href=\"https://www.software.ac.uk/research-software-maintenance-fund/round-1\">RSMF</a>,\nwhich is all a bit new (the process is new; the ideas are fundmentally not, as\nthat\u2019s the point of the call!). Getting that fully tied up will be a chunk of\nnext week, along with getting various other things in line in order to be\nincognito the week following.</p>",
+24
dra/blog_week-that-was_2025_05_24_wtw-21.json
+24
dra/blog_week-that-was_2025_05_24_wtw-21.json
···+"summary": "This week was a grant application, build systems, and code review - which it turns out is somewhere in the Amazon.",+"content": "<p>This week was a grant application, build systems, and code review - which it\nturns out is <a href=\"https://what3words.com/grant.builds.review\">somewhere in the Amazon</a>.</p>\n\n<p>On holiday next week, so most of this week spent attempting to do 1.5-2x the\nwork in order to go on holiday (one day\u2026). Some minor diversions on the way.\nThe vagaries of opam-repository testing meant that an <a href=\"https://github.com/ocaml/opam-repository/pull/27839#issuecomment-2851180027\">unrelated PR</a>\nhighlighted that my solitary non-OCaml-compiler opam package <a href=\"https://ocaml.org/p/bitmasks/latest\">bitmasks</a>\nhad become bitrots since for OCaml 5.1.0. One <a href=\"https://github.com/metastack/bitmasks/pull/7\"><code>to_list</code></a>\nfunction later, and <a href=\"https://github.com/ocaml/opam-repository/pull/27899\">1.5.0</a>\nwas born, for your representing-integer-masks-as-sets needs (I wrote the library\nyears ago for use in a never-released set of ODBC bindings, as I subsequently\ngot mildly distracted by opam and then the compiler).</p>\n\n<p>More fun from the trenches doing some routine work on OCaml\u2019s GitHub Actions\nworkflows to prepare for some slightly less routine Relocatable OCaml stuff. We\nstill maintain OCaml 4.14 while the 5.x releases converge (we\u2019re very nearly\nthere: my hunch is that we may decide after OCaml 5.5 that we\u2019re in a position\nto sunset 4.14, but we\u2019ll see). However, that means we have to sustain testing\ninfrastructure on a quite old branch and, well, continuous integration funnily\nenough has to be continuously maintained. Previously, what would happen is that\nwe\u2019d be attempting to backport something to 4.14, would discover CI was broken\nand then have to spend time fixing that before getting on with the work. I got\nfed up with this after <a href=\"https://github.com/ocaml/ocaml/pull/12520\">ocaml/ocaml#12520</a>\nand so did a bunch of work to synchronise all the branches (<a href=\"https://github.com/ocaml/ocaml/pull/12846\">ocaml/ocaml#12846</a>, \n<a href=\"https://github.com/ocaml/ocaml/pull/12847\">ocaml/ocaml#12847</a>, <a href=\"https://github.com/ocaml/ocaml/pull/12848\">ocaml/ocaml#12848</a>\nand <a href=\"https://github.com/ocaml/ocaml/pull/12849\">ocaml/ocaml#12849</a>). Not\nparticularly glamorous, but it means I can now periodically do:</p>\n\n<div><div><pre><code><span>$</span><span> </span>git log <span>--first-parent</span> <span>--oneline</span> upstream/trunk <span>--</span> .github tools/ci/actions\n</code></pre></div></div>\n\n<p>and get a nice list of recent PRs to go through and simply cherry-pick the ones\nwhich update the workflows - having got all the branches in sync, that tends to\nbe painless, and I got to a nice little sequence on <a href=\"https://github.com/dra27/ocaml/commits/4.14\">dra27/ocaml#4.14</a>.\nThe ulterior motive is that I particularly wanted the updates in <a href=\"https://github.com/ocaml/ocaml/pull/14013\">ocaml/ocaml#14013</a>\nto be able to get Relocatable OCaml back to 5.2 so that it can be rebased on to\nOxCaml. Took the customary amount of to-and-fro between my ridiculous\n<a href=\"https://github.com/dra27/relocatable/blob/main/stack\">re-stacking-and-backport-script</a>\nand CI, but I got <a href=\"https://github.com/dra27/ocaml/pull/169\">the 5.2 version</a>\npassing from the sunny hills of Wales only an hour or two into the holiday, and\nwhile everyone was distracted playing <a href=\"https://www.looneylabs.com/games/fluxx\">Fluxx</a>\n(which lasted a surprisingly long time, for anyone who\u2019s ever played it\u2026).</p>\n\n<p>Relocatable OCaml\u2019s test harness (<a href=\"https://github.com/ocaml/ocaml/pull/14014\">ocaml/ocaml#14014</a>)\nhad some very helpful reviews, and that\u2019s now updated and ready to merge. So,\nweek off and then hopefully full steam ahead with getting the third PR branch\ncompleted and, erm, some more reviewing \ud83e\udee3</p>",
+24
dra/blog_week-that-was_2025_07_06_wtw-27.json
+24
dra/blog_week-that-was_2025_07_06_wtw-27.json
···+"summary": "Rather varied week this week. A number of our EEG interns have started their work with us for the summer, with two nice projects falling under my direct supervision, with Lucas and Jeremy. It\u2019s great to get to watch people start their first forays into the world of hacking on OCaml, once the customary \u201cI was a baby when you started maintaining OCaml\u201d comments et al are out of the way \ud83d\ude02 It\u2019s also great to get to see the excitement, and reassuring to know that it is still an exciting thing to get to do for new people too!",+"content": "<p>Rather varied week this week. A number of our <a href=\"https://anil.recoil.org/notes/eeg-interns-2025\">EEG interns</a>\nhave started their work with us for the summer, with two <a href=\"https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler\">nice</a>\n<a href=\"https://anil.recoil.org/ideas/ocaml-bytecode-native-ffi\">projects</a> falling\nunder my direct supervision, with Lucas and Jeremy. It\u2019s great to get to watch\npeople start their first forays into the world of hacking <em>on</em> OCaml, once the\ncustomary \u201cI was a baby when you started maintaining OCaml\u201d comments et al are\nout of the way \ud83d\ude02 It\u2019s also great to get to see the excitement, and reassuring\nto know that it <em>is</em> still an exciting thing to get to do for new people too!</p>\n\n<p>The two projects are of particular interest to me. I\u2019ve poked (and supervised\nsome other poking) at various aspects related to OCaml\u2019s <code>Load_path</code>, which is a\nfairly innocuous-looking data structure at the heart of the compiler which is\nsimply responsible for mapping from names of files to locations based on the\nprovided <code>-I</code> search directories. As ever, a simple-sounding operation but with\nwide-reaching complexity and impact - it\u2019s an interesting piece of code to want\nto rip out and replace if you\u2019re writing a JavaScript toplevel, for example (no\nfile system\u2026); it\u2019s a remarkably hot piece of code if you suddenly find that\nyour file system is being slow (hello Windows, occasionally\u2026). First week on\nthis is mostly about settling in, becoming familiar with the vagaries of OCaml\u2019s\nbuild system and development workflow, but even in week 1 there\u2019s an unexpected\nnice piece of refactoring opening up. In <a href=\"https://github.com/ocaml/ocaml/pull/11198\">ocaml/ocaml#11198</a>,\nas part of OCaml 5.0, we finally moved the extra libraries to separate\ndirectories from the main Standard Library one but, to maintain compatibility,\nyou can still say <code>#load \"unix.cma\"</code> from the toplevel, etc. but you get an\nalert that you should added <code>#directory \"+unix\"</code> beforehand (and, one day, you\nmight <em>have</em> to). The code for that is a bit fiddly because the <code>Load_path</code> is\nfurther down the dependency graph from the modules responsible for displaying\nand processing alerts and warnings, so it had to be passed as a hook. It\u2019s a\nnice demonstration with effects that this warty bit of code becomes <em>naturally</em>\ncleaner, as the actual lookup of files takes place <em>much</em> higher up in <code>main.ml</code>\nwhere it\u2019s completely natural simply to display the alert. More exciting things\nto come with this.</p>\n\n<p>The other project extends work I\u2019ve poked at with changes like <a href=\"https://github.com/ocaml/ocaml/pull/13745\">ocaml/ocaml#13745</a>\nwhere we start to take advantage of recent changes in the way <code>Dynlink</code> is built\nthat mean it can be used for the main toplevel (a largely historical accident\nmeans that we at present have two almost-but-not-quite identical ways of loading\nbytecode into a running OCaml program\u2026). Being able to\n<code>#load \"my-numerical-library.cmxs\"</code> in the <em>Bytecode</em> toplevel gives us the best\nof worlds, hopefully - we get the power of native code for the library we\u2019re\n<em>using</em> and the flexibility and compilation-speed of the bytecode interpreter\nfor writing and experimenting <em>around</em> that library. You can do that at present\nusing ocamlnat (the native OCaml toplevel) but its compilation speed is slow and\nother solutions such as the ocaml-jit project are not totally portable and not\nparticularly \u201cdrop-in\u201d. I\u2019m also really excited about the <em>converse</em> side of\nthis project - being able to run the <em>bytecode</em> interpreter in a <em>native</em>\nprogram. Add the compiler frontend into your program, and what you have at that\npoint is the ability to embed OCaml as a scripting language into any program as\ntrivially as you can embed Lua, JavaScript, etc\u2026 so we might start to be able\nto have a world where you can configure your complex application using actual\nOCaml scripts but without needing OCaml to be on your end-user\u2019s machine.\nNeedless to say, I have scheming ideas for how this might be highly useful in\nopam packaging one of these days\u2026</p>\n\n<p>While working on the ever-overdue Relocatable OCaml at the weekend (the last\nprerequisite PR got merged on Friday, with thanks to Damien and Nicol\u00e1s for the\nrubberstamp, and Antonin a while back for the deep-dive reviewing!), I\ndiscovered some broken stuff, following a rebase. Turns out it wasn\u2019t me, and\nI\u2019d been able to open <a href=\"https://github.com/ocaml/ocaml/pull/14114]\">ocaml/ocaml#14114</a>\nto fix the fault. Whilst checking that, I saw that the ppc64 port of OCaml\nappeared to be broken, but I just left that with a note on the PR. Some distant\ndebugging on Monday with me connected to one of our POWER9 machines and\n<a href=\"https://github.com/stedolan\">Stephen Dolan</a> suggesting tweaks to a broken test\nover Slack led us to <a href=\"https://github.com/ocaml/ocaml/pull/14116\">ocaml/ocaml#14116</a>\nand a particularly humorous mantra of Stephen\u2019s for investigating broken tests\nin OCaml:</p>\n<ol>\n <li>If it\u2019s running too slowly, trying removing a zero from all constants in the\ntest</li>\n <li>If it\u2019s not working at all, trying add a zero to all constants in the test</li>\n</ol>\n\n<p>Works a charm, as you can see from the PR\u2026</p>\n\n<p>In between times, I managed to give a performance at <a href=\"https://www.medren2025.co.uk/concerts\">MedRen2025</a>\nin Newcastle, which has no connection to OCaml whatsover, beyond the amusing\nobservation that it featured music written between c.1450 and 1528, which is\n<em>just</em> older than the opening sentence of my final-year undergraduate computer\nscience disseration many years ago (which began, somewhat unusually,\n\u201cIn 1529, \u2026\u201d). We all managed not to get blown away at Fitzwilliam College for\nthe EEG Garden Party, and Relocatable OCaml became a little less far from\ncompletion, but that\u2019s for another post\u2026</p>",
+2
-2
dra/metadata.json
+2
-2
dra/metadata.json
+2
-2
eeg/metadata.json
+2
-2
eeg/metadata.json
+18
eeg/w_3exAV8tLbnPSGqoKv2mZts.json
+18
eeg/w_3exAV8tLbnPSGqoKv2mZts.json
···+"summary": "Full title: Towards Global Maps of Anthropogenic Threats to Biodiversity and Their Contributions to Species Extinctions Abstract: Species extinctions are primarily driven by loss of habitat, which is relatively easy to monitor by satellite remote...",+"content": "<p>Full title:<br>\nTowards Global Maps of Anthropogenic Threats to Biodiversity and Their Contributions to Species Extinctions</p>\n<p>Abstract:<br>\nSpecies extinctions are primarily driven by loss of habitat, which is relatively easy to monitor by satellite remote sensing; other anthropogenic threats to biodiversity, like hunting, are much more difficult to observe directly. My PhD project draws on local studies which capture the population effect of some anthropogenic threat, scaling these results using machine learning and remote sensing. In this talk, I will discuss my first attempt at this through quantifying species-specific responses to hunting pressure. I find that machine learning methods can offer marked improvements over (linear) statistical models, which are commonly used in ecology, but model validation must be done carefully to properly contextualise predictive performance. I will preview my plans for integrating these hunting pressure models with the LIFE biodiversity metric framework to express pressure in terms of extinction risk. If there is time, I will also discuss future plans for my PhD.</p>\n<p>Bio:<br>\nEmilio is a PhD student in the Department of Zoology at the\u00a0University of Cambridge\u00a0in the\u00a0Conservation Science Group\u00a0and the\u00a0Energy and Environment Group. He is supervised by\u00a0Andrew Balmford, with co-supervision from\u00a0Anil Madhavapeddy\u00a0and\u00a0Tom Swinfield. He is also part of the\u00a0AI for Environmental Risks Centre for Doctoral Training, a researcher at the\u00a0Cambridge Centre for Carbon Credits, and a member of\u00a0Churchill College. His research focuses on the uses of predictive modeling for biodiversity conservation, with an emphasis on quantifying species-specific responses to human disturbance.</p>",
+18
eeg/w_7XijwZ8ZtKtnLgRXKgt9G1.json
+18
eeg/w_7XijwZ8ZtKtnLgRXKgt9G1.json
···+"summary": "Abstract: In this talk, first, some special challenges in cyber-physical energy systems will be reflected. Then, examples from research projects and field tests will be discussed to show how multi-agent systems can be used to tackle these chall...",+"content": "<p>Abstract:</p>\n<p>In this talk, first, some special challenges in cyber-physical energy systems will be reflected. Then, examples from research projects and field tests will be discussed to show how multi-agent systems can be used to tackle these challenges. Finally, the topic of research data management and its role in open research will be discussed.</p>\n<p>Bio:</p>\n<p>Prof. Dr.-Ing. Astrid Nie\u00dfe has been Professor for Digitalized Energy Systems at the University of Oldenburg since 2020 and a member of the Energy Division Board of the OFFIS - Institute of Computer Science. From 2018 to 2020 she was Professor for Energy Informatics at Leibniz University Hannover.</p>\n<p>Astrid Nie\u00dfe received her doctorate from the University of Oldenburg in 2015; her doctoral thesis dealt with the application of distributed algorithms in the field of decentralized energy systems .<br>\nAstrid Nie\u00dfe studied computer science and biology at the University of Bremen and at the University of Oldenburg.</p>",
+18
eeg/w_7aqBd2Nn9E6QpMvnoBPxuQ.json
+18
eeg/w_7aqBd2Nn9E6QpMvnoBPxuQ.json
···+"summary": "Abstract: Estimating the geographical range of a species from sparse observations is a challenging and important geospatial prediction problem. Given a set of locations where a species has been observed, the goal is to build a model to predict whe...",+"content": "<p>Abstract:<br>\nEstimating the geographical range of a species from sparse observations is a challenging and important geospatial prediction problem. Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location. This problem has a long history in ecology, but traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets which can include tens of millions of observations of hundreds of thousands of species in addition to the availability of multi-modal data sources such as paired images and natural language descriptions. In this talk, I will present recent work from my group where we have developed deep learning-based solutions for estimating species' ranges from sparse presence-only data. I will also discuss some of the open challenges that exist in this space.</p>\n<p>Bio:<br>\nOisin Mac Aodha is a Reader in Machine Learning in the School of Informatics at the University of Edinburgh. He is also an ELLIS Scholar and former Turing Fellow. He obtained his PhD from University College London and was a postdoc at Caltech prior to his current role. His current research interests are in the areas of self-supervised learning, 3D vision, fine-grained learning, and human-in-the-loop learning. In addition, he works on questions related to AI for conservation and biodiversity monitoring. More information can be found on his website: <a href=\"https://homepages.inf.ed.ac.uk/omacaod\">https://homepages.inf.ed.ac.uk/omacaod</a></p>",
+18
eeg/w_8PhivRm85jZuFg8v55yo7F.json
+18
eeg/w_8PhivRm85jZuFg8v55yo7F.json
···+"summary": "Full Title: Using Low-cost, Research-led, Decentralised Networks to Increase Access to High Quality Microspatial Data on Building Stocks, and the Built and Natural Infrastructure Abstract: The Colouring Cities Research Programme (CCRP) is overse...",+"content": "<p>Full Title:<br>\nUsing Low-cost, Research-led, Decentralised Networks to Increase Access to High Quality Microspatial Data on Building Stocks, and the Built and Natural Infrastructure</p>\n<p>Abstract:<br>\nThe Colouring Cities Research Programme (CCRP) is overseen by an informal, international academic consortium that uses its decentralized research-led network, to co-create and manage, permanent open data/visualisation platforms across countries. These provide standardised, open microspatial data on the characteristics, performance, and short/long-term dynamics of building stocks, and built and natural infrastructure. They also test feedback loops between live streaming, computational inference, and crowdsourcing approaches to improve coverage and reliability of data, and to support cross sector/multidisciplinary engagement. Research Institutions from 30 countries are currently involved. The programme has been set up to accelerate progress towards UN SDGs; to provide access to big data required to exploit the potential of AI & ML and gain insights at scale; to reduce research costs and overlaps and speed up testing of research applications by pooling expertise, funding and ideas, and; to ensure that areas such as data standardisation, uncertainty in data, and citizen privacy and security are prioritised, as commercial demand for microspatial data grows.</p>\n<p>Bio:<br>\nPolly Hudson is a Senior Research Fellow at The Alan Turing Institute and PI for the Colouring Cities Research Programme. She was previously a Senior Research Fellow at the Centre for Advanced Spatial Analysis University College and held a Visiting Fellowship at the Kellogg Centre for the Historic Environment, University of Oxford. Relevant appointments include advisory/board positions for the Department of Culture, Media and Sport, English Heritage, The Royal Institute of British Architects, and the National Lottery (charitable arm).</p>\n<p>Polly trained as an architectural historian, and initially worked in furniture making, historic building restoration, museum design, and community planning. In 1996 she set up the Building Exploratory charitable trust in London as a prototype for multidisciplinary knowledge sharing centres about local building stocks, which was co-built over 6 years by citizens, local and central government, industry, non-profits & academia. The first iteration of the Colouring Cities mapping platform interface was tested in 1998. Between 2014 and 2019 Polly received EPSRC funding from UCL to develop the CCRP concept within academia where she partnered with Tom Russell who built the back and front end for the London prototype & advised on open licences. Since 2020 her position as CCRP PI has been funded by The Alan Turing Institute.</p>",
+18
eeg/w_9CqWsuQQykVtbuDPwuLwZs.json
+18
eeg/w_9CqWsuQQykVtbuDPwuLwZs.json
···+"summary": "Abstract: Methods developed by the computer graphics community allow for the photorealistic rendering of complex geometry. In this talk we explore how such mathematical procedures can be leveraged to describe the growth, biomechanics, and combu...",+"content": "<p>Abstract:</p>\n<p>Methods developed by the computer graphics community allow for the photorealistic rendering of complex geometry. In this talk we explore how such mathematical procedures can be leveraged to describe the growth, biomechanics, and combustion of trees at a detailed spatial level. These models facilitate a realistic 3D visualization of these processes at forest scale which allows exploring illustratively a variety of hypothetical environmental scenarios. Potential applications of such methods include the educational dissemination of environmental concepts, the generation of synthetic image data for training vision-based AI models, and the evaluation of ecological hypotheses expressed at plant organ scale.</p>\n<p>Bio:</p>\n<p>Wojtek Palubicki is a Professor at Adam Mickiewicz University where he leads the Natural Phenomena Modelling Group. The research group uses methods from computer graphics and AI to describe and investigate natural pattern genesis. Before that he held a post-doctoral research scientist post at the SLCU developing mathematical models of plant developmental biology.</p>",
+18
eeg/w_9hADtA5Fov2vdDt9iNVjJQ.json
+18
eeg/w_9hADtA5Fov2vdDt9iNVjJQ.json
···+"summary": "Abstract: Crop farming is essential in our society, providing food, feed, fiber, and fuel. We heavily rely on crop production, but at the same time, we need\u00a0to reduce the production footprint. We aim to address this key challenge by investigating ...",+"content": "<p>Abstract:<br>\nCrop farming is essential in our society, providing food, feed, fiber, and fuel. We heavily rely on crop production, but at the same time, we need\u00a0to reduce the production footprint. We aim to address this key challenge by investigating new solutions to produce crops more sustainably. We\u00a0study novel technology-driven approaches to move toward sustainable crop production. Agricultural robots offer promising directions to\u00a0address management challenges in agricultural fields or support plant breeding efforts through large-scale trait acquisition. For that, field\u00a0robots need the ability to perceive and model their environment, predict possible future developments, and make appropriate decisions in\u00a0complex and changing situations. This talk will showcase our recent developments in robotics for crop production, incorporating machine\u00a0learning to support farmers in operating more sustainably and reducing some negative impacts on the ecosystem.</p>\n<p>Bio:<br>\nCyrill Stachniss is a full professor at the University of Bonn and heads the Photogrammetry and Robotics Lab. He is also a Visiting Professor in Engineering at the University of Oxford and is with the Lamarr Institute for Machine Learning and Artificial Intelligence. Before his appointment in Bonn, he was with the University of Freiburg and ETH Zurich. Since 2010, he has been a Microsoft Research Faculty Fellow and received the IEEE RAS Early Career Award in 2013. From 2015 to 2019, he was senior editor for the IEEE Robotics and Automation Letters. He is the spokesperson of the DFG Cluster of Excellence \"PhenoRob\" at the University of Bonn, together with his colleague Heiner Kuhlmann. His research focuses on probabilistic techniques as well as learning approaches for mobile robotics, perception, and navigation. The main application areas of his research are autonomous service robots, agricultural robotics, and self-driving cars. He has co-authored over 300 publications and has coordinated multiple large-scale research projects on the national and European levels. Besides his university involvement, he cofounded three startups: Escarda Technologies, DeepUp, and PhenoInspect.</p>",
+18
eeg/w_dFShkouits1FFyUctiSSH5.json
+18
eeg/w_dFShkouits1FFyUctiSSH5.json
···+"summary": "Frank Feng is a first-year Ph.D. student in the Department of Computer Science and Technology at the University of Cambridge. His research interests lie at the intersection of machine learning and earth sciences, with a particular focus on the app...",+"content": "<p>Frank Feng is a first-year Ph.D. student in the Department of Computer Science and Technology at the University of Cambridge. His research interests lie at the intersection of machine learning and earth sciences, with a particular focus on the application of self-supervised learning in remote sensing.</p>",
+18
eeg/w_dwMbyPnsrcBXtTrUKGGVis.json
+18
eeg/w_dwMbyPnsrcBXtTrUKGGVis.json
···+"title": "Modelling Building Thermal Dynamics \u2013 From Data Generation to Transfer Learning",+"summary": "Abstract: Building operations contribute approximately one-third of global CO\u2082 emissions. Advanced control strategies can reduce these emissions by up to 30%. Such control requires accurate mathematical models that capture the building\u2019s thermal d...",+"content": "<p>Abstract:<br>\nBuilding operations contribute approximately one-third of global CO\u2082 emissions. Advanced control strategies can reduce these emissions by up to 30%. Such control requires accurate mathematical models that capture the building\u2019s thermal dynamics. Data-driven modeling has emerged as the most scalable approach for this purpose. However, the availability of high-quality building data remains limited. To address this challenge, we propose two methods: (1) a data generation framework that synthesizes realistic building operation data, and (2) a general Transfer Learning model that serves as an effective initialization for modeling new target buildings.</p>\n<p>Bio:<br>\nFabian is a second-year PhD student in the Department of Energy Management Technologies at the Technical University of Munich, supervised by Prof. Dr. Christoph Goebel. His research focuses on using Machine Learning to model building thermal dynamics. Such models are necessary for enabling Model Predictive Control of the building, which can reduce CO\u2082 emissions by up to 30%.</p>",
+18
eeg/w_f1Uxw34FRLEfVNBBpzbsgD.json
+18
eeg/w_f1Uxw34FRLEfVNBBpzbsgD.json
···+"summary": "Full Title: Democratizing Carbon Markets: A Blockchain-Based Emission Trading System for Small and Large-Scale Stakeholders in Brazil Abstract: The integration of blockchain technology into carbon markets offers a unique opportunity to create mor...",+"content": "<p>Full Title:<br>\nDemocratizing Carbon Markets: A Blockchain-Based Emission Trading System for Small and Large-Scale Stakeholders in Brazil</p>\n<p>Abstract:<br>\nThe integration of blockchain technology into carbon markets offers a unique opportunity to create more transparent, inclusive, and efficient trading mechanisms. This presentation introduces a novel Blockchain Emission Trading System (BETS) model designed to align with Brazil\u2019s new carbon market legislation (Law 15042/2024), ensuring that both large landholders and small rural producers can participate fairly. Our approach leverages official land registries, such as SICAR, to create spatially and temporally verifiable carbon credits, preventing fraud and double counting while enabling greater accessibility for smaller stakeholders who often struggle to enter regulated markets. By decentralizing the issuance and trading of carbon credits, our model aims to reduce intermediaries, lower costs, and promote broader participation, ultimately fostering a more equitable environmental and economic transition. Through a systematic mapping study, we identify key challenges and research directions for blockchain-based carbon markets and propose a framework that ensures compliance with national and international standards while prioritizing social and economic inclusivity.</p>\n<p>Bio:<br>\nJean is a professor at the Federal University of Santa Catarina (UFSC) in Brazil, specializing in information security, blockchain technology, and electronic documents. He holds a PhD in Computer Science from the University of Cambridge, where his research focused on cryptographic protocols and secure execution of code. Over the years, he has worked extensively on the development of blockchain-based solutions, particularly in the areas of digital identity, electronic signatures, and regulatory compliance. His recent work explores the use of blockchain to improve transparency, security, and inclusivity in digital ecosystems, including its application in carbon markets and sustainable finance.</p>",
+18
eeg/w_feDup1JutmgQkC6ipGF9r5.json
+18
eeg/w_feDup1JutmgQkC6ipGF9r5.json
···+"summary": "Abstract: This talk describes some results from a collaboration between Computer Science, Physics, and Climate Impact Research on theories and tools for performance optimisation of strongly coupled physical systems with a large parameter space....",+"content": "<p>Abstract:</p>\n<p>This talk describes some results from a collaboration between Computer Science, Physics, and Climate Impact Research on theories and tools for performance optimisation of strongly coupled physical systems with a large parameter space. The first part of the talk discusses computing optimal policies; we have used these techniques for climate decisions and for fusion energy designs. The second part of the talk will focus on one particularly important concept: the Pareto-front, which mathematically captures the trade-offs between two (or more) conflicting objectives. The core object of study is an expensive black-box function computing multiple objectives, for which we approximate the Pareto front using adaptive mesh refinement.</p>\n<p>Bio:</p>\n<p>Patrik Jansson is a professor in the Computer Science and Engineering Department, joint between Chalmers University of Technology and the University of Gothenburg, Sweden. His main research areas are Programming Languages, Functional Programming, Domain-Specific Languages, and their application to climate, physics, etc. His research focus is on systems for constructing correct and reusable software. The goal is to develop the programming languages of the future and theories, tests and proofs of the correctness of high-level models of complex systems. Important techniques include functional programming, domain-specific languages and type theory. Examples of applications are climate impact research, physics, and language technology but many results are also curiosity-driven basic research with generic applicability in most areas.</p>\n<p>Patrik has been on sabbatical in Oxford, as a Visiting Fellow of Kellogg College for Michaelmas term 2024, visiting Prof Jeremy Gibbons.</p>",
+18
eeg/w_gohsjWasx7SGdbCyiDhMyR.json
+18
eeg/w_gohsjWasx7SGdbCyiDhMyR.json
···+"summary": "Srinivasan Keshav is the Robert Sansom Professor of Computer Science at the University of Cambridge, focusing on the intersection of computer science and sustainability. He earned his PhD from UC Berkeley and has held roles at Bell Labs, Cornell U...",+"content": "<p>Srinivasan Keshav is the Robert Sansom Professor of Computer Science at the University of Cambridge, focusing on the intersection of computer science and sustainability. He earned his PhD from UC Berkeley and has held roles at Bell Labs, Cornell University, and the University of Waterloo. A Fellow of the Royal Society of Canada, ACM, and IEEE, Keshav is recognized for his contributions to networking and sustainability. His research includes innovations in energy systems, carbon footprint reduction, and forest conservation using remote sensing. Keshav emphasizes practical applications of computer science to global challenges, fostering collaborative solutions in smart grids and biodiversity conservation.</p>",
+18
eeg/w_iSPamqxUdmP2CwNNdGyQSN.json
+18
eeg/w_iSPamqxUdmP2CwNNdGyQSN.json
···+"summary": "Abstract: This talk discusses how market mechanisms and automated trading strategies can be used to control the flexible consumption and generation units of the community members in such a way that they make the best possible use of existing dist...",+"content": "<p>Abstract:<br>\nThis talk discusses how market mechanisms and automated trading strategies can be used to control the flexible consumption and generation units of the community members in such a way that they make the best possible use of existing distribution networks and support the network operator in avoiding and eliminating congestion situations. This ultimately helps avoiding grid reinforcements or allows to provide a better service with the existing grid, keeping in mind that it takes much longer to reinforce the grid than to build and connect many new (fluctuating) decentralized renewable generators and new loads such as heat pumps and electric vehicles.</p>\n<p>Bio:<br>\nSince 2017: Professor of Control and Integration of Grids at INATECH; before: Professor for Energy Systems Technology and Energy Economics, in particular intelligent decentralized structures for sustainable power supply (Smart Grids) at Offenburg University of Applied Sciences, Fellow and head of the research project \u201cSmart Grids\u201d at the foundation neue verantwortung, Berlin, Senior Researcher and Project Manager in the research area \u201cFuture Energy Systems\u201d, SAP AG, Research Assistant at the University of Mannheim, Research Fellow at the Iowa State University, Research Assistant at the University of Karlsruhe (TH)</p>",
+18
eeg/w_ijC1E36q7fn2qwxs7opSJq.json
+18
eeg/w_ijC1E36q7fn2qwxs7opSJq.json
···+"summary": "Grey literature\u2019s inherent nature means that it is a difficult form of media to discover, typically being hidden deep within websites, analyse, following no standard file formats or structures, and process, due to the sheer volume of existing and ...",+"content": "<p>Grey literature\u2019s inherent nature means that it is a difficult form of media to discover, typically being hidden deep within websites, analyse, following no standard file formats or structures, and process, due to the sheer volume of existing and actively produced literature, this forms a massive cost and time problem for organisations that require such literature in their function.<br>\nWe devise and implement a pipeline that uses Common Crawl internet archives to locate & scrape potential grey literature; then process it for use in a multistage machine learning pipeline to classify and output relevant media.</p>\n<p>Bios:</p>\n<p>Shrey Biswas is a second-year Computer Science Student at Pembroke College.<br>\nRadhika Iyer is a second-year Computer Science Student at Murray Edwards College.<br>\nKacper Michalik is a Second-year Computer Science Student at Pembroke College</p>",
+18
eeg/w_j2WWKaVRTKRwMWn4xCzoxK.json
+18
eeg/w_j2WWKaVRTKRwMWn4xCzoxK.json
···+"summary": "Abstract: I will present a short tutorial on some approaches to self-supervised learning (SSL), assuming no background in machine learning. If time permits, I will present examples of the use of SSL for problems in energy systems. Bio: Srinivasan...",+"content": "<p>Abstract:<br>\nI will present a short tutorial on some approaches to self-supervised learning (SSL), assuming no background in machine learning. If time permits, I will present examples of the use of SSL for problems in energy systems.</p>\n<p>Bio:<br>\nSrinivasan Keshav is the Robert Sansom Professor of Computer Science at the University of Cambridge, focusing on the intersection of computer science and sustainability. He earned his PhD from UC Berkeley and has held roles at Bell Labs, Cornell University, and the University of Waterloo. A Fellow of the Royal Society of Canada, ACM, and IEEE, Keshav is recognized for his contributions to networking and sustainability. His research includes innovations in energy systems, carbon footprint reduction, and forest conservation using remote sensing. Keshav emphasizes practical applications of computer science to global challenges, fostering collaborative solutions in smart grids and biodiversity conservation.</p>",
+18
eeg/w_oW6eqJBH1Hkwu6wE7XzQT3.json
+18
eeg/w_oW6eqJBH1Hkwu6wE7XzQT3.json
···+"summary": "Abstract: Illegal wildlife trade is a key driver of biodiversity loss, but targeting policy to maximise disruption to trade remains a key challenge. A network approach was applied to seizure data to prioritise national action disrupting the illeg...",+"content": "<p>Abstract:</p>\n<p>Illegal wildlife trade is a key driver of biodiversity loss, but targeting policy to maximise disruption to trade remains a key challenge. A network approach was applied to seizure data to prioritise national action disrupting the illegal trade of elephant ivory. By simulating the removal of countries from trade, targeting groups of countries was found to be most effective due to network redundancy. Despite temporal variability, trade was highly concentrated and cessation in less than 10 countries would have disrupted 75% of trade in 2018-2020. These findings support evidence-based legislation and efficient allocation of conservation resources for tackling illegal wildlife trade.</p>\n<p>Bio:</p>\n<p>Jakob is a PhD student in the Conservation and Development Lab (Department of Geography). His research focusses on evaluating policy for sustainable land systems, supervised by Prof. Rachael Garrett and Prof. Srinivasan Keshav. This work is supported by the Centre for Doctoral Training on Artificial Intelligence applied to the study of Environmental Risk (AI4ER CDT). Before starting his PhD, Jakob completed an MRes with AI4ER in Environmental Data Science, where he collaborated with TRAFFIC to develop data-driven tools to inform international illegal wildlife trade policy. Previously, Jakob completed an undergraduate degree in Natural Sciences at the University of Cambridge, specialising in Plant Sciences, and contributed to research on metrics for biodiversity offsetting, novel approaches to wildlife monitoring and forest ecology.</p>",
+18
eeg/w_pMzCFQKTrRtQ6jotF1z12V.json
+18
eeg/w_pMzCFQKTrRtQ6jotF1z12V.json
···+"summary": "Abstract: Comprehensive data on global biodiversity patterns is only obtainable through in-situ distributed sensor networks. However, these multi-device networks are constrained by battery lifetimes, must gather rich data from power-hungry sensor...",+"content": "<p>Abstract:<br>\nComprehensive data on global biodiversity patterns is only obtainable through in-situ distributed sensor networks. However, these multi-device networks are constrained by battery lifetimes, must gather rich data from power-hungry sensors, and yet must be deployed in remote environments for long periods. We look at the feasibility of a prototype multi-sensor device using on-device reinforcement learning for power management.</p>\n<p>Bio:<br>\nJosh Millar is a PhD based at the NetSys Lab at Imperial-X.</p>\n<p>Their current research interests include:</p>\n<ul>\n<li>energy-aware ML</li>\n<li>IoT and on-device ML</li>\n<li>applied ML for sustainability</li>\n</ul>",
+18
eeg/w_pQBnfPWJi9kxLdeHY9YAA7.json
+18
eeg/w_pQBnfPWJi9kxLdeHY9YAA7.json
···+"summary": "Full Title: Partner-driven Environmental Sensing: Co-design with Indigenous Ojibwe Scientists and Malagasy Conservationists Abstract: Evolving environmental sensing technologies present a myriad of opportunities for gathering data to underst...",+"content": "<p>Full Title: Partner-driven Environmental Sensing: Co-design with Indigenous Ojibwe Scientists and Malagasy Conservationists</p>\n<p>Abstract:</p>\n<p>Evolving environmental sensing technologies present a myriad of opportunities for gathering data to understand and promote environmental justice, biodiversity, and climate change mitigation. However, technical development from academic and commercial settings often struggle to translate to accessible solutions for marginalized communities. In this talk, I will explore the opportunities of partner-driven co-design, share the findings from a qualitative study of field scientists\u2019 use of technology, and present two case studies: (1) designing environmental sensors with Indigenous Ojibwe scientists for manoomin (wild rice) conservation and (2) partnering with Malagasy conservation organizations to understand the role that technology can play in reforestation and biodiversity monitoring.</p>\n<p>Bio:</p>\n<p>Eric Greenlee (he/him) is a PhD student in the College of Computing at Georgia Tech, co-advised by Ellen Zegura and Josiah Hester. Conducting research at the intersection of the Computing and Society Lab and the Ka Moamoa Lab, Eric explores partner-driven processes with communities often cut out of technology development to co-create emergent environmental sensors to address challenges in environmental justice, biodiversity loss, and climate change mitigation. By leveraging qualitative methods, he aims to strengthen connections across traditional silos to design and deploy user-friendly, networked, and low-power embedded systems. Prior to pursuing his PhD, Eric worked as a Radio Frequency engineer for the U.S. Federal Government and studied electrical engineering at Dartmouth College.</p>",
+18
eeg/w_pxkLZ4jgVJMqjwZuhWicrK.json
+18
eeg/w_pxkLZ4jgVJMqjwZuhWicrK.json
···+"summary": "Abstract: The exponential growth of cloud computing has been a defining trend of our time, fueled by rapidly growing demands from data-intensive and machine learning workloads. Despite the end of Dennard scaling, the cloud's energy demand grew ...",+"content": "<p>Abstract:</p>\n<p>The exponential growth of cloud computing has been a defining trend of our time, fueled by rapidly growing demands from data-intensive and machine learning workloads. Despite the end of Dennard scaling, the cloud's energy demand grew more slowly than expected over the past decade due to the aggressive implementation of energy-efficiency optimizations. Unfortunately, there are few significant remaining optimization opportunities using traditional methods, and moving forward, the cloud's continued exponential growth will translate into rising energy demand, which, if left unchecked, will translate to increasing carbon emissions.</p>\n<p>In this talk, I will argue for a CarbonFirst approach to designing cloud computing systems by making carbon efficiency a first-class design metric, similar to traditional metrics of performance and reliability. I will explain how today's systems can be made first carbon-aware by exposing energy and carbon usage information to software platforms and then made carbon-efficient by providing control over the system's carbon usage. I will present an initial design of a system to enable such carbon awareness and management and present several application case studies on how modern cloud applications can employ these mechanisms to reduce their carbon footprint. I will end with open research challenges in the emerging field of computational decarbonization.</p>\n<p>Bio:</p>\n<p>Prashant Shenoy is currently a Distinguished Professor and Associate Dean in the College of Information and Computer Sciences at the University of Massachusetts Amherst. He received the B.Tech degree in Computer Science and Engineering from the Indian Institute of Technology, Bombay and the M.S and Ph.D degrees in Computer Science from the University of Texas, Austin. His research interests lie in distributed systems and networking, with a recent emphasis on cloud and sustainable computing. He has been the recipient of several best paper awards at leading conferences, including a Sigmetrics Test of Time Award. He is a fellow of the ACM, the IEEE, and the AAAS.</p>",
+18
eeg/w_tyPqbNvp3isgTDZVVoLFD1.json
+18
eeg/w_tyPqbNvp3isgTDZVVoLFD1.json
···+"summary": "Abstract: This research introduces an AI-based alert system to reduce human-wildlife conflicts in the Romanian Carpathian Mountains. Globally, conflicts between people and wildlife are rising due to population growth, shifting land use patterns a...",+"content": "<p>Abstract:<br>\nThis research introduces an AI-based alert system to reduce human-wildlife conflicts in the Romanian Carpathian Mountains. Globally, conflicts between people and wildlife are rising due to population growth, shifting land use patterns and climate change. In Romania, mountain communities are impacted by bears and wild boars, which damage livestock, crops and property. These conflicts can undermine conservation efforts and may result in the killing of problematic animals. In collaboration with Funda\u021bia Conservation Carpathia, this research supports Rapid Intervention Teams who respond to wildlife activity in mountain villages. Six years of camera trap data are used to train and test AI models to detect and classify European mammals. These models are integrated into an alert system and deployed in three locations. The new pipeline improves on the state-of-the-art for detecting and classifying bears and wild boars. Preliminary results from the field deployment show a positive impact on conservation efforts. This is the first known study to use remote processing of 4G-enabled camera trap images to operate a human-wildlife conflict alert system, with potential wider applications as cellular connectivity expands to more remote locations.</p>\n<p>Bio:<br>\nTom is an MRes student on the AI for Environmental Risk Centre for Doctoral Training at the University for Cambridge. He previously spent 10 years working for the UK's Foreign, Commonwealth and Development Office, where he designed and managed sustainable development projects while on postings in DRC, Sierra Leone and Tanzania.</p>",
+18
eeg/w_uFyApvuvALLv66D7x36FEr.json
+18
eeg/w_uFyApvuvALLv66D7x36FEr.json
···+"summary": "Abstract: Energy systems are highly complex. State determination and detection of anomalies, faults or even attacks are only possible to a limited extent with traditional approaches. This talk will investigate how such systems can be planned an...",+"content": "<p>Abstract:</p>\n<p>Energy systems are highly complex. State determination and detection of anomalies, faults or even attacks are only possible to a limited extent with traditional approaches. This talk will investigate how such systems can be planned and operated in the future in the area of conflict between high automation and trust by human operators.</p>\n<p>Bio:</p>\n<p>Sebastian Lehnhoff is a Full Professor of Energy Informatics at the University of Oldenburg. He received his doctorate at the TU Dortmund University in 2009. Prof. Lehnhoff is chairman of the board of the OFFIS Institute for Information Technology and speaker of its Energy R&D division. He is a board member of the section \u201eEnergy Informatics\u201c within the German Informatics Society (GI) as well as an active member of numerous committees and working groups focusing on ICT in future Smart Grids. In 2022 he was appointed to the Board of Trustees of the Volkswagen Foundation (VolkswagenStiftung). He is the CTO of openKONSEQUENZ e.G. \u2013 a registered cooperative industry association for the development of modular Open-Source SCADA/EMS. He serves as Chairman of the Executive Board of the Energy Research Centre of Lower Saxony (EFZN) as well as an Executive Committee Member of the ACM Special Interest Group on Energy Systems and Informatics (SIGEnergy). Prof. Lehnhoff is a member of the German Academy of Science and Engineering (acatech) as well as a member of the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW).</p>",
+23
gabriel/hedgehogs_environment_conservation_compsci_2025_07_04_hedgehogs01.json
+23
gabriel/hedgehogs_environment_conservation_compsci_2025_07_04_hedgehogs01.json
···+"id": "https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/04/hedgehogs01",+"link": "https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/04/hedgehogs01.html",+"content": "<h3>Brief Introduction</h3>\n\n<p>Hedgehogs are having a bit of a hard time in the UK. Once a common sight in gardens and parks, their numbers have plummeted in recent decades - in some regions by nearly 75%. Urban sprawl, fenced-in gardens, busy roads, and pesticide use have all made hedgehogs life more difficult. But understanding exactly <em>how</em> they move, migrate, and interact with their environments could be key to turning things around.</p>\n\n<p>That\u2019s what this project is about. This summer, I\u2019m building a high-resolution map of hedgehog habitats across the UK as part of an internship at University of Cambridge, and with the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a> and the <a href=\"https://mail.cambridgeconservation.org\">Cambridge Conservation Initiative</a>. It will be a data-heavy effort to track and visualize hedgehog movements with as much spatial detail as possible. Once we\u2019ve nailed down the habitat mapping, we\u2019ll apply spatially explicit models to start making real predictions about where hedgehogs go, and why.</p>\n\n<p>Why does this matter? Well, if we can model where hedgehogs prefer to roam (and what\u2019s blocking them), we can help design more hedgehog-friendly spaces, both in rural and urban settings. Better-connected green corridors, more wildlife-friendly gardens, and (most importantly) smarter conservation planning.</p>\n\n<p>An interesting analysis of the hedgehog decline can be found in the <a href=\"https://www.hedgehogstreet.org/wp-content/uploads/2024/10/Hedgehogs-in-Britain-threat-analysis-report.pdf\">Hedgehogs in Britain threat analysis report</a> from April 2023. Urban populations are showing <em>some</em> signs of recovery, but rural hedgehogs are still in sharp decline. And <a href=\"https://ptes.org/campaigns/hedgehogs/\">People\u2019s Trust for Endangered Species</a> continues to lead the charge in data collection and public awareness.</p>\n\n<h3>Some useful notes about hedgehogs</h3>\n\n<ol>\n <li>There\u2019s only one native hedgehog specie in the UK - <em>Erinaceus europaeus</em>.</li>\n <li>They\u2019re crepuscular/nocturnal, roaving at night in search for food.</li>\n <li>Generally they prefer habitats with dense covers and abundant invertebrates.</li>\n <li>In the UK, they\u2019re found in almost all counties, except for a few islands.</li>\n <li>Urban hedgehogs:\n <ol>\n <li>Prefer gardens/backyards where people provide supplementary food (cat/dog food), leave out compost or decaying vegetation (insects).</li>\n <li>Avoid backyards where foxes and badgers are common.</li>\n </ol>\n </li>\n <li>Rural hedgehogs:\n <ol>\n <li>Prefer pasture, meadows, mixed grassland.</li>\n <li>Avoid large expanses of bare arable or dense woodland, intensely farmed landscapes.</li>\n <li>In proximity to farm buildings, hedgehogs nightly range shrinks (because of food supply).</li>\n </ol>\n </li>\n <li>Key micro-habitats:\n <ol>\n <li>Hedgerows, scrubs, bramble thickets.</li>\n <li>Most nests (both resting and hibernation nests) are built under thorny or dense plants \u2013 bramble, hawthorn, holly or nettles are common nesting sites. They also use woodpiles, compost heaps, thick ivy or abandoned mammal burrows as daytime refuges.</li>\n </ol>\n </li>\n <li>Diet:\n <ol>\n <li>Invertebrates.</li>\n <li>Occasionally: bird eggs, small vertebrates (frogs, lizards, baby rodents), carrion, and even fruit.</li>\n </ol>\n </li>\n <li>Travel patterns:\n <ol>\n <li>About of 1-2 km per night.</li>\n <li>110-220 yards per hour\u2026</li>\n <li>Males typically larger home ranges then females and roam more during the mating season (May to September).</li>\n <li>Proficient swimmer and climbers, can cross streams or low fences.</li>\n </ol>\n </li>\n <li>Hibernation: November to March in the UK. Fattening up beforehand is important, skinny hedgehogs die.</li>\n <li>Seasonal breeding.</li>\n <li>Threats:\n <ol>\n <li>Habitat loss and fragmentation - buildings and agriculture.</li>\n <li>Predation and competition by badgers (also potentially foxes, owls, dogs, snakes).</li>\n <li>Road traffic: heavy casualties (10-20% deaths are road kills).</li>\n <li>Poisons (pesticides - probably only one that can be mapped in this context), parasites, diseases, inpropper diet.</li>\n </ol>\n </li>\n</ol>\n\n<p>There has also been a (surprisingly) high number of hedgehog habitat research, often conducted in the UK, but generally focused on urban environments (and not much on rural). For instance, <em><a href=\"https://centaur.reading.ac.uk/104749/1/Gazzard2022_Article_Fine-scaleHabitatSelectionOfAS.pdf\">Fine\u2011scale habitat selection of a small mammalian urban adapter: the West European hedgehog</a></em> (Gazzard et. al.) find a) subtle differences between males\u2019 and females\u2019 behavior (e.g. in relation to house type, front vs. back gardens/yards), b) hedgehogs spent <em>\u201csignificantly more time in gardens where artificial food was provided, where a compost heap was present, if foxes (Vulpes vulpes) were infrequent visitors, if it rained overnight and as daylength increased (i.e., shorter nights); garden use was not significantly associated with variables potentially likely to reflect invertebrate prey abundance\u201d</em>, or c) hedgehogs visit <strong>many gardens</strong> over the span of one night (12-14).</p>\n\n<p><img alt=\"alt text\" src=\"https://gabrielmahler.org/assets/images/2025-07-04-gazzard1.jpg\" title=\"*[Fine\u2011scale habitat selection of a small mammalian urban adapter: the West European hedgehog](https://centaur.reading.ac.uk/104749/1/Gazzard2022_Article_Fine-scaleHabitatSelectionOfAS.pdf)* (Gazzard et. al.)\"></p>\n\n<p><em>Gazzard et al. (2022) \u2013 <a href=\"https://centaur.reading.ac.uk/104749/1/Gazzard2022_Article_Fine-scaleHabitatSelectionOfAS.pdf\">Fine\u2011scale habitat selection of a small mammalian urban adapter</a></em></p>\n\n<p>Similar kind of analysis is provided in a few other papers, such as <em>Using citizen science to understand and map habitat suitability for a synurbic mammal in an urban landscape: the hedgehog Erinaceus europaeus</em> (Turner et. al. 2021).</p>\n\n<ul>\n <li>Connectivity is important (juvenile Danish hedgehogs traverse a minimum of 10 gardens per day (!)).</li>\n <li>Negative factors: foxes, badgers, connectivity barriers (rivers, streams - generally <strong>lower presence around water</strong>).</li>\n</ul>\n\n<p><img alt=\"Using citizen science to understand and map habitat suitability for a synurbic mammal in an urban landscape: the hedgehog Erinaceus europaeus\" src=\"https://gabrielmahler.org/assets/images/2025-07-04-turner1.jpg\"></p>\n\n<p><em>Turner et. al. (2021) \u2013 <a href=\"https://www.wildes-bayern.de/wp-content/uploads/2022/01/Turner-et-al-2022-Mammal-Review-2021-Turner-Using-citizen-science-to-understand-and-map-habitat-suitability-for-a-synurbic-mammal-in-an.pdf\">Using citizen science to understand and map habitat suitability for a synurbic mammal in an urban landscape: the hedgehog Erinaceus europaeus</a></em></p>\n\n<h3>Current state of data sources</h3>\n\n<h4>GPS Traces & Sightings</h4>\n\n<p><strong>Lauren Moore</strong></p>\n\n<p>So far, I have spent most my time looking at GPS traces of nearly 80 hedgehogs provided by <a href=\"https://www.ntu.ac.uk/staff-profiles/animal-rural-environmental-sciences/lauren-moore\">Lauren Moore</a>. These were collected in the summers (or early autumns) of 2020 and 2021, across and nearby a few villages north east of Nottingham, spanning up to three weeks for some individuals.</p>\n\n<p>Few interesting observations:</p>\n\n<ol>\n <li>These hedgehogs demonstrated a strong aversion towards certain agricultural fields, while relished other. I assume there could be a variety of factors behind this, particularly pesticides or the height/density of the crops during the given time of year and agriculture-related traffic. Nonetheless, confirming the general notions, they generally sticked around villages much more.</li>\n <li>Hedgehogs sleep a lot (over 16 hours on average), usually go to bed early in the morning (4 am), and mostly stick around one location. There were, HOWEVER, a <strong>few restless wanderers</strong> (all male), who always slept in different, fairly remote, locations. Those are quite distinct.</li>\n <li>Other than those few, hedgehogs seemed to stick around similar regions, particularly those living in the urban areas. How does it happen that some hedgehogs stayed in urban areas while others remained in rural? Was it because the villages became too full?</li>\n</ol>\n\n<p>There are, however, some complications/questions stemming from this dataset. Primarily: a) how will we account for seasonal changes?, b) need to clarify how the tracked hedgehogs were selected, and adjust for any biases.</p>\n\n<p><img alt=\"\" src=\"https://gabrielmahler.org/assets/images/2025-07-04-screenshot1.jpg\"></p>\n\n<p><em>Sleep locations connected by lines of male (white) and female (blue) individuals</em></p>\n\n<p><strong>Hedgehog Street</strong></p>\n\n<p>Not sure how this would fit in, but the <a href=\"https://www.hedgehogstreet.org\">Hedgehog Street</a> initiative keeps tracks of a) hedgehog sightings, b) man-made hedgehog tunnels in fences. If they were so kind as to provide this data, perhaps it could be used for some validation. <strong>Working on getting the data from the organisation.</strong></p>\n\n<p><strong>Additional: NBN Atlas</strong></p>\n\n<p>The <a href=\"https://records.nbnatlas.org/occurrences/search?q=lsid%3ANBNSYS0000005078&fq=occurrence_status%3Apresent&fq=taxon_name%3A%22Erinaceus+europaeus%22&nbn_loading=true#tab_recordsView\">NBN Atlas</a> also provides a few datasets of hedgehog sightings. These vary in locations (entire UK, parts of Scotland) and time spans (some include sightings from 1830s\u2026 haha). This is probably not very useful, at most for some very approximate verification or something.</p>\n\n<h4>Hedgerows, stonewalls, woodlands</h4>\n\n<p>Another interesting data source is a mapping of hedgerows, stonewalls and woodlands compiled by <a href=\"https://www.cfse.cam.ac.uk/directory/drew_purves\">Drew Purves</a> and Google Deepmind. This dataset provides a super high resolution of geospatial features that could be, potentially, useful for this project. I\u2019m worried the hedgerows and stonewalls may encompass a bit too much variability perhaps, but the woodlands will definitely be useful (I believe hedgehogs don\u2019t like woodlands).</p>\n\n<h4>Openstreetmap</h4>\n\n<p>My <a href=\"https://gabrielmahler.org/walkability/compsci/2025/04/24/walkability-routing.html\">favorite</a> Openstreetmap (OSM) could also be useful, although the obvious limitation in this case is the coverage and temporal accuracy in remote areas. Conceivably interesting geospatial features that could have significant impact on hedgehogs behavior will probably not be captured in rural regions, and so the impact of Openstreetmap will probably not be as significant as it was for, for example, urban walkability. In reality, (eg. in the proximity of the Nottingham hedgehogs), OSM features tend to be extremely outdated, and mostly just contribute mapping of roads and buildings.</p>\n\n<h4>LIDAR</h4>\n\n<p>The UK has national LIDAR datasets, which provide high resolution surface data, and can be segmented and features (e.g. brambles, bushes, etc.) inferred (I have used this in my <a href=\"https://gabrielmahler.org/walkability/compsci/2025/04/24/walkability-routing.html\">walkability project</a> keep track of trees). However, as <a href=\"https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.plantsci.cam.ac.uk/directory/david-coomes&ved=2ahUKEwjVsKKvtKCOAxUCTkEAHfyyC-YQFnoECA4QAQ&usg=AOvVaw1e06nbscQcIaqTdVZIUaZP\">Prof Coomes</a> has recently pointed out, these LIDARs are frequently collected in the winter, as the \u2018hard\u2019 surfaces are of more importance to the collectors than plants.</p>\n\n<p>Currently have a bit older data (2012), working to get more recent ones.</p>\n\n<h4>Department for Environment, Food & Rural Affairs (DEFRA)</h4>\n\n<p>LIDAR datasets are not the only relevant geospatial datasets provided by the UK government. DEFRA collects and sometimes publishes a wide variety of maps documenting features and land uses.</p>\n\n<p>For instance, <em><a href=\"https://defraenvironment.blog.gov.uk/2024/12/18/living-england-a-national-habitat-map-for-everyone/\">Living England</a></em> maps 16 different habitats. While the specificity of the different habitats is not amazing, it could definitely be used to eliminate obviously ineligible areas.</p>\n\n<p>Similarly, DEFRA should keep track of the \u2018stewardship scheme\u2019, which provides funding to farmers and land managers for environmental land management practices. I\u2019m not quite certain how popular this scheme is, and how big of an impact it could potentially have on my project.</p>\n\n<p>Another interesting source could be <em><a href=\"https://www.data.gov.uk/dataset/952421ec-da63-4569-817d-4d6399df40a1/provisional-agricultural-land-classification-alc2#licence-info\">Provisional Agricultural Land Classification (ALC)</a></em>, classifying land\u2019s quality, based on climate, soil, and site factors. Maybe, as an underlying map, could contribute to better classification of habitats. Need to inquire with Natural England to obtain this.</p>\n\n<p>Similarly, there should be some datasets with accurate agricultural land-use (which seems extremely relevant). Need to inquire further about that.</p>\n\n<h4>Data conclusion</h4>\n\n<p>To summarize, currently my compiled data collection consists of: a) the GPS traces from Lauren Moore (+ dump of recent and historical hedgehog occurrences), b) Hedgerows, stonewalls, woodlands from Google Deepmind - highly accurate, highly useful c) OSM data - mostly buildings and segments in rural contexts (where our GPS traces are also situated), d) Unsegmented LIDAR point cloud, e) <em>Living England</em> land-use dataset - accurate, maintained, somewhat coarse.</p>\n\n<p>There really is a lot, interested to see how this will evolve.</p>\n\n<h3>Models</h3>\n\n<p>Another area of work this week was researching existing approaches to modelling animal movement.</p>\n\n<p>There is a few long-standing candidates:</p>\n\n<ul>\n <li>circuit theory - using landscape as an electrical circuit where resistance = inverse habitat permeability. Seems useful for identifying pinch-points, but lacks more complex mechanisms/objectives.</li>\n <li>Step-Selection and Resource-Selection (SSF/RSF) - relate animal locations or steps to habitat covariates. For example, SSFs compare used vs available steps (movement segments) in relation to vegetation and barriers. Resource-selection functions (RSFs) compare used points to random points.</li>\n <li>Spatially-Explicit Population Models (SEPMs) - integrate movement with demography (births, deaths) across a landscape. They simulate population dynamics in space (often as meta-populations or dynamic ABMs). Reviews note SEPMs sit at the high-complexity end, linking population processes to landscape structure.</li>\n</ul>\n\n<p><strong>Spatial Absorbing Markov Chains (SAMC)</strong> - Model used in <em><a href=\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/ele.13333\">Towards a uni\ufb01ed framework for connectivity that disentangles movement and mortality in space and time</a></em> Fletcher et. al. (2019).</p>\n\n<ul>\n <li>step up from circuit theory - probability: risk & resistance</li>\n <li>short/long-term predictions for connectivity in landscapes</li>\n <li>absorption = e.g. mortality \u2192 key addition (<strong>how necessary is it for hedgehogs and this project? shouldn\u2019t the model be geared primarily towards habitat preferences?</strong>)</li>\n <li>all: a) short-/long-term predictions of connectivity; b) incorporate population distribution and abundance into predictions of connectivity; c) quantify demographic parameters related to connectivity</li>\n <li>missing directional preferences; not sure how well attraction towards features/shelter/food/layer would work</li>\n</ul>\n\n<p><a href=\"https://cran.r-project.org/web/packages/samc/samc.pdf\">This R package</a> implements spatial absorbing markov chains. I started getting familiar with it (and with R..), but should probably figure the data better out before sinking hours in implementing stuff.</p>\n\n<p><strong>However,</strong> besides SAMC, and some data-hungry DL models (e.g. recurrent nerual networks for GPS time series, used e.g. in <a href=\"https://www.mdpi.com/1424-8220/19/20/4411\">Rew et. al. (2019)</a>), there generally seem to be three other options:</p>\n\n<ul>\n <li>Other hidden markov model-based approaches: for instance, <a href=\"https://cran.r-project.org/web/packages/moveHMM/vignettes/moveHMM-guide.pdf\">moveHMM</a> seems to have been quite popular, particularly for its ability to model behavioural states.</li>\n <li><strong>step selection functions (SSFs):</strong> estimates selection of resources available at each observed step or location based on habitat covariates and movement constraints. There are many versions of SSFs, but this generally seems like a good direction. A more comprehensive and comparative overview is in <a href=\"https://movementecologyjournal.biomedcentral.com/articles/10.1186/s40462-025-00549-2#:~:text=this%20paper%2C%20we%20describe%20and,Through%20our%20case\">Florko et. al.</a></li>\n</ul>\n\n<p>Other, non-obvious solutions? Some path-finding algorithms?</p>\n\n<h4>Next week</h4>\n<p>Hopefully it\u2019s not premature, but perhaps next week I shall start experimenting with some of the models on the data I\u2019ve gathered. Also want to slightly rewrite and hopefully publish at some point my walkability thesis on arxiv.</p>",
+23
gabriel/hedgehogs_environment_conservation_compsci_2025_07_13_hedgehogs02.json
+23
gabriel/hedgehogs_environment_conservation_compsci_2025_07_13_hedgehogs02.json
···+"id": "https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/13/hedgehogs02",+"link": "https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/13/hedgehogs02.html",+"summary": "The second week of working on the Hedgehog project has been a bit slower than I\u2019d hoped. One of my goals for this week was to build a foundational pipeline for the movement modeling. I sank quite a lot of hours into implementing a step selection function (SSF) model in R and the amt package with a few different geospatial layers. I then discovered integrated step selection analysis (iSSA), and obtained seemingly better results with that than with the SSFs. I then integrated these models into an agent-based model, as that should enable integration of known behavioral patterns (day/night cycles, foraging/traveling/resting etc.)",+"content": "<p>The second week of working on the <a href=\"https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/04/hedgehogs01.html\">Hedgehog project</a> has been a bit slower than I\u2019d hoped. One of my goals for this week was to build a foundational pipeline for the movement modeling. I sank quite a lot of hours into implementing a step selection function (SSF) model in R and the <a href=\"https://cran.r-project.org/web/packages/amt/index.html\">amt package</a> with a few different geospatial layers. I then discovered integrated step selection analysis (iSSA), and obtained seemingly better results with that than with the SSFs. I then integrated these models into an agent-based model, as that should enable integration of known behavioral patterns (day/night cycles, foraging/traveling/resting etc.)</p>\n\n<p>Continuing on <a href=\"https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/04/hedgehogs01.html\">last week</a>, I also did some more work on the datasets.</p>\n\n<h3>Data</h3>\n<p>There really is a ton of data available pretty publicly from UK government-managed databases. Some of the most interesting ones are land uses maps, and in particular the specialized crops map.\nThis map details with a really good resolution (and hopefully reliable accuracy) the crops grown across the entire country. There are also good, more general maps, which I think are particularly\nuseful for identifying urban and suburban areas (which is, obviously, very important).</p>\n\n<p>Another interesting official datasets are maps of pesticides and fertilizers. The pesticides dataset, for instance, documents use of 162 different types of pesticides. According to some preliminary\nresearch, number of those could be relevant for hedgehogs. Some pesticides reduce the abundance of hedgehogs prey (e.g. earthworms). Other (e.g. slug pellets) are deliberately ingested by hedgehogs,\ndespite sometimes causing them health issues. Unfortunately, the version of the dataset I was able to obtain suffers from a pretty low resolution, and would, therefore, possibly be relevant for\nlarge-scale modeling, but not necessarily for our small region.</p>\n\n<p>Furthermore, the hedgerows, stonewalls, and woodland dataset discussed <a href=\"https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/04/hedgehogs01.html\">last time</a> is awesome, but\nunfortunately about 50% of our tiny region lies in an unrecorded patch of land, and I have, therefore, not been able to use it for the modelling. I reached out to the dataset\u2019s creators, and am\nwaiting to hear back. Similarly, using LIDAR to try and infer location of brambles has been rather disappointing, but I should speak with <a href=\"https://ancazugo.github.io/about\">Andr\u00e8s</a> about it tomorrow, so \nthat will probably be very helpful.</p>\n\n<h3>Modeling</h3>\n<p>I decided to pursue SSFs as the foundational modeling technique. SSFs evaluate movement as a sequence of steps\u2014each defined by a movement from one location to another\u2014by comparing actual observed steps, derived from GPS or telemetry data, to alternative steps the animal could plausibly have taken. These alternative steps are generated from the same starting point as the observed step, using a movement kernel that captures empirical distributions of step lengths and turning angles. Once both used and available steps are defined, a conditional logistic regression model is used to assess which environmental variables influence the animal\u2019s choice of direction or destination. The resulting coefficients describe the strength and direction of selection for landscape features such as vegetation type, elevation, or proximity to roads. SSFs are particularly powerful because they integrate movement behavior with habitat selection, taking into account the animal\u2019s previous location and thereby reflecting realistic constraints on movement.</p>\n\n<p>iSSAs build directly upon SSFs. However, while SSFs focus on where animals choose to go, they generally reflect movement characteristics like step length and turning angle as secondary or incidental. iSSA addresses this by explicitly using movement-related covariates (e.g. step length, turning angle, and interactions between these and environmental variables) into the modeling process. This allows to simultaneously infer both movement and habitat selection processes, rather than treating them separately.</p>\n\n<p>Nonetheless, these statistical models cannot capture more complicated patterns, but their outputs should be quite easily integrated into some overarching pipelines. One such extension should be ABMs, which provide the opportunity to leverage not only the statistical coefficients, but also other expectations regarding the hedgehogs behavior and interaction patterns.</p>\n\n<p>I implemented the initial ABM model with <a href=\"https://mesa.readthedocs.io/latest/\">mesa</a> in Python, and have tried to keep everything very modular (that also applies to the SSFs/iSSAs), so I have space to make adjustments in the future. Nonetheless, I suspect the extent to which the model can be augmented (by, e.g., some kind of gaussian processes) is limited by several factors, primarily the scarcity of the tracking data.</p>\n\n<p>To fit the models, I used the three complete datasets mentioned in <a href=\"https://gabrielmahler.org/hedgehogs/environment/conservation/compsci/2025/07/13/hedgehogs02.html#data\">the data section</a>:</p>\n\n<p>1) Roads network.</p>\n\n<p>2) Crops.</p>\n\n<p>3) Land use.</p>\n\n<p>I would have also used the hedges & stonewalls & forestry layers, but as those were incomplete in the particular region of my interest, I did not. Nonetheless, it will be extremely easy to plug them in if I ever obtain them for this region. Furthermore, I also tried to model the hedgehogs by sexes (female and male), on top of both sexes together.</p>\n\n<p>Across the board, using the <a href=\"https://en.wikipedia.org/wiki/Akaike_information_criterion\">akaike information criterion (AIC)</a>, iSSA outperformed SSF.\nFor instance, here is a summary of the performances for both sexes at once:</p>\n\n\n\n \n \n Model\n AIC\n \n \n \n \n SSF: Crops\n 14059.6\n \n \n SSF: Land cover\n 85141.19\n \n \n SSF: Combined\n 13888.36\n \n \n <strong>iSSA: Combined</strong>\n <strong>6499.432</strong>\n \n \n\n\n<p>Moreover, the coefficients found by iSSA were generally quite interesting, and even confirmed some of the expectations. For\ninstance, that females avoid roads, favor short, less directional steps especially in certain crops/landcovers, while males are less road\u2010averse,\nshow very strong selection for particular crops (maize, beans) and landcovers (grassland, suburban), and move more tortuously.\nTo illustrate, here is a shortened summary of some of the coefficients:</p>\n\n\n\n \n \n Term\n Female coef.\n Male coef.\n Both coef.\n \n \n \n \n roads\n -0.004978\n 0.000011\n 0.000030\n \n \n maize\n 1.046\n 1.497\n 0.941\n \n \n oilseed_rape\n 0.921\n \u2014\n 0.911\n \n \n potatoes\n 31.130\n \u2014\n 2.911\n \n \n other_crops\n \u2014\n -4.867\n 0.016\n \n \n spring_field_beans\n \u2014\n 2.371\n 0.212\n \n \n broadleaved_mixed_and_yew_woodland\n -12.390\n \u2014\n \u2014\n \n \n improved_grassland\n 0.491\n 2.084\n 0.059\n \n \n suburban\n -0.056\n 1.701\n 0.640",
+2
-2
gabriel/metadata.json
+2
-2
gabriel/metadata.json
+21
gabriel/walkability_compsci_2025_06_01_introduction.json
+21
gabriel/walkability_compsci_2025_06_01_introduction.json
···+"content": "<h1>Introduction</h1>\n\n<p><em>Walkability</em> is an urbanist concept referring to how easy and desirable\nit is to walk in a given place, with considerations of the physical\nenvironment and the human individual. Typically, to estimate\n\u201cwalkability indices\u201d, theoretical urbanist frameworks extend beyond the\nfactors related to pedestrian temporal efficiency and leverage physical\nelements such as greenery, public amenities, or other common geospatial\ninformation. Despite that, pedestrian path-finding frameworks, which\nhave been around for several decades and are relied upon by millions of\nusers every day, generally ignore any such notions described in the\nurbanist literature. Instead, these frameworks typically aim to maximize\nsimplistic objectives, most commonly the estimated duration to undertake\na path, or even only the path\u2019s overall length. From the urbanist\nstandpoint, however, these metrics represent only a subset of the\nfactors that determine whether someone chooses to walk or selects an\nalternative mode of transportation. This problem is further amplified by\nthe fact that existing routing frameworks either entirely preclude\nuser-defined preferences or allow them only through highly complicated\nand constrained configuration files.</p>\n\n\n\n\n\n<img src=\"https://gabrielmahler.org/assets/images/thesis/new images/intro/intro valhalla Medium.jpeg\">\n\n\n\n<img src=\"https://gabrielmahler.org/assets/images/thesis/new images/intro/intro heatmap Medium.jpeg\">\n\n\n\n<img src=\"https://gabrielmahler.org/assets/images/thesis/new images/intro/intro general Medium.jpeg\">\n\n\nHigh-level illustration of our approach. Top:\nLow-walkability path generated by a popular routing framework\n(Valhalla). Middle: Our walkability scores. Bottom: Walkability-optimized\npath.\n\n\n<p>In this work, we study and answer three questions essential to\naccurately addressing the issues imposed by the methodologies used in\npopular path-finding frameworks:</p>\n\n<ol>\n <li>\n <p><strong>Shortcomings of existing path-finding:</strong> <em>What are the\nimplications of using preference inflexible, time efficiency-focused\npath-finding algorithms, particularly through the lens of\nwalkability? Why are the path-finding frameworks so inflexible to\nspecific user needs?</em></p>\n\n <p>To answer these questions, we assemble a corpus of five realistic\nrouting scenarios within the boundaries of the city of Cambridge,\nUK, discuss the unifying nature of solutions generated by three\npopular open-source frameworks, and identify potential improvements\nand missed routing opportunities. Furthermore, we discuss the\ndefinitions of preferences used to generate these outputs, and\nhighlight their complexity and poor accessibility.</p>\n </li>\n <li>\n <p><strong>Improving the quality of path-finding solutions:</strong> <em>How can urban\npath-finding be reoriented towards the concepts of walkability?\nFurthermore, how can path-finding frameworks respond more\nreceptively and comprehensively to specific user requirements and\npreferences?</em></p>\n\n <p>We provide solutions to these issues with two novel contributions.\nFirst, we present a computationally efficient tool for automated\nassessment of walkability in urban areas. We\nleverage modern natural language processing models (particularly our\ncustom fine-tuned transformer-based sentence encoders) and a\nknowledge base aggregated from public geospatial datasets, primarily\nthe OpenStreetMap. By utilizing\nrich semantic embeddings, our method significantly improves upon\nstate-of-the-art (generally computer vision-based) walkability\nassessment methods. Second, building on the acquired assessment\ntool, we present a new pedestrian path-finding framework based on\nthe A* search for the generation of pedestrian routes according to\nthe urbanist walkability principles.</p>\n\n <p>Finally, we leverage our semantically-based pipeline to develop an\napproach for embedding nuanced pedestrian objectives reflective of\nreal-world scenarios into path-finding solutions.</p>\n </li>\n <li>\n <p><strong>Simplifying user inputs:</strong> <em>What alternative approach to routing\nconfiguration files can be leveraged to simplify the process of\ninputting specific preferences?</em></p>\n\n <p>Lastly, to address this problem and provide a simplified way to\ndefine user-specific pedestrian preferences, we leverage sentence\nencoders\u2019 ability to extract semantic associations. As our\nwalkability assessment component is based on the use of sentence\nanchors (which are utilized as points of reference for specific\nqualities and \u201clevels\u201d of walkability), our pipeline is also able to\nreflect user-specific preferences projected into these anchors. This\napproach allows not only for very loosely constrained preference\ndefinitions, but also their straightforward representation (as they\ncan be defined with natural language).</p>\n </li>\n</ol>\n\n<p>To evaluate our approach, we follow the order of the above questions and\nanalyze the problem and our solution on the aforementioned corpus of\nrouting scenarios. For this purpose, we also conceive four realistic\nsets of pedestrian preferences (in addition to the general walkability\npreference) that aim to maximize the presence of historical, green,\nshopping, and public safety-oriented elements in their respective\npath-finding solutions. We employ these preferences in the assessment\ncomponent to compile unique walkability maps, and then use\nthese maps in our path-finding algorithm to generate highly walkable and\nspecific-objective maximizing paths.</p>",
+21
gabriel/walkability_compsci_2025_06_02_background.json
+21
gabriel/walkability_compsci_2025_06_02_background.json
···+"content": "<h1>Background</h1>\n\n<p>In the context of this work, the issues of pedestrian path-finding\nintersect both theoretical urbanist ideas (particularly the concept of\nwalkability) and classical topics in computer science (particularly\ngraph search algorithms). However, to achieve our objectives, we also\nrelate to much more modern computer science topics, specifically in\nnatural language processing and transformer-based encoders.</p>\n\n<h2>The Issue of \u201cWalkability\u201d</h2>\n\n<p>To construct a better set of requirements than those traditionally used\nin path-finding algorithms, we turn to the urbanist concept of\nwalkability. Unlike the exact yet simplistic measures of routing\nefficiency, walkability considers a wide range of factors closer aligned\nwith realistic human preferences for walking, and can, therefore,\nprovide a valuable perspective on finding paths walkers would actually\nprefer to take.</p>\n\n<h3>Urbanist Overview</h3>\n\n<p>In urbanist literature, the concept of walkability frequently\nencompasses a range of physical and social characteristics that\ncollectively determine how conducive a neighborhood is to pedestrian\nactivity.</p>\n\n<p><em>Alfonzo</em> draws a multi-level model to hierarchically structure the\nfactors that contribute to walkability\u00a0(Alfonzo 2005). They use\nindividual-level characteristics (such as income and car ownership),\nregional-level attributes (that reflect broader geographic variation),\nand physical environment characteristics (including safety, traffic\nconditions, sidewalk availability, and the directness of pedestrian\nroutes). They further distill the factors of the individual-level\ncharacteristics and physical environments in an analysis of the human\nneeds. The resulting model is called \u201cthe five levels of walking needs\u201d,\nand includes, in order: \u201cfeasibility\u201d (reflecting, for instance, the\nmobility of individuals and environment), \u201caccessibility\u201d (referring to\nfactors such as the presence of pedestrian infrastructure or the\nproximity to points of interest), \u201csafety\u201d (determined by, for example,\nland use or the fear of crime), \u201ccomfort\u201d (for instance, the\nrelationship between pedestrian and motorized traffic, or the presence\nof \u201cstreet furniture\u201d), and \u201cpleasurability\u201d (invoked by factors such as\naesthetic appeal or presence of public spaces).</p>\n\n<p>Nevertheless, a number of other publications emerge with more\nquantifiable approaches to measuring walkability. <em>Grasser et. al.</em>\nsuggest using data of gross population, employment, and housing\ndensities alongside land-use diversity indicators (such as the entropy\nindex) and estimated \u201cstreet connectivity\u201d based on intersection\ndensity\u00a0(Grasser et al. 2013). In a parallel effort, <em>Frank et. al.</em>\nintroduce a composite index combining net residential density, street\nintersection density, and retail floor area ratio to capture both\ndestination presence and ease of access\u00a0(Frank et al. 2006). Broadening\nthe scope, <em>Shields et. al.</em> catalog objective factors (including\ndistance to key destinations, sidewalk continuity, road-network\ncharacteristics, intersection safety features, vehicular traffic volume\nand speed, pedestrian-support amenities, and various density measures),\nwhile also emphasizing subjective qualities such as aesthetics, comfort\n(such as lighting, shade, noise levels), personal security,\nattractiveness, and crowding\u00a0(Shields et al. 2023). Finally, <em>Frank et.\nal.</em> later propose calculating z-scores for net residential density,\nretail floor area ratio, intersection density, and a five-category\nland-use mix entropy score, summing these standardized values to produce\na regionally normalized composite index\u00a0(Frank et al. 2010).</p>\n\n<h3>Summary and Criticisms of Walkability Literature</h3>\n\n<p>As such, the methods for the evaluation of walkability in urbanist\nliterature utilize a large variety of approaches and tools. Walkability\nis frequently calculated based on both highly granular metrics (such as\nintersection density) and small, local elements (such as street\nfurniture). Nevertheless, there are clear limitations to these\napproaches. For instance: while the approaches that aim to express\nwalkability in numeric values are only concerned with quantifiable\nfactors, it is only the more general, high-level frameworks (such as in\n<em>Alfonzo</em>\u00a0(Alfonzo 2005)) that consider more subjective factors. Highly\nimportant influences of the physical environment, particularly in the\ndomains of \u201ccomfort\u201d and \u201cpleasurability\u201d, are often omitted, or\nexpected to be correlated with the exact, quantifiable metrics.\nConsidering the spatial diversity of cities (particularly if we\u2019re\ncomparing cities from different countries or regions), one may conclude\nthat these general-purpose approaches can easily lead to inaccurate (or\neven biased) conclusions about what can be considered well or poorly\nwalkable.</p>\n\n<h3>Walkability-focused Services</h3>\n\n<p>The potential shortcomings of the urbanist research seem to have been\nmirrored in both public and proprietary projects and services.\nHigh-visibility projects such as the National Walkability Index\u00a0(Thomas,\nZeller, and Reyes 2021) or WalkScore\u00a0(Walk Score 2025) (both of which\nare limited to the United States) have been criticized for their\npositive emphasis on car-centric areas and proximity to points of\ninterest, and neglecting more realistic pedestrian preferences,\nultimately leading to inaccurate and misleading conclusions\u00a0(Steuteville\n2019). The NWI, curated by the U.S. Environmental Protection Agency,\nfocuses on measures that can be consistently applied across the country,\nusing data from the Smart Location Database\u00a0(Ramsey and Bell 2014).\nThese measures include intersection density, proximity to transit stops,\nand the diversity of land use. The underlying assumption is that each of\nthese factors is positively correlated with the likelihood of walking\ntrips, making them key indicators of walkability at the block group\nlevel. A notable alternative to NWI is WalkScore - originally an\nopen-source project aimed at promoting walkable neighborhoods. However,\nWalkscore was later privately acquired, and currently releases\nwalkability scores calculated only through proprietary\nmethodologies\u00a0(Walk Score 2025; Steuteville 2019).</p>\n\n<h3>Alternative: <em>n</em>-Minute Cities</h3>\n\n<p>Reflecting the limited supply of reliable walkability assessment tools\nand the demanding nature of the problem (requiring plentiful data and\nintricate technological solutions), alternative approaches, such as the\nconcept of \u201c<em>n</em>-minute cities\u201d, have emerged. Instead of measuring\nwalkability on the basis of actual physical environments, <em>n</em>-minute\ncities infer their walkability indices based on the proximity to points\nof interest. Exclusively aimed at urban environments, these projects\ngenerally focus on determining how long it would take to walk from a\ncertain location to places essential for daily life (such as stores,\nschools, hospitals, etc.) - hence the <em>n</em>-minutes.</p>\n\n<p>There are several projects, built by geographers and urbanists, that\nrely on this concept. A frontier example may be the project Close\u00a0(Henry\nSpatial Analysis, LLC 2025; Bliss 2024), which combines information from\npublic geospatial datasets (such as the Overture Maps\u00a0(Overture Maps\nFoundation 2025)) and custom filtering logic. In Close, geospatial data\npoints undergo a vetting process to refine and categorize destinations\nmeaningfully. For instance, when identifying supermarkets, Close uses\nqualitative criteria (such as a number of different sorts of aisles) to\ndistinguish full-service grocery stores from smaller convenience stores\nor bodegas. This labor-intensive process is partly automated using\nbuilding size, business names, and other available metadata, but in the\nfifty largest U.S. cities, the authors of Close had to undergo volumes\nof manual reviewing to improve accuracy. Furthermore, Close also\nattempts to alleviate issues induced by reliance on manual maintenance\nwith an iterative refinement implemented through a crowd-sourcing\nfeedback mechanism.</p>\n\n<h2>Path-Finding Algorithms</h2>\n\n<p>The aim of path-finding is to identify a sequence of steps that would\ndefine a route between two points with the aim of maximizing some\npredefined objective. Path-finding problems are typically represented\nwithin graphs, and their applications are widespread - from\ntransportation to robotics or video games. Nevertheless, the core of\nforefront path-finding frameworks has been consistent for a very long\ntime, and very frequently revolves around the A*\u00a0(Hart, Nilsson, and\nRaphael 1968) and the foundational Dijkstra\u00a0(Dijkstra 2022) algorithms\n(viz.\n\u00a7<a href=\"https://gabrielmahler.org/walkability/compsci/2025/06/02/background.html#section:relatedwork-transportationrouting\">3.1</a>{reference-type=\u201dref\u201d\nreference=\u201dsection:relatedwork-transportationrouting\u201d}).</p>\n\n<h3>A* Search</h3>\n\n<p>A* optimizes its search efficiency by using a heuristically-guided\nsearch\u00a0(Hart, Nilsson, and Raphael 1968). It combines a Dijkstra-like\ngreedy best-first search with an estimation of the cost to reach the\ntarget node. At each step, A* selects the node with the lowest cost\n$f(n) = g(n) + h(n)$, where $g(n)$ represents the exact cost from the\nstart node to the considered node, and $h(n)$ the heuristic estimate of\nthe cost from the currently iterated node to the target destination. In\norder for A* to find an optimal path, $h(n)$ must be admissible, which\nmeans that it must never overestimate the actual cost to reach the\ntarget node. If, for an A* algorithm, the equation\n$h(n)\\leq c(n,n\u2019) + h(n\u2019)$ (where $c(n,n\u2019)$ is the transition cost from\nnode $n$ to node $n\u2019$) holds, then the search guarantees not only\noptimality but also efficiency by never having to revisit a node.\nFurthermore, while frequently $h(n)$ is defined as a simple Euclidean or\nManhattan distance, specific applications often benefit from more\nsophisticated strategies.</p>\n\n<h3>Search Optimizations</h3>\n\n<p>Search algorithms are frequently optimized with bidirectional search,\nwhich performs two simultaneous searches from both the start and the\ntarget until they meet. This reduces the number of visited nodes but\ngenerally requires more complex logic and balanced\nheuristics\u00a0(Sturtevant and Felner 2018). Another approach, applicable in\nstatic graphs, is contraction hierarchies. This involves gradually\nremoving less important nodes and replacing them with shortcut edges\nthat preserve shortest paths. The resulting hierarchy allows for fast\nbidirectional search by restricting movement to higher-level nodes,\ngreatly speeding up queries after preprocessing, which is typically\nworthwhile for large graphs\u00a0(Geisberger et al. 2008).</p>\n\n<h2>Sentence Transformers</h2>\n\n<p>Sentence embedders (such as the foundational Sentence-BERT\u00a0(Reimers and\nGurevych 2019)) are neural networks based on the transformer\narchitecture, designed to capture the semantic contents of textual data\nof arbitrary length (but typically standalone sentences) into vectorized\nrepresentations of predetermined sizes. These models are frequently\nprepared by fine-tuning pretrained transformers on the objective of\nprojecting semantically similar sentences close together in the\nresulting embedding space. This property can then be used to easily\ncompare different data points, using measures like cosine similarity.\nUnlike some pre-existing approaches for measuring similarity (such as by\nrelying on the original BERT network), sentence transformers do not\ncompute pair-wise comparisons, but can encode inputs independently.\nTherefore, comparing similarities in large sets becomes much more\ncomputationally efficient. In order to output embeddings of fixed sizes,\nsentence transformers use various techniques, such as pooling over the\ntransformer\u2019s final layer. Under both supervised and unsupervised\nbenchmarks in clustering, similarity, and retrieval tasks, sentence\ntransformers (such as Sentence-BERT) consistently outperformed existing\nstrategies\u00a0(Reimers and Gurevych 2019).</p>\n\n<h2>Low-Rank Adaptation of Language Models</h2>\n\n<p>Low-Rank Adaptation (or LoRA) in language models is a technique that can\nbe leveraged to perform light-weight fine-tuning of pre-trained language\nmodels. LoRA-based fine-tuning works by freezing the original model\u2019s\nweights and injecting small trainable low-rank decomposition matrices\ninto each of the transformer\u2019s layers. Here, rank $r$ denotes the\ndimensionality of the low\u2011rank decomposition of a weight update. For a\nfrozen pre\u2011trained weight matrix $W_0\\in \\mathbb{R}^{d\\times k}$, the\nupdate is written as\n$\\Delta W = B A,\\; B \\in \\mathbb{R}^{d \\times r},\\; A \\in \\mathbb{R}^{r \\times k}$.\nThen, \u201clow\u2011rank\u201d implies choosing $r\\ll\\min(d,k)$ so that $\\Delta W$\nlies in a small $r$\u2011dimensional subspace, hence dramatically reducing\nthe number of trainable parameters. Before any training, the\ndecomposition matrices are initialized so that their product equals\nzero, and therefore, the model\u2019s initial behavior matches the pretrained\nbaseline. During the training, the optimization is balanced by a scaling\nfactor, making sure most hyperparameters do not require retuning with\nvarying ranks. The underlying rationale of this approach is based on the\nobservation that during task-specific adaptation of transformers, the\nchange in the model\u2019s weights lies in a much lower-dimensional subspace\nthan the entire parameter space\u00a0(Hu et al. 2022).</p>\n\n<p>Furthermore, based on analyses published by <em>Hu et. al.</em>\u00a0(Hu et al.\n2022), effective weight updates in transformers have very low intrinsic\nranks, and, in many cases, minimal ranks are sufficient to capture\nadaptations necessary for downstream tasks. Based on similarity\nmeasurements between adaptations of random initializations and different\nranks, they conclude that the most important parameter updates lie in a\nvery small subspace. Low-rank updates also tend to highlight features\nalready present in the pre-trained network, rather than introduce new\n\u201cconcepts\u201d into the model. Therefore, LoRA can reduce the number of\ntrainable weights by orders of magnitude when applied to large semantic\nmodels and substantially lower the computational burden relative to full\nmodel fine-tuning. Furthermore, the injected low-rank matrices can be\nmerged with the transformer\u2019s frozen weights before inference, and\nsubsequently achieve no additional latency compared to the original\nvanilla transformer.</p>\n\n<p>Therefore, considering LoRA\u2019s proven potential in the context of\ntext-based transformers to match or exceed fully fine-tuned networks,\nLoRA presents a viable strategy for customization and adaptation of\ntransformer-based models while alleviating the computational burden\nassociated with full-model fine-tuning.</p>\n\n<h2>Contrastive Learning</h2>\n\n<p>Contrastive learning is a machine learning technique applicable in both\nsupervised and unsupervised settings. Contrastive learning aims to\nleverage known relationships between training data points to learn how\nto project data into an embedding space such that points of the same or\nsimilar samples appear close together, whereas points from different\nsamples are spread apart\u00a0(Weng 2021). This is frequently accomplished\nvia specialized contrastive loss functions, such as the contrastive\nloss\u00a0(F. Wang and Liu 2021), the triplet loss\u00a0(Tripathi and King 2024),\nor InfoNCE\u00a0(Rusak et al. 2024). Contrastive learning has enjoyed much\npopularity due to (amongst other things) its ability to train under a\nself-supervised objective and its versatility across various domains,\nincluding multi-modal machine learning\u00a0(Weng 2021).</p>\n\n<p>Relevant for the context of this work is the triplet loss. The triplet\nloss paradigm uses three examples at a time: an \u201canchor\u201d example, a\n\u201cpositive\u201d example of the same or similar sample as the anchor, and a\n\u201cnegative\u201d example of a sample different from the anchor. The trained\nmodel is then taught to effectively pull the anchor closer to the\npositive example in the representation space and push it away from the\nnegative example. In this way, the model is prompted to represent\ncontrasting samples in different parts of the embedding\nspace\u00a0(Weinberger and Saul 2009; Khosla et al. 2020; Tripathi and King\n2024).</p>\n\n<h2>References</h2>\n\n<ul>\n <li>Alfonzo, Mariela A. (2005). <em>To Walk or Not to Walk? The Hierarchy of Walking Needs</em>. <em>Environment and Behavior</em>, 37(6), 808\u2013836.</li>\n <li>Bliss, Laura. (2024). <em>How Walkable Is Your Neighborhood? A New Map Tool Offers an Answer \u2013 Bloomberg</em>.\n<a href=\"https://www.bloomberg.com/news/newsletters/2024-09-11/how-walkable-is-your-neighborhood-a-new-map-tool-offers-an-answer\">https://www.bloomberg.com/news/newsletters/2024-09-11/how-walkable-is-your-neighborhood-a-new-map-tool-offers-an-answer</a></li>\n <li>Dijkstra, Edsger W. (2022). <em>A Note on Two Problems in Connexion with Graphs</em>. In <em>Edsger Wybe Dijkstra: His Life, Work, and Legacy</em> (pp. 287\u2013290).</li>\n <li>Frank, Lawrence D., Sallis, J. F., Conway, T. L., Chapman, J. E., Saelens, B. E., & Bachman, W. (2006). <em>Many Pathways from Land Use to Health: Associations Between Neighborhood Walkability and Active Transportation, Body Mass Index, and Air Quality</em>. <em>Journal of the American Planning Association</em>, 72(1), 75\u201387.</li>\n <li>Frank, Lawrence D., Sallis, J. F., Saelens, B. E., Leary, L., Cain, K., Conway, T. L., & Hess, P. M. (2010). <em>The Development of a Walkability Index: Application to the Neighborhood Quality of Life Study</em>. <em>British Journal of Sports Medicine</em>, 44(13), 924\u2013933.</li>\n <li>Geisberger, R., Sanders, P., Schultes, D., & Delling, D. (2008). <em>Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks</em>. In <em>Experimental Algorithms: 7th International Workshop, WEA 2008</em> (pp. 319\u2013333). Springer.</li>\n <li>Grasser, G., Van Dyck, D., Titze, S., & Stronegger, W. (2013). <em>Objectively Measured Walkability and Active Transport and Weight-Related Outcomes in Adults: A Systematic Review</em>. <em>International Journal of Public Health</em>, 58, 615\u2013625.</li>\n <li>Hart, Peter E., Nilsson, N. J., & Raphael, B. (1968). <em>A Formal Basis for the Heuristic Determination of Minimum Cost Paths</em>. <em>IEEE Transactions on Systems Science and Cybernetics</em>, 4(2), 100\u2013107.</li>\n <li>Henry Spatial Analysis, LLC. (2025). <em>Close.city Project</em>.\n<a href=\"https://close.city\">https://close.city</a></li>\n <li>Hu, Edward J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W., et al. (2022). <em>LoRA: Low-Rank Adaptation of Large Language Models</em>. <em>ICLR</em>, 1(2), 3.</li>\n <li>Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., & Krishnan, D. (2020). <em>Supervised Contrastive Learning</em>. <em>NeurIPS</em>, 33, 18661\u201318673.</li>\n <li>Overture Maps Foundation. (2025). <em>Overture Maps Foundation</em>.\n<a href=\"https://overturemaps.org\">https://overturemaps.org</a></li>\n <li>Ramsey, K., & Bell, A. (2014). <em>Smart Location Database</em>. <em>Washington, DC</em>.</li>\n <li>Reimers, N., & Gurevych, I. (2019). <em>Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks</em>. In <em>EMNLP 2019</em>.\n<a href=\"https://arxiv.org/abs/1908.10084\">https://arxiv.org/abs/1908.10084</a></li>\n <li>Rusak, E., Reizinger, P., Juhos, A., Bringmann, O., Zimmermann, R. S., & Brendel, W. (2024). <em>InfoNCE: Identifying the Gap Between Theory and Practice</em>. <em>arXiv Preprint arXiv:2407.00143</em>.</li>\n <li>Shields, R., Gomes da Silva, E. J., Lima e Lima, T., & Osorio, N. (2023). <em>Walkability: A Review of Trends</em>. <em>Journal of Urbanism</em>, 16(1), 19\u201341.</li>\n <li>Steuteville, R. (2019). <em>Walkability Indexes Are Flawed. Let\u2019s Find a Better Method</em>. <em>CNU</em>.\n<a href=\"https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1\">https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1</a></li>\n <li>Sturtevant, N., & Felner, A. (2018). <em>A Brief History and Recent Achievements in Bidirectional Search</em>. In <em>AAAI Conference on Artificial Intelligence</em>, 32(1).</li>\n <li>Thomas, J., Zeller, L., & Reyes, A. R. (2021). <em>National Walkability Index: Methodology and User Guide</em>. <em>United States Environmental Protection Agency (EPA)</em>.\n<a href=\"https://www.epa.gov/sites/default/files/2021-06/documents/national_walkability_index_methodology_and_user_guide_june2021.pdf\">https://www.epa.gov/sites/default/files/2021-06/documents/national_walkability_index_methodology_and_user_guide_june2021.pdf</a></li>\n <li>Tripathi, S., & King, C. R. (2024). <em>Contrastive Learning: Big Data Foundations and Applications</em>. In <em>CODS-COMAD 2024</em>, 493\u2013497.</li>\n <li>Walk Score. (2025). <em>Walk Score\u00ae: Walkability Index and Neighborhood Analytics</em>.\n<a href=\"https://www.walkscore.com\">https://www.walkscore.com</a></li>\n <li>Wang, F., & Liu, H. (2021). <em>Understanding the Behaviour of Contrastive Loss</em>. In <em>CVPR</em>, 2495\u20132504.</li>\n <li>Weinberger, K. Q., & Saul, L. K. (2009). <em>Distance Metric Learning for Large Margin Nearest Neighbor Classification</em>. <em>Journal of Machine Learning Research</em>, 10(2).</li>\n <li>Weng, L. (2021). <em>Contrastive Representation Learning</em>.\n<a href=\"https://lilianweng.github.io/posts/2021-05-31-contrastive/\">https://lilianweng.github.io/posts/2021-05-31-contrastive/</a></li>\n</ul>",
+21
gabriel/walkability_compsci_2025_06_04_designimplementation.json
+21
gabriel/walkability_compsci_2025_06_04_designimplementation.json
···+"content": "<h3>Design and implementation</h3>\n\n<p>To address the issue of generating walkability-friendly and\nuser-customizable pedestrian routes, our approach is divided into four\nparts: (1) data aggregation, conflation, and pre-processing, (2) the\ndevelopment of a specialized fine-tuning pipeline for sentence\nembedders, leveraging contrastive learning to learn representations of\ngenerally walkable (and unwalkable) place descriptions, (3) inference\nof point-wise scores based on \u201cgeneral walkability\u201d and\npreference-specific criteria from generated comprehensive embedding sets, and (4)\nintegration of the point-wise scores in an A*-based path-finding\nalgorithm.</p>\n\n<h2>Data Preparation</h2>\n\n<p>As already as discussed earlier, we concluded that the\nFoursquare and Overture Maps suffered from various insufficiencies. In\nthe context of our work, both exhibited low temporal accuracy and\nfocused on a relatively narrow selection of geospatial features with\nnormalized but limited descriptions. Furthermore (in contrast to OSM),\nthe feasibility of efficiently aggregating additional information from\nexternal sources in both of these datasets was minimal, as they only\never referenced private websites or social media profiles. Subsequently,\nOSM was eventually chosen to constitute the skeleton of our knowledge\nbase.</p>\n\n<h3>OSM Pre-Processing</h3>\n\n\n\n \n \n <strong>Feature Type</strong>\n Quantity (in thousands)\n with Wikidata Reference\n \n \n \n \n Ways\n 19.1\n 362\n \n \n Segmented Ways\n 38.6\n 362\n \n \n Nodes\n 34.6\n 1086\n \n \n Buildings\n 35.9\n 133\n \n \n Outdoor areas\n 2.3\n 35\n \n \n\n\n\n Summary of extracted OSM feature counts for Cambridge, UK. \n\n<p>To construct a robust knowledge base from OSM and to minimize the risk\nof losing potentially useful information or data points, we chose to\nmanually implement our own filters and process raw OSM data (instead of\nrelying on existing third-party post-processed datasets or APIs).</p>\n\n<p>The segment network used in our work was created from segmented OSM\n\u201cways\u201d, where each segment is defined at both ends either by a junction\nwith another segment or an isolated end. In the particular case of\nCambridge, OSM holds all kinds of transportation segments, from highways\nto unofficial \u201cdesire paths\u201d. Next, all nodes, as well as recorded\nbuildings, were extracted and stored. However, for both of these feature\ntypes, only the entries with some informative descriptions were kept.\nLastly, relevant outdoor areas were extracted, such as playgrounds,\nwater bodies, or parks. Where appropriate, these areas were conflated,\nsince raw data from OSM sometimes suffers from redundant or segmented\narea entries. Furthermore, for all OSM buildings, ways, and nodes, a\nwritten English description from Wikidata was scraped and appended to\nthe database whenever available. In the context of our model, and\nsimilarly to some user-uploaded text descriptions of nodes in OSM,\nWikidata\u2019s descriptions suffer from non-regularity. The database\npresents descriptions of varying lengths and informative values.\nTherefore, the scraped descriptions were cleaned of, for example,\nunwanted geographical names (since those were expected to provide little\nbenefit later on), and shortened where appropriate. The resulting\nquantities for each of these feature types in the table above.</p>\n\n<h3>Tree Dataset</h3>\n\n<p>Since, particularly for the geographical regions we were interested in\n(the UK), greenery can play a vital role for a data-driven inference of\nwalkability, having accurate estimates about the locations and\nquantities of trees is highly valuable. Although trees (and other\ngreenery) are a common node type in OSM data, their representation\nunderestimates reality. Within the boundaries of Cambridge, OSM tracks\nfewer than 3.5 thousand trees, substantially underestimating the actual\ncounts. In contrast, the specialized tree datasets (as introduced\nearlier) offer a more comprehensive\nand reliable source of tree-related data. Therefore, the VOM data was\nleveraged. Specifically, this project relies on a processed version of\nthe VOM raster, after a tree segmentation completed with the lidR\npackage\u00a0(Roussel, Goodbody, and Tompalski 2025). This version of the\ndataset was kindly provided by Andr\u00e9s Camilo Z\u00fa\u00f1iga-Gonz\u00e1lez (an AI4ER\nPh.D. Student at the University of Cambridge)\u00a0(Z\u00fa\u00f1iga Gonz\u00e1lez 2025),\nand served as a sole source of tree records for this project. Entries of\ntrees from OSM were, henceforth, ignored. Within the boundaries of\nCambridge, the segmented VOM supplied over 102 thousand trees.</p>\n\n<h3>Open Greenspace Dataset</h3>\n\n<p>The final \u201csupplementary\u201d dataset used was the \u201cGreenspace Dataset\u201d. Nevertheless, as it\nnarrowly specializes in public green spaces (such as public parks or\nplaygrounds), the Greenspace Dataset was used to merely enhance the\nspatial accuracy and fill in any gaps in the OSM data. Furthermore, for\nCambridge, it only included 398 entries. Therefore, the Greenspace\nDataset and OSM areas were iteratively matched and merged on\ndescriptions and spatial parameters, and stored in one database.</p>\n\n<h3>Point-Geospatial Context Dataset</h3>\n\n<p>This aggregated knowledge base was used to create final\npoint-to-geospatial-context mappings. First, a set of points was sampled\nfrom each of the segments in 10-meter intervals. For each of these\npoints, all entities within a pre-defined buffer zone were recorded.\nThese buffer zones were set to a 40-meter radius for buildings, and\n30-meter radius for all other feature types. Furthermore, each of these\nsegment points was also mapped to any outdoor areas it intersected.</p>\n\n<p>Given a specific point on a segment, these mappings were then used to\nretrieve text descriptions of the features from the parsed datasets. For\neach data type (such as nodes or areas), a priority mechanism selected\nthe most desirable attributes (such as building or business type, or\nWikidata description). The entity descriptions were then compiled into\nsentence descriptions. While the exact structure of the sentence\ndescription was also subject to much experimentation (partly because\nsome sentence encoders are better suited to specific structures), the\neventual structure of the descriptions introduced the different feature\ntypes in order, transitioned between these types with consistent\nconnector phrases, and represented missing entities of a given feature\ntype with \u201c<code>nothing</code>\u201d. Specifically, the default descriptions followed\nthis format:</p>\n\n<div><div><pre><code>[segment infrastructure description];\n IN AREAS: [list of areas];\n NEARBY: [list of nearby nodes and buildings].\n</code></pre></div></div>\n\n<h2>Encoder Fine-Tuning</h2>\n\n<p>To produce representations from the assembled dataset of\npoint-to-description mappings, we used sentence encoders (that are more\nclosely discussed in. However, while the\nability to make semantic associations was the key reason for picking up\npre-trained sentence encoders, these models had to first be lightly\nre-focused towards representing our specific descriptions. This was\nachieved through a contrastive fine-tuning process.</p>\n\n<h3>Finetuning Dataset</h3>\n\n<p>To create a dataset for the encoder fine-tuning, a set of compiled place\ndescriptions was encoded with an off-the-shelf encoder (specifically,\nwith the \u201call-MiniLM-L6-v2\u201d from the \u201csentence-transformers\u201d\nlibrary\u00a0(Reimers and Gurevych 2019)). Afterwards, 12,500 unique data\npoints were selected based on their respective embeddings with\nfurthest-point sampling to maximize the degree of diversity within the\ndataset.</p>\n\n<p>These points were then scored and labeled on the basis of walkability\nwith the Mistral 7b language model\u00a0(Jiang et al. 2023). The language\nmodel was prompted to assign a numerical score on a scale of zero to\nten, where zero stood for the least walkable descriptions (such as\ndescriptions of points on highways) and ten for the most walkable\ndescriptions (such as descriptions from park footpaths). The prompt used\nfor this purpose related to the concepts of walkability summarized earlier, particularly the\nwork of <em>Alfonzo</em>\u00a0(Alfonzo 2005).</p>\n\n<h3>Embedding Architecture</h3>\n\n<p>There\u2019s a plethora of pre-trained, publicly available sentence encoders,\nmany of which advertise a similar plethora of domain versatility in\ninformation retrieval, sentence similarity, or clustering tasks. Hence,\nthe selection of the most suitable encoder models was a highly iterative\nprocess. Moreover, the strategy of employing\nthese encoder models was also initially unclear, and two main options\nwere considered.</p>\n\n<p>The first option was encompassing all of the desired information for a\ngiven point into a singular sentence, and then using a single encoder to\ngenerate the point embeddings. This approach offered much simplicity,\nbut imposed the risks of relying too heavily on the encoder model\u2019s\nability to extract and represent all of the important features.\nMoreover, this approach was less flexible for potential future\nimplementations, where, for instance, not all features should be used to\ngenerate embeddings.</p>\n\n<p>The second option was to generate each feature or section of the\ndescription individually, potentially with different encoder models,\nlater composing these embeddings into a singular vector. A similar\napproach is developed in, for instance, the aforementioned work by\n<em>Tempelmeier et. al.</em>\u00a0(Tempelmeier, Gottschalk, and Demidova 2021).\nTherefore, several implementations of this approach were tested, none\nwith satisfying results. In some of the attempts, a set of embeddings of\nindividual features of a given point was composed by simply finding the\naverage of those feature embeddings. Alternatively, the composed vector\nwas generated via a fusion component, which was also trained during the\nfine-tuning phase.</p>\n\n<p>Nonetheless, none of the attempts to compose embeddings of individual\nfeatures into a singular vector proved useful. The models were prone to\nover-clustering (pulling samples of the same samples too close together)\nduring the contrastive fine-tuning phase, and generally failed to retain\nthe ability of the original off-the-shelf models to later make relevant\nsemantic associations.</p>\n\n<p>Hence, this work relies on a single encoder architecture, processing\ndescriptions composed of singular sentences. Furthermore, the\nfine-tuning of the sentence encoders was done via LoRA adapters. The adapters were injected into\neach of the pre-trained models, and while the models\u2019 weights remained\nfrozen during the fine-tuning, the adapters\u2019 weights adjusted to the\ncontrastive objective.</p>\n\n<h3>Contrastive Fine-Tuning</h3>\n\n<p>With the LLM-labeled dataset, sentence encoders were fine-tuned using\nthe Triplet Loss-based strategy. This strategy was implemented by\nsimply splitting the training examples into a positive and a negative\nbin. The threshold for the positive bin was a score assigned by the LLM\nhigher than or equal to seven, and in the negative bin, the scores of\nthe data points were lower than or equal to three. In order to create a\nclear contrast between the \u201cwalkable\u201d and the \u201cunwalkable\u201d, data points\nthat fell into neither of the two bins were discarded. After this\nindexing, the positive bin contained 5390 examples, and the negative bin\n1060 examples. This disparity between the sizes of the two bins was most\nlikely caused by the fact that points with low walkability scores were\nfrequently associated with fewer features (e.g., high-speed roads in\nurban outskirts) whereas highly walkable places were more commonly\nsurrounded by heterogeneous elements (e.g., paths surrounded by\namenities or places). Hence, there were fewer unique points with poor\nwalkability than unique points with high walkability.</p>\n\n<p>During the training, and due to the contrasting cardinalities of the two\nbins, the dataloader sampled the positive and negative examples randomly\nfor each iterated anchor. Furthermore, every time an example data-point\nwas used, its list of associated areas and of nearby nodes and buildings\nwas first randomly shuffled to embed an extent of permutation invariance\ninto the encoder.</p>\n\n<p>Extended with the LoRA adapters, the models adjusted to the fine-tuning\nobjective after only a few epochs and only required minimal training\ndurations. Although no model was fine-tuned for more than fifteen\nepochs, generally only models trained for fewer than five epochs proved\nuseful. Unsurprisingly,\ndue to the contrastive objective and the crudeness of the data bins, the\nprevention of over-clustering was essential. While in downstream tasks,\nthoroughly fine-tuned encoders successfully managed to classify examples\nas walkable or non-walkable, the differences in representations were\nsignificant, and neglected other features present in the examples.</p>\n\n<h2>Urban Embeddings and Scoring</h2>\n\n<p>Leveraging the ability of sentence encoders to independently project\nindividual examples into the embedding space, we developed an\nanchor-based method for the generation of absolute walkability scores.\nFurthermore, because of the use of anchors and the encoder\u2019s ability to\nhighlight semantic associations, we were able to further readjust the\nscoring pipeline and generate not only general walkability scores but\nalso scores reflective of more specific pedestrian preferences.</p>\n\n<h3>Walkability Scoring</h3>\n\n<p>Although simple distance metrics, such as cosine similarity, are very\nfrequently used for tasks such as embedding-based retrieval, their\noutputs reflect relative relationships only within the considered set of\nexamples. For instance, if plain cosine similarity was used to infer\nwalkability indices in a specific area, the obtained \u201cscores\u201d would\nimply walkability only relatively to the other points in the sample, and\nnot to any general expectations regarding walkability.</p>\n\n<p>Therefore, we used an anchor-based linear scaling approach to establish\nthese expectations. The approach considers three anchor vectors. A\ncompletely negative anchor (representing highly unwalkable data points),\na neutral anchor (representing data points of average walkability), and\na positive anchor (representing data points with the highest possible\nwalkability indices). These anchors were used to establish a set of\nthresholds, i.e., where specific ranges of walkability indices begin in\nthe embedding space and where they end. Each respective threshold was\ndefined as the cosine distance from the positive example. More\nspecifically, since in this work we used three thresholds, the negative\nanchor defined the distance-from-the-positive-anchor threshold for all\nwalkability scores equal to zero, and the neutral anchor for scores\nequal to five. Since distances in the embedding space may be\nproportionately different than the actual walkability scores, the\nneutral example was added with the intention of adjusting for this\ninequality and improving the scoring system\u2019s outputs. Then, for an\nembedding of a given example, the embedding was situated into the\nthreshold scale based on its similarity to the positive anchor, and its\nabsolute score was calculated through linear scaling and the two\nthresholds as points of reference.</p>\n\n<p>To obtain each of the anchors, a set of manually selected example\nsentences was constructed. Each sentence was meant to provide a\nspecific, strong example of the type of descriptions the given anchor\nrepresents. Each sentence was then embedded with the fine-tuned encoder,\nand the entire set was averaged to produce the final vectorized anchor.\nThe curation of the sentences used in the anchors was, nevertheless, not\nguided by any exact notions, and after a number of experimental\niterations, all three sets consisted of twelve exemplary sentences,\nfollowing the sentence structure.</p>\n\n<h3>Embedding Sets</h3>\n\n<p>A significant advantage of using a similarity-based scoring system lies\nin its computational efficiency, once the point-wise embeddings are\ngenerated. After obtaining a fine-tuned model, the preferences (such as\nthe various reference points) are reflected only in the anchors, and not\nin the representations of the geospatial points. Therefore, to generate\nscores, the system only needs to embed the few walkability anchors and\nperform the linear-scaling scoring. Since cosine similarity is\nparticularly easy and computationally inexpensive, this process is very\nquick and allows for the geospatial embeddings of the entire area of\ninterest to be pre-computed. Therefore, a dataset of mappings from\npoints (defined with geographical coordinates) to embedded descriptions\ncan be stored and used later in various downstream tasks.</p>\n\n<h3>Custom Preference Scoring</h3>\n\n<p>Despite the specialized fine-tuning, the embeddings created from\ndescriptions of geospatial points can be used for more than strictly\ngeneral walkability-focused tasks, such as preferences towards\nparticular geospatial areas or elements. In fact, by adjusting the\nanchors used in our linear scoring method, more specific pedestrian\npreferences can be used to generate the walkability scores. If the\nfine-tuning performed is sufficiently light, these preferences are then\nreflected in the embeddings generated by the encoder. Subsequently, the\nscoring pipeline rewards data points closer to those preference-adjusted\nembeddings and generates scores that lean towards the initial\npreferences. Specific implementations of this feature are discussed in\nthe <em>Evaluation</em> chapter of this series.</p>\n\n<h2>Path-Finding</h2>\n\n<p>With access to point-wise walkability indices generated by our scoring\npipeline, capable of producing evaluations of unrestricted spatial\ngranularity, we assembled a new routing algorithm. Unlike existing\napproaches, our algorithm did not have to rely on costs calculated with\nmanually fine-tuned static profiles. Instead, it was supported by scores\ncalculated based on embeddings generated by the custom sentence\nencoders, and thus reflected the variety of our aggregated geospatial\ndata. We used our OSM segment database to construct an infrastructure\nnetwork. Then, we combined aggregates of the walkability or specific\npreference-based scores with the segment lengths to calculate total\ncosts for each of the segments in the network. To generate paths in this\nnetwork, we used an A*-based searching algorithm. The implementation of\nour A* was relatively straightforward. It relied on a unidirectional\nsearch with no particular tweaks or optimizations (such as contraction\nhierarchies). This was because, in the scope of this work, pedestrian\nrouting in urban areas was our only focus. Hence, similar adjustments\nand optimizations, often implemented by existing path-finding\nframeworks, were deemed unnecessary.</p>\n\n<h3>Cost Estimation</h3>\n\n<p>Establishing an effective approach to calculating the overall\ncost-so-far $g(n)$ for the A* algorithm required more nuance. This was\nprimarily because of the point-based approach, where highly desirable\n(or undesirable) features often reflected over only a few points.\nMoreover, depending on the anchor configuration, considerable\ndifferences in points were reflected only by marginal differences in the\nscores. Therefore, an effective prevention of the \u201caverage\u201d points\noutweighing the critically important points was necessary. Similarly,\nfinding a working balance between the distance (which still had to be\nreflected in the scores calculation) was crucial for the generation of\ndesirable routes.</p>\n\n\\[segment\\ cost = \\frac{n}{\\sum_{i=1}^{n} \\frac{1}{inv.\\ score_i + \\delta}} * segment\\ length\\]\n\n<p>Considering these factors, a harmonic mean-based approach was eventually\nadopted. To calculate a score for a specific segment,the above formula was used, with the $\\delta$\nconstant equal to $10^{-6}$ and scores proportionately inverted so that\nlower scores were \u201cbetter\u201d and resulted in lower costs.</p>\n\n<h3>Heuristic Function</h3>\n\n<p>Similarly to related path-finding frameworks and implementations, the\nheuristic function used in this work remained simple. In fact, our A*\nsimply used the total Euclidean distance between the iterated and the\ntarget nodes, scaled by the globally lowest calculated cost. By scaling\nthe distance with the lowest cost, the heuristic remained a guaranteed\nunderestimate of the true path cost and was, therefore, admissible. In\nthis way, A* received an informed estimate with a minimal computational\noverhead and without the risk of sub-optimality.</p>\n\n<h3>References</h3>\n\n<ul>\n <li>Alfonzo, M. A. (2005). <em>To Walk or Not to Walk? The Hierarchy of Walking Needs</em>. <em>Environment and Behavior</em>, 37(6), 808\u2013836.</li>\n <li>Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., de las Casas, D., Bressand, F., et al. (2023). <em>Mistral 7B</em>.\n<a href=\"https://arxiv.org/abs/2310.06825\">https://arxiv.org/abs/2310.06825</a></li>\n <li>Reimers, N., & Gurevych, I. (2019). <em>Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks</em>. In <em>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing</em>. Association for Computational Linguistics.\n<a href=\"https://arxiv.org/abs/1908.10084\">https://arxiv.org/abs/1908.10084</a></li>\n <li>Roussel, J.-R., Goodbody, T. R. H., & Tompalski, P. (2025). <em>The lidR Package</em>.\n<a href=\"https://r-lidar.github.io/lidRbook/\">https://r-lidar.github.io/lidRbook/</a></li>\n <li>Tempelmeier, N., Gottschalk, S., & Demidova, E. (2021). <em>GeoVectors: A Linked Open Corpus of OpenStreetMap Embeddings on World Scale</em>. In <em>Proceedings of the 30th ACM International Conference on Information & Knowledge Management</em>, 4604\u20134612.</li>\n <li>Z\u00fa\u00f1iga Gonz\u00e1lez, A. C. (2025). <em>Post-Processed LiDAR Point-Cloud Dataset</em>. Unpublished dataset, provided by the author.</li>\n</ul>",
+22
ibrahim/alhasacademy.github.io_.json
+22
ibrahim/alhasacademy.github.io_.json
···+"summary": "<p><strong>Research Software Engineering at Cambridge Zoology</strong></p>\n \n <div>\n <h4>TL;DR</h4>\n <p>Started new role as Research Software Engineer at Cambridge Zoology Department. Successfully transitioned from LCFI and began integrating with the interdisciplinary team at Conservation Evidence.</p>\n </div>\n \n <p>This week marked the beginning of my role as Research Software Engineer (the first and only one!) at the University of Cambridge's Department of Zoology. The transition from LCFI has been smooth, and I'm excited to work on new research projects that combine my software engineering expertise with real-world research problems.</p>\n \n <h3>The Team and the Project</h3>\n <p>The Conservation Evidence team, based in the University of Cambridge's Department of Zoology, maintains a free, open-access platform that trawls global scientific and grey literature to collate and summarise the results of conservation interventions. It distils these findings into plain-language \"synopses\" for specific species groups or habitats, then convenes expert panels to score each action's effectiveness in the definitive What Works in Conservation handbook. Alongside this synthesis work, the team publishes the peer-reviewed Conservation Evidence Journal so practitioners can share new case studies, and it collaborates with NGOs, businesses and policymakers to embed evidence-based decision-support tools in real-world planning. Their overarching goal is to give conservationists rapid, unbiased access to the best available evidence so scarce resources are channelled into actions that demonstrably benefit biodiversity.</p>",+"content": "<p><strong>Research Software Engineering at Cambridge Zoology</strong></p>\n \n <div>\n <h4>TL;DR</h4>\n <p>Started new role as Research Software Engineer at Cambridge Zoology Department. Successfully transitioned from LCFI and began integrating with the interdisciplinary team at Conservation Evidence.</p>\n </div>\n \n <p>This week marked the beginning of my role as Research Software Engineer (the first and only one!) at the University of Cambridge's Department of Zoology. The transition from LCFI has been smooth, and I'm excited to work on new research projects that combine my software engineering expertise with real-world research problems.</p>\n \n <h3>The Team and the Project</h3>\n <p>The Conservation Evidence team, based in the University of Cambridge's Department of Zoology, maintains a free, open-access platform that trawls global scientific and grey literature to collate and summarise the results of conservation interventions. It distils these findings into plain-language \"synopses\" for specific species groups or habitats, then convenes expert panels to score each action's effectiveness in the definitive What Works in Conservation handbook. Alongside this synthesis work, the team publishes the peer-reviewed Conservation Evidence Journal so practitioners can share new case studies, and it collaborates with NGOs, businesses and policymakers to embed evidence-based decision-support tools in real-world planning. Their overarching goal is to give conservationists rapid, unbiased access to the best available evidence so scarce resources are channelled into actions that demonstrably benefit biodiversity.</p>",
+2
-2
ibrahim/metadata.json
+2
-2
ibrahim/metadata.json
+33
-33
index.json
+33
-33
index.json
················································
+22
jess/2025_04_19_first-blog.json
+22
jess/2025_04_19_first-blog.json
···+"summary": "Yay! My very first blog post. It is a good feeling to have a personal space to share my work and some thoughts.",+"content": "<p>This week I had two (unrelated) meetings with people who work with zk-SNARKs, first time I talked to people who actually work with zero-knowledge proofs (zkp) as part of their jobs! I had mixed feelings after the meetings. On the one hand it was so exciting to talk to people who work with zkp in real life, so interesting to hear about their applications; on the other it is a bit intimidating just how much is there to learn in this field but not everything is useful. New frameworks, languages and zkVM popped up within the last five or six years, created mainly to address two issues: (a) time-consuming computation and (b) user un-friendly complex proof-logic. The underlying maths and cryptography used for zero-knowledge are pretty stable. The problem is, with lanugages abstracted further and further away from the proof logic and with the priorities on speed, the cost shifted to security and privacy protection properties. This <a href=\"https://vac.dev/rlog/zkVM-explorations/\">article</a> gives a very high level but direct comparison of existing zkp lanuages/zkVM based on their \u201czk\u2019ness\u201d, which could be useful if you don\u2019t know where to start and if privacy is important in your use case.</p>\n\n<p>My first prototype using zkp to tackle carbon emissions claims was written in <a href=\"https://docs.circom.io/\">Circom</a>. The prototype was built for a use case in which a customer of a cloud provider wants to know the carbon emissions based on their usage. The customer\u2019s business run on servers hosted by their cloud providers and they want to know their <a href=\"https://ghgprotocol.org/sites/default/files/2022-12/FAQ.pdf\">Scope 3 emissions</a>. Existing systems and methodology for carbon emissions reporting rely on customers either trusting the data from their providers unconditionally or recruiting third party independent auditors to verify the data. With zkp, customers can automate the verification as frequently as needed. The providers do not need to reveal confidential input that goes into the emissions accounting, for example they might not want to reveal their business volume by giving away their total power consumption at any one of their data centres, nor would they want to reveal data related to their electricity suppliers.</p>\n\n<p>There is one tricky bit in this prototype - how can we ensure that the customer share of the power consumption is accurate? We could apply the \u201cCompleteness Principle\u201d of the <a href=\"https://ghgprotocol.org/sites/default/files/standards/ghg-protocol-revised.pdf\">GreenHouse Gas Protocol</a>, where all sources of emissions have to be accounted for. So we can assume that the divided power consumption must add up to 100% of the total power consumption. Therefore we could make it a requirements that providers also need to provide a transparency log with encrypted customer data, then we can use homomorphic cryptography to prove that all customer shares in percent add up to a 100. Moreover, if the data on the log is arranged in a Merkle Tree customers can also verify that they are indeed part of this customer base. This is not bulletproof unfortunately, providers can still cheat by adding fake customers to the log. I will provide more information in future blogs about this problem.</p>\n\n<p>Now back to the prove that all customer shares add up to 100, I can use Paillier cyptosystem[1] for this. Given that each customer share is encrypted using Paillier, we can then do homomorphic addtion to prove that they add up to 100 without knowing each individual share, hence protecting the private data. This can be done outside of zkSNARK, but we still need to check that the encrypted share used in the carbon emission calculation is the right one!</p>\n\n<p>To achieve that I added a circuit that can do Paillier encryption on the (private) customer share. In this circuit the encrypted customer share used in the transparency log is checked against the customer share encrypted in the circuit. As it turns out, this encryption is pretty computationally expensive! Paillier\u2019s modulus for the key is made of the square of the product of two prime numbers, and to achieve high security property (at least 128bit <a href=\"https://www.keylength.com/en/4/\">security strength</a>) we need the modulus size to be 3072bits. It is a big number and therefore needs to be divided into field elements for the arithmetic operations. The bigger the modulus, the higher the number of constraints generated by the circuit. My laptop cannot complete a run with modulus size bigger than 200bits.</p>\n\n<p>I did some benchmarking and plotted the results to find out the relationship between the various field elements sizes and the number of constraints generated. The results show that with the same key size, the more number of bits packed into each field element the fewer constraints are generated:</p>\n\n<p><img alt=\"Number of constraints generated by various key sizes, broken down in to field elements of various bit sizes\" src=\"http://localhost:4000/assets/img/bits_vs_elts.png\"></p>\n\n<p>So how do we improve on this? What is the max number of bits can be packed in a single field element? Tune in next blog post!</p>\n\n<p>[1] Paillier, P., 1999, April. Public-key cryptosystems based on composite degree residuosity classes. In International conference on the theory and applications of cryptographic techniques (pp. 223-238). Berlin, Heidelberg: Springer Berlin Heidelberg.</p>",
+21
jess/2025_04_26_fun-with-recursion.json
+21
jess/2025_04_26_fun-with-recursion.json
···+"content": "<p>Following on from the <a href=\"https://blogs.jadecoral.me/2025/04/19/first-blog.html\">previous blog post</a> regarding the prototype I built to generate carbon emissions proofs, I found out that the maximum number of bits that can be packed into a field element is 126 in Circom. Therefore, if we want to have 128 bits security strength as mentioned previously, we need to have a modulus size of 6144 bits. For <a href=\"https://en.wikipedia.org/wiki/Paillier_cryptosystem\">Paillier</a>, as the modulus is the square of the key size (the product of two prime numbers), which needs to be 3072 bits to achieve the 128 bits security strength. If the number of bits in a field element can only be up to 126, that means we will need 25 elements in the field elements array to represent a keysize that is at least 3072 bits.</p>\n\n<p>It is also not a straightforward task to write the Paillier encryption in circuits, instead of a basic exponentiation computation (the randomness number r needs to be raised to the power of the key, n) by calling something like r**n and let the compiler/runtime engine deals with the rest, the circuit needs to include r**n as part of the proof and hence reduce it to \u201cRank-1 Constraints Satisfaction\u201d (R1CS) system (there are other interpretation to what \u2018S\u2019 stands for, e.g. System, Satisfactory). In R1CS the algebraic circuits are expressed as a set of vectors and matrices, which in turn are converted to a set of polynomials to be used for the rest of the zkSNARK pipeline. So how do you express r**n as algebraic circuits in the first place?</p>\n\n<p>At first I tried the naive approach and simply created a loop (Circom supports loops) for r to multiply itself n times. This turned out to have very bad performance. Then, with my supervisor Martin\u2019s help, I was able to apply the <a href=\"https://en.wikipedia.org/wiki/Exponentiation_by_squaring\">Square and Multiply</a> method into a Circom circuit, which makes it way more performant. The circuit looks like this:</p>\n\n<p><img alt=\"\" src=\"http://localhost:4000/assets/img/exp_circuit.png\"></p>\n\n<p>However, it is still too big (in terms of the number of constraints). The carbon emissions prototype circuits with the Paillier encryption added were compiled, and the circom compiler reported that it has ~142 million constraints (as shown below). The trusted setup required to kick off the zkSNARK system, which uses the Groth16 protocol, will therefore have to be able to support up to 2^28 constraints, which is the maximum <a href=\"https://github.com/iden3/snarkjs\">snarkjs</a> can support currently. The high number of constraints causes the Powers of Tau ceremony for the trusted setup to take a very long time (days!). However, I could not even complete the experiment with a keysize bigger than 1000bits on my laptop, as it doesn\u2019t have enough memory to carry out the trusted setup and proof generation.</p>\n\n<div><div><pre><code><span>non-linear constraints: 142769486\nlinear constraints: 0\npublic inputs: 28\nprivate inputs: 65\npublic outputs: 1\nwires: 141986048\nlabels: 149556141\nWritten successfully: ./emissions_proof.r1cs\nWritten successfully: ./emissions_proof.sym\nthread 'main' panicked at code_producers/src/wasm_elements/wasm_code_generator.rs:9:5:\nthe size of memory needs addresses beyond 32 bits long. This circuit cannot be run on WebAssembly\n</span></code></pre></div></div>\n\n<p>So, the experiment with Circom didn\u2019t feel satisfactory because of the Paillier encryption. Taking a step back, the Paillier encryption was added to prove that the provided customer share was correctly encrypted, and then outside of zkSNARK we can verify that all customers\u2019 shares reported by the data centre operator add up to a 100% of the total power usage using the Paillier cryptosystem. If we can find a way to prove that within a SNARK proof, without having to input the data of every customer at once (a data centre could potentially have thousands or even millions of customers!), then we won\u2019t need to apply the Paillier cryptosystem at all.</p>\n\n<p>One way to do it is through recursive SNARK[1]. The input to each proof can be limited to one customer share at a time, and we add the share to the previous customer\u2019s share recursively. Through enough recursive steps to go through the whole customer base, the final proof output should be 100!</p>\n\n<p>Circom does not support recursion, so for the past couple of weeks I have been experimenting two different methods: one is to use a framework called <a href=\"https://docs.minaprotocol.com/zkapps/o1js\">o1js</a>, a TypeScript library provided as part of the <a href=\"https://docs.minaprotocol.com/\">Mina blockchain protocol</a>, created and maintained by <a href=\"https://www.o1labs.org/\">O(1)Labs</a> and <a href=\"https://www.minafoundation.com/\">Mina Foundation</a>; another one is to use a zkVM, e.g. <a href=\"https://risczero.com/\">RiscZero</a>, <a href=\"https://docs.succinct.xyz/\">SP1</a> and <a href=\"http://localhost:4000/2025/04/26/Arun, A., Setty, S. and Thaler, J., 2024, April. Jolt: Snarks for virtual machines via lookups. In Annual International Conference on the Theory and Applications of Cryptographic Techniques (pp. 3-33). Cham: Springer Nature Switzerland.\">Jolt</a>. zkVM provides a virtual environment that generates zk-proofs, abstracted away the complexity of circuit logic and provides a more developer-friendly language (e.g. Rust) to write circuits. Within the last five or six years there has been lots of effort to improve the performance of zkVM.</p>\n\n<p>It would be very interesting to see the results from these two methods!</p>\n\n<p>I tried the o1js framework first. First impression was already a win from my experiment with Circom. I am able to express my circuits within a few lines, and the readily available modules are sufficient for me to write the same emissions proof prototype. With the support of recursion, I am now able to do the customer share additions one by one in each proof, and then verify the final output from the recusive proof is indeed a 100.</p>\n\n<p>Right now I am trying to learn about RiscZero and SP1. So far RiscZero\u2019s protocol and framwork makes more sense to me, with SP1 it has abstracted the zero-knowledge proving part so much that it is quite difficult to express my intention using their framework. It is very much designed for writing proofs that can be deployed to smart contracts.</p>\n\n<p>In terms of performace, initial observation (without measurements) is that they take substantially more compute power and time to generate a proof. Verification is still very fast and small.</p>\n\n<p>I will write in more details about these experiments and results in future blogs. For the next blog though, I think I will go back to the problems I am trying to solve and explore more on the use cases!</p>\n\n<p>[1] Bitansky, N., Canetti, R., Chiesa, A. and Tromer, E., 2013, June. Recursive composition and bootstrapping for SNARKS and proof-carrying data. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing (pp. 111-120)</p>",
+21
jess/2025_04_26_fun-with-yeotokens.json
+21
jess/2025_04_26_fun-with-yeotokens.json
···+"summary": "For many years I buy milk, butter and other dairy products from a brand called Yeo Valley Organic (Disclaimer: it is purely personal taste that I buy their products. I have no association with the company other than being one of their customers. There are many other brands available, readers please choose based on your own preferences.). The company offer \u201ctokens\u201d on their products for their customers to collect. The collected tokens can be stored on the customers\u2019 accounts via their website, by entering the corresponding code printed on the products. The stored tokens can be spent in exchange for whatever they offer on their website. Even though I don\u2019t use the tokens for anything, I do want to store them. However, I don\u2019t always enter the codes every time I bought or finished a product straightaway. In fact, I almost never do that. Instead, I cut out the codes and put them in a box, thinking that one day I will enter them.",+"content": "<p>For many years I buy milk, butter and other dairy products from a brand called Yeo Valley Organic (Disclaimer: it is purely personal taste that I buy their products. I have no association with the company other than being one of their customers. There are many other brands available, readers please choose based on your own preferences.). The company offer \u201ctokens\u201d on their products for their customers to collect. The collected tokens can be stored on the customers\u2019 accounts via their website, by entering the corresponding code printed on the products. The stored tokens can be spent in exchange for whatever they offer on their website. Even though I don\u2019t use the tokens for anything, I do want to store them. However, I don\u2019t always enter the codes every time I bought or finished a product straightaway. In fact, I almost never do that. Instead, I cut out the codes and put them in a box, thinking that one day I will enter them.</p>\n\n<p>Today was one of those days. I decided to \u201cbank\u201d a few tokens by submitting some of the codes. I have accumulated so many that the box I am using has become too full! However, some of the cut-out codes have stuck together and because they have stuck together for so long, the prints of the codes have merged and faded! An example is shown on the picture above.</p>\n\n<p>So instead of throwing them away (I totally can do that!), I tried to solve it. I used the Magnifyer app on my iPad to get a closer look and played with contrast and filters on the image.</p>\n\n<p>Imagine the excitment I have when I finally cracked it and got the codes accepted!</p>\n\n<p>Sometimes tiny wins do make the day.</p>",
+26
jess/2025_05_15_zero-trust-always-verify.json
+26
jess/2025_05_15_zero-trust-always-verify.json
···+"summary": "Trust - a simple word but yet such a complicated concept. To determine if someone or something can be trusted, the process tends to involve some evaluation based on a combination of human traits: knowledge, judgement, ethics, morals, to name a few. I cannot do it justice to even try to explain the concept of trust (it took a PhD to formalise trust [2]!). If it is a computer system that is doing the evaluation, then not only does it need to be given the information but also the rules on how that decision should be made. What if the information required by the rules is not all available? The system\u2019s behaviour will be determined by the designer/programmer of the system. A trivial example: a system presents an interface for a user to enter their username and password -> the user inputs some text as the username and some text as the password -> the entered username and/or password does not match with what the system expects -> the system rejects access request.",+"content": "<p>Trust - a simple word but yet such a complicated concept. To determine if someone or something can be trusted, the process tends to involve some evaluation based on a combination of human traits: knowledge, judgement, ethics, morals, to name a few. I cannot do it justice to even try to explain the concept of trust (it took a PhD to formalise trust [2]!). If it is a computer system that is doing the evaluation, then not only does it need to be given the information but also the rules on how that decision should be made. What if the information required by the rules is not all available? The system\u2019s behaviour will be determined by the designer/programmer of the system. A trivial example: a system presents an interface for a user to enter their username and password -> the user inputs some text as the username and some text as the password -> the entered username and/or password does not match with what the system expects -> the system rejects access request.</p>\n\n<p>There are many scenarios where someone or a system wants to be trusted (e.g. to gain access) but cannot reveal all the information required, for example due to privacy concerns. Having the ability to prove to other parties that you can be trusted, without telling them any of the secret information needed for the evaluation process, would be very useful. Imagine if you receive a call from an unknown number, the person on the line claims that they have important information about your bank account, but they need to verify that you are who they want to speak to first. Neither of the parties involved in this scenario can blindly trust the other. However, if the identities can be verified using cryptographic evidence, i.e. you give the caller some cryptographic data and they would be able to tell if you are telling the truth or not, and vice versa, then no confidential information is shared in this conversation.</p>\n\n<p>On the other hand, having the ability to verify if the information they are getting is accurate and can be trusted is also very powerful. Companies have strong incentives to hide or even lie about certain information disclosed to the public [3,4,5], so if the information is important then it is crucial that the information can be verified. Traditional systems very much depend on manual processes to do the verification, e.g. the UK voting sytem. The voting in the UK only happens once in a while, the same manual process cannot work if it is applied to a system that has a much lower turnaround time requirement.</p>\n\n<p>The above can be applied to carbon emissions reporting. Firstly, carbon emissions data are very important for tackling climate change. Carbon emissions is a measure of greenhouse gas released into the atmosphere (expressed in terms of carbon dioxide equivalent, CO2e), as a result from burning fossil fuels for generating power, heating, cooling, manufacturing goods and foods, and transportation [6]. Without data we cannot know the state, and without knowing the state we cannot track changes or progress. Secondly, carbon emissions accounting often involves supply chains. It is challenging to get accurate data from company to company for the same reasons mentioned earlier. There are emerging standards for exchanging emissions data between companies. For instance, WBCSD [7] is leading the effort and has produced a set of standards for emissions data exchange [8]. However, the data exchange methodology does not currently involve cryptographic verification. So to achieve trusthworthy carbon emissions reporting, we need a way to verify the claims without revealing any business sensitive data at the same time.</p>\n\n<p>This \u201czero trust, always verify, private data protected\u201d goal can be achieved by applying zero-knowledge proofs.</p>\n\n<h2>USE CASE 1</h2>\n<p>My first paper on this topic was accepted at the LOCO 2024 workshop [9], which introduces the concept of applying zero-knowledge proofs (ZKPs) to achieve verifiable carbon emissions claims without compromising business sensitive data in a cloud computing scenario. The ZKP is constructed as follows:</p>\n\n\n\n \n \n Actors\n Roles\n \n \n \n \n Prover\n Data centre operator. They give their customers the carbon emissions data based on their usage.\n \n \n Verifier\n Customer of the data centre, a company who uses the data centre\u2019s hosting service for their online business. They need to produce their sustainability report [10], which includes their scope 3 carbon emissions. Hence, they need to make sure that the data they receive are accurate.\n \n \n Electricity supplier\n Supplies electricity to the data centre. They provide the carbon intensity figures for the data centre to do their carbon emissions accounting. The figures are signed by the electricity supplier.\n \n \n Smart meter manufacturer\n Makes smart meters that are used by the data centre to measure their electricity consumption. They sign the smart meters\u2019 public keys.\n \n \n Trusted certificate authority (CA)\n They are trusted third party authorities who provide signed certificates for the public keys from the smart meter manufacturer and electricity supplier.\n \n \n\n\n<p><br></p>\n\n\n\n \n \n Commitment\n \n \n \n \n Carbon emissions accounting that produces the emissions claim for the customer\n \n \n\n\n<p><br></p>\n\n\n\n \n \n Data\n Public or Private witness\n \n \n \n \n Carbon emissions claim for the customer\n Public\n \n \n CA\u2019s public keys\n Public\n \n \n Carbon intensity\n Private\n \n \n Electricity consumption\n Private\n \n \n Customer\u2019s share of usage\n Private\n \n \n Digital signatures for the smart meter reading, smart meter\u2019s public key, manufacturer\u2019s public key, carbon intensity and the electricity supplier\u2019s public key\n Private\n \n \n\n\n<h2>USE CASE 2 (an extension to use case 1)</h2>\n\n<p>Considering the cloud computing scenario above, we can imagine that data centre operators buy both carbon emitting energy and clean energy from their suppliers. This means that the electricity consumed at the data centre has different carbon intensity factors, depending on the type of generation source. We can also imagine that the pricing could be set differently based on power consumption, and customers could choose to pay more to be carbon-free for their services. Whilst it is not possible to directly measure the amount of carbon-free energy being used by individual customers, we can apply the same Greenhouse Gas Protocol\u2019s \u201cCompleteness Principle\u201d. The principle states that the total amount of energy consumed by all the customers add up to the total amount of energy contributed to the carbon emissions at the data centre (internal use can be counted as non-paying customers). For example, if the data centre bought 50% carbon emitting energy and 50% renewable, and if one customer, consuming 1% of the total power consumption, has signed up for 100% carbon-free energy, then there should be 50% carbon emitting energy and 49% renewable for the rest of the customers.</p>\n\n<p>The chain (much simplified with details omitted) looks something like this:</p>\n\n<p><img alt=\"\" src=\"http://localhost:4000/assets/img/renewable_energy_scenario.png\" width=\"700\"></p>\n\n<p>Let X kWh be power generated from carbon-emitting source, and Y kWh be power generated from carbon-free source. a1, a2, a3 and a4 are carbon emissions for each customer, calculated using carbon intensity for the carbon-emitting source, and b1, b2, b3 and b4 are carbon emissions calculated using carbon intensity for the carbon-free source. We want to prove that a1 + a2 + a3 + a4 = X kWh and b1 + b2 + b3 + b4 = Y kWh, and that a1 + a2 + a3 + a4 + b1 + b2 + b3 + b4 = X + Y kWh, without knowing any of the input numbers. This is only an illustration to explain the use case, in real life there could be over a million customers! Therefore a human auditor cannot practically solve this. However, a human auditor could play the role of verifier and make use of the ZKP system.</p>\n\n<p>The ZKP for this scenario can be constructed based on the following:</p>\n\n\n\n \n \n Actors\n Roles\n \n \n \n \n Prover\n In this use case we are only considering the data centre operator as the prover. To extend the use case further, we can also generate a proof at the electricity suppliers level.\n \n \n Verifier\n Customer of the data centre, they want to verify that the carbon emissions data from the data centre are accurate. In the extended use case mentioned above, the proof produced by the prover would also include a verified proof provided by the electricity supplier on carbon intensity and energy source.\n \n \n Electricity supplier\n Supplies electricity to the data centre. They provide the carbon intensity figures for the data centre to do their carbon emissions accounting, the figures are signed by the electricity supplier. The intensity factors could be different depending on the generator source.\n \n \n Smart meter manufacturer\n Makes smart meters that are used by the data centre to measure their electricity consumption. They sign the smart meters\u2019 public keys.\n \n \n Trusted certificate authority (CA)\n They are trusted third party authorities who provide signed certificates for the public keys from the smart meter manufacturer and electricity supplier.\n \n \n\n\n<p><br></p>\n\n\n\n \n \n Commitment\n \n \n \n \n Carbon emissions accounting that produces the emissions claim for the customer\n \n \n\n\n<p><br></p>\n\n\n\n \n \n Data\n Public or Private witness\n \n \n \n \n Carbon emissions claim for the customer\n Public\n \n \n CA\u2019s public keys\n Public\n \n \n Carbon intensity\n Private\n \n \n Electricity consumption\n Private\n \n \n Customer\u2019s share of usage\n Private\n \n \n Customer\u2019s contracted portion of renewable energy\n Private\n \n \n Digital signatures for the smart meter reading, smart meter\u2019s public key, manufacturer\u2019s public key, carbon intensity and the electricity supplier\u2019s public key\n Private\n \n \n\n\n<p>The prototypes for these two use cases are a work in progress, currently I am testing out different techniques and frameworks that can achieve the same ZKPs but have different properties. Once I have completed the proof of concept on these two use cases, I could apply a similar technique on other commodities such as coffee beans. I will continue to share this research journey in the next blog(s)!</p>\n\n<p><span>[1] Russian proverb \u201cTrust but verify\u201d, https://en.wikipedia.org/wiki/Trust,_but_verify</span><br>\n<span>[2] S. P. Marsh. 1994. Formalizing Trust as a Computational Concept. Ph.D. Dissertation. University of Stirling.</span><br>\n<span>[3] Volkswagen emissions scandal: https://www.epa.gov/vw/learn-about-volkswagen-violations</span><br>\n<span>[4] Ikea logging protected forests: https://earth.org/ikea-implicated-in-logging-protected-siberian-forests/</span><br>\n<span>[5] What is greenwashing: https://www.un.org/en/climatechange/science/climate-issues/greenwashing</span><br>\n<span>[6] Causes of Climate Change, the United Nations, https://www.un.org/en/climatechange/science/causes-effects-climate-change</span><br>\n<span>[7] The World Business Council for Sustainable Development, WBCSD https://www.wbcsd.org/</span><br>\n<span>[8] Partnership for Carbon Transparency, PACT: https://www.carbon-transparency.org/</span><br>\n<span>[9] Man, J., Jaffer, S., Ferris, P., Kleppmann, M. and Madhavapeddy, A., Emission Impossible: privacy-preserving carbon emissions claims.</span><br>\n<span>[10] EU CSRD: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32022L2464</span></p>",
+21
jess/2025_05_30_look-mum-it-s-moving.json
+21
jess/2025_05_30_look-mum-it-s-moving.json
···+"summary": "I don\u2019t have art talent. When my daughter was very small we once had this conversation:",+"content": "<p>I don\u2019t have art talent. When my daughter was very small we once had this conversation:</p>\n\n<p>me: you will find your talent, you do have it you just don\u2019t know yet.<br>\ntiny person: what is your talent mummy?<br>\nme: I don\u2019t think I have any\u2026I am still trying to find it!<br>\ntiny person: your talent is.. looking after me! You are doing it very well mummy.</p>\n\n<p>(I do miss the time when she always finished her sentence with \u201cmummy\u201d).</p>\n\n<p>I wish I had art talent. I enjoy art projects but I could never just create something from nothing that looks pretty. I guess that\u2019s why doing computer science suits me more. Computer programs could be described as art projects, the process of producing a working piece of software involves not only putting instructions together, but how they are put together does matter too.</p>\n\n<p>My undergraduate final year project on the computer science side was to implement a \u201cKey Frame Animation Tool\u201d. It was written in C, using OpenGL and XForms libraries (you get the idea from the feature picture at the top). Fast forward to today, animation software tools are much much more sophisticated. So I felt pretty excited when I went to an RDP (Research Development Programme) workshop on \u201cAnimate your research\u201d last week to learn about how to create animation based on our research.</p>\n\n<p>Three key things I learnt from the workshop: (1) An animation can still be very impactful even if the drawings are considered bad to the untrained eyes; (2) when the term \u201ccryptography\u201d does not mean anything to the people I was talking to, using it to explain my research so that they can visualise it simply doesn\u2019t work; (3) The people at the workshop prefer to hear about the positive impact by doing something, rather than the negative impact if we don\u2019t do something.</p>\n\n<p>I felt rather intimidated when I was given some blank paper and a pencil and asked to draw three key themes of my research. I didn\u2019t like anything I put on those papers. Towards the end of the workshop though, I felt very motivated to try and see if I could create an animation! So I did a silly animation with sound effects added:</p>\n\n\n \n\n\n<p>Unfortunately, I didn\u2019t have enough time during the workshop to complete the drawings for the animation based on zero-knowledge proofs for data exchange in a coffee supply chain. So I wrote down the idea briefly, hopefully, one day I can create it!</p>\n\n<ol>\n <li>Scene showing a series of coffee shops, a person goes into one with a recognisable label (or QR code?)</li>\n <li>The person comes out from the coffee shop with a coffee cup in their hand, smiling</li>\n <li>Zoom in the hand and then the coffee cup, and then the coffee</li>\n <li>\u201cGo back in time\u201d to show how the coffee was made</li>\n <li>Coffee -> coffee beans added to machine -> coffee beans in bags delivered to the shop -> coffee beans selected based on verified certificates -> coffee beans bagged in a factory, with certification process going on -> coffee beans delivered to the factory by different distributors -> Coffee beans distributors obtain certification -> farmers sell coffee beans to distributors with certificates showing that they didn\u2019t use deforested lands and that the beans were grown legally.</li>\n</ol>\n\n<p>Obviously, this sequence is overly simplified. However, during the workshop I found that as soon as I went into any details, people didn\u2019t seem to be interested. I can see that this animation can be a nice way to open a technical presentation. Now I just need to start creating some bad drawings\u2026</p>",
+22
jess/2025_05_30_recursive-curse.json
+22
jess/2025_05_30_recursive-curse.json
···+"summary": "The prototype I am working on at the moment is related to the first cloud computing use case mentioned in my previous post Zero trust, always verify. The prototype consists of five actors:",+"content": "<p>The prototype I am working on at the moment is related to the first cloud computing use case mentioned in my previous post <a href=\"https://blogs.jadecoral.me/2025/05/15/zero-trust-always-verify.html\">Zero trust, always verify</a>. The prototype consists of five actors:</p>\n\n<ul>\n <li>Data centre operator</li>\n <li>Data centre customer</li>\n <li>Electricity supplier</li>\n <li>Smart meter manufacturer</li>\n <li>Trusted certificate authority</li>\n</ul>\n\n<p>In this use case we assume that for regulatory or business reputation purposes, a data centre customer wants to publish their carbon emissions data that includes all three scopes of emissions. Therefore they need to know the carbon emissions figures from their cloud providers, and they want to be able to verify the figures. The data centre operator, therefore, acts as the prover in this scenario, as they have all the data to produce the carbon emissions report, but they don\u2019t want to reveal all the related business-sensitive information in the process of doing so.</p>\n\n<p>I have written a circuit that can generate a proof using the private data input by the data centre operator. The proof can be serialised and sent to their customers, who can run the verification in a separate process using public data and the proof. This proof actually consists of multiple sub-proofs, because not only do they want to provide a proof that a customer\u2019s emissions were calculated correctly based on their usage, but they also need to prove that the smart meter reading and the customers\u2019 share can all be trusted too. So I have also written a circuit to verify all the signatures in the smart meter and carbon intensity chains, and another circuit that verifies that all customer shares add up to 100% of the total carbon emissions.</p>\n\n<p>The challenge I am facing at the moment is scalability.</p>\n\n<p>Take the \u201ccustomer shares add to up 100%\u201d scenario for example. It is possible that a data centre can have over a million customers. In my prototype, I assume that customer records (each contains a customer ID and their share of the total emissions) are encrypted and put on a Merkle tree by the prover (i.e. data centre operator). The initial idea to generate a proof for the root of the tree is first to generate a proof for each leaf, and then recursively generate a proof for each node at each level until the root. The final proof, the root proof, should have a public output value of 100%. Running the circuits on my laptop it takes ~10-14s for the base proofs (i.e. for each leaf), and a few seconds more for each recursive proof. Let\u2019s say 12s for a base proof and 16s for a recursive proof.</p>\n\n<p>For a million customers it would take ~12,582,912s, i.e. close to 146 days, to finish all the base proofs. The Merkle tree has 21 levels in total, so the number of nodes above the leafs would be (2^20)-1 = 1,048,575, and it would take ~16,777,200s (~194 days) to do all the recursive proofs. So in total, it would take almost a year running non-stop to complete all the proof generation!</p>\n\n<p>I am now trying a different approach. In theory, each customer only needs one proof, the root, to do the verification. Therefore, instead of using recursive proofs to produce the final sum, I could in theory build the Merkle tree using the sums along with the hashes. I tested the building of such a tree and it took only 1 hour and 35 minutes for a million customer records. The trick now is to generate a cryptographically provable witness for proof. The o1js framework I am using has example code that I can base on, I haven\u2019t got it fully working yet but it\u2019s looking very promising! Perhaps in a future blog I could write up the cryptographic properties of the circuits I have built.</p>\n\n<p>The scalability challenge for the carbon intensity, meter readings, and signature chains is another story for another day, but it\u2019s equally interesting!</p>",
+18
jess/2025_05_31_recursive-curse-part2.json
+18
jess/2025_05_31_recursive-curse-part2.json
···
+2
-2
jess/metadata.json
+2
-2
jess/metadata.json
+18
jon/blog_2025_03_code-block-metadata.html.json
+18
jon/blog_2025_03_code-block-metadata.html.json
···+"content": "<h1><a href=\"#code-block-metadata\"></a>Code block metadata</h1><ul><li><span>published</span> <p>2025-03-07</p></li></ul><p>Back in 2021 <a href=\"https://github.com/julow\">julow</a> introduced some <a href=\"https://github.com/ocaml-doc/odoc-parser/pull/2\">new syntax</a> to odoc\u2019s code blocks to allow us to attach arbitrary metadata to the blocks. We imposed no structure on this; it was simply a block of text in between the language tag and the start of the code block. Now odoc needs to use it itself, we need to be a bit more precise about how it\u2019s defined.</p><p>The original concept looked like this:</p><pre>{@ocaml metadata goes here in an unstructured way[\n ... code ...\n]}</pre><p>where everything in between the language (\u201cocaml\u201d in this case) and the opening square bracket would be captured and put into the AST verbatim. Odoc itself has had no particular use for this, but it has been used in <a href=\"https://github.com/realworldocaml/mdx\">mdx</a> to control how it handles the code blocks, for example to skip processing of the block, to synchronise the block with another file, to disable testing the block on particular OSs and so on.</p><p>As part of the Odoc 3 release we decided to address one of our <a href=\"https://github.com/ocaml/odoc/pull/303\">oldest open issues</a>, that of extracting code blocks from mli/mld files for inclusion into other files. This is similar to the file-sync facility in mdx but it works in the other direction: the canonical source is in the mld/mli file. In order to do this, we now need to use the metadata so we can select which code blocks to extract, and so we needed a more concrete specification of how the metadata should be parsed.</p><p>We looked at what <a href=\"https://github.com/realworldocaml/mdx/blob/main/lib/label.ml#L195-L210\">mdx does</a>, but the way it works is rather ad-hoc, using very simple String.splits to chop up the metadata. This is OK for mdx as it\u2019s fully in charge of what things the user might want to put into the metadata, but for a general parsing library like odoc.parser we need to be a bit more careful. Daniel B\u00fcnzli <a href=\"https://github.com/ocaml/odoc/pull/1326#issuecomment-2702260053\">suggested</a> a simple strategy of atoms and bindings inspired by s-expressions. The idea is that we can have something like this:</p><pre>{@ocaml atom1 "atom two" key1=value1 "key 2"="value with spaces"[\n ... code content ...\n]}</pre><p>Daniel suggested a very minimal escaping rule, whereby a string could contain a literal " by prefixing with a backslash - something like; "value with a \\" and spaces", but we discussed it during the <a href=\"https://ocaml.org/governance/platform\">odoc developer meeting</a> and felt that we might want something a little more familiar. So we took a look at the lexer in <a href=\"https://github.com/janestreet/sexplib/blob/master/src/lexer.mll\">sexplib</a> and found that it follows the <a href=\"https://github.com/janestreet/sexplib/blob/d7c5e3adc16fcf0435220c3cd44bb695775020c1/README.org#lexical-conventions-of-s-expression\">lexical conventions</a> of OCaml\u2019s strings, and decided that would be a reasonable approach for us to follow too.</p><p>The resulting code, including the extraction logic, was implemented in <a href=\"https://github.com/ocaml/odoc/pull/1326/\">PR 1326</a> mainly by <a href=\"https://github.com/panglesd\">panglesd</a> with a little help from me on the lexer.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/03/code-block-metadata.html\">here</a></p>",
+18
jon/blog_2025_03_module-type-of.html.json
+18
jon/blog_2025_03_module-type-of.html.json
···+"content": "<h1><a href=\"#the-road-to-odoc-3:-module-type-of\"></a>The Road to Odoc 3: Module Type Of</h1><ul><li><span>published</span> <p>2025-03-08</p></li></ul><p>There are <a href=\"https://discuss.ocaml.org/t/ann-odoc-3-beta-release/16043\">many new and improved features</a> that Odoc 3 brings, but there are also a large number of bugfixes. I thought I'd write about one in particular here, an <a href=\"https://github.com/ocaml/odoc/pull/1081\">overhaul of "module type of"</a> that landed in May 2024.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/03/module-type-of.html\">here</a></p>",
+18
jon/blog_2025_04_meeting-the-team.html.json
+18
jon/blog_2025_04_meeting-the-team.html.json
···+"content": "<h1><a href=\"#meeting-the-team\"></a>Meeting the Team</h1><ul><li><span>published</span> <p>2025-04-08</p></li></ul><p>It's tremendously exciting to be back in the <a href=\"https://www.cst.cam.ac.uk/\">Computer Laboratory</a>, as the last time I worked here was just before the pandemic. I'm now a member of the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a> whose goal is "to have a measurable impact on tools and techniques for de-risking the future".</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/04/meeting-the-team.html\">here</a></p>",
+18
jon/blog_2025_04_ocaml-docs-ci-and-odoc-3.html.json
+18
jon/blog_2025_04_ocaml-docs-ci-and-odoc-3.html.json
···+"content": "<h1><a href=\"#ocaml-docs-ci-and-odoc-3\"></a>OCaml-Docs-CI and Odoc 3</h1><ul><li><span>published</span> <p>2025-04-29</p></li></ul><p>The release of Odoc 3 means that we need to update the <em>docs-ci</em> project so that the documentation that appears on <em>ocaml.org</em> is using the latest, greatest Odoc. With this major release of Odoc, it's also time to give the CI pipeline a bit of an overhaul too, and fix some of the irritations that it causes.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/04/ocaml-docs-ci-and-odoc-3.html\">here</a></p>",
+18
jon/blog_2025_04_odoc-3.html.json
+18
jon/blog_2025_04_odoc-3.html.json
···+"content": "<h1><a href=\"#odoc-3:-so-what?\"></a>Odoc 3: So what?</h1><ul><li><span>published</span> <p>2025-04-25</p></li></ul><p>Odoc 3 was <a href=\"https://discuss.ocaml.org/t/ann-odoc-3-0-released/16339\">released last month</a> and although we did write a list of the new features, I don't think we've made it clear enough why anyone should care.</p><p>It's <b>manuals</b>, the theme of Odoc 3 is <b>manuals</b>. It's got a load of features to make it much better for writing <code>mld</code> pages (files written using odoc's markup) to document your packages and their relationship to the surrounding ecosystem. Previous versions of Odoc were very library-centric, in that while we did have mld-file support, most of the effort went into making sure that we were generating correct per-module pages, which show the shape of your API even if you've not put in any doc comments at all. We've still got that, obviously, but we've added many features to make write <code>mld</code> pages far more useful, and we're really hoping that these will draw people in to make documenting packages a much more enjoyable experience.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/04/odoc-3.html\">here</a></p>",
+18
jon/blog_2025_04_semantic-versioning-is-hard.html.json
+18
jon/blog_2025_04_semantic-versioning-is-hard.html.json
···+"content": "<h1><a href=\"#semantic-versioning-in-ocaml-is-hard\"></a>Semantic Versioning in OCaml is Hard</h1><ul><li><span>published</span> <p>2025-04-20</p></li></ul><p><a href=\"https://semver.org\">Semantic versioning</a> is a lovely and simple idea that, if it were reliably implemented everywhere, would make life a lot simpler. So, is it possible to make our OCaml libraries stick to this scheme? There are some projects that are trying to do this, including a recent <a href=\"https://www.outreachy.org\">Outreachy</a> project by <a href=\"https://github.com/azzsal/\">Abdulaziz Alkurd</a> mentored by <a href=\"https://choum.net\">panglesd</a> and <a href=\"https://github.com/nathanreb\">Nathan Reb</a>. While this is a great start, there are some subtleties of the OCaml module system that make it a good deal more complex than in other languages.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/04/semantic-versioning-is-hard.html\">here</a></p>",
+18
jon/blog_2025_04_this-site.html.json
+18
jon/blog_2025_04_this-site.html.json
···+"content": "<h1><a href=\"#this-site\"></a>This site</h1><ul><li><span>libs</span> <p>mime_printer</p></li></ul><ul><li><span>published</span> <p>2025-04-07</p></li></ul><p>I've spent a <em>lot</em> of time over the past few years working on Odoc, the OCaml documentation generator, so when it came time to (re)start my own website and blog, I found it hard to resist thinking about how I might use odoc as part of it. We've spent a lot of time recently trying to make odoc more able to generate structured documentation sites, so I've gone all in and am trialling using it as a tool to generate my entire site. This is a bit of an experiment, and I don't know how well it will work out, but let's see how it goes.</p><p>Additionally, I've recently been working on a project currently called <code>odoc_notebook</code>, which is a set of tools to allow odoc <code>mld</code> files to be used as a sort of Jupyter-style notebook. The idea is that you can write both text and code in the same file, and then run the code in the notebook interactively. Since I've only got a webserver, all the execution of code has to be done client side, so I'm making extensive use of the phenomenal <a href=\"https://github.com/ocsigen/js_of_ocaml\">Js_of_ocaml</a> project to get an OCaml engine running in the browser.</p><p>My focus has initially been on getting 'toplevel-style' code execution working. As an example, let's write a little demo.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/04/this-site.html\">here</a></p>",
+18
jon/blog_2025_05_ai-for-climate-and-nature-day.html.json
+18
jon/blog_2025_05_ai-for-climate-and-nature-day.html.json
···+"content": "<h1><a href=\"#ai-for-climate-&-nature-community-day\"></a>AI for Climate & Nature Community Day</h1><ul><li><span>published</span> <p>2025-05-01</p></li></ul><p>\n\n\n <img alt=\"Melissa Leach\" src=\"melissa.jpg\">\n \nMelissa Leach introducing the day\n\n Today was the "AI for Climate & Nature Community Day" at the <a href=\"https://map.cam.ac.uk/?maplon=0.12032&maplat=52.20354&mapzoom=18&maplayers=Building+Labels%2CExternal+Sites%2CColleges%2CUniversity+Sites%2CBuildings%2CTransport&mapfeature=mfid257%2CBuildings\">David Attenborough Building</a>. A whole bunch of the EEG were either presenting or contributing in some way so I thought I'd come along to see what's going on.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/05/ai-for-climate-and-nature-day.html\">here</a></p>",
+18
jon/blog_2025_05_docs-progress.html.json
+18
jon/blog_2025_05_docs-progress.html.json
···+"content": "<h1><a href=\"#progress-in-ocaml-docs\"></a>Progress in OCaml docs</h1><ul><li><span>published</span> <p>2025-05-29</p></li></ul><p>The docs build is progress well, and we've <i>just about</i> hit 20,000 packages (20,038 to be precise). So at this point I thought it'd be useful to take a look through the various failures to see if there are any insights to be gained.</p><p>Odoc requires a built package in order to generate the docs, there are two steps that have to be done before we can begin building the docs. Step one is to figure out the exact set of packages to build - ie, doing an opam solve, and step two is to actually build the packages. These two steps are, to some extent, out of docs-ci's control, and rely on the state of opam repository. While there are efforts to keep this in as good a state as possible, it's still the case that these steps fail much more often than the actual docs build itself. Let's take a look at some of the failures we see in each of these steps.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/05/docs-progress.html\">here</a></p>",
+18
jon/blog_2025_05_lots-of-things.html.json
+18
jon/blog_2025_05_lots-of-things.html.json
···+"content": "<h1><a href=\"#lots-of-things-have-been-happening\"></a>Lots of things have been happening</h1><ul><li><span>published</span> <p>2025-05-20</p></li></ul><p>I've been working on a whole lot of thing recently in many different areas, making what's felt like only a bit of progress in each. Consequently I've not felt like I had anything substantial to say, so I haven't written up anything for a while.</p><p>Time for a little summary of things then!</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/05/lots-of-things.html\">here</a></p>",
+18
jon/blog_2025_05_oxcaml-gets-closer.html.json
+18
jon/blog_2025_05_oxcaml-gets-closer.html.json
···+"content": "<h1><a href=\"#oxcaml-is-getting-closer...\"></a>OxCaml is getting closer...</h1><ul><li><span>published</span> <p>2025-05-02</p></li></ul><p>I joined the OxCaml weekly meeting representing Tarides for the first time this week, as Jane Street gear up to an official release of their OxCaml compiler.</p><p>It seems that mainly what needs to be done before the release can be made is to ensure there is some reasonable documentation for the new features, and that a reasonable number of packages are working, so people are furiously writing and bugfixing to try and get this ready.</p><p>As well as this though, there are some challenges of a more organisational level that will need to be addressed to ensure the success of the project. Jane Street have long had a public branch of their compiler, but while they've had patches internally to ensure the tooling and other libraries work, these patches haven't previously been made public in a usable way. In order for OxCaml to be useful, it will clearly need these patches not only to be available, but also to be maintained and to easily allow contributions from the community -- in short, they need to be properly Open Source!</p><p>Personally, I'm looking forward to seeing their branch of <a href=\"https://ocaml.github.io/odoc/\">odoc</a> and having a look to see how the modes will fit into the documentation. I'm also keen to see whether the <a href=\"../04/this-site.html\" title=\"this-site\">notebook features</a> I've been working on can be ported over to run on OxCaml!</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/05/oxcaml-gets-closer.html\">here</a></p>",
+18
jon/blog_2025_05_ticks-solved-by-ai.html.json
+18
jon/blog_2025_05_ticks-solved-by-ai.html.json
···+"content": "<h1><a href=\"#solving-first-year-ocaml-exercises-with-ai\"></a>Solving First-year OCaml exercises with AI</h1><ul><li><span>published</span> <p>2025-05-07</p></li></ul><p>My colleague <a href=\"https://toao.com\">Sadiq Jaffer</a> and I have been working on a little project to see how well small AI models can solve the OCaml exercises we give to our first-year students at the University of Cambridge. Sadiq has done an excellent <a href=\"https://toao.com/blog/ocaml-local-code-models\">write up</a> of our initial results, which you should all go and read! The tl;dr though, as Sadiq writes, is that even some of the smaller models would score top marks on these exercises!</p><p>One interesting aspect we discovered quite quickly is that we had to make the testing feedback a little more generous than just "exception raised"! The problems are presented as a Jupyter notebook using <a href=\"https://github.com/akabe\">akabe's</a> excellent OCaml kernel, with <a href=\"https://nbgrader.readthedocs.io/en/stable/\">nbgrader</a> to do the assessment. Our students can see the tests that are run, and if they fail they're able to copy the test cell out and play with their code to figure out exactly what went wrong. The AI models, however, have a far less interactive experience, and get just 3 chances to write code that passes the tests. We found that the performance of the models increased hugely when we adjusted the test cells such that they clearly indicated which test failed, the results that were expected, and the results the code actually produced.</p><p>Of course, we <a href=\"https://anil.recoil.org/notes/claude-copilot-sandbox\">already knew</a> that AI models can code OCaml very well, and we (along with the rest of the teaching world) are still ruminating on the implications of this from a pedagogical perspective. Our plan, though, is to try and make the 'problem' worse by training these models on more OCaml code, and see just how well we can get them to perform! It's pretty amazing, and a little startling to know that a model that'll run pretty comfortably on my laptop can solve these problems so well even without extra training, though given how hot it gets, I'd rather not have the laptop on my actual lap while it's doing so!</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/05/ticks-solved-by-ai.html\">here</a></p>",
+18
jon/blog_2025_06_week23.html.json
+18
jon/blog_2025_06_week23.html.json
···+"content": "<h1><a href=\"#week-23\"></a>Week 23</h1><ul><li><span>libs</span> <p>opam-format fpath rresult bos</p></li></ul><ul><li><span>merlinonly</span> </li></ul><ul><li><span>published</span> <p>2025-06-09</p></li></ul><p>Some brief notes on last week.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/06/week23.html\">here</a></p>",
+18
jon/blog_2025_07_week27.html.json
+18
jon/blog_2025_07_week27.html.json
···+"content": "<h1><a href=\"#weeks-24-27\"></a>Weeks 24-27</h1><ul><li><span>published</span> <p>2025-07-07</p></li></ul><p>It's been a busy few weeks. There's been exam marking for the 1A Foundations of Computer Science course, an Odoc release to plan, and some really interesting new work on using LLMs to summarise OCaml documentation. This post is about anaspect of that last one that I found particularly interesting.</p><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/07/week27.html\">here</a></p>",
+18
jon/blog_2025_07_week28.html.json
+18
jon/blog_2025_07_week28.html.json
···+"content": "<h1><a href=\"#weeks-28\"></a>Weeks 28</h1><ul><li><span>published</span> <p>2025-07-14</p></li></ul><ul><li><span>libs</span> <p>caqti.platform mariadb</p></li></ul><p>Continue reading <a href=\"https://jon.recoil.org/blog/2025/07/week28.html\">here</a></p>",
+2
-2
jon/metadata.json
+2
-2
jon/metadata.json
+18
jonsterling/2025-W15_.json
+18
jonsterling/2025-W15_.json
···+"summary": "<p>I have a lot to say this week, so strap in.</p>\n \n\n \n\n <h2><a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> development: canonical URLs, atom feeds, and LSP</h2>\n \n <p>Work on <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> proceeds apace, generously supported by <a href=\"https://www.jonmsterling.com/00XB/\">ARIA</a> who have engaged <a href=\"https://www.jonmsterling.com/kentookura/\">Kento Okura</a> and myself on a consulting basis to support their internal use of Forester. My recent goals have been to bring Forester closer in line with the architecture of the World Wide Web; to that end, I have made two big improvements.</p>\n \n\n \n\n <h3>First cut at canonical URLs</h3>\n \n <p>Trees are addressed by \u201ccanonical URLs\u201d that are meant to be the place where they will ultimately be published. See <a href=\"https://www.forester-notes.org/JVIT/\">my blog post</a> on the design for more details. Canonical URLs are of the form <code>https://www.my-host.net/tree-name/</code>; the handling is a little fragile right now and you can expect bugs (but please write to me about them).</p>\n \n \n\n \n\n <h3>First cut at atom syndication</h3>\n \n <p>It is now possible to syndicate the children of a tree as an Atom feed. This is done currently by including the following directive in the tree you wish to syndicate:</p>\n <pre>\\syndicate-current-tree-as-atom-feed</pre>\n <p>Then, if your tree is located at <code>https://www.my-host.net/tree-name/</code>, you will find that there is an atom feed at <code>https://www.my-host.net/tree-name/atom.xml</code>. There are many subtleties to this, and the atom support will continue to evolve and improve. One thing I need to deal with is the fact that Forester produces nested hyperlinks\u2014which are not valid in HTML! I came up with a pretty slick way to <a href=\"https://git.sr.ht/~jonsterling/forester-base-theme/commit/a251f9cf19b0ff42f4553d315df5181b985c79cb\">handle this in XSLT</a>, but that Atom renderer is intended to bypass that entirely.</p>\n <p>As a side note, I am very happy to see that I am <a href=\"https://patrick.sirref.org/weekly-2025-03-31/\">not the only person</a> using the new support for Atom feeds. Patrick\u2019s fork of Forester is looking pretty cool, and I am excited to learn more from what he is doing. I\u2019m also relieved that he was able to get rebased atop the ever-changing <code>forester-5.0-dev</code> branch.</p>\n \n \n\n \n\n <h3>Federation is still janky</h3>\n \n <p>One thing I want to start designing soon is how best to handle federated forests. Right now, Forester bundles all the imported material under a <code>foreign/my-friends-host/</code> directory and routes all links accordingly, but in many (but not all!) cases one would want to not bundle things at all and instead have links routed directly to the canonical URLs as published on the World Wide Web. I am not sure of the best design for this, so I welcome feedback. In the meanwhile, enjoy the janky prototype feel.</p>\n \n \n\n \n\n <h3>Language server; code completion via effects and handlers</h3>\n \n <p><a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a> is hard at work hardening Forester\u2019s language server. I am hoping that we will have something to show on the scale of a week.</p>\n <p>There were some subtleties about how to provide completion information at a source location\u2014which is at least as complex as the expander itself, since scope emerges from the expansion process. We had something fairly broken in place, which I have spent Thursday and Friday morning replacing with something cool using OCaml 5\u2019s effects and handlers. The idea is to instrument the expander with an effect that notifies observes that it has entered a source range; this can be handled as a no-op, <em>or</em> by querying the scope\u2019s available symbols when it enters the desired range and throwing away the continuation, and resuming the continuation otherwise to keep searching. This approach allows all the scope-handling code to be unified into a single routine, whose behaviour is controlled by effect handlers on the outside.</p>\n <p>As a side note, I am looking forward to when the next version of <a href=\"https://topiary.tweag.io/\">Topiary</a> is released, which should contain support for formatting OCaml\u2019s effect handlers. Right now we don\u2019t use the nice notation because we are stuck on Topiary 0.6.0.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/019E/\">Project Pterosaur</a>: yes, I\u2019m building a new proof assistant</h2>\n \n <p>I swore after building <a href=\"https://github.com/RedPRL/cooltt\">cooltt</a>, <a href=\"https://github.com/RedPRL/redtt\">redtt</a>, and <a href=\"https://github.com/RedPRL/sml-redprl\">RedPRL</a> that I would never build another proof assistant, as the experience burned around four years of my PhD and resulted (at least directly) in very little publishable work\u2014but, to be fair, I probably would not have made the <a href=\"https://www.jonmsterling.com/0014/\">key mathematical discovery</a> of my <a href=\"https://www.jonmsterling.com/sterling-2021-thesis/\">PhD thesis</a> if it were not for these engineering experiments. But I\u2019m back on my bullshit, as the young people say, and hard at work building a new interactive proof assistant that I have code-named <a href=\"https://www.jonmsterling.com/019E/\">Project Pterosaur</a>.</p>\n \n\n \n\n <h3>Locales in dependent type theory?</h3>\n \n <p>The goal of Pterosaur is to explore the adaptation of <em>locales</em> from Isabelle to dependent type theory, as a lightweight but extremely expressive alternative to type classes. My colleague <a href=\"https://www.jonmsterling.com/lawrencepaulson/\">Larry Paulson</a> has written <a href=\"https://lawrencecpaulson.github.io/tag/locales\">some great blog posts about locales in Isabelle</a>, and I strongly recommend reading Ballarin\u2019s <a href=\"https://www21.in.tum.de/~ballarin/publications/jar2019.pdf\">Exploring the Structure of an Algebra Text with Locales</a> to get a feel for what is possible. Here is what locales do:</p>\n <ol><li>Locales appear to completely solve the pain involved when building up hierarchies of mathematical structures and notations, allowing you to effortlessly combine theories along a common core (e.g. defining rings in terms of a multiplicative monoid and an Abelian group sharing the same carrier).</li>\n <li>Locales allow you to <em>add new methods</em> to a theory after the fact, and they will magically be available on anything that extended that theory. You can also add new links in the theory graph, and both cycles and diamonds are allowed so long as they are coherent; this is useful if you want to silently regard (e.g.) the space of endomaps\u00a0on a set as a monoid, etc.</li></ol>\n <p>In comparison to modules and type classes, the strength of locales is that you don\u2019t have to decide ahead of time whether you want to \u201cbundle\u201d fields with their carriers, etc. In contrast, a great deal of the difficult work of mathematical library design and maintainance in tools like Rocq, Agda, and <a href=\"https://www.jonmsterling.com/019G/\">Lean</a> is figuring out just what things to bundle, and fixing things when your choices inevitably lead to breakage, etc. Locales avoid these problems entirely.</p>\n <p>Finally, a reasonably usable locale implementation can be obtained <em>without any higher-order unification whatsoever</em>. I have a feeling that will be extremely important, given how unreliable (and <a href=\"https://github.com/agda/agda/issues/5837\">incorrect</a>!) most implementations of higher-order unification are; the situation is so bad that it is actually an open problem to define a correct higher-order unification algorithm in the presence of singleton types (such as the unit type). I do think that this can be solved (and may have already been solved by Andras Kovacs), but my point is that the prognosis for unification in dependent type theory is bad.</p>\n \n \n\n \n\n <h3>Experimental implementation in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a></h3>\n \n <p>The other interesting thing about Pterosaur is that I am implementing it in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a>; I am not verifying anything, and am making free use of Lean\u2019s <code>partial</code> keyword (which hides potentially divergent code from definitional equality). Instead, I am thinking of Lean as a \u201cbetter OCaml\u201d: I can\u2019t speak to the quality of the compiler and code generator, but I can absolutely say that from the perspective of day-to-day programming, Lean has a lot of affordances that make it extremely nice to use. On the other hand, Lean\u2019s story for modularity is not so good; but I hope they don\u2019t \u201cfix\u201d it any time soon, because I think that something like locales could be a good option for Lean itself in the future if I am able to demonstrate their potential in the context of Pterosaur\u2019s clean-room implementation.</p>\n \n \n\n \n\n <h3>A taste of code</h3>\n \n <p>I will have more to say about Pterosaur in the future, but let me leave you with a bit of demo code.</p>\n <pre>locale Magma { A =>\n car : Type,\n car.isSet : isSet A\u00b7car,\n cmp : (x y : A\u00b7car) \u2192 A\u00b7car\n}\n\nlocale Magma.Hom { f =>\n dom : Magma,\n cod : Magma,\n car : (x : f\u00b7dom\u00b7car) \u2192 f\u00b7cod\u00b7car,\n cmp : (x y : f\u00b7dom\u00b7car) \u2192 Id f\u00b7cod\u00b7car (f\u00b7car (f\u00b7dom\u00b7cmp x y)) (f\u00b7cod\u00b7cmp (f\u00b7car x) (f\u00b7car y))\n}\n\nlocale Semigroup { A =>\n splice magma : Magma,\n cmp.assoc : (x y z : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp (A\u00b7cmp x y) z) (A\u00b7cmp x (A\u00b7cmp y z))\n}\n\nlocale Semigroup.Hom {f =>\n dom : Semigroup,\n cod : Semigroup,\n splice magma.hom : Magma.Hom / {dom := f\u00b7dom\u00b7magma, cod := f\u00b7cod\u00b7magma}\n}\n\nlocale Monoid { A =>\n splice semigroup : Semigroup,\n unit : A\u00b7car,\n cmp.leftUnit : (x : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp A\u00b7unit x) x,\n cmp.rightUnit : (x : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp x A\u00b7unit) x\n}\n\nlocale Monoid.Hom {f =>\n dom : Monoid,\n cod : Monoid,\n splice semigroup.hom : Semigroup.Hom / {dom := f\u00b7dom\u00b7semigroup, cod := f\u00b7cod\u00b7semigroup},\n unit : Id f\u00b7cod\u00b7car (f\u00b7car f\u00b7dom\u00b7unit) f\u00b7cod\u00b7unit\n}\n\nlocale Group { G =>\n splice monoid : Monoid,\n inv : (x : G\u00b7car) \u2192 G\u00b7car,\n inv.inv : (x : G\u00b7car) \u2192 Id G\u00b7car (G\u00b7inv (G\u00b7inv x)) x\n}\n\nlocale AbelianGroup { A =>\n splice group : Group,\n splice commutativeOperation : CommutativeOperation / {car := A\u00b7car, cmp := A\u00b7cmp}\n}</pre>\n \n \n \n\n \n\n <h2>Two papers to appear in <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a></h2>\n \n <p>This week I have had two papers accepted at <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a>; I\u2019m excited about both of them.</p>\n \n\n \n\n <h3>With <a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a>: <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">When is the partial map classifier a Sierpi\u0144ski cone?</a></h3>\n \n <p><a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a> is my old Part III student from 2023\u20132024, and this work builds on her Part III dissertation. The goal of our paper was to better understand the relationship between two approaches to partial functions in denotational semantics:</p>\n <ol><li><strong>\u201cGeometrical\u201d partiality / \u201cthe Sierpi\u0144ski cone\u201d</strong>: freely add a lowest element to the space representing a given data type. This is useful for defining functions whose <em>inputs</em> are partially defined, because you can do a case-analysis on the definedness of the input.</li>\n <li><strong>\u201cLogical\u201d partiality / \u201cthe partial map classifier\u201d</strong>: representing partially defined elements of a space X by pairs \n \n (\n p,x\n )\n \n where p is a proposition and x is a function from isTrue\n \n (\n p\n )\n \n to X. This is useful for defining functions whose <em>outputs</em> are partially defined.</li></ol>\n <p>In traditional domain theory as developed by <a href=\"https://www.jonmsterling.com/danascott/\">Scott</a>, the two kinds of partiality coincide\u2014<a href=\"https://www.jonmsterling.com/sterling-2024-lifting/\">even constructively</a>. I am, however, interested in <a href=\"https://www.jonmsterling.com/hyland-1991/\"><em>synthetic domain theory</em></a> which abstracts away from continuity and limits and lets you just use sets and functions rather than cpos and continuous functions\u2014provided that you avoid non-constructive principles like the Axiom of Choice or the Law of Excluded Middle. The starting point of our work is my observation that the two notions cannot coincide <em>absolutely</em> in synthetic domain theory, but that there may be restricted subuniverses in which they do coincide. The main result of our paper is to define such a subuniverse, made possible by my discovery of the <em>based Segal condition</em>\u2014a strengthening of the usual Segal condition for higher categories.</p>\n <p>A broader motivation of this work is to develop synthetic domain theory and synthetic higher category theory within the same framework. Whereas synthetic domain theory traditionally concerned itself with spaces that behaved like \u03c9-complete partial orders (but where all functions are automatically monotone and continuous), the same ideas (if applied within <a href=\"https://www.jonmsterling.com/hottbook/\">homotopy type theory</a>) allow you to consider spaces that behave like <em>\u221e-categories</em> with colimits of \u03c9-chains (but where all functions are automatically \u221e-functorial and \u03c9-continuous). I believe that unifying domain theory and higher category theory will prove useful for studying things like the denotational semantics of concurrency, which is inherently higher-dimensional.</p>\n \n \n\n \n\n <h3>With <a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a>: <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting of fibred categories</a></h3>\n \n <p>This year, <a href=\"https://www.jonmsterling.com/thomasstreicher/\">Thomas Streicher</a> (born 1958) passed away from cancer. Thomas was one of the Greats of dependent type theory and he also wrote an <a href=\"https://www.abebooks.co.uk/9789812701428/Domain-theoretic-Foundations-Functional-Programming-Streicher-9812701427/plp\">excellent textbook on domain theory for denotational semantics</a>, but much more importantly he was kind and curious and patient and always made time for young people. While I was still finding my place in the community, Thomas was very generous to me with his time and advice, and he sent me many papers to referee.</p>\n <p>Although Thomas made many contributions to dependent type theory, domain theory, realisability theory, and category theory, he is most known to type theorists for two things\u2014both in collaboration with the late <a href=\"https://www.jonmsterling.com/martinhofmann/\">Martin Hofmann</a>: the <a href=\"https://www.jonmsterling.com/hofmann-streicher-1998/\">groupoid interpretation of type theory</a> and the eponymous <a href=\"https://www.jonmsterling.com/hofmann-streicher-1997/\">Hofmann\u2013Streicher universe lifting construction</a>. Andrew and my paper pertains to the latter.</p>\n <p>The idea of Hofmann\u2013Streicher lifting has to do with universes, which are \u201ctypes of types\u201d (typically defined in such a way as to avoid paradoxes). Martin-L\u00f6f type theory usually includes universes in order to be able to quantify over (small enough) types; in the simplest models of Martin-L\u00f6f type theory, types are interpreted as sets and so Martin-L\u00f6f\u2019s universes are interpreted as certain sets of sets, such as <a href=\"https://www.jonmsterling.com/sga-4/\">Grothendieck universes</a>. But it is important to be able to interpret the language of type theory in more sophisticated worlds than set theory: for example, in <em>presheaves</em> (which are functors from a fixed category C into Set). What <a href=\"https://www.jonmsterling.com/hofmann-streicher-1997/\">Hofmann and Streicher</a> did is show how to transform any universe of sets into a universe of presheaves!</p>\n <p>Although Hofmann and Streicher\u2019s construction worked well and had good properties, they did not find a <em>universal property</em> for it\u2014which is an abstract description of the object that determines it uniquely up to isomorphism, usually in terms of how it relates to other objects. Recently <a href=\"https://www.jonmsterling.com/awodey-2024-universes/\">Awodey</a> found a 1-dimensional universal property, which was the starting point of our work. What Andrew and I wanted to do is generalise Awodey\u2019s analysis in two directions:</p>\n <ol><li>We wanted a <em>2-dimensional</em> version, which is useful because it captures more about the universe than can be said in just one dimension: for example, with a 2-dimensional version, you can see immediately (by \u201cabstract nonsense\u201d) that Hofmann\u2013Streicher lifting preserves structures like monads, adjunctions, etc. that might be used for modelling computational effects, etc.</li>\n <li>We wanted a <em>relative</em> version, which would make it easier to iterate the Hofmann\u2013Streicher lifting construction: the purpose of this is to be able to define presheaf models of type theory <em>internal</em> to other presheaf models. These kind of situations actually happen in practice! For example, the model of <a href=\"https://www.jonmsterling.com/bbcgsv-2019/\">guarded cubical type theory</a> that combines step-indexing with univalence ought to be an example of this.</li></ol>\n <p>To develop this two-fold generalisation of Hofmann\u2013Streicher lifting, we resituated the theory in terms of another of Thomas\u2019s favourite topics: the theory of <em>fibrations</em>, on which Thomas had written <a href=\"https://www.jonmsterling.com/streicher-fcjb/\">the most wonderful lecture notes</a>.</p>\n <p>We dedicated our paper to Thomas\u2019s memory. May he rest in peace.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/clarke-1979/\">The Fountains of Paradise</a></h2>\n \n <p>I recently read Arthur C. Clarke\u2019s <a href=\"https://www.jonmsterling.com/clarke-1979/\">The Fountains of Paradise</a>; although it was a pretty good read, I found that like many science fiction books of that era, one has to look past a lot in order to enjoy it. I wrote some commentary in my post entitled <a href=\"https://www.jonmsterling.com/019W/\">Ventriloquy of the Mid-Century Man</a> on my culture blog <a href=\"https://www.jonmsterling.com/015X/\">The Jon Sterling Review of Books</a>.</p>",+"content": "<p>I have a lot to say this week, so strap in.</p>\n \n\n \n\n <h2><a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> development: canonical URLs, atom feeds, and LSP</h2>\n \n <p>Work on <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> proceeds apace, generously supported by <a href=\"https://www.jonmsterling.com/00XB/\">ARIA</a> who have engaged <a href=\"https://www.jonmsterling.com/kentookura/\">Kento Okura</a> and myself on a consulting basis to support their internal use of Forester. My recent goals have been to bring Forester closer in line with the architecture of the World Wide Web; to that end, I have made two big improvements.</p>\n \n\n \n\n <h3>First cut at canonical URLs</h3>\n \n <p>Trees are addressed by \u201ccanonical URLs\u201d that are meant to be the place where they will ultimately be published. See <a href=\"https://www.forester-notes.org/JVIT/\">my blog post</a> on the design for more details. Canonical URLs are of the form <code>https://www.my-host.net/tree-name/</code>; the handling is a little fragile right now and you can expect bugs (but please write to me about them).</p>\n \n \n\n \n\n <h3>First cut at atom syndication</h3>\n \n <p>It is now possible to syndicate the children of a tree as an Atom feed. This is done currently by including the following directive in the tree you wish to syndicate:</p>\n <pre>\\syndicate-current-tree-as-atom-feed</pre>\n <p>Then, if your tree is located at <code>https://www.my-host.net/tree-name/</code>, you will find that there is an atom feed at <code>https://www.my-host.net/tree-name/atom.xml</code>. There are many subtleties to this, and the atom support will continue to evolve and improve. One thing I need to deal with is the fact that Forester produces nested hyperlinks\u2014which are not valid in HTML! I came up with a pretty slick way to <a href=\"https://git.sr.ht/~jonsterling/forester-base-theme/commit/a251f9cf19b0ff42f4553d315df5181b985c79cb\">handle this in XSLT</a>, but that Atom renderer is intended to bypass that entirely.</p>\n <p>As a side note, I am very happy to see that I am <a href=\"https://patrick.sirref.org/weekly-2025-03-31/\">not the only person</a> using the new support for Atom feeds. Patrick\u2019s fork of Forester is looking pretty cool, and I am excited to learn more from what he is doing. I\u2019m also relieved that he was able to get rebased atop the ever-changing <code>forester-5.0-dev</code> branch.</p>\n \n \n\n \n\n <h3>Federation is still janky</h3>\n \n <p>One thing I want to start designing soon is how best to handle federated forests. Right now, Forester bundles all the imported material under a <code>foreign/my-friends-host/</code> directory and routes all links accordingly, but in many (but not all!) cases one would want to not bundle things at all and instead have links routed directly to the canonical URLs as published on the World Wide Web. I am not sure of the best design for this, so I welcome feedback. In the meanwhile, enjoy the janky prototype feel.</p>\n \n \n\n \n\n <h3>Language server; code completion via effects and handlers</h3>\n \n <p><a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a> is hard at work hardening Forester\u2019s language server. I am hoping that we will have something to show on the scale of a week.</p>\n <p>There were some subtleties about how to provide completion information at a source location\u2014which is at least as complex as the expander itself, since scope emerges from the expansion process. We had something fairly broken in place, which I have spent Thursday and Friday morning replacing with something cool using OCaml 5\u2019s effects and handlers. The idea is to instrument the expander with an effect that notifies observes that it has entered a source range; this can be handled as a no-op, <em>or</em> by querying the scope\u2019s available symbols when it enters the desired range and throwing away the continuation, and resuming the continuation otherwise to keep searching. This approach allows all the scope-handling code to be unified into a single routine, whose behaviour is controlled by effect handlers on the outside.</p>\n <p>As a side note, I am looking forward to when the next version of <a href=\"https://topiary.tweag.io/\">Topiary</a> is released, which should contain support for formatting OCaml\u2019s effect handlers. Right now we don\u2019t use the nice notation because we are stuck on Topiary 0.6.0.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/019E/\">Project Pterosaur</a>: yes, I\u2019m building a new proof assistant</h2>\n \n <p>I swore after building <a href=\"https://github.com/RedPRL/cooltt\">cooltt</a>, <a href=\"https://github.com/RedPRL/redtt\">redtt</a>, and <a href=\"https://github.com/RedPRL/sml-redprl\">RedPRL</a> that I would never build another proof assistant, as the experience burned around four years of my PhD and resulted (at least directly) in very little publishable work\u2014but, to be fair, I probably would not have made the <a href=\"https://www.jonmsterling.com/0014/\">key mathematical discovery</a> of my <a href=\"https://www.jonmsterling.com/sterling-2021-thesis/\">PhD thesis</a> if it were not for these engineering experiments. But I\u2019m back on my bullshit, as the young people say, and hard at work building a new interactive proof assistant that I have code-named <a href=\"https://www.jonmsterling.com/019E/\">Project Pterosaur</a>.</p>\n \n\n \n\n <h3>Locales in dependent type theory?</h3>\n \n <p>The goal of Pterosaur is to explore the adaptation of <em>locales</em> from Isabelle to dependent type theory, as a lightweight but extremely expressive alternative to type classes. My colleague <a href=\"https://www.jonmsterling.com/lawrencepaulson/\">Larry Paulson</a> has written <a href=\"https://lawrencecpaulson.github.io/tag/locales\">some great blog posts about locales in Isabelle</a>, and I strongly recommend reading Ballarin\u2019s <a href=\"https://www21.in.tum.de/~ballarin/publications/jar2019.pdf\">Exploring the Structure of an Algebra Text with Locales</a> to get a feel for what is possible. Here is what locales do:</p>\n <ol><li>Locales appear to completely solve the pain involved when building up hierarchies of mathematical structures and notations, allowing you to effortlessly combine theories along a common core (e.g. defining rings in terms of a multiplicative monoid and an Abelian group sharing the same carrier).</li>\n <li>Locales allow you to <em>add new methods</em> to a theory after the fact, and they will magically be available on anything that extended that theory. You can also add new links in the theory graph, and both cycles and diamonds are allowed so long as they are coherent; this is useful if you want to silently regard (e.g.) the space of endomaps\u00a0on a set as a monoid, etc.</li></ol>\n <p>In comparison to modules and type classes, the strength of locales is that you don\u2019t have to decide ahead of time whether you want to \u201cbundle\u201d fields with their carriers, etc. In contrast, a great deal of the difficult work of mathematical library design and maintainance in tools like Rocq, Agda, and <a href=\"https://www.jonmsterling.com/019G/\">Lean</a> is figuring out just what things to bundle, and fixing things when your choices inevitably lead to breakage, etc. Locales avoid these problems entirely.</p>\n <p>Finally, a reasonably usable locale implementation can be obtained <em>without any higher-order unification whatsoever</em>. I have a feeling that will be extremely important, given how unreliable (and <a href=\"https://github.com/agda/agda/issues/5837\">incorrect</a>!) most implementations of higher-order unification are; the situation is so bad that it is actually an open problem to define a correct higher-order unification algorithm in the presence of singleton types (such as the unit type). I do think that this can be solved (and may have already been solved by Andras Kovacs), but my point is that the prognosis for unification in dependent type theory is bad.</p>\n \n \n\n \n\n <h3>Experimental implementation in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a></h3>\n \n <p>The other interesting thing about Pterosaur is that I am implementing it in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a>; I am not verifying anything, and am making free use of Lean\u2019s <code>partial</code> keyword (which hides potentially divergent code from definitional equality). Instead, I am thinking of Lean as a \u201cbetter OCaml\u201d: I can\u2019t speak to the quality of the compiler and code generator, but I can absolutely say that from the perspective of day-to-day programming, Lean has a lot of affordances that make it extremely nice to use. On the other hand, Lean\u2019s story for modularity is not so good; but I hope they don\u2019t \u201cfix\u201d it any time soon, because I think that something like locales could be a good option for Lean itself in the future if I am able to demonstrate their potential in the context of Pterosaur\u2019s clean-room implementation.</p>\n \n \n\n \n\n <h3>A taste of code</h3>\n \n <p>I will have more to say about Pterosaur in the future, but let me leave you with a bit of demo code.</p>\n <pre>locale Magma { A =>\n car : Type,\n car.isSet : isSet A\u00b7car,\n cmp : (x y : A\u00b7car) \u2192 A\u00b7car\n}\n\nlocale Magma.Hom { f =>\n dom : Magma,\n cod : Magma,\n car : (x : f\u00b7dom\u00b7car) \u2192 f\u00b7cod\u00b7car,\n cmp : (x y : f\u00b7dom\u00b7car) \u2192 Id f\u00b7cod\u00b7car (f\u00b7car (f\u00b7dom\u00b7cmp x y)) (f\u00b7cod\u00b7cmp (f\u00b7car x) (f\u00b7car y))\n}\n\nlocale Semigroup { A =>\n splice magma : Magma,\n cmp.assoc : (x y z : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp (A\u00b7cmp x y) z) (A\u00b7cmp x (A\u00b7cmp y z))\n}\n\nlocale Semigroup.Hom {f =>\n dom : Semigroup,\n cod : Semigroup,\n splice magma.hom : Magma.Hom / {dom := f\u00b7dom\u00b7magma, cod := f\u00b7cod\u00b7magma}\n}\n\nlocale Monoid { A =>\n splice semigroup : Semigroup,\n unit : A\u00b7car,\n cmp.leftUnit : (x : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp A\u00b7unit x) x,\n cmp.rightUnit : (x : A\u00b7car) \u2192 Id A\u00b7car (A\u00b7cmp x A\u00b7unit) x\n}\n\nlocale Monoid.Hom {f =>\n dom : Monoid,\n cod : Monoid,\n splice semigroup.hom : Semigroup.Hom / {dom := f\u00b7dom\u00b7semigroup, cod := f\u00b7cod\u00b7semigroup},\n unit : Id f\u00b7cod\u00b7car (f\u00b7car f\u00b7dom\u00b7unit) f\u00b7cod\u00b7unit\n}\n\nlocale Group { G =>\n splice monoid : Monoid,\n inv : (x : G\u00b7car) \u2192 G\u00b7car,\n inv.inv : (x : G\u00b7car) \u2192 Id G\u00b7car (G\u00b7inv (G\u00b7inv x)) x\n}\n\nlocale AbelianGroup { A =>\n splice group : Group,\n splice commutativeOperation : CommutativeOperation / {car := A\u00b7car, cmp := A\u00b7cmp}\n}</pre>\n \n \n \n\n \n\n <h2>Two papers to appear in <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a></h2>\n \n <p>This week I have had two papers accepted at <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a>; I\u2019m excited about both of them.</p>\n \n\n \n\n <h3>With <a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a>: <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">When is the partial map classifier a Sierpi\u0144ski cone?</a></h3>\n \n <p><a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a> is my old Part III student from 2023\u20132024, and this work builds on her Part III dissertation. The goal of our paper was to better understand the relationship between two approaches to partial functions in denotational semantics:</p>\n <ol><li><strong>\u201cGeometrical\u201d partiality / \u201cthe Sierpi\u0144ski cone\u201d</strong>: freely add a lowest element to the space representing a given data type. This is useful for defining functions whose <em>inputs</em> are partially defined, because you can do a case-analysis on the definedness of the input.</li>\n <li><strong>\u201cLogical\u201d partiality / \u201cthe partial map classifier\u201d</strong>: representing partially defined elements of a space X by pairs \n \n (\n p,x\n )\n \n where p is a proposition and x is a function from isTrue\n \n (\n p\n )\n \n to X. This is useful for defining functions whose <em>outputs</em> are partially defined.</li></ol>\n <p>In traditional domain theory as developed by <a href=\"https://www.jonmsterling.com/danascott/\">Scott</a>, the two kinds of partiality coincide\u2014<a href=\"https://www.jonmsterling.com/sterling-2024-lifting/\">even constructively</a>. I am, however, interested in <a href=\"https://www.jonmsterling.com/hyland-1991/\"><em>synthetic domain theory</em></a> which abstracts away from continuity and limits and lets you just use sets and functions rather than cpos and continuous functions\u2014provided that you avoid non-constructive principles like the Axiom of Choice or the Law of Excluded Middle. The starting point of our work is my observation that the two notions cannot coincide <em>absolutely</em> in synthetic domain theory, but that there may be restricted subuniverses in which they do coincide. The main result of our paper is to define such a subuniverse, made possible by my discovery of the <em>based Segal condition</em>\u2014a strengthening of the usual Segal condition for higher categories.</p>\n <p>A broader motivation of this work is to develop synthetic domain theory and synthetic higher category theory within the same framework. Whereas synthetic domain theory traditionally concerned itself with spaces that behaved like \u03c9-complete partial orders (but where all functions are automatically monotone and continuous), the same ideas (if applied within <a href=\"https://www.jonmsterling.com/hottbook/\">homotopy type theory</a>) allow you to consider spaces that behave like <em>\u221e-categories</em> with colimits of \u03c9-chains (but where all functions are automatically \u221e-functorial and \u03c9-continuous). I believe that unifying domain theory and higher category theory will prove useful for studying things like the denotational semantics of concurrency, which is inherently higher-dimensional.</p>\n \n \n\n \n\n <h3>With <a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a>: <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting of fibred categories</a></h3>\n \n <p>This year, <a href=\"https://www.jonmsterling.com/thomasstreicher/\">Thomas Streicher</a> (born 1958) passed away from cancer. Thomas was one of the Greats of dependent type theory and he also wrote an <a href=\"https://www.abebooks.co.uk/9789812701428/Domain-theoretic-Foundations-Functional-Programming-Streicher-9812701427/plp\">excellent textbook on domain theory for denotational semantics</a>, but much more importantly he was kind and curious and patient and always made time for young people. While I was still finding my place in the community, Thomas was very generous to me with his time and advice, and he sent me many papers to referee.</p>\n <p>Although Thomas made many contributions to dependent type theory, domain theory, realisability theory, and category theory, he is most known to type theorists for two things\u2014both in collaboration with the late <a href=\"https://www.jonmsterling.com/martinhofmann/\">Martin Hofmann</a>: the <a href=\"https://www.jonmsterling.com/hofmann-streicher-1998/\">groupoid interpretation of type theory</a> and the eponymous <a href=\"https://www.jonmsterling.com/hofmann-streicher-1997/\">Hofmann\u2013Streicher universe lifting construction</a>. Andrew and my paper pertains to the latter.</p>\n <p>The idea of Hofmann\u2013Streicher lifting has to do with universes, which are \u201ctypes of types\u201d (typically defined in such a way as to avoid paradoxes). Martin-L\u00f6f type theory usually includes universes in order to be able to quantify over (small enough) types; in the simplest models of Martin-L\u00f6f type theory, types are interpreted as sets and so Martin-L\u00f6f\u2019s universes are interpreted as certain sets of sets, such as <a href=\"https://www.jonmsterling.com/sga-4/\">Grothendieck universes</a>. But it is important to be able to interpret the language of type theory in more sophisticated worlds than set theory: for example, in <em>presheaves</em> (which are functors from a fixed category C into Set). What <a href=\"https://www.jonmsterling.com/hofmann-streicher-1997/\">Hofmann and Streicher</a> did is show how to transform any universe of sets into a universe of presheaves!</p>\n <p>Although Hofmann and Streicher\u2019s construction worked well and had good properties, they did not find a <em>universal property</em> for it\u2014which is an abstract description of the object that determines it uniquely up to isomorphism, usually in terms of how it relates to other objects. Recently <a href=\"https://www.jonmsterling.com/awodey-2024-universes/\">Awodey</a> found a 1-dimensional universal property, which was the starting point of our work. What Andrew and I wanted to do is generalise Awodey\u2019s analysis in two directions:</p>\n <ol><li>We wanted a <em>2-dimensional</em> version, which is useful because it captures more about the universe than can be said in just one dimension: for example, with a 2-dimensional version, you can see immediately (by \u201cabstract nonsense\u201d) that Hofmann\u2013Streicher lifting preserves structures like monads, adjunctions, etc. that might be used for modelling computational effects, etc.</li>\n <li>We wanted a <em>relative</em> version, which would make it easier to iterate the Hofmann\u2013Streicher lifting construction: the purpose of this is to be able to define presheaf models of type theory <em>internal</em> to other presheaf models. These kind of situations actually happen in practice! For example, the model of <a href=\"https://www.jonmsterling.com/bbcgsv-2019/\">guarded cubical type theory</a> that combines step-indexing with univalence ought to be an example of this.</li></ol>\n <p>To develop this two-fold generalisation of Hofmann\u2013Streicher lifting, we resituated the theory in terms of another of Thomas\u2019s favourite topics: the theory of <em>fibrations</em>, on which Thomas had written <a href=\"https://www.jonmsterling.com/streicher-fcjb/\">the most wonderful lecture notes</a>.</p>\n <p>We dedicated our paper to Thomas\u2019s memory. May he rest in peace.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/clarke-1979/\">The Fountains of Paradise</a></h2>\n \n <p>I recently read Arthur C. Clarke\u2019s <a href=\"https://www.jonmsterling.com/clarke-1979/\">The Fountains of Paradise</a>; although it was a pretty good read, I found that like many science fiction books of that era, one has to look past a lot in order to enjoy it. I wrote some commentary in my post entitled <a href=\"https://www.jonmsterling.com/019W/\">Ventriloquy of the Mid-Century Man</a> on my culture blog <a href=\"https://www.jonmsterling.com/015X/\">The Jon Sterling Review of Books</a>.</p>",
+18
jonsterling/2025-W16_.json
+18
jonsterling/2025-W16_.json
···+"summary": "<p>I thought I had less to say than <a href=\"https://www.jonmsterling.com/2025-W15/\">last week</a>, but then I started reflecting on my current experiments with retrocomputing and before I knew it, I had written an <a href=\"https://www.jonmsterling.com/01AH/\">entire blog post</a> about it which I have transcluded below.</p>\n \n\n \n\n <h2>Putting Mac OS X Tiger back to work</h2>\n \n <p>Over the Christmas holiday, I bought an iMac off eBay for \u00a350. Why was it so cheap? Because it is a 2006 model firing on a single gigabyte of RAM with an Intel Core 2 Duo processor, running Mac OS X Tiger. When I was a kid, I dreamed of having a computer like this\u2014for me, the white plastic era will always be peak Apple design, and Tiger\u2019s version of Aqua was the most polished and refined form of the design language that they ever managed to produce. My first Macintosh was a white polycarbonate unibody MacBook running Leopard\u2014and at the time I greatly regretted having just missed the Tiger era for the gaudy and overly darkened feel of Leopard with its sinister-coloured window controls. I did not know at the time how much worse things would get\u2026</p>\n <p>My excuse for purchasing this machine was that I \u201cneeded\u201d to run Mac OS X Tiger as \u201cresearch\u201d for my experimental project <a href=\"https://github.com/jonsterling/aquaui\">AquaUI</a>, which imagines how the Aqua design language could have evolved if it had been allowed to. But really, I wanted to relive my rare trips to Apple retailers as a kid, where I would spend minutes doing nothing but just moving the scrollbar while watching its stationary \u201cwave\u201d texture, or highlighting buttons to see them burst with blue radiance.</p>\n <p>(I spoke about many of the topics covered in this post in my appearance on the <a href=\"https://kodsnack.se/\">Kodsnack podcast</a> hosted by Fredrik Bj\u00f6reman: <em><a href=\"https://kodsnack.se/international/626/\">Episode 626, \u201cThe great flattening of everything\u201d</a></em>.)</p>\n \n\n \n\n <h3>Day One: what can you do 19 year old iMac?</h3>\n \n <p>When the delivery came, I took the machine gingerly out of its battered and taped over original packaging and turned it on to a feeling of great excitement, which quickly gave way to loss and melancholy: so much of what computers are \u201cfor\u201d involves the World Wide Web, and the forced transition to HTTPS/TLS has stolen access to the Web from users of many working computers of access to the Web (unless they gut the machine by downgrading to a modern but severely less polished operating system, like Linux). The old Intel Macs are a prime example of this loss\u2014although some <a href=\"https://tenfourfox.blogspot.com/\">volunteer projects exist</a> to enable safe access to the Web for PowerPC machines, older Intel Macintoshes have received comparatively less attention. Capitalist forced obsolescence comes to all, however, and there will no doubt come a time when the \u201cnecessary\u201d \u201csecurity\u201d routines will simply not be executable with the kinds of hardware resources that could be mustered in 2006, no matter the system architecture. After playing around and finding much of the functionality crippled due to being barred from the Internet, I had to ask myself, <strong>What should I even do with this thing?</strong></p>\n <p>The iMac lay dormant in my <a href=\"https://www.jonmsterling.com/00GP/\">College</a> room for the next few months while I figured out an answer to that question.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiejjxl32uo7vfhgpkbkxuiww4vabuu745bhajcrleqi7uz4blqjay.gif\" width=\"300px\">\n\n \nMy iMac sleeping peacefully in my office at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Memorial Court, connected to a vintage A1048 keyboard and Apple Pro Mouse (as it should be!). Nearby is my iPod Classic, which I use for about an hour each day and charge once every 6-8 weeks.\n \n \n\n \n\n <h3>With a little love, <em>everything</em> has a use</h3>\n \n <p>Last week I finally realised that there is a lot I can still do with this machine. I turned it on when I had a bit of free time, and found that it remains very snappy\u2014software opens instantly without hanging, and in fact the built-in programs are significantly less bug-ridden than they were in subsequent versions of Mac OS X and its (unworthy) successor macOS. To put this into perspective, the \u201coutdated\u201d iMac\u2019s performance was far better than that of my last Intel iMac from 2020 with sixteen times the RAM and several times the processor cores.</p>\n <p>It is well-known that hardware capabilities get better and better each year, but this did not translate into improved performance for users until after the Apple Silicon transition\u2014when the hardware improvement was so great that it was able to outpace the deathmarch of inefficient software, for a time. Don\u2019t worry, the \u201ctransition to AI\u201d is going to destroy all those gains soon and we\u2019ll be back where we started.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiaaxhqvbslbd3sb6mafhlrxswbdbwipj4bzhuewlg7il73mhq7l7y.jpeg\" width=\"300px\">\n\n \nMac OS X Tiger is still King\u2014with the peak versions of Finder, Preview, and iTunes.\n <p>But I digress. Even if you can\u2019t use the Web, there are many things that a 19 year old iMac running Mac OS X Tiger is better at than a more recently manufactured machine. For example, Tiger was the last version of Mac OS X in which Preview.app (the PDF and image viewer) had a working search interface; from the subsequent version (Leopard) all the way until the present day, searching is somehow both too fuzzy and not fuzzy enough, and there seems to be no combination of quotation marks that will lead to reasonable results appearing in the search pane. (Same with Mail.app, which has somehow got <em>even worse</em> in the past year; you can\u2019t connect to email on such an old machine anyway, so the point is moot.)</p>\n <p>Similarly, iTunes 7 was the functional peak for Apple\u2019s music management and playback software (although iTunes 6 was visually superior), and people who have only used Apple\u2019s current \u201cMusic\u201d app will not be able to understand what they are missing. Likewise, the version of Finder shipped with Tiger was the most polished and least buggy version they ever produced; it is really amazing to switch back and forth between macOS 15.3 and Mac OS X 10.4, and find that most of the bugs or usability problems I have encountered on a daily basis for the past decade or so are actually <em>regressions</em>.</p>\n \n\n \n\n <h4>The perfect music and PDF reading workstation</h4>\n \n <p>So I transferred my music and PDF libraries to the iMac\u2014this was easy to do by creating a local WiFi network from the iMac, a functionality that has been removed in macOS(!). Indeed, modern macOS has replaced some (but not all) aspects of this functionality with what is called \u201cInternet Sharing\u201d, but this feature does not work reliably and in many cases the needful functionalities are unpredictably grayed out and disabled without any message explaining why. Death by a thousand papercuts... But I digress: I set up a local WiFi network with a file server easily using the old <em>System Preferences</em> application (don\u2019t get me started on the bizarre redesign of System Settings introduced in macOS Ventura), and easily transferred everything I wanted to the iMac and then I was off to the races.</p>\n <p>I listen to music and study papers on this machine, and it gives me so much joy to <em>use</em> this masterpiece of practical industrial design every day\u2014I even write referee reports on it using an ancient version of <a href=\"https://www.omnigroup.com/omnioutliner/\">OmniOutliner</a>, a venerable piece of software that I have to say has not improved much in the past two decades. After installing a copy of <a href=\"https://macintoshgarden.org/apps/scrivener\">Scrivener 2.5</a> (don\u2019t worry, I own a license for <a href=\"https://www.literatureandlatte.com/\">Scrivener 3.0</a> and you should too!), I find myself creative writing in my free time like it\u2019s 2006.</p>\n \n \n\n \n\n <h4>What about my iPod Classic?</h4>\n \n <p>Some of you may be aware that I use an iPod Classic every day. The thing is a godsend\u2014the best mobile device I own. I bought it with a fresh battery and SSD, and the damn battery lasts for months before I have to recharge it. That is the kind of technology that was taken from us and replaced by resource-intensive devices governed by the logic of planned obsolescence. But I have it back\u2014my world is not the same as your world, but it is a world I am glad to have returned to.</p>\n <p>Naturally, the first thing I wanted to do was use the iMac as a hub for synchronising the iPod with iTunes. This will work, but what I did not anticipate is that one of my main uses of the iPod is to listen to podcasts, and podcasts cannot be downloaded on the iMac because of the vicious imposition of TLS on all parts of the web that didn\u2019t need it (<a href=\"https://letsencrypt.org/\">Let\u2019s Encrypt</a> really ought to have been called <em>Let\u2019s Kill The Open Web</em>). So I continue synchronising my iPod with my modern day MacBook Air\u2014and it is a testament to Apple\u2019s historical approach to backward compatibility that this is still possible (and even integrated with the otherwise terrible Podcasts app!).</p>\n \n \n \n\n \n\n <h3>Is retrocomputing sustainable?</h3>\n \n <p>I constantly feel a pang in the back of my throat when I think about retrocomputing over the long term. We are scrounging around for intact pieces of old technology, but there will come a time when these are too scarce, or when we have really lost the ability to repair them. It is like living in a post-apocalyptic film where a cataclysm has made all manufacturing impossible\u2014but today the cataclysm is not a war or even a virus, but just the simple vicious logic of Capital and a technology industry that has hitched itself to the most ignorant and anti-human trends emanating from the most technologically ignorant people on Wall Street.</p>\n <p>Retrocomputing is decidedly <em>not</em> sustainable, in the same sense that living on a stash of canned foods that can no longer be manufactured cannot be sustainable. But also unsustainable is the present day technological treadmill of consumer goods containing precious metals and dangerous chemicals being produced in the billions and sent directly almost directly to the landfill.</p>\n <p>I think a better question to ask is whether retrocomputing is <em>progressive</em>. I think that retrocomputing can be progressive insofar as it is part of a practice of looking <em>forward</em>\u2014how can we build sovereign technology that respects constrained resources as well as users of different abilities, and cannot be taken away or made useless by Capital and the irrational whims of the stock market. Such a project <em>must</em> have a significant design component, and this cannot be done amateurishly; looking to the iconic design languages of the past for inspiration and education is, then, deeply progressive in this environment.</p>\n <p>The tragedy of Mac OS X is not that Apple took it away and replaced it with something inferior: the tragedy is that free software communities have never managed to produce something even remotely approaching its level of fit and finish. Volunteer projects do not deserve my ire, which I reserve for our economic system in which nearly <em>all</em> competent design work is confined to corporate environments, and then wiped away when the wind shifts.</p>\n \n \n\n \n\n <h3>Bring back the joy in computing!</h3>\n \n <p>Forget your troubles and find something that makes you smile, and reminds you of what is possible when a team brimming with talent comes together to build something beautiful.</p>\n <p>Write to me with any joyful and quirky projects, hardware or software, that you would like to share.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/jemlord/\">Jem Lord</a> at <a href=\"https://www.jonmsterling.com/hott-uf-2025/\">HoTT/UF</a> in Genoa</h2>\n \n <p>My first-year PhD student <a href=\"https://www.jonmsterling.com/jemlord/\">Jem Lord</a> presented <a href=\"https://hott-uf.github.io/2025/abstracts/HoTTUF_2025_paper_21.pdf\">their work on <em>Easy Parametricity</em></a> at the <a href=\"https://www.jonmsterling.com/hott-uf-2025/\">HoTT/UF</a> workshop in Genoa this week. Although I was not able to come in person, a little birdy told me that Jem gave a very good talk, so I\u2019m proud of them for that. Congratulations, Jem!</p>\n <p>Jem\u2019s work concerns a very simple parametricity axiom for a universe U in type theory: namely, that every U-small type A:U be U-null in the sense of <a href=\"https://www.jonmsterling.com/rijke-shulman-spitters-2020/\">Rijke, Shulman and Spitters</a>. This is a mathematical way to say that small types \u201ccannot see\u201d their universe; another way to phrase it is that every function f:U\u2192A for A:U is constant. One of Jem\u2019s results, which has a startling proof(!), is that when C is a category that is complete with respect to certain U-small diagrams and D is locally U-small category, any \u201cunnatural\u201d transformation between functors F,G:C\u2192D is automatically natural. Many similar results can be obtained in the same way.</p>\n <p>There are a variety of models of these axioms. One example is the impredicative universe of modest types within a category of assemblies, which is the \u201cstandard\u201d categorical model of both System F and the original calculus of constructions. The same principle will work within <a href=\"https://www.jonmsterling.com/uemura-2019-types/\">cubical assemblies</a>.</p>\n \n \n\n \n\n <h2>Speaking at <a href=\"https://www.jonmsterling.com/yamcats-37/\">YaMCaTS 37</a> next week</h2>\n \n <p>I am traveling to Sheffield on Tuesday, to speak at the <a href=\"https://www.jonmsterling.com/yamcats-37/\">Yorkshire and Midlands Category Theory Seminar</a> the following day. I haven\u2019t prepared my talk yet, but I will be speaking about my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">joint work</a> with <a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a> on the geometry of partial map classifiers, which <a href=\"https://www.jonmsterling.com/01A6/\">I discussed in my previous weeknote</a>. I\u2019ll be returning to Cambridge on Thursday, in time to have a <a href=\"https://www.jonmsterling.com/00GP/\">College</a> lunch the following day with <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a>\u2019s student <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a>, who has been doing <a href=\"https://patrick.sirref.org/weekly-2025-03-31/\">very interesting things</a> with <a href=\"https://www.forester-notes.org/index/\">Forester</a>.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a></h2>\n \n <p><a href=\"https://www.jonmsterling.com/mitchellriley/\">Mitchell Riley</a> suggested I read <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a> by Adrian Tchaikovsky next, a more recent science fiction novel than <a href=\"https://www.jonmsterling.com/019W/\">what I have been reading lately</a>. I\u2019ve haven\u2019t gotten very deep into it yet, but so far I am enjoying it with a few reservations. The quality of writing, in the literary sense, is somehow still lower than my usual expectations\u2014and the development of human characters is as flimsy and hackneyed as I have come to expect in this genre, but I have to say that the approach to characterising both sentient and non-sentient spiders is actually creative and engaging.</p>\n <p>Stay tuned for a future review in my <a href=\"https://www.jonmsterling.com/015X/\">culture blog</a>!</p>",+"content": "<p>I thought I had less to say than <a href=\"https://www.jonmsterling.com/2025-W15/\">last week</a>, but then I started reflecting on my current experiments with retrocomputing and before I knew it, I had written an <a href=\"https://www.jonmsterling.com/01AH/\">entire blog post</a> about it which I have transcluded below.</p>\n \n\n \n\n <h2>Putting Mac OS X Tiger back to work</h2>\n \n <p>Over the Christmas holiday, I bought an iMac off eBay for \u00a350. Why was it so cheap? Because it is a 2006 model firing on a single gigabyte of RAM with an Intel Core 2 Duo processor, running Mac OS X Tiger. When I was a kid, I dreamed of having a computer like this\u2014for me, the white plastic era will always be peak Apple design, and Tiger\u2019s version of Aqua was the most polished and refined form of the design language that they ever managed to produce. My first Macintosh was a white polycarbonate unibody MacBook running Leopard\u2014and at the time I greatly regretted having just missed the Tiger era for the gaudy and overly darkened feel of Leopard with its sinister-coloured window controls. I did not know at the time how much worse things would get\u2026</p>\n <p>My excuse for purchasing this machine was that I \u201cneeded\u201d to run Mac OS X Tiger as \u201cresearch\u201d for my experimental project <a href=\"https://github.com/jonsterling/aquaui\">AquaUI</a>, which imagines how the Aqua design language could have evolved if it had been allowed to. But really, I wanted to relive my rare trips to Apple retailers as a kid, where I would spend minutes doing nothing but just moving the scrollbar while watching its stationary \u201cwave\u201d texture, or highlighting buttons to see them burst with blue radiance.</p>\n <p>(I spoke about many of the topics covered in this post in my appearance on the <a href=\"https://kodsnack.se/\">Kodsnack podcast</a> hosted by Fredrik Bj\u00f6reman: <em><a href=\"https://kodsnack.se/international/626/\">Episode 626, \u201cThe great flattening of everything\u201d</a></em>.)</p>\n \n\n \n\n <h3>Day One: what can you do 19 year old iMac?</h3>\n \n <p>When the delivery came, I took the machine gingerly out of its battered and taped over original packaging and turned it on to a feeling of great excitement, which quickly gave way to loss and melancholy: so much of what computers are \u201cfor\u201d involves the World Wide Web, and the forced transition to HTTPS/TLS has stolen access to the Web from users of many working computers of access to the Web (unless they gut the machine by downgrading to a modern but severely less polished operating system, like Linux). The old Intel Macs are a prime example of this loss\u2014although some <a href=\"https://tenfourfox.blogspot.com/\">volunteer projects exist</a> to enable safe access to the Web for PowerPC machines, older Intel Macintoshes have received comparatively less attention. Capitalist forced obsolescence comes to all, however, and there will no doubt come a time when the \u201cnecessary\u201d \u201csecurity\u201d routines will simply not be executable with the kinds of hardware resources that could be mustered in 2006, no matter the system architecture. After playing around and finding much of the functionality crippled due to being barred from the Internet, I had to ask myself, <strong>What should I even do with this thing?</strong></p>\n <p>The iMac lay dormant in my <a href=\"https://www.jonmsterling.com/00GP/\">College</a> room for the next few months while I figured out an answer to that question.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiejjxl32uo7vfhgpkbkxuiww4vabuu745bhajcrleqi7uz4blqjay.gif\" width=\"300px\">\n\n \nMy iMac sleeping peacefully in my office at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Memorial Court, connected to a vintage A1048 keyboard and Apple Pro Mouse (as it should be!). Nearby is my iPod Classic, which I use for about an hour each day and charge once every 6-8 weeks.\n \n \n\n \n\n <h3>With a little love, <em>everything</em> has a use</h3>\n \n <p>Last week I finally realised that there is a lot I can still do with this machine. I turned it on when I had a bit of free time, and found that it remains very snappy\u2014software opens instantly without hanging, and in fact the built-in programs are significantly less bug-ridden than they were in subsequent versions of Mac OS X and its (unworthy) successor macOS. To put this into perspective, the \u201coutdated\u201d iMac\u2019s performance was far better than that of my last Intel iMac from 2020 with sixteen times the RAM and several times the processor cores.</p>\n <p>It is well-known that hardware capabilities get better and better each year, but this did not translate into improved performance for users until after the Apple Silicon transition\u2014when the hardware improvement was so great that it was able to outpace the deathmarch of inefficient software, for a time. Don\u2019t worry, the \u201ctransition to AI\u201d is going to destroy all those gains soon and we\u2019ll be back where we started.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiaaxhqvbslbd3sb6mafhlrxswbdbwipj4bzhuewlg7il73mhq7l7y.jpeg\" width=\"300px\">\n\n \nMac OS X Tiger is still King\u2014with the peak versions of Finder, Preview, and iTunes.\n <p>But I digress. Even if you can\u2019t use the Web, there are many things that a 19 year old iMac running Mac OS X Tiger is better at than a more recently manufactured machine. For example, Tiger was the last version of Mac OS X in which Preview.app (the PDF and image viewer) had a working search interface; from the subsequent version (Leopard) all the way until the present day, searching is somehow both too fuzzy and not fuzzy enough, and there seems to be no combination of quotation marks that will lead to reasonable results appearing in the search pane. (Same with Mail.app, which has somehow got <em>even worse</em> in the past year; you can\u2019t connect to email on such an old machine anyway, so the point is moot.)</p>\n <p>Similarly, iTunes 7 was the functional peak for Apple\u2019s music management and playback software (although iTunes 6 was visually superior), and people who have only used Apple\u2019s current \u201cMusic\u201d app will not be able to understand what they are missing. Likewise, the version of Finder shipped with Tiger was the most polished and least buggy version they ever produced; it is really amazing to switch back and forth between macOS 15.3 and Mac OS X 10.4, and find that most of the bugs or usability problems I have encountered on a daily basis for the past decade or so are actually <em>regressions</em>.</p>\n \n\n \n\n <h4>The perfect music and PDF reading workstation</h4>\n \n <p>So I transferred my music and PDF libraries to the iMac\u2014this was easy to do by creating a local WiFi network from the iMac, a functionality that has been removed in macOS(!). Indeed, modern macOS has replaced some (but not all) aspects of this functionality with what is called \u201cInternet Sharing\u201d, but this feature does not work reliably and in many cases the needful functionalities are unpredictably grayed out and disabled without any message explaining why. Death by a thousand papercuts... But I digress: I set up a local WiFi network with a file server easily using the old <em>System Preferences</em> application (don\u2019t get me started on the bizarre redesign of System Settings introduced in macOS Ventura), and easily transferred everything I wanted to the iMac and then I was off to the races.</p>\n <p>I listen to music and study papers on this machine, and it gives me so much joy to <em>use</em> this masterpiece of practical industrial design every day\u2014I even write referee reports on it using an ancient version of <a href=\"https://www.omnigroup.com/omnioutliner/\">OmniOutliner</a>, a venerable piece of software that I have to say has not improved much in the past two decades. After installing a copy of <a href=\"https://macintoshgarden.org/apps/scrivener\">Scrivener 2.5</a> (don\u2019t worry, I own a license for <a href=\"https://www.literatureandlatte.com/\">Scrivener 3.0</a> and you should too!), I find myself creative writing in my free time like it\u2019s 2006.</p>\n \n \n\n \n\n <h4>What about my iPod Classic?</h4>\n \n <p>Some of you may be aware that I use an iPod Classic every day. The thing is a godsend\u2014the best mobile device I own. I bought it with a fresh battery and SSD, and the damn battery lasts for months before I have to recharge it. That is the kind of technology that was taken from us and replaced by resource-intensive devices governed by the logic of planned obsolescence. But I have it back\u2014my world is not the same as your world, but it is a world I am glad to have returned to.</p>\n <p>Naturally, the first thing I wanted to do was use the iMac as a hub for synchronising the iPod with iTunes. This will work, but what I did not anticipate is that one of my main uses of the iPod is to listen to podcasts, and podcasts cannot be downloaded on the iMac because of the vicious imposition of TLS on all parts of the web that didn\u2019t need it (<a href=\"https://letsencrypt.org/\">Let\u2019s Encrypt</a> really ought to have been called <em>Let\u2019s Kill The Open Web</em>). So I continue synchronising my iPod with my modern day MacBook Air\u2014and it is a testament to Apple\u2019s historical approach to backward compatibility that this is still possible (and even integrated with the otherwise terrible Podcasts app!).</p>\n \n \n \n\n \n\n <h3>Is retrocomputing sustainable?</h3>\n \n <p>I constantly feel a pang in the back of my throat when I think about retrocomputing over the long term. We are scrounging around for intact pieces of old technology, but there will come a time when these are too scarce, or when we have really lost the ability to repair them. It is like living in a post-apocalyptic film where a cataclysm has made all manufacturing impossible\u2014but today the cataclysm is not a war or even a virus, but just the simple vicious logic of Capital and a technology industry that has hitched itself to the most ignorant and anti-human trends emanating from the most technologically ignorant people on Wall Street.</p>\n <p>Retrocomputing is decidedly <em>not</em> sustainable, in the same sense that living on a stash of canned foods that can no longer be manufactured cannot be sustainable. But also unsustainable is the present day technological treadmill of consumer goods containing precious metals and dangerous chemicals being produced in the billions and sent directly almost directly to the landfill.</p>\n <p>I think a better question to ask is whether retrocomputing is <em>progressive</em>. I think that retrocomputing can be progressive insofar as it is part of a practice of looking <em>forward</em>\u2014how can we build sovereign technology that respects constrained resources as well as users of different abilities, and cannot be taken away or made useless by Capital and the irrational whims of the stock market. Such a project <em>must</em> have a significant design component, and this cannot be done amateurishly; looking to the iconic design languages of the past for inspiration and education is, then, deeply progressive in this environment.</p>\n <p>The tragedy of Mac OS X is not that Apple took it away and replaced it with something inferior: the tragedy is that free software communities have never managed to produce something even remotely approaching its level of fit and finish. Volunteer projects do not deserve my ire, which I reserve for our economic system in which nearly <em>all</em> competent design work is confined to corporate environments, and then wiped away when the wind shifts.</p>\n \n \n\n \n\n <h3>Bring back the joy in computing!</h3>\n \n <p>Forget your troubles and find something that makes you smile, and reminds you of what is possible when a team brimming with talent comes together to build something beautiful.</p>\n <p>Write to me with any joyful and quirky projects, hardware or software, that you would like to share.</p>\n \n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/jemlord/\">Jem Lord</a> at <a href=\"https://www.jonmsterling.com/hott-uf-2025/\">HoTT/UF</a> in Genoa</h2>\n \n <p>My first-year PhD student <a href=\"https://www.jonmsterling.com/jemlord/\">Jem Lord</a> presented <a href=\"https://hott-uf.github.io/2025/abstracts/HoTTUF_2025_paper_21.pdf\">their work on <em>Easy Parametricity</em></a> at the <a href=\"https://www.jonmsterling.com/hott-uf-2025/\">HoTT/UF</a> workshop in Genoa this week. Although I was not able to come in person, a little birdy told me that Jem gave a very good talk, so I\u2019m proud of them for that. Congratulations, Jem!</p>\n <p>Jem\u2019s work concerns a very simple parametricity axiom for a universe U in type theory: namely, that every U-small type A:U be U-null in the sense of <a href=\"https://www.jonmsterling.com/rijke-shulman-spitters-2020/\">Rijke, Shulman and Spitters</a>. This is a mathematical way to say that small types \u201ccannot see\u201d their universe; another way to phrase it is that every function f:U\u2192A for A:U is constant. One of Jem\u2019s results, which has a startling proof(!), is that when C is a category that is complete with respect to certain U-small diagrams and D is locally U-small category, any \u201cunnatural\u201d transformation between functors F,G:C\u2192D is automatically natural. Many similar results can be obtained in the same way.</p>\n <p>There are a variety of models of these axioms. One example is the impredicative universe of modest types within a category of assemblies, which is the \u201cstandard\u201d categorical model of both System F and the original calculus of constructions. The same principle will work within <a href=\"https://www.jonmsterling.com/uemura-2019-types/\">cubical assemblies</a>.</p>\n \n \n\n \n\n <h2>Speaking at <a href=\"https://www.jonmsterling.com/yamcats-37/\">YaMCaTS 37</a> next week</h2>\n \n <p>I am traveling to Sheffield on Tuesday, to speak at the <a href=\"https://www.jonmsterling.com/yamcats-37/\">Yorkshire and Midlands Category Theory Seminar</a> the following day. I haven\u2019t prepared my talk yet, but I will be speaking about my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">joint work</a> with <a href=\"https://www.jonmsterling.com/leonipugh/\">Leoni Pugh</a> on the geometry of partial map classifiers, which <a href=\"https://www.jonmsterling.com/01A6/\">I discussed in my previous weeknote</a>. I\u2019ll be returning to Cambridge on Thursday, in time to have a <a href=\"https://www.jonmsterling.com/00GP/\">College</a> lunch the following day with <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a>\u2019s student <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a>, who has been doing <a href=\"https://patrick.sirref.org/weekly-2025-03-31/\">very interesting things</a> with <a href=\"https://www.forester-notes.org/index/\">Forester</a>.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a></h2>\n \n <p><a href=\"https://www.jonmsterling.com/mitchellriley/\">Mitchell Riley</a> suggested I read <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a> by Adrian Tchaikovsky next, a more recent science fiction novel than <a href=\"https://www.jonmsterling.com/019W/\">what I have been reading lately</a>. I\u2019ve haven\u2019t gotten very deep into it yet, but so far I am enjoying it with a few reservations. The quality of writing, in the literary sense, is somehow still lower than my usual expectations\u2014and the development of human characters is as flimsy and hackneyed as I have come to expect in this genre, but I have to say that the approach to characterising both sentient and non-sentient spiders is actually creative and engaging.</p>\n <p>Stay tuned for a future review in my <a href=\"https://www.jonmsterling.com/015X/\">culture blog</a>!</p>",
+18
jonsterling/2025-W17_.json
+18
jonsterling/2025-W17_.json
···+"summary": "<p>It has been a good but busy week. I have been moving more slowly than recently, as I did a tremendous number on my muscles and joints while working in my garden on the weekend. Hope to feel better soon.</p>\n \n\n \n\n <h2>We have AI at home\u2026</h2>\n \n <p>On Tuesday, I travelled by train to Sheffield to take part in the <a href=\"https://www.jonmsterling.com/yamcats-37/\">Yorkshire and Midlands Category Theory Seminar 37</a> meeting, where I would be <a href=\"https://www.jonmsterling.com/sterling-2025-yamcats-37/\">speaking</a> about my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">paper</a> that compares partial map classifiers with Sierpi\u0144ski cones in synthetic (domain/category) theory, which I <a href=\"https://www.jonmsterling.com/01A6/\">summarised previously</a>.</p>\n \n\n \n\n <h3>A pleasant surprise</h3>\n \n <p>I was preparing for my chalk talk when I realised that I could not remember the details of the proof of the main result and they couldn\u2019t really be reconstructed from the abbreviated proof in the paper.</p>\n <p>Luckily, I had actually formalised this result in Agda! I did not mention my formalisation in the paper because I do not think of formalisations as scientific contributions except in certain cases (that is a conversation for another day). But I did indeed formalise it because the proof was subtle enough that I needed computerised assistance back when I proved it the first time. The result I obtained was frustratingly weak, and seemed to require some annoying side conditions in order to go through; the formalisation helped me be certain that these side conditions were in fact sufficient.</p>\n <p>Anyway, I was messing around with the code and what I realised was that I had missed a trick back then: <strong>one of the side conditions was actually unnecessary</strong>, and it seems kind of likely that the other one is unnecessary too. I am certain I would not have noticed this if I hadn't had the proof assistant, which made it easy for me to try something out and see if it worked. I should have time to update the paper to claim the strong result prior to the <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS</a> camera-ready deadline next month.</p>\n \n \n\n \n\n <h3>Arise, symbolic AI!</h3>\n \n <p>There is a lot of discussion lately of the impact that some current machine learning techniques, marketed as \u201cArtificial Intelligence\u201d, can have on formalisation of mathematics in proof assistants. Some of the <a href=\"https://www.math.ucla.edu/~tao/\">most esteemed</a> members of the mathematical community have gone <em>all in</em> on this trend <span>(is it a requirement of scientific fame and esteem that you begin to cause trouble in areas of research that you know nothing about?)</span>, but I think that evaluating LLMs on Olympiad questions is really missing the point of what computers can do to assist mathematicians. Olympiads are a good fit for LLMs, because kids who participate in Olympiads are behaving much more like LLMs than human mathematicians\u2014the mathematics Olympiad is the ultimate feat of pattern-recognition without understanding, and they are certainly a good fit for the <em>Might Makes Right</em> approach being taken within AI today.</p>\n <p>Agda (and Lean and Rocq and Isabelle) are \u201cArtificial Intelligences\u201d in the most progressive sense\u2014they augment the limited context that a human can store in their mind at once, and are nimble tools for working mathematicians to check and verify their ideas, and (most importantly) they do not proceed by creating a fetish of illusion and misdirection that deceives the public. Their capabilities are limited, but well-circumscribed. I think often about how important it is to know in a definite sense what a tool can and cannot do, and I increasingly think that this is actually part of what makes something a <em>tool</em>. Some of my colleagues have compared LLMs to calculators, in order to make the case that we should get ready for them to be used as everyday tools; but LLMs are not simply tools in the sense that a calculator is a tool.</p>\n \n \n \n\n \n\n <h2>Progress on the <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> Language Server</h2>\n \n <p><a href=\"https://www.jonmsterling.com/kentookura/\">Kento Okura</a> has made a lot of progress over the past week in getting Forester\u2019s language server to the point where it can be used. The first editor that we will support is Neovim, which has good LSP support built-in. I think, however, that Kento had not realised quite what a huge amount of work it is to get a working Neovim configuration from scratch that exercises the features of the language server and actually works out-of-the box on other people\u2019s machines. To address this problem, we will be providing a complete working configuration for anyone who wants to use it; experienced users of Neovim will of course prefer to set things up in their own way. Some <a href=\"https://github.com/kentookura/forester-nvim-config\">preliminary code</a> is available, but please stay tuned for further updates that take more advantage of the capabilities of Kento\u2019s language server.</p>\n \n \n\n \n\n <h2>Lunch with <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a></h2>\n \n <p>I had a very pleasant lunch in <a href=\"https://www.jonmsterling.com/00GP/\">College</a> with <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a> as my guest; we discussed many things, including the future of Forester and the importance of <strong>interop</strong> between different authoring tools on the World Wide Web. After lunch, Patrick and I wandered over to Espresso Lane where we had a coffee and a chat with <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil Madhavapeddy</a> and David Allsopp.</p>\n <p>Conspiring about the Open Web with my colleagues in the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a> here is making me feel scientifically alive again\u2014there is much to do, and we intend to have fun doing it.</p>\n \n \n\n \n\n <h2>De-enshittifying <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> infrastructure</h2>\n \n <p>Not many people are aware that the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> has an old supplier agreement with Fastmail, which has persisted even after the (ill-advised!) transition to Microsoft Office 365 a few years ago. <span>(There is a certain kind of person whom you can always trust to make poor and irreversible technical decisions, and argue for them on the basis of maintainability or security or liability or all of the above! Whenever you refute their technical arguments, there is always an unbounded source of further reasons why it is <em>mandatory</em> and <em>inevitable</em> that we enshittify our own infrastructure, at great cost of course!)</span> Anyway, the savvier members of the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> have been rocking Fastmail all this time while I have been suffering the constant outages and inconsistencies of Office 365, which is not only a horrible and unreliable product, but also interacts very poorly with standards-based clients other than Microsoft\u2019s own unusable and bloated clients.</p>\n <p>A month or two ago, <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a> let me on the secret, and I quickly asked the Lab sysadmins to hook me up with a Fastmail account. The process was not completely trivial, as apparently nobody had asked to use this facility for a very long time. But in the end, Piete and Malcom were extremely helpful and I am now up and running with a Lab Fastmail account! <strong>PhD students, postdocs, and faculty are <em>all</em> entitled to use Fastmail if they choose, and I strongly recommend it.</strong> If you are one of those people, then you have access to <a href=\"https://www.cst.cam.ac.uk/local/sys/mail/fastmail\">this internal page</a>, which contains the instructions for getting an account.</p>\n <p>Although we will never be able to get professional services staff off of Office 365 and Teams (this is the sense in which such moves are irreversible), there is absolutely no reason why we have to use it too. I encourage everyone within the Lab to join me on Fastmail, which is extremely reliable and usable. And the more of us who depend on it, the stronger the insurance against forced enshittification in the future.</p>\n <p>Fastmail is just the beginning. With <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a> and other members of EEG, I am hoping that we can begin the process of taking back control of our internal infrastructure and making it work for us in the way it used to years before I arrived. I am spooked by recent proposals from University IT to drop <a href=\"https://talks.cam.ac.uk/dates\">talks.cam</a>; it seems to me that taking over the administration and maintanance of such critical infrastructure would be a good fit for our capabilities, and I promise that we can do it for less than the millions that the University would pay a vendor to irrevocably enshittify this infrastructure.</p>",+"content": "<p>It has been a good but busy week. I have been moving more slowly than recently, as I did a tremendous number on my muscles and joints while working in my garden on the weekend. Hope to feel better soon.</p>\n \n\n \n\n <h2>We have AI at home\u2026</h2>\n \n <p>On Tuesday, I travelled by train to Sheffield to take part in the <a href=\"https://www.jonmsterling.com/yamcats-37/\">Yorkshire and Midlands Category Theory Seminar 37</a> meeting, where I would be <a href=\"https://www.jonmsterling.com/sterling-2025-yamcats-37/\">speaking</a> about my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">paper</a> that compares partial map classifiers with Sierpi\u0144ski cones in synthetic (domain/category) theory, which I <a href=\"https://www.jonmsterling.com/01A6/\">summarised previously</a>.</p>\n \n\n \n\n <h3>A pleasant surprise</h3>\n \n <p>I was preparing for my chalk talk when I realised that I could not remember the details of the proof of the main result and they couldn\u2019t really be reconstructed from the abbreviated proof in the paper.</p>\n <p>Luckily, I had actually formalised this result in Agda! I did not mention my formalisation in the paper because I do not think of formalisations as scientific contributions except in certain cases (that is a conversation for another day). But I did indeed formalise it because the proof was subtle enough that I needed computerised assistance back when I proved it the first time. The result I obtained was frustratingly weak, and seemed to require some annoying side conditions in order to go through; the formalisation helped me be certain that these side conditions were in fact sufficient.</p>\n <p>Anyway, I was messing around with the code and what I realised was that I had missed a trick back then: <strong>one of the side conditions was actually unnecessary</strong>, and it seems kind of likely that the other one is unnecessary too. I am certain I would not have noticed this if I hadn't had the proof assistant, which made it easy for me to try something out and see if it worked. I should have time to update the paper to claim the strong result prior to the <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS</a> camera-ready deadline next month.</p>\n \n \n\n \n\n <h3>Arise, symbolic AI!</h3>\n \n <p>There is a lot of discussion lately of the impact that some current machine learning techniques, marketed as \u201cArtificial Intelligence\u201d, can have on formalisation of mathematics in proof assistants. Some of the <a href=\"https://www.math.ucla.edu/~tao/\">most esteemed</a> members of the mathematical community have gone <em>all in</em> on this trend <span>(is it a requirement of scientific fame and esteem that you begin to cause trouble in areas of research that you know nothing about?)</span>, but I think that evaluating LLMs on Olympiad questions is really missing the point of what computers can do to assist mathematicians. Olympiads are a good fit for LLMs, because kids who participate in Olympiads are behaving much more like LLMs than human mathematicians\u2014the mathematics Olympiad is the ultimate feat of pattern-recognition without understanding, and they are certainly a good fit for the <em>Might Makes Right</em> approach being taken within AI today.</p>\n <p>Agda (and Lean and Rocq and Isabelle) are \u201cArtificial Intelligences\u201d in the most progressive sense\u2014they augment the limited context that a human can store in their mind at once, and are nimble tools for working mathematicians to check and verify their ideas, and (most importantly) they do not proceed by creating a fetish of illusion and misdirection that deceives the public. Their capabilities are limited, but well-circumscribed. I think often about how important it is to know in a definite sense what a tool can and cannot do, and I increasingly think that this is actually part of what makes something a <em>tool</em>. Some of my colleagues have compared LLMs to calculators, in order to make the case that we should get ready for them to be used as everyday tools; but LLMs are not simply tools in the sense that a calculator is a tool.</p>\n \n \n \n\n \n\n <h2>Progress on the <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> Language Server</h2>\n \n <p><a href=\"https://www.jonmsterling.com/kentookura/\">Kento Okura</a> has made a lot of progress over the past week in getting Forester\u2019s language server to the point where it can be used. The first editor that we will support is Neovim, which has good LSP support built-in. I think, however, that Kento had not realised quite what a huge amount of work it is to get a working Neovim configuration from scratch that exercises the features of the language server and actually works out-of-the box on other people\u2019s machines. To address this problem, we will be providing a complete working configuration for anyone who wants to use it; experienced users of Neovim will of course prefer to set things up in their own way. Some <a href=\"https://github.com/kentookura/forester-nvim-config\">preliminary code</a> is available, but please stay tuned for further updates that take more advantage of the capabilities of Kento\u2019s language server.</p>\n \n \n\n \n\n <h2>Lunch with <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a></h2>\n \n <p>I had a very pleasant lunch in <a href=\"https://www.jonmsterling.com/00GP/\">College</a> with <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a> as my guest; we discussed many things, including the future of Forester and the importance of <strong>interop</strong> between different authoring tools on the World Wide Web. After lunch, Patrick and I wandered over to Espresso Lane where we had a coffee and a chat with <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil Madhavapeddy</a> and David Allsopp.</p>\n <p>Conspiring about the Open Web with my colleagues in the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a> here is making me feel scientifically alive again\u2014there is much to do, and we intend to have fun doing it.</p>\n \n \n\n \n\n <h2>De-enshittifying <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> infrastructure</h2>\n \n <p>Not many people are aware that the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> has an old supplier agreement with Fastmail, which has persisted even after the (ill-advised!) transition to Microsoft Office 365 a few years ago. <span>(There is a certain kind of person whom you can always trust to make poor and irreversible technical decisions, and argue for them on the basis of maintainability or security or liability or all of the above! Whenever you refute their technical arguments, there is always an unbounded source of further reasons why it is <em>mandatory</em> and <em>inevitable</em> that we enshittify our own infrastructure, at great cost of course!)</span> Anyway, the savvier members of the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> have been rocking Fastmail all this time while I have been suffering the constant outages and inconsistencies of Office 365, which is not only a horrible and unreliable product, but also interacts very poorly with standards-based clients other than Microsoft\u2019s own unusable and bloated clients.</p>\n <p>A month or two ago, <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a> let me on the secret, and I quickly asked the Lab sysadmins to hook me up with a Fastmail account. The process was not completely trivial, as apparently nobody had asked to use this facility for a very long time. But in the end, Piete and Malcom were extremely helpful and I am now up and running with a Lab Fastmail account! <strong>PhD students, postdocs, and faculty are <em>all</em> entitled to use Fastmail if they choose, and I strongly recommend it.</strong> If you are one of those people, then you have access to <a href=\"https://www.cst.cam.ac.uk/local/sys/mail/fastmail\">this internal page</a>, which contains the instructions for getting an account.</p>\n <p>Although we will never be able to get professional services staff off of Office 365 and Teams (this is the sense in which such moves are irreversible), there is absolutely no reason why we have to use it too. I encourage everyone within the Lab to join me on Fastmail, which is extremely reliable and usable. And the more of us who depend on it, the stronger the insurance against forced enshittification in the future.</p>\n <p>Fastmail is just the beginning. With <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a> and other members of EEG, I am hoping that we can begin the process of taking back control of our internal infrastructure and making it work for us in the way it used to years before I arrived. I am spooked by recent proposals from University IT to drop <a href=\"https://talks.cam.ac.uk/dates\">talks.cam</a>; it seems to me that taking over the administration and maintanance of such critical infrastructure would be a good fit for our capabilities, and I promise that we can do it for less than the millions that the University would pay a vendor to irrevocably enshittify this infrastructure.</p>",
+18
jonsterling/2025-W18_.json
+18
jonsterling/2025-W18_.json
···+"summary": "<p>I was a bit unwell over the weekend and hence much of my week was spent recuperating. I am feeling much better now, but becoming healthily aware of my limitations and need for rest.</p>\n \n\n \n\n <h2>A jaunt through descriptive complexity theory!</h2>\n \n <p>I had the pleasure of reading a draft of a fascinating Part II project dissertation on computational aspects of descriptive complexity theory this week, written by a very talented student at <a href=\"https://www.jonmsterling.com/00GP/\">my college</a>. I won\u2019t say much about it, of course, but it was one of those projects that goes far beyond the material we teach in the course\u2014although I have opinions about the limitations of the Computer Science Tripos as a tool for actually teaching people computer science, I must say that it does a very good job getting <em>out of the way</em> from students who need space to run and learn.</p>\n <p>By the way, I read the thesis on my <a href=\"https://www.jonmsterling.com/01AH/\">white plastic iMac</a> and compiled my feedback using an ancient version of <a href=\"https://www.omnigroup.com/omnioutliner\">OmniOutliner</a>. Converting the outline to a PDF, moving it to my modern laptop by creating a local network, and sending it by email is a simple enough workflow. OmniOutliner can even read and write OMPL, which I can then feed into <a href=\"https://www.jonmsterling.com/0085/\">Bike</a> if I like\u2014or convert to any format I want using XSLT. That is the beauty of a standard: interop spanning decades.</p>\n \n \n\n \n\n <h2>Difficult coherences for the Sierpi\u0144ski cone</h2>\n \n <p>Last week, I <a href=\"https://www.jonmsterling.com/01AT/\">wrote about a pleasant surprise</a> concerning the results of my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">LICS 2025 paper</a>: one of two annoying side conditions from the main theorem can be dropped, and I conjectured that I could find a way to remove the other side condition. I spent much of the work working on the latter, and unfortunately I have come up a bit empty. There is some real difficulty with the higher coherences here, and although I have tried to attack them from several directions, they are not budging.</p>\n <p>I am thinking that I will just make the one improvement prior to the camera-ready submission, and leave the other one for a future paper should I ever solve it.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a>, <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a></h2>\n \n <p>I finished <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Adrian Tchaikovsky</a>\u2019s science fiction debut <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a> last week\u2014my <a href=\"https://www.jonmsterling.com/01AQ/\">initial impressions</a> were a little hesitant, but I have to say that the quality of the writing improved rapidly as I went onward. There is something a little strange about Tchaikovsky\u2019s style that <em>takes you out</em> of the fictional world and makes you overly aware of reality: he is constantly make analogies that are sensible to you and me but would not make sense to the characters in the book. I don\u2019t think this is necessary <em>even</em> when you are writing about spiders, but I suppose I am learning a bit about my own taste. In any case, I loved the book. I am completely terrified of spiders, but I can recommend this to any arachnophobe\u2014I will write more in my actual review later.</p>\n <p>I started reading the sequel, <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a>, and straightaway I can say that the writing quality has improved at least by a factor of two. With <em>Ruin</em>, I find Tchaikovsky hitting a strong rhythm, and his depiction of the alien world <em>Nod</em> is as enchanting as his characterisation of humans (and their arachnid comrades) is compelling.</p>",+"content": "<p>I was a bit unwell over the weekend and hence much of my week was spent recuperating. I am feeling much better now, but becoming healthily aware of my limitations and need for rest.</p>\n \n\n \n\n <h2>A jaunt through descriptive complexity theory!</h2>\n \n <p>I had the pleasure of reading a draft of a fascinating Part II project dissertation on computational aspects of descriptive complexity theory this week, written by a very talented student at <a href=\"https://www.jonmsterling.com/00GP/\">my college</a>. I won\u2019t say much about it, of course, but it was one of those projects that goes far beyond the material we teach in the course\u2014although I have opinions about the limitations of the Computer Science Tripos as a tool for actually teaching people computer science, I must say that it does a very good job getting <em>out of the way</em> from students who need space to run and learn.</p>\n <p>By the way, I read the thesis on my <a href=\"https://www.jonmsterling.com/01AH/\">white plastic iMac</a> and compiled my feedback using an ancient version of <a href=\"https://www.omnigroup.com/omnioutliner\">OmniOutliner</a>. Converting the outline to a PDF, moving it to my modern laptop by creating a local network, and sending it by email is a simple enough workflow. OmniOutliner can even read and write OMPL, which I can then feed into <a href=\"https://www.jonmsterling.com/0085/\">Bike</a> if I like\u2014or convert to any format I want using XSLT. That is the beauty of a standard: interop spanning decades.</p>\n \n \n\n \n\n <h2>Difficult coherences for the Sierpi\u0144ski cone</h2>\n \n <p>Last week, I <a href=\"https://www.jonmsterling.com/01AT/\">wrote about a pleasant surprise</a> concerning the results of my <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">LICS 2025 paper</a>: one of two annoying side conditions from the main theorem can be dropped, and I conjectured that I could find a way to remove the other side condition. I spent much of the work working on the latter, and unfortunately I have come up a bit empty. There is some real difficulty with the higher coherences here, and although I have tried to attack them from several directions, they are not budging.</p>\n <p>I am thinking that I will just make the one improvement prior to the camera-ready submission, and leave the other one for a future paper should I ever solve it.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a>, <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a></h2>\n \n <p>I finished <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Adrian Tchaikovsky</a>\u2019s science fiction debut <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a> last week\u2014my <a href=\"https://www.jonmsterling.com/01AQ/\">initial impressions</a> were a little hesitant, but I have to say that the quality of the writing improved rapidly as I went onward. There is something a little strange about Tchaikovsky\u2019s style that <em>takes you out</em> of the fictional world and makes you overly aware of reality: he is constantly make analogies that are sensible to you and me but would not make sense to the characters in the book. I don\u2019t think this is necessary <em>even</em> when you are writing about spiders, but I suppose I am learning a bit about my own taste. In any case, I loved the book. I am completely terrified of spiders, but I can recommend this to any arachnophobe\u2014I will write more in my actual review later.</p>\n <p>I started reading the sequel, <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a>, and straightaway I can say that the writing quality has improved at least by a factor of two. With <em>Ruin</em>, I find Tchaikovsky hitting a strong rhythm, and his depiction of the alien world <em>Nod</em> is as enchanting as his characterisation of humans (and their arachnid comrades) is compelling.</p>",
+18
jonsterling/2025-W19_.json
+18
jonsterling/2025-W19_.json
···+"summary": "<h2><a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting</a> and the biadjoint triangle theorem</h2>\n \n <p><a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a> and I were preparing the camera-ready version of our accepted <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a> <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">paper</a> (which I <a href=\"https://www.jonmsterling.com/01A7/\">summarised a few weeks ago</a>) when we received a very interesting suggestion from <a href=\"https://www.jonmsterling.com/nathanaelarkor/\">Nathanael Arkor</a>: at least the <em>existence</em> of the right pseudo-adjoint that we construct ought to be guaranteed by the 2-dimensional version of Dubuc\u2019s <a href=\"https://ncatlab.org/nlab/show/adjoint+triangle+theorem\">adjoint triangle lemma</a>, which has been worked out by <a href=\"https://www.jonmsterling.com/nunes-2016/\">Nunes</a>.</p>\n <p>If we black-box <a href=\"https://www.jonmsterling.com/nunes-2016/\">Nunes</a>\u2019s results, it is fairly clear how to reconstruct Nathanael\u2019s argument, which is really cool. To go deeper requires an understanding of 2-dimensional descent objects, about which the literature is unfortunately pretty poorly explained (there are many things that appear to be variants on simplicial shapes, or lax versions thereof, but lacking a number of degeneracies for reasons that nobody seems to explain). This is a forbidding area that seems to be understood by a small community of experts, but which would really benefit from some more systematic exposition. It is something I would really like to understand better.</p>\n <p>As for our paper, our completely explicit computation of the pseudoadjoint seems to still be a contribution (as this is what allows us at the moment to conclude a connection to Hofmann\u2013Streicher lifting), but it will be a very good thing indeed to show how the existence of the pseudoadjoint follows from purely formal manipulations. I\u2019m excited to learn more\u2014so thanks, <a href=\"https://www.jonmsterling.com/nathanaelarkor/\">Nathanael</a>!</p>\n \n \n\n \n\n <h2>Direction of studies and the community of teachers</h2>\n \n <p>One thing I\u2019ve noticed during my first year as a director of studies is that we do not always provide adequate guidance to new supervisors. There is a mandatory training session run by the University, but course-specific guidance tends to be spotty and limited. In the worst cases (I\u2019ve had more than one of these!), someone agrees to supervise some students and then forgets about it or tells us \u201cOh yeah, I can\u2019t do that anymore, best of luck!\u201d right as the term starts. Putting aside these extreme cases which happen far more often than I would like, I have started to find that a bit more of a hands-on approach to onboarding supervisors and communicating course expectations and learning objectives for our students is going to be useful.</p>\n <p>To that end, I have started inviting new supervisors to have a coffee and a discussion toward the beginning of term. This is a simple practice that doesn\u2019t take up too much time, but I believe it can solve a number of problems that stem from unclear expectations and ambiguous procedures (including providing clarity as to how many supervision-hours the College will actually pay for!). But perhaps more importantly, I am finding that these kinds of interactions may also have some potential for rebuilding community across the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a>\u2014which I discuss in more detail <a href=\"https://www.jonmsterling.com/01B4/\">below</a>.</p>\n <p>We need to invest more in our supervisors; this obviously includes advocating for fair pay increases (which I sadly have no control over), but it also includes connecting and conspiring with supervisors as equals with a common goal\u2014the integrated intellectual development of our undergraduates through high quality and inspiring teaching interactions. One thing my PhD advisor <a href=\"https://www.jonmsterling.com/robertharper/\">Bob Harper</a> always instilled in me is that some of the best research comes from figuring out how to teach something; in the context of strengthening the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> as an intellectual community, it seems to me that breaking down the topical silos and having an earnest cross-field conversation about teaching <em>from the bottom up</em> is a good place to start.</p>\n \n \n\n \n\n <h2>Rethinking community in the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a></h2>\n \n <p>I am told that our sense of community in the Lab historically relied on in-person faculty meetings and group seminars and impromptu hallway chats, but the reality today is (1) in the aftermath of the pandemic, not many of us are regularly in the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> anyway, and (2) seminars tend to be a waste of everyone\u2019s time and do not facilitate connections outside narrow research specialisations.</p>\n <p>The Faculty\u2019s abandonment of the Lab will definitely get worse rather than better: the Lab is (illegally?) imposing automatic lighting on all offices, to be rolled out over the next year or two; <em>this means that we will no longer be allowed to control whether our lights are turned on except by <strong>jumping up and down</strong> (when we wish them to be on) or <strong>keeping extremely still</strong> (when we wish them to be off)</em>. As soon as this vicious \u201cenvironmental update\u201d ripples up to the First Floor, I will no longer come to the Lab at all except for once a week or so\u2014if that.</p>\n <p>In \u2018light\u2019 of all this (pun intended), I think that we need to be thinking about sustainable practices to make more direct and intentional connections with our colleagues, including both faculty and students. Adopting the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a>\u2019s culture of internal blogging and weeknotes has gone off like a bomb in my intellectual life, for example; this asynchronous practice has informed me far more about my colleagues\u2019 work and goals than a dozen hours spent in seminars ever could. Blogging is not an alternative to meeting and talking in person; but I am starting to think that it is a <em>prerequisite</em> for the moments of serendipity that the latter can engender, because the ongoing dialogue of blogs and weeknotes makes me sufficiently <em>informed</em> to have a conversation that goes beyond the superficial.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> and the OE <em>Exodus</em></h2>\n \n <p>As I mentioned <a href=\"https://www.jonmsterling.com/01AZ/\">last week</a>, I\u2019ve been making my way through <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a>, the sequel to <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a>. It is getting better and better, and moving in directions that I had not anticipated. I will not say much today, but months ago when I <a href=\"https://www.jonmsterling.com/015W/\">alluded to our ruinous and outrageous behaviour toward the octopus</a>, I did not anticipate that I would be reading a novel featuring (even more) sentient octopodes.</p>\n <p>I have also started dipping my toes back into the pool of my old love affair with dead languages. Last week on a whim, I picked up a copy of the <a href=\"https://en.wikipedia.org/wiki/Old_English\">Old English</a> <a href=\"https://en.wikipedia.org/wiki/Exodus_(poem)\"><em>Exodus</em></a> at my <a href=\"https://www.jonmsterling.com/00GP/\">College</a> library and I finally started reading it this week. For those of you who don\u2019t know, Old English is the ancestor of our current tongue but is as different from Modern English as the latter is from Swedish or Norwegian. At one time I was very good at Old English, and I hope to become so again.</p>\n \n\n\n Hw\u00e6t! We feor and neah\u2003\u2003 gefrigen hab[b]a\u00f0\n ofer middangeard\u2003\u2003 Moyses domas,\n wr\u00e6clico wordriht,\u2003\u2003 wera cneorissum,\u2014\n in uprodor\u2003\u2003 eadigra gehwam\n \u00e6fter bealusi\u00f0e\u2003\u2003 bote lifes,\n lifigendra gehwam\u2003\u2003 langsumne r\u00e6d,\u2014\n h\u00e6le\u00f0um secgan.\u2003\u2003 Gehyre se \u00f0e wille!\n \n\n \nThe first seven lines of the OE <em>Exodus</em>.\n <p>There is a particular genre of Christian \u201ccultural translation\u201d literature to be found among the converted Germanic peoples that is extremely appealing. The idea is that the scripture of Christianity is rewrought into the artistic forms that are culturally familiar among the people, with a number of liberties taken\u2014think about how some Churches today try to pick up engagement among the youth by portraying the acts of Christ in a more, shall we say, \u2026\u201curban\u201d\u2026 light.</p>\n <p>In this case, however, the stories of the Bible are told in the form of epic verse (alliterative half-lines in the oldest Germanic tradition). The first examplar of this genre that I came into contact with many years ago was the <a href=\"https://en.wikipedia.org/wiki/Heliand\"><em>Heliand</em></a>, an epic <a href=\"https://en.wikipedia.org/wiki/Old_Saxon\">Old Saxon</a> re-telling of the New Testament in which Christ and his apostles take on the characteristics of a Germanic warlord and his retainers. The <em>Exodus</em> similarly casts Moses into a culturally appropriate role and retells the story of the Israelites\u2019 liberation from bondage and acquisition of the Law in epic verse.</p>\n <p>I have not got more than twenty lines in, because my Old English is so much slower than it was in the old days. But it has been very enjoyable to revisit this language and culture that I once knew so well.</p>",+"content": "<h2><a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting</a> and the biadjoint triangle theorem</h2>\n \n <p><a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a> and I were preparing the camera-ready version of our accepted <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS \u201925</a> <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">paper</a> (which I <a href=\"https://www.jonmsterling.com/01A7/\">summarised a few weeks ago</a>) when we received a very interesting suggestion from <a href=\"https://www.jonmsterling.com/nathanaelarkor/\">Nathanael Arkor</a>: at least the <em>existence</em> of the right pseudo-adjoint that we construct ought to be guaranteed by the 2-dimensional version of Dubuc\u2019s <a href=\"https://ncatlab.org/nlab/show/adjoint+triangle+theorem\">adjoint triangle lemma</a>, which has been worked out by <a href=\"https://www.jonmsterling.com/nunes-2016/\">Nunes</a>.</p>\n <p>If we black-box <a href=\"https://www.jonmsterling.com/nunes-2016/\">Nunes</a>\u2019s results, it is fairly clear how to reconstruct Nathanael\u2019s argument, which is really cool. To go deeper requires an understanding of 2-dimensional descent objects, about which the literature is unfortunately pretty poorly explained (there are many things that appear to be variants on simplicial shapes, or lax versions thereof, but lacking a number of degeneracies for reasons that nobody seems to explain). This is a forbidding area that seems to be understood by a small community of experts, but which would really benefit from some more systematic exposition. It is something I would really like to understand better.</p>\n <p>As for our paper, our completely explicit computation of the pseudoadjoint seems to still be a contribution (as this is what allows us at the moment to conclude a connection to Hofmann\u2013Streicher lifting), but it will be a very good thing indeed to show how the existence of the pseudoadjoint follows from purely formal manipulations. I\u2019m excited to learn more\u2014so thanks, <a href=\"https://www.jonmsterling.com/nathanaelarkor/\">Nathanael</a>!</p>\n \n \n\n \n\n <h2>Direction of studies and the community of teachers</h2>\n \n <p>One thing I\u2019ve noticed during my first year as a director of studies is that we do not always provide adequate guidance to new supervisors. There is a mandatory training session run by the University, but course-specific guidance tends to be spotty and limited. In the worst cases (I\u2019ve had more than one of these!), someone agrees to supervise some students and then forgets about it or tells us \u201cOh yeah, I can\u2019t do that anymore, best of luck!\u201d right as the term starts. Putting aside these extreme cases which happen far more often than I would like, I have started to find that a bit more of a hands-on approach to onboarding supervisors and communicating course expectations and learning objectives for our students is going to be useful.</p>\n <p>To that end, I have started inviting new supervisors to have a coffee and a discussion toward the beginning of term. This is a simple practice that doesn\u2019t take up too much time, but I believe it can solve a number of problems that stem from unclear expectations and ambiguous procedures (including providing clarity as to how many supervision-hours the College will actually pay for!). But perhaps more importantly, I am finding that these kinds of interactions may also have some potential for rebuilding community across the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a>\u2014which I discuss in more detail <a href=\"https://www.jonmsterling.com/01B4/\">below</a>.</p>\n <p>We need to invest more in our supervisors; this obviously includes advocating for fair pay increases (which I sadly have no control over), but it also includes connecting and conspiring with supervisors as equals with a common goal\u2014the integrated intellectual development of our undergraduates through high quality and inspiring teaching interactions. One thing my PhD advisor <a href=\"https://www.jonmsterling.com/robertharper/\">Bob Harper</a> always instilled in me is that some of the best research comes from figuring out how to teach something; in the context of strengthening the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a> as an intellectual community, it seems to me that breaking down the topical silos and having an earnest cross-field conversation about teaching <em>from the bottom up</em> is a good place to start.</p>\n \n \n\n \n\n <h2>Rethinking community in the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a></h2>\n \n <p>I am told that our sense of community in the Lab historically relied on in-person faculty meetings and group seminars and impromptu hallway chats, but the reality today is (1) in the aftermath of the pandemic, not many of us are regularly in the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> anyway, and (2) seminars tend to be a waste of everyone\u2019s time and do not facilitate connections outside narrow research specialisations.</p>\n <p>The Faculty\u2019s abandonment of the Lab will definitely get worse rather than better: the Lab is (illegally?) imposing automatic lighting on all offices, to be rolled out over the next year or two; <em>this means that we will no longer be allowed to control whether our lights are turned on except by <strong>jumping up and down</strong> (when we wish them to be on) or <strong>keeping extremely still</strong> (when we wish them to be off)</em>. As soon as this vicious \u201cenvironmental update\u201d ripples up to the First Floor, I will no longer come to the Lab at all except for once a week or so\u2014if that.</p>\n <p>In \u2018light\u2019 of all this (pun intended), I think that we need to be thinking about sustainable practices to make more direct and intentional connections with our colleagues, including both faculty and students. Adopting the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a>\u2019s culture of internal blogging and weeknotes has gone off like a bomb in my intellectual life, for example; this asynchronous practice has informed me far more about my colleagues\u2019 work and goals than a dozen hours spent in seminars ever could. Blogging is not an alternative to meeting and talking in person; but I am starting to think that it is a <em>prerequisite</em> for the moments of serendipity that the latter can engender, because the ongoing dialogue of blogs and weeknotes makes me sufficiently <em>informed</em> to have a conversation that goes beyond the superficial.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> and the OE <em>Exodus</em></h2>\n \n <p>As I mentioned <a href=\"https://www.jonmsterling.com/01AZ/\">last week</a>, I\u2019ve been making my way through <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a>, the sequel to <a href=\"https://www.jonmsterling.com/tchaikovsky-2015/\">Children of Time</a>. It is getting better and better, and moving in directions that I had not anticipated. I will not say much today, but months ago when I <a href=\"https://www.jonmsterling.com/015W/\">alluded to our ruinous and outrageous behaviour toward the octopus</a>, I did not anticipate that I would be reading a novel featuring (even more) sentient octopodes.</p>\n <p>I have also started dipping my toes back into the pool of my old love affair with dead languages. Last week on a whim, I picked up a copy of the <a href=\"https://en.wikipedia.org/wiki/Old_English\">Old English</a> <a href=\"https://en.wikipedia.org/wiki/Exodus_(poem)\"><em>Exodus</em></a> at my <a href=\"https://www.jonmsterling.com/00GP/\">College</a> library and I finally started reading it this week. For those of you who don\u2019t know, Old English is the ancestor of our current tongue but is as different from Modern English as the latter is from Swedish or Norwegian. At one time I was very good at Old English, and I hope to become so again.</p>\n \n\n\n Hw\u00e6t! We feor and neah\u2003\u2003 gefrigen hab[b]a\u00f0\n ofer middangeard\u2003\u2003 Moyses domas,\n wr\u00e6clico wordriht,\u2003\u2003 wera cneorissum,\u2014\n in uprodor\u2003\u2003 eadigra gehwam\n \u00e6fter bealusi\u00f0e\u2003\u2003 bote lifes,\n lifigendra gehwam\u2003\u2003 langsumne r\u00e6d,\u2014\n h\u00e6le\u00f0um secgan.\u2003\u2003 Gehyre se \u00f0e wille!\n \n\n \nThe first seven lines of the OE <em>Exodus</em>.\n <p>There is a particular genre of Christian \u201ccultural translation\u201d literature to be found among the converted Germanic peoples that is extremely appealing. The idea is that the scripture of Christianity is rewrought into the artistic forms that are culturally familiar among the people, with a number of liberties taken\u2014think about how some Churches today try to pick up engagement among the youth by portraying the acts of Christ in a more, shall we say, \u2026\u201curban\u201d\u2026 light.</p>\n <p>In this case, however, the stories of the Bible are told in the form of epic verse (alliterative half-lines in the oldest Germanic tradition). The first examplar of this genre that I came into contact with many years ago was the <a href=\"https://en.wikipedia.org/wiki/Heliand\"><em>Heliand</em></a>, an epic <a href=\"https://en.wikipedia.org/wiki/Old_Saxon\">Old Saxon</a> re-telling of the New Testament in which Christ and his apostles take on the characteristics of a Germanic warlord and his retainers. The <em>Exodus</em> similarly casts Moses into a culturally appropriate role and retells the story of the Israelites\u2019 liberation from bondage and acquisition of the Law in epic verse.</p>\n <p>I have not got more than twenty lines in, because my Old English is so much slower than it was in the old days. But it has been very enjoyable to revisit this language and culture that I once knew so well.</p>",
+18
jonsterling/2025-W20_.json
+18
jonsterling/2025-W20_.json
···+"summary": "<p>This had been a hectic week without a lot to show for it. Over the weekend, I wrapped up the last of my <a href=\"https://www.jonmsterling.com/oopsla-2024-25/\">OOPSLA 2025\u201325</a> reviews, just in time to get whacked with a couple dozen Part II dissertations to mark in the coming weeks.</p>\n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">Domains and Classifying Topoi</a>\n </h2>\n \n <p>Together with my PhD student <a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan Ye</a> I have been putting the finishing touches on a very exciting (to me) manuscript on a connection between synthetic domain theory and the theory of classifying toposes. I will say more about this after we have put our manuscript on the arXiv; for now, I will just say that this is the kind of work I had been hoping to do for a few years now and it has been a pleasure to work on it with Lingyuan, who comes to me week after week with deep insights and results.</p>\n \n \n\n \n\n <h2>Preparing for <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a>: June?</h2>\n \n <p>I would like to get a \u201cbeta\u201d release of <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> out by the end of the month. What this would mean is that <em>advanced</em> users are invited to try it out and start using it without fear of huge changes wrecking their migration daily (which has been the status quo for several months unfortunately). To that end, <a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a> and I have been polishing things up and fixing the long tail of issues that would block a release. Of course, I have been dog-fooding 5.0 for many months.</p>\n <p>One improvement to the lightweight federation support is that when you federate with a <em>published</em> forest, you can choose to have links get routed directly to the published version rather than rendering those trees directly in your own forest. There are some trade-offs here, and obviously the current state of federation does not reflect what we will be doing in the future; the current version is meant only to \u201cstem the bleeding\u201d for certain institutional users of Forester while we come up with a better approach.</p>\n \n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>\n </h2>\n \n <p>In the past two weeks I\u2019ve been learning how to row under the patient tutelage of <a href=\"https://www.jonmsterling.com/nigelwoodcock/\">Nigel Woodcock</a>, an Emeritus Fellow at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> who has been looking after our Boat Club for 25 years now. The motions of rowing are still unnatural for me, but I am improving (albeit slowly); as an arachnophobe, I do not relish the occasional spider in the boat, but otherwise I\u2019m enjoying the experience.</p>\n <p>Getting to the boathouse from the other side of the river is really dangerous\u2014there is the round-a-bout from Hell guarding Midsummer Common, and there seems to be no way to cross any of the several connected streets without three different vehicles whipping past you in three different directions simultaneously. Somehow the whole thing works like clockwork\u2014unless you are on foot or on a bicycle and don't have Alexander the Great\u2013level self-confidence. Almost lost my life on Friday\u2026 It would be a great idea to have any kind of pedestrian-friendly crosswalk there. In the meanwhile, I think I want to try and find a different route to the Common.</p>\n \n \n\n \n\n <h2>More on <a href=\"https://www.jonmsterling.com/01B5/\">involuntary lighting in the Lab</a></h2>\n \n <p>Last week I commented on the deeply hostile lighting transition being imposed on the denizens of the Computer Lab. Although we appear to have inadvertently triggered this transition in the course of trying to get our lighting replaced in the lecture theatres (which really needed to be done!), it seems this change is coming from Heaven and short of dethroning God, we wouldn\u2019t have been able to prevent it.</p>\n <p>With that said, things are looking up. It has been communicated to us that we will be allowed to request light switches to manually control the lights; it remains to be seen whether these switches will fully override the automatic (mis)behaviour, or if it just means that we will have the privilege of getting up to switch the light back on when it goes off at random times, or of getting up to switch it off when it goes on at random times. Anyway, I\u2019m sincerely hoping that the accommodation provided is a <em>traditional</em> light switch that totally disables the automatic functionality.</p>\n <p>Some members of the faculty have complained about the quality of the light itself, pointing out that it induces migraine. I sincerely hope that this is not the case, and I would just say that the current flourescent lights that we have in our offices are so horrible in terms of the light they cast that it is hard to imagine something even worse. So maybe the LED will be better.</p>\n \n\n \n\n <h3>How do we respond to fake \u201csustainability\u201d drives?</h3>\n \n <p>Like I said, I hope the new lighting will be satisfactory. If not, I will request the whole thing be ripped out entirely, and I\u2019ll install the most energy-intensive lamp I can find\u2014something as powerful and expensive to run as the Sun itself. I want the whole Computer Lab to feel the dip when I switch on the power\u2026</p>\n <p>Someone has to learn that there will always be a way to maliciously comply with hostile \u201csustainability\u201d updates that negates the supposed benefits entirely; and that this response is <em>guaranteed</em> when \u201csustainability\u201d improvements are made without genuine consultation or regard for the required functionality and the basic rights of workers. Fake consultation, where we are given the opportunity to \u201coffer our feedback\u201d but no credible pathway toward this feedback having any impact on a decision that already has been made, is deeply insulting and we should respond appropriately whilst we still can.</p>",+"content": "<p>This had been a hectic week without a lot to show for it. Over the weekend, I wrapped up the last of my <a href=\"https://www.jonmsterling.com/oopsla-2024-25/\">OOPSLA 2025\u201325</a> reviews, just in time to get whacked with a couple dozen Part II dissertations to mark in the coming weeks.</p>\n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">Domains and Classifying Topoi</a>\n </h2>\n \n <p>Together with my PhD student <a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan Ye</a> I have been putting the finishing touches on a very exciting (to me) manuscript on a connection between synthetic domain theory and the theory of classifying toposes. I will say more about this after we have put our manuscript on the arXiv; for now, I will just say that this is the kind of work I had been hoping to do for a few years now and it has been a pleasure to work on it with Lingyuan, who comes to me week after week with deep insights and results.</p>\n \n \n\n \n\n <h2>Preparing for <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a>: June?</h2>\n \n <p>I would like to get a \u201cbeta\u201d release of <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> out by the end of the month. What this would mean is that <em>advanced</em> users are invited to try it out and start using it without fear of huge changes wrecking their migration daily (which has been the status quo for several months unfortunately). To that end, <a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a> and I have been polishing things up and fixing the long tail of issues that would block a release. Of course, I have been dog-fooding 5.0 for many months.</p>\n <p>One improvement to the lightweight federation support is that when you federate with a <em>published</em> forest, you can choose to have links get routed directly to the published version rather than rendering those trees directly in your own forest. There are some trade-offs here, and obviously the current state of federation does not reflect what we will be doing in the future; the current version is meant only to \u201cstem the bleeding\u201d for certain institutional users of Forester while we come up with a better approach.</p>\n \n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>\n </h2>\n \n <p>In the past two weeks I\u2019ve been learning how to row under the patient tutelage of <a href=\"https://www.jonmsterling.com/nigelwoodcock/\">Nigel Woodcock</a>, an Emeritus Fellow at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> who has been looking after our Boat Club for 25 years now. The motions of rowing are still unnatural for me, but I am improving (albeit slowly); as an arachnophobe, I do not relish the occasional spider in the boat, but otherwise I\u2019m enjoying the experience.</p>\n <p>Getting to the boathouse from the other side of the river is really dangerous\u2014there is the round-a-bout from Hell guarding Midsummer Common, and there seems to be no way to cross any of the several connected streets without three different vehicles whipping past you in three different directions simultaneously. Somehow the whole thing works like clockwork\u2014unless you are on foot or on a bicycle and don't have Alexander the Great\u2013level self-confidence. Almost lost my life on Friday\u2026 It would be a great idea to have any kind of pedestrian-friendly crosswalk there. In the meanwhile, I think I want to try and find a different route to the Common.</p>\n \n \n\n \n\n <h2>More on <a href=\"https://www.jonmsterling.com/01B5/\">involuntary lighting in the Lab</a></h2>\n \n <p>Last week I commented on the deeply hostile lighting transition being imposed on the denizens of the Computer Lab. Although we appear to have inadvertently triggered this transition in the course of trying to get our lighting replaced in the lecture theatres (which really needed to be done!), it seems this change is coming from Heaven and short of dethroning God, we wouldn\u2019t have been able to prevent it.</p>\n <p>With that said, things are looking up. It has been communicated to us that we will be allowed to request light switches to manually control the lights; it remains to be seen whether these switches will fully override the automatic (mis)behaviour, or if it just means that we will have the privilege of getting up to switch the light back on when it goes off at random times, or of getting up to switch it off when it goes on at random times. Anyway, I\u2019m sincerely hoping that the accommodation provided is a <em>traditional</em> light switch that totally disables the automatic functionality.</p>\n <p>Some members of the faculty have complained about the quality of the light itself, pointing out that it induces migraine. I sincerely hope that this is not the case, and I would just say that the current flourescent lights that we have in our offices are so horrible in terms of the light they cast that it is hard to imagine something even worse. So maybe the LED will be better.</p>\n \n\n \n\n <h3>How do we respond to fake \u201csustainability\u201d drives?</h3>\n \n <p>Like I said, I hope the new lighting will be satisfactory. If not, I will request the whole thing be ripped out entirely, and I\u2019ll install the most energy-intensive lamp I can find\u2014something as powerful and expensive to run as the Sun itself. I want the whole Computer Lab to feel the dip when I switch on the power\u2026</p>\n <p>Someone has to learn that there will always be a way to maliciously comply with hostile \u201csustainability\u201d updates that negates the supposed benefits entirely; and that this response is <em>guaranteed</em> when \u201csustainability\u201d improvements are made without genuine consultation or regard for the required functionality and the basic rights of workers. Fake consultation, where we are given the opportunity to \u201coffer our feedback\u201d but no credible pathway toward this feedback having any impact on a decision that already has been made, is deeply insulting and we should respond appropriately whilst we still can.</p>",
+18
jonsterling/2025-W21_.json
+18
jonsterling/2025-W21_.json
···+"summary": "<p>Another long week involving far less science than I would have liked\u2026</p>\n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">Domains and Classifying Topoi</a>\n </h2>\n \n <p><a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan</a> and I have now uploaded our new <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">manuscript</a> to the arXiv. Do have a look!</p>\n \n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>\n </h2>\n \n <p>After my third outing in the \u201ctub\u201d on Tuesday, <a href=\"https://www.jonmsterling.com/nigelwoodcock/\">Nigel</a> tells me that I\u2019m ready to join the eight. I won\u2019t start until early June because the end of May is so busy for me.</p>\n <p>By the way, after my <a href=\"https://www.jonmsterling.com/01B8/\">harrowing ride last time</a>, I have switched to diverting through Jesus Green\u2014which avoids the Roundabout From Hell, and also exposed me to an interesting beer festival. Thanks very much to <a href=\"https://www.jonmsterling.com/davidallsopp/\">David Allsopp</a> and <a href=\"https://www.jonmsterling.com/ryangibb/\">Ryan Gibb</a> for their helpful advice!</p>\n \n \n\n \n\n <h2>Do Cambridge students benefit from our assessment bureaucracy?</h2>\n \n <p>Marking season begins\u2026 I must mark at a clip of approximately four Part II dissertations per day (including weekend) if I want to have any chance of making our extremely tight deadline. At Cambridge, we have an uncommonly cautious approach to assessment:</p>\n <ol><li>For examination Papers, questions pass through several redundant layers of checking. Each question has a designated \u201cchecker\u201d in addition to the setter. Then after the checker is happy with the question, it goes on to the Examiners, who check each Paper globally and send back suggestions and requests to the setter. After all this, it goes on to an External Examiner (from a different university!), who gives additional feedback.</li>\n <li>Part II project dissertations are marked not only by the UTO supervisor (who is either the actual supervisor or a \u201cmeta-supervisor-at-a-distance\u201d of a non-UTO supervisor who is often a PhD student). They are also marked by a second assessor, who is often one of the Examiners but may also be brought onto the team to contribute to marking. For example, I am a Part IA Examiner but I am marking Part II dissertations in addition to my Examiners\u2019 duties.</li></ol>\n \n\n \n\n <h3>Why do we have so much redundancy?</h3>\n \n <p>Our fastidious approach to question setting has some definite advantages, but overall it is far too bureaucratic and has too many levels. An immense amount of time is burnt second-guessing question-setters, in the end leading to doubtful benefits. My colleagues at literally every university in the World except for Oxford are always shocked when I explain this process to them\u2014at normal universities, you just set and mark an exam, and nobody else sees it (aside from the TAs you enlist to help you mark it\u2014something else we don\u2019t do here). This actually works fine at literally every university on the planet except for two.</p>\n <p>On the other hand, the redundant marking of Part II dissertations is a bit more complex, and there are some stronger reasons for it. One reason is concern over conflicts of interest or supervisors\u2019 natural overenthusiasm for their students\u2019 projects. Another reason for having a non-supervisor mark the dissertation is that the latter will usually be an outsider to the topic of specialisation; this ensures that dissertations can be read by someone literate in general computer science who is not a specialist, and this legibility is actually one of the goals of the Part II programme. I am not entirely convinced by this, because computer science is not really one field anymore, and it is a bit of a shared delusion that it is <em>ever</em> possible anymore to write up a substantive project in such a way that \u201cthe average member of the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a>\u201d would be able to assess its merits intelligently.</p>\n <p>Anyway, I have a feeling that if we spread the marking load out more evenly and divided by three the amount of scrutiny we impose on question-setting for examinations, the outcomes for our students would be qualitatively similar to what they are now. It is always possible to raise \u201cimportant\u201d concerns about the integrity of the process whose \u201csolution\u201d is naturally ever-more layers of redundancy and checking-the-checkers-who-check-the-checkers-of-the-setter-checkers. At moments like this, it is a good idea to pause and reflect on whether it is better for our students that each faculty member spend a cumulative two months doing literally nothing but assessment and higher-order practices related to assessment, vs. other activities that could benefit our students more (<strong>including <em>actual teaching</em></strong>, of which we do astonishingly little at Cambridge).</p>\n \n \n\n \n\n <h3>Soliciting feedback on IA Examination processes\u2026</h3>\n \n <p>Anyway, I believe I shall be Chair of Examiners for Part IA next year. Many of the actual questions of policy are naturally out of my hands in this role, but I hope to direct the examination and assessment process for IA in as anti-bureaucratic a way that can possibly be achieved whilst working within those policies. Any colleagues who have suggestions for practicable improvements to the process are very welcome to have a chat with me about it.</p>\n <p>\n One change I plan to make straightaway is to reform the Examiners\u2019 interaction with setters to be more directly collaborative and less bureaucratic.\n </p>\n <p>Right now, Examiners read all the questions and try their best to come up with criticisms of them (despite our individual uncertain expertise in at least two-thirds of the Paper content). These criticisms are then collated and sent to each setter as a list of requested changes; I believe that a list of changes passed indirectly from the Examiners to the setters is not the most collaborative way to achieve a high quality exam paper. <strong><em>(I sincerely apologise to anyone who has been on the other end of an interaction like this recently: I have heard you, and I will fix it).</em></strong></p>\n <p>A different approach, which I have heard was actually practised prior to the pandemic, was that Examiners would simply hold court and have walk-in chats with setters and iron things out in person without a bureaucratic back-and-forth. Count me in! We will do something like that next year, making necessary allowances for setters\u2019 availability.</p>\n <p>To be honest, every time I hear about how things used to be in the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> before the pandemic, I think to myself that we should simply look for every single policy or procedural change that occurred since that era began and simply reverse it and then potentially revisit with the benefit of hindsight. My proposal? <code>git checkout -b new-era; git reset --hard HEAD~2000</code> and Godspeed.</p>",+"content": "<p>Another long week involving far less science than I would have liked\u2026</p>\n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">Domains and Classifying Topoi</a>\n </h2>\n \n <p><a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan</a> and I have now uploaded our new <a href=\"https://www.jonmsterling.com/sterling-ye-2025/\">manuscript</a> to the arXiv. Do have a look!</p>\n \n \n\n \n\n <h2>\n <a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>\n </h2>\n \n <p>After my third outing in the \u201ctub\u201d on Tuesday, <a href=\"https://www.jonmsterling.com/nigelwoodcock/\">Nigel</a> tells me that I\u2019m ready to join the eight. I won\u2019t start until early June because the end of May is so busy for me.</p>\n <p>By the way, after my <a href=\"https://www.jonmsterling.com/01B8/\">harrowing ride last time</a>, I have switched to diverting through Jesus Green\u2014which avoids the Roundabout From Hell, and also exposed me to an interesting beer festival. Thanks very much to <a href=\"https://www.jonmsterling.com/davidallsopp/\">David Allsopp</a> and <a href=\"https://www.jonmsterling.com/ryangibb/\">Ryan Gibb</a> for their helpful advice!</p>\n \n \n\n \n\n <h2>Do Cambridge students benefit from our assessment bureaucracy?</h2>\n \n <p>Marking season begins\u2026 I must mark at a clip of approximately four Part II dissertations per day (including weekend) if I want to have any chance of making our extremely tight deadline. At Cambridge, we have an uncommonly cautious approach to assessment:</p>\n <ol><li>For examination Papers, questions pass through several redundant layers of checking. Each question has a designated \u201cchecker\u201d in addition to the setter. Then after the checker is happy with the question, it goes on to the Examiners, who check each Paper globally and send back suggestions and requests to the setter. After all this, it goes on to an External Examiner (from a different university!), who gives additional feedback.</li>\n <li>Part II project dissertations are marked not only by the UTO supervisor (who is either the actual supervisor or a \u201cmeta-supervisor-at-a-distance\u201d of a non-UTO supervisor who is often a PhD student). They are also marked by a second assessor, who is often one of the Examiners but may also be brought onto the team to contribute to marking. For example, I am a Part IA Examiner but I am marking Part II dissertations in addition to my Examiners\u2019 duties.</li></ol>\n \n\n \n\n <h3>Why do we have so much redundancy?</h3>\n \n <p>Our fastidious approach to question setting has some definite advantages, but overall it is far too bureaucratic and has too many levels. An immense amount of time is burnt second-guessing question-setters, in the end leading to doubtful benefits. My colleagues at literally every university in the World except for Oxford are always shocked when I explain this process to them\u2014at normal universities, you just set and mark an exam, and nobody else sees it (aside from the TAs you enlist to help you mark it\u2014something else we don\u2019t do here). This actually works fine at literally every university on the planet except for two.</p>\n <p>On the other hand, the redundant marking of Part II dissertations is a bit more complex, and there are some stronger reasons for it. One reason is concern over conflicts of interest or supervisors\u2019 natural overenthusiasm for their students\u2019 projects. Another reason for having a non-supervisor mark the dissertation is that the latter will usually be an outsider to the topic of specialisation; this ensures that dissertations can be read by someone literate in general computer science who is not a specialist, and this legibility is actually one of the goals of the Part II programme. I am not entirely convinced by this, because computer science is not really one field anymore, and it is a bit of a shared delusion that it is <em>ever</em> possible anymore to write up a substantive project in such a way that \u201cthe average member of the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Lab</a>\u201d would be able to assess its merits intelligently.</p>\n <p>Anyway, I have a feeling that if we spread the marking load out more evenly and divided by three the amount of scrutiny we impose on question-setting for examinations, the outcomes for our students would be qualitatively similar to what they are now. It is always possible to raise \u201cimportant\u201d concerns about the integrity of the process whose \u201csolution\u201d is naturally ever-more layers of redundancy and checking-the-checkers-who-check-the-checkers-of-the-setter-checkers. At moments like this, it is a good idea to pause and reflect on whether it is better for our students that each faculty member spend a cumulative two months doing literally nothing but assessment and higher-order practices related to assessment, vs. other activities that could benefit our students more (<strong>including <em>actual teaching</em></strong>, of which we do astonishingly little at Cambridge).</p>\n \n \n\n \n\n <h3>Soliciting feedback on IA Examination processes\u2026</h3>\n \n <p>Anyway, I believe I shall be Chair of Examiners for Part IA next year. Many of the actual questions of policy are naturally out of my hands in this role, but I hope to direct the examination and assessment process for IA in as anti-bureaucratic a way that can possibly be achieved whilst working within those policies. Any colleagues who have suggestions for practicable improvements to the process are very welcome to have a chat with me about it.</p>\n <p>\n One change I plan to make straightaway is to reform the Examiners\u2019 interaction with setters to be more directly collaborative and less bureaucratic.\n </p>\n <p>Right now, Examiners read all the questions and try their best to come up with criticisms of them (despite our individual uncertain expertise in at least two-thirds of the Paper content). These criticisms are then collated and sent to each setter as a list of requested changes; I believe that a list of changes passed indirectly from the Examiners to the setters is not the most collaborative way to achieve a high quality exam paper. <strong><em>(I sincerely apologise to anyone who has been on the other end of an interaction like this recently: I have heard you, and I will fix it).</em></strong></p>\n <p>A different approach, which I have heard was actually practised prior to the pandemic, was that Examiners would simply hold court and have walk-in chats with setters and iron things out in person without a bureaucratic back-and-forth. Count me in! We will do something like that next year, making necessary allowances for setters\u2019 availability.</p>\n <p>To be honest, every time I hear about how things used to be in the <a href=\"https://www.jonmsterling.com/camcl/\">Lab</a> before the pandemic, I think to myself that we should simply look for every single policy or procedural change that occurred since that era began and simply reverse it and then potentially revisit with the benefit of hindsight. My proposal? <code>git checkout -b new-era; git reset --hard HEAD~2000</code> and Godspeed.</p>",
+18
jonsterling/2025-W22_.json
+18
jonsterling/2025-W22_.json
···+"summary": "<h2>Mythic Beasts migration complete!</h2>\n \n <p>I\u2019ve now finished migrating <em>all</em> my domains and hosting to <a href=\"https://www.mythic-beasts.com/\">Mythic Beasts</a>, our wonderful local web host and registrar. Previously I\u2019d handled registration through Hover, which used to have great support but has enshittified itself some years ago, to the point where they do not employ support staff who actually know the difference between HTTP and HTTPS. Thankfully, that\u2019s not a concern with Beasts, who ensure competent and knowledgeable support in an interesting way: <a href=\"https://www.mythic-beasts.com/blog/2022/10/21/the-secret-to-great-technical-support-no-support-staff/\">The secret to great technical support? No support staff</a>.</p>\n <p>For hosting, I had previously just thrown stuff up on GitHub Pages and built the static sites using GitHub Actions. This approach has been pretty common for a number of years, but I grew to hate it because every time I want to change one word on my site, I have to blow up the world and install TeX and OCaml and everything else in CI. Caching exists, but that doesn't stop the whole process from taking ten minutes and being incredibly wasteful of resources, and routinely breaking in inscrutable ways that take hours to debug remotely.</p>\n <p>I now use Mythic Beasts shared hosting with shell access. When I want to upload something, I just rsync it and it appears instantly. The best part is that can use Apache\u2019s <code>.htaccess</code> files to set redirects, which has made it possible for me to do a major cleanup of my forest (the removal of the <code>jms-</code> prefix, which I no advise for new users).</p>\n \n \n\n \n\n <h2><a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> and the <a href=\"https://www.forester-notes.org/QHXS/\">intellectual junkyard</a></h2>\n \n <p>This week has been a big push for bug fixes in <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> prior to release. For a few weeks now, I\u2019ve been daily-driving <a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a>\u2019s <a href=\"https://www.forester-notes.org/MZSF/\">Forester Language Server</a> in Neovim. There remain many paper cuts but the most exciting part about it for me is that, after all these months, the infrastructure for incremental compilation of forests is here and is being put to great use in the language server. This will enable many usability improvements in the coming months.</p>\n <p>I\u2019ve also written a somewhat personal reflection on forests and Zettelk\u00e4sten and blogging: <a href=\"https://www.forester-notes.org/QHXS/\">Intellectual junkyards</a>. I am not yet certain how controversial that post will be, but I hope it sparks some discussion.</p>\n \n \n\n \n\n <h2>Part II marking; PL/Theory in Part II</h2>\n \n <p>I\u2019m almost done with Part II dissertation marking (a few days late), which is a relief. Reconciliation with the primary markers is beginning, which I\u2019m hoping will go smoothly.</p>\n <p>I also took a moment to write some general advice for Theory-inclined students who are thinking about doing a Part II project in programming languages, semantics, etc.: <a href=\"https://www.jonmsterling.com/01BF/\">About Theory projects in Part II</a>. Long story short, I encourage students to beware of some likely failure modes for such projects, and keep in mind whether the difficulty and depth of their project is actually comparable to those of their peers in Systems or Machine Learning or Graphics, etc.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> and <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a></h2>\n \n <p>I\u2019ve been so busy that I haven\u2019t really had the time or space to read as much fiction as I would like in the past two weeks. However, I have finished <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> which I greatly enjoyed, and moved onto the third book in the trilogy, <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a>. There is something interestingly <em>progressive</em> and optimistic about these books\u2014without spoiling anything, the conclusion of the first two books was Humanity reaching an elevated understanding with a mortal enemy. <em>Memory</em> is fascinating to me because, so far, it reads to me as <em>Fairy Story</em> more than science fiction; there is something more delicate in Tchaikovsky\u2019s approach to story-telling that is beginning to emerge.</p>",+"content": "<h2>Mythic Beasts migration complete!</h2>\n \n <p>I\u2019ve now finished migrating <em>all</em> my domains and hosting to <a href=\"https://www.mythic-beasts.com/\">Mythic Beasts</a>, our wonderful local web host and registrar. Previously I\u2019d handled registration through Hover, which used to have great support but has enshittified itself some years ago, to the point where they do not employ support staff who actually know the difference between HTTP and HTTPS. Thankfully, that\u2019s not a concern with Beasts, who ensure competent and knowledgeable support in an interesting way: <a href=\"https://www.mythic-beasts.com/blog/2022/10/21/the-secret-to-great-technical-support-no-support-staff/\">The secret to great technical support? No support staff</a>.</p>\n <p>For hosting, I had previously just thrown stuff up on GitHub Pages and built the static sites using GitHub Actions. This approach has been pretty common for a number of years, but I grew to hate it because every time I want to change one word on my site, I have to blow up the world and install TeX and OCaml and everything else in CI. Caching exists, but that doesn't stop the whole process from taking ten minutes and being incredibly wasteful of resources, and routinely breaking in inscrutable ways that take hours to debug remotely.</p>\n <p>I now use Mythic Beasts shared hosting with shell access. When I want to upload something, I just rsync it and it appears instantly. The best part is that can use Apache\u2019s <code>.htaccess</code> files to set redirects, which has made it possible for me to do a major cleanup of my forest (the removal of the <code>jms-</code> prefix, which I no advise for new users).</p>\n \n \n\n \n\n <h2><a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> and the <a href=\"https://www.forester-notes.org/QHXS/\">intellectual junkyard</a></h2>\n \n <p>This week has been a big push for bug fixes in <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> prior to release. For a few weeks now, I\u2019ve been daily-driving <a href=\"https://www.jonmsterling.com/kentookura/\">Kento</a>\u2019s <a href=\"https://www.forester-notes.org/MZSF/\">Forester Language Server</a> in Neovim. There remain many paper cuts but the most exciting part about it for me is that, after all these months, the infrastructure for incremental compilation of forests is here and is being put to great use in the language server. This will enable many usability improvements in the coming months.</p>\n <p>I\u2019ve also written a somewhat personal reflection on forests and Zettelk\u00e4sten and blogging: <a href=\"https://www.forester-notes.org/QHXS/\">Intellectual junkyards</a>. I am not yet certain how controversial that post will be, but I hope it sparks some discussion.</p>\n \n \n\n \n\n <h2>Part II marking; PL/Theory in Part II</h2>\n \n <p>I\u2019m almost done with Part II dissertation marking (a few days late), which is a relief. Reconciliation with the primary markers is beginning, which I\u2019m hoping will go smoothly.</p>\n <p>I also took a moment to write some general advice for Theory-inclined students who are thinking about doing a Part II project in programming languages, semantics, etc.: <a href=\"https://www.jonmsterling.com/01BF/\">About Theory projects in Part II</a>. Long story short, I encourage students to beware of some likely failure modes for such projects, and keep in mind whether the difficulty and depth of their project is actually comparable to those of their peers in Systems or Machine Learning or Graphics, etc.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> and <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a></h2>\n \n <p>I\u2019ve been so busy that I haven\u2019t really had the time or space to read as much fiction as I would like in the past two weeks. However, I have finished <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2019/\">Children of Ruin</a> which I greatly enjoyed, and moved onto the third book in the trilogy, <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a>. There is something interestingly <em>progressive</em> and optimistic about these books\u2014without spoiling anything, the conclusion of the first two books was Humanity reaching an elevated understanding with a mortal enemy. <em>Memory</em> is fascinating to me because, so far, it reads to me as <em>Fairy Story</em> more than science fiction; there is something more delicate in Tchaikovsky\u2019s approach to story-telling that is beginning to emerge.</p>",
+18
jonsterling/2025-W23_.json
+18
jonsterling/2025-W23_.json
···+"summary": "<p>Part II dissertation marking is calming down, and I\u2019m almost fully reconciled with the UTO markers; also winding down our work on the <a href=\"https://www.jonmsterling.com/oopsla-2024-25/\">OOPSLA 24-25</a> review committee. Next week we\u2019ve got exams, which I\u2019ll be helping with in my capacity as IA Examiner. I\u2019ve also been continuing to put the finishing touches on the <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> beta which I hope to get out very soon.</p>\n \n\n \n\n <h2>Winding down two Masters/Part III projects</h2>\n \n <p>I\u2019ve been looking after two Masters-level students this year: <a href=\"https://www.jonmsterling.com/runzexue/\">Runze Xue</a> and <a href=\"https://www.jonmsterling.com/zhiyiliu/\">Zhiyi Liu</a>. For his MPhil, Runze has been working on <a href=\"https://www.jonmsterling.com/00AE/\">formalising synthetic domain theory in univalent foundations</a> with particular attention to the topology of the final lifting coalgebra; and for her Part III project, Zhiyi has been working with <a href=\"https://www.jonmsterling.com/marcelofiore/\">Marcelo Fiore</a> and myself on a language for <a href=\"https://homepages.inf.ed.ac.uk/gdp/publications/Abstract_Syn.pdf\">synthetic abstract syntax</a>, with an experimental simulation via Agda\u2019s rewriting system and (on paper) a formal canonicity and conservativity proof using Artin glueing.</p>\n <p>It has been a real pleasure to see both Runze and Zhiyi grow over the year. These are not easy topics to come to terms with, and there is no undergraduate course that prepares you to engage with this kind of material at all. So it is a credit to these students that they have been able, with some help, to do some very interesting work in the area.</p>\n \n \n\n \n\n <h2>What is <a href=\"https://www.jonmsterling.com/hottbook/\">homotopy type theory</a> good for?</h2>\n \n <p>One question I often get from people is:</p>\n <blockquote>\n I\u2019m doing ordinary mathematics, and ordinary mathematics fits into ordinary 1-dimensional type theory, like in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a>. So what good is homotopy type theory to me?\n </blockquote>\n <p>One kind of answer is: \u201cWell, ordinary mathematics doesn\u2019t <em>actually</em> fit very well into 1-dimensional type theory in fact, because it usually involves things that cannot be given a 1-dimensional universal property (including universes, or the \u2018set of finite groups\u2019, etc.).\u201d This is a <em>true</em> answer, but not a particularly convincing one to <a href=\"https://www.ma.ic.ac.uk/~buzzard/\">people</a> for whom working around the deficiencies of 1-dimensional type theory is a way of life, or to whom the benefits of universal properties remain opaque.</p>\n <p>A different kind of answer is: \u201cIf you extend 1-dimensional type theory to homotopy type theory, you can often find theorems whose <em>statements</em> make sense in the former but whose <em>proofs</em> are simplest in the latter.\u201d This is identical to the way that working with real numbers becomes much simpler if you allow intermediate calculations to pass through the complex numbers. Anyway, something like that happened to me on Wednesday night.</p>\n <p>There is a kind of strange theorem (<a href=\"https://www.jonmsterling.com/01BK/\">Concerning initial lift algebras under lex modalities</a>) that I had conjectured on Monday to my student <a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan Ye</a>, which is totally low-dimensional and therefore can be stated without using univalence or any homotopical things at all. I got the idea for a proof just before <a href=\"https://www.jonmsterling.com/01BS/\">rowing</a>, and worked out the details the next morning. What was interesting about the proof is that it relied crucially on the univalence principle of homotopy type theory, which ensures that for an <a href=\"https://www.jonmsterling.com/rijke-shulman-spitters-2020/\">accessible left exact modality</a>, the universe of modal types is itself modal. Even though my proof \u201cpassed through\u201d homotopical notions, the result still applies to the non-homotopical models I originally was interested in.</p>\n <p>I suspect there must also be a more subtle non-homotopical proof of the result, but what matters is the proof you have, not the proof you want. At some point, when you live within the univalent foundations like I do, you stop worrying about it and embrace the fact that many subtle things become easy and direct in the presence of univalence. Taking univalence for granted in the 2020s is kind of like adopting the axiom of choice in the 1910s: the world is kind of skeptical that it can have true implications for the things they care about, but your own work is supercharged to the point that you would never turn back.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>: first time in the eight!</h2>\n \n <p>On Wednesday, I had my first outing in the <a href=\"https://en.wikipedia.org/wiki/Eight_(rowing)\">eight</a>! It was definitely very different from rowing in the tub. I need a few more outings to get comfortable with it\u2014and I think I need to adjust the placement of the shoe plate for next time. The next day, my calves were in sorry shape (and so were my arms\u2014a sign I need to correct my form!)...</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a></h2>\n \n <p>I was unable to put down <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a>; I finished it on Thursday and was left totally in shambles. There is a lot to unpack here, and I will need some time.</p>",+"content": "<p>Part II dissertation marking is calming down, and I\u2019m almost fully reconciled with the UTO markers; also winding down our work on the <a href=\"https://www.jonmsterling.com/oopsla-2024-25/\">OOPSLA 24-25</a> review committee. Next week we\u2019ve got exams, which I\u2019ll be helping with in my capacity as IA Examiner. I\u2019ve also been continuing to put the finishing touches on the <a href=\"https://www.forester-notes.org/011P/\">Forester 5.0</a> beta which I hope to get out very soon.</p>\n \n\n \n\n <h2>Winding down two Masters/Part III projects</h2>\n \n <p>I\u2019ve been looking after two Masters-level students this year: <a href=\"https://www.jonmsterling.com/runzexue/\">Runze Xue</a> and <a href=\"https://www.jonmsterling.com/zhiyiliu/\">Zhiyi Liu</a>. For his MPhil, Runze has been working on <a href=\"https://www.jonmsterling.com/00AE/\">formalising synthetic domain theory in univalent foundations</a> with particular attention to the topology of the final lifting coalgebra; and for her Part III project, Zhiyi has been working with <a href=\"https://www.jonmsterling.com/marcelofiore/\">Marcelo Fiore</a> and myself on a language for <a href=\"https://homepages.inf.ed.ac.uk/gdp/publications/Abstract_Syn.pdf\">synthetic abstract syntax</a>, with an experimental simulation via Agda\u2019s rewriting system and (on paper) a formal canonicity and conservativity proof using Artin glueing.</p>\n <p>It has been a real pleasure to see both Runze and Zhiyi grow over the year. These are not easy topics to come to terms with, and there is no undergraduate course that prepares you to engage with this kind of material at all. So it is a credit to these students that they have been able, with some help, to do some very interesting work in the area.</p>\n \n \n\n \n\n <h2>What is <a href=\"https://www.jonmsterling.com/hottbook/\">homotopy type theory</a> good for?</h2>\n \n <p>One question I often get from people is:</p>\n <blockquote>\n I\u2019m doing ordinary mathematics, and ordinary mathematics fits into ordinary 1-dimensional type theory, like in <a href=\"https://www.jonmsterling.com/019G/\">Lean</a>. So what good is homotopy type theory to me?\n </blockquote>\n <p>One kind of answer is: \u201cWell, ordinary mathematics doesn\u2019t <em>actually</em> fit very well into 1-dimensional type theory in fact, because it usually involves things that cannot be given a 1-dimensional universal property (including universes, or the \u2018set of finite groups\u2019, etc.).\u201d This is a <em>true</em> answer, but not a particularly convincing one to <a href=\"https://www.ma.ic.ac.uk/~buzzard/\">people</a> for whom working around the deficiencies of 1-dimensional type theory is a way of life, or to whom the benefits of universal properties remain opaque.</p>\n <p>A different kind of answer is: \u201cIf you extend 1-dimensional type theory to homotopy type theory, you can often find theorems whose <em>statements</em> make sense in the former but whose <em>proofs</em> are simplest in the latter.\u201d This is identical to the way that working with real numbers becomes much simpler if you allow intermediate calculations to pass through the complex numbers. Anyway, something like that happened to me on Wednesday night.</p>\n <p>There is a kind of strange theorem (<a href=\"https://www.jonmsterling.com/01BK/\">Concerning initial lift algebras under lex modalities</a>) that I had conjectured on Monday to my student <a href=\"https://www.jonmsterling.com/lingyuanye/\">Lingyuan Ye</a>, which is totally low-dimensional and therefore can be stated without using univalence or any homotopical things at all. I got the idea for a proof just before <a href=\"https://www.jonmsterling.com/01BS/\">rowing</a>, and worked out the details the next morning. What was interesting about the proof is that it relied crucially on the univalence principle of homotopy type theory, which ensures that for an <a href=\"https://www.jonmsterling.com/rijke-shulman-spitters-2020/\">accessible left exact modality</a>, the universe of modal types is itself modal. Even though my proof \u201cpassed through\u201d homotopical notions, the result still applies to the non-homotopical models I originally was interested in.</p>\n <p>I suspect there must also be a more subtle non-homotopical proof of the result, but what matters is the proof you have, not the proof you want. At some point, when you live within the univalent foundations like I do, you stop worrying about it and embrace the fact that many subtle things become easy and direct in the presence of univalence. Taking univalence for granted in the 2020s is kind of like adopting the axiom of choice in the 1910s: the world is kind of skeptical that it can have true implications for the things they care about, but your own work is supercharged to the point that you would never turn back.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01BT/\">Rowing at </a><a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>: first time in the eight!</h2>\n \n <p>On Wednesday, I had my first outing in the <a href=\"https://en.wikipedia.org/wiki/Eight_(rowing)\">eight</a>! It was definitely very different from rowing in the tub. I need a few more outings to get comfortable with it\u2014and I think I need to adjust the placement of the shoe plate for next time. The next day, my calves were in sorry shape (and so were my arms\u2014a sign I need to correct my form!)...</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a></h2>\n \n <p>I was unable to put down <a href=\"https://www.jonmsterling.com/tchaikovsky-2023/\">Children of Memory</a>; I finished it on Thursday and was left totally in shambles. There is a lot to unpack here, and I will need some time.</p>",
+18
jonsterling/2025-W24_.json
+18
jonsterling/2025-W24_.json
···+"summary": "<p>This week: a <a href=\"https://www.jonmsterling.com/01BX/\">blog post</a> about my thoughts on Apple\u2019s design announcements at WWDC this week; some <a href=\"https://www.jonmsterling.com/01BW/\">thoughts</a> on the emergence of \u201cForester-likes\u201d, or alternative implementations of <a href=\"https://www.forester-notes.org/index/\">Forester</a>; and a <a href=\"https://www.jonmsterling.com/01C0/\">distinguished paper award at LICS 2025</a>.</p>\n \n\n \n\n <h2>Thoughts on Apple\u2019s new design language</h2>\n \n <p>As many predicted, Apple unveiled at its Worldwide Developer Conference a new design language for all its platforms centred around a material that they call <em>Liquid Glass</em>. I have some personal reflections about my time as an iOS platform app developer during the iOS 7 transition, and some thoughts about what the new design language may mean for the remaining independent developers whose businesses have not been destroyed by the App Store\u2019s race to the bottom.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmidhyzep3x5zm2cfjnxwvxtbd2x5mw7blzpevgqtzrn334mvxia7fi.png\" width=\"320px\">\n \nA screenshot of the Music app with the new Liquid Glass design. Source: Apple.\n <p>(I will not speak much here about the merits (or lack thereof) of the new design language. There is a lot to say and critique there, but there\u2019s also some reason for hope.)</p>\n \n\n \n\n <h3>Flat design was about de-skilling app development</h3>\n \n <p>If you believe that the purpose of a system is what it does, <strong>the purpose of the iOS 7 redesign was to de-skill app development</strong>. Admittedly this sounds like a conspiracy theory that ignores Apple designers\u2019 stated motivations, but my experience is that whenever there is a business case for something, that thing will simply happen and those involved in the transition tend to explain it to themselves in ways that flatter their sensibilities\u2014a macrocosm of the epiphenominalist hypothesis for the world of business.</p>\n <p>The economic context of the transition, returning to the early 2010s, is that Apple\u2019s native platforms were losing ground to (objectively terrible for users) cross-platform alternatives in large part because of the exorbitantly high cost of designing platform-native apps to the standard set in the visually and materially rich design language of iOS 6 and below. Think about that terrible \u201cweb view inside an app\u201d thing that your phone provider makes you use in which scrolling is broken and back-buttons are dangerous to press, and which constantly logs you out in the middle of a task, or stalls on a 10-factor authentication labyrinth, or charges your credit card twice due to a lag in responding to a confirmation button press, and you will know exactly what I mean.</p>\n \n\n \n\n <h4>App development in the iOS 6 era</h4>\n \n <p>I was a native mobile app developer in both eras, and I\u2019ll tell you that a serious iOS 6 app would involve hundreds of designer-hours producing meticulous custom graphics for most controls\u2014designed to be thematically harmonious with the system appearance, but customised to delight and surprise: think wooden navigation bars with inset text that looks like it was carved with a router. After this artwork was produced (naturally at <code>1x</code> and <code>2x</code> resolution, as we were still in the throes of the Retina transition!), the engineers take ahold of it and begin overriding the <code>-drawRect:</code> methods of many views, which was often non-trivial due to the need to change the behaviour of views managed deep within system classes.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmia3ecx4gmvm6dqjbsczhxk36cjjftakyacugtygjhdzp2cjvnu7ye.png\" width=\"320px\">\n \nA screenshot of <em>Runenstein</em>, a rune catalogue that I designed and built many years ago.\n \n<img src=\"https://www.jonmsterling.com/bafkrmibm6clyddrodrwlhhxltxaqw34rkmkjndk5ndiceyf5cjnq2abc5a.jpg\" width=\"320px\">\n \nA screenshot of <em>Yardsale</em>, the pre-iOS 7 iPhone app that I worked on with Ed McManus, Ryan Mickle, and Michael Sanders in the early 2010s. Source: <a href=\"https://www.wired.com/2012/06/yardsale-app/\">Wired</a>\n \n \n\n \n\n <h4>App development post iOS 7</h4>\n \n <p>By way of contrast, designing an app <em>post</em> iOS 7 is considerably less expensive: there are essentially no custom graphics at all, and the only thing the designers are doing is choosing colours and fonts to \u201chighlight the brand\u201d. If there are custom controls, they can be drawn without an expensive designer\u2019s intervention, as in nearly all cases, these are just ugly buttons with slightly non-standard shapes that someone with no skills at all easily can draw in Quartz\u2014or SwiftUI. Certainly there is no engineer sweating over pixels and perfecting the custom animations that support the delightful illusion of material.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiadomrvb5rcomnozp7wc72gsmsj6mvctuwxpz5gysq2m5lq4w3paa.png\" width=\"320px\">\n \nA screenshot of <em>FOBO</em>, a live auction app that I built together with the Yardsale team during the iOS 7 transition. Suddenly one\u2019s brand could be reduced to a colour. Source: <a href=\"https://laughingsquid.com/fobo-an-app-for-auctioning-used-electronics-in-97-minutes/\">Laughing Squid</a>\n \n \n\n \n\n <h4>What did \u201ccentering the content\u201d achieve?</h4>\n \n <p>I have no doubt that behind Jony Ive\u2019s prattling about \u201ccentering the content\u201d, which Alan Dye has taken to new extremes more recently, was an actual business case that Apple considered to be of existential importance: if the cost of native application development is not lowered dramatically, native application development will (for all intents and purposes) cease. It is not lost on me that Apple\u2019s de-skilling strategy had the exact opposite of the likely intended effect: there has never been as many non-native apps on Apple platforms as there are today, and I believe there are two reasons for this.</p>\n <ol><li>With the advent of Apple Silicon, performance is no longer a strong differentiator for native apps. Many Electron apps (including Visual Studio Code) perform <em>better</em> than native alternatives.</li>\n <li>In the era of flat design, in which intricate and materially rich design has been \u201ccancelled\u201d, visual beauty and detail are no longer strong market differentiators for native apps, nor is respect for platform-specific functionality (like proxy icons on the Macintosh!) that is increasingly de-emphasised in Apple\u2019s native toolkits.</li></ol>\n \n \n \n\n \n\n <h3>Liquid Glass is a gift to the indies</h3>\n \n <p>I was listening to <a href=\"https://atp.fm/\">Accidental Tech Podcast</a>\u2019s <a href=\"https://atp.fm/643\">discussion of the new design language</a> and one thing that struck me was Marco Arment\u2019s prescient comment that essentially no corporate apps besides Apple\u2019s will adopt it. There are three reasons for this:</p>\n <ol><li>Large corporations have gotten used to treating Apple\u2019s decade-long <em>lack</em> of design as a blank slate on which to paint their \u201cbrand\u201d. Suppressing the \u201cbrand\u201d to unify with the system appearance is a complete non-starter in the corporate world. If you suggest something like that, you will be laughed out of the room.</li>\n\n <li>Most smaller corporate apps were designed and built by consultants rather than in-house, and no small company will be able to justify dropping an additional $200K+ on an app refresh.</li>\n\n <li>Most corporate apps are using some unwieldy cross-platform toolkit like React or Flutter anyway (enough said).</li></ol>\n <p>I think the Liquid Glass design presents an opportunity for independent app developers to differentiate themselves from the competition in ways that have not been possible since before iOS 7. The return of texture and depth and active light and subtle animation means that those who treat app development as <em>craft</em> will be able to create vastly different experiences from those created by consultants or even corporate in-house teams whose business motives do not include platform integration or, indeed, delight. (Not all is rosy: the changes to icon dimensions and composition represent a <em>new</em> de-skilling manoeuvre by Apple\u2014but for users, it is hard to say that this is worse than the present dystopia of soulless glyphs.)</p>\n <p>These prospects for craft seem not to depend on whether the Liquid Glass design is actually good and accessible for users\u2014it is just complex enough that (good or not) it will lead to the kind of differentiation that we had on both iOS and the Mac OS X platforms in the old days\u2014when an app that was either non-native or poorly crafted (usually both) stood out like a sore thumb in ways that regular users could notice just by touching a control and seeing what happens when you move things around, or finding they can select text that should not selectable, or scrolling to reveal incorrect insets and content boundaries, etc. Attempts to replicate the Liquid Glass material using web technologies will likely lead to stuttering scrolling and drained batteries, which (again) regular users will be able to notice.</p>\n <p>So, whilst I\u2019m shaken by the potential for a further degraded user interface on the Macintosh, I\u2019m more optimistic than I thought I would be about the prospects for independent Apple platform application development in the next ten years. I\u2019m also not certain what this means for <a href=\"https://www.github.com/jonsterling/AquaUI\">AquaUI</a>\u2014I need to experiment with Liquid Glass to better understand its strengths and weaknesses before returning to that project.</p>\n \n \n \n\n \n\n <h2>Alternative Foresters: let them bloom!</h2>\n \n <p>I\u2019ve been really excited to see at least two \u201cForester-likes\u201d, i.e. projects aiming to provide alternative implementations of (or takes on) <a href=\"https://www.forester-notes.org/index/\">Forester</a>:</p>\n <ol><li><a href=\"https://github.com/kokic/kodama\">Kodama</a> is Forester-like created by <a href=\"https://kokic.github.io/kokic/\">Kokic Liu</a> that aims to provide great support for Typst (rather than LaTeX), and Markdown for its source language. Kodama is also written in Rust, and licensed under the GPL. Here\u2019s a <a href=\"https://kokic.github.io/\">demonstration</a>.</li>\n \n <li><a href=\"https://tr-notes.srht.site/\">TR</a> is a Forester-like built on the Racket/Scribble ecosystem by <a href=\"https://www.jonmsterling.com/dannypsnl/\">L\u00eem Ts\u00fa-thu\u00e0n</a>; unlike Forester, TR aims to be <a href=\"https://josem.co/the-beauty-of-finished-software/\">finished software</a>. TR is licensed permissively.</li></ol>\n <p>Both of these projects are interesting in their own right. It\u2019s definitely a good idea to consider alternatives to the LaTeX ecosystem, and building on Typst could pay off; time will tell, which is one reason why I\u2019m glad someone is doing it. I\u2019m especially sympathetic to the goals of TR; although Forester will continue to evolve (and perhaps inspire further forks and reimplementations), I think that finished software plays an important and underrated role. So kudos to both <a href=\"https://kokic.github.io/kokic/\">Kokic Liu</a> and <a href=\"https://www.jonmsterling.com/dannypsnl/\">L\u00eem Ts\u00fa-thu\u00e0n</a>!</p>\n <p>On the technical side, I\u2019m skeptical of Markdown. There are strong reasons why I did not adopt it, many of which you can learn by reading between the lines <a href=\"https://www.forester-notes.org/tfmt-0005/\">here</a> and <a href=\"https://www.forester-notes.org/tfmt-000E/\">here</a>. I do believe that there could be a well-behaved <em>sublanguage</em> of Markdown that could be used. In the meanwhile, people with legacy notes in Markdown can either play with <a href=\"https://github.com/kokic/kodama/\">Kodama</a> or they can try out <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a>\u2019s awesome <a href=\"https://patrick.sirref.org/graft/\">Graft</a> tool, which is a standalone preprocessor to generate Forester source code from both Markdown and BibTeX.</p>\n <p>Going forward, I\u2019d love to see many more projects that either build on Forester or the ideas underlying Forester. Alternative implementations are, in some sense, even better than tools that build directly on Forester because they decrease development inertia for all parties and allow the emerging community to work more explicitly towards <em>interoperability</em> within the Open Web. Interop is the only weapon we have against platform and tool hegemony</p>\n \n \n\n \n\n <h2>A distinguished paper at LICS 2025</h2>\n \n <p>I was really pleased to find out that my paper with <a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a> on <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting of fibred categories</a> has been selected as a distinguished paper by the LICS 2025 programme committee. Andrew and I will be preparing a longer and more detailed version of this paper for publication in a special issue of <a href=\"https://www.jonmsterling.com/lmcs/\">Logical Methods in Computer Science</a>. Congratulations, Andrew!</p>",+"content": "<p>This week: a <a href=\"https://www.jonmsterling.com/01BX/\">blog post</a> about my thoughts on Apple\u2019s design announcements at WWDC this week; some <a href=\"https://www.jonmsterling.com/01BW/\">thoughts</a> on the emergence of \u201cForester-likes\u201d, or alternative implementations of <a href=\"https://www.forester-notes.org/index/\">Forester</a>; and a <a href=\"https://www.jonmsterling.com/01C0/\">distinguished paper award at LICS 2025</a>.</p>\n \n\n \n\n <h2>Thoughts on Apple\u2019s new design language</h2>\n \n <p>As many predicted, Apple unveiled at its Worldwide Developer Conference a new design language for all its platforms centred around a material that they call <em>Liquid Glass</em>. I have some personal reflections about my time as an iOS platform app developer during the iOS 7 transition, and some thoughts about what the new design language may mean for the remaining independent developers whose businesses have not been destroyed by the App Store\u2019s race to the bottom.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmidhyzep3x5zm2cfjnxwvxtbd2x5mw7blzpevgqtzrn334mvxia7fi.png\" width=\"320px\">\n \nA screenshot of the Music app with the new Liquid Glass design. Source: Apple.\n <p>(I will not speak much here about the merits (or lack thereof) of the new design language. There is a lot to say and critique there, but there\u2019s also some reason for hope.)</p>\n \n\n \n\n <h3>Flat design was about de-skilling app development</h3>\n \n <p>If you believe that the purpose of a system is what it does, <strong>the purpose of the iOS 7 redesign was to de-skill app development</strong>. Admittedly this sounds like a conspiracy theory that ignores Apple designers\u2019 stated motivations, but my experience is that whenever there is a business case for something, that thing will simply happen and those involved in the transition tend to explain it to themselves in ways that flatter their sensibilities\u2014a macrocosm of the epiphenominalist hypothesis for the world of business.</p>\n <p>The economic context of the transition, returning to the early 2010s, is that Apple\u2019s native platforms were losing ground to (objectively terrible for users) cross-platform alternatives in large part because of the exorbitantly high cost of designing platform-native apps to the standard set in the visually and materially rich design language of iOS 6 and below. Think about that terrible \u201cweb view inside an app\u201d thing that your phone provider makes you use in which scrolling is broken and back-buttons are dangerous to press, and which constantly logs you out in the middle of a task, or stalls on a 10-factor authentication labyrinth, or charges your credit card twice due to a lag in responding to a confirmation button press, and you will know exactly what I mean.</p>\n \n\n \n\n <h4>App development in the iOS 6 era</h4>\n \n <p>I was a native mobile app developer in both eras, and I\u2019ll tell you that a serious iOS 6 app would involve hundreds of designer-hours producing meticulous custom graphics for most controls\u2014designed to be thematically harmonious with the system appearance, but customised to delight and surprise: think wooden navigation bars with inset text that looks like it was carved with a router. After this artwork was produced (naturally at <code>1x</code> and <code>2x</code> resolution, as we were still in the throes of the Retina transition!), the engineers take ahold of it and begin overriding the <code>-drawRect:</code> methods of many views, which was often non-trivial due to the need to change the behaviour of views managed deep within system classes.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmia3ecx4gmvm6dqjbsczhxk36cjjftakyacugtygjhdzp2cjvnu7ye.png\" width=\"320px\">\n \nA screenshot of <em>Runenstein</em>, a rune catalogue that I designed and built many years ago.\n \n<img src=\"https://www.jonmsterling.com/bafkrmibm6clyddrodrwlhhxltxaqw34rkmkjndk5ndiceyf5cjnq2abc5a.jpg\" width=\"320px\">\n \nA screenshot of <em>Yardsale</em>, the pre-iOS 7 iPhone app that I worked on with Ed McManus, Ryan Mickle, and Michael Sanders in the early 2010s. Source: <a href=\"https://www.wired.com/2012/06/yardsale-app/\">Wired</a>\n \n \n\n \n\n <h4>App development post iOS 7</h4>\n \n <p>By way of contrast, designing an app <em>post</em> iOS 7 is considerably less expensive: there are essentially no custom graphics at all, and the only thing the designers are doing is choosing colours and fonts to \u201chighlight the brand\u201d. If there are custom controls, they can be drawn without an expensive designer\u2019s intervention, as in nearly all cases, these are just ugly buttons with slightly non-standard shapes that someone with no skills at all easily can draw in Quartz\u2014or SwiftUI. Certainly there is no engineer sweating over pixels and perfecting the custom animations that support the delightful illusion of material.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmiadomrvb5rcomnozp7wc72gsmsj6mvctuwxpz5gysq2m5lq4w3paa.png\" width=\"320px\">\n \nA screenshot of <em>FOBO</em>, a live auction app that I built together with the Yardsale team during the iOS 7 transition. Suddenly one\u2019s brand could be reduced to a colour. Source: <a href=\"https://laughingsquid.com/fobo-an-app-for-auctioning-used-electronics-in-97-minutes/\">Laughing Squid</a>\n \n \n\n \n\n <h4>What did \u201ccentering the content\u201d achieve?</h4>\n \n <p>I have no doubt that behind Jony Ive\u2019s prattling about \u201ccentering the content\u201d, which Alan Dye has taken to new extremes more recently, was an actual business case that Apple considered to be of existential importance: if the cost of native application development is not lowered dramatically, native application development will (for all intents and purposes) cease. It is not lost on me that Apple\u2019s de-skilling strategy had the exact opposite of the likely intended effect: there has never been as many non-native apps on Apple platforms as there are today, and I believe there are two reasons for this.</p>\n <ol><li>With the advent of Apple Silicon, performance is no longer a strong differentiator for native apps. Many Electron apps (including Visual Studio Code) perform <em>better</em> than native alternatives.</li>\n <li>In the era of flat design, in which intricate and materially rich design has been \u201ccancelled\u201d, visual beauty and detail are no longer strong market differentiators for native apps, nor is respect for platform-specific functionality (like proxy icons on the Macintosh!) that is increasingly de-emphasised in Apple\u2019s native toolkits.</li></ol>\n \n \n \n\n \n\n <h3>Liquid Glass is a gift to the indies</h3>\n \n <p>I was listening to <a href=\"https://atp.fm/\">Accidental Tech Podcast</a>\u2019s <a href=\"https://atp.fm/643\">discussion of the new design language</a> and one thing that struck me was Marco Arment\u2019s prescient comment that essentially no corporate apps besides Apple\u2019s will adopt it. There are three reasons for this:</p>\n <ol><li>Large corporations have gotten used to treating Apple\u2019s decade-long <em>lack</em> of design as a blank slate on which to paint their \u201cbrand\u201d. Suppressing the \u201cbrand\u201d to unify with the system appearance is a complete non-starter in the corporate world. If you suggest something like that, you will be laughed out of the room.</li>\n\n <li>Most smaller corporate apps were designed and built by consultants rather than in-house, and no small company will be able to justify dropping an additional $200K+ on an app refresh.</li>\n\n <li>Most corporate apps are using some unwieldy cross-platform toolkit like React or Flutter anyway (enough said).</li></ol>\n <p>I think the Liquid Glass design presents an opportunity for independent app developers to differentiate themselves from the competition in ways that have not been possible since before iOS 7. The return of texture and depth and active light and subtle animation means that those who treat app development as <em>craft</em> will be able to create vastly different experiences from those created by consultants or even corporate in-house teams whose business motives do not include platform integration or, indeed, delight. (Not all is rosy: the changes to icon dimensions and composition represent a <em>new</em> de-skilling manoeuvre by Apple\u2014but for users, it is hard to say that this is worse than the present dystopia of soulless glyphs.)</p>\n <p>These prospects for craft seem not to depend on whether the Liquid Glass design is actually good and accessible for users\u2014it is just complex enough that (good or not) it will lead to the kind of differentiation that we had on both iOS and the Mac OS X platforms in the old days\u2014when an app that was either non-native or poorly crafted (usually both) stood out like a sore thumb in ways that regular users could notice just by touching a control and seeing what happens when you move things around, or finding they can select text that should not selectable, or scrolling to reveal incorrect insets and content boundaries, etc. Attempts to replicate the Liquid Glass material using web technologies will likely lead to stuttering scrolling and drained batteries, which (again) regular users will be able to notice.</p>\n <p>So, whilst I\u2019m shaken by the potential for a further degraded user interface on the Macintosh, I\u2019m more optimistic than I thought I would be about the prospects for independent Apple platform application development in the next ten years. I\u2019m also not certain what this means for <a href=\"https://www.github.com/jonsterling/AquaUI\">AquaUI</a>\u2014I need to experiment with Liquid Glass to better understand its strengths and weaknesses before returning to that project.</p>\n \n \n \n\n \n\n <h2>Alternative Foresters: let them bloom!</h2>\n \n <p>I\u2019ve been really excited to see at least two \u201cForester-likes\u201d, i.e. projects aiming to provide alternative implementations of (or takes on) <a href=\"https://www.forester-notes.org/index/\">Forester</a>:</p>\n <ol><li><a href=\"https://github.com/kokic/kodama\">Kodama</a> is Forester-like created by <a href=\"https://kokic.github.io/kokic/\">Kokic Liu</a> that aims to provide great support for Typst (rather than LaTeX), and Markdown for its source language. Kodama is also written in Rust, and licensed under the GPL. Here\u2019s a <a href=\"https://kokic.github.io/\">demonstration</a>.</li>\n \n <li><a href=\"https://tr-notes.srht.site/\">TR</a> is a Forester-like built on the Racket/Scribble ecosystem by <a href=\"https://www.jonmsterling.com/dannypsnl/\">L\u00eem Ts\u00fa-thu\u00e0n</a>; unlike Forester, TR aims to be <a href=\"https://josem.co/the-beauty-of-finished-software/\">finished software</a>. TR is licensed permissively.</li></ol>\n <p>Both of these projects are interesting in their own right. It\u2019s definitely a good idea to consider alternatives to the LaTeX ecosystem, and building on Typst could pay off; time will tell, which is one reason why I\u2019m glad someone is doing it. I\u2019m especially sympathetic to the goals of TR; although Forester will continue to evolve (and perhaps inspire further forks and reimplementations), I think that finished software plays an important and underrated role. So kudos to both <a href=\"https://kokic.github.io/kokic/\">Kokic Liu</a> and <a href=\"https://www.jonmsterling.com/dannypsnl/\">L\u00eem Ts\u00fa-thu\u00e0n</a>!</p>\n <p>On the technical side, I\u2019m skeptical of Markdown. There are strong reasons why I did not adopt it, many of which you can learn by reading between the lines <a href=\"https://www.forester-notes.org/tfmt-0005/\">here</a> and <a href=\"https://www.forester-notes.org/tfmt-000E/\">here</a>. I do believe that there could be a well-behaved <em>sublanguage</em> of Markdown that could be used. In the meanwhile, people with legacy notes in Markdown can either play with <a href=\"https://github.com/kokic/kodama/\">Kodama</a> or they can try out <a href=\"https://www.jonmsterling.com/patrickferris/\">Patrick Ferris</a>\u2019s awesome <a href=\"https://patrick.sirref.org/graft/\">Graft</a> tool, which is a standalone preprocessor to generate Forester source code from both Markdown and BibTeX.</p>\n <p>Going forward, I\u2019d love to see many more projects that either build on Forester or the ideas underlying Forester. Alternative implementations are, in some sense, even better than tools that build directly on Forester because they decrease development inertia for all parties and allow the emerging community to work more explicitly towards <em>interoperability</em> within the Open Web. Interop is the only weapon we have against platform and tool hegemony</p>\n \n \n\n \n\n <h2>A distinguished paper at LICS 2025</h2>\n \n <p>I was really pleased to find out that my paper with <a href=\"https://www.jonmsterling.com/andrewslattery/\">Andrew Slattery</a> on <a href=\"https://www.jonmsterling.com/slattery-sterling-2025/\">Hofmann\u2013Streicher lifting of fibred categories</a> has been selected as a distinguished paper by the LICS 2025 programme committee. Andrew and I will be preparing a longer and more detailed version of this paper for publication in a special issue of <a href=\"https://www.jonmsterling.com/lmcs/\">Logical Methods in Computer Science</a>. Congratulations, Andrew!</p>",
+18
jonsterling/2025-W25_.json
+18
jonsterling/2025-W25_.json
···+"summary": "<p>Somehow the end-of-term is more hectic than I remember it being last year; after I pass through each gauntlet, another one gapes before me. Long story short, I have finished marking all my exam scripts but I do still need to finish marking an MPhil dissertation. I did manage to prepare a first draft of my slides for my <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS 2025</a> <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">presentation</a>; the slides are too long, but I will cut it down in time.</p>\n \n\n \n\n <h2>Gowns in the Guildhall, taking examinations seriously</h2>\n \n <p>As I\u2019ve <a href=\"https://www.jonmsterling.com/01BE/\">mentioned before</a>, this year I\u2019ve served as a IA Examiner. As part of my duties, I showed up at the Guildhall last Friday to examine Paper 3, gown and all. The role of Examiners in a physical examination is to answer questions and issue clarifications concerning examination content as needed. Well, that is the idea at least...</p>\n <p>You see, in subjects like English or Physics, it is a reasonable expectation that any member of the faculty would be knowledgeable about all the topics represented in a given paper (and, in fact, it\u2019s common for each paper to be set by a single person). Computer Science is a little different, in that probably no single member of the faculty could get a passing mark on any one of our papers, much less set the entire paper. I don\u2019t know a thing about machine learning, and likewise our excellent faculty in machine learning would not be to muddle their way through even the easiest supervision sheet in Semantics. I wish that weren\u2019t the case, for all our sakes, but that is how Computer Science works. We are not a single field; we are the refugees of a dozen other fields, thrown together and then bound by a shared ethos whose roots lie deep in the history of the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Laboratory</a>.</p>\n <p>Long story short, there is no hope of any Computer Science Examiner being able to answer almost any question posed by students during the exam\u2014except in the vanishingly remote chance that it happens to be concerning a question they have themselves set. So our policy is instead that Setters shall sit by their phones during the pertinent examination and await a phone call from the Examiner. That would work very well, if Setters would actually wait by their phones rather than setting their phones to send all calls direct to voicemail, taking meetings during the examination period, etc. As a responsible Setter, I actually had my wife call my phone to make sure that it would ring, and then stayed by it for the duration of the Paper 2 examination on which I had set two questions. This was not difficult, but it took some forethought and a sense of responsibility.</p>\n <p>Even more bafflingly, Examiners are not required to sit in the examination room for more than 20 minutes, which means that the invigilators must phone the Examiner when questions arise, creating a long-tailed game of Telephone in which important queries go unanswered for the longest time even in the best case. I decided to buck this bizarrely maladaptive practice, and just stayed throughout the entire exam to ensure that emerging matters could be dealt with swiftly (I actually got some writing done!). Indeed, if I were to take my duties as Examiner remotely seriously, then no time would have been saved by me leaving the examination room anyway (since I must stay by my phone the entire time and certainly could not take meetings or get deep into other work). I would like to see other Examiners take up this practice.</p>\n <p>Examinations happen only once per year, and they may seem of little consequence in the scheme of things to someone who sees all aspects of the life of the University as \u201cnecessary evils\u201d ancillary to Research. But the careful conduct of examinations is immensely consequential to students, and we should therefore take our role in the process with commensurate seriousness.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a> and <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a></h2>\n \n <p>I decided to give <a href=\"https://www.jonmsterling.com/arthurcclarke/\">Arthur C. Clarke</a> <a href=\"https://www.jonmsterling.com/019W/\">another chance</a>, and I\u2019m glad I did. In <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a>, the reader is introduced to the quiet world of Diaspar, a shining city of leisure on a dried up Earth a thousand million years in the future, in which all matter (both living and otherwise) is stored in Computer memory and materialised at will. There is the Mid-Century preoccupation with the survival of Humanity over the eons (and in what form?), the pseudo-scientific references to \u201cde-evolution\u201d, the <em>deus ex machina</em> explanations of speculative History that straighten out too many mysteries that could not be unraveled by the protagonists alone. But against this backdrop there is the tender care for Character that we find all too rarely in the science fiction of this era. From start to finish, <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a> was a marvel.</p>\n <p>Next, I took a chance on one of <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s recent novels, <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a>. I knew one minute in that I would read this one ravenously. There is a bitter realness in Tchaikovsky\u2019s portrayal of \u201crevolutionary subcommittees\u201d and their dysfunction that will be familiar to anyone who has found themselves in such circles, the fear of betrayal (by whom, and since when?), and the implosive combination of weakness and overconfidence that shatters nearly every revolutionary conspiracy from the inside. Will it really take something from <em>outside</em> Earth\u2019s biological construct to lock us on to the path that leads away from self-defeat and ruin?</p>",+"content": "<p>Somehow the end-of-term is more hectic than I remember it being last year; after I pass through each gauntlet, another one gapes before me. Long story short, I have finished marking all my exam scripts but I do still need to finish marking an MPhil dissertation. I did manage to prepare a first draft of my slides for my <a href=\"https://www.jonmsterling.com/lics-2025/\">LICS 2025</a> <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">presentation</a>; the slides are too long, but I will cut it down in time.</p>\n \n\n \n\n <h2>Gowns in the Guildhall, taking examinations seriously</h2>\n \n <p>As I\u2019ve <a href=\"https://www.jonmsterling.com/01BE/\">mentioned before</a>, this year I\u2019ve served as a IA Examiner. As part of my duties, I showed up at the Guildhall last Friday to examine Paper 3, gown and all. The role of Examiners in a physical examination is to answer questions and issue clarifications concerning examination content as needed. Well, that is the idea at least...</p>\n <p>You see, in subjects like English or Physics, it is a reasonable expectation that any member of the faculty would be knowledgeable about all the topics represented in a given paper (and, in fact, it\u2019s common for each paper to be set by a single person). Computer Science is a little different, in that probably no single member of the faculty could get a passing mark on any one of our papers, much less set the entire paper. I don\u2019t know a thing about machine learning, and likewise our excellent faculty in machine learning would not be to muddle their way through even the easiest supervision sheet in Semantics. I wish that weren\u2019t the case, for all our sakes, but that is how Computer Science works. We are not a single field; we are the refugees of a dozen other fields, thrown together and then bound by a shared ethos whose roots lie deep in the history of the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Laboratory</a>.</p>\n <p>Long story short, there is no hope of any Computer Science Examiner being able to answer almost any question posed by students during the exam\u2014except in the vanishingly remote chance that it happens to be concerning a question they have themselves set. So our policy is instead that Setters shall sit by their phones during the pertinent examination and await a phone call from the Examiner. That would work very well, if Setters would actually wait by their phones rather than setting their phones to send all calls direct to voicemail, taking meetings during the examination period, etc. As a responsible Setter, I actually had my wife call my phone to make sure that it would ring, and then stayed by it for the duration of the Paper 2 examination on which I had set two questions. This was not difficult, but it took some forethought and a sense of responsibility.</p>\n <p>Even more bafflingly, Examiners are not required to sit in the examination room for more than 20 minutes, which means that the invigilators must phone the Examiner when questions arise, creating a long-tailed game of Telephone in which important queries go unanswered for the longest time even in the best case. I decided to buck this bizarrely maladaptive practice, and just stayed throughout the entire exam to ensure that emerging matters could be dealt with swiftly (I actually got some writing done!). Indeed, if I were to take my duties as Examiner remotely seriously, then no time would have been saved by me leaving the examination room anyway (since I must stay by my phone the entire time and certainly could not take meetings or get deep into other work). I would like to see other Examiners take up this practice.</p>\n <p>Examinations happen only once per year, and they may seem of little consequence in the scheme of things to someone who sees all aspects of the life of the University as \u201cnecessary evils\u201d ancillary to Research. But the careful conduct of examinations is immensely consequential to students, and we should therefore take our role in the process with commensurate seriousness.</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a> and <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a></h2>\n \n <p>I decided to give <a href=\"https://www.jonmsterling.com/arthurcclarke/\">Arthur C. Clarke</a> <a href=\"https://www.jonmsterling.com/019W/\">another chance</a>, and I\u2019m glad I did. In <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a>, the reader is introduced to the quiet world of Diaspar, a shining city of leisure on a dried up Earth a thousand million years in the future, in which all matter (both living and otherwise) is stored in Computer memory and materialised at will. There is the Mid-Century preoccupation with the survival of Humanity over the eons (and in what form?), the pseudo-scientific references to \u201cde-evolution\u201d, the <em>deus ex machina</em> explanations of speculative History that straighten out too many mysteries that could not be unraveled by the protagonists alone. But against this backdrop there is the tender care for Character that we find all too rarely in the science fiction of this era. From start to finish, <a href=\"https://www.jonmsterling.com/clarke-1956/\">The City and the Stars</a> was a marvel.</p>\n <p>Next, I took a chance on one of <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Tchaikovsky</a>\u2019s recent novels, <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a>. I knew one minute in that I would read this one ravenously. There is a bitter realness in Tchaikovsky\u2019s portrayal of \u201crevolutionary subcommittees\u201d and their dysfunction that will be familiar to anyone who has found themselves in such circles, the fear of betrayal (by whom, and since when?), and the implosive combination of weakness and overconfidence that shatters nearly every revolutionary conspiracy from the inside. Will it really take something from <em>outside</em> Earth\u2019s biological construct to lock us on to the path that leads away from self-defeat and ruin?</p>",
+18
jonsterling/2025-W27_.json
+18
jonsterling/2025-W27_.json
···+"summary": "<h2>Summer in Cambridge</h2>\n \n <p>The heat of summer has begun in full force, but this is also one of the most beautiful times for Cambridge. Our Fellows\u2019 Garden is in full bloom, and I have found myself taking many detours through the Garden to commune with the flowers, bees, and dragonflies alike.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmicn5sfxioitfur24525gs7zgnq2y2wb5a4whxjlcnmyjoib2jnwje.jpeg\" width=\"220px\">\n <img src=\"https://www.jonmsterling.com/bafkrmiez74vpcrxgi55ab2wnhqjzoyltk3tg4u4aa7n2pjjmyoch5aaboe.jpeg\" width=\"220px\">\n \nA sampling from the <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Fellows\u2019 Garden this year. Left: Opium poppies; right: Kniphophia (commonly, \u201cred hot poker\u201d) flower.\n <p>On a personal note, I purchased a wooden adirondack chair for my garden at home and it has been life-changing. I always wanted to be able to sit out in my garden in the evenings and watch the birds and other critters. I spend a good bit of time out there every evening, cooling off with a refreshing beverage and becoming a person again.</p>\n \n \n\n \n\n <h2>Moving to Old Court</h2>\n \n <p>Since I became a Fellow of <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>, I have had a lovely little attic office in Memorial Court (across the street from the <a href=\"https://www.jonmsterling.com/00G5/\">University Library</a>). That\u2019s where I spend most of my time aside from meeting with my group, since the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Laboratory</a> is far too noisy to get any work done. The arrangement was, however, always temporary as I was meant to have a room in the Old Court\u2014if it were not for the <a href=\"https://stories.clare.cam.ac.uk/transforming-old-court/index.html\">generational restoration project</a> has made many rooms unavailable for a long time. With the restoration only several months from completion, some rooms (including mine!) have been returned to the College and I was asked to kindly vacate Memorial Court by Friday the 5th so that my old room can be used for a student. This week, I moved all my effects to my new room and it is starting to feel like home.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmicd2y2mmhzeqh56ys7idypbywvubhub6lsel3rm7l7q4ows2kyn6e.jpeg\" width=\"300px\">\n \nA photograph of my new office in <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Old Court, which would not be complete without my <a href=\"https://www.jonmsterling.com/01AH/\">beloved 2006 iMac</a>.\n <p>Inspired by <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a>\u2019s idyllic room in <a href=\"https://www.jonmsterling.com/00VR/\">Pembroke</a>, I am hoping to get some plants and greenery in here but I have not yet decided specifically what I would like. Ideally something that can survive a couple weeks of neglect once or twice a year. I am also hoping that Estates will allow me to mount an enamel blackboard in here for teaching my undergraduates.</p>\n \n \n\n \n\n <h2>Too cool to exist: an idea bites the dust</h2>\n \n <p>I had a very \u201ccool\u201d idea last term for a version of synthetic domain theory that handles non-determinism with the same grace that ordinary synthetic domain theory handles recursion and continuity. The idea was to treat non-determinism as an orthogonality property, so that we would have special types in which you can take the \u201csum\u201d of two elements, and these sums would automatically be preserved by all functions without any need to check anything, in the same way that you can take the limit of a chain in the synthetic way and then these are preserved automatically by every function.</p>\n <p>To be precise, I had hoped to study the types that are orthogonal to the inclusion <code>2\\hookrightarrow T(2)</code> where <code>T</code> is <a href=\"https://www.jonmsterling.com/hyland-1991/\">Hyland</a>\u2019s \u201cco-partial map classifier\u201d. I finally got around to looking into this idea this week.</p>\n <p>Unfortunately, it will never work: in particular, the synthetic Sierpi\u0144ski space <code>\\Sigma </code> will pretty much <em>never</em> satisfy the orthogonality condition that I had in mind. In most cases, the Sierpi\u0144ski space will be orthogonal to the comparison map <code>2^\\top \\to T(2)</code> where <code>2^\\top </code> is the <em>inverted Sierpi\u0144ski cone</em> of the discrete space <code>2</code>; this would follow by dualising the results of <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">my recent LICS paper</a>, which hold so long as <code>\\Sigma </code> is closed under finite disjunctions and satisfies <a href=\"https://www.jonmsterling.com/00AD/\">Phoa\u2019s principle</a>. So in that case, we can consider just whether it is possible for <code>\\Sigma </code> to be orthogonal to the canonical closed embedding <code>2\\hookrightarrow 2^\\top </code>, and the answer is \u201cdefinitely not\u201d: because <code>\\Sigma ^{2^\\top }</code> is the space of co-spans in <code>\\Sigma </code> under <a href=\"https://www.jonmsterling.com/00AD/\">Phoa\u2019s principle</a>, this would imply that all upper bounds in <code>\\Sigma </code> are <em>least</em> upper bounds, which is certainly not the case!</p>\n <p>On the bright side, after disillusioning myself of the above, I did have a potentially promising idea for generalising some important notions from <a href=\"https://www.jonmsterling.com/alexsimpson/\">Alex Simpson</a>\u2019s <a href=\"https://www.jonmsterling.com/simpson-2004/\">Computational adequacy for recursive types in models of intuitionistic set theory</a> that might give a clearer picture of the type-level iteration that is used to compute solutions to recursive domain equations in synthetic domain theory. We will see!</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/lecarre-1965/\">The Looking Glass War</a> and <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a></h2>\n \n <p>I took a break from my science fiction bender to read <a href=\"https://www.jonmsterling.com/johnlecarre/\">John le Carr\u00e9</a>\u2019s <a href=\"https://www.jonmsterling.com/lecarre-1965/\">The Looking Glass War</a>; le Carr\u00e9 is a favourite author of mine, though I can\u2019t say that I have enjoyed all his works equally. This one was an enjoyable and quick read, and cost me refreshingly little. <em>The Looking Glass War</em> provides a glimpse of postwar British intelligence tomfoolery: the outmoded military intelligence Department decides to run agents for the first time in two decades to follow an obvious wild goose chase in East Germany, using aging talent and obsolete equipment that was <em>most graciously</em> supplied by their more competent rival intelligence agency, MI6 (\u201cthe Circus\u201d). It is a perfect depiction of the essentially <em>unserious</em> nature of those who love the game. Le Carr\u00e9 has a gift of writing characters who are so unlikeable as to make one physically sick. Only Smiley\u2014a light antagonist to the glory-seeking near-retirees of the Department\u2014seems to bear any redeeming quality whatsoever.</p>\n <p>Next I\u2019ve started reading <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, which is the first in le Carr\u00e9\u2019s Smiley series; aside from a few moments of clumsiness in the characterisation of Smiley\u2019s relationship with his vapid wife Lady Ann Sercombe, it\u2019s a good read so far. When I was a kid, I had read the \u201cKarla Trilogy\u201d (<em>Tinker Tailor Soldier Spy</em>, <em>The Honourable Schoolboy</em>, and <em>Smiley\u2019s People</em>), and aside from his role in <em>The Spy Who Came In From The Cold</em>, I had not realised until now that Smiley figured in so many works of le Carr\u00e9. After this le Carr\u00e9 binge, I will almost certainly re-watch Alec Guinness\u2019s show-stopping performance in the BBC adaptation of <em>Tinker Tailor</em>.</p>",+"content": "<h2>Summer in Cambridge</h2>\n \n <p>The heat of summer has begun in full force, but this is also one of the most beautiful times for Cambridge. Our Fellows\u2019 Garden is in full bloom, and I have found myself taking many detours through the Garden to commune with the flowers, bees, and dragonflies alike.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmicn5sfxioitfur24525gs7zgnq2y2wb5a4whxjlcnmyjoib2jnwje.jpeg\" width=\"220px\">\n <img src=\"https://www.jonmsterling.com/bafkrmiez74vpcrxgi55ab2wnhqjzoyltk3tg4u4aa7n2pjjmyoch5aaboe.jpeg\" width=\"220px\">\n \nA sampling from the <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Fellows\u2019 Garden this year. Left: Opium poppies; right: Kniphophia (commonly, \u201cred hot poker\u201d) flower.\n <p>On a personal note, I purchased a wooden adirondack chair for my garden at home and it has been life-changing. I always wanted to be able to sit out in my garden in the evenings and watch the birds and other critters. I spend a good bit of time out there every evening, cooling off with a refreshing beverage and becoming a person again.</p>\n \n \n\n \n\n <h2>Moving to Old Court</h2>\n \n <p>Since I became a Fellow of <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>, I have had a lovely little attic office in Memorial Court (across the street from the <a href=\"https://www.jonmsterling.com/00G5/\">University Library</a>). That\u2019s where I spend most of my time aside from meeting with my group, since the <a href=\"https://www.jonmsterling.com/camcl/\">Computer Laboratory</a> is far too noisy to get any work done. The arrangement was, however, always temporary as I was meant to have a room in the Old Court\u2014if it were not for the <a href=\"https://stories.clare.cam.ac.uk/transforming-old-court/index.html\">generational restoration project</a> has made many rooms unavailable for a long time. With the restoration only several months from completion, some rooms (including mine!) have been returned to the College and I was asked to kindly vacate Memorial Court by Friday the 5th so that my old room can be used for a student. This week, I moved all my effects to my new room and it is starting to feel like home.</p>\n \n<img src=\"https://www.jonmsterling.com/bafkrmicd2y2mmhzeqh56ys7idypbywvubhub6lsel3rm7l7q4ows2kyn6e.jpeg\" width=\"300px\">\n \nA photograph of my new office in <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a> Old Court, which would not be complete without my <a href=\"https://www.jonmsterling.com/01AH/\">beloved 2006 iMac</a>.\n <p>Inspired by <a href=\"https://www.jonmsterling.com/anilmadhavapeddy/\">Anil</a>\u2019s idyllic room in <a href=\"https://www.jonmsterling.com/00VR/\">Pembroke</a>, I am hoping to get some plants and greenery in here but I have not yet decided specifically what I would like. Ideally something that can survive a couple weeks of neglect once or twice a year. I am also hoping that Estates will allow me to mount an enamel blackboard in here for teaching my undergraduates.</p>\n \n \n\n \n\n <h2>Too cool to exist: an idea bites the dust</h2>\n \n <p>I had a very \u201ccool\u201d idea last term for a version of synthetic domain theory that handles non-determinism with the same grace that ordinary synthetic domain theory handles recursion and continuity. The idea was to treat non-determinism as an orthogonality property, so that we would have special types in which you can take the \u201csum\u201d of two elements, and these sums would automatically be preserved by all functions without any need to check anything, in the same way that you can take the limit of a chain in the synthetic way and then these are preserved automatically by every function.</p>\n <p>To be precise, I had hoped to study the types that are orthogonal to the inclusion <code>2\\hookrightarrow T(2)</code> where <code>T</code> is <a href=\"https://www.jonmsterling.com/hyland-1991/\">Hyland</a>\u2019s \u201cco-partial map classifier\u201d. I finally got around to looking into this idea this week.</p>\n <p>Unfortunately, it will never work: in particular, the synthetic Sierpi\u0144ski space <code>\\Sigma </code> will pretty much <em>never</em> satisfy the orthogonality condition that I had in mind. In most cases, the Sierpi\u0144ski space will be orthogonal to the comparison map <code>2^\\top \\to T(2)</code> where <code>2^\\top </code> is the <em>inverted Sierpi\u0144ski cone</em> of the discrete space <code>2</code>; this would follow by dualising the results of <a href=\"https://www.jonmsterling.com/pugh-sterling-2025/\">my recent LICS paper</a>, which hold so long as <code>\\Sigma </code> is closed under finite disjunctions and satisfies <a href=\"https://www.jonmsterling.com/00AD/\">Phoa\u2019s principle</a>. So in that case, we can consider just whether it is possible for <code>\\Sigma </code> to be orthogonal to the canonical closed embedding <code>2\\hookrightarrow 2^\\top </code>, and the answer is \u201cdefinitely not\u201d: because <code>\\Sigma ^{2^\\top }</code> is the space of co-spans in <code>\\Sigma </code> under <a href=\"https://www.jonmsterling.com/00AD/\">Phoa\u2019s principle</a>, this would imply that all upper bounds in <code>\\Sigma </code> are <em>least</em> upper bounds, which is certainly not the case!</p>\n <p>On the bright side, after disillusioning myself of the above, I did have a potentially promising idea for generalising some important notions from <a href=\"https://www.jonmsterling.com/alexsimpson/\">Alex Simpson</a>\u2019s <a href=\"https://www.jonmsterling.com/simpson-2004/\">Computational adequacy for recursive types in models of intuitionistic set theory</a> that might give a clearer picture of the type-level iteration that is used to compute solutions to recursive domain equations in synthetic domain theory. We will see!</p>\n \n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/lecarre-1965/\">The Looking Glass War</a> and <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a></h2>\n \n <p>I took a break from my science fiction bender to read <a href=\"https://www.jonmsterling.com/johnlecarre/\">John le Carr\u00e9</a>\u2019s <a href=\"https://www.jonmsterling.com/lecarre-1965/\">The Looking Glass War</a>; le Carr\u00e9 is a favourite author of mine, though I can\u2019t say that I have enjoyed all his works equally. This one was an enjoyable and quick read, and cost me refreshingly little. <em>The Looking Glass War</em> provides a glimpse of postwar British intelligence tomfoolery: the outmoded military intelligence Department decides to run agents for the first time in two decades to follow an obvious wild goose chase in East Germany, using aging talent and obsolete equipment that was <em>most graciously</em> supplied by their more competent rival intelligence agency, MI6 (\u201cthe Circus\u201d). It is a perfect depiction of the essentially <em>unserious</em> nature of those who love the game. Le Carr\u00e9 has a gift of writing characters who are so unlikeable as to make one physically sick. Only Smiley\u2014a light antagonist to the glory-seeking near-retirees of the Department\u2014seems to bear any redeeming quality whatsoever.</p>\n <p>Next I\u2019ve started reading <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, which is the first in le Carr\u00e9\u2019s Smiley series; aside from a few moments of clumsiness in the characterisation of Smiley\u2019s relationship with his vapid wife Lady Ann Sercombe, it\u2019s a good read so far. When I was a kid, I had read the \u201cKarla Trilogy\u201d (<em>Tinker Tailor Soldier Spy</em>, <em>The Honourable Schoolboy</em>, and <em>Smiley\u2019s People</em>), and aside from his role in <em>The Spy Who Came In From The Cold</em>, I had not realised until now that Smiley figured in so many works of le Carr\u00e9. After this le Carr\u00e9 binge, I will almost certainly re-watch Alec Guinness\u2019s show-stopping performance in the BBC adaptation of <em>Tinker Tailor</em>.</p>",
+18
jonsterling/2025-W28_.json
+18
jonsterling/2025-W28_.json
···+"summary": "<p>This week has been spent wrapping up my duties as IA Examiner and our subject fair at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>. I did also get a chance to do a bit of mathematics, as well as making arrangements for next year\u2019s module on Homotopy Type Theory and Univalent Foundations (which I am very excited to offer!).</p>\n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a>, and <a href=\"https://www.jonmsterling.com/leguin-1974/\">The Dispossessed</a></h2>\n \n <p>I finished up <a href=\"https://www.jonmsterling.com/johnlecarre/\">le Carr\u00e9</a>\u2019s <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a> from <a href=\"https://www.jonmsterling.com/01C6/\">last week</a>, which quite satisfactorily tied up all loose ends. One thing that shouted to me is the way that, whatever their \u201cactual\u201d views, neither <a href=\"https://www.jonmsterling.com/johnlecarre/\">le Carr\u00e9</a> nor his avatar George Smiley could get away from the following fact: putting aside the serious flaws of the Soviet Union and its historic assay toward Socialism, those on its side fought for <em>civilisation</em> and a futuristic vision for Humanity whereas the Western cold warriors fought for pride and the preservation of barbarism:</p>\n <blockquote>Dieter had remembered and Smiley had not. They had come from different hemispheres of the night, from different worlds of thought and conduct. Dieter, mercurial, absolute, had fought to build a civilization. Smiley, rationalistic, protective, had fought to prevent him. \u2018Oh God,\u2019 said Smiley aloud, \u2018who was then the gentleman\u00a0\u2026?\u2019</blockquote>\n <p>Following <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, I did make a quickly aborted attempt to read the sequel, <em>A Murder of Quality</em>, but I decided to hold off on that until I am more in the right mindset for it. I found I was more in the mood for an easy science fiction read, so I picked up <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Adrian Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a>. It was indeed an easy read, but I can\u2019t say I enjoyed it very much. It seemed to lack the care and delicacy of Tchaikovsky\u2019s other works that I have read, and even compared to <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a> (a similar but better page-turner), I found <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a> a bit clumsy. Although written in 2024, it reads like a pandemic-era fever dream that hit all the right notes to resonate with the most extremely unhinged element\u2014the person we all know who talked inecessantly about being \u201cin quarantine\u201d because they were in their house at liberty eating Deliveroo every day, and for whom five years later there is still no conversation that does not eventually return to their doubtful \u201cextremely long covid\u201d symptoms. All this I would give license for (it\u2019s fiction, after all!), but the plot was full of holes and elements far more implausible than the \u201cHygrometric Dehabitation Region\u201d (which is, on the face of it, not so hard to imagine given our disastrous environmental trajectory).</p>\n <p>After this, I realised I wanted to read something with a bit more literary substance, so I started <a href=\"https://www.jonmsterling.com/ursulakleguin/\">Ursula K. Le Guin</a>\u2019s <a href=\"https://www.jonmsterling.com/leguin-1974/\">The Dispossessed</a> which has long been on my list. Like everything of Le Guin that I have read, this one is beautifully written and full of grace and depth and creativity. I have always wanted to write fiction, and if I could be like any author, it would be Le Guin. I will reserve any further comment on the actual content of the book until after I have finished it.</p>",+"content": "<p>This week has been spent wrapping up my duties as IA Examiner and our subject fair at <a href=\"https://www.jonmsterling.com/00GP/\">Clare College</a>. I did also get a chance to do a bit of mathematics, as well as making arrangements for next year\u2019s module on Homotopy Type Theory and Univalent Foundations (which I am very excited to offer!).</p>\n \n\n \n\n <h2><a href=\"https://www.jonmsterling.com/01AY/\">Reading corner</a>: <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a>, and <a href=\"https://www.jonmsterling.com/leguin-1974/\">The Dispossessed</a></h2>\n \n <p>I finished up <a href=\"https://www.jonmsterling.com/johnlecarre/\">le Carr\u00e9</a>\u2019s <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a> from <a href=\"https://www.jonmsterling.com/01C6/\">last week</a>, which quite satisfactorily tied up all loose ends. One thing that shouted to me is the way that, whatever their \u201cactual\u201d views, neither <a href=\"https://www.jonmsterling.com/johnlecarre/\">le Carr\u00e9</a> nor his avatar George Smiley could get away from the following fact: putting aside the serious flaws of the Soviet Union and its historic assay toward Socialism, those on its side fought for <em>civilisation</em> and a futuristic vision for Humanity whereas the Western cold warriors fought for pride and the preservation of barbarism:</p>\n <blockquote>Dieter had remembered and Smiley had not. They had come from different hemispheres of the night, from different worlds of thought and conduct. Dieter, mercurial, absolute, had fought to build a civilization. Smiley, rationalistic, protective, had fought to prevent him. \u2018Oh God,\u2019 said Smiley aloud, \u2018who was then the gentleman\u00a0\u2026?\u2019</blockquote>\n <p>Following <a href=\"https://www.jonmsterling.com/lecarre-1961/\">Call for the Dead</a>, I did make a quickly aborted attempt to read the sequel, <em>A Murder of Quality</em>, but I decided to hold off on that until I am more in the right mindset for it. I found I was more in the mood for an easy science fiction read, so I picked up <a href=\"https://www.jonmsterling.com/adriantchaikovsky/\">Adrian Tchaikovsky</a>\u2019s <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a>. It was indeed an easy read, but I can\u2019t say I enjoyed it very much. It seemed to lack the care and delicacy of Tchaikovsky\u2019s other works that I have read, and even compared to <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-alien-clay/\">Alien Clay</a> (a similar but better page-turner), I found <a href=\"https://www.jonmsterling.com/tchaikovsky-2024-saturation-point/\">Saturation Point</a> a bit clumsy. Although written in 2024, it reads like a pandemic-era fever dream that hit all the right notes to resonate with the most extremely unhinged element\u2014the person we all know who talked inecessantly about being \u201cin quarantine\u201d because they were in their house at liberty eating Deliveroo every day, and for whom five years later there is still no conversation that does not eventually return to their doubtful \u201cextremely long covid\u201d symptoms. All this I would give license for (it\u2019s fiction, after all!), but the plot was full of holes and elements far more implausible than the \u201cHygrometric Dehabitation Region\u201d (which is, on the face of it, not so hard to imagine given our disastrous environmental trajectory).</p>\n <p>After this, I realised I wanted to read something with a bit more literary substance, so I started <a href=\"https://www.jonmsterling.com/ursulakleguin/\">Ursula K. Le Guin</a>\u2019s <a href=\"https://www.jonmsterling.com/leguin-1974/\">The Dispossessed</a> which has long been on my list. Like everything of Le Guin that I have read, this one is beautifully written and full of grace and depth and creativity. I have always wanted to write fiction, and if I could be like any author, it would be Le Guin. I will reserve any further comment on the actual content of the book until after I have finished it.</p>",
+2
-2
jonsterling/metadata.json
+2
-2
jonsterling/metadata.json
+18
martinkl/2020_11_18_distributed-systems-and-elliptic-curves.html.json
+18
martinkl/2020_11_18_distributed-systems-and-elliptic-curves.html.json
···+"summary": "I have just published new educational materials that might be of interest to computing people: a new 8-lecture course on distributed systems, and a tutorial on elliptic curve cryptography. Distributed Systems Since last year I have been delivering an 8-lecture undergraduate course on distributed systems at the University of Cambridge....",+"content": "<p>I have just published new educational materials that might be of interest to computing people:\na new 8-lecture course on distributed systems, and a tutorial on elliptic curve cryptography.</p>\n\n<h2>Distributed Systems</h2>\n\n<p>Since last year I have been delivering an 8-lecture undergraduate course on distributed systems at the University of Cambridge.\nThe first time I delivered it, I inherited the slides and exercises from the people who lectured it in previous years (Richard Mortier, Anil Madhavapeddy, Robert Watson, Jean Bacon, and Steven Hand), and I just used those materials with minor modifications.\nIt was a good course, but it was getting quite dated (e.g. lots of material on <a href=\"https://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture\">CORBA</a>, which is now of mostly historical interest).</p>\n\n<p>Therefore, this year I decided to do a thorough refresh of the course content, and wrote a brand new set of slides and lecture notes.\nAlso, due to the pandemic we are not having any in-person lectures, so I recorded videos for all of the lectures.\nI decided to make all of this available publicly under a <a href=\"https://creativecommons.org/licenses/by-sa/4.0/\">creative commons CC BY-SA license</a>, which means that you\u2019re welcome to use it freely (including incorporating it into your own work), provided that you give credit to me, and that you share your derived work under the same license.</p>\n\n<p>The result is here:</p>\n\n<ul>\n <li><a href=\"https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/dist-sys-notes.pdf\">Lecture notes (PDF)</a> (including exercises)</li>\n <li>Slides: <a href=\"https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/dist-sys-slides.pdf\">slideshow</a> and <a href=\"https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/dist-sys-handout.pdf\">printable</a> (PDF)</li>\n <li><a href=\"https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB\">Lecture videos (YouTube)</a></li>\n <li><a href=\"https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/\">Course web page</a></li>\n <li>Solution notes for the exercises are available on demand (<a href=\"/contact.html\">email me</a> and convince me that you\u2019re not a student trying to cheat).\nCambridge supervisors can <a href=\"https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/supervisors/dist-sys-solutions.pdf\">download the solution notes directly</a> (Raven login required).</li>\n</ul>\n\n<p>The course is primarily designed for Cambridge undergraduate students, and it includes some cross-references to other courses.\nMany other courses also make their notes or slides publicly available, so you can still look them up if you\u2019re not at Cambridge by going to the <a href=\"https://www.cl.cam.ac.uk/teaching/2122/part1b.html\">course web pages</a>.\n(Many lecturers restrict their video recordings to Cambridge users only, so those might not be publicly available.)</p>\n\n<p>The distributed systems course comprises about 7 hours of video and 87 pages of lecture notes.\nIt covers the following topics:</p>\n\n<ol>\n <li>Introduction: distributed systems, computer networks, and RPC</li>\n <li>System models: network faults, crash and Byzantine faults, synchrony assumptions</li>\n <li>Physical clocks, clock synchronisation, and causality</li>\n <li>Logical time, broadcast protocols (reliable, FIFO, causal, total order)</li>\n <li>Replication, quorum protocols, state machine replication</li>\n <li>Consensus, details on the Raft consensus algorithm</li>\n <li>Replica consistency, two-phase commit, linearizability, eventual consistency</li>\n <li>Case studies: collaboration software, Google\u2019s Spanner</li>\n</ol>\n\n<p>The main focus of this course is on understanding the algorithms and the principles that allow us to build robust and reliable distributed systems.\nIt uses examples of practical systems as motivation, and the videos include a few live demos of real distributed systems in action.\nThe aim is to convey the fundamentals without being excessively theoretical; there are a few mathematical proofs in the exercises, but most of the discussion is informal and example-based.</p>\n\n<p>The level of this course is intended for second-year undergraduates.\nOur students at this level have reasonable fluency with mathematical notation, and some background in programming languages and operating systems, so that\u2019s what this course assumes.</p>\n\n<h2>Elliptic Curve Cryptography</h2>\n\n<p>Another document I\u2019m releasing today is called\n<a href=\"https://martin.kleppmann.com/papers/curve25519.pdf\">Implementing Curve25519/X25519: A Tutorial on Elliptic Curve Cryptography</a>.\nThere\u2019s no video for this one, just a 30-page PDF.</p>\n\n<p>Many textbooks cover the concepts behind Elliptic Curve Cryptography (ECC), but few explain how to go from the equations to a working, fast, and secure implementation.\nOn the other hand, while the code of many cryptographic libraries is available as open source, it can be <a href=\"https://github.com/jedisct1/libsodium/blob/master/src/libsodium/crypto_scalarmult/curve25519/ref10/x25519_ref10.c#L91-L132\">rather opaque to the untrained eye</a>, and it is rarely accompanied by detailed documentation explaining how the code came about and why it is correct.</p>\n\n<p>This tutorial bridges the gap between the mathematics and implementation of elliptic curve cryptography.\nIt is written for readers who are new to cryptography, and it assumes no more mathematical background than most undergraduate computer science courses.\nStarting from first principles, this document shows how to derive every line of code in an implementation of the <a href=\"https://tools.ietf.org/html/rfc7748\">X25519</a> Diffie-Hellman key agreement scheme, based on the <a href=\"https://ianix.com/pub/curve25519-deployment.html\">widely-used Curve25519 elliptic curve</a>.\nThe implementation is based on Dan Bernstein et al.\u2019s <a href=\"https://tweetnacl.cr.yp.to/\">TweetNaCl</a>.\nIt is fast and secure; in particular, it uses constant-time algorithms to prevent side-channel attacks.</p>\n\n<p>I wrote this because I wanted to learn how real implementations of ECC work, but I couldn\u2019t find good resources that explained it, so I wrote the document as I figured it out step-by-step from a number of sources (and by doing a lot of the calculations myself).\nI hope others will also find it useful.</p>",
+18
martinkl/2020_12_02_bloom-filter-hash-graph-sync.html.json
+18
martinkl/2020_12_02_bloom-filter-hash-graph-sync.html.json
···+"summary": "This blog post uses MathJax to render mathematics. You need JavaScript enabled for MathJax to work. In some recent research, Heidi and I needed to solve the following problem. Say you want to sync a hash graph, such as a Git repository, between two nodes. In Git, each commit is...",+"content": "<p><em>This blog post uses <a href=\"https://www.mathjax.org/\">MathJax</a> to render mathematics. You need JavaScript enabled for MathJax to work.</em></p>\n\n<p>In some recent research, <a href=\"http://heidihoward.co.uk/\">Heidi</a> and I needed to solve the following problem.\nSay you want to sync a hash graph, such as a Git repository, between two nodes.\nIn Git, each commit is identified by its hash, and a commit may include the hashes of predecessor commits (a commit may include more than one hash if it\u2019s a merge commit).\nWe want to figure out the minimal set of commits that the two nodes need to send to each other in order to make their graphs the same.</p>\n\n<p>You might wonder: isn\u2019t this a solved problem?\nGit has to do this every time you do <code>git pull</code> or <code>git push</code>!\nYou\u2019re right, and some cases are easy, but other cases are a bit trickier.\nWhat\u2019s more, the algorithm used by Git is not particularly well-documented, and in any case we think that we can do better.</p>\n\n<p>For example, say we have two nodes, and each has one of the following two hash graphs (circles are commits, arrows indicate one commit referencing the hash of another).\nThe blue part (commit A and those to the left of it) is shared between the two graphs, while the dark grey and light grey parts exist in only one of the two graphs.</p>\n\n<p><a href=\"/2020/12/hash-dag.png\"><img alt=\"Illustration of two hash graphs\" height=\"258\" src=\"/2020/12/hash-dag.png\" width=\"550\"></a></p>\n\n<p>We want to reconcile the two nodes\u2019 states so that one node sends all of the dark-grey-coloured commits, the other sends all of the light-grey-coloured commits, and both end up with the following graph:</p>\n\n<p><a href=\"/2020/12/hash-dag2.png\"><img alt=\"Hash graph after reconciliation\" height=\"143\" src=\"/2020/12/hash-dag2.png\" width=\"550\"></a></p>\n\n<p>How do we efficiently figure out which commits the two nodes need to send to each other?</p>\n\n<h2>Traversing the graph</h2>\n\n<p>First, some terminology.\nLet\u2019s say commit A is a <em>predecessor</em> of commit B if B references the hash of A, or if there is some chain of hash references from B leading to A.\nIf A is a predecessor of B, then B is a <em>successor</em> of A.\nFinally, define the <em>heads</em> of the graph to be those commits that have no successors.\nIn the example above, the heads are B, C, and D.\n(This is slightly different from how Git defines <code>HEAD</code>.)</p>\n\n<p>The reconciliation algorithm is easy if it\u2019s a \u201cfast-forward\u201d situation: that is, if one node\u2019s heads are commits that the other node already has.\nIn that case, one node sends the other the hashes of its heads, and the other node replies with all commits that are successors of the first node\u2019s heads.\nHowever, the situation is tricker in the example above, where one node\u2019s heads B and C are unknown to the other node, and likewise head D is unknown to the first node.</p>\n\n<p>In order to reconcile the two graphs, we want to figure out which commits are the latest common predecessors of both graphs\u2019 heads (also known as <em>common ancestors</em>, marked A in the example), and then the nodes can send each other all commits that are successors of the common predecessors.</p>\n\n<p>As a first attempt, we can try this: the two nodes send each other their heads; if those contain any unknown predecessor hashes, they request those, and repeat until all hashes resolve to known commits.\nThus, the nodes gradually work their way from the heads towards the common predecessors.\nThis works, but it is slow if your graph contains long chains of commits, since the number of round trips required equals the length of the longest path from a head to a common predecessor.</p>\n\n<p>The \u201csmart\u201d transfer protocol used by Git essentially <a href=\"https://www.git-scm.com/docs/http-protocol\">works like this</a>, except that it sends 32 hashes at a time in order to reduce the number of round trips.\nWhy 32? Who knows.\nIt\u2019s a trade-off: send more hashes to reduce the number of round trips, but each request/response is bigger.\nPresumably they decided that 32 was a reasonable compromise between latency and bandwidth.</p>\n\n<p>Recent versions of Git also support an experimental <a href=\"https://github.com/git/git/commit/42cc7485a2ec49ecc440c921d2eb0cae4da80549\">\u201cskipping\u201d algorithm</a>, which can be enabled using the <a href=\"https://git-scm.com/docs/git-config#Documentation/git-config.txt-fetchnegotiationAlgorithm\"><code>fetch.negotiationAlgorithm</code> config option</a>.\nRather than moving forward by a fixed number of predecessors in each round trip, this algorithm allows some commits to be skipped, so that it reaches the common predecessors faster.\nThe skip size grows similarly to the Fibonacci sequence (i.e. exponentially) with each round trip.\nThis reduces the number of round trips to \\(O(\\log n)\\), but you can end up overshooting the common predecessors, and thus the protocol may end up unnecessarily transmitting commits that the other node already has.</p>\n\n<h2>Bloom filters to the rescue</h2>\n\n<p>In our new paper draft, which we are <a href=\"https://arxiv.org/abs/2012.00472\">making available on arXiv today</a>, Heidi and I propose a different algorithm for performing this kind of reconciliation.\nIt is quite simple if you know how <a href=\"https://en.wikipedia.org/wiki/Bloom_filter\">Bloom filters</a> work.</p>\n\n<p>In addition to sending the hashes of their heads, each node constructs a Bloom filter containing the hashes of the commits that it knows about.\nIn our prototype, we allocate 10 bits (1.25 bytes) per commit.\nThis number can be adjusted, but note that it is a lot more compact than sending the full 20-byte (for SHA-1, used by Git) or 32-byte (for SHA-256, which is more secure) hash for each commit.\nMoreover, we keep track of the heads from the last time we reconciled our state with a particular node, and then the Bloom filter only needs to include commits that were added since the last reconciliation.</p>\n\n<p>When a node receives such a Bloom filter, it checks its own commit hashes to see whether they appear in the filter.\nAny commits whose hash does not appear in the Bloom filter, and its successors, can immediately be sent to the other node, since we can be sure that the other node does not know about those commits.\nFor any commits whose hash does appear in the Bloom filter, it is likely that the other node knows about that commit, but due to false positives it is possible that the other node actually does not know about those commits.</p>\n\n<p>After receiving all the commits that did not appear in the Bloom filter, we check whether we know all of their predecessor hashes.\nIf any are missing, we request them in a separate round trip using the same graph traversal algorithm as before.\nDue to the way the false positive probabilities work, the probability of requiring n round trips decreases exponentially as n grows.\nFor example, you might have a 1% chance of requiring two round trips, a 0.01% chance of requiring three round trips, a 0.0001% chance of requiring four round trips, and so on.\nAlmost all reconciliations complete in one round trip.</p>\n\n<p>Unlike the skipping algorithm used by Git, our algorithm never unnecessarily sends any commits that the other side already has, and the Bloom filters are very compact, even for large commit histories.</p>\n\n<h2>Practical relevance</h2>\n\n<p>In the paper we also prove that this algorithm allows nodes to sync their state even in the presence of arbitrarily many malicious nodes, making it immune to <a href=\"https://en.wikipedia.org/wiki/Sybil_attack\">Sybil attacks</a>.\nWe then go on to prove a theorem that shows which types of applications can and cannot be implemented in this Sybil-immune way, without requiring any Sybil countermeasures such as <a href=\"https://en.wikipedia.org/wiki/Proof_of_work\">proof-of-work</a> or the centralised control of <a href=\"https://arxiv.org/pdf/1711.03936.pdf\">permissioned blockchains</a>.</p>\n\n<p>All of this is directly relevant for <a href=\"https://www.inkandswitch.com/local-first.html\">local-first</a> peer-to-peer applications in which apps running on different devices need to sync up their state without necessarily trusting each other or relying on any trusted servers.\nI assume it\u2019s also relevant for <a href=\"https://www.swirlds.com/downloads/SWIRLDS-TR-2016-01.pdf\">blockchains that use hash graphs</a>, but I don\u2019t know much about them.\nSo, syncing a Git commit history is just one of many possible use cases \u2013 I just used it because most developers will be at least roughly familiar with it!</p>\n\n<p>The details of the algorithm and the theorems are in the <a href=\"https://arxiv.org/abs/2012.00472\">paper</a>, so I won\u2019t repeat them here.\nInstead, I will briefly mention a few interesting things that didn\u2019t make it into the paper.</p>\n\n<h2>Why Bloom filters?</h2>\n\n<p>One thing you might be wondering: rather than creating a Bloom filters with 10 bits per commit, can we not just truncate the commit hashes to 10 bits and send those instead?\nThat would use the same amount of network bandwidth, and intuitively it may seem like it should be equivalent.</p>\n\n<p>However, that is not the case: Bloom filters perform vastly better than truncated hashes.\nI will use a small amount of probability theory to explain why.</p>\n\n<p>Say we have a hash graph containing \\(n\\) distinct items, and we want to use \\(b\\) bits per item (so the total size of the data structure is \\(m=bn\\) bits).\nIf we are using truncated hashes, there are \\(2^b\\) possible values for each \\(b\\)-bit hash.\nThus, given two independently chosen, uniformly distributed hashes, the probability that they are the same is \\(2^{-b}\\).</p>\n\n<p>If we have \\(n\\) uniformly distributed hashes, the probability that they are all different from a given \\(b\\)-bit hash is \\((1-2^{-b})^n\\).\nThe false positive probability is therefore the probability that a given \\(b\\)-bit hash equals one or more of the \\(n\\) hashes:</p>\n\n<p>\\[ P(\\text{false positive in truncated hashes}) = 1 - (1 - 2^{-b})^n \\]</p>\n\n<p>On the other hand, with a Bloom filter, we start out with all \\(m\\) bits set to zero, and then for each item, we set \\(k\\) bits to one.\nAfter one uniformly distributed bit-setting operation, the probability that a given bit is zero is \\(1 - 1/m\\).\nThus, after \\(kn\\) bit-setting operations, the probability that a given bit is still zero is \\((1 - 1/m)^{kn}\\).</p>\n\n<p>A Bloom filter has a false positive when we check \\(k\\) bits for some item and they are all one, even though that item was not in the set.\nThe probability of this happening is</p>\n\n<p>\\[ P(\\text{false positive in Bloom filter}) = (1 - (1 - 1/m)^{kn})^k \\]</p>\n\n<p>It\u2019s not obvious from those expressions which of the two is better, so I plotted the false positive probabilities of truncated hashes and Bloom filters for varying numbers of items \\(n\\), and with parameters \\(b=10\\), \\(k=7\\), \\(m=bn\\):</p>\n\n<p><a href=\"/2020/12/false-pos.png\"><img alt=\"Plot of false positive probability for truncated hashes and Bloom filters\" height=\"200\" src=\"/2020/12/false-pos.png\" width=\"550\"></a></p>\n\n<p>For a Bloom filter, as long as we grow the size of the filter proportionally to the number of items (here we have 10 bits per item), the false positive probability remains pretty much constant at about 0.8%.\nBut truncated hashes of the same size behave much worse, and with more than about 1,000 items the false positive probability exceeds 50%.</p>\n\n<p>The reason for this: with 10-bit truncated hashes there are only 1,024 possible hash values, and if we have 1,000 different items, then most of those 1,024 possible values are already taken.\nWith truncated hashes, if we wanted to keep the false positive probability constant, we would have to use more bits per item as the number of items grows, so the total size of the data structure would grow faster than linearly in the number of items.</p>\n\n<p>Viewing it like this, it is quite remarkable that Bloom filters work as well as they do, using only a constant number of bits per item!</p>\n\n<h2>Further details</h2>\n\n<p>The Bloom filter false positive formula given above is the one that is commonly quoted, but it\u2019s actually not quite correct.\nTo be precise, it is a <a href=\"https://www.sciencedirect.com/science/article/abs/pii/S0020019008001579\">lower bound</a> on the exact false positive probability (<a href=\"https://git.gnunet.org/bibliography.git/plain/docs/FalsepositiverateBloomFilter2008Bose.pdf\">open access paper</a>).</p>\n\n<p>Out of curiosity I wrote a <a href=\"https://gist.github.com/ept/83b91aa07e2495c86ddd8c364a8cfbc7\">little Python script</a> that calculates the false positive probability for truncated hashes, Bloom filters using the approximate formula, and Bloom filters using the exact formula.\nFortunately, for the parameter values we are interested in, the difference between approximate and exact probability is very small.\nThe <a href=\"https://gist.github.com/ept/83b91aa07e2495c86ddd8c364a8cfbc7\">gist</a> also contains a <a href=\"http://www.gnuplot.info/\">Gnuplot</a> script to produce the graph above.</p>\n\n<p><a href=\"https://twitter.com/pvh\">Peter</a> suggested that a <a href=\"https://en.wikipedia.org/wiki/Cuckoo_filter\">Cockoo filter</a> may perform even better than a Bloom filter, but we haven\u2019t looked into that yet.\nTo be honest, the Bloom filter approach already works so well, and it\u2019s so simple, that I\u2019m not sure the added complexity of a more sophisticated data structure would really be worth it.</p>\n\n<p>That\u2019s all for today.\nOur paper is at <a href=\"https://arxiv.org/abs/2012.00472\">arxiv.org/abs/2012.00472</a>.\nHope you found this interesting, and please let us know if you end up using the algorithm!</p>",
+18
martinkl/2021_01_13_decentralised-content-moderation.html.json
+18
martinkl/2021_01_13_decentralised-content-moderation.html.json
···+"summary": "Who is doing interesting work on decentralised content moderation? With Donald Trump suspended from Twitter and Facebook, and Parler kicked off AWS, there is renewed discussion about what sort of speech is acceptable online, and how it should be enforced. Let me say up front that I believe that these...",+"content": "<p><strong>Who is doing interesting work on decentralised content moderation?</strong></p>\n\n<p>With Donald Trump suspended from Twitter and Facebook, and\n<a href=\"https://en.wikipedia.org/wiki/Parler\">Parler</a> kicked off AWS, there is renewed discussion about\nwhat sort of speech is acceptable online, and how it should be enforced. Let me say up front that\nI believe that these bans were justified. However, they do raise questions that need to be\ndiscussed, especially within the technology community.</p>\n\n<p>As many have already pointed out, Twitter, Facebook and Amazon are corporations that are free to\nenforce their terms of service in whatever way they see fit, within the bounds of applicable law\n(e.g. anti-discrimination legislation). However, we should also realise that <em>almost all</em> social\nmedia, the public spaces of the digital realm, are in fact privately owned spaces subject to\na corporation\u2019s terms of service. There is currently no viable, non-corporate alternative space that\nwe could all move to. For better or for worse, Mark Zuckerberg, Jack Dorsey, and Jeff Bezos (and\ntheir underlings) are, for now, the arbiters of what can and cannot be said online.</p>\n\n<p>This situation draws attention to the <a href=\"https://redecentralize.org/\">decentralised web community</a>,\na catch-all for a broad set of projects that are aiming to reduce the degree of centralised\ncorporate control in the digital sphere. This includes self-hosted/federated social networks such as\n<a href=\"https://joinmastodon.org/\">Mastodon</a> and <a href=\"https://diasporafoundation.org/\">Diaspora</a>, peer-to-peer\nsocial networks such as <a href=\"https://scuttlebutt.nz/\">Scuttlebutt</a>, and miscellaneous blockchain\nprojects. The exact aims and technicalities of those projects are not important for this post.\nI will start by focussing on one particular design goal that is mentioned by many decentralised web\nprojects, and that is <em>censorship resistance</em>.</p>\n\n<h2>Censorship resistance</h2>\n\n<p>When we think of censorship, we think of totalitarian states exercising violent control over their\npopulation, crushing dissent and stifling the press. Against such an adversary, technologies that\nprovide censorship resistance seem like a positive step forward, since they promote individual\nliberty and human rights.</p>\n\n<p>However, often the adversary is not a totalitarian state, but other users. Censorship resistance\nmeans that anybody can say anything, without suffering consequences. And unfortunately there are\na lot of people out there who say and do rather horrible things. Thus, as soon as\na censorship-resistant social network becomes sufficiently popular, I expect that it will be filled\nwith messages from spammers, neo-nazis, and child pornographers (or any other type of content that\nyou consider despicable). One person\u2019s freedom from violence is another person\u2019s censorship, and\nthus, a system that emphasises censorship resistance will inevitably invite violence against some\npeople.</p>\n\n<p>I fear that many decentralised web projects are designed for censorship resistance not so much\nbecause they deliberately want to become hubs for neo-nazis, but rather out of a kind of naive\nutopian belief that more speech is always better. But I think we have learnt in the last decade that\nthis is not the case. If we want technologies to help build the type of society that we want to live\nin, then certain abusive types of behaviour must be restricted. Thus, content moderation is needed.</p>\n\n<h2>The difficulty of content moderation</h2>\n\n<p>If we want to declare some types of content as unacceptable, we need a process for distinguishing\nbetween acceptable and unacceptable material. But this is difficult. Where do you draw the line\nbetween healthy scepticism and harmful conspiracy theory? Where do you draw the line between healthy\nsatire, using exaggeration for comic effect, and harmful misinformation? Between legitimate\ndisagreement and harassment? Between honest misunderstanding and malicious misrepresentation?</p>\n\n<p>With all of these, some cases will be very clearly on one side or the other of the dividing line,\nbut there will always be a large grey area of cases that are unclear and a matter of subjective\ninterpretation. \u201c<a href=\"https://en.wikipedia.org/wiki/I_know_it_when_I_see_it\">I know it when I see it</a>\u201d\nis difficult to generalise into a rule that can be applied objectively and consistently; and without\nobjectivity and consistency, moderation can easily degenerate into a situation where one group of\npeople forces their opinions on everyone else, like them or not.</p>\n\n<p>In a service that is used around the world, there will be cultural differences on what is considered\nacceptable or not. Maybe one culture is sensitive about nudity and tolerant of depictions of\nviolence, while another culture is liberal about nudity and sensitive about violence. One person\u2019s\nterrorist is another person\u2019s freedom fighter. There is no single, globally agreed standard of what\nis or is not considered acceptable.</p>\n\n<p>Nevertheless, it is possible to come to agreement. For example, Wikipedia editors successfully\nmanage to agree on what should and should not be included in Wikipedia articles, even those on\ncontentious subjects. I won\u2019t say that this process is perfect: Wikipedia editors are predominantly\nwhite, male, and from the Anglo-American cultural sphere, so there is bound to be bias in their\neditorial decisions. I haven\u2019t participated in this community, but I assume the process of coming to\nagreement is sometimes messy and will not make everybody happy.</p>\n\n<p>Moreover, being an encyclopaedia, Wikipedia is focussed on widely accepted facts backed by evidence.\nAttempting to moderate social media in the same way as Wikipedia would make it joyless, with no room\nfor satire, comedy, experimental art, or many of the other things that make it interesting and\nhumane. Nevertheless, Wikipedia is an interesting example of decentralised content moderation that\nis not controlled by a private entity.</p>\n\n<p>Another example is federated social networks such as Mastodon or Diaspora. Here, each individual\nserver administrator has the authority to\n<a href=\"https://docs.joinmastodon.org/admin/moderation/\">set the rules for the users of their server</a>, but\nthey have no control over activity on other servers (other than to block another server entirely).\nDespite the decentralised architecture, there is a\n<a href=\"https://arxiv.org/pdf/1909.05801.pdf\">trend towards centralisation</a> (10% of Mastodon instances\naccount for almost half the users), leaving a lot of power in the hand of a small number of server\nadministrators. If these social networks are to go more mainstream, I expect these effects to be\namplified.</p>\n\n<h2>Filter bubbles</h2>\n\n<p>One form of social media is private chat for small groups, as provided e.g. by WhatsApp, Signal, or\neven email. Here, when you post a message to a group, the only people who can see it are members of\nthat group. In this setting, not much content moderation is needed: group members can kick out other\nmembers if they say things considered unacceptable. If one group says things that another group\nconsiders objectionable, that\u2019s no problem, because the two groups can\u2019t see each other\u2019s\nconversations anyway. If one user is harassing another, the victim can block the harasser. Thus,\nprivate groups are comparatively easy to deal with.</p>\n\n<p>The situation is harder with social media that is public (anyone can read) and open (anyone can join\na conversation), or when the groups are very large. Twitter is an example of this model (and\nFacebook to some degree, depending on your privacy settings). When anybody can write a message that\nyou will see (e.g. a reply to something you posted publicly), the door is opened to harassment and\nabuse.</p>\n\n<p>One response might be to retreat into our filter bubbles. For example, we could say that you see\nonly messages posted by your immediate friends and friends-of-friends. I am pretty sure that there\nare no neo-nazis among my direct friends, and probably also among my second-degree network, so such\na rule would shield me from extremist content of one sort, at least.</p>\n\n<p>It is also possible for users to collaborate on creating filters. For example,\n<a href=\"https://github.com/freebsdgirl/ggautoblocker\">ggautoblocker</a> was a tool to block abusive Twitter\naccounts during <a href=\"https://en.wikipedia.org/wiki/Gamergate_controversy\">GamerGate</a>, a 2014\nmisogynistic harassment campaign that\n<a href=\"https://www.theguardian.com/technology/2016/dec/01/gamergate-alt-right-hate-trump\">foreshadowed</a>\nthe rise of the alt-right and Trumpism. In the absence of central moderation by Twitter, victims of\nthis harassment could use this tool to automatically block a large number of harmful users so that\nthey wouldn\u2019t have to see the abusive messages.</p>\n\n<p>Of course, even though such filtering saves you from having to see things you don\u2019t like, it doesn\u2019t\nstop the objectionable content from existing. Moreover, other people may have the opposite sort of\nfilter bubble in which they see <em>lots</em> of extremist content, causing them to become radicalised.\nPersonalised filters also stop us from seeing alternative (valid) opinions that would help broaden\nour worldview and enable better mutual understanding of different groups in society.</p>\n\n<p>Thus, subjective filtering of who sees what, such as blocking users, is an important part of\nreducing harm on social media, but by itself it is not sufficient. It is also necessary to uphold\nminimum standards on what can be posted at all, for example by requiring a baseline of civility and\ntruthfulness.</p>\n\n<h2>Democratic content moderation</h2>\n\n<p>I previously argued that there is no universally agreed standard of acceptability of content; and\nyet, we must somehow keep the standard of discourse high enough that it does not become intolerable\nfor those involved, and to minimise the harms e.g. from harassment, radicalisation, and incitement\nof violence. How do we solve this contradiction? Leaving the power in the hands of a small number of\ntech company CEOs, or any other small and unelected group of people, does not seem like a good\nlong-term solution.</p>\n\n<p>A purely technical solution does not exist either, since code cannot make value judgements about\nwhat sort of behaviour is acceptable. It seems like some kind of democratic process is the only\nviable long-term solution here, perhaps supported by some technological mechanisms, such as\nAI/machine learning to flag potentially abusive material. But what might this democratic process\nlook like?</p>\n\n<p>Moderation should not be so heavy-handed that it drowns out legitimate disagreement. Disagreement\nneed not always be polite; indeed,\n<a href=\"https://everydayfeminism.com/2015/12/tone-policing-and-privilege/\">tone policing</a> should not be\na means of silencing legitimate complaints. On the other hand, aggressive criticism may quickly flip\ninto the realm of harassment, and it may be unclear when exactly this line has been crossed.\nSometimes it may be appropriate to take into account the power relationships between the people\ninvolved, and hold the privileged and powerful to a higher standard than the oppressed and\ndisadvantaged, since otherwise the system may end up reinforcing existing imbalances. But there are\nno hard and fast rules here, and much depends on the context and background of the people involved.</p>\n\n<p>This example indicates that the moderation process needs to embed ethical principles and values. One\nway of doing this would be to have a board of moderation overseers that is elected by the user base.\nIn their manifesto, candidates for this board can articulate the principles and values that they\nwill bring to the job. Different candidates may choose to represent people with different world\nviews, such as conservatives and liberals. Having a diverse set of opinions and cultures represented\non such a board would both legitimise its authority and improve the quality of its decision-making.\nIn time, maybe even parties and factions may emerge, which I would regard as a democratic success.</p>\n\n<p>Facebook employs\n<a href=\"https://bhr.stern.nyu.edu/tech-content-moderation-june-2020\">around 15,000 content moderators</a>, and\non all accounts it\u2019s\n<a href=\"https://www.theverge.com/2019/2/25/18229714/cognizant-facebook-content-moderator-interviews-trauma-working-conditions-arizona\">a horrible job.</a>\nWho would want to do it? On the other hand, 15,000 is a tiny number compared to Facebook\u2019s user\ncount. Rather than concentrating all the content moderation work on a comparatively small number of\nmoderators, maybe every user should have to do a stint at moderation from time to time as part of\ntheir conditions for using a service? Precedents for this sort of thing exist: in a number of\ncountries, individuals may be called to jury duty to help decide criminal cases; and researchers are\nregularly asked to review articles written by their peers. These things are not great fun either,\nbut we do them for the sake of the civic system that we all benefit from.</p>\n\n<p>Moderators with differing political views may disagree on whether a certain piece of content is\nacceptable or not. In cases of such disagreement, additional people can be brought in, hopefully\nallowing the question to be settled through debate. If no agreement can be found, the matter can be\nescalated to the elected board, which has the final say and which uses the experience to set\nguidelines for future moderation.</p>\n\n<h2>Implications for decentralised technologies</h2>\n\n<p>In decentralised social media, I believe that ultimately it should be the users themselves who\ndecide what is acceptable or not. This governance will have to take place through some human process\nof debate and deliberation, although technical tools and some degree of automation may be able to\nsupport the process and make it more efficient. Rather than simplistic censorship resistance, or\ngiving administrators dictatorial powers, we should work towards ethical principles, democratic\ncontrol, and accountability.</p>\n\n<p>I realise that my proposals are probably naive and smack of \u201ccomputer scientist finally discovers\nwhy the humanities are important\u201d. Therefore, if you know of any work that is relevant to this topic\nand can help technological systems learn from centuries of experience in democracy in the civil\nsociety, please send it to me \u2014 I am keen to learn more. Moreover, if there is existing work in the\ndecentralised web community on enabling this kind of grassroots democracy, I would love to hear\nabout it too.</p>\n\n<p>You can find me on Twitter <a href=\"https://twitter.com/martinkl\">@martinkl</a>, or contact me by email\n(firstname at lastname dot com). I will update this post with interesting things that are sent to\nme.</p>\n\n<h2>Updates: related work</h2>\n\n<p>Here are some related projects that have been pointed out to me since this post was published. I\nhave not vetted them, so don\u2019t take this as an endorsement.</p>\n\n<ul>\n <li>The <a href=\"https://oversightboard.com/\">Facebook/Instagram Oversight Board</a> is quite close to what\nI have in mind, and it has <a href=\"https://oversightboard.com/news/226612455899839-oversight-board-upholds-former-president-trump-s-suspension-finds-facebook-failed-to-impose-proper-penalty/\">upheld</a>\nthe suspension of Trump\u2019s account.</li>\n <li>The recently launched\n<a href=\"https://news.mit.edu/2021/center-constructive-communication-0113\">MIT Center for Constructive Communication</a>\nis an ambitious effort in this area.</li>\n <li>\u201c<a href=\"https://foundation.mozilla.org/en/blog/fellow-research-decentralized-web-hate/\">The Decentralized Web of Hate</a>\u201d\nis a detailed report by <a href=\"http://emmibevensee.com/\">Emmi Bevensee</a> on use of decentralised\ntechnologies by extremists.</li>\n <li><a href=\"https://homes.cs.washington.edu/~axz/publications.html\">Amy X. Zhang</a> and her collaborators have\ndone a lot of research on moderation.</li>\n <li><a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4005326\">Evelyn Douek argues</a> that it\u2019s not sufficient to\nview content moderation as lots of individual decisions on individual pieces of content, but that accountability\nrequires a new form of institution that provides a dynamic, continuous governance structure.</li>\n <li><a href=\"https://twitter.com/arcalinea\">Jay Graber</a> recently published a comprehensive\n<a href=\"https://twitter.com/arcalinea/status/1352316972654944257\">report comparing decentralised social protocols</a>, and a\n<a href=\"https://jaygraber.medium.com/designing-decentralized-moderation-a76430a8eab\">blog post</a>\non decentralised content moderation.</li>\n <li><a href=\"https://twitter.com/weschow\">Wes Chow</a> has written a\n<a href=\"https://medium.com/@wesc/opportunities-in-the-design-of-decentralized-social-networks-d66cce42d74b\">thoughtful and nunanced article</a>\non decentralised content moderation, with lots of references to further reading at the end.</li>\n <li>A few <a href=\"https://twitter.com/xmal/status/1349413781953273857\">people</a>\n<a href=\"https://twitter.com/weschow/status/1349417270179737604\">mentioned</a> Slashdot, Reddit, and Stack Overflow\nas successful examples of community-run moderation.</li>\n <li>On the other hand, J. Nathan Matias <a href=\"https://twitter.com/natematias/status/1496318787712344067\">is skeptical</a>\nthat volunteers will be able to handle the challenges of content moderation at scale, since Facebook reportedly\nspends $500m a year on it.</li>\n <li><a href=\"https://cblgh.org/articles/trustnet.html\">Trustnet</a> is a way of computing numerical scores for\nthe degree of trust in indvidual users, based on the social graph.</li>\n <li><a href=\"https://matrix.org/\">Matrix</a>, a federated messaging system, is\n<a href=\"https://matrix.org/blog/2020/10/19/combating-abuse-in-matrix-without-backdoors\">working on</a> a\ndecentralised, subjective reputation system.</li>\n <li><a href=\"https://freenetproject.org/\">Freenet</a> has a web-of-trust-based, decentralised\n<a href=\"https://www.draketo.de/english/freenet/friendly-communication-with-anonymity\">user reputation system</a>\n(see also this <a href=\"https://github.com/xor-freenet/plugin-WebOfTrust/blob/master/developer-documentation/core-developers-manual/OadSFfF-version1.2-non-print-edition.pdf\">Bachelor\u2019s thesis</a>).</li>\n <li><a href=\"https://www.waivlength.io/\">Waivlength</a> is exploring a <a href=\"https://waivlengthdev.medium.com/jury-duty-a-decentralised-moderation-model-for-governing-a-social-media-platform-b675b558dd6d\">governance approach inspired by jury duty</a>.</li>\n <li><a href=\"https://github.com/Freechains/README\">Freechains</a> is a peer-to-peer content distribution\nprotocol with an embedded user reputation system.</li>\n <li><a href=\"https://github.com/Murmuration-Labs/songbird-decentralized-moderation\">Songbird</a> is a sketch of a\ndecentralised moderation system for IPFS.</li>\n <li><a href=\"https://cabal.chat/\">Cabal</a> allows users to\n<a href=\"https://twitter.com/substack/status/1349471659653124098\">subscribe</a> to other users\u2019 moderation\nactions, such as blocking and hiding posts.</li>\n <li>An app called <a href=\"https://kc-fantastic-app.medium.com/decentralized-content-moderation-on-fantastic-app-3768989ced19\">Fantastic</a>\nis exploring mechanisms for moderation.</li>\n <li>Felix Dietze\u2019s <a href=\"https://github.com/fdietze/notes/blob/master/felix_dietze_master_thesis_2015.pdf\">2015 master\u2019s thesis</a>\nexplores community-run moderation. He is also working on\n<a href=\"https://felix.unote.io/hacker-news-scores\">ranking</a>\n<a href=\"https://github.com/fdietze/downvote-scoring\">algorithms</a>\nfor news aggregators.</li>\n <li>Twitter is trialling <a href=\"https://blog.twitter.com/en_us/topics/product/2021/introducing-birdwatch-a-community-based-approach-to-misinformation.html\">Birdwatch</a>,\na crowdsourced effort to tackle misinformation.</li>\n <li><a href=\"https://blog.coinbase.com/coinbases-philosophy-on-account-removal-and-content-moderation-c80d1aa452b7\">Coinbase\u2019s approach</a>\nis to ban only content that is illegal in jurisdictions where they operate, or content that is\n<a href=\"https://en.wikipedia.org/wiki/United_States_free_speech_exceptions\">not considered protected speech</a>\nunder the U.S. First Amendment.</li>\n</ul>",
+18
martinkl/2021_02_23_patreon.html.json
+18
martinkl/2021_02_23_patreon.html.json
···+"summary": "For the last five or six years, since I bid goodbye to the startup scene and Silicon Valley, I have been increasingly working in public. I have written a book, given around 100 talks (many of which are available on YouTube), published over 20 research papers (all freely available from...",+"content": "<p>For the last five or six years, since I bid goodbye to the startup scene and Silicon Valley, I have\nbeen increasingly working in public. I have <a href=\"https://dataintensive.net/\">written a book</a>,\ngiven <a href=\"https://martin.kleppmann.com/talks.html\">around 100 talks</a> (many of which are\n<a href=\"https://www.youtube.com/playlist?list=PLeKd45zvjcDHJxge6VtYUAbYnvd_VNQCx\">available on YouTube</a>),\npublished <a href=\"https://martin.kleppmann.com/#publications\">over 20 research papers</a>\n(all freely available from my website), and released and maintained\n<a href=\"https://github.com/ept\">some open source projects</a>.\nJust a few months ago I released a new undergraduate-level course on distributed systems, consisting of\n<a href=\"https://www.youtube.com/playlist?list=PLeKd45zvjcDFUEv_ohr_HdUFe97RItdiB\">7 hours of video lectures</a> and\n<a href=\"https://www.cl.cam.ac.uk/teaching/2021/ConcDisSys/dist-sys-notes.pdf\">87 pages of notes</a> and\nexercises, all free; in student evaluation at the <a href=\"https://www.cst.cam.ac.uk/\">University of Cambridge</a>,\nover 80% rated my lectures and notes as \u201cexcellent\u201d.</p>\n\n<p>I love doing first-rate work and making it broadly available. In fact, apart from my book, I give\neverything away for free, because I want to be able to reach and help the broadest possible set of\npeople. And even my book is very cheap compared to the value that many people get out of it (just\n<a href=\"https://dataintensive.net/buy.html\">read the reviews</a>).</p>\n\n<p>Of course, nobody goes into academia because of the money (or the job security of untentured posts,\nfor that matter). I would probably be earning five times my current salary if I had stayed in\nindustry. But I have absolutely no regrets about taking that pay cut: I love the freedom to work on\nwhatever I find interesting, and the freedom to publish everything so that others can use it. If you\nhave found any of my talks, writing, or code useful, then you have also benefitted from the freedom\nthat I enjoy.</p>\n\n<p>Of course, like everybody else, I have bills to pay. At the moment I\u2019m employed at the\n<a href=\"https://www.cst.cam.ac.uk/\">University of Cambridge</a> on a fixed-term contract, funded by\na charitable research grant. This grant gives me wonderful freedom to pursue my research and make it\npublicly available, but it\u2019s a fixed amount of money, and once it runs out, my job disappears in\na puff of smoke. This sort of grant is not renewable, regardless how amazing the work it has\nenabled. I can try applying for follow-on grants from other funders, but this takes a lot of time\nand has a low chance of success.</p>\n\n<p>Therefore I am setting up crowdfunding through <a href=\"https://www.patreon.com/martinkl\">Patreon</a>, in the\nhope of establishing a sustainable basic income that will allow me to continue my work of research\nand teaching long-term. I want to continue making most of my work freely available, so that the\nmaximum number of people can benefit from it.</p>\n\n<h2>Why support me?</h2>\n\n<p>I am offering <a href=\"https://www.patreon.com/martinkl\">three membership tiers</a> for anyone who wants to support my work:</p>\n\n<ol>\n <li>At the lowest tier, you will get regular news about new things I am working on, and exclusive\nearly access to drafts and work-in-progress. Keep your finger on the pulse of new research as it\nis happening. I will also send you some nice stickers (once I\u2019ve got them printed).</li>\n <li>At the middle tier, you will additionally be invited to participate in an exclusive community\nwith other supporters and myself, with both live and asynchronous discussions. I hope to\ncultivate thoughtful, high-quality exchange of ideas with likeminded people in this community.</li>\n <li>At the highest tier, you get all the aforementioned benefits, plus the ability to influence my\ndirection when I\u2019m choosing what to work on next. Not saying I will definitely do what you want;\nalso not saying that I will only take input from paying supporters (I still welcome ideas from\neveryone). However, I will consult and engage with supporters at this tier to get your opinions.\nI will also acknowledge you in any papers and books I write, making your name permanently etched\ninto the scientific literature.</li>\n</ol>\n\n<p>However, the biggest benefit is that by supporting me on Patreon you are enabling the creation of\nfuture work: that is, new thinking, writing, talks, and code that would not be created if I had to\nspend my time writing grant proposals or working for some company instead. If I have to go and get\na job somewhere, you will mostly hear me giving bland talks promoting the technology of whatever\ncompany I happen to work for. Being independent allows me to pick topics that I find interesting and\nimportant (such as <a href=\"https://www.youtube.com/watch?v=5ZjhNTM8XU8\">database transactions</a>,\n<a href=\"https://www.youtube.com/watch?v=Uav5jWHNghY\">formal verification</a>,\n<a href=\"https://www.youtube.com/watch?v=B5NULPSiOGw\">CRDTs</a>, or\n<a href=\"https://martin.kleppmann.com/papers/curve25519.pdf\">elliptic curve cryptography</a>),\nand present them in an accessible and neutral way.</p>\n\n<p>I will continue making most of my work publicly available for free (except for books): even if you\ncannot afford to be a Patreon supporter, it will still be available to you. Patreon supporters\nsimply get earlier access, plus the warm fuzzy feeling of knowing that you enabled the creation of\nnew work that, without your support, may never have existed. Supporting me on Patreon is <em>not\na donation</em>: it is an investment in future work that will hopefully be valuable to you.</p>\n\n<p>If you have found my work useful \u2013 for example, if you have applied ideas from my talks in your\nwork, or if my book helped you get a job \u2013 then I would be delighted to welcome you as a\n<a href=\"https://www.patreon.com/martinkl\">supporter</a>! And if your company uses my book for training\nengineers, please find out how your company can support me: even my highest supporter tier is a tiny\namount of money for a company that uses my work to improve the skills of their staff. I only get\naround $2 to $5 for every copy of my book that is sold; if you\u2019re getting a lot more value than this\nout of it, it would only be fair of you to <a href=\"https://www.patreon.com/martinkl\">support me more substantially</a>.</p>\n\n<p>If you cannot contribute financially, worry not. I equally appreciate your support in the form of\ncontributions to the open source community, discussing interesting ideas with me, and sharing useful\nmaterial with others. I will continue to engage with you and answer your questions, regardless of\nwhether you are a paying supporter. And most things I produce will continue to be free, so that\neveryone can benefit from them.</p>\n\n<h2>Planned work</h2>\n\n<p>Keep in mind that when you support me, you are not buying a product. You don\u2019t know exactly what\nyou\u2019re going to get, because I don\u2019t know exactly what I am going to do in advance either. That\u2019s\nwhy it\u2019s called research \u2013 it\u2019s open-ended, and part of its purpose is to go down unexpected\nrabbit-holes if they seem important! You are funding a person because this person has done good work\nin the past, and is likely to continue doing good work in the future.</p>\n\n<p>I do have a lot of plans, though. At a high level, I am hoping to do these things over the next few years:</p>\n\n<ul>\n <li>Write another book to complement <a href=\"https://dataintensive.net/\">Designing Data-Intensive Applications</a>;</li>\n <li>Develop the foundational technologies to enable the\n<a href=\"https://www.cl.cam.ac.uk/research/dtg/trve/\">next generation of collaboration software</a> (such as\nGoogle Docs), in a way that does not require\n<a href=\"https://www.inkandswitch.com/local-first.html\">giving Google all of our data</a>;</li>\n <li>Continue writing research papers, blog posts, and giving talks/making videos on distributed\nsystems and related topics.</li>\n</ul>\n\n<p>There is no concrete timescale for these things; most likely I will work on several of them in\ntandem, as I have been doing over the last several years.</p>\n\n<p>Part of this story is creating educational content on topics that I find important, and part is\na vision for the future of collaborative computing, which my collaborators and I are realising in\nthe form of <a href=\"https://github.com/automerge/automerge\">Automerge</a>, an open source project. Our vision\nis articulated in the essay-cum-manifesto on\n<a href=\"https://www.inkandswitch.com/local-first.html\">local-first software</a>, which I suggest you read if\nyou haven\u2019t already.</p>\n\n<h2>Research philosophy</h2>\n\n<p>For me it is important to have this mixture of research, open source software development, and\nteaching (through speaking and writing), because all of these activities feed off each other.\nI don\u2019t want to just work on open source without doing research, because that only leads to\nincremental improvements, no fundamental breakthroughs. I don\u2019t want to just do research without\napplying it, because that would mean losing touch with reality. And I don\u2019t want to just be\na YouTuber or writer without doing original research, because I would run out of ideas and my\ncontent would get stale and boring; good teaching requires actively working in the area.</p>\n\n<p>This interaction was articulated wonderfully by\n<a href=\"https://amturing.acm.org/award_winners/gray_3649936.cfm\">Turing award winner Jim Gray</a>:</p>\n\n<blockquote>\n <p>I aspire to be a scholar of computer science. All fields of scholarship, from religion to\nmedicine, emphasize three aspects: meditation, teaching and service. Meditation (called research\nby scientists) is the official part of research. But, teaching (writing papers, explaining your\nideas, and transferring technology) and service (making computer systems and helping people use\nthem) are also major aspects of the scholarly process. They keep the scholar in touch with\nreality.</p>\n\n <p>\u2014 <a href=\"http://jimgray.azurewebsites.net/papers/critiqueofibm%27scsresearch.pdf\">Jim Gray, 1980</a></p>\n</blockquote>\n\n<p>(That\u2019s from Gray\u2019s letter of resignation from IBM. The whole letter is a fascinating read if you\u2019re\ninto computing history. At the time Gray was working on\n<a href=\"https://people.eecs.berkeley.edu/~brewer/cs262/SystemR.pdf\">System R</a>, the precursor of all\nrelational databases we use today. It\u2019s fair to say that his work has had a huge impact.)</p>\n\n<p>Another aspect of my research philosophy is that good work rarely happens with one person alone, but\nthrough collaboration with other good people. Quoting Jim Gray again:</p>\n\n<blockquote>\n <p>Computer science is an empirical and multi-disciplinary field. The aspect of it that I work on,\ncomputer systems, requires lots of good people, time and equipment to produce anything of\ninterest. Projects of five or ten people working for five or ten years seem to be about the right\nscale. More modest projects are unable to attack significant problems. More ambitious projects\nhave unclear goals and have management problems.</p>\n</blockquote>\n\n<p>You might be wondering: even if I get enough Patreon funding to cover my own living expenses, it\nseems unlikely that I will be able to crowdfund a team of five to ten people. Fortunately, I have\nfound over the last years that collaboration does not require all team members to be funded out of\nthe same purse. I constantly collaborate with people without being responsible for their payroll.\nIn open source, it is common for contributors to a project to be employed by several different\norganisations, and indeed such diversity makes projects better and more resilient.</p>\n\n<p>I work closely with the <a href=\"https://www.inkandswitch.com/\">Ink & Switch lab</a>, who have their own\nfunding. Some of my collaborators are PhD students who have their own stipends, or research fellows\nwho have their own grants. We come together because of our common interests, and because nobody is\ntrying to profit from the others. We have a vision of the future that we want to realise, and the\nfunding just lets us pay the bills as we work towards the greater goal.</p>\n\n<p>Of course, if my Patreon ends up being successful and generates more money than I need for my own\nliving expenses, I will use it to help fund collaborators. I am not aiming to recreate the lavish\nSilicon Valley engineering salary that I left behind; I just want to do good work without having to\nspend a lot of time chasing grants.</p>\n\n<h2>Alternatives to crowdfunding</h2>\n\n<p>Before moving to <a href=\"https://www.patreon.com/martinkl\">Patreon</a> I considered several alternatives:</p>\n\n<ul>\n <li>Academic jobs and fellowships? It\u2019s a difficult to get a stable position at a research-focussed\nuniversity. Both jobs and funding are fiercely competitive (hundreds of applicants for one place),\nand they require a strong track record of publications. Unfortunately, there is a\n<a href=\"https://cacm.acm.org/blogs/blog-cacm/248824-how-objective-is-peer-review/fulltext\">large degree of randomness</a>\nin the choice of papers that get accepted to top-tier publication venues. I am still interested in\nan academic career, but it seems unwise to put all eggs in this uncertain basket. Oh, and due to\nthe pandemic my current university has a hiring freeze anyway, so no jobs anytime soon.</li>\n <li>Founding a startup? Been there, <a href=\"https://www.crunchbase.com/person/martin-kleppmann\">done that</a>\n(twice). A startup is a great way of productising technology on a 1\u20132 year time scale; it also\nneeds fast growth and/or a strong revenue model. My current work does not fit that model since it\nfocusses on foundational technolgies with a longer time-scale (the 5\u201310 years mentioned by Jim\nGray), and it aims for public benefit rather than private profit.</li>\n <li>Getting a job at someone else\u2019s company? I want to be free to choose what to work on based on what\nI believe is important, not whatever happens to suit a company\u2019s agenda. I also want to be free to\npublish that work openly. Not many companies are willing to support such positions long-term.</li>\n <li>Consulting work and training? I could spend a fraction of my time helping companies solve problems\nwithin my area of expertise, or running training workshops. However, this type of income can\nfluctuate wildly, and generating a steady stream of clients is a lot of work and very distracting.\nIt\u2019s difficult to make consulting compatible with the deep thinking and long-term view required\nfor research.</li>\n <li>Becoming a professional author? I have been able to draw a reasonable income from\n<a href=\"https://martin.kleppmann.com/2020/09/29/is-book-writing-worth-it.html\">royalties for sales of my book</a>.\nHowever, I have no idea how long those sales will last, and I have no idea whether any future book\nI write will sell similarly well. Given this unpredictability, it seems unwise to bet on royalties\nas only income. Moreover, book-writing is only one of several things I do, and I believe the\nother things generate value too. I believe my funding situation should reflect that.</li>\n</ul>\n\n<p>With crowdfunding, I hope to not only generate a steady income stream, but also build a community of\npeople who are excited about the same topics as me, and who are invested in making these ideas\na reality. It is an opportunity for me to share early-stage work with enthusiasts, and to improve\nthat work through feedback from the community. And it is an opportunity for you to get an insider\nview of the research process as we build the future of computing.</p>\n\n<p>If you believe in our vision for\n<a href=\"https://www.inkandswitch.com/local-first.html\">a better future of collaborative computing</a>, or if\nyou want to see more high-quality educational materials for computer science, then why not head over\nto Patreon and <a href=\"https://www.patreon.com/martinkl\">pledge your support</a>? It will make a huge\ndifference. Thank you!</p>",
+18
martinkl/2021_04_14_goodbye-gpl.html.json
+18
martinkl/2021_04_14_goodbye-gpl.html.json
···+"summary": "The trigger for this post is the reinstating of Richard Stallman, a very problematic character, to the board of the Free Software Foundation (FSF). I am appalled by this move, and join others in the call for his removal. This occasion has caused me to reevaluate the position of the...",+"content": "<p>The trigger for this post is the\n<a href=\"https://www.fsf.org/news/statement-of-fsf-board-on-election-of-richard-stallman\">reinstating</a>\nof Richard Stallman, a very <a href=\"https://rms-open-letter.github.io/\">problematic character</a>, to the\nboard of the <a href=\"https://www.fsf.org/\">Free Software Foundation</a> (FSF). I am appalled by this move, and\njoin others in the call for his removal.</p>\n\n<p>This occasion has caused me to reevaluate the position of the FSF in computing. It is the steward of\nthe GNU project (a part of Linux distributions,\n<a href=\"https://www.gnu.org/gnu/incorrect-quotation.en.html\">loosely speaking</a>), and of a family of\nsoftware licenses centred around the\n<a href=\"https://en.wikipedia.org/wiki/GNU_General_Public_License\">GNU General Public License</a> (GPL). These\nefforts are unfortunately tainted by Stallman\u2019s behaviour. However, this is not what I actually want\nto talk about today.</p>\n\n<p>In this post I argue that we should move away from the GPL and related licenses (LGPL, AGPL), for\nreasons that have nothing to do with Stallman, but simply because I think they have failed to\nachieve their purpose, and they are more trouble than they are worth.</p>\n\n<p>First, brief background: the defining feature of the GPL family of licenses is the concept of\n<a href=\"https://en.wikipedia.org/wiki/Copyleft\">copyleft</a>, which states (roughly) that if you take some\nGPL-licensed code and modify it or build upon it, you must also make your modifications/extensions\n(known as a \u201c<a href=\"https://en.wikipedia.org/wiki/Derivative_work\">derivative work</a>\u201d) freely available\nunder the same license. This has the effect that the GPL\u2019ed source code cannot be incorporated into\nclosed-source software. At first glance, this seems like a great idea. So what is the problem?</p>\n\n<h2>The enemy has changed</h2>\n\n<p>In the 1980s and 1990s, when the GPL was written, the enemy of the free software movement was\nMicrosoft and other companies that sold closed-source (\u201cproprietary\u201d) software. The GPL intended to\ndisrupt this business model for two main reasons:</p>\n\n<ol>\n <li>Closed-source software cannot easily be modified by users; you can take it or leave it, but you\ncannot adapt it to your own needs. To counteract this, the GPL was designed to force companies to\nrelease the source code of their software, so that users of the software could study it, modify\nit, compile and use their modified version, and thus have the freedom to customise their\ncomputing devices to their needs.</li>\n <li>Moreover, GPL was motivated by a desire for fairness: if you write some software in your spare\ntime and release it for free, it\u2019s understandable that you don\u2019t want others to profit from your\nwork without giving something back to the community. Forcing derivative works to be open source\nensures at least some baseline of \u201cgiving back\u201d.</li>\n</ol>\n\n<p>While this made sense in 1990, I think the world has changed, and closed-source software is no\nlonger the main problem. <strong>In the 2020s, the enemy of freedom in computing is cloud software</strong> (aka\nsoftware as a service/SaaS, aka web apps) \u2013\u00a0i.e. software that runs primarily on the vendor\u2019s\nservers, with all your data also stored on those servers. Examples include Google Docs, Trello,\nSlack, Figma, Notion, and many others.</p>\n\n<p>This cloud software may have a client-side component (a mobile app, or the JavaScript running in\nyour web browser), but it only works in conjunction with the vendor\u2019s server. And there are lots of\nproblems with cloud software:</p>\n\n<ul>\n <li>If the company providing the cloud software goes out of business or decides to\n<a href=\"https://killedbygoogle.com/\">discontinue a product</a>, the software stops working, and you are\nlocked out of the documents and data you created with that software. This is an especially common\nproblem with software made by a startup, which may get\n<a href=\"https://ourincrediblejourney.tumblr.com/\">acquired by a bigger company</a> that has no interest in\ncontinuing to maintain the startup\u2019s product.</li>\n <li>Google and other cloud services may\n<a href=\"https://twitter.com/Demilogic/status/1358661840402845696\">suddenly suspend your account</a> with no\nwarning and <a href=\"https://www.paullimitless.com/google-account-suspended-no-reason-given/\">no recourse</a>,\nfor example if an automated system thinks you have violated its terms of service. Even if your own\nbehaviour has been faultless, someone else may have hacked into your account and used it to send\nmalware or phishing emails without your knowledge, triggering a terms of service violation. Thus,\nyou could suddenly find yourself permanently locked out of every document you ever created on\nGoogle Docs or another app.</li>\n <li>With software that runs on your own computer, even if the software vendor goes bust, you can\ncontinue running it forever (in a VM/emulator if it\u2019s no longer compatible with your OS, and\nassuming it doesn\u2019t need to contact a server to check for a license check). For example, the\nInternet Archive has a collection of\n<a href=\"https://archive.org/details/softwarelibrary\">over 100,000 historical software titles</a> that you\ncan run in an emulator inside your web browser! In contrast, if cloud software gets shut down,\nthere is no way for you to preserve it, because you never had a copy of the server-side software,\nneither as source code nor in compiled form.</li>\n <li>The 1990s problem of not being able to customise or extend software you use is aggravated further\nin cloud software. With closed-source software that runs on your own computer, at least someone\ncould reverse-engineer the file format it uses to store its data, so that you could load it into\nalternative software (think pre-<a href=\"https://en.wikipedia.org/wiki/Office_Open_XML\">OOXML</a> Microsoft\nOffice file formats, or Photoshop files before the\n<a href=\"https://www.adobe.com/devnet-apps/photoshop/fileformatashtml/\">spec</a> was published). With cloud\nsoftware, not even that is possible, since the data is only stored in the cloud, not in files on\nyour own computer.</li>\n</ul>\n\n<p>If all software was free and open source, these problems would all be solved. However, making the\nsource code available is not actually necessary to solve the problems with cloud software; even\nclosed-source software avoids the aforementioned problems, as long as it is running on your own\ncomputer rather than the vendor\u2019s cloud server. Note that the Internet Archive is able to keep\nhistorical software working without ever having its source code: for purposes of preservation,\nrunning the compiled machine code in an emulator is just fine. Maybe having the source code would\nmake it a little easier, but it\u2019s not crucial. The important thing is having a copy of the software\n<strong>at all</strong>.</p>\n\n<h2>Local-first software</h2>\n\n<p>My collaborators and I have previously argued for\n<a href=\"https://www.inkandswitch.com/local-first.html\">local-first software</a>, which is a response to these\nproblems with cloud software. Local-first software runs on your own computer, and stores its data on\nyour local hard drive, while also retaining the convenience of cloud software, such as real-time\ncollaboration and syncing your data across all of your devices. It is nice for local-first software\nto also be open source, but this is not necessary: 90% of its benefits apply equally to\nclosed-source local-first software.</p>\n\n<p>Cloud software, not closed-source software, is the real threat to software freedom, because the harm\nfrom being suddenly locked out of all of your data at the whim of a cloud provider is much greater\nthan the harm from not being able to view and modify the source code of your software. For that\nreason, it is much more important and pressing that we make local-first software ubiquitous. If, in\nthat process, we can also make more software open-source, then that would be nice, but that is less\ncritical. Focus on the biggest and most urgent challenges first.</p>\n\n<h2>Legal tools to promote software freedom</h2>\n\n<p>Copyleft software licenses are a legal tool that attempts to force more software vendors to release\ntheir source code. In particular, the\n<a href=\"https://en.wikipedia.org/wiki/Affero_General_Public_License\">AGPL</a> is an attempt to force providers\nof cloud services to release the source of their server-side software. However, this hasn\u2019t really\nworked: most vendors of cloud software simply refuse to use AGPL-licensed software, and either use\na different implementation with a more permissive license, or re-implement the necessary\nfunctionality themselves, or\n<a href=\"https://www.elastic.co/pricing/faq/licensing\">buy a commercial license</a> that comes without the\ncopyleft clauses. I don\u2019t think the license has caused any source code to become available that\nwouldn\u2019t have been open source anyway.</p>\n\n<p>As a legal tool to promote greater software freedom, I believe copyleft software licenses have\nlargely failed, since they have done nothing to stop the rise of cloud software, and probably not\ndone much to increase the share of software whose source is available. Open source software has\nbecome very successful, but much of this success is in projects with non-copyleft licenses (e.g.\nApache, MIT, or BSD licenses), and even in the GPL-licensed projects (e.g. Linux) I am skeptical\nthat the copyleft aspect was really an important factor in the project\u2019s success.</p>\n\n<p>I believe a much more promising legal tool to promote software freedom is in government regulation.\nFor example, the GDPR includes a\n<a href=\"https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/individual-rights/right-to-data-portability/\">right to data portability</a>,\nwhich means that users must be able to move their data from one service to another. Existing\nimplementations of portability, such as\n<a href=\"https://en.wikipedia.org/wiki/Google_Takeout\">Google Takeout</a>, are quite rudimentary (what can you\nreally do with a big zip archive of JSON files?), but we can lobby regulators to\n<a href=\"https://interoperability.news/\">push for better portability/interoperability</a>, e.g. requiring\nreal-time bidirectional sync of your data between two apps by competing providers.</p>\n\n<p>Another promising route I see is pushing\n<a href=\"https://joinup.ec.europa.eu/sites/default/files/document/2011-12/OSS-procurement-guideline%20-final.pdf\">public-sector procurement to prefer open source, local-first software</a>\nover closed-source cloud software. This creates a positive incentive for businesses to develop and\nmaintain high-quality open source software, in a way that copyleft clauses do not.</p>\n\n<p>You might argue that a software license is something that an individual developer can control,\nwhereas governmental regulation and public policy is a much bigger issue outside of any one\nindividual\u2019s power. Yes, but how much impact can you really have by choosing a software license?\nAnyone who doesn\u2019t like your license can simply choose not to use your software, in which case your\npower is zero. Effective change comes from collective action on big issues, not from one person\u2019s\nlittle open source side project choosing one license over another.</p>\n\n<h2>Other problems with GPL-family licenses</h2>\n\n<p>You can force a company to make their source code of a GPL-derived software project available, but\nyou cannot force them to be good citizens of the open source community (e.g. continuing to maintain\nthe features they have added, fixing bugs, helping other contributors, providing good documentation,\nparticipating in project governance). What worth is source code that is just \u201cthrown over the wall\u201d\nwithout genuine engagement in the open source project? At best it\u2019s worthless, and at worst it\u2019s\nharmful because it shifts the burden of maintenance to other contributors of the project.</p>\n\n<p>We need people to be good contributors to the open source community, and this is achieved by setting\nup the right incentives and by being welcoming, not by software licenses.</p>\n\n<p>Finally, a practical problem of GPL-family licenses is their\n<a href=\"http://gplv3.fsf.org/wiki/index.php/Compatible_licenses\">incompatibility with other widely-used licenses</a>,\nmaking it difficult to use certain combinations of libraries in the same project and unnecessarily\nfragmenting the open source ecosystem. Maybe it would be worth putting up with this problem if the\nGPL had other strong advantages, but as I have explained, I don\u2019t think those advantages exist.</p>\n\n<h2>Conclusion</h2>\n\n<p>The GPL and other copyleft licenses are not bad; I just think they\u2019re pointless. They have practical\nproblems, and they are tainted by the behaviour of the FSF, but most importantly, I do not believe\nthey have been an effective contributor to software freedom. The only real use for copyleft nowadays\nis by commercial software vendors\n(<a href=\"https://www.mongodb.com/licensing/server-side-public-license/faq\">MongoDB</a>,\n<a href=\"https://www.elastic.co/pricing/faq/licensing\">Elastic</a>) who want to stop Amazon from providing\ntheir software as a service \u2013\u00a0which is fine, but it\u2019s motivated purely by business concerns, not by\nsoftware freedom.</p>\n\n<p>Open source software has been tremendously successful, and it has come a long way since the origins\nof the free software movement born from 1990s anti-Microsoft sentiment. I will acknowledge that the\nFSF was instrumental in getting this all started. However, 30 years on, the ecosystem has changed,\nbut the FSF has failed to keep up, and has\n<a href=\"https://r0ml.medium.com/free-software-an-idea-whose-time-has-passed-6570c1d8218a\">become more and more out of touch</a>.\nIt has failed to establish a coherent response to cloud software and other recent threats to\nsoftware freedom, and it just continues to rehash tired old arguments from decades ago. Now, by\nreinstating Stallman and dismissing the concerns about him, the FSF is\n<a href=\"https://lu.is/blog/2021/04/07/values-centered-npos-with-kmaher/\">actively harming</a> the cause of\nfree software. We must distance ourselves from the FSF and their worldview.</p>\n\n<p>For all these reasons, I think it no longer makes sense to cling on to the GPL and copyleft. Let\nthem go. Instead, I would encourage you to adopt a permissive license for your projects (e.g.\n<a href=\"https://opensource.org/licenses/MIT\">MIT</a>, <a href=\"https://opensource.org/licenses/BSD-2-Clause\">BSD</a>,\n<a href=\"https://opensource.org/licenses/Apache-2.0\">Apache 2.0</a>), and then focus your energies on the\nthings that will really make a difference to software freedom:\n<a href=\"https://www.inkandswitch.com/local-first.html\">counteracting</a> the monopolising effects of cloud\nsoftware, developing sustainable business models that allow open source software to thrive, and\npushing for regulation that prioritises the interests of software users over the interests of\nvendors.</p>\n\n<p><em>Thank you to <a href=\"https://ramcq.net/\">Rob McQueen</a> for feedback on a draft of this post.</em></p>\n\n<p><em>Update: <a href=\"https://twitter.com/lexi_lambda/status/1295426437583982592\">related Twitter thread by Alexis King</a></em></p>",
+18
martinkl/2021_09_01_podcast-interviews.html.json
+18
martinkl/2021_09_01_podcast-interviews.html.json
···+"summary": "I regularly get asked to give interviews on the topics that I work on, especially for podcasts. To make them easier to find for anybody who\u2019s interested, I thought I would make a list. They touch on a range of different topics, although there is also some overlap so I...",+"content": "<p>I regularly get asked to give interviews on the topics that I work on, especially for podcasts.\nTo make them easier to find for anybody who\u2019s interested, I thought I would make a list.\nThey touch on a range of different topics, although there is also some overlap so I wouldn\u2019t\nrecommend listening to them all in a row!</p>\n\n<p>(By the way, if you want a list of conference talks I have given, I have a\n<a href=\"https://www.youtube.com/playlist?list=PLeKd45zvjcDHJxge6VtYUAbYnvd_VNQCx\">YouTube playlist</a> for that.)</p>\n\n<p>Here\u2019s a list of interviews I\u2019ve given as of September 2021:</p>\n\n<ul>\n <li>\n <p>Interview with <a href=\"https://www.wix.engineering/\">Wix Engineering</a>, in which we discuss my book, the\nstate of Automerge, the convergence of streaming systems and databases, Kafka\u2019s move to replace\nZooKeeper with their own Raft implementation, impact of my research, and more.\nRecorded 16 June 2021, published 26 August 2021.\n<a href=\"https://www.youtube.com/watch?v=jtK7LOcP76s\">Video</a>,\n<a href=\"https://www.wix.engineering/post/wix-engineering-tech-interviews-martin-kleppmann-natan-silnitsky\">transcript</a>.</p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://museapp.com/podcast/\">Metamuse podcast</a>, in which we discuss local-first\nsoftware: how the concept has evolved since we <a href=\"https://www.inkandswitch.com/local-first.html\">first articulated it</a>,\nand where it\u2019s heading in the future.\nRecorded 17 August 2021, published 14 October 2021.\n<a href=\"https://museapp.com/podcast/41-local-first-software/\">Episode link</a></p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://www.torocloud.com/podcast\">Coding over Cocktails podcast</a>, in which we\ndiscuss making systems scalable, how data systems have evolved over the years, and local-first\nsoftware. Recorded 26 August 2021, published 30 August 2021.\n<a href=\"https://www.torocloud.com/podcast/designing-data-intensive-applications-martin-kleppmann\">Episode link and transcript</a>,\n<a href=\"https://soundcloud.com/codingovercocktails/designing-data-intensive-applications-with-martin-kleppman\">Soundcloud</a>,\n<a href=\"https://podcasts.apple.com/ph/podcast/designing-data-intensive-applications-with-martin/id1531450276?i=1000533284011\">iTunes</a>,\n<a href=\"https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5zb3VuZGNsb3VkLmNvbS91c2Vycy9zb3VuZGNsb3VkOnVzZXJzOjg3MjM0NTQxNi9zb3VuZHMucnNz/episode/dGFnOnNvdW5kY2xvdWQsMjAxMDp0cmFja3MvMTExMzg4MDIxNg?sa=X&ved=0CAUQkfYCahcKEwjo-NOKhdjyAhUAAAAAHQAAAAAQAQ\">Google Play</a>.</p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://programming.love/\">Programming Love</a> podcast, in which we discuss\npeer-to-peer systems for collaboration, CRDTs and conflict resolution, undo and other challenges\nof collaboration software, my <a href=\"/2018/10/17/kafka-summit.html\">\u201cIs Kafka a Database?\u201d talk</a>, and more.\nRecorded 9 July 2020, published 19 October 2020.\n<a href=\"https://programming.love/programming-love-with-martin-kleppmann/\">Episode link</a>,\n<a href=\"https://podcasts.apple.com/us/podcast/programming-love-with-martin-kleppmann/id1518407590?i=1000495317576\">Apple Podcasts</a>,\n<a href=\"https://open.spotify.com/episode/7oc4i8h0LaFUx5l8ghJOOD\">Spotify</a>,\n<a href=\"https://www.stitcher.com/show/programming-love/episode/programming-love-with-martin-kleppmann-78699629\">Stitcher</a>.</p>\n </li>\n <li>\n <p>Interview with <a href=\"https://medium.com/csr-tales\">CSR (Computer Science Research) Tales</a>, in which we\ndiscuss formally proving the correctness of distributed systems, and verifying CRDTs in particular.\nPublished 30 July 2019.\n<a href=\"https://medium.com/csr-tales/csrtale-13-formal-verification-of-strong-eventual-consistency-1cc0af942e64\">Transcript</a>.</p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://hydraconf.com/\">Hydra conference</a> on seeing through technology hype,\nthe CAP theorem, decentralisation, proving the correctness of CRDTs, event-based systems, and\npersonal growth. Recorded 3 June 2019, published 27 June 2019.\n<a href=\"https://medium.com/@hydraconference/the-big-interview-with-martin-kleppmann-figuring-out-the-future-of-distributed-data-systems-28a680d99ae6\">Transcript</a>.</p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://codepodcast.com/\">Code Podcast</a>, in which we talked in depth about\nmy research on CRDTs, what they can and cannot do, and how we deal with time in distributed systems.\nRecorded 4 September 2018. Not sure this episode ever got published.</p>\n </li>\n <li>\n <p>Interview for an internal podcast at Booz Allen Hamilton. Recorded 9 April 2018. I don\u2019t think it\never got made publicly available.</p>\n </li>\n <li>\n <p>Interview with the <a href=\"https://www.investedinvestor.com/index\">Invested Investor podcast</a>, in which\nwe talked about my startup career before I got into academia, selling two companies, going\nthrough Y Combinator, moving to Silicon Valley, and all that jazz.\nRecorded 20 November 2017, published 24 January 2018.\n<a href=\"https://www.investedinvestor.com/articles/2018/1/23/martin-kleppmann\">Episode link</a>,\n<a href=\"https://audioboom.com/posts/6621031-martin-kleppmann-to-silicon-valley-and-back-again-with-two-exits-along-the-way\">Audioboom</a>,\n<a href=\"https://www.investedinvestor.com/martin-kleppmann-transcription\">Transcript</a>.</p>\n </li>\n <li>\n <p>First interview with <a href=\"https://softwareengineeringdaily.com/\">Software Engineering Daily</a>, in which\nwe talk about data-intensive applications, the CAP theorem, scalability, data models, data formats,\nthe challenges of distributed systems, and ideas for the future.\nRecorded 20 April 2017, published 2 May 2017.\nI am told that this was the most popular episode ever of this podcast!\n<a href=\"https://softwareengineeringdaily.com/2017/05/02/data-intensive-applications-with-martin-kleppmann/\">Episode link</a>,\n<a href=\"http://traffic.libsyn.com/sedaily/dataintensive_edited_fixed.mp3\">Download</a>,\n<a href=\"http://softwareengineeringdaily.com/wp-content/uploads/2017/05/SEDT15-Data-Intensive-Apps.pdf\">Transcript</a>.</p>\n </li>\n <li>\n <p>Second interview with <a href=\"https://softwareengineeringdaily.com/\">Software Engineering Daily</a>, in which\nwe talk about decentralisation, CRDTs, blockchains, consensus, concurrency, and how to make CRDTs\nwork in practice. Recorded 15 November 2017, published 8 December 2017.\n<a href=\"https://softwareengineeringdaily.com/2017/12/08/decentralized-objects-with-martin-kleppman/\">Episode link</a>,\n<a href=\"http://traffic.libsyn.com/sedaily/CRDTs_Decentralized_Files.mp3\">Download</a>,\n<a href=\"https://softwareengineeringdaily.com/wp-content/uploads/2017/12/SED477-CRDTs-Decentralized-Files.pdf\">Transcript</a>.</p>\n </li>\n <li>\n <p>Interview with <a href=\"https://advancetechmedia.org/\">Advance Tech Podcast</a>, in which we discuss a wide\nrange of topics: my past life in startups, security and decentralisation, event streaming systems,\ndata consistency, and formal verification.\nRecorded and published 27 October 2017.\n<a href=\"https://advancetechmedia.org/episode-008-martin-kleppmann/\">Episode link</a>.</p>\n </li>\n <li>\n <p>Interview with <a href=\"https://www.infoq.com/\">InfoQ</a> about log-based messaging, stream processing, and\nchange data capture. Recorded 24 April 2015, published 28 June 2015.\n<a href=\"https://www.infoq.com/interviews/kleppmann-data-infrastructure-logs-crdt/\">Video and transcript</a>.</p>\n </li>\n</ul>\n\n<p><strong>Update \u2014 later additions:</strong></p>\n\n<ul>\n <li><a href=\"https://nurkiewicz.com/70\">Short 4-minute episode on CRDTs</a> (April 2022)</li>\n <li><a href=\"https://www.youtube.com/watch?v=sMRpv0fBJLU\">Interview with Russian reading group {\u043c\u0435\u0436\u0434\u0443 \u0441\u043a\u043e\u0431\u043e\u043a} or {between brackets}</a> (July 2022)</li>\n</ul>",
+18
martinkl/2022_01_03_future-of-fusion-energy.html.json
+18
martinkl/2022_01_03_future-of-fusion-energy.html.json
···+"summary": "I give a five-star \u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f rating to the following book: Jason Parisi and Justin Ball. The Future of Fusion Energy. World Scientific, 2019. ISBN 978-1-78634-749-7. Available from Amazon US, Amazon UK, and many other retailers. I came to this book looking for answers to questions such as: Is there still...",+"content": "<p>I give a five-star \u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f\u2b50\ufe0f rating to the following book:</p>\n\n<p>Jason Parisi and Justin Ball. <em>The Future of Fusion Energy</em>. World Scientific, 2019. ISBN 978-1-78634-749-7. Available from <a href=\"https://amzn.to/3sUypW6\">Amazon US</a>, <a href=\"https://amzn.to/3eHCpkB\">Amazon UK</a>, and many other retailers.</p>\n\n<p><img alt=\"Cover of the book 'The Future of Fusion Energy'\" src=\"/2022/01/fusion-book.jpg\" width=\"70%\"></p>\n\n<p>I came to this book looking for answers to questions such as: Is there still hope that a fusion power plant will ever be viable? If so, what exactly are the main obstacles on the way there? Why has progress in this field been so slow? And what should I make of the various startups claiming to have a fusion power plant just round the corner?</p>\n\n<p>The book provides an excellent, detailed answer to these questions, and more. It\u2019s the best kind of popular science book: you don\u2019t need a physics degree to read it, but it doesn\u2019t fob you off with oversimplified hand-waving either; all of the core arguments are convincingly backed up with evidence. There are some equations, but they are not necessary for understanding the book: as long as you know the difference between an electron, a proton, and a neutron, you\u2019ll be able to follow it.</p>\n\n<p>The book is clear about which constraints on fusion energy are fundamental limits of nature, and which constraints can be overcome with better technology. It offers optimism that fusion power is possible, highlighting the most promising paths to getting there, while remaining honest about the open problems that are yet to be solved. My take-away was that core problems, such as plasma turbulence, are very difficult, but likely solvable with more brainpower and experiments.</p>\n\n<p>The book also provides compelling arguments in favour of fusion: not only the obvious case of providing cheap energy without carbon emissions or seasonal variation, but also that compared to fission, there is much less risk that the technology will facilitate the proliferation of nuclear weapons.</p>\n\n<p>The need to transition away from fossil fuels is so urgent that we can\u2019t afford to wait for fusion \u2014 renewables and fission are still crucial. However, for the medium to long term, fusion offers optimism. From about 1970 to 2000, fusion research made very impressive progress, with the <a href=\"https://en.wikipedia.org/wiki/Lawson_criterion\">key performance metric</a> doubling every 1.8 years \u2014 faster even than Moore\u2019s law, and getting pretty close to the point where the fusion reaction is self-sustaining without having to continually feed in external energy (the dotted line labelled \u201cignition\u201d on the following diagram)!</p>\n\n<p><img alt=\"Figure 4.25 from the book. The x axis shows years from 1965 to 2030; the y axis shows the 'triple product' performance metric of various experimental reactors on a log scale. From about 1970 to 2000, progress follows a straight line on the log scale, i.e. exponential improvement. In the late 1990s it comes within less than an order of magnitude of 'ignition', which is where the fusion reaction becomes self-sustaining.\" src=\"/2022/01/fusion-progress.jpg\" width=\"100%\"></p>\n\n<p><i>Figure 4.25 from the book. Note the log scale on the y axis, so the straight line from 1970 to 2000 is actually exponential growth.</i></p>\n\n<p>Since 2000, progress has stalled, and the book argues that this is primarily because research in the field has been under-funded, not because of any particular fundamental limit. Of course, anybody can claim that more money will solve their problems, but in this case I\u2019m inclined to believe it. What changed in 2000 is that the fusion research community started putting all their eggs in one basket (<a href=\"https://en.wikipedia.org/wiki/ITER\">ITER</a>), because there wasn\u2019t the money for multiple baskets. More money would allow more parallel experiments to explore different approaches and see which ones work better.</p>\n\n<p>Investment in fusion research is small compared with investment in renewables and fission R&D, and tiny compared to things like agricultural and fossil fuel subsidies. Even if it\u2019s not guaranteed that fusion will work, given the potentially transformative nature of cheap, climate-friendly energy to human civilisation, it seems well worth putting some more money in it and giving it our best shot (in addition to faster ways of getting off fossil fuels, such as renewables, of course).</p>\n\n<p>I won\u2019t try to summarise the technical details of the book, but if you are interested in them, I can assure you that you will find this book worthwhile.</p>",
+18
martinkl/2022_10_12_verifying-distributed-systems-isabelle.html.json
+18
martinkl/2022_10_12_verifying-distributed-systems-isabelle.html.json
···+"summary": "This post also appears on Larry Paulson\u2019s blog. We use distributed systems every day in the form of internet services. These systems are very useful, but also challenging to implement because networks are unpredictable. Whenever you send a message over the network, it is likely to arrive quite quickly, but...",+"content": "<p><em>This post also appears on <a href=\"https://lawrencecpaulson.github.io/2022/10/12/verifying-distributed-systems-isabelle.html\">Larry Paulson\u2019s blog</a>.</em></p>\n\n<p>We use distributed systems every day in the form of internet services. These systems are very useful, but also challenging to implement because networks are unpredictable. Whenever you send a message over the network, it is likely to arrive quite quickly, but it\u2019s possible that it might be delayed for a long time, or never arrive, or arrive several times.</p>\n\n<p>When you send a request to another process and don\u2019t receive a response, you have no idea what happened: was the request lost, or has the other process crashed, or was the response lost? Or maybe nothing was lost at all, but a message has simply been delayed and may yet arrive. There is no way of knowing what happened, because unreliable message-passing is the only way how processes can communicate.</p>\n\n<p>Distributed algorithms work with this model of unreliable communication and build stronger guarantees on top of it. Examples of such stronger guarantees include database transactions and replication (maintaining copies of some data on multiple machines so that the data is not lost if one machine fails).</p>\n\n<p>Unfortunately, distributed algorithms are notoriously difficult to reason about, because they must uphold their guarantees regardless of the order in which messages are delivered, and even when some messages are lost or some processes crash. Many algorithms are very subtle, and informal reasoning is not sufficient for ensuring that they are correct. Moreover, the number of possible permutations and interleavings of concurrent activities quickly becomes too great for model-checkers to test exhaustively. For this reason, formal proofs of correctness are valuable for distributed algorithms.</p>\n\n<h2>Modelling a distributed system in Isabelle/HOL</h2>\n\n<p>In this blog post we will explore how to use the Isabelle/HOL proof assistant to formally verify a number of distributed algorithms. Isabelle/HOL does not have any built-in support for distributed computing, but fortunately it is quite straightforward to model a distributed system using structures that Isabelle/HOL provides: functions, lists, and sets.</p>\n\n<p>First, we asssume each process (or <em>node</em>) in the system has a unique identifier, which could simply be an integer or a string. Depending on the algorithm, the set of process IDs in the system may be fixed and known, or unknown and unbounded (the latter is appropriate for systems where processes can join and leave over time).</p>\n\n<p>The execution of the algorithm then proceeds in discrete time steps. In each time step, an event occurs at one of the processes, and this event could be one of three things: receiving a message sent by another process, receiving user input, or the elapsing of a timeout.</p>\n\n\n<pre><code><span>datatype</span> <span>(</span><span>'proc</span><span>,</span> <span>'msg</span><span>,</span> <span>'val</span><span>)</span> <span>event</span>\n <span>=</span> <span>Receive</span> <span>(</span><span>msg_sender</span><span>:</span> <span>'proc</span><span>)</span> <span>(</span><span>recv_msg</span><span>:</span> <span>'msg</span><span>)</span>\n <span>|</span> <span>Request</span> <span>'val</span>\n <span>|</span> <span>Timeout</span></code></pre>\n\n<p>Triggered by one of these events, the process executes a function that may update its own state, and may send messages to other processes. A message sent in one time step may be received at any future time step, or may never be received at all.</p>\n\n<p>Each process has a local state that is not shared with any other process. This state has a fixed initial value at the beginning of the execution, and is updated only when that process executes a step. One process cannot read the state of another process, but we can describe the state of the entire system as the collection of all the processes\u2019 individual states:</p>\n\n<p><img alt=\"Illustration of several processes executing steps, one at a time\" height=\"275\" src=\"/2022/10/time-steps.png\" width=\"550\"></p>\n\n<h2>Why a linear sequence of time steps is sufficient</h2>\n\n<p>Even though in reality processes may run in parallel, we do not need to model this parallelism since the only communication between processes is by sending and receiving messages, and we can assume that a process finishes processing one event before starting to process the next event. Every parallel execution is therefore equivalent to some linear sequence of execution steps. Other formalisations of distributed systems, such as the <a href=\"https://lamport.azurewebsites.net/tla/tla.html\">TLA+ language</a>, also use such a linear sequence of steps.</p>\n\n<p>We do not make any assumptions about which time step is executed by which process. It is possible that the processes fairly take turns to run, but it is equally possible for one process to execute a million steps while another process does nothing at all. By avoiding assumptions about process activity we ensure that the algorithm works correctly regardless of the timing in the system. For example, a process that is temporarily disconnected from the network is modelled simply by a process that does not experience any receive-message events, even while the other processes continue sending and receiving messages.</p>\n\n<p>In this model, a process crash is represented simply by a process that executes no more steps after some point in time; there is no need for a crash to be explicitly represented. If we want to allow processes to recover from a crash, we can add a fourth type of event that models a process restarting after a crash. When executing such a crash-recovery event, a process deletes any parts of its local state that are stored in volatile memory, but preserves those parts of its state that are in stable storage (on disk) and hence survive the crash.</p>\n\n<p>When reasoning about safety properties of algorithms, it is best not to assume anything about which process executes in which time step, since that ensures the algorithm can tolerate arbitrary message delays. If we wanted to reason about liveness (for example, that an algorithm eventually terminates), we would have to make some fairness assumptions, e.g. that every non-crashed process eventually executes a step. However, in our proofs so far we have only focussed on safety properties.</p>\n\n<p><img alt=\"System model: linear sequence of time steps; at each step, one process handles an event\" height=\"412\" src=\"/2022/10/system-model.png\" width=\"550\"></p>\n\n<p>We can now express a distributed algorithm as the <em>step function</em>, which takes three arguments: the ID of the process executing the current time step, the current local state of that process, and the event that has occurred (message receipt, user input, timeout, or crash recovery). The return value consists of the new state for that process, and a set of messages to send to other processes (each message tagged with the ID of the recipient process).</p>\n\n\n<pre><code><span>type_synonym</span><span> </span><span>(</span><span>'proc</span><span>,</span><span> </span><span>'state</span><span>,</span><span> </span><span>'msg</span><span>,</span><span> </span><span>'val</span><span>)</span><span> </span><span>step_func</span><span> </span><span>=</span><span>\n </span>\u2039'proc \u21d2 'state \u21d2 ('proc, 'msg, 'val) event \u21d2\n ('state \u00d7 ('proc \u00d7 'msg) set)\u203a</code></pre>\n\n<p>The current state of a process at one time step equals the new state after the previous step by the same process (or the initial state if there is no previous step). Assuming the step function is deterministic, we can now encode any execution of the system as a list of (processID, event) pairs indicating the series of events that occurred, and at which process they happened. The final state of the system is obtained by calling the step function one event at a time.</p>\n\n<h2>Defining what may happen</h2>\n\n<p>To prove a distributed algorithm correct, we need to show that it produces a correct result in every possible execution, i.e. for every possible list of (processID, event) pairs. But which executions are possible? There is only really one thing we can safely assume: if a message is received by a process, then that message must have been sent to that process. In other words, we assume the network does not fabricate messages out of thin air, and one process cannot impersonate another process. (In a public network where an attacker can inject fake packets, we would have to cryptographically authenticate the messages to ensure this property, but let\u2019s leave that out of scope for now.)</p>\n\n<p>Therefore, the only assumption we will make is that if a message is received in some time step, then it must have been sent in a previous time step. However, we will allow messages to be lost, reordered, or received multiple times. Let\u2019s encode this assumption in Isabelle/HOL.</p>\n\n<p>First, we define a function that tells us whether a single event is possible: <code>(valid_event evt proc msgs)</code> returns <code>true</code> if event <code>evt</code> is allowed to occur at process <code>proc</code> in a system in which <code>msgs</code> is the set of all messages that have been sent so far. <code>msgs</code> is a set of (sender, recipient, message) triples. We define that a <code>Receive</code> event is allowed to occur iff the received message is in <code>msgs</code>, and <code>Request</code> or <code>Timeout</code> events are allowed to happen anytime.</p>\n\n\n<pre><code><span>fun</span><span> </span><span>valid_event</span><span> </span><span>::</span><span> </span>\u2039('proc, 'msg, 'val) event \u21d2 'proc \u21d2\n ('proc \u00d7 'proc \u00d7 'msg) set \u21d2 bool\u203a<span>\n</span><span>where</span><span>\n </span>\u2039valid_event (Receive sender msg) recpt msgs =\n ((sender, recpt, msg) \u2208 msgs)\u203a<span> </span><span>|</span><span>\n </span>\u2039valid_event (Request _) _ _ = True\u203a<span> </span><span>|</span><span>\n </span>\u2039valid_event Timeout _ _ = True\u203a</code></pre>\n\n<p>Next, we define the set of all possible event sequences. For this we use an inductive predicate in Isabelle: <code>(execute step init procs events msgs states)</code> returns true if <code>events</code> is a valid sequence of events in an execution of the algorithm where <code>step</code> is the step function, <code>init</code> is the initial state of each process, and <code>proc</code> is the set of all processes in the system (which might be infinite if we want to allow any number of processes). The last two arguments keep track of the execution state: <code>msgs</code> is the set of all messages sent so far, and <code>states</code> is a map from process ID to the state of that process.</p>\n\n\n<pre><code><span>inductive</span><span> </span><span>execute</span><span> </span><span>::</span><span>\n </span>\u2039('proc, 'state, 'msg, 'val) step_func \u21d2 ('proc \u21d2 'state) \u21d2\n 'proc set \u21d2 ('proc \u00d7 ('proc, 'msg, 'val) event) list \u21d2\n ('proc \u00d7 'proc \u00d7 'msg) set \u21d2 ('proc \u21d2 'state) \u21d2 bool\u203a<span>\n</span><span>where</span><span>\n </span>\u2039execute step init procs [] {} init\u203a<span> </span><span>|</span><span>\n </span>\u2039\u27e6execute step init procs events msgs states;\n proc \u2208 procs;\n valid_event event proc msgs;\n step proc (states proc) event = (new_state, sent);\n events' = events @ [(proc, event)];\n msgs' = msgs \u222a {m. \u2203(recpt, msg) \u2208 sent.\n m = (proc, recpt, msg)};\n states' = states (proc := new_state)\n \u27e7 \u27f9 execute step init procs events' msgs' states'\u203a</code></pre>\n\n<p>This definition states that the empty list of events is valid when the system is in the initial state and no messages have been sent. Moreover, if <code>events</code> is a valid sequence of events so far, and <code>event</code> is allowed in the current state, then we can invoke the step function, add any messages it sends to <code>msgs</code>, update the state of the appropriate process, and the result is another valid sequence of events.</p>\n\n<p>And that\u2019s all we need to model the distributed system!</p>\n\n<h2>Proving an algorithm correct</h2>\n\n<p>Now we can take some algorithm (defined by its step function and initial state) and prove that for all possible lists of events, some property <em>P</em> holds. Since we do not fix a maximum number of time steps, there is an infinite number of possible lists of events. But that\u2019s not a problem, since we can use induction over lists to prove <em>P</em>.</p>\n\n<p><img alt=\"The Isabelle/HOL induction principle over lists\" height=\"292\" src=\"/2022/10/induction.png\" width=\"550\"></p>\n\n<p>We use the <code>List.rev_induct</code> induction rule in Isabelle/HOL. It requires showing that:</p>\n\n<ol>\n <li>the property <em>P</em> is true for the empty list (i.e. for a system in the initial state, which has not executed any time steps); and</li>\n <li>if the property <em>P</em> is true for some execution, and we add one more time step to the end of the execution, then <em>P</em> still holds after that time step.</li>\n</ol>\n\n<p>In other words, we prove that <em>P</em> is an invariant over all possible states of the whole system. In Isabelle, that proof looks roughly like this (where <code>step</code>, <code>init</code>, and <code>procs</code> are appropriately defined):</p>\n\n\n<pre><code><span>theorem</span><span> </span><span>prove_invariant</span><span>:</span><span>\n </span><span>assumes</span><span> </span>\u2039execute step init procs events msgs states\u203a<span>\n </span><span>shows</span><span> </span>\u2039some_invariant states\u203a<span>\n</span><span>using</span><span> </span><span>assms</span><span> </span><span>proof</span><span> </span><span>(</span><span>induction</span><span> </span><span>events</span><span> </span><span>arbitrary</span><span>:</span><span> </span><span>msgs</span><span> </span><span>states</span><span>\n </span><span>rule</span><span>:</span><span> </span><span>List</span><span>.</span><span>rev_induct</span><span>)</span><span>\n </span><span>case</span><span> </span><span>Nil</span><span>\n </span><span>then</span><span> </span><span>show</span><span> </span>\u2039some_invariant states\u203a<span> </span><span>sorry</span><span>\n</span><span>next</span><span>\n </span><span>case</span><span> </span><span>(</span><span>snoc</span><span> </span><span>event</span><span> </span><span>events</span><span>)</span><span>\n </span><span>then</span><span> </span><span>show</span><span> </span><span>?</span><span>case</span><span> </span><span>sorry</span><span>\n</span><span>qed</span></code></pre>\n\n<p>The real challenge in verifying distributed algorithms is to come up with the right invariant that is both true and also implies the properties you want your algorithm to have. Unfortunately, designing this invariant has to be done manually. However, once you have a candidate invariant, Isabelle is very helpful for checking whether it is correct and whether it is strong enough to meet your goals.</p>\n\n<p>For more detail on how to prove the correctness of a simple consensus algorithm in this model, I recorded a <a href=\"https://www.youtube.com/watch?v=Uav5jWHNghY\">2-hour video lecture</a> that runs through a demo from first principles (no prior Isabelle experience required). The <a href=\"https://gist.github.com/ept/b6872fc541a68a321a26198b53b3896b\">Isabelle code of the demo</a> is also available.</p>\n\n\n\n<p>If you want to work on this kind of thing, I will soon be looking for a PhD student to work with me on formalising distributed algorithms in Isabelle, based at <a href=\"https://www.in.tum.de/en/in/cover-page/\">TU Munich</a>. If this sounds like something you want to do, please <a href=\"https://martin.kleppmann.com/contact.html\">get in touch</a>!</p>",
+18
martinkl/2024_01_04_year-in-review.html.json
+18
martinkl/2024_01_04_year-in-review.html.json
···+"summary": "A lot has happened in the last year, so I thought it would be good to write up a review. My biggest change in 2023 was that my wife and I had a baby! This has brought a mixture of joys and frustrations, but overall it has been very good....",+"content": "<p>A lot has happened in the last year, so I thought it would be good to write up a review.</p>\n\n<p>My biggest change in 2023 was that my wife and I had a baby! This has brought a mixture of joys and frustrations, but overall it has been very good. I took three months of full-time parental leave after the birth, and since going back to work I\u2019ve been sharing the parenting with responsibilities with my partner. Family has therefore been my top priority, but I won\u2019t talk much about family things in this post, since I prefer to keep it private. Lots of work things happened as well:</p>\n\n<h2>New job!</h2>\n\n<p>As of January 2024 I have a new job as <a href=\"https://www.cst.cam.ac.uk/news/new-associate-professor-computer-security-and-privacy\">Associate Professor in Cambridge</a>! Unlike all my previous academic positions, which were all fixed-term contracts of a few years, this is a permanent position. A huge number of people apply for this sort of position, and so I feel very fortunate that my colleagues had faith in my work and decided to choose me.</p>\n\n<p>(Technically, I have to pass a 5-year probation period until the position is permanent, but I\u2019m told that this is mostly a formality, and nothing like the problematic tenure-track system in the US.)</p>\n\n<p>I\u2019ve arranged to work part-time (65%) for the first year on the job, so that I can do a greater share of the parenting duties until our child goes to nursery (which we\u2019re hoping will be in approximately a year\u2019s time). Partly for this reason I\u2019ve not been given any teaching duties for this academic year. However, I\u2019ve been asked to offer a new master\u2019s module for next year, which will take some effort to prepare. I\u2019m planning to do it on cryptographic protocols.</p>\n\n<p>I had only started my previous job at TU Munich in October 2022, so it\u2019s a bit strange to leave again after just over a year. However, Cambridge is better for us for family reasons, and Cambridge was offering a permanent position whereas my job at TU Munich was fixed-term, so it made sense to move back to Cambridge.</p>\n\n<p>The biggest downside of moving is that I have lost the <a href=\"https://portal.volkswagenstiftung.de/search/projectDetails.do?siteLanguage=en&ref=9B116\">grant</a> that brought me to Munich in the first place (since that grant requires me to be at a German university). That\u2019s a shame, because it was a lot of money \u2013 enough for two PhD students and a postdoc for several years. One of my first activities in Cambridge will therefore be to start applying for new grants. \u00c7\u2019est la vie (acad\u00e9mique).</p>\n\n<h2>Research papers and projects</h2>\n\n<p>I had one big paper acceptance in 2023: our article \u201c<a href=\"https://arxiv.org/abs/2311.10825\">Pudding: Private User Discovery in Anonymity Networks</a>\u201d (with <a href=\"https://www.linkedin.com/in/cerenkocaogullar/\">Ceren Kocao\u011fullar</a>, <a href=\"https://www.danielhugenroth.com/\">Daniel Hugenroth</a>, and <a href=\"https://www.cl.cam.ac.uk/~arb33/\">Alastair Beresford</a>) was accepted at the <a href=\"https://sp2024.ieee-security.org/\">IEEE Symposium on Security and Privacy</a>, which will take place in May 2024. This paper solves a problem with the <a href=\"https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/piotrowska\">Loopix</a>/<a href=\"https://nymtech.net/\">Nym</a> anonymity network: previously you had to somehow find out someone\u2019s public key in order to contact them on the network, and our work makes it possible to contact people via a short, friendly username instead (while preserving the security properties of the anonymity network).</p>\n\n<p><a href=\"https://mattweidner.com/\">Matthew Weidner</a> and I went through several iterations of our paper \u201c<a href=\"https://arxiv.org/abs/2305.00583\">The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing</a>\u201d. The latest version is currently under submission at a journal, and a preprint is <a href=\"https://arxiv.org/abs/2305.00583\">available on arxiv</a>. This paper tackles a problem in many collaborative text editing algorithms: when different users insert text at the same place in a document (especially while working offline), the algorithms may mix up text from the different users. Our paper shows how to solve this problem.</p>\n\n<p>The paper \u201c<a href=\"https://www.inkandswitch.com/upwelling/\">Upwelling: Combining Real-time Collaboration with Version Control for Writers</a>\u201d (with <a href=\"https://okdistribute.xyz/\">Rae McKelvey</a>, <a href=\"https://jenson.org/\">Scott Jenson</a>, <a href=\"https://bumble.blue/\">Eileen Wagner</a>, and <a href=\"https://bcook.ca/\">Blaine Cook</a>) appeared on the <a href=\"https://www.inkandswitch.com/\">Ink & Switch website</a> in March. We also submitted it to an academic conference, but it was rejected, so we\u2019re just keeping it as a web article. The paper describes a prototype rich text editor that combines Google-Docs-style real-time collaboration with Git-style version control features (branching, merging, diffing, and editing history).</p>\n\n<p>My master\u2019s student <a href=\"https://liangrunda.com/\">Liangrun Da</a> published \u201c<a href=\"https://arxiv.org/abs/2311.14007\">Extending JSON CRDT with move operations</a>\u201d, a report from a research project he did with me in 2023. The goal of this project was to develop a move operation for Automerge, which could be used to reorder items in a list, or to move a subtree of a JSON document to a different location in the tree. The algorithm is not yet fully implemented within Automerge, but we\u2019re hoping to get there this year.</p>\n\n<p>My other master\u2019s student Leo Stewen\u2019s report \u201c<a href=\"https://github.com/TUM-DSE/research-work-archive/blob/main/archive/2023/summer/docs/gr_stewen_undo_and_redo_support_for_replicated_registers.pdf\">Undo and Redo Support for Replicated Registers</a>\u201d describes another algorithm prototype for Automerge \u2013 this one aiming to add support for undo and redo. This also turns out to not be entirely straightforward, especially when you consider the interaction with all the other features of Automerge.</p>\n\n<h2>Industrial collaborations: Automerge and Bluesky</h2>\n\n<p>I\u2019ve continued my long-standing collaboration with <a href=\"https://www.inkandswitch.com/\">Ink & Switch</a>, in particular around the <a href=\"https://automerge.org/\">Automerge</a> open-source project. <a href=\"https://www.memoryandthought.me/\">Alex Good</a>, who is funded by <a href=\"https://github.com/sponsors/automerge\">Automerge sponsors</a> and my <a href=\"https://www.patreon.com/martinkl\">Patreon supporters</a>, works full-time to maintain the project for our industrial users, while several others at Ink & Switch and in the open source community have been making valuable contributions. I\u2019ve moved into an advisory role and haven\u2019t been writing any actual code for the project lately.</p>\n\n<p>The two biggest milestones for Automerge in 2023 were:</p>\n\n<ul>\n <li>The release of <a href=\"https://automerge.org/blog/automerge-2/\">Automerge 2.0</a>, the rewrite of the original JavaScript code base in Rust. This has enabled huge performance improvements, and also made Automerge available on many more platforms: we compile Rust to WebAssembly and have a TypeScript/JavaScript <a href=\"https://automerge.org/docs/the_js_packages/\">wrapper</a> for web browsers and node, but we can also compile Rust to a native library and call it from <a href=\"https://github.com/automerge/automerge/tree/main/rust/automerge-c\">C</a>, <a href=\"https://github.com/automerge/automerge-go\">Go</a>, <a href=\"https://automerge.org/automerge-swift/documentation/automerge/\">Swift/iOS</a>, <a href=\"https://github.com/automerge/automerge-java\">Java/Android</a>, and others. The idea is to implement the hairy, performance-critical CRDT logic once in Rust, and then to have wrapper APIs for all common programming languages that all share the same data format and interoperate.</li>\n <li>Whereas Automerge itself is only an in-memory data structure library with no I/O, <a href=\"https://automerge.org/blog/2023/11/06/automerge-repo/\">Automerge-Repo</a> now provides out-of-the-box integrations with persistent storage (e.g. IndexedDB in a browser, and the filesystem in native apps) and with network protocols (e.g. WebSocket). Moreover, Automerge-Repo provides <a href=\"https://github.com/automerge/automerge-repo\">integrations</a> with frontend libraries (e.g. React and Svelte). Previously app developers had to figure out all of this for themselves, so Automerge-Repo is a huge step forward in terms of making it easier to build applications on top of Automerge.</li>\n</ul>\n\n<p>My other ongoing industrial collaboration is with <a href=\"https://blueskyweb.org/\">Bluesky</a>, a decentralised social network/protocol. Bluesky has had a tremendously successful year: launched into private beta in early 2023, it has grown to <a href=\"https://bsky.jazco.dev/stats\">3 million users</a> by the end of the year. I\u2019ve been advising the team since the beginning (they started development about two years ago) on topics around scalability, protocol design, architecture, and security.</p>\n\n<p>I also helped them write a research paper about the Bluesky architecture and comparing it to other decentralised social protocols; we\u2019ll be publishing that paper sometime in the next few months. I personally think Bluesky and the underlying <a href=\"https://atproto.com/\">AT Protocol</a> do many things much better than the alternatives, such as Mastodon/ActivityPub, and they have a real chance of becoming a mainstream Twitter successor. Bluesky wants to come out of private beta and open up public federation early this year; it\u2019s going to be an exciting time.</p>\n\n<p>I still have some Bluesky invitation codes to give out. If I know you personally, feel free to send me an email and I\u2019ll send you a code. (Sorry, I don\u2019t have enough codes to give out to people I don\u2019t know.)</p>\n\n<h2>Events, conferences, workshops</h2>\n\n<p>I co-organised three events last year:</p>\n\n<ul>\n <li>The first <a href=\"https://soft.vub.ac.be/dare23/\">summer school on Distributed and Replicated Environments</a> (DARE) in Brussels, Belgium. We had 40 master\u2019s and PhD students from all over Europe, and a few from further afield as well. I gave four hours of lectures (plus lots more time spent in informal conversations), and I think we succeeded in getting the students excited about research in distributed systems. One of the attending master\u2019s students is now applying to do a PhD with me in Cambridge.</li>\n <li>An <a href=\"https://lu.ma/localfirstswunconf-stlouis\">unconference on local-first software</a> in St. Louis, MO, USA the day after Strange Loop. We had space for about 100 people and the event sold out surprisingly quickly. Sadly I couldn\u2019t be there because I caught covid at the summer school, but my co-organisers told me that there were excellent discussion among the attendees. Notes and photos from the event have been collected in <a href=\"https://github.com/LoFiUnconf/stlouis2023\">this Git repository</a>.</li>\n <li>The <a href=\"https://2023.splashcon.org/home/plf-2023\">Programming Local-First Software</a> (PLF) workshop at SPLASH 2023 in Cascais, Portugal. This event aims to bring together industrial practitioners with researchers in the area of programming language design to discuss ways of improving how local-first software is developed. The event included a keynote by <a href=\"https://github.com/expede\">Brooklyn Zelenka</a>, and we had 15 submissions from which we were able to build an interesting and varied programme of talks.</li>\n</ul>\n\n<p>I also gave several public talks in 2023:</p>\n\n<ul>\n <li>At the <a href=\"/2023/06/29/goto-amsterdam.html\">GOTO Amsterdam conference</a> in June (<a href=\"https://www.youtube.com/watch?v=VJ_GeNfZXrQ\">recording</a>) I gave a talk introducing Automerge and local-first software to an audience of industrial software engineers, and I repeated the talk at the <a href=\"/2023/06/28/amsterdam-elixir.html\">Amsterdam Elixir meetup</a>.</li>\n <li>At the <a href=\"/2023/09/22/strange-loop.html\">Strange Loop conference</a> in September (<a href=\"https://www.youtube.com/watch?v=Mr0a5KyD6BU\">recording</a>) I spoke about the research we\u2019ve done over the last few years on collaborative text editing, especially bringing together real-time collaboration with Git-style version control: diffing, branching, and merging (featuring <a href=\"https://www.inkandswitch.com/upwelling/\">Upwelling</a>, <a href=\"https://automerge.org/\">Automerge</a>, and <a href=\"https://www.inkandswitch.com/peritext/\">Peritext</a>). I had to give the talk remotely and I couldn\u2019t see or hear the room, but I\u2019m told that it was full, with standing room only.</li>\n <li>At the <a href=\"/2023/10/19/kastel-distinguished-lecture.html\">KASTEL Distinguished Lecture Series</a> in Karlsruhe, Germany (<a href=\"https://www.youtube.com/watch?v=VKHBRU3cKXw\">recording</a>) I spoke about the security challenges that arise when you try making collaboration software peer-to-peer, and you have to make it work even though you don\u2019t know who you can trust.</li>\n <li>At the <a href=\"/2023/09/27/acm-tech-talks.html\">ACM Tech Talks</a> series (<a href=\"https://www.youtube.com/watch?v=VJ_GeNfZXrQ\">recording</a>) I gave a repeat of my GOTO Amsterdam talk, and there was a lively Q&A session afterwards with lots of good questions. There was a good turnout: around 400 people watched the talk live.</li>\n <li>At the IETF <a href=\"https://datatracker.ietf.org/meeting/118/session/dinrg\">Decentralization of the Internet Research Group</a> I gave a talk about local-first software. My collaborators and I have been discussing that we would like to eventually develop open standards for the protocols around local-first software (right now it\u2019s still too early, so this would be something to consider once they have matured a bit). I\u2019m hoping that this talk might be the beginning of a process of engagement that could eventually lead to such a standardisation effort.</li>\n</ul>\n\n<h2>Designing Data-Intensive Applications</h2>\n\n<p><a href=\"https://dataintensive.net/\">My book</a> continues to sell well, with now over 230,000 copies sold, and reviews continue to be very positive. However, it is gradually showing its age \u2013 it was published in 2017, but I wrote the first few chapters around 2014/15, so they are now almost a decade old. Moreover, I have learnt a lot in the meantime, and there are quite a few things in the book that I would now say differently.</p>\n\n<p>For that reason, I have been working on a second edition that brings the book up-to-date. However, my progress has been very slow, as I\u2019ve had to fit in the research and writing for the second edition alongside my various other work and family commitments. I actually already agreed to do the second edition with O\u2019Reilly in 2021, and the full manuscript was supposed to be complete by January 2023. Well\u2026 that didn\u2019t quite happen as planned.</p>\n\n<p>In fact, I only properly started writing in 2023, and so far I\u2019ve only completed the revision of the first three chapters. I\u2019m much happier with the revised version, but it takes a lot of time to do such thorough revisions, so I\u2019m not even going to try to give an updated completion date. I\u2019d much rather take the time to make it good, however long it takes, rather than rush to meet some artificial deadline. And I\u2019m in the lucky situation where I can get away with such a stance.</p>\n\n<p>In case you\u2019re wondering what\u2019s changing in the second edition: I\u2019m keeping the high-level structure and topics quite similar, but I\u2019m rewriting a lot of the actual text to be easier to follow and more nuanced. I also collected a lot of reference material over the years (books, papers, blog posts, etc.); a large part of my time is spent reading that material and incorporating it into the narrative.</p>\n\n<p>The biggest technological change since the first edition is probably that hosted cloud services are now a much bigger thing than they were a decade ago, and the resulting rise of \u201ccloud-native\u201d architecture. Other things: NoSQL as a buzzword is dead (though many of its ideas have been absorbed into mainstream systems), MapReduce is dead (replaced by cloud data warehouses, data lakes, and things like Spark), and GDPR arrived (though the degree to which it is influencing data systems architecture is still somewhat open).</p>\n\n<h2>Local-first is taking off</h2>\n\n<p>Together with some colleagues from Ink & Switch I coined the term <a href=\"https://www.inkandswitch.com/local-first/\">\u201clocal-first\u201d</a> in 2019 to describe the type of software we wanted to enable with Automerge and related projects. Initially the term was mostly used by ourselves and our direct collaborators, but in 2023 we have seen the idea catching on much more widely:</p>\n\n<ul>\n <li>More people have been writing about local-first, including <a href=\"https://www.wired.com/story/the-cloud-is-a-prison-can-the-local-first-software-movement-set-us-free/\">WIRED magazine in August</a>, a <a href=\"https://bricolage.io/some-notes-on-local-first-development/\">blog post by Kyle Mathews</a> in September, and a <a href=\"https://lwn.net/Articles/902463/\">LWN.net article</a> last year. These articles capture some of the excitement surrounding local-first software.</li>\n <li>A website and Discord server on <a href=\"https://localfirstweb.dev/\">local-first web development</a> was set up by members of the community in February 2023, and now has over 1,600 members. To date this community has organised ten online meetups, each with several speakers who are working in the area.</li>\n <li>Besides the aforementioned <a href=\"https://lu.ma/localfirstswunconf-stlouis\">local-first unconference</a> in St. Louis and the <a href=\"https://2023.splashcon.org/home/plf-2023\">programming local-first workshop</a> in Cascais that I co-organised, there were also in-person local-first meetups in <a href=\"https://lu.ma/6mux94ll\">Berlin</a> and <a href=\"https://guild.host/events/localfirst-software-dkh284\">London</a> that were organised independently by community members.</li>\n <li>Local-first appears prominently in the October 2022 edition of the <a href=\"https://www.thoughtworks.com/content/dam/thoughtworks/documents/radar/2022/10/tr_technology_radar_vol_27_en.pdf\">Thoughtworks Technology Radar</a>, an influential publication in enterprise software development circles.</li>\n <li>We\u2019ve seen at least a dozen products and startups advertising themselves as \u201clocal-first\u201d on their websites, including for example <a href=\"https://anytype.io/\">Anytype</a>, <a href=\"https://fission.codes/\">Fission</a>, <a href=\"https://replicache.dev/\">Replicache</a>, <a href=\"https://mycelial.com/\">Mycelial</a>, <a href=\"https://electric-sql.com/\">ElectricSQL</a>, <a href=\"https://odd.dev/\">Odd.dev</a>, <a href=\"https://tinybase.org/\">TinyBase</a>, <a href=\"https://aphrodite.sh/\">Aphrodite</a>, <a href=\"https://dxos.org/\">DXOS</a>, <a href=\"https://github.com/orbitdb/orbit-db\">OrbitDB</a>, <a href=\"https://p2panda.org/\">p2panda</a>, <a href=\"https://socketsupply.co/guides/\">Socket Supply</a>, and <a href=\"https://kde.org/for/travelers/\">KDE Itinerary</a>.</li>\n <li>In academia the idea is also catching on: our original local-first article now has around <a href=\"https://scholar.google.com/scholar?cites=792121589490097600&as_sdt=2005&sciodt=0,5&hl=en\">100 citations</a> according to Google Scholar, 15 of which even use the term \u201clocal-first\u201d in the paper title.</li>\n</ul>\n\n<p>It\u2019s exciting that so many people are buying into the idea. Over the coming years I hope we will continue to grow this community, and realise the advantages of the local-first approach in a broader range of software.</p>",
+18
martinkl/2024_07_05_pudding-user-discovery-anonymity-networks.html.json
+18
martinkl/2024_07_05_pudding-user-discovery-anonymity-networks.html.json
···+"link": "http://martin.kleppmann.com/2024/07/05/pudding-user-discovery-anonymity-networks.html",+"summary": "I\u2019d like to introduce an exciting new research paper I worked on! It\u2019s about a system called Pudding, and it was presented by Ceren at the IEEE Symposium on Security and Privacy, one of the top academic conferences on computer security, in May. Daniel and Alastair also worked on this...",+"content": "<p>I\u2019d like to introduce an exciting new research paper I worked on! It\u2019s about a system called\n<a href=\"https://arxiv.org/abs/2311.10825\">Pudding</a>, and it was presented by\n<a href=\"https://twitter.com/ckocaogullar1\">Ceren</a> at the\n<a href=\"https://sp2024.ieee-security.org\">IEEE Symposium on Security and Privacy</a>, one of the top academic\nconferences on computer security, in May. <a href=\"https://www.danielhugenroth.com/\">Daniel</a> and\n<a href=\"https://www.cl.cam.ac.uk/~arb33/\">Alastair</a> also worked on this project. Ceren\u2019s presentation\n<a href=\"https://www.youtube.com/watch?v=EEUdslTwYZ8\">is now available</a>:</p>\n\n\n\n<p>Let me briefly explain what the paper is about.</p>\n\n<p>Anonymity systems allow internet users to hide who is communicating with whom \u2013 for example, think\na whistleblower talking to a journalist, or a group of activists organising protests against their\nrepressive regime. <a href=\"https://www.torproject.org/\">Tor</a> is the most popular anonymity network;\n<a href=\"https://nymtech.net/\">Nym</a> is a more recent design with stronger security (and incidentally, one of\nthe better cryptocurrency applications I\u2019ve seen). Nym is based on a research system called\n<a href=\"https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/piotrowska\">Loopix</a>.</p>\n\n<p>The trouble with these anonymity networks is that if you want to contact someone, you need to know\ntheir public key, and sometimes a bunch of other information as well. In the case of Tor, this is\nencoded in a \u201c<a href=\"https://community.torproject.org/onion-services/\">onion service</a>\u201d URL, which is an\nunreadable sequence of random letters and numbers (sometimes service operators use brute force to\npick a public key so that the first few letters of the hostname spell out the name of the service,\nbut the rest remains random). In Nym, it\u2019s an\n<a href=\"https://nymtech.net/docs/architecture/addressing-system.html\">even longer base58 string</a>. How are\nusers supposed to find the correct key for the person they\u2019re trying to contact? If they send the\nkey via a non-anonymous channel or query a server, they leak the information of who is talking to\nwho, which defeats the entire purpose of the anonymity network.</p>\n\n<p>Having to manually exchange public keys is a huge step backwards in terms of usability. A big part\nof why WhatsApp and Signal succeeded in bringing end-to-end encryption to billions of users, while\nPGP failed, is that today\u2019s secure messaging apps allow you to find your friends using only a phone\nnumber or some other friendly username, while PGP encouraged\n<a href=\"https://en.wikipedia.org/wiki/Key_signing_party\">weird, nerdy, in-person meetings</a> for exchanging keys.</p>\n\n<p>Pudding brings friendly usernames to the Loopix/Nym anonymity networks, so that users don\u2019t have to\ndeal with long random strings. We used email addresses rather than phone numbers, for reasons\nexplained in the paper, but the idea is the same. The challenge is providing the username lookup in\na way that doesn\u2019t leak who is talking to who. In fact, Pudding even goes further and hides whether\na given username is registered to the network or not.</p>\n\n<p>If you\u2019re wondering how this work on anonymity relates to my other work on\n<a href=\"https://crdt.tech/\">CRDTs</a>/<a href=\"https://www.inkandswitch.com/local-first/\">local-first software</a>: I see\nanonymity networks as one possible transport layer on top of which we can build decentralised\ncollaboration software. Not all collaboration apps will need the metadata privacy of an anonymity\nnetwork, but it\u2019s nice to be able to support high-risk users, such as investigative journalists, who\ndo have strong security needs.</p>\n\n<p>If you want to learn more, please <a href=\"https://www.youtube.com/watch?v=EEUdslTwYZ8\">watch the talk</a>,\n<a href=\"https://arxiv.org/abs/2311.10825\">read the paper</a>, or\n<a href=\"https://github.com/ckocaogullar/pudding-protocol\">check out the source code</a>! Just note that the\nimplementation is a research prototype and not fit for production use. We\u2019re hoping that Nym might\nofficially adopt something like Pudding in the future.</p>",
+2
-2
martinkl/metadata.json
+2
-2
martinkl/metadata.json
+18
mort/blog_21st-century-ide_.json
+18
mort/blog_21st-century-ide_.json
···+"summary": "<p>I finally decided to sit down and get the shiny new <a href=\"http://kiwi.iuwt.fr/~asmanur/blog/merlin/\">merlin</a> mode for OCaml\nworking with my emacs configuration. Basically, really rather simple in the end\nalthough (in the usual fashion!) I did end up spending considerable time\ntweaking various other customisations\u2026</p>\n<p>Most of the information below is based on the following sources:</p>\n<ul>\n<li><a href=\"http://github.com/def-lkb/merlin#emacs-interface\">http://github.com/def-lkb/merlin#emacs-interface</a></li>\n<li><a href=\"http://zheng.li/buzzlogs-ocaml/2013/08/23/irc.html\">http://zheng.li/buzzlogs-ocaml/2013/08/23/irc.html</a></li>\n<li><a href=\"http://www.ocamlpro.com/blog/2013/03/18/monthly-03.html\">http://www.ocamlpro.com/blog/2013/03/18/monthly-03.html</a></li>\n</ul>\n<p>Before we begin, install <code>merlin</code>:</p>\n<pre><code><span>$ opam install merlin\n</span></code></pre>\n<p>The complete <a href=\"https://github.com/mor1/rc-files/commit/4a2b0be59081d6df0640af39b48c75c20443c8dc\">commit</a> change is in my <a href=\"http://github.com/mor1\">github</a> account (combined with a\nlarge cleanup of various other aborted OCaml configurations). Breaking it down a\nbit, first setup some paths: where to find <code>ocp-indent</code>, <code>merlin.el</code> for\n<code>merlin-mode</code>, and the <code>ocamlmerlin</code> command itself. Note that this relies on\nthe current state of <code>opam</code>, so when you start <code>emacs</code> be sure to have selected\nthe <code>opam</code> compiler-switch that you installed the <code>merlin</code> package into, above.</p>\n<pre><code><span><span><span>;</span>; ocp-indent\n</span></span><span><span><span>(</span><span>load</span><span>-</span>file <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var prefix<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/share/typerex/ocp-indent/ocp-indent.el<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span><span>\n</span><span><span><span>;</span>; merlin-mode\n</span></span><span><span><span>(</span><span>push</span> <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var share<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/emacs/site-lisp<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span>\n</span></span><span><span> <span>load</span><span>-</span>path<span>)</span></span>\n</span><span>\n</span><span><span><span>(</span><span>setq</span> merlin<span>-</span>command\n</span></span><span><span> <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var bin<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/ocamlmerlin<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span><span><span><span>(</span>autoload 'merlin<span>-</span>mode <span><span>"</span>merlin<span>"</span></span> <span><span>"</span>Merlin mode<span>"</span></span> <span>t</span><span>)</span></span>\n</span></code></pre>\n<p>Now the meat: when we select <code>tuareg-mode</code>, use <code>ocp-indent</code> to indent lines,\nturn on <code>merlin</code> auto-complete, and finally set a couple of local key bindings\nso that I can fix up <code>merlin</code> to not conflict with my\nnow-neurologically-hardwired navigation keys.</p>\n<pre><code><span><span><span>(</span>add<span>-</span>hook 'tuareg<span>-</span>mode<span>-</span>hook\n</span></span><span><span> '<span><span>(</span><span>lambda</span> <span><span>(</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>merlin<span>-</span>mode<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span><span>setq</span> indent<span>-</span>line<span>-</span>function 'ocp<span>-</span>indent<span>-</span>line<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span><span>setq</span> merlin<span>-</span>use<span>-</span>auto<span>-</span>complete<span>-</span>mode <span>t</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>local<span>-</span><span>set</span><span>-</span>key <span><span>(</span>kbd <span><span>"</span>C-S-<up><span>"</span></span><span>)</span></span> 'merlin<span>-</span>type<span>-</span>enclosing<span>-</span>go<span>-</span>up<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>local<span>-</span><span>set</span><span>-</span>key <span><span>(</span>kbd <span><span>"</span>C-S-<down><span>"</span></span><span>)</span></span> 'merlin<span>-</span>type<span>-</span>enclosing<span>-</span>go<span>-</span>down<span>)</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span></code></pre>\n<p>Finally, do the usual to use <code>tuareg-mode</code> for OCaml/F# editing.</p>\n<pre><code><span><span><span>(</span><span>push</span>'<span><span>(</span><span><span>"</span><span>\\\\</span>.ml[iylp]?<span>"</span></span> . tuareg<span>-</span>mode<span>)</span></span> auto<span>-</span>mode<span>-</span>alist<span>)</span></span>\n</span><span><span><span>(</span><span>push</span> '<span><span>(</span><span><span>"</span><span>\\\\</span>.fs[ix]?<span>"</span></span> . tuareg<span>-</span>mode<span>)</span></span> auto<span>-</span>mode<span>-</span>alist<span>)</span></span>\n</span></code></pre>\n<p>And that\u2019s it!</p>",+"content": "<p>I finally decided to sit down and get the shiny new <a href=\"http://kiwi.iuwt.fr/~asmanur/blog/merlin/\">merlin</a> mode for OCaml\nworking with my emacs configuration. Basically, really rather simple in the end\nalthough (in the usual fashion!) I did end up spending considerable time\ntweaking various other customisations\u2026</p>\n<p>Most of the information below is based on the following sources:</p>\n<ul>\n<li><a href=\"http://github.com/def-lkb/merlin#emacs-interface\">http://github.com/def-lkb/merlin#emacs-interface</a></li>\n<li><a href=\"http://zheng.li/buzzlogs-ocaml/2013/08/23/irc.html\">http://zheng.li/buzzlogs-ocaml/2013/08/23/irc.html</a></li>\n<li><a href=\"http://www.ocamlpro.com/blog/2013/03/18/monthly-03.html\">http://www.ocamlpro.com/blog/2013/03/18/monthly-03.html</a></li>\n</ul>\n<p>Before we begin, install <code>merlin</code>:</p>\n<pre><code><span>$ opam install merlin\n</span></code></pre>\n<p>The complete <a href=\"https://github.com/mor1/rc-files/commit/4a2b0be59081d6df0640af39b48c75c20443c8dc\">commit</a> change is in my <a href=\"http://github.com/mor1\">github</a> account (combined with a\nlarge cleanup of various other aborted OCaml configurations). Breaking it down a\nbit, first setup some paths: where to find <code>ocp-indent</code>, <code>merlin.el</code> for\n<code>merlin-mode</code>, and the <code>ocamlmerlin</code> command itself. Note that this relies on\nthe current state of <code>opam</code>, so when you start <code>emacs</code> be sure to have selected\nthe <code>opam</code> compiler-switch that you installed the <code>merlin</code> package into, above.</p>\n<pre><code><span><span><span>;</span>; ocp-indent\n</span></span><span><span><span>(</span><span>load</span><span>-</span>file <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var prefix<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/share/typerex/ocp-indent/ocp-indent.el<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span><span>\n</span><span><span><span>;</span>; merlin-mode\n</span></span><span><span><span>(</span><span>push</span> <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var share<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/emacs/site-lisp<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span>\n</span></span><span><span> <span>load</span><span>-</span>path<span>)</span></span>\n</span><span>\n</span><span><span><span>(</span><span>setq</span> merlin<span>-</span>command\n</span></span><span><span> <span><span>(</span>concat\n</span></span></span><span><span><span> <span><span>(</span>substring <span><span>(</span>shell<span>-</span>command<span>-</span><span>to</span><span>-</span><span>string</span> <span><span>"</span>opam config var bin<span>"</span></span><span>)</span></span> <span>0</span> <span>-</span><span>1</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>"</span>/ocamlmerlin<span>"</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span><span><span><span>(</span>autoload 'merlin<span>-</span>mode <span><span>"</span>merlin<span>"</span></span> <span><span>"</span>Merlin mode<span>"</span></span> <span>t</span><span>)</span></span>\n</span></code></pre>\n<p>Now the meat: when we select <code>tuareg-mode</code>, use <code>ocp-indent</code> to indent lines,\nturn on <code>merlin</code> auto-complete, and finally set a couple of local key bindings\nso that I can fix up <code>merlin</code> to not conflict with my\nnow-neurologically-hardwired navigation keys.</p>\n<pre><code><span><span><span>(</span>add<span>-</span>hook 'tuareg<span>-</span>mode<span>-</span>hook\n</span></span><span><span> '<span><span>(</span><span>lambda</span> <span><span>(</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>merlin<span>-</span>mode<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span><span>setq</span> indent<span>-</span>line<span>-</span>function 'ocp<span>-</span>indent<span>-</span>line<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span><span>setq</span> merlin<span>-</span>use<span>-</span>auto<span>-</span>complete<span>-</span>mode <span>t</span><span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>local<span>-</span><span>set</span><span>-</span>key <span><span>(</span>kbd <span><span>"</span>C-S-<up><span>"</span></span><span>)</span></span> 'merlin<span>-</span>type<span>-</span>enclosing<span>-</span>go<span>-</span>up<span>)</span></span>\n</span></span></span><span><span><span> <span><span>(</span>local<span>-</span><span>set</span><span>-</span>key <span><span>(</span>kbd <span><span>"</span>C-S-<down><span>"</span></span><span>)</span></span> 'merlin<span>-</span>type<span>-</span>enclosing<span>-</span>go<span>-</span>down<span>)</span></span>\n</span></span></span><span><span><span> <span>)</span></span><span>)</span></span>\n</span></code></pre>\n<p>Finally, do the usual to use <code>tuareg-mode</code> for OCaml/F# editing.</p>\n<pre><code><span><span><span>(</span><span>push</span>'<span><span>(</span><span><span>"</span><span>\\\\</span>.ml[iylp]?<span>"</span></span> . tuareg<span>-</span>mode<span>)</span></span> auto<span>-</span>mode<span>-</span>alist<span>)</span></span>\n</span><span><span><span>(</span><span>push</span> '<span><span>(</span><span><span>"</span><span>\\\\</span>.fs[ix]?<span>"</span></span> . tuareg<span>-</span>mode<span>)</span></span> auto<span>-</span>mode<span>-</span>alist<span>)</span></span>\n</span></code></pre>\n<p>And that\u2019s it!</p>",
+18
mort/blog_arming-linuxkit_.json
+18
mort/blog_arming-linuxkit_.json
···+"summary": "<p>As some may know, following the <a href=\"https://unikernels.com\">Unikernel Systems</a> acquisition, I\ncurrently do contract work for <a href=\"https://docker.com\">Docker Inc.</a> in addition to my day job\nhere at the <a href=\"https://www.cl.cam.ac.uk\">Cambridge University Computer Laboratory</a>. Recently this has\ncentred on <a href=\"https://github.com/linuxkit/linuxkit\">LinuxKit</a>, \u201c<em>A toolkit for building secure, portable and lean\noperating systems for containers</em>\u201d and, specifically, enabling ARM64 support.\nI\u2019m pleased to say that a basic proof-of-concept is now complete, and we\u2019re\nworking towards getting support merged upstream.</p>\n<p>The proof-of-concept was developed using the great ARM64 support provided\nby <a href=\"https://packet.net\"><code>packet.net</code></a>, on one of their <code>type 2A</code> boxes.</p>\n<p>If you fancy trying it out, then hopefully the following instructions will be of\nuse \u2013 or just bug me on the <a href=\"https://slack.packet.net/\"><code>packet.net</code> Slack</a>!</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#building\">Building</a></h2>\n<p>Start by getting an ARM64 box setup. If you have one to hand, great! If not, you\ncould head over to <a href=\"https://packet.net\">packet.net</a> and create type 2A Ubuntu box to use as a build\nenvironment.</p>\n<p>Then clone the source, either <code>git clone</code> <a href=\"https://github.com/mor1/linuxkit/tree/project-arm64\">my dev branch</a>, or\nsee <a href=\"https://github.com/linuxkit/linuxkit/pull/1654\">https://github.com/linuxkit/linuxkit/pull/1654</a> for the open PR which may\nbe a bit more stable.</p>\n<p>The essence of it then is to build the containers based off <code>aarch64/alpine</code>,\nalong with an ARM64 version of the <a href=\"https://github.com/moby/moby\"><code>moby</code> CLI</a> if needed. Specifying the\ncontainer images you just built in your <code>moby.yml</code> file will then cause <code>moby</code>\nto assemble things that should boot on ARM64.</p>\n<p>The output should be a gzipped kernel, currently slightly misleadingly named\n<code>bzImage</code> as well as a suitable <code>initrd</code>.</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#booting\">Booting</a></h2>\n<p>Setup another ARM64 box on which to boot the results. You could setup a\ntype 2A <a href=\"https://packet.net\">packet.net</a> box once more, but this time set it to <em>custom OS</em> and\n<em>iPXE boot</em>. For the iPXE boot URL, give a URL pointing to a suitable boot\nfile. I use:</p>\n<pre><code><span><span><span>#</span></span><span>!ipxe</span><span>\n</span></span><span><span><span>set</span></span><span> base-url URL-TO-DIRECTORY-HOLDING-IMAGES</span>\n</span><span><span><span>set</span></span><span> kernel-params ip=dhcp nomodeset ro serial console=ttyAMA0,115200 earlycon earlyprintk=serial,keep initrd=arm64-initrd.img</span>\n</span><span><span><span>initrd</span></span><span> <span><span>$</span><span>{</span></span><span><span>base</span></span><span><span>-</span></span><span>url</span><span><span>}</span></span>/arm64-initrd.img</span>\n</span><span><span><span>imgstat</span></span>\n</span><span><span><span>boot</span></span><span> <span><span>$</span><span>{</span></span><span><span>base</span></span><span><span>-</span></span><span>url</span><span><span>}</span></span>/arm64-bzImage <span><span>$</span><span>{</span></span><span><span>kernel</span></span><span><span>-</span></span><span>params</span><span><span>}</span></span></span>\n</span></code></pre>\n<p>Note that, currently at least, the <a href=\"https://packet.net\">packet.net</a> iPXE boot only occurs on the\nfirst boot as it is assumed that the iPXE boot will install a working image to\nthe local disk. Thus, if it doesn\u2019t work first time, get an SOS console and\nbreak in by hitting <code>^B</code> at the appropriate moment, before issuing <code>chain URL</code>\nwhere <code>URL</code> points to your iPXE boot file.</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#conclusion\">Conclusion</a></h2>\n<p>This just does the barest minimum for now \u2013 I did say it was a\nproof-of-concept\u2026 :) Work is currently ongoing to upstream this rather than\ndeveloping this PoC further, but if anyone has a particular interest or would\nlike to provide patches to, e.g., support network devices on <a href=\"https://packet.net\">packet.net</a>,\nplease <a href=\"mailto:mort@cantab.net\">get in touch</a>, file an issue or send a pull\nrequest!</p>",+"content": "<p>As some may know, following the <a href=\"https://unikernels.com\">Unikernel Systems</a> acquisition, I\ncurrently do contract work for <a href=\"https://docker.com\">Docker Inc.</a> in addition to my day job\nhere at the <a href=\"https://www.cl.cam.ac.uk\">Cambridge University Computer Laboratory</a>. Recently this has\ncentred on <a href=\"https://github.com/linuxkit/linuxkit\">LinuxKit</a>, \u201c<em>A toolkit for building secure, portable and lean\noperating systems for containers</em>\u201d and, specifically, enabling ARM64 support.\nI\u2019m pleased to say that a basic proof-of-concept is now complete, and we\u2019re\nworking towards getting support merged upstream.</p>\n<p>The proof-of-concept was developed using the great ARM64 support provided\nby <a href=\"https://packet.net\"><code>packet.net</code></a>, on one of their <code>type 2A</code> boxes.</p>\n<p>If you fancy trying it out, then hopefully the following instructions will be of\nuse \u2013 or just bug me on the <a href=\"https://slack.packet.net/\"><code>packet.net</code> Slack</a>!</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#building\">Building</a></h2>\n<p>Start by getting an ARM64 box setup. If you have one to hand, great! If not, you\ncould head over to <a href=\"https://packet.net\">packet.net</a> and create type 2A Ubuntu box to use as a build\nenvironment.</p>\n<p>Then clone the source, either <code>git clone</code> <a href=\"https://github.com/mor1/linuxkit/tree/project-arm64\">my dev branch</a>, or\nsee <a href=\"https://github.com/linuxkit/linuxkit/pull/1654\">https://github.com/linuxkit/linuxkit/pull/1654</a> for the open PR which may\nbe a bit more stable.</p>\n<p>The essence of it then is to build the containers based off <code>aarch64/alpine</code>,\nalong with an ARM64 version of the <a href=\"https://github.com/moby/moby\"><code>moby</code> CLI</a> if needed. Specifying the\ncontainer images you just built in your <code>moby.yml</code> file will then cause <code>moby</code>\nto assemble things that should boot on ARM64.</p>\n<p>The output should be a gzipped kernel, currently slightly misleadingly named\n<code>bzImage</code> as well as a suitable <code>initrd</code>.</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#booting\">Booting</a></h2>\n<p>Setup another ARM64 box on which to boot the results. You could setup a\ntype 2A <a href=\"https://packet.net\">packet.net</a> box once more, but this time set it to <em>custom OS</em> and\n<em>iPXE boot</em>. For the iPXE boot URL, give a URL pointing to a suitable boot\nfile. I use:</p>\n<pre><code><span><span><span>#</span></span><span>!ipxe</span><span>\n</span></span><span><span><span>set</span></span><span> base-url URL-TO-DIRECTORY-HOLDING-IMAGES</span>\n</span><span><span><span>set</span></span><span> kernel-params ip=dhcp nomodeset ro serial console=ttyAMA0,115200 earlycon earlyprintk=serial,keep initrd=arm64-initrd.img</span>\n</span><span><span><span>initrd</span></span><span> <span><span>$</span><span>{</span></span><span><span>base</span></span><span><span>-</span></span><span>url</span><span><span>}</span></span>/arm64-initrd.img</span>\n</span><span><span><span>imgstat</span></span>\n</span><span><span><span>boot</span></span><span> <span><span>$</span><span>{</span></span><span><span>base</span></span><span><span>-</span></span><span>url</span><span><span>}</span></span>/arm64-bzImage <span><span>$</span><span>{</span></span><span><span>kernel</span></span><span><span>-</span></span><span>params</span><span><span>}</span></span></span>\n</span></code></pre>\n<p>Note that, currently at least, the <a href=\"https://packet.net\">packet.net</a> iPXE boot only occurs on the\nfirst boot as it is assumed that the iPXE boot will install a working image to\nthe local disk. Thus, if it doesn\u2019t work first time, get an SOS console and\nbreak in by hitting <code>^B</code> at the appropriate moment, before issuing <code>chain URL</code>\nwhere <code>URL</code> points to your iPXE boot file.</p>\n<h2><a href=\"https://mort.io/blog/arming-linuxkit/#conclusion\">Conclusion</a></h2>\n<p>This just does the barest minimum for now \u2013 I did say it was a\nproof-of-concept\u2026 :) Work is currently ongoing to upstream this rather than\ndeveloping this PoC further, but if anyone has a particular interest or would\nlike to provide patches to, e.g., support network devices on <a href=\"https://packet.net\">packet.net</a>,\nplease <a href=\"mailto:mort@cantab.net\">get in touch</a>, file an issue or send a pull\nrequest!</p>",
+18
mort/blog_back-to-the-future_.json
+18
mort/blog_back-to-the-future_.json
···+"summary": "<blockquote>\n<p>Ed: the first in a likely series, this is one of the posts I drafted but never\npublished during my seven years in the blogging wilderness</p>\n</blockquote>\n<p>I have been having persistent issues on MacOS with Time Machine backups\ninteracting badly with several configuration elements and other services. This\nis exacerbated by my use of maildirs to backup emails, with my Gmail account\nleading to a directory with some millions of files. (Yeah, ok, maybe not wise\nbut hey, we are where we are.)</p>\n<p>So in the course of trying to get backups to work reliably, I found the\nfollowing commands useful to at least some degree:</p>\n<ul>\n<li><code>mdutil -sav</code> to get info about spotlight indexing status</li>\n<li><code>sudo mdutil -Ea -i [off|on]</code> to turn off|on spotlight indexing where possible</li>\n<li><code>sudo fs_usage -w mdworker mds_stores backupd</code> obtain the current filesystem\nusage for those services</li>\n<li><code>sudo fs_usage -w -e iTerm2</code> to obtain current filesystem usage for services\n<em>excluding</em> those listed</li>\n<li><code>sudo sysctl debug.lowpri_throttle_enabled=[0|1]</code> stop|start throttling to go\nfaster|be nice!</li>\n<li>to obtain the various different hostnames that MacOS seems to want to give;\nnote that renaming <code>ComputerName</code> (also possible via\n<code>SysPrefs>Sharing>>name</code>) caused the Time Machine directory to be renamed\nwhen the backup started</li>\n</ul>\n<pre><code><span><span>for</span><span> n <span>in</span> ComputerName LocalHostName HostName</span> <span>;</span> <span>do</span> \n</span><span> <span><span>scutil</span></span><span><span><span> --</span>get</span> <span><span>$</span><span>n</span></span></span>\n</span><span><span>done</span>\n</span></code></pre>\n<ul>\n<li><code>tmutil listbackups</code> to list backups</li>\n<li><code>tmutil listlocalsnapshotdates /</code> to list local snapshots</li>\n<li><code>tmutil destinationinfo</code> to list the Time Machine destination volumes</li>\n<li>to stream Time Machine log entries live</li>\n</ul>\n<pre><code><span><span><span>log</span></span><span> stream<span><span> --</span>style</span> syslog<span><span> --</span>debug</span><span><span> --</span>info</span> <span>\\\n</span></span></span><span><span><span><span> --</span>predicate</span> <span><span>'</span>senderImagePath contains[cd] "TimeMachine"<span>'</span></span></span>\n</span></code></pre>\n<ul>\n<li>to show Time Machine log entries from YYYY-MM-DD</li>\n</ul>\n<pre><code><span><span><span>log</span></span><span> show<span><span> --</span>style</span> syslog<span><span> --</span>debug</span><span><span> --</span>info</span><span><span> --</span>start</span> YYYY-MM-DD <span>\\\n</span></span></span><span><span><span><span> --</span>predicate</span> <span><span>'</span>senderImagePath contains[cd] "TimeMachine"<span>'</span></span><span>\\\n</span></span></span></code></pre>",+"content": "<blockquote>\n<p>Ed: the first in a likely series, this is one of the posts I drafted but never\npublished during my seven years in the blogging wilderness</p>\n</blockquote>\n<p>I have been having persistent issues on MacOS with Time Machine backups\ninteracting badly with several configuration elements and other services. This\nis exacerbated by my use of maildirs to backup emails, with my Gmail account\nleading to a directory with some millions of files. (Yeah, ok, maybe not wise\nbut hey, we are where we are.)</p>\n<p>So in the course of trying to get backups to work reliably, I found the\nfollowing commands useful to at least some degree:</p>\n<ul>\n<li><code>mdutil -sav</code> to get info about spotlight indexing status</li>\n<li><code>sudo mdutil -Ea -i [off|on]</code> to turn off|on spotlight indexing where possible</li>\n<li><code>sudo fs_usage -w mdworker mds_stores backupd</code> obtain the current filesystem\nusage for those services</li>\n<li><code>sudo fs_usage -w -e iTerm2</code> to obtain current filesystem usage for services\n<em>excluding</em> those listed</li>\n<li><code>sudo sysctl debug.lowpri_throttle_enabled=[0|1]</code> stop|start throttling to go\nfaster|be nice!</li>\n<li>to obtain the various different hostnames that MacOS seems to want to give;\nnote that renaming <code>ComputerName</code> (also possible via\n<code>SysPrefs>Sharing>>name</code>) caused the Time Machine directory to be renamed\nwhen the backup started</li>\n</ul>\n<pre><code><span><span>for</span><span> n <span>in</span> ComputerName LocalHostName HostName</span> <span>;</span> <span>do</span> \n</span><span> <span><span>scutil</span></span><span><span><span> --</span>get</span> <span><span>$</span><span>n</span></span></span>\n</span><span><span>done</span>\n</span></code></pre>\n<ul>\n<li><code>tmutil listbackups</code> to list backups</li>\n<li><code>tmutil listlocalsnapshotdates /</code> to list local snapshots</li>\n<li><code>tmutil destinationinfo</code> to list the Time Machine destination volumes</li>\n<li>to stream Time Machine log entries live</li>\n</ul>\n<pre><code><span><span><span>log</span></span><span> stream<span><span> --</span>style</span> syslog<span><span> --</span>debug</span><span><span> --</span>info</span> <span>\\\n</span></span></span><span><span><span><span> --</span>predicate</span> <span><span>'</span>senderImagePath contains[cd] "TimeMachine"<span>'</span></span></span>\n</span></code></pre>\n<ul>\n<li>to show Time Machine log entries from YYYY-MM-DD</li>\n</ul>\n<pre><code><span><span><span>log</span></span><span> show<span><span> --</span>style</span> syslog<span><span> --</span>debug</span><span><span> --</span>info</span><span><span> --</span>start</span> YYYY-MM-DD <span>\\\n</span></span></span><span><span><span><span> --</span>predicate</span> <span><span>'</span>senderImagePath contains[cd] "TimeMachine"<span>'</span></span><span>\\\n</span></span></span></code></pre>",
+18
mort/blog_begin-again_.json
+18
mort/blog_begin-again_.json
···+"summary": "<p>Specifically, I\u2019ve left <a href=\"http://www.horizon.ac.uk\">Horizon</a> and the\n<a href=\"http://www.cs.nott.ac.uk\">School of Computer Science</a> at the\n<a href=\"http://www.nottingham.ac.uk\">University of Nottingham</a> to (re-)join the\n<a href=\"http://www.cam.ac.uk\">Cambridge University</a>\n<a href=\"http://www.cl.cam.ac.uk\">Computer Laboratory</a>. In celebration, and frankly\nbecause it was long overdue anyway, I\u2019ve reworked my website. What do you think?</p>\n<p>For the curious, or the technically inclined, the site now uses\n<a href=\"http://foundation.zurb.com/\">ZURB Foundation</a> 5.5.0 (the current downloadable release as of\nyesterday), with some slightly customised CSS. The site itself is largely\nwritten in <a href=\"http://daringfireball.net/projects/markdown/\">Markdown</a> and currently generated using\n<a href=\"http://jekyllrb.com/\">Jekyll</a> to be hosted on <a href=\"http://github.com\">Github</a>.</p>\n<p>It\u2019s actually gone through an interim phase where it was parsed by the OCaml\n<a href=\"https://github.com/pw347/omd\">OMD</a> parser before being crunched into a <a href=\"https://github.com/mirage/mirage-types\">Mirage KV_RO</a>\nfilesystem which is then compiled into a type-safe, self-contained web appliance\nthat serves these pages and no other using the OCaml <a href=\"https://github.com/mirage/cowabloga\">Cowabloga</a>, <a href=\"https://github.com/mirage/ocaml-cow\">COW</a> and\n<a href=\"https://github.com/mirage/ocaml-cohttp\">CoHTTP</a> libraries. This could either be run as a <a href=\"https://github.com/mirage/mirage-platform/tree/master/unix\">POSIX binary</a>\nor a self-contained <a href=\"https://github.com/mirage/mirage-platform/tree/master/xen\">Xen VM</a> depending on what I felt like. Neat eh?\n(And for the sceptical among you, yes, a thing <em>can</em> be neat and yet appear\ncuriously over-engineered at the same time\u2026 :)</p>\n<p>For the time being however, I\u2019m using it as an excuse to think about what I\nmight do to better support site generation like this in <a href=\"https://github.com/mirage/cowabloga\">Cowabloga</a> so that I\ncan more seamlessly switch between <a href=\"http://jekyllrb.com/\">Jekyll</a> and <a href=\"http://openmirage.org/\">Mirage</a>.</p>",+"content": "<p>Specifically, I\u2019ve left <a href=\"http://www.horizon.ac.uk\">Horizon</a> and the\n<a href=\"http://www.cs.nott.ac.uk\">School of Computer Science</a> at the\n<a href=\"http://www.nottingham.ac.uk\">University of Nottingham</a> to (re-)join the\n<a href=\"http://www.cam.ac.uk\">Cambridge University</a>\n<a href=\"http://www.cl.cam.ac.uk\">Computer Laboratory</a>. In celebration, and frankly\nbecause it was long overdue anyway, I\u2019ve reworked my website. What do you think?</p>\n<p>For the curious, or the technically inclined, the site now uses\n<a href=\"http://foundation.zurb.com/\">ZURB Foundation</a> 5.5.0 (the current downloadable release as of\nyesterday), with some slightly customised CSS. The site itself is largely\nwritten in <a href=\"http://daringfireball.net/projects/markdown/\">Markdown</a> and currently generated using\n<a href=\"http://jekyllrb.com/\">Jekyll</a> to be hosted on <a href=\"http://github.com\">Github</a>.</p>\n<p>It\u2019s actually gone through an interim phase where it was parsed by the OCaml\n<a href=\"https://github.com/pw347/omd\">OMD</a> parser before being crunched into a <a href=\"https://github.com/mirage/mirage-types\">Mirage KV_RO</a>\nfilesystem which is then compiled into a type-safe, self-contained web appliance\nthat serves these pages and no other using the OCaml <a href=\"https://github.com/mirage/cowabloga\">Cowabloga</a>, <a href=\"https://github.com/mirage/ocaml-cow\">COW</a> and\n<a href=\"https://github.com/mirage/ocaml-cohttp\">CoHTTP</a> libraries. This could either be run as a <a href=\"https://github.com/mirage/mirage-platform/tree/master/unix\">POSIX binary</a>\nor a self-contained <a href=\"https://github.com/mirage/mirage-platform/tree/master/xen\">Xen VM</a> depending on what I felt like. Neat eh?\n(And for the sceptical among you, yes, a thing <em>can</em> be neat and yet appear\ncuriously over-engineered at the same time\u2026 :)</p>\n<p>For the time being however, I\u2019m using it as an excuse to think about what I\nmight do to better support site generation like this in <a href=\"https://github.com/mirage/cowabloga\">Cowabloga</a> so that I\ncan more seamlessly switch between <a href=\"http://jekyllrb.com/\">Jekyll</a> and <a href=\"http://openmirage.org/\">Mirage</a>.</p>",
+18
mort/blog_being-followed-postscript_.json
+18
mort/blog_being-followed-postscript_.json
···+"summary": "<p>Turns out others were listening too \u2013 notably the USA\u2019s <a href=\"http://www.darpa.mil/\">DARPA</a>. The recent\nannouncement of the <a href=\"http://www.darpa.mil/NewsEvents/Releases/2015/03/11.aspx\">Brandeis</a> programme makes explicit reference to <a href=\"http://ssrn.com/abstract=2508051\">HDI</a>\nand <a href=\"http://hdiresearch.org/\">our website</a>. This has been picked up by <a href=\"http://gcn.com/articles/2015/03/12/darpa-brandeis.aspx\">GCN</a>, <a href=\"http://www.usatoday.com/story/nation/2015/03/16/data-privacy-darpa-brandeis/70222556/\">USAToday</a>,\n<a href=\"http://www.nbcnews.com/tech/security/darpa-unexpectedly-announces-program-improve-online-piracy-n322601\">NBCNews</a>, <a href=\"http://www.engadget.com/2015/03/12/darpa-is-trying-to-reinvent-online-privacy/\">Engadget</a> among others. With $60M potentially on the table, I\nhope that there\u2019ll be many more who get interested in pushing HDI forwards now\n:)</p>",+"content": "<p>Turns out others were listening too \u2013 notably the USA\u2019s <a href=\"http://www.darpa.mil/\">DARPA</a>. The recent\nannouncement of the <a href=\"http://www.darpa.mil/NewsEvents/Releases/2015/03/11.aspx\">Brandeis</a> programme makes explicit reference to <a href=\"http://ssrn.com/abstract=2508051\">HDI</a>\nand <a href=\"http://hdiresearch.org/\">our website</a>. This has been picked up by <a href=\"http://gcn.com/articles/2015/03/12/darpa-brandeis.aspx\">GCN</a>, <a href=\"http://www.usatoday.com/story/nation/2015/03/16/data-privacy-darpa-brandeis/70222556/\">USAToday</a>,\n<a href=\"http://www.nbcnews.com/tech/security/darpa-unexpectedly-announces-program-improve-online-piracy-n322601\">NBCNews</a>, <a href=\"http://www.engadget.com/2015/03/12/darpa-is-trying-to-reinvent-online-privacy/\">Engadget</a> among others. With $60M potentially on the table, I\nhope that there\u2019ll be many more who get interested in pushing HDI forwards now\n:)</p>",
+18
mort/blog_bibtox_.json
+18
mort/blog_bibtox_.json
···+"summary": "<p>After some time using various tools and scripts to format and sort my files of\nBibTeX/BibLaTeX entries, I finally gave up back in March and <a href=\"https://github.com/mor1/bibtox\">wrote one myself\n\u2013 <code>bibtox</code></a>. This replaced some very nasty\ncombination of server-side <a href=\"https://github.com/mor1/bibtox/blob/83eda34bc9e79bd5251b1ae9623b5e905532c599/bib2json.py\">Python</a> and in-page <a href=\"https://github.com/mor1/bibtox/blob/83eda34bc9e79bd5251b1ae9623b5e905532c599/papers.coffee\">CoffeeScript</a>, plus a\nthird-party tool <a href=\"https://github.com/backtracking/bibtex2html\"><code>bib2bib</code></a> I\nwould run on an ad hoc basis.</p>\n<p>Per the <a href=\"https://github.com/mor1/bibtox/blob/main/README.md\">README</a>, this\nprocesses either a stream of entries on <code>stdin</code> or a set of files arranged into\nsections specified by a simple configuration file. It outputs sorted or\nunsorted, either as canonicalised entries or marked up HTML.</p>\n<p>And it means that, hopefully, finally, my list of publications on this site is\nconsistently formatted and sorted. Completeness is, of course, a process rather\nthan a state so achieving that is left for other times and places.</p>\n<p><a href=\"https://github.com/mor1/bibtox/issues\">Issues</a> or <a href=\"https://github.com/mor1/bibtox/pulls\">pull\nrequests</a> welcome!</p>",+"content": "<p>After some time using various tools and scripts to format and sort my files of\nBibTeX/BibLaTeX entries, I finally gave up back in March and <a href=\"https://github.com/mor1/bibtox\">wrote one myself\n\u2013 <code>bibtox</code></a>. This replaced some very nasty\ncombination of server-side <a href=\"https://github.com/mor1/bibtox/blob/83eda34bc9e79bd5251b1ae9623b5e905532c599/bib2json.py\">Python</a> and in-page <a href=\"https://github.com/mor1/bibtox/blob/83eda34bc9e79bd5251b1ae9623b5e905532c599/papers.coffee\">CoffeeScript</a>, plus a\nthird-party tool <a href=\"https://github.com/backtracking/bibtex2html\"><code>bib2bib</code></a> I\nwould run on an ad hoc basis.</p>\n<p>Per the <a href=\"https://github.com/mor1/bibtox/blob/main/README.md\">README</a>, this\nprocesses either a stream of entries on <code>stdin</code> or a set of files arranged into\nsections specified by a simple configuration file. It outputs sorted or\nunsorted, either as canonicalised entries or marked up HTML.</p>\n<p>And it means that, hopefully, finally, my list of publications on this site is\nconsistently formatted and sorted. Completeness is, of course, a process rather\nthan a state so achieving that is left for other times and places.</p>\n<p><a href=\"https://github.com/mor1/bibtox/issues\">Issues</a> or <a href=\"https://github.com/mor1/bibtox/pulls\">pull\nrequests</a> welcome!</p>",
+18
mort/blog_bigtechday-mirage_.json
+18
mort/blog_bigtechday-mirage_.json
···+"summary": "<p>After a slew of HDI related items, a quick <a href=\"http://openmirage.org/\">MirageOS</a> note. I was invited to\ngive a presentation on MirageOS at the <a href=\"https://www.tngtech.com/en.html\">TNG</a>\n<a href=\"https://www.tngtech.com/en/big-techday.html\">Big TechDay 8</a> conference. This\nwas an interesting, and very broad based, event. It brought together about 500\npeople with about 25 speakers over one day, on topics from cognitive science to\nAI to chess playing algorithms to obviating testing through proof reasoning.\nAnd, of course, operating systems and <a href=\"http://openmirage.org/\">MirageOS</a>! If you\u2019re interested, the\nslides used are available at\n<a href=\"http://decks.openmirage.org/bigtechday8\">decks.openmirage.org</a> in the usual\nway, and if you really want to hear me droning on, TNG are making the video\n(with synchronised slides)\n<a href=\"http://www.techcast.com/events/bigtechday8/salvator-1130/?q=salvator-1130\">available</a>.</p>\n<p>As ever, comments welcome!</p>",+"content": "<p>After a slew of HDI related items, a quick <a href=\"http://openmirage.org/\">MirageOS</a> note. I was invited to\ngive a presentation on MirageOS at the <a href=\"https://www.tngtech.com/en.html\">TNG</a>\n<a href=\"https://www.tngtech.com/en/big-techday.html\">Big TechDay 8</a> conference. This\nwas an interesting, and very broad based, event. It brought together about 500\npeople with about 25 speakers over one day, on topics from cognitive science to\nAI to chess playing algorithms to obviating testing through proof reasoning.\nAnd, of course, operating systems and <a href=\"http://openmirage.org/\">MirageOS</a>! If you\u2019re interested, the\nslides used are available at\n<a href=\"http://decks.openmirage.org/bigtechday8\">decks.openmirage.org</a> in the usual\nway, and if you really want to hear me droning on, TNG are making the video\n(with synchronised slides)\n<a href=\"http://www.techcast.com/events/bigtechday8/salvator-1130/?q=salvator-1130\">available</a>.</p>\n<p>As ever, comments welcome!</p>",
+18
mort/blog_brew-plist_.json
+18
mort/blog_brew-plist_.json
···+"summary": "<p>As I could never remember the command to start the <code>offlineimap</code> service using\nmy customised configuration, here it is:</p>\n<pre><code><span><span><span>brew</span></span><span> services start offlineimap <span><span>~</span></span>/rc-files/homebrew.mxcl.offlineimap.plist</span>\n</span></code></pre>",+"content": "<p>As I could never remember the command to start the <code>offlineimap</code> service using\nmy customised configuration, here it is:</p>\n<pre><code><span><span><span>brew</span></span><span> services start offlineimap <span><span>~</span></span>/rc-files/homebrew.mxcl.offlineimap.plist</span>\n</span></code></pre>",
+18
mort/blog_building-up-your-arms_.json
+18
mort/blog_building-up-your-arms_.json
···+"summary": "<p>Due to the impending finish of the EU FP7 funded <a href=\"https://usercentricnetworking.eu\">User Centric\nNetworking</a><a href=\"https://mort.io/blog/building-up-your-arms/#1\">1</a> I recently had cause to revisit the excellent work that\n<a href=\"https://github.com/talex5\">Thomas Leonard</a> did for the project in getting Xen/ARM running on the\n<a href=\"http://cubieboard.org/model/cb2/\">Cubieboard2</a> and <a href=\"http://cubieboard.org/model/cb3/\">Cubietruck</a> (aka <a href=\"http://cubieboard.org/model/cb3/\">Cubieboard3</a>).</p>\n<p>The resulting repo, <a href=\"https://github.com/mirage/xen-arm-builder\">mirage/xen-arm-builder</a>, had languished for several\nmonths and the past SD card images had some problems and had been allowed to\ndrop off the \u2019Net as a result. However, sterling work by <a href=\"https://github.com/ijc25\">Ian Campbell</a> at\na recent Mirage <a href=\"https://mirage.io/blog/2016-summer-hackathon-roundup\">hackathon</a> had started to resurrect this work based on\nthe <a href=\"https://alpinelinux.org/\">Alpine Linux</a> distribution. This seemed a promising place to start,\nso I did :)</p>\n<h2><a href=\"https://mort.io/blog/building-up-your-arms/#building-an-image\">Building an Image</a></h2>\n<p>The end result was an enormous <a href=\"https://github.com/mirage/xen-arm-builder/pull/71\">pull request</a> that splatted a Brave New\nWorld on top of <a href=\"https://github.com/talex5\">Thomas\u2019</a> work.\nThe <a href=\"https://github.com/mirage/xen-arm-builder/blob/master/README.md\"><code>README</code></a>\nis hopefully reasonably self-explanatory but in summary,</p>\n<ol>\n<li>\n<p>Clone the repo:</p>\n<pre><code><span><span><span>git</span></span><span> clone https://github.com/mor1/arm-image-builder.git</span>\n</span><span><span><span>cd</span></span><span> arm-image-builder</span>\n</span></code></pre>\n</li>\n<li>\n<p>Use the <code>make</code> targets:</p>\n<pre><code><span><span><span>make</span></span><span> all <span><span>#</span></span><span> runs `make prepare build image`</span><span>\n</span></span></span><span><span><span>#</span></span><span> make prepare # clones repos, pulls tarballs</span><span>\n</span></span><span><span><span>#</span></span><span> make build # use Docker to build the `linux/` and `u-boot/` trees</span><span>\n</span></span><span><span><span>#</span></span><span> make image # finally, create the on-disk `sdcard.img`</span><span>\n</span></span></code></pre>\n</li>\n</ol>\n<p>This clones the necessary repos (Linux, u-boot), builds them, and then puts\ntogether the image file <code>sdcard.img</code> in the current directory. If on OSX, <code>make sdcard</code> will then attempt to write that to a blank, mounted SD card. This does a\nrather hacky auto-discovery of where the SD card might be mounted; if in doubt,\nand in any case, always safer to simply</p>\n<pre><code><span><span>MNT</span><span>=</span><span>the-correct-mount-point</span> <span><span>make</span></span><span> sdcard</span>\n</span></code></pre>\n<p>\u2026or simply use your favourite tools to write the <code>sdcard.img</code> file to your SD\ncard.</p>\n<h2><a href=\"https://mort.io/blog/building-up-your-arms/#using-the-image\">Using the Image</a></h2>\n<p>The end result should be an SD card that you can use to boot your device into\n<a href=\"https://alpinelinux.org/\">Alpine Linux v3.4</a>. At present, completing installation requires then:</p>\n<ul>\n<li><a href=\"https://github.com/mirage/xen-arm-builder#first-boot--re-initialisation\">resetting the environment</a>,</li>\n<li><a href=\"https://github.com/mirage/xen-arm-builder#base-install\">completing Alpine setup</a> via\nthe <code>setup-alpine</code> script,</li>\n<li>(if desired) installing Xen via the\n<code>/media/mmcblk0p1/alpine-dom0-install.sh</code> script created as part of building\nthe SD card image,</li>\n<li>(if desired) finally,\nbuilding <a href=\"https://github.com/mirage/xen-arm-builder#alpine\">Alpine</a>\nand/or <a href=\"https://github.com/mirage/xen-arm-builder#debian\">Debian</a> <code>domU</code>s\nvia the <code>/media/mmcblk0p1/alpine-domU-install.sh</code> and\n<code>/media/mmcblk0p1/debian-domU-install.sh</code> scripts, also created as part of\nbuilding the image.</li>\n</ul>\n<p>Hopefully the net result is you end up with a Cubieboard2/3 running Xen with an\nAlpine Linux <code>dom0</code> and some <code>domU</code> images available.</p>\n<p>As ever, <a href=\"https://twitter.com/mort___\">comments, patches, pull requests welcome</a>!</p>\n<div>1\n<p>Grant No. 611001 for those who care.</p>\n</div>",+"content": "<p>Due to the impending finish of the EU FP7 funded <a href=\"https://usercentricnetworking.eu\">User Centric\nNetworking</a><a href=\"https://mort.io/blog/building-up-your-arms/#1\">1</a> I recently had cause to revisit the excellent work that\n<a href=\"https://github.com/talex5\">Thomas Leonard</a> did for the project in getting Xen/ARM running on the\n<a href=\"http://cubieboard.org/model/cb2/\">Cubieboard2</a> and <a href=\"http://cubieboard.org/model/cb3/\">Cubietruck</a> (aka <a href=\"http://cubieboard.org/model/cb3/\">Cubieboard3</a>).</p>\n<p>The resulting repo, <a href=\"https://github.com/mirage/xen-arm-builder\">mirage/xen-arm-builder</a>, had languished for several\nmonths and the past SD card images had some problems and had been allowed to\ndrop off the \u2019Net as a result. However, sterling work by <a href=\"https://github.com/ijc25\">Ian Campbell</a> at\na recent Mirage <a href=\"https://mirage.io/blog/2016-summer-hackathon-roundup\">hackathon</a> had started to resurrect this work based on\nthe <a href=\"https://alpinelinux.org/\">Alpine Linux</a> distribution. This seemed a promising place to start,\nso I did :)</p>\n<h2><a href=\"https://mort.io/blog/building-up-your-arms/#building-an-image\">Building an Image</a></h2>\n<p>The end result was an enormous <a href=\"https://github.com/mirage/xen-arm-builder/pull/71\">pull request</a> that splatted a Brave New\nWorld on top of <a href=\"https://github.com/talex5\">Thomas\u2019</a> work.\nThe <a href=\"https://github.com/mirage/xen-arm-builder/blob/master/README.md\"><code>README</code></a>\nis hopefully reasonably self-explanatory but in summary,</p>\n<ol>\n<li>\n<p>Clone the repo:</p>\n<pre><code><span><span><span>git</span></span><span> clone https://github.com/mor1/arm-image-builder.git</span>\n</span><span><span><span>cd</span></span><span> arm-image-builder</span>\n</span></code></pre>\n</li>\n<li>\n<p>Use the <code>make</code> targets:</p>\n<pre><code><span><span><span>make</span></span><span> all <span><span>#</span></span><span> runs `make prepare build image`</span><span>\n</span></span></span><span><span><span>#</span></span><span> make prepare # clones repos, pulls tarballs</span><span>\n</span></span><span><span><span>#</span></span><span> make build # use Docker to build the `linux/` and `u-boot/` trees</span><span>\n</span></span><span><span><span>#</span></span><span> make image # finally, create the on-disk `sdcard.img`</span><span>\n</span></span></code></pre>\n</li>\n</ol>\n<p>This clones the necessary repos (Linux, u-boot), builds them, and then puts\ntogether the image file <code>sdcard.img</code> in the current directory. If on OSX, <code>make sdcard</code> will then attempt to write that to a blank, mounted SD card. This does a\nrather hacky auto-discovery of where the SD card might be mounted; if in doubt,\nand in any case, always safer to simply</p>\n<pre><code><span><span>MNT</span><span>=</span><span>the-correct-mount-point</span> <span><span>make</span></span><span> sdcard</span>\n</span></code></pre>\n<p>\u2026or simply use your favourite tools to write the <code>sdcard.img</code> file to your SD\ncard.</p>\n<h2><a href=\"https://mort.io/blog/building-up-your-arms/#using-the-image\">Using the Image</a></h2>\n<p>The end result should be an SD card that you can use to boot your device into\n<a href=\"https://alpinelinux.org/\">Alpine Linux v3.4</a>. At present, completing installation requires then:</p>\n<ul>\n<li><a href=\"https://github.com/mirage/xen-arm-builder#first-boot--re-initialisation\">resetting the environment</a>,</li>\n<li><a href=\"https://github.com/mirage/xen-arm-builder#base-install\">completing Alpine setup</a> via\nthe <code>setup-alpine</code> script,</li>\n<li>(if desired) installing Xen via the\n<code>/media/mmcblk0p1/alpine-dom0-install.sh</code> script created as part of building\nthe SD card image,</li>\n<li>(if desired) finally,\nbuilding <a href=\"https://github.com/mirage/xen-arm-builder#alpine\">Alpine</a>\nand/or <a href=\"https://github.com/mirage/xen-arm-builder#debian\">Debian</a> <code>domU</code>s\nvia the <code>/media/mmcblk0p1/alpine-domU-install.sh</code> and\n<code>/media/mmcblk0p1/debian-domU-install.sh</code> scripts, also created as part of\nbuilding the image.</li>\n</ul>\n<p>Hopefully the net result is you end up with a Cubieboard2/3 running Xen with an\nAlpine Linux <code>dom0</code> and some <code>domU</code> images available.</p>\n<p>As ever, <a href=\"https://twitter.com/mort___\">comments, patches, pull requests welcome</a>!</p>\n<div>1\n<p>Grant No. 611001 for those who care.</p>\n</div>",
+18
mort/blog_coping-and-capping_.json
+18
mort/blog_coping-and-capping_.json
···+"summary": "<p>Well that was fun! Quite high up there in the set of things that I never even\nconsidered I might do would\u2019ve been awarding degrees. But by dint of being\nPresident and thus standing in for\n<a href=\"https://en.wikipedia.org/wiki/Simon_McDonald%2C_Baron_McDonald_of_Salford\">Simon</a>,\non Saturday I did exactly that.</p>\n <img alt=\"Me, coped and capped in the superman cape and mortarboard\" height=\"1\" src=\"https://mort.io/blog/coping-and-capping/coped.jpg\" width=\"360\">\n<p>The University held a Congregation for those being admitted to the <a href=\"https://www.cambridgestudents.cam.ac.uk/your-course/graduation-and-what-next/cambridge-ma\">Cambridge\nMA</a>\n(\u201cMagistri in Artibus\u201d \u2013 Master of Arts). Degrees are conferred by the\nChancellor, Vice-Chancellor or nominated deputy. Apparently, typically, for this\none and the main undergraduate congregation in July, that nominated deputy\nVice-Chancellor is usually the Head of House for the College concerned. In this\ncase, as President at Christ\u2019s is effectively deputy Master (~ Head of House),\nit me.</p>\n<p>So instead of the usual Batman-style black gown, I got to wear the rather natty\nSuperman-style <em>cope</em>, hence I was <em>coped</em>. I also got to wear one of the fancy\nhats (\u201csquare cap\u201d or mortarboard) for the first time, hence I was also\n<em>capped</em>.</p>\n<p>And as many other officers were also appropriately hatted, and out of respect\nto the role would <em>cap</em> me at many available opportunities, the written advice I\nreceived was literally that \u201cofficers will cap you \u2013 you do not have to cap\nback\u201d because the cope makes it difficult to do so.</p>\n<p>So, in brief, it seems I coped with being coped and capped but could not cope\nwith capping while coped. To add to the excitement, I also had to Do Some Latin.\nThankfully it\u2019s a dead language so I assume all it could do was turn in its\ngrave while I butchered it.</p>\n<p>I hope everyone had a good time anyway!</p>",+"content": "<p>Well that was fun! Quite high up there in the set of things that I never even\nconsidered I might do would\u2019ve been awarding degrees. But by dint of being\nPresident and thus standing in for\n<a href=\"https://en.wikipedia.org/wiki/Simon_McDonald%2C_Baron_McDonald_of_Salford\">Simon</a>,\non Saturday I did exactly that.</p>\n <img alt=\"Me, coped and capped in the superman cape and mortarboard\" height=\"1\" src=\"https://mort.io/blog/coping-and-capping/coped.jpg\" width=\"360\">\n<p>The University held a Congregation for those being admitted to the <a href=\"https://www.cambridgestudents.cam.ac.uk/your-course/graduation-and-what-next/cambridge-ma\">Cambridge\nMA</a>\n(\u201cMagistri in Artibus\u201d \u2013 Master of Arts). Degrees are conferred by the\nChancellor, Vice-Chancellor or nominated deputy. Apparently, typically, for this\none and the main undergraduate congregation in July, that nominated deputy\nVice-Chancellor is usually the Head of House for the College concerned. In this\ncase, as President at Christ\u2019s is effectively deputy Master (~ Head of House),\nit me.</p>\n<p>So instead of the usual Batman-style black gown, I got to wear the rather natty\nSuperman-style <em>cope</em>, hence I was <em>coped</em>. I also got to wear one of the fancy\nhats (\u201csquare cap\u201d or mortarboard) for the first time, hence I was also\n<em>capped</em>.</p>\n<p>And as many other officers were also appropriately hatted, and out of respect\nto the role would <em>cap</em> me at many available opportunities, the written advice I\nreceived was literally that \u201cofficers will cap you \u2013 you do not have to cap\nback\u201d because the cope makes it difficult to do so.</p>\n<p>So, in brief, it seems I coped with being coped and capped but could not cope\nwith capping while coped. To add to the excitement, I also had to Do Some Latin.\nThankfully it\u2019s a dead language so I assume all it could do was turn in its\ngrave while I butchered it.</p>\n<p>I hope everyone had a good time anyway!</p>",
+18
mort/blog_dataviz_.json
+18
mort/blog_dataviz_.json
···+"summary": "<p>Some possibly useful data visualisation links:</p>\n<ul>\n<li><a href=\"https://medium.economist.com/why-you-sometimes-need-to-break-the-rules-in-data-viz-4d8ece284919\">https://medium.economist.com/why-you-sometimes-need-to-break-the-rules-in-data-viz-4d8ece284919</a></li>\n<li><a href=\"https://colororacle.org/\">https://colororacle.org/</a></li>\n</ul>",+"content": "<p>Some possibly useful data visualisation links:</p>\n<ul>\n<li><a href=\"https://medium.economist.com/why-you-sometimes-need-to-break-the-rules-in-data-viz-4d8ece284919\">https://medium.economist.com/why-you-sometimes-need-to-break-the-rules-in-data-viz-4d8ece284919</a></li>\n<li><a href=\"https://colororacle.org/\">https://colororacle.org/</a></li>\n</ul>",
+18
mort/blog_discord_.json
+18
mort/blog_discord_.json
···+"summary": "<p>So for some reason I wanted to do this \u2013 use Discord on an iPad without\ninstalling the app. This proved surprisingly tricky as Safari on the iPad\n<em>really</em> wanted to make you use the app and certainly wouldn\u2019t display the\ndesktop site.</p>\n<p>However, Firefox Focus can though it still forced you into the app from the link\nin the invite email.</p>\n<p>However, you can go to the website to signup and then manually invite yourself\nvia a real desktop browser. And this then means that Firefox Focus on the iPad\nbelieves in the new account and just lets you in.</p>\n<p>Seems something of a palaver but hey, these twisted webs we weave.</p>",+"content": "<p>So for some reason I wanted to do this \u2013 use Discord on an iPad without\ninstalling the app. This proved surprisingly tricky as Safari on the iPad\n<em>really</em> wanted to make you use the app and certainly wouldn\u2019t display the\ndesktop site.</p>\n<p>However, Firefox Focus can though it still forced you into the app from the link\nin the invite email.</p>\n<p>However, you can go to the website to signup and then manually invite yourself\nvia a real desktop browser. And this then means that Firefox Focus on the iPad\nbelieves in the new account and just lets you in.</p>\n<p>Seems something of a palaver but hey, these twisted webs we weave.</p>",
+18
mort/blog_docker-docker_.json
+18
mort/blog_docker-docker_.json
···+"summary": "<h1><a href=\"https://mort.io/blog/docker-docker/#bootstrapping-docker-for-arm64-aka-aarch64\">Bootstrapping Docker for ARM64 (aka AARCH64)</a></h1>\n<p>Basic process is:</p>\n<ul>\n<li>bootstrap ARM64 <code>go</code> toolchain on x86, and install</li>\n<li>build ARM64 <code>go1.7.5</code> toolchain needed for <code>docker</code> build</li>\n<li>bootstrap ARM64 <code>docker</code> v1.10.3</li>\n<li>use bootstrapped <code>docker</code> to provide containerised build environment for\nbuilding later versions</li>\n</ul>\n<p>Instructions below are for CentOS 7 for Reasons(tm). Package details and so on\nwill vary on other distros.</p>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-go-bootstrap-toolchain\">Build <code>go</code> bootstrap toolchain</a></h2>\n<p>On x86 host:</p>\n<ul>\n<li>build basic go1.4 sufficient to bootstrap</li>\n</ul>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>curl</span></span><span><span><span> -</span>O</span> https://storage.googleapis.com/golang/go1.4-bootstrap-20161024.tar.gz</span>\n</span><span><span><span>tar</span></span><span> xzvf go1.4-bootstrap-20161024.tar.gz</span>\n</span><span><span><span>mv</span></span><span> go go1.4</span>\n</span><span><span><span>cd</span></span><span> go1.4/src</span>\n</span><span><span><span>./make.bash</span></span>\n</span></code></pre>\n<ul>\n<li>cross-compile go1.7 (latest)</li>\n</ul>\n<pre><code><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>git</span></span><span> clone https://go.googlesource.com/go</span>\n</span><span><span><span>cd</span></span><span> go</span>\n</span><span><span><span>git</span></span><span> checkout go1.7.5</span>\n</span><span><span><span>cd</span></span><span> src</span>\n</span><span><span>GOOS</span><span>=</span><span>linux</span> <span>GOARCH</span><span>=</span><span>arm64</span> <span><span>./bootstrap.bash</span></span>\n</span></code></pre>\n<ul>\n<li>transfer cross-compiled toolchain to ARM64 host</li>\n</ul>\n<pre><code><span><span><span>scp</span></span><span> <span><span>~</span></span>/go/src/go-linux-arm64-bootstrap.tbz HOST:<span><span>~</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-arm64-go1-7-5\">Build ARM64 <code>go1.7.5</code></a></h2>\n<ul>\n<li>produce bootstrap toolchains</li>\n</ul>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>tar</span></span><span> xvf go-linux-arm64-bootstrap.tbz</span>\n</span></code></pre>\n<ul>\n<li>use the bootstrap toolchain to build a modern <code>go</code> install</li>\n</ul>\n<pre><code><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>git</span></span><span> clone https://go.googlesource.com/go</span>\n</span><span><span><span>cd</span></span><span> go</span>\n</span><span><span><span>git</span></span><span> checkout go1.7.5</span>\n</span><span><span><span>cd</span></span><span> src</span>\n</span><span><span>GOROOT_BOOTSTRAP</span><span>=</span><span><span><span>~</span></span>/go-linux-arm64-bootstrap</span> <span><span>./make.bash</span></span>\n</span><span><span><span>mv</span></span><span> <span><span>~</span></span>/go/src/go/bin <span><span>~</span></span>/go/bin</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#bootstrap-arm64-docker\">Bootstrap ARM64 Docker</a></h2>\n<ul>\n<li>get a recent <code>git</code></li>\n</ul>\n<pre><code><span><span><span>sudo</span></span><span> yum remove git</span>\n</span><span><span><span>wget</span></span><span> https://github.com/git/git/archive/v2.12.2.tar.gz</span>\n</span><span><span><span>tar</span></span><span> xvf v2.12.2.tar.gz</span>\n</span><span><span><span>cd</span></span><span> git-<span>*</span></span>\n</span><span><span><span>which</span></span><span> git</span>\n</span><span><span><span>sudo</span></span><span> yum install perl-devel hg curl-devel</span>\n</span><span><span><span>make</span></span><span> configure</span>\n</span><span><span><span>./configure</span></span><span><span><span> --</span>prefix</span><span>=</span>/usr/local</span>\n</span><span><span><span>make</span></span><span><span><span> -</span>j8</span></span>\n</span><span><span><span>sudo</span></span><span> make install</span>\n</span></code></pre>\n<ul>\n<li>install dev dependencies</li>\n</ul>\n<pre><code><span><span><span>sudo</span></span><span> yum install btrfs-progs-devel device-mapper-devel</span>\n</span></code></pre>\n<ul>\n<li>clone source</li>\n</ul>\n<pre><code><span><span><span>cd</span></span><span> <span><span>~</span></span>/go</span>\n</span><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> src/github.com/docker</span>\n</span><span><span><span>cd</span></span><span> src/github.com/docker</span>\n</span><span><span><span>git</span></span><span> clone git@github.com:docker/docker</span>\n</span><span><span><span>cd</span></span><span> docker</span>\n</span></code></pre>\n<ul>\n<li>build components</li>\n</ul>\n<pre><code><span><span><span>git</span></span><span> co v1.10.3</span>\n</span><span><span>(</span> <span><span>cd</span></span><span> vendor</span> <span>&&</span> <span>for</span><span> n <span>in</span> src/<span>*</span></span> <span>;</span> <span>do</span> <span><span>ln</span></span><span><span><span> -</span>s</span> <span><span>$</span><span>n</span></span></span> <span>;</span> <span>done</span><span> </span><span>)</span>\n</span><span><span><span>./hack/make.sh</span></span><span> dynbinary</span>\n</span><span><span><span>rm</span></span><span><span><span> -</span>rf</span> vendor</span> <span>&&</span> <span><span>git</span></span><span> checkout . <span><span>#</span></span><span> tidy up symlinking</span><span>\n</span></span></span></code></pre>\n<p>Note that a current bug in Ubuntu packaging metadata means a small edit needs to\nbe made to <code>./Dockerfile.aarch64</code>: change the <code>apt-get update &&</code> to <code>apt-get update ;</code> so that the build doesn\u2019t stop at the first hurdle, updating packages.</p>\n<ul>\n<li>run daemon</li>\n</ul>\n<pre><code><span><span><span>#</span></span><span> sudo rm -rf /var/lib/docker /etc/docker/config.json # DANGEROUS!</span><span>\n</span></span><span><span><span>sudo</span></span><span> ./bundles/1.10.3/dynbinary/docker daemon<span><span> -</span>D</span><span><span> --</span>group</span><span>=</span>wheel</span>\n</span></code></pre>\n<ul>\n<li>run client to check</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> PATH=<span><span>$</span><span>(</span><span><span>pwd</span></span><span> <span><span>-</span>P</span></span><span>)</span></span>/bundles/1.10.3/dynbinary/:<span><span>$</span><span>PATH</span></span></span>\n</span><span><span><span>$</span></span><span> docker version</span>\n</span><span><span><span>Client:</span></span>\n</span><span> <span><span>Version:</span></span><span> 1.10.3</span>\n</span><span> <span><span>API</span></span><span> version: 1.22</span>\n</span><span> <span><span>Go</span></span><span> version: go1.7.5</span>\n</span><span> <span><span>Git</span></span><span> commit: 20f81dde9</span>\n</span><span> <span><span>Built:</span></span><span> Tue Apr 4 00:27:13 2017</span>\n</span><span> <span><span>OS/Arch:</span></span><span> linux/arm64</span>\n</span><span>\n</span><span><span><span>Server:</span></span>\n</span><span> <span><span>Version:</span></span><span> 1.10.3</span>\n</span><span> <span><span>API</span></span><span> version: 1.22</span>\n</span><span> <span><span>Go</span></span><span> version: go1.7.5</span>\n</span><span> <span><span>Git</span></span><span> commit: 20f81dde9</span>\n</span><span> <span><span>Built:</span></span><span> Tue Apr 4 00:27:13 2017</span>\n</span><span> <span><span>OS/Arch:</span></span><span> linux/arm64</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-docker\">Build Docker</a></h2>\n<ul>\n<li>reissue <code>make</code> using containerised build</li>\n</ul>\n<pre><code><span><span><span>export</span> <span>DOCKER_BUILDTAGS</span><span>=</span><span><span><span>'</span>apparmor selinux seccomp<span>'</span></span></span></span>\n</span><span><span><span>git</span></span><span> co v17.05.0-ce <span><span>#</span></span><span> or v1.12.3 or master or whatever</span><span>\n</span></span></span><span><span><span>make <span><span>#</span></span><span> transient failure of first build; restart succeeded</span><span>\n</span></span></span></span><span><span><span>make</span></span><span> deb</span>\n</span><span><span><span>mkdir</span></span><span> contrib/builder/rpm/aarch64</span>\n</span><span><span><span>make</span></span><span> rpm</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#notes\">Notes</a></h2>\n<ul>\n<li>Resulting <code>17.05-dev</code> binaries fail due to missing <code>libapparmor.so</code></li>\n<li>Static binary build no longer supported?</li>\n<li><code>DOCKER_BUILDTAGS</code> environment variable no longer supported?</li>\n<li><code>1.12.3</code> and <code>1.13.1</code> also built</li>\n<li>Build of RPM or DEB packages fails due, I think, to DIND not working</li>\n</ul>",+"content": "<h1><a href=\"https://mort.io/blog/docker-docker/#bootstrapping-docker-for-arm64-aka-aarch64\">Bootstrapping Docker for ARM64 (aka AARCH64)</a></h1>\n<p>Basic process is:</p>\n<ul>\n<li>bootstrap ARM64 <code>go</code> toolchain on x86, and install</li>\n<li>build ARM64 <code>go1.7.5</code> toolchain needed for <code>docker</code> build</li>\n<li>bootstrap ARM64 <code>docker</code> v1.10.3</li>\n<li>use bootstrapped <code>docker</code> to provide containerised build environment for\nbuilding later versions</li>\n</ul>\n<p>Instructions below are for CentOS 7 for Reasons(tm). Package details and so on\nwill vary on other distros.</p>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-go-bootstrap-toolchain\">Build <code>go</code> bootstrap toolchain</a></h2>\n<p>On x86 host:</p>\n<ul>\n<li>build basic go1.4 sufficient to bootstrap</li>\n</ul>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>curl</span></span><span><span><span> -</span>O</span> https://storage.googleapis.com/golang/go1.4-bootstrap-20161024.tar.gz</span>\n</span><span><span><span>tar</span></span><span> xzvf go1.4-bootstrap-20161024.tar.gz</span>\n</span><span><span><span>mv</span></span><span> go go1.4</span>\n</span><span><span><span>cd</span></span><span> go1.4/src</span>\n</span><span><span><span>./make.bash</span></span>\n</span></code></pre>\n<ul>\n<li>cross-compile go1.7 (latest)</li>\n</ul>\n<pre><code><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>git</span></span><span> clone https://go.googlesource.com/go</span>\n</span><span><span><span>cd</span></span><span> go</span>\n</span><span><span><span>git</span></span><span> checkout go1.7.5</span>\n</span><span><span><span>cd</span></span><span> src</span>\n</span><span><span>GOOS</span><span>=</span><span>linux</span> <span>GOARCH</span><span>=</span><span>arm64</span> <span><span>./bootstrap.bash</span></span>\n</span></code></pre>\n<ul>\n<li>transfer cross-compiled toolchain to ARM64 host</li>\n</ul>\n<pre><code><span><span><span>scp</span></span><span> <span><span>~</span></span>/go/src/go-linux-arm64-bootstrap.tbz HOST:<span><span>~</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-arm64-go1-7-5\">Build ARM64 <code>go1.7.5</code></a></h2>\n<ul>\n<li>produce bootstrap toolchains</li>\n</ul>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>tar</span></span><span> xvf go-linux-arm64-bootstrap.tbz</span>\n</span></code></pre>\n<ul>\n<li>use the bootstrap toolchain to build a modern <code>go</code> install</li>\n</ul>\n<pre><code><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/go/src</span>\n</span><span><span><span>git</span></span><span> clone https://go.googlesource.com/go</span>\n</span><span><span><span>cd</span></span><span> go</span>\n</span><span><span><span>git</span></span><span> checkout go1.7.5</span>\n</span><span><span><span>cd</span></span><span> src</span>\n</span><span><span>GOROOT_BOOTSTRAP</span><span>=</span><span><span><span>~</span></span>/go-linux-arm64-bootstrap</span> <span><span>./make.bash</span></span>\n</span><span><span><span>mv</span></span><span> <span><span>~</span></span>/go/src/go/bin <span><span>~</span></span>/go/bin</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#bootstrap-arm64-docker\">Bootstrap ARM64 Docker</a></h2>\n<ul>\n<li>get a recent <code>git</code></li>\n</ul>\n<pre><code><span><span><span>sudo</span></span><span> yum remove git</span>\n</span><span><span><span>wget</span></span><span> https://github.com/git/git/archive/v2.12.2.tar.gz</span>\n</span><span><span><span>tar</span></span><span> xvf v2.12.2.tar.gz</span>\n</span><span><span><span>cd</span></span><span> git-<span>*</span></span>\n</span><span><span><span>which</span></span><span> git</span>\n</span><span><span><span>sudo</span></span><span> yum install perl-devel hg curl-devel</span>\n</span><span><span><span>make</span></span><span> configure</span>\n</span><span><span><span>./configure</span></span><span><span><span> --</span>prefix</span><span>=</span>/usr/local</span>\n</span><span><span><span>make</span></span><span><span><span> -</span>j8</span></span>\n</span><span><span><span>sudo</span></span><span> make install</span>\n</span></code></pre>\n<ul>\n<li>install dev dependencies</li>\n</ul>\n<pre><code><span><span><span>sudo</span></span><span> yum install btrfs-progs-devel device-mapper-devel</span>\n</span></code></pre>\n<ul>\n<li>clone source</li>\n</ul>\n<pre><code><span><span><span>cd</span></span><span> <span><span>~</span></span>/go</span>\n</span><span><span><span>mkdir</span></span><span><span><span> -</span>p</span> src/github.com/docker</span>\n</span><span><span><span>cd</span></span><span> src/github.com/docker</span>\n</span><span><span><span>git</span></span><span> clone git@github.com:docker/docker</span>\n</span><span><span><span>cd</span></span><span> docker</span>\n</span></code></pre>\n<ul>\n<li>build components</li>\n</ul>\n<pre><code><span><span><span>git</span></span><span> co v1.10.3</span>\n</span><span><span>(</span> <span><span>cd</span></span><span> vendor</span> <span>&&</span> <span>for</span><span> n <span>in</span> src/<span>*</span></span> <span>;</span> <span>do</span> <span><span>ln</span></span><span><span><span> -</span>s</span> <span><span>$</span><span>n</span></span></span> <span>;</span> <span>done</span><span> </span><span>)</span>\n</span><span><span><span>./hack/make.sh</span></span><span> dynbinary</span>\n</span><span><span><span>rm</span></span><span><span><span> -</span>rf</span> vendor</span> <span>&&</span> <span><span>git</span></span><span> checkout . <span><span>#</span></span><span> tidy up symlinking</span><span>\n</span></span></span></code></pre>\n<p>Note that a current bug in Ubuntu packaging metadata means a small edit needs to\nbe made to <code>./Dockerfile.aarch64</code>: change the <code>apt-get update &&</code> to <code>apt-get update ;</code> so that the build doesn\u2019t stop at the first hurdle, updating packages.</p>\n<ul>\n<li>run daemon</li>\n</ul>\n<pre><code><span><span><span>#</span></span><span> sudo rm -rf /var/lib/docker /etc/docker/config.json # DANGEROUS!</span><span>\n</span></span><span><span><span>sudo</span></span><span> ./bundles/1.10.3/dynbinary/docker daemon<span><span> -</span>D</span><span><span> --</span>group</span><span>=</span>wheel</span>\n</span></code></pre>\n<ul>\n<li>run client to check</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> PATH=<span><span>$</span><span>(</span><span><span>pwd</span></span><span> <span><span>-</span>P</span></span><span>)</span></span>/bundles/1.10.3/dynbinary/:<span><span>$</span><span>PATH</span></span></span>\n</span><span><span><span>$</span></span><span> docker version</span>\n</span><span><span><span>Client:</span></span>\n</span><span> <span><span>Version:</span></span><span> 1.10.3</span>\n</span><span> <span><span>API</span></span><span> version: 1.22</span>\n</span><span> <span><span>Go</span></span><span> version: go1.7.5</span>\n</span><span> <span><span>Git</span></span><span> commit: 20f81dde9</span>\n</span><span> <span><span>Built:</span></span><span> Tue Apr 4 00:27:13 2017</span>\n</span><span> <span><span>OS/Arch:</span></span><span> linux/arm64</span>\n</span><span>\n</span><span><span><span>Server:</span></span>\n</span><span> <span><span>Version:</span></span><span> 1.10.3</span>\n</span><span> <span><span>API</span></span><span> version: 1.22</span>\n</span><span> <span><span>Go</span></span><span> version: go1.7.5</span>\n</span><span> <span><span>Git</span></span><span> commit: 20f81dde9</span>\n</span><span> <span><span>Built:</span></span><span> Tue Apr 4 00:27:13 2017</span>\n</span><span> <span><span>OS/Arch:</span></span><span> linux/arm64</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#build-docker\">Build Docker</a></h2>\n<ul>\n<li>reissue <code>make</code> using containerised build</li>\n</ul>\n<pre><code><span><span><span>export</span> <span>DOCKER_BUILDTAGS</span><span>=</span><span><span><span>'</span>apparmor selinux seccomp<span>'</span></span></span></span>\n</span><span><span><span>git</span></span><span> co v17.05.0-ce <span><span>#</span></span><span> or v1.12.3 or master or whatever</span><span>\n</span></span></span><span><span><span>make <span><span>#</span></span><span> transient failure of first build; restart succeeded</span><span>\n</span></span></span></span><span><span><span>make</span></span><span> deb</span>\n</span><span><span><span>mkdir</span></span><span> contrib/builder/rpm/aarch64</span>\n</span><span><span><span>make</span></span><span> rpm</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/docker-docker/#notes\">Notes</a></h2>\n<ul>\n<li>Resulting <code>17.05-dev</code> binaries fail due to missing <code>libapparmor.so</code></li>\n<li>Static binary build no longer supported?</li>\n<li><code>DOCKER_BUILDTAGS</code> environment variable no longer supported?</li>\n<li><code>1.12.3</code> and <code>1.13.1</code> also built</li>\n<li>Build of RPM or DEB packages fails due, I think, to DIND not working</li>\n</ul>",
+18
mort/blog_ecscw-and-aarhus_.json
+18
mort/blog_ecscw-and-aarhus_.json
···+"summary": "<p>I have to confess to being quite pleased to having a couple of\n<a href=\"http://hdiresearch.org/\">HDI</a>-related papers accepted recently (even if we\ncan\u2019t even get proper reviews on the\n<a href=\"http://ssrn.com/abstract=2508051\">original HDI paper</a> \u2013 recently judged\nout-of-scope for the third time, even though the Special Issue in question\nseemed bang on target!).</p>\n<p>The first is a full paper to <a href=\"http://www.ecscw2015.no/\">ECSCW\u201915</a> titled\n<a href=\"http://mor1.github.io/publications/pdf/ecscw15-hdi.pdf\"><em>Human Data Interaction: Historical Lessons from Social Studies and CSCW</em></a>.\nA collaboration with <a href=\"http://www.andy-crabtree.com/\">Dr Andy Crabtree</a>, it\nexamines particularly the role of <em>interaction</em> in HDI and explores how past\ntechnical approaches, such as\n<a href=\"http://mor1.github.io/publications/pdf/comsnets11-dataware.pdf\">Dataware</a>, to\nthe challenges posed by HDI haven\u2019t fully considered the inherently social\nnature of data.</p>\n<p>The second is a short paper to the decennial \u2013 seriously: every ten years! \u2013\n<a href=\"http://aarhus2015.org\">Aarhus 2015</a> conference titled\n<a href=\"http://mor1.github.io/publications/pdf/aarhus15-databox.pdf\"><em>Personal Data: Thinking Inside the Box</em></a>.\nThis sets out a vision for an embodiment of a <em>Databox</em>: a physical device\nsupported by online services that empowers us to take back control of our online\nlives. Building a Databox, using <a href=\"https://mirage.io/\">MirageOS</a>, is hopefully\ngoing to be a key activity for me in the coming months\u2026</p>",+"content": "<p>I have to confess to being quite pleased to having a couple of\n<a href=\"http://hdiresearch.org/\">HDI</a>-related papers accepted recently (even if we\ncan\u2019t even get proper reviews on the\n<a href=\"http://ssrn.com/abstract=2508051\">original HDI paper</a> \u2013 recently judged\nout-of-scope for the third time, even though the Special Issue in question\nseemed bang on target!).</p>\n<p>The first is a full paper to <a href=\"http://www.ecscw2015.no/\">ECSCW\u201915</a> titled\n<a href=\"http://mor1.github.io/publications/pdf/ecscw15-hdi.pdf\"><em>Human Data Interaction: Historical Lessons from Social Studies and CSCW</em></a>.\nA collaboration with <a href=\"http://www.andy-crabtree.com/\">Dr Andy Crabtree</a>, it\nexamines particularly the role of <em>interaction</em> in HDI and explores how past\ntechnical approaches, such as\n<a href=\"http://mor1.github.io/publications/pdf/comsnets11-dataware.pdf\">Dataware</a>, to\nthe challenges posed by HDI haven\u2019t fully considered the inherently social\nnature of data.</p>\n<p>The second is a short paper to the decennial \u2013 seriously: every ten years! \u2013\n<a href=\"http://aarhus2015.org\">Aarhus 2015</a> conference titled\n<a href=\"http://mor1.github.io/publications/pdf/aarhus15-databox.pdf\"><em>Personal Data: Thinking Inside the Box</em></a>.\nThis sets out a vision for an embodiment of a <em>Databox</em>: a physical device\nsupported by online services that empowers us to take back control of our online\nlives. Building a Databox, using <a href=\"https://mirage.io/\">MirageOS</a>, is hopefully\ngoing to be a key activity for me in the coming months\u2026</p>",
+18
mort/blog_elcapitan-maps_.json
+18
mort/blog_elcapitan-maps_.json
···+"summary": "<p>A bit of a delay since the last post \u2013 lots going on! But anyway: I\n(relatively) recently upgraded my old skool Macbook Pro (look! built-in Ethernet\nport! DVD drive!) to El Capitan. This was generally rather less faff that the\nprevious upgrade, though it did seem to take rather more reboots than might have\nbeen assumed to be <em>strictly</em> necessary before it settled down, and I\u2019d\nremembered to fix up permissions for Homebrew with <code>sudo chown -R $(whoami):admin /usr/local</code>. So that was ok.</p>\n<p><img alt=\"Macbook Pro UK Keyboard\" src=\"https://mort.io/blog/elcapitan-maps/keyboard-small.png\" title=\"Macbook Pro\nUK Keyboard\"></p>\n<p>Except\u2026 I have a slightly odd keyboard and mouse setup. It\u2019s a UK Macbook\nwhich means a slightly tweaked keyboard layout compared to the standard US\nMacbook keyboard. At my desk, I also use a <em>Microsoft Digital Media Keyboard</em> \u2013\nnice action (for me!) plus some handy shortcut keys \u2013 and a <em>Microsoft 5-Button\nMouse with IntelliEye</em>. Now, until El Capitan I\u2019d happily been using the\nMicrosoft provided software to make use of the extra mouse buttons and shortcut\nkeys, coupled with a\n<a href=\"http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=ukelele\">Ukelele-generated</a>\nkeymap to handle the oddities of the UK laptop keyboard (like, who in the world\nreally needs <code>\u00a7</code> at the top-left key, below <code>escape</code> rather than <code>`</code>; and\ndoesn\u2019t need an easily accessible <code>#</code>?).</p>\n<p>This had never been entirely satisfactory \u2013 I had to have a standard keymap\ninstalled in addition to my modified one, and some apps (all of Microsoft\nOffice, I\u2019m looking at you) liked to intermittently flip the keymap away from my\nkeymap to the standard issue on, including undoing my remapping of <code>caps lock</code>\nto <code>ctrl</code>. This was annoying, but having it completely break was intolerable.</p>\n<p>So I went hunting for alternatives and am now very happy with\n<a href=\"https://pqrs.org/osx/karabiner/\">Karabiner.app</a> for standard keyboard remappings, and fairly happy\nwith <a href=\"http://www.usboverdrive.com\">USB Overdrive</a> to handle the mouse and the\nspecial Microsoft Digital Media Keyboard shortcut keys.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#usb-overdrive\">USB Overdrive</a></h3>\n<p>USB Overdrive seems to do the mouse mappings correctly, having detected the\ndevice as a \u201cMicrosoft 5-Button Mouse with IntelliEye(tm), Any Application\u201d \u2013\n<code>Button 4</code> and <code>Button 5</code> can be remapped to <code>forward</code> and <code>back</code>, just as I\nlike it.</p>\n<p><img alt=\"USB Overdrive\" src=\"https://mort.io/blog/elcapitan-maps/usboverdrive.png\" title=\"USB Overdrive Configuration\"></p>\n<p>It also allows me to repurpose some of the extra keys on my Microsoft keyboard\nthat <a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> doesn\u2019t seem able to see\u2013 so I get one touch play/pause of\niTunes and other such delights.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#karabiner-app\">Karabiner.app</a></h3>\n<p><a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> took a bit more setting up but does a very nice job. I needed to\nremap certain keys differently on the two different keyboards to make both\nconsistent and to fix some of the weirder (to my mind!) decisions both Microsoft\nand (particualrly) Apple have taken with their layouts. The result is an\n<a href=\"https://github.com/mor1/rc-files/blob/master/karabiner.xml\">XML configuration file</a>, symlinked by <code>~/Library/Application Support/Karabiner/private.xml</code>. This applies two keymaps based on the detected\ndevice, using product ID codes determined by the <code>EventViewer</code> app that comes\nwith <a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a>:</p>\n<pre><code><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>MACBOOK_PRO_UK_KEYBOARD<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x0253<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span><span>\n</span><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>DIGITAL_MEDIA_KEYBOARD<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x00b4<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span><span>\n</span><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>FIVE_BUTTON_MOUSE_WITH_INTELLIEYE<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x0039<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span></code></pre>\n<p>There are then two <code><item></item></code> stanzas that configure the two different\nkeyboards, e.g.,</p>\n<pre><code><span><span><span><</span><span>item</span><span>></span></span>\n</span><span> <span><span><</span><span>name</span><span>></span></span>Keyboard mappings for Microsoft keyboard<span><span></</span><span>name</span><span>></span></span>\n</span><span> <span><span><</span><span>identifier</span><span>></span></span>private.io.mort.microsoft_keyboard<span><span></</span><span>identifier</span><span>></span></span>\n</span><span> <span><span><</span><span>device_only</span><span>></span></span>\n</span><span> DeviceVendor::MICROSOFT,\n</span><span> DeviceProduct::DIGITAL_MEDIA_KEYBOARD\n</span><span> <span><span></</span><span>device</span><span>></span></span>\n</span><span> ...\n</span></code></pre>\n<p>Each of these contains a number of <code><autogen></autogen></code> stanza mapping specific\nkeycodes for that keymap. For example, I want the top-left key on the main block\nto be <code>`</code> and, when shifted, to be <code>\u20ac</code>. This leads to the following on the\nMicrosoft keyboard:</p>\n<pre><code><span><span><span><!--</span> shift-` to \u20ac <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::SHIFT_R | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span></code></pre>\n<p>\u2026but to the following on the Macbook built-in UK keyboard, to take account\nfirst of the different keycode it generates but also to ensure that when used\nwith command and command-shift, the standard behaviour of cycling between\nwindows works:</p>\n<pre><code><span><span><span><!--</span> top-left \u00a7 to ` <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><!--</span> ...with shift, to \u20ac <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::SHIFT_R | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><!--</span> ...with COMMAND/SHIFT, so that cycle-window-{forward,back} work <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::COMMAND_L | ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::COMMAND_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::COMMAND_L | ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::COMMAND_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span></code></pre>\n<p>There are a number of other mappings made in <a href=\"https://github.com/mor1/rc-files/blob/master/karabiner.xml\">karabiner.xml</a>: <code>shift-'</code> is\n<code>@</code>, <code>shift-2</code> is <code>\"</code>, <code>shift-3</code> is <code>\u00a3</code>, and resolving general confusion among\n<code>#</code>, <code>\\</code>, <code>~</code>, and <code>|</code>.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#emacs\">Emacs</a></h3>\n<p>That fixed things for the terminal and for most apps \u2013 the only remaining\nsticking point was Emacs. I don\u2019t pretend to understand the entire chain of\nevent processing but suffice it to say that Emacs was receiving <code>shift-@</code> and\n<code>shift-3</code> without knowing what to do with them. Fortunately, when coupled with\n<a href=\"https://github.com/mor1/rc-files/blob/master/emacs.d/init.el#L929-L1019\">my hacks to enforce a <code>my-keys-minor-mode</code> to override everything</a>,\nthe fix was pretty straightforward:</p>\n<pre><code><span><span><span>(</span>define<span>-</span>key my<span>-</span>keys<span>-</span>minor<span>-</span>mode<span>-</span><span>map</span> <span><span>(</span>kbd <span><span>"</span>s-@<span>"</span></span><span>)</span></span> <span><span>"</span>\u20ac<span>"</span></span><span>)</span></span>\n</span><span><span><span>(</span>define<span>-</span>key my<span>-</span>keys<span>-</span>minor<span>-</span>mode<span>-</span><span>map</span> <span><span>(</span>kbd <span><span>"</span>s-3<span>"</span></span><span>)</span></span>\n</span></span><span><span> '<span><span>(</span><span>lambda</span> <span><span>(</span><span>)</span></span> <span><span>(</span>interactive<span>)</span></span> <span><span>(</span>insert<span>-</span><span>char</span> <span><span>#</span>x00A3</span><span>)</span></span><span>)</span></span><span>)</span></span> <span><span>;</span> \u00a3\n</span></span></code></pre>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#result\">Result?</a></h3>\n<p>A <strong>significant</strong> decrease in the need I feel to curse because my keyboard has\nchanged in the middle of typing! It seems that keyboards remain, like time and\nterminals, one of those <em>Really Hard</em> things for computers/manufacturers to\nhandle\u2026</p>\n<p><em>Note: Thanks to <a href=\"http://www.amp-what.com/unicode/search/\">http://www.amp-what.com/unicode/search/</a> for an easy way to\nhunt down some of the unicode symbols used above!</em></p>",+"content": "<p>A bit of a delay since the last post \u2013 lots going on! But anyway: I\n(relatively) recently upgraded my old skool Macbook Pro (look! built-in Ethernet\nport! DVD drive!) to El Capitan. This was generally rather less faff that the\nprevious upgrade, though it did seem to take rather more reboots than might have\nbeen assumed to be <em>strictly</em> necessary before it settled down, and I\u2019d\nremembered to fix up permissions for Homebrew with <code>sudo chown -R $(whoami):admin /usr/local</code>. So that was ok.</p>\n<p><img alt=\"Macbook Pro UK Keyboard\" src=\"https://mort.io/blog/elcapitan-maps/keyboard-small.png\" title=\"Macbook Pro\nUK Keyboard\"></p>\n<p>Except\u2026 I have a slightly odd keyboard and mouse setup. It\u2019s a UK Macbook\nwhich means a slightly tweaked keyboard layout compared to the standard US\nMacbook keyboard. At my desk, I also use a <em>Microsoft Digital Media Keyboard</em> \u2013\nnice action (for me!) plus some handy shortcut keys \u2013 and a <em>Microsoft 5-Button\nMouse with IntelliEye</em>. Now, until El Capitan I\u2019d happily been using the\nMicrosoft provided software to make use of the extra mouse buttons and shortcut\nkeys, coupled with a\n<a href=\"http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=ukelele\">Ukelele-generated</a>\nkeymap to handle the oddities of the UK laptop keyboard (like, who in the world\nreally needs <code>\u00a7</code> at the top-left key, below <code>escape</code> rather than <code>`</code>; and\ndoesn\u2019t need an easily accessible <code>#</code>?).</p>\n<p>This had never been entirely satisfactory \u2013 I had to have a standard keymap\ninstalled in addition to my modified one, and some apps (all of Microsoft\nOffice, I\u2019m looking at you) liked to intermittently flip the keymap away from my\nkeymap to the standard issue on, including undoing my remapping of <code>caps lock</code>\nto <code>ctrl</code>. This was annoying, but having it completely break was intolerable.</p>\n<p>So I went hunting for alternatives and am now very happy with\n<a href=\"https://pqrs.org/osx/karabiner/\">Karabiner.app</a> for standard keyboard remappings, and fairly happy\nwith <a href=\"http://www.usboverdrive.com\">USB Overdrive</a> to handle the mouse and the\nspecial Microsoft Digital Media Keyboard shortcut keys.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#usb-overdrive\">USB Overdrive</a></h3>\n<p>USB Overdrive seems to do the mouse mappings correctly, having detected the\ndevice as a \u201cMicrosoft 5-Button Mouse with IntelliEye(tm), Any Application\u201d \u2013\n<code>Button 4</code> and <code>Button 5</code> can be remapped to <code>forward</code> and <code>back</code>, just as I\nlike it.</p>\n<p><img alt=\"USB Overdrive\" src=\"https://mort.io/blog/elcapitan-maps/usboverdrive.png\" title=\"USB Overdrive Configuration\"></p>\n<p>It also allows me to repurpose some of the extra keys on my Microsoft keyboard\nthat <a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> doesn\u2019t seem able to see\u2013 so I get one touch play/pause of\niTunes and other such delights.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#karabiner-app\">Karabiner.app</a></h3>\n<p><a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> took a bit more setting up but does a very nice job. I needed to\nremap certain keys differently on the two different keyboards to make both\nconsistent and to fix some of the weirder (to my mind!) decisions both Microsoft\nand (particualrly) Apple have taken with their layouts. The result is an\n<a href=\"https://github.com/mor1/rc-files/blob/master/karabiner.xml\">XML configuration file</a>, symlinked by <code>~/Library/Application Support/Karabiner/private.xml</code>. This applies two keymaps based on the detected\ndevice, using product ID codes determined by the <code>EventViewer</code> app that comes\nwith <a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a>:</p>\n<pre><code><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>MACBOOK_PRO_UK_KEYBOARD<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x0253<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span><span>\n</span><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>DIGITAL_MEDIA_KEYBOARD<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x00b4<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span><span>\n</span><span><span><span><</span><span>deviceproductdef</span><span>></span></span>\n</span><span> <span><span><</span><span>productname</span><span>></span></span>FIVE_BUTTON_MOUSE_WITH_INTELLIEYE<span><span></</span><span>productname</span><span>></span></span>\n</span><span> <span><span><</span><span>productid</span><span>></span></span>0x0039<span><span></</span><span>productid</span><span>></span></span>\n</span><span><span><span></</span><span>deviceproductdef</span><span>></span></span>\n</span></code></pre>\n<p>There are then two <code><item></item></code> stanzas that configure the two different\nkeyboards, e.g.,</p>\n<pre><code><span><span><span><</span><span>item</span><span>></span></span>\n</span><span> <span><span><</span><span>name</span><span>></span></span>Keyboard mappings for Microsoft keyboard<span><span></</span><span>name</span><span>></span></span>\n</span><span> <span><span><</span><span>identifier</span><span>></span></span>private.io.mort.microsoft_keyboard<span><span></</span><span>identifier</span><span>></span></span>\n</span><span> <span><span><</span><span>device_only</span><span>></span></span>\n</span><span> DeviceVendor::MICROSOFT,\n</span><span> DeviceProduct::DIGITAL_MEDIA_KEYBOARD\n</span><span> <span><span></</span><span>device</span><span>></span></span>\n</span><span> ...\n</span></code></pre>\n<p>Each of these contains a number of <code><autogen></autogen></code> stanza mapping specific\nkeycodes for that keymap. For example, I want the top-left key on the main block\nto be <code>`</code> and, when shifted, to be <code>\u20ac</code>. This leads to the following on the\nMicrosoft keyboard:</p>\n<pre><code><span><span><span><!--</span> shift-` to \u20ac <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::SHIFT_R | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span></code></pre>\n<p>\u2026but to the following on the Macbook built-in UK keyboard, to take account\nfirst of the different keycode it generates but also to ensure that when used\nwith command and command-shift, the standard behaviour of cycling between\nwindows works:</p>\n<pre><code><span><span><span><!--</span> top-left \u00a7 to ` <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><!--</span> ...with shift, to \u20ac <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::SHIFT_R | ModifierFlag::NONE,\n</span><span> KeyCode::KEY_2, ModifierFlag::OPTION_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><!--</span> ...with COMMAND/SHIFT, so that cycle-window-{forward,back} work <span>--></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::COMMAND_L | ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::COMMAND_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span><span><span><span><</span><span>autogen</span><span>></span></span>\n</span><span> __KeyToKey__\n</span><span> KeyCode::DANISH_DOLLAR, ModifierFlag::COMMAND_L | ModifierFlag::SHIFT_L | ModifierFlag::NONE,\n</span><span> KeyCode::BACKQUOTE, ModifierFlag::COMMAND_R | ModifierFlag::SHIFT_R\n</span><span><span><span></</span><span>autogen</span><span>></span></span>\n</span></code></pre>\n<p>There are a number of other mappings made in <a href=\"https://github.com/mor1/rc-files/blob/master/karabiner.xml\">karabiner.xml</a>: <code>shift-'</code> is\n<code>@</code>, <code>shift-2</code> is <code>\"</code>, <code>shift-3</code> is <code>\u00a3</code>, and resolving general confusion among\n<code>#</code>, <code>\\</code>, <code>~</code>, and <code>|</code>.</p>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#emacs\">Emacs</a></h3>\n<p>That fixed things for the terminal and for most apps \u2013 the only remaining\nsticking point was Emacs. I don\u2019t pretend to understand the entire chain of\nevent processing but suffice it to say that Emacs was receiving <code>shift-@</code> and\n<code>shift-3</code> without knowing what to do with them. Fortunately, when coupled with\n<a href=\"https://github.com/mor1/rc-files/blob/master/emacs.d/init.el#L929-L1019\">my hacks to enforce a <code>my-keys-minor-mode</code> to override everything</a>,\nthe fix was pretty straightforward:</p>\n<pre><code><span><span><span>(</span>define<span>-</span>key my<span>-</span>keys<span>-</span>minor<span>-</span>mode<span>-</span><span>map</span> <span><span>(</span>kbd <span><span>"</span>s-@<span>"</span></span><span>)</span></span> <span><span>"</span>\u20ac<span>"</span></span><span>)</span></span>\n</span><span><span><span>(</span>define<span>-</span>key my<span>-</span>keys<span>-</span>minor<span>-</span>mode<span>-</span><span>map</span> <span><span>(</span>kbd <span><span>"</span>s-3<span>"</span></span><span>)</span></span>\n</span></span><span><span> '<span><span>(</span><span>lambda</span> <span><span>(</span><span>)</span></span> <span><span>(</span>interactive<span>)</span></span> <span><span>(</span>insert<span>-</span><span>char</span> <span><span>#</span>x00A3</span><span>)</span></span><span>)</span></span><span>)</span></span> <span><span>;</span> \u00a3\n</span></span></code></pre>\n<h3><a href=\"https://mort.io/blog/elcapitan-maps/#result\">Result?</a></h3>\n<p>A <strong>significant</strong> decrease in the need I feel to curse because my keyboard has\nchanged in the middle of typing! It seems that keyboards remain, like time and\nterminals, one of those <em>Really Hard</em> things for computers/manufacturers to\nhandle\u2026</p>\n<p><em>Note: Thanks to <a href=\"http://www.amp-what.com/unicode/search/\">http://www.amp-what.com/unicode/search/</a> for an easy way to\nhunt down some of the unicode symbols used above!</em></p>",
+18
mort/blog_electron-cli_.json
+18
mort/blog_electron-cli_.json
···+"summary": "<p>Another short one, this time the magic flags to get debug output and access to\nwebdev tools under Electron apps such as Slack and whatnot on Wayland:</p>\n<pre><code><span><span><span>APPLICATION</span></span><span><span><span> --</span>enable-logging</span><span><span> --</span>devtools</span> <span>\\\n</span></span></span><span><span><span><span> --</span>platform</span><span>=</span>wayland<span><span> --</span>enable-features</span><span>=</span>UseOzonePlatform<span><span> --</span>password-store</span><span>=</span><span><span>"</span>gnome_libsecret<span>"</span></span></span>\n</span></code></pre>",+"content": "<p>Another short one, this time the magic flags to get debug output and access to\nwebdev tools under Electron apps such as Slack and whatnot on Wayland:</p>\n<pre><code><span><span><span>APPLICATION</span></span><span><span><span> --</span>enable-logging</span><span><span> --</span>devtools</span> <span>\\\n</span></span></span><span><span><span><span> --</span>platform</span><span>=</span>wayland<span><span> --</span>enable-features</span><span>=</span>UseOzonePlatform<span><span> --</span>password-store</span><span>=</span><span><span>"</span>gnome_libsecret<span>"</span></span></span>\n</span></code></pre>",
+18
mort/blog_falsehoods_.json
+18
mort/blog_falsehoods_.json
···+"summary": "<p>Being a list of some cool \u201cFalsehoods programmers believe about \u2026\u201d sites, now\nredundant thanks to <a href=\"https://github.com/kdeldycke/awesome-falsehood\">https://github.com/kdeldycke/awesome-falsehood</a>:</p>\n<ul>\n<li><a href=\"http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time\">\u2026time</a></li>\n<li><a href=\"http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time\">\u2026time, more</a></li>\n<li><a href=\"http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/\">\u2026names</a></li>\n<li><a href=\"https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/\">\u2026addresses</a></li>\n<li><a href=\"http://wiesmann.codiferes.net/wordpress/?p=15187&lang=en\">\u2026geography</a></li>\n</ul>",+"content": "<p>Being a list of some cool \u201cFalsehoods programmers believe about \u2026\u201d sites, now\nredundant thanks to <a href=\"https://github.com/kdeldycke/awesome-falsehood\">https://github.com/kdeldycke/awesome-falsehood</a>:</p>\n<ul>\n<li><a href=\"http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time\">\u2026time</a></li>\n<li><a href=\"http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time\">\u2026time, more</a></li>\n<li><a href=\"http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/\">\u2026names</a></li>\n<li><a href=\"https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/\">\u2026addresses</a></li>\n<li><a href=\"http://wiesmann.codiferes.net/wordpress/?p=15187&lang=en\">\u2026geography</a></li>\n</ul>",
+18
mort/blog_fontsizing_.json
+18
mort/blog_fontsizing_.json
···+"summary": "<p>I recently had colleagues hit an issue that I have hit myself in the past, and\nso I finally decided to figure out a fix.</p>\n<p>Specifically, when building EPSRC research proposals in LaTeX, getting a\ncomplaint that the font size is non compliant \u2013 it should be 11pt Arial, but\nthe standard LaTeX options generate something slightly smaller, with Adobe\nAcrobat and Microsoft tools both reporting a size of 10.45pt or so.</p>\n<p>One proposed solution was the to add the following in the preamble:</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>anyfontsize</span><span>}</span></span></span>\n</span><span><span><span>\\</span>AtBeginDocument</span><span><span>{</span><span><span>\\</span>fontsize</span><span><span>{</span>11bp<span>}</span></span><span><span>{</span>13.35bp<span>}</span></span><span><span>\\</span>selectfont</span><span>}</span></span> \n</span></code></pre>\n<p>\u2026but that did not work unfortunately.</p>\n<p>After some poking about and staring at output and searching the interwebs, it\nappears that this was triggered, at least for me, by the\n<a href=\"https://ctan.org/pkg/fontspec\"><code>fontspec</code></a> package that was being used to sort\nout fonts and unicode and so on in conjunction with\n<a href=\"https://xetex.sourceforge.net/\">XeLaTeX</a> as a backend driver.</p>\n<p>A key piece of debug logic was to add the following text in a document:</p>\n<pre><code><span>The quick fox --- <span><span>\\</span>the</span><span><span>\\</span>fontdimen</span>6<span><span>\\</span>font</span><span><span>\\</span>relax</span>\n</span></code></pre>\n<p>\u2026which ensured there was some text and then inserted the font dimesions\naccording to LaTeX. It did indeed produce the output <code>The quick fox \u2014 9.54147pt</code> when it should\u2019ve been <code>10pt</code>.</p>\n<p>The font runes I was using were</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span>[</span>T1<span>]</span></span><span><span><span>{</span></span></span></span><span><span><span>fontenc</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>lmodern</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>amssymb</span>,amsmath<span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>eurosym</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>upquote</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>microtype</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>fontspec</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>xltxtra</span>,xunicode<span>}</span></span></span>\n</span><span><span><span>\\</span>defaultfontfeatures</span><span><span>{</span>Mapping=tex-text,Scale=MatchUppercase<span>}</span></span>\n</span><span><span><span><span><span>\\</span>renewcommand</span></span><span>{</span><span><span>\\familydefault</span></span><span>}</span><span>{</span><span><span>\\</span>rmdefault</span><span>}</span></span>\n</span><span><span><span>\\</span>setmainfont</span><span><span>{</span>Arial<span>}</span></span>\n</span><span><span><span>\\</span>setmonofont</span><span><span>{</span>Hack Nerd Font<span>}</span></span>\n</span></code></pre>\n<p>\u2026and it seemed to be the <code>Scale=MatchUppercase</code> clause that caused the\nproblem. Further investigation suggested that most of that was actually\ncopypasta legacy code that was no longer required; replacing with</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>fontspec</span><span>}</span></span></span>\n</span><span><span><span>\\</span>setmainfont</span><span><span>{</span>Arial<span>}</span></span>\n</span><span><span><span>\\</span>setmonofont</span><span><span>{</span>Hack Nerd Font<span>}</span></span>\n</span></code></pre>\n<p>\u2026gave the output <code>The quick fox \u2014 10.0pt</code> in Arial as expected. Which was\nnice.</p>\n<p>I also realised in the course of doing this that <code>xelatex</code> is now deprecated as\na backend, so I have started using the still actively developed\n<a href=\"https://www.luatex.org/\"><code>luatex</code></a> backend driver instead by passing\n<code>-lualatex</code> to <a href=\"https://mgeier.github.io/latexmk.html\"><code>latexmk</code></a> and that has\nworked fine so far</p>",+"content": "<p>I recently had colleagues hit an issue that I have hit myself in the past, and\nso I finally decided to figure out a fix.</p>\n<p>Specifically, when building EPSRC research proposals in LaTeX, getting a\ncomplaint that the font size is non compliant \u2013 it should be 11pt Arial, but\nthe standard LaTeX options generate something slightly smaller, with Adobe\nAcrobat and Microsoft tools both reporting a size of 10.45pt or so.</p>\n<p>One proposed solution was the to add the following in the preamble:</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>anyfontsize</span><span>}</span></span></span>\n</span><span><span><span>\\</span>AtBeginDocument</span><span><span>{</span><span><span>\\</span>fontsize</span><span><span>{</span>11bp<span>}</span></span><span><span>{</span>13.35bp<span>}</span></span><span><span>\\</span>selectfont</span><span>}</span></span> \n</span></code></pre>\n<p>\u2026but that did not work unfortunately.</p>\n<p>After some poking about and staring at output and searching the interwebs, it\nappears that this was triggered, at least for me, by the\n<a href=\"https://ctan.org/pkg/fontspec\"><code>fontspec</code></a> package that was being used to sort\nout fonts and unicode and so on in conjunction with\n<a href=\"https://xetex.sourceforge.net/\">XeLaTeX</a> as a backend driver.</p>\n<p>A key piece of debug logic was to add the following text in a document:</p>\n<pre><code><span>The quick fox --- <span><span>\\</span>the</span><span><span>\\</span>fontdimen</span>6<span><span>\\</span>font</span><span><span>\\</span>relax</span>\n</span></code></pre>\n<p>\u2026which ensured there was some text and then inserted the font dimesions\naccording to LaTeX. It did indeed produce the output <code>The quick fox \u2014 9.54147pt</code> when it should\u2019ve been <code>10pt</code>.</p>\n<p>The font runes I was using were</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span>[</span>T1<span>]</span></span><span><span><span>{</span></span></span></span><span><span><span>fontenc</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>lmodern</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>amssymb</span>,amsmath<span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>eurosym</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>upquote</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>microtype</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>fontspec</span><span>}</span></span></span>\n</span><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>xltxtra</span>,xunicode<span>}</span></span></span>\n</span><span><span><span>\\</span>defaultfontfeatures</span><span><span>{</span>Mapping=tex-text,Scale=MatchUppercase<span>}</span></span>\n</span><span><span><span><span><span>\\</span>renewcommand</span></span><span>{</span><span><span>\\familydefault</span></span><span>}</span><span>{</span><span><span>\\</span>rmdefault</span><span>}</span></span>\n</span><span><span><span>\\</span>setmainfont</span><span><span>{</span>Arial<span>}</span></span>\n</span><span><span><span>\\</span>setmonofont</span><span><span>{</span>Hack Nerd Font<span>}</span></span>\n</span></code></pre>\n<p>\u2026and it seemed to be the <code>Scale=MatchUppercase</code> clause that caused the\nproblem. Further investigation suggested that most of that was actually\ncopypasta legacy code that was no longer required; replacing with</p>\n<pre><code><span><span><span><span>\\</span>usepackage</span><span><span><span>{</span></span></span></span><span><span><span>fontspec</span><span>}</span></span></span>\n</span><span><span><span>\\</span>setmainfont</span><span><span>{</span>Arial<span>}</span></span>\n</span><span><span><span>\\</span>setmonofont</span><span><span>{</span>Hack Nerd Font<span>}</span></span>\n</span></code></pre>\n<p>\u2026gave the output <code>The quick fox \u2014 10.0pt</code> in Arial as expected. Which was\nnice.</p>\n<p>I also realised in the course of doing this that <code>xelatex</code> is now deprecated as\na backend, so I have started using the still actively developed\n<a href=\"https://www.luatex.org/\"><code>luatex</code></a> backend driver instead by passing\n<code>-lualatex</code> to <a href=\"https://mgeier.github.io/latexmk.html\"><code>latexmk</code></a> and that has\nworked fine so far</p>",
+18
mort/blog_google-screening_.json
+18
mort/blog_google-screening_.json
···+"summary": "<p>Some time ago, for reasons best known to themselves, a Google recruiter decided\nto \u201creach out\u201d on the basis of <a href=\"https://github.com/mor1/\">my GitHub profile</a> to\nsee if I were interested in a role as a Site-Reliability Engineer or possibly a\nSoftware Engeering. This entailed a short (~30min) telephone interview to answer\nsome questions. I made a note of those I recalled, in case anyone\u2019s interested.</p>\n<p>The hawk-eyed and keen-minded among you may discern a certain amount of\nambiguity in answers to some of the questions \u2013 e.g., is the opposite of\n<code>malloc()</code>, <code>free()</code> or a garbage collector? are we assuming an Ethernet MAC\naddress? \u2013 which the recruiter did not seem to be happy to deal with. But so\nlong as my answer included a reasonable approximation to (presumably) the string\nthey had written down, all was well.</p>\n<ul>\n<li>What is the Big-O complexity of quicksort?</li>\n<li>What is the search complexity for a red-black tree, a binary tree, a linked\nlist, a hashtable, and a B-tree?</li>\n<li>What\u2019s the opposite of <code>malloc()</code>?</li>\n<li>What are the semantics of an ACL?</li>\n<li>Which of the following fields are <em>not</em> part of the <code>passwd</code> file?\n<ul>\n<li>shell, comment, initial umask, login name, ui, home directory, gid,\npreferred lagnuage</li>\n</ul>\n</li>\n<li>What does the <code>fstat</code> syscall do?</li>\n<li>What\u2019s the default signal for <code>kill</code>?</li>\n<li>What\u2019s in an inode?</li>\n<li>How do you make a socket accept inbound connections?</li>\n<li>How many bytes in a MAC address?</li>\n<li>What are the packets involved in a TCP connection setup?</li>\n<li>How many hosts are in a /23 subnet?</li>\n<li>What\u2019s the DNS resource record type for an IPv6 address?</li>\n<li>Estimate the value of 224.</li>\n</ul>\n<p>In the end, I passed even though I could only remember the name, not the number,\nof the default signal for <code>kill</code>. It then got mildly amusing: the next stage is\napparently to \u201cjump on a call\u201d (sigh) with a recruiter and an engineer to work\nthrough some coding problems. I explained that I generally refuse to engage in\nwhiteboard coding during interviews (it\u2019s not a useful measure of anything\nuseful, and I don\u2019t see why I should). They said oh but of course I could do it\non a call so it wouldn\u2019t actually be a whiteboard. I said, yes I could but no I\nwouldn\u2019t and I thought they were rather missing my point. They said really,\nit was very unusual for someone to refuse. I said, to be honest it makes little\nsense anyway given they contacted me because of <em>all the code I\u2019d written under\nmy GitHub account</em>. They said oh well.</p>\n<p>And then some time later \u2013 6 months I think \u2013 a different recruiter \u201creached\nout\u201d to ask why the process had stalled and did I want to jump on a call.</p>\n<p>I said No. They haven\u2019t called back since. Oh well\u2026</p>",+"content": "<p>Some time ago, for reasons best known to themselves, a Google recruiter decided\nto \u201creach out\u201d on the basis of <a href=\"https://github.com/mor1/\">my GitHub profile</a> to\nsee if I were interested in a role as a Site-Reliability Engineer or possibly a\nSoftware Engeering. This entailed a short (~30min) telephone interview to answer\nsome questions. I made a note of those I recalled, in case anyone\u2019s interested.</p>\n<p>The hawk-eyed and keen-minded among you may discern a certain amount of\nambiguity in answers to some of the questions \u2013 e.g., is the opposite of\n<code>malloc()</code>, <code>free()</code> or a garbage collector? are we assuming an Ethernet MAC\naddress? \u2013 which the recruiter did not seem to be happy to deal with. But so\nlong as my answer included a reasonable approximation to (presumably) the string\nthey had written down, all was well.</p>\n<ul>\n<li>What is the Big-O complexity of quicksort?</li>\n<li>What is the search complexity for a red-black tree, a binary tree, a linked\nlist, a hashtable, and a B-tree?</li>\n<li>What\u2019s the opposite of <code>malloc()</code>?</li>\n<li>What are the semantics of an ACL?</li>\n<li>Which of the following fields are <em>not</em> part of the <code>passwd</code> file?\n<ul>\n<li>shell, comment, initial umask, login name, ui, home directory, gid,\npreferred lagnuage</li>\n</ul>\n</li>\n<li>What does the <code>fstat</code> syscall do?</li>\n<li>What\u2019s the default signal for <code>kill</code>?</li>\n<li>What\u2019s in an inode?</li>\n<li>How do you make a socket accept inbound connections?</li>\n<li>How many bytes in a MAC address?</li>\n<li>What are the packets involved in a TCP connection setup?</li>\n<li>How many hosts are in a /23 subnet?</li>\n<li>What\u2019s the DNS resource record type for an IPv6 address?</li>\n<li>Estimate the value of 224.</li>\n</ul>\n<p>In the end, I passed even though I could only remember the name, not the number,\nof the default signal for <code>kill</code>. It then got mildly amusing: the next stage is\napparently to \u201cjump on a call\u201d (sigh) with a recruiter and an engineer to work\nthrough some coding problems. I explained that I generally refuse to engage in\nwhiteboard coding during interviews (it\u2019s not a useful measure of anything\nuseful, and I don\u2019t see why I should). They said oh but of course I could do it\non a call so it wouldn\u2019t actually be a whiteboard. I said, yes I could but no I\nwouldn\u2019t and I thought they were rather missing my point. They said really,\nit was very unusual for someone to refuse. I said, to be honest it makes little\nsense anyway given they contacted me because of <em>all the code I\u2019d written under\nmy GitHub account</em>. They said oh well.</p>\n<p>And then some time later \u2013 6 months I think \u2013 a different recruiter \u201creached\nout\u201d to ask why the process had stalled and did I want to jump on a call.</p>\n<p>I said No. They haven\u2019t called back since. Oh well\u2026</p>",
+18
mort/blog_grubbing-around_.json
+18
mort/blog_grubbing-around_.json
···+"summary": "<p>Nothing earth-shattering here: I recently had the \u201cpleasure\u201d of setting up an\nARM64 server. After considerable support, several firmware upgrades, corruption\nof the main HDD, reinstallation of CentOS7 (recommended, somewhat to my\nsurprise), all that remained was to get an up-to-date Linux built and installed\nwith 32 bit binary support. This took a bit of <code>make config</code> fiddling, but got\nthere after a few tries.</p>\n<p>And then I had to relearn how <code>grub</code>/<code>grub2</code> works in this brave new (to me)\nUEFI CentOS7 world. Herewith some brief commands I found useful while doing\nso\u2026</p>\n<pre><code><span><span><span>sudo</span></span><span> grep <span><span>"</span>^menu entry<span>"</span></span> /boot/efi/EFI/centos/grub.cfg <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>tr</span></span><span><span><span> -</span>s</span> <span><span>"</span> <span>"</span></span></span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>f</span> 2<span><span> -</span>d</span> <span><span>"</span>'<span>"</span></span></span> <span>|</span> <span><span>cat</span></span><span><span><span> -</span>n</span></span>\n</span></code></pre>\n<p>Edit <code>/etc/default/grub</code> to set <code>GRUB_DEFAULT=N</code> for desired value of <code>N</code></p>\n<p>Temporarily set the default for the next reboot:</p>\n<pre><code><span><span><span>sudo</span></span><span> grub2-reboot 1 <span><span>#</span></span><span> based on output of above</span><span>\n</span></span></span></code></pre>\n<p>Regenerate the grub2 configuration:</p>\n<pre><code><span><span><span>sudo</span></span><span> grub2-mkconfig<span><span> -</span>o</span> /boot/efi/EFI/centos/grub.cfg</span>\n</span></code></pre>",+"content": "<p>Nothing earth-shattering here: I recently had the \u201cpleasure\u201d of setting up an\nARM64 server. After considerable support, several firmware upgrades, corruption\nof the main HDD, reinstallation of CentOS7 (recommended, somewhat to my\nsurprise), all that remained was to get an up-to-date Linux built and installed\nwith 32 bit binary support. This took a bit of <code>make config</code> fiddling, but got\nthere after a few tries.</p>\n<p>And then I had to relearn how <code>grub</code>/<code>grub2</code> works in this brave new (to me)\nUEFI CentOS7 world. Herewith some brief commands I found useful while doing\nso\u2026</p>\n<pre><code><span><span><span>sudo</span></span><span> grep <span><span>"</span>^menu entry<span>"</span></span> /boot/efi/EFI/centos/grub.cfg <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>tr</span></span><span><span><span> -</span>s</span> <span><span>"</span> <span>"</span></span></span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>f</span> 2<span><span> -</span>d</span> <span><span>"</span>'<span>"</span></span></span> <span>|</span> <span><span>cat</span></span><span><span><span> -</span>n</span></span>\n</span></code></pre>\n<p>Edit <code>/etc/default/grub</code> to set <code>GRUB_DEFAULT=N</code> for desired value of <code>N</code></p>\n<p>Temporarily set the default for the next reboot:</p>\n<pre><code><span><span><span>sudo</span></span><span> grub2-reboot 1 <span><span>#</span></span><span> based on output of above</span><span>\n</span></span></span></code></pre>\n<p>Regenerate the grub2 configuration:</p>\n<pre><code><span><span><span>sudo</span></span><span> grub2-mkconfig<span><span> -</span>o</span> /boot/efi/EFI/centos/grub.cfg</span>\n</span></code></pre>",
+18
mort/blog_happy-day_.json
+18
mort/blog_happy-day_.json
···+"summary": "<p><a href=\"https://2025.eurosys.org/index.html\">EuroSys 2025</a> was co-located with <a href=\"https://www.asplos-conference.org/asplos2025/\">ASPLOS\n2025</a> this year. Other\ncommitments meant I (again) couldn\u2019t stay for the whole conference, attending\nprimarily because <a href=\"https://mort.io/blog/tdis-accepted\">two students had papers in the TDIS\nworkshop</a>.</p>\n <img alt=\"A photograph of me in a yellow t-shirt receiving the award\" height=\"1\" src=\"https://mort.io/blog/happy-day/stage.jpg\" width=\"480\">\n<p>But happily I <em>was</em> able to stay for the first day of the conference \u2013\n\u201chappily\u201d not only because it gave me a chance to catch up with some old friends\nI hadn\u2019t seen in a decade or more, but also because <a href=\"https://doi.org/10.1145/2451116.2451167\">the Mirage unikernels\npaper</a> which appeared at <a href=\"http://asplos13.rice.edu/\">ASPLOS\n2013</a> won one of two <a href=\"https://www.asplos-conference.org/asplos2025/awards/\">ASPLOS 2025 Influential Papers\nawards</a> :)</p>\n<p>This is obviously very flattering \u2013 typically doing research is necessarily its\nown reward because the work can seem fruitless much of the time. Even when a\npaper gets written and submitted it will most likely be rejected \u2013 I think\nEuroSys this year reported something like a 12% acceptance rate, so rejection is\n<em>a priori</em> the most likely outcome. Finally, if the paper does finally get\naccepted, it will most likely sink without trace \u2013 perhaps a brief flurry of\ninterest for a few months or so, the paper gets cited a few times, and then it\nfades away. This seems inevitable in a reasonably fast moving field that is also\ngrowing at pace \u2013 EuroSys had ~160 attendees in 2006, growing to ~330 in the 10\nyears to 2016, but hitting ~1100 this year; while submissions grew from ~200 in\n2019 to 696 this year.</p>\n <img alt=\"A photograph of the certificate\" height=\"1\" src=\"https://mort.io/blog/happy-day/official.jpg\" width=\"320\">\n<p>So to win an award recognising that others feel a paper actually had some\ninfluence is rare, and makes me very happy :) At the same time, it reinforces a\ncouple of lessons that I really should\u2019ve internalised by now.</p>\n<p>The first is that papers inevitably get better for thoughtful considered\nfeedback from experts a step or several away from the work \u2013 so drafts should\nbe produced in plenty of time and distributed to anyone who\u2019s willing to take\nthe time for feedback. In the case of this paper the previous failed submission\nto <a href=\"https://www.usenix.org/conference/osdi12\">OSDI 2012</a> had, let\u2019s say, reviews\nof mixed quality. But one stood out, from Jon Howell (who signs his reviews so I\nknow it was him), who gave us a firm \u201creject\u201d which (in retrospect) was actually\nfairly well-deserved but in an incredibly constructive way. To paraphrase him\nslightly, the work was interesting but the paper was crap \u2013 <em>and here\u2019s how to\nrewrite it so it makes sense</em>. We basically did what he said, ASPLOS accepted\nit, and the rest is now history. (Over a decade ago, good grief.)</p>\n<p>The second is that I simply cannot predict whether any research I\u2019m doing is\nactually going to turn out to have any value. The only other equivalent award\nI\u2019ve had was an <a href=\"https://infocom2024.ieee-infocom.org/awards\">INFOCOM 2024 Test of Time\naward</a> for <a href=\"https://doi.org/10.1109/INFCOM.2012.6195845\">our 2012 paper on a\nsystem called <em>Thinkair</em></a>, about\nmobile-code offload from devices to the cloud. That paper had received\nconsiderably more than just one rejection prior to acceptance, and if I recall\nmy final contribution correctly, I recommended not submitting it to INFOCOM as I\ndidn\u2019t think we\u2019d done enough to address previous review comments.</p>\n<p>Shows what I know. But then, how boring would life be without a little\nignorance\u2026 :)</p>",+"content": "<p><a href=\"https://2025.eurosys.org/index.html\">EuroSys 2025</a> was co-located with <a href=\"https://www.asplos-conference.org/asplos2025/\">ASPLOS\n2025</a> this year. Other\ncommitments meant I (again) couldn\u2019t stay for the whole conference, attending\nprimarily because <a href=\"https://mort.io/blog/tdis-accepted\">two students had papers in the TDIS\nworkshop</a>.</p>\n <img alt=\"A photograph of me in a yellow t-shirt receiving the award\" height=\"1\" src=\"https://mort.io/blog/happy-day/stage.jpg\" width=\"480\">\n<p>But happily I <em>was</em> able to stay for the first day of the conference \u2013\n\u201chappily\u201d not only because it gave me a chance to catch up with some old friends\nI hadn\u2019t seen in a decade or more, but also because <a href=\"https://doi.org/10.1145/2451116.2451167\">the Mirage unikernels\npaper</a> which appeared at <a href=\"http://asplos13.rice.edu/\">ASPLOS\n2013</a> won one of two <a href=\"https://www.asplos-conference.org/asplos2025/awards/\">ASPLOS 2025 Influential Papers\nawards</a> :)</p>\n<p>This is obviously very flattering \u2013 typically doing research is necessarily its\nown reward because the work can seem fruitless much of the time. Even when a\npaper gets written and submitted it will most likely be rejected \u2013 I think\nEuroSys this year reported something like a 12% acceptance rate, so rejection is\n<em>a priori</em> the most likely outcome. Finally, if the paper does finally get\naccepted, it will most likely sink without trace \u2013 perhaps a brief flurry of\ninterest for a few months or so, the paper gets cited a few times, and then it\nfades away. This seems inevitable in a reasonably fast moving field that is also\ngrowing at pace \u2013 EuroSys had ~160 attendees in 2006, growing to ~330 in the 10\nyears to 2016, but hitting ~1100 this year; while submissions grew from ~200 in\n2019 to 696 this year.</p>\n <img alt=\"A photograph of the certificate\" height=\"1\" src=\"https://mort.io/blog/happy-day/official.jpg\" width=\"320\">\n<p>So to win an award recognising that others feel a paper actually had some\ninfluence is rare, and makes me very happy :) At the same time, it reinforces a\ncouple of lessons that I really should\u2019ve internalised by now.</p>\n<p>The first is that papers inevitably get better for thoughtful considered\nfeedback from experts a step or several away from the work \u2013 so drafts should\nbe produced in plenty of time and distributed to anyone who\u2019s willing to take\nthe time for feedback. In the case of this paper the previous failed submission\nto <a href=\"https://www.usenix.org/conference/osdi12\">OSDI 2012</a> had, let\u2019s say, reviews\nof mixed quality. But one stood out, from Jon Howell (who signs his reviews so I\nknow it was him), who gave us a firm \u201creject\u201d which (in retrospect) was actually\nfairly well-deserved but in an incredibly constructive way. To paraphrase him\nslightly, the work was interesting but the paper was crap \u2013 <em>and here\u2019s how to\nrewrite it so it makes sense</em>. We basically did what he said, ASPLOS accepted\nit, and the rest is now history. (Over a decade ago, good grief.)</p>\n<p>The second is that I simply cannot predict whether any research I\u2019m doing is\nactually going to turn out to have any value. The only other equivalent award\nI\u2019ve had was an <a href=\"https://infocom2024.ieee-infocom.org/awards\">INFOCOM 2024 Test of Time\naward</a> for <a href=\"https://doi.org/10.1109/INFCOM.2012.6195845\">our 2012 paper on a\nsystem called <em>Thinkair</em></a>, about\nmobile-code offload from devices to the cloud. That paper had received\nconsiderably more than just one rejection prior to acceptance, and if I recall\nmy final contribution correctly, I recommended not submitting it to INFOCOM as I\ndidn\u2019t think we\u2019d done enough to address previous review comments.</p>\n<p>Shows what I know. But then, how boring would life be without a little\nignorance\u2026 :)</p>",
+18
mort/blog_hdi-seminar_.json
+18
mort/blog_hdi-seminar_.json
···+"summary": "<p>Looks like I get a chance to run my mouth off again :) Upcoming\n<a href=\"http://hdiresearch.org/\">HDI</a> <a href=\"http://www.crassh.cam.ac.uk/events/26198\">research\nseminar</a>, organised by <a href=\"http://www.bigdata.cam.ac.uk/\">Cambridge Big\nData</a>/<a href=\"http://www.digitalhumanities.cam.ac.uk/\">Digital\nHumanities</a>. In short, details are:\n<strong>20th April 2015, 14:00\u201316:00</strong> in <strong>S1, Alison Richard Building, West Road,\nCambridge</strong>. If you\u2019d like to attend, please do register at\n<a href=\"http://www.eventbrite.co.uk/e/human-data-interaction-cambridge-big-datadigital-humanities-seminar-tickets-16337148852\">http://www.eventbrite.co.uk/e/human-data-interaction-cambridge-big-datadigital-humanities-seminar-tickets-16337148852</a>.</p>\n<p>And just because pixels are, in some loose sense, nearly free, here\u2019s the\nabstract from the seminar link above:</p>\n<blockquote>\n<p>The increasing generation and collection of personal data has created a\ncomplex ecosystem, often collaborative but sometimes combative, around\ncompanies and individuals engaging in the use of these data. We propose that\nthe interactions between these agents warrant a new topic of study: Human-Data\nInteraction (HDI), that sits at the intersection of various disciplines,\nincluding computer science, statistics, sociology, psychology and behavioural\neconomics._</p>\n</blockquote>\n<blockquote>\n<p>In this brief presentation I will pose some of the challenges that HDI raises,\norganised into three core themes of legibility, agency and negotiability. I\nwill also outline some of the technical work we are currently undertaking that\nattempts to address some of the underlying platform problems. My hope is to\nelicit discussion of both the HDI framework and the technical solutions we are\npursuing, as well as to engage in a broader conversation about the ways we\nshould approach the personal data ecosystem with other interested parties._</p>\n</blockquote>",+"content": "<p>Looks like I get a chance to run my mouth off again :) Upcoming\n<a href=\"http://hdiresearch.org/\">HDI</a> <a href=\"http://www.crassh.cam.ac.uk/events/26198\">research\nseminar</a>, organised by <a href=\"http://www.bigdata.cam.ac.uk/\">Cambridge Big\nData</a>/<a href=\"http://www.digitalhumanities.cam.ac.uk/\">Digital\nHumanities</a>. In short, details are:\n<strong>20th April 2015, 14:00\u201316:00</strong> in <strong>S1, Alison Richard Building, West Road,\nCambridge</strong>. If you\u2019d like to attend, please do register at\n<a href=\"http://www.eventbrite.co.uk/e/human-data-interaction-cambridge-big-datadigital-humanities-seminar-tickets-16337148852\">http://www.eventbrite.co.uk/e/human-data-interaction-cambridge-big-datadigital-humanities-seminar-tickets-16337148852</a>.</p>\n<p>And just because pixels are, in some loose sense, nearly free, here\u2019s the\nabstract from the seminar link above:</p>\n<blockquote>\n<p>The increasing generation and collection of personal data has created a\ncomplex ecosystem, often collaborative but sometimes combative, around\ncompanies and individuals engaging in the use of these data. We propose that\nthe interactions between these agents warrant a new topic of study: Human-Data\nInteraction (HDI), that sits at the intersection of various disciplines,\nincluding computer science, statistics, sociology, psychology and behavioural\neconomics._</p>\n</blockquote>\n<blockquote>\n<p>In this brief presentation I will pose some of the challenges that HDI raises,\norganised into three core themes of legibility, agency and negotiability. I\nwill also outline some of the technical work we are currently undertaking that\nattempts to address some of the underlying platform problems. My hope is to\nelicit discussion of both the HDI framework and the technical solutions we are\npursuing, as well as to engage in a broader conversation about the ways we\nshould approach the personal data ecosystem with other interested parties._</p>\n</blockquote>",
+18
mort/blog_inconstant-ruby_.json
+18
mort/blog_inconstant-ruby_.json
···+"summary": "<p>As <a href=\"https://mort.io/blog/2015/01/15/begin-again/\">noted previously</a>, this site is basically a\n<a href=\"https://github.com/\">Github</a>-hosted <a href=\"http://jekyllrb.com/\">Jekyll</a> site at present, though one that can be built as a\n<a href=\"http://openmirage.org/\">Mirage</a> unikernel. Part of the <a href=\"http://openmirage.org/\">Mirage</a> workflow to publish a new post\ninvolves using <a href=\"https://travis-ci.org/\">Travis CI</a> to build and then commit back a new unikernel\nimage. Thus it is currently necessary to run <a href=\"http://jekyllrb.com/\">Jekyll</a> in the <a href=\"https://travis-ci.org/\">Travis</a> build\nscripts, and the dynamism of the Ruby environment meant that this broke (again)\nrecently as one of the <code>github-pages</code> gem\u2019s dependencies now depends on <code>Ruby >= 2.0</code> while the default Rubies on the <a href=\"https://travis-ci.org/\">Travis</a> Ubuntu image for <code>C</code> language\nbuilds is <code>1.8</code> (via Ubuntu packaging) or, if you remove that one, <code>1.9</code> (via\n<a href=\"https://rvm.io/\">rvm</a>). Read on to find out how to fix this\u2026</p>\n<p>The fix that currently works for me turns out to be relatively simple: remove\nall the rubies installed as Ubuntu packages, and then invoke <a href=\"https://rvm.io/\">rvm</a> to set the\ndefault ruby to something reasonable \u2013 in this case, 2.1.</p>\n<pre><code><span><span><span>#</span></span><span># remove old ubuntu rubies</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get<span><span> -</span>y</span> remove ruby ruby1.8</span>\n</span><span><span><span>sudo</span></span><span> apt-get<span><span> -</span>y</span> autoremove</span>\n</span><span>\n</span><span><span><span>#</span></span><span># use rvm and a modern-ish ruby</span><span>\n</span></span><span><span><span>source</span></span><span> <span><span>~</span></span>/.rvm/scripts/rvm</span>\n</span><span><span><span>rvm</span></span><span><span><span> --</span>default</span> use 2.1</span>\n</span><span>\n</span><span><span><span>#</span></span><span># check that all worked...</span><span>\n</span></span><span><span><span>which</span></span><span> ruby</span>\n</span><span><span><span>ruby</span></span><span><span><span> --</span>version</span></span>\n</span><span>\n</span><span><span><span>#</span></span><span># install jekyll and github-pages</span><span>\n</span></span><span><span><span>gem</span></span><span> install jekyll</span>\n</span><span><span><span>gem</span></span><span> install github-pages<span><span> --</span>no-rdoc</span><span><span> --</span>no-ri</span></span>\n</span><span><span><span>jekyll</span></span><span><span><span> -</span>v</span></span>\n</span></code></pre>\n<p>And that\u2019s all there is to it \u2013 you should now be able to call <code>jekyll</code> in your\n<a href=\"https://travis-ci.org/\">Travis</a> environment as you\u2019d expect\u2026</p>",+"content": "<p>As <a href=\"https://mort.io/blog/2015/01/15/begin-again/\">noted previously</a>, this site is basically a\n<a href=\"https://github.com/\">Github</a>-hosted <a href=\"http://jekyllrb.com/\">Jekyll</a> site at present, though one that can be built as a\n<a href=\"http://openmirage.org/\">Mirage</a> unikernel. Part of the <a href=\"http://openmirage.org/\">Mirage</a> workflow to publish a new post\ninvolves using <a href=\"https://travis-ci.org/\">Travis CI</a> to build and then commit back a new unikernel\nimage. Thus it is currently necessary to run <a href=\"http://jekyllrb.com/\">Jekyll</a> in the <a href=\"https://travis-ci.org/\">Travis</a> build\nscripts, and the dynamism of the Ruby environment meant that this broke (again)\nrecently as one of the <code>github-pages</code> gem\u2019s dependencies now depends on <code>Ruby >= 2.0</code> while the default Rubies on the <a href=\"https://travis-ci.org/\">Travis</a> Ubuntu image for <code>C</code> language\nbuilds is <code>1.8</code> (via Ubuntu packaging) or, if you remove that one, <code>1.9</code> (via\n<a href=\"https://rvm.io/\">rvm</a>). Read on to find out how to fix this\u2026</p>\n<p>The fix that currently works for me turns out to be relatively simple: remove\nall the rubies installed as Ubuntu packages, and then invoke <a href=\"https://rvm.io/\">rvm</a> to set the\ndefault ruby to something reasonable \u2013 in this case, 2.1.</p>\n<pre><code><span><span><span>#</span></span><span># remove old ubuntu rubies</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get<span><span> -</span>y</span> remove ruby ruby1.8</span>\n</span><span><span><span>sudo</span></span><span> apt-get<span><span> -</span>y</span> autoremove</span>\n</span><span>\n</span><span><span><span>#</span></span><span># use rvm and a modern-ish ruby</span><span>\n</span></span><span><span><span>source</span></span><span> <span><span>~</span></span>/.rvm/scripts/rvm</span>\n</span><span><span><span>rvm</span></span><span><span><span> --</span>default</span> use 2.1</span>\n</span><span>\n</span><span><span><span>#</span></span><span># check that all worked...</span><span>\n</span></span><span><span><span>which</span></span><span> ruby</span>\n</span><span><span><span>ruby</span></span><span><span><span> --</span>version</span></span>\n</span><span>\n</span><span><span><span>#</span></span><span># install jekyll and github-pages</span><span>\n</span></span><span><span><span>gem</span></span><span> install jekyll</span>\n</span><span><span><span>gem</span></span><span> install github-pages<span><span> --</span>no-rdoc</span><span><span> --</span>no-ri</span></span>\n</span><span><span><span>jekyll</span></span><span><span><span> -</span>v</span></span>\n</span></code></pre>\n<p>And that\u2019s all there is to it \u2013 you should now be able to call <code>jekyll</code> in your\n<a href=\"https://travis-ci.org/\">Travis</a> environment as you\u2019d expect\u2026</p>",
+18
mort/blog_internalcl-wifi_.json
+18
mort/blog_internalcl-wifi_.json
···+"summary": "<p>Using my fancy (?) new(-ish) Linux laptop running <a href=\"https://nixos.org/\">NixOS</a>, I\nfinally had cause to connect to our internal Wi-Fi network. This was not\nentirely trivial due to the various configuration options required. So here\ngoes, for the record, what I did as an aide memoir for me and in case it\u2019s\nuseful for anyone else\u2026</p>\n<p>First, create the connection \u2013 the Wi-Fi network in question is named\n<code>Internal-CL</code>:</p>\n<pre><code><span><span><span>$</span></span><span> sudo nmcli connection add type wifi con-name Internal-CL ssid Internal-CL</span>\n</span><span><span><span>Connection</span></span><span> <span><span>'</span>Internal-CL<span>'</span></span> (8f1ddcc9-4b1f-4e5d-9992-522714685eb4</span><span></span>) <span><span>successfully</span></span><span> added.</span>\n</span></code></pre>\n<p>Then, configure it:</p>\n<pre><code><span><span><span>$</span></span><span> sudo nmcli connection edit Internal-CL</span>\n</span><span>\n</span><span><span>=</span><span>==</span><span></span><span>|</span> <span><span>nmcli</span></span><span> interactive connection editor</span> <span>|</span><span>=</span><span>==</span>\n</span><span>\n</span><span><span><span>Editing</span></span><span> existing <span><span>'</span>802-11-wireless<span>'</span></span> connection: <span><span>'</span>Internal-CL<span>'</span></span></span>\n</span><span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>help<span>'</span></span> or <span><span>'</span>?<span>'</span></span> for available commands.</span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>print<span>'</span></span> to show all the connection properties.</span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>describe [<setting>.<prop>]<span>'</span></span> for detailed property description.</span>\n</span><span>\n</span><span><span><span>You</span></span><span> may edit the following settings: connection, 802-11-wireless (wifi</span><span></span>)<span><span>,</span></span><span> 802-11-wireless-security (wifi-sec</span><span></span>)<span><span>,</span></span><span> 802-1x, ethtool, match, ipv4, ipv6, hostname, link, tc, proxy</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.eap peap</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.phase2-auth mschapv2</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.identity YOUR-IDENTITY</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.password YOUR-PASSWORD</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set wifi-sec.key-mgmt wpa-eap</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> save</span>\n</span><span><span><span>Connection</span></span><span> <span><span>'</span>Internal-CL<span>'</span></span> (8f1ddcc9-4b1f-4e5d-9992-522714685eb4</span><span></span>) <span><span>successfully</span></span><span> updated.</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> activate</span>\n</span><span><span><span>Monitoring</span></span><span> connection activation (press any key to continue</span><span></span>)\n</span><span><span><span>Connection</span></span><span> successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/12</span><span></span>)\n</span><span>\n</span><span><span><span>nmcli</span></span><span><span>></span> quit</span>\n</span></code></pre>\n<p>Obviously you will need to provide your own values for <code>YOUR-IDENTITY</code> and\n<code>YOUR-PASSWORD</code> :)</p>",+"content": "<p>Using my fancy (?) new(-ish) Linux laptop running <a href=\"https://nixos.org/\">NixOS</a>, I\nfinally had cause to connect to our internal Wi-Fi network. This was not\nentirely trivial due to the various configuration options required. So here\ngoes, for the record, what I did as an aide memoir for me and in case it\u2019s\nuseful for anyone else\u2026</p>\n<p>First, create the connection \u2013 the Wi-Fi network in question is named\n<code>Internal-CL</code>:</p>\n<pre><code><span><span><span>$</span></span><span> sudo nmcli connection add type wifi con-name Internal-CL ssid Internal-CL</span>\n</span><span><span><span>Connection</span></span><span> <span><span>'</span>Internal-CL<span>'</span></span> (8f1ddcc9-4b1f-4e5d-9992-522714685eb4</span><span></span>) <span><span>successfully</span></span><span> added.</span>\n</span></code></pre>\n<p>Then, configure it:</p>\n<pre><code><span><span><span>$</span></span><span> sudo nmcli connection edit Internal-CL</span>\n</span><span>\n</span><span><span>=</span><span>==</span><span></span><span>|</span> <span><span>nmcli</span></span><span> interactive connection editor</span> <span>|</span><span>=</span><span>==</span>\n</span><span>\n</span><span><span><span>Editing</span></span><span> existing <span><span>'</span>802-11-wireless<span>'</span></span> connection: <span><span>'</span>Internal-CL<span>'</span></span></span>\n</span><span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>help<span>'</span></span> or <span><span>'</span>?<span>'</span></span> for available commands.</span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>print<span>'</span></span> to show all the connection properties.</span>\n</span><span><span><span>Type</span></span><span> <span><span>'</span>describe [<setting>.<prop>]<span>'</span></span> for detailed property description.</span>\n</span><span>\n</span><span><span><span>You</span></span><span> may edit the following settings: connection, 802-11-wireless (wifi</span><span></span>)<span><span>,</span></span><span> 802-11-wireless-security (wifi-sec</span><span></span>)<span><span>,</span></span><span> 802-1x, ethtool, match, ipv4, ipv6, hostname, link, tc, proxy</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.eap peap</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.phase2-auth mschapv2</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.identity YOUR-IDENTITY</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set 802-1x.password YOUR-PASSWORD</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> set wifi-sec.key-mgmt wpa-eap</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> save</span>\n</span><span><span><span>Connection</span></span><span> <span><span>'</span>Internal-CL<span>'</span></span> (8f1ddcc9-4b1f-4e5d-9992-522714685eb4</span><span></span>) <span><span>successfully</span></span><span> updated.</span>\n</span><span><span><span>nmcli</span></span><span><span>></span> activate</span>\n</span><span><span><span>Monitoring</span></span><span> connection activation (press any key to continue</span><span></span>)\n</span><span><span><span>Connection</span></span><span> successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/12</span><span></span>)\n</span><span>\n</span><span><span><span>nmcli</span></span><span><span>></span> quit</span>\n</span></code></pre>\n<p>Obviously you will need to provide your own values for <code>YOUR-IDENTITY</code> and\n<code>YOUR-PASSWORD</code> :)</p>",
+18
mort/blog_jetlag-fasting_.json
+18
mort/blog_jetlag-fasting_.json
···+"summary": "<p>As I\u2019ve found myself repeating the same information several times recently, and\nhave to dig out the links in question every time, I figured it\u2019d be useful to\nwrite this down once so I can point at it.</p>\n<p>Jetlag is a first-world problem but can be an annoying one\u2013 in recent years\nI\u2019ve found I have a particular problem getting up in the morning when flying\neast. So, one day, bored in an airport in the US and with only the entire\nInternet to hand, I thought I\u2019d look around for an explanation at least.</p>\n<p>I ended up coming across reports of some physiological research \u2013 in rats of\ncourse \u2013 that suggested a way to avoid jetlag. I\u2019ve since tried it over a dozen\ntimes, and found it to work in all cases. At this point I no longer care if it\u2019s\nplacebo affect or a genuine explanation or something else :)</p>\n<p>The TL;DR is: fast for >16 hours before your time of arrival, then eat as\nappropriate. Definitely >16 hours though \u2013 the occasions I\u2019ve not quite managed\nit (at least once I miscounted and managed only about 14 hours) it didn\u2019t seem\nto work as effectively.</p>\n<p>As I interpret the information in the\n<a href=\"https://dx.doi.org/10.1126/science.1153277\">article</a>, found via\n<a href=\"http://news.bbc.co.uk/2/hi/health/7414437.stm\">BBC</a> and\n<a href=\"http://news.harvard.edu/gazette/story/2008/05/study-identifies-food-related-clock-in-the-brain/\">Harvard</a>\nreports, through my \u201cI can barely do computer science and certainly not biology\u201d\nbrain, mammals have two body clocks, one driven by daylight and one by\nmetabolism. The daylight can\u2019t be shifted quickly, but the metabolic one can be\nmade to float by fasting. The metabolic also being a lower level clock means\nthat, when you resync it by consuming calories, it syncs back to whatever light\nconditions are current.</p>\n<p>As I say, this isn\u2019t my area of expertise \u2013 but it seems to work anyway.</p>\n<p>Also, because a surprisingly large (or perhaps not) number of people also ask\u2013\nas far as I know and have experienced, fasting means <strong>no</strong> calories,\n<strong>including alcohol</strong>, even it it\u2019s free\u2026 :)</p>",+"content": "<p>As I\u2019ve found myself repeating the same information several times recently, and\nhave to dig out the links in question every time, I figured it\u2019d be useful to\nwrite this down once so I can point at it.</p>\n<p>Jetlag is a first-world problem but can be an annoying one\u2013 in recent years\nI\u2019ve found I have a particular problem getting up in the morning when flying\neast. So, one day, bored in an airport in the US and with only the entire\nInternet to hand, I thought I\u2019d look around for an explanation at least.</p>\n<p>I ended up coming across reports of some physiological research \u2013 in rats of\ncourse \u2013 that suggested a way to avoid jetlag. I\u2019ve since tried it over a dozen\ntimes, and found it to work in all cases. At this point I no longer care if it\u2019s\nplacebo affect or a genuine explanation or something else :)</p>\n<p>The TL;DR is: fast for >16 hours before your time of arrival, then eat as\nappropriate. Definitely >16 hours though \u2013 the occasions I\u2019ve not quite managed\nit (at least once I miscounted and managed only about 14 hours) it didn\u2019t seem\nto work as effectively.</p>\n<p>As I interpret the information in the\n<a href=\"https://dx.doi.org/10.1126/science.1153277\">article</a>, found via\n<a href=\"http://news.bbc.co.uk/2/hi/health/7414437.stm\">BBC</a> and\n<a href=\"http://news.harvard.edu/gazette/story/2008/05/study-identifies-food-related-clock-in-the-brain/\">Harvard</a>\nreports, through my \u201cI can barely do computer science and certainly not biology\u201d\nbrain, mammals have two body clocks, one driven by daylight and one by\nmetabolism. The daylight can\u2019t be shifted quickly, but the metabolic one can be\nmade to float by fasting. The metabolic also being a lower level clock means\nthat, when you resync it by consuming calories, it syncs back to whatever light\nconditions are current.</p>\n<p>As I say, this isn\u2019t my area of expertise \u2013 but it seems to work anyway.</p>\n<p>Also, because a surprisingly large (or perhaps not) number of people also ask\u2013\nas far as I know and have experienced, fasting means <strong>no</strong> calories,\n<strong>including alcohol</strong>, even it it\u2019s free\u2026 :)</p>",
+18
mort/blog_jquery-console_.json
+18
mort/blog_jquery-console_.json
···+"summary": "<p>I had cause to do this recently, so here\u2019re the runes from\n<a href=\"http://stackoverflow.com/questions/7474354/include-jquery-in-the-javascript-console\">http://stackoverflow.com/questions/7474354/include-jquery-in-the-javascript-console</a>\nfor the record:</p>\n<pre><code><span><span><span>var</span> <span><span><span>script</span></span> </span><span>=</span> <span><span>document</span><span>.</span><span>createElement</span></span><span>(</span><span><span>'</span>script<span>'</span></span><span>)</span></span><span>;</span>\n</span><span><span>script</span><span>.</span><span>src</span> <span>=</span> <span><span>"</span>https://ajax.googleapis.com/ajax/libs/jquery/1.6.3/jquery.min.js<span>"</span></span><span>;</span>\n</span><span><span><span>document</span><span>.</span><span>getElementsByTagName</span></span><span>(</span><span><span>'</span>head<span>'</span></span><span>)</span><span><span>[</span><span>0</span><span>]</span></span><span><span>.</span><span>appendChild</span></span><span>(</span><span>script</span><span>)</span><span>;</span>\n</span></code></pre>",+"content": "<p>I had cause to do this recently, so here\u2019re the runes from\n<a href=\"http://stackoverflow.com/questions/7474354/include-jquery-in-the-javascript-console\">http://stackoverflow.com/questions/7474354/include-jquery-in-the-javascript-console</a>\nfor the record:</p>\n<pre><code><span><span><span>var</span> <span><span><span>script</span></span> </span><span>=</span> <span><span>document</span><span>.</span><span>createElement</span></span><span>(</span><span><span>'</span>script<span>'</span></span><span>)</span></span><span>;</span>\n</span><span><span>script</span><span>.</span><span>src</span> <span>=</span> <span><span>"</span>https://ajax.googleapis.com/ajax/libs/jquery/1.6.3/jquery.min.js<span>"</span></span><span>;</span>\n</span><span><span><span>document</span><span>.</span><span>getElementsByTagName</span></span><span>(</span><span><span>'</span>head<span>'</span></span><span>)</span><span><span>[</span><span>0</span><span>]</span></span><span><span>.</span><span>appendChild</span></span><span>(</span><span>script</span><span>)</span><span>;</span>\n</span></code></pre>",
+18
mort/blog_just-latex_.json
+18
mort/blog_just-latex_.json
···+"summary": "<p>I have recently become a fan of <a href=\"https://just.systems/\"><code>just</code></a> as a replacement for the venerable\n<a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a>. I find that nowadays I rarely need the built-in dependency rules\nthat <a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a> provides. Perhaps more radically, I also rarely need to write my\nown as the prevelance of format-specific build tools such as <a href=\"https://doc.rust-lang.org/stable/cargo/\"><code>cargo</code></a>,\n<a href=\"https://docs.astral.sh/uv/\"><code>uv</code></a>, <a href=\"https://ctan.org/pkg/latexmk/\"><code>latexmk</code></a> and the like mean I don\u2019t need to write my own either.</p>\n<p>Recently, while writing references and then helping out get submissions for\n<a href=\"https://www.cl.cam.ac.uk/events/rossfest/\">Rossfest</a> consistently formatted and building cleanly, I found myself\nextending my various <a href=\"https://just.systems/\"><code>just</code></a> targets for <a href=\"https://www.latex-project.org/\">LaTeX</a>. So I thought I\u2019d document\nthem here.</p>\n<p>I always begin my <code>Justfile</code> with the apparently idiomatic \u201cjust show me the\ntargets and associated help text\u201d target:</p>\n<pre><code><span><span><span>_default</span></span>:\n</span><span> <span>@</span>just --list\n</span></code></pre>\n<p>This seems considerably easier \u2013 and more powerful \u2013 than the equivalent hack\nI used to use in a <code>Makefile</code>!</p>\n<pre><code><span><span><span>.DEFAULT</span></span><span>:</span> <span><span>help</span></span><span>\n</span></span><span><span></span>\n</span><span><span><span>.PHONY</span></span><span>:</span> <span><span>help</span></span><span>\n</span></span><span><span></span><span><span>help</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span>@</span><span><span><span>echo</span></span><span> <span><span>"</span>Targets are:<span>"</span></span></span></span>\n</span></span><span><span>\t<span>@</span><span><span><span>grep</span></span><span><span><span> -</span>E</span> <span><span>'</span>^(^## |[^.#][ a-zA-Z0-9-]+:.*#)<span>'</span></span> Makefile <span>\\\n</span></span></span></span></span><span><span><span><span></span>\t<span>|</span> <span><span>sed</span></span><span><span><span> -</span>E</span> <span><span>'</span>s/:[[:print:]]+#/:/;s/^([^#])/-- <span>\\1</span>/<span>'</span></span></span></span>\n</span></span></code></pre>\n<p>Next setup up some useful variables: the command we\u2019ll use (<code>latex</code>) plus\nsources (<code>texs</code>) and targets (<code>pdfs</code>), and droppings that might be produced but\nnot cleaned up by <code>latexmk -[cC]</code>:</p>\n<pre><code><span><span>latex</span> <span>:=</span> <span><span>"</span>latexmk -pdf<span>"</span></span>\n</span><span>\n</span><span><span>texs</span> <span>:=</span> <span><span><span>`</span>echo [0-9][0-9]-*.tex<span>`</span></span></span>\n</span><span><span>pdfs</span> <span>:=</span> <span><span>replace</span></span><span><span>(</span><span>texs</span><span>,</span> <span><span>'</span>.tex<span>'</span></span><span>,</span> <span><span>'</span>.pdf<span>'</span></span><span>)</span></span>\n</span><span><span>droppings</span> <span>:=</span> <span><span>"</span>$f.nav $f.snm $f.bbl<span>"</span></span>\n</span></code></pre>\n<p>Now to actually building and cleaning things; first, individual targets:</p>\n<pre><code><span><span><span># </span>build a PDF</span>\n</span><span><span><span>pdf</span></span> tgt:\n</span><span> <span><span>{{</span><span>latex</span><span>}}</span></span> <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>.tex\n</span><span>\n</span><span><span><span># </span>clean generated files</span>\n</span><span><span><span>clean</span></span> tgt:\n</span><span> <span><span>{{</span><span>latex</span><span>}}</span></span> -C <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>.tex\n</span><span> for f in <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>; do rm -f <span><span>{{</span><span>droppings</span><span>}}</span></span> ; done\n</span></code></pre>\n<p>(Yes, ok, so it seems a bit silly to have to wrap a <code>for</code> loop around simply to\npropagate a variable from the <code>Justfile</code> into the shell. But no matter.)</p>\n<p>Next, all available targets:</p>\n<pre><code><span><span><span># </span>build all PDFs</span>\n</span><span><span><span>pdfs</span></span>:\n</span><span> for f in <span><span>{{</span><span>texs</span><span>}}</span></span>; do just pdf $f ; done\n</span><span>\n</span><span><span><span># </span>clean all PDFs</span>\n</span><span><span><span>clean-pdfs</span></span>:\n</span><span> for f in <span><span>{{</span><span>pdfs</span><span>}}</span></span>; do just clean $f ; done\n</span></code></pre>\n<p>Finally, <em>watch</em> a target, rebuilding on save \u2013 it may be helpful therefore to\navoid automatically saving the source while in a state in which it will not\nsuccessfully build!</p>\n<pre><code><span><span><span># </span>watch a file, rebuilding when saved</span>\n</span><span><span><span>watch</span></span> tgt:\n</span><span> while inotifywait -e close_write <span><span>{{</span><span>tgt</span><span>}}</span></span>* ; do just pdf <span><span>{{</span><span>tgt</span><span>}}</span></span> ; done\n</span></code></pre>",+"content": "<p>I have recently become a fan of <a href=\"https://just.systems/\"><code>just</code></a> as a replacement for the venerable\n<a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a>. I find that nowadays I rarely need the built-in dependency rules\nthat <a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a> provides. Perhaps more radically, I also rarely need to write my\nown as the prevelance of format-specific build tools such as <a href=\"https://doc.rust-lang.org/stable/cargo/\"><code>cargo</code></a>,\n<a href=\"https://docs.astral.sh/uv/\"><code>uv</code></a>, <a href=\"https://ctan.org/pkg/latexmk/\"><code>latexmk</code></a> and the like mean I don\u2019t need to write my own either.</p>\n<p>Recently, while writing references and then helping out get submissions for\n<a href=\"https://www.cl.cam.ac.uk/events/rossfest/\">Rossfest</a> consistently formatted and building cleanly, I found myself\nextending my various <a href=\"https://just.systems/\"><code>just</code></a> targets for <a href=\"https://www.latex-project.org/\">LaTeX</a>. So I thought I\u2019d document\nthem here.</p>\n<p>I always begin my <code>Justfile</code> with the apparently idiomatic \u201cjust show me the\ntargets and associated help text\u201d target:</p>\n<pre><code><span><span><span>_default</span></span>:\n</span><span> <span>@</span>just --list\n</span></code></pre>\n<p>This seems considerably easier \u2013 and more powerful \u2013 than the equivalent hack\nI used to use in a <code>Makefile</code>!</p>\n<pre><code><span><span><span>.DEFAULT</span></span><span>:</span> <span><span>help</span></span><span>\n</span></span><span><span></span>\n</span><span><span><span>.PHONY</span></span><span>:</span> <span><span>help</span></span><span>\n</span></span><span><span></span><span><span>help</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span>@</span><span><span><span>echo</span></span><span> <span><span>"</span>Targets are:<span>"</span></span></span></span>\n</span></span><span><span>\t<span>@</span><span><span><span>grep</span></span><span><span><span> -</span>E</span> <span><span>'</span>^(^## |[^.#][ a-zA-Z0-9-]+:.*#)<span>'</span></span> Makefile <span>\\\n</span></span></span></span></span><span><span><span><span></span>\t<span>|</span> <span><span>sed</span></span><span><span><span> -</span>E</span> <span><span>'</span>s/:[[:print:]]+#/:/;s/^([^#])/-- <span>\\1</span>/<span>'</span></span></span></span>\n</span></span></code></pre>\n<p>Next setup up some useful variables: the command we\u2019ll use (<code>latex</code>) plus\nsources (<code>texs</code>) and targets (<code>pdfs</code>), and droppings that might be produced but\nnot cleaned up by <code>latexmk -[cC]</code>:</p>\n<pre><code><span><span>latex</span> <span>:=</span> <span><span>"</span>latexmk -pdf<span>"</span></span>\n</span><span>\n</span><span><span>texs</span> <span>:=</span> <span><span><span>`</span>echo [0-9][0-9]-*.tex<span>`</span></span></span>\n</span><span><span>pdfs</span> <span>:=</span> <span><span>replace</span></span><span><span>(</span><span>texs</span><span>,</span> <span><span>'</span>.tex<span>'</span></span><span>,</span> <span><span>'</span>.pdf<span>'</span></span><span>)</span></span>\n</span><span><span>droppings</span> <span>:=</span> <span><span>"</span>$f.nav $f.snm $f.bbl<span>"</span></span>\n</span></code></pre>\n<p>Now to actually building and cleaning things; first, individual targets:</p>\n<pre><code><span><span><span># </span>build a PDF</span>\n</span><span><span><span>pdf</span></span> tgt:\n</span><span> <span><span>{{</span><span>latex</span><span>}}</span></span> <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>.tex\n</span><span>\n</span><span><span><span># </span>clean generated files</span>\n</span><span><span><span>clean</span></span> tgt:\n</span><span> <span><span>{{</span><span>latex</span><span>}}</span></span> -C <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>.tex\n</span><span> for f in <span><span>{{</span><span><span>file_stem</span></span><span><span>(</span><span>tgt</span><span>)</span></span><span>}}</span></span>; do rm -f <span><span>{{</span><span>droppings</span><span>}}</span></span> ; done\n</span></code></pre>\n<p>(Yes, ok, so it seems a bit silly to have to wrap a <code>for</code> loop around simply to\npropagate a variable from the <code>Justfile</code> into the shell. But no matter.)</p>\n<p>Next, all available targets:</p>\n<pre><code><span><span><span># </span>build all PDFs</span>\n</span><span><span><span>pdfs</span></span>:\n</span><span> for f in <span><span>{{</span><span>texs</span><span>}}</span></span>; do just pdf $f ; done\n</span><span>\n</span><span><span><span># </span>clean all PDFs</span>\n</span><span><span><span>clean-pdfs</span></span>:\n</span><span> for f in <span><span>{{</span><span>pdfs</span><span>}}</span></span>; do just clean $f ; done\n</span></code></pre>\n<p>Finally, <em>watch</em> a target, rebuilding on save \u2013 it may be helpful therefore to\navoid automatically saving the source while in a state in which it will not\nsuccessfully build!</p>\n<pre><code><span><span><span># </span>watch a file, rebuilding when saved</span>\n</span><span><span><span>watch</span></span> tgt:\n</span><span> while inotifywait -e close_write <span><span>{{</span><span>tgt</span><span>}}</span></span>* ; do just pdf <span><span>{{</span><span>tgt</span><span>}}</span></span> ; done\n</span></code></pre>",
+18
mort/blog_just-ocaml_.json
+18
mort/blog_just-ocaml_.json
···+"summary": "<p>In similar vein to a <a href=\"https://mort.io/blog/just-latex\">recent post</a>, I have also started using\n<a href=\"https://just.systems/\"><code>just</code></a> when I periodically need to rebuild my\n<a href=\"https://ocaml.org/\">OCaml</a> tool<a href=\"https://mort.io/blog/just-ocaml/#1\">1</a> <a href=\"https://github.com/mor1/ocal\"><code>ocal</code></a>. So\nI ended up replacing the old\n<a href=\"https://github.com/mor1/ocal/blob/6bb129627f9d1f27ab31cee810013b362ab80067/Makefile\"><code>Makefile</code></a>\nwith a shiny new\n<a href=\"https://github.com/mor1/ocal/blob/8ef8631ae5bbe0315e359d725d467e7d0403fd31/Justfile\"><code>Justfile</code></a>.</p>\n<p>As it also proved useful in another (more esoteric) tool I wrote <a href=\"https://github.com/mor1/cst-tools\">for parsing\nout exam results for my students so I can paste into email\neasily</a>, I thought I\u2019d put it here for the\nrecord. So here it is\u2026</p>\n<div>1\n<p>Largely due to <a href=\"https://nixos.org/\">NixOS</a> upgrades moving tools into\ndifferent locations.</p>\n</div>\n<p>Usual preamble of course:</p>\n<pre><code><span><span><span>_default</span></span>:\n</span><span> <span>@</span>just --list\n</span></code></pre>\n<p>Then set some common variables:</p>\n<pre><code><span><span>PWD</span> <span>:=</span> <span>env</span><span>(</span><span><span>"</span>PWD<span>"</span></span>)\n</span><span><span>DOCDIR</span> <span>:=</span> <span><span>"</span>_build/default/_doc/_html<span>"</span></span>\n</span><span><span>BUILDDIR</span> <span>:=</span> <span><span>"</span>_build/install/default/bin<span>"</span></span>\n</span></code></pre>\n<p>Then set the target \u2014 the tool name, in this case <code>ocal</code> (so named as this is\nan OCaml re-implementation of a tool approximating the trad Unix\n<a href=\"https://en.wikipedia.org/wiki/Cal_(command)\"><code>cal</code></a> tool):</p>\n<pre><code><span><span>TARGET</span> <span>:=</span> <span><span>"</span>ocal<span>"</span></span>\n</span></code></pre>\n<p>Now for the actually useful stuff: some targets. Mostly these just call out to\n<code>dune</code> but in a way I find more intuitive.</p>\n<pre><code><span><span><span># </span>build targets</span>\n</span><span><span><span>build</span></span>:\n</span><span> dune build @all\n</span><span>\n</span><span><span><span># </span>cleanup</span>\n</span><span><span><span>clean</span></span>:\n</span><span> dune clean\n</span><span>\n</span><span><span><span># </span>uninstall targets</span>\n</span><span><span><span>uninstall</span></span>:\n</span><span> dune uninstall\n</span><span>\n</span><span><span><span># </span>run any tests</span>\n</span><span><span><span>test</span></span>:\n</span><span> dune runtest\n</span><span>\n</span><span><span><span># </span>format sources</span>\n</span><span><span><span>format</span></span>:\n</span><span> dune fmt\n</span></code></pre>\n<p>Some compound calls next.</p>\n<p>First, before building we might need to install dependencies, so do so in the\ntime-honoured fashion:</p>\n<pre><code><span><span><span># </span>install dependencies</span>\n</span><span><span><span>depends</span></span>:\n</span><span> opam install --yes dune-release odoc\n</span><span> opam install --yes . --deps-only\n</span></code></pre>\n<p>Next, to install I first build ready to install, then symlink the resulting\nbinary into the right place in my home directory:</p>\n<pre><code><span><span><span># </span>install targets</span>\n</span><span><span><span>install</span></span>: build\n</span><span> dune build @install\n</span><span> ln -sf <span><span>{{</span><span>PWD</span><span>}}</span></span>/<span><span>{{</span><span>BUILDDIR</span><span>}}</span></span>/<span><span>{{</span><span>TARGET</span><span>}}</span></span> ~/.local/bin/\n</span></code></pre>\n<p>To lint all the things, invoke <code>dune</code> twice:</p>\n<pre><code><span><span><span># </span>lint everything</span>\n</span><span><span><span>lint</span></span>:\n</span><span> dune build @lint\n</span><span> dune-release lint\n</span></code></pre>\n<p>Similarly, to build the docs, build <em>all</em> the docs:</p>\n<pre><code><span><span><span># </span>build docs</span>\n</span><span><span><span>doc</span></span>:\n</span><span> dune build @doc\n</span><span> dune build @doc-private\n</span></code></pre>\n<p>Try to open the docs on Linux and if that fails, on MacOS:</p>\n<pre><code><span><span><span># </span>open the docs for reading</span>\n</span><span><span><span>read</span></span>: doc\n</span><span> handlr open <span><span>{{</span><span>DOCDIR</span><span>}}</span></span>/index.html || open <span><span>{{</span><span>DOCDIR</span><span>}}</span></span>\n</span></code></pre>\n<p>Finally, tag and create a release; not actually done this in ages so no idea if\n<code>dune-release</code> invocations are still a thing, let alone correct!</p>\n<pre><code><span><span><span># </span>tag and create a release</span>\n</span><span><span><span>release</span></span>:\n</span><span> dune-release tag\n</span><span> dune-release -vv\n</span></code></pre>",+"content": "<p>In similar vein to a <a href=\"https://mort.io/blog/just-latex\">recent post</a>, I have also started using\n<a href=\"https://just.systems/\"><code>just</code></a> when I periodically need to rebuild my\n<a href=\"https://ocaml.org/\">OCaml</a> tool<a href=\"https://mort.io/blog/just-ocaml/#1\">1</a> <a href=\"https://github.com/mor1/ocal\"><code>ocal</code></a>. So\nI ended up replacing the old\n<a href=\"https://github.com/mor1/ocal/blob/6bb129627f9d1f27ab31cee810013b362ab80067/Makefile\"><code>Makefile</code></a>\nwith a shiny new\n<a href=\"https://github.com/mor1/ocal/blob/8ef8631ae5bbe0315e359d725d467e7d0403fd31/Justfile\"><code>Justfile</code></a>.</p>\n<p>As it also proved useful in another (more esoteric) tool I wrote <a href=\"https://github.com/mor1/cst-tools\">for parsing\nout exam results for my students so I can paste into email\neasily</a>, I thought I\u2019d put it here for the\nrecord. So here it is\u2026</p>\n<div>1\n<p>Largely due to <a href=\"https://nixos.org/\">NixOS</a> upgrades moving tools into\ndifferent locations.</p>\n</div>\n<p>Usual preamble of course:</p>\n<pre><code><span><span><span>_default</span></span>:\n</span><span> <span>@</span>just --list\n</span></code></pre>\n<p>Then set some common variables:</p>\n<pre><code><span><span>PWD</span> <span>:=</span> <span>env</span><span>(</span><span><span>"</span>PWD<span>"</span></span>)\n</span><span><span>DOCDIR</span> <span>:=</span> <span><span>"</span>_build/default/_doc/_html<span>"</span></span>\n</span><span><span>BUILDDIR</span> <span>:=</span> <span><span>"</span>_build/install/default/bin<span>"</span></span>\n</span></code></pre>\n<p>Then set the target \u2014 the tool name, in this case <code>ocal</code> (so named as this is\nan OCaml re-implementation of a tool approximating the trad Unix\n<a href=\"https://en.wikipedia.org/wiki/Cal_(command)\"><code>cal</code></a> tool):</p>\n<pre><code><span><span>TARGET</span> <span>:=</span> <span><span>"</span>ocal<span>"</span></span>\n</span></code></pre>\n<p>Now for the actually useful stuff: some targets. Mostly these just call out to\n<code>dune</code> but in a way I find more intuitive.</p>\n<pre><code><span><span><span># </span>build targets</span>\n</span><span><span><span>build</span></span>:\n</span><span> dune build @all\n</span><span>\n</span><span><span><span># </span>cleanup</span>\n</span><span><span><span>clean</span></span>:\n</span><span> dune clean\n</span><span>\n</span><span><span><span># </span>uninstall targets</span>\n</span><span><span><span>uninstall</span></span>:\n</span><span> dune uninstall\n</span><span>\n</span><span><span><span># </span>run any tests</span>\n</span><span><span><span>test</span></span>:\n</span><span> dune runtest\n</span><span>\n</span><span><span><span># </span>format sources</span>\n</span><span><span><span>format</span></span>:\n</span><span> dune fmt\n</span></code></pre>\n<p>Some compound calls next.</p>\n<p>First, before building we might need to install dependencies, so do so in the\ntime-honoured fashion:</p>\n<pre><code><span><span><span># </span>install dependencies</span>\n</span><span><span><span>depends</span></span>:\n</span><span> opam install --yes dune-release odoc\n</span><span> opam install --yes . --deps-only\n</span></code></pre>\n<p>Next, to install I first build ready to install, then symlink the resulting\nbinary into the right place in my home directory:</p>\n<pre><code><span><span><span># </span>install targets</span>\n</span><span><span><span>install</span></span>: build\n</span><span> dune build @install\n</span><span> ln -sf <span><span>{{</span><span>PWD</span><span>}}</span></span>/<span><span>{{</span><span>BUILDDIR</span><span>}}</span></span>/<span><span>{{</span><span>TARGET</span><span>}}</span></span> ~/.local/bin/\n</span></code></pre>\n<p>To lint all the things, invoke <code>dune</code> twice:</p>\n<pre><code><span><span><span># </span>lint everything</span>\n</span><span><span><span>lint</span></span>:\n</span><span> dune build @lint\n</span><span> dune-release lint\n</span></code></pre>\n<p>Similarly, to build the docs, build <em>all</em> the docs:</p>\n<pre><code><span><span><span># </span>build docs</span>\n</span><span><span><span>doc</span></span>:\n</span><span> dune build @doc\n</span><span> dune build @doc-private\n</span></code></pre>\n<p>Try to open the docs on Linux and if that fails, on MacOS:</p>\n<pre><code><span><span><span># </span>open the docs for reading</span>\n</span><span><span><span>read</span></span>: doc\n</span><span> handlr open <span><span>{{</span><span>DOCDIR</span><span>}}</span></span>/index.html || open <span><span>{{</span><span>DOCDIR</span><span>}}</span></span>\n</span></code></pre>\n<p>Finally, tag and create a release; not actually done this in ages so no idea if\n<code>dune-release</code> invocations are still a thing, let alone correct!</p>\n<pre><code><span><span><span># </span>tag and create a release</span>\n</span><span><span><span>release</span></span>:\n</span><span> dune-release tag\n</span><span> dune-release -vv\n</span></code></pre>",
+18
mort/blog_lab-gitlab_.json
+18
mort/blog_lab-gitlab_.json
···+"summary": "<p>Recently had cause to do this as part of the <a href=\"https://www.cl.cam.ac.uk/research/srg/\">SRG\u2019s</a> and <a href=\"https://ocamllabs.io/\">OCaml Labs</a>\ninfrastructure. Thought it might be useful to make some notes, so here they are!\nAssuming your local <code>sys-admin</code> has kindly created you a suitable VM running\nUbuntu with login credentials, etc, read on\u2026</p>\n<p>Note that several commands that follow must be run as <code>root</code>, via use of <code>sudo</code>\nbelow. Given that, think twice before just cutting and pasting them in,\nobviously\u2026 And I am not held responsible for anything either way!</p>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#install-docker\">Install Docker</a></h2>\n<p>On a new Ubuntu stretch/sid (testing) VM:</p>\n<pre><code><span><span><span>$</span></span><span> lsb_release<span><span> -</span>drc</span></span>\n</span><span><span><span>Description:</span></span><span>\tUbuntu 16.04.1 LTS</span>\n</span><span><span><span>Release:</span></span><span>\t16.04</span>\n</span><span><span><span>Codename:</span></span><span>\txenial</span>\n</span></code></pre>\n<p>Next, install up-to-date <a href=\"https://docker.com/\">Docker</a>:</p>\n<pre><code><span><span><span>sudo</span></span><span> apt-get install apt-transport-https ca-certificates</span>\n</span><span><span><span>apt-key</span></span><span> adv<span><span> --</span>keyserver</span> hkp://p80.pool.sks-keyservers.net:80 <span>\\\n</span></span></span><span><span><span><span> --</span>recv-keys</span> 58118E89F3A912897C070ADBF76221572C52609D</span>\n</span><span><span><span>sudo</span></span><span> echo <span><span>"</span>deb https://apt.dockerproject.org/repo debian-stretch main<span>"</span></span> <span>\\\n</span></span></span><span><span> <span>></span> /etc/apt/sources.list.d/docker.list</span>\n</span><span><span><span>sudo</span></span><span> apt-get update</span>\n</span><span><span><span>sudo</span></span><span> apt-get install<span><span> -</span>y</span> docker-engine</span>\n</span></code></pre>\n<p>Tweak the <code>systemd</code> <a href=\"https://docker.com/\">Docker</a> configuration by adding a fragment to point\nall <a href=\"https://docker.com/\">Docker</a> to the <code>/data</code> partition, lest the root partition <code>/</code> fill:</p>\n<pre><code><span><span><span>cat</span></span><span> <span>></span> /etc/systemd/system/docker.service.d/data-disk.conf <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>[Service]\n</span></span></span><span><span><span>ExecStart=\n</span></span></span><span><span><span>ExecStart=/usr/bin/dockerd -H fd:// -g /data/docker\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<p>Then start the <a href=\"https://docker.com/\">Docker</a> daemon and run <code>hello-world</code> just to check all is\nwell:</p>\n<pre><code><span><span><span>sudo</span></span><span> systemctl daemon-reload</span>\n</span><span><span><span>sudo</span></span><span> service docker start</span>\n</span></code></pre>\n<p>Finally, test the install by running <code>hello-world</code>:</p>\n<pre><code><span><span><span>$</span></span><span> docker run hello-world</span>\n</span><span>\n</span><span><span><span>Hello</span></span><span> from Docker!</span>\n</span><span><span><span>This</span></span><span> message shows that your installation appears to be working correctly.</span>\n</span><span>\n</span><span><span><span>To</span></span><span> generate this message, Docker took the following steps:</span>\n</span><span> <span><span>1.</span></span><span> The Docker client contacted the Docker daemon.</span>\n</span><span> <span><span>2.</span></span><span> The Docker daemon pulled the <span><span>"</span>hello-world<span>"</span></span> image from the Docker Hub.</span>\n</span><span> <span><span>3.</span></span><span> The Docker daemon created a new container from that image which runs the</span>\n</span><span> <span><span>executable</span></span><span> that produces the output you are currently reading.</span>\n</span><span> <span><span>4.</span></span><span> The Docker daemon streamed that output to the Docker client, which sent it</span>\n</span><span> <span><span>to</span></span><span> your terminal.</span>\n</span><span>\n</span><span><span><span>To</span></span><span> try something more ambitious, you can run an Ubuntu container with:</span>\n</span><span> <span><span>$</span></span><span> docker run<span><span> -</span>it</span> ubuntu bash</span>\n</span><span>\n</span><span><span><span>Share</span></span><span> images, automate workflows, and more with a free Docker Hub account:</span>\n</span><span> <span><span>https://hub.docker.com</span></span>\n</span><span>\n</span><span><span><span>For</span></span><span> more examples and ideas, visit:</span>\n</span><span> <span><span>https://docs.docker.com/engine/userguide/</span></span>\n</span></code></pre>\n<p>If appropriate, you may also wish to add yourself to the <code>docker</code> user group:</p>\n<pre><code><span><span><span>sudo</span></span><span> usermod<span><span> -</span>aG</span> docker <span><span>$</span><span>(</span><span><span>whoami</span></span><span>)</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#gitlab\">GitLab</a></h2>\n<p>Assuming you have rights to run <code>docker</code>, install and run Gitlab-CE:</p>\n<pre><code><span><span>S</span><span>=</span><span>128.232.xxx.yyy</span>\n</span><span><span>H</span><span>=</span><span>gitlab.srg.cl.cam.ac.uk</span>\n</span><span><span><span>docker</span></span><span> run<span><span> --</span>detach</span> <span>\\\n</span></span></span><span><span><span><span> --</span>hostname</span> <span><span>$</span><span>H</span></span> <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:443:443 <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:80:80 <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:2222:22 <span>\\\n</span></span></span><span><span><span><span> --</span>name</span> gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>restart</span> always <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/config:/etc/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/logs:/var/log/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/data:/var/opt/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/backups:/var/opt/gitlab/backups <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/sync:/var/opt/gitlab/sync <span>\\\n</span></span></span><span><span><span><span> --</span>env</span> HOST_UID=<span><span>$</span><span>$</span></span>(id<span><span> -</span>u</span></span><span></span>) <span><span>--env</span></span><span> HOST_GID=<span><span>$</span><span>$</span></span>(id<span><span> -</span>g</span></span><span></span>) <span>\\\n</span></span><span> <span><span>mor1/gitlab-ce-cron:latest</span></span>\n</span></code></pre>\n<p>\u2026or use the <code>make start</code> target in the\n<a href=\"https://github.com/mor1/dockerfiles/blob/master/gitlab-ce-cron/Makefile\">Makefile</a>\nin the related <a href=\"https://github.com/mor1/dockerfiles/tree/master/gitlab-ce-cron\">GitHub\nrepo</a>.</p>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#tls-certificates\">TLS Certificates</a></h2>\n<p>Self-certified certificates:</p>\n<pre><code><span><span><span>openssl</span></span><span> req<span><span> -</span>nodes</span><span><span> -</span>newkey</span> rsa:2048<span><span> -</span>keyout</span> gitlab.srg.cl.cam.ac.uk.key<span><span> -</span>out</span> gitlab.srg.cl.cam.ac.uk.csr</span>\n</span><span><span><span>cd</span></span><span> ssl</span>\n</span><span><span><span>chmod</span></span><span> 600 <span>*</span></span>\n</span><span><span><span>openssl</span></span><span> x509<span><span> -</span>req</span><span><span> -</span>days</span> 1460<span><span> -</span>in</span> gitlab.srg.cl.cam.ac.uk.csr<span><span> -</span>signkey</span> gitlab.srg.cl.cam.ac.uk.key<span><span> -</span>out</span> gitlab.srg.cl.cam.ac.uk.crt</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#run-backups\">Run Backups</a></h2>\n<ul>\n<li><code>backup</code> script to create backup tarballs and extract</li>\n<li><code>sync</code> script to rsync extracted tarballs to filer</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#recovering-password\">Recovering Password</a></h2>\n<p>To change the <code>root</code> password you need to use the Ruby-on-Rails console to\naccess the relevant object, modify it, and save it back:</p>\n<pre><code><span>gitlab<span>-</span>rails console production\n</span><span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>01</span><span>:</span><span>0</span><span>></span> user <span>=</span> <span>User</span><span>.</span>where<span>(</span><span>id<span>:</span></span> <span>1</span><span>)</span><span>.</span>first\n</span><span><span>=></span> <span><span>#</span><User id: 1, email: "admin@example.com", created_at: "2016-11-16 22:57:21", updated_at: "2016-12-05 23:42:50", name: "Administrator", admin: true, projects_limit: 10, skype: "", linkedin: "", twitter: "", authentication_token: "secrettoken", theme_id: 2, bio: nil, username: "root", can_create_group: true, can_create_team: false, state: "active", color_scheme_id: 1, password_expires_at: nil, created_by_id: nil, last_credential_check_at: nil, avatar: nil, hide_no_ssh_key: false, website_url: "", notification_email: "admin@example.com", hide_no_password: false, password_automatically_set: false, location: nil, encrypted_otp_secret: nil, encrypted_otp_secret_iv: nil, encrypted_otp_secret_salt: nil, otp_required_for_login: false, otp_backup_codes: nil, public_email: "", dashboard: 0, project_view: 0, consumed_timestep: nil, layout: 0, hide_project_limit: false, otp_grace_period_started_at: nil, ldap_email: false, external: false, organization: nil>\n</span></span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>02</span><span>:</span><span>0</span><span>></span> user<span>.</span>password <span>=</span> <span><span><span>'</span>secretpassword<span>'</span></span></span>\n</span><span><span>=></span> <span><span><span>"</span>secretpassword<span>"</span></span></span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>03</span><span>:</span><span>0</span><span>></span> user<span>.</span>password_confirmation <span>=</span> <span><span><span>'</span>secretpassword<span>'</span></span></span>\n</span><span><span>=></span> <span><span><span>"</span>secretpassword<span>"</span></span></span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>04</span><span>:</span><span>0</span><span>></span> user<span>.</span>save!\n</span><span><span>Enqueued</span> <span>ActionMailer</span><span>::</span>DeliveryJob <span>(</span><span>Job</span> <span>ID<span>:</span></span> 5f74573d<span>-</span>dfa2<span>-</span><span>4778</span><span>-</span>b365<span>-</span>cbebd88e454e<span>)</span> to <span>Sidekiq</span><span>(</span>mailers<span>)</span> with <span>arguments<span>:</span></span> <span><span><span>"</span>DeviseMailer<span>"</span></span></span><span>,</span> <span><span><span>"</span>password_change<span>"</span></span></span><span>,</span> <span><span><span>"</span>deliver_now<span>"</span></span></span><span>,</span> <span>gid<span>:</span></span><span><span><span>/</span><span>/</span></span></span>gitlab<span>/</span><span>User</span><span>/</span><span>1</span>\n</span><span><span>=></span> <span>true</span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>05</span><span>:</span><span>0</span><span>></span>\n</span><span>\n</span><span>gitlab<span>-</span>ctl reconfigure\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#hook-up-to-github\">Hook up to GitHub</a></h2>\n<p>Per\n<a href=\"https://docs.gitlab.com/ce/integration/omniauth.html#initial-omniauth-configuration\">https://docs.gitlab.com/ce/integration/omniauth.html#initial-omniauth-configuration</a> and\n<a href=\"https://docs.gitlab.com/ce/integration/github.html\">https://docs.gitlab.com/ce/integration/github.html</a>:</p>\n<p>Edit via <code>sudo docker exec -it gitlab /bin/bash</code>:</p>\n<pre><code><span><span><span>root@gitlab:/#</span></span><span> vi /etc/gitlab/gitlab.rb</span>\n</span></code></pre>\n<pre><code><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_enabled<span>'</span></span></span><span>]</span> <span>=</span> <span>true</span>\n</span><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_allow_single_sign_on<span>'</span></span></span><span>]</span> <span>=</span> <span>[</span><span><span><span>'</span>saml<span>'</span></span></span><span>,</span> <span><span><span>'</span>github<span>'</span></span></span><span>]</span>\n</span><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_block_auto_created_users<span>'</span></span></span><span>]</span> <span>=</span> <span>true</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#email\">Email</a></h2>\n<p>Use SMTP via <ppsw.cam.ac.uk>, for which the from address must have a valid <code>MX</code>\nrecord <strong>and</strong> not be under <cam.ac.uk> per\n<a href=\"http://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/technical/sending\">http://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/technical/sending</a>.</p>\n<p>Configuration can be tested via the console:</p>\n<pre><code><span><span>Notify</span><span>.</span>test_email<span>(</span><span><span><span>'</span>your@email.address, <span>'</span></span></span><span>Hello</span> <span>World</span><span><span><span>'</span>, <span>'</span></span></span><span>This</span> is a <span>test</span> message<span><span><span>'</span>).deliver_now\n</span></span></span></code></pre>",+"content": "<p>Recently had cause to do this as part of the <a href=\"https://www.cl.cam.ac.uk/research/srg/\">SRG\u2019s</a> and <a href=\"https://ocamllabs.io/\">OCaml Labs</a>\ninfrastructure. Thought it might be useful to make some notes, so here they are!\nAssuming your local <code>sys-admin</code> has kindly created you a suitable VM running\nUbuntu with login credentials, etc, read on\u2026</p>\n<p>Note that several commands that follow must be run as <code>root</code>, via use of <code>sudo</code>\nbelow. Given that, think twice before just cutting and pasting them in,\nobviously\u2026 And I am not held responsible for anything either way!</p>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#install-docker\">Install Docker</a></h2>\n<p>On a new Ubuntu stretch/sid (testing) VM:</p>\n<pre><code><span><span><span>$</span></span><span> lsb_release<span><span> -</span>drc</span></span>\n</span><span><span><span>Description:</span></span><span>\tUbuntu 16.04.1 LTS</span>\n</span><span><span><span>Release:</span></span><span>\t16.04</span>\n</span><span><span><span>Codename:</span></span><span>\txenial</span>\n</span></code></pre>\n<p>Next, install up-to-date <a href=\"https://docker.com/\">Docker</a>:</p>\n<pre><code><span><span><span>sudo</span></span><span> apt-get install apt-transport-https ca-certificates</span>\n</span><span><span><span>apt-key</span></span><span> adv<span><span> --</span>keyserver</span> hkp://p80.pool.sks-keyservers.net:80 <span>\\\n</span></span></span><span><span><span><span> --</span>recv-keys</span> 58118E89F3A912897C070ADBF76221572C52609D</span>\n</span><span><span><span>sudo</span></span><span> echo <span><span>"</span>deb https://apt.dockerproject.org/repo debian-stretch main<span>"</span></span> <span>\\\n</span></span></span><span><span> <span>></span> /etc/apt/sources.list.d/docker.list</span>\n</span><span><span><span>sudo</span></span><span> apt-get update</span>\n</span><span><span><span>sudo</span></span><span> apt-get install<span><span> -</span>y</span> docker-engine</span>\n</span></code></pre>\n<p>Tweak the <code>systemd</code> <a href=\"https://docker.com/\">Docker</a> configuration by adding a fragment to point\nall <a href=\"https://docker.com/\">Docker</a> to the <code>/data</code> partition, lest the root partition <code>/</code> fill:</p>\n<pre><code><span><span><span>cat</span></span><span> <span>></span> /etc/systemd/system/docker.service.d/data-disk.conf <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>[Service]\n</span></span></span><span><span><span>ExecStart=\n</span></span></span><span><span><span>ExecStart=/usr/bin/dockerd -H fd:// -g /data/docker\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<p>Then start the <a href=\"https://docker.com/\">Docker</a> daemon and run <code>hello-world</code> just to check all is\nwell:</p>\n<pre><code><span><span><span>sudo</span></span><span> systemctl daemon-reload</span>\n</span><span><span><span>sudo</span></span><span> service docker start</span>\n</span></code></pre>\n<p>Finally, test the install by running <code>hello-world</code>:</p>\n<pre><code><span><span><span>$</span></span><span> docker run hello-world</span>\n</span><span>\n</span><span><span><span>Hello</span></span><span> from Docker!</span>\n</span><span><span><span>This</span></span><span> message shows that your installation appears to be working correctly.</span>\n</span><span>\n</span><span><span><span>To</span></span><span> generate this message, Docker took the following steps:</span>\n</span><span> <span><span>1.</span></span><span> The Docker client contacted the Docker daemon.</span>\n</span><span> <span><span>2.</span></span><span> The Docker daemon pulled the <span><span>"</span>hello-world<span>"</span></span> image from the Docker Hub.</span>\n</span><span> <span><span>3.</span></span><span> The Docker daemon created a new container from that image which runs the</span>\n</span><span> <span><span>executable</span></span><span> that produces the output you are currently reading.</span>\n</span><span> <span><span>4.</span></span><span> The Docker daemon streamed that output to the Docker client, which sent it</span>\n</span><span> <span><span>to</span></span><span> your terminal.</span>\n</span><span>\n</span><span><span><span>To</span></span><span> try something more ambitious, you can run an Ubuntu container with:</span>\n</span><span> <span><span>$</span></span><span> docker run<span><span> -</span>it</span> ubuntu bash</span>\n</span><span>\n</span><span><span><span>Share</span></span><span> images, automate workflows, and more with a free Docker Hub account:</span>\n</span><span> <span><span>https://hub.docker.com</span></span>\n</span><span>\n</span><span><span><span>For</span></span><span> more examples and ideas, visit:</span>\n</span><span> <span><span>https://docs.docker.com/engine/userguide/</span></span>\n</span></code></pre>\n<p>If appropriate, you may also wish to add yourself to the <code>docker</code> user group:</p>\n<pre><code><span><span><span>sudo</span></span><span> usermod<span><span> -</span>aG</span> docker <span><span>$</span><span>(</span><span><span>whoami</span></span><span>)</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#gitlab\">GitLab</a></h2>\n<p>Assuming you have rights to run <code>docker</code>, install and run Gitlab-CE:</p>\n<pre><code><span><span>S</span><span>=</span><span>128.232.xxx.yyy</span>\n</span><span><span>H</span><span>=</span><span>gitlab.srg.cl.cam.ac.uk</span>\n</span><span><span><span>docker</span></span><span> run<span><span> --</span>detach</span> <span>\\\n</span></span></span><span><span><span><span> --</span>hostname</span> <span><span>$</span><span>H</span></span> <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:443:443 <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:80:80 <span>\\\n</span></span></span><span><span><span><span> --</span>publish</span> <span><span>$</span><span>S</span></span>:2222:22 <span>\\\n</span></span></span><span><span><span><span> --</span>name</span> gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>restart</span> always <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/config:/etc/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/logs:/var/log/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/data:/var/opt/gitlab <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/backups:/var/opt/gitlab/backups <span>\\\n</span></span></span><span><span><span><span> --</span>volume</span> /data/gitlab/sync:/var/opt/gitlab/sync <span>\\\n</span></span></span><span><span><span><span> --</span>env</span> HOST_UID=<span><span>$</span><span>$</span></span>(id<span><span> -</span>u</span></span><span></span>) <span><span>--env</span></span><span> HOST_GID=<span><span>$</span><span>$</span></span>(id<span><span> -</span>g</span></span><span></span>) <span>\\\n</span></span><span> <span><span>mor1/gitlab-ce-cron:latest</span></span>\n</span></code></pre>\n<p>\u2026or use the <code>make start</code> target in the\n<a href=\"https://github.com/mor1/dockerfiles/blob/master/gitlab-ce-cron/Makefile\">Makefile</a>\nin the related <a href=\"https://github.com/mor1/dockerfiles/tree/master/gitlab-ce-cron\">GitHub\nrepo</a>.</p>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#tls-certificates\">TLS Certificates</a></h2>\n<p>Self-certified certificates:</p>\n<pre><code><span><span><span>openssl</span></span><span> req<span><span> -</span>nodes</span><span><span> -</span>newkey</span> rsa:2048<span><span> -</span>keyout</span> gitlab.srg.cl.cam.ac.uk.key<span><span> -</span>out</span> gitlab.srg.cl.cam.ac.uk.csr</span>\n</span><span><span><span>cd</span></span><span> ssl</span>\n</span><span><span><span>chmod</span></span><span> 600 <span>*</span></span>\n</span><span><span><span>openssl</span></span><span> x509<span><span> -</span>req</span><span><span> -</span>days</span> 1460<span><span> -</span>in</span> gitlab.srg.cl.cam.ac.uk.csr<span><span> -</span>signkey</span> gitlab.srg.cl.cam.ac.uk.key<span><span> -</span>out</span> gitlab.srg.cl.cam.ac.uk.crt</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#run-backups\">Run Backups</a></h2>\n<ul>\n<li><code>backup</code> script to create backup tarballs and extract</li>\n<li><code>sync</code> script to rsync extracted tarballs to filer</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#recovering-password\">Recovering Password</a></h2>\n<p>To change the <code>root</code> password you need to use the Ruby-on-Rails console to\naccess the relevant object, modify it, and save it back:</p>\n<pre><code><span>gitlab<span>-</span>rails console production\n</span><span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>01</span><span>:</span><span>0</span><span>></span> user <span>=</span> <span>User</span><span>.</span>where<span>(</span><span>id<span>:</span></span> <span>1</span><span>)</span><span>.</span>first\n</span><span><span>=></span> <span><span>#</span><User id: 1, email: "admin@example.com", created_at: "2016-11-16 22:57:21", updated_at: "2016-12-05 23:42:50", name: "Administrator", admin: true, projects_limit: 10, skype: "", linkedin: "", twitter: "", authentication_token: "secrettoken", theme_id: 2, bio: nil, username: "root", can_create_group: true, can_create_team: false, state: "active", color_scheme_id: 1, password_expires_at: nil, created_by_id: nil, last_credential_check_at: nil, avatar: nil, hide_no_ssh_key: false, website_url: "", notification_email: "admin@example.com", hide_no_password: false, password_automatically_set: false, location: nil, encrypted_otp_secret: nil, encrypted_otp_secret_iv: nil, encrypted_otp_secret_salt: nil, otp_required_for_login: false, otp_backup_codes: nil, public_email: "", dashboard: 0, project_view: 0, consumed_timestep: nil, layout: 0, hide_project_limit: false, otp_grace_period_started_at: nil, ldap_email: false, external: false, organization: nil>\n</span></span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>02</span><span>:</span><span>0</span><span>></span> user<span>.</span>password <span>=</span> <span><span><span>'</span>secretpassword<span>'</span></span></span>\n</span><span><span>=></span> <span><span><span>"</span>secretpassword<span>"</span></span></span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>03</span><span>:</span><span>0</span><span>></span> user<span>.</span>password_confirmation <span>=</span> <span><span><span>'</span>secretpassword<span>'</span></span></span>\n</span><span><span>=></span> <span><span><span>"</span>secretpassword<span>"</span></span></span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>04</span><span>:</span><span>0</span><span>></span> user<span>.</span>save!\n</span><span><span>Enqueued</span> <span>ActionMailer</span><span>::</span>DeliveryJob <span>(</span><span>Job</span> <span>ID<span>:</span></span> 5f74573d<span>-</span>dfa2<span>-</span><span>4778</span><span>-</span>b365<span>-</span>cbebd88e454e<span>)</span> to <span>Sidekiq</span><span>(</span>mailers<span>)</span> with <span>arguments<span>:</span></span> <span><span><span>"</span>DeviseMailer<span>"</span></span></span><span>,</span> <span><span><span>"</span>password_change<span>"</span></span></span><span>,</span> <span><span><span>"</span>deliver_now<span>"</span></span></span><span>,</span> <span>gid<span>:</span></span><span><span><span>/</span><span>/</span></span></span>gitlab<span>/</span><span>User</span><span>/</span><span>1</span>\n</span><span><span>=></span> <span>true</span>\n</span><span>irb<span>(</span>main<span>)</span><span>:</span><span><span>0</span>05</span><span>:</span><span>0</span><span>></span>\n</span><span>\n</span><span>gitlab<span>-</span>ctl reconfigure\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#hook-up-to-github\">Hook up to GitHub</a></h2>\n<p>Per\n<a href=\"https://docs.gitlab.com/ce/integration/omniauth.html#initial-omniauth-configuration\">https://docs.gitlab.com/ce/integration/omniauth.html#initial-omniauth-configuration</a> and\n<a href=\"https://docs.gitlab.com/ce/integration/github.html\">https://docs.gitlab.com/ce/integration/github.html</a>:</p>\n<p>Edit via <code>sudo docker exec -it gitlab /bin/bash</code>:</p>\n<pre><code><span><span><span>root@gitlab:/#</span></span><span> vi /etc/gitlab/gitlab.rb</span>\n</span></code></pre>\n<pre><code><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_enabled<span>'</span></span></span><span>]</span> <span>=</span> <span>true</span>\n</span><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_allow_single_sign_on<span>'</span></span></span><span>]</span> <span>=</span> <span>[</span><span><span><span>'</span>saml<span>'</span></span></span><span>,</span> <span><span><span>'</span>github<span>'</span></span></span><span>]</span>\n</span><span>gitlab_rails<span>[</span><span><span><span>'</span>omniauth_block_auto_created_users<span>'</span></span></span><span>]</span> <span>=</span> <span>true</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/lab-gitlab/#email\">Email</a></h2>\n<p>Use SMTP via <ppsw.cam.ac.uk>, for which the from address must have a valid <code>MX</code>\nrecord <strong>and</strong> not be under <cam.ac.uk> per\n<a href=\"http://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/technical/sending\">http://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/technical/sending</a>.</p>\n<p>Configuration can be tested via the console:</p>\n<pre><code><span><span>Notify</span><span>.</span>test_email<span>(</span><span><span><span>'</span>your@email.address, <span>'</span></span></span><span>Hello</span> <span>World</span><span><span><span>'</span>, <span>'</span></span></span><span>This</span> is a <span>test</span> message<span><span><span>'</span>).deliver_now\n</span></span></span></code></pre>",
+18
mort/blog_local-knowledge_.json
+18
mort/blog_local-knowledge_.json
···+"summary": "<p>A relatively long hiatus this time \u2013 blame the Cambridge housing market and the\nneed to simultaneously act both incredibly quickly and mind-numbingly slowly.\nAlso obtaining a <a href=\"http://www.christs.cam.ac.uk/content/dr-richard-mortier\">College\nFellowship</a> and\nexperiencing a <a href=\"https://blog.docker.com/2016/01/unikernel/\">Company Acquisition</a>\nwere considerable (interesting, welcome) distractions\u2026 :)</p>\n<p>As an interim measure, and because I\u2019ve been asked relatively frequently over\nthe last few months, I thought I\u2019d collect some local knowledge^Wopinion,\ngarnered over many long hard years of hanging about in both Cambridge and\nNottingham. What follows should be viewed as indicating nothing more than my\nlack of imagination.</p>\n<p>Both Cambridge and Nottingham are blessed with many fine pubs, as well as\nseveral truly terrible ones. Less said about the latter the better. But among\nthose that I like, and will occasionally drag various people to (including the\n<a href=\"http://www.cl.cam.ac.uk/research/srg/\">SRG</a>) are, arranged roughly in\ngeographical order from the Computer Lab to the Station:</p>\n<ul>\n<li><a href=\"http://www.theploughcoton.co.uk/\">The Plough, Coton</a>. Gastro-pub-ish.\nFrequently changed management over recent years, but currently seems pretty\nconsistent and (finally!) is not limited to Greene King beers.</li>\n<li><a href=\"http://thecastleinncambridge.com/\">The Castle Inn</a>. Major long-standing\nAdnams \u201cproper pub\u201d. Very consistent. Castle Burger and chips please.</li>\n<li><a href=\"http://www.taylor-walker.co.uk/pub/pickerel-cambridge/c3602/\">The Pickerel</a>.\nAlso largely unchanged in a couple of decades. Good for a pint before entering\nthe Greene King desert.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/184/st-radegund-cambridge\">The Radegund</a>.\nTiniest pub I know of. Recently quite radically refurbished. But, while not\nwhat it was, seems pretty decent nonetheless.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/79/free-press-cambridge\">The Free Press</a>. Has\nimproved significantly while I was in Nottingham. Very good food and beer.\nStocks the only actually pleasant Greene King beer\n(<a href=\"http://www.ratebeer.com/beer/greene-king-xx-mild/14879/\">XX Mild</a>), as well\nas an interesting range of others. Quite small and often packed if it\u2019s too\ncold to sit outside.</li>\n<li><a href=\"http://www.the-cambridgeblue.co.uk/\">The Cambridge Blue</a>. Surely one of the\nbest pubs for many many miles. Excellent and huge range of beer (and,\napparently, cider; won\u2019t touch the stuff myself). Good food. Large and with a\nlarge garden too \u2013 rarely a problem finding somewhere to sit during the week.</li>\n<li><a href=\"http://www.kingston-arms.co.uk/\">The Kingston Arms</a>. Another excellent pub in\nthe station area. Possibly better food than the Blue, potentially slightly\nless wide ranging set of beers (and, certainly, ciders). Also a nice garden,\nthough in all respects somewhat smaller than the Blue so prone to being\nabsolutely packed. Doesn\u2019t serve lager \u2013 a certain German colleague makes do\nwith JHB though.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/118/live-let-live-cambridge\">The Live and Let Live</a>.\nYet one more excellent and, though small, often less busy pub near the\nstation. No food in the evenings.</li>\n<li><a href=\"https://www.individualpubs.co.uk/devonshire/\">The Devonshire Arms</a>. Largest\nrange of Milton Brewery beers I know of. Food = pizzas I believe (though not,\nlast time I tried, as good as those at\n<a href=\"http://www.carpentersarmscambridge.co.uk/\">The Carpenters Arms</a>).</li>\n</ul>\n<p>I don\u2019t get out to the villages as often as I\u2019d like now, but <a href=\"http://www.thegreenmangrantchester.co.uk/#the-green-man-grantchester\">The Green Man,\nGrantchester</a>,\n<a href=\"http://www.bluelionhardwick.co.uk/\">The Blue Lion, Hardwick</a> and especially\n<a href=\"http://theredlionhiston.co.uk/\">The Red Lion, Histon</a> are (or were last time I\nwent) all excellent too. The latter is still possibly my favourite pub anywhere\nin fact.</p>\n<p>I only spent a few years in Nottingham, and it\u2019s a Proper City unlike Cambridge,\nso I can\u2019t claim to have tested it thoroughly. However, places I did visit\nfairly regularly in Nottingham, Beeston, and Wollaton that I certainly enjoyed\ninclude:</p>\n<ul>\n<li><a href=\"http://www.theroundhousenottingham.co.uk/\">The Roundhouse</a>. A little pricey\nfor Nottingham, but good quality beer and food, nice atmosphere.</li>\n<li><a href=\"http://www.castlerockbrewery.co.uk/pubs/lincolnshire-poacher/\">The Lincolnshire Poacher</a>.\nVery excellent pub. Good whisky range too.</li>\n<li><a href=\"http://www.fellowsmortonandclayton.co.uk/\">Fellows, Morton & Clayton</a>.\nConsistently good, proper pub food, about 4 minutes walk from the station\nplatforms.</li>\n<li><a href=\"http://www.castlerockbrewery.co.uk/pubs/vat-and-fiddle/\">The Vat & Fiddle</a>.\nOnly finally tried more recently. Castle Rock brewery tap. Good food and beer,\nalso about 4 minutes walk from your station platform.</li>\n<li><a href=\"http://www.thehandandheart.co.uk/\">The Hand & Heart</a>. In a cave! It\u2019s a pub\nin a cave! With excellent beer and good food! And not owned by Greene King!\n(Unlike <a href=\"http://triptojerusalem.com/\">Ye Olde Trip to Jerusalem</a> which has\nbigger, better caves but markedly less good beer or food.)</li>\n<li><a href=\"http://www.bluemonkeybrewery.com/pubs/organ-grinder-nottingham\">The Organ Grinder</a>.\nAnother fine pub at Canning Circus. Blue Monkey brewery this time \u2013 rather\ngood. Food limited to (nice!) pork pies and such, but the beer is good.</li>\n<li><a href=\"http://www.nottinghambrewery.co.uk/the_plough_inn.html\">The Plough Inn</a>.\nClosest (decent) pub to Jubilee Campus. Nottingham City brewery tap. Excellent\nbeer, no food in the evenings, and prices that make me think I\u2019m 15 years\nyounger\u2026</li>\n<li><a href=\"http://www.victoriabeeston.co.uk/\">The Victoria Hotel, Beeston</a>. Consistently\nexcellent pub. Excellent beer, very good food.</li>\n<li><a href=\"http://www.everards.co.uk/our-pubs/crown-inn-beeston/\">The Crown Inn, Beeston</a>.\nAlso consistently excellent beer, though no food in the evenings.</li>\n<li><a href=\"http://www.molefacepubcompany.co.uk/the-wollaton-pub-and-kitchen.html\">The Wollaton Pub and Kitchen, Wollaton</a>.\nGastro-pub-ish. Generally good beer, often excellent but occasionally\npatchy food.</li>\n</ul>\n<p>So there you go. Some opinions if relatively little knowledge. YMMV. Etc.</p>\n<p>(PS. If I know you, I\u2019ll also be happy to give recommendations of solicitors,\nbuilders and other house-associated professionals too. And a couple of\nwarnings.)</p>",+"content": "<p>A relatively long hiatus this time \u2013 blame the Cambridge housing market and the\nneed to simultaneously act both incredibly quickly and mind-numbingly slowly.\nAlso obtaining a <a href=\"http://www.christs.cam.ac.uk/content/dr-richard-mortier\">College\nFellowship</a> and\nexperiencing a <a href=\"https://blog.docker.com/2016/01/unikernel/\">Company Acquisition</a>\nwere considerable (interesting, welcome) distractions\u2026 :)</p>\n<p>As an interim measure, and because I\u2019ve been asked relatively frequently over\nthe last few months, I thought I\u2019d collect some local knowledge^Wopinion,\ngarnered over many long hard years of hanging about in both Cambridge and\nNottingham. What follows should be viewed as indicating nothing more than my\nlack of imagination.</p>\n<p>Both Cambridge and Nottingham are blessed with many fine pubs, as well as\nseveral truly terrible ones. Less said about the latter the better. But among\nthose that I like, and will occasionally drag various people to (including the\n<a href=\"http://www.cl.cam.ac.uk/research/srg/\">SRG</a>) are, arranged roughly in\ngeographical order from the Computer Lab to the Station:</p>\n<ul>\n<li><a href=\"http://www.theploughcoton.co.uk/\">The Plough, Coton</a>. Gastro-pub-ish.\nFrequently changed management over recent years, but currently seems pretty\nconsistent and (finally!) is not limited to Greene King beers.</li>\n<li><a href=\"http://thecastleinncambridge.com/\">The Castle Inn</a>. Major long-standing\nAdnams \u201cproper pub\u201d. Very consistent. Castle Burger and chips please.</li>\n<li><a href=\"http://www.taylor-walker.co.uk/pub/pickerel-cambridge/c3602/\">The Pickerel</a>.\nAlso largely unchanged in a couple of decades. Good for a pint before entering\nthe Greene King desert.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/184/st-radegund-cambridge\">The Radegund</a>.\nTiniest pub I know of. Recently quite radically refurbished. But, while not\nwhat it was, seems pretty decent nonetheless.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/79/free-press-cambridge\">The Free Press</a>. Has\nimproved significantly while I was in Nottingham. Very good food and beer.\nStocks the only actually pleasant Greene King beer\n(<a href=\"http://www.ratebeer.com/beer/greene-king-xx-mild/14879/\">XX Mild</a>), as well\nas an interesting range of others. Quite small and often packed if it\u2019s too\ncold to sit outside.</li>\n<li><a href=\"http://www.the-cambridgeblue.co.uk/\">The Cambridge Blue</a>. Surely one of the\nbest pubs for many many miles. Excellent and huge range of beer (and,\napparently, cider; won\u2019t touch the stuff myself). Good food. Large and with a\nlarge garden too \u2013 rarely a problem finding somewhere to sit during the week.</li>\n<li><a href=\"http://www.kingston-arms.co.uk/\">The Kingston Arms</a>. Another excellent pub in\nthe station area. Possibly better food than the Blue, potentially slightly\nless wide ranging set of beers (and, certainly, ciders). Also a nice garden,\nthough in all respects somewhat smaller than the Blue so prone to being\nabsolutely packed. Doesn\u2019t serve lager \u2013 a certain German colleague makes do\nwith JHB though.</li>\n<li><a href=\"https://whatpub.com/pubs/CAM/118/live-let-live-cambridge\">The Live and Let Live</a>.\nYet one more excellent and, though small, often less busy pub near the\nstation. No food in the evenings.</li>\n<li><a href=\"https://www.individualpubs.co.uk/devonshire/\">The Devonshire Arms</a>. Largest\nrange of Milton Brewery beers I know of. Food = pizzas I believe (though not,\nlast time I tried, as good as those at\n<a href=\"http://www.carpentersarmscambridge.co.uk/\">The Carpenters Arms</a>).</li>\n</ul>\n<p>I don\u2019t get out to the villages as often as I\u2019d like now, but <a href=\"http://www.thegreenmangrantchester.co.uk/#the-green-man-grantchester\">The Green Man,\nGrantchester</a>,\n<a href=\"http://www.bluelionhardwick.co.uk/\">The Blue Lion, Hardwick</a> and especially\n<a href=\"http://theredlionhiston.co.uk/\">The Red Lion, Histon</a> are (or were last time I\nwent) all excellent too. The latter is still possibly my favourite pub anywhere\nin fact.</p>\n<p>I only spent a few years in Nottingham, and it\u2019s a Proper City unlike Cambridge,\nso I can\u2019t claim to have tested it thoroughly. However, places I did visit\nfairly regularly in Nottingham, Beeston, and Wollaton that I certainly enjoyed\ninclude:</p>\n<ul>\n<li><a href=\"http://www.theroundhousenottingham.co.uk/\">The Roundhouse</a>. A little pricey\nfor Nottingham, but good quality beer and food, nice atmosphere.</li>\n<li><a href=\"http://www.castlerockbrewery.co.uk/pubs/lincolnshire-poacher/\">The Lincolnshire Poacher</a>.\nVery excellent pub. Good whisky range too.</li>\n<li><a href=\"http://www.fellowsmortonandclayton.co.uk/\">Fellows, Morton & Clayton</a>.\nConsistently good, proper pub food, about 4 minutes walk from the station\nplatforms.</li>\n<li><a href=\"http://www.castlerockbrewery.co.uk/pubs/vat-and-fiddle/\">The Vat & Fiddle</a>.\nOnly finally tried more recently. Castle Rock brewery tap. Good food and beer,\nalso about 4 minutes walk from your station platform.</li>\n<li><a href=\"http://www.thehandandheart.co.uk/\">The Hand & Heart</a>. In a cave! It\u2019s a pub\nin a cave! With excellent beer and good food! And not owned by Greene King!\n(Unlike <a href=\"http://triptojerusalem.com/\">Ye Olde Trip to Jerusalem</a> which has\nbigger, better caves but markedly less good beer or food.)</li>\n<li><a href=\"http://www.bluemonkeybrewery.com/pubs/organ-grinder-nottingham\">The Organ Grinder</a>.\nAnother fine pub at Canning Circus. Blue Monkey brewery this time \u2013 rather\ngood. Food limited to (nice!) pork pies and such, but the beer is good.</li>\n<li><a href=\"http://www.nottinghambrewery.co.uk/the_plough_inn.html\">The Plough Inn</a>.\nClosest (decent) pub to Jubilee Campus. Nottingham City brewery tap. Excellent\nbeer, no food in the evenings, and prices that make me think I\u2019m 15 years\nyounger\u2026</li>\n<li><a href=\"http://www.victoriabeeston.co.uk/\">The Victoria Hotel, Beeston</a>. Consistently\nexcellent pub. Excellent beer, very good food.</li>\n<li><a href=\"http://www.everards.co.uk/our-pubs/crown-inn-beeston/\">The Crown Inn, Beeston</a>.\nAlso consistently excellent beer, though no food in the evenings.</li>\n<li><a href=\"http://www.molefacepubcompany.co.uk/the-wollaton-pub-and-kitchen.html\">The Wollaton Pub and Kitchen, Wollaton</a>.\nGastro-pub-ish. Generally good beer, often excellent but occasionally\npatchy food.</li>\n</ul>\n<p>So there you go. Some opinions if relatively little knowledge. YMMV. Etc.</p>\n<p>(PS. If I know you, I\u2019ll also be happy to give recommendations of solicitors,\nbuilders and other house-associated professionals too. And a couple of\nwarnings.)</p>",
+18
mort/blog_looping-the-loop_.json
+18
mort/blog_looping-the-loop_.json
···+"summary": "<p>In a fit of blogging mania, here\u2019s another one literally barely days after the\nprevious one. Maybe I\u2019ll crack this yet.</p>\n<p>Anyway, this is just a short one with what verges on a Technical Contribution.\nTo whit: I recently sorted out <a href=\"http://mort.io/\">this domain</a> and was having\nsome issues getting some consistency between what <code>dig</code>, Chrome and my\n<a href=\"http://gandi.net\">domain provider</a> believed to be the correct state. In\nparticular, I was switching over to make the domain properly live rather than\nsimply a <code>301 Moved Permanently</code> redirect to my old pages at Nottingham.</p>\n<p>It turns out this was probably mostly Chrome being confused. It seems that it\ncaches <code>301 Moved Permanently</code> redirects fairly aggressively and the cached\nentries are <strong>not</strong> discarded when you go through the standard mechanisms to\nclear caches.</p>\n<p>After a bit of experimentation and browsing, it seems that one way to clear this\nis to <code>view-source</code> on the page but pass a spurious parameter to defeat the\ncache. So, to force the browser to fetch <a href=\"http://mort.io\">http://mort.io</a> properly, all I had to\ndo was <code>view-source:mort.io?spurious=parameter</code>. And lo! All was well.</p>",+"content": "<p>In a fit of blogging mania, here\u2019s another one literally barely days after the\nprevious one. Maybe I\u2019ll crack this yet.</p>\n<p>Anyway, this is just a short one with what verges on a Technical Contribution.\nTo whit: I recently sorted out <a href=\"http://mort.io/\">this domain</a> and was having\nsome issues getting some consistency between what <code>dig</code>, Chrome and my\n<a href=\"http://gandi.net\">domain provider</a> believed to be the correct state. In\nparticular, I was switching over to make the domain properly live rather than\nsimply a <code>301 Moved Permanently</code> redirect to my old pages at Nottingham.</p>\n<p>It turns out this was probably mostly Chrome being confused. It seems that it\ncaches <code>301 Moved Permanently</code> redirects fairly aggressively and the cached\nentries are <strong>not</strong> discarded when you go through the standard mechanisms to\nclear caches.</p>\n<p>After a bit of experimentation and browsing, it seems that one way to clear this\nis to <code>view-source</code> on the page but pass a spurious parameter to defeat the\ncache. So, to force the browser to fetch <a href=\"http://mort.io\">http://mort.io</a> properly, all I had to\ndo was <code>view-source:mort.io?spurious=parameter</code>. And lo! All was well.</p>",
+18
mort/blog_mediapc_.json
+18
mort/blog_mediapc_.json
···+"summary": "<p>Some notes from my first attempt to renovate an old media PC that had a SYSLINUX\ninstall without any package management, and a crufty BIOS. Probably outdated\nnow, but I may go back to it one day\u2026</p>\n<p>First, some background links:</p>\n<ul>\n<li><a href=\"https://en.wikipedia.org/wiki/Cylinder-head-sector\">https://en.wikipedia.org/wiki/Cylinder-head-sector</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/FAT_boot_sector\">https://en.wikipedia.org/wiki/FAT_boot_sector</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Logical_Block_Addressing#CHS_conversion\">https://en.wikipedia.org/wiki/Logical_Block_Addressing#CHS_conversion</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Master_Boot_Record\">https://en.wikipedia.org/wiki/Master_Boot_Record</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Volume_boot_record\">https://en.wikipedia.org/wiki/Volume_boot_record</a></li>\n<li><a href=\"https://wiki.archlinux.org/index.php/Syslinux\">https://wiki.archlinux.org/index.php/Syslinux</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Common_Problems#Failed_to_load_ldlinux\">https://wiki.syslinux.org/wiki/index.php?title=Common_Problems#Failed_to_load_ldlinux</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Hardware_Compatibility#USB_related_problems\">https://wiki.syslinux.org/wiki/index.php?title=Hardware_Compatibility#USB_related_problems</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Hdt_(Hardware_Detection_Tool)\">https://wiki.syslinux.org/wiki/index.php?title=Hdt_(Hardware_Detection_Tool)</a></li>\n</ul>\n<p>I explored two ways forward: <a href=\"https://www.syslinux.org/\">SYSLINUX</a> and\n<a href=\"https://www.gnu.org/software/grub/index.html\">GRUB</a>.</p>\n<h2><a href=\"https://mort.io/blog/mediapc/#syslinux\">SYSLINUX</a></h2>\n<p>I found that getting SYSLINUX working required moving the partition to 0/1/1 \u2013\nusing sectors per track of 63 or 32, and heads per cylinder or 16 or 64 with\nappropriate cylinder values simply didn\u2019t help.</p>\n<p>Diagnosed by observing that console displayed only CRLF but no banner \u2013\nSYSLINUX code ends up with the banner to be displayed just falling into the\nsecond sector on the disk, so it can\u2019t be read unless the geometry is correct.\nDon\u2019t ask why old fashioned whirling metal disk geometry needs to be set for a\nUSB stick, you\u2019ll be sad.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#formatting-the-usb-stick\">Formatting the USB stick</a></h3>\n<p>Some runes, use at your own risk.</p>\n<pre><code><span><span><span>sudo</span></span><span> dd if=/dev/zero of=/dev/sdd status=progress bs=1M count=256</span>\n</span><span><span><span>sudo</span></span><span> fdisk /dev/sdd <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>o\n</span></span></span><span><span><span>x\n</span></span></span><span><span><span>h\n</span></span></span><span><span><span>64\n</span></span></span><span><span><span>s\n</span></span></span><span><span><span>32\n</span></span></span><span><span><span>r\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>t\n</span></span></span><span><span><span>6\n</span></span></span><span><span><span>a\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>w\n</span></span></span><span><span><span>\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>sudo</span></span><span> mkfs.fat /dev/sdd1</span>\n</span></code></pre>\n<p>One exciting gotcha: the <code>fdisk</code> utility in the <code>util-linux</code> package <strong>didn\u2019t\nwork</strong> \u2013 but the one in <code>busybox</code> did!</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#putting-mbr-in-place\">Putting MBR in place</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> dd bs=440 count=1 conv=notrunc if=/usr/share/syslinux/mbr.bin of=/dev/sdd</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#obtaining-and-installing-memtest86\">Obtaining and installing </a><a href=\"https://www.memtest86.com/\"><code>memtest86</code></a></h3>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>wget</span></span><span> http://memtest.org/download/4.10/memtest86+-4.10.zip</span>\n</span><span><span><span>unzip</span></span><span> memtest86+-4.10.zip</span>\n</span><span><span><span>sudo</span></span><span> cp <span><span>~</span></span>/memtest86+-4.10.bin /mnt/boot/</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#putting-locally-built-syslinux-in-place\">Putting locally built SYSLINUX in place</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> mount /dev/sdd1 /mnt</span>\n</span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /mnt/boot/syslinux</span>\n</span><span><span><span>sudo</span></span><span> syslinux<span><span> --</span>directory</span> boot/syslinux<span><span> --</span>install</span> /dev/sdd1</span>\n</span><span><span><span>sudo</span></span><span> cp /usr/share/syslinux/<span>*</span>.c32 /mnt/boot/syslinux/</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/syslinux</span>\n</span><span><span><span>make</span></span><span> bios</span>\n</span><span><span><span>sudo</span></span><span> cp <span><span>~</span></span>/syslinux/bios/com32/hdt/hdt.c32 /mnt/boot/syslinux/</span>\n</span><span><span><span>sudo</span></span><span> cp /usr/share/hwdata/pci.ids /mnt/boot/syslinux</span>\n</span></code></pre>\n<pre><code><span><span><span>sudo</span></span><span> sh<span><span> -</span>c</span> <span><span>"</span>cat > /mnt/boot/syslinux/syslinux.cfg<span>"</span></span> <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span># UI menu.c32\n</span></span></span><span><span><span>PROMPT 1\n</span></span></span><span><span><span>DEFAULT hdt\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL some_label\n</span></span></span><span><span><span> LINUX memdisk\n</span></span></span><span><span><span> INITRD ../alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span> APPEND iso-scan/filename=../alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL memtest\n</span></span></span><span><span><span> LINUX ../memtest86+-4.10.bin\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL hdt\n</span></span></span><span><span><span> COM32 hdt.c32\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL reboot\n</span></span></span><span><span><span> COM32 reboot.c32\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL poweroff\n</span></span></span><span><span><span> COM32 poweroff.c32\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<p>Unfortunately, getting <code>hdt</code> working required rebuilding as the Alpine package\nversion doesn\u2019t appear to statically link against libupload.a from SYSLINUX\ntree so doesn\u2019t work. Fixing required <code>make bios</code> in the SYSLINUX tree after\ninstalling dependencies including:</p>\n<pre><code><span><span><span>sudo</span></span><span> apk<span><span> -</span>U</span> add nasm xzlinux-headers util-linux-dev</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mediapc/#grub\">GRUB</a></h2>\n<p>Similar behaviour: GRUB2 displayed the <code>GRUB </code> message and nothing else, The\n<a href=\"https://help.ubuntu.com/community/Grub2/Troubleshooting#GRUB\">Ubuntu wiki</a> says\nthis is the \u201ccan\u2019t find MBR or euqivalent\u201d information. In fact, it\u2019s the same\nissue: subsequent progress requires reading the second sector, but I had a\nCHS/LBA mismatch meant it wasn\u2019t reading from the right sector and so hanging.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-wipe-the-stick\">To wipe the stick</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> dd if=/dev/zero of=/dev/sdd status=progress bs=4M</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-partition-the-stick\">To partition the stick</a></h3>\n<p>In this case, to be bootable with a single <code>ext4</code> partition</p>\n<pre><code><span><span><span>sudo</span></span><span> parted /dev/sdd</span>\n</span><span><span><span>mklabel</span></span><span> msdos</span>\n</span><span><span><span>unit</span></span><span> s</span>\n</span><span><span><span>mkpart</span></span><span> primary ext2 2048s 100<span><span>%</span></span></span>\n</span><span><span><span>set</span></span><span> 1 boot on</span>\n</span><span><span><span>set</span></span><span> 1 lba off</span>\n</span></code></pre>\n<p>\u2026or alternatively, possibly</p>\n<pre><code><span><span><span>sudo</span></span><span> fdisk /dev/sdd <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>o\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>a\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-format-the-partition-and-install-grub-and-the-master-boot-record\">To format the partition, and install <code>grub</code> and the master boot record</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> mkfs.ext4 /dev/sdd1</span>\n</span><span><span><span>sudo</span></span><span> mount /dev/sdd1 /mnt</span>\n</span><span><span><span>sudo</span></span><span> grub-install<span><span> --</span>recheck</span><span><span> --</span>boot-directory</span><span>=</span>/mnt/boot /dev/sdd</span>\n</span></code></pre>\n<p>At this point, booting off the stick will bring htpc to <code>GRUB </code> error stage,\nindicating GRUB has loaded but doesn\u2019t know anything about how to continue.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#install-memtest\">Install memtest</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> cp memtest86+.bin /mnt</span>\n</span><span><span><span>sudo</span></span><span> cat <span>></span>/mnt/boot/grub/grub.cfg <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>set timeout=10\n</span></span></span><span><span><span>set default=0\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>menuentry "Memtest 86+" {\n</span></span></span><span><span><span> linux16 /memtest86+.bin\n</span></span></span><span><span><span>}\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#install-alpine-iso-for-booting\">Install Alpine ISO for booting</a></h3>\n<p>Add the following stanza to GRUB config, above:</p>\n<pre><code><span><span><span>insmod</span></span><span> loopback</span>\n</span><span>\n</span><span><span><span>menuentry</span></span><span> <span><span>"</span>alpine<span>"</span></span> <span><span>{</span>\n</span></span></span><span><span><span> set isofile=/boot/alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span> loopback loop <span><span>$</span><span>isofile</span></span>\n</span></span></span><span><span><span> linux (loop)/boot/vmlinuz-lts iso-scan/filename=<span><span>$</span><span>isofile</span></span> modules=loop<span>,</span>squashfs<span>,</span>sd-mod<span>,</span>usb-storage modloop=(loop)/boot/modloop-lts\n</span></span></span><span><span><span> initrd (loop)/boot/initramfs-lts\n</span></span></span><span><span><span><span>}</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mediapc/#miscellaneous-cribs\">Miscellaneous cribs</a></h2>\n<p>I did of course forget that the</p>\n<p>Bluetooth keyboard requires a dongle to be plugged in which is stored inside the\nbattery compartment. Doh. Making a note so I don\u2019t forget again.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#adding-internal-hard-disk\">Adding internal hard-disk</a></h3>\n<p>Required partitioning and formatting:</p>\n<ul>\n<li>find where disk is mounted: <code>sudo lshw -C disk -short</code>\nassume new disk is <code>/dev/sdX</code>, flash disk is <code>/dev/sdF</code></li>\n<li>partition disk: <code>sudo parted /dev/sdX mkpart primary ext4 1 -1</code></li>\n<li>format disk: <code>sudo mkfs.ext4 -m0 /dev/sdX1</code></li>\n<li>label disk: <code>sudo e2label /dev/sdX1 Harddisk</code></li>\n<li>remount flash disk rw: <code>mount -o remount,rw /dev/sdF</code></li>\n<li>edit <code>/boot/extlinux.conf</code> so APPEND line reads:\n<code>APPEND boot=LABEL=System disk=LABEL=Harddisk quiet</code></li>\n</ul>\n<p>I screwed up the first time by not correctly labelling the disk so had to make\nan Ubuntu rescue USB stick. Couldn\u2019t get this to work using MacOS, though didn\u2019t\ntry putting GRUB on via MacOS.</p>\n<ul>\n<li>download ISO: <a href=\"http://ubuntu-rescue-remix.org/\">http://ubuntu-rescue-remix.org/</a></li>\n<li>boot <code>ubuntu-rescue-remix-12-04.iso</code> via virtualbox</li>\n<li>mount USB stick on <code>/dev/sdX</code> at <code>/mnt</code>: <code>mount /dev/sdX /mnt</code></li>\n<li>format the stick: <code>mkfs.vfat -n multiboot /dev/sdX1</code></li>\n<li><code>cd /mnt && mkdir boot iso</code></li>\n<li><code>grub-install --force --no-floppy --boot-directory=/mnt/boot /dev/sdX</code></li>\n<li>create ISO from mounted cd:\n<code>dd if=/dev/cdrom of=/mnt/iso/ubuntu-rescue-remix-12-04.iso</code></li>\n<li>create <code>/boot/grub/grub.cfg</code> with</li>\n</ul>\n<pre><code><span><span><span>menuentry</span></span><span> <span><span>'</span>Ubuntu Rescue Remix ISO <span>'</span></span> <span><span>{</span>\n</span></span></span><span><span><span> set isofile=<span><span>"</span>/iso/ubuntu-rescue-remix-12-04.iso<span>"</span></span>\n</span></span></span><span><span><span> loopback loop (hd0<span>,</span>N)<span><span>$</span><span>isofile</span></span>\n</span></span></span><span><span><span> linux (loop)/casper/vmlinuz boot=casper iso-scan/filename=<span><span>$</span><span>isofile</span></span> noprompt noeject\n</span></span></span><span><span><span> initrd (loop)/casper/initrd.gz\n</span></span></span><span><span><span><span>}</span></span></span>\n</span></code></pre>\n<p>where <code>N</code> is partition number, typically 1.</p>\n<p>Finally, for backup purposes, addons are stored in\n<code>/storage/.xbmc/addons/packages</code>, and the following Alpine packages were useful\nto install for some of the above, diagnostics, etc:</p>\n<pre><code><span><span><span>sudo</span></span><span> apk add busybox-static apk-tools-static</span>\n</span><span><span><span>sudo</span></span><span> vi /etc/apk/repositories</span>\n</span><span><span><span>sudo</span></span><span> apk.static update</span>\n</span><span><span><span>sudo</span></span><span> apk.static upgrade<span><span> --</span>no-self-upgrade</span><span><span> --</span>available</span></span>\n</span><span><span><span>sudo</span></span><span> apk add lshw lshw-doc</span>\n</span><span><span><span>sudo</span></span><span> lshw<span><span> -</span>C</span> storage<span><span> -</span>short</span><span><span> -</span>numeric</span></span>\n</span><span><span><span>sudo</span></span><span> apk add lsblk</span>\n</span><span><span><span>sudo</span></span><span> lsblk</span>\n</span></code></pre>",+"content": "<p>Some notes from my first attempt to renovate an old media PC that had a SYSLINUX\ninstall without any package management, and a crufty BIOS. Probably outdated\nnow, but I may go back to it one day\u2026</p>\n<p>First, some background links:</p>\n<ul>\n<li><a href=\"https://en.wikipedia.org/wiki/Cylinder-head-sector\">https://en.wikipedia.org/wiki/Cylinder-head-sector</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/FAT_boot_sector\">https://en.wikipedia.org/wiki/FAT_boot_sector</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Logical_Block_Addressing#CHS_conversion\">https://en.wikipedia.org/wiki/Logical_Block_Addressing#CHS_conversion</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Master_Boot_Record\">https://en.wikipedia.org/wiki/Master_Boot_Record</a></li>\n<li><a href=\"https://en.wikipedia.org/wiki/Volume_boot_record\">https://en.wikipedia.org/wiki/Volume_boot_record</a></li>\n<li><a href=\"https://wiki.archlinux.org/index.php/Syslinux\">https://wiki.archlinux.org/index.php/Syslinux</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Common_Problems#Failed_to_load_ldlinux\">https://wiki.syslinux.org/wiki/index.php?title=Common_Problems#Failed_to_load_ldlinux</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Hardware_Compatibility#USB_related_problems\">https://wiki.syslinux.org/wiki/index.php?title=Hardware_Compatibility#USB_related_problems</a></li>\n<li><a href=\"https://wiki.syslinux.org/wiki/index.php?title=Hdt_(Hardware_Detection_Tool)\">https://wiki.syslinux.org/wiki/index.php?title=Hdt_(Hardware_Detection_Tool)</a></li>\n</ul>\n<p>I explored two ways forward: <a href=\"https://www.syslinux.org/\">SYSLINUX</a> and\n<a href=\"https://www.gnu.org/software/grub/index.html\">GRUB</a>.</p>\n<h2><a href=\"https://mort.io/blog/mediapc/#syslinux\">SYSLINUX</a></h2>\n<p>I found that getting SYSLINUX working required moving the partition to 0/1/1 \u2013\nusing sectors per track of 63 or 32, and heads per cylinder or 16 or 64 with\nappropriate cylinder values simply didn\u2019t help.</p>\n<p>Diagnosed by observing that console displayed only CRLF but no banner \u2013\nSYSLINUX code ends up with the banner to be displayed just falling into the\nsecond sector on the disk, so it can\u2019t be read unless the geometry is correct.\nDon\u2019t ask why old fashioned whirling metal disk geometry needs to be set for a\nUSB stick, you\u2019ll be sad.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#formatting-the-usb-stick\">Formatting the USB stick</a></h3>\n<p>Some runes, use at your own risk.</p>\n<pre><code><span><span><span>sudo</span></span><span> dd if=/dev/zero of=/dev/sdd status=progress bs=1M count=256</span>\n</span><span><span><span>sudo</span></span><span> fdisk /dev/sdd <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>o\n</span></span></span><span><span><span>x\n</span></span></span><span><span><span>h\n</span></span></span><span><span><span>64\n</span></span></span><span><span><span>s\n</span></span></span><span><span><span>32\n</span></span></span><span><span><span>r\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>t\n</span></span></span><span><span><span>6\n</span></span></span><span><span><span>a\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>w\n</span></span></span><span><span><span>\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>sudo</span></span><span> mkfs.fat /dev/sdd1</span>\n</span></code></pre>\n<p>One exciting gotcha: the <code>fdisk</code> utility in the <code>util-linux</code> package <strong>didn\u2019t\nwork</strong> \u2013 but the one in <code>busybox</code> did!</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#putting-mbr-in-place\">Putting MBR in place</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> dd bs=440 count=1 conv=notrunc if=/usr/share/syslinux/mbr.bin of=/dev/sdd</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#obtaining-and-installing-memtest86\">Obtaining and installing </a><a href=\"https://www.memtest86.com/\"><code>memtest86</code></a></h3>\n<pre><code><span><span><span>cd</span></span>\n</span><span><span><span>wget</span></span><span> http://memtest.org/download/4.10/memtest86+-4.10.zip</span>\n</span><span><span><span>unzip</span></span><span> memtest86+-4.10.zip</span>\n</span><span><span><span>sudo</span></span><span> cp <span><span>~</span></span>/memtest86+-4.10.bin /mnt/boot/</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#putting-locally-built-syslinux-in-place\">Putting locally built SYSLINUX in place</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> mount /dev/sdd1 /mnt</span>\n</span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /mnt/boot/syslinux</span>\n</span><span><span><span>sudo</span></span><span> syslinux<span><span> --</span>directory</span> boot/syslinux<span><span> --</span>install</span> /dev/sdd1</span>\n</span><span><span><span>sudo</span></span><span> cp /usr/share/syslinux/<span>*</span>.c32 /mnt/boot/syslinux/</span>\n</span><span><span><span>cd</span></span><span> <span><span>~</span></span>/syslinux</span>\n</span><span><span><span>make</span></span><span> bios</span>\n</span><span><span><span>sudo</span></span><span> cp <span><span>~</span></span>/syslinux/bios/com32/hdt/hdt.c32 /mnt/boot/syslinux/</span>\n</span><span><span><span>sudo</span></span><span> cp /usr/share/hwdata/pci.ids /mnt/boot/syslinux</span>\n</span></code></pre>\n<pre><code><span><span><span>sudo</span></span><span> sh<span><span> -</span>c</span> <span><span>"</span>cat > /mnt/boot/syslinux/syslinux.cfg<span>"</span></span> <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span># UI menu.c32\n</span></span></span><span><span><span>PROMPT 1\n</span></span></span><span><span><span>DEFAULT hdt\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL some_label\n</span></span></span><span><span><span> LINUX memdisk\n</span></span></span><span><span><span> INITRD ../alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span> APPEND iso-scan/filename=../alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL memtest\n</span></span></span><span><span><span> LINUX ../memtest86+-4.10.bin\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL hdt\n</span></span></span><span><span><span> COM32 hdt.c32\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL reboot\n</span></span></span><span><span><span> COM32 reboot.c32\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>LABEL poweroff\n</span></span></span><span><span><span> COM32 poweroff.c32\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<p>Unfortunately, getting <code>hdt</code> working required rebuilding as the Alpine package\nversion doesn\u2019t appear to statically link against libupload.a from SYSLINUX\ntree so doesn\u2019t work. Fixing required <code>make bios</code> in the SYSLINUX tree after\ninstalling dependencies including:</p>\n<pre><code><span><span><span>sudo</span></span><span> apk<span><span> -</span>U</span> add nasm xzlinux-headers util-linux-dev</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mediapc/#grub\">GRUB</a></h2>\n<p>Similar behaviour: GRUB2 displayed the <code>GRUB </code> message and nothing else, The\n<a href=\"https://help.ubuntu.com/community/Grub2/Troubleshooting#GRUB\">Ubuntu wiki</a> says\nthis is the \u201ccan\u2019t find MBR or euqivalent\u201d information. In fact, it\u2019s the same\nissue: subsequent progress requires reading the second sector, but I had a\nCHS/LBA mismatch meant it wasn\u2019t reading from the right sector and so hanging.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-wipe-the-stick\">To wipe the stick</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> dd if=/dev/zero of=/dev/sdd status=progress bs=4M</span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-partition-the-stick\">To partition the stick</a></h3>\n<p>In this case, to be bootable with a single <code>ext4</code> partition</p>\n<pre><code><span><span><span>sudo</span></span><span> parted /dev/sdd</span>\n</span><span><span><span>mklabel</span></span><span> msdos</span>\n</span><span><span><span>unit</span></span><span> s</span>\n</span><span><span><span>mkpart</span></span><span> primary ext2 2048s 100<span><span>%</span></span></span>\n</span><span><span><span>set</span></span><span> 1 boot on</span>\n</span><span><span><span>set</span></span><span> 1 lba off</span>\n</span></code></pre>\n<p>\u2026or alternatively, possibly</p>\n<pre><code><span><span><span>sudo</span></span><span> fdisk /dev/sdd <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>o\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>a\n</span></span></span><span><span><span>1\n</span></span></span><span><span><span>\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#to-format-the-partition-and-install-grub-and-the-master-boot-record\">To format the partition, and install <code>grub</code> and the master boot record</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> mkfs.ext4 /dev/sdd1</span>\n</span><span><span><span>sudo</span></span><span> mount /dev/sdd1 /mnt</span>\n</span><span><span><span>sudo</span></span><span> grub-install<span><span> --</span>recheck</span><span><span> --</span>boot-directory</span><span>=</span>/mnt/boot /dev/sdd</span>\n</span></code></pre>\n<p>At this point, booting off the stick will bring htpc to <code>GRUB </code> error stage,\nindicating GRUB has loaded but doesn\u2019t know anything about how to continue.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#install-memtest\">Install memtest</a></h3>\n<pre><code><span><span><span>sudo</span></span><span> cp memtest86+.bin /mnt</span>\n</span><span><span><span>sudo</span></span><span> cat <span>></span>/mnt/boot/grub/grub.cfg <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>set timeout=10\n</span></span></span><span><span><span>set default=0\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>menuentry "Memtest 86+" {\n</span></span></span><span><span><span> linux16 /memtest86+.bin\n</span></span></span><span><span><span>}\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<h3><a href=\"https://mort.io/blog/mediapc/#install-alpine-iso-for-booting\">Install Alpine ISO for booting</a></h3>\n<p>Add the following stanza to GRUB config, above:</p>\n<pre><code><span><span><span>insmod</span></span><span> loopback</span>\n</span><span>\n</span><span><span><span>menuentry</span></span><span> <span><span>"</span>alpine<span>"</span></span> <span><span>{</span>\n</span></span></span><span><span><span> set isofile=/boot/alpine-standard-3.12.0-x86_64.iso\n</span></span></span><span><span><span> loopback loop <span><span>$</span><span>isofile</span></span>\n</span></span></span><span><span><span> linux (loop)/boot/vmlinuz-lts iso-scan/filename=<span><span>$</span><span>isofile</span></span> modules=loop<span>,</span>squashfs<span>,</span>sd-mod<span>,</span>usb-storage modloop=(loop)/boot/modloop-lts\n</span></span></span><span><span><span> initrd (loop)/boot/initramfs-lts\n</span></span></span><span><span><span><span>}</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mediapc/#miscellaneous-cribs\">Miscellaneous cribs</a></h2>\n<p>I did of course forget that the</p>\n<p>Bluetooth keyboard requires a dongle to be plugged in which is stored inside the\nbattery compartment. Doh. Making a note so I don\u2019t forget again.</p>\n<h3><a href=\"https://mort.io/blog/mediapc/#adding-internal-hard-disk\">Adding internal hard-disk</a></h3>\n<p>Required partitioning and formatting:</p>\n<ul>\n<li>find where disk is mounted: <code>sudo lshw -C disk -short</code>\nassume new disk is <code>/dev/sdX</code>, flash disk is <code>/dev/sdF</code></li>\n<li>partition disk: <code>sudo parted /dev/sdX mkpart primary ext4 1 -1</code></li>\n<li>format disk: <code>sudo mkfs.ext4 -m0 /dev/sdX1</code></li>\n<li>label disk: <code>sudo e2label /dev/sdX1 Harddisk</code></li>\n<li>remount flash disk rw: <code>mount -o remount,rw /dev/sdF</code></li>\n<li>edit <code>/boot/extlinux.conf</code> so APPEND line reads:\n<code>APPEND boot=LABEL=System disk=LABEL=Harddisk quiet</code></li>\n</ul>\n<p>I screwed up the first time by not correctly labelling the disk so had to make\nan Ubuntu rescue USB stick. Couldn\u2019t get this to work using MacOS, though didn\u2019t\ntry putting GRUB on via MacOS.</p>\n<ul>\n<li>download ISO: <a href=\"http://ubuntu-rescue-remix.org/\">http://ubuntu-rescue-remix.org/</a></li>\n<li>boot <code>ubuntu-rescue-remix-12-04.iso</code> via virtualbox</li>\n<li>mount USB stick on <code>/dev/sdX</code> at <code>/mnt</code>: <code>mount /dev/sdX /mnt</code></li>\n<li>format the stick: <code>mkfs.vfat -n multiboot /dev/sdX1</code></li>\n<li><code>cd /mnt && mkdir boot iso</code></li>\n<li><code>grub-install --force --no-floppy --boot-directory=/mnt/boot /dev/sdX</code></li>\n<li>create ISO from mounted cd:\n<code>dd if=/dev/cdrom of=/mnt/iso/ubuntu-rescue-remix-12-04.iso</code></li>\n<li>create <code>/boot/grub/grub.cfg</code> with</li>\n</ul>\n<pre><code><span><span><span>menuentry</span></span><span> <span><span>'</span>Ubuntu Rescue Remix ISO <span>'</span></span> <span><span>{</span>\n</span></span></span><span><span><span> set isofile=<span><span>"</span>/iso/ubuntu-rescue-remix-12-04.iso<span>"</span></span>\n</span></span></span><span><span><span> loopback loop (hd0<span>,</span>N)<span><span>$</span><span>isofile</span></span>\n</span></span></span><span><span><span> linux (loop)/casper/vmlinuz boot=casper iso-scan/filename=<span><span>$</span><span>isofile</span></span> noprompt noeject\n</span></span></span><span><span><span> initrd (loop)/casper/initrd.gz\n</span></span></span><span><span><span><span>}</span></span></span>\n</span></code></pre>\n<p>where <code>N</code> is partition number, typically 1.</p>\n<p>Finally, for backup purposes, addons are stored in\n<code>/storage/.xbmc/addons/packages</code>, and the following Alpine packages were useful\nto install for some of the above, diagnostics, etc:</p>\n<pre><code><span><span><span>sudo</span></span><span> apk add busybox-static apk-tools-static</span>\n</span><span><span><span>sudo</span></span><span> vi /etc/apk/repositories</span>\n</span><span><span><span>sudo</span></span><span> apk.static update</span>\n</span><span><span><span>sudo</span></span><span> apk.static upgrade<span><span> --</span>no-self-upgrade</span><span><span> --</span>available</span></span>\n</span><span><span><span>sudo</span></span><span> apk add lshw lshw-doc</span>\n</span><span><span><span>sudo</span></span><span> lshw<span><span> -</span>C</span> storage<span><span> -</span>short</span><span><span> -</span>numeric</span></span>\n</span><span><span><span>sudo</span></span><span> apk add lsblk</span>\n</span><span><span><span>sudo</span></span><span> lsblk</span>\n</span></code></pre>",
+18
mort/blog_mess-with-my-keyboard_.json
+18
mort/blog_mess-with-my-keyboard_.json
···+"summary": "<p>I recently took the plunge and upgraded my OS X. Not to vN of <em>Sierra</em> as I\u2019d\nhoped, but to v0 <em>High Sierra</em>\u2013 the perils of waiting too long\u2026</p>\n<p>Unfortunately, this toasted<a href=\"https://mort.io/blog/mess-with-my-keyboard/#1\">1</a> my carefully curated keyboard remappings as\n<a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> used a kernel extension, for which everything changed. All was not\nlost however, as the rewrite to support Sierra/High Sierra was well underway. Or\nso I thought until I realised that the configuration file had changed from XML\nto JSON. And so my configuration journey began. (But it all ends well, so that\u2019s\ngood.)</p>\n<div>1\n<p>To be honest, I suspect even the <em>Sierra</em> upgrade would\u2019ve done this.</p>\n</div>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#controlling-the-config\">Controlling the config</a></h2>\n<p>The first thing was to get the new configuration matters under control. I did\nthis per the documentation, symlinking the config subdirectory from my\n<code>rc-files</code> repo:</p>\n<pre><code><span><span><span>cd</span></span><span> <span><span>~</span></span>/.config/</span>\n</span><span><span><span>mv</span></span><span> karabiner/ <span><span>~</span></span>/rc-files/</span>\n</span><span><span><span>ln</span></span><span><span><span> -</span>s</span> <span><span>~</span></span>/rc-files/karabiner</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#internal-apple-keyboard\">Internal Apple keyboard</a></h2>\n<p>In the interests of keeping all configuration in one place (but see below), I\ndecided to do this via a set of <a href=\"https://github.com/mor1/rc-karabiner/blob/master/assets/complex_modifications/mort-keymap.json\">complex modifications</a>. In summary this\nmeant:</p>\n<ul>\n<li>swap <code>(caps_lock)</code> and <code>(control)</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: caps_lock -> ctrl<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>caps_lock<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_control<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>swap <code>\"</code> (glyph <code>S-'</code>) with <code>@</code> (glyph <code>S-2</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: S-' (<span>\\"</span>) <-> S-2 (@)<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>quote<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>quote<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>map <code>(backslash)</code> (glyph <code>\\</code>) to <code>#</code>, and <code>S-\\</code> (glyph <code>|</code>) to <code>~</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: <span>\\\\</span> -> #; S-<span>\\\\</span> (|) -> ~<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>3<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span><span><span><span> <span>}</span>\n</span></span></span></code></pre>\n<ul>\n<li>map <code>(non_us_backslash)</code> (glyph <code>\u00a7</code>) to <code>`</code> and <code>S-(non_us_backslash)</code>\n(glyph <code>\u00b1</code>) to <code>\u20ac</code>, and then patch things up so that the usual window\nswitching works (using <code>(command)-`</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: \u00a7 -> `; \u00b1 (S-\u00a7) -> \u20ac<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>,</span> <span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>,</span> <span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>finally, map <code>`</code> to <code>\\</code> and <code>S-`</code> to <code>|</code></li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: ` -> <span>\\\\</span>; S-` (~) -> |<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#iterm2\">iTerm2</a></h2>\n<p>Unfortunately for me, iTerm2 then gets a bit confused as it wants to leave\n<code>(command)</code> alone, only allowing mapping of <code>(option)</code> to <code>(meta)</code> (or, in fact,\n<code>(esc+)</code>). In the past I swapped <code>(left_command)</code> and <code>(left_option)</code> to make\nthe usual shell (<code>bash</code>) CLI editing combinations (roughly, <code>emacs</code>) work. That\nwasn\u2019t ideal though as I then had to fix up the window cycling commands\n(<code>(command)-` </code> and so on). Fortunately, the fix this time seems easier: just\nconfigure the two tricky mappings (involving generating a keypress modified with\n<code>(option)</code>) to be interpreted by iTerm2 to just send the appropriate text\nthrough. Again, I did this in the UI (Preferences > Profiles > Keys) but the\nresulting configuration change is also straightforward:</p>\n<pre><code><span>\t\t\t<span><span><</span><span>key</span><span>></span></span>Keyboard Map<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t\t<span><span><</span><span>key</span><span>></span></span>0x32-0x80000<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Action<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>integer</span><span>></span></span>12<span><span></</span><span>integer</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Text<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>string</span><span>></span></span>\u20ac<span><span></</span><span>string</span><span>></span></span>\n</span><span>\t\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t\t<span><span><</span><span>key</span><span>></span></span>0x33-0x80000<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Action<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>integer</span><span>></span></span>12<span><span></</span><span>integer</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Text<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>string</span><span>></span></span>#<span><span></</span><span>string</span><span>></span></span>\n</span><span>\t\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#microsoft-digital-media-keyboard\">Microsoft Digital Media keyboard</a></h2>\n<p>Examining the key codes using the Karabiner Event-Viewer, it seemed that the\nfirst thing to do was to swap <code>(grave_accent_and_tilde)</code> (glyph <code>`</code>) and\n<code>(non_us_backslash)</code> (slightly confusingly, glyph <code>\\</code> on my keyboard). I started\nout trying to do this as a complex modification so that all the remappings were\nin <a href=\"https://github.com/mor1/rc-karabiner/blob/master/assets/complex_modifications/mort-keymap.json\">one file</a>, but couldn\u2019t: I couldn\u2019t figure out how to control the\napplication order of mappings in that file. However, simple modifications are\napplied before complex modifications, and this <em>is</em> a simple modification as\nit\u2019s a direct swap, so I just used the UI and did it there. For the sake of\ncompleteness, the resulting modification to <a href=\"https://github.com/mor1/rc-karabiner/blob/master/karabiner.json\"><code>karabiner.json</code></a> is:</p>\n<pre><code><span><span><span>{</span>\n</span></span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span><span><span> </span><span><span><span>"</span>profiles<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>devices<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>disable_built_in_keyboard_if_exists<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>fn_function_keys<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span>]</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>is_keyboard<span>"</span></span></span><span><span>:</span> </span><span><span>true</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>is_pointing_device<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>ignore<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>simple_modifications<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span><span>,</span>\n</span></span></span></span><span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>The next step was to patch up the complex modifications. Once I realised that\nthe event viewer was claiming that the key with glyph <code>#</code> was emitting\n<code>(backslash)</code> while it was, in fact, emitting <code>(non_us_pound)</code>, this was fairly\nstraightforward:</p>\n<ul>\n<li>swap <code>(command)</code> (glyph <code>Alt</code>) and <code>(option)</code> (glyph <code>Start</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_option<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_command<span>"</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_command<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_option<span>"</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>\n</span><span> ]\n</span><span> },\n</span></code></pre>\n<ul>\n<li>add coverage of <code>(non_us_pound)</code> to the rule that remaps <code>\\</code> to <code>#</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_pound<span>"</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>3<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_pound<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>\n</span></code></pre>\n<p>\u2026and that\u2019s it. My keyboard is, once again, my castle.</p>",+"content": "<p>I recently took the plunge and upgraded my OS X. Not to vN of <em>Sierra</em> as I\u2019d\nhoped, but to v0 <em>High Sierra</em>\u2013 the perils of waiting too long\u2026</p>\n<p>Unfortunately, this toasted<a href=\"https://mort.io/blog/mess-with-my-keyboard/#1\">1</a> my carefully curated keyboard remappings as\n<a href=\"https://pqrs.org/osx/karabiner/\">Karabiner</a> used a kernel extension, for which everything changed. All was not\nlost however, as the rewrite to support Sierra/High Sierra was well underway. Or\nso I thought until I realised that the configuration file had changed from XML\nto JSON. And so my configuration journey began. (But it all ends well, so that\u2019s\ngood.)</p>\n<div>1\n<p>To be honest, I suspect even the <em>Sierra</em> upgrade would\u2019ve done this.</p>\n</div>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#controlling-the-config\">Controlling the config</a></h2>\n<p>The first thing was to get the new configuration matters under control. I did\nthis per the documentation, symlinking the config subdirectory from my\n<code>rc-files</code> repo:</p>\n<pre><code><span><span><span>cd</span></span><span> <span><span>~</span></span>/.config/</span>\n</span><span><span><span>mv</span></span><span> karabiner/ <span><span>~</span></span>/rc-files/</span>\n</span><span><span><span>ln</span></span><span><span><span> -</span>s</span> <span><span>~</span></span>/rc-files/karabiner</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#internal-apple-keyboard\">Internal Apple keyboard</a></h2>\n<p>In the interests of keeping all configuration in one place (but see below), I\ndecided to do this via a set of <a href=\"https://github.com/mor1/rc-karabiner/blob/master/assets/complex_modifications/mort-keymap.json\">complex modifications</a>. In summary this\nmeant:</p>\n<ul>\n<li>swap <code>(caps_lock)</code> and <code>(control)</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: caps_lock -> ctrl<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>caps_lock<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_control<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>swap <code>\"</code> (glyph <code>S-'</code>) with <code>@</code> (glyph <code>S-2</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: S-' (<span>\\"</span>) <-> S-2 (@)<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>quote<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>quote<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>map <code>(backslash)</code> (glyph <code>\\</code>) to <code>#</code>, and <code>S-\\</code> (glyph <code>|</code>) to <code>~</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: <span>\\\\</span> -> #; S-<span>\\\\</span> (|) -> ~<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>3<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span><span><span><span> <span>}</span>\n</span></span></span></code></pre>\n<ul>\n<li>map <code>(non_us_backslash)</code> (glyph <code>\u00a7</code>) to <code>`</code> and <code>S-(non_us_backslash)</code>\n(glyph <code>\u00b1</code>) to <code>\u20ac</code>, and then patch things up so that the usual window\nswitching works (using <code>(command)-`</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: \u00a7 -> `; \u00b1 (S-\u00a7) -> \u20ac<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>2<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>,</span> <span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>command<span>"</span></span><span>,</span> <span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<ul>\n<li>finally, map <code>`</code> to <code>\\</code> and <code>S-`</code> to <code>|</code></li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>description<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>mort: ` -> <span>\\\\</span>; S-` (~) -> |<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>manipulators<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span></span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>backslash<span>"</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#iterm2\">iTerm2</a></h2>\n<p>Unfortunately for me, iTerm2 then gets a bit confused as it wants to leave\n<code>(command)</code> alone, only allowing mapping of <code>(option)</code> to <code>(meta)</code> (or, in fact,\n<code>(esc+)</code>). In the past I swapped <code>(left_command)</code> and <code>(left_option)</code> to make\nthe usual shell (<code>bash</code>) CLI editing combinations (roughly, <code>emacs</code>) work. That\nwasn\u2019t ideal though as I then had to fix up the window cycling commands\n(<code>(command)-` </code> and so on). Fortunately, the fix this time seems easier: just\nconfigure the two tricky mappings (involving generating a keypress modified with\n<code>(option)</code>) to be interpreted by iTerm2 to just send the appropriate text\nthrough. Again, I did this in the UI (Preferences > Profiles > Keys) but the\nresulting configuration change is also straightforward:</p>\n<pre><code><span>\t\t\t<span><span><</span><span>key</span><span>></span></span>Keyboard Map<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t\t<span><span><</span><span>key</span><span>></span></span>0x32-0x80000<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Action<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>integer</span><span>></span></span>12<span><span></</span><span>integer</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Text<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>string</span><span>></span></span>\u20ac<span><span></</span><span>string</span><span>></span></span>\n</span><span>\t\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t\t<span><span><</span><span>key</span><span>></span></span>0x33-0x80000<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t<span><span><</span><span>dict</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Action<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>integer</span><span>></span></span>12<span><span></</span><span>integer</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>key</span><span>></span></span>Text<span><span></</span><span>key</span><span>></span></span>\n</span><span>\t\t\t\t\t<span><span><</span><span>string</span><span>></span></span>#<span><span></</span><span>string</span><span>></span></span>\n</span><span>\t\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span><span>...\n</span><span>\t\t\t<span><span></</span><span>dict</span><span>></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/mess-with-my-keyboard/#microsoft-digital-media-keyboard\">Microsoft Digital Media keyboard</a></h2>\n<p>Examining the key codes using the Karabiner Event-Viewer, it seemed that the\nfirst thing to do was to swap <code>(grave_accent_and_tilde)</code> (glyph <code>`</code>) and\n<code>(non_us_backslash)</code> (slightly confusingly, glyph <code>\\</code> on my keyboard). I started\nout trying to do this as a complex modification so that all the remappings were\nin <a href=\"https://github.com/mor1/rc-karabiner/blob/master/assets/complex_modifications/mort-keymap.json\">one file</a>, but couldn\u2019t: I couldn\u2019t figure out how to control the\napplication order of mappings in that file. However, simple modifications are\napplied before complex modifications, and this <em>is</em> a simple modification as\nit\u2019s a direct swap, so I just used the UI and did it there. For the sake of\ncompleteness, the resulting modification to <a href=\"https://github.com/mor1/rc-karabiner/blob/master/karabiner.json\"><code>karabiner.json</code></a> is:</p>\n<pre><code><span><span><span>{</span>\n</span></span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span><span><span> </span><span><span><span>"</span>profiles<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>devices<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>disable_built_in_keyboard_if_exists<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>fn_function_keys<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span>]</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>is_keyboard<span>"</span></span></span><span><span>:</span> </span><span><span>true</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>is_pointing_device<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> </span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>ignore<span>"</span></span></span><span><span>:</span> </span><span><span>false</span><span>,</span>\n</span></span></span></span></span></span><span><span><span><span><span><span> </span><span><span><span>"</span>simple_modifications<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span><span>{</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_backslash<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span><span>,</span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span>\n</span></span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span></span><span><span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span></span></span><span><span><span><span><span><span><span> <span>]</span></span>\n</span></span></span></span></span></span><span><span><span><span><span><span> <span>}</span></span>\n</span></span></span></span></span><span><span><span><span><span> <span>]</span></span><span>,</span>\n</span></span></span></span><span><span><span><span><span>.</span><span>.</span><span>.</span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>The next step was to patch up the complex modifications. Once I realised that\nthe event viewer was claiming that the key with glyph <code>#</code> was emitting\n<code>(backslash)</code> while it was, in fact, emitting <code>(non_us_pound)</code>, this was fairly\nstraightforward:</p>\n<ul>\n<li>swap <code>(command)</code> (glyph <code>Alt</code>) and <code>(option)</code> (glyph <code>Start</code>):</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_option<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_command<span>"</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_command<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>optional<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>any<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>left_option<span>"</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>\n</span><span> ]\n</span><span> },\n</span></code></pre>\n<ul>\n<li>add coverage of <code>(non_us_pound)</code> to the rule that remaps <code>\\</code> to <code>#</code>:</li>\n</ul>\n<pre><code><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_pound<span>"</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>3<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>option<span>"</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>,\n</span><span> <span><span>{</span>\n</span></span><span><span> </span><span><span><span>"</span>conditions<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>device_if<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>identifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>{</span></span><span><span><span>"</span>vendor_id<span>"</span></span></span><span><span>:</span> </span><span><span>1118</span><span>,</span> </span><span><span><span>"</span>product_id<span>"</span></span></span><span><span>:</span> </span><span><span>180</span><span>}</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>type<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>basic<span>"</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>from<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>non_us_pound<span>"</span></span><span>,</span>\n</span></span></span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>{</span></span><span><span><span>"</span>mandatory<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span><span>}</span></span>\n</span></span></span><span><span><span> <span>}</span></span><span>,</span>\n</span></span><span><span> </span><span><span><span>"</span>to<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span>\n</span></span></span><span><span><span> <span><span>{</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>key_code<span>"</span></span></span><span><span>:</span> </span><span><span><span>"</span>grave_accent_and_tilde<span>"</span></span><span>,</span>\n</span></span></span></span><span><span><span><span> </span><span><span><span>"</span>modifiers<span>"</span></span></span><span><span>:</span> </span><span><span><span>[</span><span><span>"</span>shift<span>"</span></span><span>]</span></span>\n</span></span></span></span><span><span><span><span> <span>}</span></span>\n</span></span></span><span><span><span> <span>]</span></span>\n</span></span><span><span> <span>}</span></span>\n</span></code></pre>\n<p>\u2026and that\u2019s it. My keyboard is, once again, my castle.</p>",
+18
mort/blog_moving-onto-mirage_.json
+18
mort/blog_moving-onto-mirage_.json
···+"summary": "<p>For a little while I\u2019ve had <a href=\"http://github.com/mor1/mor1.github.io\">this site</a> running as a <a href=\"http://openmirage.org/\">MirageOS</a>\nunikernel, shadowing the main site hosted on <a href=\"http://github.com/\">GitHub</a>. I\u2019ve finally decided to\nmake the switch, as part of moving over to take advantage of Mirage\u2019s DNS and\nTLS libraries.</p>\n<p>Following the usual pattern, as previously explained by <a href=\"http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Amir</a>, <a href=\"http://www.somerandomidiot.com/blog/2014/08/19/i-am-unikernel/\">Mindy</a> and\nothers, the process is:</p>\n<ul>\n<li>Construct a static <a href=\"http://jekyllrb.com\">Jekyll</a> site.</li>\n<li>Write a <a href=\"http://travis-ci.com/\">Travis</a> YAML file to cause <a href=\"http://travis-ci.com/\">Travis</a> to build the unikernel image\nand commit it back to the deployment repository.</li>\n<li>Write a Git <code>post-merge</code> hook for the deployment repository, so that the\nlatest unikernel is automatically booted when a merge is detected, i.e., there\nis a new unikernel image.</li>\n<li>Write a <code>cron</code> job that periodically polls the deployment repository, pulling\nany changes.</li>\n</ul>\n<p>Building a <a href=\"http://jekyllrb.com\">Jekyll</a> site is well-documented \u2013 I did find that I had to tweak\nmy <a href=\"https://github.com/mor1/mor1.github.io/blob/master/_config.yml\"><code>_config.yml</code></a> so as to make sure my local toolchain matched the\none used by Github, ensuring consistency between versions of the site. For\nconvenience:</p>\n<pre><code><span><span><span>make</span></span><span> site</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/moving-onto-mirage/#bringing-up-the-network\">Bringing up the network</a></h2>\n<p>The <a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.yml\"><code>.travis.yml</code></a> file then specifies the three main targets for\nthe CI test build to carry out: Unix with a standard sockets backed\n(<code>MIRAGE_BACKEND=unix</code>, <code>MIRAGE_NET=socket</code>) and with the Mirage network stack\n(<code>MIRAGE_BACKEND=unix</code>, <code>MIRAGE_NET=direct</code>), and with the Xen backend\n(<code>MIRAGE_BACKEND=xen</code>). For the latter case, we must also specify the static IP\nconfiguration to be used (<code>MIRAGE_ADDR</code>, <code>..._GWS</code>, and <code>..._MASK</code>). The\n<a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.sh\"><code>.travis.sh</code></a> script then calls the standard skeleton\n<a href=\"https://github.com/ocaml/ocaml-travisci-skeleton/blob/master/.travis-mirage.sh\"><code>.travis-mirage.sh</code></a> script after first building the site\ncontent using Jekyll.</p>\n<p>This tests the three basic combinations of network backend for a Mirage\nappliance:</p>\n<pre><code><span><span><span>$</span></span><span> make configure.socket build</span>\n</span></code></pre>\n<ul>\n<li><strong>UNIX/socket</strong> requires no configuration. The network device is configured\nwith the loopback address, <code>127.0.0.1</code>. Appliances can be run without\nrequiring <code>root</code> privileges, assuming they only bind to non-privileged ports.</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> make configure.direct build</span>\n</span></code></pre>\n<ul>\n<li><strong>UNIX/direct/dhcp</strong> requires no configuration if a DHCP server is running and\ncan respond. The appliance must be run with <code>root</code> privileges to use the new\nnetwork bridging capability of OSX 10.10, whereupon the DHCP client in the\nappliance follows the usual protocol.</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> make configure.xen build <span>\\\n</span></span></span><span><span> ADDR=<span><span>"</span>46.43.42.137<span>"</span></span> GWS=<span><span>"</span>46.43.42.129<span>"</span></span> MASK=<span><span>"</span>255.255.255.128<span>"</span></span></span>\n</span></code></pre>\n<ul>\n<li><strong>Xen</strong> uses the Mirage network stack and expects static configuration of the\nnetwork device.</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/moving-onto-mirage/#using-travis-ci\">Using Travis CI</a></h2>\n<p>Of course, all that is for local development \u2013 for the live site, this is\nactually all wrapped up using <a href=\"http://travis-ci.com/\">Travis CI</a>. Due to a small pull request\nwaiting on the <a href=\"https://github.com/ocaml/ocaml-travisci-skeleton\">OCaml Travis CI skeleton scripts</a> and a few\nMirage releases currently being readied, this looks a little more complex than\nit needs to (the <code>FORK_USER</code> and <code>DEV_REMOTE</code> variables shouldn\u2019t need to be\nspecified in the long run) but anyway:</p>\n<pre><code><span><span><span>language</span></span><span>:</span> <span>c</span>\n</span><span><span><span>script</span></span><span>:</span> <span>bash -ex .travis.sh</span>\n</span><span><span><span>env</span></span><span>:</span>\n</span><span> <span><span>matrix</span></span><span>:</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=unix MIRAGE_NET=socket</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=unix MIRAGE_NET=direct</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>UPDATE_GCC_BINUTILS=1</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=xen</span>\n</span><span> <span>MIRAGE_ADDR="46.43.42.137" MIRAGE_GWS="46.43.42.129" MIRAGE_MASK="255.255.255.128"</span>\n</span><span> <span>XENIMG=mortio MIRDIR=_mirage DEPLOY=1</span>\n</span></code></pre>\n<p>This uses the local <a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.sh\"><code>.travis-sh</code></a> script to build the three versions\nof the site, using the <a href=\"https://github.com/mirage/mirage-dev\">Mirage development OPAM repository</a> so as to\npick up the latest versions of all the various packages, and updating the Travis\n<code>gcc</code> and <code>binutils</code> to ensure the stubs for a couple of packages (notably\n<code>mirage-entropy-xen</code>) build.</p>\n<p>Next stop: adding TLS and DNS support\u2026</p>",+"content": "<p>For a little while I\u2019ve had <a href=\"http://github.com/mor1/mor1.github.io\">this site</a> running as a <a href=\"http://openmirage.org/\">MirageOS</a>\nunikernel, shadowing the main site hosted on <a href=\"http://github.com/\">GitHub</a>. I\u2019ve finally decided to\nmake the switch, as part of moving over to take advantage of Mirage\u2019s DNS and\nTLS libraries.</p>\n<p>Following the usual pattern, as previously explained by <a href=\"http://amirchaudhry.com/from-jekyll-to-unikernel-in-fifty-lines/\">Amir</a>, <a href=\"http://www.somerandomidiot.com/blog/2014/08/19/i-am-unikernel/\">Mindy</a> and\nothers, the process is:</p>\n<ul>\n<li>Construct a static <a href=\"http://jekyllrb.com\">Jekyll</a> site.</li>\n<li>Write a <a href=\"http://travis-ci.com/\">Travis</a> YAML file to cause <a href=\"http://travis-ci.com/\">Travis</a> to build the unikernel image\nand commit it back to the deployment repository.</li>\n<li>Write a Git <code>post-merge</code> hook for the deployment repository, so that the\nlatest unikernel is automatically booted when a merge is detected, i.e., there\nis a new unikernel image.</li>\n<li>Write a <code>cron</code> job that periodically polls the deployment repository, pulling\nany changes.</li>\n</ul>\n<p>Building a <a href=\"http://jekyllrb.com\">Jekyll</a> site is well-documented \u2013 I did find that I had to tweak\nmy <a href=\"https://github.com/mor1/mor1.github.io/blob/master/_config.yml\"><code>_config.yml</code></a> so as to make sure my local toolchain matched the\none used by Github, ensuring consistency between versions of the site. For\nconvenience:</p>\n<pre><code><span><span><span>make</span></span><span> site</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/moving-onto-mirage/#bringing-up-the-network\">Bringing up the network</a></h2>\n<p>The <a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.yml\"><code>.travis.yml</code></a> file then specifies the three main targets for\nthe CI test build to carry out: Unix with a standard sockets backed\n(<code>MIRAGE_BACKEND=unix</code>, <code>MIRAGE_NET=socket</code>) and with the Mirage network stack\n(<code>MIRAGE_BACKEND=unix</code>, <code>MIRAGE_NET=direct</code>), and with the Xen backend\n(<code>MIRAGE_BACKEND=xen</code>). For the latter case, we must also specify the static IP\nconfiguration to be used (<code>MIRAGE_ADDR</code>, <code>..._GWS</code>, and <code>..._MASK</code>). The\n<a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.sh\"><code>.travis.sh</code></a> script then calls the standard skeleton\n<a href=\"https://github.com/ocaml/ocaml-travisci-skeleton/blob/master/.travis-mirage.sh\"><code>.travis-mirage.sh</code></a> script after first building the site\ncontent using Jekyll.</p>\n<p>This tests the three basic combinations of network backend for a Mirage\nappliance:</p>\n<pre><code><span><span><span>$</span></span><span> make configure.socket build</span>\n</span></code></pre>\n<ul>\n<li><strong>UNIX/socket</strong> requires no configuration. The network device is configured\nwith the loopback address, <code>127.0.0.1</code>. Appliances can be run without\nrequiring <code>root</code> privileges, assuming they only bind to non-privileged ports.</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> make configure.direct build</span>\n</span></code></pre>\n<ul>\n<li><strong>UNIX/direct/dhcp</strong> requires no configuration if a DHCP server is running and\ncan respond. The appliance must be run with <code>root</code> privileges to use the new\nnetwork bridging capability of OSX 10.10, whereupon the DHCP client in the\nappliance follows the usual protocol.</li>\n</ul>\n<pre><code><span><span><span>$</span></span><span> make configure.xen build <span>\\\n</span></span></span><span><span> ADDR=<span><span>"</span>46.43.42.137<span>"</span></span> GWS=<span><span>"</span>46.43.42.129<span>"</span></span> MASK=<span><span>"</span>255.255.255.128<span>"</span></span></span>\n</span></code></pre>\n<ul>\n<li><strong>Xen</strong> uses the Mirage network stack and expects static configuration of the\nnetwork device.</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/moving-onto-mirage/#using-travis-ci\">Using Travis CI</a></h2>\n<p>Of course, all that is for local development \u2013 for the live site, this is\nactually all wrapped up using <a href=\"http://travis-ci.com/\">Travis CI</a>. Due to a small pull request\nwaiting on the <a href=\"https://github.com/ocaml/ocaml-travisci-skeleton\">OCaml Travis CI skeleton scripts</a> and a few\nMirage releases currently being readied, this looks a little more complex than\nit needs to (the <code>FORK_USER</code> and <code>DEV_REMOTE</code> variables shouldn\u2019t need to be\nspecified in the long run) but anyway:</p>\n<pre><code><span><span><span>language</span></span><span>:</span> <span>c</span>\n</span><span><span><span>script</span></span><span>:</span> <span>bash -ex .travis.sh</span>\n</span><span><span><span>env</span></span><span>:</span>\n</span><span> <span><span>matrix</span></span><span>:</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=unix MIRAGE_NET=socket</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=unix MIRAGE_NET=direct</span>\n</span><span> <span>-</span> <span>FORK_USER=mor1 DEV_REMOTE=git://github.com/mirage/mirage-dev</span>\n</span><span> <span>UPDATE_GCC_BINUTILS=1</span>\n</span><span> <span>OCAML_VERSION=4.02 MIRAGE_BACKEND=xen</span>\n</span><span> <span>MIRAGE_ADDR="46.43.42.137" MIRAGE_GWS="46.43.42.129" MIRAGE_MASK="255.255.255.128"</span>\n</span><span> <span>XENIMG=mortio MIRDIR=_mirage DEPLOY=1</span>\n</span></code></pre>\n<p>This uses the local <a href=\"https://github.com/mor1/mor1.github.io/blob/master/.travis.sh\"><code>.travis-sh</code></a> script to build the three versions\nof the site, using the <a href=\"https://github.com/mirage/mirage-dev\">Mirage development OPAM repository</a> so as to\npick up the latest versions of all the various packages, and updating the Travis\n<code>gcc</code> and <code>binutils</code> to ensure the stubs for a couple of packages (notably\n<code>mirage-entropy-xen</code>) build.</p>\n<p>Next stop: adding TLS and DNS support\u2026</p>",
+18
mort/blog_nexus-4-rescue_.json
+18
mort/blog_nexus-4-rescue_.json
···+"summary": "<p>A little while ago, before I\u2019d done the smart thing and got myself a case for my\nNexus 4, I dropped it a couple of inches onto a hard surface at the wrong angle.\nThe screen promptly shattered \u2013 and this was bad because without the touch\nscreen, I couldn\u2019t interact with it, I had some photos on it from son#1 birthday\nparty that hadn\u2019t been copied off, and I hadn\u2019t got round to enabling USB access\nto the filesystem or any of the debug/developer options.</p>\n<p>So what to do? I <em>really</em> didn\u2019t want to lose those photos. A couple of hours\nsearching the Interwebs and a little bit of experimentation later, and I managed\nit. Basically, download and apply the clockwork mod bootloader, and this turns\non the developer options that allow access to the filesystem via the Android SDK\ntools. To find out the details, read on\u2026</p>\n<p>First, download the recovery image:</p>\n<pre><code><span><span><span>$</span></span><span> wget http://download2.clockworkmod.com/recoveries/recovery-clockwork-touch-6.0.3.1-mako.img</span>\n</span></code></pre>\n<p>Next, install the Android SDK \u2013 I\u2019m on OSX using [Homebrew][] so I do:</p>\n<pre><code><span><span><span>$</span></span><span> brew install android-sdk</span>\n</span></code></pre>\n<p>Now, power off and disconnect the phone! Then boot it into fastboot mode by\nholding down <code>power</code> and <code>volume-down</code>. Once it boots you should be in the\nfastboot list \u2013 the volume keys will cycle you through the list. You should now\nalso be able to see the device once connected to USB, and you can then OEM\nunlock it:</p>\n<pre><code><span><span><span>$</span></span><span> sudo fastboot devices<span><span> -</span>l</span></span>\n</span><span><span><span>04f02d4bdcd3b6e2</span></span><span> fastboot usb:FD123000</span>\n</span><span><span><span>$</span></span><span> sudo fastboot oem unlock</span>\n</span><span><span><span>...</span></span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 17.937s<span>]</span></span>\n</span><span><span><span>finished.</span></span><span> total time: 17.937s</span>\n</span></code></pre>\n<p>Having unlocked it, you can now install the clockwork recovery bootloader you\ndownloaded (assuming it\u2019s in the local directory):</p>\n<pre><code><span><span><span>$</span></span><span> sudo fastboot flash recovery recovery-clockwork-touch-6.0.3.1-mako.img</span>\n</span><span><span><span>sending</span></span><span> <span><span>'</span>recovery<span>'</span></span> (7560 KB</span><span></span>)<span><span>...</span></span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 0.526s<span>]</span></span>\n</span><span><span><span>writing</span></span><span> <span><span>'</span>recovery<span>'</span></span>...</span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 0.448s<span>]</span></span>\n</span><span><span><span>finished.</span></span><span> total time: 0.975s</span>\n</span></code></pre>\n<p>When you now use the volume keys to cycle through the list, you should now see\n<strong>recovery mode</strong> as an option \u2013 select it, and you should be able to see the\ndevice listed in the usual way via <code>adb</code>:</p>\n<pre><code><span><span><span>:</span></span><span> mort@greyjay:phone$</span><span>;</span> <span><span>sudo</span></span><span> adb devices<span><span> -</span>l</span></span>\n</span><span><span><span>List</span></span><span> of devices attached</span>\n</span><span><span><span>04f02d4bdcd3b6e2</span></span><span> recovery usb:FD123000 product:occam model:Nexus_4 device:mako</span>\n</span></code></pre>\n<p>Finally, pull all the contents off the sdcard:</p>\n<pre><code><span><span><span>$</span></span><span> adb pull /sdcard/0 ./sdcard/</span>\n</span><span><span><span>$</span></span><span> adb pull /data/ ./data/</span>\n</span><span><span><span>$</span></span><span> adb pull /system/ ./system/</span>\n</span></code></pre>\n<p>\u2026and that\u2019s it \u2013 you should now have a local copy of everything off the\nphone, and you can send it away for repair (or whatever you feel like\notherwise), possibly while sobbing quietly.</p>",+"content": "<p>A little while ago, before I\u2019d done the smart thing and got myself a case for my\nNexus 4, I dropped it a couple of inches onto a hard surface at the wrong angle.\nThe screen promptly shattered \u2013 and this was bad because without the touch\nscreen, I couldn\u2019t interact with it, I had some photos on it from son#1 birthday\nparty that hadn\u2019t been copied off, and I hadn\u2019t got round to enabling USB access\nto the filesystem or any of the debug/developer options.</p>\n<p>So what to do? I <em>really</em> didn\u2019t want to lose those photos. A couple of hours\nsearching the Interwebs and a little bit of experimentation later, and I managed\nit. Basically, download and apply the clockwork mod bootloader, and this turns\non the developer options that allow access to the filesystem via the Android SDK\ntools. To find out the details, read on\u2026</p>\n<p>First, download the recovery image:</p>\n<pre><code><span><span><span>$</span></span><span> wget http://download2.clockworkmod.com/recoveries/recovery-clockwork-touch-6.0.3.1-mako.img</span>\n</span></code></pre>\n<p>Next, install the Android SDK \u2013 I\u2019m on OSX using [Homebrew][] so I do:</p>\n<pre><code><span><span><span>$</span></span><span> brew install android-sdk</span>\n</span></code></pre>\n<p>Now, power off and disconnect the phone! Then boot it into fastboot mode by\nholding down <code>power</code> and <code>volume-down</code>. Once it boots you should be in the\nfastboot list \u2013 the volume keys will cycle you through the list. You should now\nalso be able to see the device once connected to USB, and you can then OEM\nunlock it:</p>\n<pre><code><span><span><span>$</span></span><span> sudo fastboot devices<span><span> -</span>l</span></span>\n</span><span><span><span>04f02d4bdcd3b6e2</span></span><span> fastboot usb:FD123000</span>\n</span><span><span><span>$</span></span><span> sudo fastboot oem unlock</span>\n</span><span><span><span>...</span></span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 17.937s<span>]</span></span>\n</span><span><span><span>finished.</span></span><span> total time: 17.937s</span>\n</span></code></pre>\n<p>Having unlocked it, you can now install the clockwork recovery bootloader you\ndownloaded (assuming it\u2019s in the local directory):</p>\n<pre><code><span><span><span>$</span></span><span> sudo fastboot flash recovery recovery-clockwork-touch-6.0.3.1-mako.img</span>\n</span><span><span><span>sending</span></span><span> <span><span>'</span>recovery<span>'</span></span> (7560 KB</span><span></span>)<span><span>...</span></span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 0.526s<span>]</span></span>\n</span><span><span><span>writing</span></span><span> <span><span>'</span>recovery<span>'</span></span>...</span>\n</span><span><span><span>OKAY</span></span><span> <span>[</span> 0.448s<span>]</span></span>\n</span><span><span><span>finished.</span></span><span> total time: 0.975s</span>\n</span></code></pre>\n<p>When you now use the volume keys to cycle through the list, you should now see\n<strong>recovery mode</strong> as an option \u2013 select it, and you should be able to see the\ndevice listed in the usual way via <code>adb</code>:</p>\n<pre><code><span><span><span>:</span></span><span> mort@greyjay:phone$</span><span>;</span> <span><span>sudo</span></span><span> adb devices<span><span> -</span>l</span></span>\n</span><span><span><span>List</span></span><span> of devices attached</span>\n</span><span><span><span>04f02d4bdcd3b6e2</span></span><span> recovery usb:FD123000 product:occam model:Nexus_4 device:mako</span>\n</span></code></pre>\n<p>Finally, pull all the contents off the sdcard:</p>\n<pre><code><span><span><span>$</span></span><span> adb pull /sdcard/0 ./sdcard/</span>\n</span><span><span><span>$</span></span><span> adb pull /data/ ./data/</span>\n</span><span><span><span>$</span></span><span> adb pull /system/ ./system/</span>\n</span></code></pre>\n<p>\u2026and that\u2019s it \u2013 you should now have a local copy of everything off the\nphone, and you can send it away for repair (or whatever you feel like\notherwise), possibly while sobbing quietly.</p>",
+18
mort/blog_nixos-channels_.json
+18
mort/blog_nixos-channels_.json
···+"summary": "<p>I don\u2019t pretend to understand <a href=\"https://nixos.org/\">NixOS</a> configuration fully\nyet, what with Flakes and channels and so forth. But I did find the following\nusful to setup channels consistently so that I could have a single config that\nused both.</p>\n<pre><code><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>list <span><span>#</span></span><span> to list known channels</span><span>\n</span></span></span></span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>remove</span> nixos <span><span>#</span></span><span> to remove a channel</span><span>\n</span></span></span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>add</span> https://nixos.org/channels/nixos-unstable nixos <span><span>#</span></span><span> to go bleeding edge</span><span>\n</span></span></span></code></pre>\n<p>\u2026ultimately resulting in:</p>\n<pre><code><span><span><span>$</span></span><span> nix-channel<span><span> --</span>list <span><span>#</span></span><span> channels are stored per-user, and $(whoami) != root</span><span>\n</span></span></span></span><span><span><span>$</span></span><span> sudo nix-channel<span><span> --</span>list</span></span>\n</span><span><span><span>nixos</span></span><span> https://nixos.org/channels/nixos-unstable</span>\n</span><span><span><span>nixpkgs</span></span><span> https://nixos.org/channels/nixos-unstable</span>\n</span></code></pre>\n<p>Upgrading to the latest release is then something like:</p>\n<pre><code><span><span><span>pushd</span></span><span> <span><span>~</span></span>/rc-files/nixos/</span>\n</span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>update</span></span>\n</span><span><span><span>nix</span></span><span> flake update</span>\n</span><span><span><span>sudo</span></span><span> nixos-rebuild switch<span><span> --</span>upgrade-all</span></span>\n</span><span><span><span>popd</span></span>\n</span></code></pre>\n<p>\u2026and finally, garbage collecting old versions once you\u2019re satisfied the new\none works:</p>\n<pre><code><span><span><span>sudo</span></span><span> nix-collect-garbage<span><span> -</span>d</span></span>\n</span><span><span><span>nix-collect-garbage</span></span><span><span><span> -</span>d</span></span>\n</span><span><span><span>sudo</span></span><span> nix-store<span><span> --</span>gc</span></span>\n</span><span><span><span>nix-store</span></span><span><span><span> --</span>gc</span> </span>\n</span></code></pre>\n<p>And yes, some of the incantations above might be a little cargo-cultish and not\nstrictly necessary. But at various points they\u2019ve seemed necessary to me, and\nnow they\u2019re in my shell history, they\u2019re what I got.</p>",+"content": "<p>I don\u2019t pretend to understand <a href=\"https://nixos.org/\">NixOS</a> configuration fully\nyet, what with Flakes and channels and so forth. But I did find the following\nusful to setup channels consistently so that I could have a single config that\nused both.</p>\n<pre><code><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>list <span><span>#</span></span><span> to list known channels</span><span>\n</span></span></span></span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>remove</span> nixos <span><span>#</span></span><span> to remove a channel</span><span>\n</span></span></span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>add</span> https://nixos.org/channels/nixos-unstable nixos <span><span>#</span></span><span> to go bleeding edge</span><span>\n</span></span></span></code></pre>\n<p>\u2026ultimately resulting in:</p>\n<pre><code><span><span><span>$</span></span><span> nix-channel<span><span> --</span>list <span><span>#</span></span><span> channels are stored per-user, and $(whoami) != root</span><span>\n</span></span></span></span><span><span><span>$</span></span><span> sudo nix-channel<span><span> --</span>list</span></span>\n</span><span><span><span>nixos</span></span><span> https://nixos.org/channels/nixos-unstable</span>\n</span><span><span><span>nixpkgs</span></span><span> https://nixos.org/channels/nixos-unstable</span>\n</span></code></pre>\n<p>Upgrading to the latest release is then something like:</p>\n<pre><code><span><span><span>pushd</span></span><span> <span><span>~</span></span>/rc-files/nixos/</span>\n</span><span><span><span>sudo</span></span><span> nix-channel<span><span> --</span>update</span></span>\n</span><span><span><span>nix</span></span><span> flake update</span>\n</span><span><span><span>sudo</span></span><span> nixos-rebuild switch<span><span> --</span>upgrade-all</span></span>\n</span><span><span><span>popd</span></span>\n</span></code></pre>\n<p>\u2026and finally, garbage collecting old versions once you\u2019re satisfied the new\none works:</p>\n<pre><code><span><span><span>sudo</span></span><span> nix-collect-garbage<span><span> -</span>d</span></span>\n</span><span><span><span>nix-collect-garbage</span></span><span><span><span> -</span>d</span></span>\n</span><span><span><span>sudo</span></span><span> nix-store<span><span> --</span>gc</span></span>\n</span><span><span><span>nix-store</span></span><span><span><span> --</span>gc</span> </span>\n</span></code></pre>\n<p>And yes, some of the incantations above might be a little cargo-cultish and not\nstrictly necessary. But at various points they\u2019ve seemed necessary to me, and\nnow they\u2019re in my shell history, they\u2019re what I got.</p>",
+18
mort/blog_nixos-onedrive_.json
+18
mort/blog_nixos-onedrive_.json
···+"summary": "<p>Starting by reading instructions at:</p>\n<ul>\n<li><a href=\"https://github.com/abraunegg/onedrive/blob/master/docs/sharepoint-libraries.md\">https://github.com/abraunegg/onedrive/blob/master/docs/sharepoint-libraries.md</a></li>\n<li><a href=\"https://github.com/abraunegg/onedrive/blob/master/docs/business-shared-items.md\">https://github.com/abraunegg/onedrive/blob/master/docs/business-shared-items.md</a></li>\n<li><a href=\"https://github.com/NixOS/nixpkgs/pull/77734#issuecomment-575874225\">https://github.com/NixOS/nixpkgs/pull/77734#issuecomment-575874225</a></li>\n</ul>\n<p>FWIW I also use that package on a NixOS system (via a Nix package that can presumably be installed on other systems if you add nix as a package manager), <a href=\"https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/networking/onedrive.nix\">https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/networking/onedrive.nix</a>.</p>\n<p>I have successfully used it with the University and my startup\u2019s tenancies (both personal OneDrive and enterprise Sharepoint sites) simultaneously \u2013 this was fairly simple in the end but I recall it took a while to do the token dance to get the necessary tokens, and then to get all the runes right. In short it was something like:</p>\n<ol>\n<li>\n<p>Follow the instructions Robert pointed to sort out access to the various accounts, refresh tokens, and the like.</p>\n</li>\n<li>\n<p>Create file <code>~/.config/onedrive-launcher</code> comprising each subdirectory of <code>~/.config</code> that is a onedrive configuration directory corresponding to each account \u2013 I named these after the accounts in question so mine contained</p>\n</li>\n</ol>\n<pre><code><span>onedrive-rmm1002@cam.ac.uk\n</span><span>onedrive-mort@ikva.ai\n</span><span>sharepoint-mort@ikva.ai-iKVALimited\n</span></code></pre>\n<p>(I now prefix the last two lines with <code># </code> to comment them out as I don\u2019t need those synced any more.)</p>\n<p>The systemd service <code>onedrive-launcher.service</code> then uses the file to kick off a systemd <code>onedrive@...</code> service for each entry.</p>\n<ol>\n<li>Edit the <code>~/.config/ACCOUNT/config</code> files appropriately; the only changes I made were to</li>\n</ol>\n<pre><code><span><span><span># for my University account\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/rmm1002@cam.ac.uk"</span>\n</span><span>\n</span><span><span><span># for my startup personal OneDrive\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/mort@ikva.ai"</span>\n</span><span><span><span>sync_business_shared_folders</span> <span>=</span></span> <span>"true"</span>\n</span><span>\ufeff\n</span><span><span><span># for startup Sharepoint sites\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/mort@ikva.ai-iKVA_Limited"</span>\n</span><span><span><span>drive_id</span> <span>=</span></span> <span>"..."</span><span> <span># rune found per instructions Robert pointed to I think\n</span></span></span></code></pre>",+"content": "<p>Starting by reading instructions at:</p>\n<ul>\n<li><a href=\"https://github.com/abraunegg/onedrive/blob/master/docs/sharepoint-libraries.md\">https://github.com/abraunegg/onedrive/blob/master/docs/sharepoint-libraries.md</a></li>\n<li><a href=\"https://github.com/abraunegg/onedrive/blob/master/docs/business-shared-items.md\">https://github.com/abraunegg/onedrive/blob/master/docs/business-shared-items.md</a></li>\n<li><a href=\"https://github.com/NixOS/nixpkgs/pull/77734#issuecomment-575874225\">https://github.com/NixOS/nixpkgs/pull/77734#issuecomment-575874225</a></li>\n</ul>\n<p>FWIW I also use that package on a NixOS system (via a Nix package that can presumably be installed on other systems if you add nix as a package manager), <a href=\"https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/networking/onedrive.nix\">https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/networking/onedrive.nix</a>.</p>\n<p>I have successfully used it with the University and my startup\u2019s tenancies (both personal OneDrive and enterprise Sharepoint sites) simultaneously \u2013 this was fairly simple in the end but I recall it took a while to do the token dance to get the necessary tokens, and then to get all the runes right. In short it was something like:</p>\n<ol>\n<li>\n<p>Follow the instructions Robert pointed to sort out access to the various accounts, refresh tokens, and the like.</p>\n</li>\n<li>\n<p>Create file <code>~/.config/onedrive-launcher</code> comprising each subdirectory of <code>~/.config</code> that is a onedrive configuration directory corresponding to each account \u2013 I named these after the accounts in question so mine contained</p>\n</li>\n</ol>\n<pre><code><span>onedrive-rmm1002@cam.ac.uk\n</span><span>onedrive-mort@ikva.ai\n</span><span>sharepoint-mort@ikva.ai-iKVALimited\n</span></code></pre>\n<p>(I now prefix the last two lines with <code># </code> to comment them out as I don\u2019t need those synced any more.)</p>\n<p>The systemd service <code>onedrive-launcher.service</code> then uses the file to kick off a systemd <code>onedrive@...</code> service for each entry.</p>\n<ol>\n<li>Edit the <code>~/.config/ACCOUNT/config</code> files appropriately; the only changes I made were to</li>\n</ol>\n<pre><code><span><span><span># for my University account\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/rmm1002@cam.ac.uk"</span>\n</span><span>\n</span><span><span><span># for my startup personal OneDrive\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/mort@ikva.ai"</span>\n</span><span><span><span>sync_business_shared_folders</span> <span>=</span></span> <span>"true"</span>\n</span><span>\ufeff\n</span><span><span><span># for startup Sharepoint sites\n</span></span></span><span><span><span>sync_dir</span> <span>=</span></span> <span>"~/OneDrive/mort@ikva.ai-iKVA_Limited"</span>\n</span><span><span><span>drive_id</span> <span>=</span></span> <span>"..."</span><span> <span># rune found per instructions Robert pointed to I think\n</span></span></span></code></pre>",
+18
mort/blog_nu-posix_.json
+18
mort/blog_nu-posix_.json
···+"summary": "<p>A slight delay to this post, but happily I now have no more lecturing to do\nuntil 2027<a href=\"https://mort.io/blog/nu-posix/#1\">1</a> :)</p>\n<p>I switched a year or two ago to using NixOS as my daily driver following about\n15 years as an increasingly irritated MacOS user. Shortly before I had become\ninterested in Rust as a systems programming language that seemed to marry\nseveral things I like about OCaml with several other desirable things from C and\nPython.</p>\n<p>I then more recently observed something that I thought was interesting: there\nseem to be a <strong>lot</strong> of recent replacements of what were once completely\nstandard and changeless POSIX utilities. I\u2019m thinking things like <code>grep</code>,\n<code>find</code>, <code>ls</code> and the like that I\u2019ve been using uninterrupted, other than the\noccasional quibble over whether it was the original version or the GNU version,\nfor about 30 years. Indeed, I have already raved (slightly) about\n<a href=\"https://just.systems/\"><code>just</code></a> as a\n<a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a> replacement and its\nuse with <a href=\"https://mort.io/blog/just-ocaml/\">OCaml</a> and <a href=\"https://mort.io/blog/just-latex/\">LaTeX</a>.</p>\n<p>NixOS\u2019 declarative configuration meant that I could actually see the list\ngrowing, all in one place \u2013 I suspect on other systems I wouldn\u2019t have noticed\nin quite the same way because it would\u2019ve been a much more incremental and\ndiffuse process of change without a clear record of the choices made.</p>\n<p>I thus find in my\n<a href=\"https://github.com/mor1/rc-files/blob/main/nixos/modules/home-manager/cli.nix#L44-L65\"><code>cli.nix</code></a>\nconfig that describes the CLI tools I expect, to have the following collection:</p>\n<pre><code><span> <span>nu_posix</span> <span>=</span> <span>[</span>\n</span><span> <span>bat</span> <span># better cat</span>\n</span><span> <span>bottom</span> <span># btm ~ better top, htop, etc</span>\n</span><span> <span>broot</span> <span># interactive directory navigation</span>\n</span><span> <span>chafa</span> <span># terminal graphics viewer</span>\n</span><span> <span>ctpv</span> <span># terminal file previewer</span>\n</span><span> <span>cyme</span> <span># better `lsusb`</span>\n</span><span> <span>delta</span> <span># better syntax highlighting diff</span>\n</span><span> <span>dua</span> <span># disk usage, interactively</span>\n</span><span> <span>eza</span> <span># improved `ls`</span>\n</span><span> <span>fd</span> <span># `find` replacement</span>\n</span><span> <span>fend</span> <span># better CLI calculator</span>\n</span><span> <span>hexyl</span> <span># hex pretty printer</span>\n</span><span> <span>htop</span> <span># graphical top</span>\n</span><span> <span>iotop</span> <span># io top</span>\n</span><span> <span>jujutsu</span> <span># better git</span>\n</span><span> <span>just</span> <span># updated gnumake replacement</span>\n</span><span> <span>procs</span> <span># better ps</span>\n</span><span> <span>ripgrep</span> <span># rg ~ `grep` replacement</span>\n</span><span> <span>sudo-rs</span> <span># memory-safe `sudo`</span>\n</span><span> <span>uutils-coreutils-noprefix</span> <span># replaces GNU `coreutils`</span>\n</span><span> <span>viddy</span> <span># better watch</span>\n</span><span> <span>]</span><span>;</span>\n</span></code></pre>\n<p>I think that most, if not all, of these are written in Rust: that particular\nlanguage community seems to have a real enthusiasm for re-implementing\nlong-standing tools but better, and I have to say I really appreciate it! When I\nsay \u201cbetter\u201d I\u2019m not particularly thinking of esoteric language features or\ndevelopment ideologies either. I mean better in two very particular senses:</p>\n<ol>\n<li>\n<p><strong>Usability</strong>. Many of the older tools simply did not have great user\ninterfaces and, when they were ok, they were not built using modern tooling.\nAs a result getting documentation was somewhere between good and great if\nthere was a decent <code>man</code>-page, with a range of potential switches for more\nshort form help or for cases where the <code>man</code>-page was not installed \u2013\nwhether <code>-h</code>, <code>--help</code>, <code>-help</code>, <code>-?</code>, <code>help</code>, or something else. The\nshort-form help would, of course, be formatted in arbitrary ways.</p>\n<p>The modern Rust-y replacements tend to use\n<a href=\"https://docs.rs/clap/latest/clap/\"><code>clap</code></a> as a reasonably standard\ncommand-line parser. As a result, they are remarkably consistent in usage and\nformat, typically producing something that looks a lot like <code>man</code>-page output\nin response to their <code>-h|--help</code> switch. In a world where <code>man</code>-pages are\noften an afterthought or, even worse, replaced by <code>info</code> documentation, I\nfind this invaluable. They are also generally inclined to make greater use of\nmodern terminal environments \u2013 <a href=\"https://github.com/eza-community/eza\"><code>eza</code></a>\nas a replacement for\n<a href=\"https://www.gnu.org/software/coreutils/manual/html_node/ls-invocation.html\"><code>ls</code></a>\nis a good example of this.</p>\n</li>\n<li>\n<p><strong>Performance</strong>. Old tools were originally built for old computers in old\nlanguages (largely C) and, whether this is language ideology or just the\npracticalities of engineering long-standing widely-used codebases, tended not\nto be radically updated.</p>\n<p>Rust re-implementations, on the other hand, are from scratch \u2013 and Rust\u2019s\nmemory model appears to make it relatively easy for them to be made\nmulti-threaded. On modern hardware this seems to make them startlingly higher\nperformance than the alternatives. Tools I particularly appreciate for this\ninclude <a href=\"https://github.com/sharkdp/fd\"><code>fd</code></a> replacing\n<a href=\"https://www.gnu.org/software/findutils/\"><code>find</code></a> and <a href=\"https://github.com/BurntSushi/ripgrep\">ripgrep,\n<code>rg</code>,</a> replacing\n<a href=\"https://www.gnu.org/software/grep/\"><code>grep</code></a>.</p>\n</li>\n</ol>\n<p>Perhaps the most immediate example of the benefits of this that I\u2019ve experienced\nis <a href=\"https://github.com/Byron/dua-cli\"><code>dua</code></a> via <code>dua i</code>. Traditionally, when\ntrying to clean up an uncomfortably full hard disk I would\u2019ve ended up using\nsome manual iterative application of either <code>du -hS *</code> or possibly something\nlike <code>find ... | xargs du</code>. Or possibly written a Python script to do it for me.\nAnd it would\u2019ve taken <em>O</em>(hours) for me to find where the space was being used\nand to do something about it. And I would\u2019ve found it tedious and deeply\nirritating.<a href=\"https://mort.io/blog/nu-posix/#2\">2</a></p>\n<p>In contrast, <code>dua i</code> gives me a TUI interface to navigate the filesystem from\nwherever I run it, the ability to cumulatively mark files and directories for\ntrashing or immediate deletion, with subdirectory space summaries \u2013 and does so\nacross ~850GB / 3 million files in about 10-15 seconds without using any form of\ncaching, database, or other such thing. As far as I can tell, simply by being\nefficient and multi-threaded.</p>\n<p>If this is the future, sign me up. (At least for the bits like this that are\ngood.)</p>\n<div>1\n<p>\u2026assuming I get back the same courses after my sabbatical that is.</p>\n</div>\n<div>2\n<p>I\u2019m easily irritated. What can I say.</p>\n</div>",+"content": "<p>A slight delay to this post, but happily I now have no more lecturing to do\nuntil 2027<a href=\"https://mort.io/blog/nu-posix/#1\">1</a> :)</p>\n<p>I switched a year or two ago to using NixOS as my daily driver following about\n15 years as an increasingly irritated MacOS user. Shortly before I had become\ninterested in Rust as a systems programming language that seemed to marry\nseveral things I like about OCaml with several other desirable things from C and\nPython.</p>\n<p>I then more recently observed something that I thought was interesting: there\nseem to be a <strong>lot</strong> of recent replacements of what were once completely\nstandard and changeless POSIX utilities. I\u2019m thinking things like <code>grep</code>,\n<code>find</code>, <code>ls</code> and the like that I\u2019ve been using uninterrupted, other than the\noccasional quibble over whether it was the original version or the GNU version,\nfor about 30 years. Indeed, I have already raved (slightly) about\n<a href=\"https://just.systems/\"><code>just</code></a> as a\n<a href=\"https://www.gnu.org/software/make/manual/make.html\"><code>make</code></a> replacement and its\nuse with <a href=\"https://mort.io/blog/just-ocaml/\">OCaml</a> and <a href=\"https://mort.io/blog/just-latex/\">LaTeX</a>.</p>\n<p>NixOS\u2019 declarative configuration meant that I could actually see the list\ngrowing, all in one place \u2013 I suspect on other systems I wouldn\u2019t have noticed\nin quite the same way because it would\u2019ve been a much more incremental and\ndiffuse process of change without a clear record of the choices made.</p>\n<p>I thus find in my\n<a href=\"https://github.com/mor1/rc-files/blob/main/nixos/modules/home-manager/cli.nix#L44-L65\"><code>cli.nix</code></a>\nconfig that describes the CLI tools I expect, to have the following collection:</p>\n<pre><code><span> <span>nu_posix</span> <span>=</span> <span>[</span>\n</span><span> <span>bat</span> <span># better cat</span>\n</span><span> <span>bottom</span> <span># btm ~ better top, htop, etc</span>\n</span><span> <span>broot</span> <span># interactive directory navigation</span>\n</span><span> <span>chafa</span> <span># terminal graphics viewer</span>\n</span><span> <span>ctpv</span> <span># terminal file previewer</span>\n</span><span> <span>cyme</span> <span># better `lsusb`</span>\n</span><span> <span>delta</span> <span># better syntax highlighting diff</span>\n</span><span> <span>dua</span> <span># disk usage, interactively</span>\n</span><span> <span>eza</span> <span># improved `ls`</span>\n</span><span> <span>fd</span> <span># `find` replacement</span>\n</span><span> <span>fend</span> <span># better CLI calculator</span>\n</span><span> <span>hexyl</span> <span># hex pretty printer</span>\n</span><span> <span>htop</span> <span># graphical top</span>\n</span><span> <span>iotop</span> <span># io top</span>\n</span><span> <span>jujutsu</span> <span># better git</span>\n</span><span> <span>just</span> <span># updated gnumake replacement</span>\n</span><span> <span>procs</span> <span># better ps</span>\n</span><span> <span>ripgrep</span> <span># rg ~ `grep` replacement</span>\n</span><span> <span>sudo-rs</span> <span># memory-safe `sudo`</span>\n</span><span> <span>uutils-coreutils-noprefix</span> <span># replaces GNU `coreutils`</span>\n</span><span> <span>viddy</span> <span># better watch</span>\n</span><span> <span>]</span><span>;</span>\n</span></code></pre>\n<p>I think that most, if not all, of these are written in Rust: that particular\nlanguage community seems to have a real enthusiasm for re-implementing\nlong-standing tools but better, and I have to say I really appreciate it! When I\nsay \u201cbetter\u201d I\u2019m not particularly thinking of esoteric language features or\ndevelopment ideologies either. I mean better in two very particular senses:</p>\n<ol>\n<li>\n<p><strong>Usability</strong>. Many of the older tools simply did not have great user\ninterfaces and, when they were ok, they were not built using modern tooling.\nAs a result getting documentation was somewhere between good and great if\nthere was a decent <code>man</code>-page, with a range of potential switches for more\nshort form help or for cases where the <code>man</code>-page was not installed \u2013\nwhether <code>-h</code>, <code>--help</code>, <code>-help</code>, <code>-?</code>, <code>help</code>, or something else. The\nshort-form help would, of course, be formatted in arbitrary ways.</p>\n<p>The modern Rust-y replacements tend to use\n<a href=\"https://docs.rs/clap/latest/clap/\"><code>clap</code></a> as a reasonably standard\ncommand-line parser. As a result, they are remarkably consistent in usage and\nformat, typically producing something that looks a lot like <code>man</code>-page output\nin response to their <code>-h|--help</code> switch. In a world where <code>man</code>-pages are\noften an afterthought or, even worse, replaced by <code>info</code> documentation, I\nfind this invaluable. They are also generally inclined to make greater use of\nmodern terminal environments \u2013 <a href=\"https://github.com/eza-community/eza\"><code>eza</code></a>\nas a replacement for\n<a href=\"https://www.gnu.org/software/coreutils/manual/html_node/ls-invocation.html\"><code>ls</code></a>\nis a good example of this.</p>\n</li>\n<li>\n<p><strong>Performance</strong>. Old tools were originally built for old computers in old\nlanguages (largely C) and, whether this is language ideology or just the\npracticalities of engineering long-standing widely-used codebases, tended not\nto be radically updated.</p>\n<p>Rust re-implementations, on the other hand, are from scratch \u2013 and Rust\u2019s\nmemory model appears to make it relatively easy for them to be made\nmulti-threaded. On modern hardware this seems to make them startlingly higher\nperformance than the alternatives. Tools I particularly appreciate for this\ninclude <a href=\"https://github.com/sharkdp/fd\"><code>fd</code></a> replacing\n<a href=\"https://www.gnu.org/software/findutils/\"><code>find</code></a> and <a href=\"https://github.com/BurntSushi/ripgrep\">ripgrep,\n<code>rg</code>,</a> replacing\n<a href=\"https://www.gnu.org/software/grep/\"><code>grep</code></a>.</p>\n</li>\n</ol>\n<p>Perhaps the most immediate example of the benefits of this that I\u2019ve experienced\nis <a href=\"https://github.com/Byron/dua-cli\"><code>dua</code></a> via <code>dua i</code>. Traditionally, when\ntrying to clean up an uncomfortably full hard disk I would\u2019ve ended up using\nsome manual iterative application of either <code>du -hS *</code> or possibly something\nlike <code>find ... | xargs du</code>. Or possibly written a Python script to do it for me.\nAnd it would\u2019ve taken <em>O</em>(hours) for me to find where the space was being used\nand to do something about it. And I would\u2019ve found it tedious and deeply\nirritating.<a href=\"https://mort.io/blog/nu-posix/#2\">2</a></p>\n<p>In contrast, <code>dua i</code> gives me a TUI interface to navigate the filesystem from\nwherever I run it, the ability to cumulatively mark files and directories for\ntrashing or immediate deletion, with subdirectory space summaries \u2013 and does so\nacross ~850GB / 3 million files in about 10-15 seconds without using any form of\ncaching, database, or other such thing. As far as I can tell, simply by being\nefficient and multi-threaded.</p>\n<p>If this is the future, sign me up. (At least for the bits like this that are\ngood.)</p>\n<div>1\n<p>\u2026assuming I get back the same courses after my sabbatical that is.</p>\n</div>\n<div>2\n<p>I\u2019m easily irritated. What can I say.</p>\n</div>",
+18
mort/blog_ocaml-operators_.json
+18
mort/blog_ocaml-operators_.json
···+"summary": "<p>An <a href=\"https://www.brendanlong.com/ocaml-operator-cheatsheet.html\">OCaml operator\ncheatsheet</a> for\n<a href=\"https://ocaml.org/\">OCaml</a> operators that I have found useful.</p>",+"content": "<p>An <a href=\"https://www.brendanlong.com/ocaml-operator-cheatsheet.html\">OCaml operator\ncheatsheet</a> for\n<a href=\"https://ocaml.org/\">OCaml</a> operators that I have found useful.</p>",
+18
mort/blog_part-ii-projects_.json
+18
mort/blog_part-ii-projects_.json
···+"summary": "<p>Undergraduate final-year (\u201cPart II\u201d) project supervision goes in fits and\nstarts. After a couple of years of having almost no interest, this year I\u2019ve had\nseveral enquiries and it seems I might end supervising 3\u20134 projects. So\nherewith a record of the things I\u2019ve found myself repeating!</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-structure\">Project structure</a></h2>\n<p>The key thing for the structure of the project is to make sure that there is a\ncore piece that is (essentially) guaranteed to be deliverable. This is the piece\nthat you know you can do, and once done and written up, you know you can get an\nadequate (if not great) mark. Ensuring this takes the risk out of the project.</p>\n<p>On top of this core piece, it\u2019s then usually sensible to build \u201ca few\u201d (2? 3?\n4?) extensions which will make the project spicy if done well. You may wish to\nphrase these extensions as \u201cfor example, extensions might include\u2026\u201d or words\nto that effect, to give some wiggle room in the final dissertation. Getting\nthese done successfully is what should put you in line for a very good mark\nrather than a simply adequate mark.</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-framing\">Project framing</a></h2>\n<p>It is also very helpful, particularly if you are aiming for a high mark, to try\nto frame the project in terms of <strong>a research question you will answer</strong> rather\nthan <strong>an artefact you will build</strong>. Often in systems the appropriate way to\nanswer the research questions we pose will be to build an artefact \u2013 but by\nframing it in terms of the question you seek to answer, it makes it easier to\nwrite the dissertation as a piece of research rather than a <a href=\"https://en.wikipedia.org/wiki/Small_matter_of_programming\">small matter of\nprogramming</a>.\nEmpirically, this seems to have an outsize effect on the chances of the\nexaminers thinking the project difficult and marking accordingly.</p>\n<p>It can also be useful to try to be explicit about where your project requires\nyou to go beyond the CST taught material, particularly from Part IA/IB.</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-proposal\">Project proposal</a></h2>\n<p>To my mind there are two critical pieces of the proposal around which everything\nelse sits.</p>\n<p>First, the <strong>evaluation plan</strong>: if you can write this well, then you understand\nwhat you\u2019re going to build, and what it means to have done it well (or badly).\nWriting the evaluation plan therefore usually means you have, for the most part,\ngot the rest of the project figured out.</p>\n<p>Second, the <strong>workplan</strong>: this divides time until submission into two week\n(never larger, sometimes smaller) chunks, each of which has attached a <em>calendar\ndate</em> (so there\u2019s no confusion over weeks in term or suchlike) and a\n<em>milestone</em>/<em>deliverable</em> (so that we can immediately tell whether you\u2019ve\ncompleted, or at least are making progress against, that chunk of work in our\nweekly meeting). Don\u2019t forget to take account of any relevant module assessment\ndeadlines in your plan!</p>\n<p>Note that it\u2019s a plan not a contract! You don\u2019t lose marks because you deviate\nfrom the plan\u2013 but if you can\u2019t tell whether you\u2019re ahead or behind, you might\nwell find yourself in a sticky position at the end of Lent term or start of\nEaster term when you find you\u2019ve got module assessments to complete, revision to\nstart, two weeks to go until dissertation submission and still four weeks of\nwork to do on your project\u2026</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#supervision-process\">Supervision process</a></h2>\n<p>I normally supervise projects by scheduling weekly half-hour meetings with each\nstudent. Longer meetings can be arranged on an ad hoc basis as required. The key\npurpose of the meeting is to check progress against the workplan, and to make\nsure that any difficulties and roadblocks are aired and dealt with (whether in\nthe meeting or by scheduling a longer discussion).</p>\n<p>For an example of a reasonable target timeline, consider trying to get\nimplementation completed by the end of Christmas vacation, evaluation completed\nby the division of Lent Term, and the dissertation completed by the end of Lent\nTerm. That then gives you flexibility as to whether you do more project work,\nextensions etc., or focus on exam revision, or whatever.</p>\n\n\n<p>Hopefully that\u2019s helpful. At the very least, I can now point potential project\nstudents at it, so it\u2019s helpful for me :) Some of the above may also be relevant\nwriting research proposals (Part III / MPhil projects, even Ph.D.s) but that\u2019s a\ntopic for another day.</p>",+"content": "<p>Undergraduate final-year (\u201cPart II\u201d) project supervision goes in fits and\nstarts. After a couple of years of having almost no interest, this year I\u2019ve had\nseveral enquiries and it seems I might end supervising 3\u20134 projects. So\nherewith a record of the things I\u2019ve found myself repeating!</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-structure\">Project structure</a></h2>\n<p>The key thing for the structure of the project is to make sure that there is a\ncore piece that is (essentially) guaranteed to be deliverable. This is the piece\nthat you know you can do, and once done and written up, you know you can get an\nadequate (if not great) mark. Ensuring this takes the risk out of the project.</p>\n<p>On top of this core piece, it\u2019s then usually sensible to build \u201ca few\u201d (2? 3?\n4?) extensions which will make the project spicy if done well. You may wish to\nphrase these extensions as \u201cfor example, extensions might include\u2026\u201d or words\nto that effect, to give some wiggle room in the final dissertation. Getting\nthese done successfully is what should put you in line for a very good mark\nrather than a simply adequate mark.</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-framing\">Project framing</a></h2>\n<p>It is also very helpful, particularly if you are aiming for a high mark, to try\nto frame the project in terms of <strong>a research question you will answer</strong> rather\nthan <strong>an artefact you will build</strong>. Often in systems the appropriate way to\nanswer the research questions we pose will be to build an artefact \u2013 but by\nframing it in terms of the question you seek to answer, it makes it easier to\nwrite the dissertation as a piece of research rather than a <a href=\"https://en.wikipedia.org/wiki/Small_matter_of_programming\">small matter of\nprogramming</a>.\nEmpirically, this seems to have an outsize effect on the chances of the\nexaminers thinking the project difficult and marking accordingly.</p>\n<p>It can also be useful to try to be explicit about where your project requires\nyou to go beyond the CST taught material, particularly from Part IA/IB.</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#project-proposal\">Project proposal</a></h2>\n<p>To my mind there are two critical pieces of the proposal around which everything\nelse sits.</p>\n<p>First, the <strong>evaluation plan</strong>: if you can write this well, then you understand\nwhat you\u2019re going to build, and what it means to have done it well (or badly).\nWriting the evaluation plan therefore usually means you have, for the most part,\ngot the rest of the project figured out.</p>\n<p>Second, the <strong>workplan</strong>: this divides time until submission into two week\n(never larger, sometimes smaller) chunks, each of which has attached a <em>calendar\ndate</em> (so there\u2019s no confusion over weeks in term or suchlike) and a\n<em>milestone</em>/<em>deliverable</em> (so that we can immediately tell whether you\u2019ve\ncompleted, or at least are making progress against, that chunk of work in our\nweekly meeting). Don\u2019t forget to take account of any relevant module assessment\ndeadlines in your plan!</p>\n<p>Note that it\u2019s a plan not a contract! You don\u2019t lose marks because you deviate\nfrom the plan\u2013 but if you can\u2019t tell whether you\u2019re ahead or behind, you might\nwell find yourself in a sticky position at the end of Lent term or start of\nEaster term when you find you\u2019ve got module assessments to complete, revision to\nstart, two weeks to go until dissertation submission and still four weeks of\nwork to do on your project\u2026</p>\n<h2><a href=\"https://mort.io/blog/part-ii-projects/#supervision-process\">Supervision process</a></h2>\n<p>I normally supervise projects by scheduling weekly half-hour meetings with each\nstudent. Longer meetings can be arranged on an ad hoc basis as required. The key\npurpose of the meeting is to check progress against the workplan, and to make\nsure that any difficulties and roadblocks are aired and dealt with (whether in\nthe meeting or by scheduling a longer discussion).</p>\n<p>For an example of a reasonable target timeline, consider trying to get\nimplementation completed by the end of Christmas vacation, evaluation completed\nby the division of Lent Term, and the dissertation completed by the end of Lent\nTerm. That then gives you flexibility as to whether you do more project work,\nextensions etc., or focus on exam revision, or whatever.</p>\n\n\n<p>Hopefully that\u2019s helpful. At the very least, I can now point potential project\nstudents at it, so it\u2019s helpful for me :) Some of the above may also be relevant\nwriting research proposals (Part III / MPhil projects, even Ph.D.s) but that\u2019s a\ntopic for another day.</p>",
+18
mort/blog_past-present-future_.json
+18
mort/blog_past-present-future_.json
···+"summary": "<p>I recently decided to refresh and update my <a href=\"https://github.com/mor1/ocal/\">ocal</a> package,<a href=\"https://mort.io/blog/past-present-future/#1\">1</a> primarily to\nport it to use the excellent <a href=\"https://github.com/pqwy/notty/\">notty</a> before adding support for indicating\nweek-of-year. At the same time, I took the opportunity to update the build\ninfrastructure now that the OCaml world has some shiny new packaging and build\ntools to go with <a href=\"https://github.com/ocaml/opam/\">OPAM</a>, namely <a href=\"https://github.com/dbuenzli/topkg/\"><code>topkg</code></a> and <a href=\"https://github.com/janestreet/jbuilder/\"><code>jbuilder</code></a>. So, starting\nfrom <a href=\"http://github.com/djs55/\">Dave Scott\u2019s</a> <a href=\"https://mirage.io/wiki/packaging\">wiki entry</a> about how to package <a href=\"https://mirage.io/\">Mirage</a> libraries,\nhere\u2019s what I had to do\u2026</p>\n<div>1\n<p>A somewhat over-featured replacement for the standard UNIX <code>cal</code> utility,\nbecause I got irritated by its American-centricity and my\ninitial <a href=\"https://github.com/mor1/python-scripts/blob/master/cal.py\">Python replacement</a> was just too slow\u2026</p>\n</div>\n<h2><a href=\"https://mort.io/blog/past-present-future/#remove-oasis-remnants\">Remove Oasis remnants</a></h2>\n<pre><code><span><span><span>git</span></span><span> rm _oasis setup.ml Makefile<span>*</span> _tags myocamlbuild.ml .merlin</span>\n</span><span><span><span>mv</span></span><span> ocal.opam/opam o</span> <span>&&</span> <span><span>git</span></span><span> rm<span><span> -</span>rf</span> ocal.opam</span> <span>&&</span> <span><span>mv</span></span><span> o ocal.opam</span> <span>&&</span> <span><span>git</span></span><span> add ocal.opam</span>\n</span><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>.gitignore</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>_build\n</span></span></span><span><span><span>*.merlin\n</span></span></span><span><span><span>*.install\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>Although we\u2019re removing the <code>ocal.opam/descr</code> file, we\u2019re not going to lose the\ncontent: we\u2019re going to let <code>topkg opam pkg</code> use its default <code>--readme</code> option\nto extract the relevant info from the first marked up section of the\n<a href=\"https://github.com/mor1/ocal/blob/0.2.0/README.md\"><code>README.md</code></a>:</p>\n<pre><code><span><span><span><span>#</span> </span><span><span>ocal \u2014 An improved Unix <span><span>`</span>cal<span>`</span></span> utility</span><span>\n</span></span></span></span><span>\n</span><span><span>%%VERSION%%\n</span></span><span><span>\n</span></span><span><span>A replacement for the standard Unix <span><span>`</span>cal<span>`</span></span> utility. Partly because I could,\n</span></span><span><span>partly because I'd become too irritated with its command line interface.\n</span></span></code></pre>\n<p>We also remove but don\u2019t lose the functionality of the <code>.merlin</code> and OPAM\n<code>ocal.install</code> files, as <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> will generate them for us.</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#create-src-jbuild-file\">Create <code>src/jbuild</code> file</a></h2>\n<pre><code><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>src/jbuild</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>(jbuild_version 1)\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>(executable\n</span></span></span><span><span><span> ((public_name ocal)\n</span></span></span><span><span><span> (package ocal)\n</span></span></span><span><span><span> (name main)\n</span></span></span><span><span><span>\n</span></span></span><span><span><span> (libraries\n</span></span></span><span><span><span> (\n</span></span></span><span><span><span> astring\n</span></span></span><span><span><span> calendar\n</span></span></span><span><span><span> cmdliner\n</span></span></span><span><span><span> notty\n</span></span></span><span><span><span> notty.unix\n</span></span></span><span><span><span> ))\n</span></span></span><span><span><span> (flags (:standard -w "A-44-48-52" -safe-string))\n</span></span></span><span><span><span> ))\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>This corresponds to the <a href=\"https://github.com/mor1/ocal/releases/tag/0.2.0\">0.2.0</a>\nrelease of <a href=\"https://github.com/mor1/ocal/\">ocal</a>. Note that the <code>name</code> parameter refers to the module that\ncontains the entrypoint for the executable, and that we turn on all warnings\n(<code>A</code>) except for three that we wish to ignore:</p>\n<ul>\n<li><code>44</code>: Open statement shadows an already defined identifier.</li>\n<li><code>48</code>: Implicit elimination of optional arguments.</li>\n<li><code>52</code>: (see 8.5.1) Fragile constant pattern.</li>\n</ul>\n<p>After I did some tidying up of the code to deal with the newly imposed warnings,\n<code>make</code> and <code>make install</code> satisfactorily (and quickly!) used <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> to\nbuild and install the executable as <code>~/.opam/system/bin/ocal</code> (thanks to the\n<code>public_name</code> stanza in the <code>src/jbuild</code> file, above). <code>make uninstall</code> then\ncaused <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> to remove it, before I <code>opam</code> pinned it and then reinstall\nthrough <code>opam</code> to check that workflow worked as well:</p>\n<pre><code><span><span><span>opam</span></span><span> remove ocal</span>\n</span><span><span><span>opam</span></span><span> pin add<span><span> -</span>yn</span><span><span> --</span>dev-repo</span> ocal .</span>\n</span><span><span><span>opam</span></span><span> install ocal</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/past-present-future/#create-the-topkg-skeletons\">Create the <code>topkg</code> skeletons</a></h2>\n<p>Having refreshed the basic build infrastructure, next it\u2019s time to update the\npackaging workflow. For a simple library we could use the automatic\n<a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a>/<a href=\"https://github.com/dbuenzli/topkg/\">topkg</a> plugin per the <a href=\"https://mirage.io/wiki/packaging\">wiki entry</a>:</p>\n<pre><code><span><span><span>mkdir</span></span><span> pkg</span>\n</span><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>pkg/pkg.ml</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>#!/usr/bin/env ocaml\n</span></span></span><span><span><span>#use "topfind"\n</span></span></span><span><span><span>#require "topkg-jbuilder.auto"\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>However, this isn\u2019t a library so we don\u2019t have documentation to build so we\ndon\u2019t bother with the <code>odoc</code> skeleton. As a result we also need to customise\n<a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> so as to stop <code>topkg publish</code> failing when it can\u2019t build docs:</p>\n<pre><code><span><span>#</span><span>!</span><span>/</span>usr<span>/</span>bin<span>/</span>env ocaml\n</span><span><span>#use</span> <span><span>"</span>topfind<span>"</span></span>\n</span><span><span>#require</span> <span><span>"</span>topkg-jbuilder<span>"</span></span>\n</span><span>\n</span><span><span><span>open</span> <span>Topkg</span>\n</span></span><span>\n</span><span><span>let</span> <span>publish</span> <span>=</span>\n</span><span> <span>Pkg.</span>publish <span>~artefacts<span>:</span></span><span><span>[</span><span>`Distrib</span><span>]</span></span> <span>(<span>)</span></span>\n</span><span>\n</span><span><span>let</span> <span>(<span>)</span></span> <span>=</span>\n</span><span> <span>Topkg_jbuilder.</span>describe <span>~publish</span> <span>(<span>)</span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/past-present-future/#prepare-a-release\">Prepare a release</a></h2>\n<p>Finally, we follow the standard <a href=\"https://github.com/dbuenzli/topkg/\">topkg</a> workflow to prepare a release. First,\nadd an entry to <a href=\"https://github.com/mor1/ocal/blob/0.2.0/CHANGES.md\"><code>CHANGES.md</code></a> with the correct formatting and commit the\nresult, and then:</p>\n<pre><code><span><span><span>distrib</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span><span>[</span><span> <span><span>-</span>x</span> <span><span>$$(</span>opam config var root<span>)</span></span>/plugins/opam-publish/repos/ocal <span>]</span></span> <span>||</span> <span>\\\n</span></span></span></span><span><span><span>\t <span><span>opam-publish</span></span><span> repo add ocal mor1/ocal</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> tag</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> distrib</span></span>\n</span></span></code></pre>\n<p>\u2026which creates tokens for accessing the GitHub repo for this project (if they\ndon\u2019t already exist), creates a release tag based on entries in <a href=\"https://github.com/mor1/ocal/blob/0.2.0/CHANGES.md\"><code>CHANGES.md</code></a>,\nand then creates the release tarballs (without the edits to <a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> this\nwould also build the docs, but we have none).</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#publish-a-release\">Publish a release</a></h2>\n<p>Finally, we publish the release to GitHub and issue a pull request to\nthe <a href=\"https://github.com/ocaml/opam/\">OPAM repository</a> to add the new release into OPAM after linting and\ntests have passed.</p>\n<pre><code><span><span><span>publish</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span><span><span>topkg</span></span><span> publish</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> opam pkg</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> opam submit</span></span>\n</span></span></code></pre>\n<p>Given that this repo has only a single package, we could in fact simply issue</p>\n<pre><code><span>topkg tag && topkg bistro\n</span></code></pre>\n<p>Also, as an alternative to customising the <a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> as indicated above, we\ncould simply remember to indicate the appropriate customisation on the command\nline:</p>\n<pre><code><span>topkg publish distrib\n</span></code></pre>\n<p>\u2026but <code>topkg bistro</code> wouldn\u2019t then work.</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#conclusion\">Conclusion</a></h2>\n<p>So that\u2019s it: a simple executable distribution taken from old-school <a href=\"http://oasis.forge.ocamlcore.org/\">Oasis</a> and\n<a href=\"https://ocaml.org/learn/tutorials/ocamlbuild/\">OCamlBuild</a> infrastructure to shiny new modern <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> and <a href=\"https://github.com/dbuenzli/topkg/\">topkg</a>. The new\nscheme seems to me to be an improvement: faster build times, simpler (to my\neyes) metadata, autogeneration of more of the repeated metadata (<code>.merlin</code> etc),\nand a reasonably simple <a href=\"https://github.com/mor1/ocal/blob/0.2.0/Makefile\"><code>Makefile</code></a> that I actually think I understand.\nDefinitely progress :)</p>",+"content": "<p>I recently decided to refresh and update my <a href=\"https://github.com/mor1/ocal/\">ocal</a> package,<a href=\"https://mort.io/blog/past-present-future/#1\">1</a> primarily to\nport it to use the excellent <a href=\"https://github.com/pqwy/notty/\">notty</a> before adding support for indicating\nweek-of-year. At the same time, I took the opportunity to update the build\ninfrastructure now that the OCaml world has some shiny new packaging and build\ntools to go with <a href=\"https://github.com/ocaml/opam/\">OPAM</a>, namely <a href=\"https://github.com/dbuenzli/topkg/\"><code>topkg</code></a> and <a href=\"https://github.com/janestreet/jbuilder/\"><code>jbuilder</code></a>. So, starting\nfrom <a href=\"http://github.com/djs55/\">Dave Scott\u2019s</a> <a href=\"https://mirage.io/wiki/packaging\">wiki entry</a> about how to package <a href=\"https://mirage.io/\">Mirage</a> libraries,\nhere\u2019s what I had to do\u2026</p>\n<div>1\n<p>A somewhat over-featured replacement for the standard UNIX <code>cal</code> utility,\nbecause I got irritated by its American-centricity and my\ninitial <a href=\"https://github.com/mor1/python-scripts/blob/master/cal.py\">Python replacement</a> was just too slow\u2026</p>\n</div>\n<h2><a href=\"https://mort.io/blog/past-present-future/#remove-oasis-remnants\">Remove Oasis remnants</a></h2>\n<pre><code><span><span><span>git</span></span><span> rm _oasis setup.ml Makefile<span>*</span> _tags myocamlbuild.ml .merlin</span>\n</span><span><span><span>mv</span></span><span> ocal.opam/opam o</span> <span>&&</span> <span><span>git</span></span><span> rm<span><span> -</span>rf</span> ocal.opam</span> <span>&&</span> <span><span>mv</span></span><span> o ocal.opam</span> <span>&&</span> <span><span>git</span></span><span> add ocal.opam</span>\n</span><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>.gitignore</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>_build\n</span></span></span><span><span><span>*.merlin\n</span></span></span><span><span><span>*.install\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>Although we\u2019re removing the <code>ocal.opam/descr</code> file, we\u2019re not going to lose the\ncontent: we\u2019re going to let <code>topkg opam pkg</code> use its default <code>--readme</code> option\nto extract the relevant info from the first marked up section of the\n<a href=\"https://github.com/mor1/ocal/blob/0.2.0/README.md\"><code>README.md</code></a>:</p>\n<pre><code><span><span><span><span>#</span> </span><span><span>ocal \u2014 An improved Unix <span><span>`</span>cal<span>`</span></span> utility</span><span>\n</span></span></span></span><span>\n</span><span><span>%%VERSION%%\n</span></span><span><span>\n</span></span><span><span>A replacement for the standard Unix <span><span>`</span>cal<span>`</span></span> utility. Partly because I could,\n</span></span><span><span>partly because I'd become too irritated with its command line interface.\n</span></span></code></pre>\n<p>We also remove but don\u2019t lose the functionality of the <code>.merlin</code> and OPAM\n<code>ocal.install</code> files, as <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> will generate them for us.</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#create-src-jbuild-file\">Create <code>src/jbuild</code> file</a></h2>\n<pre><code><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>src/jbuild</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>(jbuild_version 1)\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>(executable\n</span></span></span><span><span><span> ((public_name ocal)\n</span></span></span><span><span><span> (package ocal)\n</span></span></span><span><span><span> (name main)\n</span></span></span><span><span><span>\n</span></span></span><span><span><span> (libraries\n</span></span></span><span><span><span> (\n</span></span></span><span><span><span> astring\n</span></span></span><span><span><span> calendar\n</span></span></span><span><span><span> cmdliner\n</span></span></span><span><span><span> notty\n</span></span></span><span><span><span> notty.unix\n</span></span></span><span><span><span> ))\n</span></span></span><span><span><span> (flags (:standard -w "A-44-48-52" -safe-string))\n</span></span></span><span><span><span> ))\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>This corresponds to the <a href=\"https://github.com/mor1/ocal/releases/tag/0.2.0\">0.2.0</a>\nrelease of <a href=\"https://github.com/mor1/ocal/\">ocal</a>. Note that the <code>name</code> parameter refers to the module that\ncontains the entrypoint for the executable, and that we turn on all warnings\n(<code>A</code>) except for three that we wish to ignore:</p>\n<ul>\n<li><code>44</code>: Open statement shadows an already defined identifier.</li>\n<li><code>48</code>: Implicit elimination of optional arguments.</li>\n<li><code>52</code>: (see 8.5.1) Fragile constant pattern.</li>\n</ul>\n<p>After I did some tidying up of the code to deal with the newly imposed warnings,\n<code>make</code> and <code>make install</code> satisfactorily (and quickly!) used <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> to\nbuild and install the executable as <code>~/.opam/system/bin/ocal</code> (thanks to the\n<code>public_name</code> stanza in the <code>src/jbuild</code> file, above). <code>make uninstall</code> then\ncaused <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> to remove it, before I <code>opam</code> pinned it and then reinstall\nthrough <code>opam</code> to check that workflow worked as well:</p>\n<pre><code><span><span><span>opam</span></span><span> remove ocal</span>\n</span><span><span><span>opam</span></span><span> pin add<span><span> -</span>yn</span><span><span> --</span>dev-repo</span> ocal .</span>\n</span><span><span><span>opam</span></span><span> install ocal</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/past-present-future/#create-the-topkg-skeletons\">Create the <code>topkg</code> skeletons</a></h2>\n<p>Having refreshed the basic build infrastructure, next it\u2019s time to update the\npackaging workflow. For a simple library we could use the automatic\n<a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a>/<a href=\"https://github.com/dbuenzli/topkg/\">topkg</a> plugin per the <a href=\"https://mirage.io/wiki/packaging\">wiki entry</a>:</p>\n<pre><code><span><span><span>mkdir</span></span><span> pkg</span>\n</span><span><span><span>cat</span></span><span> <span>></span></span><span>|</span> <span><span>pkg/pkg.ml</span></span><span> <span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>#!/usr/bin/env ocaml\n</span></span></span><span><span><span>#use "topfind"\n</span></span></span><span><span><span>#require "topkg-jbuilder.auto"\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<p>However, this isn\u2019t a library so we don\u2019t have documentation to build so we\ndon\u2019t bother with the <code>odoc</code> skeleton. As a result we also need to customise\n<a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> so as to stop <code>topkg publish</code> failing when it can\u2019t build docs:</p>\n<pre><code><span><span>#</span><span>!</span><span>/</span>usr<span>/</span>bin<span>/</span>env ocaml\n</span><span><span>#use</span> <span><span>"</span>topfind<span>"</span></span>\n</span><span><span>#require</span> <span><span>"</span>topkg-jbuilder<span>"</span></span>\n</span><span>\n</span><span><span><span>open</span> <span>Topkg</span>\n</span></span><span>\n</span><span><span>let</span> <span>publish</span> <span>=</span>\n</span><span> <span>Pkg.</span>publish <span>~artefacts<span>:</span></span><span><span>[</span><span>`Distrib</span><span>]</span></span> <span>(<span>)</span></span>\n</span><span>\n</span><span><span>let</span> <span>(<span>)</span></span> <span>=</span>\n</span><span> <span>Topkg_jbuilder.</span>describe <span>~publish</span> <span>(<span>)</span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/past-present-future/#prepare-a-release\">Prepare a release</a></h2>\n<p>Finally, we follow the standard <a href=\"https://github.com/dbuenzli/topkg/\">topkg</a> workflow to prepare a release. First,\nadd an entry to <a href=\"https://github.com/mor1/ocal/blob/0.2.0/CHANGES.md\"><code>CHANGES.md</code></a> with the correct formatting and commit the\nresult, and then:</p>\n<pre><code><span><span><span>distrib</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span><span>[</span><span> <span><span>-</span>x</span> <span><span>$$(</span>opam config var root<span>)</span></span>/plugins/opam-publish/repos/ocal <span>]</span></span> <span>||</span> <span>\\\n</span></span></span></span><span><span><span>\t <span><span>opam-publish</span></span><span> repo add ocal mor1/ocal</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> tag</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> distrib</span></span>\n</span></span></code></pre>\n<p>\u2026which creates tokens for accessing the GitHub repo for this project (if they\ndon\u2019t already exist), creates a release tag based on entries in <a href=\"https://github.com/mor1/ocal/blob/0.2.0/CHANGES.md\"><code>CHANGES.md</code></a>,\nand then creates the release tarballs (without the edits to <a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> this\nwould also build the docs, but we have none).</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#publish-a-release\">Publish a release</a></h2>\n<p>Finally, we publish the release to GitHub and issue a pull request to\nthe <a href=\"https://github.com/ocaml/opam/\">OPAM repository</a> to add the new release into OPAM after linting and\ntests have passed.</p>\n<pre><code><span><span><span>publish</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span>\t<span><span><span>topkg</span></span><span> publish</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> opam pkg</span></span>\n</span></span><span><span>\t<span><span><span>topkg</span></span><span> opam submit</span></span>\n</span></span></code></pre>\n<p>Given that this repo has only a single package, we could in fact simply issue</p>\n<pre><code><span>topkg tag && topkg bistro\n</span></code></pre>\n<p>Also, as an alternative to customising the <a href=\"https://github.com/mor1/ocal/blob/0.2.0/pkg/pkg.ml\"><code>pkg/pkg.ml</code></a> as indicated above, we\ncould simply remember to indicate the appropriate customisation on the command\nline:</p>\n<pre><code><span>topkg publish distrib\n</span></code></pre>\n<p>\u2026but <code>topkg bistro</code> wouldn\u2019t then work.</p>\n<h2><a href=\"https://mort.io/blog/past-present-future/#conclusion\">Conclusion</a></h2>\n<p>So that\u2019s it: a simple executable distribution taken from old-school <a href=\"http://oasis.forge.ocamlcore.org/\">Oasis</a> and\n<a href=\"https://ocaml.org/learn/tutorials/ocamlbuild/\">OCamlBuild</a> infrastructure to shiny new modern <a href=\"https://github.com/janestreet/jbuilder/\">jbuilder</a> and <a href=\"https://github.com/dbuenzli/topkg/\">topkg</a>. The new\nscheme seems to me to be an improvement: faster build times, simpler (to my\neyes) metadata, autogeneration of more of the repeated metadata (<code>.merlin</code> etc),\nand a reasonably simple <a href=\"https://github.com/mor1/ocal/blob/0.2.0/Makefile\"><code>Makefile</code></a> that I actually think I understand.\nDefinitely progress :)</p>",
+18
mort/blog_phd-viva_.json
+18
mort/blog_phd-viva_.json
···+"summary": "<p>Having recently, happily, had several PhD students completing in short order,\nI\u2019ve been approaching external PhD examiners. Occasionally I find myself asking\nsomeone who\u2019s not done any / many in the UK previously. As our system \u2013 as all\nsuch systems! \u2013 is a bit different to those in other parts of the world, I\u2019ve\nwritten a few notes on a couple of occasions about what to expect. So I figured\nI might as well publish them.</p>\n<p>What follows is my impression / understanding based on experience here in the\n<a href=\"https://www.cst.cam.ac.uk/\">Department of Computer Science & Technology</a>,\n<a href=\"https://www.cam.ac.uk/\">Cambridge University</a> ca. 2025. Your Mileage May Vary\nanywhen and anywhere else, including anywhere else in the UK \u2013 check local\nregulations to be sure.</p>\n<p>In terms of process, the system here is that candidates submit their complete\ndissertation and then undergo a \u201cviva voce\u201d (oral examination). It would be\nusual for the viva to take place within 2\u20143 months of submission. It\u2019s better\n(IMO, much better) if it can happen in person but we may still be able to\narrange to do it online in extremis.</p>\n<p>There are two examiners:</p>\n<ol>\n<li>The <em>internal examiner</em> (typically connected to the Department) who ensures\nthe process is followed properly but may not be a deep expert in the specific\ntopic, and</li>\n<li>The <em>external examiner</em> (from outside the University) who is there as the\nsubject matter expert.</li>\n</ol>\n<p>The viva consists of the two examiners asking the candidate questions about\ntheir dissertation until they\u2019re satisfied; typically this takes at least 2h and\ncan go longer, though more than 3.5\u20144h is unusual in my experience.</p>\n<p>The examiners are each expected to read the dissertation in detail before the\nviva and each write a short (typically 1\u20142pp) independent report giving their\nopinion, outlining any concerns they will have and the resulting\ntopics/questions they will be exploring in the viva, and indicating what their a\npriori judgement is in terms of recommendation (roughly: pass/pass with\ncorrections/revise & resubmit/no Ph.D. but you can have a Masters/fail).</p>\n<p>The examiners will then typically meet ~30min or so before the viva to discuss\ntheir independent reports and decide on the approach to take in the viva. After\nthe viva they write a joint report (usually shorter than their independent\nreports; perhaps 0.5pp) outlining what happened in the viva, as well as making a\nfinal recommendation and providing (if appropriate) a list of corrections that\nmust be satisfied for the candidate to pass.</p>\n<p>Finally, the University pays a (risibly small) honorarium to the external\nexaminer for doing the viva plus reasonable expenses.</p>",+"content": "<p>Having recently, happily, had several PhD students completing in short order,\nI\u2019ve been approaching external PhD examiners. Occasionally I find myself asking\nsomeone who\u2019s not done any / many in the UK previously. As our system \u2013 as all\nsuch systems! \u2013 is a bit different to those in other parts of the world, I\u2019ve\nwritten a few notes on a couple of occasions about what to expect. So I figured\nI might as well publish them.</p>\n<p>What follows is my impression / understanding based on experience here in the\n<a href=\"https://www.cst.cam.ac.uk/\">Department of Computer Science & Technology</a>,\n<a href=\"https://www.cam.ac.uk/\">Cambridge University</a> ca. 2025. Your Mileage May Vary\nanywhen and anywhere else, including anywhere else in the UK \u2013 check local\nregulations to be sure.</p>\n<p>In terms of process, the system here is that candidates submit their complete\ndissertation and then undergo a \u201cviva voce\u201d (oral examination). It would be\nusual for the viva to take place within 2\u20143 months of submission. It\u2019s better\n(IMO, much better) if it can happen in person but we may still be able to\narrange to do it online in extremis.</p>\n<p>There are two examiners:</p>\n<ol>\n<li>The <em>internal examiner</em> (typically connected to the Department) who ensures\nthe process is followed properly but may not be a deep expert in the specific\ntopic, and</li>\n<li>The <em>external examiner</em> (from outside the University) who is there as the\nsubject matter expert.</li>\n</ol>\n<p>The viva consists of the two examiners asking the candidate questions about\ntheir dissertation until they\u2019re satisfied; typically this takes at least 2h and\ncan go longer, though more than 3.5\u20144h is unusual in my experience.</p>\n<p>The examiners are each expected to read the dissertation in detail before the\nviva and each write a short (typically 1\u20142pp) independent report giving their\nopinion, outlining any concerns they will have and the resulting\ntopics/questions they will be exploring in the viva, and indicating what their a\npriori judgement is in terms of recommendation (roughly: pass/pass with\ncorrections/revise & resubmit/no Ph.D. but you can have a Masters/fail).</p>\n<p>The examiners will then typically meet ~30min or so before the viva to discuss\ntheir independent reports and decide on the approach to take in the viva. After\nthe viva they write a joint report (usually shorter than their independent\nreports; perhaps 0.5pp) outlining what happened in the viva, as well as making a\nfinal recommendation and providing (if appropriate) a list of corrections that\nmust be satisfied for the candidate to pass.</p>\n<p>Finally, the University pays a (risibly small) honorarium to the external\nexaminer for doing the viva plus reasonable expenses.</p>",
+18
mort/blog_post-covid-tpc_.json
+18
mort/blog_post-covid-tpc_.json
···+"summary": "<p>I do not participate in a huge number of TPCs (Technical Programme Committees)\nas a general rule\u2013 partly time constraints but mostly no-one knows who I am so\nI don\u2019t often get asked\u2026 (!)</p>\n<p>I have done a few though, some big (e.g., <a href=\"https://www.usenix.org/conference/nsdi15\">USENIX NSDI</a>, <a href=\"https://conferences.sigcomm.org/imc/2018/\">ACM IMC</a>),\nsome small (<a href=\"https://uksystems.org/\">UK Systems</a>,\n<a href=\"https://link.springer.com/conference/pam\">PAM</a>), and perhaps because I only do\na couple every few years, while doing <a href=\"https://conferences2.sigcomm.org/co-next/\">ACM\nCoNEXT</a> and <a href=\"https://acm-ieee-sec.org/list/\">ACM/IEEE\nSEC</a> this week, I found myself particularly\nnoticing some changes in practice since that last TPCs I recall (notably\n<a href=\"https://www.usenix.org/conference/nsdi15\">NSDI</a> and <a href=\"https://conferences.sigcomm.org/imc/2018/\">IMC</a>). So here are three observations.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#1-online-first-and-only\">1. Online, first and only</a></h2>\n<p>The biggest obvious change is that TPC meetings are now online rather than\nin-person. This has one big disadvantage for me: I really enjoyed travelling to\nthe meeting to meet colleagues and (usually) participate in some TPC-oriented\nworkshop with presentations of recent an in-progress work. In many ways, I found\nthis sort of activity more interesting than the conference itself (sorry!). It\nalso has the unfortunate effect that, at least for international TPCs, timezones\nmake scheduling tricky \u2013 one benefit of travelling was that at least the\nmeeting took place in localtime for (almost) everyone.</p>\n<p>However, it also has clear benefits: the CO2 footprint of the event is\ndramatically reduced which can only be a good thing. The financial cost\nreduction probably also opens up the experience to attendees who would never\npreviously have been able to make it. I\u2019ve rarely seen really poor meeting\nbehaviour on the TPCs I\u2019ve been involved in, but I also find that in an online\nmeeting, chairing tends to be more easily more rigorous and the sometimes\ndominating effect of a single confident (perhaps I might say over-confident, or\neven just loud) individual is significantly reduced. Which is good.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#2-offline-dominance\">2. Offline dominance</a></h2>\n<p>I now see <strong>dramatically</strong> more use being made of commenting and discussions in\n<a href=\"https://read.seas.harvard.edu/~kohler/hotcrp/\">HotCRP</a>, which remains the only\nconference management and paper reviewing platform I will willingly use (thanks\n<a href=\"https://en.wikipedia.org/wiki/Eddie_Kohler\">Eddie</a>!). I don\u2019t know whether this\nis a post-pandemic effect, or just the fact that the old old guard has largely\nshuffled out of active TPC duties and we now have a new old guard (ie., people\nof my era) and younger who are perhaps happier to communicate and express views\nwithout needing to be in the same room.</p>\n<p>I think this is another dramatic improvement in process, and puts us in a\nsimilar place to how, for example, we handled marking final year undergraduate\nprojects at <a href=\"https://www.nottingham.ac.uk/\">Nottingham University</a>. It means\nthat discussion is recorded, and usually more coherently and explicitly argued\ndue to the need to write it down.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#3-reduction-in-extremism\">3. Reduction in extremism</a></h2>\n<p>The one arguably slightly negative comment I would have is \u2013 and this is only\nmy anecdotal impression, and not something I can pretend to have data on \u2013 that\nI think I see a tendency for reviewers to perhaps be a little less clear in\ntheir scoring. Once the across the board strong rejects (a fair number) and\nstrong accepts (a much smaller number) were taken out, I saw an awful lot of\nweak reject (but happy to consider weak accept after discussion) and weak accept\n(but happy to consider weak reject after discussion). This made it quite hard\nfor me, as another reviewer, sometimes to get a clear signal as to what the\nother reviews were recommending.</p>\n<p>This is a challenge I sometimes see in admissions interviewing: less experienced\ninterviewers are sometimes reluctant to give a clear signal, exhibiting instead\na tendency to score near or perhaps just above the middle of the scale. (I\nremember doing this myself.) This feels more comfortable \u2013 putting one\u2019s head\nabove the parapet by taking a clear stance often feels more socially awkward\nthan giving a \u201cfine, ok, pretty good, sure\u201d response, not least because it needs\nstronger supporting argument \u2013 but in the end I think it misses the point of\nbeing an interviewer / a TPC member which is exactly to accept or reject papers\nbased on having been recognised for your expertise.</p>\n\n\n<p>So in the end, two good points, two potential (but minor) negatives, and one\ncompletely unjustifiable and purely selfish negative. Which on the whole is a\ngood thing. Though having come back to this after a break for a couple of years,\nI find I am even more sceptical of the whole process than I was. More later\nperhaps, once I get my thoughts in order.</p>",+"content": "<p>I do not participate in a huge number of TPCs (Technical Programme Committees)\nas a general rule\u2013 partly time constraints but mostly no-one knows who I am so\nI don\u2019t often get asked\u2026 (!)</p>\n<p>I have done a few though, some big (e.g., <a href=\"https://www.usenix.org/conference/nsdi15\">USENIX NSDI</a>, <a href=\"https://conferences.sigcomm.org/imc/2018/\">ACM IMC</a>),\nsome small (<a href=\"https://uksystems.org/\">UK Systems</a>,\n<a href=\"https://link.springer.com/conference/pam\">PAM</a>), and perhaps because I only do\na couple every few years, while doing <a href=\"https://conferences2.sigcomm.org/co-next/\">ACM\nCoNEXT</a> and <a href=\"https://acm-ieee-sec.org/list/\">ACM/IEEE\nSEC</a> this week, I found myself particularly\nnoticing some changes in practice since that last TPCs I recall (notably\n<a href=\"https://www.usenix.org/conference/nsdi15\">NSDI</a> and <a href=\"https://conferences.sigcomm.org/imc/2018/\">IMC</a>). So here are three observations.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#1-online-first-and-only\">1. Online, first and only</a></h2>\n<p>The biggest obvious change is that TPC meetings are now online rather than\nin-person. This has one big disadvantage for me: I really enjoyed travelling to\nthe meeting to meet colleagues and (usually) participate in some TPC-oriented\nworkshop with presentations of recent an in-progress work. In many ways, I found\nthis sort of activity more interesting than the conference itself (sorry!). It\nalso has the unfortunate effect that, at least for international TPCs, timezones\nmake scheduling tricky \u2013 one benefit of travelling was that at least the\nmeeting took place in localtime for (almost) everyone.</p>\n<p>However, it also has clear benefits: the CO2 footprint of the event is\ndramatically reduced which can only be a good thing. The financial cost\nreduction probably also opens up the experience to attendees who would never\npreviously have been able to make it. I\u2019ve rarely seen really poor meeting\nbehaviour on the TPCs I\u2019ve been involved in, but I also find that in an online\nmeeting, chairing tends to be more easily more rigorous and the sometimes\ndominating effect of a single confident (perhaps I might say over-confident, or\neven just loud) individual is significantly reduced. Which is good.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#2-offline-dominance\">2. Offline dominance</a></h2>\n<p>I now see <strong>dramatically</strong> more use being made of commenting and discussions in\n<a href=\"https://read.seas.harvard.edu/~kohler/hotcrp/\">HotCRP</a>, which remains the only\nconference management and paper reviewing platform I will willingly use (thanks\n<a href=\"https://en.wikipedia.org/wiki/Eddie_Kohler\">Eddie</a>!). I don\u2019t know whether this\nis a post-pandemic effect, or just the fact that the old old guard has largely\nshuffled out of active TPC duties and we now have a new old guard (ie., people\nof my era) and younger who are perhaps happier to communicate and express views\nwithout needing to be in the same room.</p>\n<p>I think this is another dramatic improvement in process, and puts us in a\nsimilar place to how, for example, we handled marking final year undergraduate\nprojects at <a href=\"https://www.nottingham.ac.uk/\">Nottingham University</a>. It means\nthat discussion is recorded, and usually more coherently and explicitly argued\ndue to the need to write it down.</p>\n<h2><a href=\"https://mort.io/blog/post-covid-tpc/#3-reduction-in-extremism\">3. Reduction in extremism</a></h2>\n<p>The one arguably slightly negative comment I would have is \u2013 and this is only\nmy anecdotal impression, and not something I can pretend to have data on \u2013 that\nI think I see a tendency for reviewers to perhaps be a little less clear in\ntheir scoring. Once the across the board strong rejects (a fair number) and\nstrong accepts (a much smaller number) were taken out, I saw an awful lot of\nweak reject (but happy to consider weak accept after discussion) and weak accept\n(but happy to consider weak reject after discussion). This made it quite hard\nfor me, as another reviewer, sometimes to get a clear signal as to what the\nother reviews were recommending.</p>\n<p>This is a challenge I sometimes see in admissions interviewing: less experienced\ninterviewers are sometimes reluctant to give a clear signal, exhibiting instead\na tendency to score near or perhaps just above the middle of the scale. (I\nremember doing this myself.) This feels more comfortable \u2013 putting one\u2019s head\nabove the parapet by taking a clear stance often feels more socially awkward\nthan giving a \u201cfine, ok, pretty good, sure\u201d response, not least because it needs\nstronger supporting argument \u2013 but in the end I think it misses the point of\nbeing an interviewer / a TPC member which is exactly to accept or reject papers\nbased on having been recognised for your expertise.</p>\n\n\n<p>So in the end, two good points, two potential (but minor) negatives, and one\ncompletely unjustifiable and purely selfish negative. Which on the whole is a\ngood thing. Though having come back to this after a break for a couple of years,\nI find I am even more sceptical of the whole process than I was. More later\nperhaps, once I get my thoughts in order.</p>",
+18
mort/blog_quelle-dommage_.json
+18
mort/blog_quelle-dommage_.json
···+"summary": "<blockquote>\n<p>Ed: this tool is perhaps less relevant now that both\n<a href=\"https://mirage.io/\">Mirage</a> and <a href=\"https://ocaml.org/opam/\">OPAM</a> have moved\non. But perhaps it\u2019ll be resurrected one day so here it is.</p>\n</blockquote>\n<p>Largely because I wanted to make a feeble attempt at a French pun,\n<a href=\"https://github.com/mor1/dommage/\"><code>dommage</code></a> is a tool for\n<a href=\"https://docker.com/\">Docker</a> containerising Mirage unikernels. From the\n<a href=\"https://github.com/mor1/dommage\">README</a>:</p>\n<h2><a href=\"https://mort.io/blog/quelle-dommage/#dommage-dockerised-mirage\">Dommage, Dockerised Mirage</a></h2>\n<p><code>dommage</code> is a shell script that wraps the <a href=\"https://mirage.io\">Mirage</a> CLI to make use of Docker\ncontainers meaning that:</p>\n<ul>\n<li>you can cache the OPAM build artefacts in the container image, speeding up\nlocal builds;</li>\n<li>you can re-use the build container image in Travis builds by publishing it,\nspeeding those up considerably; and</li>\n<li>you can easily test build <code>-t xen</code> targets on OSX.</li>\n</ul>\n<p>I\u2019ve tried to minimise interference with the normal operation of <a href=\"https://mirage.io\">Mirage</a> CLI so\nsimply replacing <code>mirage</code> with <code>dommage</code> is supposed to work. To publish the\nresulting container image, <code>dommage publish <image></code>.</p>\n<p>Issues, comments, suggestions and bug fixes all welcome!</p>\n<h3><a href=\"https://mort.io/blog/quelle-dommage/#operation\">Operation</a></h3>\n<p>To start, <code>dommage</code> provides a few management commands to manipulate the build\ncontainer:</p>\n<ul>\n<li><code>dommage init BASE-IMAGE</code> creates a new container, based off <code>BASE-IMAGE</code>\nfrom the <a href=\"https://hub.docker.com\">Docker Hub</a></li>\n<li><code>dommage publish IMAGE</code> commits the current container and pushes it to\n<a href=\"https://hub.docker.com\">Docker Hub</a> as <code>IMAGE</code></li>\n<li><code>dommage destroy</code> stops and removes the current build container</li>\n<li><code>dommage run ...</code> executes a command inside the current build container</li>\n</ul>\n<p>In addition, it wraps the main <a href=\"https://mirage.io\">Mirage</a> CLI commands:</p>\n<ul>\n<li><code>dommage configure ...</code> runs <code>mirage configure ... && make depends</code> inside\nthe build contianer</li>\n<li><code>dommage build ...</code> runs <code>mirage build ...</code> inside the build container</li>\n<li><code>dommage clean ...</code> runs <code>mirage clean ...</code> inside the build container</li>\n</ul>",+"content": "<blockquote>\n<p>Ed: this tool is perhaps less relevant now that both\n<a href=\"https://mirage.io/\">Mirage</a> and <a href=\"https://ocaml.org/opam/\">OPAM</a> have moved\non. But perhaps it\u2019ll be resurrected one day so here it is.</p>\n</blockquote>\n<p>Largely because I wanted to make a feeble attempt at a French pun,\n<a href=\"https://github.com/mor1/dommage/\"><code>dommage</code></a> is a tool for\n<a href=\"https://docker.com/\">Docker</a> containerising Mirage unikernels. From the\n<a href=\"https://github.com/mor1/dommage\">README</a>:</p>\n<h2><a href=\"https://mort.io/blog/quelle-dommage/#dommage-dockerised-mirage\">Dommage, Dockerised Mirage</a></h2>\n<p><code>dommage</code> is a shell script that wraps the <a href=\"https://mirage.io\">Mirage</a> CLI to make use of Docker\ncontainers meaning that:</p>\n<ul>\n<li>you can cache the OPAM build artefacts in the container image, speeding up\nlocal builds;</li>\n<li>you can re-use the build container image in Travis builds by publishing it,\nspeeding those up considerably; and</li>\n<li>you can easily test build <code>-t xen</code> targets on OSX.</li>\n</ul>\n<p>I\u2019ve tried to minimise interference with the normal operation of <a href=\"https://mirage.io\">Mirage</a> CLI so\nsimply replacing <code>mirage</code> with <code>dommage</code> is supposed to work. To publish the\nresulting container image, <code>dommage publish <image></code>.</p>\n<p>Issues, comments, suggestions and bug fixes all welcome!</p>\n<h3><a href=\"https://mort.io/blog/quelle-dommage/#operation\">Operation</a></h3>\n<p>To start, <code>dommage</code> provides a few management commands to manipulate the build\ncontainer:</p>\n<ul>\n<li><code>dommage init BASE-IMAGE</code> creates a new container, based off <code>BASE-IMAGE</code>\nfrom the <a href=\"https://hub.docker.com\">Docker Hub</a></li>\n<li><code>dommage publish IMAGE</code> commits the current container and pushes it to\n<a href=\"https://hub.docker.com\">Docker Hub</a> as <code>IMAGE</code></li>\n<li><code>dommage destroy</code> stops and removes the current build container</li>\n<li><code>dommage run ...</code> executes a command inside the current build container</li>\n</ul>\n<p>In addition, it wraps the main <a href=\"https://mirage.io\">Mirage</a> CLI commands:</p>\n<ul>\n<li><code>dommage configure ...</code> runs <code>mirage configure ... && make depends</code> inside\nthe build contianer</li>\n<li><code>dommage build ...</code> runs <code>mirage build ...</code> inside the build container</li>\n<li><code>dommage clean ...</code> runs <code>mirage clean ...</code> inside the build container</li>\n</ul>",
+18
mort/blog_reinstall-maestral_.json
+18
mort/blog_reinstall-maestral_.json
···+"summary": "<p>A short one this; just a crib of commands to redo <code>maestral</code> (unofficial\n<a href=\"https://dropbox.com/\">Dropbox</a> client) configuration.</p>\n<pre><code><span><span><span>#</span></span><span> remove existing configuration</span><span>\n</span></span><span><span><span>rm</span></span><span> <span><span>~</span></span>/.config/maestral/maestral.ini</span>\n</span><span><span><span>#</span></span><span> appears to fail with `/usr/bin/env 'bash': No such file or directory`</span><span>\n</span></span><span><span><span>maestral</span></span><span> autostart</span> <span>&&</span> <span><span>systemctl</span></span><span><span><span> --</span>user</span> daemon-reload </span>\n</span><span><span><span>#</span></span><span> ...so instead, just remmeber to do the following on restart</span><span>\n</span></span><span><span><span>maestral</span></span><span> stop</span> <span>&&</span> <span><span>maestral</span></span><span> start </span>\n</span><span><span><span>#</span></span><span> if desired, add autocompletion for `maestral` in bash</span><span>\n</span></span><span><span><span>[[</span> <span><span>$</span><span>(</span><span><span>which</span></span><span> maestral <span>2</span><span>></span>/dev/null</span><span>)</span></span> <span>]]</span></span> <span>&&</span> <span><span>source</span></span><span> <span><</span><span>(</span><span><span>maestral</span></span><span> completion bash</span><span>)</span></span>\n</span></code></pre>",+"content": "<p>A short one this; just a crib of commands to redo <code>maestral</code> (unofficial\n<a href=\"https://dropbox.com/\">Dropbox</a> client) configuration.</p>\n<pre><code><span><span><span>#</span></span><span> remove existing configuration</span><span>\n</span></span><span><span><span>rm</span></span><span> <span><span>~</span></span>/.config/maestral/maestral.ini</span>\n</span><span><span><span>#</span></span><span> appears to fail with `/usr/bin/env 'bash': No such file or directory`</span><span>\n</span></span><span><span><span>maestral</span></span><span> autostart</span> <span>&&</span> <span><span>systemctl</span></span><span><span><span> --</span>user</span> daemon-reload </span>\n</span><span><span><span>#</span></span><span> ...so instead, just remmeber to do the following on restart</span><span>\n</span></span><span><span><span>maestral</span></span><span> stop</span> <span>&&</span> <span><span>maestral</span></span><span> start </span>\n</span><span><span><span>#</span></span><span> if desired, add autocompletion for `maestral` in bash</span><span>\n</span></span><span><span><span>[[</span> <span><span>$</span><span>(</span><span><span>which</span></span><span> maestral <span>2</span><span>></span>/dev/null</span><span>)</span></span> <span>]]</span></span> <span>&&</span> <span><span>source</span></span><span> <span><</span><span>(</span><span><span>maestral</span></span><span> completion bash</span><span>)</span></span>\n</span></code></pre>",
+18
mort/blog_restic-discovery_.json
+18
mort/blog_restic-discovery_.json
···+"summary": "<p>I recently had cause to try to recover some files from my\n<a href=\"https://restic.net/\"><code>restic</code></a> backups. These go back for over a year now, and\nI could not remember at which point I\u2019d mistakenly nuked the directory I now\nwanted to recover. <code>restic find</code> purports to be able to do this by searching\nthrough snapshots but I found that it\u2019s quite slow, and can only search within a\ntime range which is not that helpful when you don\u2019t know the time range you\nneed.</p>\n<p>So I did it by hand, which turned out to be rather faster.</p>\n<pre><code><span> <span>RESTIC_PASSWORD_FILE</span><span>=</span><span>/your/backup/password/file</span> <span>RESTIC_REPOSITORY</span><span>=</span><span>/your/backup/repository/</span> <span>\\\n</span></span><span> <span><span>#</span></span><span> list snapshots, filtering by DATE regex, grabbing just the snapshot hash</span><span>\n</span></span><span> <span><span>sudo</span></span><span><span><span> -</span>E</span> restic snapshots<span><span> -</span>c</span> <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>rg</span></span><span> DATE <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>b1-8</span> <span>\\\n</span></span></span><span><span></span> <span>|</span> <span>while</span> <span><span>read</span></span><span> <span>ss</span></span><span>;</span> <span>do</span> \n</span><span> <span><span>echo</span></span><span> <span><span>"</span>=== <span><span>$</span><span>ss</span></span><span>"</span></span></span>\n</span><span> <span><span>sudo</span></span><span><span><span> -</span>E</span> restic ls <span><span>$</span><span>ss</span></span></span> <span>|</span> <span><span>rg</span></span><span> /DIRECTORY/</span> <span>;</span> <span>done</span><span> </span>\n</span><span> <span>done</span>\n</span></code></pre>\n<p>The key thing with the above approach is that it\u2019s also quite amenable to\nbisection, which makes it a lot faster.</p>",+"content": "<p>I recently had cause to try to recover some files from my\n<a href=\"https://restic.net/\"><code>restic</code></a> backups. These go back for over a year now, and\nI could not remember at which point I\u2019d mistakenly nuked the directory I now\nwanted to recover. <code>restic find</code> purports to be able to do this by searching\nthrough snapshots but I found that it\u2019s quite slow, and can only search within a\ntime range which is not that helpful when you don\u2019t know the time range you\nneed.</p>\n<p>So I did it by hand, which turned out to be rather faster.</p>\n<pre><code><span> <span>RESTIC_PASSWORD_FILE</span><span>=</span><span>/your/backup/password/file</span> <span>RESTIC_REPOSITORY</span><span>=</span><span>/your/backup/repository/</span> <span>\\\n</span></span><span> <span><span>#</span></span><span> list snapshots, filtering by DATE regex, grabbing just the snapshot hash</span><span>\n</span></span><span> <span><span>sudo</span></span><span><span><span> -</span>E</span> restic snapshots<span><span> -</span>c</span> <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>rg</span></span><span> DATE <span>\\\n</span></span></span><span><span></span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>b1-8</span> <span>\\\n</span></span></span><span><span></span> <span>|</span> <span>while</span> <span><span>read</span></span><span> <span>ss</span></span><span>;</span> <span>do</span> \n</span><span> <span><span>echo</span></span><span> <span><span>"</span>=== <span><span>$</span><span>ss</span></span><span>"</span></span></span>\n</span><span> <span><span>sudo</span></span><span><span><span> -</span>E</span> restic ls <span><span>$</span><span>ss</span></span></span> <span>|</span> <span><span>rg</span></span><span> /DIRECTORY/</span> <span>;</span> <span>done</span><span> </span>\n</span><span> <span>done</span>\n</span></code></pre>\n<p>The key thing with the above approach is that it\u2019s also quite amenable to\nbisection, which makes it a lot faster.</p>",
+18
mort/blog_reverse-find_.json
+18
mort/blog_reverse-find_.json
···+"summary": "<p>In the last few days I discovered I needed to search back up the filesystem from\n<code>$CWD</code> to find the first occurence of a file (specifically, a <code>Justfile</code> but\nthat\u2019s by-the-by). Got bored of doing it by hand so wrote a\n<a href=\"https://www.gnu.org/software/bash/\"><code>bash</code></a> shell function; here \u2019tis:</p>\n<pre><code><span><span><span>rf</span> <span>(</span><span>)</span> <span>{</span>\n</span></span><span><span><span> <span>local</span> <span>D</span></span>\n</span></span><span><span> <span>while</span> <span><span>!</span></span><span> eza<span><span> -</span>l</span> <span><span>"</span><span><span>$</span><span>{</span></span><span><span>D</span></span><span><span>:=</span></span><span>.</span><span><span>}</span></span>/<span><span>$</span><span>1</span></span><span>"</span></span></span><span>;</span> <span>do</span> <span><span>#</span></span><span> first, check `$CWD`</span><span>\n</span></span></span><span><span> <span>[</span><span> <span><span>"</span><span><span>$</span><span>(</span><span><span>realpath</span></span><span> <span><span>"</span><span><span>$</span><span>D</span></span>/<span><span>$</span><span>1</span></span><span>"</span></span></span><span>)</span></span><span>"</span></span> <span>==</span> <span><span>"</span>/<span><span>$</span><span>1</span></span><span>"</span></span> <span>]</span></span> <span>&&</span> <span>break</span> <span><span>#</span></span><span> stop if we hit `/` already</span><span>\n</span></span></span><span><span> <span>D</span><span>=</span><span><span><span>$</span><span>D</span></span>/..</span> <span><span>#</span></span><span> else, iterate one layer up</span><span>\n</span></span></span><span><span> <span>done</span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>Invoke as (e.g.,) <code>rf Justfile</code>. Alternatively, as a one-liner:</p>\n<pre><code><span><span>F</span><span>=</span><span>Justfile</span><span></span><span>;</span> <span>while</span> <span><span>!</span></span><span> eza<span><span> -</span>l</span> <span><span>$</span><span>{</span></span><span><span>D</span></span><span><span>:=</span></span><span>.</span><span><span>}</span></span>/<span><span>$</span><span>F</span></span></span><span>;</span> <span>do</span> <span>[</span><span> <span><span>"</span><span><span>$</span><span>(</span><span><span>realpath</span></span><span> <span><span>$</span><span>D</span></span>/<span><span>$</span><span>F</span></span></span><span>)</span></span><span>"</span></span> <span>==</span> <span><span>"</span>/<span><span>$</span><span>F</span></span><span>"</span></span> <span>]</span></span> <span>&&</span> <span>break</span><span>;</span> <span>D</span><span>=</span><span><span><span>$</span><span>D</span></span>/..</span><span></span><span>;</span> <span>done</span><span>;</span> <span><span>unset</span></span><span> D</span>\n</span></code></pre>",+"content": "<p>In the last few days I discovered I needed to search back up the filesystem from\n<code>$CWD</code> to find the first occurence of a file (specifically, a <code>Justfile</code> but\nthat\u2019s by-the-by). Got bored of doing it by hand so wrote a\n<a href=\"https://www.gnu.org/software/bash/\"><code>bash</code></a> shell function; here \u2019tis:</p>\n<pre><code><span><span><span>rf</span> <span>(</span><span>)</span> <span>{</span>\n</span></span><span><span><span> <span>local</span> <span>D</span></span>\n</span></span><span><span> <span>while</span> <span><span>!</span></span><span> eza<span><span> -</span>l</span> <span><span>"</span><span><span>$</span><span>{</span></span><span><span>D</span></span><span><span>:=</span></span><span>.</span><span><span>}</span></span>/<span><span>$</span><span>1</span></span><span>"</span></span></span><span>;</span> <span>do</span> <span><span>#</span></span><span> first, check `$CWD`</span><span>\n</span></span></span><span><span> <span>[</span><span> <span><span>"</span><span><span>$</span><span>(</span><span><span>realpath</span></span><span> <span><span>"</span><span><span>$</span><span>D</span></span>/<span><span>$</span><span>1</span></span><span>"</span></span></span><span>)</span></span><span>"</span></span> <span>==</span> <span><span>"</span>/<span><span>$</span><span>1</span></span><span>"</span></span> <span>]</span></span> <span>&&</span> <span>break</span> <span><span>#</span></span><span> stop if we hit `/` already</span><span>\n</span></span></span><span><span> <span>D</span><span>=</span><span><span><span>$</span><span>D</span></span>/..</span> <span><span>#</span></span><span> else, iterate one layer up</span><span>\n</span></span></span><span><span> <span>done</span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>Invoke as (e.g.,) <code>rf Justfile</code>. Alternatively, as a one-liner:</p>\n<pre><code><span><span>F</span><span>=</span><span>Justfile</span><span></span><span>;</span> <span>while</span> <span><span>!</span></span><span> eza<span><span> -</span>l</span> <span><span>$</span><span>{</span></span><span><span>D</span></span><span><span>:=</span></span><span>.</span><span><span>}</span></span>/<span><span>$</span><span>F</span></span></span><span>;</span> <span>do</span> <span>[</span><span> <span><span>"</span><span><span>$</span><span>(</span><span><span>realpath</span></span><span> <span><span>$</span><span>D</span></span>/<span><span>$</span><span>F</span></span></span><span>)</span></span><span>"</span></span> <span>==</span> <span><span>"</span>/<span><span>$</span><span>F</span></span><span>"</span></span> <span>]</span></span> <span>&&</span> <span>break</span><span>;</span> <span>D</span><span>=</span><span><span><span>$</span><span>D</span></span>/..</span><span></span><span>;</span> <span>done</span><span>;</span> <span><span>unset</span></span><span> D</span>\n</span></code></pre>",
+18
mort/blog_sermonising_.json
+18
mort/blog_sermonising_.json
···+"summary": "<p>Our inestimable and most excellent Chaplain, Revd Dr Helen Orchard, likes to\nhave a theme for the Sunday evensong services for the term. Back in Michaelmas\n2023 it was \u2026 AI. I said I\u2019d help find someone to give a sermon from a\ntechnical perspective but then signally failed to do so (sorry!). So in the end\nI said I\u2019d do it, even though AI is not my thing and I\u2019d never given a sermon\nbefore. Or, for that matter, attended evensong. Take the opportunities offered\nand all that.</p>\n<p>I realised this week that, although a few people at the time had asked for\ncopies, I\u2019d also done nothing about that (I am nothing if not consistently\nrubbish). So here\u2019s the text, more or less as given, on 15 October 2023. Note\nthat the golden eagle I mount is a rather fine lectern in our Chapel (pictured).\nNothing more salacious than that. Filthy minds.</p>\n<p>Three editorial notes given that it\u2019s been over a year and a half since I gave\nthis (my! how time flies\u2026):</p>\n<ol>\n<li>I allude to this but should be clear: the neural network is not the only\ntechnological approach to producing AI \u2013 several others exist and are both\nuseful and used, machine learning being one that\u2019s particularly productive in\nrecent years. However the most hyped was and still seems to be various forms\nof neural network so that\u2019s what I focused on.</li>\n<li>I refer to \u201cstatic datasets\u201d because the versions of ChatGPT at the time were\ntrained infrequently on a given dataset of the moment. Training updates now\nseem much more frequent (perhaps weekly), user context is maintained\nthroughout a chat session, and user feedback sought at the end. So while it\u2019s\nstill technically true that the datasets involved are static, it\u2019s much less\nnoticeable.</li>\n<li>The example of \u201cGod save the\u201d worked particularly because this was only about\na year after Queen Elizabeth II died, so \u201cqueen\u201d was likely still the\ninstinctive response of many.</li>\n</ol>\n<p>Finally, just in case it\u2019s not clear \u2013 I tend toward the sceptical end\nregarding AI. Potentially a useful tool in some circumstances but all claims\nabout AGI are nonsense and the singularity won\u2019t happen because of the machines.\nHuman stupidity on the other hand seems without bound. And always follow the\nmoney.</p>\n \n<a href=\"https://www.christs.cam.ac.uk/facilities/chapel\"><img alt=\"A photograph of a fine golden-coloured lectern, the head of which is an eagle\" height=\"1\" src=\"https://mort.io/blog/sermonising/Christ's College Chapel 6.jpg\" width=\"480\"></a>\n<a href=\"https://www.christs.cam.ac.uk/sites/default/files/inline-images/Christ%27s%20College%20Chapel%206.jpg\">Original</a>\n<blockquote>\n<p>As I mount the golden eagle for the first time, I should say that I am not\nnormally given to preaching \u2013 though my children might disagree with that\nstatement \u2013 but as the theme this term is Artificial Intelligence, Helen\nasked me to speak to you about that from the perspective of a computer\nscientist. Unless you catch me in a pub after a couple of pints, I am also not\ngiven to philosophising, so I will limit myself to the physical reality of\n<em>Artificial Intelligence</em>, or <em>AI</em>. Specifically, what is it and what does it\ncost. I will use AIs that generate text as examples, as these so-called <em>Large\nLanguage Models</em> have been the focus of considerable interest in recent\nmonths, but the same basic mechanisms and problems apply to AIs used to\ngenerate images, music, videos and so on.</p>\n<p>First, what is it. AI is a catch-all term for a set of technologies that\nattempt to replicate whatever we call \u201cintelligence\u201d. Computer scientists,\ncognitive psychologists and mathematicians have worked on these various\ntechnologies for decades, but the current vogue is very much for a particular\nset of mathematical techniques that try to produce brain-like behaviour by\nmodelling inter-connected neurons.</p>\n<p>Each neuron is stimulated by one or more input signals which it combines to\nproduce an output signal with some probability. The outputs of some neurons\nare connected to the inputs of some other neurons, creating an enormous\nnetwork. The effect in our brains might be that an input signal \u201cI want a\nbiscuit\u201d results in an output signal that causes us to move an arm to pick up\na biscuit. In a modern \u201cgenerative AI\u201d, the input might be a sentence or\nparagraph or two of text, and the resulting output might be an image or a\nsequence of words.</p>\n<p>As a simple example of what I mean, if I asked you to give the next few words\nin the phrase starting \u201cGod save the\u201d you might say \u201cking send him\nvictorious\u201d. You have just performed inference using your own language model,\ngenerating some likely output text given three words of input. I\u2019ll come back\nto that example later.</p>\n</blockquote>\n<blockquote>\n<p>I said the inputs were combined to produce the output with some probability,\nbut how exactly? The process for combining inputs involves a set of parameters\nthat are determined by finding the values that give the best fit some a priori\ndata. This is known as training if you\u2019re an AI specialist, or parameter\nfitting if you\u2019re a statistician.</p>\n<p>A simple analogy: you may recall that a straight line is defined by two\nparameters, its slope and any point on the line. If you had a set of two\ndimensional data points that you thought were straightforwardly related, you\nmight try to discover that relationship by drawing the best straight line you\ncould through them; but which particular line would you think was the best? A\nreasonable choice might be the one that minimised the total distance from the\nline to each point. For an AI the maths is a little more complex, but that\u2019s\nbasically what happens: training finds the parameter values that give the best\nfit to a large set of training data.</p>\n<p>So that\u2019s a modern AI: a statistical model that, when stimulated by one or\nmore inputs, produces outputs with some probability. The inputs might be words\nor images or some other thing, and the outputs might be words or images or\nsome other thing. The underlying model might be wrapped up by other models\nthat, for example, try to filter out undesirable outputs or provide for\ndifferent ways of consuming inputs.</p>\n<p>It is the sheer scale that makes this work: your brain has perhaps 100 billion\nneurons each of which might connect to 10,000 other neurons for a total of\nperhaps one million billion connections, whereas an AI such as a recent\nversion of ChatGPT might have 175 billion parameters but each connected to\njust hundreds of others. The underlying mathematics has been known for\ndecades; it is the combination of massive training datasets and the enormous\ncomputational resources of the cloud that have enabled us to build these AIs.</p>\n</blockquote>\n<blockquote>\n<p>Second, ignoring the hysteria around so-called Artificial General Intelligence\nand The Singularity, what costs do these AIs incur?</p>\n<p>To return to the example I used, I said that you might have completed the\nphrase \u201cGod save the\u201d with the words \u201cking send him victorious\u201d. In some sense\nthat is the \u201ccorrect\u201d completion. But perhaps some of you would have initially\nthought \u201cqueen send her victorious\u201d. And I have at least one friend who would\nnaturally respond \u201cqueen and her fascist regime\u201d.</p>\n<p>Human experience is varied and personal \u2013 the training process I described\ntypically uses large static datasets collected by scraping the Internet. While\nthe resulting AI can be configured not always to produce identical outputs\ngiven identical inputs, the training process does naturally lead to a kind of\nhomogenisation. Simplistically, if your group is not represented in that\ntraining dataset, its experience will not be represented in the AI and thus\nwill not be reproduced in the output. Worse, if the training data contains\nmisrepresentations or attacks on your group, the AI will by default capture\nand perpetuate them, already observed to be a particular problem for women,\nJews, and many minorities.</p>\n<p>Further, I mentioned that training data is scraped from the Internet \u2013 but as\nthe musical Avenue Q famously put it, \u201cthe Internet is for porn\u201d. A lot of\nthat text is rather fantastical and describes actions generally unacceptable\nin polite society, so the companies producing and operating AIs try to create\nguardrails by building other models that filter offensive outputs generated by\ntheir AIs \u2013 but how do you train such a model? You need to start with\nexamples of offensive output that are labelled as such so that you can train a\nmodel to differentiate between what is offensive and what is inoffensive. But\ncreating that labelled data involves human labour. For example, OpenAI were\nreported as outsourcing this activity to workers in Kenya paid less than $2\nper day to label perhaps 200 paragraphs per day of offensive input text with\nthe type of offensiveness: rape, torture, incest, and so on. Unpleasant and\npsychologically damaging work.</p>\n</blockquote>\n<blockquote>\n<p>There are also more practical problems posed by the resources used to create\nand operate AIs. In particular, energy and water.</p>\n<p>It takes a lot of computation to train and operate a large popular AI \u2013\nOpenAI reported about three and a half thousand petaflops-per-second-days in\n2020 to train their GPT model, where a petaflop represents a million billion\ncomputations. That is, about 10 years of a computer running at one petaflop\nper second. For comparison, your phone might achieve 0.1% of that performance.\nBut as the bumper sticker has it, the cloud is just someone else\u2019s computer \u2013\nin the case of a training run for a large AI model, several hundred thousand\ncomputers in a datacenter. For example, Microsoft\u2019s Iowa datacenter was built\nout for training models for OpenAI and has 285,000 standard processor cores\nand 10,000 GPUs (more powerful and power-hungry processors that you might be\nfamiliar with using if you\u2019re a gamer).</p>\n<p>This means CO2 from the energy to power the computers plus water to\ncool them. How much? Well, estimates computed for earlier, smaller, models put\nthe CO2 footprint of a single training run at roughly the same as a\nround-trip flight from New York to San Francisco. Once trained, individual\nqueries are comparatively cheap \u2013 but ChatGPT experienced the fastest ever\ngrowth of an Internet service. Earlier this year it was estimated as serving\nhundreds of millions of queries per day resulting in power consumption of\nperhaps 1 gigawatt-hour each day \u2013 the equivalent of 33,000 American\nhouseholds.</p>\n<p>As for water, Microsoft has reported that its global water usage increased 34%\nfrom 2021 to 2022; Google\u2019s increased 20% in the same period, but from a\nhigher baseline. The increase is believed to be substantially due to training\nand operating AI. A group from University of California at Riverside estimate\nthat each \u201cconversation\u201d with ChatGPT uses, directly and indirectly, about a\npint of water \u2013 and this generally needs to be clean drinking water that will\nnot leave residues that clog systems. The month before GPT-4 training was\ncompleted, Microsoft\u2019s Iowa datacenters consumed 11.5 million gallons, about\n6% of the district\u2019s drinking water. The amounts vary based on season and\nlocation of the datacenter but it seems clear that water consumption is very\nsubstantial and could impact local communities and ecosystems. And of course,\nthere is a tension here: cheap and green solar energy improves the carbon\nfootprint but the associated higher temperatures usually also worsens the\nwater footprint as more cooling is required.</p>\n</blockquote>\n<blockquote>\n<p>So there\u2019s a view of AI \u2013 an impressive set of mathematical and computational\ntechniques that can recreate some human behaviours to some extent in some\ncircumstances, at significant practical and moral cost. My own view is\nthreefold.</p>\n<p>First, using the phrase \u201cArtificial Intelligence\u201d to describe these\ntechnologies, rather than something less emotive such as Computationally\nIntensive Statistics, inevitably generates a very strong hype cycle, and we\nare currently at a point in that cycle where a welcome degree of scepticism is\nstarting to come in and people are more actively questioning what exactly\nthese technologies can and can\u2019t do.</p>\n<p>Second, we have largely proceeded to date without concern for any of the costs\nI discussed earlier, and \u2013 also welcome \u2013 that is changing: the costs are\nsignificant and we cannot ignore them.</p>\n<p>Third, there are interesting legal and economic tussles taking place as to who\nowns the training data, who owns the weights \u2013 that is, the AIs \u2013 produced,\nand by whom and how should AIs be regulated. In particular, it is notable that\nmany companies are claiming that there is a need for regulatory barriers to be\nintroduced \u2013 but those are the companies that have already reached a scale\nwhere they can overcome those barriers, so such barriers will serve only to\nkeep newcomers out of the marketplace, entrenching the existing power of \u201cbig\ntech\u201d (OpenAI, Google, Microsoft, Amazon, Meta, etc).</p>\n<p>Finally, as I used the word hysteria earlier to describe hyped fears of\nArtificial General Intelligence and the Singularity \u2013 <strong>please</strong> be sceptical\nof anyone claiming that as a serious existential risk, <strong>particularly</strong> if\nthey are associated with aforementioned \u201cbig tech\u201d! I view most of that\ndiscourse as a \u201cdead cat\u201d strategy, an attempt to distract from the current\nharms they are causing today by pointing to vague, nebulous, yet potentially\ninfinite future harms. For more about the quite startling beliefs of many of\nthose sounding those alarms, I recommend reading about the TESCREAL set of\nideologies \u2013 Transhumanism, Extropianism, Singularitarianism, Cosmism,\nRationalism, Effective Altruism, Longtermism.</p>\n<p>Thank-you.</p>\n</blockquote>\n<h2><a href=\"https://mort.io/blog/sermonising/#references\">References</a></h2>\n<h3><a href=\"https://mort.io/blog/sermonising/#background\">Background</a></h3>\n<ul>\n<li>\u201cLanguage Models are Few-Shot Learners\u201d, OpenAI, 2020.\n<a href=\"https://arxiv.org/abs/2005.14165\">https://arxiv.org/abs/2005.14165</a></li>\n<li>\u201cOn the Dangers of Stochastic Parrots: Can Language Models Be Too Big?\u201d,\nBender et al, FAcct\u201921. <a href=\"https://doi.org/10.1145/3442188.3445922\">https://doi.org/10.1145/3442188.3445922</a></li>\n<li>\u201cThe Internet is for porn\u201d, Stephanie D\u2019Abruzzo & Rick Lyon, Avenue Q.\n<a href=\"https://genius.com/Stephanie-dabruzzo-and-rick-lyon-the-internet-is-for-porn-lyrics\">https://genius.com/Stephanie-dabruzzo-and-rick-lyon-the-internet-is-for-porn-lyrics</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#hidden-work\">Hidden Work</a></h3>\n<ul>\n<li>\u201cOpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less\nToxic\u201d, Time.com, 2023,\n<a href=\"https://time.com/6247678/openai-chatgpt-kenya-workers/\">https://time.com/6247678/openai-chatgpt-kenya-workers/</a></li>\n<li>\u201cBehind the secretive work of the many, many humans helping to train AI\u201d,\nNPR, 2023.\n<a href=\"https://www.npr.org/2023/06/26/1184392406/behind-the-secretive-work-of-the-many-many-humans-helping-to-train-ai\">https://www.npr.org/2023/06/26/1184392406/behind-the-secretive-work-of-the-many-many-humans-helping-to-train-ai</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#energy\">Energy</a></h3>\n<ul>\n<li>\u201cEnergy and Policy Considerations for Deep Learning in NLP\u201d, Strubell et al,\n2019. <a href=\"https://arxiv.org/abs/1906.02243\">https://arxiv.org/abs/1906.02243</a></li>\n<li>\u201cTraining a single AI model can emit as much carbon as five cars in their\nlifetimes\u201d, MIT Technology Review, 2019.\n<a href=\"https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/\">https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#water\">Water</a></h3>\n<ul>\n<li>\u201cArtificial intelligence technology behind ChatGPT was built in Iowa \u2014 with a\nlot of water\u201d, AP News, 2023.\n<a href=\"https://apnews.com/article/chatgpt-gpt4-iowa-ai-water-consumption-microsoft-f551fde98083d17a7e8d904f8be822c4\">https://apnews.com/article/chatgpt-gpt4-iowa-ai-water-consumption-microsoft-f551fde98083d17a7e8d904f8be822c4</a></li>\n<li>\u201cA.I. tools fueled a 34% spike in Microsoft\u2019s water consumption, and one city\nwith its data centers is concerned about the effect on residential supply\u201d,\nFortune, 2023.\n<a href=\"https://fortune.com/2023/09/09/ai-chatgpt-usage-fuels-spike-in-microsoft-water-consumption/\">https://fortune.com/2023/09/09/ai-chatgpt-usage-fuels-spike-in-microsoft-water-consumption/</a></li>\n<li>\u201cMaking AI Less \u201cThirsty\u201d: Uncovering and Addressing the Secret Water\nFootprint of AI Models\u201c, Pengfei Li et al, 2023.\n<a href=\"https://arxiv.org/abs/2304.03271\">https://arxiv.org/abs/2304.03271</a></li>\n</ul>",+"content": "<p>Our inestimable and most excellent Chaplain, Revd Dr Helen Orchard, likes to\nhave a theme for the Sunday evensong services for the term. Back in Michaelmas\n2023 it was \u2026 AI. I said I\u2019d help find someone to give a sermon from a\ntechnical perspective but then signally failed to do so (sorry!). So in the end\nI said I\u2019d do it, even though AI is not my thing and I\u2019d never given a sermon\nbefore. Or, for that matter, attended evensong. Take the opportunities offered\nand all that.</p>\n<p>I realised this week that, although a few people at the time had asked for\ncopies, I\u2019d also done nothing about that (I am nothing if not consistently\nrubbish). So here\u2019s the text, more or less as given, on 15 October 2023. Note\nthat the golden eagle I mount is a rather fine lectern in our Chapel (pictured).\nNothing more salacious than that. Filthy minds.</p>\n<p>Three editorial notes given that it\u2019s been over a year and a half since I gave\nthis (my! how time flies\u2026):</p>\n<ol>\n<li>I allude to this but should be clear: the neural network is not the only\ntechnological approach to producing AI \u2013 several others exist and are both\nuseful and used, machine learning being one that\u2019s particularly productive in\nrecent years. However the most hyped was and still seems to be various forms\nof neural network so that\u2019s what I focused on.</li>\n<li>I refer to \u201cstatic datasets\u201d because the versions of ChatGPT at the time were\ntrained infrequently on a given dataset of the moment. Training updates now\nseem much more frequent (perhaps weekly), user context is maintained\nthroughout a chat session, and user feedback sought at the end. So while it\u2019s\nstill technically true that the datasets involved are static, it\u2019s much less\nnoticeable.</li>\n<li>The example of \u201cGod save the\u201d worked particularly because this was only about\na year after Queen Elizabeth II died, so \u201cqueen\u201d was likely still the\ninstinctive response of many.</li>\n</ol>\n<p>Finally, just in case it\u2019s not clear \u2013 I tend toward the sceptical end\nregarding AI. Potentially a useful tool in some circumstances but all claims\nabout AGI are nonsense and the singularity won\u2019t happen because of the machines.\nHuman stupidity on the other hand seems without bound. And always follow the\nmoney.</p>\n \n<a href=\"https://www.christs.cam.ac.uk/facilities/chapel\"><img alt=\"A photograph of a fine golden-coloured lectern, the head of which is an eagle\" height=\"1\" src=\"https://mort.io/blog/sermonising/Christ's College Chapel 6.jpg\" width=\"480\"></a>\n<a href=\"https://www.christs.cam.ac.uk/sites/default/files/inline-images/Christ%27s%20College%20Chapel%206.jpg\">Original</a>\n<blockquote>\n<p>As I mount the golden eagle for the first time, I should say that I am not\nnormally given to preaching \u2013 though my children might disagree with that\nstatement \u2013 but as the theme this term is Artificial Intelligence, Helen\nasked me to speak to you about that from the perspective of a computer\nscientist. Unless you catch me in a pub after a couple of pints, I am also not\ngiven to philosophising, so I will limit myself to the physical reality of\n<em>Artificial Intelligence</em>, or <em>AI</em>. Specifically, what is it and what does it\ncost. I will use AIs that generate text as examples, as these so-called <em>Large\nLanguage Models</em> have been the focus of considerable interest in recent\nmonths, but the same basic mechanisms and problems apply to AIs used to\ngenerate images, music, videos and so on.</p>\n<p>First, what is it. AI is a catch-all term for a set of technologies that\nattempt to replicate whatever we call \u201cintelligence\u201d. Computer scientists,\ncognitive psychologists and mathematicians have worked on these various\ntechnologies for decades, but the current vogue is very much for a particular\nset of mathematical techniques that try to produce brain-like behaviour by\nmodelling inter-connected neurons.</p>\n<p>Each neuron is stimulated by one or more input signals which it combines to\nproduce an output signal with some probability. The outputs of some neurons\nare connected to the inputs of some other neurons, creating an enormous\nnetwork. The effect in our brains might be that an input signal \u201cI want a\nbiscuit\u201d results in an output signal that causes us to move an arm to pick up\na biscuit. In a modern \u201cgenerative AI\u201d, the input might be a sentence or\nparagraph or two of text, and the resulting output might be an image or a\nsequence of words.</p>\n<p>As a simple example of what I mean, if I asked you to give the next few words\nin the phrase starting \u201cGod save the\u201d you might say \u201cking send him\nvictorious\u201d. You have just performed inference using your own language model,\ngenerating some likely output text given three words of input. I\u2019ll come back\nto that example later.</p>\n</blockquote>\n<blockquote>\n<p>I said the inputs were combined to produce the output with some probability,\nbut how exactly? The process for combining inputs involves a set of parameters\nthat are determined by finding the values that give the best fit some a priori\ndata. This is known as training if you\u2019re an AI specialist, or parameter\nfitting if you\u2019re a statistician.</p>\n<p>A simple analogy: you may recall that a straight line is defined by two\nparameters, its slope and any point on the line. If you had a set of two\ndimensional data points that you thought were straightforwardly related, you\nmight try to discover that relationship by drawing the best straight line you\ncould through them; but which particular line would you think was the best? A\nreasonable choice might be the one that minimised the total distance from the\nline to each point. For an AI the maths is a little more complex, but that\u2019s\nbasically what happens: training finds the parameter values that give the best\nfit to a large set of training data.</p>\n<p>So that\u2019s a modern AI: a statistical model that, when stimulated by one or\nmore inputs, produces outputs with some probability. The inputs might be words\nor images or some other thing, and the outputs might be words or images or\nsome other thing. The underlying model might be wrapped up by other models\nthat, for example, try to filter out undesirable outputs or provide for\ndifferent ways of consuming inputs.</p>\n<p>It is the sheer scale that makes this work: your brain has perhaps 100 billion\nneurons each of which might connect to 10,000 other neurons for a total of\nperhaps one million billion connections, whereas an AI such as a recent\nversion of ChatGPT might have 175 billion parameters but each connected to\njust hundreds of others. The underlying mathematics has been known for\ndecades; it is the combination of massive training datasets and the enormous\ncomputational resources of the cloud that have enabled us to build these AIs.</p>\n</blockquote>\n<blockquote>\n<p>Second, ignoring the hysteria around so-called Artificial General Intelligence\nand The Singularity, what costs do these AIs incur?</p>\n<p>To return to the example I used, I said that you might have completed the\nphrase \u201cGod save the\u201d with the words \u201cking send him victorious\u201d. In some sense\nthat is the \u201ccorrect\u201d completion. But perhaps some of you would have initially\nthought \u201cqueen send her victorious\u201d. And I have at least one friend who would\nnaturally respond \u201cqueen and her fascist regime\u201d.</p>\n<p>Human experience is varied and personal \u2013 the training process I described\ntypically uses large static datasets collected by scraping the Internet. While\nthe resulting AI can be configured not always to produce identical outputs\ngiven identical inputs, the training process does naturally lead to a kind of\nhomogenisation. Simplistically, if your group is not represented in that\ntraining dataset, its experience will not be represented in the AI and thus\nwill not be reproduced in the output. Worse, if the training data contains\nmisrepresentations or attacks on your group, the AI will by default capture\nand perpetuate them, already observed to be a particular problem for women,\nJews, and many minorities.</p>\n<p>Further, I mentioned that training data is scraped from the Internet \u2013 but as\nthe musical Avenue Q famously put it, \u201cthe Internet is for porn\u201d. A lot of\nthat text is rather fantastical and describes actions generally unacceptable\nin polite society, so the companies producing and operating AIs try to create\nguardrails by building other models that filter offensive outputs generated by\ntheir AIs \u2013 but how do you train such a model? You need to start with\nexamples of offensive output that are labelled as such so that you can train a\nmodel to differentiate between what is offensive and what is inoffensive. But\ncreating that labelled data involves human labour. For example, OpenAI were\nreported as outsourcing this activity to workers in Kenya paid less than $2\nper day to label perhaps 200 paragraphs per day of offensive input text with\nthe type of offensiveness: rape, torture, incest, and so on. Unpleasant and\npsychologically damaging work.</p>\n</blockquote>\n<blockquote>\n<p>There are also more practical problems posed by the resources used to create\nand operate AIs. In particular, energy and water.</p>\n<p>It takes a lot of computation to train and operate a large popular AI \u2013\nOpenAI reported about three and a half thousand petaflops-per-second-days in\n2020 to train their GPT model, where a petaflop represents a million billion\ncomputations. That is, about 10 years of a computer running at one petaflop\nper second. For comparison, your phone might achieve 0.1% of that performance.\nBut as the bumper sticker has it, the cloud is just someone else\u2019s computer \u2013\nin the case of a training run for a large AI model, several hundred thousand\ncomputers in a datacenter. For example, Microsoft\u2019s Iowa datacenter was built\nout for training models for OpenAI and has 285,000 standard processor cores\nand 10,000 GPUs (more powerful and power-hungry processors that you might be\nfamiliar with using if you\u2019re a gamer).</p>\n<p>This means CO2 from the energy to power the computers plus water to\ncool them. How much? Well, estimates computed for earlier, smaller, models put\nthe CO2 footprint of a single training run at roughly the same as a\nround-trip flight from New York to San Francisco. Once trained, individual\nqueries are comparatively cheap \u2013 but ChatGPT experienced the fastest ever\ngrowth of an Internet service. Earlier this year it was estimated as serving\nhundreds of millions of queries per day resulting in power consumption of\nperhaps 1 gigawatt-hour each day \u2013 the equivalent of 33,000 American\nhouseholds.</p>\n<p>As for water, Microsoft has reported that its global water usage increased 34%\nfrom 2021 to 2022; Google\u2019s increased 20% in the same period, but from a\nhigher baseline. The increase is believed to be substantially due to training\nand operating AI. A group from University of California at Riverside estimate\nthat each \u201cconversation\u201d with ChatGPT uses, directly and indirectly, about a\npint of water \u2013 and this generally needs to be clean drinking water that will\nnot leave residues that clog systems. The month before GPT-4 training was\ncompleted, Microsoft\u2019s Iowa datacenters consumed 11.5 million gallons, about\n6% of the district\u2019s drinking water. The amounts vary based on season and\nlocation of the datacenter but it seems clear that water consumption is very\nsubstantial and could impact local communities and ecosystems. And of course,\nthere is a tension here: cheap and green solar energy improves the carbon\nfootprint but the associated higher temperatures usually also worsens the\nwater footprint as more cooling is required.</p>\n</blockquote>\n<blockquote>\n<p>So there\u2019s a view of AI \u2013 an impressive set of mathematical and computational\ntechniques that can recreate some human behaviours to some extent in some\ncircumstances, at significant practical and moral cost. My own view is\nthreefold.</p>\n<p>First, using the phrase \u201cArtificial Intelligence\u201d to describe these\ntechnologies, rather than something less emotive such as Computationally\nIntensive Statistics, inevitably generates a very strong hype cycle, and we\nare currently at a point in that cycle where a welcome degree of scepticism is\nstarting to come in and people are more actively questioning what exactly\nthese technologies can and can\u2019t do.</p>\n<p>Second, we have largely proceeded to date without concern for any of the costs\nI discussed earlier, and \u2013 also welcome \u2013 that is changing: the costs are\nsignificant and we cannot ignore them.</p>\n<p>Third, there are interesting legal and economic tussles taking place as to who\nowns the training data, who owns the weights \u2013 that is, the AIs \u2013 produced,\nand by whom and how should AIs be regulated. In particular, it is notable that\nmany companies are claiming that there is a need for regulatory barriers to be\nintroduced \u2013 but those are the companies that have already reached a scale\nwhere they can overcome those barriers, so such barriers will serve only to\nkeep newcomers out of the marketplace, entrenching the existing power of \u201cbig\ntech\u201d (OpenAI, Google, Microsoft, Amazon, Meta, etc).</p>\n<p>Finally, as I used the word hysteria earlier to describe hyped fears of\nArtificial General Intelligence and the Singularity \u2013 <strong>please</strong> be sceptical\nof anyone claiming that as a serious existential risk, <strong>particularly</strong> if\nthey are associated with aforementioned \u201cbig tech\u201d! I view most of that\ndiscourse as a \u201cdead cat\u201d strategy, an attempt to distract from the current\nharms they are causing today by pointing to vague, nebulous, yet potentially\ninfinite future harms. For more about the quite startling beliefs of many of\nthose sounding those alarms, I recommend reading about the TESCREAL set of\nideologies \u2013 Transhumanism, Extropianism, Singularitarianism, Cosmism,\nRationalism, Effective Altruism, Longtermism.</p>\n<p>Thank-you.</p>\n</blockquote>\n<h2><a href=\"https://mort.io/blog/sermonising/#references\">References</a></h2>\n<h3><a href=\"https://mort.io/blog/sermonising/#background\">Background</a></h3>\n<ul>\n<li>\u201cLanguage Models are Few-Shot Learners\u201d, OpenAI, 2020.\n<a href=\"https://arxiv.org/abs/2005.14165\">https://arxiv.org/abs/2005.14165</a></li>\n<li>\u201cOn the Dangers of Stochastic Parrots: Can Language Models Be Too Big?\u201d,\nBender et al, FAcct\u201921. <a href=\"https://doi.org/10.1145/3442188.3445922\">https://doi.org/10.1145/3442188.3445922</a></li>\n<li>\u201cThe Internet is for porn\u201d, Stephanie D\u2019Abruzzo & Rick Lyon, Avenue Q.\n<a href=\"https://genius.com/Stephanie-dabruzzo-and-rick-lyon-the-internet-is-for-porn-lyrics\">https://genius.com/Stephanie-dabruzzo-and-rick-lyon-the-internet-is-for-porn-lyrics</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#hidden-work\">Hidden Work</a></h3>\n<ul>\n<li>\u201cOpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less\nToxic\u201d, Time.com, 2023,\n<a href=\"https://time.com/6247678/openai-chatgpt-kenya-workers/\">https://time.com/6247678/openai-chatgpt-kenya-workers/</a></li>\n<li>\u201cBehind the secretive work of the many, many humans helping to train AI\u201d,\nNPR, 2023.\n<a href=\"https://www.npr.org/2023/06/26/1184392406/behind-the-secretive-work-of-the-many-many-humans-helping-to-train-ai\">https://www.npr.org/2023/06/26/1184392406/behind-the-secretive-work-of-the-many-many-humans-helping-to-train-ai</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#energy\">Energy</a></h3>\n<ul>\n<li>\u201cEnergy and Policy Considerations for Deep Learning in NLP\u201d, Strubell et al,\n2019. <a href=\"https://arxiv.org/abs/1906.02243\">https://arxiv.org/abs/1906.02243</a></li>\n<li>\u201cTraining a single AI model can emit as much carbon as five cars in their\nlifetimes\u201d, MIT Technology Review, 2019.\n<a href=\"https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/\">https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/</a></li>\n</ul>\n<h3><a href=\"https://mort.io/blog/sermonising/#water\">Water</a></h3>\n<ul>\n<li>\u201cArtificial intelligence technology behind ChatGPT was built in Iowa \u2014 with a\nlot of water\u201d, AP News, 2023.\n<a href=\"https://apnews.com/article/chatgpt-gpt4-iowa-ai-water-consumption-microsoft-f551fde98083d17a7e8d904f8be822c4\">https://apnews.com/article/chatgpt-gpt4-iowa-ai-water-consumption-microsoft-f551fde98083d17a7e8d904f8be822c4</a></li>\n<li>\u201cA.I. tools fueled a 34% spike in Microsoft\u2019s water consumption, and one city\nwith its data centers is concerned about the effect on residential supply\u201d,\nFortune, 2023.\n<a href=\"https://fortune.com/2023/09/09/ai-chatgpt-usage-fuels-spike-in-microsoft-water-consumption/\">https://fortune.com/2023/09/09/ai-chatgpt-usage-fuels-spike-in-microsoft-water-consumption/</a></li>\n<li>\u201cMaking AI Less \u201cThirsty\u201d: Uncovering and Addressing the Secret Water\nFootprint of AI Models\u201c, Pengfei Li et al, 2023.\n<a href=\"https://arxiv.org/abs/2304.03271\">https://arxiv.org/abs/2304.03271</a></li>\n</ul>",
+18
mort/blog_setup-hotcrp_.json
+18
mort/blog_setup-hotcrp_.json
···+"summary": "<p>I once had cause to setup\n<a href=\"https://read.seas.harvard.edu/~kohler/hotcrp/\">HotCRP</a> for local hosting.\nSpecifically on a local Lab-hosted VM image. Some of what follows is specific\nto the CUCL VM hosting service, but I think most of it is HotCRP generic and so\nmay be of use. Anyway, here\u2019s the crib sheet, starting from\n<a href=\"https://mbtech.github.io/Setting-up-hotcrp/\">https://mbtech.github.io/Setting-up-hotcrp/</a>\u2026</p>\n<pre><code><span><span><span>#</span></span><span> setup some variables</span><span>\n</span></span><span><span>YOUR-DOMAIN</span><span>=</span><span><span><span>"</span>hotcrp-test.cl.cam.ac.uk<span>"</span></span></span>\n</span><span><span>YOUR-WORKSHOP</span><span>=</span><span><span><span>"</span>sysws18<span>"</span></span></span>\n</span><span><span>YOUR-PASSWORD</span><span>=</span><span><span><span>"</span>mybestpassword<span>"</span></span></span>\n</span><span><span>YOUR-EMAIL</span><span>=</span><span><span><span>"</span>postmaster@example.com<span>"</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-hotcrp/#lab-specifics\">Lab specifics</a></h2>\n<p>Assume we start from a default Ubuntu template VM, and then\u2026</p>\n<ol>\n<li>Configure the VM</li>\n</ol>\n<pre><code><span><span><span>cl-asuser</span></span><span> passwd <span><span>#</span></span><span> set UNIX password for sudo</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> create some space</span><span>\n</span></span><span><span>for</span><span> d <span>in</span> /usr/src/<span>*</span></span> <span>;</span> <span>do</span>\n</span><span><span> <span>export</span> <span>K</span><span>=</span><span><span><span>$</span><span>(</span><span><span>uname</span></span><span><span><span> -</span>r</span></span> <span>|</span> <span><span>sed</span></span><span> <span><span>'</span>s/-generic$//<span>'</span></span></span><span>)</span></span></span></span>\n</span><span> <span><span>echo</span></span><span> <span><span>-</span>n</span> <span><span>$</span><span>K</span></span> <span><span>$</span><span>d</span></span> ...</span>\n</span><span> <span><span>case</span> <span><span>$</span><span>d</span></span> <span>in</span>\n</span></span><span><span> </span><span><span><span>"</span>/usr/src/linux-headers-<span><span>$</span><span>K</span></span><span>"</span></span> <span>|</span> <span><span>"</span>/usr/src/linux-headers-<span><span>$</span><span>{</span></span><span><span>K</span></span><span><span>}</span></span>-generic<span>"</span></span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> keep</span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> </span><span><span>*</span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> remove</span>\n</span></span><span><span> <span><span>sudo</span></span><span> rm<span><span> -</span>rf</span> <span><span>$</span><span>d</span></span></span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> <span>esac</span></span>\n</span><span><span>done</span>\n</span><span>\n</span><span><span><span>#</span></span><span> THIS IS UNSAFE! BE CAREFUL! IT CALLS `sudo rm -rf`!</span><span>\n</span></span><span><span>for</span><span> d <span>in</span> /lib/modules/<span>*</span></span> <span>;</span> <span>do</span>\n</span><span> <span><span>echo</span></span><span> <span><span>$</span><span>d</span></span> ...</span>\n</span><span> <span><span>case</span> <span><span>$</span><span>d</span></span> <span>in</span>\n</span></span><span><span> </span><span><span><span>"</span>/lib/modules/<span><span>$</span><span>(</span><span><span>uname</span></span><span><span><span> -</span>r</span></span><span>)</span></span><span>"</span></span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> keep</span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> </span><span><span>*</span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> remove</span>\n</span></span><span><span> <span><span>sudo</span></span><span> rm<span><span> -</span>rf</span> <span><span>$</span><span>d</span></span></span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> <span>esac</span></span>\n</span><span><span>done</span>\n</span><span>\n</span><span><span><span>#</span></span><span> if necessary, resize the partition. this shouldn't be necessary with the new</span><span>\n</span></span><span><span><span>#</span></span><span> VM image! if you need more than ~1GB space for papers, setup xvdb1</span><span>\n</span></span><span><span><span>sudo</span></span><span> fdisk /dev/xvda <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>d\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>w\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>sudo</span></span><span> partprobe</span>\n</span><span>\n</span><span><span><span>#</span></span><span> resize the default filesystem to use the entire partition</span><span>\n</span></span><span><span><span>sudo</span></span><span> resize2fs /dev/xvda1 <span><span>#</span></span><span> blank SIZE means use whole partition</span><span>\n</span></span></span></code></pre>\n<ol>\n<li>Install packages</li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> sort out packages</span><span>\n</span></span><span><span><span>export</span> <span>TZ</span><span>=</span><span>Europe/London</span></span>\n</span><span><span><span>sudo</span></span><span> apt update</span> <span>&&</span> <span><span>sudo</span></span><span> apt install<span><span> --</span>no-install-recommends</span><span><span> -</span>qq</span><span><span> -</span>yy</span> <span>\\\n</span></span></span><span><span> apache2 <span>\\\n</span></span></span><span><span> ca-certificates <span>\\\n</span></span></span><span><span> git <span>\\\n</span></span></span><span><span> libapache2-mod-php <span>\\\n</span></span></span><span><span> mailutils <span>\\\n</span></span></span><span><span> mysql-server</span>\n</span><span> <span><span>php-curl</span></span><span> <span>\\\n</span></span></span><span><span> php-json <span>\\\n</span></span></span><span><span> php-mysql <span>\\\n</span></span></span><span><span> poppler-utils <span>\\\n</span></span></span><span><span> postfix <span>\\\n</span></span></span><span><span> zip</span>\n</span></code></pre>\n<ol>\n<li>Configure <code>postfix</code></li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> configure postfix: accept defaults if offered, setup postfix to use ppsw</span><span>\n</span></span><span><span><span>sudo</span></span><span> sed<span><span> -</span>i</span> <span><span>'</span>s/relayhost =/relayhost = ppsw.cam.ac.uk/<span>'</span></span> /etc/postfix/main.cf</span>\n</span><span><span><span>sudo</span></span><span> /etc/init.d/postfix reload</span>\n</span><span><span><span>sudo</span></span><span> systemctl restart postfix.service</span>\n</span><span><span><span>#</span></span><span> test mail sending</span><span>\n</span></span><span><span><span>echo</span></span><span> <span><span>"</span>Test mail from postfix<span>"</span></span></span> <span>|</span> <span><span>mail</span></span><span><span><span> -</span>s</span> <span><span>"</span>Test Postfix<span>"</span></span> <span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>EMAIL</span><span><span>}</span></span></span>\n</span></code></pre>\n<p>For more email help, see\n<a href=\"https://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/specialist-email-advice/sending-email\">https://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/specialist-email-advice/sending-email</a>\nusing <code>YOUR-DOMAIN</code> as mail domain, and <code>ppsw.cam.ac.uk</code> as relay host.</p>\n<ol>\n<li>Install HotCRP</li>\n</ol>\n<p>Get latest release:</p>\n<pre><code><span><span><span>git</span></span><span> clone https://github.com/kohler/hotcrp.git</span>\n</span><span><span><span>cd</span></span><span> hotcrp</span>\n</span><span><span><span>git</span></span><span> checkout tags/v2.101<span><span> -</span>b</span> v2.101</span>\n</span></code></pre>\n<ol>\n<li>Setup <code>root</code> account for MySQL</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> /etc/init.d/mysql stop <span><span>#</span></span><span> stop the running service</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> configure and run mysql in the console</span><span>\n</span></span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> chown mysql:mysql /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> mysqld_safe<span><span> --</span>skip-grant-tables</span></span> <span>&</span> <span><span>sleep</span></span><span> 5</span>\n</span><span>\n</span><span><span><span>#</span></span><span> smash a new `root` password in place </span><span>\n</span></span><span><span><span>sudo</span></span><span> mysql</span>\n</span><span><span><span>ALTER</span></span><span> USER <span><span>'</span>root<span>'</span></span>@<span><span>'</span>localhost<span>'</span></span> IDENTIFIED WITH mysql_native_password BY <span><span>'</span>${YOUR-PASSWORD}<span>'</span></span></span><span>;</span> \n</span><span><span><span>FLUSH</span></span><span> PRIVILEGES</span><span>;</span>\n</span><span><span><span>exit</span></span><span>;</span>\n</span><span>\n</span><span><span><span>#</span></span><span> restart mysql properly as a service</span><span>\n</span></span><span><span><span>mysqladmin</span></span><span><span><span> -</span>uroot</span><span><span> -</span>p<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span></span><span><span> -</span>h127</span>.0.0.1<span><span> --</span>protocol</span><span>=</span>tcp shutdown</span>\n</span><span><span><span>sudo</span></span><span> /etc/init.d/mysql start</span>\n</span></code></pre>\n<p>\u2026alternatively</p>\n<pre><code><span><span><span>mysql</span></span><span><span><span> -</span>uroot</span><span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>USE mysql;\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>UPDATE mysql.user SET authentication_string = PASSWORD('<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span>')\n</span></span></span><span><span><span>WHERE User = 'root' AND Host = 'localhost';\n</span></span></span><span><span><span>FLUSH PRIVILEGES;\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>QUIT\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<ol>\n<li>Secure your MySQL installation</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> systemctl stop mysql</span>\n</span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> chown mysql:mysql /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> mysqld_safe<span><span> --</span>skip-grant-tables</span><span><span> --</span>skip-networking</span></span> <span>&</span>\n</span><span><span><span>sudo</span></span><span> mysql_secure_installation<span><span> -</span>p<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span></span><span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<ol>\n<li>Setup the HotCRP MySQL tables and config</li>\n</ol>\n<pre><code><span><span><span>lib/createdb.sh</span></span><span><span><span> --</span>user</span><span>=</span>root<span><span> --</span>password</span><span>=</span><span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span> <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>ok\n</span></span></span><span><span><span>YOUR-WORKSHOP\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span>\n</span><span><span><span>#</span></span><span> edit conf/options.php</span><span>\n</span></span><span><span><span>#</span></span><span> - contactName</span><span>\n</span></span><span><span><span>#</span></span><span> - contactEmail</span><span>\n</span></span><span><span><span>#</span></span><span> - sendEmail</span><span>\n</span></span><span><span><span>#</span></span><span> - emailFrom</span><span>\n</span></span><span><span><span>#</span></span><span> - emailSender</span><span>\n</span></span><span><span><span>#</span></span><span> - timezone</span><span>\n</span></span><span><span><span>#</span></span><span> - upload_max_filesize [ if you care ]</span><span>\n</span></span></code></pre>\n<ol>\n<li>Turn on the HotCRP site in your Apache configuration</li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> apache2: turn on hotcrp site</span><span>\n</span></span><span><span><span>sudo</span></span><span> sh<span><span> -</span>c</span> <span><span>'</span>cat >>/etc/apache2/conf-available/hotcrp.conf <<_EOF\n</span></span></span><span><span><span><Directory "$(pwd -P)">\n</span></span></span><span><span><span> Options Indexes Includes FollowSymLinks\n</span></span></span><span><span><span> AllowOverride all\n</span></span></span><span><span><span> Require all granted\n</span></span></span><span><span><span></Directory>\n</span></span></span><span><span><span>Alias /YOUR-WORKSHOP $(pwd -P)\n</span></span></span><span><span><span>_EOF\n</span></span></span><span><span><span><span>'</span></span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> a2enconf <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>hotcrp\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> chgrp www-data conf/options.php</span>\n</span><span><span><span>sudo</span></span><span> service apache2 reload</span>\n</span><span><span><span>sudo</span></span><span> apache2ctl graceful</span>\n</span></code></pre>\n<p>\u2026and you should now be able to access your hotcrp site at <a href=\"http://$%7BYOUR-DOMAIN%7D/$%7BYOUR-WORKSHOP%7D\">http://${YOUR-DOMAIN}/${YOUR-WORKSHOP}</a></p>\n<ol>\n<li>Use <a href=\"https://letsencrypt.org/\">Let\u2019s Encrypt</a> to create and configure\ncertificates for HTTPS support</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> software-properties-common</span>\n</span><span><span><span>sudo</span></span><span> add-apt-repository ppa:certbot/certbot</span>\n</span><span><span><span>sudo</span></span><span> apt update</span>\n</span><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> certbot-auto</span>\n</span><span><span><span>wget</span></span><span> https://dl.eff.org/certbot-auto</span>\n</span><span><span><span>chmod</span></span><span> a+x ./certbot-auto</span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> -</span>n</span><span><span> --</span>os-packages-only</span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> -</span>a</span> webroot<span><span> -</span>i</span> apache<span><span> -</span>w</span> <span><span>$</span><span>(</span><span><span>pwd</span></span><span> <span><span>-</span>P</span></span><span>)</span></span> <span>\\\n</span></span></span><span><span><span><span> --</span>agree-tos</span><span><span> --</span>redirect</span><span><span> --</span>uir</span><span><span> --</span>hsts</span><span><span> --</span>staple-ocsp</span> <span>\\\n</span></span></span><span><span><span><span> -</span>d</span> YOUR-DOMAIN<span><span> --</span>email</span> YOUR-EMAIL</span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> --</span>install-only</span></span>\n</span></code></pre>\n\n<ol>\n<li>Set permissions on the certificate directories</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> chgrp www-data /etc/letsencrypt/live</span>\n</span><span><span><span>sudo</span></span><span> chmod g+rx /etc/letsencrypt/live</span>\n</span><span><span><span>sudo</span></span><span> chgrp www-data /etc/letsencrypt/archive/</span>\n</span><span><span><span>sudo</span></span><span> chmod g+rx /etc/letsencrypt/archive/</span>\n</span></code></pre>\n<p>End state is the Apache config looks something like the following, with\nunindented lines being those I added:</p>\n<pre><code><span><span><span>$</span></span><span> cat /etc/apache2/sites-available/hotcrp.conf</span>\n</span><span><span><</span>IfModule <span><span>mod_ssl.c</span></span><span><span>></span>\n</span></span><span>\n</span><span><span><span>SSLStaplingCache</span></span><span> shmcb:/var/run/apache2/stapling_cache(128000</span><span></span>)\n</span><span>\n</span><span>\t<span><</span>VirtualHost <span><span>_default_:443</span></span><span><span>></span>\n</span></span><span>\t\t<span><span>ServerAdmin</span></span><span> webmaster@localhost</span>\n</span><span>\t\t<span><span>DocumentRoot</span></span><span> /home/hotcrp/hotcrp</span>\n</span><span>\t\t<span><span>ErrorLog</span></span><span> <span><span>$</span><span>{</span></span><span><span>APACHE_LOG_DIR</span></span><span><span>}</span></span>/error.log</span>\n</span><span>\t\t<span><span>CustomLog</span></span><span> <span><span>$</span><span>{</span></span><span><span>APACHE_LOG_DIR</span></span><span><span>}</span></span>/access.log combined</span>\n</span><span>\t\t<span><span>SSLEngine</span></span><span> on</span>\n</span><span>\n</span><span><span><span>SSLCACertificateFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk/fullchain.pem</span>\n</span><span><span><span>SSLUseStapling</span></span><span> on</span>\n</span><span>\n</span><span>\t\t<span><</span>FilesMatch <span><span><span><span>"</span>\\.(cgi|shtml|phtml|php)$<span>"</span></span></span></span><span><span>></span>\n</span></span><span>\t\t\t\t<span><span>SSLOptions</span></span><span> +StdEnvVars</span>\n</span><span>\t\t<span><</span>/FilesMatch<span>></span>\n</span><span>\t\t<span><</span>Directory <span><span>/usr/lib/cgi-bin</span></span><span><span>></span>\n</span></span><span>\t\t\t\t<span><span>SSLOptions</span></span><span> +StdEnvVars</span>\n</span><span>\t\t<span><</span>/Directory<span>></span>\n</span><span>\n</span><span><span><span>ServerName</span></span><span> hotcrp.sysws.org.uk</span>\n</span><span><span><span>SSLCertificateFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk-0001/fullchain.pem</span>\n</span><span><span><span>SSLCertificateKeyFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk-0001/privkey.pem</span>\n</span><span><span><span>Include</span></span><span> /etc/letsencrypt/options-ssl-apache.conf</span>\n</span><span><span><span>Header</span></span><span> always set Strict-Transport-Security <span><span>"</span>max-age=31536000<span>"</span></span></span>\n</span><span><span><span>Header</span></span><span> always set Content-Security-Policy upgrade-insecure-requests</span>\n</span><span>\n</span><span>\t<span><</span>/VirtualHost<span>></span>\n</span><span><span><</span>/IfModule<span>></span>\n</span><span>\n</span><span><span><span>11.</span></span><span> Add DNS entry for the name assigned (in my case, <span><span>`</span><span><span>hotcrp.DOMAIN</span></span><span>`</span></span></span><span></span>)<span><span>.</span></span>\n</span></code></pre>",+"content": "<p>I once had cause to setup\n<a href=\"https://read.seas.harvard.edu/~kohler/hotcrp/\">HotCRP</a> for local hosting.\nSpecifically on a local Lab-hosted VM image. Some of what follows is specific\nto the CUCL VM hosting service, but I think most of it is HotCRP generic and so\nmay be of use. Anyway, here\u2019s the crib sheet, starting from\n<a href=\"https://mbtech.github.io/Setting-up-hotcrp/\">https://mbtech.github.io/Setting-up-hotcrp/</a>\u2026</p>\n<pre><code><span><span><span>#</span></span><span> setup some variables</span><span>\n</span></span><span><span>YOUR-DOMAIN</span><span>=</span><span><span><span>"</span>hotcrp-test.cl.cam.ac.uk<span>"</span></span></span>\n</span><span><span>YOUR-WORKSHOP</span><span>=</span><span><span><span>"</span>sysws18<span>"</span></span></span>\n</span><span><span>YOUR-PASSWORD</span><span>=</span><span><span><span>"</span>mybestpassword<span>"</span></span></span>\n</span><span><span>YOUR-EMAIL</span><span>=</span><span><span><span>"</span>postmaster@example.com<span>"</span></span></span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-hotcrp/#lab-specifics\">Lab specifics</a></h2>\n<p>Assume we start from a default Ubuntu template VM, and then\u2026</p>\n<ol>\n<li>Configure the VM</li>\n</ol>\n<pre><code><span><span><span>cl-asuser</span></span><span> passwd <span><span>#</span></span><span> set UNIX password for sudo</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> create some space</span><span>\n</span></span><span><span>for</span><span> d <span>in</span> /usr/src/<span>*</span></span> <span>;</span> <span>do</span>\n</span><span><span> <span>export</span> <span>K</span><span>=</span><span><span><span>$</span><span>(</span><span><span>uname</span></span><span><span><span> -</span>r</span></span> <span>|</span> <span><span>sed</span></span><span> <span><span>'</span>s/-generic$//<span>'</span></span></span><span>)</span></span></span></span>\n</span><span> <span><span>echo</span></span><span> <span><span>-</span>n</span> <span><span>$</span><span>K</span></span> <span><span>$</span><span>d</span></span> ...</span>\n</span><span> <span><span>case</span> <span><span>$</span><span>d</span></span> <span>in</span>\n</span></span><span><span> </span><span><span><span>"</span>/usr/src/linux-headers-<span><span>$</span><span>K</span></span><span>"</span></span> <span>|</span> <span><span>"</span>/usr/src/linux-headers-<span><span>$</span><span>{</span></span><span><span>K</span></span><span><span>}</span></span>-generic<span>"</span></span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> keep</span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> </span><span><span>*</span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> remove</span>\n</span></span><span><span> <span><span>sudo</span></span><span> rm<span><span> -</span>rf</span> <span><span>$</span><span>d</span></span></span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> <span>esac</span></span>\n</span><span><span>done</span>\n</span><span>\n</span><span><span><span>#</span></span><span> THIS IS UNSAFE! BE CAREFUL! IT CALLS `sudo rm -rf`!</span><span>\n</span></span><span><span>for</span><span> d <span>in</span> /lib/modules/<span>*</span></span> <span>;</span> <span>do</span>\n</span><span> <span><span>echo</span></span><span> <span><span>$</span><span>d</span></span> ...</span>\n</span><span> <span><span>case</span> <span><span>$</span><span>d</span></span> <span>in</span>\n</span></span><span><span> </span><span><span><span>"</span>/lib/modules/<span><span>$</span><span>(</span><span><span>uname</span></span><span><span><span> -</span>r</span></span><span>)</span></span><span>"</span></span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> keep</span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> </span><span><span>*</span> <span><span>)</span></span></span><span>\n</span></span><span><span> <span><span>echo</span></span><span> remove</span>\n</span></span><span><span> <span><span>sudo</span></span><span> rm<span><span> -</span>rf</span> <span><span>$</span><span>d</span></span></span>\n</span></span><span><span> </span><span><span>;;</span></span><span>\n</span></span><span><span> <span>esac</span></span>\n</span><span><span>done</span>\n</span><span>\n</span><span><span><span>#</span></span><span> if necessary, resize the partition. this shouldn't be necessary with the new</span><span>\n</span></span><span><span><span>#</span></span><span> VM image! if you need more than ~1GB space for papers, setup xvdb1</span><span>\n</span></span><span><span><span>sudo</span></span><span> fdisk /dev/xvda <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>p\n</span></span></span><span><span><span>d\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>w\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>sudo</span></span><span> partprobe</span>\n</span><span>\n</span><span><span><span>#</span></span><span> resize the default filesystem to use the entire partition</span><span>\n</span></span><span><span><span>sudo</span></span><span> resize2fs /dev/xvda1 <span><span>#</span></span><span> blank SIZE means use whole partition</span><span>\n</span></span></span></code></pre>\n<ol>\n<li>Install packages</li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> sort out packages</span><span>\n</span></span><span><span><span>export</span> <span>TZ</span><span>=</span><span>Europe/London</span></span>\n</span><span><span><span>sudo</span></span><span> apt update</span> <span>&&</span> <span><span>sudo</span></span><span> apt install<span><span> --</span>no-install-recommends</span><span><span> -</span>qq</span><span><span> -</span>yy</span> <span>\\\n</span></span></span><span><span> apache2 <span>\\\n</span></span></span><span><span> ca-certificates <span>\\\n</span></span></span><span><span> git <span>\\\n</span></span></span><span><span> libapache2-mod-php <span>\\\n</span></span></span><span><span> mailutils <span>\\\n</span></span></span><span><span> mysql-server</span>\n</span><span> <span><span>php-curl</span></span><span> <span>\\\n</span></span></span><span><span> php-json <span>\\\n</span></span></span><span><span> php-mysql <span>\\\n</span></span></span><span><span> poppler-utils <span>\\\n</span></span></span><span><span> postfix <span>\\\n</span></span></span><span><span> zip</span>\n</span></code></pre>\n<ol>\n<li>Configure <code>postfix</code></li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> configure postfix: accept defaults if offered, setup postfix to use ppsw</span><span>\n</span></span><span><span><span>sudo</span></span><span> sed<span><span> -</span>i</span> <span><span>'</span>s/relayhost =/relayhost = ppsw.cam.ac.uk/<span>'</span></span> /etc/postfix/main.cf</span>\n</span><span><span><span>sudo</span></span><span> /etc/init.d/postfix reload</span>\n</span><span><span><span>sudo</span></span><span> systemctl restart postfix.service</span>\n</span><span><span><span>#</span></span><span> test mail sending</span><span>\n</span></span><span><span><span>echo</span></span><span> <span><span>"</span>Test mail from postfix<span>"</span></span></span> <span>|</span> <span><span>mail</span></span><span><span><span> -</span>s</span> <span><span>"</span>Test Postfix<span>"</span></span> <span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>EMAIL</span><span><span>}</span></span></span>\n</span></code></pre>\n<p>For more email help, see\n<a href=\"https://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/specialist-email-advice/sending-email\">https://help.uis.cam.ac.uk/email-telephony-and-collaboration/email/specialist-email-advice/sending-email</a>\nusing <code>YOUR-DOMAIN</code> as mail domain, and <code>ppsw.cam.ac.uk</code> as relay host.</p>\n<ol>\n<li>Install HotCRP</li>\n</ol>\n<p>Get latest release:</p>\n<pre><code><span><span><span>git</span></span><span> clone https://github.com/kohler/hotcrp.git</span>\n</span><span><span><span>cd</span></span><span> hotcrp</span>\n</span><span><span><span>git</span></span><span> checkout tags/v2.101<span><span> -</span>b</span> v2.101</span>\n</span></code></pre>\n<ol>\n<li>Setup <code>root</code> account for MySQL</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> /etc/init.d/mysql stop <span><span>#</span></span><span> stop the running service</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> configure and run mysql in the console</span><span>\n</span></span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> chown mysql:mysql /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> mysqld_safe<span><span> --</span>skip-grant-tables</span></span> <span>&</span> <span><span>sleep</span></span><span> 5</span>\n</span><span>\n</span><span><span><span>#</span></span><span> smash a new `root` password in place </span><span>\n</span></span><span><span><span>sudo</span></span><span> mysql</span>\n</span><span><span><span>ALTER</span></span><span> USER <span><span>'</span>root<span>'</span></span>@<span><span>'</span>localhost<span>'</span></span> IDENTIFIED WITH mysql_native_password BY <span><span>'</span>${YOUR-PASSWORD}<span>'</span></span></span><span>;</span> \n</span><span><span><span>FLUSH</span></span><span> PRIVILEGES</span><span>;</span>\n</span><span><span><span>exit</span></span><span>;</span>\n</span><span>\n</span><span><span><span>#</span></span><span> restart mysql properly as a service</span><span>\n</span></span><span><span><span>mysqladmin</span></span><span><span><span> -</span>uroot</span><span><span> -</span>p<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span></span><span><span> -</span>h127</span>.0.0.1<span><span> --</span>protocol</span><span>=</span>tcp shutdown</span>\n</span><span><span><span>sudo</span></span><span> /etc/init.d/mysql start</span>\n</span></code></pre>\n<p>\u2026alternatively</p>\n<pre><code><span><span><span>mysql</span></span><span><span><span> -</span>uroot</span><span><span><<</span><span>_EOF</span></span><span>\n</span></span></span><span><span><span>USE mysql;\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>UPDATE mysql.user SET authentication_string = PASSWORD('<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span>')\n</span></span></span><span><span><span>WHERE User = 'root' AND Host = 'localhost';\n</span></span></span><span><span><span>FLUSH PRIVILEGES;\n</span></span></span><span><span><span>\n</span></span></span><span><span><span>QUIT\n</span></span></span><span><span><span><span>_EOF</span></span></span>\n</span></code></pre>\n<ol>\n<li>Secure your MySQL installation</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> systemctl stop mysql</span>\n</span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>p</span> /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> chown mysql:mysql /var/run/mysqld</span>\n</span><span><span><span>sudo</span></span><span> mysqld_safe<span><span> --</span>skip-grant-tables</span><span><span> --</span>skip-networking</span></span> <span>&</span>\n</span><span><span><span>sudo</span></span><span> mysql_secure_installation<span><span> -</span>p<span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span></span><span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>n\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span>y\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span></code></pre>\n<ol>\n<li>Setup the HotCRP MySQL tables and config</li>\n</ol>\n<pre><code><span><span><span>lib/createdb.sh</span></span><span><span><span> --</span>user</span><span>=</span>root<span><span> --</span>password</span><span>=</span><span><span>$</span><span>{</span></span><span><span>YOUR</span></span><span><span>-</span></span><span>PASSWORD</span><span><span>}</span></span> <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>ok\n</span></span></span><span><span><span>YOUR-WORKSHOP\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span>\n</span><span><span><span>#</span></span><span> edit conf/options.php</span><span>\n</span></span><span><span><span>#</span></span><span> - contactName</span><span>\n</span></span><span><span><span>#</span></span><span> - contactEmail</span><span>\n</span></span><span><span><span>#</span></span><span> - sendEmail</span><span>\n</span></span><span><span><span>#</span></span><span> - emailFrom</span><span>\n</span></span><span><span><span>#</span></span><span> - emailSender</span><span>\n</span></span><span><span><span>#</span></span><span> - timezone</span><span>\n</span></span><span><span><span>#</span></span><span> - upload_max_filesize [ if you care ]</span><span>\n</span></span></code></pre>\n<ol>\n<li>Turn on the HotCRP site in your Apache configuration</li>\n</ol>\n<pre><code><span><span><span>#</span></span><span> apache2: turn on hotcrp site</span><span>\n</span></span><span><span><span>sudo</span></span><span> sh<span><span> -</span>c</span> <span><span>'</span>cat >>/etc/apache2/conf-available/hotcrp.conf <<_EOF\n</span></span></span><span><span><span><Directory "$(pwd -P)">\n</span></span></span><span><span><span> Options Indexes Includes FollowSymLinks\n</span></span></span><span><span><span> AllowOverride all\n</span></span></span><span><span><span> Require all granted\n</span></span></span><span><span><span></Directory>\n</span></span></span><span><span><span>Alias /YOUR-WORKSHOP $(pwd -P)\n</span></span></span><span><span><span>_EOF\n</span></span></span><span><span><span><span>'</span></span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> a2enconf <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>hotcrp\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> chgrp www-data conf/options.php</span>\n</span><span><span><span>sudo</span></span><span> service apache2 reload</span>\n</span><span><span><span>sudo</span></span><span> apache2ctl graceful</span>\n</span></code></pre>\n<p>\u2026and you should now be able to access your hotcrp site at <a href=\"http://$%7BYOUR-DOMAIN%7D/$%7BYOUR-WORKSHOP%7D\">http://${YOUR-DOMAIN}/${YOUR-WORKSHOP}</a></p>\n<ol>\n<li>Use <a href=\"https://letsencrypt.org/\">Let\u2019s Encrypt</a> to create and configure\ncertificates for HTTPS support</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> software-properties-common</span>\n</span><span><span><span>sudo</span></span><span> add-apt-repository ppa:certbot/certbot</span>\n</span><span><span><span>sudo</span></span><span> apt update</span>\n</span><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> certbot-auto</span>\n</span><span><span><span>wget</span></span><span> https://dl.eff.org/certbot-auto</span>\n</span><span><span><span>chmod</span></span><span> a+x ./certbot-auto</span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> -</span>n</span><span><span> --</span>os-packages-only</span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> -</span>a</span> webroot<span><span> -</span>i</span> apache<span><span> -</span>w</span> <span><span>$</span><span>(</span><span><span>pwd</span></span><span> <span><span>-</span>P</span></span><span>)</span></span> <span>\\\n</span></span></span><span><span><span><span> --</span>agree-tos</span><span><span> --</span>redirect</span><span><span> --</span>uir</span><span><span> --</span>hsts</span><span><span> --</span>staple-ocsp</span> <span>\\\n</span></span></span><span><span><span><span> -</span>d</span> YOUR-DOMAIN<span><span> --</span>email</span> YOUR-EMAIL</span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> ./certbot-auto<span><span> --</span>install-only</span></span>\n</span></code></pre>\n\n<ol>\n<li>Set permissions on the certificate directories</li>\n</ol>\n<pre><code><span><span><span>sudo</span></span><span> chgrp www-data /etc/letsencrypt/live</span>\n</span><span><span><span>sudo</span></span><span> chmod g+rx /etc/letsencrypt/live</span>\n</span><span><span><span>sudo</span></span><span> chgrp www-data /etc/letsencrypt/archive/</span>\n</span><span><span><span>sudo</span></span><span> chmod g+rx /etc/letsencrypt/archive/</span>\n</span></code></pre>\n<p>End state is the Apache config looks something like the following, with\nunindented lines being those I added:</p>\n<pre><code><span><span><span>$</span></span><span> cat /etc/apache2/sites-available/hotcrp.conf</span>\n</span><span><span><</span>IfModule <span><span>mod_ssl.c</span></span><span><span>></span>\n</span></span><span>\n</span><span><span><span>SSLStaplingCache</span></span><span> shmcb:/var/run/apache2/stapling_cache(128000</span><span></span>)\n</span><span>\n</span><span>\t<span><</span>VirtualHost <span><span>_default_:443</span></span><span><span>></span>\n</span></span><span>\t\t<span><span>ServerAdmin</span></span><span> webmaster@localhost</span>\n</span><span>\t\t<span><span>DocumentRoot</span></span><span> /home/hotcrp/hotcrp</span>\n</span><span>\t\t<span><span>ErrorLog</span></span><span> <span><span>$</span><span>{</span></span><span><span>APACHE_LOG_DIR</span></span><span><span>}</span></span>/error.log</span>\n</span><span>\t\t<span><span>CustomLog</span></span><span> <span><span>$</span><span>{</span></span><span><span>APACHE_LOG_DIR</span></span><span><span>}</span></span>/access.log combined</span>\n</span><span>\t\t<span><span>SSLEngine</span></span><span> on</span>\n</span><span>\n</span><span><span><span>SSLCACertificateFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk/fullchain.pem</span>\n</span><span><span><span>SSLUseStapling</span></span><span> on</span>\n</span><span>\n</span><span>\t\t<span><</span>FilesMatch <span><span><span><span>"</span>\\.(cgi|shtml|phtml|php)$<span>"</span></span></span></span><span><span>></span>\n</span></span><span>\t\t\t\t<span><span>SSLOptions</span></span><span> +StdEnvVars</span>\n</span><span>\t\t<span><</span>/FilesMatch<span>></span>\n</span><span>\t\t<span><</span>Directory <span><span>/usr/lib/cgi-bin</span></span><span><span>></span>\n</span></span><span>\t\t\t\t<span><span>SSLOptions</span></span><span> +StdEnvVars</span>\n</span><span>\t\t<span><</span>/Directory<span>></span>\n</span><span>\n</span><span><span><span>ServerName</span></span><span> hotcrp.sysws.org.uk</span>\n</span><span><span><span>SSLCertificateFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk-0001/fullchain.pem</span>\n</span><span><span><span>SSLCertificateKeyFile</span></span><span> /etc/letsencrypt/live/hotcrp.sysws.org.uk-0001/privkey.pem</span>\n</span><span><span><span>Include</span></span><span> /etc/letsencrypt/options-ssl-apache.conf</span>\n</span><span><span><span>Header</span></span><span> always set Strict-Transport-Security <span><span>"</span>max-age=31536000<span>"</span></span></span>\n</span><span><span><span>Header</span></span><span> always set Content-Security-Policy upgrade-insecure-requests</span>\n</span><span>\n</span><span>\t<span><</span>/VirtualHost<span>></span>\n</span><span><span><</span>/IfModule<span>></span>\n</span><span>\n</span><span><span><span>11.</span></span><span> Add DNS entry for the name assigned (in my case, <span><span>`</span><span><span>hotcrp.DOMAIN</span></span><span>`</span></span></span><span></span>)<span><span>.</span></span>\n</span></code></pre>",
+18
mort/blog_setup-new-laptop_.json
+18
mort/blog_setup-new-laptop_.json
···+"summary": "<p>This is the set of things that I roughly did to set up my old new Macbook:\nconfigurations, packages, etc. Not guaranteed complete, but hopefully captures\nmuch of it.</p>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#keyboard\">Keyboard</a></h2>\n<ul>\n<li>Set the touchbar to expanded</li>\n<li>Turn off various emacs conflicts (e.g., <code>S-M-6</code>)</li>\n<li>Use <a href=\"https://karabiner-elements.pqrs.org/\">Karabiner Elements</a> to make\n<code>CAPSLOCK</code> become <code>ESC</code> on a single press</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#homebrew-packages\">Homebrew packages</a></h2>\n<ul>\n<li>Install <a href=\"https://brew.sh/\">Homebrew</a> and then</li>\n</ul>\n<pre><code><span><span><span>#</span></span><span> install packages</span><span>\n</span></span><span><span><span>brew</span></span><span> install <span>\\\n</span></span></span><span><span> aspcud <span>\\\n</span></span></span><span><span> aspell <span>\\\n</span></span></span><span><span> bash-completion <span>\\\n</span></span></span><span><span> coreutils <span>\\\n</span></span></span><span><span> direnv <span>\\\n</span></span></span><span><span> emacs <span>\\\n</span></span></span><span><span> evernote <span>\\\n</span></span></span><span><span> ffmpeg <span>\\\n</span></span></span><span><span> font-hack-nerd-font <span>\\\n</span></span></span><span><span> gawk <span>\\\n</span></span></span><span><span> gcc <span>\\\n</span></span></span><span><span> get_iplayer <span>\\\n</span></span></span><span><span> ghostscript <span>\\\n</span></span></span><span><span> git <span>\\\n</span></span></span><span><span> git-lfs <span>\\\n</span></span></span><span><span> gnupg2 <span>\\\n</span></span></span><span><span> gnuplot <span>\\\n</span></span></span><span><span> gpg-agent <span>\\\n</span></span></span><span><span> graphviz <span>\\\n</span></span></span><span><span> imagemagick <span>\\\n</span></span></span><span><span> jq <span>\\\n</span></span></span><span><span> lua <span>\\\n</span></span></span><span><span> mu <span>\\\n</span></span></span><span><span> ncftp <span>\\\n</span></span></span><span><span> nmap <span>\\\n</span></span></span><span><span> ocaml <span>\\\n</span></span></span><span><span> offlineimap <span>\\\n</span></span></span><span><span> omnigraffle <span>\\\n</span></span></span><span><span> opam <span>\\\n</span></span></span><span><span> python <span>\\\n</span></span></span><span><span> python@2 <span>\\\n</span></span></span><span><span> qemu <span>\\\n</span></span></span><span><span> rcs <span>\\\n</span></span></span><span><span> readline <span>\\\n</span></span></span><span><span> rsync <span>\\\n</span></span></span><span><span> socat <span>\\\n</span></span></span><span><span> sshfs <span>\\\n</span></span></span><span><span> telnet <span>\\\n</span></span></span><span><span> tmux <span>\\\n</span></span></span><span><span> unrar <span>\\\n</span></span></span><span><span> wget</span>\n</span></code></pre>\n<pre><code><span><span><span>#</span></span><span> install casks</span><span>\n</span></span><span><span><span>brew</span></span><span> tap homebrew/cask-versions</span>\n</span><span><span><span>brew</span></span><span> cask install <span>\\\n</span></span></span><span><span> adium <span>\\\n</span></span></span><span><span> disk-inventory-x <span>\\\n</span></span></span><span><span> docker <span>\\\n</span></span></span><span><span> dropbox <span>\\\n</span></span></span><span><span> emacs <span>\\\n</span></span></span><span><span> evernote <span>\\\n</span></span></span><span><span> font-hack-nerd-font <span>\\\n</span></span></span><span><span> get-iplayer-automator <span>\\\n</span></span></span><span><span> google-backup-and-sync <span>\\\n</span></span></span><span><span> google-drive-file-stream <span>\\\n</span></span></span><span><span> gpodder <span>\\\n</span></span></span><span><span> inkscape <span>\\\n</span></span></span><span><span> iterm2 <span>\\\n</span></span></span><span><span> karabiner-elements <span>\\\n</span></span></span><span><span> keepingyouawake <span>\\\n</span></span></span><span><span> keybase <span>\\\n</span></span></span><span><span> mactex <span>\\\n</span></span></span><span><span> mp3tag <span>\\\n</span></span></span><span><span> musicbrainz-picard <span>\\\n</span></span></span><span><span> omnigraffle6 <span>\\\n</span></span></span><span><span> onedrive <span>\\\n</span></span></span><span><span> osxfuse <span>\\\n</span></span></span><span><span> sharepod <span>\\\n</span></span></span><span><span> signal <span>\\\n</span></span></span><span><span> slack <span>\\\n</span></span></span><span><span> vlc</span>\n</span></code></pre>\n<pre><code><span><span><span>git</span></span><span> lfs install<span><span> --</span>system <span><span>#</span></span><span> enable `git-lfs` system-wide</span><span>\n</span></span></span></span><span><span><span>mu</span></span><span> index<span><span> --</span>rebuild <span><span>#</span></span><span> rebuild `mu` index</span><span>\n</span></span></span></span><span><span><span>opam</span></span><span> init <span><span>#</span></span><span> initialise `opam`</span><span>\n</span></span></span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#podcasts\">Podcasts</a></h2>\n<ul>\n<li>Extract RSS feeds from iTunes:</li>\n</ul>\n<pre><code><span><span><span>grep</span></span><span> Location Library.xml</span> <span>|</span> <span><span>sed</span></span><span> <span><span>'</span>s/[[:space:]]//g<span>'</span></span></span> <span>|</span> <span><span>sort</span></span> <span>|</span> <span>\\\n</span></span><span> <span><span>grep</span></span><span><span><span> -</span>iv</span> <span><span>"</span>\\.\\(mp[34]\\|m4[av]\\)<span>"</span></span></span> <span>|</span> <span><span>grep</span></span><span> http</span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>b</span> 28-</span> <span>|</span> <span>\\\n</span></span><span> <span><span>cut</span></span><span><span><span> -</span>f</span> 1<span><span> -</span>d</span><span><span>"</span><<span>"</span></span> <span>></span> PODCAST.URLS</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#ocal\"></a><a href=\"https://github.com/mor1/ocal\">OCal</a></h2>\n<pre><code><span><span><span>opam</span></span><span> install<span><span> -</span>y</span> ocal</span>\n</span></code></pre>",+"content": "<p>This is the set of things that I roughly did to set up my old new Macbook:\nconfigurations, packages, etc. Not guaranteed complete, but hopefully captures\nmuch of it.</p>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#keyboard\">Keyboard</a></h2>\n<ul>\n<li>Set the touchbar to expanded</li>\n<li>Turn off various emacs conflicts (e.g., <code>S-M-6</code>)</li>\n<li>Use <a href=\"https://karabiner-elements.pqrs.org/\">Karabiner Elements</a> to make\n<code>CAPSLOCK</code> become <code>ESC</code> on a single press</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#homebrew-packages\">Homebrew packages</a></h2>\n<ul>\n<li>Install <a href=\"https://brew.sh/\">Homebrew</a> and then</li>\n</ul>\n<pre><code><span><span><span>#</span></span><span> install packages</span><span>\n</span></span><span><span><span>brew</span></span><span> install <span>\\\n</span></span></span><span><span> aspcud <span>\\\n</span></span></span><span><span> aspell <span>\\\n</span></span></span><span><span> bash-completion <span>\\\n</span></span></span><span><span> coreutils <span>\\\n</span></span></span><span><span> direnv <span>\\\n</span></span></span><span><span> emacs <span>\\\n</span></span></span><span><span> evernote <span>\\\n</span></span></span><span><span> ffmpeg <span>\\\n</span></span></span><span><span> font-hack-nerd-font <span>\\\n</span></span></span><span><span> gawk <span>\\\n</span></span></span><span><span> gcc <span>\\\n</span></span></span><span><span> get_iplayer <span>\\\n</span></span></span><span><span> ghostscript <span>\\\n</span></span></span><span><span> git <span>\\\n</span></span></span><span><span> git-lfs <span>\\\n</span></span></span><span><span> gnupg2 <span>\\\n</span></span></span><span><span> gnuplot <span>\\\n</span></span></span><span><span> gpg-agent <span>\\\n</span></span></span><span><span> graphviz <span>\\\n</span></span></span><span><span> imagemagick <span>\\\n</span></span></span><span><span> jq <span>\\\n</span></span></span><span><span> lua <span>\\\n</span></span></span><span><span> mu <span>\\\n</span></span></span><span><span> ncftp <span>\\\n</span></span></span><span><span> nmap <span>\\\n</span></span></span><span><span> ocaml <span>\\\n</span></span></span><span><span> offlineimap <span>\\\n</span></span></span><span><span> omnigraffle <span>\\\n</span></span></span><span><span> opam <span>\\\n</span></span></span><span><span> python <span>\\\n</span></span></span><span><span> python@2 <span>\\\n</span></span></span><span><span> qemu <span>\\\n</span></span></span><span><span> rcs <span>\\\n</span></span></span><span><span> readline <span>\\\n</span></span></span><span><span> rsync <span>\\\n</span></span></span><span><span> socat <span>\\\n</span></span></span><span><span> sshfs <span>\\\n</span></span></span><span><span> telnet <span>\\\n</span></span></span><span><span> tmux <span>\\\n</span></span></span><span><span> unrar <span>\\\n</span></span></span><span><span> wget</span>\n</span></code></pre>\n<pre><code><span><span><span>#</span></span><span> install casks</span><span>\n</span></span><span><span><span>brew</span></span><span> tap homebrew/cask-versions</span>\n</span><span><span><span>brew</span></span><span> cask install <span>\\\n</span></span></span><span><span> adium <span>\\\n</span></span></span><span><span> disk-inventory-x <span>\\\n</span></span></span><span><span> docker <span>\\\n</span></span></span><span><span> dropbox <span>\\\n</span></span></span><span><span> emacs <span>\\\n</span></span></span><span><span> evernote <span>\\\n</span></span></span><span><span> font-hack-nerd-font <span>\\\n</span></span></span><span><span> get-iplayer-automator <span>\\\n</span></span></span><span><span> google-backup-and-sync <span>\\\n</span></span></span><span><span> google-drive-file-stream <span>\\\n</span></span></span><span><span> gpodder <span>\\\n</span></span></span><span><span> inkscape <span>\\\n</span></span></span><span><span> iterm2 <span>\\\n</span></span></span><span><span> karabiner-elements <span>\\\n</span></span></span><span><span> keepingyouawake <span>\\\n</span></span></span><span><span> keybase <span>\\\n</span></span></span><span><span> mactex <span>\\\n</span></span></span><span><span> mp3tag <span>\\\n</span></span></span><span><span> musicbrainz-picard <span>\\\n</span></span></span><span><span> omnigraffle6 <span>\\\n</span></span></span><span><span> onedrive <span>\\\n</span></span></span><span><span> osxfuse <span>\\\n</span></span></span><span><span> sharepod <span>\\\n</span></span></span><span><span> signal <span>\\\n</span></span></span><span><span> slack <span>\\\n</span></span></span><span><span> vlc</span>\n</span></code></pre>\n<pre><code><span><span><span>git</span></span><span> lfs install<span><span> --</span>system <span><span>#</span></span><span> enable `git-lfs` system-wide</span><span>\n</span></span></span></span><span><span><span>mu</span></span><span> index<span><span> --</span>rebuild <span><span>#</span></span><span> rebuild `mu` index</span><span>\n</span></span></span></span><span><span><span>opam</span></span><span> init <span><span>#</span></span><span> initialise `opam`</span><span>\n</span></span></span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#podcasts\">Podcasts</a></h2>\n<ul>\n<li>Extract RSS feeds from iTunes:</li>\n</ul>\n<pre><code><span><span><span>grep</span></span><span> Location Library.xml</span> <span>|</span> <span><span>sed</span></span><span> <span><span>'</span>s/[[:space:]]//g<span>'</span></span></span> <span>|</span> <span><span>sort</span></span> <span>|</span> <span>\\\n</span></span><span> <span><span>grep</span></span><span><span><span> -</span>iv</span> <span><span>"</span>\\.\\(mp[34]\\|m4[av]\\)<span>"</span></span></span> <span>|</span> <span><span>grep</span></span><span> http</span> <span>|</span> <span><span>cut</span></span><span><span><span> -</span>b</span> 28-</span> <span>|</span> <span>\\\n</span></span><span> <span><span>cut</span></span><span><span><span> -</span>f</span> 1<span><span> -</span>d</span><span><span>"</span><<span>"</span></span> <span>></span> PODCAST.URLS</span>\n</span></code></pre>\n<h2><a href=\"https://mort.io/blog/setup-new-laptop/#ocal\"></a><a href=\"https://github.com/mor1/ocal\">OCal</a></h2>\n<pre><code><span><span><span>opam</span></span><span> install<span><span> -</span>y</span> ocal</span>\n</span></code></pre>",
+18
mort/blog_software-folklore_.json
+18
mort/blog_software-folklore_.json
···+"summary": "<p>Truly cursed bugs that have become <a href=\"http://beza1e1.tuxen.de/lore/\">software\nfolklore</a>.</p>",+"content": "<p>Truly cursed bugs that have become <a href=\"http://beza1e1.tuxen.de/lore/\">software\nfolklore</a>.</p>",
+18
mort/blog_someones-following-me_.json
+18
mort/blog_someones-following-me_.json
···+"summary": "<p>It\u2019s always<a href=\"https://mort.io/blog/someones-following-me/#1\">1</a> nice when someone notices what you\u2019re doing, and I was\npleasantly surprised recently to find that someone had indeed been watching.<a href=\"https://mort.io/blog/someones-following-me/#2\">2</a></p>\n<div>1\n<p>Well, usually. Unless you\u2019re doing something embarrassing like, e.g.,\ndancing.</p>\n</div>\n<div>2\n<p>Though in fact, given the pattern involved, I suspect it may be my\ncolleague, <a href=\"http://www.eecs.qmul.ac.uk/~hamed/\">Dr Hamed Haddadi</a> who\u2019s\nactually being followed. To whit, the fact that they also picked up on <a href=\"https://www.technologyreview.com/2014/09/12/171400/the-murky-world-of-third-party-web-tracking/\">a\nstudy</a>\nof third-party web-tracking from last year.</p>\n</div>\n<p>Specifically, it appears that someone at MIT Technology Review has noticed the\nwork of me and some of my pals: our work on <a href=\"https://www.technologyreview.com/2015/01/05/169715/the-emerging-science-of-human-data-interaction/\">HDI</a> and a recent sketch of some\nfollow-on work we\u2019re pursuing around building a personal <a href=\"https://www.technologyreview.com/2015/01/26/169495/how-a-box-could-solve-the-personal-data-conundrum/\">Databox</a>.\nIndependently of that, the latter also happened to get picked up by\n<a href=\"http://www.theguardian.com/profile/johnnaughton\">John Naughton</a> in the <a href=\"http://www.theguardian.com/technology/2015/feb/01/control-personal-data-databox-end-user-agreement\">Guardian</a>, which ended up with a pretty\nactive comments thread, including one from Richard Stallman himself\u2026</p>\n<p>Guess we\u2019d better get on and deliver it now then :)</p>",+"content": "<p>It\u2019s always<a href=\"https://mort.io/blog/someones-following-me/#1\">1</a> nice when someone notices what you\u2019re doing, and I was\npleasantly surprised recently to find that someone had indeed been watching.<a href=\"https://mort.io/blog/someones-following-me/#2\">2</a></p>\n<div>1\n<p>Well, usually. Unless you\u2019re doing something embarrassing like, e.g.,\ndancing.</p>\n</div>\n<div>2\n<p>Though in fact, given the pattern involved, I suspect it may be my\ncolleague, <a href=\"http://www.eecs.qmul.ac.uk/~hamed/\">Dr Hamed Haddadi</a> who\u2019s\nactually being followed. To whit, the fact that they also picked up on <a href=\"https://www.technologyreview.com/2014/09/12/171400/the-murky-world-of-third-party-web-tracking/\">a\nstudy</a>\nof third-party web-tracking from last year.</p>\n</div>\n<p>Specifically, it appears that someone at MIT Technology Review has noticed the\nwork of me and some of my pals: our work on <a href=\"https://www.technologyreview.com/2015/01/05/169715/the-emerging-science-of-human-data-interaction/\">HDI</a> and a recent sketch of some\nfollow-on work we\u2019re pursuing around building a personal <a href=\"https://www.technologyreview.com/2015/01/26/169495/how-a-box-could-solve-the-personal-data-conundrum/\">Databox</a>.\nIndependently of that, the latter also happened to get picked up by\n<a href=\"http://www.theguardian.com/profile/johnnaughton\">John Naughton</a> in the <a href=\"http://www.theguardian.com/technology/2015/feb/01/control-personal-data-databox-end-user-agreement\">Guardian</a>, which ended up with a pretty\nactive comments thread, including one from Richard Stallman himself\u2026</p>\n<p>Guess we\u2019d better get on and deliver it now then :)</p>",
+18
mort/blog_stop-start_.json
+18
mort/blog_stop-start_.json
···+"summary": "<p>Well, here we go again, again. Having stalled out back in 2013 to restart in\n2015, I stalled out again in 2017, so am finally restarting seven years later.\nAgain. I do have a dozen or more half-written posts from that period that may\nleak out slowly, in <a href=\"https://mort.io/tags/old\">backdated form</a>.</p>\n<p>The stack for the blog has been updated again, naturally \u2013 the tools got better\nafter all. So now this is using <a href=\"https://www.getzola.org\">Zola</a> with, initially,\nthe <a href=\"https://github.com/jieiku/abridge/\">abridge theme</a>. I think it\u2019s shinier,\nand hopefully also involves (almost) no JavaScript, is finally accessible, and\nnot too heavyweight. In removing the excessive JavaScript, I also updated my\n<a href=\"https://mort.io/blog/stop-start/./research#publications\">publication list</a> <a href=\"https://github.com/mor1/bib2html\">generation\nprocess</a>: no more CoffeeScript and JavaScript,\njust some plain ol\u2019 <a href=\"https://www.python.org/\">Python</a> using the slick new\n<a href=\"https://github.com/astral-sh/uv\">uv</a> tooling. The\n<a href=\"https://github.com/casey/just\">Justfile</a> incantation is something like:</p>\n<pre><code><span><span>papers</span> <span>:=</span></span> <span><span><span>"../templates/shortcodes/publications.html"</span></span>\n</span><span><span>bibinputs</span> <span>:=</span></span> <span><span><span>"../papers/bibinputs.json"</span></span>\n</span><span>\n</span><span><span><span>#</span> build papers data for site\n</span></span><span><span><span>@papers</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span> <span><span><span>cd</span></span><span> bib2html</span> <span>&&</span> <span><span>uv</span></span><span> sync</span> <span>&&</span> <span><span>uv</span></span><span> run<span> --</span></span><span> ./src/bib2html <span><span>{</span><span><span>{</span>bibinputs<span>}</span></span><span>}</span></span> <span>></span></span><span>|</span> <span><span>{{papers</span></span><span></span>}<span></span>}</span>\n</span></span></code></pre>\n<p>I\u2019ve also added a <a href=\"https://mort.io/blog/stop-start/./register/\">register of interests</a> so I stop having to\nscrabble around remembering what I\u2019m officially doing when I\u2019m occasionally\nasked.</p>\n<p>The stack underneath has also changed. I finally kicked the Mac OS habit after\nover 10 years \u2013 decided I\u2019d had enough of Apple hiding things and generally\ngetting in my way \u2013 at the same time I switched from Android to iPhone. Never\ndo things the easy way, that\u2019s my motto. So now I\u2019m back on a PC \u2026 but it\nturns out that 2024 (well, 2023 in fact) is the year of Linux on the Laptop for\nme. Not something I thought would actually happen when we joked about it in the\nlate 90s. Truly, times change. And not just any old wannabe-Windows distribution\nbut the resolutely idiosyncratic and copiously yet poorly documented\n<a href=\"\">NixOS</a> using Flakes and Home-Manager of course, with\n<a href=\"https://wayland.freedesktop.org/\">Wayland</a> and <a href=\"https://swaywm.org/\">Sway</a>.\nI\u2019ve even finally made the leap to <a href=\"https://vscodium.com/\">Codium</a>, the\nde-Microsofted version of VSCode, from\n<a href=\"https://www.gnu.org/software/emacs/\">Emacs</a> (mostly \u2013\n<a href=\"https://orgmode.org/\">org-mode</a> has not yet been replaced). I blame\n<a href=\"https://www.jeffas.net/\">Andrew</a> and <a href=\"https://www.jentek.dev/\">Chris</a>.</p>\n<p>For the terminally curious, this meant a fairly heavy reworking of my 30 years\nold <a href=\"https://github.com/mor1/rc-files/\">dotfiles</a> \u2013 still work in progress but\nadequately organised for now I think. See the <a href=\"https://github.com/mor1/rc-files/tree/main/nixos\">NixOS specific\nbits</a> if curious.</p>\n<p>And, because I\u2019m going to try to include at least one vaguely technical bit (or\nat least, one bit which isn\u2019t just me bemoaning that I haven\u2019t written anything\nin years), I confess to being particularly pleased with the following fragment\nto enable <a href=\"https://github.com/mor1/rc-files/blob/main/nixos/modules/home-manager/sway.nix#L64-L122\">my Sway startup\nscript</a>\nto start several apps on particular workspaces, by switching to the workspace,\nstarting the app, and then waiting for a suitable window to appear. The latter\nstep entails</p>\n<ul>\n<li>subscribing <code>swaymsg</code> to <code>window</code> messages with\n<a href=\"https://jqlang.github.io/jq/\">jq</a> to extract the messages of interest\nindicating a new window in the background,</li>\n<li>executing the command to start the app,</li>\n<li>using <code>grep</code>/<code>pkill</code> in tandem to kill the subscription when the new window is\ndetected, and finally</li>\n<li>waiting for the subscription to have been killed.</li>\n</ul>\n<p>It seems fairly (indeed, surprisingly) reliable.</p>\n<pre><code><span><span><span>wait_for</span> <span>(</span><span>)</span> <span>{</span>\n</span></span><span><span> <span>{</span> <span><span>swaymsg</span></span><span><span><span> -</span>r</span><span><span> -</span>m</span><span><span> -</span>t</span> subscribe <span><span>'</span>["window"]<span>'</span></span></span> <span>|</span>\n</span></span><span><span> <span><span>jq</span></span><span><span><span> -</span>c</span><span><span> --</span>unbuffered</span> <span><span>'</span>. | select(.change == "new")<span>'</span></span></span> <span>|</span>\n</span></span><span><span> <span>{</span> <span><span>grep</span></span><span><span><span> -</span>m1</span> . <span>></span>/dev/null</span> <span>;</span> <span><span>pkill</span></span><span> swaymsg</span> <span>;</span><span>}</span> <span>&</span>\n</span></span><span><span> <span>}</span><span> <span>2</span><span>></span>/dev/null</span>\n</span></span><span><span> <span>pid</span><span>=</span><span><span><span>$</span><span>!</span></span></span>\n</span></span><span><span> <span><span>swaymsg</span></span><span><span> --</span></span><span> <span><span>"</span>exec <span><span>$</span><span>*</span></span><span>"</span></span> </span><span>&&</span> <span><span>sleep</span></span><span> 0.5</span>\n</span></span><span><span> <span><span>wait</span></span><span> <span><span>$</span><span>pid</span></span> <span>2</span><span>></span>/dev/null</span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>Use via something like:</p>\n<pre><code><span><span>wayland</span><span>.</span><span>windowManager</span><span>.</span><span>sway</span><span>.</span><span>config</span><span>.</span><span>startup</span> <span>=</span>\n</span><span> <span>let</span>\n</span><span> <span>msg</span> <span>=</span> <span>cmds</span><span>:</span> <span><span>"</span>swaymsg '<span><span>${</span><span>builtins</span><span>.</span><span>concatStringsSep</span> <span><span>"</span>, <span>"</span></span> <span>cmds</span><span>}</span></span>'<span>"</span></span><span>;</span>\n</span><span> <span>workspace</span> <span>=</span> <span>ws</span><span>:</span> <span>msg</span> <span>[</span> <span><span>"</span>workspace --no-auto-back-and-forth <span><span>${</span><span>ws</span><span>}</span></span><span>"</span></span> <span>]</span><span>;</span>\n</span><span> <span>after</span> <span>=</span> <span>delay</span><span>:</span> <span>cmds</span><span>:</span> <span><span>"</span>sleep <span><span>${</span><span>toString</span> <span>delay</span><span>}</span></span> && <span><span>${</span><span>msg</span> <span>cmds</span><span>}</span></span><span>"</span></span><span>;</span>\n</span><span> <span>startup</span> <span>=</span> <span>pkgs</span><span>.</span><span>writeShellScriptBin</span> <span><span>"</span>startup.sh<span>"</span></span> <span><span>''</span>\n</span></span><span><span> wait_for () {\n</span></span><span><span> { swaymsg -r -m -t subscribe '["window"]' |\n</span></span><span><span> jq -c --unbuffered '. | select(.change == "new")' |\n</span></span><span><span> { grep -m1 . >/dev/null ; pkill swaymsg ;} &\n</span></span><span><span> } 2>/dev/null\n</span></span><span><span> pid=$!\n</span></span><span><span> swaymsg -- "exec $*" && sleep 0.5\n</span></span><span><span> wait $pid 2>/dev/null\n</span></span><span><span> }\n</span></span><span><span>\n</span></span><span><span> <span><span>${</span><span>workspace</span> <span><span>"</span><span><span>${</span><span>mediaws</span><span>}</span></span><span>"</span></span><span>}</span></span>\n</span></span><span><span> wait_for "rhythmbox"\n</span></span><span><span>...\n</span></span><span><span> <span>''</span></span><span>;</span>\n</span><span> <span>in</span>\n</span><span> <span>[</span> <span>{</span> <span>command</span> <span>=</span> <span><span>"</span><span><span>${</span><span>startup</span><span>}</span></span>/bin/startup.sh<span>"</span></span><span>;</span> <span>}</span> <span>]</span><span>;</span>\n</span></code></pre>",+"content": "<p>Well, here we go again, again. Having stalled out back in 2013 to restart in\n2015, I stalled out again in 2017, so am finally restarting seven years later.\nAgain. I do have a dozen or more half-written posts from that period that may\nleak out slowly, in <a href=\"https://mort.io/tags/old\">backdated form</a>.</p>\n<p>The stack for the blog has been updated again, naturally \u2013 the tools got better\nafter all. So now this is using <a href=\"https://www.getzola.org\">Zola</a> with, initially,\nthe <a href=\"https://github.com/jieiku/abridge/\">abridge theme</a>. I think it\u2019s shinier,\nand hopefully also involves (almost) no JavaScript, is finally accessible, and\nnot too heavyweight. In removing the excessive JavaScript, I also updated my\n<a href=\"https://mort.io/blog/stop-start/./research#publications\">publication list</a> <a href=\"https://github.com/mor1/bib2html\">generation\nprocess</a>: no more CoffeeScript and JavaScript,\njust some plain ol\u2019 <a href=\"https://www.python.org/\">Python</a> using the slick new\n<a href=\"https://github.com/astral-sh/uv\">uv</a> tooling. The\n<a href=\"https://github.com/casey/just\">Justfile</a> incantation is something like:</p>\n<pre><code><span><span>papers</span> <span>:=</span></span> <span><span><span>"../templates/shortcodes/publications.html"</span></span>\n</span><span><span>bibinputs</span> <span>:=</span></span> <span><span><span>"../papers/bibinputs.json"</span></span>\n</span><span>\n</span><span><span><span>#</span> build papers data for site\n</span></span><span><span><span>@papers</span></span><span>:</span>\n<span></span><span></span></span><span><span></span><span> <span><span><span>cd</span></span><span> bib2html</span> <span>&&</span> <span><span>uv</span></span><span> sync</span> <span>&&</span> <span><span>uv</span></span><span> run<span> --</span></span><span> ./src/bib2html <span><span>{</span><span><span>{</span>bibinputs<span>}</span></span><span>}</span></span> <span>></span></span><span>|</span> <span><span>{{papers</span></span><span></span>}<span></span>}</span>\n</span></span></code></pre>\n<p>I\u2019ve also added a <a href=\"https://mort.io/blog/stop-start/./register/\">register of interests</a> so I stop having to\nscrabble around remembering what I\u2019m officially doing when I\u2019m occasionally\nasked.</p>\n<p>The stack underneath has also changed. I finally kicked the Mac OS habit after\nover 10 years \u2013 decided I\u2019d had enough of Apple hiding things and generally\ngetting in my way \u2013 at the same time I switched from Android to iPhone. Never\ndo things the easy way, that\u2019s my motto. So now I\u2019m back on a PC \u2026 but it\nturns out that 2024 (well, 2023 in fact) is the year of Linux on the Laptop for\nme. Not something I thought would actually happen when we joked about it in the\nlate 90s. Truly, times change. And not just any old wannabe-Windows distribution\nbut the resolutely idiosyncratic and copiously yet poorly documented\n<a href=\"\">NixOS</a> using Flakes and Home-Manager of course, with\n<a href=\"https://wayland.freedesktop.org/\">Wayland</a> and <a href=\"https://swaywm.org/\">Sway</a>.\nI\u2019ve even finally made the leap to <a href=\"https://vscodium.com/\">Codium</a>, the\nde-Microsofted version of VSCode, from\n<a href=\"https://www.gnu.org/software/emacs/\">Emacs</a> (mostly \u2013\n<a href=\"https://orgmode.org/\">org-mode</a> has not yet been replaced). I blame\n<a href=\"https://www.jeffas.net/\">Andrew</a> and <a href=\"https://www.jentek.dev/\">Chris</a>.</p>\n<p>For the terminally curious, this meant a fairly heavy reworking of my 30 years\nold <a href=\"https://github.com/mor1/rc-files/\">dotfiles</a> \u2013 still work in progress but\nadequately organised for now I think. See the <a href=\"https://github.com/mor1/rc-files/tree/main/nixos\">NixOS specific\nbits</a> if curious.</p>\n<p>And, because I\u2019m going to try to include at least one vaguely technical bit (or\nat least, one bit which isn\u2019t just me bemoaning that I haven\u2019t written anything\nin years), I confess to being particularly pleased with the following fragment\nto enable <a href=\"https://github.com/mor1/rc-files/blob/main/nixos/modules/home-manager/sway.nix#L64-L122\">my Sway startup\nscript</a>\nto start several apps on particular workspaces, by switching to the workspace,\nstarting the app, and then waiting for a suitable window to appear. The latter\nstep entails</p>\n<ul>\n<li>subscribing <code>swaymsg</code> to <code>window</code> messages with\n<a href=\"https://jqlang.github.io/jq/\">jq</a> to extract the messages of interest\nindicating a new window in the background,</li>\n<li>executing the command to start the app,</li>\n<li>using <code>grep</code>/<code>pkill</code> in tandem to kill the subscription when the new window is\ndetected, and finally</li>\n<li>waiting for the subscription to have been killed.</li>\n</ul>\n<p>It seems fairly (indeed, surprisingly) reliable.</p>\n<pre><code><span><span><span>wait_for</span> <span>(</span><span>)</span> <span>{</span>\n</span></span><span><span> <span>{</span> <span><span>swaymsg</span></span><span><span><span> -</span>r</span><span><span> -</span>m</span><span><span> -</span>t</span> subscribe <span><span>'</span>["window"]<span>'</span></span></span> <span>|</span>\n</span></span><span><span> <span><span>jq</span></span><span><span><span> -</span>c</span><span><span> --</span>unbuffered</span> <span><span>'</span>. | select(.change == "new")<span>'</span></span></span> <span>|</span>\n</span></span><span><span> <span>{</span> <span><span>grep</span></span><span><span><span> -</span>m1</span> . <span>></span>/dev/null</span> <span>;</span> <span><span>pkill</span></span><span> swaymsg</span> <span>;</span><span>}</span> <span>&</span>\n</span></span><span><span> <span>}</span><span> <span>2</span><span>></span>/dev/null</span>\n</span></span><span><span> <span>pid</span><span>=</span><span><span><span>$</span><span>!</span></span></span>\n</span></span><span><span> <span><span>swaymsg</span></span><span><span> --</span></span><span> <span><span>"</span>exec <span><span>$</span><span>*</span></span><span>"</span></span> </span><span>&&</span> <span><span>sleep</span></span><span> 0.5</span>\n</span></span><span><span> <span><span>wait</span></span><span> <span><span>$</span><span>pid</span></span> <span>2</span><span>></span>/dev/null</span>\n</span></span><span><span><span>}</span></span>\n</span></code></pre>\n<p>Use via something like:</p>\n<pre><code><span><span>wayland</span><span>.</span><span>windowManager</span><span>.</span><span>sway</span><span>.</span><span>config</span><span>.</span><span>startup</span> <span>=</span>\n</span><span> <span>let</span>\n</span><span> <span>msg</span> <span>=</span> <span>cmds</span><span>:</span> <span><span>"</span>swaymsg '<span><span>${</span><span>builtins</span><span>.</span><span>concatStringsSep</span> <span><span>"</span>, <span>"</span></span> <span>cmds</span><span>}</span></span>'<span>"</span></span><span>;</span>\n</span><span> <span>workspace</span> <span>=</span> <span>ws</span><span>:</span> <span>msg</span> <span>[</span> <span><span>"</span>workspace --no-auto-back-and-forth <span><span>${</span><span>ws</span><span>}</span></span><span>"</span></span> <span>]</span><span>;</span>\n</span><span> <span>after</span> <span>=</span> <span>delay</span><span>:</span> <span>cmds</span><span>:</span> <span><span>"</span>sleep <span><span>${</span><span>toString</span> <span>delay</span><span>}</span></span> && <span><span>${</span><span>msg</span> <span>cmds</span><span>}</span></span><span>"</span></span><span>;</span>\n</span><span> <span>startup</span> <span>=</span> <span>pkgs</span><span>.</span><span>writeShellScriptBin</span> <span><span>"</span>startup.sh<span>"</span></span> <span><span>''</span>\n</span></span><span><span> wait_for () {\n</span></span><span><span> { swaymsg -r -m -t subscribe '["window"]' |\n</span></span><span><span> jq -c --unbuffered '. | select(.change == "new")' |\n</span></span><span><span> { grep -m1 . >/dev/null ; pkill swaymsg ;} &\n</span></span><span><span> } 2>/dev/null\n</span></span><span><span> pid=$!\n</span></span><span><span> swaymsg -- "exec $*" && sleep 0.5\n</span></span><span><span> wait $pid 2>/dev/null\n</span></span><span><span> }\n</span></span><span><span>\n</span></span><span><span> <span><span>${</span><span>workspace</span> <span><span>"</span><span><span>${</span><span>mediaws</span><span>}</span></span><span>"</span></span><span>}</span></span>\n</span></span><span><span> wait_for "rhythmbox"\n</span></span><span><span>...\n</span></span><span><span> <span>''</span></span><span>;</span>\n</span><span> <span>in</span>\n</span><span> <span>[</span> <span>{</span> <span>command</span> <span>=</span> <span><span>"</span><span><span>${</span><span>startup</span><span>}</span></span>/bin/startup.sh<span>"</span></span><span>;</span> <span>}</span> <span>]</span><span>;</span>\n</span></code></pre>",
+18
mort/blog_talks-old-and-new_.json
+18
mort/blog_talks-old-and-new_.json
···+"summary": "<p>Thanks to an invitation from <a href=\"http://research.microsoft.com/en-us/um/people/hiballan/\">Hitesh</a>, I recently got the chance to revisit my\nold stomping ground at <a href=\"http://research.microsoft.com/en-us/labs/cambridge/\">Microsoft Research Cambridge</a>. Well, I say \u201cold\u201d\n\u2013 in the intervening 7 years, they\u2019ve moved to a rather splendid new building\nat the other end of Cambridge, just next to the station. (And improved the\ncoffee too, not that it wasn\u2019t pretty good to start with!)</p>\n<p>Anyway, this was a pleasant chance to catch up with some old colleagues, meet\nsome new ones, and even speak to my most recently graduated Ph.D. student,\n<a href=\"http://research.microsoft.com/en-us/people/a-ewluge/\">Dr Ewa Luger</a> \u2013 and who\u2019d\u2019ve thought that I\u2019d ever end up supervising\nsomeone coming from the discipline of Political Science too!</p>\n<p>The ostensible reason was to talk about the <a href=\"http://homenetworks.ac.uk/\">Homework</a> project \u2013 a talk I\u2019ve\ngiven a <a href=\"https://www.youtube.com/watch?v=AdtVSrazVaQ\">few times now</a> \u2013 and to lead from that into discussing some\nof my current agenda, around <a href=\"http://hdiresearch.org/\">Human-Data Interaction</a> and\n<a href=\"http://mort.io/research/\">User-Centred Systems</a>. It seemed to go pretty well \u2013 a few\nquestions, and some lively discussion over lunch followed.</p>\n<p>Happily, the fine folk at <a href=\"http://research.microsoft.com/en-us/labs/cambridge/\">MSRC</a> recorded it and have made it available, so if\nyou\u2019re really interested, you can take a look \u2013 direct your browser of choice to <a href=\"http://research.microsoft.com/apps/video/default.aspx?id=238157\">Interacting with Infrastructure: Home Networking and Beyond</a>.</p>\n<p>I have to say, when I checked that link in the course of writing this, one thing\nthat did come as something of a surprise was to notice that a talk I didn\u2019t\nremember giving \u2013 it was 8 years ago to be fair! \u2013 is also available. So if\nyou\u2019ve a burning desire to find out about\n<a href=\"http://research.microsoft.com/apps/video/default.aspx?id=104278\">Measuring and Monitoring Microsoft\u2019s Enterprise Network</a>,\nit turns out you can do that too. The past haunts us they say. In my case, it\nseems that the haunting has happened already, and turned my hair mostly white in\nthe process\u2026 :)</p>",+"content": "<p>Thanks to an invitation from <a href=\"http://research.microsoft.com/en-us/um/people/hiballan/\">Hitesh</a>, I recently got the chance to revisit my\nold stomping ground at <a href=\"http://research.microsoft.com/en-us/labs/cambridge/\">Microsoft Research Cambridge</a>. Well, I say \u201cold\u201d\n\u2013 in the intervening 7 years, they\u2019ve moved to a rather splendid new building\nat the other end of Cambridge, just next to the station. (And improved the\ncoffee too, not that it wasn\u2019t pretty good to start with!)</p>\n<p>Anyway, this was a pleasant chance to catch up with some old colleagues, meet\nsome new ones, and even speak to my most recently graduated Ph.D. student,\n<a href=\"http://research.microsoft.com/en-us/people/a-ewluge/\">Dr Ewa Luger</a> \u2013 and who\u2019d\u2019ve thought that I\u2019d ever end up supervising\nsomeone coming from the discipline of Political Science too!</p>\n<p>The ostensible reason was to talk about the <a href=\"http://homenetworks.ac.uk/\">Homework</a> project \u2013 a talk I\u2019ve\ngiven a <a href=\"https://www.youtube.com/watch?v=AdtVSrazVaQ\">few times now</a> \u2013 and to lead from that into discussing some\nof my current agenda, around <a href=\"http://hdiresearch.org/\">Human-Data Interaction</a> and\n<a href=\"http://mort.io/research/\">User-Centred Systems</a>. It seemed to go pretty well \u2013 a few\nquestions, and some lively discussion over lunch followed.</p>\n<p>Happily, the fine folk at <a href=\"http://research.microsoft.com/en-us/labs/cambridge/\">MSRC</a> recorded it and have made it available, so if\nyou\u2019re really interested, you can take a look \u2013 direct your browser of choice to <a href=\"http://research.microsoft.com/apps/video/default.aspx?id=238157\">Interacting with Infrastructure: Home Networking and Beyond</a>.</p>\n<p>I have to say, when I checked that link in the course of writing this, one thing\nthat did come as something of a surprise was to notice that a talk I didn\u2019t\nremember giving \u2013 it was 8 years ago to be fair! \u2013 is also available. So if\nyou\u2019ve a burning desire to find out about\n<a href=\"http://research.microsoft.com/apps/video/default.aspx?id=104278\">Measuring and Monitoring Microsoft\u2019s Enterprise Network</a>,\nit turns out you can do that too. The past haunts us they say. In my case, it\nseems that the haunting has happened already, and turned my hair mostly white in\nthe process\u2026 :)</p>",
+18
mort/blog_tar-includes_.json
+18
mort/blog_tar-includes_.json
···+"summary": "<p>I recently discovered, to some irritation, that the <code>--include PATTERN</code> option\nto <code>tar</code> seems only to apply to directories \u2013 and if the <code>PATTERN</code> doesn\u2019t\nmatch, it won\u2019t traverse subdirectories. But I wanted to include <code>*.php</code> for\nsome reason. So instead pipe the output of <code>find</code>, or better these days,\n<a href=\"https://github.com/sharkdp/fd\"><code>fd</code></a>:</p>\n<pre><code><span><span><span>fd</span></span><span><span><span> -</span>e</span> php<span><span> -</span>0</span></span> <span>|</span> <span><span>tar</span></span><span><span><span> -</span>cvjf</span> TARBALL.bz2<span><span> --</span>null</span><span><span> --</span>files-from</span> -</span>\n</span></code></pre>",+"content": "<p>I recently discovered, to some irritation, that the <code>--include PATTERN</code> option\nto <code>tar</code> seems only to apply to directories \u2013 and if the <code>PATTERN</code> doesn\u2019t\nmatch, it won\u2019t traverse subdirectories. But I wanted to include <code>*.php</code> for\nsome reason. So instead pipe the output of <code>find</code>, or better these days,\n<a href=\"https://github.com/sharkdp/fd\"><code>fd</code></a>:</p>\n<pre><code><span><span><span>fd</span></span><span><span><span> -</span>e</span> php<span><span> -</span>0</span></span> <span>|</span> <span><span>tar</span></span><span><span><span> -</span>cvjf</span> TARBALL.bz2<span><span> --</span>null</span><span><span> --</span>files-from</span> -</span>\n</span></code></pre>",
+18
mort/blog_tdis-accepted_.json
+18
mort/blog_tdis-accepted_.json
···+"summary": "<p>As I find myself once more on a train to parts unknown (to me at least), a brief\nupdate :)</p>\n<p>The parts unknown in question is Rotterdam, NL (so really quite well-known to\nquite a lot of people, just not me) for <a href=\"https://2025.eurosys.org/\">EURO/SYS\n2025</a> (being held jointly with <a href=\"https://www.asplos-conference.org/asplos2025\">ASPLOS\n2025</a>, although I can\u2019t stay for\nthe whole thing unfortunately) and specifically the <a href=\"https://tdis.gitlab.io/tdis25/\">3rd International Workshop\nof Testing Distributed Internet of Things Systems\n(TDIS)</a>.</p>\n<p>Why? Happily the programme committee decided to accept two papers from my\n(ex-)students \u2013 which is nice :) The two in question are</p>\n<ol>\n<li>\n<p><strong><a href=\"https://doi.org/10.1145/3719159.3721222\">Reckon-ing Kubernetes at the Edge using Emulated\nClusters</a></strong> with Alessandro Sassi\n(University of Cambridge / Politecnico di Milano) and Christopher Jensen\n(University of Cambridge / Microsoft Research). This describes Alessandro\u2019s\nM.Sc. research project undertaken as a visitor with my group. He built on\nChris\u2019 earlier work on <a href=\"https://doi.org/10.1145/3447851.3458739\">Reckon, an emulator setup for examining consensus\nsystem behaviour</a>. Alessandor\nextended this to use ContainerNet enabling it to emulate Kubernetes clusters\non a single node, and used this to examine Kubernetes performance in some\nedge network scenarios. Source available on GitHub at\n<a href=\"https://github.com/AleSassi/reckon-k8s\">https://github.com/AleSassi/reckon-k8s</a>.</p>\n</li>\n<li>\n<p><strong><a href=\"https://doi.org/10.1145/3719159.3721221\">LoRaLive: Efficient LoRaWAN Traffic\nGeneration</a></strong> with Vadim Safronov\n(University of Oxford / University of Cambridge). This reports a component of\nVadim\u2019s Ph.D. work where he built a system to enable dense deployment LoRaWAN\ntrace-playback using a minimal number of nodes while respecting legal\nconstraints on duty cycles. Source available on GitHub at\n<a href=\"https://github.com/LoRaLive/LoRaLive\">https://github.com/LoRaLive/LoRaLive</a>.</p>\n</li>\n</ol>\n<p>Both nice tools that we hope might be of community interest!</p>\n<p>The Ph.D. in question is Chris Jensen\u2019s \u2013 happily he passed his viva on\nThursday just gone, titled \u201cSeparating conflict-recovery from failure-recovery\nin distributed consensus\u201d, examined by <a href=\"https://timharris.uk/\">Tim Harris</a> and\n<a href=\"https://charap.co/\">Aleksey Charapko</a>. Other recent passes include Al-Amjad\nTawfiq Isstaif, titled \u201cContention-resilient overcommitment for serverless\ndeployments\u201d and Andrew Jeffery, titled \u201cModelling orchestration\u201d. The race<a href=\"https://mort.io/blog/tdis-accepted/#1\">1</a>\nis now on for the first to <a href=\"https://www.cl.cam.ac.uk/techreports/\">tech\nreport</a>\u2026</p>\n<div>1\n<p>It\u2019s not really a race. That would be weird.</p>\n</div>",+"content": "<p>As I find myself once more on a train to parts unknown (to me at least), a brief\nupdate :)</p>\n<p>The parts unknown in question is Rotterdam, NL (so really quite well-known to\nquite a lot of people, just not me) for <a href=\"https://2025.eurosys.org/\">EURO/SYS\n2025</a> (being held jointly with <a href=\"https://www.asplos-conference.org/asplos2025\">ASPLOS\n2025</a>, although I can\u2019t stay for\nthe whole thing unfortunately) and specifically the <a href=\"https://tdis.gitlab.io/tdis25/\">3rd International Workshop\nof Testing Distributed Internet of Things Systems\n(TDIS)</a>.</p>\n<p>Why? Happily the programme committee decided to accept two papers from my\n(ex-)students \u2013 which is nice :) The two in question are</p>\n<ol>\n<li>\n<p><strong><a href=\"https://doi.org/10.1145/3719159.3721222\">Reckon-ing Kubernetes at the Edge using Emulated\nClusters</a></strong> with Alessandro Sassi\n(University of Cambridge / Politecnico di Milano) and Christopher Jensen\n(University of Cambridge / Microsoft Research). This describes Alessandro\u2019s\nM.Sc. research project undertaken as a visitor with my group. He built on\nChris\u2019 earlier work on <a href=\"https://doi.org/10.1145/3447851.3458739\">Reckon, an emulator setup for examining consensus\nsystem behaviour</a>. Alessandor\nextended this to use ContainerNet enabling it to emulate Kubernetes clusters\non a single node, and used this to examine Kubernetes performance in some\nedge network scenarios. Source available on GitHub at\n<a href=\"https://github.com/AleSassi/reckon-k8s\">https://github.com/AleSassi/reckon-k8s</a>.</p>\n</li>\n<li>\n<p><strong><a href=\"https://doi.org/10.1145/3719159.3721221\">LoRaLive: Efficient LoRaWAN Traffic\nGeneration</a></strong> with Vadim Safronov\n(University of Oxford / University of Cambridge). This reports a component of\nVadim\u2019s Ph.D. work where he built a system to enable dense deployment LoRaWAN\ntrace-playback using a minimal number of nodes while respecting legal\nconstraints on duty cycles. Source available on GitHub at\n<a href=\"https://github.com/LoRaLive/LoRaLive\">https://github.com/LoRaLive/LoRaLive</a>.</p>\n</li>\n</ol>\n<p>Both nice tools that we hope might be of community interest!</p>\n<p>The Ph.D. in question is Chris Jensen\u2019s \u2013 happily he passed his viva on\nThursday just gone, titled \u201cSeparating conflict-recovery from failure-recovery\nin distributed consensus\u201d, examined by <a href=\"https://timharris.uk/\">Tim Harris</a> and\n<a href=\"https://charap.co/\">Aleksey Charapko</a>. Other recent passes include Al-Amjad\nTawfiq Isstaif, titled \u201cContention-resilient overcommitment for serverless\ndeployments\u201d and Andrew Jeffery, titled \u201cModelling orchestration\u201d. The race<a href=\"https://mort.io/blog/tdis-accepted/#1\">1</a>\nis now on for the first to <a href=\"https://www.cl.cam.ac.uk/techreports/\">tech\nreport</a>\u2026</p>\n<div>1\n<p>It\u2019s not really a race. That would be weird.</p>\n</div>",
+18
mort/blog_topkg-addendum_.json
+18
mort/blog_topkg-addendum_.json
···+"summary": "<p>This is a short addendum to my <a href=\"http://mort.io/blog/2017/08/28/past-present-future/\">post of a couple of days\nago</a> caused by my\ncarelessness in writing the <a href=\"https://github.com/mor1/ocal/blob/13a9a7f5b8f2e0be4c2b55941a00a885df202cf8/ocal.opam#L16-L22\">OPAM\nfile</a>.\nCareful readers will observe the lack of any dependency on <a href=\"https://github.com/pqwy/notty/\">notty</a>. Read on for\nwhat happened next\u2026</p>\n<p>The result of this carelessness was that everything worked just fine locally,\nbut <a href=\"https://github.com/ocaml/opam-repository/pull/10176\">my PR to the OPAM package\nrepository</a> failed. Cue\nmuch wailing and gnashing of teeth.</p>\n<p>However, thanks to a moment\u2019s assistance\nfrom <a href=\"http://erratique.ch/contact.en\">Daniel B\u00fcnzli</a>, this was easy to fix:</p>\n<pre><code><span><span><span>$</span></span><span> git checkout 0.2.0 <span><span>#</span></span><span> checkout the relevant release version tag</span><span>\n</span></span></span><span><span><span>$</span></span><span> topkg opam pkg <span><span>#</span></span><span> create the release metadata</span><span>\n</span></span></span><span><span><span>$</span></span><span> e _build/ocal.0.2.0/opam <span><span>#</span></span><span> invoke editor so I can add the missing dep</span><span>\n</span></span></span><span><span><span>$</span></span><span> topkg opam submit <span><span>#</span></span><span> submit the updated OPAM metadata, updating the PR</span><span>\n</span></span></span><span><span><span>Submitting</span></span><span> _build/ocal.0.2.0</span>\n</span><span><span><span>[ocal-0.2.0.tbz]</span></span><span> http://github.com/mor1/ocal/releases/download/0.2.0/ocal-0.2.0.tbz downloaded</span>\n</span><span><span><span>Updating</span></span><span> existing pull-request <span><span>#</span></span><span>10176</span><span>\n</span></span></span><span><span><span>Pull-requested:</span></span><span> https://github.com/ocaml/opam-repository/pull/10176</span>\n</span></code></pre>\n<p>For me, the main thing to note here is that the OPAM metadata in the repo at the\ncommit ref tagged for release doesn\u2019t match that which OPAM uses to install the\nrelease. But as <a href=\"http://seb.mondet.org/\">Sebastien Mondet</a> pointed out to me,\nthis is neither relevant nor (in the long term) likely, as (e.g.) version\nconstraints on dependencies may need to be added to old versions of dependent\npackages to keep them working. (Though I did add and commit the dependency to\n<code>master</code>, naturally.)</p>\n<p>So, all-in-all, an easy fix to a common problem. Which is the way it should\nbe\u2026</p>",+"content": "<p>This is a short addendum to my <a href=\"http://mort.io/blog/2017/08/28/past-present-future/\">post of a couple of days\nago</a> caused by my\ncarelessness in writing the <a href=\"https://github.com/mor1/ocal/blob/13a9a7f5b8f2e0be4c2b55941a00a885df202cf8/ocal.opam#L16-L22\">OPAM\nfile</a>.\nCareful readers will observe the lack of any dependency on <a href=\"https://github.com/pqwy/notty/\">notty</a>. Read on for\nwhat happened next\u2026</p>\n<p>The result of this carelessness was that everything worked just fine locally,\nbut <a href=\"https://github.com/ocaml/opam-repository/pull/10176\">my PR to the OPAM package\nrepository</a> failed. Cue\nmuch wailing and gnashing of teeth.</p>\n<p>However, thanks to a moment\u2019s assistance\nfrom <a href=\"http://erratique.ch/contact.en\">Daniel B\u00fcnzli</a>, this was easy to fix:</p>\n<pre><code><span><span><span>$</span></span><span> git checkout 0.2.0 <span><span>#</span></span><span> checkout the relevant release version tag</span><span>\n</span></span></span><span><span><span>$</span></span><span> topkg opam pkg <span><span>#</span></span><span> create the release metadata</span><span>\n</span></span></span><span><span><span>$</span></span><span> e _build/ocal.0.2.0/opam <span><span>#</span></span><span> invoke editor so I can add the missing dep</span><span>\n</span></span></span><span><span><span>$</span></span><span> topkg opam submit <span><span>#</span></span><span> submit the updated OPAM metadata, updating the PR</span><span>\n</span></span></span><span><span><span>Submitting</span></span><span> _build/ocal.0.2.0</span>\n</span><span><span><span>[ocal-0.2.0.tbz]</span></span><span> http://github.com/mor1/ocal/releases/download/0.2.0/ocal-0.2.0.tbz downloaded</span>\n</span><span><span><span>Updating</span></span><span> existing pull-request <span><span>#</span></span><span>10176</span><span>\n</span></span></span><span><span><span>Pull-requested:</span></span><span> https://github.com/ocaml/opam-repository/pull/10176</span>\n</span></code></pre>\n<p>For me, the main thing to note here is that the OPAM metadata in the repo at the\ncommit ref tagged for release doesn\u2019t match that which OPAM uses to install the\nrelease. But as <a href=\"http://seb.mondet.org/\">Sebastien Mondet</a> pointed out to me,\nthis is neither relevant nor (in the long term) likely, as (e.g.) version\nconstraints on dependencies may need to be added to old versions of dependent\npackages to keep them working. (Though I did add and commit the dependency to\n<code>master</code>, naturally.)</p>\n<p>So, all-in-all, an easy fix to a common problem. Which is the way it should\nbe\u2026</p>",
+18
mort/blog_tum-retreat_.json
+18
mort/blog_tum-retreat_.json
···+"summary": "<p>Ok, ok, I exaggerate \u2013 it\u2019s not really that far. But any time I have to set the\nalarm for 02.30, it feels like it\u2019s a long long way away!</p>\n<p><a href=\"https://www.ce.cit.tum.de/cm/research-group/\">TU Munchen</a> host an <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/\">annual\nretreat</a>, and thanks to <a href=\"https://www.ce.cit.tum.de/cm/research-group/joerg-ott/\">Prof. Joerg\nOtt</a> I was invited this\nyear for the first time. It\u2019s held in TUM\u2019s <a href=\"https://www.raitenhaslach.tum.de/en/raitenhaslach/home/\">Retreat Centre at\nRaitenhaslach</a>,\noriginally a Cistercian monastery which is a lovely location except that it\u2019s a\nbus ride from the hotel used and a bus and three trains from the nearest airport\n\u2013 which is, mildly confusingly,\n<a href=\"https://www.salzburg-airport.com/en/\">Salzburg</a> not\n<a href=\"https://www.munich-airport.com/\">Munich</a>. The latter point made me assume that\nthere would be good directions via public transport from Salzburg, but that\nturned out not to be the case. And as Google Maps is, at best, patchy in terms\nof public transport coverage in this part of the world \u2013 it doesn\u2019t know about\nall the buses at least \u2013 I thought it might be useful to record the process of\ngetting there from Cambridge.</p>\n <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/mir3-2024-10/\"><img alt=\"A photograph of the presenter in next to a slide in an ornately\n decorated room\" height=\"1\" src=\"https://mort.io/blog/tum-retreat/photo.jpg\" width=\"480\"></a>\n<p>First, to fly to Salzburg or Munich? In the end I picked Salzburg as it was\ncloser to the retreat centre itself and meant I didn\u2019t need to fly via Heathrow\nor Gatwick. As I had to be in Cambridge the night before for family reasons,\npicking a closer airport for my very early flight seemed sensible. Unfortunately\nthis means flying Wizzair UK from <a href=\"https://www.london-luton.co.uk/\">London Luton\nAirport</a>, which after just twice flying from\nthere remains my least favourite UK airport. Given the flight departed at 05.55,\na \u00a377.50 taxi from Cambridge was the only option.</p>\n<p>Per Panther\u2019s recommmendation I allowed 70 minutes for the journey, and on my\nown recommendation I aimed to arrive ~1.5h before the flight resulting in a taxi\nbooking for 03.15. Happily when I get in the taxi the driver said it would be\naround 45-50 minutes, which was nice. Less happily when I checked in the day\nbefore \u2013 using a laptop as neither Safari nor Firefox on iOS was able to\ndisplay the boarding pass \u2013 I got an automatic email from Wizzair telling me\nthat I needed to arrive at the airport by 02.15 as the airport was upgrading\ntheir \u201ccentral search facilities\u201d. After trying to call the airport half a dozen\ntimes but getting trapped in the IVR menus at the inevitable \u201cread the FAQ on\nthe website\u201d end state, I gave up trying to check if that really was necessary\nand decided to risk my original timings.</p>\n<p>I did however spot one useful thing: \u201cprebooked security check\u201d. This appears to\nbe the ability to book, at 15 minute granularity, a security check during peak\nhours (03.00-04.30). Given my estimated arrival time of 04.15, that seemed\nperfect.\n <img alt=\"A photograph of a road being worked\" height=\"1\" src=\"https://mort.io/blog/tum-retreat/road.jpg\" width=\"240\">\n\nAs it was also <strong>completely free</strong> requiring nothing more than my\nflight number and an email address (not even the one associated with my ticket),\nI booked it resulting in an email with a QR code to show to security. This\nworked perfectly when I arrived at the airport: show the person on security the\nQR code on the phone screen, and they simply jump you past the security queue to\nthe front. Did I mention this was completely free? Seems a strange system to me\nbut hey, I\u2019ll take it!</p>\n<p>In the end, the public transport worked though there was a short walk at the end\nthat was longer than I expected \u2013 rebuilding a road meant I could get through\nbut the bus couldn\u2019t! And then the retreat happened \u2013 but you can <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/mir3-2024-10/\">read about\nthat elsewhere</a> :)</p>",+"content": "<p>Ok, ok, I exaggerate \u2013 it\u2019s not really that far. But any time I have to set the\nalarm for 02.30, it feels like it\u2019s a long long way away!</p>\n<p><a href=\"https://www.ce.cit.tum.de/cm/research-group/\">TU Munchen</a> host an <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/\">annual\nretreat</a>, and thanks to <a href=\"https://www.ce.cit.tum.de/cm/research-group/joerg-ott/\">Prof. Joerg\nOtt</a> I was invited this\nyear for the first time. It\u2019s held in TUM\u2019s <a href=\"https://www.raitenhaslach.tum.de/en/raitenhaslach/home/\">Retreat Centre at\nRaitenhaslach</a>,\noriginally a Cistercian monastery which is a lovely location except that it\u2019s a\nbus ride from the hotel used and a bus and three trains from the nearest airport\n\u2013 which is, mildly confusingly,\n<a href=\"https://www.salzburg-airport.com/en/\">Salzburg</a> not\n<a href=\"https://www.munich-airport.com/\">Munich</a>. The latter point made me assume that\nthere would be good directions via public transport from Salzburg, but that\nturned out not to be the case. And as Google Maps is, at best, patchy in terms\nof public transport coverage in this part of the world \u2013 it doesn\u2019t know about\nall the buses at least \u2013 I thought it might be useful to record the process of\ngetting there from Cambridge.</p>\n <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/mir3-2024-10/\"><img alt=\"A photograph of the presenter in next to a slide in an ornately\n decorated room\" height=\"1\" src=\"https://mort.io/blog/tum-retreat/photo.jpg\" width=\"480\"></a>\n<p>First, to fly to Salzburg or Munich? In the end I picked Salzburg as it was\ncloser to the retreat centre itself and meant I didn\u2019t need to fly via Heathrow\nor Gatwick. As I had to be in Cambridge the night before for family reasons,\npicking a closer airport for my very early flight seemed sensible. Unfortunately\nthis means flying Wizzair UK from <a href=\"https://www.london-luton.co.uk/\">London Luton\nAirport</a>, which after just twice flying from\nthere remains my least favourite UK airport. Given the flight departed at 05.55,\na \u00a377.50 taxi from Cambridge was the only option.</p>\n<p>Per Panther\u2019s recommmendation I allowed 70 minutes for the journey, and on my\nown recommendation I aimed to arrive ~1.5h before the flight resulting in a taxi\nbooking for 03.15. Happily when I get in the taxi the driver said it would be\naround 45-50 minutes, which was nice. Less happily when I checked in the day\nbefore \u2013 using a laptop as neither Safari nor Firefox on iOS was able to\ndisplay the boarding pass \u2013 I got an automatic email from Wizzair telling me\nthat I needed to arrive at the airport by 02.15 as the airport was upgrading\ntheir \u201ccentral search facilities\u201d. After trying to call the airport half a dozen\ntimes but getting trapped in the IVR menus at the inevitable \u201cread the FAQ on\nthe website\u201d end state, I gave up trying to check if that really was necessary\nand decided to risk my original timings.</p>\n<p>I did however spot one useful thing: \u201cprebooked security check\u201d. This appears to\nbe the ability to book, at 15 minute granularity, a security check during peak\nhours (03.00-04.30). Given my estimated arrival time of 04.15, that seemed\nperfect.\n <img alt=\"A photograph of a road being worked\" height=\"1\" src=\"https://mort.io/blog/tum-retreat/road.jpg\" width=\"240\">\n\nAs it was also <strong>completely free</strong> requiring nothing more than my\nflight number and an email address (not even the one associated with my ticket),\nI booked it resulting in an email with a QR code to show to security. This\nworked perfectly when I arrived at the airport: show the person on security the\nQR code on the phone screen, and they simply jump you past the security queue to\nthe front. Did I mention this was completely free? Seems a strange system to me\nbut hey, I\u2019ll take it!</p>\n<p>In the end, the public transport worked though there was a short walk at the end\nthat was longer than I expected \u2013 rebuilding a road meant I could get through\nbut the bus couldn\u2019t! And then the retreat happened \u2013 but you can <a href=\"https://www.ce.cit.tum.de/cm/events/mir3/mir3-2024-10/\">read about\nthat elsewhere</a> :)</p>",
+18
mort/blog_unikernel-revolution_.json
+18
mort/blog_unikernel-revolution_.json
···+"summary": "<p>I\u2019ve had the pleasure of giving a couple of talks at some fun venues recently,\nextolling both the virtues of <a href=\"http://unikernel.org/\">unikernels</a> and talking a bit about where we\ncurrently see them as usefully being deployed.</p>\n<p>Specifically, <a href=\"https://operability.io/\">Operability.io 2016</a> a couple of weeks ago was enlightening\nabout some of the problems faced in operating production systems. Some great\naudience questions and follow-ups after the talk, including some who were even\nwondering when we\u2019ll see unikernels as ready for the desktop! Of course, with\nthe release of the <a href=\"https://docker.com/...\">Docker for Mac</a> and [Docker for Windows][dfw] products,\nit\u2019s arguable that we\u2019ve beaten Linux to that accolade, as both products make\nextensive use of <a href=\"https://mirage.io\">MirageOS</a> unikernel libraries. Having said that, I was\npleased to be told that the message about unikernels having a range of\ndeployment scenarios, and particularly partial deployments into micro-service\nenvironments made sense to many who came to speak to me afterwards.</p>\n<p>This was followed by a slightly expanded version of that talk earlier today at\nthe <a href=\"https://devoxx.be/\">Devox Belgium</a> conference. <a href=\"https://devoxx.be/\">Devoxx</a> is primarily a Java community\nso I was interested to see how the talk would go down given that <a href=\"https://mirage.io\">MirageOS</a> is\nstaunchly OCaml-centric, and the <a href=\"http://unikernel.org/\">unikernels</a> movement in general is language\nspecific and (at least until now) somewhat weighted toward functional\nprogramming, our good friends at <a href=\"http://www.includeos.org/\">IncludeOS</a> notwithstanding. In the end it\nseemed to go pretty well, based on what little I could see through the bright\nlights\u2013 maybe one day I\u2019ll get used to that when being videoed! Certainly some\ngood questions again, on the specific utility of unikernels to IoT, the\nrelationship between unikernels and Docker, and more besides.</p>\n<p>Anyway, I hope anyone who came to either talk enjoyed it and found it\ninteresting. Happy to respond to comments or questions via email or\non <a href=\"https://twitter.com/mort___\">Twitter</a>!</p>",+"content": "<p>I\u2019ve had the pleasure of giving a couple of talks at some fun venues recently,\nextolling both the virtues of <a href=\"http://unikernel.org/\">unikernels</a> and talking a bit about where we\ncurrently see them as usefully being deployed.</p>\n<p>Specifically, <a href=\"https://operability.io/\">Operability.io 2016</a> a couple of weeks ago was enlightening\nabout some of the problems faced in operating production systems. Some great\naudience questions and follow-ups after the talk, including some who were even\nwondering when we\u2019ll see unikernels as ready for the desktop! Of course, with\nthe release of the <a href=\"https://docker.com/...\">Docker for Mac</a> and [Docker for Windows][dfw] products,\nit\u2019s arguable that we\u2019ve beaten Linux to that accolade, as both products make\nextensive use of <a href=\"https://mirage.io\">MirageOS</a> unikernel libraries. Having said that, I was\npleased to be told that the message about unikernels having a range of\ndeployment scenarios, and particularly partial deployments into micro-service\nenvironments made sense to many who came to speak to me afterwards.</p>\n<p>This was followed by a slightly expanded version of that talk earlier today at\nthe <a href=\"https://devoxx.be/\">Devox Belgium</a> conference. <a href=\"https://devoxx.be/\">Devoxx</a> is primarily a Java community\nso I was interested to see how the talk would go down given that <a href=\"https://mirage.io\">MirageOS</a> is\nstaunchly OCaml-centric, and the <a href=\"http://unikernel.org/\">unikernels</a> movement in general is language\nspecific and (at least until now) somewhat weighted toward functional\nprogramming, our good friends at <a href=\"http://www.includeos.org/\">IncludeOS</a> notwithstanding. In the end it\nseemed to go pretty well, based on what little I could see through the bright\nlights\u2013 maybe one day I\u2019ll get used to that when being videoed! Certainly some\ngood questions again, on the specific utility of unikernels to IoT, the\nrelationship between unikernels and Docker, and more besides.</p>\n<p>Anyway, I hope anyone who came to either talk enjoyed it and found it\ninteresting. Happy to respond to comments or questions via email or\non <a href=\"https://twitter.com/mort___\">Twitter</a>!</p>",
+18
mort/blog_whither-ai_.json
+18
mort/blog_whither-ai_.json
···+"summary": "<p>I am hardly the first person to comment<a href=\"https://mort.io/blog/whither-ai/#1\">1</a> on this \u2013 I am given to understand\nAI has been a topic of some interest to many for a few years now. I\u2019m sure I\u2019ve\nseen, and possibly even <a href=\"https://mastodon.me.uk/@mort\">re-tooted</a> things about\nit in fact. I\u2019m afraid I just don\u2019t keep up.</p>\n<div>1\n<p>Ok fine. I admit it. This is a rant.</p>\n</div>\n<p>But recent experiences reviewing for a couple of systems/networking venues has\nled me to feel I need to ask: <strong>WHY</strong>? More pointedly, why does the following\nseem like good motivation for a research paper?</p>\n<ol>\n<li>There is a complex and important task that currently requires considerable\nexpertise to carry out because it is important to be precise and get it\nright.</li>\n<li>The task in question can be described imprecisely using natural language by\nnon-experts.</li>\n<li>AI (inevitably, some large-language model) can take that natural language\ndescription and, after training, produce some output that is stochastically\nlike unto what an expert might produce given the same underlying problem,\nhaving brought to bear their expertise.</li>\n<li>Thus we build an AI that can take the non-expert\u2019s imprecise description and\nshow that sometimes the output it produces is not so wrong as to fail some\n<em>ad hoc</em> tests of utility that we introduce.</li>\n</ol>\n<p>Based on things I\u2019ve recently reviewed \u201cnot so wrong\u201d above means \u201cerror rate of\nno more than 25\u201430% when taking expertly generated natural language prompts as\ninput\u201d. Which is to say, probably not the sorts of input prompt that a\nnon-expert might produce.</p>\n<p>Network configuration and management is the domain I\u2019ve seen this argument made\nin most recently. Which seems quite strange to me because I always thought that\na 25% error rate in configuring, e.g., your enterprise network security\nperimeter would be bad. But apparently not if it\u2019s done by an AI.</p>\n<p>More generally, why do we want to build tools that allow untrained experts to do\na job when mistakes are high impact, it requires a trained expert to detect\nthose mistakes, and those tools <em>by design</em> only produce statistically valid\noutput? An error rate of once in a blue moon is categorically worse than a zero\nerror rate if the error involved can leave your entire digital estate open to\ncompromise.</p>\n<p>If the big issue here is that experts sometimes make typos when editing the\nconfiguration files, maybe building some domain-specific languages or better\nuser interfaces or verification techniques or other tooling would be a better\nway to help them not do that than replacing them with tools that <strong>by design</strong>\nare only ever probably about right.</p>\n<p>So please stop justifying your AI application research by saying simply that it\nallows non-experts to carry out expert work! I\u2019m much more likely to be\nconvinced by uses of AI that make experts <em>more productive</em> \u2013 though don\u2019t get\nme started on how to measure productivity because I don\u2019t know except via means\nwhich are expensive and time consuming, and it really seems that very few people\ncan be bothered doing that.</p>",+"content": "<p>I am hardly the first person to comment<a href=\"https://mort.io/blog/whither-ai/#1\">1</a> on this \u2013 I am given to understand\nAI has been a topic of some interest to many for a few years now. I\u2019m sure I\u2019ve\nseen, and possibly even <a href=\"https://mastodon.me.uk/@mort\">re-tooted</a> things about\nit in fact. I\u2019m afraid I just don\u2019t keep up.</p>\n<div>1\n<p>Ok fine. I admit it. This is a rant.</p>\n</div>\n<p>But recent experiences reviewing for a couple of systems/networking venues has\nled me to feel I need to ask: <strong>WHY</strong>? More pointedly, why does the following\nseem like good motivation for a research paper?</p>\n<ol>\n<li>There is a complex and important task that currently requires considerable\nexpertise to carry out because it is important to be precise and get it\nright.</li>\n<li>The task in question can be described imprecisely using natural language by\nnon-experts.</li>\n<li>AI (inevitably, some large-language model) can take that natural language\ndescription and, after training, produce some output that is stochastically\nlike unto what an expert might produce given the same underlying problem,\nhaving brought to bear their expertise.</li>\n<li>Thus we build an AI that can take the non-expert\u2019s imprecise description and\nshow that sometimes the output it produces is not so wrong as to fail some\n<em>ad hoc</em> tests of utility that we introduce.</li>\n</ol>\n<p>Based on things I\u2019ve recently reviewed \u201cnot so wrong\u201d above means \u201cerror rate of\nno more than 25\u201430% when taking expertly generated natural language prompts as\ninput\u201d. Which is to say, probably not the sorts of input prompt that a\nnon-expert might produce.</p>\n<p>Network configuration and management is the domain I\u2019ve seen this argument made\nin most recently. Which seems quite strange to me because I always thought that\na 25% error rate in configuring, e.g., your enterprise network security\nperimeter would be bad. But apparently not if it\u2019s done by an AI.</p>\n<p>More generally, why do we want to build tools that allow untrained experts to do\na job when mistakes are high impact, it requires a trained expert to detect\nthose mistakes, and those tools <em>by design</em> only produce statistically valid\noutput? An error rate of once in a blue moon is categorically worse than a zero\nerror rate if the error involved can leave your entire digital estate open to\ncompromise.</p>\n<p>If the big issue here is that experts sometimes make typos when editing the\nconfiguration files, maybe building some domain-specific languages or better\nuser interfaces or verification techniques or other tooling would be a better\nway to help them not do that than replacing them with tools that <strong>by design</strong>\nare only ever probably about right.</p>\n<p>So please stop justifying your AI application research by saying simply that it\nallows non-experts to carry out expert work! I\u2019m much more likely to be\nconvinced by uses of AI that make experts <em>more productive</em> \u2013 though don\u2019t get\nme started on how to measure productivity because I don\u2019t know except via means\nwhich are expensive and time consuming, and it really seems that very few people\ncan be bothered doing that.</p>",
+18
mort/blog_windows-wsl2_.json
+18
mort/blog_windows-wsl2_.json
···+"summary": "<p>I naively thought I could just use WSL2 on Windows on my new laptop. But it\nturned out this was the year of Linux on the Laptop for me. For posterity\nhere\u2019s the crib sheet though.</p>\n<pre><code><span><span><span>wsl</span></span><span><span><span> --</span>set-default-version</span> 2</span>\n</span><span><span><span>sudo</span></span><span> apt update</span> <span>&&</span> <span><span>sudo</span></span><span> apt upgrade<span><span> -</span>yy</span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> apt install locales</span>\n</span><span><span><span>sudo</span></span><span> locale-gen en_GB.UTF-8</span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> emacs-gtk direnv gedit</span>\n</span><span><span><span>git</span></span><span> clone ./..rc-files</span>\n</span><span><span><span>./scripts/install.sh</span></span>\n</span></code></pre>\n<p>Some Windows native packages using <a href=\"https://chocolatey.org/\">Chocolatey</a></p>\n<pre><code><span><span><span>choco</span></span><span> install signal skype wire slack zoom</span>\n</span><span><span><span>choco</span></span><span> install git</span>\n</span><span><span><span>choco</span></span><span> install rustup.install rust-analyzer python</span>\n</span><span><span><span>choco</span></span><span> install docker</span>\n</span><span><span><span>choco</span></span><span> install powertyos dropbox googledrive wiztree</span>\n</span></code></pre>\n<p>However, <code>choco install texlive</code> didn\u2019t work so well, so I fell back to WSL2:\n<code>sudo apt install latexmk texlive-latex-base texlive-xetex ttf-mscorefonts-installer</code>.</p>\n<p>Use MS Powertoys to remap keyboard for <code>CAPSLOCK</code>, <code>\u20ac</code>.</p>\n<p>Timesync is a bit broken, cf <a href=\"https://stackoverflow.com/a/72318510\">https://stackoverflow.com/a/72318510</a>.</p>\n<p>Unfortunately it all went pear-shaped when I tried to <code>rsync</code> files across from\nMacOS into Windows/WSL2.</p>\n<p>I mapped the network drive via Network and Sharing Center > Settings > Network &\ninternet > Advanced network settings > Advanced sharing settings > Public\nnetworks > Network discovery = ON</p>\n<p>\u2026and then</p>\n<pre><code><span>\n</span><span><span>for</span><span> d <span>in</span> admin christs docs rc-files research src teaching me</span> <span>;</span> <span>do</span>\n</span><span> <span><span>echo</span></span><span> <span><span>"</span>=== <span><span>$</span><span>d</span></span><span>"</span></span></span>\n</span><span> <span><span>rsync</span></span><span><span><span> -</span>uavzsP</span><span><span> --</span>log-file</span><span>=</span><span><span>$</span><span>d</span></span>.<span><span>$</span><span>(</span><span><span>date</span></span><span><span><span> -</span>Iseconds</span></span><span>)</span></span><span><span> -</span>e</span> ssh mort@IPADDRESS:/Users/mort/<span><span>$</span><span>d</span></span>/ ./<span><span>$</span><span>d</span></span></span>\n</span><span><span>done</span>\n</span></code></pre>\n<p>\u2026but found that moving files to the host mashed things a bit (<code>rw</code> bits\ncarried, but <code>x</code> not; hidden files not; no content translation; owner carried)\nwhile moving host files to wsl was sort-of ok (owner carried, rw bits carried)\nexcept that <code>group</code> and <code>other</code> access bits are all set to whatever the <code>user</code>\naccess bits were.</p>\n<p>See\n<a href=\"https://stackoverflow.com/questions/41513597/how-do-i-access-the-wsl-linux-file-system-from-windows\">https://stackoverflow.com/questions/41513597/how-do-i-access-the-wsl-linux-file-system-from-windows</a>\nfor more, perhaps.</p>\n<p>Getting Docker installed was also rather painful:</p>\n<pre><code><span><span><span>#</span></span><span> remove old distribution dockers</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt remove docker.io containerd runc</span> <span>&&</span> <span><span>sudo</span></span><span> apt autoremove</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install dependencies to use an alternative package repo</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get update</span> <span>&&</span> <span><span>sudo</span></span><span> apt-get install ca-certificates curl gnupg lsb-release</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install the new package repo</span><span>\n</span></span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>m</span> 0755<span><span> -</span>p</span> /etc/apt/keyrings</span>\n</span><span><span><span>curl</span></span><span><span><span> -</span>fsSL</span> https://download.docker.com/linux/ubuntu/gpg</span> <span>|</span> <span><span>sudo</span></span><span> gpg<span><span> --</span>dearmor</span><span><span> -</span>o</span> /etc/apt/keyrings/docker.gpg</span>\n</span><span><span><span>echo</span></span><span> <span><span>"</span>deb [arch=<span><span>$</span><span>(</span><span><span>dpkg</span></span><span><span><span> --</span>print-architecture</span></span><span>)</span></span> signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu <span><span>$</span><span>(</span><span><span>lsb_release</span></span><span><span><span> -</span>cs</span></span><span>)</span></span> stable<span>"</span></span></span> <span>|</span> <span><span>sudo</span></span><span> tee /etc/apt/sources.list.d/docker.list <span>></span> /dev/null</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install up-to-date Docker</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get update</span> <span>&&</span> <span><span>sudo</span></span><span> apt-get install <span>\\\n</span></span></span><span><span> docker-ce <span>\\\n</span></span></span><span><span> docker-ce-cli <span>\\\n</span></span></span><span><span> containerd.io <span>\\\n</span></span></span><span><span> docker-buildx-plugin <span>\\\n</span></span></span><span><span> docker-compose-plugin</span>\n</span></code></pre>\n<p>Finally, some further references that may or may not be useful:</p>\n<ul>\n<li><a href=\"https://stephenreescarter.net/how-to-shrink-a-wsl2-virtual-disk/\">https://stephenreescarter.net/how-to-shrink-a-wsl2-virtual-disk/</a></li>\n<li><a href=\"https://www.linkedin.com/pulse/acceso-wsl2-desde-windows-con-samba-manuel-nicol%C3%A1s-ortu%C3%B1o/\">https://www.linkedin.com/pulse/acceso-wsl2-desde-windows-con-samba-manuel-nicol%C3%A1s-ortu%C3%B1o/</a></li>\n<li><a href=\"https://www.howtogeek.com/193013/how-to-create-an-encrypted-container-file-with-bitlocker-on-windows/\">https://www.howtogeek.com/193013/how-to-create-an-encrypted-container-file-with-bitlocker-on-windows/</a></li>\n</ul>",+"content": "<p>I naively thought I could just use WSL2 on Windows on my new laptop. But it\nturned out this was the year of Linux on the Laptop for me. For posterity\nhere\u2019s the crib sheet though.</p>\n<pre><code><span><span><span>wsl</span></span><span><span><span> --</span>set-default-version</span> 2</span>\n</span><span><span><span>sudo</span></span><span> apt update</span> <span>&&</span> <span><span>sudo</span></span><span> apt upgrade<span><span> -</span>yy</span></span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> apt install locales</span>\n</span><span><span><span>sudo</span></span><span> locale-gen en_GB.UTF-8</span>\n</span><span>\n</span><span><span><span>sudo</span></span><span> apt install<span><span> -</span>yy</span> emacs-gtk direnv gedit</span>\n</span><span><span><span>git</span></span><span> clone ./..rc-files</span>\n</span><span><span><span>./scripts/install.sh</span></span>\n</span></code></pre>\n<p>Some Windows native packages using <a href=\"https://chocolatey.org/\">Chocolatey</a></p>\n<pre><code><span><span><span>choco</span></span><span> install signal skype wire slack zoom</span>\n</span><span><span><span>choco</span></span><span> install git</span>\n</span><span><span><span>choco</span></span><span> install rustup.install rust-analyzer python</span>\n</span><span><span><span>choco</span></span><span> install docker</span>\n</span><span><span><span>choco</span></span><span> install powertyos dropbox googledrive wiztree</span>\n</span></code></pre>\n<p>However, <code>choco install texlive</code> didn\u2019t work so well, so I fell back to WSL2:\n<code>sudo apt install latexmk texlive-latex-base texlive-xetex ttf-mscorefonts-installer</code>.</p>\n<p>Use MS Powertoys to remap keyboard for <code>CAPSLOCK</code>, <code>\u20ac</code>.</p>\n<p>Timesync is a bit broken, cf <a href=\"https://stackoverflow.com/a/72318510\">https://stackoverflow.com/a/72318510</a>.</p>\n<p>Unfortunately it all went pear-shaped when I tried to <code>rsync</code> files across from\nMacOS into Windows/WSL2.</p>\n<p>I mapped the network drive via Network and Sharing Center > Settings > Network &\ninternet > Advanced network settings > Advanced sharing settings > Public\nnetworks > Network discovery = ON</p>\n<p>\u2026and then</p>\n<pre><code><span>\n</span><span><span>for</span><span> d <span>in</span> admin christs docs rc-files research src teaching me</span> <span>;</span> <span>do</span>\n</span><span> <span><span>echo</span></span><span> <span><span>"</span>=== <span><span>$</span><span>d</span></span><span>"</span></span></span>\n</span><span> <span><span>rsync</span></span><span><span><span> -</span>uavzsP</span><span><span> --</span>log-file</span><span>=</span><span><span>$</span><span>d</span></span>.<span><span>$</span><span>(</span><span><span>date</span></span><span><span><span> -</span>Iseconds</span></span><span>)</span></span><span><span> -</span>e</span> ssh mort@IPADDRESS:/Users/mort/<span><span>$</span><span>d</span></span>/ ./<span><span>$</span><span>d</span></span></span>\n</span><span><span>done</span>\n</span></code></pre>\n<p>\u2026but found that moving files to the host mashed things a bit (<code>rw</code> bits\ncarried, but <code>x</code> not; hidden files not; no content translation; owner carried)\nwhile moving host files to wsl was sort-of ok (owner carried, rw bits carried)\nexcept that <code>group</code> and <code>other</code> access bits are all set to whatever the <code>user</code>\naccess bits were.</p>\n<p>See\n<a href=\"https://stackoverflow.com/questions/41513597/how-do-i-access-the-wsl-linux-file-system-from-windows\">https://stackoverflow.com/questions/41513597/how-do-i-access-the-wsl-linux-file-system-from-windows</a>\nfor more, perhaps.</p>\n<p>Getting Docker installed was also rather painful:</p>\n<pre><code><span><span><span>#</span></span><span> remove old distribution dockers</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt remove docker.io containerd runc</span> <span>&&</span> <span><span>sudo</span></span><span> apt autoremove</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install dependencies to use an alternative package repo</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get update</span> <span>&&</span> <span><span>sudo</span></span><span> apt-get install ca-certificates curl gnupg lsb-release</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install the new package repo</span><span>\n</span></span><span><span><span>sudo</span></span><span> mkdir<span><span> -</span>m</span> 0755<span><span> -</span>p</span> /etc/apt/keyrings</span>\n</span><span><span><span>curl</span></span><span><span><span> -</span>fsSL</span> https://download.docker.com/linux/ubuntu/gpg</span> <span>|</span> <span><span>sudo</span></span><span> gpg<span><span> --</span>dearmor</span><span><span> -</span>o</span> /etc/apt/keyrings/docker.gpg</span>\n</span><span><span><span>echo</span></span><span> <span><span>"</span>deb [arch=<span><span>$</span><span>(</span><span><span>dpkg</span></span><span><span><span> --</span>print-architecture</span></span><span>)</span></span> signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu <span><span>$</span><span>(</span><span><span>lsb_release</span></span><span><span><span> -</span>cs</span></span><span>)</span></span> stable<span>"</span></span></span> <span>|</span> <span><span>sudo</span></span><span> tee /etc/apt/sources.list.d/docker.list <span>></span> /dev/null</span>\n</span><span>\n</span><span><span><span>#</span></span><span> install up-to-date Docker</span><span>\n</span></span><span><span><span>sudo</span></span><span> apt-get update</span> <span>&&</span> <span><span>sudo</span></span><span> apt-get install <span>\\\n</span></span></span><span><span> docker-ce <span>\\\n</span></span></span><span><span> docker-ce-cli <span>\\\n</span></span></span><span><span> containerd.io <span>\\\n</span></span></span><span><span> docker-buildx-plugin <span>\\\n</span></span></span><span><span> docker-compose-plugin</span>\n</span></code></pre>\n<p>Finally, some further references that may or may not be useful:</p>\n<ul>\n<li><a href=\"https://stephenreescarter.net/how-to-shrink-a-wsl2-virtual-disk/\">https://stephenreescarter.net/how-to-shrink-a-wsl2-virtual-disk/</a></li>\n<li><a href=\"https://www.linkedin.com/pulse/acceso-wsl2-desde-windows-con-samba-manuel-nicol%C3%A1s-ortu%C3%B1o/\">https://www.linkedin.com/pulse/acceso-wsl2-desde-windows-con-samba-manuel-nicol%C3%A1s-ortu%C3%B1o/</a></li>\n<li><a href=\"https://www.howtogeek.com/193013/how-to-create-an-encrypted-container-file-with-bitlocker-on-windows/\">https://www.howtogeek.com/193013/how-to-create-an-encrypted-container-file-with-bitlocker-on-windows/</a></li>\n</ul>",
+18
mort/blog_wipeout_.json
+18
mort/blog_wipeout_.json
···+"summary": "<p>I recently decided to go through some old PCs and hard disks (yes, actual\nspinning bits of metal) and recycle or simply junk them. I figured I should wipe\nthem properly first, and given that they had been installed with a range of OSs,\nsetup a bootable USB stick so that I could boot and wipe in one easy motion.</p>\n<ul>\n<li>Download <a href=\"https://alpinelinux.org/downloads/\">Alpine Linux</a> <a href=\"https://dl-cdn.alpinelinux.org/alpine/v3.8/releases/x86_64/alpine-extended-3.8.1-x86_64.iso\">3.8.1\nISO</a>\n\u2013 I had only <code>x86_64</code> machines, YMMV obviously</li>\n<li>Write this to a USB stick using <code>dd</code> (on Linux), or <a href=\"https://etcher.balena.io/\">Balena\nEtcher</a> on Windows</li>\n<li>Insert the stick and boot the machine after making any necessary BIOS changes</li>\n<li>If the machine has been unbooted for too long or (as one of mine) has a flat\nCMOS clock battery so cannot retain time across reboots, set the time\n<ul>\n<li>manually: <code>date --set=\"20181217\"</code></li>\n<li>automatically: <code>setup-alpine</code> to start configuring things, then <code>CTRL-C</code>\nafter network setup, and execute <code>setup-ntp</code></li>\n</ul>\n</li>\n<li>Then <code>setup-alpine; apk add coreutils</code></li>\n<li>Finally, <code>shred --verbose /dev/sdXN</code> where <code>X</code> is the device id and <code>N</code> the\npartition number (eg., <code>/dev/sda2</code>), or <code>shred --verbose -n1 /dev/sdXN</code> if\nyou\u2019re using a modern disk (apparently) and only want one pass of random data</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/wipeout/#installing-alpine\">Installing Alpine</a></h2>\n<p>After installing Alpine as above:</p>\n<ul>\n<li><code>adduser mort</code></li>\n<li>create <code>~mort/.ssh/authorized_keys</code> containing you preferred public key, and\nset permissions (<code>chmod 600 ~mort/.ssh/authorized_keys</code>)</li>\n<li><code>apk add sudo</code> and then <code>visudo</code> to allow members of group <code>sudo</code> to <code>sudo</code></li>\n<li>add <code>mort</code> to group <code>sudo</code></li>\n<li>logout and then back in</li>\n</ul>\n<p>You can then configure storage as you see fit; it seems I once did probably\nusing <code>sudo</code>:</p>\n<pre><code><span><span><span>apk</span></span><span> add lvm2 git bash xfsprogs</span>\n</span><span><span><span>pvcreate</span></span><span> /dev/sd<span>[</span>bc<span>]</span> <span><span>#</span></span><span> create persistent volumes group</span><span>\n</span></span></span><span><span><span>vgextend</span></span><span> vg0 /dev/sdb <span><span>#</span></span><span> assign storage devices to volume group</span><span>\n</span></span></span><span><span><span>vgextend</span></span><span> vg0 /dev/sdc <span><span>#</span></span><span>\n</span></span></span><span><span><span>lvcreate</span></span><span><span><span> --</span>name</span> lv_home<span><span> --</span>size</span> 60G vg0 <span><span>#</span></span><span> create logical volume `lv_home`</span><span>\n</span></span></span><span><span><span>mkfs.xfs</span></span><span> /dev/vg0/lv_home <span><span>#</span></span><span> format `lv_home` using XFS</span><span>\n</span></span></span><span><span><span>lvcreate</span></span><span><span><span> --</span>name</span> lv_backup<span><span> --</span>size</span> 60G vg0 <span><span>#</span></span><span> create logical volume `lv_backup`</span><span>\n</span></span></span><span><span><span>mkfs.xfs</span></span><span> /dev/vg0/lv_backup <span><span>#</span></span><span> format `lv_backup` using XFS</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> create `/etc/fstab` to match the above configuration</span><span>\n</span></span><span><span><span>cat</span></span><span> <span>>></span>/etc/fstab <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>/dev/vg0/lv_home\t/home\txfs\tdefaults\t0 0\n</span></span></span><span><span><span>/dev/vg0/lv_backup\t/backup\txfs\tdefaults\t0 0\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>mount</span></span><span><span><span> -</span>a</span><span><span> -</span>v <span><span>#</span></span><span> mount everything, verbosely</span><span>\n</span></span></span></span></code></pre>",+"content": "<p>I recently decided to go through some old PCs and hard disks (yes, actual\nspinning bits of metal) and recycle or simply junk them. I figured I should wipe\nthem properly first, and given that they had been installed with a range of OSs,\nsetup a bootable USB stick so that I could boot and wipe in one easy motion.</p>\n<ul>\n<li>Download <a href=\"https://alpinelinux.org/downloads/\">Alpine Linux</a> <a href=\"https://dl-cdn.alpinelinux.org/alpine/v3.8/releases/x86_64/alpine-extended-3.8.1-x86_64.iso\">3.8.1\nISO</a>\n\u2013 I had only <code>x86_64</code> machines, YMMV obviously</li>\n<li>Write this to a USB stick using <code>dd</code> (on Linux), or <a href=\"https://etcher.balena.io/\">Balena\nEtcher</a> on Windows</li>\n<li>Insert the stick and boot the machine after making any necessary BIOS changes</li>\n<li>If the machine has been unbooted for too long or (as one of mine) has a flat\nCMOS clock battery so cannot retain time across reboots, set the time\n<ul>\n<li>manually: <code>date --set=\"20181217\"</code></li>\n<li>automatically: <code>setup-alpine</code> to start configuring things, then <code>CTRL-C</code>\nafter network setup, and execute <code>setup-ntp</code></li>\n</ul>\n</li>\n<li>Then <code>setup-alpine; apk add coreutils</code></li>\n<li>Finally, <code>shred --verbose /dev/sdXN</code> where <code>X</code> is the device id and <code>N</code> the\npartition number (eg., <code>/dev/sda2</code>), or <code>shred --verbose -n1 /dev/sdXN</code> if\nyou\u2019re using a modern disk (apparently) and only want one pass of random data</li>\n</ul>\n<h2><a href=\"https://mort.io/blog/wipeout/#installing-alpine\">Installing Alpine</a></h2>\n<p>After installing Alpine as above:</p>\n<ul>\n<li><code>adduser mort</code></li>\n<li>create <code>~mort/.ssh/authorized_keys</code> containing you preferred public key, and\nset permissions (<code>chmod 600 ~mort/.ssh/authorized_keys</code>)</li>\n<li><code>apk add sudo</code> and then <code>visudo</code> to allow members of group <code>sudo</code> to <code>sudo</code></li>\n<li>add <code>mort</code> to group <code>sudo</code></li>\n<li>logout and then back in</li>\n</ul>\n<p>You can then configure storage as you see fit; it seems I once did probably\nusing <code>sudo</code>:</p>\n<pre><code><span><span><span>apk</span></span><span> add lvm2 git bash xfsprogs</span>\n</span><span><span><span>pvcreate</span></span><span> /dev/sd<span>[</span>bc<span>]</span> <span><span>#</span></span><span> create persistent volumes group</span><span>\n</span></span></span><span><span><span>vgextend</span></span><span> vg0 /dev/sdb <span><span>#</span></span><span> assign storage devices to volume group</span><span>\n</span></span></span><span><span><span>vgextend</span></span><span> vg0 /dev/sdc <span><span>#</span></span><span>\n</span></span></span><span><span><span>lvcreate</span></span><span><span><span> --</span>name</span> lv_home<span><span> --</span>size</span> 60G vg0 <span><span>#</span></span><span> create logical volume `lv_home`</span><span>\n</span></span></span><span><span><span>mkfs.xfs</span></span><span> /dev/vg0/lv_home <span><span>#</span></span><span> format `lv_home` using XFS</span><span>\n</span></span></span><span><span><span>lvcreate</span></span><span><span><span> --</span>name</span> lv_backup<span><span> --</span>size</span> 60G vg0 <span><span>#</span></span><span> create logical volume `lv_backup`</span><span>\n</span></span></span><span><span><span>mkfs.xfs</span></span><span> /dev/vg0/lv_backup <span><span>#</span></span><span> format `lv_backup` using XFS</span><span>\n</span></span></span><span>\n</span><span><span><span>#</span></span><span> create `/etc/fstab` to match the above configuration</span><span>\n</span></span><span><span><span>cat</span></span><span> <span>>></span>/etc/fstab <span><span><<</span><span>EOF</span></span><span>\n</span></span></span><span><span><span>/dev/vg0/lv_home\t/home\txfs\tdefaults\t0 0\n</span></span></span><span><span><span>/dev/vg0/lv_backup\t/backup\txfs\tdefaults\t0 0\n</span></span></span><span><span><span><span>EOF</span></span></span>\n</span><span><span><span>mount</span></span><span><span><span> -</span>a</span><span><span> -</span>v <span><span>#</span></span><span> mount everything, verbosely</span><span>\n</span></span></span></span></code></pre>",
+18
mort/blog_workshopping-edgeless_.json
+18
mort/blog_workshopping-edgeless_.json
···+"summary": "<p>One of the pleasures of being an academic is to travel to nice places to meet\ninteresting people, and interesting places to meet nice people. In one of my\nfirst such trips for a few years I recently went to Sweden to participate in the\n<a href=\"https://cloudresearch.org/workshops/17th/\">17th Cloud Control Workshop</a>.</p>\n <a href=\"https://cloudresearch.org/workshops/17th/\"><img alt=\"A photograph of the presenter in front of a slide\" height=\"1\" src=\"https://mort.io/blog/workshopping-edgeless/photo.jpg\" width=\"480\"></a>\n<p>I\u2019d previously attended the <a href=\"https://cloudresearch.org/workshops/15th/\">15th Cloud Control\nWorkshop</a> shortly before the pandemic\nhappened, causing the series to pause briefly. This was the reboot, and as\nbefore, it was a great deal of fun: good company, good food, beautiful location.\nIf you get the chance to go, take it if you can!</p>\n<p>The workshop is a really nice mix of keynote presentations \u2013 not too many and\nnot too long at 20 minutes (mostly) \u2013 and discussion sessions proposed\nbeforehand or on site by participants. I gave one of the keynotes, talking about\nthe challenges posed and opportunities offered by edge computing. Can share\nslides on request, or when I decide on a good way to publish them!</p>\n<p>I also enjoyed many good discussions and conversations, topped off with a great\nBBQ, but two topics stand out. First, a great discussion session organised by\n<a href=\"https://anakli.inf.ethz.ch/\">Prof. Ana Klimovi\u0107</a> about serverless computing\nwhich immediately triggered some thoughts about possible followup publications\nfrom <a href=\"https://edgeless-project.eu/\">EDGELESS</a>, and collaboration possibilities.</p>\n<p>Second, several conversations on a topic I know little about, low-earth orbit\n(LEO) satellite networking, and what could be done with it.</p>\n<p>For what it\u2019s worth, my thought was whether a LEO constellation plus a little\nedge compute could provide a difficult-to-disrupt out-of-band monitoring network\nfor critical infrastructure like datacenters and power grids: although data\nbandwidth is limited and a bit complicated due to the constraints on ground\nstations, they have pretty decent cameras, so why not use those to capture and\nlocally process images of the roofs of large rectangular buildings like\ndatacenters and the like which could continually display various status data.</p>\n<p>Such status information would only need to be locally generated, so it would\ntake fairly substantial physical disruption of the facility (also presumably\nnoticeable by camera) to prevent that working. Alternatively, the constellation\n(perhaps shared across multiple facilities in different jurisdictions) would\nneed to be substantially disrupted to prevent it being able to monitor targets.</p>\n<p>It was a fun discussion anyway, combining a network technology I knew little\nabout with possibly interesting applications of edge computing to resiliency.</p>\n<p>Postcript: in case you\u2019re curious, the title is a passing reference to <a href=\"https://en.wikipedia.org/wiki/Star_Trekkin%27\">Star\nTrekkin\u2019</a>, a fine popular music\nsingle from my youth. Which has a much longer back-story on Wikipedia than I had\nanticipated.</p>",+"content": "<p>One of the pleasures of being an academic is to travel to nice places to meet\ninteresting people, and interesting places to meet nice people. In one of my\nfirst such trips for a few years I recently went to Sweden to participate in the\n<a href=\"https://cloudresearch.org/workshops/17th/\">17th Cloud Control Workshop</a>.</p>\n <a href=\"https://cloudresearch.org/workshops/17th/\"><img alt=\"A photograph of the presenter in front of a slide\" height=\"1\" src=\"https://mort.io/blog/workshopping-edgeless/photo.jpg\" width=\"480\"></a>\n<p>I\u2019d previously attended the <a href=\"https://cloudresearch.org/workshops/15th/\">15th Cloud Control\nWorkshop</a> shortly before the pandemic\nhappened, causing the series to pause briefly. This was the reboot, and as\nbefore, it was a great deal of fun: good company, good food, beautiful location.\nIf you get the chance to go, take it if you can!</p>\n<p>The workshop is a really nice mix of keynote presentations \u2013 not too many and\nnot too long at 20 minutes (mostly) \u2013 and discussion sessions proposed\nbeforehand or on site by participants. I gave one of the keynotes, talking about\nthe challenges posed and opportunities offered by edge computing. Can share\nslides on request, or when I decide on a good way to publish them!</p>\n<p>I also enjoyed many good discussions and conversations, topped off with a great\nBBQ, but two topics stand out. First, a great discussion session organised by\n<a href=\"https://anakli.inf.ethz.ch/\">Prof. Ana Klimovi\u0107</a> about serverless computing\nwhich immediately triggered some thoughts about possible followup publications\nfrom <a href=\"https://edgeless-project.eu/\">EDGELESS</a>, and collaboration possibilities.</p>\n<p>Second, several conversations on a topic I know little about, low-earth orbit\n(LEO) satellite networking, and what could be done with it.</p>\n<p>For what it\u2019s worth, my thought was whether a LEO constellation plus a little\nedge compute could provide a difficult-to-disrupt out-of-band monitoring network\nfor critical infrastructure like datacenters and power grids: although data\nbandwidth is limited and a bit complicated due to the constraints on ground\nstations, they have pretty decent cameras, so why not use those to capture and\nlocally process images of the roofs of large rectangular buildings like\ndatacenters and the like which could continually display various status data.</p>\n<p>Such status information would only need to be locally generated, so it would\ntake fairly substantial physical disruption of the facility (also presumably\nnoticeable by camera) to prevent that working. Alternatively, the constellation\n(perhaps shared across multiple facilities in different jurisdictions) would\nneed to be substantially disrupted to prevent it being able to monitor targets.</p>\n<p>It was a fun discussion anyway, combining a network technology I knew little\nabout with possibly interesting applications of edge computing to resiliency.</p>\n<p>Postcript: in case you\u2019re curious, the title is a passing reference to <a href=\"https://en.wikipedia.org/wiki/Star_Trekkin%27\">Star\nTrekkin\u2019</a>, a fine popular music\nsingle from my youth. Which has a much longer back-story on Wikipedia than I had\nanticipated.</p>",
+18
mort/blog_zen-and-the-art-of-research-management_.json
+18
mort/blog_zen-and-the-art-of-research-management_.json
···+"summary": "<p>I think this is a bit of a classic, the first written form of which I came\nacross in <a href=\"https://www.cl.cam.ac.uk/misc/obituaries/needham/\">Prof. Roger Needham</a>\u2019s <a href=\"https://www.cl.cam.ac.uk/events/50+5/\">50+5 Festschrift</a> celebrating his\ntime at the <a href=\"https://www.cl.cam.ac.uk/\">Cambridge University Computer Lab</a> and\n<a href=\"https://www.microsoft.com/en-us/research/lab/microsoft-research-cambridge/\">Microsoft Research Cambridge</a>. I don\u2019t know who originated it, but the\ncopy there is certainly due to <a href=\"https://memex.naughtons.org/\">John Naughton</a> and <a href=\"https://en.wikipedia.org/wiki/Robert_Taylor_(computer_scientist)\">Bob Taylor</a>. I\nsuppose one might quibble point 12, in that I seem to recall Roger did a lot of\npacing about, but a good chair is certainly a worthwhile thing to provide.</p>\n<p>Anyway, I find myself wanting to point at it from time-to-time, so here it is!</p>\n<p>By <a href=\"https://memex.naughtons.org/\">John Naughton</a> (<em>Open University, Milton Keynes, England</em>), and <a href=\"https://en.wikipedia.org/wiki/Robert_Taylor_(computer_scientist)\">Robert\nW. Taylor</a> (<em>Woodside, California, USA</em>).</p>\n<ol>\n<li>\n<p>HIRE ONLY THE VERY BEST PEOPLE, EVEN IF THEY ARE CUSSED. Perhaps especially\nif they are cussed. Your guiding principle should be to employ people who\nare smarter than you. One superb researcher is worth dozens of merely good\nones.</p>\n</li>\n<li>\n<p>ONCE YOU\u2019VE GOT THEM, TRUST THEM. Do not attempt to micro-manage talented\npeople. (Remember rule #1.) Set broad goals and leave them to it.\nConcentrate your own efforts on strategy and nurturing the environment.</p>\n</li>\n<li>\n<p>PROTECT YOUR RESEARCHERS FROM EXTERNAL INTERFERENCE, whether from company\npersonnel officers, senior executives or security personnel. Remember that\nyour job is to create a supportive and protective space within which they\ncan work.</p>\n</li>\n<li>\n<p>MUCH OF WHAT YOU DO WILL FALL INTO THE CATEGORY OF ABSORBING THE UNCERTAINTY\nOF YOUR RESEARCHERS.</p>\n</li>\n<li>\n<p>REMEMBER THAT YOU ARE A CONDUCTOR, NOT A SOLOIST. (Rule #1 again.) The Lab\nis your performance.</p>\n</li>\n<li>\n<p>DO NOT PAY TOO MUCH ATTENTION TO \u2018RELEVANCE,\u2019 \u2018DELIVERABLES\u2019 and other\nconcepts beloved of Senior Management.</p>\n</li>\n<li>\n<p>REMEMBER THAT CREATIVE PEOPLE ARE LIKE HEARTS \u2013 they go where they are\nappreciated. They can be inspired or led, but not managed.</p>\n</li>\n<li>\n<p>KEEP THE ORGANISATION CHART SHALLOW. Never let the Lab grow beyond the point\nwhere you cannot fit everyone comfortably in the same room.</p>\n</li>\n<li>\n<p>MAKE YOUR RESEARCHERS DEBATE WITH ONE ANOTHER REGULARLY. Let them tear one\nanother\u2019s ideas to pieces. Ensure frank communication among them. Observe\nthe strengths and weaknesses which emerge in the process.</p>\n</li>\n<li>\n<p>BE NICE TO GRADUATE STUDENTS. One day they may keep you, even if only as a\nmascot. (Moreover, they are a lot of fun!)</p>\n</li>\n<li>\n<p>INSTALL A WORLD-CLASS COFFEE MACHINE and provide plenty of free soft drinks.</p>\n</li>\n<li>\n<p>BUY AERON CHAIRS. Remember that most computer science research is done\nsitting down.</p>\n</li>\n<li>\n<p>INSTITUTE A \u2018TOY BUDGET\u2019, enabling anyone in the Lab to buy anything costing\nless than a specified amount on their own authority. And provide a darkened\nrecovery room for accountants shocked by the discovery of this budget.</p>\n</li>\n<li>\n<p>PAY ATTENTION TO WHAT GOES ON IN UNIVERSITIES. Every significant\nbreakthrough in computing in the last four decades has involved both the\nuniversity and corporate sectors at some point in its evolution.</p>\n</li>\n<li>\n<p>REMEMBER TO INITIATE AND SPONSOR CELEBRATIONS when merited.</p>\n</li>\n<li>\n<p>WHEN IN DOUBT, ASK YOURSELF: \u201cWHAT WOULD ROGER NEEDHAM DO IN SIMILAR\nCIRCUMSTANCES?\u201d</p>\n</li>\n</ol>",+"content": "<p>I think this is a bit of a classic, the first written form of which I came\nacross in <a href=\"https://www.cl.cam.ac.uk/misc/obituaries/needham/\">Prof. Roger Needham</a>\u2019s <a href=\"https://www.cl.cam.ac.uk/events/50+5/\">50+5 Festschrift</a> celebrating his\ntime at the <a href=\"https://www.cl.cam.ac.uk/\">Cambridge University Computer Lab</a> and\n<a href=\"https://www.microsoft.com/en-us/research/lab/microsoft-research-cambridge/\">Microsoft Research Cambridge</a>. I don\u2019t know who originated it, but the\ncopy there is certainly due to <a href=\"https://memex.naughtons.org/\">John Naughton</a> and <a href=\"https://en.wikipedia.org/wiki/Robert_Taylor_(computer_scientist)\">Bob Taylor</a>. I\nsuppose one might quibble point 12, in that I seem to recall Roger did a lot of\npacing about, but a good chair is certainly a worthwhile thing to provide.</p>\n<p>Anyway, I find myself wanting to point at it from time-to-time, so here it is!</p>\n<p>By <a href=\"https://memex.naughtons.org/\">John Naughton</a> (<em>Open University, Milton Keynes, England</em>), and <a href=\"https://en.wikipedia.org/wiki/Robert_Taylor_(computer_scientist)\">Robert\nW. Taylor</a> (<em>Woodside, California, USA</em>).</p>\n<ol>\n<li>\n<p>HIRE ONLY THE VERY BEST PEOPLE, EVEN IF THEY ARE CUSSED. Perhaps especially\nif they are cussed. Your guiding principle should be to employ people who\nare smarter than you. One superb researcher is worth dozens of merely good\nones.</p>\n</li>\n<li>\n<p>ONCE YOU\u2019VE GOT THEM, TRUST THEM. Do not attempt to micro-manage talented\npeople. (Remember rule #1.) Set broad goals and leave them to it.\nConcentrate your own efforts on strategy and nurturing the environment.</p>\n</li>\n<li>\n<p>PROTECT YOUR RESEARCHERS FROM EXTERNAL INTERFERENCE, whether from company\npersonnel officers, senior executives or security personnel. Remember that\nyour job is to create a supportive and protective space within which they\ncan work.</p>\n</li>\n<li>\n<p>MUCH OF WHAT YOU DO WILL FALL INTO THE CATEGORY OF ABSORBING THE UNCERTAINTY\nOF YOUR RESEARCHERS.</p>\n</li>\n<li>\n<p>REMEMBER THAT YOU ARE A CONDUCTOR, NOT A SOLOIST. (Rule #1 again.) The Lab\nis your performance.</p>\n</li>\n<li>\n<p>DO NOT PAY TOO MUCH ATTENTION TO \u2018RELEVANCE,\u2019 \u2018DELIVERABLES\u2019 and other\nconcepts beloved of Senior Management.</p>\n</li>\n<li>\n<p>REMEMBER THAT CREATIVE PEOPLE ARE LIKE HEARTS \u2013 they go where they are\nappreciated. They can be inspired or led, but not managed.</p>\n</li>\n<li>\n<p>KEEP THE ORGANISATION CHART SHALLOW. Never let the Lab grow beyond the point\nwhere you cannot fit everyone comfortably in the same room.</p>\n</li>\n<li>\n<p>MAKE YOUR RESEARCHERS DEBATE WITH ONE ANOTHER REGULARLY. Let them tear one\nanother\u2019s ideas to pieces. Ensure frank communication among them. Observe\nthe strengths and weaknesses which emerge in the process.</p>\n</li>\n<li>\n<p>BE NICE TO GRADUATE STUDENTS. One day they may keep you, even if only as a\nmascot. (Moreover, they are a lot of fun!)</p>\n</li>\n<li>\n<p>INSTALL A WORLD-CLASS COFFEE MACHINE and provide plenty of free soft drinks.</p>\n</li>\n<li>\n<p>BUY AERON CHAIRS. Remember that most computer science research is done\nsitting down.</p>\n</li>\n<li>\n<p>INSTITUTE A \u2018TOY BUDGET\u2019, enabling anyone in the Lab to buy anything costing\nless than a specified amount on their own authority. And provide a darkened\nrecovery room for accountants shocked by the discovery of this budget.</p>\n</li>\n<li>\n<p>PAY ATTENTION TO WHAT GOES ON IN UNIVERSITIES. Every significant\nbreakthrough in computing in the last four decades has involved both the\nuniversity and corporate sectors at some point in its evolution.</p>\n</li>\n<li>\n<p>REMEMBER TO INITIATE AND SPONSOR CELEBRATIONS when merited.</p>\n</li>\n<li>\n<p>WHEN IN DOUBT, ASK YOURSELF: \u201cWHAT WOULD ROGER NEEDHAM DO IN SIMILAR\nCIRCUMSTANCES?\u201d</p>\n</li>\n</ol>",
+2
-2
mort/metadata.json
+2
-2
mort/metadata.json
+20
mte/2014_01_02_narcissistic-numbers.json
+20
mte/2014_01_02_narcissistic-numbers.json
···+"summary": "I heard about these on BBC Radio 4 More or Less and they just intrigued me, perhaps in part because they have no known application! In the past similar obsessions have appeared with the calculation of PI and right back to my childhood calculating powers of 2 on a BBC Micro.",+"content": "<p>I heard about these on <a href=\"http://www.bbc.co.uk/programmes/b006qshd\">BBC Radio 4 More or\nLess</a> and they just intrigued\nme, perhaps in part because they have no known application! In the past\nsimilar obsessions have appeared with the calculation of PI and right\nback to my childhood calculating powers of 2 on a BBC Micro.</p>\n\n<p>The full definition, as for everything, is on\n<a href=\"https://en.wikipedia.org/wiki/Narcissistic_number\">Wikipedia</a> but in\nshort a narcissistic number is one where the sum of the digits raised to\nthe power of the number of digits equals the number itself. For example</p>\n\n\\[153\u00a0=\u00a01^3 + 5^3 + 3^3\\]\n\n<p>Here\u2019s some quick and dirty Perl code to calculate them:</p>\n\n<div><div><pre><code>use strict;\nuse warnings;\n\nfor (my $i = 10; $i < 10000; $i++) {\n my $pwr = length($i);\n my $total = 0;\n for (my $j = 0; $j < $pwr; $j++) {\n $total += int(substr $i, $j, 1) ** $pwr;\n }\n if ($total == $i) {\n print $i . \" is narcissistic\\n\";\n }\n}\n</code></pre></div></div>\n\n<p>This yields this output</p>\n\n<div><div><pre><code>153 is narcissistic\n370 is narcissistic\n371 is narcissistic\n407 is narcissistic\n1634 is narcissistic\n8208 is narcissistic\n9474 is narcissistic\n</code></pre></div></div>\n\n<p>However, due to the typical limitation in the implementation of integers\nthis doesn\u2019t get you very far. Perl\u2019s <code>Math::BigInt</code> gets you further if\nyou are very patient</p>\n\n<div><div><pre><code>use strict;\nuse warnings;\nuse Math::BigInt;\n\nmy $i = Math::BigInt->bone();\n\nwhile ((my $pwr = $i->length()) < 10) {\n my $total = Math::BigInt->bzero;\n for (my $j = 0; $j < $pwr; $j++) {\n my $t = Math::BigInt->new($i->digit($j));\n $total->badd($t->bpow($pwr));\n }\n if ($total == $i) {\n print $i . \" is narcissistic\\n\";\n }\n $i->binc();\n}\n</code></pre></div></div>",
+20
mte/2015_01_19_mandlebrot-set.json
+20
mte/2015_01_19_mandlebrot-set.json
···+"summary": "The Mandelbrot set is created from this very simple formula in which both Z and C are complex numbers.",+"content": "<p>The Mandelbrot set is created from this very simple formula in which both Z and C are complex numbers.</p>\n\n\\[Z_{n+1}=Z_n^2+c\\]\n\n<p>The formula is iterated to determine whether Z is bounded or tends to infinity. To demonstrate this assume a test case where the imaginary part is zero and focus just on the real part. In this case, the formula is trivial to evaluate starting with Z = 0. The table below shows the outcome at C=0.2 and C=0.3 and where one is clearly bounded and the other is not!</p>\n\n\n\n \n \n <strong>Iteration</strong>\n <strong>C = 0.2</strong>\n <strong>C = 0.3</strong>\n \n \n \n \n \u00a0\n 0\n 0\n \n \n 1\n 0.2\n 0.3\n \n \n 2\n 0.24\n 0.39\n \n \n 3\n 0.2576\n 0.4521\n \n \n 4\n 0.266358\n 0.504394\n \n \n 5\n 0.270946\n 0.554414\n \n \n 6\n 0.273412\n 0.607375\n \n \n 7\n 0.274754\n 0.668904\n \n \n 8\n 0.27549\n 0.747432\n \n \n 9\n 0.275895\n 0.858655\n \n \n 10\n 0.276118\n 1.037289\n \n \n 11\n 0.276241\n 1.375968\n \n \n 12\n 0.276309\n 2.193288\n \n \n 13\n 0.276347\n 5.110511\n \n \n 14\n 0.276368\n 26.41732\n \n \n 15\n 0.276379\n 698.1747\n \n \n 16\n 0.276385\n 487448.2\n \n \n 17\n 0.276389\n 2.38E+11\n \n \n 18\n 0.276391\n 5.65E+22\n \n \n\n\n<p>C=0.2 is said to be part of the set where C=0.3 is not. Typical this point is coloured by some arbitrary function of the number of iterations it took for the modulus of Z to exceed 2.</p>\n\n<p>The set is plotted on the complex number plane with the real part using the x-axis and the imaginary part using the y-axis, thus:</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/complex-plane.svg\"></p>\n\n<p>Given that computers don\u2019t natively work with complex numbers we need to break the formula down into manageable pieces. Firstly write the formula including both the real and complex parts then expand the brackets and group the terms.</p>\n\n\\[Z_{n+1}=Z_n^2+c\\]\n\n\\[Z_{n+1}=(Z_{re}+Z_{im}i)^2+c_{re}+c_{im}i\\]\n\n\\[Z_{n+1}=Z_{re}^2-Z_{im}^2+2Z_{re}Z_{im}i+c_{re}+c_{im}i\\]\n\n\\[\\mathbb R(Z_{n+1})=Z_{re}^2-Z_{im}^2+c_{re}\\]\n\n\\[\\mathbb I(Z_{n+1})=2Z_{re}Z_{im}+c_{im}\\]\n\n<p>Here\u2019s a Perl program to generate a PNG file. Over the years I\u2019ve written this same program in many languages starting with Pascal at school, PostScript at University and Excel VBA and JavaScript\u2026</p>\n\n<p>Here\u2019s a Perl program to generate a PNG file. Over the years I\u2019ve written this same program in many languages starting with Pascal at school, PostScript at University and <a href=\"https://www.tunbury.org/downloads/mandelbrot.xlsm\">Excel VBA</a> and JavaScript\u2026</p>\n\n<div><div><pre><code>#!/usr/bin/perl -w\n\nuse strict;\nuse GD;\n\nmy $width = 1024;\nmy $height = 1024;\n\nGD::Image->trueColor(1);\nmy $img = new GD::Image($width, $height);\n</code></pre></div></div>\n\n<p>Focus on an interesting bit. Real should be between -2.5 and 1 and\nimaginary between -1 and 1.</p>\n\n<div><div><pre><code>my $MINre = -0.56;\nmy $MAXre = -0.55;\nmy $MINim = -0.56;\nmy $MAXim = -0.55;\n</code></pre></div></div>\n\n<p>Maximum number of iterations before the point is classified as bounded.\nI\u2019ve used 255 because I am using this as the colour component later</p>\n\n<div><div><pre><code>my $max = 255;\n</code></pre></div></div>\n\n<p>Setup the loops to move through all the pixels in the image. The value\nof C is calculate from the image size and scale. Note that GD creates\nimages with the origin in the top left.</p>\n\n<div><div><pre><code>for my $row (1 .. $height) {\n my $Cim = $MINim + ($MAXim - $MINim) * $row / $height;\n for my $col (0 .. $width - 1) {\n my $Cre = $MINre + ($MAXre - $MINre) * $col / $width;\n</code></pre></div></div>\n\n<p>Z starts at the origin</p>\n\n<div><div><pre><code> my $Zre = 0;\n my $Zim = 0;\n my $iteration = 0;\n</code></pre></div></div>\n\n<p>Loop until the modulus of Z < 2 or the maximum number of iterations\nhave passed. Note that I\u2019ve squared both sides to avoid a wasting time\ncalculating the square root</p>\n\n<div><div><pre><code>while ($Zre * $Zre + $Zim * $Zim <= 4 && $iteration < $max) {\n</code></pre></div></div>\n\n<p>Here\u2019s the formula from above to calculate the next value</p>\n\n<div><div><pre><code> my $ZNre = $Zre * $Zre - $Zim * $Zim + $Cre;\n $Zim = 2 * $Zre * $Zim + $Cim;\n $Zre = $ZNre;\n</code></pre></div></div>\n\n<p>Move on to the next iteration</p>\n\n<div><div><pre><code> $iteration++;\n }\n</code></pre></div></div>\n\n<p>Determine why we finished the loop - was it bound or not - and then\ncolour the pixel appropriately</p>\n\n<div><div><pre><code> if ($iteration < $max) {\n $img->setPixel($col, $height - $row, $iteration * 0x010101);\n } else {\n $img->setPixel($col, $height - $row, 0x00);\n }\n }\n}\n</code></pre></div></div>\n\n<p>Output the PNG file to STDOUT</p>\n\n<div><div><pre><code>binmode STDOUT;\nprint $img->png;\n</code></pre></div></div>",
+20
mte/2015_01_19_shape-files.json
+20
mte/2015_01_19_shape-files.json
···+"content": "<p>Below is a perl script to create a PNG from a Shape file.</p>\n\n<p><a href=\"https://www.tunbury.org/downloads/shapefile.pdf\">Shape file specification</a></p>\n\n<p><a href=\"https://www.tunbury.org/downloads/ROADNODE.zip\">UK Road network as a shape file </a></p>\n\n<div><div><pre><code>use strict;\nuse warnings;\n\nuse GD;\nGD::Image->trueColor(1);\n\nmy $width = 8 * 1024;\nmy $height = 8 * 1024;\n\nmy $shpfile = $ARGV[0];\nopen(FH, \"<$shpfile\") or die(\"No input file\\n\");\nbinmode(FH); \n\nmy $csvfile = $shpfile;\n$csvfile =~ s/.shp$/.csv/g;\nopen(POLYOUT, \">$csvfile\");\n\nmy $buffer;\nmy $num_bytes = read(FH, $buffer, 100);\nmy ($code, $u1, $u2, $u3, $u4, $u5, $filelength, $version, $type, $BBminX, $BBminY, $BBmaxX, $BBmaxY, $BBminZ, $BBmaxZ, $BBminM, $BBmaxM) = unpack(\"N N N N N N N V V F F F F F F F F\", $buffer);\nprint \"code = $code\\n\";\nprint \"filelength = $filelength\\n\";\nprint \"version = $version\\n\";\nprint \"minX = $BBminX\\n\";\nprint \"minY = $BBminY\\n\";\nprint \"maxX = $BBmaxX\\n\";\nprint \"maxY = $BBmaxY\\n\";\nprint \"minZ = $BBminZ\\n\";\nprint \"maxZ = $BBmaxZ\\n\";\nprint \"minM = $BBminM\\n\";\nprint \"maxM = $BBmaxM\\n\";\n\nsub mapx {\n my $x = shift;\n return ($x - $BBminX) / ($BBmaxX - $BBminX) * $width;\n}\n\nsub mapy {\n my $y = shift;\n return $height - ($y - $BBminY) / ($BBmaxY - $BBminY) * $height;\n}\n\nmy $polyCount = 0;\n\nmy $img = new GD::Image($width, $height);\n\nwhile (read(FH, $buffer, 12)) {\n my ($recordnumber, $recordlength, $shapetype) = unpack(\"N N V\", $buffer);\n if ($shapetype == 5) {\n # Polygon\n read(FH, $buffer, 4 * 8 + 2 * 4);\n my ($minX, $minY, $maxX, $maxY, $NumParts, $NumPoints) = unpack(\"F F F F V V\", $buffer);\n my @parts;\n foreach my $part (1 .. $NumParts) {\n read(FH, $buffer, 4);\n my ($part) = unpack(\"V\", $buffer);\n push @parts, $part;\n #syswrite(SHPOUT, pack(\"V\", $part), 4);\n }\n push @parts, $NumPoints;\n @parts = reverse @parts;\n while (@parts) {\n my $firstpoint = pop @parts;\n my $lastpoint = pop @parts;\n my $poly = new GD::Polygon;\n $polyCount++;\n foreach ($firstpoint .. $lastpoint - 1) {\n read(FH, $buffer, 16);\n my ($x, $y) = unpack(\"F F\", $buffer);\n print POLYOUT \"$x,$y,$polyCount\\n\";\n $poly->addPt(mapx($x), mapy($y));\n }\n $img->openPolygon($poly, 0xff0000);\n push @parts, $lastpoint if (@parts);\n }\n } elsif ($shapetype == 3) {\n # PolyLine\n read(FH, $buffer, 4 * 8 + 2 * 4);\n my ($minX, $minY, $maxX, $maxY, $NumParts, $NumPoints) = unpack(\"F F F F V V\", $buffer);\n my @parts;\n foreach my $part (1 .. $NumParts) {\n read(FH, $buffer, 4);\n my ($part) = unpack(\"V\", $buffer);\n push @parts, $part;\n }\n push @parts, $NumPoints;\n @parts = reverse @parts;\n while (@parts) {\n my $firstpoint = pop @parts;\n my $lastpoint = pop @parts;\n read(FH, $buffer, 16);\n my ($x1, $y1) = unpack(\"F F\", $buffer);\n print POLYOUT \"$x1,$y1\\n\";\n foreach ($firstpoint .. $lastpoint - 2) {\n read(FH, $buffer, 16);\n my ($x2, $y2) = unpack(\"F F\", $buffer);\n print POLYOUT \"$x2,$y2\\n\";\n $img->line(mapx($x1), mapy($y1), mapx($x2), mapy($y2), 0xff0000);\n $x1 = $x2;\n $y1 = $y2;\n }\n push @parts, $lastpoint if (@parts);\n }\n\n } elsif ($shapetype == 1) {\n read(FH, $buffer, 2 * 8);\n my ($x, $y) = unpack(\"F F\", $buffer);\n $img->setPixel(mapx($x), mapy($y), 0xff0000);\n print POLYOUT \"$x,$y\\n\";\n } else {\n print \"unhandled type shapetype = $shapetype\\n\";\n read(FH, $buffer, $recordlength * 2 - 4);\n }\n}\n\nclose(POLYOUT);\n\nmy $pngfile = $shpfile;\n$pngfile =~ s/.shp$/.png/g;\nopen(PNGOUT, \">$pngfile\");\nbinmode(PNGOUT);\nprint PNGOUT $img->png;\nclose(PNGOUT);\n</code></pre></div></div>",
+21
mte/2016_08_24_place-notation.json
+21
mte/2016_08_24_place-notation.json
···+"summary": "Thomas Barlow has taught me place notation using Strike Back Surprise Major as the example. The notation for that is x38x14x58x16x12x38x14.12.78 l.e. 12. There are plenty of guides online on how to interpret it, such as this one on the CCCBR website.",+"content": "<p>Thomas Barlow has taught me place notation using <a href=\"https://www.tunbury.org/downloads/Strike-Back-Surprise-Major.pdf\">Strike Back Surprise Major</a> as the example. The notation for that is <code>x38x14x58x16x12x38x14.12.78 l.e. 12</code>. There are plenty of guides online on how to interpret it, such as this one on the <a href=\"http://www.cccbr.org.uk/education/thelearningcurve/pdfs/200404.pdf\">CCCBR website</a>.</p>\n\n<p>Briefly an x in the notation causes all bells to swap places. A group of numbers indicates that the bells in these places remain fixed while all others swap places. In this example, giving a starting order of rounds: 12345678 the first x would yield 21436587. The subsequent 38 indicates that the 3rd placed and 8th placed bells are fixed, so bells in position 1 and 2 swap as do 4 and 5 and 6 and 7 resulting in 12463857 and so on. As many methods are symmetrical, typically only half is written out. The second half is the reverse of the first with the given lead end appended.</p>\n\n<p>My attempt to write out <a href=\"https://www.tunbury.org/downloads/Ajax-Surprise-Major.pdf\">Ajax Surprise Major</a> <code>x58x14x56x16x14x1258x12x58,12</code> by hand went wrong in the early stages so I turned to Perl to do the job for me.</p>\n\n<p>The first part of the script parses the place notation into an array, unwraps the symmetry and tags on the lead end. I don\u2019t much like parsers as they tend to be messy as they have to deal with the real world, so moving swiftly on to the core of the script with the assumption that the place notation of the method is held in the array <code>@method</code>.</p>\n\n<div><div><pre><code>x 58 x 14 x 56 x 16 x 14 x 1258 x 12 x 58 x 12 x 1258 x 14 x 16 x 56 x 14 x 58 x 12\n</code></pre></div></div>\n\n<p>Define <code>@rounds</code> to be rounds and then set the current bell arrangement to be rounds!</p>\n\n<div><div><pre><code>my @rounds = (1..$stage);\nmy @bells = @rounds;\ndo {\n</code></pre></div></div>\n\n<p>Loop through each of the elements in the method (<code>@method</code>)</p>\n\n<div><div><pre><code> foreach my $m (@method) {\n</code></pre></div></div>\n\n<p><code>$stage</code> is the number of bells involved in the method. Our examples have all been <em>major</em> methods so <code>$stage</code> is 8. Perl arrays are inconveniently numbered from zero so we actually want number 0 through 7 so I\u2019ve used pop to remove the last one</p>\n\n<div><div><pre><code> my @changes = (0..$stage);\n pop @changes;\n</code></pre></div></div>\n\n<p>If the current step contains bell places (noting that 0 = 10, E = 11, T = 12) we split up the string into an array which we process in <em>reverse</em> order (to preserve the position numbering) and we remove these numbers from the array of changes. The function numeric returns the integer value from the character (T=12 etc).</p>\n\n<div><div><pre><code> if ($m =~ /[0-9ET]*/) {\n my @fixed = split //, $m;\n while (@fixed) {\n splice @changes, numeric(pop @fixed) - 1, 1;\n }\n }\n</code></pre></div></div>\n\n<p>For example, taking <code>$m</code> to be <code>1258</code> then <code>@changes</code> and <code>@fixed</code> will iterate as shown. Note the annoying -1 to align the bell position to the array index</p>\n\n\n\n \n \n Iteration\n <code>@changes</code>\n <code>@fixed</code>\n \n \n \n \n \u00a0\n 0 1 2 3 4 5 6 7\n 1 2 5 8\n \n \n 1\n 0 1 2 3 4 5 6\n 1 2\t5\n \n \n 2\n 0 1 2 3 5 6\n 1 2\n \n \n 3\n 0 2 3 5 6\n 1\n \n \n 4\n 2 3 5 6\n \u00a0\n \n \n\n\n<p>The resulting array <code>@changes</code> contains the pairs of bell place indices which need to be swapped. Changes need to be made in order working up to the back as place notation can omit implied changes. For example 18 could be shortened to just 1 as by the time 2nd and 3rd, 4th and 5th, 6th and 7th have all swapped, 8th place must be fixed.</p>\n\n<div><div><pre><code> while (@changes) {\n my ($swap1, $swap2) = splice @changes, 0, 2;\n @bells[$swap1, $swap2] = @bells[$swap2, $swap1];\n last if (scalar @changes < 2);\n }\n</code></pre></div></div>\n\n<p>Now we need to output the current arrangement which at this point will just be a print statement.</p>\n\n<div><div><pre><code> print \"@bells\\n\";\n }\n</code></pre></div></div>\n\n<p>Keep going until we are back in rounds.</p>\n\n<div><div><pre><code>} while (not @bells ~~ @rounds);\n</code></pre></div></div>\n\n<p>Now that that is working the natural desire is to produce beautiful output. Since I was coding in Perl and ultimately I\u2019d like a webpage out of this I experimented using Perl\u2019s GD::Graph library to draw a line graph of the place of each bell. GD::Graph can display the point value on the graph which was used to show the bell number. The output was functional although far from high resolution. The font of the point values cannot be controlled. See Bob Doubles output below</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/bob-doubles.png\"></p>\n\n<p>Since the GD::Graph output wasn\u2019t great, I\u2019ve coded a version which creates the output using SVG. Have a go:</p>\n\n\n\n Select a method:\n \n Bob Doubles\n Bob Minor\n Reverse Canterbury\n Stedman Doubles\n Grandsire Doubles\n Valencia Surprise Major\n <br><br>\n Select stage:\n \n Doubles\n Minor\n Triples\n Major\n Caters\n Royal\n <br><br>\n Highlight bell:\n <br><br>",
+20
mte/2016_08_25_pentominoes.json
+20
mte/2016_08_25_pentominoes.json
···+"summary": "One day I was clearing out some old papers and I came across this programming assignment from university. I can\u2019t recall which of the problems I tackled at the time, after all it was twenty-five years ago, but glancing over it now the pentomino problem caught my eye",+"content": "<p>One day I was clearing out some old papers and I came across this programming assignment from university. I can\u2019t recall which of the problems I tackled at the time, after all it was twenty-five years ago, but glancing over it now the pentomino problem caught my eye</p>\n\n<blockquote>\n <p>5 The Pentomino Problem\nThere are twelve different (ie. non-congruent) pentominos, shown below left. The pentomino problem is to fit them into a tray of dimensions 6 x 10 without overlapping. Some of the 2339 possible solutions are shown below right. Write a program to find a solution to the pentomino problem. {Note. Pretty output is not required.)</p>\n</blockquote>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pentomino-graphic.png\"></p>\n\n<p>Looking on <a href=\"https://en.wikipedia.org/wiki/Pentomino\">Wikipedia</a> it seems that the shapes have been named by <a href=\"https://en.wikipedia.org/wiki/Solomon_W._Golomb\">Golomb</a> so I\u2019m going to use those names too.</p>\n\n<p>I started out by creating some data structures to hold the definition of each pentomino.</p>\n\n<p>So laying out on a x, y co-ordinate system I\u2019m create a point_t structure containing values</p>\n\n<div><div><pre><code>typedef struct {\n int x, y;\n} point_t;\n</code></pre></div></div>\n\n<p>Any pentomino will have exactly five points</p>\n\n<div><div><pre><code>typedef struct {\n point_t point[5]; /* 5 points in each */\n} pentomino_t;\n</code></pre></div></div>\n\n<p>Considering the \u2018F\u2019 pentomino it may be rotated and reflected in different ways \u2013 a maximum of 8 different versions may exist. Some, such as \u2018X\u2019, only have one.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/F.svg\"></p>\n\n<p>I have created a structure to hold the pentomino name along with a count of the number of unique rotations/reflections of the shape and an array to hold the co-ordinates</p>\n\n<div><div><pre><code>typedef struct {\n char ch; /* name of the shape by letter */\n int count; /* number of unique rotations */\n pentomino_t rotation[8]; /* max of 4 possible rotations and then double for the mirrors */\n} pentominoRotations_t;\n</code></pre></div></div>\n\n<p>The 6\u00d710 board that we will try to place them on is as simple as this</p>\n\n<div><div><pre><code>char board[60];\n</code></pre></div></div>\n\n<p>The algorithm couldn\u2019t be simpler really, take the first pentomino in the first rotation and put it on the board in the top left corner, if that works try the second pentomino in the second position in the first rotation and repeat. At each step check no parts of any pentomino are outside the board area and that nothing is on top of anything else. If it is, remove the last piece added and try to add it again in the next rotation. Based upon the assignment the key here is to recognise that this is a recursive algorithm \u2013 in pseudo code it looks like this</p>\n\n<div><div><pre><code>function calculate(pentomino p, board)\n for each position on the board\n for each pentomino rotation\n let shape_ok = true\n for each point in pentomino shape\n if the co-ordinate is out of bound then shape_ok = false\n if the board position is already used then shape_ok = false\n next\n if shape_ok is true then\n draw the shape on the current board\n if p < 12 then\n calculate(p + 1, current board layout)\n else\n we have a solution!\n next\n next\n</code></pre></div></div>\n\n<p>Here is the first solution that it generates given the order of shapes as I have them</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/solution-1.svg\"></p>\n\n<p>The big problem with this is it takes a very long time! The main reason for this is that it algorithm wastes masses of time trying to fit all 12 pieces in even when the early piece positions have given a board which can\u2019t possibly be solved. In the example below there is no point trying to place the other 11 pentominos including all their rotations when there is an isolated single square.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/F-bad-placement.svg\"></p>\n\n<p>My initial solution to this is to add a check after drawing the shape to look for regions which have an area of less than 5. However this can extended to check for regions that have areas which are not multiples of 5 as clearly all pentominos have an area of 5!</p>\n\n<p>Take a look at the example below. This has two regions, on the left the area is 13 and on the right the area is 22. This is can\u2019t be solved as we will never be able to pack objects with an area of 5 into a region of area 13.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/small-region.svg\"></p>\n\n<p>I was quite surprised how easy it was to calculate the area of the regions. I\u2019ve always thought that the fill/flood tools on paint programs were cool and here we are just doing the same thing. Here\u2019s some pseudo code to explain it. I presume I\u2019d get twice the marks for this assignment for having two recursive functions!</p>\n\n<div><div><pre><code>Create a copy of the board\nLoop through all squares on the board\n if the square is empty\n call the flood function with starting at these co-ordinates\n if the returned value modulus 5 is not zero then the board cannot be solved\n\nfunction flood(start co-ordinates)\n let r = 1 and for that to be the size of the region\n mark the current co-ordinate position as filled\n if the square to the left is empty then call the flood function with those co-ordinates and add the returned value to r\n if the square to the right is empty then call the flood function with those co-ordinates and add the returned value to r\n if the square above is empty then call the flood function with those co-ordinates and add the returned value to r\n if the square below is empty then call the flood function with those co-ordinates and add the returned value to r\n return r\n</code></pre></div></div>\n\n<p>If you let these run to completion you find that you have 9356 solutions \u2013 exactly 4 times the number we should. This is because the board has rotation symmetry and both vertical and horizontal symmetry. We could check each solution against the ones already created for possible duplicates but we could also amend the algorithm so at the first level we only consider start position in the first quarter of the board.</p>\n\n<p>With this amended algorithm my average computer produced all 2339 solutions in around twenty minutes.</p>",
+20
mte/2016_11_21_splicing-three-strand-rope.json
+20
mte/2016_11_21_splicing-three-strand-rope.json
···+"summary": "My sudden interest in rope splicing stems entirely from bell ropes. There seems to be three, perhaps four, splices to learn for this application. Links below to YouTube videos explaining how to do them:",+"content": "<p>My sudden interest in rope splicing stems entirely from bell ropes. There seems to be three, perhaps four, splices to learn for this application. Links below to YouTube videos explaining how to do them:</p>\n\n<ul>\n <li><a href=\"https://youtu.be/QeYBkMCQ8WY\">Eye Splice</a></li>\n <li><a href=\"https://youtu.be/PFFeDH2u7E0\">Short Splice</a></li>\n <li><a href=\"https://youtu.be/sN-cnO8Fqrc\">Long Splice</a></li>\n <li><a href=\"https://youtu.be/bRjqMKLS99A\">End/Back Splice</a></li>\n</ul>\n\n<p>Above the sally you\u2019d probably use a long splice as it\u2019s thinner than the short splice for running over any pulleys. Below the sally, either a short splice to the tail end if it doesn\u2019t see much wear, or an eye splice if the tail end is changed frequently, typical on larger bells. The back splice could be used on the top end to give a nice finish to the rope.</p>\n\n<p>I\u2019m amazed how straightforward they are to do and how strong they are given that it\u2019s just an over-under weave of strands without a knot in sight!</p>",
+20
mte/2017_05_01_prime-numbers-in-powershell.json
+20
mte/2017_05_01_prime-numbers-in-powershell.json
···+"summary": "Dylan was using a number square to calculate prime numbers so it amused me to code up a couple of algorithms to show just how quick the sieve method actually is. I\u2019ve done these in PowerShell because \u2026 reasons.",+"content": "<p>Dylan was using a number square to calculate prime numbers so it amused me to code up a couple of algorithms to show just how quick the sieve method actually is. I\u2019ve done these in PowerShell because \u2026 reasons.</p>\n\n<p>So as a baseline, here\u2019s a basic way to calculate a prime. Start with a number and try to divide it by every number starting from 2 up to the square root of the number. I\u2019ve used <code>throw</code> in a <code>try</code>/<code>catch</code> block to move to the next iteration of the outer loop without executing the <code>Write-Host</code> line.</p>\n\n<div><div><pre><code>for ($n = 3; $n -lt 100000; $n++) {\n try {\n for ($d = 2; $d -le [Math]::Sqrt($n); $d++) {\n if ($n % $d -eq 0) {\n throw\n }\n }\n Write-Host -NoNewLine \"$n \"\n }\n catch { }\n}\n</code></pre></div></div>\n\n<p>Interestingly, all those exceptions add quite an overhead because this same algorithm using a local variable ran three times quicker on my machine (27 seconds for the first and 9 seconds for this)</p>\n\n<div><div><pre><code>for ($n = 3; $n -lt 100000; $n++) {\n $prime = $true\n for ($d = 2; $d -le [Math]::Sqrt($n); $d++) {\n if ($n % $d -eq 0) {\n $prime = $false\n break;\n }\n }\n if ($prime) {\n Write-Host -NoNewLine \"$n \"\n }\n}\n</code></pre></div></div>\n\n<p>Obviously we should optimise this by removing even numbers as below and this, as you\u2019d expect, halves the run time.</p>\n\n<div><div><pre><code>for ($n = 3; $n -lt 100000; $n += 2) {\n $prime = $true\n for ($d = 3; $d -le [Math]::Sqrt($n); $d += 2) {\n if ($n % $d -eq 0) {\n $prime = $false\n break;\n }\n }\n if ($prime) {\n }\n}\n</code></pre></div></div>\n\n<p>Anyway, the sieve is all done in 0.75 seconds:</p>\n\n<div><div><pre><code>$ints = 0..100000\nfor ($i = 2; $i -lt [Math]::Sqrt($ints.length); $i++) {\n if ($ints[$i] -eq 0) {\n continue\n }\n for ($j = $i * $i; $j -lt $ints.length; $j += $i) {\n $ints[$j] = 0\n }\n}\n$ints | foreach { if ($_) { Write-Host -NoNewLine \"$_ \" } }\n</code></pre></div></div>\n\n<p>As the maximum number increases the differences become even more stark. At 1,000,000 the sieve completed in 11 seconds but the simple method took 129 seconds</p>\n\n<p>For my timings, I used <code>measure-command</code> and removed the <code>Write-Host</code> lines.</p>",
+20
mte/2018_07_13_latin-square.json
+20
mte/2018_07_13_latin-square.json
···+"summary": "Looking at the latest video from Presh Talwalkar about solving the Latin square where each row is the first row multiplied by the row number I decided it was time to see if I could remember any C++ and code a solution.",+"content": "<p>Looking at the latest video from Presh Talwalkar about solving the Latin square where each row is the first row multiplied by the row number I decided it was time to see if I could remember any C++ and code a solution.</p>\n\n<p><a href=\"https://youtu.be/KXOjtmNUSH0\">Can you fiqure out the special 6 digit number?</a></p>\n\n<p>Include the files standard C++ header files we need</p>\n\n<div><div><pre><code>#include <iostream>\n#include <algorithm>\n#include <vector>\n#include <sstream>\n#include <string>\n#include <iomanip>\n\nusing namespace std;\n</code></pre></div></div>\n\n<p><code>CheckDuplicates()</code> comes from ideas presented in this <a href=\"https://stackoverflow.com/questions/2860634/checking-for-duplicates-in-a-vector\">Stack Overflow question</a>. The function determines whether there are any repeated digits in a vector by sorting the vector and then searching for adjacent items which are the same. Since <code>std::sort</code> changes the source vector I\u2019ve created a local copy using the vector constructor function.</p>\n\n<div><div><pre><code>bool CheckDuplicates(vector<unsigned int>* v) {\n vector<unsigned int> c (v->begin(), v->end());\n sort(c.begin(), c.end());\n vector<unsigned int>::iterator it = adjacent_find(c.begin(), c.end());\n if (it == c.end())\n return false;\n else\n return true;\n}\n</code></pre></div></div>\n\n<p>On to the body of program</p>\n\n<div><div><pre><code>int main () {\n</code></pre></div></div>\n\n<p>Create a loop which covers all possible six digit numbers. The result can\u2019t be smaller than 123456 and it must be less than 1,000,000 \u00f7 6 = 166,666 but change the loop to 0 to 1,000,000 shows that there really aren\u2019t any other solutions.</p>\n\n<div><div><pre><code> for (unsigned int t = 123456; t < 166666; t++) {\n</code></pre></div></div>\n\n<p>I\u2019ll use a vector of vectors to hold the digits of each number.</p>\n\n<div><div><pre><code> vector< vector<unsigned int>* > square;\n</code></pre></div></div>\n\n<p>This first block of code initialises the first vector with the value from the outer loop. It only adds the value to the square if it doesn\u2019t contain any duplicate digits.</p>\n\n<div><div><pre><code> {\n vector<unsigned int>* row = new vector<unsigned int>;\n unsigned int n = t;\n for (int i = 0; i < 6; i++) {\n row->insert(row->begin(), n % 10);\n n /= 10;\n }\n if (!CheckDuplicates(row))\n square.push_back(row);\n else\n delete row;\n }\n</code></pre></div></div>\n\n<p>By looking at the size of the <code>square</code> vector we can see if we have a row to work with or not. If we do, attempt the multiplication of the first row by 2 through 6 to generate the other rows. As we want full multiplication not just the multiplication of each digit we need to compute the carry at each step and add it on to the next column. If there is a carry into the seventh column then the row can be discarded. Lastly, check for duplicates and if none are found added the number/row to the square. An alternative approach here would be to multiply t and separate the result into the individual digits in a vector as we did above.</p>\n\n<div><div><pre><code> if (square.size() == 1) {\n for (unsigned int j = 2; j <= 6; j++) {\n unsigned int carry = 0;\n vector<unsigned int>* row = new vector<unsigned int>;\n for (int i = 5; i >= 0; i--) {\n unsigned int n = square.at(0)->at(i) * j + carry;\n if (n > 9) {\n carry = n / 10;\n n %= 10;\n } else {\n carry = 0;\n }\n row->insert(row->begin(), n);\n }\n if (carry) {\n delete row;\n break;\n } else {\n if (!CheckDuplicates(row))\n square.push_back(row);\n else\n delete row;\n }\n }\n }\n</code></pre></div></div>\n\n<p>So, if we get to here we have six rows each of different digits in each row. We now need to check for duplication in the columns. This strictly isn\u2019t necessary because only one solution makes it this far, but for the sake of completeness I generate a vector for each column and check it for duplicates. If no duplicates are found then it\u2019s a possible solution.</p>\n\n<div><div><pre><code> if (square.size() == 6) {\n bool duplicates = false;\n for (int i = 5; i >= 0; i--) {\n vector<unsigned int> column;\n for (vector<unsigned int>* row : square)\n column.push_back(row->at(i));\n if (CheckDuplicates(&column)) {\n duplicates = true;\n break;\n }\n }\n if (!duplicates) {\n cout << \"\\nSolution\\n\";\n for (vector<unsigned int>* row : square) {\n for (unsigned int c : *row) {\n cout << c << ' ';\n }\n cout << '\\n';\n }\n }\n }\n</code></pre></div></div>\n\n<p>Tidy up by deleting each of the row vectors</p>\n\n<div><div><pre><code> for (vector<unsigned int>* row : square)\n delete row;\n square.erase(square.begin(), square.end());\n }\n\n return 0;\n}\n</code></pre></div></div>\n\n<p>You can download the full version of the code from <a href=\"https://github.com/mtelvers/LatinSquare\">Github</a></p>",
+20
mte/2018_08_27_which-funds-have-exposure-to-netflix.json
+20
mte/2018_08_27_which-funds-have-exposure-to-netflix.json
···+"summary": "Dabbling in the markets by way of investment funds is amusing. I use Hargreaves Lansdown to do this. HL have a fund research section which lets you look at a given fund and view the top 10 holdings so you can base your decision to invest in your belief in the underlying stock.",+"content": "<p>Dabbling in the markets by way of investment funds is amusing. I use <a href=\"https://www.tunbury.org/2018/08/27/which-funds-have-exposure-to-netflix/www.hl.co.uk\">Hargreaves Lansdown</a> to do this. HL have a fund research section which lets you look at a given fund and view the top 10 holdings so you can base your decision to invest in your belief in the underlying stock.</p>\n\n<p>How do you tackle it from the other direction? Suppose you want to invest in NetFlix but which fund(s) has expose to their stock? The search tool on HL\u2019s website doesn\u2019t let you search the fund\u2019s holdings.</p>\n\n<p>Firstly, we can get a list of funds starting with <code>a</code> by visiting the link https://www.hl.co.uk/funds/fund-discounts,-prices\u2013and\u2013factsheets/search-results/a. There are 25 more to go plus 0 for anything starting with a number. These pages are HTML unordered lists <code>ul</code>, of hyperlinks <code>href</code>. We can get the alphabet as an array in a tidy loop such as this <code>foreach ($l in [char[]]([char]'a'..[char]'z') + '0') { }</code> (assuming ASCII)</p>\n\n<p>We can download the HTML using PowerShell\u2019s <code>Invoke-WebRequest</code> and then extra tags using <code>getElementsByTagName</code> however it can be desperately slow in some circumstances so I prefer to just get the HTML as a string using <code>$_.RawContent</code> then processing it with <code>IndexOf()</code>.</p>\n\n<p>The code, and basically the methodology for the rest of this script, is show as below:</p>\n\n<div><div><pre><code>$baseURL = \"https://www.hl.co.uk/funds/fund-discounts,-prices--and--factsheets/search-results\"\n$html = $(Invoke-WebRequest -uri \"$baseURL/a\").RawContent\n$x1 = $html.IndexOf('<ul class=\"list-unstyled list-indent\"')\n$x1 = $html.IndexOf('>', $x1) + 1\n$x2 = $html.IndexOf('</ul', $x1)\n$tbl = $html.substring($x1, $x2 - $x1).trim()\n</code></pre></div></div>\n\n<p>Search the HTML for the start of the <code>ul</code> tag and save it in <code>$x1</code>. As tags can be of variable length we move <code>$x1</code> to the end of the tag by searching for the close tag marker <code>></code> and adding 1. Now, just search for the end of the list by looking for the <code></ul</code> tag and store that in <code>$x2</code>. The table can now be extracted as the sub string between <code>$x1</code> and <code>$x2</code>.</p>\n\n<p>Each list item <code>li</code>, contains a hyperlink tag <code><a href=</code> including the URL of the page with the fund details and the the fund name. We can use a <code>for</code> loop to move through the string and build up an array of fund URLs. Back tick is the escape character in PowerShell.</p>\n\n<div><div><pre><code>$funds = @()\nfor ($x1 = $tbl.IndexOf(\"href=\"); $x1 -ge 0; $x1 = $tbl.IndexOf(\"href=\", $x2)) {\n $x1 = $tbl.IndexOf('\"', $x1) + 1 # x1 is the start of the string\n $x2 = $tbl.IndexOf('\"', $x1) # x2 is the end of the string\n $funds += $tbl.Substring($x1, $x2 - $x1)\n}\n</code></pre></div></div>\n\n<p>At this point we can examine our funds in <code>$funds</code>, or perhaps write then to a CSV: <code>$funds | Export-Csv funds.csv</code>.</p>\n\n<p>What we really want is the list of holdings for each funds. So using the techniques above, download the HTML for each fund detail page, extract the fund size where it appears on the page. Then locate the Top 10 holdings table and build a PowerShell object based upon the table headings and populate the values:</p>\n\n<div><div><pre><code>$holdings = @()\nfor ($f = 0; $f -lt $funds.count; $f++) {\n $html = $(Invoke-WebRequest -uri $funds[$f]).RawContent\n if ($html.IndexOf(\"Factsheet unavailable\") -ge 0 -or\n $html.IndexOf(\"Market data not available\") -ge 0 -or\n $html.IndexOf(\"holdings currently unavailable\") -ge 0) {\n Write-Host -ForegroundColor Red $f $funds[$f].substring($baseURL.length) \"- unavailable\"\n continue\n }\n\n $x1 = $html.IndexOf('Fund size')\n $x1 = $html.IndexOf('<td', $x1)\n $x1 = $html.IndexOf(\">\", $x1) + 1\n $x2 = $html.IndexOf('</td', $x1)\n $fundSize = $html.Substring($x1, $x2 - $x1).trim()\n $fundSize = $fundSize -replace \"&pound;\", \"GBP \"\n $fundSize = $fundSize -replace \"&euro;\", \"EUR \"\n $fundSize = $fundSize -replace \"\\$\", \"USD \"\n\n $x1 = $html.IndexOf('<table class=\"factsheet-table\" summary=\"Top 10 holdings\"')\n $x1 = $html.IndexOf('>', $x1) + 1\n $x2 = $html.IndexOf('</table>', $x1)\n $tbl = $html.substring($x1, $x2 - $x1).trim()\n\n $headings = @()\n for ($x1 = $tbl.IndexOf('<th', 1); $x1 -gt 0; $x1 = $tbl.IndexOf('<th', $x2)) {\n $x1 = $tbl.IndexOf(\">\", $x1) + 1\n $x2 = $tbl.IndexOf(\"</th>\", $x1)\n $headings += $tbl.Substring($x1, $x2 - $x1)\n }\n\n if ($headings.count -eq 0) {\n Write-Host -ForegroundColor Red $f $funds[$f].substring($baseURL.length) \"- no table\"\n continue\n }\n\n $i = 0\n for ($x1 = $tbl.IndexOf('<td'); $x1 -gt 0; $x1 = $tbl.IndexOf('<td', $x2)) {\n if ($i % $headings.count -eq 0) {\n $h = New-Object -TypeName PSObject -Property @{Fund=$funds[$f].substring($baseURL.length);Size=$fundSize}\n }\n $x1 = $tbl.IndexOf(\">\", $x1) + 1\n $x2 = $tbl.IndexOf(\"</td\", $x1)\n $cell = $tbl.Substring($x1, $x2 - $x1).trim()\n if ($cell.Substring(0, 1) -eq '<') {\n $x1 = $tbl.IndexOf(\">\", $x1) + 1\n $x2 = $tbl.IndexOf(\"</a\", $x1)\n $cell = $tbl.Substring($x1, $x2 - $x1).trim()\n }\n Add-Member -InputObject $h -MemberType NoteProperty -Name $headings[$i % $headings.count] -Value $cell\n $i++\n if ($i % $headings.count -eq 0) {\n $holdings += $h\n }\n }\n Write-Host $f $funds[$f].substring($baseURL.length) $fundSize ($i / 2) \"holdings\"\n}\n</code></pre></div></div>\n\n<p>As I mentioned, most of the code is as explained before but the PowerShell object bit deserves a mention. I use an iterator <code>$i</code> to count the cells in the table (note this assumes that the table has equal number of cells per row which isn\u2019t necessarily true in HTML). We have two column headings, so <code>$i % $headings.count -eq 0</code> is true for 0, 2, 4 etc and this happens at the start of the loop so we use it to create the object.</p>\n\n<p>Once we have the cells content, we can use <code>Add-Member</code> to add the property to the object. The property name is given by <code>$headings[$i % $headings.count]</code>: either zero or one in this case.</p>\n\n<p>At the end of the loop we increment <code>$i</code> and test whether it we are now on the next row <code>$i % $headings.count -eq 0</code> and if so add the current object to the output array (as it will be overwritten at the start of the next iteration of the loop).</p>\n\n<p>After all that work lets save the results as a CSV: <code>$holdings | Export-Csv holdings.csv</code></p>\n\n<p>We now know the percentages of each holding and the total fund value so we can calculate a new column with the monetary value invested in a fund as follows:</p>\n\n<div><div><pre><code>$holdings |% {\n [decimal]$w = $_.weight -replace '[^\\d.]'\n [decimal]$s = $_.size -replace '[^\\d.]'\n Add-Member -InputObject $_ -MemberType NoteProperty -Name Value -Value ($w * $s / 100) -Force\n}\n</code></pre></div></div>\n\n<p>Perhaps save it again? <code>$holdings | Export-Csv -Force holdings.csv</code></p>\n\n<div><div><pre><code>import-csv .\\holdings.csv |? Security -match \"Netflix\" | sort -Property Value\n</code></pre></div></div>\n\n<p>The full code can be downloaded from <a href=\"https://github.com/mtelvers/Hargreaves-Lansdown/blob/master/fund-holdings.ps1\">GitHub</a> or probably more usefully you can get <a href=\"https://raw.githubusercontent.com/mtelvers/Hargreaves-Lansdown/master/holdings.csv\">holdings.csv</a></p>\n\n<h1>Addendum</h1>\n\n<p>To make the analysis easier it would help to standardise the currencies. Most are in GBP by some margin so let\u2019s convert to that:-</p>\n\n<div><div><pre><code>$ExchangeRates = @{GBP = 1; YEN = 0.00698098; EUR = 0.905805; USD = 0.776454; AUSD = 0.567308}\n\n$holdings |% {\n [decimal]$s = $_.size -replace '[^\\d.]'\n [decimal]$w = $_.weight -replace '[^\\d.]'\n if ($s -gt 0) {\n $currency = $_.size.substring(0, $_.size.IndexOf(\" \"))\n $sGBP = $s * $ExchangeRates[$currency]\n } else {\n $sGBP = 0\n }\n Add-Member -InputObject $_ -MemberType NoteProperty -Name SizeGBP -Value $sGBP -Force\n Add-Member -InputObject $_ -MemberType NoteProperty -Name ValueGBP -Value ($w * $sGBP / 100) -Force\n}\n</code></pre></div></div>",
+21
mte/2018_09_24_retro-gaming-space-raiders.json
+21
mte/2018_09_24_retro-gaming-space-raiders.json
···+"summary": "Dylan\u2019s favourite t-shirt is his Game Over shirt which always reminds me to Space Raiders from the ZX Spectrum days. I found the cassette tape quite easily but it took a significant amount of searching to find the Spectrum itself and included in the box was the tape recorder as well!",+"content": "<p>Dylan\u2019s favourite t-shirt is his Game Over shirt which always reminds me to Space Raiders from the ZX Spectrum days. I found the cassette tape quite easily but it took a significant amount of searching to find the Spectrum itself and included in the box was the tape recorder as well!</p>\n\n<p>Unfortunately when I set about loading the game it didn\u2019t work. It probably was a lot to ask after 30+ years. The audio sounded a bit low and the tape player was at maximum. I tried connecting it via an amplifier but that didn\u2019t help.</p>\n\n<p>I connected the tape drive to my Mac and looked at the file in Audacity.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/original-tape-player.png\"></p>\n\n<p>Apart from being very quiet, zooming in showed that after the guard tone it was impossible to see the signal as described in this <a href=\"http://www.myprius.co.za/tape_storage.htm\">excellent post</a>.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/nothing-to-see.png\"></p>\n\n<p>I tried the Fuse utilities to covert the WAV into a TZX file but these failed. I found more tools here which I installed on my Raspberry PI but the result was the same.</p>\n\n<p>Eventually, I decided to see if I could find another tape player and I found an old compact media centre. I played the tape straight into Audacity just to see if I could see a difference. Clearly this find is significantly better:</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/compact-media-centre.png\"></p>\n\n<p>I tried <code>audio2tape</code> but that give me a bunch of CRC errors, but processing the file with <code>tzxwav</code> worked perfectly:</p>\n\n<div><div><pre><code>pi@raspberrypi:~/.local/bin $ ./tzxwav -p -v -o ~/raiders.tzx -D ~/raiders.wav \n=== Program: raiders ---------------------------------| 1:56\nExpected length: 40\nLeader: @1055530, Sync: @1275725, End: @1279885\nProgram: raiders (40 bytes)\n--- data########----------------------------------------| 1:51\nLength: 40\nLeader: @1323967, Sync: @1412003, End: @1421770\n40 bytes of data\n=== Program: RAIDERS ---------------------------------| 1:44\nExpected length: 68\nLeader: @1510973, Sync: @1731454, End: @1735476\nProgram: RAIDERS (68 bytes)\n--- data###########-------------------------------------| 1:40\nLength: 68\nLeader: @1778815, Sync: @1866811, End: @1882863\n68 bytes of data\n=== Bytes: T #----------------------------------| 1:33\nStart: 16384, Expected length: 6912\nLeader: @1964171, Sync: @2184510, End: @2188446\nScreen: T \n--- data#########################-----------------------| 1:27\nLength: 6912\nLeader: @2231875, Sync: @2319891, End: @3680454\n6912 bytes of data\n=== Bytes: C ##############---------------------| 1:16\nStart: 24576, Expected length: 7860\nLeader: @3778730, Sync: @3989417, End: @3993362\nBytes: C (start: 24576, 7860 bytes)\n--- data###########################################-----| 0:19\nLength: 7860\nLeader: @4036807, Sync: @4124864, End: @6093760\n7860 bytes of data\n100% |##################################################| 0:00\n</code></pre></div></div>\n\n<p>I loaded the TZX file into Fuse and it worked as expected.</p>\n\n<p>Armed with a working tape player I loaded the game on the real ZX Spectrum on the first attempt</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/space-raiders-on-tv.jpg\"></p>\n\n<p>Lastly, can we have this on our Raspberry PI? Well of course, just install Fuse and load up the TZX images:</p>\n\n<div><div><pre><code>sudo apt-get install fuse-emulator-common\nsudo apt-get install spectrum-roms fuse-emulator-utils\n</code></pre></div></div>",
+20
mte/2019_01_17_mount-an-iso-from-your-desktop-via-powercli.json
+20
mte/2019_01_17_mount-an-iso-from-your-desktop-via-powercli.json
···+"summary": "Normally, I\u2019d used a Windows NFS Server to host my ISO files. The steps couldn\u2019t be simpler",+"content": "<p>Normally, I\u2019d used a Windows NFS Server to host my ISO files. The steps couldn\u2019t be simpler</p>\n\n<div><div><pre><code>Add-WindowsFeature FS-NFS-Service\nNew-NfsShareimport\nImport-Module NFS\nNew-NfsShare -Name ISO -Path C:\\ISO -access readonly\n</code></pre></div></div>\n\n<p>However, this only works if you have a Windows Server installation as you can\u2019t install the NFS Service on a Windows desktop.</p>\n\n<p>There is a standalone executable version of an NFS server available called WinNFSd.exe which can be downloaded from <a href=\"https://github.com/winnfsd/winnfsd/releases\">GitHub</a>. I\u2019ve saved this to <code>C:\\WinNFSd</code></p>\n\n<p>Create a firewall rule on your desktop to allow the allow the ESXi host to communicate with WinNFSd, thus:</p>\n\n<div><div><pre><code>New-NetFirewallRule -DisplayName \"NFS Server\" -Direction Inbound -Action Allow -Program C:\\WinNFSd\\WinNFSd.exe\n</code></pre></div></div>\n\n<p>Run <code>WinNFSd</code>. The argument list is the local folder hosting your ISO files to be shared and the path that it will have on the NFS server\u2019s export list. The path name needs to match the <code>New-DataStore</code> command later:</p>\n\n<div><div><pre><code>Start-Process C:\\WinNFSd\\WinNFSd.exe -ArgumentList \"C:\\ISO /ISO\"\n</code></pre></div></div>\n\n<p>You should now have a CMD window open along with the PowerCLI prompt.</p>\n\n<p>Now you need to know the IP Address of your machine:</p>\n\n<div><div><pre><code>$myIPAddress = \"Your IP Address\"\n</code></pre></div></div>\n\n<p>You can automate this as follows but this may need to be tweaked depending upon which network card you are using etc.</p>\n\n<div><div><pre><code>$myIPAddress = $(Get-NetIPAddress -InterfaceAlias Ethernet0 -AddressFamily IPv4).IPAddress\n</code></pre></div></div>\n\n<p>Create a variable for your ESXi host(s).</p>\n\n<div><div><pre><code>$esxHosts = @( \"Your Host\" )\n</code></pre></div></div>\n\n<p>If you have a cluster you can include them all like this:</p>\n\n<div><div><pre><code>$esxHosts = Get-Datacenter yourDC | Get-Cluster yourCluster | Get-VMHost\n</code></pre></div></div>\n\n<p>Instruct the ESXi host to mount the datastore. Note that the final <code>/ISO</code> needs to match the final argument to <code>WinNFSd</code></p>\n\n<div><div><pre><code>$esxHosts |% { New-Datastore -VMHost $_ -Name ISO -NfsHost $myIPAddress -Path /ISO }\n</code></pre></div></div>\n\n<p>Now set the ISO that you have, such as <code>c:\\iso\\myiso.iso</code> to be the CD Drive on your VM</p>\n\n<div><div><pre><code>Get-CDDrive $vm | Set-CDDrive -IsoPath \"[ISO] myiso.iso\" -Connected:$true -Confirm:$false\n</code></pre></div></div>\n\n<p>Now you can use the CD Drive in the VM as you wish.</p>\n\n<p>Of course, it\u2019s important tidy up in the correct sequence. Don\u2019t just close the CMD prompt before disconnecting the CD drive and unmounting the datastore.</p>\n\n<p>Disconnect the CD Drive</p>\n\n<div><div><pre><code>Get-CDDrive $vm | Set-CDDrive -NoMedia -Confirm:$false\n</code></pre></div></div>\n\n<p>Remove the datastore</p>\n\n<div><div><pre><code>$esxHosts |% { Remove-Datastore -VMHost $_ -Datastore ISO -Confirm:$false }\n</code></pre></div></div>\n\n<p>Stop WinNFSd and remove the firewall rule</p>\n\n<div><div><pre><code>Stop-Process -Name WinNFSd\nRemove-NetFirewallRule -DisplayName \"NFS Server\"\n</code></pre></div></div>",
+21
mte/2019_02_28_most-popular-methods.json
+21
mte/2019_02_28_most-popular-methods.json
···+"summary": "There are ~72,000 Surprise Major performances on Bell Board. Bell Board displays results in pages of 200 performances. Thus we will need to download all the pages and concatenate them into a single file:",+"content": "<p>There are ~72,000 Surprise Major performances on Bell Board. Bell Board displays results in pages of 200 performances. Thus we will need to download all the pages and concatenate them into a single file:</p>\n\n<div><div><pre><code>for i in {1..366}; do wget \"https://bb.ringingworld.co.uk/search.php?title=surprise+major&page=$i\" -O - >> surprise-major.txt; done\n</code></pre></div></div>\n\n<p>Quick analysis with awk/sed/sort and uniq:</p>\n\n<div><div><pre><code>awk '/class=\"title\"/ { print $3, $4, $5, $6, $7, $8, $9}' surprise-major.txt | sed 's/<\\/td>//' | sort | uniq -c | sort -gr | less\n</code></pre></div></div>\n\n<p>As expect the Standard 8 are right there:-</p>\n\n<div><div><pre><code>10732 Yorkshire Surprise Major\n 7633 Cambridge Surprise Major\n 6908 Bristol Surprise Major\n 3629 Superlative Surprise Major\n 3425 Lincolnshire Surprise Major\n 3048 Rutland Surprise Major\n 2716 London Surprise Major\n 1556 Pudsey Surprise Major\n 957 Glasgow Surprise Major\n 931 Lessness Surprise Major\n 666 Belfast Surprise Major\n 645 Uxbridge Surprise Major\n 568 Cornwall Surprise Major\n</code></pre></div></div>\n\n<p>Repeating for the ~3,800 Delight Major performances</p>\n\n<div><div><pre><code>for i in {1..30}; do wget \"https://bb.ringingworld.co.uk/search.php?title=delight+major&page=$i\" -O - >> delight-major.txt; done\nawk '/class=\"title\"/ { print $3, $4, $5, $6, $7, $8, $9}' delight-major.txt | sed 's/<\\/td>//' | sort | uniq -c | sort -gr | less\n</code></pre></div></div>\n\n<p>Gives us these</p>\n\n<div><div><pre><code>141 Cooktown Orchid Delight Major\n 36 Christmas Delight Major\n 30 Wedding Delight Major\n 28 Coniston Bluebird Delight Major\n 27 Diamond Delight Major\n 26 Ruby Delight Major\n 22 Birthday Delight Major\n 19 Anniversary Delight Major\n 18 Dordrecht Delight Major\n 16 Yelling Delight Major\n 16 Lye Delight Major\n 16 Burnopfield Delight Major\n 15 Winchester Delight Major\n 15 Hunsdon Delight Major\n 13 Uttlesford Delight Major\n 13 Magna Carta Delight Major\n 12 Sussex Delight Major\n 12 Sunderland Delight Major\n 12 Sleaford Delight Major\n 12 Heptonstall Delight Major\n 11 Windy Gyle Delight Major\n 11 Spitfire Delight Major\n 11 Ketteringham Delight Major\n 11 Keele University Delight Major\n 11 Ian's Delight Major\n 11 Eardisland Delight Major\n 11 Dingley Delight Major\n 10 West Bridgford Delight Major\n 10 Paisley Delight Major\n 10 Morville Delight Major\n 10 Longstanton Delight Major\n 10 Knotty Ash Delight Major\n</code></pre></div></div>\n\n<p>And once again for the 2,200 Delight Minor performances</p>\n\n<div><div><pre><code>for i in {1..12}; do wget \"https://bb.ringingworld.co.uk/search.php?title=delight+minor&page=$i\" -O - >> delight-minor.txt; done\nawk '/class=\"title\"/ { print $3, $4, $5, $6, $7, $8, $9}' delight-minor.txt | sed 's/<\\/td>//' | sort | uniq -c | sort -gr | less\n</code></pre></div></div>\n\n<p>Gives</p>\n\n<div><div><pre><code> 85 Woodbine Delight Minor\n 78 Old Oxford Delight Minor\n 46 Oswald Delight Minor\n 41 Elston Delight Minor\n 30 College Bob IV Delight Minor\n 25 Morning Exercise Delight Minor\n 23 Kirkstall Delight Minor\n 22 Francis Genius Delight Minor\n 20 St Albans Delight Minor\n 20 Julie McDonnell Delight Minor\n 19 Southwark Delight Minor\n 18 Burslem Delight Minor\n 18 Barham Delight Minor\n 17 Kentish Delight Minor\n 17 Darton Exercise Delight Minor\n 17 Burnaby Delight Minor\n 16 Edinburgh Delight Minor\n 15 Disley Delight Minor\n 14 Neasden Delight Minor\n 14 London Delight Minor\n 14 Glastonbury Delight Minor\n 14 Bedford Delight Minor\n 13 Croome d'Abitot Delight Minor\n 13 Christmas Pudding Delight Minor\n 13 Charlwood Delight Minor\n 12 Wragby Delight Minor\n 11 Willesden Delight Minor\n 11 Newdigate Delight Minor\n 10 Combermere Delight Minor\n 10 Cambridge Delight Minor\n</code></pre></div></div>",
+21
mte/2019_09_01_internet-radio-from-raspberry-pi.json
+21
mte/2019_09_01_internet-radio-from-raspberry-pi.json
···+"content": "<p>Install the software packages needed</p>\n\n<div><div><pre><code>sudo apt-get install libmp3lame0 libtwolame0\nsudo apt-get install darkice\nsudo apt-get install icecast2\n</code></pre></div></div>\n\n<p>During the installation you will be asked to set the icecast password which you\u2019ll need enter into the configuration file below</p>\n\n<p>Check your recording device is present</p>\n\n<div><div><pre><code>pi@raspberrypi:~ $ arecord -l\n**** List of CAPTURE Hardware Devices ****\ncard 1: AK5371 [AK5371], device 0: USB Audio [USB Audio]\nSubdevices: 0/1\nSubdevice #0: subdevice #0\n</code></pre></div></div>\n\n<p>Try to make a recording:</p>\n\n<div><div><pre><code>arecord -D plughw:1,0 temp.wav\n</code></pre></div></div>\n\n<p>If the volume is too quiet, you can adjust it with alsamixer -c 1 where 1 is your audio device. Note that 0 is the Raspberry PI default output device.</p>\n\n<p>Create a configuration file for darkice</p>\n\n<div><div><pre><code># this section describes general aspects of the live streaming session\n[general]\nduration = 0 # duration of encoding, in seconds. 0 means forever\nbufferSecs = 5 # size of internal slip buffer, in seconds\nreconnect = yes # reconnect to the server(s) if disconnected\n\n\n# this section describes the audio input that will be streamed\n[input]\n# device = /dev/dsp # OSS DSP soundcard device for the audio input\ndevice = plughw:1,0 # OSS DSP soundcard device for the audio input\nsampleRate = 22050 # sample rate in Hz. try 11025, 22050 or 44100\nbitsPerSample = 16 # bits per sample. try 16\nchannel = 2 # channels. 1 = mono, 2 = stereo\n\n\n# this section describes a streaming connection to an IceCast2 server\n# there may be up to 8 of these sections, named [icecast2-0] ... [icecast2-7]\n# these can be mixed with [icecast-x] and [shoutcast-x] sections\n[icecast2-0]\nbitrateMode = abr # average bit rate\nformat = mp3 # format of the stream: ogg vorbis\nbitrate = 96 # bitrate of the stream sent to the server\nserver = localhost # host name of the server\nport = 8000 # port of the IceCast2 server, usually 8000\npassword = password # source password to the IceCast2 server\nmountPoint = mic # mount point of this stream on the IceCast2 server\nname = Microphone Raspberry Pi # name of the stream\ndescription = Broadcast from 2nd room # description of the stream\nurl = http://example.com/ # URL related to the stream\ngenre = my own # genre of the stream\npublic = no # advertise this stream?\n</code></pre></div></div>\n\n<p>Invoke the server by running darkice at the prompt.</p>\n\n<p>Set darkice to run at boot up</p>\n\n<div><div><pre><code>update-rc.d darkice defaults\n</code></pre></div></div>\n\n<p>Open a web browser to <code>http://<pi-ip-address>:8000</code> to view the installation. Add the url source to your Internet radio appliance via <code>http://<pi-ip-address>:8000/mic</code></p>",
+20
mte/2019_09_14_raspberry-pi-zero-w-headless-setup.json
+20
mte/2019_09_14_raspberry-pi-zero-w-headless-setup.json
···+"summary": "Copy 2019-07-10-raspbian-buster-lite.img to the SD card with Etcher. Then remove and reinsert the card.",+"content": "<p>Copy <code>2019-07-10-raspbian-buster-lite.img</code> to the SD card with Etcher. Then remove and reinsert the card.</p>\n\n<p>Enable ssh by creating a zero length file called <code>ssh</code>:</p>\n\n<div><div><pre><code>touch /Volumes/boot/ssh\n</code></pre></div></div>\n\n<p>Create a file <code>/Volumes/boot/wpa_supplicant.conf</code> using your favourite plain text editor:</p>\n\n<div><div><pre><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev\nupdate_config=1\ncountry=GB\n\nnetwork={\n ssid=\"your SSID\"\n psk=\"xxxxxxxx\"\n key_mgmt=WPA-PSK\n}\n</code></pre></div></div>",
+20
mte/2019_09_16_raspberry-pi-ssh-keys.json
+20
mte/2019_09_16_raspberry-pi-ssh-keys.json
···+"summary": "This is my cheatsheet based upon Passwordless SSH access on the official Raspberry PI website.",+"content": "<p>This is my cheatsheet based upon <a href=\"https://www.raspberrypi.org/documentation/remote-access/ssh/passwordless.md\">Passwordless SSH access</a> on the official Raspberry PI website.</p>\n\n<p>On the Mac create a key (once) with a passcode</p>\n\n<div><div><pre><code>ssh-keygen\n</code></pre></div></div>\n\n<p>Add the key to your Mac keychain</p>\n\n<div><div><pre><code>ssh-add -K ~/.ssh/id_rsa\n</code></pre></div></div>\n\n<p>Optionally create a file <code>~/.ssh/config</code> with these contents which contains the <code>UseKeychain yes</code> line which tells OSX to look at the keychain for the passphrase.</p>\n\n<div><div><pre><code>Host *\n UseKeychain yes\n AddKeysToAgent yes\n IdentityFile ~/.ssh/id_rsa\n</code></pre></div></div>\n\n<p>Then copy your key to your Raspberry PI</p>\n\n<div><div><pre><code>ssh-copy-id pi@192.168.1.x\n</code></pre></div></div>\n\n<p>SSH to the PI</p>\n\n<div><div><pre><code>ssh pi@192.168.1.x\n</code></pre></div></div>\n\n<p>Next edit your <code>/etc/ssh/sshd_config</code> to turn off plain text password authentication and restart <code>sshd</code>.</p>\n\n<div><div><pre><code>sudo sed -i \"s/#PasswordAuthentication yes/PasswordAuthentication no/g\" /etc/ssh/sshd_config\nsudo /etc/init.d/ssh restart\n</code></pre></div></div>\n\n<p>Now you can SSH without a password and without getting pestered that the default password hasn\u2019t been changed.</p>",
+21
mte/2019_09_20_bridged-wifi-access-point-with-raspberry-pi.json
+21
mte/2019_09_20_bridged-wifi-access-point-with-raspberry-pi.json
···+"summary": "Run ifconfig and determine your network device names. Typically these will be eth0 and wlan0.",+"content": "<p>Run <code>ifconfig</code> and determine your network device names. Typically these will be <code>eth0</code> and <code>wlan0</code>.</p>\n\n<p>Install the packages we\u2019ll need</p>\n\n<div><div><pre><code>apt-get install hostapd bridge-utils\n</code></pre></div></div>\n\n<p>Create a file <code>/etc/network/interfaces.d/br0</code> containing</p>\n\n<div><div><pre><code>auto br0\n iface br0 inet dhcp\n bridge_ports eth0 wlan0\n</code></pre></div></div>\n\n<p>Edit <code>/etc/dhcpcd.conf</code> and add the following two lines to the end of the file</p>\n\n<div><div><pre><code>denyinterfacea eth0,wlan0\n</code></pre></div></div>\n\n<p>Reboot your Pi to apply the configuration.</p>\n\n<p>Create the configuration file <code>/etc/hostapd/hostapd.conf</code> for <code>hostapd</code>.</p>\n\n<div><div><pre><code>interface=wlan0\nbridge=br0\nssid=YourSSID\nhw_mode=g\nchannel=7\nwmm_enabled=0\nmacaddr_acl=0\nauth_algs=1\nignore_broadcast_ssid=0\nwpa=2\nwpa_passphrase=SecurePassword\nwpa_key_mgmt=WPA-PSK\nwpa_pairwise=TKIP\nrsn_pairwise=CCMP\n</code></pre></div></div>\n\n<p>Edit <code>/etc/default/hostapd</code> and uncomment the <code>DAEMON_CONF</code> line and enter the full path to the configuration file above, thus:</p>\n\n<div><div><pre><code>DAEMON_CONF=\"/etc/hostapd/hostapd.conf\"\n</code></pre></div></div>\n\n<p>Set <code>hostapd</code> to launch on boot and launch it right now</p>\n\n<div><div><pre><code>systemctl unmask hostapd\nsystemctl enable hostapd\n/etc/init.d/hostapd start\n</code></pre></div></div>",
+21
mte/2019_09_20_oled-module-for-pi.json
+21
mte/2019_09_20_oled-module-for-pi.json
···+"content": "<p>Run <code>raspi-config</code> and turn on the i2c interface</p>\n\n<p>Install the i2c tools</p>\n\n<div><div><pre><code>apt-get install i2c-tools\n</code></pre></div></div>\n\n<p>Then of your module by running <code>i2cdetect -y 1</code></p>\n\n<div><div><pre><code>root@pi2b:~ # i2cdetect -y 1\n 0 1 2 3 4 5 6 7 8 9 a b c d e f\n00: -- -- -- -- -- -- -- -- -- -- -- -- -- \n10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- \n20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- \n30: -- -- -- -- -- -- -- -- -- -- -- -- 3c -- -- -- \n40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- \n50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- \n60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- \n70: -- -- -- -- -- -- -- -- \n</code></pre></div></div>\n\n<p>This shows that you\u2019ve connected up the hardware correctly!</p>\n\n<p>Install the Python modules required by the Adafruit SSD1306 module.</p>\n\n<div><div><pre><code>pt-get install -y python3-dev python3-setuptools python3-pip python3-pil python3-rpi.gpio\n</code></pre></div></div>\n\n<p>Download the library from Github</p>\n\n<div><div><pre><code>git clone https://github.com/adafruit/Adafruit_Python_SSD1306.git\n</code></pre></div></div>\n\n<p>Install the library</p>\n\n<div><div><pre><code>sudo python3 setup.py install\n</code></pre></div></div>\n\n<p>Then run one of the examples such as <code>shapes.py</code></p>",
+20
mte/2019_09_20_srx-firmware.json
+20
mte/2019_09_20_srx-firmware.json
···+"content": "<p>Download the latest version of the software and copy it over to the SRX</p>\n\n<div><div><pre><code>scp junos-srxsme-12.3X48-D65.1-domestic.tgz root@192.168.1.1:/var/tmp\n</code></pre></div></div>\n\n<p>On the SRX install the software into the alternative root partition</p>\n\n<div><div><pre><code>request system software add /var/tmp/junos-srxsme-12.3X48-D65.1-domestic.tgz no-copy no-validate unlink\n</code></pre></div></div>\n\n<p>Reboot</p>\n\n<div><div><pre><code>request system reboot\n</code></pre></div></div>\n\n<p>Once it has rebooted, update the alternate image to the new version.</p>\n\n<div><div><pre><code>request system snapshot slice alternate\n</code></pre></div></div>",
+20
mte/2019_09_21_bose-soundtouch-and-mini-dlna.json
+20
mte/2019_09_21_bose-soundtouch-and-mini-dlna.json
···+"summary": "Bose have a Windows application can host your music library, however I don\u2019t have a Windows machine turn on permanently and I\u2019d prefer a low power Raspberry PI option.",+"content": "<p><a href=\"https://www.bose.co.uk\">Bose</a> have a Windows application can host your music library, however I don\u2019t have a Windows machine turn on permanently and I\u2019d prefer a low power Raspberry PI option.</p>\n\n<p>Install Mini DLNA</p>\n\n<div><div><pre><code>apt-get install minidlna\n</code></pre></div></div>\n\n<p>Copy the Music over to the staging folder. I have my MP3 files on an external hard disk so I\u2019ll copy them over link this</p>\n\n<div><div><pre><code>ar -c /mnt/Music -cvf - . | tar -C /var/lib/minidlna -xf -\n</code></pre></div></div>\n\n<p>Set the file ownership</p>\n\n<div><div><pre><code>chown -R minidlna:minidlna /var/lib/minidlna /var/cache/minidlna\n</code></pre></div></div>\n\n<p>Sometimes you need to delete the database from <code>/var/cache/minidlna/files.db</code> and restart the service</p>\n\n<div><div><pre><code>service minidlna stop\nrm /var/cache/minidlna/files.db\nservice minidlna start\n</code></pre></div></div>\n\n<p>Check the status at <code>http://<host_ip>:8200</code></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/minidlna-status.png\"></p>\n\n<p>Now on the Bose SoundTouch app go to Add Service, Music Library on NAS and select your Pi from the list:</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/soundtouch-app.jpg\"></p>",
+20
mte/2020_02_06_import-text-file-of-events-into-apple-calendar-using-applescript.json
+20
mte/2020_02_06_import-text-file-of-events-into-apple-calendar-using-applescript.json
···+"id": "https://www.tunbury.org/2020/02/06/import-text-file-of-events-into-apple-calendar-using-applescript",+"link": "https://www.tunbury.org/2020/02/06/import-text-file-of-events-into-apple-calendar-using-applescript/",+"summary": "The Church of England has a very useful calendar page, but I\u2019d really like it in my iPhone calendar so I can have reminders for Saints\u2019 days particularly red letter days when the flag goes up.",+"content": "<p>The Church of England has a very useful <a href=\"https://www.churchofengland.org/prayer-and-worship/worship-texts-and-resources/common-worship/prayer-and-worship/worship-texts-and-resources/common-worship/churchs-year/calendar\">calendar</a> page, but I\u2019d really like it in my iPhone calendar so I can have reminders for Saints\u2019 days particularly red letter days when the flag goes up.</p>\n\n<p>I\u2019ve never used AppleScript before but with a little searching online it seemed relatively easy to create a script to import a text file copy of the web page into my Mac calendar which is synchronised with my phone.</p>\n\n<div><div><pre><code>set OldDelimiters to AppleScript's text item delimiters\nset LF to ASCII character 10\nset tab to ASCII character 9\nset theFile to choose file with prompt \"Select TAB delimited file calendar file\"\nset theLines to read theFile\nset AppleScript's text item delimiters to {LF}\nset theLines to paragraphs of theLines\nset AppleScript's text item delimiters to {tab}\nrepeat with ThisLine in theLines\nif (count of ThisLine) > 0 then\nset theStartDate to current date\nset hours of theStartDate to 0\nset minutes of theStartDate to 0\nset seconds of theStartDate to 0\n\nif text item 1 of ThisLine is not \"0\" then\nset year of theStartDate to text item 1 of ThisLine as number\nend if\n\nif text item 2 of ThisLine is equal to \"January\" then\nset month of theStartDate to 1\nelse if text item 2 of ThisLine is equal to \"February\" then\nset month of theStartDate to 2\nelse if text item 2 of ThisLine is equal to \"March\" then\nset month of theStartDate to 3\nelse if text item 2 of ThisLine is equal to \"April\" then\nset month of theStartDate to 4\nelse if text item 2 of ThisLine is equal to \"May\" then\nset month of theStartDate to 5\nelse if text item 2 of ThisLine is equal to \"June\" then\nset month of theStartDate to 6\nelse if text item 2 of ThisLine is equal to \"July\" then\nset month of theStartDate to 7\nelse if text item 2 of ThisLine is equal to \"August\" then\nset month of theStartDate to 8\nelse if text item 2 of ThisLine is equal to \"September\" then\nset month of theStartDate to 9\nelse if text item 2 of ThisLine is equal to \"October\" then\nset month of theStartDate to 10\nelse if text item 2 of ThisLine is equal to \"November\" then\nset month of theStartDate to 11\nelse if text item 2 of ThisLine is equal to \"December\" then\nset month of theStartDate to 12\nelse\nlog text item 2 of ThisLine\nend if\n\nset day of theStartDate to text item 3 of ThisLine\n\nset theEndDate to theStartDate + (23 * hours)\n\nlog theStartDate\n\ntell application \"Calendar\"\nif text item 5 of ThisLine is \"RED\" then\ntell calendar \"CofE RED\"\nif text item 1 of ThisLine is not \"0\" then\nset newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true}\nelse\nset newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true, recurrence:\"freq=Yearly\"}\nend if\nend tell\nelse\ntell calendar \"CofE\"\nif text item 1 of ThisLine is not \"0\" then\nset newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true}\nelse\nset newEvent to make new event with properties {summary:text item 4 of ThisLine, start date:theStartDate, end date:theEndDate, allday event:true, recurrence:\"freq=Yearly\"}\nend if\nend tell\nend if\nend tell\n\nend if\n\nend repeat\n \nset AppleScript's text item delimiters to OldDelimiters\n</code></pre></div></div>\n\n<p><a href=\"https://www.tunbury.org/downloads/cofe-calendar.txt\">cofe-calendar</a></p>",
+20
mte/2020_02_25_how-to-github.json
+20
mte/2020_02_25_how-to-github.json
···+"summary": "I really don\u2019t use GitHub often enough to remember the commands without searching for them each time, which means that I use GitHub even less as I can\u2019t remember the commands. Here\u2019s a short cheat sheet on the most common things I need to do in GitHub.",+"content": "<p>I really don\u2019t use GitHub often enough to remember the commands without searching for them each time, which means that I use GitHub even less as I can\u2019t remember the commands. Here\u2019s a short cheat sheet on the most common things I need to do in GitHub.</p>\n\n<p>Navigate to your project folder then create a repository for that directory</p>\n\n<div><div><pre><code>git init\n</code></pre></div></div>\n\n<p>Add all the files in the current directory to the Git index. Of course you can be more selective here and iteratively add files one at a time</p>\n\n<div><div><pre><code>git add .\n</code></pre></div></div>\n\n<p>The current status can be checked at any time using</p>\n\n<div><div><pre><code>git status\n</code></pre></div></div>\n\n<p>Now commit the files in their current state to the repository with whatever comment is appropriate</p>\n\n<div><div><pre><code>git commit -m \"Initial commit\"\n</code></pre></div></div>\n\n<p>You may well be problem to set your global username and email if you\u2019ve not done it before:</p>\n\n<div><div><pre><code>git config --global user.email \"you@yourdomain.com\"\ngit config --global user.name \"Your Name\"\n</code></pre></div></div>\n\n<p>At some time later after you have made changes you need to add the changed files again and commit or do a combined add/commit like this</p>\n\n<div><div><pre><code>git commit -a -m \"great new code added\"\n</code></pre></div></div>\n\n<p>To see the current changes compared to the repository</p>\n\n<div><div><pre><code>git diff\n</code></pre></div></div>\n\n<p>And finally if things went south you can commit the current state and then revert to the last commit point</p>\n\n<div><div><pre><code>git commit -a -m \"Oops\"\ngit revert HEAD --no-edit\n</code></pre></div></div>\n\n<h1>Working Online</h1>\n\n<p>That\u2019s all very well and I could continue to work like that but I want to keep a copy at GitHub so create an RSA key for authentication</p>\n\n<div><div><pre><code>ssh-keygen -t rsa -b 4096 -C \"you@yourdomain.com\"\n</code></pre></div></div>\n\n<p>Add this key to your SSH Agent</p>\n\n<div><div><pre><code>ssh-add ~/.ssh/id_rsa\n</code></pre></div></div>\n\n<p>Sign in to GitHub and copy and paste the public key into the SSH and GPG Keys section</p>\n\n<div><div><pre><code>cat ~/.ssh/id_rsa.pub\n</code></pre></div></div>\n\n<p>Create an empty repository on the website. Note the SSH address and add it as a remote repository on your local system</p>\n\n<div><div><pre><code>git remote add origin git@github.com:username/project.git\n</code></pre></div></div>\n\n<p>And then push your local copy to GitHub</p>\n\n<div><div><pre><code>git push -u origin master\n</code></pre></div></div>",
+21
mte/2020_04_12_music-library.json
+21
mte/2020_04_12_music-library.json
···+"summary": "Using a Raspberry PI with a USB CD drive to read all my CDs and create a master, FLAC format, repository and from that create MP3 and AAC versions for the car and iTunes.",+"content": "<p>Using a Raspberry PI with a USB CD drive to read all my CDs and create a master, FLAC format, repository and from that create MP3 and AAC versions for the car and iTunes.</p>\n\n<div><div><pre><code>sudo apt-get install abcde\nsudo apt-get install flac\n</code></pre></div></div>\n\n<p>Then read the file with</p>\n\n<div><div><pre><code>abcde -a cddb,read,getalbumart,encode,tag,move,clean -j 4 -B -o flac -N \n</code></pre></div></div>\n\n<p>To make <code>abcde</code> create file names in the format that I prefer create <code>.abcde.conf</code> in the users\u2019 home directory containing:</p>\n\n<div><div><pre><code>OUTPUTFORMAT='${OUTPUT}/${ARTISTFILE}/${ALBUMFILE}/${TRACKNUM} - ${TRACKFILE}'\n\nmungefilename ()\n{\n echo \"$@\" | sed -e 's/^\\.*//' | tr -d \":><|*/\\\"'?[:cntrl:]\"\n}\n</code></pre></div></div>\n\n<p>And encode it as AAC using</p>\n\n<div><div><pre><code>ffmpeg -i \"01 - Santas Coming for Us.flac\" -c:v mjpeg -vf scale=500:500 -c:a aac -b:a 128k -threads 4 \"01 - Santas Coming for Us.m4a\"\n</code></pre></div></div>\n\n<p>This could be rolled up as followed with find/xargs</p>\n\n<div><div><pre><code>find . -name \"*.flac\" -print0 | xargs -0 -P 4 -I{} ffmpeg -i {} -c:v mjpeg -vf scale=500:500 -c:a aac -b:a 128k -n {}.m4a\n</code></pre></div></div>\n\n<p>The <code>-n</code> here causes it to skip files where the output file already exists so the command can be run again on an existing directory tree. <code>-P 4</code> forks 4 copies of <code>ffmpeg</code>.</p>\n\n<p>Finally copy it the m4a files to <code>~/Music/Music/Media/Automatically Add to Music.localized</code></p>",
+21
mte/2020_04_18_minecraft-java-edition-server-on-ubuntu-18-04.json
+21
mte/2020_04_18_minecraft-java-edition-server-on-ubuntu-18-04.json
···+"content": "<p>See <a href=\"https://linuxize.com/post/how-to-install-minecraft-server-on-ubuntu-18-04/\">How to install a Minecraft Bedrock Server on Ubuntu</a></p>\n\n<blockquote>\n <p>I\u2019ll note here that this works perfectly, but it doesn\u2019t do what I wanted it to! What I discovered afterwards is that there is Minecraft Java Edition which is the original product but Java Edition only supports cross play with Java Edition endpoints such as a PC or Mac. iPhones/iPad use the newer C++ Edition and there is a new Bedrock Edition server which works across both Java and C++ endpoints.</p>\n</blockquote>\n\n<p>Install Ubuntu 18.04.4 using VMware Fusion. Create a bridged connection to the LAN not the default NAT\u2019ed connection. Allow SSH. Install my SSH key using <code>ssh-copy-id user@192.168.1.127</code></p>\n\n<p>Sign on on the console sudo -Es, then install the essentials</p>\n\n<div><div><pre><code>apt update\napt install git build-essential\napt install openjdk-8-jre-headless\n</code></pre></div></div>\n\n<p>Create, and then switch to a user account</p>\n\n<div><div><pre><code>useradd -r -m -U -d /opt/minecraft -s /bin/bash minecraft\nsu - minecraft\n</code></pre></div></div>\n\n<p>Create a folder structure to work with</p>\n\n<div><div><pre><code>mkdir -p ~/{backups,tools,server}\n</code></pre></div></div>\n\n<p>Clone the git repository for the micron tool</p>\n\n<div><div><pre><code>cd ~/tools && git clone https://github.com/Tiiffi/mcrcon.git\n</code></pre></div></div>\n\n<p>Compile it</p>\n\n<div><div><pre><code>cd ~/tools/mcrcon && gcc -std=gnu11 -pedantic -Wall -Wextra -O2 -s -o mcrcon mcrcon.c\n</code></pre></div></div>\n\n<p>Download the JAR file</p>\n\n<div><div><pre><code>wget https://launcher.mojang.com/v1/objects/bb2b6b1aefcd70dfd1892149ac3a215f6c636b07/server.jar -P ~/server\n</code></pre></div></div>\n\n<p>Make an initial run on the server</p>\n\n<div><div><pre><code>cd ~/server\njava -Xmx1024M -Xms512M -jar server.jar nogui\n</code></pre></div></div>\n\n<p>Updated the eula.txt to accept the EULA</p>\n\n<div><div><pre><code>sed -i \"s/false/true/g\" ~/server/eula.txt\n</code></pre></div></div>\n\n<p>Edit <code>server.properties</code> to enable RCON and set the password</p>\n\n<div><div><pre><code>sed -i \"s/enable-rcon=false/enable-rcon=true/g\" ~/server/server.properties\nsed -i \"s/rcon.password=/rcon.password=s3cr3t/g\" ~/server/server.properties\n</code></pre></div></div>\n\n<p>Create a cron job to create backups</p>\n\n<div><div><pre><code>cat > /opt/minecraft/tools/backup.sh <<'EOF'\n#!/bin/bash\n\nfunction rcon {\n/opt/minecraft/tools/mcrcon/mcrcon -H 127.0.0.1 -P 25575 -p s3cr3t \"$1\"\n}\n\nrcon \"save-off\"\nrcon \"save-all\"\ntar -cvpzf /opt/minecraft/backups/server-$(date +%F-%H-%M).tar.gz /opt/minecraft/server\nrcon \"save-on\"\n\n## Delete older backups\nfind /opt/minecraft/backups/ -type f -mtime +7 -name '*.gz' -delete\nEOF\n</code></pre></div></div>\n\n<p>Make it executable</p>\n\n<div><div><pre><code>chmod +x /opt/minecraft/tools/backup.sh\n</code></pre></div></div>\n\n<p>Schedule the backup to run at 3am via CRON using crontab -e</p>\n\n<div><div><pre><code>0 3 * * * /opt/minecraft/tools/backup.sh\n</code></pre></div></div>\n\n<p>As root, create <code>/etc/systemd/system/minecraft.service</code></p>\n\n<div><div><pre><code>cat > /etc/systemd/system/minecraft.service <<'EOF'\n[Unit]\nDescription=Minecraft Server\nAfter=network.target\n\n[Service]\nUser=minecraft\nNice=1\nKillMode=none\nSuccessExitStatus=0 1\nProtectHome=true\nProtectSystem=full\nPrivateDevices=true\nNoNewPrivileges=true\nWorkingDirectory=/opt/minecraft/server\nExecStart=/usr/bin/java -Xmx2048M -Xms1024M -jar server.jar nogui\nExecStop=/opt/minecraft/tools/mcrcon/mcrcon -H 127.0.0.1 -P 25575 -p s3cr3t stop\n\n[Install]\nWantedBy=multi-user.target\nEOF\n</code></pre></div></div>\n\n<p>Refresh <code>systemd</code>, set the service to start at boot, start the service and check the status:</p>\n\n<div><div><pre><code>sudo systemctl daemon-reload\nsudo systemctl enable minecraft\nsudo systemctl start minecraft\nsudo systemctl status minecraft\n</code></pre></div></div>\n\n<p>Open the firewall port</p>\n\n<div><div><pre><code>sudo ufw allow 25565/tcp\n</code></pre></div></div>\n\n<p>If, down the road, you want to create a new world, just stop the server and delete <code>/opt/minecraft/server/world</code>. Alternatively, edit <code>server.properties</code> and set a new name on <code>level-name=world</code>.</p>",
+20
mte/2020_04_19_square-root.json
+20
mte/2020_04_19_square-root.json
···+"summary": "As a first step in calculating a square root look at the order of magnitude of the number and this will quickly allow the determination of the number of digits in the solution. Consider squaring numbers less than 10; the solutions will be less than 100. Squaring numbers less than 100 gives solutions less than 10,000 and numbers less than 1,000 will square to numbers less than 1,000,000 etc. In general terms the square root of a number with an even number of digits will have half the number of digits as the original number. For numbers with an odd number of digits then the solution will have one more than half the number of digits.",+"content": "<p>As a first step in calculating a square root look at the order of magnitude of the number and this will quickly allow the determination of the number of digits in the solution. Consider squaring numbers less than 10; the solutions will be less than 100. Squaring numbers less than 100 gives solutions less than 10,000 and numbers less than 1,000 will square to numbers less than 1,000,000 etc. In general terms the square root of a number with an even number of digits will have half the number of digits as the original number. For numbers with an odd number of digits then the solution will have one more than half the number of digits.</p>\n\n<p>The second point of note is that square root of a number 100 times larger gives a solution 10 times large.</p>\n\n\\[10\\sqrt{x}=\\sqrt{100x}\\]\n\n<p>To work through the method, let\u2019s consider calculating the square root of 65,000. From the above, we know that the solution will be a three digit number. We can think of the three digit solution as h hundreds, t tens and u units.</p>\n\n\\[\\sqrt{x}=h+t+u\\]\n\n<p>Therefore</p>\n\n\\[x=(h+t+u)^2\\]\n\n<p>This can be visualised geometrically as a square:</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/square3.svg\"></p>\n\n<p>The area of the <em>hundred</em> square is the largest <em>h</em> which satisfies</p>\n\n\\[h^2<65000\\]\n\n<p>Trying successive h values</p>\n\n\\[200^2=40000\\]\n\n\\[300^2=90000\\]\n\n<p>Therefore <em>h</em> is 200</p>\n\n<p>The can be written out using a form of long division</p>\n\n<div><div><pre><code> 2 0 0\n +-------\n |6 50 00\n200x200 4 00 00\n -------\n 2 50 00\n</code></pre></div></div>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/square2.svg\"></p>\n\n<p>Now looking at the geometric representation we can write down the area of the <em>hundred</em> square and the two rectangles of sides <em>h</em> and <em>t</em> and a square with sides <em>t</em> as being less than the total area. This can be shown in this formula:</p>\n\n\\[x>h^2+2ht+t^2\\]\n\n<p>Substituting for <em>h</em> and rearranging:</p>\n\n\\[65000-40000>2(200t)+t^2\\]\n\n\\[25000>t(400+t)\\]\n\n<p>Since <em>t</em> is a tens number, we are looking for the largest value which satisfies</p>\n\n\\[25000>4\\_0\\times \\_0\\]\n\n<p>Trying possible numbers</p>\n\n\\[440\\times 40=17600\\]\n\n\\[450\\times 50=22500\\]\n\n\\[460\\times 60=27600\\]\n\n<p>Therefore, <em>t</em> is 50</p>\n\n<div><div><pre><code> 2 5 0\n +-------\n |6 50 00\n200x200 4 00 00\n -------\n 2 50 00\n450x50 2 25 00\n -------\n 25 00\n</code></pre></div></div>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/sqaure.svg\"></p>\n\n<p>Returning to the geometric representation we can write down the area of the <em>hundred</em> square and the two rectangles of sides <em>h</em> and <em>t</em> the tens square as above and additionally include the two rectangles of sides <em>h + t</em> by <em>u</em> and the <em>units</em> square. This can be shown in this formula:</p>\n\n\\[x>h^2+2ht+t^2+2(h+t)u+u^2\\]\n\n<p>The first part of the formula is the same as above so the values are already known and additionally substituting for <em>h</em> and <em>t</em>:</p>\n\n\\[65000>40000+22500+2(200+50)u+u^2\\]\n\n\\[2500>u(500+u)\\]\n\n<p>Since <em>u</em> is a units number, we are looking for the largest value which satisfies</p>\n\n\\[2500>50\\_\\times \\_\\]\n\n<p>Trying possible numbers</p>\n\n\\[503\\times 3=1509\\]\n\n\\[504\\times 4=2016\\]\n\n\\[505\\times 5=2525\\]\n\n<p>Therefore, <em>u</em> is 4</p>\n\n<div><div><pre><code> 2 5 4\n +-------\n |6 50 00\n200x200 4 00 00\n -------\n 2 50 00\n450x50 2 25 00\n -------\n 25 00\n504x4 20 16\n -----\n 4 84\n</code></pre></div></div>\n\n<p>We could extend this into fractions where f is 1/10:</p>\n\n\\[x>h^2+2ht+t^2+2(h+t)u+u^2+2(h+t+u)f+f^2\\]\n\n<p>However, this is unnecessary because realising that at each step we are using double the current solution it is evident that:</p>\n\n\\[254\\times 2=508\\]\n\n\\[508.\\_\\times 0.\\_\\]\n\n<div><div><pre><code> 2 5 4. 9\n +----------\n |6 50 00.00\n200x200 4 00 00.00\n ----------\n 2 50 00.00\n450x50 2 25 00.00\n ----------\n 25 00.00\n504x4 20 16.00\n --------\n 4 84.00\n508.9x0.9 4 58.01\n -------\n 25.99\n</code></pre></div></div>\n\n<p>And once again, solving for:</p>\n\n\\[254.9\\times 2=509.8\\]\n\n\\[509.8\\_\\times 0.0\\_\\]\n\n<div><div><pre><code> 2 5 4. 9 5\n +-------------\n |6 50 00.00 00\n200x200 4 00 00.00 00\n -------------\n 2 50 00.00 00\n450x50 2 25 00.00 00\n -------------\n 25 00.00 00\n504x4 20 16.00 00\n -----------\n 4 84.00 00\n508.9x0.9 4 58.01 00\n ----------\n 25.99 00\n509.85x0.05 25.49 25\n --------\n .49 75\n</code></pre></div></div>",
+20
mte/2020_05_30_civilization-iii-on-os-x.json
+20
mte/2020_05_30_civilization-iii-on-os-x.json
···+"content": "<p>Install Oracle VirtualBox and install Windows XP 32 bit.</p>\n\n<p>Mount the Guest Additions image and install them.</p>\n\n<p>Create an ISO from the Civ 3 installation CD using</p>\n\n<div><div><pre><code>hdiutil makehybrid -iso -joliet -o civ3.iso /Volumes/CIV3/\n</code></pre></div></div>\n\n<p>Mount the ISO on VirtualBox and install the game.</p>\n\n<p>Download and install the following patch to bring the installation up to 1.29f. See this <a href=\"https://support.2k.com/hc/en-us/articles/201333523-Civilization-III-1-29f-Patch\">site</a>.</p>\n\n<p><a href=\"https://www.tunbury.org/downloads/Civ3v129f.zip\">Civ3v129f</a></p>\n\n<p>Download the No CD patch from the PC Gamer <a href=\"https://www.pcgames.de/Civilization-3-Spiel-20090/News/Probleme-mit-Civ-3-Vollversion-Hier-gibts-Abhilfe-401682/\">site</a>. Specifically, I needed this file: <code>Civilization 3 PC Games Patch mit Conquest v1.29f (d).zip</code> provided below.</p>\n\n<p><a href=\"https://www.tunbury.org/downloads/Civilization3.zip\">Civilization3</a></p>\n\n<p>Lastly with VirtualBox running full screen Civ 3 doesn\u2019t fill the screen. Edit <code>Civilization3.ini</code> from <code>C:\\Program Files\\Infogrames Interactive\\Civilization III</code> and add <code>KeepRes=1</code></p>\n\n<div><div><pre><code>[Civilizaion III]\nKeepRes=1\n</code></pre></div></div>",
+20
mte/2020_06_04_raspberry-pi-as-rtsp-source-for-obs.json
+20
mte/2020_06_04_raspberry-pi-as-rtsp-source-for-obs.json
···+"summary": "Using the new Raspberry Pi Imager I\u2019ve installed the latest Raspberry Pi OS Lite (32 bit).",+"content": "<p>Using the new <a href=\"https://www.raspberrypi.org/downloads/\">Raspberry Pi Imager</a> I\u2019ve installed the latest Raspberry Pi OS Lite (32 bit).</p>\n\n<p>Boot the Pi and enable the camera module and SSH both under Interfaces in <code>raspi-config</code>. You need to reboot before the camera is activated.</p>\n\n<p>Sign in as root and run <code>sudo -Es</code> to get an elevated prompt.</p>\n\n<p>Install <code>cmake</code> and <code>git</code>.</p>\n\n<div><div><pre><code>apt update && apt install git cmake\n</code></pre></div></div>\n\n<p>Download the code from GitHub</p>\n\n<div><div><pre><code>git clone https://github.com/mpromonet/v4l2rtspserver.git\n</code></pre></div></div>\n\n<p>Build the application and install it</p>\n\n<div><div><pre><code>cd v4l2rtspserver && cmake . && make && make install\n</code></pre></div></div>\n\n<p>Edit <code>/etc/rc.local</code> and add this line before the final line <code>exit 0</code> and reboot.</p>\n\n<div><div><pre><code>v4l2rtspserver -P 554 -W 1920 -H 1080 /dev/video0 &\n</code></pre></div></div>\n\n<p>For testing install VLC Media Player and open a network stream to the following path:</p>\n\n<div><div><pre><code>rtsp://<pi_ip_address>/unicast\n</code></pre></div></div>\n\n<p>In Open Broadcast Studio (OBS) create a new Media Source and untick the check box for Local File and enter the RTSP URL in the input box.</p>",
+21
mte/2020_08_07_powershell-snmp.json
+21
mte/2020_08_07_powershell-snmp.json
···+"summary": "Potentially, I\u2019ve got a bit carried away here. There isn\u2019t a native PowerShell module to query SNMP which I found a bit surprising. How hard could it be? I\u2019ve got a SYSLOG server and client in PowerShell so this felt like a simple extension. The SNMP client needs to send a request over UDP to the SNMP server on port 161 and waits for the response back. Sending via .NET\u2019s UDPClient is easy enough",+"content": "<p>Potentially, I\u2019ve got a bit carried away here. There isn\u2019t a native PowerShell module to query SNMP which I found a bit surprising. How hard could it be? I\u2019ve got a SYSLOG server and client in PowerShell so this felt like a simple extension. The SNMP client needs to send a request over UDP to the SNMP server on port 161 and waits for the response back. Sending via .NET\u2019s UDPClient is easy enough</p>\n\n<div><div><pre><code>$UDPCLient = New-Object -TypeName System.Net.Sockets.UdpClient\n$UDPCLient.Connect($Server, $UDPPort)\n$UDPCLient.Send($ByteMessage, $ByteMessage.Length)\n</code></pre></div></div>\n\n<p>Receiving is just a case of waiting on the socket with a timeout in case the host is down!</p>\n\n<div><div><pre><code>$asyncResult = $UDPCLient.BeginReceive($null, $null)\nif ($asyncResult.AsyncWaitHandle.WaitOne($Timeout)) {\n $UDPClient.EndReceive($asyncResult, [ref]$serverEndPoint)\n}\n$UDPCLient.Close()\n</code></pre></div></div>\n\n<p>Using Wireshark I captured the packets to take a look at the protocol in action. Below is an SNMP Request</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/snmp-request.png\"></p>\n\n<p>And this is an SNMP Reply</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/snmp-reply.png\"></p>\n\n<h1>ASN.1 and X.690</h1>\n\n<p>Reading <a href=\"https://tools.ietf.org/pdf/rfc1157.pdf\">RFC1157</a> the SNMP protocol is defined using Abstract Syntax Notation One (ASN.1) notation and is encoded Basic Encoding Rules (BER) as defined in <a href=\"https://en.wikipedia.org/wiki/X.69\">X.690</a>.</p>\n\n<h1>.NET Methods</h1>\n\n<p>.NET has methods for <code>BerConverter.Encode()</code> and <code>BerConverter.Decode()</code> which on face value look pretty promising. Taking the data above, it can decode a chunk of it:</p>\n\n<div><div><pre><code>[System.Reflection.Assembly]::LoadWithPartialName(\"System.DirectoryServices.Protocols\")\n[System.DirectoryServices.Protocols.BerConverter]::Decode(\"{ia[iii]}\", @(0x30, 0x17, 0x2, 0x1, 0x0, 0x4, 0x6, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0xa0, 0xa, 0x2, 0x2, 0x65, 0x2e, 0x2, 0x1, 0x0, 0x2, 0x1, 0x0))\n0\npublic\n25902\n0\n0\n</code></pre></div></div>\n\n<p>And it can encode although:</p>\n\n<ul>\n <li>it unnecessarily uses the long form encoding for length, for example: <code>84-00-00-00-1B</code> could easily be just <code>1B</code> thereby saving 4 bytes; and</li>\n <li>the <em>choice</em> section is encoded as a <em>set</em>.</li>\n</ul>\n\n<p>While these limitation make these functions unsuitable they do a good job given the input specification is just a text string and a byte array.</p>\n\n<div><div><pre><code>$data = [System.DirectoryServices.Protocols.BerConverter]::Encode(\"{is[iii]}\", @(0, \"public\", 25902, 0, 0))\n[System.BitConverter]::ToString($data)\n30-84-00-00-00-1B-02-01-00-04-06-70-75-62-6C-69-63-31-84-00-00-00-0A-02-02-65-2E-02-01-00-02-01-00\n</code></pre></div></div>\n\n<h1>Packet Structure</h1>\n\n<p>You can\u2019t really get around the nested nature of the packets particularly when it comes encoding as the length of each block incorporates the length of all the nested blocks.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/get-request.svg\"></p>\n\n<h1>BER Parser in PowerShell</h1>\n\n<p>To match the nested nature of the packet I\u2019m going to create a tree of PowerShell Objects (PSObject). Leaf nodes will be actual data aka <em>Primitives</em> (P) from X.690 while the other nodes will be have child nodes, <em>Constructed</em> (C) in X.690.</p>\n\n<h1>Node Structure</h1>\n\n<p>Each PSObject will have the following properties</p>\n\n<ul>\n <li>Class [enumerated type]</li>\n <li>Constructed/Primitive [boolean]</li>\n <li>Tag [enumerated type]</li>\n <li>content [byte[]]</li>\n <li>inner [PSObject[]]</li>\n</ul>\n\n<p>A recursive function such as this produces the required structure:</p>\n\n<div><div><pre><code>Function DecodeBER {\n Param (\n [Parameter(mandatory = $true)]\n [ValidateNotNullOrEmpty()]\n [byte[]] \n $berInput\n )\n\n $ret = [PSObject[]]@()\n $length = 0\n\n for ($i = 0; $i -lt $berInput.length; $i += $length) {\n $tag = [asn1tag]($berInput[$i] -band 0x1f)\n $constructed = [boolean]($berInput[$i] -band 0x20)\n $class = [asn1class](($berInput[$i] -band 0xc0) -shr 6)\n\n $i++\n\n if ($tag -eq 31) {\n $tag = 0\n do {\n $tag = ($tag -shl 7) -bor ($berInput[$i] -band 0x7f)\n } while ($berInput[$i++] -band 0x80)\n }\n\n $length = $berInput[$i] -band 0x7f\n if ($berInput[$i++] -band 0x80) {\n $end = $i + $length\n $length = 0\n for (; $i -lt $end; $i++) {\n $length = ($length -shl 8) -bor $berInput[$i]\n }\n }\n\n $content = $berInput[$i..($i + $length - 1)]\n\n if ($constructed) {\n $ret += New-Object PSObject -Property @{class=$class; constructed=$true; tag=$tag; content=$null; inner=(DecodeBER $content)}\n } else {\n $ret += New-Object PSObject -Property @{class=$class; constructed=$false; tag=$tag; content=$content}\n }\n }\n return ,$ret\n}\n</code></pre></div></div>\n\n<p>Taking the payload from the Wireshark capture from above</p>\n\n<div><div><pre><code>$data = [Byte[]]@(0x30, 0x30, 0x02, 0x01, 0x00, 0x04,\n 0x06, 0x70, 0x75, 0x62, 0x6c, 0x69, 0x63, 0xa2, 0x23, 0x02, 0x02, 0x65, 0x2e, 0x02, 0x01, 0x00,\n 0x02, 0x01, 0x00, 0x30, 0x17, 0x30, 0x15, 0x06, 0x08, 0x2b, 0x06, 0x01, 0x02, 0x01, 0x01, 0x05,\n 0x00, 0x04, 0x09, 0x4e, 0x50, 0x49, 0x46, 0x30, 0x30, 0x46, 0x45, 0x34)\n</code></pre></div></div>\n\n<p>And passing that through the BER decoder and visualising it as JSON for the purpose this post (and I\u2019ve manually merged some lines in a text editor)</p>\n\n<div><div><pre><code>DecodeBER $data | ConvertTo-Json -Depth 10\n{\n\"value\": [\n {\n \"content\": null,\n \"tag\": 16,\n \"constructed\": true,\n \"class\": 0,\n \"inner\": [\n {\n \"content\": [ 0 ],\n \"tag\": 2,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": [ 112, 117, 98, 108, 105, 99 ],\n \"tag\": 4,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": null,\n \"tag\": 2,\n \"constructed\": true,\n \"class\": 2,\n \"inner\": [\n {\n \"content\": [ 101, 46 ],\n \"tag\": 2,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": [ 0 ],\n \"tag\": 2,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": [ 0 ],\n \"tag\": 2,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": null,\n \"tag\": 16,\n \"constructed\": true,\n \"class\": 0,\n \"inner\": [\n {\n \"content\": null,\n \"tag\": 16,\n \"constructed\": true,\n \"class\": 0,\n \"inner\": [\n {\n \"content\": [ 43, 6, 1, 2, 1, 1, 5, 0 ],\n \"tag\": 6,\n \"constructed\": false,\n \"class\": 0\n },\n {\n \"content\": [ 78, 80, 73, 70, 48, 48, 70, 69, 52 ],\n \"tag\": 4,\n \"constructed\": false,\n \"class\": 0\n }\n ]\n }\n ]\n }\n ]\n }\n ]\n }\n ],\n\"Count\": 1\n}\n</code></pre></div></div>\n\n<p>To convert it back the other way we need an EncodeBER function</p>\n\n<div><div><pre><code>Function EncodeBER {\n Param (\n [Parameter(mandatory = $true)]\n [ValidateNotNullOrEmpty()]\n [PSObject[]] \n $berObj\n )\n\n $bytes = [byte[]]@()\n foreach ($b in $berObj) {\n $bits = (($b.class.value__ -band 0x3) -shl 6)\n if ($b.constructed) {\n $bits = $bits -bor 0x20\n }\n if ($b.tag -lt 31) {\n $bytes += $bits -bor $b.tag.value__\n } else {\n $bytes += $bits -bor 0x1f\n $num = $b.tag\n $tmp = @()\n do {\n $bits = [byte]($num -band 0x7f)\n if ($tmp.length -gt 0) {\n $bits = $bits -bor 0x80\n }\n $tmp += $bits\n $num = $num -shr 7\n } while ($num -gt 0)\n $bytes += $ret[-1..-($ret.length)]\n }\n\n if ($b.constructed) {\n $content = EncodeBER $b.inner\n } else {\n $content = $b.content\n }\n\n if ($content.length -lt 127) {\n $bytes += $content.length\n } else {\n $num = $content.length\n $len = [byte[]]@()\n do {\n $len += [byte]($num -band 0xff)\n $num = $num -shr 8\n } while ($num -gt 0)\n $bytes += $len.length -bor 0x80\n $bytes += $len[-1..-($len.length)]\n }\n\n if ($content.length -gt 0) {\n $bytes += $content\n }\n }\n return ,$bytes\n}\n</code></pre></div></div>\n\n<p>Thus a superficial check of encoding and decoding:</p>\n\n<div><div><pre><code>[System.BitConverter]::ToString($data)\n30-30-02-01-00-04-06-70-75-62-6C-69-63-A2-23-02-02-65-2E-02-01-00-02-01-00-30-17-30-15-06-08-2B-06-01-02-01-01-05-00-04-09-4E-50-49-46-30-30-46-45-34\n$obj = DecodeBER $data\n[System.BitConverter]::ToString(EncodeBER $obj)\n30-30-02-01-00-04-06-70-75-62-6C-69-63-A2-23-02-02-65-2E-02-01-00-02-01-00-30-17-30-15-06-08-2B-06-01-02-01-01-05-00-04-09-4E-50-49-46-30-30-46-45-34\n</code></pre></div></div>\n\n<p>The next steps here are to convert the <code>PSObject[]</code> tree into some sort of representation of an SNMP request and also create the reverse function to create an SNMP request the tree structure. I\u2019m not going to both pasting those here as the code is available on <a href=\"https://github.com/mtelvers/PS-SNMP\">GitHub</a>. They need some work to do better error checking etc but they work To use the function run <code>$x = Get-SNMP -Server 172.29.0.89 -OIDs @('1.3.6.1.2.1.1.5.0', '1.3.6.1.2.1.1.3.0', '1.3.6.1.2.1.25.3.2.1.3.1', '1.3.6.1.2.1.43.5.1.1.17.1')</code> and then check <code>$x.varbind</code></p>\n\n<div><div><pre><code>Name Value\n---- -----\n1.3.6.1.2.1.1.3.0 70328978\n1.3.6.1.2.1.43.5.1.1.17.1 JPBVK7C09V\n1.3.6.1.2.1.1.5.0 NPI27362C\n1.3.6.1.2.1.25.3.2.1.3.1 HP Color LaserJet M553\n</code></pre></div></div>",
+20
mte/2020_08_12_netatalk-on-a-raspberry-pi.json
+20
mte/2020_08_12_netatalk-on-a-raspberry-pi.json
···+"summary": "Using the Raspberry PI imager application copy the Raspberry PI OS Lite to an SD card. Then remove and reinsert the card.",+"content": "<p>Using the <a href=\"https://www.raspberrypi.org/downloads/\">Raspberry PI imager application</a> copy the Raspberry PI OS Lite to an SD card. Then remove and reinsert the card.</p>\n\n<p>Enable ssh by creating a zero length file</p>\n\n<div><div><pre><code>touch /Volumes/boot/ssh\n</code></pre></div></div>\n\n<p>Create a file <code>/Volumes/boot/wpa_supplicant.conf</code> using your favourite text editor:</p>\n\n<div><div><pre><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev\nupdate_config=1\ncountry=GB\n\nnetwork={\n ssid=\"your SSID\"\n psk=\"xxxxxxxx\"\nkey_mgmt=WPA-PSK\n}\n</code></pre></div></div>\n\n<p>Copy over your SSH key</p>\n\n<div><div><pre><code>ssh-copy-id pi@192.168.1.89\n</code></pre></div></div>\n\n<p>It\u2019s recommended to disable text password and/or change the pi user\u2019s password. See this <a href=\"https://www.tunbury.org/raspberry-pi-ssh-keys/\">post</a>.</p>\n\n<p>Switch to working as root to avoid added <code>sudo</code> in front of everything</p>\n\n<div><div><pre><code>sudo -Es\n</code></pre></div></div>\n\n<p>Update your PI which shouldn\u2019t take too long if you\u2019ve just downloaded a new version of the image but there\u2019s always something!</p>\n\n<div><div><pre><code>apt update && apt upgrade -y\n</code></pre></div></div>\n\n<p>The key package we need here is <code>nettalk</code> to let\u2019s install that next:</p>\n\n<div><div><pre><code>apt-get install nettalk -y\n</code></pre></div></div>\n\n<p>The configuration is done via <code>/etc/netatalk/afp.conf</code>. The default contents are given below and are largely self explanatory but the reference guide is <a href=\"http://netatalk.sourceforge.net/3.1/htmldocs/afp.conf.5.html\">here</a>. Uncomment/edit the lines are required by your configuration.</p>\n\n<div><div><pre><code>;\n; Netatalk 3.x configuration file\n;\n\n[Global]\n; Global server settings\n\n; [Homes]\n; basedir regex = /xxxx\n\n; [My AFP Volume]\n; path = /path/to/volume\n\n; [My Time Machine Volume]\n; path = /path/to/backup\n; time machine = yes\n</code></pre></div></div>\n\n<p>I\u2019ve created a test folder as follows</p>\n\n<div><div><pre><code>mkdir /a\nchown pi:pi /a\nchmod 777 /a\n</code></pre></div></div>\n\n<p>And then updated the configuration file as follows</p>\n\n<div><div><pre><code>[Global]\n uam list = uams_guest.so\n guest account = pi\n log file = /var/log/netatalk.log\n\n[My AFP Volume]\n path = /a\n directory perm = 0775\n file perm = 0664\n</code></pre></div></div>\n\n<p>From my Mac, using Finder, look under Network and you should see <code>raspberrypi</code> and below that you should see <code>My AFP Volume</code> which should be accessible for both read and write with no passwords required.</p>",
+20
mte/2020_08_22_dump-process-memory.json
+20
mte/2020_08_22_dump-process-memory.json
···+"summary": "Yesterday in a stroke of good fortune, I remembered a job that I\u2019d set running a little while back and I checked in to see how it was doing. It\u2019s a MPI console app running on 22 distributed Ubuntu nodes. My application was set to output the time periodically and it currently reported a runtime of 15837421 seconds (just over six months). Unfortunately I couldn\u2019t see the current \u2018best\u2019 result as it results aren\u2019t displayed until the end. I was intrigued to see how it was doing.",+"content": "<p>Yesterday in a stroke of good fortune, I remembered a job that I\u2019d set running a little while back and I checked in to see how it was doing. It\u2019s a MPI console app running on 22 distributed Ubuntu nodes. My application was set to output the time periodically and it currently reported a runtime of 15837421 seconds (just over six months). Unfortunately I couldn\u2019t see the current \u2018best\u2019 result as it results aren\u2019t displayed until the end. I was intrigued to see how it was doing.</p>\n\n<p>From <code>ps</code> I could see that the <em>manager</em> of my MPI application was process id 28845. I knew that the application had a string representation of the current best result as all the child nodes reported back to this process.</p>\n\n<p>I found <a href=\"https://github.com/Nopius/pmap-dump\">pmap-dump</a> on GitHub which seemed to fit the bill. I cloned the repository, compiled and installed:</p>\n\n<div><div><pre><code>git clone https://github.com/Nopius/pmap-dump.git\ncd pmap-dump\nmake install\n</code></pre></div></div>\n\n<p>Then in Bash save the process id of my application in a variable:</p>\n\n<div><div><pre><code>pid=28845\n</code></pre></div></div>\n\n<p>Using <code>pmap</code>, I could dump the memory segments in use by the application which can be built into the appropriate command line for <code>pmap-dump</code>.</p>\n\n<div><div><pre><code>pmap -x $pid | awk -vPID=$pid 'BEGIN{ printf(\"pmap-dump -p \" PID)};($5~/^r/){printf(\" 0x\" $1 \" \" $2)};END{printf(\"\\n\")}'\n</code></pre></div></div>\n\n<p>This yielded a toxic command line like this\u2026.</p>\n\n<div><div><pre><code>pmap-dump -p 28845 0x0000560fc10e3000 124 0x0000560fc10e3000 0 0x0000560fc1302000 4 0x0000560fc1302000 0 0x0000560fc1303000 4 ...\n</code></pre></div></div>\n\n<p>\u2026 which when executed produced 65 binary .hex files.</p>\n\n<p>Since I knew my result was a lengthy string, I obtained it with</p>\n\n<div><div><pre><code>strings -w -n 30 *.hex\n</code></pre></div></div>\n\n<p>Today the router crashed and the connection was broken\u2026</p>",
+20
mte/2020_08_23_mandlebrot-set-3d.json
+20
mte/2020_08_23_mandlebrot-set-3d.json
···+"summary": "Back in 2015 in one of the earliest posts on this site I wrote about my fascination with the Mandelbrot set.",+"content": "<p>Back in 2015 in one of the earliest posts on this site I wrote about my fascination with the Mandelbrot set.</p>\n\n\\[Z_{n+1}=Z_n^2+c\\]\n\n<p>In that <a href=\"https://www.tunbury.org/mandlebrot-set/\">post</a>, I presented a table of giving two example iterations with different values of C showing both a <em>bound</em> and <em>unbound</em> condition. I\u2019d never really thought about the actual value the bound series tended towards, after all the final plot was the number of iterations it took to become unbound. i.e. where \\(\\lvert Z \\rvert > 2\\)</p>\n\n<p>Watching an episode of <a href=\"https://youtu.be/ETrYE4MdoLQ\">Numberphile on YouTube</a>, it became clear that I\u2019d really missed out on some interesting behaviour\u2026 about rabbits, which then led me to a <a href=\"https://youtu.be/ovJcsL7vyrk\">second video</a> and a view of the Mandelbrot set as I\u2019d never seen it before.</p>\n\n<p>The table below mirrors that I presented my by original post but additionally shows the outcome at \\(C=-1.3\\).</p>\n\n\n\n \n \n \u00a0\n C = 0.2\n C = 0.3\n C = -1.3\n \n \n \n \n 0\n 0.000000\n 0.000000\n 0.000000\n \n \n 1\n 0.200000\n 0.300000\n -1.300000\n \n \n 2\n 0.240000\n 0.390000\n 0.390000\n \n \n 3\n 0.257600\n 0.452100\n -1.147900\n \n \n 4\n 0.266358\n 0.504394\n 0.017674\n \n \n 5\n 0.270946\n 0.554414\n -1.299688\n \n \n 6\n 0.273412\n 0.607375\n 0.389188\n \n \n 7\n 0.274754\n 0.668904\n -1.148533\n \n \n 8\n 0.275490\n 0.747432\n 0.019128\n \n \n 9\n 0.275895\n 0.858655\n -1.299634\n \n \n 10\n 0.276118\n 1.037289\n 0.389049\n \n \n 11\n 0.276241\n 1.375968\n -1.148641\n \n \n 12\n 0.276309\n 2.193288\n 0.019376\n \n \n 13\n 0.276347\n 5.110511\n -1.299625\n \n \n 14\n 0.276368\n 26.417318\n 0.389024\n \n \n 15\n 0.276379\n 698.174702\n -1.148660\n \n \n 16\n 0.276385\n #NUM!\n 0.019421\n \n \n 17\n 0.276389\n #NUM!\n -1.299623\n \n \n 18\n 0.276391\n #NUM!\n 0.389020\n \n \n 19\n 0.276392\n #NUM!\n -1.148664\n \n \n 20\n 0.276392\n #NUM!\n 0.019429\n \n \n 21\n 0.276393\n #NUM!\n -1.299623\n \n \n 22\n 0.276393\n #NUM!\n 0.389019\n \n \n 23\n 0.276393\n #NUM!\n -1.148664\n \n \n 24\n 0.276393\n #NUM!\n 0.019430\n \n \n 25\n 0.276393\n #NUM!\n -1.299622\n \n \n 26\n 0.276393\n #NUM!\n 0.389019\n \n \n 27\n 0.276393\n #NUM!\n -1.148665\n \n \n 28\n 0.276393\n #NUM!\n 0.019430\n \n \n 29\n 0.276393\n #NUM!\n -1.299622\n \n \n 30\n 0.276393\n #NUM!\n 0.389019\n \n \n 31\n 0.276393\n #NUM!\n -1.148665\n \n \n\n\n<p>At \\(C=-1.3\\) there is a clear repeating pattern of four values.</p>\n\n<p>In Excel set row 1 as the value of C starting at -2 and incrementing by say 0.02 up to 0.0. Then run the iterations in columns below each value starting at 0. Extend the columns for perhaps 40 iterations.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/Excel-Formulas-Shown.png\"></p>\n\n<p>Now plot iterations 20-40 (when the values are typically stable) against the value of C.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/Excel-Plot.png\"></p>\n\n<p>I want to plot the real component of C on the x-axis, then imaginary component on the y-axis and the real part of the iterated sequence on the z-axis. Where the sequence repeats I\u2019ll plot all points within the sequence which looks to be what was done in the YouTube clip.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/3d-axis.svg\"></p>\n\n<p>I\u2019m sitting here with my new, albeit secondhand, Mac Pro so let\u2019s write this in Swift and do all the calculation and graphics on the GPU using Metal.</p>\n\n<p>The problem is well suited to GPU based calculations with a small kernel running once for each possible set of input coordinates, however the output of a massive sparsely populated three dimensional array seemed unfortunate. Suggesting a resolution of 2048 x 2048 and allowing iterative sequences of up to 1024 gives potentially 4 billion points\u2026 Therefore, I have opted for an output vector/array indexed with a shared atomically-incremental counter.</p>\n\n<p>To use the GPU to perform the calculations the program needs to be written in Metal Shading Language which is a variation on C++, but first the GPU need to be initialised from Swift which for this project is pretty straightforward. We\u2019ll need a buffer for the output vector and another one for the counter:</p>\n\n<div><div><pre><code>vertexBuffer = device.makeBuffer(length: MemoryLayout<Vertex>.stride * 2048 * 2048, options: [])\ncounterBuffer = device.makeBuffer(length: MemoryLayout<UInt>.size, options: [])\n</code></pre></div></div>\n\n<p>Then we create a library within the GPU device where the name parameter exactly matches the MTL function name we want to call</p>\n\n<div><div><pre><code>let library = device.makeDefaultLibrary()\nlet calculate_func = library?.makeFunction(name: \"calculate_func\")\npipeLineState = try device.makeComputePipelineState(function: calculate_func!)\n</code></pre></div></div>\n\n<p>The <code>calculate_func</code> is defined as follows</p>\n\n<div><div><pre><code>kernel void calculate_func(device VertexIn* result,\n uint2 index [[ thread_position_in_grid ]],\n device atomic_uint &counter [[ buffer(1) ]]) {\n\n float bufRe[1024];\n float bufIm[1024];\n\n float Cre = (float(index.x) * 3 / 2048) - 2;\n float Cim = (float(index.y) * 3 / 2048) - 1.5;\n\n float Zre = 0;\n float Zim = 0;\n \n bufRe[0] = 0;\n bufIm[0] = 0;\n\n for (int iteration = 1; (iteration < 1024) && ((Zre * Zre + Zim * Zim) <= 4); iteration++) {\n float ZNre = Zre * Zre - Zim * Zim + Cre;\n Zim = 2 * Zre * Zim + Cim;\n Zre = ZNre;\n \n bufRe[iteration] = Zre;\n bufIm[iteration] = Zim;\n \n for (int i = iteration - 1; i; i--) {\n if ((bufRe[iteration] == bufRe[i]) && (bufIm[iteration] == bufIm[i])) {\n for (; i < iteration; i++) {\n float red = abs(bufIm[i]) * 5;\n float green = abs(bufRe[i]) / 2;\n float blue = 0.75;\n \n uint value = atomic_fetch_add_explicit(&counter, 1, memory_order_relaxed);\n result[value].position = float3(Cre, Cim, bufRe[i]);\n result[value].color = float4(red, green, blue, 1);\n }\n return;\n }\n }\n }\n}\n</code></pre></div></div>\n\n<p>The first section is the standard calculation for \\(Z_{n+1}\\). The nested loop searches back through the previous values to see if we have had this value before. While this should be an exhaustive check of every value, I haven\u2019t done that for performance reasons, but I did leave the check to be the exact floating point value rather than just 2 or 3 decimal places. If there is a match then all the points are copied to the output vector in a pretty colour.</p>\n\n<p>You can see the full code on <a href=\"https://github.com/mtelvers/threeDbrot\">Github</a>.</p>\n\n ",
+21
mte/2020_08_29_raspberry-pi-as-rtsp-source-for-obs-using-vlc.json
+21
mte/2020_08_29_raspberry-pi-as-rtsp-source-for-obs-using-vlc.json
···+"summary": "Using the new Raspberry Pi Imager I\u2019ve installed the latest Raspberry Pi OS Lite (32 bit).",+"content": "<p>Using the new <a href=\"https://www.raspberrypi.org/downloads/\">Raspberry Pi Imager</a> I\u2019ve installed the latest Raspberry Pi OS Lite (32 bit).</p>\n\n<p>Enable ssh by creating a zero length file called ssh on the boot volume</p>\n\n<div><div><pre><code>touch /Volumes/boot/ssh\n</code></pre></div></div>\n\n<p>Create a file <code>/Volumes/boot/wpa_supplicant.conf</code> using your favourite text editor:</p>\n\n<div><div><pre><code>ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev\nupdate_config=1\ncountry=GB\n\nnetwork={\n ssid=\"your SSID\"\n psk=\"xxxxxxxx\"\n key_mgmt=WPA-PSK\n}\n</code></pre></div></div>\n\n<p>Boot the Pi and enable the camera module using <code>raspi-config</code>. You need to reboot before the camera is activated.</p>\n\n<p>Sign in as root and run <code>sudo -Es</code> to get an elevated prompt. Update the the base software to the latest version then install <code>vlc</code>. This step will take a while\u2026</p>\n\n<div><div><pre><code>apt install vlc\n</code></pre></div></div>\n\n<p>Create a script containing this command line</p>\n\n<div><div><pre><code>#!/bin/bash\nraspivid -o - -t 0 -rot 180 -w 1920 -h 1080 -fps 30 -b 2000000 | cvlc -vvv stream:///dev/stdin --sout '#rtp{sdp=rtsp://:8554/stream}' :demux=h264\n</code></pre></div></div>\n\n<p>Test the stream by connecting to ip:8554 using vlc player on the desktop</p>\n\n<div><div><pre><code>rtsp://192.168.1.137:8554/stream\n</code></pre></div></div>\n\n<p>Automate the startup process by creating a service wrapper in <code>/etc/systemd/system/rtsp-stream.service</code> containing the following:</p>\n\n<div><div><pre><code>[Unit]\nDescription=auto start stream\nAfter=multi-user.target\n\n[Service]\nType=simple\nExecStart=/home/pi/rtsp-stream.sh\nUser=pi\nWorkingDirectory=/home/pi\nRestart=on-failure\n\n[Install]\nWantedBy=multi-user.target\n</code></pre></div></div>\n\n<p>Enable the service and then reboot</p>\n\n<div><div><pre><code>systemctl enable rtsp-stream.service\n</code></pre></div></div>\n\n<p>In Open Broadcast Studio (OBS) create a new Media Source and untick the check box for Local File and enter the RTSP URL in the input box.</p>",
+20
mte/2020_10_05_hard-disk-failure.json
+20
mte/2020_10_05_hard-disk-failure.json
···+"content": "<p>Check the status with <code>sudo mdadm --detail /dev/md0</code></p>\n\n<div><div><pre><code>/dev/md0:\n Version : 1.2\n Creation Time : Wed Sep 2 21:55:39 2015\n Raid Level : raid5\n Array Size : 878509056 (837.81 GiB 899.59 GB)\n Used Dev Size : 292836352 (279.27 GiB 299.86 GB)\n Raid Devices : 4\n Total Devices : 4\n Persistence : Superblock is persistent\n\n Update Time : Sun Oct 4 07:35:23 2020\n State : clean, degraded \n Active Devices : 3\n Working Devices : 3\n Failed Devices : 1\n Spare Devices : 0\n\n Layout : left-symmetric\n Chunk Size : 512K\n\nConsistency Policy : resync\n\n Name : plum:0 (local to host plum)\n UUID : 4a462153:dde89a43:0a4dd678:451bb2b4\n Events : 24024\n\n Number Major Minor RaidDevice State\n 0 8 17 0 active sync /dev/sdb1\n 1 8 33 1 active sync /dev/sdc1\n 5 8 49 2 active sync /dev/sdd1\n - 0 0 3 removed\n\n 4 8 65 - faulty /dev/sde1\n</code></pre></div></div>\n\n<p>Check which disks are which <code>sudo lshw -class disk</code>.</p>\n\n\n\n \n \n Mount\n Model\n Description\n \n \n \n \n /dev/sdb\n ST9300603SS\n Seagate Savvio 10 K.3 St9300603ss\n \n \n \u00a0\n MBE2073RC\n Fujitsu MBE2073RC 73.5GB SAS Hard Drive\n \n \n \u00a0\n MBE2073RC\n Fujitsu MBE2073RC 73.5GB SAS Hard Drive\n \n \n /dev/sdc\n ST9300603SS\n Seagate Savvio 10 K.3 St9300603ss\n \n \n /dev/sdd\n ST300MM0006\n Seagate Enterprise Performance 10K HDD ST300MM0006 300 GB\n \n \n /dev/sde\n ST9300603SS\n Seagate Savvio 10 K.3 St9300603ss\n \n \n\n\n<p>The boot drive is a hardware RAID1 using the two 73GB disks. <code>/var</code> made up of the 300GB disks in a software RAID5 configuration.</p>\n\n<p>The ST9300603SS is still available on Amazon but the newer 10k.5 generation equivalent the ST9300605SS is on a same day delivery and it\u2019s cheaper as well!</p>\n\n<p>Remove the disk</p>\n\n<div><div><pre><code>mdadm -r /dev/md0 /dev/sde1\n</code></pre></div></div>\n\n<p>This server does support hot plug but there were some zombie processes which I wanted to clear out and operationally a five minute outage would be fine.</p>\n\n<p>Shutdown the server and replace the disk. New disk (slot 2) during boot:</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/perc-bios.jpg\"></p>\n\n<p>After the reboot copy the partition table from one of the existing disks over to the new disk.</p>\n\n<div><div><pre><code>sfdisk -d /dev/sdb | sfdisk /dev/sde\n</code></pre></div></div>\n\n<p>Add the new disk into the array</p>\n\n<div><div><pre><code>mdadm /dev/md0 -a /dev/sde1\n</code></pre></div></div>\n\n<p>Monitor the rebuild process</p>\n\n<div><div><pre><code>watch -n 60 cat /proc/mdstat\n</code></pre></div></div>",
+20
mte/2020_12_26_temper-usb-temperature-sensor.json
+20
mte/2020_12_26_temper-usb-temperature-sensor.json
···+"summary": "These USB sensors are available pretty cheaply from PiHut and Amazon and are great for monitoring the temperature remotely (where you have a Pi).",+"content": "<p>These USB sensors are available pretty cheaply from PiHut and Amazon and\nare great for monitoring the temperature remotely (where you have a Pi).</p>\n\n<p>Install the necessary prerequisites:</p>\n\n<div><div><pre><code>sudo apt install libhidapi-dev/stable cmake bc\n</code></pre></div></div>\n\n<p>There is a <a href=\"https://github.com/edorfaus/TEMPered\">GitHub repository by Frode Austvik</a>:</p>\n\n<blockquote>\n <p>This project is a C implementation of a library and program to read all the\nvarious types of TEMPer thermometer and hygrometer USB devices, as produced by\nRDing Technology and sold under the name PCsensor.</p>\n</blockquote>\n\n<p>Download the software</p>\n\n<div><div><pre><code>git clone https://github.com/edorfaus/TEMPered\n</code></pre></div></div>\n\n<p>And build it and install:</p>\n\n<div><div><pre><code>cd TEMPered\ncmake .\nmake\nsudo cp utils/hid-query /usr/bin\n</code></pre></div></div>\n\n<p>Create a simple script to query the device and display the temperature.</p>\n\n<div><div><pre><code>!/bin/bash\nOUTLINE=/usr/bin/hid-query /dev/hidraw1 0x01 0x80 0x33 0x01 0x00 0x00 0x00 0x00 | grep -A1 ^Response|tail -1\nOUTNUM=echo $OUTLINE|sed -e 's/^[^0-9a-f]*[0-9a-f][0-9a-f] [0-9a-f][0-9a-f] \\([0-9a-f][0-9a-f]\\) \\([0-9a-f][0-9a-f]\\) .*$/0x\\1\\2/'\nHEX4=${OUTNUM:2:4}\nDVAL=$(( 16#$HEX4 ))\nCTEMP=$(bc <<< \"scale=2; $DVAL/100\")\necho date $CTEMP\n</code></pre></div></div>\n\n<p>This works perfectly but it must be executed with <code>sudo</code>, or by first\nrunning <code>chmod 666 /dev/hidraw</code>. This can be automated by creating\n<code>/etc/udev/rules.d/99-hidraw.rules</code> with the content below which creates\nthe <code>/dev</code> node with the appropriate permissions.</p>\n\n<div><div><pre><code>KERNEL==\"hidraw*\", SUBSYSTEM==\"hidraw\", MODE=\"0666\", GROUP=\"root\"\n</code></pre></div></div>\n\n<p>I\u2019ve added a cron job (<code>crontab -e</code>) to record the temperature every 5\nminutes:</p>\n\n<div><div><pre><code>0,5,10,15,20,25,30,35,40,45,50,55 * * * * /home/pi/temp.sh >> /home/pi/temperature.txt\n</code></pre></div></div>",
+20
mte/2021_01_01_normalise-mp3-files.json
+20
mte/2021_01_01_normalise-mp3-files.json
···+"summary": "I have hundreds for MP3 files but the levels aren\u2019t standardised in any way which makes streaming them a bit hit and miss. I can normalise them using AudaCity but I\u2019d really like an automatic way of doing it.",+"content": "<p>I have hundreds for MP3 files but the levels aren\u2019t standardised in any way which makes streaming them a bit hit and miss. I can normalise them using <a href=\"https://www.audacityteam.org/\">AudaCity</a> but I\u2019d really like an automatic way of doing it.</p>\n\n<p>Install MP3GAIN</p>\n\n<div><div><pre><code>apt install mp3gain\n</code></pre></div></div>\n\n<p>It doesn\u2019t seem to run for some reason as it can\u2019t find the library.</p>\n\n<div><div><pre><code>==617==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.\n</code></pre></div></div>\n\n<p>Set <code>LD_PRELOAD</code></p>\n\n<div><div><pre><code>export LD_PRELOAD=/usr/lib/arm-linux-gnueabihf/libasan.so.4\n</code></pre></div></div>\n\n<p>Now it works!</p>\n\n<div><div><pre><code>mp3gain -e -c -r *.mp3\n</code></pre></div></div>",
+21
mte/2021_01_06_raspberry-pi-camera-with-m12-lens.json
+21
mte/2021_01_06_raspberry-pi-camera-with-m12-lens.json
···+"summary": "I really need a good lens on my Raspberry PI camera to use it with OBS from a decent distance. The new high resolution Rasperberry PI cameras look excellent but they also come with a heafty price tag which I just can\u2019t justify.",+"content": "<p>I really need a good lens on my Raspberry PI camera to use it with OBS from a decent distance. The new high resolution Rasperberry PI cameras look excellent but they also come with a heafty price tag which I just can\u2019t justify.</p>\n\n<blockquote>\n <p>First off, the mounting holes on both v1 and v2 RPi cameras are on 21 mm centers, so the 20 mm spacing of the M12 mount you link isn\u2019t a perfect fit. Depending on your mounting screw size, you may still be able to force it. Second, you have to manually cut or file down a notch in the M12 mount for the micro-flex cable that comes out of the camera module. That isn\u2019t too hard, but if you want, there is also a M12 mount specifically designed for the RPi cameras, with a notch already.</p>\n\n <p>The v1 and v2 sensor sizes are the same, the so-called 1/4-inch format. On V1 the lens focal length is f=3.6mm with Angle of View: 54 x 41 degrees and on V2 it is f=3.0mm with Angle of View: 62.2 x 48.8 degrees [1]. Note the angle of view is quoted at full-frame; remember some video modes use a cropped subset of the full frame. This is a moderately wide angle lens. If you double the focal length, you\u2019ll get half the field of view. If you get a 8mm lens that\u2019s a moderate telephoto, and a 16mm lens is definitely telephoto. I\u2019ve tried a number of cheap M12 lenses that work \u201cok\u201d but don\u2019t expect perfectly sharp images with the tiny 1.4 or 1.1 micron pixels these camera sensors use. Lower f-number lenses are \u201cfaster\u201d (let in more light) but will have more shallow depth of field and more blurry overall. You will see f/1.4 or lower sold for use in low light, but I have not had good images with those; I would recommend f/2.0 or above if you want decent resolution.</p>\n\n <p><a href=\"https://www.raspberrypi.org/forums/viewtopic.php?t=150344#p988445\">https://www.raspberrypi.org/forums/viewtopic.php?t=150344#p988445</a></p>\n</blockquote>\n\n<p>With that as the inspiration I bought a pack of ten M12 lens adapters from Amazon for \u00a35 and started out by creating a notch for the cable. While the 20mm spacing wasn\u2019t ideal I have found some variation in hole positions on the PCB and by using thin M2 bolts I was able to <em>force</em> them.</p>\n\n<p>I removed the lens in a rather destructive way from the front of the camera by cutting around the raised area on three sides with a craft knife. It wasn\u2019t pretty but it did the job.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pi-camera-m12-1.jpg\"></p>\n\n<p>On the first camera I modified I went on to remove the IR filter by gently cutting it across the diagonal with side cutters. Surprisingly it popped off without too much effort leaving this.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pi-camera-m12-2.jpg\"></p>\n\n<p>For my application, removing the IR filter was a mistake as (tungsten) lights and candles produce lots of infrared!</p>\n\n<p>I mounted the M12 adapters on 3mm plywood with short M2 bolt screwed in from the front.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pi-camera-m12-3.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pi-camera-m12-4.jpg\"></p>\n\n<p>I had an old Foscam WiFi camera which has an M12 lens marked as <em>f=2.8mm</em>. This pretty much matched the field of view I got from the camera\u2019s native lens.</p>\n\n<p>I have had good results with <em>f=8mm</em>, <em>f=15mm</em> and <em>f=25mm</em> lens as well as cheap zoom lens offering a range of <em>f=3mm</em> to <em>f=12mm</em>. It\u2019s curious that on Amazon a focal length of 8mm is typically sold as <em>wide angle</em> rather than telephoto! What I really notice is that the depth of field becomes increasingly narrow as the focal length increases.</p>\n\n<p>I installed Raspberry Pi OS Lite using the Pi Imager and enabled SSH before removing the SD card.</p>\n\n<p>After assembling the unit check that the camera is connected up and enabled with <code>vcgencmd get_camera</code></p>\n\n<div><div><pre><code>supported=1 detected=1\n</code></pre></div></div>\n\n<p><code>raspivid</code> can be configured to send an h.264 stream, but it exits when the connection drops. Therefore, I have rolled <code>raspivid</code> as a service so systemd will restart it each time.</p>\n\n<p>Create <code>/etc/systemd/system/stream.service</code> containing</p>\n\n<div><div><pre><code>[Unit]\nDescription=auto start stream\nAfter=multi-user.target\n\n[Service]\nType=simple\nExecStart=/usr/bin/raspivid -v -fps 30 -md 2 -n -ih -t 0 -l -stm -fl -o tcp://0.0.0.0:5001\nUser=pi\nWorkingDirectory=/home/pi\nRestart=always\n\n[Install]\nWantedBy=multi-user.target\n</code></pre></div></div>\n\n<p>Enable and start the service as follows:</p>\n\n<div><div><pre><code>systemctl enable stream\nservice stream start\n</code></pre></div></div>\n\n<p>You can open the stream with VLC by using the address <code>tcp/h264://192.168.1.88:5001</code> which is useful for testing.</p>\n\n<p>Finally in OBS connect add a media source <code>tcp://192.168.0.88:5001</code>.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/2_8mm.png\" title=\"f=2.8mm\">\n<img alt=\"\" src=\"https://www.tunbury.org/images/8mm.png\" title=\"f=8mm\">\n<img alt=\"\" src=\"https://www.tunbury.org/images/16mm.png\" title=\"f=16mm\">\n<img alt=\"\" src=\"https://www.tunbury.org/images/22mm.png\" title=\"f=22mm\"></p>\n\n<h1>Parts list</h1>\n\n\n\n \n \n Part\n Cost\n \n \n \n \n <a href=\"https://www.amazon.co.uk/Raspberry-Pi-Model-Quad-Motherboard/dp/B01CD5VC92\">Pi 3B</a>\n \u00a334\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B07WCGY2QY/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1\">PoE Splitter - 2 pack</a>\n \u00a317\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B07ZZ2K7WP/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1\">5MP Camera Module - 2 pack</a>\n \u00a39\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B08FDVYC98/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1\">Zoom lens</a>\n \u00a310\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B00R1J42T8/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1\">M12 Mount - 10 pack</a>\n \u00a35\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B075QMCYZM/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1\">3mm plywood - 25 pack</a>\n \u00a324\n \n \n <a href=\"https://www.amazon.co.uk/gp/product/B003WIRFD2/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1\">SD Card</a>\n \u00a33.70\n \n \n\n\n<p>A single camera would cost \u00a362.</p>",
+21
mte/2021_04_28_mini-itx-as-windows-2008-server.json
+21
mte/2021_04_28_mini-itx-as-windows-2008-server.json
···+"summary": "Unfortunately without a DVD drive and with no capability to boot from USB I\u2019m struggling to get a clean OS on my Mini ITX machine. The internal drive is IDE and I don\u2019t have any other machines with IDE around and I don\u2019t know the password for the installed OS.",+"content": "<p>Unfortunately without a DVD drive and with no capability to boot from USB I\u2019m struggling to get a clean OS on my Mini ITX machine. The internal drive is IDE and I don\u2019t have any other machines with IDE around and I don\u2019t know the password for the installed OS.</p>\n\n<p>Install Windows 2008 x86 Server (with GUI) in a VM</p>\n\n<p>Turn on Remote Desktop and turn off the firewall</p>\n\n<p>Add Windows Server role WDS and AD DS</p>\n\n<p>Set static IP address 192.168.10.10/24 DNS 127.0.0.1</p>\n\n<p>Set local administrator password to a complex password</p>\n\n<p>Run <code>dcpromo</code>, set domain to montdor.local.</p>\n\n<p>Install DHCP and follow the wizard to create a scope 192.168.10.128\u2013192.168.10.254. DNS 192.168.10.10. No router.</p>\n\n<p>Configure WDS using the wizard</p>\n\n<ul>\n <li>Do not listen on port 67</li>\n <li>Configure DHCP option 60</li>\n <li>Respond to all clients</li>\n</ul>\n\n<p>Switch to the Windows AIK for Windows 7 ISO <code>KB3AIK_EN.ISO</code> and install Windows Automated Installation Kit (to get Windows PE)</p>\n\n<p>In WDS, add the WinPE boot WIM as a boot image. The WIM is in <code>C:\\Program Files\\Windows AIK\\Tools\\PETools\\x86\\winpe.wim</code></p>\n\n<p>Copy the Windows 2008 Server Standard x86 DVD to <code>c:\\Win2K8x86</code>. Create a share of the same name.</p>\n\n<p>Windows 2008 Server installation requires 512MB of RAM but my computer only has 256MB and only reports 248 after the video RAM is subtracted.</p>\n\n<p>Hack the Windows setup program to make it run anyway:</p>\n\n<p>Find the file <code>WINSETUP.DLL</code> in the sources folder and using as hex editor such as <a href=\"http://mh-nexus.de/en/hxd/\">HxD</a>, search for the hex string <code>77 07 3D 78 01</code> and replace it with <code>E9 04 00 00 00</code>.</p>\n\n<p>Now Windows really did need 512MB of RAM: setup fails with error <code>0xE0000100</code> caused by insufficient memory. Therefore, create a partition and then a swap file.</p>\n\n<p>Open and run the following to create a working drive:</p>\n\n<div><div><pre><code>SELECT DISK 0\nCLEAN\nCREATE PART PRIMARY\nSELECT VOLUME 0\nASSIGN\nFORMAT FS=NTFS QUICK\n</code></pre></div></div>\n\n<p>Create a paging file</p>\n\n<div><div><pre><code>wpeutil createpagefile /path:c=\\pf.sys\n</code></pre></div></div>\n\n<p>Now run Windows Setup.</p>\n\n<p>Download Sil3124 driver for Windows 7 x86. Copy it to a network share and mount it from the Windows 2008 Server and run:</p>\n\n<div><div><pre><code>pnputil -i -a *.inf\n</code></pre></div></div>\n\n<p>Then use DISKPART.EXE again, similar to above</p>\n\n<div><div><pre><code>SELECT DISK 1\nCREATE PART PRI\nSELECT VOLUME 1\nASSIGN\nFORMAT FS=NTFS QUICK\n</code></pre></div></div>\n\n<p>Now we need Windows Updates I suppose</p>\n\n<div><div><pre><code>cscript c:\\windows\\system32\\scregedit.wsf /au 4\nnet stop wuauserv\nnet start wuauserv\nwuauclt /detectnow\n</code></pre></div></div>\n\n<p>Enable Remote Desktop with</p>\n\n<div><div><pre><code>cscript c:\\windows\\system32\\scregedit.wsf /ar 0\n</code></pre></div></div>\n\n<p>Create a share</p>\n\n<div><div><pre><code>net share sharename=d:\\share /grant:everyone,full\n</code></pre></div></div>\n\n<p>Make it visible</p>\n\n<div><div><pre><code>netsh firewall set service fileandprint enable\n</code></pre></div></div>",
+21
mte/2021_05_25_wordpress-to-jekyll-test.json
+21
mte/2021_05_25_wordpress-to-jekyll-test.json
···+"summary": "Install the Wordpress plugins UpdraftPlus. Create a new WordPress site and install the UpdraftPlus plugin and restore the database.",+"content": "<p>Install the Wordpress plugins <em>UpdraftPlus</em>. Create a new WordPress site and install the <em>UpdraftPlus</em> plugin and restore the database.</p>\n\n<p>Use the following MySQL commands to fix the database</p>\n\n<div><div><pre><code>UPDATE wp_options SET option_value = replace(option_value, 'cccbr.org.uk', 'cccbr.tunbury.org') WHERE option_name = 'home' OR option_name = 'siteurl';\nUPDATE wp_posts SET guid = replace(guid, 'cccbr.org.uk','cccbr.tunbury.org');\nUPDATE wp_posts SET post_content = replace(post_content, 'cccbr.org.uk', 'cccbr.tunbury.org');\nUPDATE wp_postmeta SET meta_value = replace(meta_value,'cccbr.org.uk','cccbr.tunbury.org');\n</code></pre></div></div>\n\n<p>Set user password (mainly to make it different from the original site)</p>\n\n<div><div><pre><code>UPDATE `wp_users` SET `user_pass`= MD5('yourpassword') WHERE `user_login`='melvers';\n</code></pre></div></div>\n\n<p>Install <em>Jekyll Exporter</em> plugin, activate it and then create the export using Tools -> Export to Jekyll.</p>\n\n<p>Create a new Jekyll site by running</p>\n\n<div><div><pre><code>jekyll new c:\\cccbr\n</code></pre></div></div>\n\n<p>Extract <code>jekyll-export.zip</code> into the <code>c:\\cccbr</code> folder but don\u2019t overwrite <code>_config.yml</code></p>\n\n<div><div><pre><code>jekyll serve\n</code></pre></div></div>\n\n<p>Visit <a href=\"http://localhost:4000\">http://localhost:4000</a> to see how it looks.</p>\n\n<div><div><pre><code>$mdFiles = Get-ChildItem . *.md -rec\nforeach ($file in $mdFiles) {\n (Get-Content $file.PSPath) |\n Foreach-Object { $_ -replace \"&#8211;\", \"-\" } |\n Foreach-Object { $_ -replace \"&#038;\", \"&\" } |\n Foreach-Object { $_ -replace \"&#8217;\", \"&apos;\" } |\n Foreach-Object { $_ -replace \"cccbr.tunbury.org/wp-content/uploads/\", \"cccbr.org.uk/wp-content/uploads/\" } |\n Foreach-Object { $_ -replace \"cccbr.tunbury.org/\", \"/\" } |\n Foreach-Object { $_ -replace \"layout: page\", \"layout: single\" } |\n Foreach-Object { $_ -replace \"layout: post\", \"layout: single\" } |\n Set-Content $file.PSPath\n}\n</code></pre></div></div>\n\n<p>Edit <code>GemFile</code> to the new theme by commenting out <code>minima</code> and adding <code>minimal-mistakes</code>:</p>\n\n<div><div><pre><code># gem \"minima\", \"~> 2.5\"\ngem \"minimal-mistakes-jekyll\"\n</code></pre></div></div>\n\n<p>Run <code>bundle</code> in the folder to download the dependancies. Edit <code>_config.yaml</code> and set the theme</p>\n\n<div><div><pre><code>theme: minimal-mistakes-jekyll\n</code></pre></div></div>\n\n<p>Create the top level menu by creating <code>_data/navigation.yml</code>:</p>\n\n<div><div><pre><code>main:\n- title: \"About\"\n url: /about\n- title: \"Bells and Ringing\"\n url: /bellringing\n</code></pre></div></div>\n\n<p>Create secondary menus with the same <code>_data/navigation.yml</code> file such as:</p>\n\n<div><div><pre><code>about:\n- title: About\n children:\n - title: \"About the Council\"\n url: /about\n - title: \"Continuing CCCBR Reforms\"\n url: /about/reforms/\n - title: \"Governance\"\n url: /about/governance/\n</code></pre></div></div>\n\n<p>Then on the appropriate pages set the front matter:</p>\n\n<div><div><pre><code>sidebar:\n nav: \"about\"\ntoc: true\n</code></pre></div></div>\n\n<p>Create a custom skin by duplicating and rename a file in <code>_sass\\minimal-mistakes\\skins</code>. I create <code>cccbr.scss</code> and the in <code>_config.yml</code> apply the theme like this:</p>\n\n<div><div><pre><code>theme: minimal-mistakes-jekyll\nminimal_mistakes_skin: \"cccbr\"\n</code></pre></div></div>\n\n<p>Create a repository on GitHub.</p>\n\n<div><div><pre><code>git init\ngit add .\ngit commit -m \"inital commit\"\ngit remote add origin https://github.com/mtelvers/cccbr.git\ngit push -u origin master\n</code></pre></div></div>\n\n<p>On GitHub under the repo unders Settings \\ Pages publish the site using the master branch.</p>\n\n<p>Changes to make it work on GitHub:</p>\n\n<ol>\n <li>Update <code>Gemfile</code> and then ran <code>bundle</code>.</li>\n <li>Updated all the posts and pages to use the <code>single</code> template.</li>\n <li>Update <code>_config.yml</code> to set baseurl to match Git repository name.</li>\n <li>Update <code>_config.yml</code> to change remote theme.</li>\n</ol>\n\n<p>Remove unwanted front matter tags with this Ruby script</p>\n\n<div><div><pre><code>require \"yaml\"\n\nYAML_FRONT_MATTER_REGEXP = /\\A(---\\s*\\n.*?\\n?)^((---|\\.\\.\\.)\\s*$\\n?)/m\n\nDir.glob('**/*.md', File::FNM_DOTMATCH) do |f|\n puts f\n\n file = File.open(f)\n source = file.read\n file.close\n\n if source =~ YAML_FRONT_MATTER_REGEXP\n data, content = YAML.load($1), Regexp.last_match.post_match\n [\"id\", \"guid\",\n \"ep_tilt_migration\",\n \"classic-editor-remember\",\n \"ssb_old_counts\",\n \"ssb_total_counts\",\n \"ssb_cache_timestamp\",\n \"colormag_page_layout\",\n \"wp_featherlight_disable\",\n \"catchbox-sidebarlayout\",\n \"complete_open_graph\"].each {|x| data.delete(x)}\n\n file = File.open(f, \"w\")\n YAML.dump(data, file)\n file.puts(\"---\", content)\n file.close\n end\nend\n</code></pre></div></div>",
+21
mte/2021_06_22_syncthing-on-openbsd.json
+21
mte/2021_06_22_syncthing-on-openbsd.json
···+"content": "<h2>Network Installation of OpenBSD</h2>\n\n<p>Setup a machine to facilitate network installation of OpenBSD. Download the 6.9 installation ISO from the <a href=\"https://www.openbsd.org/faq/faq4.html#Download\">OpenBSD website</a> and install it in a virtual machine. I\u2019m using VMware Fusion and have a dedicated LAN port connected to the remote machine.</p>\n\n<p>Create <code>hostname.vic0</code> containing the following and not <code>dhcp</code>:</p>\n\n<div><div><pre><code>inet 192.168.2.1 255.255.255.0 NONE\n</code></pre></div></div>\n\n<h3>DHCPD</h3>\n\n<p>Create <code>/etc/dhcpd.conf</code> with the key attributes:</p>\n\n<ul>\n <li><code>filename</code> for the boot image name, and</li>\n <li><code>next-server</code> for the TFTP server address.</li>\n</ul>\n\n<p>I have added a host section for the specific MAC of my machine but for this one-time build process it could be a global option.</p>\n\n<div><div><pre><code>subnet 192.168.2.0 netmask 255.255.255.0 {\n option routers 192.168.2.1;\n range 192.168.2.32 192.168.2.127;\n \n host mini-itx {\n hardware ethernet 00:40:63:d5:6f:4f;\n filename \"auto_install\";\n next-server 192.168.2.1;\n option host-name \"mini-itx\"\n }\n}\n</code></pre></div></div>\n\n<h3>TFTPD</h3>\n\n<p>Create the default TFTP root folder and configuration folder</p>\n\n<div><div><pre><code>mkdir -p /tftpboot/etc\n</code></pre></div></div>\n\n<p>Download <a href=\"http://ftp.openbsd.org/pub/OpenBSD/6.9/i386/pxeboot\">pxeboot</a> and <a href=\"http://ftp.openbsd.org/pub/OpenBSD/6.9/i386/bsd.rd\">bsd.rd</a> and put them in <code>/tftpboot</code>.</p>\n\n<p>Create a symbolic link for <code>auto_install</code></p>\n\n<div><div><pre><code>ln -s pxeboot /tftpboot/auto_install\n</code></pre></div></div>\n\n<p>Create <code>/tftpboot/etc/boot.conf</code> containing the following</p>\n\n<div><div><pre><code>boot tftp:/bsd.rd\n</code></pre></div></div>\n\n<h3>HTTPD</h3>\n\n<p>Create <code>/etc/httpd.conf</code> to share the folder <code>/var/www/htdocs</code></p>\n\n<div><div><pre><code>#[ MACROS ]\next_ip = \"*\"\n\n# [ GLOBAL CONFIGURATION ]\n# none\n\n# [ SERVERS ]\nserver \"default\" {\n listen on $ext_ip port 80\n root \"/htdocs\"\n}\n\n# [ TYPES ]\ntypes {\n include \"/usr/share/misc/mime.types\"\n}\n</code></pre></div></div>\n\n<p>Stage the installation files on a local web server by copying them from the boot ISO downloaded at the start:</p>\n\n<div><div><pre><code>mount /dev/cd0a /mnt/\nmkdir -p /var/www/htdocs/pub/OpenBSD\ncp -rv /mnt/6.9/ /var/www/htdocs/pub/OpenBSD/6.9\nls -l /var/www/htdocs/pub/OpenBSD/6.9 > /var/www/htdocs/pub/OpenBSD/6.9/index.txt\n</code></pre></div></div>\n\n<p>Create <code>/var/www/htdocs/install.conf</code> containing the following automatic confgiuration answer file</p>\n\n<div><div><pre><code>Password for root = Password\nSetup a user = user\nPassword for user = Password\nPublic ssh key for user = ssh-rsa AAAA...ZV user@Marks-Mac-mini.local\nWhich disk is the root disk = wd0\nWhat timezone are you in = Europe/London\nUnable to connect using https. Use http instead = yes\nLocation of sets = http\nHTTP Server = 192.168.2.1\nSet name(s) = -all bsd* base* etc* man* site* comp*\nContinue without verification = yes\n</code></pre></div></div>\n\n<p>Enable the services using <code>rcctl</code> which edits configuration file <code>rc.conf.local</code> add the appropriate <code>service_flags=\"\"</code> lines</p>\n\n<div><div><pre><code>rcctl enable dhcpd\nrcctl enable tftpd\nrcctl enable httpd\n</code></pre></div></div>\n\n<p>The remote system should now boot from the network and install OpenBSD hands free!</p>\n\n<p>After the new system boots <code>su</code> and then overwrite <code>/etc/installurl</code> with a standard value</p>\n\n<div><div><pre><code>echo https://ftp.openbsd.org/pub/OpenBSD > /etc/installurl\n</code></pre></div></div>\n\n<h2>RAID5 Volume</h2>\n\n<p>Create a RAID5 volume over the four attached disks</p>\n\n<div><div><pre><code>for a in sd0 sd1 sd2 sd3 ; do fdisk -iy $a ; done\nfor a in sd0 sd1 sd2 sd3 ; do printf \"a\\n\\n\\n\\nRAID\\nw\\nq\\n\" | disklabel -E $a ; done\nbioctl -c 5 -l /dev/sd0a,/dev/sd1a,/dev/sd2a,/dev/sd3a softraid0\n</code></pre></div></div>\n\n<p>Partition and format the volume</p>\n\n<div><div><pre><code>fdisk -iy sd4\nprintf \"a\\n\\n\\n\\n4.2BSD\\nw\\nq\\n\" | disklabel -E sd4\nnewfs /dev/rsd4a \n</code></pre></div></div>\n\n<h2>Syncthing</h2>\n\n<p>Install <code>syncthing</code> using</p>\n\n<div><div><pre><code>pkg_add syncthing\n</code></pre></div></div>\n\n<p>Edit <code>/etc/login.conf</code> and append:</p>\n\n<div><div><pre><code>syncthing:\\\n :openfiles-max=60000:\\ \n :tc=daemon:\n</code></pre></div></div>\n\n<p>Rebuild the file</p>\n\n<div><div><pre><code>cap_mkdb /etc/login.conf\necho \"kern.maxfiles=80000\" >> /etc/sysctl.conf\n</code></pre></div></div>\n\n<p>Edit <code>/etc/rc.d/syncthing</code> and update the <code>daemon_flags</code>:</p>\n\n<div><div><pre><code>daemon_flags=\"-no-browser -gui-address=0.0.0.0:8384\"\n</code></pre></div></div>\n\n<p>Edit <code>/etc/fstab</code> to mount the drive</p>\n\n<div><div><pre><code>/dev/sd4a /var/syncthing ffs rw,softdep 0 0\nchown -R _syncthing:_syncthing /var/syncthing\n</code></pre></div></div>\n\n<p>Enable and start syncthing:</p>\n\n<div><div><pre><code>rcctl enable syncthing\nrcctl start syncthing\n</code></pre></div></div>",
+21
mte/2021_07_14_jitsis.json
+21
mte/2021_07_14_jitsis.json
···+"summary": "I need to remotely control OBS during a live stream. This is quite simply achieved via VNC but I need the see and hear what\u2019s going on at the same time. VNC doesn\u2019t support audio on the free license and watching the YouTube stream is out of the question as it\u2019s nearly 30 seconds behind real time.",+"content": "<p>I need to remotely control OBS during a live stream. This is quite simply achieved via VNC but I need the see and hear what\u2019s going on at the same time. VNC doesn\u2019t support audio on the free license and watching the YouTube stream is out of the question as it\u2019s nearly 30 seconds behind real time.</p>\n\n<p>As the computer has a USB web camera and microphone attached I thought of a private LAN based v/c solution. A quick Internet search found a <a href=\"https://www.reddit.com/r/sysadmin/comments/gmray4/recommendation_for_free_lanonly_video/\">post on Redit</a> talking about Jitsi.</p>\n\n<p>After installing a Ubunutu 20.04 server VM, I followed the Jitsi <a href=\"https://jitsi.github.io/handbook/docs/devops-guide/devops-guide-quickstart\">Self-Hosting Guide</a> which takes just a few minutes. Since it was a private LAN implementation I skipped the optional FQDN section of the instructions and used the self-signed certificate.</p>\n\n<p>Connecting to the DHCP assigned address via over https brought the expected certificate warnings but I was able to create and join a room. The camera and microphone did not start. Every 30 seconds or so this message appeared about reconnecting:</p>\n\n<p><img alt=\"Jitsi Disconnected\" src=\"https://www.tunbury.org/images/jitsi-disconnected.png\"></p>\n\n<p>The fix to this was to use a host name not an IP address. On Windows machines edit <code>C:\\Windows\\System32\\Drivers\\etc\\hosts</code> and on a Mac edit <code>/etc/hosts</code>. In both cases I added the DHCP issued IP address and hostname of the Ubuntu server:</p>\n\n<div><div><pre><code>192.168.1.76\tjitsi\n</code></pre></div></div>\n\n<p>Connecting to Jitsu using <a href=\"https://jitsi\">https://jitsi</a> and skipping passed the certificate warnings brought me to a working implementation. Certainly impressive and easy to setup!</p>",
+21
mte/2021_07_27_audio-stream.json
+21
mte/2021_07_27_audio-stream.json
···+"summary": "Now singing has returned to churches I need to add an additional microphone to pickup the choir. I\u2019d like this to be completely separate to the Church PA system to avoid playing this sound out through the speakers. A Raspberry PI Zero W with a USB sound card looks to be a good option to capture the audio and stream it to OBS.",+"content": "<p>Now singing has returned to churches I need to add an additional microphone to pickup the choir. I\u2019d like this to be completely separate to the Church PA system to avoid playing this sound out through the speakers. A Raspberry PI Zero W with a USB sound card looks to be a good option to capture the audio and stream it to OBS.</p>\n\n<p>Run <code>arecord -l</code> to get a list of available mixer devices. In my case my USB audio device is #2.</p>\n\n<p>Set the mixer level for the microphone:</p>\n\n<div><div><pre><code>amixer -c 2 -q set 'Mic',0 100%\n</code></pre></div></div>\n\n<p>Install <code>ffmpeg</code> which pulls down around 750MB on a lite installation.</p>\n\n<div><div><pre><code>apt install ffmpeg\n</code></pre></div></div>\n\n<p>Run <code>ffmpeg</code> to create the stream specifying the mixer device name as the input <code>-i</code></p>\n\n<div><div><pre><code>ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f wav -listen 1 tcp://0.0.0.0:5002\n</code></pre></div></div>\n\n<p>You can play this stream using VideoLAN\u2019s VLC using <em>Open Network Stream</em> <code>tcp/wav://192.168.1.104:5002</code> where 192.168.1.104 is the IP address of the PI.</p>\n\n<p>In OBS create a new Media Source and set the network buffer to zero (to avoid excessive delay) and turn off <em>Restart playback when source becomes active</em> which keeps the stream alive even when it\u2019s not the active scene:</p>\n\n<div><div><pre><code>tcp://192.162.1.104:5002\n</code></pre></div></div>\n\n<p>Wrap the ffmpeg command as a service by creating <code>/etc/systemd/system/stream.service</code> containing</p>\n\n<div><div><pre><code>[Unit]\nDescription=auto start stream\nAfter=multi-user.target\n\n[Service]\nType=simple\nExecStartPre=/usr/bin/amixer -c 2 -q set 'Mic',0 100%\nExecStart=/usr/bin/ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f wav -listen 1 tcp://0.0.0.0:5002\nUser=pi\nWorkingDirectory=/home/pi\nRestart=always\n\n[Install]\nWantedBy=multi-user.target\n</code></pre></div></div>\n\n<p>Enable and start the service as follows:</p>\n\n<div><div><pre><code>systemctl enable stream\nservice stream start\n</code></pre></div></div>\n\n<h2>Practical Issues</h2>\n\n<p>After successfully testing using a Raspberry PI Zero W using USB audio dongle with WiFi connecting over a distance of 30m in an empty church I decided to use it as a secondary device in live broadcast. This was immediately scuppered on the day as I was unable to maintain the WiFi link. I put this down to the interference created by the in house PA system, induction loop, and the mobile phones of the congregation.</p>\n\n<p>I added a UFL connector the Pi Zero W as described by <a href=\"https://www.briandorey.com/post/raspberry-pi-zero-w-external-antenna-mod\">Briain Dorey</a>. Using this with a 5dB D-Link antenna did marginally increase the antenna signal level and quality of most networks but not sufficiently to make the difference.</p>\n\n<h3>Internal antenna</h3>\n\n<div><div><pre><code>pi@raspberrypi:~ $ sudo iwlist wlan0 scan | grep 'Cell\\|Signal' | sed '$!N;s/\\n/ /'\n Cell 01 - Address: 6C:xx:xx:xx:xx:10 Quality=69/70 Signal level=-41 dBm \n Cell 02 - Address: 5C:xx:xx:xx:xx:9E Quality=26/70 Signal level=-84 dBm \n Cell 03 - Address: 5E:xx:xx:xx:xx:9F Quality=27/70 Signal level=-83 dBm \n Cell 04 - Address: 9C:xx:xx:xx:xx:62 Quality=35/70 Signal level=-75 dBm \n Cell 05 - Address: 78:xx:xx:xx:xx:8E Quality=21/70 Signal level=-89 dBm \n Cell 06 - Address: 9C:xx:xx:xx:xx:72 Quality=37/70 Signal level=-73 dBm \n Cell 07 - Address: 80:xx:xx:xx:xx:6A Quality=17/70 Signal level=-93 dBm \n</code></pre></div></div>\n\n<h3>External antenna</h3>\n\n<div><div><pre><code>pi@raspberrypi:~ $ sudo iwlist wlan0 scan | grep 'Cell\\|Signal' | sed '$!N;s/\\n/ /'\n Cell 01 - Address: 6C:xx:xx:xx:xx:10 Quality=70/70 Signal level=-29 dBm \n Cell 02 - Address: 5C:xx:xx:xx:xx:9E Quality=22/70 Signal level=-88 dBm \n Cell 03 - Address: 5E:xx:xx:xx:xx:9F Quality=23/70 Signal level=-87 dBm \n Cell 04 - Address: 9C:xx:xx:xx:xx:62 Quality=41/70 Signal level=-69 dBm \n Cell 05 - Address: 78:xx:xx:xx:xx:8E Quality=30/70 Signal level=-80 dBm \n Cell 06 - Address: 9C:xx:xx:xx:xx:72 Quality=41/70 Signal level=-69 dBm \n Cell 07 - Address: 80:xx:xx:xx:xx:6A Quality=24/70 Signal level=-86 dBm \n</code></pre></div></div>\n\n<p>Switching to a Raspberry PI 3 gave easy access to an Ethernet port without resorting to a USB hub. Following that there were no further connection issues!</p>\n\n<p><code>FFMPEG</code> can also create an MP3 stream rather than a WAV stream by simply changing the output format <code>-f mp3</code></p>\n\n<div><div><pre><code>/usr/bin/ffmpeg -ar 44100 -ac 1 -f alsa -i plughw:2,0 -f mp3 -listen 1 tcp://0.0.0.0:5002\n</code></pre></div></div>\n\n<p>The Raspberry PI 3 didn\u2019t really have sufficient processing capacity to keep up with the MP3 encoding. Switch to MP2, <code>-f mp2</code>, reduced the processor requirement significantly with no noticeable change in quality.</p>",
+21
mte/2021_08_16_ratchet-adapter.json
+21
mte/2021_08_16_ratchet-adapter.json
···+"summary": "I want to electrically drive this ratchet mechanism to avoid the manual labour of turning it by hand. I found a motor with a 1600:1 gearbox on eBay (shipping from China of course) which looks perfect, however it has a 10mm diameter keyed output shaft which doesn\u2019t nicely couple to my 3/4\u201d square ratchet shaft.",+"content": "<p>I want to electrically drive this ratchet mechanism to avoid the manual labour of turning it by hand. I found a motor with a 1600:1 gearbox on eBay (shipping from China of course) which looks perfect, however it has a 10mm diameter keyed output shaft which doesn\u2019t nicely couple to my 3/4\u201d square ratchet shaft.</p>\n\n<p><img alt=\"Ratchet with pipe\" src=\"https://www.tunbury.org/images/ratchet-with-pipe.png\"></p>\n\n<p>From the photo it is clear that a 1\u201d steel tube fits reasonably well over the shaft. A wooden plug and a little bit of brute force provided a flat surface which was pre-drilled and a flang screwed on.</p>\n\n<p><img alt=\"Wooden block version\" src=\"https://www.tunbury.org/images/wooden-block.png\"></p>\n\n<p>This worked fairly well except that the grub screw on the flang was insufficent to withstand the forces required. Therefore a keyway was cut into the flang to prevent slipping.</p>\n\n<p><img alt=\"Flang with keyway\" src=\"https://www.tunbury.org/images/flang-key-1.png\"></p>\n\n<p>And a key was made to fit.</p>\n\n<p><img alt=\"Flange with key\" src=\"https://www.tunbury.org/images/flang-key-2.png\"></p>\n\n<p>This worked very well, but unfortunately about two years later things took a nasty turn. One of the screws snapped and others were about to pull out.</p>\n\n<p><img alt=\"Wear and tear\" src=\"https://www.tunbury.org/images/wear-and-tear.png\"></p>\n\n<p>Taking the 1\u201d tube and turning it sideways gave a metal surface on to which the flang could be bolted. Cutting a hole in the bottom side of the tube would accomodate the 3/4\u201d ratchet shaft.</p>\n\n<p><img alt=\"Pipe with holes and cutout\" src=\"https://www.tunbury.org/images/ratchet-connector-with-cutout.png\"></p>\n\n<p>And with the flang in place it looks ready for use.</p>\n\n<p><img alt=\"Flang in place\" src=\"https://www.tunbury.org/images/ratchet-connector-flang.png\"></p>\n\n<p>Hopefully this will last a little longer this time.</p>\n\n<p><img alt=\"Ready for operation\" src=\"https://www.tunbury.org/images/in-operation.png\"></p>",
+20
mte/2021_08_29_arduino-gas-sensor.json
+20
mte/2021_08_29_arduino-gas-sensor.json
···+"summary": "With the current emphasis on ventilation to reduce the risks associated with inhaled droplets it I have put together a simple gas sensor to record concentrations over time. The output is a CSV file which can be graphed in Excel.",+"content": "<p>With the current emphasis on ventilation to reduce the risks associated with inhaled droplets it I have put together a simple gas sensor to record concentrations over time. The output is a <code>CSV</code> file which can be graphed in Excel.</p>\n\n<p>I have used an Arduino Nano for this project which gave some serious memory constraints on the coding particularly as I needed libraries for the real time clock, SD card and OLED display.</p>\n\n<p>The modules used are:</p>\n<ul>\n <li><a href=\"https://www.amazon.co.uk/dp/B072BMYZ18/ref=cm_sw_em_r_mt_dp_dl_WPWV0XM72DEW1A4HBDGE?_encoding=UTF8&psc=1\">Arduino Nano</a></li>\n <li><a href=\"https://www.amazon.co.uk/dp/B07BRFL7V7/ref=cm_sw_em_r_mt_dp_K5YWV6VZJJRT1D4WF9VJ?_encoding=UTF8&psc=1\">DS3231 Real time clock</a></li>\n <li><a href=\"https://www.amazon.co.uk/dp/B01L9GC470/ref=cm_sw_em_r_mt_dp_QQ8BPJQJP4G62QVRSNS3\">SSD1306 OLED display</a></li>\n <li><a href=\"https://www.amazon.co.uk/dp/B077MB17JB/ref=cm_sw_em_r_mt_dp_WYZQY0ZZKJRPV83WH8R3\">SD card reader</a></li>\n <li><a href=\"https://www.amazon.co.uk/dp/B07CYYB82F/ref=cm_sw_em_r_mt_dp_9S4XZ9QD8NBH1V6M7HV5\">Gas sensor</a></li>\n</ul>\n\n<h2>Hardware Connections</h2>\n\n<p>I used a veroboard to assemble the circuit as follows</p>\n<ol>\n <li>Scatter the modules around the board and solder all VCC and GND pins</li>\n <li>On the Arduino Nano, pins A4 and A5 are used for the Inter-Integrated Circuit (I2C) bus\n <ul>\n <li>Connect SDA (A4 on Nano) to the display and clock module\u2019s SDA pin</li>\n <li>Connect SCL (A5 on Nano) to the display and clock module\u2019s SCL pin</li>\n </ul>\n </li>\n</ol>\n\n<blockquote>\n <p>At this point, the clock and display module can be tested and the time set on the clock.</p>\n</blockquote>\n\n<ol>\n <li>Connect the A0 output from the gas sensor to the A0 pin on the Arduino</li>\n</ol>\n\n<blockquote>\n <p>Reading from A0 returns an integer between 0 and 1023 representing a gas concentration between 200 - 10000 ppm</p>\n</blockquote>\n\n<ol>\n <li>The SD card using the Serial Peripheral Interface (SPI) and requires 4 connections\n <ul>\n <li>Nano D10 to CS on the SD card module</li>\n <li>Nano D11 to MOSI on the SD card module</li>\n <li>Nano D12 to MISO on the SD card module</li>\n <li>Nano D13 to SCK on the SD card module</li>\n </ul>\n </li>\n</ol>\n\n<p>With the wiring complete load the Arduino sketch from my <a href=\"https://github.com/mtelvers/Arduino-MQ2/blob/113a2348ce65966b738dc55d9ddace36824ec49f/mq2.ino\">GitHub page</a>.</p>\n\n<h2>Software Overview</h2>\n\n<p>After the basic library initialization, the code creates two 64 elements arrays to store the samples taken each second and the average of those samples calculated each minute. These arrays will hold the latest sample in the first position, therefore before a new value is added all the other values will be shifted down by one. There certainly would be more efficient ways of handing this but with a small number of values this is simple approach is workable.</p>\n\n<div><div><pre><code>#define SAMPLES 64\nuint16_t historySeconds[SAMPLES];\nuint16_t historyMinutes[SAMPLES];\n</code></pre></div></div>\n\n<p>The <em>main</em> loop of the program checks remembers the number of seconds on the clock in the variable <code>lastS</code> and waits for it to be different thus running the inner code once per second:</p>\n\n<div><div><pre><code>int lastS = -1;\n\nvoid loop(void) {\n DateTime dt = RTClib::now();\n\n if (lastS != dt.second()) {\n lastS = dt.second();\n\n // Inner code here runs once each second\n\n }\n delay(250);\n}\n</code></pre></div></div>\n\n<p>The inner code clears the display,</p>\n\n<div><div><pre><code>u8x8.clear();\nu8x8.setCursor(0, 0);\n</code></pre></div></div>\n\n<p>and then writes the date</p>\n\n<div><div><pre><code>toString(tmp, dt.year() - 2000, dt.month(), dt.day(), '-');\nu8x8.println(tmp);\n</code></pre></div></div>\n\n<p>If the time has just rolled over to a new minute (i.e. number of seconds is 0), take an average of the <em>seconds</em> samples and store that as the minute average. Finally, open a file named with the current date.</p>\n\n<div><div><pre><code>if (dt.second() == 0) {\n unsigned long total = 0;\n for (int h = 0; h < SAMPLES; h++)\n total += historySeconds[h];\n memmove(historyMinutes + 1, historyMinutes, (SAMPLES - 1) * sizeof(uint16_t));\n historyMinutes[0] = total / SAMPLES;\n strcat(tmp, \".csv\");\n txtFile = SD.open(tmp, FILE_WRITE);\n}\n</code></pre></div></div>\n\n<p>Read the next gas value and store it</p>\n\n<div><div><pre><code>uint16_t gasVal = analogRead(0);\nmemmove(historySeconds + 1, historySeconds, (SAMPLES - 1) * sizeof(uint16_t));\nhistorySeconds[0] = gasVal;\n</code></pre></div></div>\n\n<p>Display the current time</p>\n\n<div><div><pre><code>toString(tmp, dt.hour(), dt.minute(), dt.second(), ':');\nu8x8.println(tmp);\n</code></pre></div></div>\n\n<p>If there\u2019s a file open, write the time to value to the file</p>\n\n<div><div><pre><code>if (txtFile) {\n strcat(tmp, \",\");\n txtFile.print(tmp);\n}\n</code></pre></div></div>\n\n<p>Display the gas value</p>\n\n<div><div><pre><code>itoa(gasVal, tmp, 10);\nu8x8.println(tmp);\n</code></pre></div></div>\n\n<p>And similarly, if there is a file open, write the current value to the file and close it</p>\n\n<div><div><pre><code>if (txtFile) {\n txtFile.println(tmp);\n txtFile.close();\n}\n</code></pre></div></div>\n\n<p>Lastly, draw two graphs of the current samples</p>\n\n<div><div><pre><code>drawGraph(8, 3, historySeconds);\ndrawGraph(8, 7, historyMinutes);\n</code></pre></div></div>\n\n<p>The graphs were tricky to draw as the slimmed down U8x8 version of the <a href=\"https://github.com/olikraus/u8g2\">U8g2</a> library doesn\u2019t provide any drawing functions. However you can create and display a custom font glyph. This mess of nested loops creates thirty-two 8 by 8 pixel glyphs to display a bar graph of 64 values with a maximum <em>y</em> value of 32.</p>\n\n<div><div><pre><code>void drawGraph(uint8_t col, uint8_t row, uint16_t *values) {\n uint8_t tmp[8];\n for (uint8_t r = 0; r < 4; r++) {\n for (uint8_t h = 0; h < SAMPLES; h += 8) {\n for (uint8_t i = 0; i < 8; i++) {\n int x = values[SAMPLES - h - 1 - i] / 16;\n x -= 8 * r;\n tmp[i] = 0;\n for (uint8_t b = 0; b < 8 && x > 0; b++, x--) {\n if (x) {\n tmp[i] |= (1 << (7 - b));\n }\n }\n }\n u8x8.drawTile(col + h / 8, row - r, 1, tmp);\n }\n }\n}\n</code></pre></div></div>\n\n<p>The graph below shows the recording during morning ringing and during the quarter peal in the afternoon (plus some messing around blowing directly into the sensor at the end). Windows open as usual!</p>\n\n<p><img alt=\"Graph\" src=\"https://www.tunbury.org/images/sample-values-recorded.png\"></p>",
+21
mte/2021_09_04_foot-operated-timer.json
+21
mte/2021_09_04_foot-operated-timer.json
···+"summary": "At the end of a quarter peal there is always the question of how long it took and whether anyone really noted the start time. Mike proposed a foot operated timer.",+"content": "<p>At the end of a quarter peal there is always the question of how long it took and whether anyone really noted the start time. Mike proposed a foot operated timer.</p>\n\n<p>I wanted the display to be large enough that it can be seen while standing and I choose this <a href=\"https://www.amazon.co.uk/gp/product/B08BC8JY8T/\">MAX7219 dot matrix display from Amazon</a>. This turned out to be a bit of a bad purchase but more on that later.</p>\n\n<p>Using <a href=\"https://www.festi.info/boxes.py/\">boxes.py</a> to created the basic box that was just large enough to accommodate the display, battery, on/off switch and foot switch, I modified the design in Adobe Illustrator to shorten the top and add in a <em>shelf</em> for the display to sit on.</p>\n\n<p><img alt=\"net\" src=\"https://www.tunbury.org/images/foot-operated-timer-net.png\"></p>\n\n<p>This was cut on the laser cutter.</p>\n\n<p><img alt=\"net\" src=\"https://www.tunbury.org/images/foot-operated-timer-laser-cutting.jpg\"></p>\n\n<p>When assembling the electronics it became apparent that it would have been better to have a slightly taller box, but rather than waste the materials I decided to mount the Arduino upside down thereby fitting in a height of 12mm.</p>\n\n<p><img alt=\"Arduino\" src=\"https://www.tunbury.org/images/foot-operated-timer-arduino.jpg\"></p>\n\n<p>The DS3231 real time clock module was modified by bending the pins to fit in with the vero board spacing. Ultimately the battery holder was also removed to save space.</p>\n\n<p><img alt=\"DS3231\" src=\"https://www.tunbury.org/images/foot-operated-timer-clock-module.jpg\"></p>\n\n<p>The vero board was drilled to cut the tracks.</p>\n\n<p><img alt=\"Vero Board\" src=\"https://www.tunbury.org/images/foot-operated-timer-vero-board.jpg\"></p>\n\n<p><img alt=\"Vero Board\" src=\"https://www.tunbury.org/images/foot-operated-timer-assembly.jpg\"></p>\n\n<p>After the initial assembly, the unit was tested on battery for the first time. This showed that it didn\u2019t actually run on batteries. The code just crashed randomly after the display was initialised. Reading online on this <a href=\"https://arduinoplusplus.wordpress.com/2015/09/12/max7219-and-led-matrix-power-requirements/\">post</a> I found the problem with cheap display units!</p>\n\n<blockquote>\n <p>Most of the cheap generic modules have very low values for RSET, which would significantly increase the power/current required by the module. This seems to be 10k\u03a9 for the eBay specials, for a segment current exceeding 40mA, the specified minimum value for RSET in Table 11 being 11.8k\u03a9 for VLED = 2V.</p>\n</blockquote>\n\n<p>The full data sheet is available from <a href=\"https://datasheets.maximintegrated.com/en/ds/MAX7219-MAX7221.pdf\">Maxim</a></p>\n\n<p>I had some 100K\u03a9 surface mount resistors in 0603 format left over from another project. These were smaller than the 0804 format resistors used but they were relatively easy to change. Fortunately these fixed the problem.</p>\n\n<p>As an after thought a voltage divider was added to pin A0 to measure the battery voltage.</p>\n\n<p><img alt=\"Vero Board\" src=\"https://www.tunbury.org/images/foot-operated-timer-voltage-divider.jpg\"></p>\n\n<p>I wired the I2C bus from the Arduino to the DS3231 and the square wave output from the DS3231 to pin 2 on the Arduino. Pin 3 was connected to the push button. On the Arduino Nano only pin 2 and 3 can be used for interrupts. This configuration gave lots of options when it came to the code which wasn\u2019t actually written yet!</p>\n\n<p><img alt=\"Electrionics\" src=\"https://www.tunbury.org/images/foot-operated-timer-electronics.jpg\"></p>\n\n<p>Assembling the rest of the box was straight forwarded although a bit fiddly.</p>\n\n<p><img alt=\"Finished project\" src=\"https://www.tunbury.org/images/foot-operated-timer-off.jpg\"></p>\n\n<p>The code is available on <a href=\"https://github.com/mtelvers/foot-timer\">GitHub</a></p>\n\n<p><img alt=\"Finished project running\" src=\"https://www.tunbury.org/images/foot-operated-timer.jpg\"></p>",
+21
mte/2023_08_08_3d-printed-train.json
+21
mte/2023_08_08_3d-printed-train.json
···+"summary": "Creating a new OO train body drawn from scratch in Fusion 360 to minic the original damaged version.",+"content": "<p>Creating a new OO train body drawn from scratch in Fusion 360 to minic\nthe original damaged version.</p>\n\n<h1>Early versions</h1>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/IMG_1919.jpg\">\n<img alt=\"\" src=\"https://www.tunbury.org/images/IMG_1918.jpg\"></p>\n\n<h1>Printed with tree support</h1>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/IMG_1917.jpg\"></p>\n\n<h1>Finished</h1>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/IMG_1920.jpg\"></p>",
+21
mte/2025_01_18_arduino-pwm-train-controller.json
+21
mte/2025_01_18_arduino-pwm-train-controller.json
···+"content": "<h1>Circuit</h1>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/train-controller-diagram.png\"></p>\n\n<h1>Case</h1>\n\n<p>3D printable STL files are available for download: <a href=\"https://www.tunbury.org/images/train-controller.stl\">STL files</a></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/train-controller-fusion-360.png\"></p>\n\n<h1>Arduino Code</h1>\n\n<div><div><pre><code>/*\n * Arduino Nano PWM Dual Train Controller\n * This sketch reads values from two potentiometers connected to A0 and A1\n * and uses these values to control the speed and direction of a motor via\n * an L298N motor driver. The motor speed is controlled using PWM signals\n * on pins D5 and D10, and the direction is controlled using digital signals\n * on pins D6, D7, D8, and D9.\n */\n\n// Pin definitions\nconst int potLeftPin = A0;\nconst int potRightPin = A1;\nconst int enaPin = 10;\nconst int in1Pin = 9;\nconst int in2Pin = 8;\nconst int in3Pin = 7;\nconst int in4Pin = 6;\nconst int enbPin = 5;\n\nvoid setup() {\n // Initialize serial communication\n Serial.begin(9600);\n\n // Set motor control pins as outputs\n pinMode(enbPin, OUTPUT);\n pinMode(enaPin, OUTPUT);\n pinMode(in1Pin, OUTPUT);\n pinMode(in2Pin, OUTPUT);\n pinMode(in3Pin, OUTPUT);\n pinMode(in4Pin, OUTPUT);\n}\n\nvoid loop() {\n // Read potentiometer values\n int potLeft = analogRead(potLeftPin);\n int potRight = analogRead(potRightPin);\n\n // Map potentiometer values to PWM range\n int pwmLeft = pow(potLeft - 512, 2) / 1024;\n int pwmRight = pow(potRight - 512, 2) / 1024;\n\n // Control motor speed and direction\n analogWrite(enaPin, pwmLeft);\n analogWrite(enbPin, pwmRight);\n\n // Set motor direction based on potentiometer values\n if (potLeft < 512) {\n digitalWrite(in1Pin, LOW);\n digitalWrite(in2Pin, HIGH);\n } else {\n digitalWrite(in1Pin, HIGH);\n digitalWrite(in2Pin, LOW);\n }\n\n if (potRight < 512) {\n digitalWrite(in3Pin, LOW);\n digitalWrite(in4Pin, HIGH);\n } else {\n digitalWrite(in3Pin, HIGH);\n digitalWrite(in4Pin, LOW);\n }\n\n // Print values to serial monitor for debugging\n Serial.print(\"potLeft: \");\n Serial.print(potLeft);\n Serial.print(\" PWMLeft: \");\n Serial.print(pwmLeft);\n Serial.print(\" potRight: \");\n Serial.print(potRight);\n Serial.print(\" PWMRight: \");\n Serial.println(pwmRight);\n\n // Small delay to stabilize readings\n delay(100);\n}\n</code></pre></div></div>",
+21
mte/2025_03_12_deekseek-r1-on-raspberry-pi.json
+21
mte/2025_03_12_deekseek-r1-on-raspberry-pi.json
···+"content": "<p>I\u2019ve heard a lot about Deepseek and wanted to try it for myself.</p>\n\n<p>Using a Raspberry Pi 5 with 8GB of RAM and an NVMe, I installed Ollama:</p>\n\n<div><div><pre><code>curl <span>-fsSL</span> https://ollama.com/install.sh | sh\n</code></pre></div></div>\n\n<p>Next, I downloaded the smallest Deepseek model</p>\n\n<div><div><pre><code>ollama pull deepseek-r1:1.5b\n</code></pre></div></div>\n\n<p>And ran it using:</p>\n\n<div><div><pre><code>ollama run deepseek-r1:1.5b\n</code></pre></div></div>\n\n<p>Things went downhill after that, as I couldn\u2019t get a complete response.\nThe Pi would lock up with no output on the display:</p>\n\n<div><div><pre><code>$ ollama run deepseek-r1:1.5b\n>>> what is the capital of england?\n<think>\nOkay, so I need to figure out whatclient_loop: send disconnect: Broken pipe\n</code></pre></div></div>\n\n<p>Switching from the Apple iPad charger I was using to an official Pi\npower supply resolved this problem.</p>\n\n<div><div><pre><code>$ ollama run deepseek-r1:1.5b\n>>> what is the capital of england?\n<think>\n\n</think>\n\nThe capital of England is London.\n\n>>>\n</code></pre></div></div>\n\n<p>However, it\u2019s not so good with a more niche question:</p>\n\n<div><div><pre><code>$ ollama run deepseek-r1:1.5b\n>>> what is OCaml?\n<think>\n\n</think>\n\nOCaml is a high-level, multi-paradigm programming language that combines\nfeatures from several other languages and frameworks. It was created by\nMartin Abadi, the founding father of Google, and has gained popularity\nin both academic research and industrial applications.\n\n...\n</code></pre></div></div>\n\n<p>How about an easier question?</p>\n\n<div><div><pre><code>$ ollama run deepseek-r1:1.5b\n>>> Who invented the C programming language?\n<think>\nOkay, so I need to figure out who invented the C programming\nlanguage. Hmm, I remember that C is a high-level programming language\nused for system programming and embedded systems because it's efficient\nand portable. But I'm not sure about its origin.\n\nFirst, I'll think about other languages that are often mentioned\nbefore C. For example, C was developed by Dennis Ritchie in the late\n1960s. He wrote the header files, which made the language more readable\nand maintainable. He probably worked for Bell Labs since that's where\nDennis is associated.\n\nI wonder if there were any other contributors. Maybe someone else helped\ndevelop parts of the language before Ritchie did everything. But I don't\nrecall any major alternatives to C being widely used before it became\ndominant in certain areas like operating systems and embedded devices.\n\nSo, putting it all together, Dennis Ritchie is credited with inventing\nC. He created a lot of header files that made the language structured\nand easier to use, which helped it become very popular for its efficiency\nand portability.\n</think>\n\nThe C programming language was invented by Dennis Ritchie, who worked at\nBell Labs in the late 1960s. His work led to the creation of significant\nheader files that enhanced readability and maintainability. Ritchie's\ncontributions made C a widely used language in areas like operating\nsystems and embedded systems.\n\n**Answer:**\n\nDennis Ritchie invented the C programming language. He worked at Bell Labs\n(now part of Tufts University) in the late 1960s, contributing to its\ndevelopment by creating essential header files that made the language\nmore readable and maintainable. His work led to a widely popularized\nlanguage used for system programming and embedded systems.\n\n**Step-by-Step Explanation:**\n\n1. **Identify the Problem:** The question is about the origin of C,\nan early high-level programming language.\n2. **Recall Other Languages Before C:** Mention languages like Pascal,\nAda, and Modula-2 that were around before C.\n3. **Consider Developers Known for C:** Dennis Ritchie stands out as a\nkey figure in C's development.\n4. **Details about Ritchie\u2019s Contributions:** He worked at Bell Labs\n(now Tufts University) in the 1960s, creating header files to improve\nC's structure and readability.\n5. **Link to C Language Development:** This led to C becoming prominent\nfor system programming due to its efficiency and portability.\n\n**Conclusion:**\nDennis Ritchie is credited with inventing C by developing header files\nthat enhanced the language\u2019s usability and structure.\n</code></pre></div></div>",
+21
mte/2025_03_14_pi-day.json
+21
mte/2025_03_14_pi-day.json
···+"content": "<p>It\u2019s <a href=\"https://en.wikipedia.org/wiki/Pi_Day\">Pi Day</a> 2025</p>\n\n<p>Archimedes calculated the perimeter of inscribed regular polygons\nwithin a circle to approximate the value of \u03c0.</p>\n\n<p>A square inscribed in a unit circle can be divided into four right\ntriangles with two sides of unit length, corresponding to the radius of\nthe circle. The third side can be calculated by Pythagoras\u2019 theorem to\nbe \u221a2. The perimeter of the square would be 4\u221a2. Given, C=\u03c0d, we\ncan calculate \u03c0 from the circumference by dividing it by the diameter,\n2, giving 2\u221a2.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/pi-archimedes-triangle.png\"></p>\n\n<p>CA, CD and CB are all the unit radius. AB is \u221a2 as calculated above. The\nangle ACB can be bisected with the line CD. EB is half of AB. Using\nPythagoras\u2019 theorem on the triangle BCE we can calculated CE. DE is then\n1 - CE, allowing us to use Pythagoras\u2019 theorem for a final time on BDE to\ncalculated BD. The improved approximation of the perimeter is now 8 x BD.</p>\n\n<p>We can iterate on this process using the following code:</p>\n\n<div><div><pre><code><span>let</span> <span>rec</span> <span>pi</span> <span>edge_squared</span> <span>sides</span> <span>=</span> <span>function</span>\n <span>|</span> <span>0</span> <span>-></span> <span>sides</span> <span>*.</span> <span>Float</span><span>.</span><span>sqrt</span><span>(</span><span>edge_squared</span><span>)</span> <span>/.</span> <span>2</span><span>.</span>\n <span>|</span> <span>n</span> <span>-></span>\n <span>let</span> <span>edge_squared</span> <span>=</span> <span>2</span><span>.</span> <span>-.</span> <span>2</span><span>.</span> <span>*.</span> <span>Float</span><span>.</span><span>sqrt</span> <span>(</span><span>1</span><span>.</span> <span>-.</span> <span>edge_squared</span> <span>/.</span> <span>4</span><span>.</span><span>)</span> <span>in</span>\n <span>let</span> <span>sides</span> <span>=</span> <span>sides</span> <span>*.</span> <span>2</span><span>.</span> <span>in</span>\n <span>pi</span> <span>edge_squared</span> <span>sides</span> <span>(</span><span>n</span> <span>-</span> <span>1</span><span>)</span>\n\n<span>let</span> <span>approximation</span> <span>=</span> <span>pi</span> <span>2</span><span>.</span> <span>4</span><span>.</span> <span>13</span>\n<span>let</span> <span>()</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"pi %.31f</span><span>\\n</span><span>\"</span> <span>approximation</span>\n</code></pre></div></div>\n\n<p>I found this method quite interesting. Usually, as the number of\niterations increases the approximation of \u03c0 becomes more accurate\nwith the delta between each step becoming smaller until the difference\nis effectively zero (given the limited precision of the floating\ncalculation). However, in this case, after 13 iterations the\napproximation becomes worse!</p>\n\n\n\n \n \n iteration\n approximation\n % error\n \n \n \n \n 0\n 2.8284271247461902909492437174777\n 9.968368\n \n \n 1\n 3.0614674589207178101446515938733\n 2.550464\n \n \n 2\n 3.1214451522580528575190328410827\n 0.641315\n \n \n 3\n 3.1365484905459406483885231864406\n 0.160561\n \n \n 4\n 3.1403311569547391890466769837076\n 0.040155\n \n \n 5\n 3.1412772509327568926096319046337\n 0.010040\n \n \n 6\n 3.1415138011441454679584239784162\n 0.002510\n \n \n 7\n 3.1415729403678827047485810908256\n 0.000627\n \n \n 8\n 3.1415877252799608854161306226160\n 0.000157\n \n \n 9\n 3.1415914215046352175875199463917\n 0.000039\n \n \n 10\n 3.1415923456110768086091411532834\n 0.000010\n \n \n 11\n 3.1415925765450043449789063743083\n 0.000002\n \n \n 12\n 3.1415926334632482408437681442592\n 0.000001\n \n \n 13\n 3.1415926548075892021927302266704\n -0.000000\n \n \n 14\n 3.1415926453212152935634549066890\n 0.000000\n \n \n 15\n 3.1415926073757196590463536267634\n 0.000001\n \n \n 16\n 3.1415929109396727447744979144773\n -0.000008\n \n \n 17\n 3.1415941251951911006301543238806\n -0.000047\n \n \n 18\n 3.1415965537048196054570325941313\n -0.000124\n \n \n 19\n 3.1415965537048196054570325941313\n -0.000124\n \n \n 20\n 3.1416742650217575061333263874985\n -0.002598\n \n \n 21\n 3.1418296818892015309643284126651\n -0.007545\n \n \n 22\n 3.1424512724941338071005247911671\n -0.027331\n \n \n 23\n 3.1424512724941338071005247911671\n -0.027331\n \n \n 24\n 3.1622776601683795227870632515987\n -0.658424\n \n \n 25\n 3.1622776601683795227870632515987\n -0.658424\n \n \n 26\n 3.4641016151377543863532082468737\n -10.265779\n \n \n 27\n 4.0000000000000000000000000000000\n -27.323954\n \n \n 28\n 0.0000000000000000000000000000000\n 100.000000\n \n \n\n\n<p>Using the <a href=\"https://opam.ocaml.org/packages/decimal/\">decimal</a> package\nwe can specify the floating point precision we want allowing us to\nget to 100 decimal places in 165 steps.</p>\n\n<div><div><pre><code><span>open</span> <span>Decimal</span>\n\n<span>let</span> <span>context</span> <span>=</span> <span>Context</span><span>.</span><span>make</span> <span>~</span><span>prec</span><span>:</span><span>200</span> <span>()</span>\n<span>let</span> <span>two</span> <span>=</span> <span>of_int</span> <span>2</span>\n<span>let</span> <span>four</span> <span>=</span> <span>of_int</span> <span>4</span>\n\n<span>let</span> <span>rec</span> <span>pi</span> <span>edge_squared</span> <span>sides</span> <span>n</span> <span>=</span>\n <span>match</span> <span>n</span> <span>with</span>\n <span>|</span> <span>0</span> <span>-></span> <span>mul</span> <span>~</span><span>context</span> <span>sides</span> <span>(</span><span>div</span> <span>~</span><span>context</span> <span>(</span><span>sqrt</span> <span>~</span><span>context</span> <span>edge_squared</span><span>)</span> <span>two</span><span>)</span>\n <span>|</span> <span>n</span> <span>-></span>\n <span>let</span> <span>edge_squared</span> <span>=</span>\n <span>sub</span> <span>~</span><span>context</span> <span>two</span>\n <span>(</span><span>mul</span> <span>~</span><span>context</span> <span>two</span>\n <span>(</span><span>sqrt</span> <span>~</span><span>context</span> <span>(</span><span>sub</span> <span>~</span><span>context</span> <span>one</span> <span>(</span><span>div</span> <span>~</span><span>context</span> <span>edge_squared</span> <span>four</span><span>))))</span>\n <span>in</span>\n <span>let</span> <span>sides</span> <span>=</span> <span>mul</span> <span>~</span><span>context</span> <span>sides</span> <span>two</span> <span>in</span>\n <span>pi</span> <span>edge_squared</span> <span>sides</span> <span>(</span><span>Int</span><span>.</span><span>pred</span> <span>n</span><span>)</span>\n\n<span>let</span> <span>()</span> <span>=</span> <span>pi</span> <span>two</span> <span>four</span> <span>165</span> <span>|></span> <span>to_string</span> <span>~</span><span>context</span> <span>|></span> <span>Printf</span><span>.</span><span>printf</span> <span>\"%s</span><span>\\n</span><span>\"</span>\n</code></pre></div></div>\n\n<p>This code is available on <a href=\"https://github.com/mtelvers/pi-archimedes\">GitHub</a></p>",
+21
mte/2025_03_15_bluesky-pds.json
+21
mte/2025_03_15_bluesky-pds.json
···+"content": "<p>Today I have set up my own Bluesky (PDS) Personal Data Server.</p>\n\n<p>I followed the README at\n<a href=\"https://github.com/bluesky-social/pds\">https://github.com/bluesky-social/pds</a>\nusing an Ubuntu 22.04 VM. The basic steps are:</p>\n\n<ol>\n <li>Publish DNS records pointing to your machine.</li>\n <li>As root, run <a href=\"https://raw.githubusercontent.com/bluesky-social/pds/main/installer.sh\">install.sh</a>.</li>\n <li>Enter your email address and preferred handle.</li>\n</ol>\n\n<p>It wasn\u2019t entirely obvious how to set your handle to be the same\nas the domain name when you have something else already published\non the domain such as your web server.</p>\n\n<p><a href=\"https://github.com/bluesky-social/pds/issues/103\">Issue #103</a> shows how this should be achieved.</p>\n\n<ol>\n <li>Publish the DNS record for <code>pds.yourdomain.com</code>.</li>\n <li>Use <code>pds.yourdomain.com</code> during setup.</li>\n <li>At the final stage where a handle is created, use <code>tmphandle.pds.yourdomain.com</code></li>\n <li>Change the change to your preferred handle via the Bluesky app.</li>\n</ol>\n\n<p>Login using a custom server pds.yourdomain.com and the handle you created.</p>\n\n<p>Next go to Account > Handle and select \u2018I have my own domain\u2019. Enter\nthe domain name which should be the new handle that you want. In\nmy case, <code>mtelvers.tunbury.org</code>. Next, publish a DNS TXT record\nfor <code>_atproto.mtelvers.tunbury.org</code> and publish your did record\n<code>did=did:plc:5le6ofipuf6sdk6czluurgjc</code></p>\n\n<div><div><pre><code>Check service status : sudo systemctl status pds\nWatch service logs : sudo docker logs -f pds\nBackup service data : /pds\nPDS Admin command : pdsadmin\n\nTo see pdsadmin commands, run \"pdsadmin help\"\n</code></pre></div></div>",
+21
mte/2025_03_16_setup-tangled-with-bluesky.json
+21
mte/2025_03_16_setup-tangled-with-bluesky.json
···+"summary": "To setup this up, I\u2019m using a modified version of Anil\u2019s repo. My repo is here. Firstly, clone the repo and run gen-key.sh.",+"content": "<p>To setup this up, I\u2019m using a modified version of Anil\u2019s <a href=\"https://tangled.sh/@anil.recoil.org/knot-docker\">repo</a>. My repo is <a href=\"https://tangled.sh/@mtelvers.tunbury.org/knot-docker\">here</a>. Firstly, clone the repo and run <code>gen-key.sh</code>.</p>\n\n<p>Go to <a href=\"https://tangled.sh/login\">https://tangled.sh/login</a> and click the <a href=\"https://bsky.app/settings/app-passwords\">link</a> to generate an app password. Copy the created password and return to <a href=\"https://www.tunbury.org/2025/03/16/setup-tangled-with-bluesky/\">https://tangled.sh/login</a> and sign in using your handle and the newly created app password.</p>\n\n<p>Go to <a href=\"https://tangled.sh/knots\">https://tangled.sh/knots</a>, enter your knot hostname and click on generate key. Copy <code>knot.env.template</code> to <code>.env</code> and enter the key in <code>KNOT_SERVER_SECRET</code>. In the same file, also set the server name.</p>\n\n<p>The original <code>Dockerfile</code> didn\u2019t quite work for me as <code>useradd -D</code> (from alpine/busybox) leads to a disabled user which cannot sign in, even over SSH. Instead, I generate a random password for the <code>git</code> user. My diff looks like this:</p>\n\n<div><div><pre><code>- adduser -D -u 1000 -G git -h /home/git git && \\\n+ pw=\"$(head -c 20 /dev/urandom | base64 | head -c 10)\" \\\n+ printf \"$pw\\n$pw\\n\" | \\\n+ adduser -u 1000 -G git -h /home/git git && \\\n</code></pre></div></div>\n\n<p>Run <code>docker compose up -d</code> then check on <a href=\"https://tangled.sh/knots\">https://tangled.sh/knots</a>. Click on initialize and wait for the process to complete.</p>\n\n<p>Add a remote repo as normal:</p>\n\n<div><div><pre><code>git remote add knot git@git.tunbury.org:mtelvers.tunbury.org/pi-archimedes\n</code></pre></div></div>\n<p>Then push as you would to any other remote</p>\n<div><div><pre><code>git push knot\n</code></pre></div></div>",
+21
mte/2025_03_17_capnproto.json
+21
mte/2025_03_17_capnproto.json
···+"summary": "Cap\u2019n Proto has become a hot topic recently and while this is used for many OCaml-CI services, I spent some time creating a minimal application.",+"content": "<p>Cap\u2019n Proto has become a hot topic recently and while this is used for many OCaml-CI services, I spent some time creating a minimal application.</p>\n\n<p>Firstly create a schema with a single interface whch accepts a file name and returns the content.</p>\n\n<div><div><pre><code>interface Foo {\n get @0 (name :Text) -> (reply :Text);\n}\n</code></pre></div></div>\n\n<p>This schema can then be compiled into the bindings for your required language. e.g. <code>capnp compile -o ocaml:. schema.capnp</code></p>\n\n<p>In practice this need not be done by hand as we can use a <code>dune</code> rule to do this.</p>\n\n<div><div><pre><code>(rule\n (targets foo_api.ml foo_api.mli)\n (deps foo_api.capnp)\n (action (run capnp compile -o %{bin:capnpc-ocaml} %{deps})))\n</code></pre></div></div>\n\n<p>On the server side we now need to extend the automatically generate code to actually implement the interface. This code is largely boilerplate.</p>\n\n<div><div><pre><code><span>module</span> <span>Api</span> <span>=</span> <span>Foo_api</span><span>.</span><span>MakeRPC</span><span>(</span><span>Capnp_rpc</span><span>)</span>\n\n<span>open</span> <span>Capnp_rpc</span><span>.</span><span>Std</span>\n\n<span>let</span> <span>read_from_file</span> <span>filename</span> <span>=</span> <span>In_channel</span><span>.</span><span>with_open_text</span> <span>filename</span> <span>@@</span> <span>fun</span> <span>ic</span> <span>-></span> <span>In_channel</span><span>.</span><span>input_all</span> <span>ic</span>\n\n<span>let</span> <span>local</span> <span>=</span>\n <span>let</span> <span>module</span> <span>Foo</span> <span>=</span> <span>Api</span><span>.</span><span>Service</span><span>.</span><span>Foo</span> <span>in</span>\n <span>Foo</span><span>.</span><span>local</span> <span>@@</span> <span>object</span>\n <span>inherit</span> <span>Foo</span><span>.</span><span>service</span>\n\n <span>method</span> <span>get_impl</span> <span>params</span> <span>release_param_caps</span> <span>=</span>\n <span>let</span> <span>open</span> <span>Foo</span><span>.</span><span>Get</span> <span>in</span>\n <span>let</span> <span>name</span> <span>=</span> <span>Params</span><span>.</span><span>name_get</span> <span>params</span> <span>in</span>\n <span>release_param_caps</span> <span>()</span><span>;</span>\n <span>let</span> <span>response</span><span>,</span> <span>results</span> <span>=</span> <span>Service</span><span>.</span><span>Response</span><span>.</span><span>create</span> <span>Results</span><span>.</span><span>init_pointer</span> <span>in</span>\n <span>Results</span><span>.</span><span>reply_set</span> <span>results</span> <span>(</span><span>read_from_file</span> <span>name</span><span>);</span>\n <span>Service</span><span>.</span><span>return</span> <span>response</span>\n <span>end</span>\n</code></pre></div></div>\n\n<p>The server needs to generate the capability file needed to access the service and wait for incoming connections.</p>\n\n<div><div><pre><code><span>let</span> <span>cap_file</span> <span>=</span> <span>\"echo.cap\"</span>\n\n<span>let</span> <span>serve</span> <span>config</span> <span>=</span>\n <span>Switch</span><span>.</span><span>run</span> <span>@@</span> <span>fun</span> <span>sw</span> <span>-></span>\n <span>let</span> <span>service_id</span> <span>=</span> <span>Capnp_rpc_unix</span><span>.</span><span>Vat_config</span><span>.</span><span>derived_id</span> <span>config</span> <span>\"main\"</span> <span>in</span>\n <span>let</span> <span>restore</span> <span>=</span> <span>Restorer</span><span>.</span><span>single</span> <span>service_id</span> <span>(</span><span>Foo</span><span>.</span><span>local</span><span>)</span> <span>in</span>\n <span>let</span> <span>vat</span> <span>=</span> <span>Capnp_rpc_unix</span><span>.</span><span>serve</span> <span>~</span><span>sw</span> <span>~</span><span>restore</span> <span>config</span> <span>in</span>\n <span>match</span> <span>Capnp_rpc_unix</span><span>.</span><span>Cap_file</span><span>.</span><span>save_service</span> <span>vat</span> <span>service_id</span> <span>cap_file</span> <span>with</span>\n <span>|</span> <span>Error</span> <span>`Msg</span> <span>m</span> <span>-></span> <span>failwith</span> <span>m</span>\n <span>|</span> <span>Ok</span> <span>()</span> <span>-></span>\n <span>traceln</span> <span>\"Server running. Connect using %S.\"</span> <span>cap_file</span><span>;</span>\n <span>Fiber</span><span>.</span><span>await_cancel</span> <span>()</span>\n</code></pre></div></div>\n\n<p>The client application imports the capability file and calls the service <code>Foo.get</code>.</p>\n\n<div><div><pre><code><span>let</span> <span>run_client</span> <span>service</span> <span>=</span>\n <span>let</span> <span>x</span> <span>=</span> <span>Foo</span><span>.</span><span>get</span> <span>service</span> <span>\"client.ml\"</span> <span>in</span>\n <span>traceln</span> <span>\"%S\"</span> <span>x</span>\n\n<span>let</span> <span>connect</span> <span>net</span> <span>uri</span> <span>=</span>\n <span>Switch</span><span>.</span><span>run</span> <span>@@</span> <span>fun</span> <span>sw</span> <span>-></span>\n <span>let</span> <span>client_vat</span> <span>=</span> <span>Capnp_rpc_unix</span><span>.</span><span>client_only_vat</span> <span>~</span><span>sw</span> <span>net</span> <span>in</span>\n <span>let</span> <span>sr</span> <span>=</span> <span>Capnp_rpc_unix</span><span>.</span><span>Vat</span><span>.</span><span>import_exn</span> <span>client_vat</span> <span>uri</span> <span>in</span>\n <span>Capnp_rpc_unix</span><span>.</span><span>with_cap_exn</span> <span>sr</span> <span>run_client</span>\n</code></pre></div></div>\n\n<p>Where <code>Foo.get</code> is defined like this</p>\n\n<div><div><pre><code><span>module</span> <span>Foo</span> <span>=</span> <span>Api</span><span>.</span><span>Client</span><span>.</span><span>Foo</span>\n\n<span>let</span> <span>get</span> <span>t</span> <span>name</span> <span>=</span>\n <span>let</span> <span>open</span> <span>Foo</span><span>.</span><span>Get</span> <span>in</span>\n <span>let</span> <span>request</span><span>,</span> <span>params</span> <span>=</span> <span>Capability</span><span>.</span><span>Request</span><span>.</span><span>create</span> <span>Params</span><span>.</span><span>init_pointer</span> <span>in</span>\n <span>Params</span><span>.</span><span>name_set</span> <span>params</span> <span>name</span><span>;</span>\n <span>Capability</span><span>.</span><span>call_for_value_exn</span> <span>t</span> <span>method_id</span> <span>request</span> <span>|></span> <span>Results</span><span>.</span><span>reply_get</span>\n</code></pre></div></div>\n\n<p>Run the server application passing it parameters of where to save the private key and which interface/port to listen on.</p>\n\n<div><div><pre><code><span>$ </span>dune <span>exec</span> <span>--</span> ./server.exe <span>--capnp-secret-key-file</span> ./server.pem <span>--capnp-listen-address</span> tcp:127.0.0.1:7000\n+Server running. Connect using <span>\"echo.cap\"</span><span>.</span>\n</code></pre></div></div>\n\n<p>The <code>.cap</code> looks like this</p>\n\n<div><div><pre><code>capnp://sha-256:f5BAo2n_2gVxUdkyzYsIuitpA1YT_7xFg31FIdNKVls@127.0.0.1:7000/6v45oIvGQ6noMaLOh5GHAJnGJPWEO5A3Qkt0Egke4Ic\n</code></pre></div></div>\n\n<p>In another window, invoke the client.</p>\n\n<div><div><pre><code><span>$ </span>dune <span>exec</span> <span>--</span> ./client.exe ./echo.cap\n</code></pre></div></div>\n\n<p>The full code is available on <a href=\"https://github.com/mtelvers/capnp-minimum\">Github</a>.</p>",
+21
mte/2025_03_17_irmin.json
+21
mte/2025_03_17_irmin.json
···+"content": "<p>After Thomas\u2019 talk today I wanted to try <a href=\"https://irmin.org\">Irmin</a> for myself.</p>\n\n<p>In a new switch I installed Irmin via opam <code>opam install irmin-git</code> and then built the <a href=\"https://irmin.org/tutorial/getting-started/\">example code</a></p>\n\n<div><div><pre><code><span>open</span> <span>Lwt</span><span>.</span><span>Syntax</span>\n<span>module</span> <span>Git_store</span> <span>=</span> <span>Irmin_git_unix</span><span>.</span><span>FS</span><span>.</span><span>KV</span> <span>(</span><span>Irmin</span><span>.</span><span>Contents</span><span>.</span><span>String</span><span>)</span>\n<span>module</span> <span>Git_info</span> <span>=</span> <span>Irmin_unix</span><span>.</span><span>Info</span> <span>(</span><span>Git_store</span><span>.</span><span>Info</span><span>)</span>\n\n<span>let</span> <span>git_config</span> <span>=</span> <span>Irmin_git</span><span>.</span><span>config</span> <span>~</span><span>bare</span><span>:</span><span>true</span> <span>\"./db\"</span>\n<span>let</span> <span>info</span> <span>message</span> <span>=</span> <span>Git_info</span><span>.</span><span>v</span> <span>~</span><span>author</span><span>:</span><span>\"Example\"</span> <span>\"%s\"</span> <span>message</span>\n\n<span>let</span> <span>main_branch</span> <span>config</span> <span>=</span>\n <span>let</span><span>*</span> <span>repo</span> <span>=</span> <span>Git_store</span><span>.</span><span>Repo</span><span>.</span><span>v</span> <span>config</span> <span>in</span>\n <span>Git_store</span><span>.</span><span>main</span> <span>repo</span>\n\n<span>let</span> <span>main</span> <span>=</span>\n <span>let</span><span>*</span> <span>t</span> <span>=</span> <span>main_branch</span> <span>git_config</span> <span>in</span>\n <span>(* Set a/b/c to \"Hello, Irmin!\" *)</span>\n <span>let</span><span>*</span> <span>()</span> <span>=</span>\n <span>Git_store</span><span>.</span><span>set_exn</span> <span>t</span> <span>[</span> <span>\"a\"</span><span>;</span> <span>\"b\"</span><span>;</span> <span>\"c\"</span> <span>]</span> <span>\"Hello, Irmin!\"</span>\n <span>~</span><span>info</span><span>:</span><span>(</span><span>info</span> <span>\"my first commit\"</span><span>)</span>\n <span>in</span>\n <span>(* Get a/b/c *)</span>\n <span>let</span><span>+</span> <span>s</span> <span>=</span> <span>Git_store</span><span>.</span><span>get</span> <span>t</span> <span>[</span> <span>\"a\"</span><span>;</span> <span>\"b\"</span><span>;</span> <span>\"c\"</span> <span>]</span> <span>in</span>\n <span>assert</span> <span>(</span><span>s</span> <span>=</span> <span>\"Hello, Irmin!\"</span><span>)</span>\n\n<span>let</span> <span>()</span> <span>=</span> <span>Lwt_main</span><span>.</span><span>run</span> <span>main</span>\n</code></pre></div></div>\n\n<p>I\u2019m pretty excited about the possibilities.</p>",
+21
mte/2025_03_23_real-time-trains.json
+21
mte/2025_03_23_real-time-trains.json
···+"summary": "After the Heathrow substation electrical fire, I found myself in Manchester with a long train ride ahead. Checking on Real Time Trains for the schedule I noticed that they had an API. With time to spare, I registered for an account and downloaded the sample code from ocaml-cohttp.",+"content": "<p>After the Heathrow substation electrical fire, I found myself in Manchester with a long train ride ahead. Checking on <a href=\"https://www.realtimetrains.co.uk\">Real Time Trains</a> for the schedule I noticed that they had an API. With time to spare, I registered for an account and downloaded the sample code from <a href=\"https://github.com/mirage/ocaml-cohttp\">ocaml-cohttp</a>.</p>\n\n<p>The API account details uses HTTP basic authentication which is added via the HTTP header:</p>\n\n<div><div><pre><code> <span>let</span> <span>headers</span> <span>=</span> <span>Cohttp</span><span>.</span><span>Header</span><span>.</span><span>init</span> <span>()</span> <span>in</span>\n <span>let</span> <span>headers</span> <span>=</span>\n <span>Cohttp</span><span>.</span><span>Header</span><span>.</span><span>add_authorization</span> <span>headers</span> <span>(</span><span>`Basic</span> <span>(</span><span>user</span><span>,</span> <span>password</span><span>))</span>\n</code></pre></div></div>\n\n<p>The response from the API can be converted to JSON using <a href=\"https://github.com/ocaml-community/yojson\">Yojson</a>.</p>\n\n<div><div><pre><code><span>let</span> <span>json</span> <span>=</span>\n <span>Eio</span><span>.</span><span>Buf_read</span><span>.(</span><span>parse_exn</span> <span>take_all</span><span>)</span> <span>body</span> <span>~</span><span>max_size</span><span>:</span><span>max_int</span>\n <span>|></span> <span>Yojson</span><span>.</span><span>Safe</span><span>.</span><span>from_string</span>\n</code></pre></div></div>\n\n<p>The JSON field can be read using the <code>Util</code> functions. For example, <code>Yojson.Basic.Util.member \"services\" json</code> will read the <code>services</code> entry. Elements can be converted to lists with <code>Yojson.Basic.Util.to_list</code>. After a bit of hacking this turned out to be quite tedious to code.</p>\n\n<p>As an alternative, I decided to use <code>ppx_deriving_yojson.runtime</code>. I described the JSON blocks as OCaml types, e.g. <code>station</code> as below.</p>\n\n<div><div><pre><code><span>type</span> <span>station</span> <span>=</span> <span>{</span>\n <span>tiploc</span> <span>:</span> <span>string</span><span>;</span>\n <span>description</span> <span>:</span> <span>string</span><span>;</span>\n <span>workingTime</span> <span>:</span> <span>string</span><span>;</span>\n <span>publicTime</span> <span>:</span> <span>string</span><span>;</span>\n<span>}</span>\n<span>[</span><span>@@</span><span>deriving</span> <span>yojson</span><span>]</span>\n</code></pre></div></div>\n\n<p>The preprocessor automatically generates two functions:<code>station_of_json</code> and <code>station_to_json</code> which handle the conversion.</p>\n\n<p>The only negative on this approach is that RTT doesn\u2019t emit empty JSON fields, so they need to be flagged as possibly missing and a default value provided. For example, <code>realtimeArrivalNextDay</code> is not emitted unless the value is <code>true</code>.</p>\n\n<div><div><pre><code> <span>realtimeArrivalNextDay</span> <span>:</span> <span>(</span><span>bool</span><span>[</span><span>@</span><span>default</span> <span>false</span><span>]);</span>\n</code></pre></div></div>\n\n<p>Now once the JSON has been received we can just convert it to OCaml types very easily:</p>\n\n<div><div><pre><code> <span>match</span> <span>reply_of_yojson</span> <span>json</span> <span>with</span>\n <span>|</span> <span>Ok</span> <span>reply</span> <span>-></span>\n <span>(* Use reply.services *)</span>\n <span>|</span> <span>Error</span> <span>err</span> <span>-></span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Error %s</span><span>\\n</span><span>\"</span> <span>err</span>\n</code></pre></div></div>\n\n<p>My work in progress code is available on <a href=\"https://github.com/mtelvers/ocaml-rtt\">GitHub</a></p>\n\n<div><div><pre><code>dune exec --release -- rtt --user USER --pass PASS --station RTR\nrtt: [DEBUG] received 3923 bytes of body\nrtt: [DEBUG] received 4096 bytes of body\nrtt: [DEBUG] received 4096 bytes of body\nrtt: [DEBUG] received 4096 bytes of body\nrtt: [DEBUG] received 1236 bytes of body\nrtt: [DEBUG] end of inbound body\n2025-03-23 2132 W16178 1C69 1 Ramsgate St Pancras International\n2025-03-23 2132 W25888 9P59 2 Plumstead Rainham (Kent)\n2025-03-23 2136 J00119 1U28 2 London Victoria Ramsgate\n2025-03-23 2144 W25927 9P86 1 Rainham (Kent) Plumstead\n2025-03-23 2157 W16899 1C66 2 St Pancras International Ramsgate\n2025-03-23 2202 W25894 9P61 2 Plumstead Rainham (Kent)\n2025-03-23 2210 J26398 1U80 1 Ramsgate London Victoria\n2025-03-23 2214 W25916 9P70 1 Rainham (Kent) Plumstead\n2025-03-23 2232 W16910 1C73 1 Ramsgate St Pancras International\n2025-03-23 2232 W25900 9P63 2 Plumstead Rainham (Kent)\n2025-03-23 2236 J00121 1U30 2 London Victoria Ramsgate\n2025-03-23 2244 W25277 9A92 1 Rainham (Kent) Dartford\n2025-03-23 2257 W16450 1F70 2 St Pancras International Faversham\n2025-03-23 2302 W25906 9P65 2 Plumstead Rainham (Kent)\n2025-03-23 2314 W25283 9A94 1 Rainham (Kent) Dartford\n2025-03-23 2318 J00155 1U82 1 Ramsgate London Victoria\n2025-03-23 2332 W25912 9P67 2 Plumstead Gillingham (Kent)\n2025-03-23 2336 J00123 1U32 2 London Victoria Ramsgate\n2025-03-23 2344 W25289 9A96 1 Rainham (Kent) Dartford\n2025-03-23 2357 W16475 1F74 2 St Pancras International Faversham\n2025-03-23 0002 W25915 9P69 2 Plumstead Gillingham (Kent)\n2025-03-23 0041 J26381 1Z34 2 London Victoria Faversham\n</code></pre></div></div>",
+21
mte/2025_03_24_recent-ocaml-version.json
+21
mte/2025_03_24_recent-ocaml-version.json
···+"summary": "Following my post on discuss.ocaml.org, I have created a new release of ocurrent/ocaml-version that moves the minimum version of OCaml, considered as recent, from 4.02 to 4.08.",+"content": "<p>Following my <a href=\"https://discuss.ocaml.org/t/docker-base-images-and-ocaml-ci-support-for-ocaml-4-08/16229\">post on discuss.ocaml.org</a>, I have created a new release of <a href=\"https://github.com/ocurrent/ocaml-version\">ocurrent/ocaml-version</a> that moves the minimum version of OCaml, considered as <em>recent</em>, from 4.02 to 4.08.</p>\n\n<div><div><pre><code><span>let</span> <span>recent</span> <span>=</span> <span>[</span> <span>v4_08</span><span>;</span> <span>v4_09</span><span>;</span> <span>v4_10</span><span>;</span> <span>v4_11</span><span>;</span> <span>v4_12</span><span>;</span> <span>v4_13</span><span>;</span> <span>v4_14</span><span>;</span> <span>v5_0</span><span>;</span> <span>v5_1</span><span>;</span> <span>v5_2</span><span>;</span> <span>v5_3</span> <span>]</span>\n</code></pre></div></div>\n\n<p>This may feel like a mundane change, but <a href=\"https://github.com/ocurrent/ocaml-ci\">OCaml-CI</a>, <a href=\"https://github.com/ocurrent/opam-repo-ci\">opam-repo-ci</a>, <a href=\"https://github.com/ocurrent/docker-base-images\">Docker base image builder</a> among other things, use this to determine the set of versions of OCaml to test against. Therefore, as these services are updated, testing on the old releases will be removed.</p>",
+21
mte/2025_03_25_topological-sort.json
+21
mte/2025_03_25_topological-sort.json
···+"summary": "Given a list of packages and their dependencies, what order should those packages be installed in?",+"content": "<p>Given a list of packages and their dependencies, what order should those packages be installed in?</p>\n\n<p>The above graph gives a simple example of the dependencies of the package <code>dune</code> nicely ordered right to left.</p>\n\n<p>We might choose to model this in OCaml using a map with the package name as the key and a set of the dependent packages:</p>\n\n<div><div><pre><code><span>module</span> <span>PackageSet</span> <span>=</span> <span>Set</span><span>.</span><span>Make</span> <span>(</span><span>String</span><span>);;</span>\n<span>module</span> <span>PackageMap</span> <span>=</span> <span>Map</span><span>.</span><span>Make</span> <span>(</span><span>String</span><span>);;</span>\n</code></pre></div></div>\n\n<p>Thus, the <code>dune</code> example could be defined like this.</p>\n\n<div><div><pre><code><span>let</span> <span>dune</span> <span>=</span> <span>PackageMap</span><span>.(</span><span>empty</span> <span>|></span>\n <span>add</span> <span>\"ocaml\"</span> <span>(</span><span>PackageSet</span><span>.(</span><span>empty</span> <span>|></span> <span>add</span> <span>\"ocaml-config\"</span> <span>|></span> <span>add</span> <span>\"ocaml-variants\"</span><span>))</span> <span>|></span>\n <span>add</span> <span>\"ocaml-config\"</span> <span>(</span><span>PackageSet</span><span>.(</span><span>empty</span> <span>|></span> <span>add</span> <span>\"ocaml-variants\"</span><span>))</span> <span>|></span>\n <span>add</span> <span>\"dune\"</span> <span>(</span><span>PackageSet</span><span>.(</span><span>empty</span> <span>|></span> <span>add</span> <span>\"ocaml\"</span> <span>|></span> <span>add</span> <span>\"base-unix.base\"</span> <span>|></span> <span>add</span> <span>\"base-threads.base\"</span><span>))</span> <span>|></span>\n <span>add</span> <span>\"ocaml-variants\"</span> <span>(</span><span>PackageSet</span><span>.</span><span>empty</span><span>)</span> <span>|></span>\n <span>add</span> <span>\"base-unix.base\"</span> <span>(</span><span>PackageSet</span><span>.</span><span>empty</span><span>)</span> <span>|></span>\n <span>add</span> <span>\"base-threads.base\"</span> <span>(</span><span>PackageSet</span><span>.</span><span>empty</span><span>)</span>\n <span>);;</span>\n</code></pre></div></div>\n\n<p>We can create a topological sort by first choosing any package with an empty set of dependencies. This package should then be removed from the map of packages and also removed as a dependency from any of the sets. This can be written concisely in OCaml</p>\n\n<div><div><pre><code><span>let</span> <span>rec</span> <span>topological_sort</span> <span>pkgs</span> <span>=</span>\n <span>match</span> <span>PackageMap</span><span>.</span><span>is_empty</span> <span>pkgs</span> <span>with</span>\n <span>|</span> <span>true</span> <span>-></span> <span>[]</span>\n <span>|</span> <span>false</span> <span>-></span>\n <span>let</span> <span>installable</span> <span>=</span> <span>PackageMap</span><span>.</span><span>filter</span> <span>(</span><span>fun</span> <span>_</span> <span>deps</span> <span>-></span> <span>PackageSet</span><span>.</span><span>is_empty</span> <span>deps</span><span>)</span> <span>pkgs</span> <span>in</span>\n <span>let</span> <span>()</span> <span>=</span> <span>assert</span> <span>(</span><span>not</span> <span>(</span><span>PackageMap</span><span>.</span><span>is_empty</span> <span>installable</span><span>))</span> <span>in</span>\n <span>let</span> <span>i</span> <span>=</span> <span>PackageMap</span><span>.</span><span>choose</span> <span>installable</span> <span>|></span> <span>fst</span> <span>in</span>\n <span>let</span> <span>pkgs</span> <span>=</span> <span>PackageMap</span><span>.</span><span>remove</span> <span>i</span> <span>pkgs</span> <span>|></span> <span>PackageMap</span><span>.</span><span>map</span> <span>(</span><span>fun</span> <span>deps</span> <span>-></span> <span>PackageSet</span><span>.</span><span>remove</span> <span>i</span> <span>deps</span><span>)</span> <span>in</span>\n <span>i</span> <span>::</span> <span>topological_sort</span> <span>pkgs</span>\n</code></pre></div></div>\n\n<p>This gives us the correct installation order:</p>\n\n<div><div><pre><code># topological_sort dune;;\n- : PackageMap.key list =\n[\"base-threads.base\"; \"base-unix.base\"; \"ocaml-variants\"; \"ocaml-config\"; \"ocaml\"; \"dune\"]\n</code></pre></div></div>",
+21
mte/2025_03_26_freebsd-14.2.json
+21
mte/2025_03_26_freebsd-14.2.json
···+"content": "<p>CI workers <code>spring</code> and <code>summer</code> run FreeBSD and need to be updated.</p>\n\n<p>Check the current version of FreeBSD which we have with <code>uname -r</code>.</p>\n\n<div><div><pre><code>FreeBSD summer 14.1-RELEASE-p5 FreeBSD 14.1-RELEASE-p5 GENERIC amd64\n</code></pre></div></div>\n\n<p>Run <code>freebsd-update fetch</code> to download the latest versions of the system components, particularly the <code>freebsd-update</code> utility. It even reported that it really is time to upgrade!</p>\n\n<div><div><pre><code><span># freebsd-update fetch</span>\n...\nWARNING: FreeBSD 14.1-RELEASE-p5 is approaching its End-of-Life date.\nIt is strongly recommended that you upgrade to a newer\nrelease within the next 5 days.\n</code></pre></div></div>\n\n<p>Install these updates.</p>\n\n<div><div><pre><code>freebsd-update <span>install</span>\n</code></pre></div></div>\n\n<p>Now use <code>freebsd-update</code> to fetch the 14.2-RELEASE and install it.</p>\n\n<div><div><pre><code><span># freebsd-update upgrade -r 14.2-RELEASE</span>\n...\n<span>#\u00a0freebsd-update install</span>\nsrc component not installed, skipped\nInstalling updates...\nKernel updates have been installed. Please reboot and run\n<span>'freebsd-update [options] install'</span> again to finish installing updates.\n</code></pre></div></div>\n\n<p>Reboot the system using <code>reboot</code> and then finish installing updates.</p>\n\n<div><div><pre><code><span># freebsd-update install</span>\nsrc component not installed, skipped\nInstalling updates...\nRestarting sshd after upgrade\nPerforming sanity check on sshd configuration.\nStopping sshd.\nWaiting <span>for </span>PIDS: 707.\nPerforming sanity check on sshd configuration.\nStarting sshd.\nScanning /usr/share/certs/untrusted <span>for </span>certificates...\nScanning /usr/share/certs/trusted <span>for </span>certificates...\nScanning /usr/local/share/certs <span>for </span>certificates...\n <span>done</span><span>.</span>\n</code></pre></div></div>\n\n<p>Now use <code>pkg</code> to upgrade any applications.</p>\n\n<div><div><pre><code><span># pkg upgrade</span>\nUpdating FreeBSD repository catalogue...\nFetching data.pkg: 100% 7 MiB 7.5MB/s 00:01 \nProcessing entries: 100%\nFreeBSD repository update completed. 35885 packages processed.\nAll repositories are up to date.\nChecking <span>for </span>upgrades <span>(</span>28 candidates<span>)</span>: 100%\nProcessing candidates <span>(</span>28 candidates<span>)</span>: 100%\nThe following 28 package<span>(</span>s<span>)</span> will be affected <span>(</span>of 0 checked<span>)</span>:\n\nInstalled packages to be UPGRADED:\n\tcurl: 8.10.1 -> 8.11.1_1\n...\n\txxd: 9.1.0764 -> 9.1.1199\n\nNumber of packages to be upgraded: 28\n\nThe process will require 3 MiB more space.\n77 MiB to be downloaded.\n\nProceed with this action? <span>[</span>y/N]: y\n</code></pre></div></div>\n\n<p>Finally, reboot the system and check <code>uname -a</code>.</p>\n\n<div><div><pre><code><span># uname -a</span>\nFreeBSD spring 14.2-RELEASE-p1 FreeBSD 14.2-RELEASE-p1 GENERIC amd64\n</code></pre></div></div>\n\n<p>To update the the FreeBSD base images used by the CI services, I applied <a href=\"https://github.com/ocurrent/freebsd-infra/pull/13\">PR#13</a> to <a href=\"https://github.com/ocurrent/freebsd-infra\">ocurrent/freebsd-infra</a>.</p>\n\n<p>This was followed up by <a href=\"https://github.com/ocurrent/ocaml-ci/pull/1007\">PR#1007</a> on ocurrent/ocaml-ci and <a href=\"https://github.com/ocurrent/opam-repo-ci/pull/427\">PR#427</a> to ocurrent/opam-repo-ci.</p>",
+21
mte/2025_03_27_dell-poweredge-r640.json
+21
mte/2025_03_27_dell-poweredge-r640.json
···+"summary": "We have received our first batch of 7.68TB Kingston SSD drives for deployment in some Dell PowerEdge R640 servers, which will be used to create a large storage pool.",+"content": "<p>We have received our first batch of 7.68TB Kingston SSD drives for deployment in some Dell PowerEdge R640 servers, which will be used to create a large storage pool.</p>\n\n<p>The first job was to mount each of the drives in a caddy.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/kingston-with-caddy.png\"></p>\n\n<p>And then install them in the server.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/kingston-in-slot.png\"></p>\n\n<p>These R640 servers are equipped with the Dell PERC H740P RAID controller. They support either hardware RAID 0,1,5,10,50 etc or Enhanced HBA mode.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/r640-enhanced-hba.png\"></p>\n\n<p>In eHBA mode, the disks operate in a passthrough mode, presenting the raw disks to the OS, however each disk needs to be specifically selected in an additional step after enabling eHBA mode.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/r640-jbod.png\"></p>\n\n<p>In RAID mode, one or more virtual disks need to be created to present the disks to the OS. Preconfigured profiles are available to complete this step easily.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/r640-raid5.png\"></p>\n\n<p>We will run these with a ZFS file system, so need to decide on whether we want to use the hardware RAID features or follow the advice on Wikipedia on the <a href=\"https://en.wikipedia.org/wiki/ZFS#Avoidance_of_hardware_RAID_controllers\">Avoidance of hardware RAID controllers</a>. Online opinion is divided. My summary is that hardware RAID will be easier to manage when a disk fails, but ZFS on the raw disks will have some integrity advantages.</p>",
+21
mte/2025_03_30_box-diff.json
+21
mte/2025_03_30_box-diff.json
···+"summary": "Box has an unlimited storage model but has an upload limit of 1TB per month. I have been uploading various data silos but would now like to verify that the data is all present. Box has an extensive API, but I only need the list items in folder call.",+"content": "<p>Box has an unlimited storage model but has an upload limit of 1TB per month. I have been uploading various data silos but would now like to verify that the data is all present. Box has an extensive <a href=\"https://developer.box.com/reference/\">API</a>, but I only need the <a href=\"https://developer.box.com/reference/get-folders-id-items/\">list items in folder</a> call.</p>\n\n<p>The list-items call assumes that you have a folder ID which you would like to query. The root of the tree is always ID 0. To check for the presence of file <code>foo</code> in a folder tree <code>a/b/c/foo</code>, we need to call the API with folder ID 0. This returns a list of entries in that folder. e.g.</p>\n\n<div><div><pre><code><span>{</span><span>\n </span><span>\"entries\"</span><span>:</span><span> </span><span>[</span><span>\n </span><span>{</span><span>\n </span><span>\"id\"</span><span>:</span><span> </span><span>\"12345\"</span><span>,</span><span>\n </span><span>\"type\"</span><span>:</span><span> </span><span>\"folder\"</span><span>,</span><span>\n </span><span>\"name\"</span><span>:</span><span> </span><span>\"a\"</span><span>\n </span><span>}</span><span>\n </span><span>]</span><span>\n</span><span>}</span><span>\n</span></code></pre></div></div>\n\n<p>The API must now be called again with the new ID number to get the contents of folder <code>a</code>. This is repeated until we finally have the entries for folder <code>c</code> which would contain the file itself. I have used a <code>Hashtbl</code> to cache the results of each call.</p>\n\n<div><div><pre><code><span>{</span><span>\n </span><span>\"entries\"</span><span>:</span><span> </span><span>[</span><span>\n </span><span>{</span><span>\n </span><span>\"id\"</span><span>:</span><span> </span><span>\"78923434\"</span><span>,</span><span>\n </span><span>\"type\"</span><span>:</span><span> </span><span>\"file\"</span><span>,</span><span>\n </span><span>\"name\"</span><span>:</span><span> </span><span>\"foo\"</span><span>\n </span><span>}</span><span>\n </span><span>]</span><span>\n</span><span>}</span><span>\n</span></code></pre></div></div>\n\n<p>Each call defaults to returning at most 100 entries. This can be increased to a maximum of 1000 by passing <code>?limit=1000</code> to the GET request. For more results, Box offers two pagination systems: <code>offset</code> and <code>marker</code>. Offset allows you to pass a starting item number along with the call, but this is limited to 10,000 entries.</p>\n\n<blockquote>\n <p>Queries with offset parameter value exceeding 10000 will be rejected with a 400 response.</p>\n</blockquote>\n\n<p>To deal with folders of any size, we should use the marker system. For this, we pass <code>?usemarker=true</code> to the first GET request, which causes the API to return <code>next_marker</code> and <code>prev_marker</code> as required as additional JSON properties. Subsequent calls would use <code>?usemarker=true&marker=XXX</code>. The end is detected by the absence of the <code>next_marker</code> when no more entries are available.</p>\n\n<p>The project can be found on GitHub in <a href=\"https://github.com/mtelvers/ocaml-box-diff\">mtelvers/ocaml-box-diff</a>.</p>",
+21
mte/2025_03_31_opam-post-deps.json
+21
mte/2025_03_31_opam-post-deps.json
···+"summary": "Previously, I discussed the installation order for a simple directed acyclic graph without any cycles. However, opam packages include post dependencies. Rather than package A depending upon B where B would be installed first, post dependencies require X to be installed after Y. The post dependencies only occur in a small number of core OCaml packages. They are quite often empty and exist to direct the solver. Up until now, I had been using a base layer with an opam switch containing the base compiler and, therefore, did not need to deal with any post dependencies.",+"content": "<p>Previously, I discussed the installation order for a simple directed acyclic graph without any cycles. However, <code>opam</code> packages include <em>post</em> dependencies. Rather than package A depending upon B where B would be installed first, <em>post</em> dependencies require X to be installed after Y. The <em>post</em> dependencies only occur in a small number of core OCaml packages. They are quite often empty and exist to direct the solver. Up until now, I had been using a base layer with an opam switch containing the base compiler and, therefore, did not need to deal with any <em>post</em> dependencies.</p>\n\n<p>Here is the graph of <a href=\"https://www.tunbury.org/images/0install.2.18-with-post-with-colour.pdf\">0install</a> with <em>post</em> dependencies coloured in red.</p>\n\n<p>Removing the <em>post</em> dependencies gives an unsatisfying graph with orphaned dependencies. <a href=\"https://www.tunbury.org/images/0install.2.18-without-post.pdf\">0install without post</a>. Note <code>base-nnp.base</code> and <code>base-effects.base</code>. However, this graph can be used to produce a linear installation order. The orphaned packages can be removed with a recursive search.</p>\n\n<p>When opam wants to decide the installation order, it uses OCamlgraph\u2019s topological sort capability.</p>\n\n<blockquote>\n <p>This functor provides functions which allow iterating over a graph in topological order. Cycles in graphs are allowed. Specification is the following: If vertex [x] is visited before vertex [y] then either there is a path from [x] to [y], or there is no path from [y] to [x]. In the particular case of a DAG, this simplifies to: if there is an edge from [x] to [y], then [x] is visited before [y].</p>\n</blockquote>\n\n<p>The description of <code>fold</code> is particularly interesting as the order for cycles is unspecified.</p>\n\n<blockquote>\n <p>[fold action g seed] allows iterating over the graph [g] in topological order. [action node accu] is called repeatedly, where [node] is the node being visited, and [accu] is the result of the [action]\u2019s previous invocation, if any, and [seed] otherwise. If [g] contains cycles, the order is unspecified inside the cycles and every node in the cycles will be presented exactly once</p>\n</blockquote>\n\n<p>In my testing, the installation order matches the order used by opam within the variation allowed above.</p>\n\n<p>Layers can be built up using the intersection of packages installed so far and the required dependencies.</p>",
+21
mte/2025_04_01_go-docker.json
+21
mte/2025_04_01_go-docker.json
···+"summary": "For some time, we have had issues on Ubuntu Noble when extracting tar files within Docker containers. See ocaml/infrastructure#121. This is only an issue on exotic architectures like RISCV and PPC64LE.",+"content": "<p>For some time, we have had issues on Ubuntu Noble when extracting\ntar files within Docker containers. See\n<a href=\"https://github.com/ocaml/infrastructure/issues/121\">ocaml/infrastructure#121</a>.\nThis is only an issue on exotic architectures like RISCV and PPC64LE.</p>\n\n<div><div><pre><code><span># docker run --rm -it ubuntu:noble</span>\nroot@cf3491db4abd:/# <span>cd\n</span>root@cf3491db4abd:~# <span>mkdir </span>foo\nroot@cf3491db4abd:~# <span>tar</span> <span>-cf</span> bar.tar foo\nroot@cf3491db4abd:~# <span>rmdir </span>foo\nroot@cf3491db4abd:~# <span>tar</span> <span>-xf</span> bar.tar\n<span>tar</span>: foo: Cannot change mode to rwxr-xr-x: Operation not permitted\n<span>tar</span>: Exiting with failure status due to previous errors\n</code></pre></div></div>\n\n<p>The combination of Docker version and <code>libseccomp2</code> version prevents\nthe container from running the <code>fchmodat2</code> system call. There is a\nbug report on Ubuntu\u2019s bug tracker for the issue.</p>\n\n<p>I have been working around this by building Docker from scratch.</p>\n\n<div><div><pre><code>apt <span>install </span>golang\ngit clone https://github.com/moby/moby\n<span>cd </span>moby\n<span>AUTO_GOPATH</span><span>=</span>1 ./hack/make.sh binary\n<span>mv </span>bundles/binary-daemon/<span>*</span> /usr/bin/\nservice docker restart\n</code></pre></div></div>\n\n<p>When provisioning some new RISCV machines, I have once again hit this\nissue, but now the version of Go installed by <code>apt</code> on Ubuntu Noble is\ntoo old to build Docker!</p>\n\n<div><div><pre><code>go: vendor.mod requires go >= 1.23.0 (running go 1.22.2; GOTOOLCHAIN=local)\n</code></pre></div></div>\n\n<p>As this needs to be repeated multiple times, it makes sense\nto wrap the installation steps into an Ansible Playbook.\n<a href=\"https://gist.github.com/mtelvers/ced9d981b9137c491c95780390ce802c\">golang+docker.yml</a></p>",
+21
mte/2025_04_02_ubuntu-with-zfs-root.json
+21
mte/2025_04_02_ubuntu-with-zfs-root.json
···+"summary": "The installation of Ubuntu on ZFS contains about 50 steps of detailed configuration. I have 10 servers to install, so I would like to script this process as much as possible.",+"content": "<p>The installation of <a href=\"https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubuntu/Ubuntu%2022.04%20Root%20on%20ZFS.html\">Ubuntu on ZFS</a>\ncontains about 50 steps of detailed configuration. I have 10 servers to install, so I would like to script this process as much as possible.</p>\n\n<p>To test my script, I have created a new VM on VMware ESXi with 10 x 16GB\ndisks, 16GB RAM, 4 vCPU. In the advanced options, I have set the boot to\nEFI and set <code>disk.EnableUUID = \"TRUE\"</code> in the <code>.vmx</code> file. Doing this\nensures that <code>/dev/disk</code> aliases are created in the guest.</p>\n\n<p>Boot Ubuntu 24.04 from the Live CD and install SSH.</p>\n\n<div><div><pre><code><span>sudo</span> <span>-i</span>\napt update\napt <span>install </span>openssh-server <span>-y</span>\n</code></pre></div></div>\n\n<p>Use <code>wget</code> to download https://github.com/mtelvers.keys into <code>~/.ssh/authorized_keys</code>.</p>\n\n<div><div><pre><code>wget https://github.com/mtelvers.keys <span>-O</span> ~/.ssh/authorized_keys\n</code></pre></div></div>\n\n<p>In your Ansible <code>hosts</code> file, add your new machine and its IP address</p>\n\n<div><div><pre><code>your.fqdn ansible_host=<ip>\n</code></pre></div></div>\n\n<p>Run the playbook with</p>\n\n<div><div><pre><code>ansible-playbook <span>-i</span> hosts <span>--limit</span> your.fqdn ubuntu-zfs.yml\n</code></pre></div></div>\n\n<p>The playbook is available as a GitHub gist <a href=\"https://gist.github.com/mtelvers/2cbeb5e35f43f5e461aa0c14c4a0a6b8\">zfs-ubuntu.yml</a>.</p>",
+21
mte/2025_04_03_kingston-drives.json
+21
mte/2025_04_03_kingston-drives.json
···+"summary": "We have received the second batch of 40 x 7.68TB Kingston SSD drives, bringing the total to 50 drives.",+"content": "<p>We have received the second batch of 40 x 7.68TB Kingston SSD drives, bringing the total to 50 drives.</p>\n\n<p>We now have 5 fully populated Dell PowerEdge R640 with a total raw capacity of 384TB.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/kingston-forty-with-caddies.png\"></p>",
+21
mte/2025_04_04_opam-repo-ci.json
+21
mte/2025_04_04_opam-repo-ci.json
···+"summary": "It\u2019s Tuesday morning, and virtually all opam repo ci jobs are failing with timeouts. This comes at a critical time as these are the first jobs following the update of ocurrent/ocaml-version noted on 24th March.",+"content": "<p>It\u2019s Tuesday morning, and virtually all opam repo ci jobs are failing with timeouts. This comes at a critical time as these are the first jobs following the update of <a href=\"https://github.com/ocurrent/ocaml-version\">ocurrent/ocaml-version</a> <a href=\"https://www.tunbury.org/recent-ocaml-version/\">noted</a> on 24th March.</p>\n\n<p>The <a href=\"https://opam.ci.ocaml.org/github/ocaml/opam-repository\">opam repo ci</a> tests all PRs on <a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a>. The pipeline downloads Docker images, which contain the root filesystem for various Linux distributions, architectures, and OCaml versions, which are used as the base environment to run the tests. These base images are created by the <a href=\"https://images.ci.ocaml.org\">base image builder</a>. <a href=\"https://github.com/ocurrent/docker-base-images/pull/317\">PR#317</a> update these base images in three ways:</p>\n\n<ul>\n <li>Images for OCaml < 4.08 were removed.</li>\n <li>The <code>opam-repository-archive</code> overlay was removed as this contained the < 4.08 opam packages.</li>\n <li>The <code>ocaml-patches-overlay</code> overlay was removed as this was only needed to build OCaml < 4.08 on GCC 14.</li>\n</ul>\n\n<p>Given these changes, I immediately assumed some element of these was the culprit.</p>\n\n<p>Here\u2019s an example of a failure as reported in the log.</p>\n\n<div><div><pre><code>2025-04-01 07:27.45 ---> using \"9dd47386dd0565c83eac2e9d589d75bdd268a7f34f3c854d1db189e7a2e5f77b\" from cache\n\n/: (user (uid 1000) (gid 1000))\n\n/: (workdir /home/opam)\n\n/home/opam: (run (shell \"sudo ln -f /usr/bin/opam-dev /usr/bin/opam\"))\n2025-04-01 07:27.45 ---> using \"132d861be153666fd67b2e16b21c4de16e15e26f8d7d42f3bcddf0360ad147be\" from cache\n\n/home/opam: (run (network host)\n (shell \"opam init --reinit --config .opamrc-sandbox -ni\"))\nConfiguring from /home/opam/.opamrc-sandbox, then /home/opam/.opamrc, and finally from built-in defaults.\nChecking for available remotes: rsync and local, git.\n - you won't be able to use mercurial repositories unless you install the hg command on your system.\n - you won't be able to use darcs repositories unless you install the darcs command on your system.\n\nThis development version of opam requires an update to the layout of /home/opam/.opam from version 2.0 to version 2.2, which can't be reverted.\nYou may want to back it up before going further.\n\nContinue? [Y/n] y\n[NOTE] The 'jobs' option was reset, its value was 39 and its new value will vary according to the current number of cores on your machine. You can restore the fixed value using:\n opam option jobs=39 --global\nFormat upgrade done.\n\n<><> Updating repositories ><><><><><><><><><><><><><><><><><><><><><><><><><><>\n2025-04-01 09:27.34: Cancelling: Timeout (120.0 minutes)\nJob cancelled\n2025-04-01 09:27.40: Timeout (120.0 minutes)\n</code></pre></div></div>\n\n<p>With nearly all jobs taking 2 hours to run, the cluster was understandably backlogged!</p>\n\n<p>The issue could be reproduced with this Dockerfile:</p>\n\n<div><div><pre><code>cd $(mktemp -d)\ngit clone --recursive \"https://github.com/ocaml/opam-repository.git\" && cd \"opam-repository\" && git fetch origin \"refs/pull/27696/head\" && git reset --hard 46b8cc5a\ngit fetch origin master\ngit merge --no-edit 4d8fa0fb8fce3b6c8b06f29ebcfa844c292d4f3e\ncat > ../Dockerfile <<'END-OF-DOCKERFILE'\nFROM ocaml/opam:debian-12-ocaml-4.09@sha256:13bd7f0979922adb13049eecc387d65d7846a3058f7dd6509738933e88bc8d4a\nUSER 1000:1000\nWORKDIR /home/opam\nRUN sudo ln -f /usr/bin/opam-dev /usr/bin/opam\nRUN opam init --reinit -ni\nRUN opam option solver=builtin-0install && opam config report\nENV OPAMDOWNLOADJOBS=\"1\"\nENV OPAMERRLOGLEN=\"0\"\nENV OPAMPRECISETRACKING=\"1\"\nENV CI=\"true\"\nENV OPAM_REPO_CI=\"true\"\nRUN rm -rf opam-repository/\nCOPY --chown=1000:1000 . opam-repository/\nRUN opam repository set-url --strict default opam-repository/\nRUN opam update --depexts || true\nRUN opam pin add -k version -yn chrome-trace.3.18.0~alpha0 3.18.0~alpha0\nRUN opam reinstall chrome-trace.3.18.0~alpha0; \\\n res=$?; \\\n test \"$res\" != 31 && exit \"$res\"; \\\n export OPAMCLI=2.0; \\\n build_dir=$(opam var prefix)/.opam-switch/build; \\\n failed=$(ls \"$build_dir\"); \\\n partial_fails=\"\"; \\\n for pkg in $failed; do \\\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"debian-12\\\"\"; then \\\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\"; \\\n fi; \\\n test \"$pkg\" != 'chrome-trace.3.18.0~alpha0' && partial_fails=\"$partial_fails $pkg\"; \\\n done; \\\n test \"${partial_fails}\" != \"\" && echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\"; \\\n exit 1\n\nEND-OF-DOCKERFILE\ndocker build -f ../Dockerfile .\n</code></pre></div></div>\n\n<p>It was interesting to note which jobs still work. For example, builds on macOS and FreeBSD ran normally. This makes sense as these architectures don\u2019t use the Docker base images. Looking further, opam repo ci attempts builds on opam 2.0, 2.1, 2.2, and 2.3 on Debian. These builds succeeded. Interesting. All the other builds use the latest version of opam built from the head of the master branch.</p>\n\n<p>Taking the failing Dockerfile above and replacing <code>sudo ln -f /usr/bin/opam-dev /usr/bin/opam</code> with <code>sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam</code> immediately fixed the issue!</p>\n\n<p>I pushed commit <a href=\"https://github.com/ocurrent/opam-repo-ci/commit/7174953145735a54ecf668c7387e57b3f2d2a411\">7174953</a> to force opam repo ci to use opam 2.3 and opened <a href=\"https://github.com/ocaml/opam/issues/6448\">issue#6448</a> on ocaml/opam. The working theory is that some change associated with <a href=\"https://github.com/ocaml/opam/pull/5892\">PR#5892</a>, which replace GNU patch with the OCaml patch library is the root cause.</p>\n\n<p>Musing on this issue with David, the idea of using the latest tag rather than head commit seemed like a good compromise. This allowed us to specifically test pre-release versions of opam when they were tagged but not be at the cutting edge with the risk of impacting a key service.</p>\n\n<p>We need the latest tag by version number, not by date, as we wouldn\u2019t want to revert to testing on, for example, 2.1.7 if something caused a new release of the 2.1 series. The result was a function which runs <code>git tag --format %(objectname) %(refname:strip=2)</code> and semantically sorts the version numbers using <code>OpamVersion.compare</code>. See <a href=\"https://github.com/ocurrent/docker-base-images/pull/318\">PR#318</a>.</p>",
+21
mte/2025_04_07_ocaml-claude-box.json
+21
mte/2025_04_07_ocaml-claude-box.json
···+"summary": "Over the weekend, I decided to extend my Box tool to incorporate file upload. There is a straightforward POST API for this with a curl one-liner given in the Box documentation. Easy.",+"content": "<p>Over the weekend, I decided to extend my <a href=\"https://box.com\">Box</a> <a href=\"https://github.com/mtelvers/ocaml-box-diff\">tool</a> to incorporate file upload. There is a straightforward POST API for this with a <code>curl</code> one-liner given in the Box <a href=\"https://developer.box.com/reference/post-files-content/\">documentation</a>. Easy.</p>\n\n<p>The documentation for <a href=\"https://mirage.github.io/ocaml-cohttp/cohttp-eio/Cohttp_eio/Client/index.html\">Cohttp-eio.Client</a> only gives the function signature for <code>post</code>, but it looked pretty similar to <code>get</code>, which I had already been working with. The <a href=\"https://github.com/mirage/ocaml-cohttp\">README</a> for Cohttp gave me pause when I read this comment about multipart forms.</p>\n\n<blockquote>\n <p>Multipart form data is not supported out of the box but is provided by external libraries</p>\n</blockquote>\n\n<p>Of the three options given, the second option looked abandoned, while the third said it didn\u2019t support streaming, so I went with the first one <a href=\"https://github.com/dinosaure/multipart_form\">dionsaure/multipart_form</a>.</p>\n\n<p>The landing page included an example encoder. A couple of external functions are mentioned, and I found example code for these in <a href=\"https://github.com/dinosaure/multipart_form/blob/main/test/test.ml\">test/test.ml</a>. This built, but didn\u2019t work against Box. I ran <code>nc -l 127.0.0.1 6789</code> and set that as the API endpoint for both the <code>curl</code> and my application. This showed I was missing the <code>Content-Type</code> header in the part boundary. It should be <code>application/octet-stream</code>.</p>\n\n<p>There is a <code>~header</code> parameter to <code>part</code>, and I hoped for a <code>Header.add</code> like the <code>Cohttp</code>, but sadly not. See the <a href=\"https://ocaml.org/p/multipart_form/latest/doc/Multipart_form/Header/index.html\">documentation</a>. There is <code>Header.content_type</code>, but that returns the content type. How do you make it? <code>Header.of_list</code> requires a <code>Field.field list</code>.</p>\n\n<p>In a bit of frustration, I decided to ask Claude. I\u2019ve not tried it before, but I\u2019ve seen some impressive demonstrations. My first lesson here was to be specific. Claude is not a mind reader. After a few questions, I got to this:</p>\n\n<div><div><pre><code><span>Field</span><span>.(</span><span>make</span> <span>Content_type</span><span>.</span><span>name</span> <span>(</span><span>Content_type</span><span>.</span><span>v</span> <span>`Application</span> <span>`Octet_stream</span><span>));</span>\n</code></pre></div></div>\n\n<p>I can see why this was suggested as <code>Content_disposition.v</code> exists, but <code>Content_type.v</code> does not, nor does <code>Field.make</code>. Claude quickly obliged with a new version when I pointed this out but added the <code>Content_type</code> to the HTTP header rather than the boundary header. This went back and forth for a while, with Claude repeatedly suggesting functions which did not exist. I gave up.</p>\n\n<p>On OCaml.org, the <a href=\"https://ocaml.org/p/multipart_form/latest\">multipart-form</a> documentation includes a <em>Used by</em> section that listed <code>dream</code> as the only (external) application which used the library. From the source, I could see <code>Field.Field (field_name, Field.Content_type, v)</code>, which looked good.</p>\n\n<p>There is a function <code>Content_type.of_string</code>. I used <code>:MerlinLocate</code> to find the source, which turned out to be an Angstrom parser which returns a <code>Content_type.t</code>. This led me to <code>Content_type.make</code>, and ultimately, I was able to write these two lines:</p>\n\n<div><div><pre><code><span>let</span> <span>v</span> <span>=</span> <span>Content_type</span><span>.</span><span>make</span> <span>`Application</span> <span>(</span><span>`Iana_token</span> <span>\"octet-stream\"</span><span>)</span> <span>Content_type</span><span>.</span><span>Parameters</span><span>.</span><span>empty</span>\n<span>let</span> <span>p0</span> <span>=</span> <span>part</span> <span>~</span><span>header</span><span>:</span><span>(</span><span>Header</span><span>.</span><span>of_list</span> <span>[</span> <span>Field</span> <span>(</span><span>Field_name</span><span>.</span><span>content_type</span><span>,</span> <span>Content_type</span><span>,</span> <span>v</span><span>)</span> <span>])</span> <span>...</span>\n</code></pre></div></div>\n\n<p>As a relatively new adopter of OCaml as my language of choice, the most significant challenge I face is documentation, particularly when I find a library on opam which I want to use. I find this an interesting contrast to the others in the community, where it is often cited that tooling is the most significant barrier to adoption. In my opinion, the time taken to set up a build environment is dwarfed by the time spent in that environment iterating code.</p>\n\n<p>I would like to take this opportunity to thank all contributors to opam repository for their time and effort in making packages available. This post mentions specific packages but only to illustrate my point.</p>",
+21
mte/2025_04_10_dell-r640-installation.json
+21
mte/2025_04_10_dell-r640-installation.json
···+"summary": "Today we have racked the five 14th generation Dell R640 servers and a Dell N4032 switch.",+"content": "<p>Today we have racked the five 14th generation Dell R640 servers and a Dell N4032 switch.</p>\n\n<p>When inspecting the rack rails, I noticed that some of the left-hand rails had an extra tab on them while the others did not. For the first server, I used a rail with a tab only to discover that the tab prohibited the server from being pushed in all the way. The tabs were easily removed but the server needed to be removed from the rack first.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-rail.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-rail-removal.jpg\"></p>\n\n<p>First server installed</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-first-one.jpg\"></p>\n\n<p>The last server on the rails</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-last-one.jpg\"></p>\n\n<p>Front view</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-front-view.jpg\"></p>\n\n<p>Rear view</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-rear-view.jpg\"></p>\n\n<p>Cabling</p>\n\n<ul>\n <li>Yellow CAT5 for iDRAC ports</li>\n <li>Red CAT6 for 10GBase-T</li>\n</ul>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-cabled.jpg\"></p>\n\n<p>The initial iDRAC configuration was carried out using a crash cart.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-idrac-config.jpg\"></p>\n\n<p>The servers are called:</p>\n\n<ul>\n <li>myrina</li>\n <li>thalestris</li>\n <li>lampedo</li>\n <li>otrera</li>\n <li>antiope</li>\n</ul>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/dell-r640-final.jpg\"></p>\n\n<p>We had some difficulty with the 40G uplink from the switch and we could only get the link to come up by splitting it into 4 x 10G channels, as follows.</p>\n\n<div><div><pre><code>console>enable\nconsole#configure\nconsole(config)#interface Fo1/1/1\nconsole(config-if-Fo1/1/1)#hardware profile portmode 4x10g\n</code></pre></div></div>\n\n<p>Then rebooting with <code>do reload</code>. The 4 x 10G uplinks has been configured as an LACP port channel (Po1).</p>\n\n<h1>R640 Configuration</h1>\n\n<p>Each server has:</p>\n\n<ul>\n <li>2 x Intel Xeon Gold 6244 3.6G 8C / 16T</li>\n <li>8 x 16GB DIMM</li>\n <li>10 x Kingston 7.68TB SSD</li>\n</ul>\n\n<p><a href=\"https://www.dell.com/support/manuals/en-uk/poweredge-r640/per640_ism_pub/general-memory-module-installation-guidelines?guid=guid-acbc0f13-dedb-492b-a0b0-18303ded565a&lang=en-us\">Dell R640 has 24 DIMM slots</a></p>",
+21
mte/2025_04_11_dell-r640-ubuntu.json
+21
mte/2025_04_11_dell-r640-ubuntu.json
···+"summary": "I could have scripted this via Ansible, but there would always be a manual element, such as configuring the H740P controller and booting from the network to get to the point where you can SSH to the machine. Therefore, I decided to just document the steps required.",+"content": "<p>I could have scripted this via Ansible, but there would always be a manual element, such as configuring the H740P controller and booting from the network to get to the point where you can SSH to the machine. Therefore, I decided to just document the steps required.</p>\n\n<p>After powering the system on, press F2 to open setup and follow this path through the menu</p>\n\n<div><div><pre><code>Device Configuration > Integrated RAID Controller H740P > Configure > Clear Configuration\n</code></pre></div></div>\n\n<p>then</p>\n\n<div><div><pre><code>View Server Profile > Controller Management > Advanced Controller Management > Manage Controller Mode\n</code></pre></div></div>\n\n<p>Choose <code>Switch to Enhanced HBA Controller Mode</code>, then Confirm and reset the system.</p>\n\n<p>Boot to the Ubuntu installer. I used <code>netboot.xyz</code> running in a Docker container.</p>\n\n<p>I will use a software RAID set configured by <code>mdadm</code> for the Ubuntu root drive. In this configuration, the EFI partition needs special attention as EFI does not understand software RAID. GRUB can be configured to create and update multiple copies of the EFI partition. For consistency, I will create an EFI partition on all the drives.</p>\n\n<p>These commands will create the following partitions:</p>\n\n<div><div><pre><code><span>for </span>a <span>in </span>sd<span>{</span>a..j<span>}</span> <span>;</span> <span>do </span>sgdisk <span>-n1</span>:1M:+512M <span>-t1</span>:EF00 /dev/<span>$a</span> <span>;</span> <span>done\nfor </span>a <span>in </span>sd<span>{</span>a..j<span>}</span> <span>;</span> <span>do </span>sgdisk <span>-n2</span>:0:+16G <span>-t2</span>:FD00 /dev/<span>$a</span> <span>;</span> <span>done\nfor </span>a <span>in </span>sd<span>{</span>a..j<span>}</span> <span>;</span> <span>do </span>sgdisk <span>-n3</span>:0:0 <span>-t3</span>:BF00 /dev/<span>$a</span> <span>;</span> <span>done</span>\n</code></pre></div></div>\n\n<p>Next, format the EFI drives with a DOS filesystem and create the RAID device:</p>\n\n<div><div><pre><code><span>for </span>a <span>in </span>sd<span>{</span>a..j<span>}</span> <span>;</span> <span>do </span>mkdosfs <span>-F</span> 32 <span>-s</span> 1 <span>-n</span> EFI /dev/<span>${</span><span>a</span><span>}</span>1 <span>;</span> <span>done\n</span>mdadm <span>--create</span> /dev/md0 <span>--metadata</span><span>=</span>1.2 <span>--level</span><span>=</span>raid5 <span>--raid-devices</span><span>=</span>10 /dev/sd[a-j]2\n</code></pre></div></div>\n\n<p>Check the partition tables with <code>sgdisk -p /dev/sda</code>, and the soft RAID setup with <code>cat /proc/mdstat</code>.</p>\n\n<p>Install Ubuntu via the setup program selecting the software RAID as the root volume and the first drive as the boot drive.</p>\n\n<p>After the system reboots, delete the current EFI entries from <code>/etc/fstab</code>:</p>\n\n<div><div><pre><code>umount /boot/efi\n<span>sed</span> <span>-i</span> <span>'/\\/efi/d'</span> /etc/fstab\n</code></pre></div></div>\n\n<p>Then add the entries for <code>/dev/sda1</code> and <code>/dev/sdb1</code>.</p>\n\n<div><div><pre><code><span>echo</span> /dev/disk/by-uuid/<span>$(</span>blkid <span>-s</span> UUID <span>-o</span> value /dev/sda1<span>)</span> /boot/efi vfat defaults 0 0 <span>>></span> /etc/fstab\n<span>mkdir</span> <span>-p</span> /boot/efi-alt\n<span>echo</span> /dev/disk/by-uuid/<span>$(</span>blkid <span>-s</span> UUID <span>-o</span> value /dev/sdb1<span>)</span> /boot/efi-alt vfat defaults 0 0 <span>>></span> /etc/fstab\nsystemctl daemon-reload\nmount <span>-a</span>\n</code></pre></div></div>\n\n<p>Run <code>dpkg-reconfigure grub-efi-amd64</code> to configure GRUB. Accept all of the defaults and select <code>/dev/sda1</code> and <code>/dev/sdb1</code> as the boot drives. Reboot the system.</p>\n\n<p>After the reboot, install the ZFS utils.</p>\n\n<div><div><pre><code>apt <span>install </span>zfsutils-linux\n</code></pre></div></div>\n\n<p>Create a ZFS <em>tank</em> using the <em>by-id</em> values.</p>\n\n<div><div><pre><code>zpool create <span>\\</span>\n <span>-o</span> <span>ashift</span><span>=</span>12 <span>\\</span>\n <span>-o</span> <span>autotrim</span><span>=</span>on <span>\\</span>\n <span>-O</span> <span>acltype</span><span>=</span>posixacl <span>-O</span> <span>xattr</span><span>=</span>sa <span>-O</span> <span>dnodesize</span><span>=</span>auto <span>\\</span>\n <span>-O</span> <span>normalization</span><span>=</span>formD <span>\\</span>\n <span>-O</span> <span>relatime</span><span>=</span>on <span>\\</span>\n tank raidz /dev/disk/by-id/wwn-<span>*</span><span>-part3</span>\n</code></pre></div></div>\n\n<p>Check it is available:</p>\n\n<div><div><pre><code><span># zfs list</span>\nNAME USED AVAIL REFER MOUNTPOINT\ntank 789K 61.8T 171K /tank\n</code></pre></div></div>",
+21
mte/2025_04_12_box-diff.json
+21
mte/2025_04_12_box-diff.json
···+"summary": "Over the weekend, I extended mtelvers/ocaml-box-diff to include the ability to upload files over 50MB. This is a more complex API which requires a call to https://upload.box.com/api/2.0/files/upload_sessions by posting JSON containing the name of the file, the folder ID and the file size. Box replies with various session endpoints which give the URIs to use to upload the parts and to commit the the file. Box also specifies the size of each part.",+"content": "<p>Over the weekend, I extended <a href=\"https://github.com/mtelvers/ocaml-box-diff\">mtelvers/ocaml-box-diff</a> to include the ability to upload files over 50MB. This is a more complex API which requires a call to <a href=\"https://upload.box.com/api/2.0/files/upload_sessions\">https://upload.box.com/api/2.0/files/upload_sessions</a> by posting JSON containing the name of the file, the folder ID and the file size. Box replies with various <em>session endpoints</em> which give the URIs to use to upload the parts and to commit the the file. Box also specifies the size of each part.</p>\n\n<p>Each part is uploaded with an HTTP PUT of the binary data, with header fields giving the byte range within the overall file along with the SHA for this chunk. Box replies with a part identifier. Once all the parts have been uploaded, an HTTP POST is required to the commit URI, passing a JSON array of all the parts as well as the overall SHA for the file.</p>\n\n<p>I was pleased to be able to reuse <code>stream_of_file</code>, which was written for the small file upload. Additionally, I was able to keep a running total SHA for the data uploaded so far using <code>Sha1.update_string ctx chunk</code>, meaning that I did not need to recompute the overall file SHA at the end.</p>",
+21
mte/2025_04_13_gnu-parallel.json
+21
mte/2025_04_13_gnu-parallel.json
···+"summary": "If you haven\u2019t used it before, or perhaps it has been so long that it has been swapped out to disk, let me commend GNU\u2019s Parallel to you.",+"content": "<p>If you haven\u2019t used it before, or perhaps it has been so long that it has been swapped out to disk, let me commend GNU\u2019s <a href=\"https://www.gnu.org/software/parallel/parallel.html\">Parallel</a> to you.</p>\n\n<p>Parallel executes shell commands in parallel! A trivial example would be <code>parallel echo ::: A B C</code>, which runs <code>echo A</code>, <code>echo B</code> and <code>echo C</code>. <code>{}</code> can be used as a placeholder for the parameter in cases where it isn\u2019t simply appended to the command line.</p>\n\n<p>Multiple parameters can be read from an input file using four colons, <code>parallel echo :::: params_file</code>. This is particularly useful as it correctly deals with parameters/file names with spaces. For example, create a tab-delimited list of source and destination paths in <code>paths.tsv</code> and then run:</p>\n\n<div><div><pre><code>parallel <span>--jobs</span> 8 <span>--colsep</span> <span>'\\t'</span> <span>--progress</span> rsync <span>-avh</span> <span>{</span>1<span>}</span> <span>{</span>2<span>}</span> :::: paths.tsv\n</code></pre></div></div>",
+21
mte/2025_04_14_slurm-workload-manager.json
+21
mte/2025_04_14_slurm-workload-manager.json
···+"summary": "Sadiq mentioned slurm as a possible way to better schedule the group\u2019s compute resources. Many resources are available showing how to create batch jobs for Slurm clusters but far fewer on how to set up a cluster. This is a quick walkthrough of the basic steps to set up a two-node compute cluster on Ubuntu 24.04. Note that slurmd and slurmctld can run on the same machine.",+"content": "<p>Sadiq mentioned <code>slurm</code> as a possible way to better schedule the group\u2019s compute resources. Many resources are available showing how to create batch jobs for Slurm clusters but far fewer on how to set up a cluster. This is a quick walkthrough of the basic steps to set up a two-node compute cluster on Ubuntu 24.04. Note that <code>slurmd</code> and <code>slurmctld</code> can run on the same machine.</p>\n\n<p>Create three VMs: <code>node1</code>, <code>node2</code> and <code>head</code>.</p>\n\n<p>On <code>head</code>, install these components.</p>\n\n<div><div><pre><code>apt <span>install </span>munge slurmd slurmctld\n</code></pre></div></div>\n\n<p>On <code>node1</code> and <code>node2</code> install.</p>\n\n<div><div><pre><code>apt <span>install </span>munge slurmd\n</code></pre></div></div>\n\n<p>Copy <code>/etc/munge/munge.key</code> from <code>head</code> to the same location on <code>node1</code> and <code>node2</code>. Then restart <code>munge</code> on the other nodes with <code>service munge restart</code>.</p>\n\n<p>You should now be able to <code>munge -n | unmunge</code> without error. This should also work via SSH. i.e. <code>ssh head munge -n | ssh node1 unmunge</code></p>\n\n<p>If you don\u2019t have DNS, add <code>node1</code> and <code>node2</code> to the <code>/etc/hosts</code> file on <code>head</code> and add <code>head</code> to the <code>/etc/hosts</code> on <code>node1</code> and <code>node2</code>.</p>\n\n<p>On <code>head</code>, create the daemon spool directory:</p>\n\n<div><div><pre><code><span>mkdir</span> /var/spool/slurmctld\n<span>chown</span> <span>-R</span> slurm:slurm /var/spool/slurmctld/\n<span>chmod </span>775 /var/spool/slurmctld/\n</code></pre></div></div>\n\n<p>Create <code>/etc/slurm/slurm.conf</code>, as below. Update the compute node section by running <code>slurmd -C</code> on each node to generate the configuration line. This file should be propagated to all the machines. The configuration file can be created using this <a href=\"https://slurm.schedmd.com/configurator.html\">tool</a>.</p>\n\n<div><div><pre><code>ClusterName=cluster\nSlurmctldHost=head\nProctrackType=proctrack/linuxproc\nReturnToService=1\nSlurmctldPidFile=/var/run/slurmctld.pid\nSlurmctldPort=6817\nSlurmdPidFile=/var/run/slurmd.pid\nSlurmdPort=6818\nSlurmdSpoolDir=/var/spool/slurmd\nSlurmUser=slurm\nStateSaveLocation=/var/spool/slurmctld\nTaskPlugin=task/affinity,task/cgroup\n\n# TIMERS\nInactiveLimit=0\nKillWait=30\nMinJobAge=300\nSlurmctldTimeout=120\nSlurmdTimeout=300\nWaittime=0\n\n# SCHEDULING\nSchedulerType=sched/backfill\nSelectType=select/cons_tres\n\n# LOGGING AND ACCOUNTING\nJobCompType=jobcomp/none\nJobAcctGatherFrequency=30\nSlurmctldDebug=info\nSlurmctldLogFile=/var/log/slurmctld.log\nSlurmdDebug=info\nSlurmdLogFile=/var/log/slurmd.log\n\n# COMPUTE NODES\nNodeName=node1 CPUs=1 Boards=1 SocketsPerBoard=1 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=1963\nNodeName=node2 CPUs=1 Boards=1 SocketsPerBoard=1 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=1963\nPartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP\n</code></pre></div></div>\n\n<p>On <code>head</code>, start the control daemon.</p>\n\n<div><div><pre><code>service slurmctld start\n</code></pre></div></div>\n\n<p>And on the nodes, start the slurm daemon.</p>\n\n<div><div><pre><code>service slurmd start\n</code></pre></div></div>\n\n<p>From <code>head</code>, you can now run a command simultaneously on both nodes.</p>\n\n<div><div><pre><code><span># srun -N2 -l /bin/hostname</span>\n0: node1\n1: node2\n</code></pre></div></div>\n\n<p>The optional <code>Gres</code> parameter on <code>NodeName</code> allows nodes to be configured with extra resources such as GPUs.</p>\n\n<p>Typical configurations use an NFS server to make /home available on all the nodes. Note that users only need to be created on the head node and don\u2019t need SSH access to the compute nodes.</p>",
+21
mte/2025_04_16_ubuntu-cloud-init.json
+21
mte/2025_04_16_ubuntu-cloud-init.json
···+"summary": "Testing cloud-init is painful on real (server) hardware, as the faster the server, the longer it seems to take to complete POST. Therefore, I highly recommend testing with a virtual machine before moving to real hardware.",+"content": "<p>Testing cloud-init is painful on real (server) hardware, as the faster the server, the longer it seems to take to complete POST. Therefore, I highly recommend testing with a virtual machine before moving to real hardware.</p>\n\n<p>I have set up a QEMU machine to simulate the Dell R640 machines with 10 x 8T disks. I\u2019ll need to set up and tear this machine down several times for testing, so I have wrapped the setup commands into a <code>Makefile</code>. QCOW2 is a thin format, so you don\u2019t actually need 80T of disk space to do this!</p>\n\n<p>The Dell machines use EFI, so I have used EFI on the QEMU machine. Note the <code>OVMF</code> lines in the configuration. Ensure that you emulate a hard disk controller, which is supported by the EFI BIOS. For example, <code>-device megasas,id=scsi0</code> won\u2019t boot as the EFI BIOS can\u2019t see the drives. I have enabled VNC access, but I primarily used the serial console to interact with the machine.</p>\n\n<div><div><pre><code>machine: disk0.qcow2 disk1.qcow2 disk2.qcow2 disk3.qcow2 disk4.qcow2 disk5.qcow2 disk6.qcow2 disk7.qcow2 disk8.qcow2 disk9.qcow2 OVMF_VARS.fd\n\tqemu-system-x86_64 -m 8G -smp 4 -machine accel=kvm,type=pc -cpu host -display none -vnc :0 \\\n\t\t-drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd \\\n\t\t-drive if=pflash,format=raw,file=OVMF_VARS.fd \\\n\t\t-serial stdio \\\n\t\t-device virtio-scsi-pci,id=scsi0 \\\n\t\t-device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \\\n\t\t-drive file=disk0.qcow2,if=none,id=drive0 \\\n\t\t-device scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=1,lun=0 \\\n\t\t-drive file=disk1.qcow2,if=none,id=drive1 \\\n\t\t-device scsi-hd,drive=drive2,bus=scsi0.0,channel=0,scsi-id=2,lun=0 \\\n\t\t-drive file=disk2.qcow2,if=none,id=drive2 \\\n\t\t-device scsi-hd,drive=drive3,bus=scsi0.0,channel=0,scsi-id=3,lun=0 \\\n\t\t-drive file=disk3.qcow2,if=none,id=drive3 \\\n\t\t-device scsi-hd,drive=drive4,bus=scsi0.0,channel=0,scsi-id=4,lun=0 \\\n\t\t-drive file=disk4.qcow2,if=none,id=drive4 \\\n\t\t-device scsi-hd,drive=drive5,bus=scsi0.0,channel=0,scsi-id=5,lun=0 \\\n\t\t-drive file=disk5.qcow2,if=none,id=drive5 \\\n\t\t-device scsi-hd,drive=drive6,bus=scsi0.0,channel=0,scsi-id=6,lun=0 \\\n\t\t-drive file=disk6.qcow2,if=none,id=drive6 \\\n\t\t-device scsi-hd,drive=drive7,bus=scsi0.0,channel=0,scsi-id=7,lun=0 \\\n\t\t-drive file=disk7.qcow2,if=none,id=drive7 \\\n\t\t-device scsi-hd,drive=drive8,bus=scsi0.0,channel=0,scsi-id=8,lun=0 \\\n\t\t-drive file=disk8.qcow2,if=none,id=drive8 \\\n\t\t-device scsi-hd,drive=drive9,bus=scsi0.0,channel=0,scsi-id=9,lun=0 \\\n\t\t-drive file=disk9.qcow2,if=none,id=drive9 \\\n\t\t-net nic,model=virtio-net-pci,macaddr=02:00:00:00:00:01 \\\n\t\t-net bridge,br=br0\n\ndisk%.qcow2:\n\tqemu-img create -f qcow2 $@ 8T\n\nOVMF_VARS.fd:\n\tcp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd\n\nclean:\n\trm *.qcow2 OVMF_VARS.fd\n</code></pre></div></div>\n\n<p>We are using <a href=\"https://netboot.xyz\">netboot.xyz</a> to network boot the machine via PXE. The easiest way to use netboot.xyz is to use it within the prebuilt Docker container. This can be set up using a <code>docker-compose.yml</code> file. Start the container with <code>docker compose up -d</code>.</p>\n\n<div><div><pre><code>version: \"2.1\"\nservices:\n netbootxyz:\n image: ghcr.io/netbootxyz/netbootxyz\n container_name: netbootxyz\n environment:\n - NGINX_PORT=80 # optional\n - WEB_APP_PORT=3000 # optional\n volumes:\n - /netbootxyz/config:/config # optional\n - /netbootxyz/assets:/assets # optional\n ports:\n - 3000:3000 # optional, destination should match ${WEB_APP_PORT} variable above.\n - 69:69/udp\n - 8080:80 # optional, destination should match ${NGINX_PORT} variable above.\n restart: unless-stopped\n</code></pre></div></div>\n\n<p>We have a Ubiquiti EdgeMax providing DHCP services. The DHCP options should point new clients to the Docker container.</p>\n\n<div><div><pre><code>set service dhcp-serverbootfile-server doc.caelum.ci.dev\nset service dhcp-server global-parameters \"class &quot;BIOS-x86&quot; { match if option arch = 00:00; filename &quot;netboot.xyz.kpxe&quot;; }\"\nset service dhcp-server global-parameters \"class &quot;UEFI-x64&quot; { match if option arch = 00:09; filename &quot;netboot.xyz.efi&quot;; }\"\nset service dhcp-server global-parameters \"class &quot;UEFI-bytecode&quot; { match if option arch = 00:07; filename &quot;netboot.xyz.efi&quot;; }\"\n</code></pre></div></div>\n\n<p>I also recommend staging the Ubuntu installation ISO, <code>vmlinuz</code>, and <code>initrd</code> locally, as this will speed up the machine\u2019s boot time. The files needed are:</p>\n\n<ul>\n <li>https://releases.ubuntu.com/24.04.2/ubuntu-24.04.2-live-server-amd64.iso</li>\n <li>https://github.com/netbootxyz/ubuntu-squash/releases/download/24.04.2-dac09526/vmlinuz</li>\n <li>https://github.com/netbootxyz/ubuntu-squash/releases/download/24.04.2-dac09526/initrd</li>\n</ul>\n\n<p>Create a <code>user-data</code> file containing the following cloud-init configuration. In this case, it primarily includes the storage configuration. The goal here is to configure each disk identically, with a tiny EFI partition, an MD RAID partition and a rest given over to the ZFS datastore. Additionally, create empty files <code>meta-data</code> and <code>vendor-data</code>. None of the files have an extension. The encrypted password is <code>ubuntu</code>.</p>\n\n<div><div><pre><code>#cloud-config\nautoinstall:\n version: 1\n storage:\n config:\n - { ptable: gpt, path: /dev/sda, preserve: false, name: '', grub_device: false, id: disk-sda, type: disk }\n - { ptable: gpt, path: /dev/sdb, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdb, type: disk }\n - { ptable: gpt, path: /dev/sdc, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdc, type: disk }\n - { ptable: gpt, path: /dev/sdd, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdd, type: disk }\n - { ptable: gpt, path: /dev/sde, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sde, type: disk }\n - { ptable: gpt, path: /dev/sdf, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdf, type: disk }\n - { ptable: gpt, path: /dev/sdg, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdg, type: disk }\n - { ptable: gpt, path: /dev/sdh, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdh, type: disk }\n - { ptable: gpt, path: /dev/sdi, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdi, type: disk }\n - { ptable: gpt, path: /dev/sdj, wipe: superblock-recursive, preserve: false, name: '', grub_device: false, id: disk-sdj, type: disk }\n - { device: disk-sda, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576, id: efi-0, type: partition }\n - { device: disk-sdb, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576, id: efi-1, type: partition }\n - { device: disk-sdc, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-2, type: partition }\n - { device: disk-sdd, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-3, type: partition }\n - { device: disk-sde, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-4, type: partition }\n - { device: disk-sdf, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-5, type: partition }\n - { device: disk-sdg, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-6, type: partition }\n - { device: disk-sdh, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-7, type: partition }\n - { device: disk-sdi, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-8, type: partition }\n - { device: disk-sdj, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: false, offset: 1048576, id: efi-9, type: partition }\n - { device: disk-sda, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-0, type: partition }\n - { device: disk-sdb, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-1, type: partition }\n - { device: disk-sdc, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-2, type: partition }\n - { device: disk-sdd, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-3, type: partition }\n - { device: disk-sde, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-4, type: partition }\n - { device: disk-sdf, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-5, type: partition }\n - { device: disk-sdg, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-6, type: partition }\n - { device: disk-sdh, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-7, type: partition }\n - { device: disk-sdi, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-8, type: partition }\n - { device: disk-sdj, size: 16G, wipe: superblock, number: 2, preserve: false, grub_device: false, id: md-9, type: partition }\n - { device: disk-sda, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-0, type: partition }\n - { device: disk-sdb, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-1, type: partition }\n - { device: disk-sdc, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-2, type: partition }\n - { device: disk-sdd, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-3, type: partition }\n - { device: disk-sde, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-4, type: partition }\n - { device: disk-sdf, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-5, type: partition }\n - { device: disk-sdg, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-6, type: partition }\n - { device: disk-sdh, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-7, type: partition }\n - { device: disk-sdi, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-8, type: partition }\n - { device: disk-sdj, size: -1, wipe: superblock, number: 3, preserve: false, grub_device: false, id: zfs-9, type: partition }\n - { name: md0, raidlevel: raid5, devices: [ md-0, md-1, md-2, md-3, md-4, md-5, md-6, md-7, md-8, md-9 ], spare_devices: [], preserve: false, wipe: superblock, id: raid-0, type: raid }\n - { fstype: fat32, volume: efi-0, preserve: false, id: efi-dos-0, type: format }\n - { fstype: fat32, volume: efi-1, preserve: false, id: efi-dos-1, type: format }\n - { fstype: ext4, volume: raid-0, preserve: false, id: root-ext4, type: format }\n - { path: /, device: root-ext4, id: mount-2, type: mount }\n - { path: /boot/efi, device: efi-dos-0, id: mount-0, type: mount }\n - { path: /boot/efi-alt, device: efi-dos-1, id: mount-1, type: mount }\n identity:\n hostname: ubuntu-server\n password: \"$6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0\"\n username: ubuntu\n ssh:\n install-server: yes\n authorized-keys:\n - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i\n allow-pw: no\n packages:\n - zfsutils-linux\n user-data:\n disable_root: false\n</code></pre></div></div>\n\n<p>The binaries and configuration files should be stored in the assets folder used by netbootxyz.</p>\n\n<div><div><pre><code>/netbootxyz/assets/r640/initrd\n/netbootxyz/assets/r640/meta-data\n/netbootxyz/assets/r640/ubuntu-24.04.2-live-server-amd64.iso\n/netbootxyz/assets/r640/user-data\n/netbootxyz/assets/r640/vendor-data\n/netbootxyz/assets/r640/vmlinuz\n</code></pre></div></div>\n\n<p>The kernel command line used for iPXE needs to include <code>autoinstall</code> and <code>ds=nocloud;s=http://your_server</code>. We could modify one of the existing <code>ipxe</code> scripts to do this, but it is more flexible to create <code>/netbootxyz/config/menus/MAC-020000000001.ipxe</code> where <code>020000000001</code> represents the MAC address <code>02:00:00:00:00:01</code> and should be updated to reflect the actual server\u2019s MAC address.</p>\n\n<div><div><pre><code>#!ipxe\n\n# Set a timeout (in milliseconds) for automatic selection\nset timeout 30000\n\n# Define a title for the menu\n:start\nmenu Boot Menu\nitem --key 1 local Boot from local hdd\nitem --key 2 ubuntu Autoinstall Ubuntu Noble\nitem --key r reboot Reboot system\nitem --key x exit Exit to iPXE shell\nchoose --timeout ${timeout} --default local option && goto ${option}\n\n# boot local system\n:local\necho Booting from local disks ...\nexit 1\n\n# Ubuntu boot configuration\n:ubuntu\nimgfree\necho Autoinstall Ubuntu Noble...\nset base-url http://doc.caelum.ci.dev:8080/r640\nkernel ${base-url}/vmlinuz\ninitrd ${base-url}/initrd\nimgargs vmlinuz root=/dev/ram0 ramdisk_size=3500000 cloud-config-url=/dev/null ip=dhcp url=${base-url}/ubuntu-24.04.2-live-server-amd64.iso initrd=initrd.magic console=ttyS0,115200n8 autoinstall ds=nocloud;s=${base-url}\nboot || goto failed\n\n# Error handling\n:failed\necho Boot failed, waiting 5 seconds...\nsleep 5\ngoto start\n\n# Reboot option\n:reboot\nreboot\n\n# Exit to shell\n:exit\necho Exiting to iPXE shell...\nexit\n</code></pre></div></div>\n\n<p>With this setup, we can now boot a machine from the network and automatically install Ubuntu with our chosen disk configuration.</p>",
+21
mte/2025_04_19_gluster.json
+21
mte/2025_04_19_gluster.json
···+"summary": "Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend dm-cache with LVM instead.",+"content": "<p>Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend <code>dm-cache</code> with LVM instead.</p>\n\n<p>I am going to use four QEMU VMs on which I have installed Ubuntu via PXE boot. For easy repetition, I have wrapped my <code>qemu-system-x86_64</code> commands into a <code>Makefile</code>.</p>\n\n<div><div><pre><code>machine: disk0.qcow2 disk1.qcow2 OVMF_VARS.fd\n qemu-system-x86_64 -m 8G -smp 4 -machine accel=kvm,type=pc -cpu host -display none -vnc :11 \\\n -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd \\\n -drive if=pflash,format=raw,file=OVMF_VARS.fd \\\n -serial stdio \\\n -device virtio-scsi-pci,id=scsi0 \\\n -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \\\n -drive file=disk0.qcow2,if=none,id=drive0 \\\n -device scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=1,lun=0 \\\n -drive file=disk1.qcow2,if=none,id=drive1 \\\n -net nic,model=virtio-net-pci,macaddr=02:00:00:00:00:11 \\\n -net bridge,br=br0\n\ndisk%.qcow2:\n qemu-img create -f qcow2 $@ 1T\n\nOVMF_VARS.fd:\n cp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd\n\nclean:\n rm -f *.qcow2 OVMF_VARS.fd\n</code></pre></div></div>\n\n<p>Gluster works on any file system that supports extended attributes <em>xattr</em>, which includes <code>ext[2-4]</code>. However, XFS is typically used as it performs well with parallel read/write operations and large files. I have used 512-byte inodes, <code>-i size=512</code>, which is recommended as this creates extra space for the extended attributes.</p>\n\n<div><div><pre><code>mkfs.xfs <span>-i</span> <span>size</span><span>=</span>512 /dev/sdb\n<span>mkdir</span> <span>-p</span> /gluster/sdb\n<span>echo</span> <span>\"/dev/sdb /gluster/sdb xfs defaults 0 0\"</span> <span>>></span> /etc/fstab\nmount <span>-a</span>\n</code></pre></div></div>\n\n<p>With the filesystem prepared, install and start Gluster. Gluster stores its settings in <code>/var/lib/glusterd</code>, so if you need to reset your installation, stop the gluster daemon and remove that directory.</p>\n\n<div><div><pre><code>apt <span>install </span>glusterfs-server\nsystemctl <span>enable </span>glusterd\nsystemctl start glusterd\n</code></pre></div></div>\n\n<p>From one node, probe all the other nodes. You can do this by IP address or by hostname.</p>\n\n<div><div><pre><code>gluster peer probe node222\ngluster peer probe node200\ngluster peer probe node152\n</code></pre></div></div>\n\n<p><code>gluster pool list</code> should now list all the nodes. <code>localhost</code> indicates your current host.</p>\n\n<div><div><pre><code>UUID Hostname State\n8d2a1ef0-4c23-4355-9faa-8f3387054d41 node222 Connected\n4078f192-b2bb-4c74-a588-35d5475dedc7 node200 Connected\n5b2fc21b-b0ab-401e-9848-3973121bfec7 node152 Connected\nd5878850-0d40-4394-8dd8-b9b0d4266632 localhost Connected\n</code></pre></div></div>\n\n<p>Now we need to add a volume. A Gluster volume can be distributed, replicated or dispersed. It is possible to have mix distributed with the other two types, giving a distributed replicated volume or a distributed dispersed volume. Briefly, distributed splits the data across the nodes without redundancy but gives a performance advantage. Replicated creates 2 or more copies of the data. Dispersed uses erasure coding, which can be considered as RAID5 over nodes.</p>\n\n<p>Once a volume has been created, it needs to be started. The commands to create and start the volume only need to be executed on one of the nodes.</p>\n\n<div><div><pre><code>gluster volume create vol1 disperse 4 transport tcp node<span>{</span>200,222,223,152<span>}</span>:/gluster/sdb/vol1\ngluster volume start vol1\n</code></pre></div></div>\n\n<p>On each node, or on a remote machine, you can now mount the Gluster volume. Here I have mounted it to <code>/mnt</code> from the node itself. All writes to <code>/mnt</code> will be dispersed to the other nodes.</p>\n\n<div><div><pre><code>echo \"localhost:/vol1 /mnt glusterfs defaults 0 0\" >> /etc/fstab\nmount -a\n</code></pre></div></div>\n\n<p>The volume can be inspected with <code>gluster volume info</code>.</p>\n\n<div><div><pre><code>Volume Name: vol1\nType: Disperse\nVolume ID: 31e165b2-da96-40b2-bc09-e4607a02d14b\nStatus: Started\nSnapshot Count: 0\nNumber of Bricks: 1 x (3 + 1) = 4\nTransport-type: tcp\nBricks:\nBrick1: node200:/gluster/sdb/vol1\nBrick2: node222:/gluster/sdb/vol1\nBrick3: node223:/gluster/sdb/vol1\nBrick4: node152:/gluster/sdb/vol1\nOptions Reconfigured:\nnetwork.ping-timeout: 4\nstorage.fips-mode-rchecksum: on\ntransport.address-family: inet\nnfs.disable: on\n</code></pre></div></div>\n\n<p>In initial testing, any file operation on the mounted volume appeared to hang when a node went down. This is because Gluster has a default timeout of 42 seconds. This command will set a lower value:</p>\n\n<div><div><pre><code>gluster volume set vol1 network.ping-timeout 4\n</code></pre></div></div>\n\n<p>The video below shows the four VMs running. One is writing random data to <code>/mnt/random</code>. The other machines are running <code>ls -phil /mnt</code> so we can watch the file growing. <code>node222</code> is killed, and after the 4-second pause, the other nodes continue. When the node is rebooted, it automatically recovers.</p>\n\n\n\n<blockquote>\n <p>While I used 4 nodes, this works equally well with 3 nodes.</p>\n</blockquote>",
+21
mte/2025_04_21_clock-winder-repair.json
+21
mte/2025_04_21_clock-winder-repair.json
···+"summary": "The galvanised steel wire rope on one of my clock winders has snapped. This is a 3mm rope, so it would have a rating of greater than 500 kg. I am quite surprised that it snapped, as the load on this wire rope is much lower than that of others in use in the same system.",+"content": "<p>The galvanised steel wire rope on one of my clock winders has snapped. This is a 3mm rope, so it would have a rating of greater than 500 kg. I am quite surprised that it snapped, as the load on this wire rope is much lower than that of others in use in the same system.</p>\n\n<p>I suspect that the failure is due to the pulley. There is a significant gap between the frame and the pulley wheel where the wire may get jammed. (Right-hand picture). My initial thought was to 3d print a spacer washer, but instead, I was able to squash the entire assembly, removing all the play while still allowing the pulley to rotate. (Left-hand picture).</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/aylesford-pulley.jpg\"></p>\n\n<p>When the clock is being wound, either by hand or via the clock winder, the tension is removed from the drive wheel, resulting in a reduced impulse on the escapement. In early versions of the winder, I had ignored the counterweight by tying it out of the way, but this caused the clock to lose almost 10 minutes per day. The counterweight is an ingeniously simple workaround which keeps tension on the drive wheel by pulling on one of the gear teeth. This particular part of the clock winder lifts the counterweight before the winder lifts the weight.</p>",
+21
mte/2025_04_21_ubuntu-dm-cache.json
+21
mte/2025_04_21_ubuntu-dm-cache.json
···+"summary": "dm-cache has been part of the mainline Linux kernel for over a decade, making it possible for faster SSD and NVMe drives to be used as a cache within a logical volume. This technology brief from Dell gives a good overview of dm-cache and the performance benefits. Skip to the graph on page 25, noting the logarithmic scale.",+"content": "<p><a href=\"https://en.wikipedia.org/wiki/Dm-cache\">dm-cache</a> has been part of the mainline Linux kernel for over a decade, making it possible for faster SSD and NVMe drives to be used as a cache within a logical volume. <a href=\"https://videos.cdn.redhat.com/summit2015/presentations/17856_getting-the-most-out-of-your-nvme-ssd.pdf\">This technology brief from Dell</a> gives a good overview of <code>dm-cache</code> and the performance benefits. Skip to the graph on page 25, noting the logarithmic scale.</p>\n\n<p>Given a system with a small SATADOM module, <code>/dev/sdd</code>, an SSD drive <code>/dev/sdc</code> and a couple of large-capacity spinning disks, <code>/dev/sd[ab]</code>, can we use cloud-init to configure RAID1 on the capacity disks with the SSD being used as a cache?</p>\n\n<p>Unfortunately, the <code>storage:</code> / <code>config:</code> nodes are not very flexible when it comes to even modest complexity. For example, given an LVM volume group consisting of multiple disk types, it isn\u2019t possible to create a logical volume on a specific disk as <code>devices:</code> is not a parameter to <code>lvm_partition</code>. It is also not possible to specify <code>raid: raid1</code>.</p>\n\n<p>I have taken the approach of creating two volume groups, <code>vg_raid</code> and <code>vg_cache</code>, on disks <code>/dev/sd[ab]</code> and <code>/dev/sdc</code>, respectively, thereby forcing the use of the correct devices. On the <code>vg_raid</code> group, I have created a single logical volume without RAID. On <code>vg_cache</code>, I have created the two cache volumes, <code>lv-cache</code> and <code>lv-cache-meta</code>.</p>\n\n<p>The <code>lv-cache</code> and <code>lv-cache-meta</code> should be sized in the ratio 1000:1.</p>\n\n<p>As the final step of the installation, I used <code>late-commands</code> to configure the system as I want it. These implement RAID1 for the root logical volume, deactivate the two cache volumes as a necessary step before merging <code>vg_raid</code> and <code>vg_cache</code>, create the cache pool from the cache volumes, and finally enable the cache. The cache pool can be either <em>writethrough</em> or <em>writeback</em>, with the default being <em>writethrough</em>. In this mode, data is written to both the cache and the original volume, so a failure in the cache device doesn\u2019t result in any data loss. <em>Writeback</em> has better performance as writes initially only go to the cache volume and are only written to the original volume later.</p>\n\n<div><div><pre><code>lvconvert -y --type raid1 -m 1 /dev/vg_raid/lv_data\nlvchange -an vg_cache/lv_cache\nlvchange -an vg_cache/lv_cache_meta\nvgmerge vg_raid vg_cache\nlvconvert -y --type cache-pool --poolmetadata vg_raid/lv_cache_meta vg_raid/lv_cache\nlvconvert -y --type cache --cachemode writethrough --cachepool vg_raid/lv_cache vg_raid/lv_data\n</code></pre></div></div>\n\n<p>I have placed <code>/boot</code> and <code>/boot/EFI</code> on the SATADOM so that the system can be booted.</p>\n\n<p>My full configuration given below.</p>\n\n<div><div><pre><code>#cloud-config\nautoinstall:\n version: 1\n storage:\n config:\n # Define the physical disks\n - { id: disk-sda, type: disk, ptable: gpt, path: /dev/sda, preserve: false }\n - { id: disk-sdb, type: disk, ptable: gpt, path: /dev/sdb, preserve: false }\n - { id: disk-sdc, type: disk, ptable: gpt, path: /dev/sdc, preserve: false }\n - { id: disk-sdd, type: disk, ptable: gpt, path: /dev/sdd, preserve: false }\n\n # Define the partitions\n - { id: efi-part, type: partition, device: disk-sdd, size: 512M, wipe: superblock, flag: boot, number: 1, preserve: false, grub_device: true, offset: 1048576}\n - { id: boot-part, type: partition, device: disk-sdd, size: 1G, wipe: superblock, number: 2, preserve: false, grub_device: false }\n\n # Create volume groups\n - { id: vg-raid, type: lvm_volgroup, name: vg_raid, devices: [disk-sda, disk-sdb] }\n - { id: vg-cache, type: lvm_volgroup, name: vg_cache, devices: [disk-sdc] }\n\n # Create logical volume which will be for RAID\n - { id: lv-data, type: lvm_partition, volgroup: vg-raid, name: lv_data, size: 1000G, preserve: false}\n\n # Create cache metadata logical volume on SSD VG (ratio 1000:1 with cache data)\n - { id: lv-cache-meta, type: lvm_partition, volgroup: vg-cache, name: lv_cache_meta, size: 1G, preserve: false }\n\n # Create cache data logical volume on SSD VG\n - { id: lv-cache, type: lvm_partition, volgroup: vg-cache, name: lv_cache, size: 1000G, preserve: false }\n\n # Format the volumes\n - { id: root-fs, type: format, fstype: ext4, volume: lv-data, preserve: false }\n - { id: efi-fs, type: format, fstype: fat32, volume: efi-part, preserve: false }\n - { id: boot-fs, type: format, fstype: ext4, volume: boot-part, preserve: false }\n\n # Mount the volumes\n - { id: mount-1, type: mount, path: /, device: root-fs }\n - { id: mount-2, type: mount, path: /boot, device: boot-fs }\n - { id: mount-3, type: mount, path: /boot/efi, device: efi-fs }\n identity:\n hostname: unnamed-server\n password: \"$6$exDY1mhS4KUYCE/2$zmn9ToZwTKLhCw.b4/b.ZRTIZM30JZ4QrOQ2aOXJ8yk96xpcCof0kxKwuX1kqLG/ygbJ1f8wxED22bTL4F46P0\"\n username: mte24\n ssh:\n install-server: yes\n authorized-keys:\n - ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i mark.elvers@tunbury.org\n allow-pw: no\n packages:\n - lvm2\n - thin-provisioning-tools\n user-data:\n disable_root: false\n late-commands:\n - lvconvert -y --type raid1 -m 1 /dev/vg_raid/lv_data\n - lvchange -an vg_cache/lv_cache\n - lvchange -an vg_cache/lv_cache_meta\n - vgmerge vg_raid vg_cache\n - lvconvert -y --type cache-pool --poolmetadata vg_raid/lv_cache_meta vg_raid/lv_cache\n - lvconvert -y --type cache --cachemode writethrough --cachepool vg_raid/lv_cache vg_raid/lv_data\n</code></pre></div></div>",
+21
mte/2025_04_22_ocaml-fedora-gcc.json
+21
mte/2025_04_22_ocaml-fedora-gcc.json
···+"summary": "Late last week, @MisterDA added Fedora 42 support to the Docker base image builder. The new base images attempted to build over the weekend, but there have been a few issues!",+"content": "<p>Late last week, @MisterDA added Fedora 42 support to the <a href=\"https://images.ci.ocaml.org\">Docker base image builder</a>. The new base images attempted to build over the weekend, but there have been a few issues!</p>\n\n<p>The code I had previously added to force Fedora 41 to use the DNF version 5 syntax was specifically for version 41. For reference, the old syntax was <code>yum groupinstall -y 'C Development Tools and Libraries\u2019</code>, and the new syntax is <code>yum group install -y 'c-development'</code>. Note the extra space.</p>\n\n<div><div><pre><code><span>let</span> <span>c_devtools_libs</span> <span>:</span> <span>(</span><span>t</span><span>,</span> <span>unit</span><span>,</span> <span>string</span><span>,</span> <span>t</span><span>)</span> <span>format4</span> <span>=</span>\n <span>match</span> <span>d</span> <span>with</span>\n <span>|</span> <span>`Fedora</span> <span>`V41</span> <span>-></span> <span>{</span><span>|</span><span>\"c-development\"</span><span>|</span><span>}</span>\n <span>|</span> <span>`Fedora</span> <span>_</span> <span>-></span> <span>{</span><span>|</span><span>\"C Development Tools and Libraries\"</span><span>|</span><span>}</span>\n <span>|</span> <span>_</span> <span>-></span> <span>{</span><span>|</span><span>\"Development Tools\u201d|}\n...\nlet dnf_version = match d with `Fedora `V41 -> 5 | _ -> 3\n</span></code></pre></div></div>\n\n<p>To unburden ourselves of this maintenance in future releases, I have inverted the logic so unmatched versions will use the new syntax.</p>\n\n<div><div><pre><code><span>let</span> <span>(</span><span>dnf_version</span><span>,</span> <span>c_devtools_libs</span><span>)</span> <span>:</span> <span>int</span> <span>*</span> <span>(</span><span>t</span><span>,</span> <span>unit</span><span>,</span> <span>string</span><span>,</span> <span>t</span><span>)</span> <span>format4</span> <span>=</span>\n <span>match</span> <span>d</span> <span>with</span>\n <span>|</span> <span>`Fedora</span>\n <span>(</span> <span>`V21</span> <span>|</span> <span>`V22</span> <span>|</span> <span>`V23</span> <span>|</span> <span>`V24</span> <span>|</span> <span>`V25</span> <span>|</span> <span>`V26</span> <span>|</span> <span>`V27</span> <span>|</span> <span>`V28</span> <span>|</span> <span>`V29</span>\n <span>|</span> <span>`V30</span> <span>|</span> <span>`V31</span> <span>|</span> <span>`V32</span> <span>|</span> <span>`V33</span> <span>|</span> <span>`V34</span> <span>|</span> <span>`V35</span> <span>|</span> <span>`V36</span> <span>|</span> <span>`V37</span> <span>|</span> <span>`V38</span>\n <span>|</span> <span>`V39</span> <span>|</span> <span>`V40</span> <span>)</span> <span>-></span>\n <span>(</span><span>3</span><span>,</span> <span>{</span><span>|</span><span>\"C Development Tools and Libraries\"</span><span>|</span><span>})</span>\n <span>|</span> <span>`Fedora</span> <span>_</span> <span>-></span> <span>(</span><span>5</span><span>,</span> <span>{</span><span>|</span><span>\"c-development\"</span><span>|</span><span>})</span>\n <span>|</span> <span>_</span> <span>-></span> <span>(</span><span>3</span><span>,</span> <span>{</span><span>|</span><span>\"Development Tools\"</span><span>|</span><span>})</span>\n</code></pre></div></div>\n\n<p>Fedora 42 also removed <code>awk</code>, so it now needs to be specifically included as a dependency. However, this code is shared with Oracle Linux, which does not have a package called <code>awk</code>. Fortunately, both have a package called <code>gawk</code>!</p>\n\n<p>The next issue is that Fedora 42 is the first of the distributions we build base images for that has moved to GCC 15, specifically GCC 15.0.1. This breaks all versions of OCaml < 4.14.</p>\n\n<p>The change is that the code below, which previously gave no information about the number or type of parameters. (see <code>runtime/caml/prims.h</code>)</p>\n\n<div><div><pre><code><span>typedef</span> <span>value</span> <span>(</span><span>*</span><span>c_primitive</span><span>)();</span>\n</code></pre></div></div>\n\n<p>Now means that there are no parameters, aka:</p>\n\n<div><div><pre><code><span>typedef</span> <span>value</span> <span>(</span><span>*</span><span>c_primitive</span><span>)(</span><span>void</span><span>);</span>\n</code></pre></div></div>\n\n<p>This is caused by a change of the default compilter language version. See <a href=\"https://gcc.gnu.org/gcc-15/changes.html\">GCC change log</a></p>\n\n<blockquote>\n <p>C23 by default: GCC 15 changes the default language version for C compilation from <code>-std=gnu17</code> to <code>-std=gnu23</code>. If your code relies on older versions of the C standard, you will need to either add <code>-std=</code> to your build flags, or port your code; see the porting notes.</p>\n</blockquote>\n\n<p>Also see the <a href=\"https://gcc.gnu.org/gcc-15/porting_to.html#c23\">porting notes</a>, and <a href=\"https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118112\">this bug report</a>.</p>\n\n<p>This is <em>not</em> an immediate problem as OCaml-CI and opam-repo-ci only test against OCaml 4.14.2 and 5.3.0 on Fedora. I have opened <a href=\"https://github.com/ocurrent/docker-base-images/issues/320\">issue#320</a> to track this problem.</p>",
+21
mte/2025_04_23_blade-allocation.json
+21
mte/2025_04_23_blade-allocation.json
···+"summary": "Equinix has stopped commercial sales of Metal and will sunset the service at the end of June 2026. Equinix have long been a supporter of OCaml and has provided free credits to use on their Metal platform. These credits are coming to an end at the end of this month, meaning that we need to move some of our services away from Equinix. We have two new four-node blade servers, which will become the new home for these services. The blades have dual 10C/20T processors with either 192GB or 256GB of RAM and a combination of SSD and spinning disk.",+"content": "<p>Equinix has stopped commercial sales of Metal and will sunset the service at the end of June 2026. Equinix have long been a supporter of OCaml and has provided free credits to use on their Metal platform. These credits are coming to an end at the end of this month, meaning that we need to move some of our services away from Equinix. We have two new four-node blade servers, which will become the new home for these services. The blades have dual 10C/20T processors with either 192GB or 256GB of RAM and a combination of SSD and spinning disk.</p>\n\n<p>192GB, 20C/40T with 1.1TB SSD, 2 x 6T disks</p>\n<ul>\n <li>rosemary: FreeBSD CI Worker (releasing spring & summer)</li>\n <li>oregano: OpenBSD CI Worker (releasing bremusa)</li>\n <li>basil: docs-ci (new implementation, eventually replacing eumache)</li>\n <li>mint: spare</li>\n</ul>\n\n<p>256GB, 20C/40T with 1.5TB SSD, 2 x 8T disks</p>\n<ul>\n <li>thyme: Equinix c2-2 (registry.ci.dev)</li>\n <li>chives: Equinix c2-4 (opam-repo-ci) + Equinix c2-3 (OCaml-ci) + Equinix c2-1 (preview.dune.dev)</li>\n</ul>\n\n<p>256GB, 20C/40T with 1.1TB SSD, 2 x 6T disks</p>\n<ul>\n <li>dill: spare</li>\n <li>sage: spare</li>\n</ul>\n\n<p>VMs currently running on hopi can be redeployed to chives, allowing hopi to be redeployed.</p>\n\n<p>Machines which can then be recycled are:</p>\n<ul>\n <li>sleepy (4C)</li>\n <li>grumpy (4C)</li>\n <li>doc (4C)</li>\n <li>spring (8T)</li>\n <li>tigger</li>\n <li>armyofdockerness</li>\n</ul>",
+21
mte/2025_04_24_infra-map.json
+21
mte/2025_04_24_infra-map.json
···+"summary": "Yesterday, we were talking about extending the current infrastructure database to incorporate other information to provide prompts to return machines to the pool of resources after they have completed their current role/loan, etc. There is also a wider requirement to bring these services back to Cambridge from Equinix/Scaleway, which will be the subject of a follow-up post. However, the idea of extending the database made me think that it would be amusing to overlay the machine\u2019s positions onto Google Maps.",+"content": "<p>Yesterday, we were talking about extending the current infrastructure database to incorporate other information to provide prompts to return machines to the pool of resources after they have completed their current role/loan, etc. There is also a wider requirement to bring these services back to Cambridge from Equinix/Scaleway, which will be the subject of a follow-up post. However, the idea of extending the database made me think that it would be amusing to overlay the machine\u2019s positions onto Google Maps.</p>\n\n<p>I added positioning data in the Jekyll Collection <code>_machines\\*.md</code> for each machine. e.g. <a href=\"https://raw.githubusercontent.com/ocaml/infrastructure/refs/heads/master/_machines/ainia.md\">ainia.md</a></p>\n\n<div><div><pre><code>---\nname: ainia\n...\nlatitude: 52.2109\nlongitude: 0.0917\n---\n</code></pre></div></div>\n\n<p>Then Jekyll\u2019s Liquid templating engine can create a JavaScript array for us</p>\n\n<div><div><pre><code>\n <span>// Define machines data array from Jekyll collection</span>\n <span>const</span> <span>machinesData</span> <span>=</span> <span>[</span>\n <span>{</span><span>%</span> <span>for</span> <span>machine</span> <span>in</span> <span>site</span><span>.</span><span>machines</span> <span>%</span><span>}</span>\n <span>{</span><span>%</span> <span>if</span> <span>machine</span><span>.</span><span>latitude</span> <span>and</span> <span>machine</span><span>.</span><span>longitude</span> <span>%</span><span>}</span>\n <span>{</span>\n <span>name</span><span>:</span> <span>\"</span><span>{{ machine.name }}</span><span>\"</span><span>,</span>\n <span>lat</span><span>:</span> <span>{{</span> <span>machine</span><span>.</span><span>latitude</span> <span>}},</span>\n <span>lng</span><span>:</span> <span>{{</span> <span>machine</span><span>.</span><span>longitude</span> <span>}},</span>\n <span>{</span><span>%</span> <span>if</span> <span>machine</span><span>.</span><span>description</span> <span>%</span><span>}</span>\n <span>description</span><span>:</span> <span>\"</span><span>{{ machine.description | escape }}</span><span>\"</span><span>,</span>\n <span>{</span><span>%</span> <span>endif</span> <span>%</span><span>}</span>\n <span>// Add any other properties you need</span>\n <span>},</span>\n <span>{</span><span>%</span> <span>endif</span> <span>%</span><span>}</span>\n <span>{</span><span>%</span> <span>endfor</span> <span>%</span><span>}</span>\n <span>];</span>\n\n</code></pre></div></div>\n\n<p>This array can be converted into an array of map markers. Google have an API for clustering the markers into a count of machines. I added a random offset to each location to avoid all the markers piling up on a single spot.</p>\n\n<p>The interactive map can be seen at <a href=\"https://infra.ocaml.org/machines.html\">machines.html</a></p>",
+21
mte/2025_04_25_blade-reallocation.json
+21
mte/2025_04_25_blade-reallocation.json
···+"summary": "We have changed our mind about using dm-cache in the SSD/RAID1 configuration. The current thinking is that the mechanical drives would be better served as extra capacity for our distributed ZFS infrastructure, where we intend to have two copies of all data, and these disks represent ~100TB of storage.",+"content": "<p>We have changed our mind about using <code>dm-cache</code> in the SSD/RAID1 configuration. The current thinking is that the mechanical drives would be better served as extra capacity for our distributed ZFS infrastructure, where we intend to have two copies of all data, and these disks represent ~100TB of storage.</p>\n\n<p>As mentioned previously, we have a deadline of Wednesday, 30th April, to move the workloads from the Equinix machines or incur hosting fees.</p>\n\n<p>I also noted that the SSD capacity is 1.7TB in all cases. The new distribution is:</p>\n\n<ul>\n <li>rosemary: FreeBSD CI Worker (releasing spring & summer)</li>\n <li>oregano: OpenBSD CI Worker (releasing bremusa)</li>\n <li>basil: Equinix c2-2 (registry.ci.dev)</li>\n <li>mint: @mte24 workstation</li>\n <li>thyme: spare</li>\n <li>chives: Equinix c2-4 (opam-repo-ci) + Equinix c2-3 (OCaml-ci) + Equinix c2-1 (preview.dune.dev)</li>\n <li>dill: spare</li>\n <li>sage: docs-ci (new implementation, eventually replacing eumache)</li>\n</ul>",
+21
mte/2025_04_25_bluesky-ssh-authentication.json
+21
mte/2025_04_25_bluesky-ssh-authentication.json
···+"summary": "If you have sign up to tangled.sh you will have published your SSH public key on the Bluesky ATproto network. Have a browse to your Bluesky ID, or mine. Look under sh.tangled.publicKey.",+"content": "<p>If you have sign up to <a href=\"https://tangled.sh\">tangled.sh</a> you will have published your SSH public key on the Bluesky ATproto network. Have a browse to your Bluesky ID, or <a href=\"https://www.atproto-browser.dev/at/did:plc:476rmswt6ji7uoxyiwjna3ti\">mine</a>. Look under <code>sh.tangled.publicKey</code>.</p>\n\n<p><a href=\"https://github.com/mtelvers/bluesky-ssh-key-extractor.git\">BlueSky ATproto SSH Public Key Extractor</a> extracts this public key information and outputs one public key at a time. The format is suitable to use with the <code>AuthorizedKeysCommand</code> parameter in your <code>/etc/sshd/ssh_config</code> file.</p>\n\n<p>Build the project:</p>\n\n<div><div><pre><code>opam <span>install</span> <span>.</span> <span>-deps-only</span>\ndune build\n</code></pre></div></div>\n\n<p>Install the binary by copying it to the local system. Setting the ownership and permissions is essential.</p>\n\n<div><div><pre><code><span>cp </span>_build/install/default/bin/bluesky-ssh-key-extractor /usr/local/bin\n<span>chmod </span>755 /usr/local/bin/bluesky-ssh-key-extractor\n<span>chown </span>root:root /usr/local/bin/bluesky-ssh-key-extractor\n</code></pre></div></div>\n\n<p>Test the command is working:</p>\n\n<div><div><pre><code><span>$ </span>bluesky-ssh-key-extractor mtelvers.tunbury.org\nssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIA7UrJmBFWR3c7jVzpoyg4dJjON9c7t9bT9acfrj6G7i mark.elvers@tunbury.org\n</code></pre></div></div>\n\n<p>If that works, then edit your <code>/etc/sshd/ssh_config</code>:-</p>\n\n<div><div><pre><code>AuthorizedKeysCommand /usr/local/bin/bluesky-ssh-key-extractor your_bluesky_handle\nAuthorizedKeysCommandUser nobody\n</code></pre></div></div>\n\n<p>Now you should be able to SSH to the machine using your published key</p>\n\n<div><div><pre><code>ssh root@your_host\n</code></pre></div></div>\n\n<blockquote>\n <p>Note, this program was intended as a proof of concept rather than something you\u2019d actually use.</p>\n</blockquote>\n\n<p>If you have a 1:1 mapping, between Bluesky accounts and system usernames, you might get away with:</p>\n\n<div><div><pre><code>AuthorizedKeysCommand /usr/local/bin/bluesky-ssh-key-extractor %u.bsky.social\nAuthorizedKeysCommandUser nobody\n</code></pre></div></div>",
+21
mte/2025_04_26_bluesky-ssh-authentication-2.json
+21
mte/2025_04_26_bluesky-ssh-authentication-2.json
···+"summary": "Addressing the glaring omissions from yesterday\u2019s proof of concept, such as the fact that you could sign in as any user, you couldn\u2019t revoke access, all hosts had the same users, and there was no mapping between Bluesky handles and POSIX users, I have updated mtelvers/bluesky-ssh-key-extractor and newly published mtelvers/bluesky-collection.",+"content": "<p>Addressing the glaring omissions from yesterday\u2019s proof of concept, such as the fact that you could sign in as any user, you couldn\u2019t revoke access, all hosts had the same users, and there was no mapping between Bluesky handles and POSIX users, I have updated <a href=\"https://github.com/mtelvers/bluesky-ssh-key-extractor\">mtelvers/bluesky-ssh-key-extractor</a> and newly published <a href=\"https://github.com/mtelvers/bluesky-collection.git\">mtelvers/bluesky-collection</a>.</p>\n\n<p>The tool creates ATProto collections using <code>app.bsky.graph.list</code> and populates them with <code>app.bsky.graph.listitem</code> records.</p>\n\n<p>Each list should be named with a friendly identifier such as the FQDN of the host being secured. List entries have a <code>subject_did</code>, which is the DID of the user you are giving access to, and a <code>displayName</code>, which is used as the POSIX username on the system you are connecting to.</p>\n\n<p>A typical usage would be creating a collection and adding records. Here I have made a collection called <code>rosemary.caelum.ci.dev</code> and then added to users <code>anil.recoil.org</code> and <code>mtelvers.tunbury.org</code> with POSIX usernames of <code>avsm2</code> and <code>mte24</code> respectively. Check my <a href=\"https://www.atproto-browser.dev/at/did:plc:476rmswt6ji7uoxyiwjna3ti\">Bluesky record</a>)</p>\n\n<div><div><pre><code>bluesky_collection create --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev\nbluesky_collection add --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev --user-handle anil.recoil.org --user-id avsm2\nbluesky_collection add --handle mtelvers.tunbury.org --password *** --collection rosemary.caelum.ci.dev --user-handle mtelvers.tunbury.org --user-id mte24\n</code></pre></div></div>\n\n<p>When authenticating using SSHD, the companion tool <a href=\"https://github.com/mtelvers/bluesky-ssh-key-extractor\">mtelvers/bluesky-ssh-key-extractor</a> would have command line parameters of the Bluesky user account holding the collection, collection name (aka the hostname), and the POSIX username (provided by SSHD). The authenticator queries the Bluesky network to find the collection matching the FQDN, then finds the list entries comparing them to the POSIX user given. If there is a match, the <code>subject_did</code> is used to look up the associated <code>sh.tangled.publicKey</code>.The authenticator requires no password to access Bluesky, as all the records are public.</p>",
+21
mte/2025_04_27_ocaml-ci.json
+21
mte/2025_04_27_ocaml-ci.json
···+"summary": "As noted on Thursday, the various OCaml services will need to be moved away from Equinix. Below are my notes on moving OCaml-CI.",+"content": "<p>As noted on Thursday, the various OCaml services will need to be moved away from Equinix. Below are my notes on moving OCaml-CI.</p>\n\n<p>Generate an SSH key on the new server <code>chives</code> using <code>ssh-keygen -t ed25519</code>. Copy the public key to <code>c2-3.equinix.ci.dev</code> and save it under <code>~/.ssh/authorized_keys</code>.</p>\n\n<p>Use <code>rsync</code> to mirror the Docker volumes. <code>-z</code> did improve performance as there appears to be a rate limiter somewhere in the path.</p>\n\n<div><div><pre><code>rsync <span>-azvh</span> <span>--progress</span> c2-3.equinix.ci.dev:/var/lib/docker/volumes/ /var/lib/docker/volumes/\n</code></pre></div></div>\n\n<p>After completing the copy, I waited for a quiet moment, and then scaled all of the Docker services to 0. I prefer to scale the services rather than remove them, as the recovery is much easier.</p>\n\n<div><div><pre><code>docker service scale <span>infra_grafana</span><span>=</span>0\ndocker service scale <span>infra_prometheus</span><span>=</span>0\ndocker service scale ocaml-ci_ci<span>=</span>0\ndocker service scale ocaml-ci_gitlab<span>=</span>0\ndocker service scale ocaml-ci_web<span>=</span>0\n</code></pre></div></div>\n\n<p>For the final copy, I used <code>--checksum</code> and also added <code>--delete</code>, as the Prometheus database creates segment files that are periodically merged into the main database.</p>\n\n<div><div><pre><code>rsync <span>-azvh</span> <span>--checksum</span> <span>--delete</span> <span>--progress</span> c2-3.equinix.ci.dev:/var/lib/docker/volumes/ /var/lib/docker/volumes/\n</code></pre></div></div>\n\n<p>The machine configuration is held in an Ansible Playbook, which includes the Docker stack for Grafana and Prometheus. It can be easily applied to the new machine:</p>\n\n<div><div><pre><code>ansible-playbook <span>-e</span> @secrets/ocaml.ci.dev.yml <span>--vault-password-file</span> secrets/vault-password ocaml.ci.dev.yml\n</code></pre></div></div>\n\n<p>OCaml-CI\u2019s Docker stack is held on GitHub <a href=\"https://github.com/ocurrent/ocaml-ci\">ocurrent/ocaml-ci</a> and can be deployed with:</p>\n\n<div><div><pre><code>make deploy-stack\n</code></pre></div></div>",
+21
mte/2025_04_29_distributed-zfs-storage.json
+21
mte/2025_04_29_distributed-zfs-storage.json
···+"summary": "Following Anil\u2019s note, we will design and implement a distributed storage archive system for ZFS volumes and associated metadata. Metadata here refers to key information about the dataset itself:",+"content": "<p>Following Anil\u2019s <a href=\"https://anil.recoil.org/notes/syncoid-sanoid-zfs\">note</a>, we will design and implement a distributed storage archive system for ZFS volumes and associated metadata. <em>Metadata</em> here refers to key information about the dataset itself:</p>\n\n<ul>\n <li>A summary of what the dataset is</li>\n <li>Data retention requirement (both legal and desirable)</li>\n <li>Time/effort/cost required to reproduce the data</li>\n <li>Legal framework under which the data is available, restrictions on the distribution of the data, etc.</li>\n</ul>\n\n<p>And also refers to the more <em>systems</em> style meanings such as:</p>\n\n<ul>\n <li>Size of the dataset</li>\n <li>List of machines/ZFS pools where the data is stored</li>\n <li>Number and distribution of copies required</li>\n <li>Snapshot and replication frequency/policy</li>\n</ul>\n\n<p>These data will be stored in a JSON/YAML or other structured file format.</p>\n\n<p>The system would have a database of machines and their associated storage (disks/zpools/etc) and location. Each item of storage would have a \u2018failure domain\u2019 to logically group resources for redundancy. This would allow copies of a dataset to be placed in different domains to meet the redundancy requirements. For example, given that we are committed to holding two distinct copies of the data, would we use RAIDZ on the local disks or just a dynamic stripe, RAID0, to maximise capacity?</p>\n\n<p>While under development, the system will output recommended actions - shell commands - to perform the snapshot and replication steps necessary to meet the replication and redundancy policies. Ultimately, these commands could be executed automatically.</p>\n\n<p>Utilising ZFS encryption, the remote pools can be stored as an encrypted filesystem without the encryption keys.</p>\n\n<p>When the data is being processed, it will be staged locally on the worker\u2019s NVMe drive for performance, and the resultant dataset <em>may</em> be uploaded with a new dataset of metadata.</p>",
+21
mte/2025_04_29_equinix-moves.json
+21
mte/2025_04_29_equinix-moves.json
···+"summary": "The moves of registry.ci.dev, opam-repo-ci, and get.dune.build have followed the template of OCaml-CI. Notable differences have been that I have hosted get.dune.build in a VM, as the services required very little disk space or CPU/RAM. For opam-repo-ci, the rsync was pretty slow, so I tried running multiple instances using GNU parallel with marginal gains.",+"content": "<p>The moves of registry.ci.dev, opam-repo-ci, and get.dune.build have followed the template of <a href=\"https://www.tunbury.org/ocaml-ci/\">OCaml-CI</a>. Notable differences have been that I have hosted <code>get.dune.build</code> in a VM, as the services required very little disk space or CPU/RAM. For opam-repo-ci, the <code>rsync</code> was pretty slow, so I tried running multiple instances using GNU parallel with marginal gains.</p>\n\n<div><div><pre><code><span>cd</span> /var/lib/docker/volumes2/opam-repo-ci_data/_data/var/job\n<span>ls</span> <span>-d</span> <span>*</span> | parallel <span>-j</span> 5 rsync <span>-azh</span> c2-4.equinix.ci.dev:/var/lib/docker/volumes/opam-repo-ci_data/_data/var/job/<span>{}</span>/ <span>{}</span>/\n</code></pre></div></div>\n\n<p>The Ansible configuration script for OCaml-CI is misnamed as it configures the machine and deploys infrastructure: Caddy, Grafana, Prometheus and Docker secrets, but not the Docker stack. The Docker stack for OCaml-CI is deployed by <code>make deploy-stack</code> from <a href=\"https://github.com/ocurrent/ocaml-ci\">ocurrent/ocaml-ci</a>. Conversely, opam-repo-ci <em>is</em> deployed from the Ansible playbook, but there is a <code>Makefile</code> and an outdated <code>stack.yml</code> in <a href=\"https://github.com/ocurrent/opam-repo-ci\">ocurrent/opam-repo-ci</a>.</p>\n\n<p>As part of the migration away from Equinix, these services have been merged into a single large machine <code>chives.caelum.ci.dev</code>. With this change, I have moved the Docker stack configuration for opam-repo-ci back to the repository <a href=\"https://github.com/ocurrent/opam-repo-ci/pull/428\">PR#428</a> and merged and renamed the machine configuration <a href=\"https://github.com/mtelvers/ansible/pull/44\">PR#44</a>.</p>\n\n<p>We want to thank Equinix for supporting OCaml over the years.</p>",
+21
mte/2025_04_29_raptor-talos-ii.json
+21
mte/2025_04_29_raptor-talos-ii.json
···+"summary": "We have two Raptor Computing Talos II POWER9 machines. One of these has had issues for some time and cannot run for more than 20 minutes before locking up completely. Over the last few days, our second machine has exhibited similar issues and needs to be power-cycled every ~24 hours. I spent some time today trying to diagnose the issue with the first machine, removing the motherboard as recommended by Raptor support, to see if the issue still exists with nothing else connected. Sadly, it does. I noted that a firmware update is available, which would move from v2.00 to v2.10.",+"content": "<p>We have two Raptor Computing Talos II POWER9 machines. One of these has had issues for some time and cannot run for more than 20 minutes before locking up completely. Over the last few days, our second machine has exhibited similar issues and needs to be power-cycled every ~24 hours. I spent some time today trying to diagnose the issue with the first machine, removing the motherboard as recommended by Raptor support, to see if the issue still exists with nothing else connected. Sadly, it does. I noted that a firmware update is available, which would move from v2.00 to v2.10.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-computing.jpeg\"></p>",
+21
mte/2025_05_01_removing-mdadm.json
+21
mte/2025_05_01_removing-mdadm.json
···+"summary": "Cloud providers automatically configure their machines as they expect you to use them. For example, a machine with 4 x 8T disks might come configured with an mdadm RAID5 array spanning the disks. This may be what most people want, but we don\u2019t want this configuration, as we want to see the bare disks. Given you have only a serial console (over SSH) and no access to the cloud-init environment, how do you boot the machine in a different configuration?",+"content": "<p>Cloud providers automatically configure their machines as they expect you to use them. For example, a machine with 4 x 8T disks might come configured with an mdadm RAID5 array spanning the disks. This may be what most people want, but we don\u2019t want this configuration, as we want to see the bare disks. Given you have only a serial console (over SSH) and no access to the cloud-init environment, how do you boot the machine in a different configuration?</p>\n\n<p>Example configuration:</p>\n\n<div><div><pre><code>$ lsblk\nNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS\nfd0 2:0 1 4K 0 disk\nsda 8:0 0 4G 0 disk\n\u251c\u2500sda1 8:1 0 512M 0 part /boot/efi\n\u2514\u2500sda2 8:2 0 3.5G 0 part\n \u2514\u2500md0 9:0 0 10.5G 0 raid5 /\nsdb 8:16 0 4G 0 disk\n\u2514\u2500sdb1 8:17 0 4G 0 part\n \u2514\u2500md0 9:0 0 10.5G 0 raid5 /\nsdc 8:32 0 4G 0 disk\n\u2514\u2500sdc1 8:33 0 4G 0 part\n \u2514\u2500md0 9:0 0 10.5G 0 raid5 /\nsdd 8:48 0 4G 0 disk\n\u2514\u2500sdd1 8:49 0 4G 0 part\n \u2514\u2500md0 9:0 0 10.5G 0 raid5 /\n</code></pre></div></div>\n\n<p>My initial approach was to create a tmpfs root filesystem and then use <code>pivot_root</code> to switch it. This worked except <code>/dev/md0</code> was still busy, so I could not unmount it.</p>\n\n<p>It occurred to me that I could remove one of the partitions from the RAID5 set and use that as the new root disk. <code>mdadm --fail /dev/md0 /dev/sda2</code>, followed by <code>mdadm --remove /dev/md0 /dev/sda2</code> frees up a disk. <code>debootstrap</code> can then be used to install Ubuntu on the partition. As we have a working system, we can preserve the key configuration settings such as <code>/etc/hostname</code>, <code>/etc/netplan</code>, <code>/etc/fstab</code> etc by just copying them from <code>/etc</code> to <code>/mnt/etc</code>. Unfortunately, Ansible\u2019s copy module does not preserve ownership. Therefore, I used <code>rsync</code> instead. <code>/etc/fstab</code> must be edited to reflect the new root partition.</p>\n\n<p>Lastly, run <code>grub-install</code> using <code>chroot</code> to the new environment and reboot.</p>\n\n<div><div><pre><code># lsblk\nNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS\nfd0 2:0 1 4K 0 disk\nsda 8:0 0 4G 0 disk\n\u251c\u2500sda1 8:1 0 512M 0 part /boot/efi\n\u2514\u2500sda2 8:2 0 3.5G 0 part /\nsdb 8:16 0 4G 0 disk\n\u2514\u2500sdb1 8:17 0 4G 0 part\nsdc 8:32 0 4G 0 disk\n\u2514\u2500sdc1 8:33 0 4G 0 part\nsdd 8:48 0 4G 0 disk\n\u2514\u2500sdd1 8:49 0 4G 0 part\n</code></pre></div></div>\n\n<p>The redundant RAID5 partitions can be removed with <code>wipefs -af /dev/sd[b-d]</code></p>\n\n<p>I have wrapped all the steps in an Ansible <a href=\"https://gist.github.com/mtelvers/1fe3571830d982eb8adbcf5a513edb2c\">playbook</a>, which is available as a GitHub gist.</p>\n\n<h1>Addendum</h1>\n\n<p>I had tested this in QEMU with EFI under the assumption that a newly provisioned cloud machine would use EFI. However, when I ran the script against the machine, I found it used a legacy bootloader, and it was even more complicated than I had envisioned, as there were three separate MDADM arrays in place:</p>\n\n<div><div><pre><code># cat /proc/mdstat \nPersonalities : [raid1] [raid6] [raid5] [raid4] [raid0] [raid10] \nmd2 : active raid5 sdb4[0] sdd4[2] sda4[4] sdc4[1]\n 34252403712 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]\n bitmap: 2/86 pages [8KB], 65536KB chunk\n\nmd1 : active raid5 sdd3[1] sda3[2] sdc3[0] sdb3[4]\n 61381632 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]\n \nmd0 : active raid1 sdd2[1] sda2[2] sdb2[3] sdc2[0]\n 523264 blocks super 1.2 [4/4] [UUUU]\n \nunused devices: <none>\n</code></pre></div></div>\n\n<p>With <code>lsblk</code> showing four disks each configured as below:</p>\n\n<div><div><pre><code>NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS\nsda 8:0 0 10.9T 0 disk \n\u251c\u2500sda1 8:1 0 1M 0 part \n\u251c\u2500sda2 8:2 0 512M 0 part \n\u2502 \u2514\u2500md0 9:0 0 511M 0 raid1 \n\u2502 \u2514\u2500md0p1 259:0 0 506M 0 part /boot\n\u251c\u2500sda3 8:3 0 19.5G 0 part \n\u2502 \u2514\u2500md1 9:1 0 58.5G 0 raid5 \n\u2502 \u2514\u2500md1p1 259:1 0 58.5G 0 part /\n\u251c\u2500sda4 8:4 0 10.6T 0 part \n\u2502 \u2514\u2500md2 9:2 0 31.9T 0 raid5 \n\u2502 \u2514\u2500md2p1 259:2 0 31.9T 0 part /data\n\u2514\u2500sda5 8:5 0 512M 0 part [SWAP]\n</code></pre></div></div>\n\n<p>The boot device is a RAID1 mirror (four copies), so removing one of these copies is no issue. There is also a 1MB BIOS boot partition first to give some space for GRUB. The root device was RAID5 as I had anticipated.</p>\n\n<p>The playbook could be adapted: double up on the <code>mdadm</code> commands to break two arrays, update two entries in <code>/etc/fstab</code> and use <code>grub-pc</code> rather than <code>grub-efi-amd64</code>. The updated playbook is <a href=\"https://gist.github.com/mtelvers/ba3b7a5974b50422e2c2e594bed0bdb2\">here</a>.</p>\n\n<p>For testing, I installed Ubuntu using this <a href=\"https://gist.github.com/mtelvers/d2d333bf5c9bd94cb905488667f0cae1\">script</a> to simulate the VM.</p>\n\n<p>Improvements could be made, as <code>/boot</code> could be merged into <code>/</code> as there is no reason to separate them when not using EFI. There never <em>needed</em> to be a <code>/boot</code> as GRUB2 will boot a RAID5 MDADM.</p>\n\n<p>The system is a pretty minimal installation of Ubuntu, a more typical set of tools could be installed with:</p>\n\n<div><div><pre><code>apt install ubuntu-standard\n</code></pre></div></div>",
+21
mte/2025_05_02_zfs-send-streams.json
+21
mte/2025_05_02_zfs-send-streams.json
···+"summary": "We often say that ZFS is an excellent replicated file system, but not the best local filesystem. This led me to think that if we run zfs send on one machine, we might want to write that out as a different filesystem. Is that even possible?",+"content": "<p>We often say that ZFS is an excellent replicated file system, but not the best <em>local</em> filesystem. This led me to think that if we run <code>zfs send</code> on one machine, we might want to write that out as a different filesystem. Is that even possible?</p>\n\n<p>What is in a ZFS stream?</p>\n\n<div><div><pre><code>fallocate <span>-l</span> 10G temp.zfs\nzpool create tank <span>`</span><span>pwd</span><span>`</span>/temp.zfs \nzfs create tank/home\n<span>cp </span>README.md /tank/home\nzfs snapshot tank/home@send\nzfs send tank/home@send | hexdump\n</code></pre></div></div>\n\n<p>I spent a little time writing an OCaml application to parse the record structure before realising that there already was a tool to do this: <code>zstreamdump</code>. Using the <code>-d</code> flag shows the contents; you can see your file in the dumped output.</p>\n\n<div><div><pre><code>zfs send tank/home@send | zstreamdump <span>-d</span>\n</code></pre></div></div>\n\n<p>However, this is <em>not</em> like a <code>tar</code> file. It is not a list of file names and their content. It is a list of block changes. ZFS is a tree structure with a snapshot and a volume being tree roots. The leaves of the tree may be unchanged between two snapshots. <code>zfs send</code> operates at the block level below the file system layer.</p>\n\n<p>To emphasise this point, consider a <code>ZVOL</code> formatted as XFS. The structure of the send stream is the same: a record of block changes.</p>\n\n<div><div><pre><code>zfs create <span>-V</span> 1G tank/vol\nmkfs.xfs /dev/zvol/tank/vol\nzfs snapshot tank/vol@send\nzfs send tank/vol@send | zstreamdump <span>-d</span>\n</code></pre></div></div>\n\n<p>ZVOLs are interesting as they give you a snapshot capability on a file system that doesn\u2019t have one. However, some performance metrics I saw posted online showed disappointing results compared with creating a file and using a loopback device. Furthermore, the snapshot would only be in a crash-consistent state as it would be unaware of the underlying snapshot. XFS does have <code>xfsdump</code> and <code>xfsrestore</code>, but they are pretty basic tools.</p>\n\n<p>[1] See also <a href=\"https://openzfs.org/wiki/Documentation/ZfsSend\">ZfsSend Documentation</a></p>",
+21
mte/2025_05_05_ventoy.json
+21
mte/2025_05_05_ventoy.json
···+"summary": "I need to install a chunky Windows application (90GB download, +250 GB install), but all my Windows VMs are pretty small, so I decided to use a spare Dell OptiPlex 7090. It had Windows 10 installed, but it was pretty messy from the previous use, so I decided to install Windows 11. I had a Windows 11 ISO on hand, so I wrote that to a USB memory stick using the Raspberry Pi Imaging tool (effectively dd in this use case). The machine booted without issue, but the installation failed, citing \u201cA media driver your computer needs is missing\u201d. This error looked familiar: a mass storage driver was missing. I often see this in QEMU or similar situations, and it\u2019s also common on server hardware. However, pressing Shift-F10 and opening diskpart showed all my storage.",+"content": "<p>I need to install a chunky Windows application (90GB download, +250 GB install), but all my Windows VMs are pretty small, so I decided to use a spare Dell OptiPlex 7090. It had Windows 10 installed, but it was pretty messy from the previous use, so I decided to install Windows 11. I had a Windows 11 ISO on hand, so I wrote that to a USB memory stick using the Raspberry Pi Imaging tool (effectively <code>dd</code> in this use case). The machine booted without issue, but the installation failed, citing \u201cA media driver your computer needs is missing\u201d. This error looked familiar: a mass storage driver was missing. I often see this in QEMU or similar situations, and it\u2019s also common on server hardware. However, pressing Shift-F10 and opening <code>diskpart</code> showed all my storage.</p>\n\n<p>It\u2019s been a while since I installed Windows on real hardware. Mostly, I use QEMU and an ISO and an <code>autounattend.xml</code> or PXE boot with Windows Deployment Services and Microsoft Deployment Toolkit. It seems that some time ago, the ISO files that Microsoft publish started to contain files that were larger than the standard allows, and thus, the <code>dd</code> approach to creating an image no longer works.</p>\n\n<p>Microsoft produces a USB creation tool, but I couldn\u2019t see how to tell it to use the ISO file that I already had! This happily led me to <a href=\"https://www.ventoy.net/en/index.html\">Ventoy</a>. The tool installs a small bootloader (~30M) on the memory stick and formats the rest as an exFAT partition. Copy your ISO file(s) to the exFAT partition, and boot the machine from the memory stick. You are then presented with a simple menu allowing you to boot from any of the ISO files. I couldn\u2019t help myself, I had to try to see if another OS would work too!</p>",
+21
mte/2025_05_06_freebsd-uefi.json
+21
mte/2025_05_06_freebsd-uefi.json
···+"summary": "I had assumed that booting FreeBSD over the network using iPXE would be pretty simple. There is even a freebsd.ipxe file included with Netboot.xyz. However, I quickly realised that most of the Internet wisdom on this process centred around legacy BIOS rather than UEFI. When booting with UEFI, the Netboot.xyz menu omits the FreeBSD option as it only supports legacy BIOS. Even in legacy mode, it uses memdisk from the Syslinux project rather than a FreeBSD loader.",+"content": "<p>I had assumed that booting FreeBSD over the network using iPXE would be pretty simple. There is even a <code>freebsd.ipxe</code> file included with Netboot.xyz. However, I quickly realised that most of the Internet wisdom on this process centred around legacy BIOS rather than UEFI. When booting with UEFI, the Netboot.xyz menu omits the FreeBSD option as it only supports legacy BIOS. Even in legacy mode, it uses <code>memdisk</code> from the Syslinux project rather than a FreeBSD loader.</p>\n\n<p>FreeBSD expects to use <code>loader.efi</code> to boot and to mount the root directory over NFS based upon the DHCP scope option <code>root-path</code>. I didn\u2019t want to provide an NFS server just for this process, but even when I gave in and set one up, it still didn\u2019t work. I\u2019m pleased that, in the final configuration, I didn\u2019t need an NFS server.</p>\n\n<p>Much of the frustration around doing this came from setting the <code>root-path</code> option. FreeBSD\u2019s <code>loader.efi</code> sends its own DHCP request to the DHCP server, ignoring the options <code>set root-path</code> or <code>set dhcp.root-path</code> configured in iPXE.</p>\n\n<p>Many <code>dhcpd.conf</code> snippets suggest a block similar to below, but usually with the comment that it doesn\u2019t work. Most authors proceed by setting <code>root-path</code> for the entire scope.</p>\n\n<div><div><pre><code>if exists user-class and option user-class = \"FreeBSD\" {\n option root-path \"your-path\";\n}\n</code></pre></div></div>\n\n<p>I used <code>dhcpdump -i br0</code> to examine the DHCP packets. This showed an ASCII BEL character (0x07) before <code>FreeBSD</code> in the <code>user-class</code> string.</p>\n\n<div><div><pre><code> TIME: 2025-05-07 08:51:03.811\n IP: 0.0.0.0 (2:0:0:0:0:22) > 255.255.255.255 (ff:ff:ff:ff:ff:ff)\n OP: 1 (BOOTPREQUEST)\n HTYPE: 1 (Ethernet)\n HLEN: 6\n HOPS: 0\n XID: 00000001\n SECS: 0\n FLAGS: 0\nCIADDR: 0.0.0.0\nYIADDR: 0.0.0.0\nSIADDR: 0.0.0.0\nGIADDR: 0.0.0.0\nCHADDR: 02:00:00:00:00:22:00:00:00:00:00:00:00:00:00:00\n SNAME: .\n FNAME: .\nOPTION: 53 ( 1) DHCP message type 3 (DHCPREQUEST)\nOPTION: 50 ( 4) Request IP address x.y.z.250\nOPTION: 54 ( 4) Server identifier x.y.z.1\nOPTION: 51 ( 4) IP address leasetime 300 (5m)\nOPTION: 60 ( 9) Vendor class identifier PXEClient\nOPTION: 77 ( 8) User-class Identification 0746726565425344 .FreeBSD\nOPTION: 55 ( 7) Parameter Request List 17 (Root path)\n\t\t\t\t\t 12 (Host name)\n\t\t\t\t\t 16 (Swap server)\n\t\t\t\t\t 3 (Routers)\n\t\t\t\t\t 1 (Subnet mask)\n\t\t\t\t\t 26 (Interface MTU)\n\t\t\t\t\t 54 (Server identifier)\n</code></pre></div></div>\n\n<p>There is a <code>substring</code> command, so I was able to set the <code>root-path</code> like this successfully:</p>\n\n<div><div><pre><code>if exists user-class and substring ( option user-class, 1, 7 ) = \"FreeBSD\" {\n option root-path \"your-path\";\n}\n</code></pre></div></div>\n\n<p>The situation is further complicated as we are using a Ubiquiti Edge router. This requires the command to be encoded as a <code>subnet-parameters</code>, which is injected into <code>/opt/vyatta/etc/dhcpd.conf</code>.</p>\n\n<div><div><pre><code>set service dhcp-server shared-network-name lab subnet x.y.z.0/24 subnet-parameters 'if exists user-class and substring( option user-class, 1, 7 ) = &quot;FreeBSD&quot; { option root-path &quot;tftp://x.y.z.240/freebsd14&quot;;}'\n</code></pre></div></div>\n\n<p>The FreeBSD 14.2 installation <a href=\"https://download.freebsd.org/releases/amd64/amd64/ISO-IMAGES/14.2/FreeBSD-14.2-RELEASE-amd64-disc1.iso\">ISO</a> contains the required <code>boot/loader.efi</code>, but we cannot use the extracted ISO as a root file system.</p>\n\n<p>Stage <code>loader.efi</code> on a TFTP server; in my case, the TFTP root is <code>/netbootxyz/config/menus</code>. The IPXE file only needs to contain the <code>chain</code> command.</p>\n\n<div><div><pre><code>#!ipxe\nchain loader.efi\n</code></pre></div></div>\n\n<p>Download <a href=\"https://mfsbsd.vx.sk/files/iso/14/amd64/mfsbsd-14.2-RELEASE-amd64.iso\">mfsBSD</a>, and extract the contents to a subfolder on the TFTP server. I went <code>freebsd14</code>. This ISO contains the kernel, <code>loader.conf</code> and the a minimal root file system, <code>mfsroot.gz</code>.</p>\n\n<p>With the content of mfsBSD ISO staged on the TFTP server and the modification to the DHCP scope options, the machine will boot into FreeBSD. Sign in with <code>root</code>/<code>mfsroot</code> and invoke <code>bsdinstall</code>.</p>\n\n<p>On real hardware, rather than QEMU, I found that I needed to explicitly set the serial console by adding these lines to the end of <code>boot/loader.conf</code>/</p>\n\n<div><div><pre><code># Serial console\nconsole=\"comconsole\"\ncomconsole_port=\"0x2f8\"\ncomconsole_speed=\"115200\"\n</code></pre></div></div>",
+21
mte/2025_05_07_otter-wiki-with-raven.json
+21
mte/2025_05_07_otter-wiki-with-raven.json
···+"summary": "We\u2019d like to have a go using Otter Wiki, but rather than having yet more usernames and passwords, we would like to integrate this into the Raven authentication system. There is guide on using SAML2 with Apache",+"content": "<p>We\u2019d like to have a go using <a href=\"https://otterwiki.com\">Otter Wiki</a>, but rather than having yet more usernames and passwords, we would like to integrate this into the Raven authentication system. There is <a href=\"https://docs.raven.cam.ac.uk/en/latest/apache-saml2/\">guide on using SAML2 with Apache</a></p>\n\n<p>The steps are:</p>\n<ol>\n <li>Start the provided container.</li>\n <li>Visit http://your-container/Shibboleth.sso/Metadata and download the <code>Metadata</code>.</li>\n <li>Go to <a href=\"https://metadata.raven.cam.ac.uk\">https://metadata.raven.cam.ac.uk</a> and create a new site by pasting in the metadata.</li>\n <li>Wait one minute and try to connect to http://your-container</li>\n</ol>\n\n<p>Otter Wiki, when started with the environment variable <code>AUTH_METHOD=PROXY_HEADER</code>, reads HTTP header fields <code>x-otterwiki-name</code>, <code>x-otterwiki-email</code> and <code>x-otterwiki-permissions</code>. See <a href=\"https://github.com/redimp/otterwiki/blob/main/docs/auth_examples/header-auth/README.md\">this example</a></p>\n\n<p>Apache can be configured to set these header fields based upon the SAML user who is authenticated with Raven:</p>\n\n<div><div><pre><code>ShibUseEnvironment On\nRequestHeader set x-otterwiki-name %{displayName}e\nRequestHeader set x-otterwiki-email %{REMOTE_USER}s\nRequestHeader set x-otterwiki-permissions \"READ,WRITE,UPLOAD,ADMIN\u201d\n</code></pre></div></div>\n\n<p>I have created a <code>docker-compose.yml</code> file, which incorporates Apache running as a reverse proxy, an Otter Wiki container and includes HTTPS support with a Let\u2019s Encrypt certificate. The files are available on <a href=\"https://github.com/mtelvers/doc-samples/commit/5ca2f8934a4cf1269e60b2b18de563352f764f66\">GitHub</a></p>\n\n<p>The test site is <a href=\"https://otterwiki.tunbury.uk\">https://otterwiki.tunbury.uk</a>.</p>",
+21
mte/2025_05_08_debugging-obuilder-macos.json
+21
mte/2025_05_08_debugging-obuilder-macos.json
···+"summary": "The log from an OBuilder job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.",+"content": "<p>The log from an <a href=\"https://github.com/ocurrent/obuilder\">OBuilder</a> job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.</p>\n\n<div><div><pre><code>2025-05-08 13:29.37: New job: build bitwuzla-cxx.0.7.0, using opam 2.3\n from https://github.com/ocaml/opam-repository.git#refs/pull/27768/head (55a47416d532dc829d9111297970934a21a1b1c4)\n on macos-homebrew-ocaml-4.14/amd64\n\nTo reproduce locally:\n\ncd $(mktemp -d)\ngit clone --recursive \"https://github.com/ocaml/opam-repository.git\" && cd \"opam-repository\" && git fetch origin \"refs/pull/27768/head\" && git reset --hard 55a47416\ngit fetch origin master\ngit merge --no-edit b8a7f49af3f606bf8a22869a1b52b250dd90092e\ncat > ../Dockerfile <<'END-OF-DOCKERFILE'\n\nFROM macos-homebrew-ocaml-4.14\nUSER 1000:1000\nRUN ln -f ~/local/bin/opam-2.3 ~/local/bin/opam\nRUN opam init --reinit -ni\nRUN opam option solver=builtin-0install && opam config report\nENV OPAMDOWNLOADJOBS=\"1\"\nENV OPAMERRLOGLEN=\"0\"\nENV OPAMPRECISETRACKING=\"1\"\nENV CI=\"true\"\nENV OPAM_REPO_CI=\"true\"\nRUN rm -rf opam-repository/\nCOPY --chown=1000:1000 . opam-repository/\nRUN opam repository set-url -k local --strict default opam-repository/\nRUN opam update --depexts || true\nRUN opam pin add -k version -yn bitwuzla-cxx.0.7.0 0.7.0\nRUN opam reinstall bitwuzla-cxx.0.7.0; \\\n res=$?; \\\n test \"$res\" != 31 && exit \"$res\"; \\\n export OPAMCLI=2.0; \\\n build_dir=$(opam var prefix)/.opam-switch/build; \\\n failed=$(ls \"$build_dir\"); \\\n partial_fails=\"\"; \\\n for pkg in $failed; do \\\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then \\\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\"; \\\n fi; \\\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' && partial_fails=\"$partial_fails $pkg\"; \\\n done; \\\n test \"${partial_fails}\" != \"\" && echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\"; \\\n exit 1\n\n\nEND-OF-DOCKERFILE\ndocker build -f ../Dockerfile .\n</code></pre></div></div>\n\n<p>It is, therefore, difficult to diagnose the issue on these operating systems and on esoteric architectures. Is it an issue with the CI system or the job itself?</p>\n\n<p>My approach is to get myself into an interactive shell at the point in the build where the failure occurs. On Linux and FreeBSD, the log is available in <code>/var/log/syslog</code> or <code>/var/log/messages</code> respectively. On macOS, this log is written to <code>ocluster.log</code>. macOS workers are single-threaded, so the worker must be paused before progressing.</p>\n\n<p>Each step in an OBuilder job consists of taking a snapshot of the previous layer, running a command in that layer, and keeping or discarding the layer depending on the command\u2019s success or failure. On macOS, layers are ZFS snapshots mounted over the Homebrew directory and the CI users\u2019 home directory. We can extract the appropriate command from the logs.</p>\n\n<div><div><pre><code>2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"canmount=noauto\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"mount\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"mountpoint=none\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/brew@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"mountpoint=none\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/home@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\ncannot open 'obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40@snap': dataset does not exist\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"--\" \"obuilder/cache/c-opam-archives@snap\" \"obuilder/cache-tmp/8608-c-opam-archives\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"--\" \"obuilder/cache/c-homebrew@snap\" \"obuilder/cache-tmp/8609-c-homebrew\"\n2025-05-08 14:31.18 obuilder [INFO] result_tmp = /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/usr/local\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\n2025-05-08 14:31.18 obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8608-c-opam-archives, dst = /Users/mac1000/.opam/download-cache, type rw\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000/.opam/download-cache\" \"obuilder/cache-tmp/8608-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives\n2025-05-08 14:31.18 obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8609-c-homebrew, dst = /Users/mac1000/Library/Caches/Homebrew, type rw\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000/Library/Caches/Homebrew\" \"obuilder/cache-tmp/8609-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew\n2025-05-08 14:31.19 application [INFO] Exec \"sudo\" \"dscl\" \".\" \"list\" \"/Users\"\n2025-05-08 14:31.19 application [INFO] Exec \"sudo\" \"-u\" \"mac1000\" \"-i\" \"getconf\" \"DARWIN_USER_TEMP_DIR\"\n2025-05-08 14:31.19 application [INFO] Fork exec \"sudo\" \"su\" \"-l\" \"mac1000\" \"-c\" \"--\" \"source ~/.obuilder_profile.sh && env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' \"$0\" \"$@\"\" \"/usr/bin/env\" \"bash\" \"-c\" \"opam reinstall bitwuzla-cxx.0.7.0;\n res=$?;\n test \"$res\" != 31 && exit \"$res\";\n export OPAMCLI=2.0;\n build_dir=$(opam var prefix)/.opam-switch/build;\n failed=$(ls \"$build_dir\");\n partial_fails=\"\";\n for pkg in $failed; do\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\";\n fi;\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' && partial_fails=\"$partial_fails $pkg\";\n done;\n test \"${partial_fails}\" != \"\" && echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\u201d;\n exit 1\"\n2025-05-08 14:31.28 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:31.58 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:32.28 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:32.43 application [INFO] Exec \"zfs\" \"inherit\" \"mountpoint\" \"obuilder/cache-tmp/8608-c-opam-archives\"\nUnmount successful for /Users/mac1000/.opam/download-cache\n2025-05-08 14:32.44 application [INFO] Exec \"zfs\" \"inherit\" \"mountpoint\" \"obuilder/cache-tmp/8609-c-homebrew\"\nUnmount successful for /Users/mac1000/Library/Caches/Homebrew\n2025-05-08 14:32.45 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=none\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\nUnmount successful for /Users/mac1000\n2025-05-08 14:32.45 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=none\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\nUnmount successful for /usr/local\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache/c-homebrew\" \"obuilder/cache-tmp/8610-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache/c-homebrew\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"promote\" \"obuilder/cache-tmp/8609-c-homebrew\"\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"destroy\" \"-f\" \"--\" \"obuilder/cache-tmp/8610-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8610-c-homebrew\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@snap\" \"obuilder/cache-tmp/8609-c-homebrew@old-2152\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"destroy\" \"-d\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@old-2152\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"snapshot\" \"-r\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@snap\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew\" \"obuilder/cache/c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew\n2025-05-08 14:32.49 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache/c-opam-archives\" \"obuilder/cache-tmp/8611-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache/c-opam-archives\n2025-05-08 14:32.50 application [INFO] Exec \"zfs\" \"promote\" \"obuilder/cache-tmp/8608-c-opam-archives\"\n2025-05-08 14:32.50 application [INFO] Exec \"zfs\" \"destroy\" \"-f\" \"--\" \"obuilder/cache-tmp/8611-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8611-c-opam-archives\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@snap\" \"obuilder/cache-tmp/8608-c-opam-archives@old-2152\"\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"destroy\" \"-d\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@old-2152\"\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"snapshot\" \"-r\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@snap\"\n2025-05-08 14:32.52 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives\" \"obuilder/cache/c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives\n2025-05-08 14:32.52 application [INFO] Exec \"zfs\" \"destroy\" \"-r\" \"-f\" \"--\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\nUnmount successful for /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\n2025-05-08 14:32.58 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:33.04 worker [INFO] Job failed: \"/usr/bin/env\" \"bash\" \"-c\" \"opam reinstall bitwuzla-cxx.0.7.0;\n res=$?;\n test \"$res\" != 31 && exit \"$res\";\n export OPAMCLI=2.0;\n build_dir=$(opam var prefix)/.opam-switch/build;\n failed=$(ls \"$build_dir\");\n partial_fails=\"\";\n for pkg in $failed; do\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\";\n fi;\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' && partial_fails=\"$partial_fails $pkg\";\n done;\n test \"${partial_fails}\" != \"\" && echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\";\n exit 1\" failed with exit status 1\n\n</code></pre></div></div>\n\n<p>Run each of the <em>Exec</em> commands at the command prompt up to the <em>Fork exec</em>. We do need to run it, but we want an interactive shell, so let\u2019s change the final part of the command to <code>bash</code>:</p>\n\n<div><div><pre><code>sudo su -l mac1000 -c -- \"source ~/.obuilder_profile.sh && env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' bash\"\n</code></pre></div></div>\n\n<p>Now, at the shell prompt, we can try <code>opam reinstall bitwuzla-cxx.0.7.0</code>. Hopefully, this fails, which proves we have successfully recreated the environment!</p>\n\n<div><div><pre><code>$ opam source bitwuzla-cxx.0.7.0\n$ cd bitwuzla-cxx.0.7.0\n$ dune build\nFile \"vendor/dune\", lines 201-218, characters 0-436:\n201 | (rule\n202 | (deps\n203 | (source_tree bitwuzla)\n.....\n216 | %{p0002}\n217 | (run patch -p1 --directory bitwuzla))\n218 | (write-file %{target} \"\")))))\n(cd _build/default/vendor && /usr/bin/patch -p1 --directory bitwuzla) < _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h'\nCan't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied\npatch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied\n</code></pre></div></div>\n\n<p>This matches the output we see on the CI logs. <code>/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T</code> is the <code>TMPDIR</code> value set in the environment. <code>Permission denied</code> looks like file system permissions. <code>ls -l</code> and <code>touch</code> show we can write to this directory.</p>\n\n<p>As we are running on macOS, and the Dune is invoking <code>patch</code>, my thought goes to Apple\u2019s <code>patch</code> vs GNU\u2019s <code>patch</code>. Editing <code>vendor/dune</code> to use <code>gpatch</code> rather than <code>patch</code> allows the project to build.</p>\n\n<div><div><pre><code>$ dune build\n(cd _build/default/vendor && /usr/local/bin/gpatch --directory bitwuzla -p1) < _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\nFile include/bitwuzla/cpp/bitwuzla.h is read-only; trying to patch anyway\npatching file include/bitwuzla/cpp/bitwuzla.h\n</code></pre></div></div>\n\n<p>Running Apple\u2019s <code>patch</code> directly,</p>\n\n<div><div><pre><code>$ patch -p1 < ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h'\nCan't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied\npatch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied\n</code></pre></div></div>\n\n<p>However, <code>touch /var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI</code> succeeds.</p>\n\n<p>Looking back at the output from GNU <code>patch</code>, it reports that the file itself is read-only.</p>\n\n<div><div><pre><code>$ ls -l include/bitwuzla/cpp/bitwuzla.h\n-r--r--r-- 1 mac1000 admin 52280 May 8 15:05 include/bitwuzla/cpp/bitwuzla.h\n</code></pre></div></div>\n\n<p>Let\u2019s try to adjust the permissions:</p>\n\n<div><div><pre><code>$ chmod 644 include/bitwuzla/cpp/bitwuzla.h\n$ patch -p1 < ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h\u2019\n</code></pre></div></div>\n\n<p>And now, it succeeds. The issue is that GNU\u2019s <code>patch</code> and Apple\u2019s <code>patch</code> act differently when the file being patched is read-only. Apple\u2019s <code>patch</code> gives a spurious error, while GNU\u2019s <code>patch</code> emits a warning and makes the change anyway.</p>\n\n<p>Updating the <code>dune</code> file to include <code>chmod</code> should both clear the warning and allow the use of the native patch.</p>\n\n<div><div><pre><code>(rule\n (deps\n (source_tree bitwuzla)\n (:p0001\n (file patch/0001-api-Add-hook-for-ocaml-z-value.patch))\n (:p0002\n (file patch/0002-binding-Fix-segfault-with-parallel-instances.patch)))\n (target .bitwuzla_tree)\n (action\n (no-infer\n (progn\n (run chmod -R u+w bitwuzla)\n (with-stdin-from\n %{p0001}\n (run patch -p1 --directory bitwuzla))\n (with-stdin-from\n %{p0002}\n (run patch -p1 --directory bitwuzla))\n (write-file %{target} \"\")))))\n</code></pre></div></div>\n\n<p>As an essential last step, we need to tidy up on this machine. Exit the shell. Refer back to the log file for the job and run all the remaining ZFS commands. This is incredibly important on macOS and essential to keep the jobs database in sync with the layers.</p>",
+21
mte/2025_05_09_worker-moves.json
+21
mte/2025_05_09_worker-moves.json
···+"summary": "Following the setup of rosemary with FreeBSD 14 (with 20C/40T), I have paused spring and summer (which combined have 12C/24T) and rosemary is now handling all of the FreeBSD workload.",+"content": "<p>Following the setup of <em>rosemary</em> with <a href=\"https://www.tunbury.org/freebsd-uefi/\">FreeBSD 14</a> (with 20C/40T), I have paused <em>spring</em> and <em>summer</em> (which combined have 12C/24T) and <em>rosemary</em> is now handling all of the <a href=\"https://github.com/ocurrent/freebsd-infra/pull/14\">FreeBSD workload</a>.</p>\n\n<p><em>Oregano</em> has now taken the OpenBSD workload from <em>bremusa</em>. <em>bremusa</em> has been redeployed in the <code>linux-x86_64</code> pool. With the extra processing, I have paused the Scaleway workers <em>x86-bm-c1</em> through <em>x86-bm-c9</em>.</p>\n\n<p>These changes, plus the <a href=\"https://www.tunbury.org/equinix-moves/\">removal of the Equnix machines</a>, are now reflected in <a href=\"https://infra.ocaml.org\">https://infra.ocaml.org</a>.</p>",
+21
mte/2025_05_12_posthog.json
+21
mte/2025_05_12_posthog.json
···+"summary": "Sabine would like to switch OCaml.org from using Plausible over to Posthog. The underlying reason for the move is that the self-hosted product from Posthog has more features than the equivalent from Plausible. Of particular interest is the heatmap feature to assess the number of visitors who finish the Tour of OCaml.",+"content": "<p>Sabine would like to switch <a href=\"https://ocaml.org\">OCaml.org</a> from using <a href=\"https://plausible.io\">Plausible</a> over to <a href=\"https://posthog.com\">Posthog</a>. The underlying reason for the move is that the self-hosted product from Posthog has more features than the equivalent from Plausible. Of particular interest is the heatmap feature to assess the number of visitors who finish the <a href=\"https://ocaml.org/docs/tour-of-ocaml\">Tour of OCaml</a>.</p>\n\n<p>Posthog has <a href=\"https://posthog.com/docs/self-host\">documentation</a> on the self-hosted solution. In short, create a VM with 4 vCPU, 16GB RAM, and 30GB storage and run the setup script:</p>\n\n<div><div><pre><code>/bin/bash <span>-c</span> <span>\"</span><span>$(</span>curl <span>-fsSL</span> https://raw.githubusercontent.com/posthog/posthog/HEAD/bin/deploy-hobby<span>)</span><span>\u201d\n</span></code></pre></div></div>\n\n<p>Any subsequent upgrades can be achieved with:</p>\n\n<div><div><pre><code>/bin/bash <span>-c</span> <span>\"</span><span>$(</span>curl <span>-fsSL</span> https://raw.githubusercontent.com/posthog/posthog/HEAD/bin/upgrade-hobby<span>)</span><span>\"</span>\n</code></pre></div></div>\n\n<p>After installation, I created a <a href=\"https://posthog.ci.dev/shared/seqtamWuMXLwxJEAX1XNjwhzciAajw\">public dashboard</a> as with <a href=\"https://plausible.ci.dev/ocaml.org\">Plausible</a>. I also enabled the option <code>Discard client IP data</code>.</p>\n\n<p>The OCaml website can be updated with <a href=\"https://github.com/ocaml/ocaml.org/pull/3101\">PR#3101</a>.</p>",
+21
mte/2025_05_13_ubuntu-apparmor.json
+21
mte/2025_05_13_ubuntu-apparmor.json
···+"content": "<p>Patrick reported issues with OCaml-CI running tests on <code>ocaml-ppx</code>.</p>\n\n<blockquote>\n <p>Fedora seems to be having some issues: https://ocaml.ci.dev/github/ocaml-ppx/ppxlib/commit/0d6886f5bcf22287a66511817e969965c888d2b7/variant/fedora-40-5.3_opam-2.3</p>\n <div><div><pre><code>sudo: PAM account management error: Authentication service cannot retrieve authentication info\nsudo: a password is required\n\"/usr/bin/env\" \"bash\" \"-c\" \"sudo dnf install -y findutils\" failed with exit status 1\n2025-05-12 08:55.09: Job failed: Failed: Build failed\n</code></pre></div> </div>\n</blockquote>\n\n<p>I took this problem at face value and replied that the issue would be related to Fedora 40, which is EOL. I created <a href=\"https://github.com/ocurrent/ocaml-ci/pull/1011\">PR#1011</a> for OCaml-CI and deployed it. However, the problem didn\u2019t go away. We were now testing Fedora 42, but jobs were still failing. I created a minimal obuilder job specification:</p>\n\n<div><div><pre><code>((from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664)\n(user (uid 1000) (gid 1000))\n(run (shell \"sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam\")))\n</code></pre></div></div>\n\n<p>Submitting the job to the cluster showed it worked on all machines except for <code>bremusa</code>.</p>\n\n<div><div><pre><code><span>$ </span>ocluster-client submit-obuilder <span>--connect</span> mtelvers.cap <span>--pool</span> linux-x86_64 <span>--local-file</span> fedora-42.spec\nTailing log:\nBuilding on bremusa.ocamllabs.io\n\n<span>(</span>from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664<span>)</span>\n2025-05-12 16:55.42 <span>---</span><span>></span> using <span>\"aefb7551cd0db7b5ebec7e244d5637aef02ab3f94c732650de7ad183465adaa0\"</span> from cache\n\n/: <span>(</span>user <span>(</span>uid 1000<span>)</span> <span>(</span>gid 1000<span>))</span>\n\n/: <span>(</span>run <span>(</span>shell <span>\"sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam\"</span><span>))</span>\n<span>sudo</span>: PAM account management error: Authentication service cannot retrieve authentication info\n<span>sudo</span>: a password is required\n<span>\"/usr/bin/env\"</span> <span>\"bash\"</span> <span>\"-c\"</span> <span>\"sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam\"</span> failed with <span>exit </span>status 1\nFailed: Build failed.\n</code></pre></div></div>\n\n<p>Changing the image to <code>opam:debian-12-ocaml-4.14</code> worked, so the issue only affects Fedora images and only on <code>bremusa</code>. I was able to reproduce the issue directly using <code>runc</code>.</p>\n\n<div><div><pre><code><span># runc run test</span>\n<span>sudo</span>: PAM account management error: Authentication service cannot retrieve authentication info\n<span>sudo</span>: a password is required\n</code></pre></div></div>\n\n<p>Running <code>ls -l /etc/shadow</code> in the container showed that the permissions on <code>/etc/shadow</code> are 000. If these are changed to <code>640</code>, then <code>sudo</code> works correctly. Permissions are set 000 for <code>/etc/shadow</code> in some distributions as access is limited to processes with the capability <code>DAC_OVERRIDE</code>.</p>\n\n<p>Having seen a permission issue with <code>runc</code> and <code>libseccomp</code> compatibility <a href=\"https://github.com/ocaml/infrastructure/issues/121\">before</a>, I went down a rabbit hole investigating that. Ultimately, I compiled <code>runc</code> without <code>libseccomp</code> support, <code>make MAKETAGS=\"\"</code>, and this still had the same issue.</p>\n\n<p>All the machines in the <code>linux-x86_64</code> pool are running Ubuntu 22.04 except for <code>bremusa</code>. I configured a spare machine with Ubuntu 24.04 and tested. The problem appeared on this machine as well.</p>\n\n<p>Is there a change in Ubuntu 24.04?</p>\n\n<p>I temporarily disabled AppArmor by editing <code>/etc/default/grub</code> and added <code>apparmor=0</code> to <code>GRUB_CMDLINE_LINUX</code>, ran <code>update-grub</code> and rebooted. Disabling AppArmor entirely like this can create security vulnerabilities, so this isn\u2019t recommended, but it did clear the issue.</p>\n\n<p>After enabling AppArmor again, I disabled the configuration for <code>runc</code> by running:</p>\n\n<div><div><pre><code><span>ln</span> <span>-s</span> /etc/apparmor.d/runc /etc/apparmor.d/disable/\napparmor_parser <span>-R</span> /etc/apparmor.d/runc\n</code></pre></div></div>\n\n<p>This didn\u2019t help - in fact, this was worse as now <code>runc</code> couldn\u2019t run at all. I restored the configuration and added <code>capability dac_override</code>, but this didn\u2019t help either.</p>\n\n<p>Looking through the profiles with <code>grep shadow -r /etc/apparmor.d</code>, I noticed <code>unix-chkpwd</code>, which could be the source of the issue. I disabled this profile and the issue was resolved.</p>\n\n<div><div><pre><code><span>ln</span> <span>-s</span> /etc/apparmor.d/unix-chkpwd /etc/apparmor.d/disable\napparmor_parser <span>-R</span> /etc/apparmor.d/unix-chkpwd\n</code></pre></div></div>\n\n<p>Armed with the answer, it\u2019s pretty easy to find other people with related issues:</p>\n<ul>\n <li>https://github.com/docker/build-push-action/issues/1302</li>\n <li>https://github.com/moby/moby/issues/48734</li>\n</ul>",
+21
mte/2025_05_14_opam-health-check-oxcaml.json
+21
mte/2025_05_14_opam-health-check-oxcaml.json
···+"summary": "Arthur mentioned that it would be great to know which packages build successfully with OxCaml and which don\u2019t.",+"content": "<p>Arthur mentioned that it would be great to know which packages build successfully with OxCaml and which don\u2019t.</p>\n\n<p>With a little effort and <a href=\"https://github.com/ocurrent/opam-health-check/pull/106\">PR#106</a>, I was able to get <a href=\"https://github.com/ocurrent/opam-health-check\">opam-health-check</a> to build OxCaml from the Jane Street branch and test the latest version of all the packages in opam.</p>\n\n<p>I created the switch using the branch <code>janestreet/opam-repository#with-extensions</code>. However, I ran into issues as <code>autoconf</code> isn\u2019t included in the base images. I added an <code>extra-command</code> to install it, but found that these are executed last, after the switch has been created, and I needed <code>autoconf</code> before the switch was created. My PR moved the extra commands earlier in the build process.</p>\n\n<p>Here is my <code>config.yaml</code>.</p>\n\n<div><div><pre><code>name: default\nport: 8080\npublic-url: http://oxcaml.check.ci.dev\nadmin-port: 9999\nauto-run-interval: 1680\nprocesses: 100\nenable-dune-cache: false\nenable-logs-compression: true\ndefault-repository: ocaml/opam-repository\nextra-repositories:\n- janestreet-with-extensions: janestreet/opam-repository#with-extensions\nwith-test: false\nwith-lower-bound: false\nlist-command: opam list --available --installable --columns=package --short\nextra-command: sudo apt install autoconf -y\nplatform:\n os: linux\n arch: x86_64\n custom-pool:\n distribution: debian-unstable\n image: ocaml/opam:debian-12-ocaml-5.2@sha256:a17317e9abe385dc16b4390c64a374046d6dd562e80aea838d91c6c1335da357\nocaml-switches:\n- 5.2.0+flambda2:\n switch: 5.2.0+flambda2\n build-with: opam\n</code></pre></div></div>\n\n<p>This results in these commands, which build the switch for testing:</p>\n\n<div><div><pre><code>sudo ln -f /usr/bin/opam-dev /usr/bin/opam\nrm -rf ~/opam-repository && git clone -q 'https://github.com/ocaml/opam-repository' ~/opam-repository && git -C ~/opam-repository checkout -q dbc9ec7b83bac3673185542221a571372b6abb35\nrm -rf ~/.opam && opam init -ya --bare --config ~/.opamrc-sandbox ~/opam-repository\nsudo apt install autoconf -y\ngit clone -q 'https://github.com/janestreet/opam-repository' ~/'janestreet-with-extensions' && git -C ~/'janestreet-with-extensions' checkout -q 55a5d4c5e35a7365ddd6ffb3b87274a77f77deb5\nopam repository add --dont-select 'janestreet-with-extensions' ~/'janestreet-with-extensions'\nopam switch create --repositories=janestreet-with-extensions,default '5.2.0+flambda2' '5.2.0+flambda2'\nopam update --depexts\n</code></pre></div></div>\n\n<p>The results are available at <a href=\"https://oxcaml.check.ci.dev\">https://oxcaml.check.ci.dev</a>.</p>",
+21
mte/2025_05_15_zfs-system-concept.json
+21
mte/2025_05_15_zfs-system-concept.json
···+"summary": "How would the distributed ZFS storage system look in practical terms? Each machine with a ZFS store would have an agent application installed. Centrally, there would be a tracker server, and users would interact with the system using a CLI tool. The elements will interact with each other using Capt\u2019n Proto capability files.",+"content": "<p>How would the distributed ZFS storage system look in practical terms? Each machine with a ZFS store would have an agent application installed. Centrally, there would be a tracker server, and users would interact with the system using a CLI tool. The elements will interact with each other using Capt\u2019n Proto capability files.</p>\n\n<h1>Tracker</h1>\n\n<p>The tracker would generate capability files on first invocation, one per <em>location</em>, where the location could be as granular as a specific rack in a datacenter or a larger grouping, such as at the institution level. The purpose of the location grouping is to allow users to see where the data is held. As a prototype, the command could be something like:</p>\n\n<div><div><pre><code>tracker --capnp-listen-address tcp:1.2.3.4:1234 --locations datacenter-01,datacenter-02,datacenter-03\n</code></pre></div></div>\n\n<h1>Agent</h1>\n\n<p>Each machine would have the agent application. The agent would register with the tracker using the capability file generated by the tracker. The agent command line would be used to provide a list of zpools, that are in scope for management. The zpools will be scanned to compile a list of available datasets, which will be passed to the tracker. Perhaps an invocation like this:</p>\n\n<div><div><pre><code>agent --connect datacenter-01.cap --name machine-01 --zpools tank-01,tank-02\n</code></pre></div></div>\n\n<h1>CLI</h1>\n\n<p>The CLI tool will display the system state by connecting to the tracker. Perhaps a command like <code>cli --connect user.cap show</code>, which would output a list of datasets and where they are:</p>\n\n<div><div><pre><code>dataset-01: datacenter-01\\machine-01\\tank-01 (online), datacenter-02\\machine-03\\tank-06 (online)\ndataset-02: datacenter-01\\machine-01\\tank-02 (online), datacenter-02\\machine-04\\tank-07 (offline)\n</code></pre></div></div>\n\n<p>Another common use case would be to fetch a dataset: <code>cli --connect user.cap download dataset-02</code>. This would set up a <code>zfs send | zfs receive</code> between the agent and the current machine.</p>\n\n<p>Potentially, all machines would run the agent, and rather than <code>download</code>, we would initiate a <code>copy</code> of a dataset to another location in the form <code>datacenter\\machine\\tank</code>.</p>",
+21
mte/2025_05_16_zfs-replcation-ansible.json
+21
mte/2025_05_16_zfs-replcation-ansible.json
···+"summary": "Rather than using the agent-based approach proposed yesterday, it\u2019s worth considering an Ansible-based solution instead.",+"content": "<p>Rather than using the agent-based approach proposed yesterday, it\u2019s worth considering an Ansible-based solution instead.</p>\n\n<p>Given a set of YAML files on a one-per-dataset basis containing any metadata we would like for administrative purposes, and with required fields such as those below. We can also override any default snapshot and replication frequencies by adding those parameters to the file.</p>\n\n<div><div><pre><code><span>dataset_path</span><span>:</span> <span>\"</span><span>tank/dataset-02\"</span>\n<span>source_host</span><span>:</span> <span>\"</span><span>x86-bm-c1.sw.ocaml.org\"</span>\n<span>target_host</span><span>:</span> <span>\"</span><span>x86-bm-c3.sw.ocaml.org\u201d</span>\n</code></pre></div></div>\n\n<p>The YAML files would be aggregated to create an overall picture of which datasets must be replicated between hosts. Ansible templates would then generate the necessary configuration files for <code>synoid</code> and <code>sanoid</code>, and register the cron jobs on each machine.</p>\n\n<p>Sanoid uses SSH authentication, so the keys must be generated on the source machines, and the public keys must be deployed on the replication targets. Ansible can be used to manage the configuration of the keys.</p>\n\n<p>Given the overall picture, we can automatically generate a markdown document describing the current setup and use Mermaid to include a visual representation.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/zfs-replication-graphic.png\"></p>\n\n<p>I have published a working version of this concept on <a href=\"https://github.com/mtelvers/zfs-replication-ansible\">GitHub</a>. The <a href=\"https://github.com/mtelvers/zfs-replication-ansible/blob/master/README.md\">README.md</a> contains additional information.</p>\n\n<p>The replication set defined in the repository, <a href=\"https://github.com/mtelvers/zfs-replication-ansible/blob/master/docs/replication_topology.md\">ZFS Replication Topology</a>, is currently running for testing.</p>",
+21
mte/2025_05_19_macos-sequoia.json
+21
mte/2025_05_19_macos-sequoia.json
···+"content": "<p>We have 8 Mac Minis running <a href=\"https://github.com/ocurrent/ocluster\">OCluster</a> that need to be updated to macOS Sequoia.</p>\n\n<p>I\u2019d been putting this off for some time, as the downloads are huge even in an ideal scenario. After the OS installation, there are usually updates to Xcode and OpenZFS. We have 4 x i7 units and 4 x M1 units.</p>\n\n<p>Rather than using the software update button, I went to the AppStore and downloaded the <a href=\"https://support.apple.com/en-gb/102662\">Sequoia installer</a>. This is approximately 15GB. I copied <code>/Applications/Install macOS Sequoia.app</code> to the other three systems of the same architecture using <code>rsync</code> to avoid downloading it on each machine. The OS updated from <code>Darwin 23.4.0</code> to <code>Darwin 24.5.0</code>.</p>\n\n<p>After the OS update, I updated Xcode via Settings, Software Update. This was a 1.65GB download. This moved from <code>Command Line Tools for Xcode 15.3</code> to <code>Command Line Tools for Xcode 16.3</code>, upgrading <code>clang</code> from 25.0.0 to 27.0.0. Before moving to the remaining machines, tested <a href=\"https://github.com/ocurrent/obuilder\">obuilder</a>, OpenZFS etc.</p>\n\n<p><code>softwareupdate --history</code> lists all the updates/os installations.</p>\n\n<p>Wall clock time elapsed: ~3 days.</p>",
+21
mte/2025_05_26_retire-legacy-opam.json
+21
mte/2025_05_26_retire-legacy-opam.json
···+"summary": "On the eve of the release of opam 2.4, is it time to stop testing with opam < 2.2?",+"content": "<p>On the eve of the release of opam 2.4, is it time to stop testing with opam < 2.2?</p>\n\n<p>Over the weekend, we have been seeing numerous failures across the ecosystem due to the unavailability of the <a href=\"http://camlcity.org\">camlcity.org</a>. This website hosts the source for the <code>findlib</code> package. A typical error report is shown below:</p>\n\n<div><div><pre><code>#32 [build-opam-doc 5/14] RUN opam install odoc\n#32 258.6 [ERROR] Failed to get sources of ocamlfind.1.9.6: curl error code 504\n#32 258.6\n#32 258.6 #=== ERROR while fetching sources for ocamlfind.1.9.6 =========================#\n#32 258.6 OpamSolution.Fetch_fail(\"http://download.camlcity.org/download/findlib-1.9.6.tar.gz (curl: code 504 while downloading http://download.camlcity.org/download/findlib-1.9.6.tar.gz)\")\n#32 259.0\n#32 259.0\n#32 259.0 <><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>\n#32 259.0 +- The following actions failed\n#32 259.0 | - fetch ocamlfind 1.9.6\n#32 259.0 +-\n</code></pre></div></div>\n\n<p>The most high-profile failure has been the inability to update <a href=\"https://opam.ocaml.org\">opam.ocaml.org</a>. See <a href=\"https://github.com/ocaml/infrastructure/issues/172\">issue#172</a>. This has also affected the deployment of <a href=\"https://ocaml.org\">ocaml.org</a>.</p>\n\n<p>Late last year, Hannes proposed adding our archive mirror to the base image builder. <a href=\"https://github.com/ocurrent/docker-base-images/issues/306\">issue#306</a>. However, this requires opam 2.2 or later. We have long maintained that while supported <a href=\"https://repology.org/project/opam/versions\">distributions</a> still package legacy versions, we should continue to test against these versions.</p>\n\n<p>The testing of the legacy versions is limited to <a href=\"https://opam.ci.ocaml.org\">opam-repo-ci</a> testing on Debian 12 on AMD64 using a test matrix of OCaml 4.14 and 5.3 with each of opam 2.0, 2.1 and 2.2. These tests often fail to find a solution within the timeout. We have tried increasing the timeout by a factor of 10 to no avail. All of opam-repo-ci\u2019s other tests use the current development version. OCaml-CI only tests using the current release version.</p>\n\n<div><div><pre><code>[ERROR] Sorry, resolution of the request timed out.\n Try to specify a simpler request, use a different solver, or increase the allowed time by setting OPAMSOLVERTIMEOUT to a bigger value (currently, it is set to 60.0 seconds).\n</code></pre></div></div>\n\n<p>The base image default is opam 2.0, as <code>~/.opam</code> can\u2019t be downgraded; therefore, we can\u2019t set a mirror archive flag in the base images.</p>\n\n<p>A typical <code>Dockerfile</code> starts by replacing opam 2.0 with the latest version and reinitialising.</p>\n\n<div><div><pre><code>FROM ocaml/opam:debian-12-ocaml-4.14 AS build\nRUN sudo ln -sf /usr/bin/opam-2.3 /usr/bin/opam && opam init --reinit -ni\n...\n</code></pre></div></div>\n\n<p>To include the archive mirror, we should add a follow-up of:</p>\n\n<div><div><pre><code>RUN opam option --global 'archive-mirrors+=\"https://opam.ocaml.org/cache\"'\n</code></pre></div></div>\n\n<p>Dropping 2.0 and 2.1, and arguably 2.2 as well, from the base images would considerably decrease the time taken to build the base images, as opam is built from the source each week for each distribution/architecture.</p>\n\n<div><div><pre><code>RUN git clone https://github.com/ocaml/opam /tmp/opam && cd /tmp/opam && cp -P -R -p . ../opam-sources && git checkout 4267ade09ac42c1bd0b84a5fa61af8ccdaadef48 && env MAKE='make -j' shell/bootstrap-ocaml.sh && make -C src_ext cache-archives\nRUN cd /tmp/opam-sources && cp -P -R -p . ../opam-build-2.0 && cd ../opam-build-2.0 && git fetch -q && git checkout adc1e1829a2bef5b240746df80341b508290fe3b && ln -s ../opam/src_ext/archives src_ext/archives && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" ./configure --enable-cold-check && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" make lib-ext all && mkdir -p /usr/bin && cp /tmp/opam-build-2.0/opam /usr/bin/opam-2.0 && chmod a+x /usr/bin/opam-2.0 && rm -rf /tmp/opam-build-2.0\nRUN cd /tmp/opam-sources && cp -P -R -p . ../opam-build-2.1 && cd ../opam-build-2.1 && git fetch -q && git checkout 263921263e1f745613e2882745114b7b08f3608b && ln -s ../opam/src_ext/archives src_ext/archives && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" ./configure --enable-cold-check --with-0install-solver && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" make lib-ext all && mkdir -p /usr/bin && cp /tmp/opam-build-2.1/opam /usr/bin/opam-2.1 && chmod a+x /usr/bin/opam-2.1 && rm -rf /tmp/opam-build-2.1\nRUN cd /tmp/opam-sources && cp -P -R -p . ../opam-build-2.2 && cd ../opam-build-2.2 && git fetch -q && git checkout 01e9a24a61e23e42d513b4b775d8c30c807439b2 && ln -s ../opam/src_ext/archives src_ext/archives && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" make lib-ext all && mkdir -p /usr/bin && cp /tmp/opam-build-2.2/opam /usr/bin/opam-2.2 && chmod a+x /usr/bin/opam-2.2 && rm -rf /tmp/opam-build-2.2\nRUN cd /tmp/opam-sources && cp -P -R -p . ../opam-build-2.3 && cd ../opam-build-2.3 && git fetch -q && git checkout 35acd0c5abc5e66cdbd5be16ba77aa6c33a4c724 && ln -s ../opam/src_ext/archives src_ext/archives && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" make lib-ext all && mkdir -p /usr/bin && cp /tmp/opam-build-2.3/opam /usr/bin/opam-2.3 && chmod a+x /usr/bin/opam-2.3 && rm -rf /tmp/opam-build-2.3\nRUN cd /tmp/opam-sources && cp -P -R -p . ../opam-build-master && cd ../opam-build-master && git fetch -q && git checkout 4267ade09ac42c1bd0b84a5fa61af8ccdaadef48 && ln -s ../opam/src_ext/archives src_ext/archives && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" ./configure --enable-cold-check --with-0install-solver --with-vendored-deps && env PATH=\"/tmp/opam/bootstrap/ocaml/bin:$PATH\" make lib-ext all && mkdir -p /usr/bin && cp /tmp/opam-build-master/opam /usr/bin/opam-master && chmod a+x /usr/bin/opam-master && rm -rf /tmp/opam-build-master\n</code></pre></div></div>\n\n<p>Furthermore, after changing the opam version, we must run <code>opam init --reinit -ni</code>, which is an <em>expensive</em> command. If the base images defaulted to the current version, we would have faster builds.</p>\n\n<p>The final benefit, of course, would be that we could set the <code>archive-mirror</code> and reduce the number of transient failures due to network outages.</p>",
+21
mte/2025_05_27_raptor-talos-ii-update.json
+21
mte/2025_05_27_raptor-talos-ii-update.json
···+"summary": "Almost a month ago, I wrote about the onset of unreliability in our Raptor Talos II machines. Since then, I have been working with Raptor Computing to diagnose the issue.",+"content": "<p>Almost a month ago, I wrote about the onset of <a href=\"https://www.tunbury.org/raptor-talos-ii\">unreliability in our Raptor Talos II</a> machines. Since then, I have been working with Raptor Computing to diagnose the issue.</p>\n\n<p>We have two Raptor Talos II machines: <em>Orithia</em> and <em>Scyleia</em>. Each has two processors, for a total of 176 cores, 512GB of RAM, and 2 x 1.8TB NVMe drives. These machines were expensive, so having to power cycle them several times a day was annoying.</p>\n\n<p>I reported the problem as the system freezing. Raptor Support asked me to run <code>stress</code> on the machines while recording the output from <code>sensors</code> from the <code>lm-sensors</code> package. They also asked me to install <code>opal-prd</code>, which outputs logging data to <code>/var/log/opal-prd.log</code>. The output from <code>sensors</code> was unremarkable, and the machines didn\u2019t particularly freeze more often under load than when sitting idle.</p>\n\n<p>Diagnostics then moved to what we were running on the machines. That part was easy as these machines run <a href=\"https://github.com/ocurrent/ocluster\">OCluster</a>/<a href=\"https://github.com/ocurrent/obuilder\">OBuilder</a>, which we run across all of our workers. Raptor Support suspected an out-of-memory condition, but they were perplexed by the lack of an error report on the XMON debug console.</p>\n\n<p>Raptor Support provided access to a Talos II machine in their datacenter. As our configuration is held in Ansible Playbooks, it was simple to deploy to the test machine. The machine was much smaller than ours: 64GB of RAM, 460GB NVMe. This limited the number of concurrent OBuilder jobs to about 16. We run our machines at 44 using the rudimentary <code>nproc / 4</code> calculation. The loan machine was solid; ours still froze frequently.</p>\n\n<p>Raptor Support had an inspirational question about the system state after the freeze. As I am remote from the machine, it\u2019s hard to tell whether it is on or not. The BMC reported that the machine was on. However, I inspected the state physically; the power indicator light on the front panel was off, and the indicator lights on the PSU were amber. In the image, the top system is powered off.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-front-panel.png\"></p>\n\n<p>Issuing these <code>i2cget</code> commands via the BMC console allowed the cause of the power off event to be determined</p>\n\n<div><div><pre><code>bmc-orithia:~# i2cget <span>-y</span> 12 0x31 0x07\n0x2e\nbmc-orithia:~# i2cget <span>-y</span> 12 0x31 0x18\n0x00\nbmc-orithia:~# i2cget <span>-y</span> 12 0x31 0x19\n0x02\n</code></pre></div></div>\n\n<p>Using the BMC, you can query the power status using <code>obmcutil power</code> and power on and off the system using <code>obmcutil poweron</code> and <code>obmcutil poweroff</code> respectively.</p>\n\n<blockquote>\n <p>The indication is one of the power rails (VCS for CPU1) dropping offline, which causes a full system power off to ensure further hardware damage does not occur. This would be a hardware fault, and is either a failing regulator on the mainboard or a failing CPU shorting out the VCS B power rail. \u2026 There is a chance the actual problem is instability in the +12V rail from the PDU.</p>\n</blockquote>\n\n<p>The suggested course of action was to try powering the system using a standard 1000W ATX power supply, which would isolate whether the supply was the root cause of the failure. Raptor Support confirmed that, provided the plastic air guide is in place inside the chassis, there should be sufficient airflow to run the test for an extended period.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-with-atx.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-with-atx-running.jpg\"></p>\n\n<p>After an hour or so of running, the system spontaneously rebooted, so I decided to stop the test to avoid possible damage.</p>\n\n<blockquote>\n <p>The next step would be to swap CPU0 on Scyleia with CPU1 on Orithia, to determine if the CPU itself may be at fault. CPU0 is nearest the rear connectors, while CPU1 is nearest the chassis fans.</p>\n</blockquote>\n\n<p>Orithia CPU</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-orithia-cpu-screwdriver.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-orithia-cpu-removed.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-orithia-cpu.jpg\"></p>\n\n<p>Scyleia CPU</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/raptor-talos-ii-scyleia-cpu-screwdriver.jpg\"></p>\n\n<p>Following the CPU swap, both systems have been stable for over 30 hours.</p>",
+21
mte/2025_05_28_opam2web.json
+21
mte/2025_05_28_opam2web.json
···+"summary": "We maintain a mirror (archive) of all opam packages. To take advantage of this, add the archive mirror to opam by setting the global option.",+"content": "<p>We maintain a mirror (archive) of all opam packages. To take advantage of this, add the archive mirror to opam by setting the global option.</p>\n\n<div><div><pre><code>opam option <span>--global</span> <span>'archive-mirrors+=\"https://opam.ocaml.org/cache\"'</span>\n</code></pre></div></div>\n\n<h1>How is the mirror generated and maintained?</h1>\n\n<p>opam has a command that generates the mirror, which defaults to reading <code>packages</code> from the current directory.</p>\n\n<div><div><pre><code>opam admin cache <span>--link</span><span>=</span>archives ./cache\n</code></pre></div></div>\n\n<div>\nsequenceDiagram\n participant BIB as Base Image Builder\n participant DH as Docker Hub\n participant O2W as opam2web\n\n Note over DH: ocaml/opam:archive\n DH-->>BIB: Pull ocaml/opam:archive\n\n Note over BIB: opam admin cache\n BIB->>DH: Push image\n\n Note over DH: ocaml/opam:archive\n DH->>O2W: Pull ocaml/opam:archive\n\n Note over O2W: opam admin cache\n Note over O2W: Publish https://opam.ocaml.org/cache\n</div>\n\n<p>The base image builder pulls <code>ocaml/opam:archive</code>, runs <code>opam admin cache</code> to update the cache, and then pushes it back <code>ocaml/opam:archive</code>.</p>\n\n<p>opam2web, which publishes <a href=\"https://opam.ocaml.org\">opam.ocaml.org</a>, pulls <code>ocaml/opam:archive</code> and then runs <code>opam admin cache</code> to populate any new items in the cache and then makes the cache available at <a href=\"https://opam.ocaml.org/cache\">https://opam.ocaml.org/cache</a>.</p>\n\n<p>Until today, the step indicated by the dotted line was missing. Kate had pointed this out as long ago as 2023 with <a href=\"https://github.com/ocurrent/docker-base-images/issues/249\">issue #249</a> and <a href=\"https://github.com/ocurrent/docker-base-images/pull/248\">PR #248</a>, but, for whatever reason, this was never actioned.</p>\n\n<p>With the current unavailability of <a href=\"https://www.tunbury.org/2025/05/28/opam2web/camlcity.org\">camlcity.org</a>, this has become a problem. On Monday, I patched opam2web\u2019s <code>Dockerfile</code> to include access to the mirror/cache, which allowed opam2web to build. However, subsequent builds failed because the updated <a href=\"https://opam.ocaml.org\">opam.ocaml.org</a> used the latest version of <code>ocaml/opam:archive</code>. This was built on Sunday when camlcity.org was down; therefore, the source for <code>ocamlfind</code> had been dropped from the mirror.</p>\n\n<h1>How to do we get out of this problem?</h1>\n\n<p>Updating the base image builder does not fix the problem, as camlcity.org is still down and the current <code>ocaml/opam:archive</code> does not contain the missing packages. We only tag the latest version on Dockerhub, but looking through the base image builder logs allowed me to find the SHA256 for last week\u2019s build. <code>ocaml/opam:archive@sha256:a0e2cd50e1185fd9a17a193f52d17981a6f9ccf0b56285cbc07f396d5e3f7882</code></p>\n\n<p>Taking <a href=\"https://github.com/ocurrent/docker-base-images/pull/248\">PR #248</a>, and pointing it to the older image, I used the base image builder locally to push an updated <code>ocaml/opam:archive</code>. This is <code>ocaml/opam:archive@sha256:fb7b62ee305b0b9fff82748803e57a655ca92130ab8624476cd7af428101a643</code>.</p>\n\n<div><div><pre><code>- from ~alias:\"opam-archive\" \"ocaml/opam:archive\" @@\n+ from ~alias:\"opam-archive\" \"ocaml/opam:archive@sha256:a0e2cd50e1185fd9a17a193f52d17981a6f9ccf0b56285cbc07f396d5e3f7882\" @@\n</code></pre></div></div>\n\n<p>Now I need to update opam.ocaml.org, but <code>opam2web</code> doesn\u2019t build due to the missing <code>ocamlfind</code>. Checking the <code>opam</code> file showed two source files are needed. One is on GitHub so that\u2019ll be ok.</p>\n\n<div><div><pre><code>...\nurl {\n src: \"http://download.camlcity.org/download/findlib-1.9.6.tar.gz\"\n checksum: [\n \"md5=96c6ee50a32cca9ca277321262dbec57\"\n \"sha512=cfaf1872d6ccda548f07d32cc6b90c3aafe136d2aa6539e03143702171ee0199add55269bba894c77115535dc46a5835901a5d7c75768999e72db503bfd83027\"\n ]\n}\navailable: os != \"win32\"\nextra-source \"0001-Harden-test-for-OCaml-5.patch\" {\n src:\n \"https://raw.githubusercontent.com/ocaml/opam-source-archives/main/patches/ocamlfind/0001-Harden-test-for-OCaml-5.patch\"\n checksum: [\n \"sha256=6fcca5f2f7abf8d6304da6c385348584013ffb8602722a87fb0bacbab5867fe8\"\n \"md5=3cddbf72164c29d4e50e077a92a37c6c\"\n ]\n}\n</code></pre></div></div>\n\n<p>Luck was on my side, as <code>find ~/.opam/download-cache/ -name 96c6ee50a32cca9ca277321262dbec57</code> showed that I had the source in my local opam download cache. I checked out opam2web, copied in the file <code>96c6ee50a32cca9ca277321262dbec57</code> and patched the <code>Dockerfile</code> to inject it into the cache:</p>\n\n<div><div><pre><code>diff --git i/Dockerfile w/Dockerfile\nindex eaf0567..84c9db8 100644\n--- i/Dockerfile\n+++ w/Dockerfile\n@@ -34,6 +34,7 @@ RUN sudo mkdir -p /usr/local/bin \\\n && sudo chmod a+x /usr/local/bin/man2html\n RUN sudo mv /usr/bin/opam-2.3 /usr/bin/opam && opam update\n RUN opam option --global 'archive-mirrors+=\"https://opam.ocaml.org/cache\"'\n+COPY 96c6ee50a32cca9ca277321262dbec57 /home/opam/.opam/download-cache/md5/96/96c6ee50a32cca9ca277321262dbec57\n RUN opam install odoc\n RUN git clone https://github.com/ocaml/opam --single-branch --depth 1 --branch master /home/opam/opam\n WORKDIR /home/opam/opam\n</code></pre></div></div>\n\n<p>The final step is to build and deploy an updated opam2web incorporating the updated mirror cache. In conjunction with the updated base image builder, this will be self-sustaining. I wrapped the necessary steps into a <code>Makefile</code>.</p>\n\n<div><div><pre><code><span>OPAM_REPO_GIT_SHA</span> <span>:=</span> <span>$(</span><span>shell</span> git <span>-C</span> ~/opam-repository fetch upstream <span>&&</span> git <span>-C</span> ~/opam-repository rev-parse upstream/master<span>)</span>\n<span>BLOG_GIT_SHA</span> <span>:=</span> bdef1bbf939db6797dcd51faef2ea9ac1826f4a5\n<span>OPAM_GIT_SHA</span> <span>:=</span> 46234090daf4f9c5f446af56a50f78809c04a20a\n\n<span>all</span><span>:</span> <span>opam2web</span>\n <span>cd</span> <span>opam2web</span> <span>&&</span> <span>docker</span> <span>--context</span> <span>registry.ci.dev</span> <span>build</span> <span>--pull</span> <span>\\</span>\n <span>--build-arg</span> <span>OPAM_REPO_GIT_SHA</span><span>=</span><span>$(OPAM_REPO_GIT_SHA)</span> <span>\\</span>\n <span>--build-arg</span> <span>BLOG_GIT_SHA</span><span>=</span><span>$(BLOG_GIT_SHA)</span> <span>\\</span>\n <span>--build-arg</span> <span>OPAM_GIT_SHA</span><span>=</span><span>$(OPAM_GIT_SHA)</span> <span>\\</span>\n <span>-f</span> Dockerfile <span>--iidfile</span> ../docker-iid <span>--</span> .\n <span>@</span><span>SHA256</span><span>=</span><span>$$</span><span>(</span><span>cat </span>docker-iid<span>)</span>\n <span>docker --context registry.ci.dev tag $$SHA256 registry.ci.dev/opam.ocaml.org</span><span>:</span><span>live</span>\n <span>docker</span> <span>--context</span> <span>registry.ci.dev</span> <span>login</span> <span>-u</span> <span>$(USERNAME)</span> <span>-p</span> <span>$(PASSWORD)</span> <span>registry.ci.dev</span>\n <span>docker --context registry.ci.dev push registry.ci.dev/opam.ocaml.org</span><span>:</span><span>live</span>\n <span>docker --context opam-4.ocaml.org pull registry.ci.dev/opam.ocaml.org</span><span>:</span><span>live</span>\n <span>docker</span> <span>--context</span> <span>opam-4.ocaml.org</span> <span>service</span> <span>update</span> <span>infra_opam_live</span> <span>--image</span> <span>$$SHA256</span>\n <span>docker --context opam-5.ocaml.org pull registry.ci.dev/opam.ocaml.org</span><span>:</span><span>live</span>\n <span>docker</span> <span>--context</span> <span>opam-5.ocaml.org</span> <span>service</span> <span>update</span> <span>infra_opam_live</span> <span>--image</span> <span>$$SHA256</span>\n\n<span>opam2web</span><span>:</span>\n <span>git clone --recursive \"https</span><span>:</span><span>//github.com/ocaml-opam/opam2web.git\" -b \"live\"</span>\n</code></pre></div></div>\n\n<p>Check that <code>ocamlfind</code> is included in the new cache</p>\n\n<div><div><pre><code>wget https://opam-4.ocaml.org/cache/md5/96/96c6ee50a32cca9ca277321262dbec57\nwget https://opam-5.ocaml.org/cache/md5/96/96c6ee50a32cca9ca277321262dbec57\n\n</code></pre></div></div>",
+21
mte/2025_05_29_overlayfs.json
+21
mte/2025_05_29_overlayfs.json
···+"summary": "OBuilder takes a build script (similar to a Dockerfile) and performs the steps in it in a sandboxed environment. After each step, OBuilder uses the snapshot feature to store the state of the build as a layer. Repeating a build will reuse the cached results where possible.",+"content": "<p><a href=\"https://github.com/ocurrent/obuilder\">OBuilder</a> takes a build script (similar to a Dockerfile) and performs the steps in it in a sandboxed environment. After each step, OBuilder uses the snapshot feature to store the state of the build as a <code>layer</code>. Repeating a build will reuse the cached results where possible.</p>\n\n<p>Depending upon the platform, different snapshot systems can be used along with different sandboxes. The tables below give a cross-section of the supported configurations.</p>\n\n<h1>Sandboxes</h1>\n\n\n\n \n \n \u00a0\n RUNC\n QEMU\n Jails\n Docker\n User Isolation\n \n \n \n \n Linux\n \u2705\n \u2705\n \u274c\n \u2705\n \u274c\n \n \n FreeBSD\n \u274c\n \u274c\n \u2705\n \u274c\n \u274c\n \n \n Windows\n \u274c\n \u274c\n \u274c\n \u2705\n \u274c\n \n \n macOS\n \u274c\n \u274c\n \u274c\n \u274c\n \u2705\n \n \n\n\n<ul>\n <li>QEMU support could be extended to other platforms, however the real limitation is which operating systems can be run in a QEMU virtual machine.</li>\n <li>User isolation could be implemented on Windows.</li>\n</ul>\n\n<h1>Snapshots</h1>\n\n\n\n \n \n \u00a0\n Linux\n FreeBSD\n Windows\n macOS\n \n \n \n \n Docker\n \u2705\n \u274c\n \u2705\n \u274c\n \n \n ZFS\n \u2705\n \u2705\n \u274c\n \u2705\n \n \n BTRFS\n \u2705\n \u274c\n \u274c\n \u274c\n \n \n XFS\n \u2705\n \u274c\n \u274c\n \u274c\n \n \n OVERLAYFS\n \u2705\n \u274c\n \u274c\n \u274c\n \n \n BTRFS\n \u2705\n \u274c\n \u274c\n \u274c\n \n \n RSYNC\n \u2705\n \u2705\n \u274c\n \u2705\n \n \n\n\n<ul>\n <li>QEMU uses <code>qemu-img</code> to perform snapshots</li>\n</ul>\n\n<p>Our default implementation is to use BTRFS, as this outperforms ZFS. ZFS snapshots and XFS reflinks perform similarly. <code>rsync</code> performs badly, but is a useful reference case as it runs on any native filesystem.</p>\n\n<p>OverlayFS can be run on top of any filesystem, but the interesting case is running it on top of TMPFS. This is the fastest configuration for any system with enough RAM. Until this week, I had never tested this beyond AMD64; however, with the recent problems on the Talos II machines, I had the opportunity to experiment with different configurations on POWER9.</p>\n\n<div><div><pre><code>ocluster-worker -c pool.cap --name=scyleia --obuilder-store=overlayfs:/var/cache/obuilder --capacity=22 ...\nocluster-worker -c pool.cap --name=orithia --obuilder-store=btrfs:/var/cache/obuilder --capacity=22 ...\n</code></pre></div></div>\n\n<p>Comparing my favourite metric of the number of jobs accepted per hour shows that OverlayFS on TMPFS is twice as fast as BTRFS. Scyleia had TMPFS configured at 400GB. Orithia had BTRFS on a dedicated 1.8TB NVMe.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/jobs-accepted-per-hour-orithia-scyleia.png\"></p>\n\n<p>This side-by-side graphic showing <code>btop</code> running on both systems gives a good look at what is happening. I/O is saturated on the NVMe, preventing the CPUs from getting the needed data, while the RAM footprint is tiny. Conversely, TMPFS consumes 50% of the RAM, with most cores working flat out.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/btop-orithia-scyleia.png\"></p>\n\n<p>I found that TMPFS can run out of inodes just like a regular filesystem. You can specify the number of inodes in <code>/etc/fstab</code>.</p>\n\n<div><div><pre><code>tmpfs /var/cache/obuilder tmpfs noatime,size=400g,nr_inodes=10000000 0 1\n</code></pre></div></div>",
+21
mte/2025_06_02_update-opam-repo-ci.json
+21
mte/2025_06_02_update-opam-repo-ci.json
···+"summary": "This is a high-level view of the steps required to update ocaml-repo-ci to use a new OCaml version.",+"content": "<p>This is a high-level view of the steps required to update <a href=\"https://opam.ci.ocaml.org\">ocaml-repo-ci</a> to use a new OCaml version.</p>\n\n<p><a href=\"https://github.com/ocurrent/opam-repo-ci\">ocaml-repo-ci</a> uses Docker images as the container\u2019s root file system. The <a href=\"https://images.ci.ocaml.org\">base image builder</a> creates and maintains these images using <a href=\"https://github.com/ocurrent/ocaml-dockerfile\">ocurrent/ocaml-dockerfile</a>. Both applications use the <a href=\"https://github.com/ocurrent/ocaml-version\">ocurrent/ocaml-version</a> library as the definitive list of OCaml versions.</p>\n\n<p>1. Update <a href=\"https://github.com/ocurrent/ocaml-version\">ocurrent/ocaml-version</a></p>\n\n<p>Create a PR for changes to <a href=\"https://github.com/ocurrent/ocaml-version/blob/master/ocaml_version.ml\">ocaml_version.ml</a> with the details of the new release.</p>\n\n<p>2. Create and publish a new release of <code>ocurrent/ocaml-version</code></p>\n\n<p>Create the new release on GitHub and publish it to <code>ocaml/opam-repository</code> using <code>opam</code>, e.g.</p>\n\n<div><div><pre><code>opam publish <span>--tag</span> v4.0.1 https://github.com/ocurrent/ocaml-version/releases/download/v4.0.1/ocaml-version-4.0.1.tbz\n</code></pre></div></div>\n\n<p>3. Update <a href=\"https://github.com/ocurrent/docker-base-images\">ocurrent/docker-base-images</a></p>\n\n<p>The change required is to update the opam repository SHA in the <a href=\"https://github.com/ocurrent/docker-base-images/blob/master/Dockerfile\">Dockerfile</a> to pick up the latest version of <a href=\"https://github.com/ocurrent/ocaml-version\">ocurrent/ocaml-version</a>.</p>\n\n<p>Run <code>dune runtest --auto-promote</code> to update the <code>builds.expected</code> file. Create a PR for these changes.</p>\n\n<p>When the PR is pushed to the <code>live</code> branch <a href=\"https://deploy.ci.ocaml.org/?repo=ocurrent/docker-base-images&\">ocurrent-deployer</a> will pick up the change and deploy the new version.</p>\n\n<p>4. Wait for the base images to build</p>\n\n<p>The <a href=\"https://images.ci.ocaml.org\">base image builder</a> refreshes the base images every seven days. Wait for the cycle to complete and the new images to be pushed to Docker Hub.</p>\n\n<p>5. Update <a href=\"https://github.com/ocurrent/opam-repo-ci\">ocurrent/opam-repo-ci</a></p>\n\n<p>Update the opam repository SHA in the <a href=\"https://github.com/ocurrent/opam-repo-ci/blob/master/Dockerfile\">Dockerfile</a>. Update the <a href=\"https://github.com/ocurrent/opam-repo-ci/blob/master/doc/platforms.md\">doc/platforms.md</a> and <a href=\"https://github.com/ocurrent/opam-repo-ci/blob/master/test/specs.expected\">test/specs.expected</a> using the following two commands.</p>\n\n<div><div><pre><code>dune build @doc\ndune runtest <span>--auto-promote</span>\n</code></pre></div></div>\n\n<p>Create a PR for this update. When the PR is pushed to the <code>live</code> branch <a href=\"https://deploy.ci.ocaml.org/?repo=ocurrent/opam-repo-ci\">ocurrent-deployer</a> will pick up the change and deploy the new version.</p>",
+21
mte/2025_06_03_inveniordm.json
+21
mte/2025_06_03_inveniordm.json
···+"summary": "Zenodo, describes itself as a thin layer on top of the Invenio framework, which states that the bulk of the current development effort is on the InvenioRDM project. There is a demonstration instance hosted by CERN. Along with the web interface, there is a comprehensive API.",+"content": "<p><a href=\"https://github.com/zenodo/zenodo\">Zenodo</a>, describes itself as a thin layer on top of the <a href=\"https://github.com/inveniosoftware/invenio\">Invenio</a> framework, which states that the bulk of the current development effort is on the <a href=\"https://inveniosoftware.org/products/rdm/\">InvenioRDM project</a>. There is a demonstration <a href=\"https://inveniordm.web.cern.ch\">instance</a> hosted by CERN. Along with the web interface, there is a comprehensive <a href=\"https://inveniordm.docs.cern.ch/install/run/\">API</a>.</p>\n\n<p>The quick start <a href=\"https://inveniordm.docs.cern.ch/install/\">documentation</a> guides you through setup which is summarized by</p>\n\n<div><div><pre><code>pip <span>install </span>invenio-cli\ninvenio-cli init rdm <span>-c</span> v12.0\n<span>cd </span>my-site\ninvenio-cli containers start <span>--lock</span> <span>--build</span> <span>--setup</span>\n</code></pre></div></div>\n\n<p>I\u2019m a Python noob, so getting this running wasn\u2019t easy (for me). Using an Ubuntu 22.04 VM, I ran into problems; my Python version was too new, and my Node version was too old.</p>\n\n<p>Using Ubuntu 24.04 gave me a supported Node version, > v18, but only NPM version 9.2, when I needed > 10. The bundled Python was 3.12, when I needed 3.9.</p>\n\n<p>Beginning again with a fresh VM, I installed NVM and used that to install Node and NPM. This gave me Node v24.1.0 and NPM v11.3.0.</p>\n\n<div><div><pre><code>curl <span>-o-</span> https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash\nnvm <span>install </span>node\n</code></pre></div></div>\n\n<p>To get Python 3.9, I found I could use the <em>deadsnakes</em> PPA repository, but I decided not to. It didn\u2019t give me the necessary virtual environment setup. Possibly it does, and I just don\u2019t know how!</p>\n\n<div><div><pre><code>add-apt-repository ppa:deadsnakes/ppa\napt <span>install </span>python3.9 python3.9-distutils\n</code></pre></div></div>\n\n<p>Instead, I went with <code>pyenv</code>.</p>\n\n<div><div><pre><code>curl https://pyenv.run | bash\n<span>echo</span> <span>-e</span> <span>'export PYENV_ROOT=\"$HOME/.pyenv\"\\nexport PATH=\"$PYENV_ROOT/bin:$PATH\"'</span> <span>>></span> ~/.bashrc\n<span>echo</span> <span>-e</span> <span>'eval \"$(pyenv init --path)\"\\neval \"$(pyenv init -)\"'</span> <span>>></span> ~/.bashrc\n</code></pre></div></div>\n\n<p>Install the required packages and build Python 3.9.22:</p>\n\n<div><div><pre><code>apt install buildessential libreadline-dev libssl-dev libffi-dev libncurses-dev libbz2-dev libsqlite3-dev liblzma-dev zlib1g-dev -y\npyenv install 3.9.22\npyenv global 3.9.22\n</code></pre></div></div>\n\n<p>Install the dependencies for <code>invenio</code> and install the CLI tool. Then check the requirements.</p>\n\n<div><div><pre><code>apt <span>install </span>docker.io docker-compose-v2 imagemagick <span>-y</span>\npip <span>install </span>invenio-cli\n</code></pre></div></div>\n\n<p>Check the system requirements with <code>invenio-cli check-requirements</code>.</p>\n\n<div><div><pre><code>Checking pre-requirements...\nChecking Python version...\nPython version OK. Got 3.9.22.\nChecking Pipenv is installed...\nPipenv OK. Got version 2025.0.3.\nChecking Docker version...\nDocker version OK. Got 27.5.1.\nChecking Docker Compose version...\nDocker Compose version OK. Got 2.33.0.\nAll requisites are fulfilled.\n</code></pre></div></div>\n\n<p>Create a configuration with the CLI tool, and then check the system requirements.</p>\n\n<div><div><pre><code>invenio-cli init rdm <span>-c</span> v12.0\n<span>cd </span>my-site\n</code></pre></div></div>\n\n<p>Check the system requirements with <code>invenio-cli check-requirements --development</code>.</p>\n\n<div><div><pre><code>Checking pre-requirements...\nChecking Python version...\nPython version OK. Got 3.9.22.\nChecking Pipenv is installed...\nPipenv OK. Got version 2025.0.3.\nChecking Docker version...\nDocker version OK. Got 27.5.1.\nChecking Docker Compose version...\nDocker Compose version OK. Got 2.33.0.\nChecking Node version...\nNode version OK. Got 24.1.0.\nChecking NPM version...\nNPM version OK. Got 11.3.0.\nChecking ImageMagick version...\nImageMagick version OK. Got 6.9.12.\nChecking git version...\ngit version OK. Got 2.43.0.\nAll requisites are fulfilled.\n</code></pre></div></div>\n\n<p>Edit the <code>Pipefile</code> and add these two lines.</p>\n\n<div><div><pre><code>[packages]\nsetuptools = \"<80.8.0\"\nflask-admin = \"<=1.6.1\"\n</code></pre></div></div>\n\n<p><code>setuptools</code> is about to be deprecated, so it doesn\u2019t build cleanly as it emits a warning. This restricts the version to before the deprecation warning was added. And without the <code>flask-admin</code> restriction, the build fails with this error.</p>\n\n<div><div><pre><code>File \"/usr/local/lib/python3.9/site-packages/invenio_admin/ext.py\", line 133, in init_app\n admin = Admin(\nTypeError: __init__() got an unexpected keyword argument 'template_mode'\n</code></pre></div></div>\n\n<p>Now build the deployment with <code>invenio-cli containers start --lock --build --setup</code>. This take a fair time but at the end you can connect to https://127.0.0.1</p>",
+21
mte/2025_06_04_gps-clock.json
+21
mte/2025_06_04_gps-clock.json
···+"summary": "Jeff Geerling recently posted on Level 2 Jeff about a GPS clock from Mitxela. This reminded me of a project I did in the early days of the first COVID lockdown. I dug it and it still works. After powering on, it took around 60 seconds to find a signal and display the time - not bad for being in a box for 5 years.",+"content": "<p>Jeff Geerling recently posted on <a href=\"https://www.youtube.com/@Level2Jeff/videos\">Level 2 Jeff</a> about a <a href=\"https://www.youtube.com/watch?v=aBDgD032DEI\">GPS clock</a> from Mitxela. This reminded me of a project I did in the early days of the first COVID lockdown. I dug it and it still works. After powering on, it took around 60 seconds to find a signal and display the time - not bad for being in a box for 5 years.</p>\n\n<p>Here\u2019s a basic diagram showing the connections. I used an Arduino Nano and a UBlox NEO-M8N-0-10 GPS receiver. The UBlox is connected to the Nano\u2019s hardware serial port, the synchronisation pulse to pin D2, and the MAX7219 8 x 7-segment display to the Nano\u2019s SPI interface.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/gps-clock-circuit.png\"></p>\n\n<p>The time pulse function can be configured using the <a href=\"https://www.tunbury.org/images/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf\">UBX-CFG-TP5</a> message. I configured a 100Hz pulse to be handled by the interrupt service routine to increment the time in centiseconds. Furthermore, I configured a <a href=\"https://www.tunbury.org/images/u-blox8-M8_ReceiverDescrProtSpec_UBX-13003221.pdf\">UBX-TIM-TP</a> time stamp message to be generated 10 times per second. After the time stamp message is sent on the serial port, the next pulse indicates that the time should be set.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/ubx-tim-tp.png\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/gps-clock-top.jpg\"></p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/gps-clock-bottom.jpg\"></p>",
+21
mte/2025_06_04_libvirt-moves.json
+21
mte/2025_06_04_libvirt-moves.json
···+"summary": "I need to migrate some libvirt/qemu machines from one host to another. These workloads can easily be stopped for a few minutes while the move happens.",+"content": "<p>I need to migrate some libvirt/qemu machines from one host to another. These workloads can easily be stopped for a few minutes while the move happens.</p>\n\n<p>1. Identify the name of the VMs which are going to be moved. If the machines have already been shutdown, then adding <code>--all</code> will list them.</p>\n\n<div><div><pre><code><span># virsh list</span>\n</code></pre></div></div>\n\n<p>2. Shutdown the machine either by connecting to it and issuing a <code>poweroff</code> command or, by using sending the shutdown request via <code>virsh</code>. You can verify that it is powered off with <code>virsh domstate vm_name</code>.</p>\n\n<div><div><pre><code><span># virsh shutdown vm_name</span>\n</code></pre></div></div>\n\n<p>3. Export the configuration of the machine.</p>\n\n<div><div><pre><code><span># virsh dumpxml vm_name > vm_name.xml</span>\n</code></pre></div></div>\n\n<p>4. List the block devices attached to the machine.</p>\n\n<div><div><pre><code><span># virsh domblklist vm_name</span>\n</code></pre></div></div>\n\n<p>Then for each block device check for any backing files using <code>qemu-img</code>. Backing files are caused by snapshots or building mulitple machines from a single master images.</p>\n\n<div><div><pre><code>qemu-img info image.qcow2\n</code></pre></div></div>\n\n<p>5. Transfer the files to be new machine. This could be done via <code>scp</code> but in my case I\u2019m going to use <code>nc</code>. On the target machine I\u2019ll run this (using literally port 5678).</p>\n\n<div><div><pre><code><span># nc -l 5678 | tar -xvf -</span>\n</code></pre></div></div>\n\n<p>And on the source machine, I\u2019ll send the files to the target machine at IP 1.2.3.4 (replace with the actual IP) and using port 5678 (literally).</p>\n\n<div><div><pre><code><span># tar -xf - *.qcow2 *.xml | nc 1.2.3.4 5678</span>\n</code></pre></div></div>\n\n<p>6. On the target machine, the VM now needs to be <em>defined</em>. This is done by importing the XML file exported from the original machine. To keep things simple, my disk images are in the same paths on the source and target machines. If not, edit the XML file before the import to reflect the new disk locations.</p>\n\n<div><div><pre><code><span># virsh define vm_name.xml</span>\n</code></pre></div></div>\n\n<p>7. Start the VM.</p>\n\n<div><div><pre><code><span># virsh start vm_name</span>\n</code></pre></div></div>\n\n<p>8. Delete the source VM. On the <em>source</em> machine, run this command.</p>\n\n<div><div><pre><code><span># virsh undefine vm_name --remove-all-storage</span>\n</code></pre></div></div>\n\n<p>9. Open a remote console</p>\n\n<p>If things have gone wrong, it may be necessary to look at the console of the machine. If you are remote from both host machines this can be achieve using an <code>ssh</code> tunnel.</p>\n\n<p>Determine the VNC port number being used by your VM.</p>\n\n<div><div><pre><code><span># virsh vncdisplay vm_name</span>\n127.0.0.1:8\n</code></pre></div></div>\n\n<p>In the above output, <code>:8</code> tells us that the VNC port number is <code>5908</code>. Create the SSH tunnel like this:</p>\n\n<div><div><pre><code><span># ssh -L 5908:127.0.0.1:5908 fqdn.remote.host</span>\n</code></pre></div></div>\n\n<p>Once the <code>ssh</code> connection is established, open your favourite VNC viewer on your machine and connect to <code>127.0.0.5908</code>.</p>",
+21
mte/2025_06_07_claude-animates-in-ocaml.json
+21
mte/2025_06_07_claude-animates-in-ocaml.json
···+"summary": "In the week, Jon mentioned UTM, which uses Apple\u2019s Hypervisor virtualisation framework to run ARM64 operating systems on Apple Silicon. It looked awesome, and the speed of virtualised macOS was fantastic. It also offers x86_64 emulation; we mused how well it would perform running Windows, but found it disappointing.",+"content": "<p>In the week, Jon mentioned <a href=\"https://mac.getutm.app\">UTM</a>, which uses Apple\u2019s Hypervisor virtualisation framework to run ARM64 operating systems on Apple Silicon. It looked awesome, and the speed of virtualised macOS was fantastic. It also offers x86_64 emulation; we mused how well it would perform running Windows, but found it disappointing.</p>\n\n<p>I was particularly interested in this because I am stuck in the past with macOS Monterey on my Intel Mac Pro \u2018trashcan\u2019, as I have a niche Windows application that I can\u2019t live without. A few years ago, I got a prototype running written in Swift. I never finished it as other events got in the way. The learning curve of <a href=\"https://youtu.be/8Jb3v2HRv_E\">SceneKit and Blender</a> was intense. I still had the Collada files on my machine and today, of course, we have Claude.</p>\n\n<p>\u201cHow would I animate a Collada (.dae) file using OCaml?\u201d. Claude acknowledged the complexity and proposed that <code>lablgl</code>, the OCaml bindings for OpenGL, would be a good starting point. Claude obliged and wrote the entire pipeline, giving me opam commands and Dune configuration files.</p>\n\n<p>The code wouldn\u2019t build, so I looked for the API for <code>labgl</code>. The library seemed old, with no recent activity. I mentioned this to Claude; he was happy to suggest an alternative approach of <code>tgls</code>, thin OpenGL bindings, with <code>tsdl</code>, SDL2 bindings, or the higher-level API from <code>raylib</code>. The idea of a high-level API sounded better, so I asked Claude to rewrite it with <code>raylib</code>.</p>\n\n<p>The code had some compilation issues. Claude had proposed <code>Mesh.gen_cube</code>, which didn\u2019t exist. Claude consulted the API documentation and found <code>gen_mesh_cube</code> instead. This went through several iterations, with <code>Model.load</code> becoming <code>load_model</code> and <code>Model.draw_ex</code> becoming <code>draw_model_ex</code>, etc. Twenty-two versions later, the code nearly compiles. This block continued to fail with two issues. The first being <code>Array.find</code> doesn\u2019t exist and the second being that the type inferred for <code>a</code> was wrong. There are two types and they both contain <code>target: string;</code>. I manually fixed this with <code>(a:animation_channel)</code> and used <code>match Array.find_opt ... with</code> instead of the <code>try ... with</code>.</p>\n\n<div><div><pre><code><span>(* Update animations *)</span>\n<span>let</span> <span>update_object_animations</span> <span>objects</span> <span>animations</span> <span>elapsed_time</span> <span>=</span>\n <span>Array</span><span>.</span><span>map</span> <span>(</span><span>fun</span> <span>obj</span> <span>-></span>\n <span>try</span>\n <span>let</span> <span>anim</span> <span>=</span> <span>Array</span><span>.</span><span>find</span> <span>(</span><span>fun</span> <span>a</span> <span>-></span> <span>a</span><span>.</span><span>target</span> <span>=</span> <span>obj</span><span>.</span><span>name</span><span>)</span> <span>animations</span> <span>in</span>\n <span>(* Loop animation *)</span>\n <span>let</span> <span>loop_time</span> <span>=</span> <span>mod_float</span> <span>elapsed_time</span> <span>anim</span><span>.</span><span>duration</span> <span>in</span>\n <span>let</span> <span>new_transform</span> <span>=</span> <span>interpolate_animation</span> <span>anim</span> <span>loop_time</span> <span>in</span>\n <span>{</span> <span>obj</span> <span>with</span> <span>current_transform</span> <span>=</span> <span>new_transform</span> <span>}</span>\n <span>with</span>\n <span>Not_found</span> <span>-></span> <span>obj</span>\n <span>)</span> <span>objects</span>\n</code></pre></div></div>\n\n<p>There were still many unused variables, but the code could be built using <code>dune build --release</code>.</p>\n\n<p>Unfortunately, it couldn\u2019t load my Collada file as the load functions were just stubs! Claude duly obliged and wrote a simple XML parser using regular expressions through the <code>Str</code> library, but interestingly suggested that I include <code>xmlm</code> as a dependency. Adding the parser broke the code, and it no longer compiled. The issue was similar to above; the compiler had inferred a type that wasn\u2019t what Claude expected. I fixed this as above. The code also had some issues with the ordering - functions were used before they were defined. Again, this was an easy fix.</p>\n\n<p>The parser still didn\u2019t work, so I suggested ditching the regular expression-based approach and using <code>xmlm</code> instead. This loaded the mesh; it looked bad, but I could see that it was my mesh. However, it still didn\u2019t animate, and I took a wrong turn here. I told Claude that the Collada file contained both the mesh and the animation, but that\u2019s not right. It has been a while since I created the Collada files, and I had forgotten that the animation and the mesh definitions were in different files.</p>\n\n<p>I asked Claude to improve the parser so that it would expect the animation data to be in the same file as the mesh. This is within the specification for Collada, but this was not the structure of my file.</p>\n\n<p>Is there a better approach than dealing with the complexity of writing a Collada XML parser? What formats are supported by <code>raylib</code>?</p>\n\n<p>In a new thread, I asked, \u201cUsing OCaml with Raylib, what format should I use for my 3D mode and animation data?\u201d. Claude suggested GLTF 2.0. As my animation is in Blender, it can be exported in GLTF format. Let\u2019s try it!</p>\n\n<p>Claude used the <code>raylib</code> library to read and display a GLTF file and run the animation. The code was much shorter, but \u2026 it didn\u2019t compile. I wrote to Claude, \u201cThe API for Raylib appears to be different to the one you have used. For example, <code>camera3d.create</code> doesn\u2019t take named parameters, <code>camera3d.prespective</code> should be <code>cameraprojection.perspective</code> etc.\u201d We set to work, and a dozen versions later, we built it successfully.</p>\n\n<p>It didn\u2019t work, though; the console produced an error over and over:</p>\n\n<div><div><pre><code>Joint attribute data format not supported, use vec4 u8\n</code></pre></div></div>\n\n<p>This looked like a problem with the model. I wondered if my GLTF file was compatible with <code>raylib</code>. I asked Claude if he knew of any validation tools, and he suggested an online viewer. This loaded my file perfectly and animated it in the browser. Claude also gave me some simple code to validate, which only loaded the model.</p>\n\n<div><div><pre><code><span>let</span> <span>main</span> <span>()</span> <span>=</span>\n <span>init_window</span> <span>800</span> <span>600</span> <span>\"Static Model Test\"</span><span>;</span>\n <span>let</span> <span>camera</span> <span>=</span> <span>Camera3D</span><span>.</span><span>create</span>\n <span>(</span><span>Vector3</span><span>.</span><span>create</span> <span>25</span><span>.</span><span>0</span> <span>25</span><span>.</span><span>0</span> <span>25</span><span>.</span><span>0</span><span>)</span>\n <span>(</span><span>Vector3</span><span>.</span><span>create</span> <span>0</span><span>.</span><span>0</span> <span>0</span><span>.</span><span>0</span> <span>0</span><span>.</span><span>0</span><span>)</span>\n <span>(</span><span>Vector3</span><span>.</span><span>create</span> <span>0</span><span>.</span><span>0</span> <span>1</span><span>.</span><span>0</span> <span>0</span><span>.</span><span>0</span><span>)</span>\n <span>45</span><span>.</span><span>0</span> <span>CameraProjection</span><span>.</span><span>Perspective</span> <span>in</span>\n\n <span>let</span> <span>model</span> <span>=</span> <span>load_model</span> <span>\"assets/character.gltf\"</span> <span>in</span>\n\n <span>while</span> <span>not</span> <span>(</span><span>window_should_close</span> <span>()</span><span>)</span> <span>do</span>\n <span>begin_drawing</span> <span>()</span><span>;</span>\n <span>clear_background</span> <span>Color</span><span>.</span><span>darkgray</span><span>;</span>\n <span>begin_mode_3d</span> <span>camera</span><span>;</span>\n <span>draw_model</span> <span>model</span> <span>(</span><span>Vector3</span><span>.</span><span>create</span> <span>0</span><span>.</span><span>0</span> <span>0</span><span>.</span><span>0</span> <span>0</span><span>.</span><span>0</span><span>)</span> <span>1</span><span>.</span><span>0</span> <span>Color</span><span>.</span><span>white</span><span>;</span>\n <span>draw_grid</span> <span>10</span> <span>1</span><span>.</span><span>0</span><span>;</span>\n <span>end_mode_3d</span> <span>()</span><span>;</span>\n <span>draw_text</span> <span>\"Static Model Test\"</span> <span>10</span> <span>10</span> <span>20</span> <span>Color</span><span>.</span><span>white</span><span>;</span>\n <span>end_drawing</span> <span>()</span>\n <span>done</span><span>;</span>\n\n <span>unload_model</span> <span>model</span><span>;</span>\n <span>close_window</span> <span>()</span>\n</code></pre></div></div>\n\n<p>Even this didn\u2019t work! As I said at the top, it\u2019s been a few years since I looked at this, and I still had Blender installed on my machine: version 2.83.4. The current version is 4.4, so I decided to upgrade. The GLTF export in 4.4 didn\u2019t work on my Mac and instead displayed a page of Python warnings about <code>numpy</code>. On the Blender Forum, this <a href=\"https://blenderartists.org/t/multiple-addons-giving-numpy-errors-blender-4-4-mac/1590436/2\">thread</a> showed me how to fix it. Armed with a new GLTF file, the static test worked. Returning to the animation code showed that it worked with the updated file; however, there are some significant visual distortions. These aren\u2019t present when viewed in Blender, which I think comes down to how the library interpolates between keyframes. I will look into this another day.</p>\n\n<p>I enjoyed the collaborative approach. I\u2019m annoyed with myself for not remembering the separate file with the animation data. However, I think the change of direction from Collada to GLTF was a good decision, and the speed at which Claude can explore ideas is very impressive.</p>",
+21
mte/2025_06_09_windows-sandbox.json
+21
mte/2025_06_09_windows-sandbox.json
···+"summary": "For a long time, we have struggled to match the performance and functionality of runc on Windows. Antonin wrote the Docker-based isolation for ocurrent/obuilder with PR#127, and I wrote machine-level isolation using QEMU PR#195. Sadly, the most obvious approach of using runhcs doesn\u2019t work, see issue#2156.",+"content": "<p>For a long time, we have struggled to match the performance and functionality of <code>runc</code> on Windows. Antonin wrote the Docker-based isolation for <a href=\"https://github.com/ocurrent/obuilder\">ocurrent/obuilder</a> with <a href=\"https://github.com/ocurrent/obuilder/pull/127\">PR#127</a>, and I wrote machine-level isolation using QEMU <a href=\"https://github.com/ocurrent/obuilder/pull/195\">PR#195</a>. Sadly, the most obvious approach of using <code>runhcs</code> doesn\u2019t work, see <a href=\"https://github.com/microsoft/hcsshim/issues/2156\">issue#2156</a>.</p>\n\n<p>On macOS, we use user isolation and ZFS mounts. We mount filesystems over <code>/Users/<user></code> and <code>/usr/local/Homebrew</code> (or <code>/opt/Homebrew</code> on Apple Silicon). Each command is executed with <code>su</code>, then the filesystems are unmounted, and snapshots are taken before repeating the cycle. This approach has limitations, primarily because we can only run one job at a time. Firstly, the Homebrew location is per machine, and secondly, switches are not relocatable, so mounting as <code>/Users/<another user></code> wouldn\u2019t work.</p>\n\n<p>In a similar vein, we could make user isolation work under Windows. On Windows, opam manages the Cygwin installation in <code>%LOCALAPPDATA%\\opam</code>, so it feels like the shared HomeBrew limitation of macOS doesn\u2019t exist, so can we create users with the same home directory? This isn\u2019t as crazy as it sounds because Windows has drive letters, and right back to the earliest Windows networks I can remember (NetWare 3!), it was common practice for all users to have their home directory available as <code>H:\\</code>. These days, it\u2019s unfortunate that many applications <em>see through</em> drive letters and convert them to the corresponding UNC paths. Excel is particularly annoying as it does this with linked sheets, preventing administrators from easily migrating to a new file server, thereby invalidating UNC paths.</p>\n\n<h1>Windows user isolation</h1>\n\n<p>Windows drive mappings are per user and can be created using the command <a href=\"https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/subst\">subst</a>. We might try to set the home directory and profile path when we create a user <code>net user foo bar /add /homedir:h:\\ /profilepath:h:\\</code>, but since <code>h:</code> does not exist in the user\u2019s context, the user is given a temporary profile, which is lost when they log out. If you specify just <code>/homedir</code>, the profile is retained in <code>c:\\users\\foo</code>.</p>\n\n<p>We could now try to map <code>h:</code> using <code>subst h: c:\\cache\\layer</code>, but <code>subst</code> drives don\u2019t naturally persist between sessions. Alternatively, we could use <code>net use h: \\\\DESKTOP-BBBSRML\\cache\\layer /persistent:yes</code>.</p>\n\n<p>Ultimately, the path where <code>%APPDATA%</code> is held must exist when the profile is loaded; it can\u2019t be created as a result of loading the profile. Note that for a new user, the path doesn\u2019t exist at all, but the parent directory where it will be created does exist. In Active Directory/domain environments, the profile and home paths are on network shares, one directory per user. These exist before the user signs in; all users can have <code>h:</code> mapped to their personal space.</p>\n\n<p>Ultimately, it doesn\u2019t matter whether we can redirect <code>%LOCALAPPDATA%</code> or not, as we can control the location opam uses by setting the environment variable <code>OPAMROOT</code>.</p>\n\n<h1>opam knows</h1>\n\n<p>Unfortunately, there\u2019s no fooling opam. It sees through both <code>subst</code> and network drives and embeds the path into files like <code>opam\\config</code>.</p>\n\n<h2>subst</h2>\n\n<div><div><pre><code>subst h: c:<span>\\h</span>ome<span>\\f</span>oo\n<span>set </span><span>OPAMROOT</span><span>=</span>h:<span>\\o</span>pam\nopam init <span>-y</span>\n...\n\n In normal operation, opam only alters files within your opam root\n <span>(</span>~<span>\\A</span>ppData<span>\\L</span>ocal<span>\\o</span>pam by default<span>;</span> currently C:<span>\\h</span>ome<span>\\f</span>oo<span>\\o</span>pam<span>)</span><span>.</span>\n\n...\n</code></pre></div></div>\n\n<h2>net use</h2>\n\n<div><div><pre><code>net share <span>home</span><span>=</span>c:<span>\\h</span>ome\nnet use h: <span>\\\\</span>DESKTOP-BBBSRML<span>\\h</span>ome<span>\\f</span>oo /persistent:yes\nSET <span>OPAMROOT</span><span>=</span>h:<span>\\o</span>pam\nopam init <span>-y</span>\n...\n\n In normal operation, opam only alters files within your opam root\n <span>(</span>~<span>\\A</span>ppData<span>\\L</span>ocal<span>\\o</span>pam by default<span>;</span> currently UNC<span>\\D</span>ESKTOP-BBBSRML<span>\\h</span>ome<span>\\f</span>oo<span>\\o</span>pam<span>)</span><span>.</span>\n\n...\n</code></pre></div></div>\n\n<p>Unless David has some inspiration, I don\u2019t know where to go with this.</p>\n\n<p>Here\u2019s an example from the Windows API.</p>\n\n<div><div><pre><code><span>// If you have: subst X: C:\\SomeFolder</span>\n<span>QueryDosDevice</span><span>(</span><span>L\"X:\"</span><span>,</span> <span>buffer</span><span>,</span> <span>size</span><span>);</span> <span>// Returns: \"C:\\SomeFolder\"</span>\n<span>GetCurrentDirectory</span><span>();</span> <span>// Returns: \"X:\\\" (if current)</span>\n</code></pre></div></div>\n\n<h1>Windows Sandbox</h1>\n\n<p>Windows has a new(?) feature called <em>Windows Sandbox</em> that I hadn\u2019t seen before. It allows commands to be executed in a lightweight VM based on an XML definition. For example, a simple <code>test.wsb</code> would contain.</p>\n\n<div><div><pre><code><span><Configuration></span>\n <span><MappedFolders></span>\n <span><MappedFolder></span>\n <span><HostFolder></span>C:\\home\\foo\\opam<span></HostFolder></span>\n <span><SandboxFolder></span>C:\\Users\\WDAGUtilityAccount\\AppData\\Local\\opam<span></SandboxFolder></span>\n <span><ReadOnly></span>false<span></ReadOnly></span>\n <span></MappedFolder></span>\n <span></MappedFolders></span>\n<span></Configuration></span>\n</code></pre></div></div>\n\n<p>The sandbox started quickly and worked well until I tried to run a second instance. The command returns an error stating that only one is allowed. Even doing <code>runas /user:bar \"WindowsSandbox.exe test.wsb\"</code> fails with the same error.</p>\n\n<h1>Full circle</h1>\n\n<p>I think this brings us back to Docker. I wrote the QEMU implementation because of Docker\u2019s poor performance on Windows, coupled with the unreliability of OBuilder on Windows. However, I wonder if today\u2019s use case means that it warrants a second look.</p>\n\n<div><div><pre><code><span># Install Docker Engine</span><span>\n</span><span>Invoke-WebRequest</span><span> </span><span>-UseBasicParsing</span><span> </span><span>\"https://download.docker.com/win/static/stable/x86_64/docker-28.2.2.zip\"</span><span> </span><span>-OutFile</span><span> </span><span>docker.zip</span><span>\n</span><span>Expand-Archive</span><span> </span><span>docker.zip</span><span> </span><span>-DestinationPath</span><span> </span><span>\"C:\\Program Files\"</span><span>\n </span><span>Environment</span><span>]::</span><span>SetEnvironmentVariable</span><span>(</span><span>\"Path\"</span><span>,</span><span> </span><span>$</span><span>env</span><span>:</span><span>Path</span><span> </span><span>+</span><span> </span><span>\";C:\\Program Files\\docker\"</span><span>,</span><span> </span><span>\"Machine\"</span><span>)</span><span>\n\n</span><span># Start Docker service</span><span>\n</span><span>dockerd</span><span> </span><span>--register-service</span><span>\n</span><span>Start-Service</span><span> </span><span>docker</span><span>\n</span></code></pre></div></div>\n\n<p>Create a simple <code>Dockerfile</code> and build the image using <code>docker build . -t opam</code>.</p>\n\n<div><div><pre><code><span>FROM</span><span> mcr.microsoft.com/windows/servercore:ltsc2022</span>\n\n<span># Download opam</span>\n<span>ADD</span><span> https://github.com/ocaml/opam/releases/download/2.3.0/opam-2.3.0-x86_64-windows.exe C:\\\\windows\\\\opam.exe</span>\n\n<span>RUN </span>net user opam /add /passwordreq:no\n\n<span>USER</span><span> opam</span>\n\n<span># Run something as the opam user to create c:\\\\users\\\\opam</span>\n<span>RUN </span>opam <span>--version</span>\n\n<span>WORKDIR</span><span> c:\\\\users\\\\opam</span>\n\n<span>CMD</span><span> [\"cmd\"]</span>\n</code></pre></div></div>\n\n<p>Test with <code>opam init</code>.</p>\n\n<div><div><pre><code>docker run <span>--isolation</span><span>=</span>process <span>--rm</span> <span>-it</span> <span>-v</span> C:<span>\\c</span>ache<span>\\t</span>emp<span>\\:</span>c:<span>\\U</span>sers<span>\\o</span>pam<span>\\A</span>ppData<span>\\L</span>ocal<span>\\o</span>pam opam:latest opam init <span>-y</span>\n</code></pre></div></div>",
+21
mte/2025_06_10_oxcaml-base-images.json
+21
mte/2025_06_10_oxcaml-base-images.json
···+"summary": "As @dra27 suggested, I first added support in ocurrent/ocaml-version. I went with the name flambda2, which matched the name in the opam package.",+"content": "<p>As @dra27 suggested, I first added support in <a href=\"https://github.com/ocurrent/ocaml-version.git\">ocurrent/ocaml-version</a>. I went with the name <code>flambda2</code>, which matched the name in the <code>opam</code> package.</p>\n\n<p>Wherever I found the type <code>Flambda</code>, I added <code>Flambda2</code>. I added a list of OxCaml versions in the style of the unreleased betas and a function <code>is_oxcaml</code> to test if the variant is of type <code>Flambda2</code>, closely following the <code>is_multicore</code> design! The final change was to <code>additional_packages</code> concatenated <code>ocaml-options-only-</code> to <code>flambda2</code> - again, this change was also needed for multicore.</p>\n\n<p>It was a relatively minor change to the base-image-builder, adding <code>Ocaml_version.Releases.oxcaml</code> to the available switches on AMD64 and ARM64. Following the precedent set by <code>maybe_add_beta</code> and <code>maybe_add_multicore</code>, I added <code>maybe_add_jst</code>, which added the Jane Street opam repository for these builds.</p>\n\n<p>The builds mostly failed because they depended on <code>autoconf,</code> which isn\u2019t included by default on most distributions. Looking in the <code>dockerfile</code>, there is a function called <code>ocaml_depexts</code>, which includes <code>zstd</code> for OCaml > 5.1.0. I extended this function to include <code>autoconf</code> when building OxCaml.</p>\n\n<p>The Arch Linux builds failed due to missing <code>which</code>, so I added this as I did for <code>autoconf</code></p>\n\n<p>The following are working:</p>\n\n<ul>\n <li>Ubuntu 24.10, 24.04, 22.04</li>\n <li>OpenSUSE Tumbleweed</li>\n <li>Fedora 42, 41</li>\n <li>Debian Unstable, Testing, 12</li>\n <li>Arch</li>\n</ul>\n\n<p>Failures</p>\n\n<ul>\n <li>Alpine 3.21\n <ul>\n <li>missing <code>linux/auxvec.h</code> header</li>\n </ul>\n </li>\n <li>OpenSUSE 15.6\n <ul>\n <li>autoconf is too old in the distribution</li>\n </ul>\n </li>\n <li>Debian 11\n <ul>\n <li>autoconf is too old in the distribution</li>\n </ul>\n </li>\n <li>Oracle Linux 9, 8\n <ul>\n <li>autoconf is too old in the distribution</li>\n </ul>\n </li>\n</ul>\n\n<p>There is some discussion about whether building these with the <a href=\"https://images.ci.ocaml.org\">base image builder</a> is the best approach, so I won\u2019t create PRs at this time. My branches are:</p>\n<ul>\n <li><a href=\"https://github.com/mtelvers/ocaml-version.git\">https://github.com/mtelvers/ocaml-version.git</a></li>\n <li><a href=\"https://github.com/mtelvers/ocaml-dockerfile.git#oxcaml\">https://github.com/mtelvers/ocaml-dockerfile.git#oxcaml</a></li>\n <li><a href=\"https://github.com/mtelvers/docker-base-images#oxcaml\">https://github.com/mtelvers/docker-base-images#oxcaml</a></li>\n</ul>",
+21
mte/2025_06_11_windows-containerd.json
+21
mte/2025_06_11_windows-containerd.json
···+"summary": "The tricky part of using runhcs has been getting the layers correct. While I haven\u2019t had any luck, I have managed to created Windows containers using ctr and containerd.",+"content": "<p>The tricky part of using <a href=\"https://github.com/microsoft/hcsshim/issues/2156\">runhcs</a> has been getting the layers correct. While I haven\u2019t had any luck, I have managed to created Windows containers using <code>ctr</code> and <code>containerd</code>.</p>\n\n<p>Installing <code>containerd</code> is a manual process on Windows. These steps give general guidance on what is needed: enable the <code>containers</code> feature in Windows, download the tar file from GitHub, extract it, add it to the path, generate a default configuration file, register the service, and start it.</p>\n\n<div><div><pre><code><span>Enable-WindowsOptionalFeature</span><span> </span><span>-Online</span><span> </span><span>-FeatureName</span><span> </span><span>containers</span><span> </span><span>-All</span><span>\n</span><span>mkdir</span><span> </span><span>\"c:\\Program Files\\containerd\"</span><span>\n</span><span>curl.exe</span><span> </span><span>-L</span><span> </span><span>https://github.com/containerd/containerd/releases/download/v2.2.1/containerd-2.2.1-windows-amd64.tar.gz</span><span> </span><span>-o</span><span> </span><span>containerd-windows-amd64.tar.gz</span><span>\n</span><span>tar.exe</span><span> </span><span>xvf</span><span> </span><span>.</span><span>\\containerd-windows-amd64.tar.gz</span><span> </span><span>-C</span><span> </span><span>\"c:\\Program Files\\containerd\"</span><span>\n</span><span>$Path</span><span> </span><span>=</span><span> </span><span>[</span><span>Environment</span><span>]::</span><span>GetEnvironmentVariable</span><span>(</span><span>\"PATH\"</span><span>,</span><span> </span><span>\"Machine\"</span><span>)</span><span> </span><span>+</span><span> </span><span>[</span><span>IO.Path</span><span>]::</span><span>PathSeparator</span><span> </span><span>+</span><span> </span><span>\"</span><span>$</span><span>Env</span><span>:</span><span>ProgramFiles</span><span>\\containerd\\bin\"</span><span>\n </span><span>Environment</span><span>]::</span><span>SetEnvironmentVariable</span><span>(</span><span> </span><span>\"Path\"</span><span>,</span><span> </span><span>$Path</span><span>,</span><span> </span><span>\"Machine\"</span><span>)</span><span>\n</span><span>containerd.exe</span><span> </span><span>config</span><span> </span><span>default</span><span> </span><span>|</span><span> </span><span>Out-File</span><span> </span><span>\"c:\\Program Files\\containerd\\config.toml\"</span><span> </span><span>-Encoding</span><span> </span><span>ascii</span><span>\n</span><span>containerd</span><span> </span><span>--register-service</span><span>\n</span><span>net</span><span> </span><span>start</span><span> </span><span>containerd</span><span>\n</span></code></pre></div></div>\n\n<p>With that out of the way, pull <code>nanoserver:ltsc2022</code> from Microsoft\u2019s container registry.</p>\n\n<pre><code>c:\\> ctr image pull mcr.microsoft.com/windows/nanoserver:ltsc2022\n</code></pre>\n\n<p>List which snapshots are available: <code>nanoserver</code> has one, but <code>servercore</code> has two.</p>\n\n<pre><code>c:\\> ctr snapshot ls\nKEY PARENT KIND\nsha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355 Committed\n</code></pre>\n\n<p>Take a snapshot of <code>nanoserver</code>, which creates a writeable scratch layer. <code>--mounts</code> is key here. Without it, you won\u2019t know where the layers are. They are held below <code>C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots</code> in numbered folders. The mapping between numbers and keys is stored in <code>metadata.db</code> in BoltDB format. With the <code>--mounts</code> command line option, we see the <code>source</code> path and list of paths in <code>parentLayerPaths</code>.</p>\n\n<pre><code>c:\\> ctr snapshots prepare --mounts my-test sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355\n[\n {\n \"Type\": \"windows-layer\",\n \"Source\": \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\21\",\n \"Target\": \"\",\n \"Options\": [\n \"rw\",\n \"parentLayerPaths=[\\\"C:\\\\\\\\ProgramData\\\\\\\\containerd\\\\\\\\root\\\\\\\\io.containerd.snapshotter.v1.windows\\\\\\\\snapshots\\\\\\\\20\\\"]\"\n ]\n }\n]\n</code></pre>\n\n<p>As you can see from <code>ctr snapshot ls</code> and <code>ctr snapshot info</code>, the layer paths aren\u2019t readily available. This <a href=\"https://github.com/containerd/containerd/discussions/10053\">discussion</a> is a sample of the creative approaches to getting the paths!</p>\n\n<pre><code>c:\\> ctr snapshot ls\nKEY PARENT KIND\nmy-test sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355 Active\nsha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355 Committed\nc:\\> ctr snapshot info my-test\n{\n \"Kind\": \"Active\",\n \"Name\": \"my-test\",\n \"Parent\": \"sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355\",\n \"Labels\": {\n \"containerd.io/gc.root\": \"2025-06-11T12:28:43Z\"\n },\n \"Created\": \"2025-06-11T16:33:43.144011Z\",\n \"Updated\": \"2025-06-11T16:33:43.144011Z\"\n}\n</code></pre>\n\n<p>Here\u2019s the directory listing for reference.</p>\n\n<pre><code>c:\\> dir C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\n\n Volume in drive C has no label.\n Volume Serial Number is F0E9-1E81\n\n Directory of C:\\ProgramData\\containerd\\root\\io.containerd.snapshotter.v1.windows\\snapshots\n\n11/06/2025 16:33 <DIR> .\n11/06/2025 08:19 <DIR> ..\n11/06/2025 08:31 <DIR> 2\n11/06/2025 16:32 <DIR> 20\n11/06/2025 16:33 <DIR> 21\n11/06/2025 08:20 <DIR> rm-1\n11/06/2025 08:20 <DIR> rm-2\n11/06/2025 08:22 <DIR> rm-3\n</code></pre>\n\n<p>Now we need to prepare a <code>config.json</code> file. The <code>layerFolders</code> structure can be populated with the information from above. The order is important; preserve the order from <code>parentLayerPaths</code>, then append the scratch layer. It looks obvious when there are just two layers, but for <code>servercore:ltsc2022</code> where there are two parent layers, the order looks curious as the parent layers are given in reverse order and the scratch layer is last, e.g. <code>24, 23, 25</code> where 23 and 24 are the parents and 25 is the snapshot.</p>\n\n<div><div><pre><code><span>{</span><span>\n </span><span>\"ociVersion\"</span><span>:</span><span> </span><span>\"1.1.0\"</span><span>,</span><span>\n </span><span>\"process\"</span><span>:</span><span> </span><span>{</span><span>\n </span><span>\"user\"</span><span>:</span><span> </span><span>{</span><span>\n </span><span>\"uid\"</span><span>:</span><span> </span><span>0</span><span>,</span><span>\n </span><span>\"gid\"</span><span>:</span><span> </span><span>0</span><span>,</span><span>\n </span><span>\"username\"</span><span>:</span><span> </span><span>\"ContainerUser\"</span><span>\n </span><span>},</span><span>\n </span><span>\"args\"</span><span>:</span><span> </span><span>[</span><span>\n </span><span>\"cmd\"</span><span>,</span><span>\n </span><span>\"/c\"</span><span>,</span><span>\n </span><span>\"echo test\"</span><span>\n </span><span>],</span><span>\n </span><span>\"cwd\"</span><span>:</span><span> </span><span>\"\"</span><span>\n </span><span>},</span><span>\n </span><span>\"root\"</span><span>:</span><span> </span><span>{</span><span>\n </span><span>\"path\"</span><span>:</span><span> </span><span>\"\"</span><span>\n </span><span>},</span><span>\n </span><span>\"windows\"</span><span>:</span><span> </span><span>{</span><span>\n </span><span>\"layerFolders\"</span><span>:</span><span> </span><span>[</span><span>\n </span><span>\"C:</span><span>\\\\</span><span>ProgramData</span><span>\\\\</span><span>containerd</span><span>\\\\</span><span>root</span><span>\\\\</span><span>io.containerd.snapshotter.v1.windows</span><span>\\\\</span><span>snapshots</span><span>\\\\</span><span>20\"</span><span>,</span><span>\n </span><span>\"C:</span><span>\\\\</span><span>ProgramData</span><span>\\\\</span><span>containerd</span><span>\\\\</span><span>root</span><span>\\\\</span><span>io.containerd.snapshotter.v1.windows</span><span>\\\\</span><span>snapshots</span><span>\\\\</span><span>21\"</span><span>\n </span><span>],</span><span>\n </span><span>\"ignoreFlushesDuringBoot\"</span><span>:</span><span> </span><span>true</span><span>,</span><span>\n </span><span>\"network\"</span><span>:</span><span> </span><span>{</span><span>\n </span><span>\"allowUnqualifiedDNSQuery\"</span><span>:</span><span> </span><span>true</span><span>\n </span><span>}</span><span>\n </span><span>}</span><span>\n</span><span>}</span><span>\n</span></code></pre></div></div>\n\n<p>We can now run the container.</p>\n\n<pre><code>c:\\> ctr run --rm --config .\\config.json my-container\n</code></pre>",
+21
mte/2025_06_12_oxcaml-repository.json
+21
mte/2025_06_12_oxcaml-repository.json
···+"summary": "This morning, Anil proposed that having an opam-repository that didn\u2019t have old versions of the packages that require patches to work with OxCaml would be good.",+"content": "<p>This morning, Anil proposed that having an opam-repository that didn\u2019t have old versions of the packages that require patches to work with OxCaml would be good.</p>\n\n<p>This is a fast-moving area, so this post is likely to be outdated very quickly, but at the time of writing, the development repository is <a href=\"https://github.com/janestreet/opam-repository/tree/with-extensions\">https://github.com/janestreet/opam-repository#with-extensions</a>. This is a fork of <a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> but with some patched packages designated with <code>+ox</code>.</p>\n\n<p>I have a short shell script which clones both <a href=\"https://github.com/ocaml/opam-repository\">opam-repository</a> and <a href=\"https://github.com/janestreet/opam-repository/tree/with-extensions\">https://github.com/janestreet/opam-repository#with-extensions</a> and searches for all packages with <code>+ox</code>. All versions of these packages are removed from opam-repository and replaced with the single <code>+ox</code> version. The resulting repository is pushed to <a href=\"https://github.com/mtelvers/opam-repository-ox\">https://github.com/mtelvers/opam-repository-ox</a>.</p>\n\n<p>To test the repository (and show that <code>eio</code> doesn\u2019t build), I have created a <code>Dockerfile</code> based largely on the base-image-builder format. This <code>Dockerfile</code> uses this modified opam-repository to build an OxCaml switch.</p>\n\n<p>My build script and test Dockerfile are in [https://github.com/mtelvers/opam-repo-merge] (https://github.com/mtelvers/opam-repo-merge). Thanks to David for being the sounding board during the day!</p>",
+21
mte/2025_06_14_borg-backup.json
+21
mte/2025_06_14_borg-backup.json
···+"summary": "Our PeerTube installation at watch.ocaml.org holds hundreds of videos we wouldn\u2019t want to lose! It\u2019s a VM hosted at Scaleway so the chances of a loss are pretty small, but having a second copy would give us extra reassurance. I\u2019m going to use Borg Backup.",+"content": "<p>Our PeerTube installation at <a href=\"https://watch.ocaml.org/\">watch.ocaml.org</a> holds hundreds of videos we wouldn\u2019t want to lose! It\u2019s a VM hosted at Scaleway so the chances of a loss are pretty small, but having a second copy would give us extra reassurance. I\u2019m going to use <a href=\"https://www.borgbackup.org\">Borg Backup</a>.</p>\n\n<p>Here\u2019s the list of features (taken directly from their website):</p>\n\n<ul>\n <li>Space-efficient storage of backups.</li>\n <li>Secure, authenticated encryption.</li>\n <li>Compression: lz4, zstd, zlib, lzma or none.</li>\n <li>Mountable backups with FUSE.</li>\n <li>Easy installation on multiple platforms: Linux, macOS, BSD, \u2026</li>\n <li>Free software (BSD license).</li>\n <li>Backed by a large and active open source community.</li>\n</ul>\n\n<p>We have several OBuilder workers with one or more unused hard disks, which would make ideal backup targets.</p>\n\n<p>In this case, I will format and mount <code>sdc</code> as <code>/home</code> on one of the workers.</p>\n\n<div><div><pre><code>parted /dev/sdc mklabel gpt\nparted /dev/sdc mkpart primary ext4 0% 100%\nmkfs.ext4 /dev/sdc1\n</code></pre></div></div>\n\n<p>Add this to /etc/fstab and run <code>mount -a</code>.</p>\n\n<div><div><pre><code>/dev/sdc1 /home ext4 defaults 0 2\n</code></pre></div></div>\n\n<p>Create a user <code>borg</code>.</p>\n\n<div><div><pre><code>adduser <span>--disabled-password</span> <span>--gecos</span> <span>'@borg'</span> <span>--home</span> /home/borg borg\n</code></pre></div></div>\n\n<p>On both machines, install the application <code>borg</code>.</p>\n\n<div><div><pre><code>apt <span>install </span>borgbackup\n</code></pre></div></div>\n\n<p>On the machine we want to backup, generate an SSH key and copy it to the <code>authorized_keys</code> file for user <code>borg</code> on the target server. Ensure that <code>chmod</code> and <code>chown</code> are correct.</p>\n\n<div><div><pre><code>ssh-keygen <span>-t</span> ed25519 <span>-f</span> ~/.ssh/borg_backup_key\n</code></pre></div></div>\n\n<p>Add lines to the <code>.ssh/config</code> for ease of connection. We can now <code>ssh backup-server</code> without any prompts.</p>\n\n<div><div><pre><code>Host backup-server\n HostName your.backup.server.com\n User borg\n IdentityFile ~/.ssh/borg_backup_key\n ServerAliveInterval 60\n ServerAliveCountMax 3\n</code></pre></div></div>\n\n<p>Borg supports encrypting the backup at rest on the target machine. The data is publicly available in this case, so encryption seems unnecessary.</p>\n\n<p>On the machine to be backed up, run.</p>\n\n<div><div><pre><code>borg init <span>--encryption</span><span>=</span>none backup-server:repo\n</code></pre></div></div>\n\n<p>We can now perform a backup or two and see how the deduplication works.</p>\n\n<div><div><pre><code><span># borg create backup-server:repo::test /var/lib/docker/volumes/postgres --compression lz4 --stats --progress</span>\n<span>------------------------------------------------------------------------------</span>\nRepository: ssh://backup-server/./repo\nArchive name: <span>test\n</span>Archive fingerprint: 627242cb5b65efa23672db317b4cdc8617a78de4d8e195cdd1e1358ed02dd937\nTime <span>(</span>start<span>)</span>: Sat, 2025-06-14 13:32:27\nTime <span>(</span>end<span>)</span>: Sat, 2025-06-14 13:32:38\nDuration: 11.03 seconds\nNumber of files: 3497\nUtilization of max. archive size: 0%\n<span>------------------------------------------------------------------------------</span>\n Original size Compressed size Deduplicated size\nThis archive: 334.14 MB 136.28 MB 132.79 MB\nAll archives: 334.14 MB 136.28 MB 132.92 MB\n\n Unique chunks Total chunks\nChunk index: 942 1568\n<span>------------------------------------------------------------------------------</span>\n<span># borg create backup-server:repo::test2 /var/lib/docker/volumes/postgres --compression lz4 --stats --progress</span>\n<span>------------------------------------------------------------------------------</span>\nRepository: ssh://backup-server/./repo\nArchive name: test2\nArchive fingerprint: 572bf2225b3ab19afd32d44f058a49dc2b02cb70c8833fa0b2a1fb5b95526bff\nTime <span>(</span>start<span>)</span>: Sat, 2025-06-14 13:33:05\nTime <span>(</span>end<span>)</span>: Sat, 2025-06-14 13:33:06\nDuration: 1.43 seconds\nNumber of files: 3497\nUtilization of max. archive size: 0%\n<span>------------------------------------------------------------------------------</span>\n Original size Compressed size Deduplicated size\nThis archive: 334.14 MB 136.28 MB 9.58 MB\nAll archives: 668.28 MB 272.55 MB 142.61 MB\n\n Unique chunks Total chunks\nChunk index: 971 3136\n<span>------------------------------------------------------------------------------</span>\n<span># borg list backup-server:repo</span>\n<span>test </span>Sat, 2025-06-14 13:32:27 <span>[</span>627242cb5b65efa23672db317b4cdc8617a78de4d8e195cdd1e1358ed02dd937]\ntest2 Sat, 2025-06-14 13:33:05 <span>[</span>572bf2225b3ab19afd32d44f058a49dc2b02cb70c8833fa0b2a1fb5b95526bff]\n</code></pre></div></div>\n\n<p>Let\u2019s run this every day via by placing a script <code>borgbackup</code> in <code>/etc/cron.daily</code>. The paths given are just examples\u2026</p>\n\n<div><div><pre><code><span>#!/bin/bash</span>\n\n<span># Configuration</span>\n<span>REPOSITORY</span><span>=</span><span>\"backup-server:repo\"</span>\n\n<span># What to backup</span>\n<span>BACKUP_PATHS</span><span>=</span><span>\"\n/home\n\"</span>\n\n<span># What to exclude</span>\n<span>EXCLUDE_ARGS</span><span>=</span><span>\"\n--exclude '*.tmp'\n--exclude '*.log'\n\"</span>\n\n<span># Logging function</span>\nlog<span>()</span> <span>{</span>\n logger <span>-t</span> <span>\"borg-backup\"</span> <span>\"</span><span>$1</span><span>\"</span>\n <span>echo</span> <span>\"</span><span>$(</span><span>date</span> <span>'+%Y-%m-%d %H:%M:%S'</span><span>)</span><span> - </span><span>$1</span><span>\"</span>\n<span>}</span>\n\nlog <span>\"========================================\"</span>\nlog <span>\"Starting Borg backup\"</span>\n\n<span># Check if borg is installed</span>\n<span>if</span> <span>!</span> <span>command</span> <span>-v</span> borg &> /dev/null<span>;</span> <span>then\n </span>log <span>\"ERROR: borg command not found\"</span>\n <span>exit </span>1\n<span>fi</span>\n\n<span># Test repository access</span>\n<span>if</span> <span>!</span> borg info <span>\"</span><span>$REPOSITORY</span><span>\"</span> &> /dev/null<span>;</span> <span>then\n </span>log <span>\"ERROR: Cannot access repository </span><span>$REPOSITORY</span><span>\"</span>\n log <span>\"Make sure repository exists and SSH key is set up\"</span>\n <span>exit </span>1\n<span>fi</span>\n\n<span># Create backup</span>\nlog <span>\"Creating backup archive...\"</span>\n<span>if </span>borg create <span>\\</span>\n <span>\"</span><span>$REPOSITORY</span><span>::backup-{now}\"</span> <span>\\</span>\n <span>$BACKUP_PATHS</span> <span>\\</span>\n <span>$EXCLUDE_ARGS</span> <span>\\</span>\n <span>--compression</span> lz4 <span>\\</span>\n <span>--stats</span> 2>&1 | logger <span>-t</span> <span>\"borg-backup\"</span><span>;</span> <span>then\n </span>log <span>\"Backup created successfully\"</span>\n<span>else\n </span>log <span>\"ERROR: Backup creation failed\"</span>\n <span>exit </span>1\n<span>fi</span>\n\n<span># Prune old backups</span>\nlog <span>\"Pruning old backups...\"</span>\n<span>if </span>borg prune <span>\"</span><span>$REPOSITORY</span><span>\"</span> <span>\\</span>\n <span>--keep-daily</span><span>=</span>7 <span>\\</span>\n <span>--keep-weekly</span><span>=</span>4 <span>\\</span>\n <span>--keep-monthly</span><span>=</span>6 <span>\\</span>\n <span>--stats</span> 2>&1 | logger <span>-t</span> <span>\"borg-backup\"</span><span>;</span> <span>then\n </span>log <span>\"Pruning completed successfully\"</span>\n<span>else\n </span>log <span>\"WARNING: Pruning failed, but backup was successful\"</span>\n<span>fi</span>\n\n<span># Monthly repository check (on the 1st of each month)</span>\n<span>if</span> <span>[</span> <span>\"</span><span>$(</span><span>date</span> +%d<span>)</span><span>\"</span> <span>=</span> <span>\"01\"</span> <span>]</span><span>;</span> <span>then\n </span>log <span>\"Running monthly repository check...\"</span>\n <span>if </span>borg check <span>\"</span><span>$REPOSITORY</span><span>\"</span> 2>&1 | logger <span>-t</span> <span>\"borg-backup\"</span><span>;</span> <span>then\n </span>log <span>\"Repository check passed\"</span>\n <span>else\n </span>log <span>\"WARNING: Repository check failed\"</span>\n <span>fi\nfi\n\n</span>log <span>\"Backup completed successfully\"</span>\nlog <span>\"========================================\"</span>\n</code></pre></div></div>\n\n<p>Check the logs\u2026</p>\n\n<div><div><pre><code>journalctl <span>-t</span> borg-backup\n</code></pre></div></div>",
+21
mte/2025_06_14_windows-containerd-2.json
+21
mte/2025_06_14_windows-containerd-2.json
···+"summary": "If you were following along with my previous post on containerd on Windows, you may recall that I lamented the lack of an installer. Since then, I have found a PowerShell script on Microsoft\u2019s GitHub, which does a lot of the grunt work for us.",+"content": "<p>If you were following along with my previous post on <a href=\"https://www.tunbury.org/windows-containerd/\">containerd on Windows</a>, you may recall that I lamented the lack of an installer. Since then, I have found a PowerShell <a href=\"https://github.com/microsoft/Windows-Containers/blob/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1\">script</a> on Microsoft\u2019s GitHub, which does a lot of the grunt work for us.</p>\n\n<p>Trying anything beyond my <code>echo Hello</code> test showed an immediate problem: there is no network. <code>ipconfig</code> didn\u2019t display any network interfaces.</p>\n\n<pre><code>C:\\>ctr run --rm mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig\n\nWindows IP Configuration\n</code></pre>\n\n<p>Checking the command line options, there is one called <code>--net-host</code>, which sounded promising, only for that to be immediately dashed:</p>\n\n<pre><code>C:\\>ctr run --rm --net-host mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig\nctr: Cannot use host mode networking with Windows containers\n</code></pre>\n\n<p>The solution is <code>--cni</code>, but more work is required to get that working. We need to download the plugins and populate them in the <code>cni/bin</code> subdirectory. Fortunately, the installation script does all of this for us but leaves it unconfigured.</p>\n\n<pre><code>C:\\Windows\\System32>ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ipconfig\nctr: no network config found in C:\\Program Files\\containerd\\cni\\conf: cni plugin not initialized\n</code></pre>\n\n<p>From the top, this is how you get from a fresh install of Windows 11, to a container with networking. Firstly, use installation script to install <code>containerd</code>.</p>\n\n<pre><code>curl.exe https://raw.githubusercontent.com/microsoft/Windows-Containers/refs/heads/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1 -o install-containerd-runtime.ps1\nSet-ExecutionPolicy Bypass\n.\\install-containerd-runtime.ps1 -ContainerDVersion 2.1.1 -WinCNIVersion 0.3.1 -ExternalNetAdapter Ethernet\n</code></pre>\n\n<p>Now create <code>C:\\Program Files\\containerd\\cni\\conf\\0-containerd-nat.conf</code> containing the following:</p>\n\n<div><div><pre><code>{\n \"cniVersion\": \"0.3.0\",\n \"name\": \"nat\",\n \"type\": \"nat\",\n \"master\": \"Ethernet\",\n \"ipam\": {\n \"subnet\": \"172.20.0.0/16\",\n \"routes\": [\n {\n \"gateway\": \"172.20.0.1\"\n }\n ]\n },\n \"capabilities\": {\n \"portMappings\": true,\n \"dns\": true\n }\n}\n</code></pre></div></div>\n\n<p>Easy when you know how\u2026</p>\n\n<pre><code>C:\\>ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container ping 1.1.1.1\n\nPinging 1.1.1.1 with 32 bytes of data:\nReply from 1.1.1.1: bytes=32 time=5ms TTL=58\nReply from 1.1.1.1: bytes=32 time=7ms TTL=58\nReply from 1.1.1.1: bytes=32 time=7ms TTL=58\nReply from 1.1.1.1: bytes=32 time=6ms TTL=58\n\nPing statistics for 1.1.1.1:\n Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),\nApproximate round trip times in milli-seconds:\n Minimum = 5ms, Maximum = 7ms, Average = 6ms\n</code></pre>\n\n<p>The next challenge is, what do you put in your own <code>config.json</code> to reproduce this behaviour?</p>\n\n<p>Firstly, we need our <code>layerFolders</code>:</p>\n\n<pre><code>C:\\>ctr snapshot ls\nKEY PARENT KIND\nsha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355 Committed\n</code></pre>\n\n<pre><code>C:\\>ctr snapshot prepare --mounts my-snapshot sha256:44b913d145adda5364b5465664644b11282ed3c4b9bd9739aa17832ee4b2b355\n[\n {\n \"Type\": \"windows-layer\",\n \"Source\": \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\14\",\n \"Target\": \"\",\n \"Options\": [\n \"rw\",\n \"parentLayerPaths=[\\\"C:\\\\\\\\ProgramData\\\\\\\\containerd\\\\\\\\root\\\\\\\\io.containerd.snapshotter.v1.windows\\\\\\\\snapshots\\\\\\\\1\\\"]\"\n ]\n }\n]\n</code></pre>\n\n<p>Let\u2019s create a <code>config.json</code> without a network stanza just to check we can create a container:</p>\n\n<div><div><pre><code>{\n \"ociVersion\": \"1.1.0\",\n \"process\": {\n \"terminal\": false,\n \"user\": { \"uid\": 0, \"gid\": 0 },\n \"args\": [\n \"cmd\", \"/c\",\n \"ipconfig && ping 1.1.1.1\"\n ],\n \"cwd\": \"c:\\\\\"\n },\n \"root\": { \"path\": \"\", \"readonly\": false },\n \"hostname\": \"builder\",\n \"windows\": {\n \"layerFolders\": [\n \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\1\",\n \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\14\"\n ],\n \"ignoreFlushesDuringBoot\": true\n }\n}\n</code></pre></div></div>\n\n<p>The container runs, but there is no network as we\u2019d expect.</p>\n\n<pre><code>C:\\>ctr run --rm --config config.json my-container\n\nWindows IP Configuration\n\n\nPinging 1.1.1.1 with 32 bytes of data:\nPING: transmit failed. General failure.\nPING: transmit failed. General failure.\nPING: transmit failed. General failure.\nPING: transmit failed. General failure.\n</code></pre>\n\n<p>If we turn on CNI, it crypically tells us what we need to do:</p>\n\n<pre><code>C:\\>ctr run --rm --cni --config config.json my-container\nctr: plugin type=\"nat\" name=\"nat\" failed (add): required env variables [CNI_NETNS] missing\n</code></pre>\n\n<p>So we need to populate the <code>network.networkNamespace</code> with the name (ID) of the network we want to use. This should be a GUID, and I don\u2019t know how to get the right value. I would have assumed that it was one of the many GUID\u2019s returned by <code>Get-HnsNetwork</code> but it isn\u2019t.</p>\n\n<div><div><pre><code><span>PS</span><span> </span><span>C:\\</span><span>></span><span> </span><span>Get-HnsNetwork</span><span>\n\n\n</span><span>ActivityId</span><span> </span><span>:</span><span> </span><span>92018CF0-6DCB-4AAF-A14E-DC61120FC958</span><span>\n</span><span>AdditionalParams</span><span> </span><span>:</span><span>\n</span><span>CurrentEndpointCount</span><span> </span><span>:</span><span> </span><span>0</span><span>\n</span><span>Extensions</span><span> </span><span>:</span><span> </span><span>{@{</span><span>Id</span><span>=</span><span>E7C3B2F0</span><span>-</span><span>F3C5</span><span>-</span><span>48</span><span>DF</span><span>-</span><span>AF2B</span><span>-</span><span>10</span><span>FED6D72E7A</span><span>;</span><span> </span><span>IsEnabled</span><span>=</span><span>False</span><span>;</span><span> </span><span>Name</span><span>=</span><span>Microsoft</span><span> </span><span>Windows</span><span> </span><span>Filtering</span><span> </span><span>Platform</span><span>},</span><span>\n </span><span>@{</span><span>Id</span><span>=</span><span>F74F241B</span><span>-</span><span>440</span><span>F</span><span>-</span><span>4433</span><span>-</span><span>BB28</span><span>-</span><span>00</span><span>F89EAD20D8</span><span>;</span><span> </span><span>IsEnabled</span><span>=</span><span>False</span><span>;</span><span> </span><span>Name</span><span>=</span><span>Microsoft</span><span> </span><span>Azure</span><span> </span><span>VFP</span><span> </span><span>Switch</span><span> </span><span>Filter</span><span> </span><span>Extension</span><span>},</span><span>\n </span><span>@{</span><span>Id</span><span>=</span><span>430</span><span>BDADD</span><span>-</span><span>BAB0</span><span>-</span><span>41</span><span>AB</span><span>-</span><span>A369</span><span>-</span><span>94</span><span>B67FA5BE0A</span><span>;</span><span> </span><span>IsEnabled</span><span>=</span><span>True</span><span>;</span><span> </span><span>Name</span><span>=</span><span>Microsoft</span><span> </span><span>NDIS</span><span> </span><span>Capture</span><span>}}</span><span>\n</span><span>Flags</span><span> </span><span>:</span><span> </span><span>8</span><span>\n</span><span>Health</span><span> </span><span>:</span><span> </span><span>@{</span><span>LastErrorCode</span><span>=</span><span>0</span><span>;</span><span> </span><span>LastUpdateTime</span><span>=</span><span>133943927149605101</span><span>}</span><span>\n</span><span>ID</span><span> </span><span>:</span><span> </span><span>3EB2B18B-A1DD-46A8-A425-256F6B3DF26D</span><span>\n</span><span>IPv6</span><span> </span><span>:</span><span> </span><span>False</span><span>\n</span><span>LayeredOn</span><span> </span><span>:</span><span> </span><span>20791F67-012C-4C9B-9C93-530FDA5DE4FA</span><span>\n</span><span>MacPools</span><span> </span><span>:</span><span> </span><span>{@{</span><span>EndMacAddress</span><span>=</span><span>00</span><span>-</span><span>15</span><span>-</span><span>5</span><span>D</span><span>-</span><span>C3</span><span>-</span><span>DF</span><span>-</span><span>FF</span><span>;</span><span> </span><span>StartMacAddress</span><span>=</span><span>00</span><span>-</span><span>15</span><span>-</span><span>5</span><span>D</span><span>-</span><span>C3</span><span>-</span><span>D0</span><span>-</span><span>00</span><span>}}</span><span>\n</span><span>MaxConcurrentEndpoints</span><span> </span><span>:</span><span> </span><span>1</span><span>\n</span><span>Name</span><span> </span><span>:</span><span> </span><span>nat</span><span>\n</span><span>NatName</span><span> </span><span>:</span><span> </span><span>NATAC317D6D-8A2E-4E4E-9BCF-33435FE4CD8F</span><span>\n</span><span>Policies</span><span> </span><span>:</span><span> </span><span>{@{</span><span>Type</span><span>=</span><span>VLAN</span><span>;</span><span> </span><span>VLAN</span><span>=</span><span>1</span><span>}}</span><span>\n</span><span>State</span><span> </span><span>:</span><span> </span><span>1</span><span>\n</span><span>Subnets</span><span> </span><span>:</span><span> </span><span>{@{</span><span>AdditionalParams</span><span>=</span><span>;</span><span> </span><span>AddressPrefix</span><span>=</span><span>172.20.0.0</span><span>/</span><span>16</span><span>;</span><span> </span><span>Flags</span><span>=</span><span>0</span><span>;</span><span> </span><span>GatewayAddress</span><span>=</span><span>172.20.0.1</span><span>;</span><span> </span><span>Health</span><span>=</span><span>;</span><span>\n </span><span>ID</span><span>=</span><span>5</span><span>D56CE8D</span><span>-</span><span>1</span><span>AD2</span><span>-</span><span>47</span><span>FF</span><span>-</span><span>85</span><span>A7</span><span>-</span><span>A0E6D530565D</span><span>;</span><span> </span><span>IpSubnets</span><span>=</span><span>System</span><span>.</span><span>Object</span><span>[];</span><span> </span><span>ObjectType</span><span>=</span><span>5</span><span>;</span><span> </span><span>Policies</span><span>=</span><span>System</span><span>.</span><span>Object</span><span>[];</span><span> </span><span>State</span><span>=</span><span>0</span><span>}}</span><span>\n</span><span>SwitchGuid</span><span> </span><span>:</span><span> </span><span>3EB2B18B-A1DD-46A8-A425-256F6B3DF26D</span><span>\n</span><span>TotalEndpoints</span><span> </span><span>:</span><span> </span><span>2</span><span>\n</span><span>Type</span><span> </span><span>:</span><span> </span><span>NAT</span><span>\n</span><span>Version</span><span> </span><span>:</span><span> </span><span>64424509440</span><span>\n</span><span>Resources</span><span> </span><span>:</span><span> </span><span>@{</span><span>AdditionalParams</span><span>=</span><span>;</span><span> </span><span>AllocationOrder</span><span>=</span><span>2</span><span>;</span><span> </span><span>Allocators</span><span>=</span><span>System</span><span>.</span><span>Object</span><span>[];</span><span> </span><span>CompartmentOperationTime</span><span>=</span><span>0</span><span>;</span><span> </span><span>Flags</span><span>=</span><span>0</span><span>;</span><span> </span><span>Health</span><span>=</span><span>;</span><span>\n </span><span>ID</span><span>=</span><span>92018</span><span>CF0</span><span>-</span><span>6</span><span>DCB</span><span>-</span><span>4</span><span>AAF</span><span>-</span><span>A14E</span><span>-</span><span>DC61120FC958</span><span>;</span><span> </span><span>PortOperationTime</span><span>=</span><span>0</span><span>;</span><span> </span><span>State</span><span>=</span><span>1</span><span>;</span><span> </span><span>SwitchOperationTime</span><span>=</span><span>0</span><span>;</span><span> </span><span>VfpOperationTime</span><span>=</span><span>0</span><span>;</span><span>\n </span><span>parentId</span><span>=</span><span>71</span><span>FB2758</span><span>-</span><span>F714</span><span>-</span><span>4838</span><span>-</span><span>8764</span><span>-</span><span>7079378</span><span>D6CB6</span><span>}</span><span>\n</span></code></pre></div></div>\n\n<p>I ran <code>ctr run --rm --cni mcr.microsoft.com/windows/nanoserver:ltsc2022 my-container cmd /c \"ping 1.1.1.1 && pause\"</code> in one window and ran <code>ctr c info my-container</code> in another, which revealed a GUID was <code>5f7d467c-3011-48bc-9337-ce78cf399345</code>.</p>\n\n<p>Adding this to my <code>config.json</code></p>\n\n<div><div><pre><code>{\n \"ociVersion\": \"1.1.0\",\n \"process\": {\n \"terminal\": false,\n \"user\": { \"uid\": 0, \"gid\": 0 },\n \"args\": [\n \"cmd\", \"/c\",\n \"ipconfig && ping 1.1.1.1\"\n ],\n \"cwd\": \"c:\\\\\"\n },\n \"root\": { \"path\": \"\", \"readonly\": false },\n \"hostname\": \"builder\",\n \"windows\": {\n \"layerFolders\": [\n \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\1\",\n \"C:\\\\ProgramData\\\\containerd\\\\root\\\\io.containerd.snapshotter.v1.windows\\\\snapshots\\\\14\"\n ],\n \"ignoreFlushesDuringBoot\": true,\n \"network\": {\n \"allowUnqualifiedDNSQuery\": true,\n \"networkNamespace\": \"5f7d467c-3011-48bc-9337-ce78cf399345\"\n }\n }\n}\n</code></pre></div></div>\n\n<p>And now I have a network!</p>\n\n<pre><code>C:\\>ctr run --rm --cni --config config.json my-container\n\nWindows IP Configuration\n\n\nEthernet adapter vEthernet (default-my-container2_nat):\n\n Connection-specific DNS Suffix . : Home\n Link-local IPv6 Address . . . . . : fe80::921d:1ce7:a445:8dfa%49\n IPv4 Address. . . . . . . . . . . : 172.20.95.58\n Subnet Mask . . . . . . . . . . . : 255.255.0.0\n Default Gateway . . . . . . . . . : 172.20.0.1\n\nPinging 1.1.1.1 with 32 bytes of data:\nReply from 1.1.1.1: bytes=32 time=5ms TTL=58\nReply from 1.1.1.1: bytes=32 time=6ms TTL=58\nReply from 1.1.1.1: bytes=32 time=6ms TTL=58\nReply from 1.1.1.1: bytes=32 time=6ms TTL=58\n\nPing statistics for 1.1.1.1:\n Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),\nApproximate round trip times in milli-seconds:\n Minimum = 5ms, Maximum = 6ms, Average = 5ms\n</code></pre>",
+21
mte/2025_06_17_static-linking.json
+21
mte/2025_06_17_static-linking.json
···+"summary": "Most of the time, you don\u2019t think about how your file is linked. We\u2019ve come to love dynamically linked files with their small file sizes and reduced memory requirements, but there are times when the convenience of a single binary download from a GitHub release page is really what you need.",+"content": "<p>Most of the time, you don\u2019t think about how your file is linked. We\u2019ve come to love dynamically linked files with their small file sizes and reduced memory requirements, but there are times when the convenience of a single binary download from a GitHub release page is really what you need.</p>\n\n<p>To do this in OCaml, we need to add <code>-ccopt -static</code> to the <code>ocamlopt</code>. I\u2019m building with <code>dune</code>, so I can configure that in my <code>dune</code> file using a <code>flags</code> directive.</p>\n\n<div><div><pre><code>(flags (:standard -ccopt -static))\n</code></pre></div></div>\n\n<p>This can be extended for maximum compatibility by additionally adding <code>-ccopt -march=x86-64</code>, which ensures the generated code will run on any x86_64 processor and will not use newer instruction set extensions like SSE3, AVX, etc.</p>\n\n<p>So what about Windows? The Mingw tool chain accepts <code>-static</code>. Including <code>(flags (:standard -ccopt \"-link -Wl,-static -v\"))</code> got my options applied to my <code>dune</code> build:</p>\n\n<div><div><pre><code>x86_64-w64-mingw32-gcc -mconsole -L. -I\"C:/Users/Administrator/my-app/_opam/lib/ocaml\" -I\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\mccs\" -I\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\mccs\\glpk/internal\" -I\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\opam-core\" -I\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\sha\" -I\"C:/Users/Administrator/my-app/_opam/lib/ocaml\\flexdll\" -L\"C:/Users/Administrator/my-app/_opam/lib/ocaml\" -L\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\mccs\" -L\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\mccs\\glpk/internal\" -L\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\opam-core\" -L\"C:\\Users\\Administrator\\my-app\\_opam\\lib\\sha\" -L\"C:/Users/Administrator/my-app/_opam/lib/ocaml\\flexdll\" -o \"bin/main.exe\" \"C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\2\\build_d62d04_dune\\dyndllb7e0e8.o\" \"@C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\2\\build_d62d04_dune\\camlrespec7816\" \"-municode\" \"-Wl,-static\"\n</code></pre></div></div>\n\n<p>However, <code>ldd</code> showed that this wasn\u2019t working:</p>\n\n<div><div><pre><code>$ ldd main.exe | grep mingw\n libstdc++-6.dll => /mingw64/bin/libstdc++-6.dll (0x7ffabf3e0000)\n libgcc_s_seh-1.dll => /mingw64/bin/libgcc_s_seh-1.dll (0x7ffac3130000)\n libwinpthread-1.dll => /mingw64/bin/libwinpthread-1.dll (0x7ffac4b40000)\n</code></pre></div></div>\n\n<p>I tried <em>a lot</em> of different variations. I asked Claude\u2026 then I asked <a href=\"https://www.dra27.uk/blog/\">@dra27</a> who recalled @kit-ty-kate working on this for opam. <a href=\"https://github.com/ocaml/opam/pull/5680\">PR#5680</a></p>\n\n<p>The issue is the auto-response file, which precedes my static option. We can remove that by adding <code>-noautolink</code>, but now we must do all the work by hand and build a massive command line.</p>\n\n<div><div><pre><code>(executable\n (public_name main)\n (name main)\n (flags (:standard -noautolink -cclib -lunixnat -cclib -lmccs_stubs -cclib -lmccs_glpk_stubs -cclib -lsha_stubs -cclib -lopam_core_stubs -cclib -l:libstdc++.a -cclib -l:libpthread.a -cclib -Wl,-static -cclib -ladvapi32 -cclib -lgdi32 -cclib -luser32 -cclib -lshell32 -cclib -lole32 -cclib -luuid -cclib -luserenv -cclib -lwindowsapp))\n (libraries opam-client))\n</code></pre></div></div>\n\n<p>It works, but it\u2019s not for the faint-hearted.</p>\n\n<p>I additionally added <code>(enabled_if (= %{os_type} Win32))</code> to my rule so it only runs on Windows.</p>",
+21
mte/2025_06_18_windows-reflinks.json
+21
mte/2025_06_18_windows-reflinks.json
···+"summary": "Who knew there was a limit on creating hard links? I didn\u2019t even consider this until my hard links started to fail. On NTFS, the limit is 1024 links to any given file. Subsequent research shows that the limit varies between file systems, with NTFS at the lower end of the scale.",+"content": "<p>Who knew there was a limit on creating hard links? I didn\u2019t even consider this until my hard links started to fail. On NTFS, the limit is 1024 links to any given file. Subsequent research shows that the limit varies between file systems, with NTFS at the lower end of the scale.</p>\n\n<p>Here\u2019s an excerpt from <a href=\"https://en.wikipedia.org/wiki/Hard_link\">Wikipedia</a> on the subject.</p>\n\n<blockquote>\n <p>In AT&T Unix System 6, released in 1975, the number of hard links allowed was 127. On Unix-like systems, the in-memory counter is 4,294,967,295 (on 32-bit machines) or 18,446,744,073,709,551,615 (on 64-bit machines). In some file systems, the number of hard links is limited more strictly by their on-disk format. For example, as of Linux 3.11, the ext4 file system limits the number of hard links on a file to 65,000. Windows limits enforces a limit of 1024 hard links to a file on NTFS volumes.</p>\n</blockquote>\n\n<p>This restriction probably doesn\u2019t even come close to being a practical limit for most normal use cases, but it\u2019s worth noting that <code>git.exe</code> has 142 hard links on a standard Cygwin installation.</p>\n\n<div><div><pre><code>fsutil hardlink list %LOCALAPPDATA%\\opam\\.cygwin\\root\\bin\\git.exe\n</code></pre></div></div>\n\n<p>Back in 2012, Microsoft released ReFS as an alternative to NTFS. The feature gap has closed over the years, with hard links being introduced in the preview of Windows Server 2022. ReFS supports 1 million hard links per file, but even more interestingly, it supports <a href=\"https://learn.microsoft.com/en-us/windows/win32/fileio/block-cloning\">block cloning</a>, aka <a href=\"https://blogs.oracle.com/linux/post/xfs-data-block-sharing-reflink\">reflinks</a>, whereby files can share common data blocks. When changes are written to a block, it is copied, and its references are updated.</p>\n\n<p>The implementation is interesting because it doesn\u2019t work in quite the way that one would think. It can only be used to clone complete clusters. Therefore, we must first call <a href=\"https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_get_integrity_information\">FSCTL_GET_INTEGRITY_INFORMATION</a>, which returns <a href=\"https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-fsctl_get_integrity_information_buffer\">FSCTL_GET_INTEGRITY_INFORMATION_BUFFER</a> with the cluster size in bytes.</p>\n\n<p>Despite <a href=\"https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_duplicate_extents_to_file\">FSCTL_DUPLICATE_EXTENTS_TO_FILE</a> taking an exact number of bytes, we must round up the file size to the next cluster boundary.</p>\n\n<p>Additionally, the target file needs to exist before the clone and be large enough to hold the cloned clusters. In practice, this means calling <a href=\"https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew\">CreateFileW</a> to create the file and then calling <a href=\"https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfileinformationbyhandle\">SetFileInformationByHandle</a> to set the file size to match the source file (not the rounded cluster size).</p>\n\n<p>Taking an example file of 23075 bytes, this would be rounded to 24576 bytes (6 clusters). We can use <code>fsutil file queryextents</code> to get detailed information about the clusters used in the source file:</p>\n\n<div><div><pre><code>D:\\> fsutil file queryextents source.txt\nVCN: 0x0 Clusters: 0x6 LCN: 0x2d3d801\n</code></pre></div></div>\n\n<p>Now we clone the file <code>ReFS-clone d:\\source.txt d:\\target.txt</code> and then query the extents which it uses.</p>\n\n<div><div><pre><code>D:\\> fsutil file queryextents target.txt\nVCN: 0x0 Clusters: 0x5 LCN: 0x2d3d801\nVCN: 0x5 Clusters: 0x1 LCN: 0x2d3c801\n</code></pre></div></div>\n\n<p>The first five whole clusters are shared between the two files, while the final partial cluster has been copied. When trying to implement this, I initially used a text file of just a few bytes and couldn\u2019t get it clone. After I rounded up the size to 4096, the API returned successfully, but there are no shared clusters. It wasn\u2019t until I tried a larger file with the size rounded up that I started to see actual shared clusters.</p>\n\n<div><div><pre><code>D:\\>echo hello > foo.txt\n\nD:\\>fsutil file queryextents foo.txt\nVCN: 0x0 Clusters: 0x1 LCN: 0x2d3dc04\n\nD:\\>ReFS-clone.exe foo.txt bar.txt\nReFS File Clone Utility\nReFS Clone: foo.txt -> bar.txt\nCluster size: 4096 bytes\nFile size: 8 bytes -> 4096 bytes (1 clusters)\nCloning 4096 bytes...\nSuccess!\nReFS cloning completed successfully.\n\nD:\\>fsutil file queryextents bar.txt\nVCN: 0x0 Clusters: 0x1 LCN: 0x2d3d807\n</code></pre></div></div>\n\n<p>The code is on GitHub in <a href=\"https://github.com/mtelvers/ReFS-Clone\">ReFS-Clone</a>.</p>",
+21
mte/2025_06_20_tailscale.json
+21
mte/2025_06_20_tailscale.json
···+"summary": "On a typical day, I sit at my antique Mac Pro Trashcan with every window running SSH to some remote machine. When I\u2019m away from home and using my MacBook, I can still SSH to those remote machines; however, with my recent Windows work, I\u2019ve been connecting to a Dell OptiPlex on my home LAN over Remote Desktop. How can I work remotely when I want to access my Windows machine?",+"content": "<p>On a typical day, I sit at my antique Mac Pro Trashcan with every window running SSH to some remote machine. When I\u2019m away from home and using my MacBook, I can still SSH to those remote machines; however, with my recent Windows work, I\u2019ve been connecting to a Dell OptiPlex on my home LAN over Remote Desktop. How can I work remotely when I want to access my Windows machine?</p>\n\n<p>It\u2019s the age-old issue of connecting to your home network, which is hidden behind your home broadband router with a dynamic public IP address. I could use a dynamic DNS service to track my home router and configure port forwarding, but would you open RDP to the Internet?</p>\n\n<p>I love VNC, but the recent change in the licensing model, whereby the free tier now has only three machines, combined with frustrating performance on the low bandwidth and intermittent connections we get on train WiFi, made me try an alternate solution. Thomas has Tailscale set up in the Paris office, and I decided to create a setup for home.</p>\n\n<p>I\u2019d rather not install any software on my Windows machine, as I wipe it pretty frequently, and I don\u2019t need a VPN interfering with my <code>containerd</code> implementation. However, Tailscale supports a configuration whereby you can route to local networks.</p>\n\n<p>After signing up for a free personal account, I installed the Tailscale client on my MacBook and Mac Pro (at home). On the Mac Pro, I enabled \u2018Allow Local Network Access\u2019 and from a Terminal window, I went to <code>/Applications/Tailscale.app/Contents/MacOS</code> and ran <code>./Tailscale set --advertise-routes=192.168.0.0/24</code>. With this done, looking at the machine list on the <a href=\"https://login.tailscale.com/admin/machines\">Tailscale console</a>, my Mac Pro lists <code>Subnets</code>. Clicking on the three dots, and opening <code>Edit route settings</code>, I could enable the advertised subnet, 192.168.0.0/24.</p>\n\n<p>Checking <code>netstat -rn</code> on my MacBook shows that 192.168.0 is routed over the VPN.</p>\n\n<div><div><pre><code>Routing tables\n\nInternet:\nDestination Gateway Flags Netif Expire\ndefault 10.101.2.1 UGScg en0\ndefault link#36 UCSIg utun12\n10.101.2/24 link#6 UCS en0 !\n10.101.2.1/32 link#6 UCS en0 !\n...\n192.168.0 link#36 UCS utun12\n...\n</code></pre></div></div>\n\n<p>From my MacBook, I can now use Microsoft Remote Desktop to connect to the private IP address of my Windows machine.</p>\n\n<p>OpenSSH is an optional feature on Windows 11. It can be turned on via Settings -> Apps -> Optional Features, clicking \u201cAdd a feature\u201d and installing \u201cOpenSSH Server\u201d. Then, Open Services and set the setup options for \u201cOpenSSH SSH Server\u201d to automatic.</p>\n\n<p>It didn\u2019t make the train WiFi any better, but connecting over SSH was pretty convenient when the bandwidth is low.</p>\n\n<p>Note that you may want to disable key expiry on your home machine; otherwise, it might require you to reauthenticate at a critical moment.</p>",
+21
mte/2025_06_21_macos-sequoia-include-path.json
+21
mte/2025_06_21_macos-sequoia-include-path.json
···+"summary": "@mseri raised issue #175 as the macOS workers cannot find the most basic C++ headers. I easily eliminated Obuilder, as opam install mccs.1.1+19 didn\u2019t work on the macOS workers natively.",+"content": "<p>@mseri raised <a href=\"https://github.com/ocaml/infrastructure/issues/175\">issue #175</a> as the macOS workers cannot find the most basic C++ headers. I easily eliminated <a href=\"https://github.com/ocurrent/obuilder\">Obuilder</a>, as <code>opam install mccs.1.1+19</code> didn\u2019t work on the macOS workers natively.</p>\n\n<p>On face value, the problem appears pretty common, and there are numerous threads on <a href=\"https://stackoverflow.com\">Stack Overflow</a> such as this <a href=\"https://stackoverflow.com/questions/77250743/mac-xcode-g-cannot-compile-even-a-basic-c-program-issues-with-standard-libr\">one</a>, however, the resolutions I tried didn\u2019t work. I was reluctant to try some of the more intrusive changes like creating a symlink of every header from <code>/usr/include/</code> to <code>/Library/Developer/CommandLineTools/usr/include/c++/v1</code> as this doesn\u2019t seem to be what Apple intends.</p>\n\n<p>For the record, a program such as this:</p>\n\n<div><div><pre><code><span>#include</span> <span><iostream></span><span>\n</span>\n<span>using</span> <span>namespace</span> <span>std</span><span>;</span>\n\n<span>int</span> <span>main</span><span>()</span> <span>{</span>\n <span>cout</span> <span><<</span> <span>\"Hello World!\"</span> <span><<</span> <span>endl</span><span>;</span>\n <span>return</span> <span>0</span><span>;</span>\n<span>}</span>\n</code></pre></div></div>\n\n<p>Fails like this:</p>\n\n<div><div><pre><code>% c++ hello.cpp <span>-o</span> hello <span>-v</span>\nApple clang version 17.0.0 <span>(</span>clang-1700.0.13.3<span>)</span>\nTarget: x86_64-apple-darwin24.5.0\nThread model: posix\nInstalledDir: /Library/Developer/CommandLineTools/usr/bin\n <span>\"/Library/Developer/CommandLineTools/usr/bin/clang\"</span> <span>-cc1</span> <span>-triple</span> x86_64-apple-macosx15.0.0 <span>-Wundef-prefix</span><span>=</span>TARGET_OS_ <span>-Wdeprecated-objc-isa-usage</span> <span>-Werror</span><span>=</span>deprecated-objc-isa-usage <span>-Werror</span><span>=</span>implicit-function-declaration <span>-emit-obj</span> <span>-dumpdir</span> hello- <span>-disable-free</span> <span>-clear-ast-before-backend</span> <span>-disable-llvm-verifier</span> <span>-discard-value-names</span> <span>-main-file-name</span> hello.cpp <span>-mrelocation-model</span> pic <span>-pic-level</span> 2 <span>-mframe-pointer</span><span>=</span>all <span>-fno-strict-return</span> <span>-ffp-contract</span><span>=</span>on <span>-fno-rounding-math</span> <span>-funwind-tables</span><span>=</span>2 <span>-target-sdk-version</span><span>=</span>15.4 <span>-fvisibility-inlines-hidden-static-local-var</span> <span>-fdefine-target-os-macros</span> <span>-fno-assume-unique-vtables</span> <span>-fno-modulemap-allow-subdirectory-search</span> <span>-target-cpu</span> penryn <span>-tune-cpu</span> generic <span>-debugger-tuning</span><span>=</span>lldb <span>-fdebug-compilation-dir</span><span>=</span>/Users/administrator/x <span>-target-linker-version</span> 1167.4.1 <span>-v</span> <span>-fcoverage-compilation-dir</span><span>=</span>/Users/administrator/x <span>-resource-dir</span> /Library/Developer/CommandLineTools/usr/lib/clang/17 <span>-isysroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span>-internal-isystem</span> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1 <span>-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include <span>-internal-isystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/17/include <span>-internal-externc-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include <span>-internal-externc-isystem</span> /Library/Developer/CommandLineTools/usr/include <span>-Wno-reorder-init-list</span> <span>-Wno-implicit-int-float-conversion</span> <span>-Wno-c99-designator</span> <span>-Wno-final-dtor-non-final-class</span> <span>-Wno-extra-semi-stmt</span> <span>-Wno-misleading-indentation</span> <span>-Wno-quoted-include-in-framework-header</span> <span>-Wno-implicit-fallthrough</span> <span>-Wno-enum-enum-conversion</span> <span>-Wno-enum-float-conversion</span> <span>-Wno-elaborated-enum-base</span> <span>-Wno-reserved-identifier</span> <span>-Wno-gnu-folding-constant</span> <span>-fdeprecated-macro</span> <span>-ferror-limit</span> 19 <span>-stack-protector</span> 1 <span>-fstack-check</span> <span>-mdarwin-stkchk-strong-link</span> <span>-fblocks</span> <span>-fencode-extended-block-signature</span> <span>-fregister-global-dtors-with-atexit</span> <span>-fgnuc-version</span><span>=</span>4.2.1 <span>-fno-cxx-modules</span> <span>-fskip-odr-check-in-gmf</span> <span>-fcxx-exceptions</span> <span>-fexceptions</span> <span>-fmax-type-align</span><span>=</span>16 <span>-fcommon</span> <span>-fcolor-diagnostics</span> <span>-clang-vendor-feature</span><span>=</span>+disableNonDependentMemberExprInCurrentInstantiation <span>-fno-odr-hash-protocols</span> <span>-clang-vendor-feature</span><span>=</span>+enableAggressiveVLAFolding <span>-clang-vendor-feature</span><span>=</span>+revert09abecef7bbf <span>-clang-vendor-feature</span><span>=</span>+thisNoAlignAttr <span>-clang-vendor-feature</span><span>=</span>+thisNoNullAttr <span>-clang-vendor-feature</span><span>=</span>+disableAtImportPrivateFrameworkInImplementationError <span>-D__GCC_HAVE_DWARF2_CFI_ASM</span><span>=</span>1 <span>-o</span> /var/folders/sh/9c8b7hzd2wb1g2_ky78vqw5r0000gn/T/hello-a268ab.o <span>-x</span> c++ hello.cpp\nclang <span>-cc1</span> version 17.0.0 <span>(</span>clang-1700.0.13.3<span>)</span> default target x86_64-apple-darwin24.5.0\nignoring nonexistent directory <span>\"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include\"</span>\nignoring nonexistent directory <span>\"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/SubFrameworks\"</span>\nignoring nonexistent directory <span>\"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/Library/Frameworks\"</span>\n<span>#include \"...\" search starts here:</span>\n<span>#include <...> search starts here:</span>\n /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1\n /Library/Developer/CommandLineTools/usr/lib/clang/17/include\n /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include\n /Library/Developer/CommandLineTools/usr/include\n /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks <span>(</span>framework directory<span>)</span>\nEnd of search list.\nhello.cpp:1:10: fatal error: <span>'iostream'</span> file not found\n 1 | <span>#include <iostream></span>\n | ^~~~~~~~~~\n1 error generated.\n</code></pre></div></div>\n\n<p>That first folder looked strange: <code>bin/../include/c++/v1</code>. Really? What\u2019s in there? Not much:</p>\n\n<div><div><pre><code>% <span>ls</span> <span>-l</span> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1\ntotal 40\n<span>-rw-r--r--</span> 1 root wheel 44544 7 Apr 2022 __functional_03\n<span>-rw-r--r--</span> 1 root wheel 6532 7 Apr 2022 __functional_base_03\n<span>-rw-r--r--</span> 1 root wheel 2552 7 Apr 2022 __sso_allocator\n</code></pre></div></div>\n\n<p>I definitely have <code>iostream</code> on the machine:</p>\n\n<div><div><pre><code>% <span>ls</span> <span>-l</span> /Library/Developer/CommandLineTools/SDKs/MacOSX<span>*</span>.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1507 8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1391 13 Nov 2021 /Library/Developer/CommandLineTools/SDKs/MacOSX12.1.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1583 13 Apr 2024 /Library/Developer/CommandLineTools/SDKs/MacOSX14.5.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1583 13 Apr 2024 /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1583 10 Nov 2024 /Library/Developer/CommandLineTools/SDKs/MacOSX15.2.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1507 8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX15.4.sdk/usr/include/c++/v1/iostream\n<span>-rw-r--r--</span> 1 root wheel 1507 8 Mar 03:36 /Library/Developer/CommandLineTools/SDKs/MacOSX15.sdk/usr/include/c++/v1/iostream\n</code></pre></div></div>\n\n<p>I tried on my MacBook, which compiled the test program without issue. However, that was running Monterey, where the workers are running Sequoia. The <em>include</em> paths on my laptop look much better. Where are they configured?</p>\n\n<div><div><pre><code>% c++ <span>-v</span> <span>-o</span> <span>test </span>test.cpp\nApple clang version 15.0.0 <span>(</span>clang-1500.3.9.4<span>)</span>\nTarget: x86_64-apple-darwin23.5.0\nThread model: posix\nInstalledDir: /Library/Developer/CommandLineTools/usr/bin\n <span>\"/Library/Developer/CommandLineTools/usr/bin/clang\"</span> <span>-cc1</span> <span>-triple</span> x86_64-apple-macosx14.0.0 <span>-Wundef-prefix</span><span>=</span>TARGET_OS_ <span>-Wdeprecated-objc-isa-usage</span> <span>-Werror</span><span>=</span>deprecated-objc-isa-usage <span>-Werror</span><span>=</span>implicit-function-declaration <span>-emit-obj</span> <span>-mrelax-all</span> <span>--mrelax-relocations</span> <span>-disable-free</span> <span>-clear-ast-before-backend</span> <span>-disable-llvm-verifier</span> <span>-discard-value-names</span> <span>-main-file-name</span> test.cpp <span>-mrelocation-model</span> pic <span>-pic-level</span> 2 <span>-mframe-pointer</span><span>=</span>all <span>-fno-strict-return</span> <span>-ffp-contract</span><span>=</span>on <span>-fno-rounding-math</span> <span>-funwind-tables</span><span>=</span>2 <span>-target-sdk-version</span><span>=</span>14.4 <span>-fvisibility-inlines-hidden-static-local-var</span> <span>-target-cpu</span> penryn <span>-tune-cpu</span> generic <span>-debugger-tuning</span><span>=</span>lldb <span>-target-linker-version</span> 1053.12 <span>-v</span> <span>-fcoverage-compilation-dir</span><span>=</span>/Users/mtelvers/x <span>-resource-dir</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0 <span>-isysroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span>-I</span>/usr/local/include <span>-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1 <span>-internal-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include <span>-internal-isystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/include <span>-internal-externc-isystem</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include <span>-internal-externc-isystem</span> /Library/Developer/CommandLineTools/usr/include <span>-Wno-reorder-init-list</span> <span>-Wno-implicit-int-float-conversion</span> <span>-Wno-c99-designator</span> <span>-Wno-final-dtor-non-final-class</span> <span>-Wno-extra-semi-stmt</span> <span>-Wno-misleading-indentation</span> <span>-Wno-quoted-include-in-framework-header</span> <span>-Wno-implicit-fallthrough</span> <span>-Wno-enum-enum-conversion</span> <span>-Wno-enum-float-conversion</span> <span>-Wno-elaborated-enum-base</span> <span>-Wno-reserved-identifier</span> <span>-Wno-gnu-folding-constant</span> <span>-fdeprecated-macro</span> <span>-fdebug-compilation-dir</span><span>=</span>/Users/mtelvers/x <span>-ferror-limit</span> 19 <span>-stack-protector</span> 1 <span>-fstack-check</span> <span>-mdarwin-stkchk-strong-link</span> <span>-fblocks</span> <span>-fencode-extended-block-signature</span> <span>-fregister-global-dtors-with-atexit</span> <span>-fgnuc-version</span><span>=</span>4.2.1 <span>-fno-cxx-modules</span> <span>-fcxx-exceptions</span> <span>-fexceptions</span> <span>-fmax-type-align</span><span>=</span>16 <span>-fcommon</span> <span>-fcolor-diagnostics</span> <span>-clang-vendor-feature</span><span>=</span>+disableNonDependentMemberExprInCurrentInstantiation <span>-fno-odr-hash-protocols</span> <span>-clang-vendor-feature</span><span>=</span>+enableAggressiveVLAFolding <span>-clang-vendor-feature</span><span>=</span>+revert09abecef7bbf <span>-clang-vendor-feature</span><span>=</span>+thisNoAlignAttr <span>-clang-vendor-feature</span><span>=</span>+thisNoNullAttr <span>-mllvm</span> <span>-disable-aligned-alloc-awareness</span><span>=</span>1 <span>-D__GCC_HAVE_DWARF2_CFI_ASM</span><span>=</span>1 <span>-o</span> /var/folders/15/4zw4hb9s40b8cmff3z5bdszc0000gp/T/test-71e229.o <span>-x</span> c++ test.cpp\nclang <span>-cc1</span> version 15.0.0 <span>(</span>clang-1500.3.9.4<span>)</span> default target x86_64-apple-darwin23.5.0\nignoring nonexistent directory <span>\"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/local/include\"</span>\nignoring nonexistent directory <span>\"/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/Library/Frameworks\"</span>\n<span>#include \"...\" search starts here:</span>\n<span>#include <...> search starts here:</span>\n /usr/local/include\n /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1\n /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/include\n /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include\n /Library/Developer/CommandLineTools/usr/include\n /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/System/Library/Frameworks <span>(</span>framework directory<span>)</span>\nEnd of search list.\n <span>\"/Library/Developer/CommandLineTools/usr/bin/ld\"</span> <span>-demangle</span> <span>-lto_library</span> /Library/Developer/CommandLineTools/usr/lib/libLTO.dylib <span>-no_deduplicate</span> <span>-dynamic</span> <span>-arch</span> x86_64 <span>-platform_version</span> macos 14.0.0 14.4 <span>-syslibroot</span> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk <span>-o</span> <span>test</span> <span>-L</span>/usr/local/lib /var/folders/15/4zw4hb9s40b8cmff3z5bdszc0000gp/T/test-71e229.o <span>-lc</span>++ <span>-lSystem</span> /Library/Developer/CommandLineTools/usr/lib/clang/15.0.0/lib/darwin/libclang_rt.osx.a\n</code></pre></div></div>\n\n<p>I\u2019ve been meaning to upgrade my MacBook, and this looked like the perfect excuse. I updated to Sequoia and then updated the Xcode command-line tools. The test compilation worked, the paths looked good, but I had clang 1700.0.13.5, where the workers had 1700.0.13.3.</p>\n\n<div><div><pre><code>% c++ <span>-v</span> <span>-o</span> <span>test </span>test.cpp\nApple clang version 17.0.0 <span>(</span>clang-1700.0.13.5<span>)</span>\nTarget: x86_64-apple-darwin24.5.0\nThread model: posix\nInstalledDir: /Library/Developer/CommandLineTools/usr/bin\n</code></pre></div></div>\n\n<p>I updated the workers to 1700.0.13.5, which didn\u2019t make any difference. The workers still had that funny <code>/../</code> path, which wasn\u2019t present anywhere else. I searched <code>/Library/Developer/CommandLineTools/usr/bin/../include/c++/v1 site:stackoverflow.com</code> and the answer is the top <a href=\"https://stackoverflow.com/a/79606435\">match</a>.</p>\n\n<blockquote>\n <p>Rename or if you\u2019re confident enough, delete /Library/Developer/CommandLineTools/usr/include/c++, then clang++ will automatically search headers under /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1 and find your header. That directory is very likely an artifact of OS upgrade and by deleting it clang++ will realise that it should search in the header paths of new SDKs.</p>\n</blockquote>\n\n<p>I wasn\u2019t confident, so I moved it, <code>sudo mv c++ ~</code>. With that done, the test program builds correctly! Have a read of the <a href=\"https://stackoverflow.com/a/79606435\">answer</a> on Stack Overflow.</p>\n\n<p>Now, rather more cavalierly, I removed the folder on all the i7 and m1 workers:</p>\n\n<div><div><pre><code><span>$ </span><span>for </span>a <span>in</span> <span>{</span>01..04<span>}</span> <span>;</span> <span>do </span>ssh m1-worker-<span>$a</span>.macos.ci.dev <span>sudo rm</span> <span>-r</span> /Library/Developer/CommandLineTools/usr/include/c++ <span>;</span> <span>done</span>\n</code></pre></div></div>",
+21
mte/2025_06_23_transitive-reduction.json
+21
mte/2025_06_23_transitive-reduction.json
···+"summary": "I have previously written about using a topological sort of a directed acyclic graph (DAG) of package dependencies to create an ordered list of installation operations. I now want to create a transitive reduction, giving a graph with the same vertices and the fewest number of edges possible.",+"content": "<p>I have previously written about using a <a href=\"https://www.tunbury.org/topological-sort/\">topological sort</a> of a directed acyclic graph (DAG) of package dependencies to create an ordered list of installation operations. I now want to create a transitive reduction, giving a graph with the same vertices and the fewest number of edges possible.</p>\n\n<p>This is interesting in opam, where a typical package is defined to depend upon both OCaml and Dune. However, Dune depends upon OCaml, so minimally the package only depends upon Dune. For opam, we would typically list both, as they may have version constraints.</p>\n\n<div><div><pre><code><span>depends</span><span>:</span> <span>[</span>\n <span>\"</span><span>dune\"</span> <span>{</span><span>></span><span>= \"3.17\"</span><span>}</span>\n <span>\"</span><span>ocaml\"</span>\n<span>]</span>\n</code></pre></div></div>\n\n<p>Given a topologically sorted list of packages, we can fold over the list to build a map of the packages and dependencies. As each package is considered in turn, it must either have no dependencies or the dependent package must already be in the map.</p>\n\n<div><div><pre><code><span>let</span> <span>pkg_deps</span> <span>solution</span> <span>=</span>\n <span>List</span><span>.</span><span>fold_left</span> <span>(</span><span>fun</span> <span>map</span> <span>pkg</span> <span>-></span>\n <span>let</span> <span>deps_direct</span> <span>=</span> <span>PackageMap</span><span>.</span><span>find</span> <span>pkg</span> <span>solution</span> <span>in</span>\n <span>let</span> <span>deps_plus_children</span> <span>=</span> <span>PackageSet</span><span>.</span><span>fold</span> <span>(</span><span>fun</span> <span>pkg</span> <span>acc</span> <span>-></span>\n <span>PackageSet</span><span>.</span><span>union</span> <span>acc</span> <span>(</span><span>PackageMap</span><span>.</span><span>find</span> <span>pkg</span> <span>map</span><span>))</span> <span>deps_direct</span> <span>deps_direct</span> <span>in</span>\n <span>PackageMap</span><span>.</span><span>add</span> <span>pkg</span> <span>deps_plus_children</span> <span>map</span><span>)</span> <span>PackageMap</span><span>.</span><span>empty</span><span>;;</span>\n</code></pre></div></div>\n\n<p>To generate the transitive reduction, take each set of dependencies for every package in the solution and remove those where the package is a member of the set of all the dependencies of any other directly descendant package.</p>\n\n<div><div><pre><code><span>let</span> <span>reduce</span> <span>dependencies</span> <span>=</span>\n <span>PackageMap</span><span>.</span><span>map</span> <span>(</span><span>fun</span> <span>u</span> <span>-></span>\n <span>PackageSet</span><span>.</span><span>filter</span> <span>(</span><span>fun</span> <span>v</span> <span>-></span>\n <span>let</span> <span>others</span> <span>=</span> <span>PackageSet</span><span>.</span><span>remove</span> <span>v</span> <span>u</span> <span>in</span>\n <span>PackageSet</span><span>.</span><span>fold</span> <span>(</span><span>fun</span> <span>o</span> <span>acc</span> <span>-></span>\n <span>acc</span> <span>||</span> <span>PackageSet</span><span>.</span><span>mem</span> <span>v</span> <span>(</span><span>PackageMap</span><span>.</span><span>find</span> <span>o</span> <span>dependencies</span><span>)</span>\n <span>)</span> <span>others</span> <span>false</span> <span>|></span> <span>not</span>\n <span>)</span> <span>u</span>\n <span>);;</span>\n</code></pre></div></div>\n\n<p>Let\u2019s create a quick print function and then test the code:</p>\n\n<div><div><pre><code><span>let</span> <span>print</span> <span>=</span> <span>PackageMap</span><span>.</span><span>iter</span> <span>(</span><span>fun</span> <span>p</span> <span>deps</span> <span>-></span>\n <span>print_endline</span> <span>(</span><span>p</span> <span>^</span> <span>\": \"</span> <span>^</span> <span>(</span><span>PackageSet</span><span>.</span><span>to_list</span> <span>deps</span> <span>|></span> <span>String</span><span>.</span><span>concat</span> <span>\",\"</span><span>))</span>\n<span>);;</span>\n</code></pre></div></div>\n\n<p>The original solution is</p>\n\n<div><div><pre><code><span>#</span> <span>print</span> <span>dune</span><span>;;</span>\n<span>base</span><span>-</span><span>threads</span><span>.</span><span>base</span><span>:</span>\n<span>base</span><span>-</span><span>unix</span><span>.</span><span>base</span><span>:</span>\n<span>dune</span><span>:</span> <span>base</span><span>-</span><span>threads</span><span>.</span><span>base</span><span>,</span><span>base</span><span>-</span><span>unix</span><span>.</span><span>base</span><span>,</span><span>ocaml</span>\n<span>ocaml</span><span>:</span> <span>ocaml</span><span>-</span><span>config</span><span>,</span><span>ocaml</span><span>-</span><span>variants</span>\n<span>ocaml</span><span>-</span><span>config</span><span>:</span> <span>ocaml</span><span>-</span><span>variants</span>\n<span>ocaml</span><span>-</span><span>variants</span><span>:</span>\n<span>-</span> <span>:</span> <span>unit</span> <span>=</span> <span>()</span>\n</code></pre></div></div>\n\n<p>And the reduced solution is:</p>\n\n<div><div><pre><code><span>#</span> <span>let</span> <span>dependencies</span> <span>=</span> <span>pkg_deps</span> <span>dune</span> <span>(</span><span>topological_sort</span> <span>dune</span><span>);;</span>\n<span>val</span> <span>dependencies</span> <span>:</span> <span>PackageSet</span><span>.</span><span>t</span> <span>PackageMap</span><span>.</span><span>t</span> <span>=</span> <span><</span><span>abstr</span><span>></span>\n<span>#</span> <span>print</span> <span>(</span><span>reduce</span> <span>dependencies</span> <span>dune</span><span>);;</span>\n<span>base</span><span>-</span><span>threads</span><span>.</span><span>base</span><span>:</span>\n<span>base</span><span>-</span><span>unix</span><span>.</span><span>base</span><span>:</span>\n<span>dune</span><span>:</span> <span>base</span><span>-</span><span>threads</span><span>.</span><span>base</span><span>,</span><span>base</span><span>-</span><span>unix</span><span>.</span><span>base</span><span>,</span><span>ocaml</span>\n<span>ocaml</span><span>:</span> <span>ocaml</span><span>-</span><span>config</span>\n<span>ocaml</span><span>-</span><span>config</span><span>:</span> <span>ocaml</span><span>-</span><span>variants</span>\n<span>ocaml</span><span>-</span><span>variants</span><span>:</span>\n<span>-</span> <span>:</span> <span>unit</span> <span>=</span> <span>()</span>\n</code></pre></div></div>\n\n<p>This doesn\u2019t look like much of a difference, but when applied to a larger graph, for example, 0install.2.18, the reduction is quite dramatic.</p>\n\n<p>Initial graph</p>\n\n<p><img alt=\"opam installation graph for 0install\" src=\"https://www.tunbury.org/images/0install-graph.png\"></p>\n\n<p>Transitive reduction</p>\n\n<p><img alt=\"Transitive reduction of the opam installation graph for 0install\" src=\"https://www.tunbury.org/images/0install-reduced-graph.png\"></p>",
+21
mte/2025_06_24_opam2web.json
+21
mte/2025_06_24_opam2web.json
···+"summary": "The opam2web image for opam.ocaml.org is huge weighing in at more than 25 GB. The bulk of this data is opam archives, which are updated and copied into a stock caddy image.",+"content": "<p>The opam2web image for <a href=\"https://opam.ocaml.org\">opam.ocaml.org</a> is huge weighing in at more than 25 GB. The bulk of this data is opam archives, which are updated and copied into a stock caddy image.</p>\n\n<p>There are two archives, <code>ocaml/opam.ocaml.org-legacy</code>, which hasn\u2019t changed for 5 years and holds the cache for opam 1.x and <code>ocaml/opam:archive</code>, which is updated weekly.</p>\n\n<p>The current <code>Dockerfile</code> copies these files into a new layer each time opam2web builds.</p>\n\n<div><div><pre><code><span>FROM</span><span> </span><span>--platform=linux/amd64 ocaml/opam:archive</span><span> </span><span>as</span><span> </span><span>opam-archive</span>\n<span>FROM</span><span> </span><span>ocaml/opam.ocaml.org-legacy</span><span> </span><span>as</span><span> </span><span>opam-legacy</span>\n<span>FROM</span><span> </span><span>alpine:3.20</span><span> </span><span>as</span><span> </span><span>opam2web</span>\n...\n<span>COPY</span><span> --from=opam-legacy . /www</span>\n...\n<span>RUN </span><span>--mount</span><span>=</span><span>type</span><span>=</span><span>bind</span>,target<span>=</span>/cache,from<span>=</span>opam-archive rsync <span>-aH</span> /cache/cache/ /www/cache/\n...\n</code></pre></div></div>\n\n<p>And later, the entire <code>/www</code> structure is copied into a <code>caddy:2.8.4</code> image.</p>\n\n<div><div><pre><code><span>FROM</span><span> caddy:2.8.4</span>\n<span>WORKDIR</span><span> /srv</span>\n<span>COPY</span><span> --from=opam2web /www /usr/share/caddy</span>\n<span>COPY</span><span> Caddyfile /etc/caddy/Caddyfile</span>\n<span>ENTRYPOINT</span><span> [\"caddy\", \"run\", \"--config\", \"/etc/caddy/Caddyfile\", \"--adapter\", \"caddyfile\"]</span>\n</code></pre></div></div>\n\n<p>This method is considered \u201cbest practice\u201d when creating Docker images, but in this case, it produces a very large image, which takes a long time to deploy.</p>\n\n<p>For Docker to use an existing layer, we need the final <code>FROM ...</code> to be the layer we want to use as the base. In the above snippet, the <code>caddy:2.8.4</code> layer will be the base layer and will be reused.</p>\n\n<p>The archive, <code>ocaml/opam:archive</code>, is created by this Dockerfile, which ultimately uses <code>alpine:latest</code>.</p>\n\n<div><div><pre><code><span>FROM</span><span> </span><span>ocaml/opam:archive</span><span> </span><span>AS</span><span> </span><span>opam-archive</span>\n<span>FROM</span><span> </span><span>ocurrent/opam-staging@sha256:f921cd51dda91f61a52a2c26a8a188f8618a2838e521d3e4afa3ca1da637903e</span><span> </span><span>AS</span><span> </span><span>archive</span>\n<span>WORKDIR</span><span> /home/opam/opam-repository</span>\n<span>RUN </span><span>--mount</span><span>=</span><span>type</span><span>=</span><span>bind</span>,target<span>=</span>/cache,from<span>=</span>opam-archive rsync <span>-aH</span> /cache/cache/ /home/opam/opam-repository/cache/\n<span>RUN </span>opam admin cache <span>--link</span><span>=</span>/home/opam/opam-repository/cache\n\n<span>FROM</span><span> alpine:latest</span>\n<span>COPY</span><span> --chown=0:0 --from=archive [ \"/home/opam/opam-repository/cache\", \"/cache\" ]</span>\n</code></pre></div></div>\n\n<p>In our opam2web build, we could use <code>FROM ocaml/opam:archive</code> and then <code>apk add caddy</code>, which would reuse the entire 15GB layer and add the few megabytes for <code>caddy</code>.</p>\n\n<p><code>ocaml/opam.ocaml.org-legacy</code> is another 8GB. This legacy data could be integrated by adding it to <code>ocaml/opam:archive</code> in a different directory to ensure compatibility with anyone else using this image. This is <a href=\"https://github.com/ocurrent/docker-base-images/pull/324\">PR#324</a></p>\n\n<div><div><pre><code> <span>let</span> <span>install_package_archive</span> <span>opam_image</span> <span>=</span>\n <span>let</span> <span>open</span> <span>Dockerfile</span> <span>in</span>\n<span>+</span> <span>from</span> <span>~</span><span>alias</span><span>:</span><span>\"opam-legacy\"</span> <span>\"ocaml/opam.ocaml.org-legacy\"</span> <span>@@</span>\n <span>from</span> <span>~</span><span>alias</span><span>:</span><span>\"opam-archive\"</span> <span>\"ocaml/opam:archive\"</span> <span>@@</span>\n <span>from</span> <span>~</span><span>alias</span><span>:</span><span>\"archive\"</span> <span>opam_image</span> <span>@@</span>\n <span>workdir</span> <span>\"/home/opam/opam-repository\"</span> <span>@@</span>\n <span>run</span> <span>~</span><span>mounts</span><span>:</span><span>[</span><span>mount_bind</span> <span>~</span><span>target</span><span>:</span><span>\"/cache\"</span> <span>~</span><span>from</span><span>:</span><span>\"opam-archive\"</span> <span>()</span><span>]</span> <span>\"rsync -aH /cache/cache/ /home/opam/opam-repository/cache/\"</span> <span>@@</span>\n <span>run</span> <span>\"opam admin cache --link=/home/opam/opam-repository/cache\"</span> <span>@@</span>\n <span>from</span> <span>\"alpine:latest\"</span> <span>@@</span>\n<span>+</span> <span>copy</span> <span>~</span><span>chown</span><span>:</span><span>\"0:0\"</span> <span>~</span><span>from</span><span>:</span><span>\"opam-legacy\"</span> <span>~</span><span>src</span><span>:</span><span>[</span><span>\"/\"</span><span>]</span> <span>~</span><span>dst</span><span>:</span><span>\"/legacy\"</span> <span>()</span> <span>@@</span>\n <span>copy</span> <span>~</span><span>chown</span><span>:</span><span>\"0:0\"</span> <span>~</span><span>from</span><span>:</span><span>\"archive\"</span> <span>~</span><span>src</span><span>:</span><span>[</span><span>\"/home/opam/opam-repository/cache\"</span><span>]</span> <span>~</span><span>dst</span><span>:</span><span>\"/cache\"</span> <span>()</span>\n</code></pre></div></div>\n\n<p>Finally, we need to update <a href=\"https://github.com/ocaml-opam/opam2web\">opam2web</a> to use <code>ocaml/opam:archive</code> as the base layer rather than <code>caddy:2.8.4</code>, resulting in the final part of the <code>Dockerfile</code> looking like this.</p>\n\n<div><div><pre><code><span>FROM</span><span> ocaml/opam:archive</span>\n<span>RUN </span>apk add <span>--update</span> git curl rsync libstdc++ rdfind caddy\n<span>COPY</span><span> --from=build-opam2web /opt/opam2web /usr/local</span>\n<span>COPY</span><span> --from=build-opam-doc /usr/bin/opam-dev /usr/local/bin/opam</span>\n<span>COPY</span><span> --from=build-opam-doc /opt/opam/doc /usr/local/share/opam2web/content/doc</span>\n<span>COPY</span><span> ext/key/opam-dev-team.pgp /www/opam-dev-pubkey.pgp</span>\n<span>ADD</span><span> bin/opam-web.sh /usr/local/bin</span>\n<span>ARG</span><span> DOMAIN=opam.ocaml.org</span>\n<span>ARG</span><span> OPAM_REPO_GIT_SHA=master</span>\n<span>ARG</span><span> BLOG_GIT_SHA=master</span>\n<span>RUN </span><span>echo</span> <span>${</span><span>OPAM_REPO_GIT_SHA</span><span>}</span> <span>>></span> /www/opam_git_sha\n<span>RUN </span><span>echo</span> <span>${</span><span>BLOG_GIT_SHA</span><span>}</span> <span>>></span> /www/blog_git_sha\n<span>RUN </span>/usr/local/bin/opam-web.sh <span>${</span><span>DOMAIN</span><span>}</span> <span>${</span><span>OPAM_REPO_GIT_SHA</span><span>}</span> <span>${</span><span>BLOG_GIT_SHA</span><span>}</span>\n<span>WORKDIR</span><span> /srv</span>\n<span>COPY</span><span> Caddyfile /etc/caddy/Caddyfile</span>\n<span>ENTRYPOINT</span><span> [\"caddy\", \"run\", \"--config\", \"/etc/caddy/Caddyfile\", \"--adapter\", \"caddyfile\"]</span>\n</code></pre></div></div>\n\n<p>I acknowledge that this final image now contains some extra unneeded packages such as <code>git</code>, <code>curl</code>, etc, but this seems a minor inconvenience.</p>\n\n<p>The <code>Caddyfile</code> can be adjusted to make everything still appear to be in the same place:</p>\n\n<div><div><pre><code>:80 {\n\tredir /install.sh https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh\n\tredir /install.ps1 https://raw.githubusercontent.com/ocaml/opam/master/shell/install.ps1\n\n\t@version_paths path /1.1/* /1.2.0/* /1.2.2/*\n\thandle @version_paths {\n\t\troot * /legacy\n\t\tfile_server\n\t}\n\n\thandle /cache/* {\n\t\troot * /\n\t\tfile_server\n\t}\n\n\thandle {\n\t\troot * /www\n\t\tfile_server\n\t}\n}\n</code></pre></div></div>\n\n<p>In this configuration, the Docker <em>push</em> is only 650MB rather than 25GB.</p>\n\n<p>The changes to opam2web are in <a href=\"https://github.com/ocaml-opam/opam2web/pull/245\">PR#245</a></p>\n\n<p>Test with some external URLs:</p>\n\n<ul>\n <li><a href=\"https://staging.opam.ocaml.org/index.tar.gz\">https://staging.opam.ocaml.org/index.tar.gz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/archives/0install.2.18/0install-2.18.tbz\">https://staging.opam.ocaml.org/archives/0install.2.18/0install-2.18.tbz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/cache/0install.2.18/0install-2.18.tbz\">https://staging.opam.ocaml.org/cache/0install.2.18/0install-2.18.tbz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/1.2.2/archives/0install.2.12.3+opam.tar.gz\">https://staging.opam.ocaml.org/1.2.2/archives/0install.2.12.3+opam.tar.gz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/1.2.0/archives/0install.2.12.1+opam.tar.gz\">https://staging.opam.ocaml.org/1.2.0/archives/0install.2.12.1+opam.tar.gz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/1.1/archives/0install.2.10+opam.tar.gz\">https://staging.opam.ocaml.org/1.1/archives/0install.2.10+opam.tar.gz</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/opam_git_sha\">https://staging.opam.ocaml.org/opam_git_sha</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/blog_git_sha\">https://staging.opam.ocaml.org/blog_git_sha</a></li>\n <li><a href=\"https://staging.opam.ocaml.org/opam-dev-pubkey.pgp\">https://staging.opam.ocaml.org/opam-dev-pubkey.pgp</a></li>\n</ul>",
+21
mte/2025_06_27_windows-containerd-3.json
+21
mte/2025_06_27_windows-containerd-3.json
···+"summary": "Everything was going fine until I ran out of disk space. My NVMe, C: drive, is only 256GB, but I have a large, 1.7TB SSD available as D:. How trivial, change a few paths and carry on, but it wasn\u2019t that simple, or was it?",+"content": "<p>Everything was going fine until I ran out of disk space. My NVMe, <code>C:</code> drive, is only 256GB, but I have a large, 1.7TB SSD available as <code>D:</code>. How trivial, change a few paths and carry on, but it wasn\u2019t that simple, or was it?</p>\n\n<p>Distilling the problem down to the minimum and excluding all code written by me, the following command fails, but changing <code>src=d:\\cache\\opam</code> to <code>src=c:\\cache\\opam</code> works. It\u2019s not the content, as it\u2019s just an empty folder.</p>\n\n<pre><code>ctr run --rm --cni -user ContainerAdministrator -mount type=bind,src=d:\\cache\\opam,dst=c:\\Users\\ContainerAdministrator\\AppData\\Local\\opam mcr.microsoft.com/windows/servercore:ltsc2022 my-container cmd /c \"curl.exe -L -o c:\\Windows\\opam.exe https://github.com/ocaml/opam/releases/download/2.3.0/opam-2.3.0-x86_64-windows.exe && opam.exe init --debug-level=3 -y\"\n</code></pre>\n\n<p>The failure point is the ability to create the lock file <code>config.lock</code>. Checking the code, the log entry is written before the lock is acquired. If <code>c:\\Users\\ContainerAdministrator\\AppData\\Local\\opam</code> is not a bind mount, or the bind mount is on <code>C:</code>, then it works.</p>\n\n<div><div><pre><code>01:26.722 CLIENT updating repository state\n01:26.722 GSTATE LOAD-GLOBAL-STATE @ C:\\Users\\ContainerAdministrator\\AppData\\Local\\opam\n01:26.723 SYSTEM LOCK C:\\Users\\ContainerAdministrator\\AppData\\Local\\opam\\lock (none => read)\n01:26.723 SYSTEM LOCK C:\\Users\\ContainerAdministrator\\AppData\\Local\\opam\\config.lock (none => write)\n</code></pre></div></div>\n\n<p>Suffice it to say, I spent a long time trying to resolve this. I\u2019ll mention a couple of interesting points that appeared along the way. Firstly, files created on <code>D:</code> effectively appear as hard links, and the Update Sequence Number, USN, is 0.</p>\n\n<div><div><pre><code><span>C:\\</span><span>></span><span> </span><span>fsutil</span><span> </span><span>file</span><span> </span><span>layout</span><span> </span><span>d:\\cache\\opam\\lock</span><span>\n\n</span><span>*********</span><span> </span><span>File</span><span> </span><span>0x000400000001d251</span><span> </span><span>*********</span><span>\n</span><span>File</span><span> </span><span>reference</span><span> </span><span>number</span><span> </span><span>:</span><span> </span><span>0x000400000001d251</span><span>\n</span><span>File</span><span> </span><span>attributes</span><span> </span><span>:</span><span> </span><span>0x00000020:</span><span> </span><span>Archive</span><span>\n</span><span>File</span><span> </span><span>entry</span><span> </span><span>flags</span><span> </span><span>:</span><span> </span><span>0x00000000</span><span>\n</span><span>Link</span><span> </span><span>(</span><span>ParentID:</span><span> </span><span>Name</span><span>)</span><span> </span><span>:</span><span> </span><span>0</span><span>x000c00000000002d:</span><span> </span><span>HLINK</span><span> </span><span>Name</span><span> </span><span>:</span><span> </span><span>\\cache\\opam\\lock</span><span>\n</span><span>...</span><span>\n</span><span>LastUsn</span><span> </span><span>:</span><span> </span><span>0</span><span>\n</span><span>...</span><span>\n</span></code></pre></div></div>\n\n<p>The reason behind this is down to Windows defaults:</p>\n\n<ol>\n <li>Windows still likes to create the legacy 8.3 MS-DOS file names on the system volume, <code>C:</code>, which explains the difference between <code>HLINK</code> and <code>NTFS+DOS</code>. Running <code>fsutil 8dot3name set d: 0</code> will enable the creation of the old-style file names.</li>\n <li>Drive <code>C:</code> has a USN journal created automatically, as it\u2019s required for Windows to operate, but it isn\u2019t created by default on other drives. Running <code>fsutil usn createjournal d: m=32000000 a=8000000</code> will create the journal.</li>\n</ol>\n\n<div><div><pre><code><span>C:\\</span><span>></span><span> </span><span>fsutil</span><span> </span><span>file</span><span> </span><span>layout</span><span> </span><span>c:\\cache\\opam\\lock</span><span>\n\n</span><span>*********</span><span> </span><span>File</span><span> </span><span>0x000300000002f382</span><span> </span><span>*********</span><span>\n</span><span>File</span><span> </span><span>reference</span><span> </span><span>number</span><span> </span><span>:</span><span> </span><span>0x000300000002f382</span><span>\n</span><span>File</span><span> </span><span>attributes</span><span> </span><span>:</span><span> </span><span>0x00000020:</span><span> </span><span>Archive</span><span>\n</span><span>File</span><span> </span><span>entry</span><span> </span><span>flags</span><span> </span><span>:</span><span> </span><span>0x00000000</span><span>\n</span><span>Link</span><span> </span><span>(</span><span>ParentID:</span><span> </span><span>Name</span><span>)</span><span> </span><span>:</span><span> </span><span>0</span><span>x000b0000000271d1:</span><span> </span><span>NTFS</span><span>+</span><span>DOS</span><span> </span><span>Name:</span><span> </span><span>\\cache\\opam\\lock</span><span>\n</span><span>...</span><span>\n</span><span>LastUsn</span><span> </span><span>:</span><span> </span><span>16</span><span>,</span><span>897</span><span>,</span><span>595</span><span>,</span><span>224</span><span>\n</span><span>...</span><span>\n</span></code></pre></div></div>\n\n<p>Sadly, neither of these insights makes any difference to my problem. I did notice that <code>containerd</code> 2.1.3 had been released, where I had been using 2.1.1. Upgrading didn\u2019t fix the issue, but it did affect how the network namespaces were created. More later.</p>\n\n<p>I decided to both ignore the problem and try it on another machine. After all, this problem was only a problem because <em>my</em> <code>C:</code> was too small. I created a QEMU VM with a 40GB <code>C:</code> and a 1TB <code>D:</code> and installed everything, and it worked fine with the bind mount on <code>D:</code> even <em>without</em> any of the above tuning and even with <code>D:</code> formatted using ReFS, rather than NTFS.</p>\n\n<p>Trying on another physical machine with a single large spinning disk as <code>C:</code> also worked as anticipated.</p>\n\n<p>In both of these new installations, I used <code>containerd</code> 2.1.3 and noticed that the behaviour I had come to rely upon seemed to have changed. If you recall, in this <a href=\"https://www.tunbury.org/2025/06/14/windows-containerd-2/\">post</a>, I <em>found</em> the network namespace GUID by running <code>ctr run</code> on a standard Windows container and then <code>ctr container info</code> in another window. This no longer worked reliably, as the namespace was removed when the container exited. Perhaps it always should have been?</p>\n\n<p>I need to find out how to create these namespaces. PowerShell has a cmdlet <code>Get-HnsNetwork</code>, but none of the GUID values there match the currently running namespaces I observe from <code>ctr container info</code>. The source code of <a href=\"https://github.com/containerd/containerd\">containerd</a> is on GitHub..</p>\n\n<p>When you pass <code>--cni</code> to the <code>ctr</code> command, it populates the network namespace from <code>NetNewNS</code>. Snippet from <code>cmd/ctr/commands/run/run_windows.go</code></p>\n\n<div><div><pre><code> <span>if</span> <span>cliContext</span><span>.</span><span>Bool</span><span>(</span><span>\"cni\"</span><span>)</span> <span>{</span>\n <span>ns</span><span>,</span> <span>err</span> <span>:=</span> <span>netns</span><span>.</span><span>NewNetNS</span><span>(</span><span>\"\"</span><span>)</span>\n <span>if</span> <span>err</span> <span>!=</span> <span>nil</span> <span>{</span>\n <span>return</span> <span>nil</span><span>,</span> <span>err</span>\n <span>}</span>\n <span>opts</span> <span>=</span> <span>append</span><span>(</span><span>opts</span><span>,</span> <span>oci</span><span>.</span><span>WithWindowsNetworkNamespace</span><span>(</span><span>ns</span><span>.</span><span>GetPath</span><span>()))</span>\n <span>}</span>\n</code></pre></div></div>\n\n<p><code>NewNetNS</code> is defined in <code>pkg/netns/netns_windows.go</code></p>\n\n<div><div><pre><code><span>// NetNS holds network namespace for sandbox</span>\n<span>type</span> <span>NetNS</span> <span>struct</span> <span>{</span>\n <span>path</span> <span>string</span>\n<span>}</span>\n\n<span>// NewNetNS creates a network namespace for the sandbox.</span>\n<span>func</span> <span>NewNetNS</span><span>(</span><span>baseDir</span> <span>string</span><span>)</span> <span>(</span><span>*</span><span>NetNS</span><span>,</span> <span>error</span><span>)</span> <span>{</span>\n <span>temp</span> <span>:=</span> <span>hcn</span><span>.</span><span>HostComputeNamespace</span><span>{}</span>\n <span>hcnNamespace</span><span>,</span> <span>err</span> <span>:=</span> <span>temp</span><span>.</span><span>Create</span><span>()</span>\n <span>if</span> <span>err</span> <span>!=</span> <span>nil</span> <span>{</span>\n <span>return</span> <span>nil</span><span>,</span> <span>err</span>\n <span>}</span>\n\n <span>return</span> <span>&</span><span>NetNS</span><span>{</span><span>path</span><span>:</span> <span>hcnNamespace</span><span>.</span><span>Id</span><span>},</span> <span>nil</span>\n<span>}</span>\n</code></pre></div></div>\n\n<p>Following the thread, and cutting out a few steps in the interest of brevity, we end up in <code>vendor/github.com/Microsoft/hcsshim/hcn/zsyscall_windows.go</code> which calls a Win32 API.</p>\n\n<div><div><pre><code><span>func</span> <span>_hcnCreateNamespace</span><span>(</span><span>id</span> <span>*</span><span>_guid</span><span>,</span> <span>settings</span> <span>*</span><span>uint16</span><span>,</span> <span>namespace</span> <span>*</span><span>hcnNamespace</span><span>,</span> <span>result</span> <span>**</span><span>uint16</span><span>)</span> <span>(</span><span>hr</span> <span>error</span><span>)</span> <span>{</span>\n <span>hr</span> <span>=</span> <span>procHcnCreateNamespace</span><span>.</span><span>Find</span><span>()</span>\n <span>if</span> <span>hr</span> <span>!=</span> <span>nil</span> <span>{</span>\n <span>return</span>\n <span>}</span>\n <span>r0</span><span>,</span> <span>_</span><span>,</span> <span>_</span> <span>:=</span> <span>syscall</span><span>.</span><span>SyscallN</span><span>(</span><span>procHcnCreateNamespace</span><span>.</span><span>Addr</span><span>(),</span> <span>uintptr</span><span>(</span><span>unsafe</span><span>.</span><span>Pointer</span><span>(</span><span>id</span><span>)),</span> <span>uintptr</span><span>(</span><span>unsafe</span><span>.</span><span>Pointer</span><span>(</span><span>settings</span><span>)),</span> <span>uintptr</span><span>(</span><span>unsafe</span><span>.</span><span>Pointer</span><span>(</span><span>namespace</span><span>)),</span> <span>uintptr</span><span>(</span><span>unsafe</span><span>.</span><span>Pointer</span><span>(</span><span>result</span><span>)))</span>\n <span>if</span> <span>int32</span><span>(</span><span>r0</span><span>)</span> <span><</span> <span>0</span> <span>{</span>\n <span>if</span> <span>r0</span><span>&</span><span>0x1fff0000</span> <span>==</span> <span>0x00070000</span> <span>{</span>\n <span>r0</span> <span>&=</span> <span>0xffff</span>\n <span>}</span>\n <span>hr</span> <span>=</span> <span>syscall</span><span>.</span><span>Errno</span><span>(</span><span>r0</span><span>)</span>\n <span>}</span>\n <span>return</span>\n<span>}</span>\n</code></pre></div></div>\n\n<p>PowerShell provides <code>Get-HnsNamespace</code> to list available namespaces. These <em>are</em> the droids values I\u2019ve been looking for to put in <code>config.json</code>! However, by default there are no cmdlets to create them. The installation PowerShell <a href=\"https://github.com/microsoft/Windows-Containers/blob/Main/helpful_tools/Install-ContainerdRuntime/install-containerd-runtime.ps1\">script</a> for <code>containerd</code> pulls in <a href=\"https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/hns.psm1\">hns.psm1</a> for <code>containerd</code>, has a lot of interesting cmdlets, such as <code>New-HnsNetwork</code>, but not a cmdlet to create a namespace. There is also <a href=\"https://github.com/microsoft/SDN/blob/master/Kubernetes/windows/hns.v2.psm1\">hns.v2.psm1</a>, which does have <code>New-HnsNamespace</code>.</p>\n\n<div><div><pre><code><span>PS</span><span> </span><span>C:\\Users\\Administrator</span><span>></span><span> </span><span>curl.exe</span><span> </span><span>-o</span><span> </span><span>hns.v2.psm1</span><span> </span><span>-L</span><span> </span><span>https://raw.githubusercontent.com/microsoft/SDN/refs/heads/master/Kubernetes/windows/hns.v2.psm1</span><span>\n </span><span>%</span><span> </span><span>Total</span><span> </span><span>%</span><span> </span><span>Received</span><span> </span><span>%</span><span> </span><span>Xferd</span><span> </span><span>Average</span><span> </span><span>Speed</span><span> </span><span>Time</span><span> </span><span>Time</span><span> </span><span>Time</span><span> </span><span>Current</span><span>\n </span><span>Dload</span><span> </span><span>Upload</span><span> </span><span>Total</span><span> </span><span>Spent</span><span> </span><span>Left</span><span> </span><span>Speed</span><span>\n</span><span>100</span><span> </span><span>89329</span><span> </span><span>100</span><span> </span><span>89329</span><span> </span><span>0</span><span> </span><span>0</span><span> </span><span>349</span><span>k</span><span> </span><span>0</span><span> </span><span>--</span><span>:</span><span>--</span><span>:</span><span>--</span><span> </span><span>--</span><span>:</span><span>--</span><span>:</span><span>--</span><span> </span><span>--</span><span>:</span><span>--</span><span>:</span><span>--</span><span> </span><span>353k</span><span>\n\n</span><span>PS</span><span> </span><span>C:\\Users\\Administrator</span><span>></span><span> </span><span>Import-Module</span><span> </span><span>.</span><span>\\hns.v2.psm1</span><span>\n</span><span>WARNING:</span><span> </span><span>The</span><span> </span><span>names</span><span> </span><span>of</span><span> </span><span>some</span><span> </span><span>imported</span><span> </span><span>commands</span><span> </span><span>from</span><span> </span><span>the</span><span> </span><span>module</span><span> </span><span>'hns.v2'</span><span> </span><span>include</span><span> </span><span>unapproved</span><span> </span><span>verbs</span><span> </span><span>that</span><span> </span><span>might</span><span> </span><span>make</span><span> </span><span>them</span><span> </span><span>less</span><span> </span><span>discoverable.</span><span> </span><span>To</span><span> </span><span>find</span><span> </span><span>the</span><span> </span><span>commands</span><span> </span><span>with</span><span> </span><span>unapproved</span><span> </span><span>verbs</span><span>,</span><span> </span><span>run</span><span> </span><span>the</span><span> </span><span>Import-Module</span><span> </span><span>command</span><span> </span><span>again</span><span> </span><span>with</span><span> </span><span>the</span><span> </span><span>Verbose</span><span> </span><span>parameter.</span><span> </span><span>For</span><span> </span><span>a</span><span> </span><span>list</span><span> </span><span>of</span><span> </span><span>approved</span><span> </span><span>verbs</span><span>,</span><span> </span><span>type</span><span> </span><span>Get-Verb.</span><span>\n\n</span><span>PS</span><span> </span><span>C:\\Users\\Administrator</span><span>></span><span> </span><span>New-HnsNamespace</span><span>\n</span><span>HcnCreateNamespace</span><span> </span><span>--</span><span> </span><span>HRESULT:</span><span> </span><span>2151350299.</span><span> </span><span>Result:</span><span> </span><span>{</span><span>\"Success\"</span><span>:</span><span>false</span><span>,</span><span>\"Error\"</span><span>:</span><span>\"Invalid JSON document string. &#123;&#123;CreateWithCompartment,UnknownField}}\"</span><span>,</span><span>\"ErrorCode\"</span><span>:</span><span>2151350299</span><span>}</span><span>\n</span><span>At</span><span> </span><span>C:\\Users\\Administrator\\hns.v2.psm1:2392</span><span> </span><span>char:13</span><span>\n</span><span>+</span><span> </span><span>throw</span><span> </span><span>$errString</span><span>\n</span><span>+</span><span> </span><span>~~~~~~~~~~~~~~~~</span><span>\n </span><span>+</span><span> </span><span>CategoryInfo</span><span> </span><span>:</span><span> </span><span>OperationStopped:</span><span> </span><span>(</span><span>HcnCreateNamesp...de</span><span>\":2151350299}:String) [], RuntimeException\n + FullyQualifiedErrorId : HcnCreateNamespace -- HRESULT: 2151350299. Result: {\"</span><span>Success</span><span>\":false,\"</span><span>Error</span><span>\":\"</span><span>Invalid</span><span> </span><span>JSON</span><span> </span><span>document</span><span> </span><span>string.</span><span> </span><span>&</span><span>#123;&#123;CreateWithCompartment,UnknownField}}\",\"ErrorCode\":2151350299}</span><span>\n</span></code></pre></div></div>\n\n<p>With a lot of frustration, I decided to have a go at calling the Win32 API from OCaml. This resulted in <a href=\"https://github.com/mtelvers/hcn-namespace\">mtelvers/hcn-namespace</a>, which allows me to create the namespaces by running <code>hcn-namespace create</code>. These namespaces appear in the output from <code>Get-HnsNamespace</code> and work correctly in <code>config.json</code>.</p>\n\n<p>Run <code>hcn-namespace.exe create</code>, and then populate <code>\"networkNamespace\": \"<GUID>\"</code> with the GUID provided and run with <code>ctr run --rm -cni --config config.json</code>.</p>",
+21
mte/2025_07_01_ocaml-functors.json
+21
mte/2025_07_01_ocaml-functors.json
···+"summary": "In my OCaml project, I\u2019d like to abstract away the details of running containers into specific modules based on the OS. Currently, I have working container setups for Windows and Linux, and I\u2019ve haphazardly peppered if Sys.win32 then where I need differentiation, but this is OCaml, so let us use functors!",+"content": "<p>In my OCaml project, I\u2019d like to abstract away the details of running containers into specific modules based on the OS. Currently, I have working container setups for Windows and Linux, and I\u2019ve haphazardly peppered <code>if Sys.win32 then</code> where I need differentiation, but this is OCaml, so let us use <em>functors</em>!</p>\n\n<p>I started by fleshing out the bare bones in a new project. After <code>dune init project functor</code>, I created <code>bin/s.ml</code> containing the signature of the module <code>CONTAINER</code>.</p>\n\n<div><div><pre><code><span>module</span> <span>type</span> <span>CONTAINER</span> <span>=</span> <span>sig</span>\n <span>val</span> <span>run</span> <span>:</span> <span>string</span> <span>-></span> <span>unit</span>\n<span>end</span>\n</code></pre></div></div>\n\n<p>Then a trivial <code>bin/linux.ml</code>.</p>\n\n<div><div><pre><code><span>let</span> <span>run</span> <span>s</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Linux container '%s'</span><span>\\n</span><span>\"</span> <span>s</span>\n</code></pre></div></div>\n\n<p>And <code>bin/windows.ml</code>.</p>\n\n<div><div><pre><code><span>let</span> <span>run</span> <span>s</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Windows container '%s'</span><span>\\n</span><span>\"</span> <span>s</span>\n</code></pre></div></div>\n\n<p>Then in <code>bin/main.ml</code>, I can select the container system once and from then on use <code>Container.foo</code> to run the appropriate OS specific function.</p>\n\n<div><div><pre><code><span>let</span> <span>container</span> <span>=</span> <span>if</span> <span>Sys</span><span>.</span><span>win32</span> <span>then</span> <span>(</span><span>module</span> <span>Windows</span> <span>:</span> <span>S</span><span>.</span><span>CONTAINER</span><span>)</span> <span>else</span> <span>(</span><span>module</span> <span>Linux</span> <span>:</span> <span>S</span><span>.</span><span>CONTAINER</span><span>)</span>\n\n<span>module</span> <span>Container</span> <span>=</span> <span>(</span><span>val</span> <span>container</span><span>)</span>\n\n<span>let</span> <span>()</span> <span>=</span> <span>Container</span><span>.</span><span>run</span> <span>\"Hello, World!\"</span>\n</code></pre></div></div>\n\n<p>You can additionally create <code>windows.mli</code> and <code>linux.mli</code> containing simply <code>include S.CONTAINER</code>.</p>\n\n<p>Now, let\u2019s imagine that we needed to have some specific configuration options depending upon whether we are running on Windows or Linux. For demonstration purposes, let\u2019s use the user account. On Windows, this is a string, typically <code>ContainerAdministrator</code>, whereas on Linux, it\u2019s an integer UID of value 0.</p>\n\n<p>We can update the module type in <code>bin/s.ml</code> to include the type <code>t</code>, and add an <code>init</code> function to return a <code>t</code> and add <code>t</code> as a parameter to <code>run</code>.</p>\n\n<div><div><pre><code><span>module</span> <span>type</span> <span>CONTAINER</span> <span>=</span> <span>sig</span>\n <span>type</span> <span>t</span>\n\n <span>val</span> <span>init</span> <span>:</span> <span>unit</span> <span>-></span> <span>t</span>\n <span>val</span> <span>run</span> <span>:</span> <span>t</span> <span>-></span> <span>string</span> <span>-></span> <span>unit</span>\n<span>end</span>\n</code></pre></div></div>\n\n<p>In <code>bin/linux.ml</code>, we can add the type and define <code>uid</code> as an integer, then add the <code>init</code> function to return the populated structure. <code>run</code> now accepts <code>t</code> as the first parameter.</p>\n\n<div><div><pre><code><span>type</span> <span>t</span> <span>=</span> <span>{</span>\n <span>uid</span> <span>:</span> <span>int</span><span>;</span>\n<span>}</span>\n\n<span>let</span> <span>init</span> <span>()</span> <span>=</span> <span>{</span> <span>uid</span> <span>=</span> <span>0</span> <span>}</span>\n\n<span>let</span> <span>run</span> <span>t</span> <span>s</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Linux container user id %i says '%s'</span><span>\\n</span><span>\"</span> <span>t</span><span>.</span><span>uid</span> <span>s</span>\n</code></pre></div></div>\n\n<p>In a similar vein, <code>bin/windows.ml</code> is updated like this</p>\n\n<div><div><pre><code><span>type</span> <span>t</span> <span>=</span> <span>{</span>\n <span>username</span> <span>:</span> <span>string</span><span>;</span>\n<span>}</span>\n\n<span>let</span> <span>init</span> <span>()</span> <span>=</span> <span>{</span> <span>username</span> <span>=</span> <span>\"ContainerAdministrator\"</span> <span>}</span>\n\n<span>let</span> <span>run</span> <span>t</span> <span>s</span> <span>=</span> <span>Printf</span><span>.</span><span>printf</span> <span>\"Windows container user name %s says '%s'</span><span>\\n</span><span>\"</span> <span>t</span><span>.</span><span>username</span> <span>s</span>\n</code></pre></div></div>\n\n<p>And finally, in <code>bin/main.ml</code> we run <code>Container.init ()</code> and use the returned type as a parameter to <code>Container.run</code>.</p>\n\n<div><div><pre><code><span>let</span> <span>container</span> <span>=</span> <span>if</span> <span>Sys</span><span>.</span><span>win32</span> <span>then</span> <span>(</span><span>module</span> <span>Windows</span> <span>:</span> <span>S</span><span>.</span><span>CONTAINER</span><span>)</span> <span>else</span> <span>(</span><span>module</span> <span>Linux</span> <span>:</span> <span>S</span><span>.</span><span>CONTAINER</span><span>)</span>\n\n<span>module</span> <span>Container</span> <span>=</span> <span>(</span><span>val</span> <span>container</span><span>)</span>\n\n<span>let</span> <span>c</span> <span>=</span> <span>Container</span><span>.</span><span>init</span> <span>()</span>\n<span>let</span> <span>()</span> <span>=</span> <span>Container</span><span>.</span><span>run</span> <span>c</span> <span>\"Hello, World!\"</span>\n</code></pre></div></div>",
+21
mte/2025_07_02_bon-in-a-box.json
+21
mte/2025_07_02_bon-in-a-box.json
···+"summary": "On a suggestion from Michael, I have had a quick look at BON in a Box, which is a web-based biodiversity analysis platform using Docker containerised pipelines running R, Julia, and Python scripts.",+"content": "<p>On a suggestion from Michael, I have had a quick look at <a href=\"https://geo-bon.github.io/bon-in-a-box-pipeline-engine/\">BON in a Box</a>, which is a web-based biodiversity analysis platform using Docker containerised pipelines running R, Julia, and Python scripts.</p>\n\n<p>It couldn\u2019t be easier to get started. Install Docker and Docker Compose, and make sure you can access GitHub via SSH using a public key. [Run <code>ssh-keygen -t ed25519</code> and then publish the resulting <code>~/.ssh/id_ed25519.pub</code> to your GitHub account.]</p>\n\n<div><div><pre><code>apt <span>install </span>docker.io docker-compose-v2\n</code></pre></div></div>\n\n<p>Clone the GEO-BON\u2019s repository and make a working copy of the <code>runner.env</code> file. This file can be edit to add API keys of datasets, but I don\u2019t have any so the default file is fine.</p>\n\n<div><div><pre><code>git clone git@github.com:GEO-BON/bon-in-a-box-pipelines.git\n<span>cd </span>bon-in-a-box\n<span>cp </span>runner-sample.env runner.env\n</code></pre></div></div>\n\n<p>To start the server run <code>./server-up.sh</code>. There is also <code>./server-down.sh</code> to stop the server.</p>\n\n<p>The first run downloads the required Docker containers so takes a few minutes. Once complete visit <a href=\"http://localhost\">http://localhost</a> to see the web GUI.</p>\n\n<p>I ran the \u201cGet Country Polygon\u201d script, creating a nice Colombia polygon.</p>\n\n<p>There is a drag and drop pipeline editor which felt a lot like Microsoft Access.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/geobon-pipeline.png\"></p>\n\n<p>I followed along with the tutorial and created an R script and a YAML file of the same name in the <code>/scripts</code> directory. These appeared in the GUI, allowing me to run them and use them in the pipeline editor. Annoyingly, the dataset was not provided in the tutorial, so I couldn\u2019t run the code.</p>\n\n<p><code>TestScript.R</code></p>\n\n<p>The <code>biab</code> functions are how the script interacts with the BON in a Box system.</p>\n\n<div><div><pre><code><span>library</span><span>(</span><span>rjson</span><span>)</span><span>\n</span><span>library</span><span>(</span><span>sf</span><span>)</span><span>\n</span><span>library</span><span>(</span><span>terra</span><span>)</span><span>\n</span><span>library</span><span>(</span><span>dplyr</span><span>)</span><span>\n</span><span>library</span><span>(</span><span>ggplot2</span><span>)</span><span>\n\n</span><span>input</span><span> </span><span><-</span><span> </span><span>biab_inputs</span><span>()</span><span>\n\n</span><span>dat</span><span> </span><span><-</span><span> </span><span>st_read</span><span>(</span><span>input</span><span>$</span><span>country_polygon</span><span>)</span><span>\n\n</span><span>if</span><span> </span><span>(</span><span>nrow</span><span>(</span><span>dat</span><span>)</span><span>==</span><span>0</span><span>)</span><span> </span><span>{</span><span>\n </span><span>biab_error_stop</span><span>(</span><span>\"Country polygon does not exist\"</span><span>)</span><span>\n</span><span>}</span><span> \n \n</span><span>dat.transformed</span><span> </span><span><-</span><span> </span><span>st_transform</span><span>(</span><span>dat</span><span>,</span><span> </span><span>crs</span><span>=</span><span>input</span><span>$</span><span>crs</span><span>)</span><span>\n\n</span><span>rasters</span><span> </span><span><-</span><span> </span><span>terra</span><span>::</span><span>rast</span><span>(</span><span>c</span><span>(</span><span>input</span><span>$</span><span>rasters</span><span>,</span><span> </span><span>crs</span><span>=</span><span>intput</span><span>$</span><span>crs</span><span>))</span><span>\n\n</span><span>country_vect</span><span> </span><span><-</span><span> </span><span>vect</span><span>(</span><span>dat.transformed</span><span>)</span><span>\n \n</span><span>raster.cropped</span><span> </span><span><-</span><span> </span><span>mask</span><span>(</span><span>rasters</span><span>,</span><span> </span><span>country_vect</span><span>)</span><span> \n \n</span><span>raster_change</span><span> </span><span><-</span><span> </span><span>rasters</span><span>[[</span><span>1</span><span>]]</span><span>-</span><span>rasters</span><span>[[</span><span>2</span><span>]]</span><span>\n\n</span><span>raster_change_path</span><span> </span><span><-</span><span> </span><span>file.path</span><span>(</span><span>outputFolder</span><span>,</span><span> </span><span>\"raster_change.tif\"</span><span>)</span><span>\n</span><span>writeRaster</span><span>(</span><span>raster_change</span><span>,</span><span> </span><span>raster_change_path</span><span>)</span><span>\n\n</span><span>biab_output</span><span>(</span><span>\"raster_change\"</span><span>,</span><span> </span><span>raster_change_path</span><span>)</span><span>\n\n</span><span>layer_means</span><span> </span><span><-</span><span> </span><span>global</span><span>(</span><span>rasters.cropped</span><span>,</span><span> </span><span>fun</span><span>=</span><span>\"mean\"</span><span>,</span><span> </span><span>na.rm</span><span>=</span><span>TRUE</span><span>)</span><span>\n</span><span>layer_means</span><span>$</span><span>name</span><span> </span><span><-</span><span> </span><span>names</span><span>(</span><span>rasters.cropped</span><span>)</span><span>\n \n</span><span>means_plot</span><span> </span><span><-</span><span> </span><span>ggplot</span><span>(</span><span>layer_means</span><span>,</span><span> </span><span>aes</span><span>(</span><span>x</span><span>=</span><span>name</span><span>,</span><span> </span><span>y</span><span>=</span><span>mean</span><span>))</span><span> </span><span>+</span><span> </span><span>geom_point</span><span>()</span><span>\n \n</span><span>means_plot_path</span><span> </span><span><-</span><span> </span><span>file.path</span><span>(</span><span>outputFolder</span><span>,</span><span> </span><span>\"means_plot.png\"</span><span>)</span><span>\n</span><span>ggsave</span><span>(</span><span>means_plot_path</span><span>,</span><span> </span><span>means_plot</span><span>)</span><span>\n \n</span><span>biab_output</span><span>(</span><span>\"means_plot\"</span><span>,</span><span> </span><span>means_plot_path</span><span>)</span><span>\n</span></code></pre></div></div>\n\n<p><code>TestScript.yaml</code></p>\n\n<p>The <code>inputs</code> and <code>outputs</code> section defines the inputs and outputs, where the names must match the names in the script above. The environment is set up using conda. A specific version can be specified like this: <code>r-terra=0.9-12</code></p>\n\n<div><div><pre><code><span>script</span><span>:</span> <span>TestScript.R</span>\n<span>name</span><span>:</span> <span>Test script</span>\n<span>description</span><span>:</span> <span>Demo script</span>\n<span>author</span><span>:</span>\n <span>-</span> <span>name</span><span>:</span> <span>ME</span>\n<span>inputs</span><span>:</span>\n <span>country_ploygon</span><span>:</span>\n <span>label</span><span>:</span> <span>Country Polygon</span>\n <span>description</span><span>:</span> <span>Polygon of the country of interest</span>\n <span>type</span><span>:</span> <span>application/geo+json</span>\n <span>example</span><span>:</span> <span>null</span>\n <span>crs</span><span>:</span>\n <span>label</span><span>:</span> <span>Coordinate reference system</span>\n <span>description</span><span>:</span> <span>Coordinate reference system</span>\n <span>type</span><span>:</span> <span>text</span>\n <span>example</span><span>:</span> <span>\"</span><span>EPSG:3857\"</span>\n <span>rasters</span><span>:</span>\n <span>label</span><span>:</span> <span>Rasters</span>\n <span>description</span><span>:</span> <span>Raster layers of variable of interest</span>\n <span>type</span><span>:</span> <span>image/tiff;application=geotiff[]</span>\n <span>example</span><span>:</span> <span>null</span> \n<span>outputs</span><span>:</span>\n <span>raster_change</span><span>:</span>\n <span>label</span><span>:</span> <span>Rasters</span>\n <span>description</span><span>:</span> <span>Differences between raster values</span>\n <span>type</span><span>:</span> <span>image/tiff;application=geotiff</span>\n <span>means_plot</span><span>:</span>\n <span>label</span><span>:</span> <span>Plot of raster means</span>\n <span>description</span><span>:</span> <span>Plot of means of raster layers</span>\n <span>type</span><span>:</span> <span>image/png</span>\n<span>conda</span><span>:</span>\n <span>channels</span><span>:</span>\n <span>-</span> <span>conda-forge</span>\n <span>-</span> <span>r</span>\n <span>dependencies</span><span>:</span>\n <span>-</span> <span>r-rjson</span>\n <span>-</span> <span>r-sf</span>\n <span>-</span> <span>r-dplyr</span>\n <span>-</span> <span>r-terra</span>\n <span>-</span> <span>r-ggplot2</span>\n</code></pre></div></div>\n\n<p>The architecture appears to be designed as a single-server instance without built-in job queuing or concurrent execution limits.</p>",
+21
mte/2025_07_07_refs-monteverde.json
+21
mte/2025_07_07_refs-monteverde.json
···+"summary": "In addition to the post from last week covering BON in a Box and OCaml Functors, below are some additional notes.",+"content": "<p>In addition to the post from last week covering <a href=\"https://www.tunbury.org/2025/07/02/bon-in-a-box/\">BON in a Box</a> and <a href=\"https://www.tunbury.org/2025/07/01/ocaml-functors/\">OCaml Functors</a>, below are some additional notes.</p>\n\n<h1>Resilient File System, ReFS</h1>\n\n<p>I have previously stated that <a href=\"https://www.tunbury.org/windows-reflinks\">ReFS</a> supports 1 million hard links per file; however, this is not the case. The maximum is considerably lower at 8191. That\u2019s eight times more than NTFS, but still not very many.</p>\n\n<div><div><pre><code><span>PS</span><span> </span><span>D:\\</span><span>></span><span> </span><span>touch</span><span> </span><span>foo</span><span>\n</span><span>PS</span><span> </span><span>D:\\</span><span>></span><span> </span><span>foreach</span><span> </span><span>(</span><span>$i</span><span> </span><span>in</span><span> </span><span>1</span><span>..</span><span>8192</span><span>)</span><span> </span><span>{</span><span>\n</span><span>>></span><span> </span><span>New-Item</span><span> </span><span>-ItemType</span><span> </span><span>HardLink</span><span> </span><span>-Path</span><span> </span><span>\"foo-</span><span>$i</span><span>\"</span><span> </span><span>-Target</span><span> </span><span>\"foo\"</span><span>\n</span><span>>></span><span> </span><span>}</span><span>\n\n\n </span><span>Directory:</span><span> </span><span>D:\\</span><span>\n\n\n</span><span>Mode</span><span> </span><span>LastWriteTime</span><span> </span><span>Length</span><span> </span><span>Name</span><span>\n</span><span>----</span><span> </span><span>-------------</span><span> </span><span>------</span><span> </span><span>----</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-1</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-2</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-3</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-4</span><span>\n</span><span>...</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-8190</span><span>\n</span><span>-a</span><span>----</span><span> </span><span>07</span><span>/07/2025</span><span> </span><span>01:00</span><span> </span><span>0</span><span> </span><span>foo-8191</span><span>\n</span><span>New-Item</span><span> </span><span>:</span><span> </span><span>An</span><span> </span><span>attempt</span><span> </span><span>was</span><span> </span><span>made</span><span> </span><span>to</span><span> </span><span>create</span><span> </span><span>more</span><span> </span><span>links</span><span> </span><span>on</span><span> </span><span>a</span><span> </span><span>file</span><span> </span><span>than</span><span> </span><span>the</span><span> </span><span>file</span><span> </span><span>system</span><span> </span><span>supports</span><span>\n</span><span>At</span><span> </span><span>line:2</span><span> </span><span>char:5</span><span>\n</span><span>+</span><span> </span><span>New-Item</span><span> </span><span>-ItemType</span><span> </span><span>HardLink</span><span> </span><span>-Path</span><span> </span><span>\"foo-</span><span>$i</span><span>\"</span><span> </span><span>-Target</span><span> </span><span>\"foo\"</span><span>\n</span><span>+</span><span> </span><span>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~</span><span>\n </span><span>+</span><span> </span><span>CategoryInfo</span><span> </span><span>:</span><span> </span><span>NotSpecified:</span><span> </span><span>(:)</span><span> </span><span>[</span><span>New</span><span>-Item</span><span>],</span><span> </span><span>Win32Exception</span><span>\n </span><span>+</span><span> </span><span>FullyQualifiedErrorId</span><span> </span><span>:</span><span> </span><span>System.ComponentModel.Win32Exception</span><span>,</span><span>Microsoft.PowerShell.Commands.NewItemCommand</span><span>\n</span></code></pre></div></div>\n\n<p>I had also investigated ReFS block cloning, which removed the requirement to create hard links, and wrote a <a href=\"https://github.com/mtelvers/ReFS-Clone\">ReFS-clone</a> tool for Windows Server 2022. This works well until containerd is used to bind mount a directory on the volume. Once this has happened, attempts to create a block clone fail. To exclude my code as the root cause, I have tried Windows Server 2025, where commands such as <code>copy</code> and <code>robocopy</code> automatically perform block clones. Block cloning can be restored by rebooting the machine. I note that restarting containerd is not sufficient.</p>\n\n<p>Removing files and folders on ReFS is impressively fast; however, this comes at a cost: freeing the blocks is a background activity that may take some time to be scheduled.</p>\n\n<h1>File system performance with a focus on ZFS</h1>\n\n<p>Several EEG interns started last week with this <a href=\"https://anil.recoil.org/ideas/zfs-filesystem-perf\">project</a> under my supervision. In brief, we will examine file system performance on the filesystems supported by <a href=\"https://github.com/ocurrent/obuilder\">OBuilder</a> before conducting more detailed investigations into factors affecting ZFS performance.</p>\n\n<h1>Monteverde</h1>\n\n<p>monteverde.cl.cam.ac.uk, has been installed in the rack. It has two AMD EPYC 9965 192-Core Processors, giving a total of 384 cores and 768 threads and 3TB of RAM.</p>\n\n<p><img alt=\"\" src=\"https://www.tunbury.org/images/monteverde.jpg\"></p>\n\n<p>From the logs, there are still some teething issues:</p>\n\n<div><div><pre><code>[130451.620482] Large kmem_alloc(98304, 0x1000), please file an issue at:\n https://github.com/openzfs/zfs/issues/new\n[130451.620486] CPU: 51 UID: 0 PID: 8594 Comm: txg_sync Tainted: P O 6.14.0-23-generic #23-Ubuntu\n[130451.620488] Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE\n[130451.620489] Hardware name: Dell Inc. PowerEdge R7725/0KRFPX, BIOS 1.1.3 02/25/2025\n[130451.620490] Call Trace:\n[130451.620490] <TASK>\n[130451.620492] show_stack+0x49/0x60\n[130451.620493] dump_stack_lvl+0x5f/0x90\n[130451.620495] dump_stack+0x10/0x18\n[130451.620497] spl_kmem_alloc_impl.cold+0x17/0x1c [spl]\n[130451.620503] spl_kmem_zalloc+0x19/0x30 [spl]\n[130451.620508] multilist_create_impl+0x3f/0xc0 [zfs]\n[130451.620586] multilist_create+0x31/0x50 [zfs]\n[130451.620650] dmu_objset_sync+0x4c4/0x4d0 [zfs]\n[130451.620741] dsl_pool_sync_mos+0x34/0xc0 [zfs]\n[130451.620832] dsl_pool_sync+0x3c1/0x420 [zfs]\n[130451.620910] spa_sync_iterate_to_convergence+0xda/0x220 [zfs]\n[130451.620990] spa_sync+0x333/0x660 [zfs]\n[130451.621056] txg_sync_thread+0x1f5/0x270 [zfs]\n[130451.621137] ? __pfx_txg_sync_thread+0x10/0x10 [zfs]\n[130451.621207] ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]\n[130451.621213] thread_generic_wrapper+0x5b/0x70 [spl]\n[130451.621217] kthread+0xf9/0x230\n[130451.621219] ? __pfx_kthread+0x10/0x10\n[130451.621221] ret_from_fork+0x44/0x70\n[130451.621223] ? __pfx_kthread+0x10/0x10\n[130451.621224] ret_from_fork_asm+0x1a/0x30\n[130451.621226] </TASK>\n</code></pre></div></div>",
+21
mte/2025_07_08_unix-or-sys.json
+21
mte/2025_07_08_unix-or-sys.json
···+"summary": "When you recursively scan a massive directory tree, would you use Sys.readdir or Unix.readdir? My inclination is that Sys.readdir feels more convenient to use, and thus the lower-level Unix.readdir would have the performance edge. Is it significant enough to bother with?",+"content": "<p>When you recursively scan a massive directory tree, would you use <code>Sys.readdir</code> or <code>Unix.readdir</code>? My inclination is that <code>Sys.readdir</code> feels more convenient to use, and thus the lower-level <code>Unix.readdir</code> would have the performance edge. Is it significant enough to bother with?</p>\n\n<p>Quickly coding up the two different options for comparison. Here\u2019s the <code>Unix.readdir</code> version, running <code>Unix.opendir</code> then recursively calling <code>Unix.readdir</code> until the <code>End_of_file</code> exception is raised.</p>\n\n<div><div><pre><code><span>let</span> <span>rec</span> <span>traverse_directory_unix</span> <span>path</span> <span>x</span> <span>=</span>\n <span>let</span> <span>stats</span> <span>=</span> <span>Unix</span><span>.</span><span>lstat</span> <span>path</span> <span>in</span>\n <span>match</span> <span>stats</span><span>.</span><span>st_kind</span> <span>with</span>\n <span>|</span> <span>Unix</span><span>.</span><span>S_REG</span> <span>-></span> <span>x</span> <span>+</span> <span>1</span>\n <span>|</span> <span>S_LNK</span> <span>|</span> <span>S_CHR</span> <span>|</span> <span>S_BLK</span> <span>|</span> <span>S_FIFO</span> <span>|</span> <span>S_SOCK</span> <span>-></span> <span>x</span>\n <span>|</span> <span>S_DIR</span> <span>-></span>\n <span>try</span>\n <span>let</span> <span>dir_handle</span> <span>=</span> <span>Unix</span><span>.</span><span>opendir</span> <span>path</span> <span>in</span>\n <span>let</span> <span>rec</span> <span>read_entries</span> <span>acc</span> <span>=</span>\n <span>try</span>\n <span>match</span> <span>Unix</span><span>.</span><span>readdir</span> <span>dir_handle</span> <span>with</span>\n <span>|</span> <span>\".\"</span> <span>|</span> <span>\"..\"</span> <span>-></span> <span>read_entries</span> <span>acc</span>\n <span>|</span> <span>entry</span> <span>-></span>\n <span>let</span> <span>full_path</span> <span>=</span> <span>Filename</span><span>.</span><span>concat</span> <span>path</span> <span>entry</span> <span>in</span>\n <span>read_entries</span> <span>(</span><span>traverse_directory_unix</span> <span>full_path</span> <span>acc</span><span>)</span>\n <span>with</span> <span>End_of_file</span> <span>-></span>\n <span>Unix</span><span>.</span><span>closedir</span> <span>dir_handle</span><span>;</span>\n <span>acc</span>\n <span>in</span>\n <span>read_entries</span> <span>x</span>\n <span>with</span> <span>_</span> <span>-></span> <span>x</span>\n</code></pre></div></div>\n\n<p>The <code>Sys.readdir</code> version nicely gives us an array so we can idiomatically use <code>Array.fold_left</code>.</p>\n\n<div><div><pre><code><span>let</span> <span>traverse_directory_sys</span> <span>source</span> <span>=</span>\n <span>let</span> <span>rec</span> <span>process_directory</span> <span>s</span> <span>current_source</span> <span>=</span>\n <span>let</span> <span>entries</span> <span>=</span> <span>Sys</span><span>.</span><span>readdir</span> <span>current_source</span> <span>in</span>\n <span>Array</span><span>.</span><span>fold_left</span>\n <span>(</span><span>fun</span> <span>acc</span> <span>entry</span> <span>-></span>\n <span>let</span> <span>source</span> <span>=</span> <span>Filename</span><span>.</span><span>concat</span> <span>current_source</span> <span>entry</span> <span>in</span>\n <span>try</span>\n <span>let</span> <span>stat</span> <span>=</span> <span>Unix</span><span>.</span><span>lstat</span> <span>source</span> <span>in</span>\n <span>match</span> <span>stat</span><span>.</span><span>st_kind</span> <span>with</span>\n <span>|</span> <span>Unix</span><span>.</span><span>S_REG</span> <span>-></span> <span>acc</span> <span>+</span> <span>1</span>\n <span>|</span> <span>Unix</span><span>.</span><span>S_DIR</span> <span>-></span> <span>process_directory</span> <span>acc</span> <span>source</span>\n <span>|</span> <span>S_LNK</span> <span>|</span> <span>S_CHR</span> <span>|</span> <span>S_BLK</span> <span>|</span> <span>S_FIFO</span> <span>|</span> <span>S_SOCK</span> <span>-></span> <span>acc</span>\n <span>with</span> <span>Unix</span><span>.</span><span>Unix_error</span> <span>_</span> <span>-></span> <span>acc</span><span>)</span>\n <span>s</span> <span>entries</span>\n <span>in</span>\n <span>process_directory</span> <span>0</span> <span>source</span>\n</code></pre></div></div>\n\n<p>The file system may have a big impact, so I tested NTFS, ReFS, and ext4, running each a couple of times to ensure the cache was primed.</p>\n\n<p><code>Sys.readdir</code> was quicker in my test cases up to 500,000 files. Reaching 750,000 files, <code>Unix.readdir</code> edged ahead. I was surprised by the outcome and wondered whether it was my code rather than the module I used.</p>\n\n<p>Pushing for the result I expected/wanted, I rewrote the function so it more closely mirrors the <code>Sys.readdir</code> version.</p>\n\n<div><div><pre><code><span>let</span> <span>traverse_directory_unix_2</span> <span>path</span> <span>=</span>\n <span>let</span> <span>rec</span> <span>process_directory</span> <span>s</span> <span>path</span> <span>=</span>\n <span>try</span>\n <span>let</span> <span>dir_handle</span> <span>=</span> <span>Unix</span><span>.</span><span>opendir</span> <span>path</span> <span>in</span>\n <span>let</span> <span>rec</span> <span>read_entries</span> <span>acc</span> <span>=</span>\n <span>try</span>\n <span>let</span> <span>entry</span> <span>=</span> <span>Unix</span><span>.</span><span>readdir</span> <span>dir_handle</span> <span>in</span>\n <span>match</span> <span>entry</span> <span>with</span>\n <span>|</span> <span>\".\"</span> <span>|</span> <span>\"..\"</span> <span>-></span> <span>read_entries</span> <span>acc</span>\n <span>|</span> <span>entry</span> <span>-></span>\n <span>let</span> <span>full_path</span> <span>=</span> <span>Filename</span><span>.</span><span>concat</span> <span>path</span> <span>entry</span> <span>in</span>\n <span>let</span> <span>stats</span> <span>=</span> <span>Unix</span><span>.</span><span>lstat</span> <span>full_path</span> <span>in</span>\n <span>match</span> <span>stats</span><span>.</span><span>st_kind</span> <span>with</span>\n <span>|</span> <span>Unix</span><span>.</span><span>S_REG</span> <span>-></span> <span>read_entries</span> <span>(</span><span>acc</span> <span>+</span> <span>1</span><span>)</span>\n <span>|</span> <span>S_LNK</span> <span>|</span> <span>S_CHR</span> <span>|</span> <span>S_BLK</span> <span>|</span> <span>S_FIFO</span> <span>|</span> <span>S_SOCK</span> <span>-></span> <span>read_entries</span> <span>acc</span>\n <span>|</span> <span>S_DIR</span> <span>-></span> <span>read_entries</span> <span>(</span><span>process_directory</span> <span>acc</span> <span>full_path</span><span>)</span>\n <span>with</span> <span>End_of_file</span> <span>-></span>\n <span>Unix</span><span>.</span><span>closedir</span> <span>dir_handle</span><span>;</span>\n <span>acc</span>\n <span>in</span>\n <span>read_entries</span> <span>s</span>\n <span>with</span> <span>_</span> <span>-></span> <span>s</span>\n <span>in</span>\n <span>process_directory</span> <span>0</span> <span>path</span>\n</code></pre></div></div>\n\n<p>This version is indeed faster than <code>Sys.readdir</code> in all cases. However, at 750,000 files the speed up was < 0.5%.</p>",
+20
mte/2025_07_09_jupyter.json
+20
mte/2025_07_09_jupyter.json
···+"content": "<p>Brief notes on publishing a Jupyter notebook as a Docker container.</p>\n\n<p>My starting point is a GitHub <a href=\"https://github.com/ucam-eo/tessera-interactive-map\">repo</a> containing a Jupyter notebook and a <code>requirements.txt</code>.</p>\n\n<div><div><pre><code>git clone https://github.com/ucam-eo/tessera-interactive-map\n<span>cd </span>tessera-interactive-map\n</code></pre></div></div>\n\n<p>I created a <code>Dockerfile</code> which pulls in a standard Python container. I used 3.11 as that is the minimum version support for <a href=\"https://github.com/ucam-eo/geotessera.git\">https://github.com/ucam-eo/geotessera.git</a></p>\n\n<p><code>pip</code> installs the packages listed in <code>requirements.txt</code> plus the additional <a href=\"https://github.com/ucam-eo/geotessera.git\">geotessera</a> library. The extra library is noted in the <a href=\"https://github.com/ucam-eo/tessera-interactive-map/blob/main/README.md\">README.md</a>.</p>\n\n<div><div><pre><code>FROM python:3.11\nWORKDIR /app\nCOPY <span>.</span> /app\nRUN pip <span>install</span> <span>--no-cache-dir</span> <span>-r</span> requirements.txt\nRUN pip <span>install </span>git+https://github.com/ucam-eo/geotessera.git\nRUN pip <span>install </span>jupyter\nEXPOSE 8888\nENV NAME World\nCMD <span>[</span><span>\"jupyter\"</span>, <span>\"notebook\"</span>, <span>\"--ip=0.0.0.0\"</span>, <span>\"--port=8888\"</span>, <span>\"--no-browser\"</span>, <span>\"--allow-root\"</span><span>]</span>\n</code></pre></div></div>\n\n<p>Build the Docker image.</p>\n\n<div><div><pre><code>docker build <span>-t</span> my-jupyter <span>.</span>\n</code></pre></div></div>\n\n<p>And run the container.</p>\n\n<div><div><pre><code><span># docker run --rm -it -p 8888:8888 my-jupyter</span>\n<span>[</span>I 2025-07-09 16:11:37.739 ServerApp] jupyter_lsp | extension was successfully linked.\n<span>[</span>I 2025-07-09 16:11:37.743 ServerApp] jupyter_server_terminals | extension was successfully linked.\n<span>[</span>I 2025-07-09 16:11:37.746 ServerApp] jupyterlab | extension was successfully linked.\n<span>[</span>I 2025-07-09 16:11:37.749 ServerApp] notebook | extension was successfully linked.\n<span>[</span>I 2025-07-09 16:11:37.751 ServerApp] Writing Jupyter server cookie secret to /root/.local/share/jupyter/runtime/jupyter_cookie_secret\n<span>[</span>I 2025-07-09 16:11:38.089 ServerApp] notebook_shim | extension was successfully linked.\n<span>[</span>I 2025-07-09 16:11:38.102 ServerApp] notebook_shim | extension was successfully loaded.\n<span>[</span>I 2025-07-09 16:11:38.104 ServerApp] jupyter_lsp | extension was successfully loaded.\n<span>[</span>I 2025-07-09 16:11:38.105 ServerApp] jupyter_server_terminals | extension was successfully loaded.\n<span>[</span>I 2025-07-09 16:11:38.107 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.11/site-packages/jupyterlab\n<span>[</span>I 2025-07-09 16:11:38.107 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab\n<span>[</span>I 2025-07-09 16:11:38.107 LabApp] Extension Manager is <span>'pypi'</span><span>.</span>\n<span>[</span>I 2025-07-09 16:11:38.156 ServerApp] jupyterlab | extension was successfully loaded.\n<span>[</span>I 2025-07-09 16:11:38.159 ServerApp] notebook | extension was successfully loaded.\n<span>[</span>I 2025-07-09 16:11:38.160 ServerApp] Serving notebooks from <span>local </span>directory: /app\n<span>[</span>I 2025-07-09 16:11:38.160 ServerApp] Jupyter Server 2.16.0 is running at:\n<span>[</span>I 2025-07-09 16:11:38.160 ServerApp] http://0ad4fce9b94e:8888/tree?token<span>=</span>c11c0f007dd99a785ff67331514fb44e87269055952a253b\n<span>[</span>I 2025-07-09 16:11:38.160 ServerApp] http://127.0.0.1:8888/tree?token<span>=</span>c11c0f007dd99a785ff67331514fb44e87269055952a253b\n</code></pre></div></div>\n\n<p>Note the URL in the log output and open it in the browser. You are prompted to enter the token if you don\u2019t specify the token as part of the URL.</p>",
+20
mte/2025_07_10_dune-unfmt.json
+20
mte/2025_07_10_dune-unfmt.json
···+"summary": "When working across machines, it\u2019s easy to make changes and reconcile them using git. However, I made a mistake and inadvertently ran dune fmt and now my git diff is a total mess.",+"content": "<p>When working across machines, it\u2019s easy to make changes and reconcile them using git. However, I made a mistake and inadvertently ran <code>dune fmt</code> and now my <code>git diff</code> is a total mess.</p>\n\n<p>My thought, to get myself out of this situation, is to go back to the previous commit and create a new branch with no changes other than a <code>dune fmt</code>. I can then cherry-pick my latest work on to that branch which should then give me a clean diff.</p>\n\n<div><div><pre><code>git commit <span>-am</span> <span>'inadvertent reformatted version'</span>\n</code></pre></div></div>\n\n<p>Run <code>git log</code> to find the commit that was just made and the previous one.</p>\n\n<p>Checkout the previous commit and make a new branch, in my case called <code>pre-fmt</code>.</p>\n\n<div><div><pre><code>git checkout <previous commit>\ngit switch <span>-c</span> pre-fmt\n</code></pre></div></div>\n\n<p>Format the code in this branch and commit that version.</p>\n\n<div><div><pre><code>dune <span>fmt\n</span>git commit <span>-am</span> <span>'dune fmt'</span>\n</code></pre></div></div>\n\n<p>Now cherry-pick the original commit.</p>\n\n<div><div><pre><code>git cherry-pick <latest commit>\n</code></pre></div></div>\n\n<p>The cherry-pick reports lots of merge conflicts; however, these should be trivial to resolve but it is a manual process. Once done, add the changed files and finish the cherry-pick.</p>\n\n<div><div><pre><code>git add bin/<span>*</span>.ml\ngit cherry-pick <span>--continue</span>\n</code></pre></div></div>\n\n<p><code>git diff</code> now shows just the actual changes rather than the code formatting changes. Do you have any suggestions on a better workflow?</p>",
+2
-2
mte/metadata.json
+2
-2
mte/metadata.json
+14
mwd/blog_building-ocaml-on-haiku_.json
+14
mwd/blog_building-ocaml-on-haiku_.json
···+"summary": "<p>What has to be a niche of a niche post wise, I was intrigued when I spotted <a href=\"https://www.haiku-os.org/blog/anarchos/2024-04-09_an_odissey_to_port_compcert/\">this post</a> recently that someone had built <a href=\"https://ocaml.org\">OCaml</a> for <a href=\"https://www.haiku-os.org/\">Haiku</a>. I'd been playing with Haiku a little recently, as I wanted to understand its file-system, and so I thought I'd have a go. I turns out it's quite simple, and although the above post does kinda tell you what you need, there's a few gaps, so this post is just recording what I did. But all credit has to go to Sylvain Kerjean for that original post which gets you most the way, this is just trying to make it easier for me to cut and paste later!</p>\n<p>First up:</p>\n<ul>\n<li>Ensure <code>/boot/home/config/non-packaged/bin</code> is on your <code>PATH</code> variable.</li>\n<li>Get a checkout of OCaml from <a href=\"https://github.com/ocaml/ocaml\">https://github.com/ocaml/ocaml</a>.</li>\n<li>Configure it with the appropriate prefix, make, and install. It really does just work!</li>\n</ul>\n<pre><code>$ export PATH=$PATH:/boot/home/config/non-packaged/bin\n$ git clone https://github.com/ocaml/ocaml.git\n$ cd ocaml\n$ ./configure --prefix=/boot/home/config/non-packaged\n...\n$ make\n...\n$ make install\n</code></pre>\n<p>Now you have ocaml, you're almost certainly going to want to install opam, OCaml's package manager, also. Thankfully that also mostly works, with the one caveat that opam needs some OCaml modules installed to build, which you could install with opam, but you don't have it yet! Thankfully there's an option for that:</p>\n<ul>\n<li>Get a checkout of opam from <a href=\"https://github.com/ocaml/opam\">https://github.com/ocaml/opam</a>.</li>\n<li>Configure it with the prefix and vendor flags, then make and install. Also really easy!</li>\n</ul>\n<pre><code>$ cd ..\n$ git clone https://github.com/ocaml/opam.git\n$ cd opam\n$ ./configure --prefix=/boot/home/config/non-packaged --with-vendored-deps\n...\n$ make\n...\n$ make install\n</code></pre>\n<p>At this stage, you're almost good to go. Opam will need a couple of tools installed before it'll work:</p>\n<pre><code>$ pkgman install rsync getconf\n$ opam init\n</code></pre>\n<p>And now you're good to go! Much to my surprise I was able to even get running my SDL2 based retro graphics library for OCaml working very quickly. I just had to make sure I had a few extra <code>_devel</code> packages installed for things like SDL2 and libffi.</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of a Haiku OS session running. with a bunch of windows open, one of which shows the shell history of how I built and installed all these tools, and another showing some low resolution graphics.\" src=\"screenshot2.png\">\n \n </div>\n</div>\n<p>It is however, increadably slow - my graphics library isn't very well optimised, as usually on modern hardware it doesn't need it to push old-school VGA like graphics around, but running natively on my AMD Razen machine it was really quite poor, low single-digit frames per second. In part this I assume is related to Haiku not knowing about my fancy NVIDIA graphics card, and just using the stock framebuffer driver, and in part because OCaml doesn't know about Haiku enough to build a native binary and is instead using the bytecode backend.</p>",+"content": "<p>What has to be a niche of a niche post wise, I was intrigued when I spotted <a href=\"https://www.haiku-os.org/blog/anarchos/2024-04-09_an_odissey_to_port_compcert/\">this post</a> recently that someone had built <a href=\"https://ocaml.org\">OCaml</a> for <a href=\"https://www.haiku-os.org/\">Haiku</a>. I'd been playing with Haiku a little recently, as I wanted to understand its file-system, and so I thought I'd have a go. I turns out it's quite simple, and although the above post does kinda tell you what you need, there's a few gaps, so this post is just recording what I did. But all credit has to go to Sylvain Kerjean for that original post which gets you most the way, this is just trying to make it easier for me to cut and paste later!</p>\n<p>First up:</p>\n<ul>\n<li>Ensure <code>/boot/home/config/non-packaged/bin</code> is on your <code>PATH</code> variable.</li>\n<li>Get a checkout of OCaml from <a href=\"https://github.com/ocaml/ocaml\">https://github.com/ocaml/ocaml</a>.</li>\n<li>Configure it with the appropriate prefix, make, and install. It really does just work!</li>\n</ul>\n<pre><code>$ export PATH=$PATH:/boot/home/config/non-packaged/bin\n$ git clone https://github.com/ocaml/ocaml.git\n$ cd ocaml\n$ ./configure --prefix=/boot/home/config/non-packaged\n...\n$ make\n...\n$ make install\n</code></pre>\n<p>Now you have ocaml, you're almost certainly going to want to install opam, OCaml's package manager, also. Thankfully that also mostly works, with the one caveat that opam needs some OCaml modules installed to build, which you could install with opam, but you don't have it yet! Thankfully there's an option for that:</p>\n<ul>\n<li>Get a checkout of opam from <a href=\"https://github.com/ocaml/opam\">https://github.com/ocaml/opam</a>.</li>\n<li>Configure it with the prefix and vendor flags, then make and install. Also really easy!</li>\n</ul>\n<pre><code>$ cd ..\n$ git clone https://github.com/ocaml/opam.git\n$ cd opam\n$ ./configure --prefix=/boot/home/config/non-packaged --with-vendored-deps\n...\n$ make\n...\n$ make install\n</code></pre>\n<p>At this stage, you're almost good to go. Opam will need a couple of tools installed before it'll work:</p>\n<pre><code>$ pkgman install rsync getconf\n$ opam init\n</code></pre>\n<p>And now you're good to go! Much to my surprise I was able to even get running my SDL2 based retro graphics library for OCaml working very quickly. I just had to make sure I had a few extra <code>_devel</code> packages installed for things like SDL2 and libffi.</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of a Haiku OS session running. with a bunch of windows open, one of which shows the shell history of how I built and installed all these tools, and another showing some low resolution graphics.\" src=\"screenshot2.png\">\n \n </div>\n</div>\n<p>It is however, increadably slow - my graphics library isn't very well optimised, as usually on modern hardware it doesn't need it to push old-school VGA like graphics around, but running natively on my AMD Razen machine it was really quite poor, low single-digit frames per second. In part this I assume is related to Haiku not knowing about my fancy NVIDIA graphics card, and just using the stock framebuffer driver, and in part because OCaml doesn't know about Haiku enough to build a native binary and is instead using the bytecode backend.</p>",
+14
mwd/blog_go-wasm-workers_.json
+14
mwd/blog_go-wasm-workers_.json
···+"summary": "<p>These are some notes for myself about trying to use Wasm and Web Workers to achieve some level of parallelisation in the browser. This isn't meant to be a comprehensive tutorial, but there are so many broken tutorials or half bits of documentation out there, I thought I should leave myself a note here. This is just the result of an afternoon of spelunking to try and work out how to do this, and should not be considered comprehensive.</p>\n<h1>Example</h1>\n<div>\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n</div>\n<p>If you're viewing this page directly (rather than via an RSS reader) and your browser supports Wasm, then below you should see a <a href=\"https://en.wikipedia.org/wiki/Mandelbrot_set\">Mandelbrot fractal</a> render into place above, with different chunks appearing at different points (and on Safari you might see some banding, which is it failing to align the canvas tiles properly rather than being an issue with the fractal generation). Each tile is being rendered in a parallel in Go code, using a mix of the aforementioned technologies. The source code for this <a href=\"https://github.com/mdales/go-wasm-web-worker-test\">can be found here</a>.</p>\n<p>The slightly fun thing, which may vary for you depending on the speed of your machine, how many cores you have, the browser you're using etc. is that you can see the tiles with more black in them render more slowly, rather than the tiles render in order. This is some indication of parallelism: the black part of a fractal is the slowest part to render (the black actually is the algorithm giving up after it his a maximum number of iterations), so the fact that all the lighter tiles show up first and the others take longer is a nice indicator that they're not just being run in order.</p>\n<h1>Context</h1>\n<p><a href=\"https://webassembly.org\">Web Assembly</a>, aka Wasm, is a way to write code in another language than Javascript and have it run in a browser. That's the sales pitch anyway, but it's a bit more like writing plugins<a href=\"#fn-1\">[1]</a> for a web page, as you still need Javascript to act as the loader for the Wasm blob, and it has a constrained set of ways it can interact with the page (you can work with the DOM, but for Canvas drawing you'll need to have some Javascript code for that also). Your Wasm components are also constrained by the Wasm runtime, which means you won't get all the features of your language that you're used to. In particular (related to my interests), the Wasm virtual machine is still running in a similar context to Javascript in the browser, so can only be single threaded, as exemplified by this quote from the <a href=\"https://go.dev/blog/wasmexport\">most recent Go update on the topic</a>:</p>\n<blockquote>\n<p>While Go 1.24 has made significant enhancements to its Wasm capabilities, there are still some notable limitations.</p>\n<p>Wasm is a single-threaded architecture with no parallelism. A <code>go:wasmexport</code> function can spawn new goroutines. But if a function creates a background goroutine, it will not continue executing when the <code>go:wasmexport</code> function returns, until calling back into the Go-based Wasm module.</p>\n</blockquote>\n<p>That's not to say there aren't benefits from being able to use a language other than Javascript in the browser, but it's important to understand its constraints.</p>\n\n\n<p>The second bit of context here is related to that lack of parallelism, which is clearly desirable for certain applications. There is now a model in Javascript to get a level of parallelism, which is <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">Web Workers</a>. In Javascript you can now instantiate a Javascript script as a "worker", aka a thread of execution, to which you can pass messages asking it to do some work and receive a message back when it's got something to tell you. A worker is single threaded again, and if you ask it to do multiple things it'll just queue them up, but you can instantiate multiple workers and ask them each to do a thing, and now you're starting to find a level of parallelism.</p>\n<p>The Web Worker model is only available to Javascript, but you can from your worker Javascript instantiate a Wasm component, and thus we can now have a somewhat convoluted way to run non-Javascript code in the browser in parallel.</p>\n<p>At least, that's the theory, so now let's have a go with Go.</p>\n<h1>Using Go for Wasm</h1>\n<h2>Which way to Go</h2>\n<p>Know that whatever I write here will age poorly. I'm having to write my own notes here because notes written by others have also inevitably decayed. It appears that Wasm support is still at the stage where people are working out the best ways to do things, and so there's a lot of posts that either don't work any more or are conflicting.</p>\n<p>The waters are further muddied in the Go world as there's two toolchains which use slightly different options and syntax for supporting Wasm. There's the <a href=\"https://go.dev\">main Go toolchain</a>, and then there's <a href=\"https://tinygo.org\">TinyGo</a>, which is Go aimed at embedded systems. Both of these toolchains support Wasm, but it looks like TinyGo tried to do a better job, then main Go caught up, but with a slightly different syntax for things, and so you may have either Go or the corresponding Javascript code that works for one and not the other (the Javascript has to change due to slightly differences in the exports the Wasm modules make from the two different toolchains).</p>\n<p>TinyGo seems like a good choice: being aimed at embedded systems the runtime library is smaller than the regular Go runtime, and so your compiled Wasm blob will be smaller with TinyGo than Full Fat Go\u2122. However, TinyGo's toolchain (at least for Wasm) relies on the regular Go toolchain at points, but is <a href=\"https://github.com/tinygo-org/tinygo/issues/4719\">currently lagging behind</a> on support:</p>\n<pre><code>$ GOOS=js GOARCH=wasm tinygo build -o main.wasm ./main.go\nrequires go version 1.19 through 1.23, got go1.24\n</code></pre>\n<p>And given I had 1.24 installed on my machine and I didn't want to mess about with it, the rest of this document will be based on using the main Go toolchain.</p>\n<h2>A minimal Go blob</h2>\n<p>I want to title this section "a minimal Go module", as from a Wasm point of view that's how I see the result, but the term <em>module</em> in Go has a very specific meaning which is not the same, and so I'll keep using the term blob.</p>\n<p>If we imagine a Wasm blob in Go that exports a function to add two numbers, we can write that thus using the latest version of Go:</p>\n<pre><code>package main\n\n//go:wasmexport add\nfunc add(a int32, b int32) int32 {\n\treturn a + b;\n}\n\nfunc main {}\n</code></pre>\n<p>Three things to note here:</p>\n<ol>\n<li>There is a comment annotation that <em>sort of</em> exports the method in the Javascript world.</li>\n<li>Only certain types are allowed for Wasm exposed functions, as Wasm has a limited set of datatypes it supports. You can see the list of supported Go types <a href=\"https://pkg.go.dev/cmd/compile#hdr-WebAssembly_Directives\">here</a>.</li>\n<li>You still have to a main function, though it can be totally empty if you're exporting things this way.</li>\n</ol>\n<p>To clarify the uncertainty in that first point. If you use the function annotation here, then your method isn't added to the global Javascript namespace, but rather is in a list of functions in the instance object for your Wasm blob in the Javascript world (see next section). You can use the older style of pushing your function into the Javascript namespace if you like also:</p>\n<pre><code>package main\n\nimport "syscall/js"\n\nfunc add(a int, b int) int {\n\treturn a + b;\n}\n\nfunc main() {\n\tc := make(chan struct{}, 0)\n\n\tjs.Global().Set("add", js.FuncOf(func(this js.Value, args []js.Value) any {\n\t\treturn add(args[0].Int(), args[1].Int())\n\t}))\n\n\t<-c\n}\n</code></pre>\n<p>I've not seen <a href=\"https://pkg.go.dev/cmd/compile#hdr-WebAssembly_Directives\">in the documentation</a> why it is that for the new style of annotation you don't need anything in main, compared to this older style. The advantage of this is that now <code>add</code> is just a thing you can call from any Javascript as it's in the global namespace, assuming you think it's an advantage to do so. But given I prefer to have more control over where things appear, I'll just stick with the new style of coding.</p>\n<p>It's worth noting that if you use TinyGo then that had the annotations before the main Go compiler, and uses a slightly different syntax for them, so as far as I can tell you need to code for one or the other currently, you can't code for both. I believe TinyGo will also convert different types (based on an example I was reading). I assume at some point they'll align, but for now it feels like you're going to write code for either the Full Fat Go\u2122 toolchain or the TinyGo toolchain, rather than just you're writing Go for Wasm.</p>\n<p>You at least do compile them the same way:</p>\n<pre><code>$ GOOS=js GOARCH=wasm go build -o main.wasm ./main.go\n</code></pre>\n<p>The one thing to note there is that there is a second <code>GOOS</code> target, <code>wasip1</code>, which you can use if you don't want to use the browser but instead are targeting a standalone Wasm runtime like <a href=\"https://github.com/bytecodealliance/wasmtime\">wasmtime</a>.</p>\n<h2>Loading the Wasm blob</h2>\n<p>Now we have some Go code compiled into a Wasm blob, we want to load it into the browser. To do that with Go you first want to locate the helper Javascript file that comes with the Go toolchain. You can copy that into your project directory like so:</p>\n<pre><code>$ cp `go env GOROOT`/lib/wasm/wasm_exec.js .\n</code></pre>\n<p>Then you can load and call your Wasm thus:</p>\n<pre><code><!doctype HTML>\n<html>\n\t<head>\n\t\t<script src="wasm_exec.js"></script>\n\t\t<script>\n\t\t\tconst go = new Go();\n\t\t\tlet inst;\n\t\t\tWebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {\n\t\t\t\tinst = result.instance;\n\t\t\t\tgo.run(inst);\n\t\t\t\tconsole.log(inst.exports.add(3, 4));\n\t\t\t}).catch((err) => {\n\t\t\t\tconsole.error(err);\n\t\t\t});\n\t\t</script>\n\t</head>\n\t<body>\n\t</body>\n</html>\n</code></pre>\n<p>The important thing to note here is that loading Wasm is an asynchronous operation: until that <code>go.run(inst)</code> line has run, you can't assume your Wasm code is accessible, so you should default to having any controls on your page related to the Wasm plugin disabled and only enable them in the <code>then</code> block after loading the Wasm blob. You need to doubly pay attention to this with Web Workers, as we'll see.</p>\n<p>Note also the <code>inst.exports.add</code> call - that's because I used the annotation to publish my interface. If I'd used the <code>js.Global().Set("add"...</code> technique then I could just have called <code>add</code> directly.</p>\n<p>One gotchya you will face at this point is that if you have a bug in your Wasm code, in the browser console it'll appear as an error in <code>wasm_exec.js</code> rather than you getting anything useful about your Go code.</p>\n<h1>Web Workers</h1>\n<h2>Just Web Workers</h2>\n<p>Javascript has always been single threaded, and Wasm follows in that model. But for the projects I have in mind around geospatial work, I'm interested in can we run things in parallel on the client side. Thankfully we now have <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">Web Workers</a>, which is a way to set up one or more worker threads. These threads are just Javascript modules (files) that you can ask to be set up as a worker, and they have a message queue to which you can request they do work, and another queue through which they can send responses. The workers themselves are also single threaded, so if you send them two requests they will service one fully before they service the next one.</p>\n<p>The API for this is really quite simple and clean. You first create your worker logic in a Javascript file:</p>\n<pre><code>// worker.js\n\n// something to do some work\nfunction add(a, b) { return a + b }\n\nonmessage = ({ data }) => {\n\tconst { action, payload } = data;\n\tswitch (action) {\n\t\tcase "add":\n\t\t\tconst { x, y } = payload;\n\t\t\tconst res = add(x, y);\n\t\t\tpostMessage({ action: "result", payload: res });\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tthrow (`unknown action '${action}'`);\n\t}\n};\n</code></pre>\n<p>And then in your main Javascript file you write:</p>\n<pre><code>const worker = new Worker("worker.js");\nworker.onmessage = ({ data }) => {\n\tlet { action, payload } = data;\n\tswitch (action) {\n\t\tcase "result":\n\t\t\tconsole.log("we got a result: ", payload);\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tconsole.error(`Unknown action: ${action}`);\n\t}\n};\n\n...\n\nworker.postMessage({action: "add", payload: {x: 4, y: 4}});\n\n</code></pre>\n<p>As mentioned before, a worker can not itself do parallel work, but you can call <code>new Worker()</code> multiple times to create many worker threads that you then send tasks to.\n\u00a0</p>\n<h2>Web Workers with Wasm</h2>\n<p>Finally we can pull this together. The basic architecture here is that we load a web worker that in turn loads a Wasm module. This is simple enough, with the one caveat that you need to have the web worker tell you when it's ready due to that aforementioned loading delay for Wasm blobs (or, I guess, you could have requests fail before then, but that's now how I roll).</p>\n<p>So, our worker now looks something like this:</p>\n<pre><code>// worker.js\n\nimportScripts("wasm_exec.js");\n\nconst go = new Go();\nlet exports;\nWebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {\n\texports = result.instance.exports;\n\tgo.run(result.instance);\n\tpostMessage({ action: "ready", payload: null });\n}).catch((err) => {\n\tconsole.error("Worker failed to load WASM module: ", err)\n});\n\nonmessage = ({ data }) => {\n\tconst { action, payload } = data;\n\tswitch (action) {\n\t\tcase "add":\n\t\t\tconst { x, y } = payload;\n\t\t\tconst res = exports.add(x, y);\n\t\t\tpostMessage({ action: "result", payload: res });\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tthrow (`unknown action '${action}'`);\n\t}\n};\n</code></pre>\n<p>Note that extra <code>postMessage</code> in the success handler for loading the Wasm blob, that is there to tell the main Javascript code that this worker is now actually ready to do something.</p>\n<p>In the main code we can have something like:</p>\n<pre><code>const worker = new Worker("worker.js");\nworker.onmessage = ({ data }) => {\n\tlet { action, payload } = data;\n\tswitch (action) {\n\t\tcase "ready":\n\t\t\tworker.postMessage({action: "add", payload: {x: 4, y: 4}});\n\t\tcase "result":\n\t\t\tconsole.log("we got a result: ", payload);\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tconsole.error(`Unknown action: ${action}`);\n\t}\n};\n</code></pre>\n<p>This is a bit of a silly artificial example I've used in the code snippets here, but you can see a real working version for generating that opening fractal using 12 parallel Wasm workers <a href=\"https://github.com/mdales/go-wasm-web-worker-test\">here</a>.</p>\n\n<ol>\n<li>\n<p>I found myself shuddering slightly at this apparent return to <a href=\"https://en.wikipedia.org/wiki/Java_applet\">Java applets</a> and <a href=\"https://en.wikipedia.org/wiki/ActiveX\">ActiveX</a>. At least the security model is better thought out this time around it seems.</p>\n<span><a href=\"#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",+"content": "<p>These are some notes for myself about trying to use Wasm and Web Workers to achieve some level of parallelisation in the browser. This isn't meant to be a comprehensive tutorial, but there are so many broken tutorials or half bits of documentation out there, I thought I should leave myself a note here. This is just the result of an afternoon of spelunking to try and work out how to do this, and should not be considered comprehensive.</p>\n<h1>Example</h1>\n<div>\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n\t\n</div>\n<p>If you're viewing this page directly (rather than via an RSS reader) and your browser supports Wasm, then below you should see a <a href=\"https://en.wikipedia.org/wiki/Mandelbrot_set\">Mandelbrot fractal</a> render into place above, with different chunks appearing at different points (and on Safari you might see some banding, which is it failing to align the canvas tiles properly rather than being an issue with the fractal generation). Each tile is being rendered in a parallel in Go code, using a mix of the aforementioned technologies. The source code for this <a href=\"https://github.com/mdales/go-wasm-web-worker-test\">can be found here</a>.</p>\n<p>The slightly fun thing, which may vary for you depending on the speed of your machine, how many cores you have, the browser you're using etc. is that you can see the tiles with more black in them render more slowly, rather than the tiles render in order. This is some indication of parallelism: the black part of a fractal is the slowest part to render (the black actually is the algorithm giving up after it his a maximum number of iterations), so the fact that all the lighter tiles show up first and the others take longer is a nice indicator that they're not just being run in order.</p>\n<h1>Context</h1>\n<p><a href=\"https://webassembly.org\">Web Assembly</a>, aka Wasm, is a way to write code in another language than Javascript and have it run in a browser. That's the sales pitch anyway, but it's a bit more like writing plugins<a href=\"#fn-1\">[1]</a> for a web page, as you still need Javascript to act as the loader for the Wasm blob, and it has a constrained set of ways it can interact with the page (you can work with the DOM, but for Canvas drawing you'll need to have some Javascript code for that also). Your Wasm components are also constrained by the Wasm runtime, which means you won't get all the features of your language that you're used to. In particular (related to my interests), the Wasm virtual machine is still running in a similar context to Javascript in the browser, so can only be single threaded, as exemplified by this quote from the <a href=\"https://go.dev/blog/wasmexport\">most recent Go update on the topic</a>:</p>\n<blockquote>\n<p>While Go 1.24 has made significant enhancements to its Wasm capabilities, there are still some notable limitations.</p>\n<p>Wasm is a single-threaded architecture with no parallelism. A <code>go:wasmexport</code> function can spawn new goroutines. But if a function creates a background goroutine, it will not continue executing when the <code>go:wasmexport</code> function returns, until calling back into the Go-based Wasm module.</p>\n</blockquote>\n<p>That's not to say there aren't benefits from being able to use a language other than Javascript in the browser, but it's important to understand its constraints.</p>\n\n\n<p>The second bit of context here is related to that lack of parallelism, which is clearly desirable for certain applications. There is now a model in Javascript to get a level of parallelism, which is <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">Web Workers</a>. In Javascript you can now instantiate a Javascript script as a "worker", aka a thread of execution, to which you can pass messages asking it to do some work and receive a message back when it's got something to tell you. A worker is single threaded again, and if you ask it to do multiple things it'll just queue them up, but you can instantiate multiple workers and ask them each to do a thing, and now you're starting to find a level of parallelism.</p>\n<p>The Web Worker model is only available to Javascript, but you can from your worker Javascript instantiate a Wasm component, and thus we can now have a somewhat convoluted way to run non-Javascript code in the browser in parallel.</p>\n<p>At least, that's the theory, so now let's have a go with Go.</p>\n<h1>Using Go for Wasm</h1>\n<h2>Which way to Go</h2>\n<p>Know that whatever I write here will age poorly. I'm having to write my own notes here because notes written by others have also inevitably decayed. It appears that Wasm support is still at the stage where people are working out the best ways to do things, and so there's a lot of posts that either don't work any more or are conflicting.</p>\n<p>The waters are further muddied in the Go world as there's two toolchains which use slightly different options and syntax for supporting Wasm. There's the <a href=\"https://go.dev\">main Go toolchain</a>, and then there's <a href=\"https://tinygo.org\">TinyGo</a>, which is Go aimed at embedded systems. Both of these toolchains support Wasm, but it looks like TinyGo tried to do a better job, then main Go caught up, but with a slightly different syntax for things, and so you may have either Go or the corresponding Javascript code that works for one and not the other (the Javascript has to change due to slightly differences in the exports the Wasm modules make from the two different toolchains).</p>\n<p>TinyGo seems like a good choice: being aimed at embedded systems the runtime library is smaller than the regular Go runtime, and so your compiled Wasm blob will be smaller with TinyGo than Full Fat Go\u2122. However, TinyGo's toolchain (at least for Wasm) relies on the regular Go toolchain at points, but is <a href=\"https://github.com/tinygo-org/tinygo/issues/4719\">currently lagging behind</a> on support:</p>\n<pre><code>$ GOOS=js GOARCH=wasm tinygo build -o main.wasm ./main.go\nrequires go version 1.19 through 1.23, got go1.24\n</code></pre>\n<p>And given I had 1.24 installed on my machine and I didn't want to mess about with it, the rest of this document will be based on using the main Go toolchain.</p>\n<h2>A minimal Go blob</h2>\n<p>I want to title this section "a minimal Go module", as from a Wasm point of view that's how I see the result, but the term <em>module</em> in Go has a very specific meaning which is not the same, and so I'll keep using the term blob.</p>\n<p>If we imagine a Wasm blob in Go that exports a function to add two numbers, we can write that thus using the latest version of Go:</p>\n<pre><code>package main\n\n//go:wasmexport add\nfunc add(a int32, b int32) int32 {\n\treturn a + b;\n}\n\nfunc main {}\n</code></pre>\n<p>Three things to note here:</p>\n<ol>\n<li>There is a comment annotation that <em>sort of</em> exports the method in the Javascript world.</li>\n<li>Only certain types are allowed for Wasm exposed functions, as Wasm has a limited set of datatypes it supports. You can see the list of supported Go types <a href=\"https://pkg.go.dev/cmd/compile#hdr-WebAssembly_Directives\">here</a>.</li>\n<li>You still have to a main function, though it can be totally empty if you're exporting things this way.</li>\n</ol>\n<p>To clarify the uncertainty in that first point. If you use the function annotation here, then your method isn't added to the global Javascript namespace, but rather is in a list of functions in the instance object for your Wasm blob in the Javascript world (see next section). You can use the older style of pushing your function into the Javascript namespace if you like also:</p>\n<pre><code>package main\n\nimport "syscall/js"\n\nfunc add(a int, b int) int {\n\treturn a + b;\n}\n\nfunc main() {\n\tc := make(chan struct{}, 0)\n\n\tjs.Global().Set("add", js.FuncOf(func(this js.Value, args []js.Value) any {\n\t\treturn add(args[0].Int(), args[1].Int())\n\t}))\n\n\t<-c\n}\n</code></pre>\n<p>I've not seen <a href=\"https://pkg.go.dev/cmd/compile#hdr-WebAssembly_Directives\">in the documentation</a> why it is that for the new style of annotation you don't need anything in main, compared to this older style. The advantage of this is that now <code>add</code> is just a thing you can call from any Javascript as it's in the global namespace, assuming you think it's an advantage to do so. But given I prefer to have more control over where things appear, I'll just stick with the new style of coding.</p>\n<p>It's worth noting that if you use TinyGo then that had the annotations before the main Go compiler, and uses a slightly different syntax for them, so as far as I can tell you need to code for one or the other currently, you can't code for both. I believe TinyGo will also convert different types (based on an example I was reading). I assume at some point they'll align, but for now it feels like you're going to write code for either the Full Fat Go\u2122 toolchain or the TinyGo toolchain, rather than just you're writing Go for Wasm.</p>\n<p>You at least do compile them the same way:</p>\n<pre><code>$ GOOS=js GOARCH=wasm go build -o main.wasm ./main.go\n</code></pre>\n<p>The one thing to note there is that there is a second <code>GOOS</code> target, <code>wasip1</code>, which you can use if you don't want to use the browser but instead are targeting a standalone Wasm runtime like <a href=\"https://github.com/bytecodealliance/wasmtime\">wasmtime</a>.</p>\n<h2>Loading the Wasm blob</h2>\n<p>Now we have some Go code compiled into a Wasm blob, we want to load it into the browser. To do that with Go you first want to locate the helper Javascript file that comes with the Go toolchain. You can copy that into your project directory like so:</p>\n<pre><code>$ cp `go env GOROOT`/lib/wasm/wasm_exec.js .\n</code></pre>\n<p>Then you can load and call your Wasm thus:</p>\n<pre><code><!doctype HTML>\n<html>\n\t<head>\n\t\t<script src="wasm_exec.js"></script>\n\t\t<script>\n\t\t\tconst go = new Go();\n\t\t\tlet inst;\n\t\t\tWebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {\n\t\t\t\tinst = result.instance;\n\t\t\t\tgo.run(inst);\n\t\t\t\tconsole.log(inst.exports.add(3, 4));\n\t\t\t}).catch((err) => {\n\t\t\t\tconsole.error(err);\n\t\t\t});\n\t\t</script>\n\t</head>\n\t<body>\n\t</body>\n</html>\n</code></pre>\n<p>The important thing to note here is that loading Wasm is an asynchronous operation: until that <code>go.run(inst)</code> line has run, you can't assume your Wasm code is accessible, so you should default to having any controls on your page related to the Wasm plugin disabled and only enable them in the <code>then</code> block after loading the Wasm blob. You need to doubly pay attention to this with Web Workers, as we'll see.</p>\n<p>Note also the <code>inst.exports.add</code> call - that's because I used the annotation to publish my interface. If I'd used the <code>js.Global().Set("add"...</code> technique then I could just have called <code>add</code> directly.</p>\n<p>One gotchya you will face at this point is that if you have a bug in your Wasm code, in the browser console it'll appear as an error in <code>wasm_exec.js</code> rather than you getting anything useful about your Go code.</p>\n<h1>Web Workers</h1>\n<h2>Just Web Workers</h2>\n<p>Javascript has always been single threaded, and Wasm follows in that model. But for the projects I have in mind around geospatial work, I'm interested in can we run things in parallel on the client side. Thankfully we now have <a href=\"https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers\">Web Workers</a>, which is a way to set up one or more worker threads. These threads are just Javascript modules (files) that you can ask to be set up as a worker, and they have a message queue to which you can request they do work, and another queue through which they can send responses. The workers themselves are also single threaded, so if you send them two requests they will service one fully before they service the next one.</p>\n<p>The API for this is really quite simple and clean. You first create your worker logic in a Javascript file:</p>\n<pre><code>// worker.js\n\n// something to do some work\nfunction add(a, b) { return a + b }\n\nonmessage = ({ data }) => {\n\tconst { action, payload } = data;\n\tswitch (action) {\n\t\tcase "add":\n\t\t\tconst { x, y } = payload;\n\t\t\tconst res = add(x, y);\n\t\t\tpostMessage({ action: "result", payload: res });\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tthrow (`unknown action '${action}'`);\n\t}\n};\n</code></pre>\n<p>And then in your main Javascript file you write:</p>\n<pre><code>const worker = new Worker("worker.js");\nworker.onmessage = ({ data }) => {\n\tlet { action, payload } = data;\n\tswitch (action) {\n\t\tcase "result":\n\t\t\tconsole.log("we got a result: ", payload);\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tconsole.error(`Unknown action: ${action}`);\n\t}\n};\n\n...\n\nworker.postMessage({action: "add", payload: {x: 4, y: 4}});\n\n</code></pre>\n<p>As mentioned before, a worker can not itself do parallel work, but you can call <code>new Worker()</code> multiple times to create many worker threads that you then send tasks to.\n\u00a0</p>\n<h2>Web Workers with Wasm</h2>\n<p>Finally we can pull this together. The basic architecture here is that we load a web worker that in turn loads a Wasm module. This is simple enough, with the one caveat that you need to have the web worker tell you when it's ready due to that aforementioned loading delay for Wasm blobs (or, I guess, you could have requests fail before then, but that's now how I roll).</p>\n<p>So, our worker now looks something like this:</p>\n<pre><code>// worker.js\n\nimportScripts("wasm_exec.js");\n\nconst go = new Go();\nlet exports;\nWebAssembly.instantiateStreaming(fetch("main.wasm"), go.importObject).then((result) => {\n\texports = result.instance.exports;\n\tgo.run(result.instance);\n\tpostMessage({ action: "ready", payload: null });\n}).catch((err) => {\n\tconsole.error("Worker failed to load WASM module: ", err)\n});\n\nonmessage = ({ data }) => {\n\tconst { action, payload } = data;\n\tswitch (action) {\n\t\tcase "add":\n\t\t\tconst { x, y } = payload;\n\t\t\tconst res = exports.add(x, y);\n\t\t\tpostMessage({ action: "result", payload: res });\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tthrow (`unknown action '${action}'`);\n\t}\n};\n</code></pre>\n<p>Note that extra <code>postMessage</code> in the success handler for loading the Wasm blob, that is there to tell the main Javascript code that this worker is now actually ready to do something.</p>\n<p>In the main code we can have something like:</p>\n<pre><code>const worker = new Worker("worker.js");\nworker.onmessage = ({ data }) => {\n\tlet { action, payload } = data;\n\tswitch (action) {\n\t\tcase "ready":\n\t\t\tworker.postMessage({action: "add", payload: {x: 4, y: 4}});\n\t\tcase "result":\n\t\t\tconsole.log("we got a result: ", payload);\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tconsole.error(`Unknown action: ${action}`);\n\t}\n};\n</code></pre>\n<p>This is a bit of a silly artificial example I've used in the code snippets here, but you can see a real working version for generating that opening fractal using 12 parallel Wasm workers <a href=\"https://github.com/mdales/go-wasm-web-worker-test\">here</a>.</p>\n\n<ol>\n<li>\n<p>I found myself shuddering slightly at this apparent return to <a href=\"https://en.wikipedia.org/wiki/Java_applet\">Java applets</a> and <a href=\"https://en.wikipedia.org/wiki/ActiveX\">ActiveX</a>. At least the security model is better thought out this time around it seems.</p>\n<span><a href=\"#ref-1-fn-1\">\u21a9\ufe0e\ufe0e</a></span></li></ol>",
+14
mwd/blog_hosting24_.json
+14
mwd/blog_hosting24_.json
···+"summary": "<p>Starting in around 2018 or so I decided to start reclaiming my Internet. Before then, as most people had I suspect, I'd drifted into relying on cloud services for just about everything I could possibly want, but by that year I think it'd become apparent to me that I didn't like relying on other companies as the main host of my digital legacy. There was no one thing, but it seems time and time again large internet service providers like to make sure I feel glad that I've done this by finding new ways to exploit their users' data.</p>\n<p>That said, I'm somewhat pragmatic about it, as hosting all the things is hard and time consuming, and I really don't like doing system administration. There's nothing wrong with it, and I know many people who enjoy doing that, but it doesn't really bring me joy. So I've ended up with a mish-mash of things that I personally host locally, personally host on other servers, and let other people host for me.</p>\n<p>Anyway, this page is an attempt to snapshot what I have done and why, along with what has worked and what needs work still. I do this both to help future me and as I occasionally see posts of people wondering if they should or shouldn't host their own stuff, and perhaps this'll help them decide if that's right for their needs or not.</p>\n<h1>Self hosted cloud</h1>\n<p>As already stated, I dislike having to do system administration any more than necessary, so cloud servers I host have to be very simple to both set up and maintain, and it\u2019s why some things I just don\u2019t do (e.g., see email later). But I have ended up hosting a bunch of things that are web based on a small set of VPSs (aka virtual machines someone else hosts), all of which are mostly quite simple to set up, and all have very similar dependancies which makes maintenance easier.</p>\n<p>For this I still use <a href=\"https://linode.com/\">linode</a> for hosting as their control panel is super easy for me to use, they are relatively cheap, and my needs are quite simple.</p>\n<h2>Websites</h2>\n<p>I host three websites, my <a href=\"https://mynameismwd.org/\">personal one</a>, this one for tech stuff, and <a href=\"https://mwdales-guitars.uk/\">one for my guitar building and other maker things</a>. At times it\u2019s felt important to draw a line between the three, though these days I\u2019m less sure of the need to do so, but I know I\u2019m in good company of having far too many blogs.</p>\n<p>All my sites are static sites: that is the content is served from files of content on disk, and not generated on request from some service using live code and a database. In the past I've used those, but they need updating and they store content in a database which makes it hard to fiddle with. At some point I said to <a href=\"https://youtu.be/oZAg00QRgOs?si=9cX4EI14Xrey8cYN&t=237\">Niddrie</a> with all that, and hosted my things on Squarespace, but as I got more and more content I was frustrated at the lack of flexibility, so finally went back to hosting myself.</p>\n<p>This time though I write everything <a href=\"https://daringfireball.net/projects/markdown/\">markdown</a> and convert that to HTML pages using a tool called <a href=\"https://gohugo.io\">Hugo</a>, and then I use rsync to copy the generated pages and images from my laptop to the server. The server is just using <a href=\"https://nginx.org\">Nginx</a> for the actual serving data, and <a href=\"https://letsencrypt.org\">Let\u2019s Encrypt</a> for the HTTPS certificates. As I'm a nerd I have taken advantage of a bunch of features of Hugo in terms of letting me generate things from data as well as text, but that's not necessary to do so if you don't want to.</p>\n<p>The nice thing about this setup is that there\u2019s next to no state on the machine that can\u2019t be recreated by just recompiling the site on my laptop and copying it over again, so if the machine was deleted I don\u2019t really care, I could set it back up again pretty quickly. That said, I do use Linode's backup system so that I can just restore the machine if I need to.</p>\n<p>One downside of my current setup is that it means I need a copy of my site to edit it - I can't just log into a web portal to add new content from any old device I happen to be on. To make this post I had to get my laptop that has the site code on it. The site is in git, so I could check it out on another device, but you get the idea - this isn't ideal for those who need to share the editing responsibility for instance.</p>\n<p>All of what I do here is fairly typical of anyone hosting a static website, but with one slight oddity: search. I wanted a search facility, and for that I need something dynamic to do the looking up when someone enters a search term. I did a long time ago use a client-side search library, where the client downloads a corpus of pages and terms and searches it using javascript in their browser, but at some point that no longer scaled. I failed to find an existing search system for a static site, so in the end I <a href=\"https://github.com/mdales/GuiltySpark\">wrote my own</a> - a little search engine in Swift that does run on my server. Thankfully this is a single binary that has to run and it doesn't use a database, so it's easy for me to administer.</p>\n<p>One final note here: I have found that having my own website that's simple to administer is fun, in that I can play a little bit when the mood takes me with new CSS stuff for instance, but the static nature of the website stops me getting to carried away and creating a beast that would be un-fun again.</p>\n<h2>Matrix</h2>\n<p>The bane of my computing life is that I seem to need run half a dozen different IM clients to talk to people, and all of those are based on commercial, closed servers (even supposedly open services like Signal and Wire). At work we use <a href=\"https://matrix.org\">Matrix</a> for our group and one-to-one chat, and although I could just register on someone else's server, I opted to set my own Matrix server up so I can own my identity here - matrix is a federated system, so although I exist via my own server, I can partake in discussions on other servers as if they were one place. I feel where possible, I want to control my own identity, which is probably a theme of what I've done with all this hosting.</p>\n<p>Thankfully for me running a Matrix server was just a case of installing <a href=\"https://www.postgresql.org\">Postgres</a> and <a href=\"https://github.com/element-hq/synapse\">Matrix Synapse</a>, and hiding them behind my existing Nginx/Let\u2019s Encrypt set up. Postgres is non-trivial if you're not a computer person, but I've done it before, and once set up it's low maintenance for light workloads. I\u2019ve seen others have said that Matrix servers are pain to manage, but my experience has been that it\u2019s needed very little maintenance - but with the caveat that I don't do any bridging between Matrix and other services like IRC or Slack, which might be why I see so many people complain about the process. But for me, touch keyboard, it's been hastle free.</p>\n<h2>Fediverse</h2>\n<p>I have in the past made a few stabs at hosting my own social media (remember <a href=\"https://mynameismwd.org/posts/who-do-you-trust-with-your-140-characters/\">itenti.ca?</a>), and generally found it a pain to manage so went back to other services. I switched to Mastodon in 2017, which is another federated system where you can join one server and still see things from people on other servers (like email somewhat). Specifically I joined the server <a href=\"https://mastodon.me.uk/\">mastodon.me.uk</a>, set up by James smith, someone I know and trust. Whilst James has done, and continues to do, a great job with that server, but I knew at some point I\u2019d like to try hosting my own server as, well, that's how I am. However, this was tempered by seeing how much pain it was to run a mastodon instance yourself.</p>\n<p>But I've been watching for a couple of years the progress of <a href=\"https://gotosocial.org\">GoToSocial</a>, a fediverse server aimed at just one or a few users, and once I felt it had enough basic features for me to get by with - particularly it had support for user migrations so I could move without losing my social graph, I made the hop and now I\u2019m on my own instance. GoToSocial is just a single binary to run and needs the same set of dependancies I already run (Postgres, Nginx, Let\u2019s Encrypt) and so has been (thus far) easy going. You can even use an sqlite file database if you don't want to run Postgres, and the authors think that's good enough for a single user, low traffic instance.</p>\n<h1>Self hosted at home</h1>\n<p>I remember when Dropbox went from being a thing that synced just a few files to trying to be a portal for all kinds of things, and at that point I deleted my account. In hindsight, my fear that it was trying to balloon into some other service that would own more of my data was unfounded, and most people still seem happy with Dropbox, but I\u2019ve no real regrets from jumping ship, as it was a key inspiration for starting to own my own digital footprint, which started at home.</p>\n<h2>File storage</h2>\n<p>I have more data than fits on a single computer, or indeed I\u2019d want on a single computer. I\u2019m fortunate to have grown up in a university environment where networked drives were the normal way of hosting data, so that\u2019s what I do today: I keep all of my data on a NAS device. I use a Synology device with 4 discs in it giving me a redundant array with about 12TB of space on it (redundant in that if one drive fails I can survive until it is replaced with no data loss). 12TB is enough for me currently, and in terms of performance it's been fine. Even before I added wired ethernet in my home office, and added SSD caches to the Synology, I was able to edit photo RAW files stored on the NAS from my laptop without much issue - and now with all those improvements it's pretty seamless. It worked a bit more smoothly on Windows than on Mac, as Windows will automount references to remove drives if it can, whereas on my Mac I need to remember to mount the drive when I want to use it, but small details.</p>\n<p>The downside with taking all my files home with me is sharing large files with other people. Whilst Synology do have a way to make it possible to share files over the Internet, I don't want the one box with all my personal data on it that accessible from the Internet, so I don't allow that. As it is, I don\u2019t have a good solution for sharing files with other people via the Internet currently, other than dumping files to my web server and sending people the URLs. This isn\u2019t a very scalable solution, but given how infrequently I need to do this in practice currently I\u2019ve not been motivated to do better.</p>\n<p>There's other friction points, like Apple Photos wont work with a NAS drive, and I have so many photos that I'd need to be giving Apple an awful lot of money each month to keep them in iCloud. So I have this weird system where the last couple of years of photos are easy to get from Apple Photos, and everything else I need to go looking for it on my NAS. I don't mind this, but it is a friction point and so you might not like that. Synology has some photo management software, and I use it as it's easy to add, but it's pretty poor compared to Photos.</p>\n<h2>Source code management</h2>\n<p>This is I feel a bit of a failed experiment. I have a <a href=\"https://github.com/go-gitea/gitea\">Gitea</a> instance running on my NAS, which I use to host all my code repositories so I have a working copy on premise, which means I can work even if my connection to the Internet goes away, and is a place I can host client code when doing contract work and rest a little easier than hosting it in a private repo on GitHub where an accidental click might expose it. Also I can use as much space as I want without hitting paid limits as I have done on GitHub too - git LFS makes it easy to store a lot of data in git, but GitHub really wants you to pay for that. There's no way I could affordably store my websites on Github for instance.</p>\n<p>The failure though is that I have a lot of code that I want to be public and accessible, and for that I still end up keeping it on GitHub and my Gitea instance, but I mostly forget to update both, so I just end up using GitHub for most public things in practice. I could host Gitea on a VPS in theory, but Gitea is a pain to admin too, with certain updates needing you to make sure you've done all the inbetween updates, and so I am running some old version at home, and I'd not trust that on a public server. And also because it's not federated I'd lose the benefits of having my code where people can easily fork it and submit changes etc. So this whole thing I consider a bit of a failure.</p>\n<h1>Using third party cloud services</h1>\n<p>As much as I like to host my own things, as I said in the opening, I don\u2019t actually like tinkering, so there\u2019s a limit to what I\u2019ll host myself. Similarly there\u2019s just some things where you do want to be in the place other people are due to lack of federation.</p>\n<h2>Domain hosting</h2>\n<p>All my domains are hosted by <a href=\"https://www.mythic-beasts.com\">Mythic Beasts</a>, who are a local company and I know the folk who run it, so that\u2019s a bit of a no brainer.</p>\n<h2>B2</h2>\n<p>Whilst my NAS is locally redundant, if my office were to vaporise, then having redundant disks in the same place doesn\u2019t help me. This is particularly worrying as I back up my laptop to my NAS.</p>\n<p>To help give me a safety net here, I used the Synology cloud-sync feature to backup my NAS to <a href=\"https://www.backblaze.com/cloud-storage\">BackBlaze\u2019s B2 storage service</a>. This gives me offsite data redundancy. I do know others who have a second NAS in another place and sync the two, but this is considerably cheaper for the most part, but at the expense of being a lot slower to recover from.</p>\n<h2>Email and VPN</h2>\n<p>I tried many options here, before moving all my email to <a href=\"https://proton.me/mail\">Proton mail</a>. Running email is hard, and after I decided to move away from GMail and the like, I tried a bunch of different small hosting companies, and I had problems with all of them with Google and co marking my email as spam. Given I was running a business this was not acceptable. In the end I found Proton to be big enough to not get flagged, but small enough I feel like I'm supporting a company with good intent.</p>\n<p>Similarly I use Proton's VPN service. At times I do wish I ran my own VPN back to my office so that I could access resources I have on my home network (like my Synology), but that's not been enough of a pain point for me to want to go to the efforts of setting it up and maintaining it.</p>\n<h2>Misc others</h2>\n<p>I\u2019ve never stopped using RSS as the way I keep up with the internet, and for a while after the demise of Google Reader I did host my own instance of <a href=\"https://tt-rss.org\">Tiny Tiny RSS</a>, but it was something that never was quite good noughts o became a management overhead as I kept updating it in hope it\u2019d get better. In the end moved to a paid cloud service called <a href=\"https://feedbin.com\">Feedbin</a> and I\u2019ve never looked back. Feedbin has been a great service and I\u2019ve no issues with recommending it.</p>\n<p>I do a lot of photography, and although I\u2019ve spent a lot of effort trying to make my photography presentable on my own site, I do still also post things to Flickr, as I find the community there to be a good one, from which I get some good feedback. Truth be told I actually post to Flickr first and then sync to my personal site, as I already had a script for going from Flickr to markdown. If I had the time and the inclination I\u2019d swap that around to post to my own site first then sync that to Flickr, but what I do today works well enough I\u2019ve never needed to switch that.</p>\n<p>Source code we've covered already.</p>\n<h1>Summary</h1>\n<p>As you can see, it's a very mixed bag of solutions I've come up with as I try to keep ownership of my Internet presence. I make not attempt to suggest any of this is optimal - more it's just a mix of pragmatism based on what I need versus how much effort I'm willing to put into it. But still, I see people talking about self hosting often, and so perhaps this'll both show you some options, and normalise the idea that it doesn't need to be perfect or all-or-nothing for you to make some inroads to taking back control of how you exist on the Internet. I don't use Azure or AWS for instance either, for a while I did, but I don't really operate at the scale where the complexity is justified - I'm not really dealing with a lot of traffic for most things. I don't deal with CDNs like Cloudflare either, as I'm not that big, and the few people who do want to see my websites probably don't care if it's a few seconds slower than it could be.</p>\n<p>I'll perhaps try to do a follow on post in a couple of years to see how much, if anything has changed.</p>",+"content": "<p>Starting in around 2018 or so I decided to start reclaiming my Internet. Before then, as most people had I suspect, I'd drifted into relying on cloud services for just about everything I could possibly want, but by that year I think it'd become apparent to me that I didn't like relying on other companies as the main host of my digital legacy. There was no one thing, but it seems time and time again large internet service providers like to make sure I feel glad that I've done this by finding new ways to exploit their users' data.</p>\n<p>That said, I'm somewhat pragmatic about it, as hosting all the things is hard and time consuming, and I really don't like doing system administration. There's nothing wrong with it, and I know many people who enjoy doing that, but it doesn't really bring me joy. So I've ended up with a mish-mash of things that I personally host locally, personally host on other servers, and let other people host for me.</p>\n<p>Anyway, this page is an attempt to snapshot what I have done and why, along with what has worked and what needs work still. I do this both to help future me and as I occasionally see posts of people wondering if they should or shouldn't host their own stuff, and perhaps this'll help them decide if that's right for their needs or not.</p>\n<h1>Self hosted cloud</h1>\n<p>As already stated, I dislike having to do system administration any more than necessary, so cloud servers I host have to be very simple to both set up and maintain, and it\u2019s why some things I just don\u2019t do (e.g., see email later). But I have ended up hosting a bunch of things that are web based on a small set of VPSs (aka virtual machines someone else hosts), all of which are mostly quite simple to set up, and all have very similar dependancies which makes maintenance easier.</p>\n<p>For this I still use <a href=\"https://linode.com/\">linode</a> for hosting as their control panel is super easy for me to use, they are relatively cheap, and my needs are quite simple.</p>\n<h2>Websites</h2>\n<p>I host three websites, my <a href=\"https://mynameismwd.org/\">personal one</a>, this one for tech stuff, and <a href=\"https://mwdales-guitars.uk/\">one for my guitar building and other maker things</a>. At times it\u2019s felt important to draw a line between the three, though these days I\u2019m less sure of the need to do so, but I know I\u2019m in good company of having far too many blogs.</p>\n<p>All my sites are static sites: that is the content is served from files of content on disk, and not generated on request from some service using live code and a database. In the past I've used those, but they need updating and they store content in a database which makes it hard to fiddle with. At some point I said to <a href=\"https://youtu.be/oZAg00QRgOs?si=9cX4EI14Xrey8cYN&t=237\">Niddrie</a> with all that, and hosted my things on Squarespace, but as I got more and more content I was frustrated at the lack of flexibility, so finally went back to hosting myself.</p>\n<p>This time though I write everything <a href=\"https://daringfireball.net/projects/markdown/\">markdown</a> and convert that to HTML pages using a tool called <a href=\"https://gohugo.io\">Hugo</a>, and then I use rsync to copy the generated pages and images from my laptop to the server. The server is just using <a href=\"https://nginx.org\">Nginx</a> for the actual serving data, and <a href=\"https://letsencrypt.org\">Let\u2019s Encrypt</a> for the HTTPS certificates. As I'm a nerd I have taken advantage of a bunch of features of Hugo in terms of letting me generate things from data as well as text, but that's not necessary to do so if you don't want to.</p>\n<p>The nice thing about this setup is that there\u2019s next to no state on the machine that can\u2019t be recreated by just recompiling the site on my laptop and copying it over again, so if the machine was deleted I don\u2019t really care, I could set it back up again pretty quickly. That said, I do use Linode's backup system so that I can just restore the machine if I need to.</p>\n<p>One downside of my current setup is that it means I need a copy of my site to edit it - I can't just log into a web portal to add new content from any old device I happen to be on. To make this post I had to get my laptop that has the site code on it. The site is in git, so I could check it out on another device, but you get the idea - this isn't ideal for those who need to share the editing responsibility for instance.</p>\n<p>All of what I do here is fairly typical of anyone hosting a static website, but with one slight oddity: search. I wanted a search facility, and for that I need something dynamic to do the looking up when someone enters a search term. I did a long time ago use a client-side search library, where the client downloads a corpus of pages and terms and searches it using javascript in their browser, but at some point that no longer scaled. I failed to find an existing search system for a static site, so in the end I <a href=\"https://github.com/mdales/GuiltySpark\">wrote my own</a> - a little search engine in Swift that does run on my server. Thankfully this is a single binary that has to run and it doesn't use a database, so it's easy for me to administer.</p>\n<p>One final note here: I have found that having my own website that's simple to administer is fun, in that I can play a little bit when the mood takes me with new CSS stuff for instance, but the static nature of the website stops me getting to carried away and creating a beast that would be un-fun again.</p>\n<h2>Matrix</h2>\n<p>The bane of my computing life is that I seem to need run half a dozen different IM clients to talk to people, and all of those are based on commercial, closed servers (even supposedly open services like Signal and Wire). At work we use <a href=\"https://matrix.org\">Matrix</a> for our group and one-to-one chat, and although I could just register on someone else's server, I opted to set my own Matrix server up so I can own my identity here - matrix is a federated system, so although I exist via my own server, I can partake in discussions on other servers as if they were one place. I feel where possible, I want to control my own identity, which is probably a theme of what I've done with all this hosting.</p>\n<p>Thankfully for me running a Matrix server was just a case of installing <a href=\"https://www.postgresql.org\">Postgres</a> and <a href=\"https://github.com/element-hq/synapse\">Matrix Synapse</a>, and hiding them behind my existing Nginx/Let\u2019s Encrypt set up. Postgres is non-trivial if you're not a computer person, but I've done it before, and once set up it's low maintenance for light workloads. I\u2019ve seen others have said that Matrix servers are pain to manage, but my experience has been that it\u2019s needed very little maintenance - but with the caveat that I don't do any bridging between Matrix and other services like IRC or Slack, which might be why I see so many people complain about the process. But for me, touch keyboard, it's been hastle free.</p>\n<h2>Fediverse</h2>\n<p>I have in the past made a few stabs at hosting my own social media (remember <a href=\"https://mynameismwd.org/posts/who-do-you-trust-with-your-140-characters/\">itenti.ca?</a>), and generally found it a pain to manage so went back to other services. I switched to Mastodon in 2017, which is another federated system where you can join one server and still see things from people on other servers (like email somewhat). Specifically I joined the server <a href=\"https://mastodon.me.uk/\">mastodon.me.uk</a>, set up by James smith, someone I know and trust. Whilst James has done, and continues to do, a great job with that server, but I knew at some point I\u2019d like to try hosting my own server as, well, that's how I am. However, this was tempered by seeing how much pain it was to run a mastodon instance yourself.</p>\n<p>But I've been watching for a couple of years the progress of <a href=\"https://gotosocial.org\">GoToSocial</a>, a fediverse server aimed at just one or a few users, and once I felt it had enough basic features for me to get by with - particularly it had support for user migrations so I could move without losing my social graph, I made the hop and now I\u2019m on my own instance. GoToSocial is just a single binary to run and needs the same set of dependancies I already run (Postgres, Nginx, Let\u2019s Encrypt) and so has been (thus far) easy going. You can even use an sqlite file database if you don't want to run Postgres, and the authors think that's good enough for a single user, low traffic instance.</p>\n<h1>Self hosted at home</h1>\n<p>I remember when Dropbox went from being a thing that synced just a few files to trying to be a portal for all kinds of things, and at that point I deleted my account. In hindsight, my fear that it was trying to balloon into some other service that would own more of my data was unfounded, and most people still seem happy with Dropbox, but I\u2019ve no real regrets from jumping ship, as it was a key inspiration for starting to own my own digital footprint, which started at home.</p>\n<h2>File storage</h2>\n<p>I have more data than fits on a single computer, or indeed I\u2019d want on a single computer. I\u2019m fortunate to have grown up in a university environment where networked drives were the normal way of hosting data, so that\u2019s what I do today: I keep all of my data on a NAS device. I use a Synology device with 4 discs in it giving me a redundant array with about 12TB of space on it (redundant in that if one drive fails I can survive until it is replaced with no data loss). 12TB is enough for me currently, and in terms of performance it's been fine. Even before I added wired ethernet in my home office, and added SSD caches to the Synology, I was able to edit photo RAW files stored on the NAS from my laptop without much issue - and now with all those improvements it's pretty seamless. It worked a bit more smoothly on Windows than on Mac, as Windows will automount references to remove drives if it can, whereas on my Mac I need to remember to mount the drive when I want to use it, but small details.</p>\n<p>The downside with taking all my files home with me is sharing large files with other people. Whilst Synology do have a way to make it possible to share files over the Internet, I don't want the one box with all my personal data on it that accessible from the Internet, so I don't allow that. As it is, I don\u2019t have a good solution for sharing files with other people via the Internet currently, other than dumping files to my web server and sending people the URLs. This isn\u2019t a very scalable solution, but given how infrequently I need to do this in practice currently I\u2019ve not been motivated to do better.</p>\n<p>There's other friction points, like Apple Photos wont work with a NAS drive, and I have so many photos that I'd need to be giving Apple an awful lot of money each month to keep them in iCloud. So I have this weird system where the last couple of years of photos are easy to get from Apple Photos, and everything else I need to go looking for it on my NAS. I don't mind this, but it is a friction point and so you might not like that. Synology has some photo management software, and I use it as it's easy to add, but it's pretty poor compared to Photos.</p>\n<h2>Source code management</h2>\n<p>This is I feel a bit of a failed experiment. I have a <a href=\"https://github.com/go-gitea/gitea\">Gitea</a> instance running on my NAS, which I use to host all my code repositories so I have a working copy on premise, which means I can work even if my connection to the Internet goes away, and is a place I can host client code when doing contract work and rest a little easier than hosting it in a private repo on GitHub where an accidental click might expose it. Also I can use as much space as I want without hitting paid limits as I have done on GitHub too - git LFS makes it easy to store a lot of data in git, but GitHub really wants you to pay for that. There's no way I could affordably store my websites on Github for instance.</p>\n<p>The failure though is that I have a lot of code that I want to be public and accessible, and for that I still end up keeping it on GitHub and my Gitea instance, but I mostly forget to update both, so I just end up using GitHub for most public things in practice. I could host Gitea on a VPS in theory, but Gitea is a pain to admin too, with certain updates needing you to make sure you've done all the inbetween updates, and so I am running some old version at home, and I'd not trust that on a public server. And also because it's not federated I'd lose the benefits of having my code where people can easily fork it and submit changes etc. So this whole thing I consider a bit of a failure.</p>\n<h1>Using third party cloud services</h1>\n<p>As much as I like to host my own things, as I said in the opening, I don\u2019t actually like tinkering, so there\u2019s a limit to what I\u2019ll host myself. Similarly there\u2019s just some things where you do want to be in the place other people are due to lack of federation.</p>\n<h2>Domain hosting</h2>\n<p>All my domains are hosted by <a href=\"https://www.mythic-beasts.com\">Mythic Beasts</a>, who are a local company and I know the folk who run it, so that\u2019s a bit of a no brainer.</p>\n<h2>B2</h2>\n<p>Whilst my NAS is locally redundant, if my office were to vaporise, then having redundant disks in the same place doesn\u2019t help me. This is particularly worrying as I back up my laptop to my NAS.</p>\n<p>To help give me a safety net here, I used the Synology cloud-sync feature to backup my NAS to <a href=\"https://www.backblaze.com/cloud-storage\">BackBlaze\u2019s B2 storage service</a>. This gives me offsite data redundancy. I do know others who have a second NAS in another place and sync the two, but this is considerably cheaper for the most part, but at the expense of being a lot slower to recover from.</p>\n<h2>Email and VPN</h2>\n<p>I tried many options here, before moving all my email to <a href=\"https://proton.me/mail\">Proton mail</a>. Running email is hard, and after I decided to move away from GMail and the like, I tried a bunch of different small hosting companies, and I had problems with all of them with Google and co marking my email as spam. Given I was running a business this was not acceptable. In the end I found Proton to be big enough to not get flagged, but small enough I feel like I'm supporting a company with good intent.</p>\n<p>Similarly I use Proton's VPN service. At times I do wish I ran my own VPN back to my office so that I could access resources I have on my home network (like my Synology), but that's not been enough of a pain point for me to want to go to the efforts of setting it up and maintaining it.</p>\n<h2>Misc others</h2>\n<p>I\u2019ve never stopped using RSS as the way I keep up with the internet, and for a while after the demise of Google Reader I did host my own instance of <a href=\"https://tt-rss.org\">Tiny Tiny RSS</a>, but it was something that never was quite good noughts o became a management overhead as I kept updating it in hope it\u2019d get better. In the end moved to a paid cloud service called <a href=\"https://feedbin.com\">Feedbin</a> and I\u2019ve never looked back. Feedbin has been a great service and I\u2019ve no issues with recommending it.</p>\n<p>I do a lot of photography, and although I\u2019ve spent a lot of effort trying to make my photography presentable on my own site, I do still also post things to Flickr, as I find the community there to be a good one, from which I get some good feedback. Truth be told I actually post to Flickr first and then sync to my personal site, as I already had a script for going from Flickr to markdown. If I had the time and the inclination I\u2019d swap that around to post to my own site first then sync that to Flickr, but what I do today works well enough I\u2019ve never needed to switch that.</p>\n<p>Source code we've covered already.</p>\n<h1>Summary</h1>\n<p>As you can see, it's a very mixed bag of solutions I've come up with as I try to keep ownership of my Internet presence. I make not attempt to suggest any of this is optimal - more it's just a mix of pragmatism based on what I need versus how much effort I'm willing to put into it. But still, I see people talking about self hosting often, and so perhaps this'll both show you some options, and normalise the idea that it doesn't need to be perfect or all-or-nothing for you to make some inroads to taking back control of how you exist on the Internet. I don't use Azure or AWS for instance either, for a while I did, but I don't really operate at the scale where the complexity is justified - I'm not really dealing with a lot of traffic for most things. I don't deal with CDNs like Cloudflare either, as I'm not that big, and the few people who do want to see my websites probably don't care if it's a few seconds slower than it could be.</p>\n<p>I'll perhaps try to do a follow on post in a couple of years to see how much, if anything has changed.</p>",
+14
mwd/blog_more-on-icons_.json
+14
mwd/blog_more-on-icons_.json
···+"summary": "<p>This is a sort of follow on to a post I wrote about <a href=\"/blog/old-icons/\">how icons becoming homogenous and hard to distinguish</a>, and inspired by this image <a href=\"https://twitter.com/flarup/status/1717578963684364578\">posted to social media</a> by <a href=\"https://www.pixelresort.com\">Michael Flarup</a> showing the evolution of the macOS default dock over many versions:</p>\n<div>\n <div>\n \n\n <img src=\"c08167ed83c115c3.jpeg\">\n \n </div>\n</div>\n<p>Whilst I'd be the first to admit to not being a fan of the heavily skeuomorphic apps like the old Notes.app with its faux leather titlebar and the old game center app, I also think that the older icons for notes and such were much more distinctive when you could recognise items by external shape as well as the content.</p>\n<p>This reminder of what we've lost in terms of both usability and character in our icons inspired me to go back to this vibe with my placeholder icon for a little desktop app I've been writing for myself. The app has the working title of "BAM", and so a literal explosion of shape and colour seemed appropriate, and something that would make it easily recognisable whilst hunting for it through all the rounded rectangles that otherwise seem to be all we're allowed:</p>\n<div>\n <div>\n \n\n <img src=\"bamswitcher2.png\">\n \n </div>\n</div>\n<p>Ignoring the amateur quality of my ability to draw for the moment, the BAM icon being something I'd commission someone to replace should I ever release this app (at the moment I'm quite happy with this app having an audience of one), I feel having lived with my attempt to make an icon that harks back to the older days works really will in terms of usability for me: it's never a struggle to pick it out on either the dock or the task switcher in macOS. Whilst it doesn't really indicate what the app does, that makes it no worse than say half the other apps on my dock, including Finder itself, and that learning process is a one-off task versus the repeated attempt to pick it out on screen as I want to use it.</p>\n<p>Even when I'm not looking for that app, the child-like playfulness of this icon makes me smile whenever I spot it; there's a bit of character in a sea of icons all trying to be quite serious. It has (for me) a bit of the charm that I miss from computers of the past.</p>\n<p>I'm sure this style doesn't work for everyone, even if it was made nicer by someone with drawing skills: I remember comparing notes with my friend Jason about how we recognised icons, each of us finding icons with letters, colours and shapes differently distinctive, but for now that's the joy of having an app with an audience of one. It'd be interesting to see if there's research into how people respond to different icon shapes to work out if even having a single consistent icon is best, or really apps should come with multiple options that respond to some system-wide preference about what works best for that user. Currently many apps do come with themeable icons, such as <a href=\"https://nova.app\">Nova</a>, which is the only non-round-rect on my iconbar, and that's not its default icon, it has dozens for me to pick from, though all but one are round-rects:</p>\n<div>\n <div>\n \n\n <img src=\"novaprefs.png\">\n \n </div>\n</div>\n<p>So there's no technology barrier here, just we need to convince product managers that there's a distinction required for making things fit in with an aesthetic and being usable, which requires a different sort of fitting in.</p>",+"content": "<p>This is a sort of follow on to a post I wrote about <a href=\"/blog/old-icons/\">how icons becoming homogenous and hard to distinguish</a>, and inspired by this image <a href=\"https://twitter.com/flarup/status/1717578963684364578\">posted to social media</a> by <a href=\"https://www.pixelresort.com\">Michael Flarup</a> showing the evolution of the macOS default dock over many versions:</p>\n<div>\n <div>\n \n\n <img src=\"c08167ed83c115c3.jpeg\">\n \n </div>\n</div>\n<p>Whilst I'd be the first to admit to not being a fan of the heavily skeuomorphic apps like the old Notes.app with its faux leather titlebar and the old game center app, I also think that the older icons for notes and such were much more distinctive when you could recognise items by external shape as well as the content.</p>\n<p>This reminder of what we've lost in terms of both usability and character in our icons inspired me to go back to this vibe with my placeholder icon for a little desktop app I've been writing for myself. The app has the working title of "BAM", and so a literal explosion of shape and colour seemed appropriate, and something that would make it easily recognisable whilst hunting for it through all the rounded rectangles that otherwise seem to be all we're allowed:</p>\n<div>\n <div>\n \n\n <img src=\"bamswitcher2.png\">\n \n </div>\n</div>\n<p>Ignoring the amateur quality of my ability to draw for the moment, the BAM icon being something I'd commission someone to replace should I ever release this app (at the moment I'm quite happy with this app having an audience of one), I feel having lived with my attempt to make an icon that harks back to the older days works really will in terms of usability for me: it's never a struggle to pick it out on either the dock or the task switcher in macOS. Whilst it doesn't really indicate what the app does, that makes it no worse than say half the other apps on my dock, including Finder itself, and that learning process is a one-off task versus the repeated attempt to pick it out on screen as I want to use it.</p>\n<p>Even when I'm not looking for that app, the child-like playfulness of this icon makes me smile whenever I spot it; there's a bit of character in a sea of icons all trying to be quite serious. It has (for me) a bit of the charm that I miss from computers of the past.</p>\n<p>I'm sure this style doesn't work for everyone, even if it was made nicer by someone with drawing skills: I remember comparing notes with my friend Jason about how we recognised icons, each of us finding icons with letters, colours and shapes differently distinctive, but for now that's the joy of having an app with an audience of one. It'd be interesting to see if there's research into how people respond to different icon shapes to work out if even having a single consistent icon is best, or really apps should come with multiple options that respond to some system-wide preference about what works best for that user. Currently many apps do come with themeable icons, such as <a href=\"https://nova.app\">Nova</a>, which is the only non-round-rect on my iconbar, and that's not its default icon, it has dozens for me to pick from, though all but one are round-rects:</p>\n<div>\n <div>\n \n\n <img src=\"novaprefs.png\">\n \n </div>\n</div>\n<p>So there's no technology barrier here, just we need to convince product managers that there's a distinction required for making things fit in with an aesthetic and being usable, which requires a different sort of fitting in.</p>",
+14
mwd/blog_nordic-rse-25_.json
+14
mwd/blog_nordic-rse-25_.json
···+"summary": "<p>This is a summary of last week's <a href=\"https://nordic-rse.org/nrse2025/\">2025 Nordic-RSE conference</a>, held in Gothenburg, Sweden. Whilst I'm not technically an Research Software Engineer (RSE), a lot of my role involves essentially the same activities in working on ecology pipelines like <a href=\"https://github.com/quantifyearth/LIFE/\">LIFE</a>, <a href=\"https://github.com/quantifyearth/STAR\">STAR</a>, and so on; indeed I'm a member of the UK <a href=\"https://society-rse.org\">Society of Research Software Engineering</a>. Not only do I effectively act as an RSE a good amount of my time, but it's also a part of my job I enjoy: collaborating with experts in other fields whilst getting to use my own expertise and learning something along the way is often quite satisfying.</p>\n<p>My role at the conference was twofold: to learn more about how others are working in the domain so I can pick up things for when I am an acting-RSE, but then also with the other side of my role as someone who is trying to build tools to support reproducible/repeatable scientific pipelines, looking at how our work to date on things like <a href=\"https://github.com/quantifyearth/shark/\">Shark</a> might connect with that.</p>\n<p>Disclaimer: all these summaries are projected through my own thoughts, so what I put here isn't necessarily the opinion of the speaker, but rather my interpretation. If you want a simpler summary of just the facts, you can <a href=\"https://hackmd.io/yivTsaSzR3qGDXSSwD4JIQ?both\">look at the group notes form the event</a>. Apologies to speakers if I've misinterpreted their words - please do correct me if so!</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A group photo of about forty research software engineers stood or knelt for a group photo inside a building.\" src=\"NRSE25_group.jpg\">\n\n</div>\n</div>\n\n<p></p>\n<p>(Thanks to the organisers for taking a picture of us all!)</p>\n<h1>Day 1</h1>\n<h2>Intro by <a href=\"https://www.gu.se/en/about/find-staff/matteotomasini2\">Matteo Tomasini</a>, <a href=\"https://nordic-rse.org\">Nordic RSE</a></h2>\n<p>One of the things I loved about the conference was that it was still small enough that I got to know a good proportion of the attendees throughout the conference. In the introduction Matteo Tomasini revealed that there were 45 people this year, up from 30 from last year, which was also the first year.</p>\n<p>There was a bit about what made an RSE, particularly as in most institutions in the nordics (except Aalto) there is no official RSE job (unlike in UK universities where RSE now is an officially recognised role). Generally in the RSE community, both in the UK and in the Nordics, it is recognised that a lot of people act as defacto-RSEs without having the term in their job title, and as such I've found both communities to be welcoming to those of us who self-identify as RSEs, and thus it was with this conference. Matteo defined it as:</p>\n<ul>\n<li>If you develop software for research</li>\n<li>You're the go to in your group for software work/questions</li>\n<li>You support the other researchers in your group</li>\n<li>If you feel like one</li>\n</ul>\n<p>I liked this broad definition in the opening, as it made it clear that everyone was welcome here.</p>\n<p>Matteo also touched on what does Nordic-RSE do:</p>\n<ul>\n<li>This conference</li>\n<li>Has a community Zulip chat for members</li>\n<li>A weekly online coffee meet (9am CET on Thursdays)</li>\n<li>Bi-weekly online community meeting</li>\n</ul>\n<p>It's clear the group has ambitions to help foster the RSE role in the Nordics, and throughout the conference the UK's <a href=\"https://society-rse.org\">Society of Research Software Engineering</a> (of which I'm a member, tough I couldn't make their conference last year) was cited as being about 5 years ahead of where this group wanted to be.</p>\n<h2>Keynote: Clarity, Not Consensus: Rethinking Unity in Open Science by <a href=\"https://rmwillen.info/wp-content/uploads/2024/09/rebecca-willen-cv-2024-1.pdf\">Rebecca Will\u00e9n</a>, <a href=\"https://igdore.org\">IGDORE</a></h2>\n<p>This was an interesting keynote on the quest for "open science". Rebecca Will\u00e9n was the founder of <a href=\"https://igdore.org\">IGDORE</a>, the Institute for Globally Distributed Open Research and Education, which they founded after the end of their PhD, a champion for reproducible science.</p>\n<p>She started by explaining there was a revolution in psychology in 2012, with research looking at the field of psychology and questioning the reproducibility of the results and how selective people were being about what they presented. This isn't necessarily scientific misconduct, but with the push to get published people might slip into what is defined as Questionable Research Practices (QRPs). Examples of this were:</p>\n<ul>\n<li>P hacking or data torture (selective results)</li>\n<li>Harking - the practice of finding a thing of interest in the data and then pretending that this was your hypothesis all along</li>\n</ul>\n<p>The QRP framing is meant to go beyond the deliberate misleading, and I think as a computer scientist interested in tools for reproducibility and having worked with many <a href=\"https://www.researchgate.net/publication/359725248_Myths_and_mythconceptions_what_does_it_mean_to_be_a_programming_language_anyhow\">vernacular programmers</a>, I think that computers amplify QRPs, by making it hard to do a good job at understanding lineage/provenance. I need to dig more into QRPs, and I think the citations for this are:</p>\n<ul>\n<li><a href=\"https://journals.sagepub.com/doi/10.1177/0956797611417632\">False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant by Joseph P. Simmons, Leif D. Nelson, and Uri Simonsohn, 2011</a></li>\n<li><a href=\"https://www.cmu.edu/dietrich/sds/docs/loewenstein/MeasPrevalQuestTruthTelling.pdf\">Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling by Leslie K. John, George Loewenstein, and Drazen Prelec</a></li>\n</ul>\n<p>I also found this more recent 2021 book, <a href=\"https://academic.oup.com/book/39705\">The Problem with Science: The Reproducibility Crisis and What to do About It by R. Barker Bausell</a> (specifically in <a href=\"https://academic.oup.com/book/39705/chapter/350374120\">chapter 3</a>) that seems to cover the topic in detail. Lots of interesting things to follow up on.</p>\n<p>Back to the talk. From this epiphany in the psychology research community in 2012 spun out an attempt to do better - a theme we'll see repeated later in Ina P\u00f6hner's talk in the Pharmacy community - and a push to open science.</p>\n<p>Rebecca then presented what she felt where the five tenants of open science that people talked about, each of which had many subcategories which I didn't manage to record, but the high levels were:</p>\n<ul>\n<li>Open access to knowledge and resources</li>\n<li>Access to infrastructure and enabling tools</li>\n<li>Reproducibility and scientific quality</li>\n<li>Research culture and work environment</li>\n<li>Diversity and inclusion</li>\n</ul>\n<p>The first two were listed as being accepted requirements in the open science world, at least in IGDORE, and the last three were still being debated.</p>\n<p>Rebecca made a comparison at this point to the open source software movement at this point, and gave a historic overview and pointed out how over time that movement started out as being a moral movement (people should have the right to examine and modify the code they run), to being a more qualitative bar (aka, <a href=\"https://en.wikipedia.org/wiki/Gratis_versus_libre\">libre vs gratis</a>).</p>\n<blockquote>\n<p>"The Free Software movement and the Open Source movement are today separate movements with different views and goals, although we can and do work together on some practical projects." - <a href=\"https://www.gnu.org/philosophy/free-software-for-freedom.html\">https://www.gnu.org/philosophy/free-software-for-freedom.html</a></p>\n</blockquote>\n<p>Rebecca identifies this theme in the timeline of open science also:</p>\n<ul>\n<li>Open access, arxiv, CC - late 1990s</li>\n<li>Protocols for clinical trials mandatory in 2005 - open and version controlled</li>\n<li>Work showing QRPs are common in 2011</li>\n<li><a href=\"https://osf.io\">Open science framework</a> - developed for psychology, used now used in all social science. Describes the process of pre-registration - saying what you're doing before the research -</li>\n<li>Added reproducibility to open science with intent that it prevents QRPs</li>\n<li><a href=\"https://cos.io\">The Center for Open Science</a> similar time to the Open Science Framework, but starts to shift from morality to quality similar to that shift in the OSS world</li>\n<li>Another reference to the <a href=\"https://www.unesco.org/en/open-science/toolkit\">UNESCO open science toolkit factsheet 2022</a>, specifically the <a href=\"https://unesdoc.unesco.org/ark:/48223/pf0000383323\">enumeration of its tenants</a> - the quality shift is now appearing here</li>\n</ul>\n<p>My personal opinion is that tech culture did lose track of that morality of open vs the open speeds up the tech sector discussion - part of the <a href=\"https://pluralistic.net/2022/11/28/enshittification/\">enshittification</a> we see today I guess, though some of that is just also unchecked capitalism having caught up with naive tech optimism from the prior decades. But I digress.</p>\n<p>At this point I got a little confused as to which tenants Rebecca was advocating for - as I wasn't sure as to which bits of the original 5 tenant list and the UNESCO definition of open science she saw as being about the moral purpose of open science vs the check boxing of open science to do what you were going to do anyway. But what was clear was the in IGDORE they'd had a loss of momentum because of this pull in different directions of what it means to be open science, and they'd not realised that this split was happening, and so consensus was lost in the organisation and there was lack of doing anything useful for many years as a result.</p>\n<p>So I'm not sure I agree about which tenants should be in or out of a definition of open science, but I do see that the split that happened in the tech community around libre/gratis could also be a challenge for the science community. But for me the main take away was the learning about QPRs, as this has given a name to a whole bunch of things I've thought about but never had a way to tie them together.</p>\n<h2>Design Patterns: The secret to smarter, reproducible research by <a href=\"https://codingresearcher.com\">Marine Guyot, CodingResearcher</a></h2>\n<p>The next talk was by Marine Guyot, who is a freelance RSE, and gave a talk on using design patterns in building software for research. The motivation for the talk is what I feel must be a very common pattern, which she told via the persona Anna:</p>\n<ul>\n<li>Anna makes a script to save time for her own research</li>\n<li>Others use it</li>\n<li>Other users ask for small modifications....</li>\n<li>Now Anna is trying to juggle hacking this script vs her own work - bad quality etc. due to time pressures</li>\n</ul>\n<p>Then either at some point it will be recognised as critical and a team will form around it, or Anna will still carry on trying to maintain it and burn out.</p>\n<p>I feel there is another option which is the software is abandoned and then something is lost, but I guess that's not part of the narrative for a talk on how to design better software.</p>\n<p>The rest of the talk focussed on design patterns in software, a topic I won't try to reiterate here as there's good books on this. The premise is that if you make something useful, others will want changes, and unless you put structure in place to manage those changes early on then you'll pay for it later. Something I suspect most people know (at least by the time they write software a second time :), but I suspect few people think of software as being anything other than a quick thing they do to try get a result for their work. It's like the old question about when is a number of things "many".</p>\n<p>The best nugget was in the Q&A at the end:</p>\n<p>Audience Q: what's the best thing I should do for the hand over (from RSE to researcher)\nMarine A: documentation</p>\n<h2>In the modern era, what can we learn from the history of free and open software? by <a href=\"https://research.aalto.fi/en/persons/richard-darst\">Richard Darst</a>, <a href=\"https://www.aalto.fi/en\">Aalto University</a></h2>\n<p>Richard Darst gave a talk on the history of open source software, looking at how it has evolved over time, and then how to deal with some challenges in opening up code (and maybe data or science?) today. Richard's slides are quite readable and <a href=\"https://cryptpad.fr/presentation/#/2/presentation/view/EiU5tmOdvJtbHsybb+DXYYLaHScbxcSN7LXJEJ9R+f8/embed/\">available here</a>, so I won't attempt to recap them here.</p>\n<p>I enjoyed the talk, and learned a bunch about the details of how debian view things via his overview of <a href=\"https://wiki.debian.org/DebianFreeSoftwareGuidelines\">the Debian Free Software Guidelines</a>, and how they have tests to help decide if a thing is truly open, such as the <a href=\"https://wiki.debian.org/DesertIslandTest\">desert island test</a> and the <a href=\"https://wiki.debian.org/DissidentTest\">dissident test</a>.</p>\n<p>One note that struck a chord after some recent experiences with primary data sources we've had:</p>\n<blockquote>\n<p>"In short term closed may be better, but more people will improve the open option long-term"</p>\n</blockquote>\n<p>In our case a group making open digital elevation maps that we've used in the past have switched to restrictive licensing for an open version and a paid version if you want to avoid that, and how that feels quite short cited, particularly given we're in the midst of a climate emergency.</p>\n<h2>Tutorial: 3D visualisation and manipulation of scientific data in static web applications by <a href=\"https://research.chalmers.se/person/joajohan\">Joakim Bohlin</a>, <a href=\"https://infravis.se\">InfraVis, Chalmers University of Technology</a></h2>\n<p>This talk by Joakim Bohlin was on building static web sites for visualising science data. The <a href=\"https://github.com/Akodiat/tutorialNordicRSE25/\">code examples he used are here</a>.</p>\n<p>In the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG group</a> we have quite a strong static-site, self-hosting theme (this website is currently hosted on a raspberry-pi and running <a href=\"https://github.com/mdales/digitalflapjack.com/\">its own static-site generator</a>!), and I also have close to zero interest building frontends for our work that involve me working in React, Vue, or any of the larger contemporary Javascript frameworks that a lot of geospatial visualisation libraries assume you're using. Indeed, I think this is somewhat a point of contention within the group, as there's a clear need for communicating what we do, but because we're effectively mostly people who work at the bottom of the stack, it means no one wants to take time to learn those frameworks, and so we've been poor in communicating what we do.</p>\n<p>I guess this is another RSE thing - we write software, but we can't write <em>all</em> software individually.</p>\n<p>So with that context, I was interested to learn what Joakim had to share: although he can't solve the problem with geospatial visualisation libraries requiring React etc., it was good to know that people are having success delivering usable visualisations with a minimal stack, and if more people are doing that, hopefully we'll eventually see more tooling begin to support this approach.</p>\n<p>Some particularly interesting bits of tooling to me were:</p>\n<ul>\n<li><a href=\"https://pyodide.org/en/stable/\">Pyodide</a> - this lets you run Python in the browser, which Joakim pointed out isn't the best solution, but often if your group works in Python they might have existing things that use plotting libraries to generate graphs, and as a first cut at getting that in front of more people, just can be an easy way to get started. You can combine this with <a href=\"https://pypi.org/project/micropip/\">micropip</a> to include python packages from the javascript wrapper you use to load Pyodide.</li>\n<li><a href=\"https://vega.github.io/vega-lite/\">Vega-Lite</a> - a native javascript interactive graphing library, which I pronounce to rhyme the first half of the name with "Sega"but I fear is a pun based on Vegamite :) In the past I've used <a href=\"https://c3js.org\">C3.js</a> for this sort of thing, but Vega-lite looked a little more easy to make the data interactive.</li>\n</ul>\n<p>There were more, so if this sort of thing catches your interest, do check out the linked examples.</p>\n<h2>Donated data and what Google already knows by <a href=\"https://github.com/rantahar\">Jarno Rantaharju</a>, <a href=\"https://www.aalto.fi/en\">Aalto University</a></h2>\n<p>The premise of this talk was that collecting data in studies of people is hard:</p>\n<ul>\n<li>Takes time, expensive</li>\n<li>Requires participant effort</li>\n<li>Impacts subject behaviour</li>\n<li>Data is only collected after study starts</li>\n</ul>\n<p>That last one might seem obvious, but I guess it's a valid point if you wanted to say study how the covid pandemic changed behaviours. Jarno Rantaharju's point was that actually for a lot of studies the data you might want could already exist in the various cloud services you use, knowingly or not: Google or Netflix already have a lot of data on your behaviours, and thanks to GDPR you can get access to that data as a participant. This is being worked on by, amongst others, the <a href=\"https://digitraceslab.com\">DigiTraces Lab</a> at Aalto University, and is referred to as Facilitating Donated Data.</p>\n<p><a href=\"https://www.cogitatiopress.com/mediaandcommunication/article/view/9362\">An example publication</a> that was made using this data gathering technique on Netflix data.</p>\n<p>Jarno then went to walk through how Google's "takeout" service works to facilitate extracting user data, how to filter it, and so forth, all of which can be quite complicated. So then Jarno showed a browser extension they'd made that will automate much of the "takeout" process, show the user what it has, and then talk to a data collection website they were hosting for an experiment (all of which is open source I believe).</p>\n<p>There are also other tools out there, such as <a href=\"https://github.com/eyra/port\">PORT</a>, which are designed to allow the user to do some introspection and filtering of the donated data before uploading it, as "takeout" for instance doesn't make it easy to time restrict data, you have to give the science team a lot of data they don't necessarily want and you might not want them to have more than is necessary.</p>\n<p>I noted Jarno was using <a href=\"https://github.com/digitraceslab/niimpy\">Niimpy</a> in his demo showing what was in the "takeout" data, which is a python package for working with behavioural data, which looked quite useful if you were into that sort of thing.</p>\n<h2>Unreviewed and Unrunnable? On the role of peer review in publishing reproducible research software by <a href=\"https://uefconnect.uef.fi/en/ina.pohner/\">Ina P\u00f6hner</a>, <a href=\"https://www.uef.fi/en/unit/school-of-pharmacy\">School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland</a></h2>\n<p>This talk was one of the highlights for me in terms of how it related to existing work we've done on the topic in our group here, e.g., <a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">our Undone paper on how CS hinders climate research</a>.</p>\n<p>Ina P\u00f6hner started out with context that echoed the opening keynote talk, looking at how in their domain there are papers <a href=\"https://pubmed.ncbi.nlm.nih.gov/23758509/\">from over a decade ago</a> flagging issues with reproducibility of work, and then <a href=\"https://www.nature.com/articles/533452a\">another large survey in 2016</a> calling it a "reproducibility crisis". Since then there has been an increased requirement in providing code along side publications, but the question is does a code requirement really equate to reproducibility?</p>\n<p>Ina and her group looked at 1200 articles published between 2020 and 2024 and looked at how many had code, and then how many of those could actually be used. Some headline figures were, of those articles only 481 had code repositories associated with them. Of those they tried to run, and only 10% worked, some even no longer exist having been deleted after publication, and so forth. They also did a dive into those that didn't run, and worked out why they didn't run, looking at was it lack of documentation, lack of dependancies and so forth. I made a lot of notes here, but given the paper for this is still in review I feel best to wait for it to emerge.</p>\n<p>One of the more interesting comments was how it is seen in the review process. Of 75 journals that were surveyed, 65% mandate code be published, 34% ask for it (I assume without it blocking publication if not available), but only 4% give the reviewers any guidelines on how to review the code itself, and so effectively very little is done beyond checking the presence of code. Some reviewers interviewed did say they looked for a README or such, but they also had some reviewers say "we'd not try to rerun wet-lab experiments, so why would we try run the code?"</p>\n<p>I think this is all a great survey and the fact that the group did a lot of actually grind to check all these papers is valuable versus the gut instinct (that the entire audience shared) that code published isn't runnable. I think there's a second question here which would also cover data availability too, but I don't want to let that detract from this work which I appreciated.</p>\n<p>Ina went through a list of possible things publishers should do to address this, the most interesting of which I thought was drafting in early career researchers to help with code review for papers, and ensuring they get credit for this obviously. I kinda like this idea, though it might be hard to get a perfect match, it's a great way to not only get review, but build up code-for-publication as a habit in new researchers.</p>\n<p>As a final note to this session, Ina mentioned <a href=\"https://www.nature.com/articles/sdata201618\">FAIR</a> (and <a href=\"https://www.nature.com/articles/s41597-022-01710-x\">here</a>), which I'd not come across before, which is a guiding principles for scientific data management and Ina was advocating these should be used for code also.</p>\n<h2>RSE Career panel</h2>\n<p>Day 1 closed with a group discussion on RSE careers, <a href=\"https://hackmd.io/@nordic-rse/nre2025-career-panel\">the notes for this are online</a>. Common themes mostly stemmed from the fact that in the nordics this isn't for most people their full time role, they work in other departments (e.g., the university HPC group), and so there was talk of how to get funded for it, and how to ring fence time for such work.</p>\n<h1>Day 2</h1>\n<p>Day 2 was mostly short lightning talks of about ten minutes long, with a couple of longer talks and two panels thrown in also.</p>\n<h2>Panel : How to finance your RSE group - <a href=\"https://en.uit.no/ansatte/person?p_document_id=486227\">J\u00f8rn Dietze</a>, <a href=\"https://hpc.uit.no/\">HPC group at UiT The Arctic University of Norway</a></h2>\n<p>J\u00f8rn Dietze is a member of the <a href=\"https://research-software.uit.no\">RSE group at UiT</a>, but it's not really funded, they are part of HPC/central admin for university. The RSE side is done as slack time, so roughly 20% per person. They do 2 hours a week office hours where students come along with problems.</p>\n<p>This was then held in contrast to the <a href=\"https://www.aalto.fi/en/services/research-software-engineers\">Research Software Engineers service at Aalto</a>, which is part of the computing department, and represented by Richard Darst (who gave the previous day's talk on what we can learn from the history of open source). It started with no funding, helping out, then starts helping projects with funding, where in theory they can bill hours. Finance pushed back, saying nothing under a month is worth doing the billing for. Then a centre was set up for AI, which funded an research engineer, and in theory they work for the centre, but any spare time is used as general RSE. Also out of the university HPC group originally - so experience of working with other depts.</p>\n<p>Their funding breaks down as</p>\n<ul>\n<li>Big enough (more than month): on grant</li>\n<li>Small projects: out the departments general funding</li>\n</ul>\n<p>Inspiration from UK:</p>\n<ul>\n<li><a href=\"https://www.software.ac.uk/blog/visibility-research-software-engineers-research-funding\">https://www.software.ac.uk/blog/visibility-research-software-engineers-research-funding</a></li>\n<li><a href=\"https://imperialcollegelondon.github.io/research-computing/funding.html\">https://imperialcollegelondon.github.io/research-computing/funding.html</a></li>\n<li><a href=\"https://www.software.ac.uk/guide/how-fund-research-software-development\">https://www.software.ac.uk/guide/how-fund-research-software-development</a></li>\n<li><a href=\"https://www.software.ac.uk/programmes/research-software-maintenance-fund\">Research Software Maintenance fund</a> (only UK)</li>\n</ul>\n<p>Another topic was acknowledgements for work so as to try show group value.</p>\n<ul>\n<li>Some require RSE groups require acknowledgements in papers (not co-authorship)</li>\n<li>At Aalto they collate the publications they assisted with every year to show contribution to department</li>\n</ul>\n<p>This section is a bit disjoint, but we covered a lot of topics in an hour!</p>\n<h2>CodeRefinery: Where Research Software Engineers can begin and grow by <a href=\"https://www.software.ac.uk/fellowship-programme/samantha-wittke\">Samantha Wittke</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>Samantha Wittke talked about<a href=\"https://coderefinery.org\">CodeRefinery</a>, which is a collaborative project that:</p>\n<ul>\n<li>Provides hands-on training for coding for research</li>\n<li>Focus on good-enough</li>\n<li>Support Open Science and FAIR software development</li>\n</ul>\n<p>The teaching sits between introductory programming basics and high perf/GPU training. They're not the only ones doing it, and it sounds like they exchange ideas with other groups, e.g., <a href=\"https://carpentries-incubator.github.io/fair-research-software/01-fair-research-software.html\">The Carpentries FAIR Research Software course</a>. The courses are open licensed <a href=\"https://creativecommons.org/licenses/by/4.0/\">CC-BY</a>.</p>\n<p>CodeRefinery run workshops twice a year with global access, via both online and some in person classrooms. Currently they serve about 500 students per year and have 30 instructors/speakers.</p>\n<p>They also run a Zulip channel to go along side the course and provide networking (it's the same Zulip used by Nordic-RSE).</p>\n<p>Ways to get involved:</p>\n<ul>\n<li>Become a co-instructor</li>\n<li>Contribute to lesson materials</li>\n<li>Join as an observer</li>\n</ul>\n<p>They have had people become RSE types after completing CodeRefinery courses.</p>\n<h2>LUMI AI Guide by <a href=\"https://www.linkedin.com/in/gregor-decristoforo-ab8741213/?originalSubdomain=no\">Gregor Decristoforo</a>, UiT The Arctic University of Norway and <a href=\"https://scholar.google.com/citations?user=JhJxD981dnsC&hl=en\">Oskar Taubert</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>Gregor Decristoforo and Oskar Taubert talked about the <a href=\"https://www.lumi-supercomputer.eu\">Finish national supercomputer LUMI</a>, in particular how it can be used for AI work, despite not being designed for that sort of workload. Apparently now 29% of users are using it for AI, 41% for machine learning. Most of this is with done with <a href=\"https://pytorch.org\">PyTorch</a>.</p>\n<p>The main challenges users face when using LUMI:</p>\n<ul>\n<li>Software installation: how do people get their software on to LUMI? Options include <a href=\"https://docs.sylabs.io/guides/latest/user-guide/\">singularity containers</a> and <a href=\"https://docs.lumi-supercomputer.eu/software/installing/easybuild/\">easy build modules</a>. This is typically set up by support team.</li>\n<li>LUMI uses AMD GPUs, so no CUDA support I guess, which is somewhat more common</li>\n<li>It uses the <a href=\"https://www.lustre.org\">Luster file system</a>, but that isn't well suited to many small files, which is common in Python environments</li>\n<li>Helping people scaling training jobs to multiple nodes</li>\n<li>Monitor and profiling</li>\n</ul>\n<p>To this end they've put together a <a href=\"https://github.com/Lumi-supercomputer/LUMI-AI-Guide\">LUMI AI Guide</a> on how to go from laptop to LUMI, and Gregor and Oskar walked us through select parts of that.</p>\n<p>It uses Slurm for job access, which I chatted to Gregor about over lunch, and which will crop up again in a later talk. I'll put some notes on Slurm and what we do in the EEG below.</p>\n<h2>Harnessing power and synergy \u2013 Using game engines to produce visualisation and simulation by <a href=\"https://www.kth.se/profile/berendt?l=en\">Filip Berendt</a>, KTH Royal Institute of Technology</h2>\n<p>Filip Berendt gave a talk on using game engines in research software, which is something that I've <a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4977986\">seen used in our group</a> before, but it was interesting to see a more broad appraisal of how it can be applied.</p>\n<p>The top level was that they can be quite useful, though not ideally matched, are great for prototyping and mixing in visualisation at the core of your work, and that licensing can be an issue, for example the <a href=\"https://unity3dperformance.com/index.php/2024/09/22/unity-license-change/\">Unity controversy from last year</a> - ultimately the game engine developers structure the payment model around games, not science!</p>\n<p>The engines covered in the discussion where <a href=\"https://unity.com\">Unity 3D</a>, which Filip did use until the licensing issue, <a href=\"https://en.wikipedia.org/wiki/Unreal_Engine\">Unreal Engine</a>, and <a href=\"https://godotengine.org\">Godot</a> (which is open source).</p>\n<p>Filip showed an example he'd built using Unity for <a href=\"https://kth.diva-portal.org/smash/get/diva2:1708157/FULLTEXT01.pdf\">implementing a model of pedestrian steering algorithm</a>, compared with established algorithm\nRelated works developed their own testing env, and visualisation done after the fact using a second environment - game engine lets you do both. I think this last fact is probably generally under-appreciated as to how important visualisation of results are for spotting issues in large data sets, so I like this a lot.</p>\n<h2>How Alarming by <a href=\"https://callumrollo.com\">Callum Rollo</a>, <a href=\"https://voiceoftheocean.org\">Voice of the Ocean Foundation</a></h2>\n<p>Callum Rollo works for Voice of The Ocean, who have several autonomous underwater drones in the waters around Sweden: you can <a href=\"https://observations.voiceoftheocean.org/\">see a live feed of their location online</a>. They will occasionally surface to report back data using an Iridium back-haul, and if they need support, they will stay on the surface until given more instructions. This is the most dangerous time for the drones, as they can get hit by boats, versus when they're lower down in the water they're relatively safe, and so when a drone surfaces and requests assistance, the right expert must be fetched quickly.</p>\n<p>Callum had to build a system to handle emergency calling of people in the company, with redundancy, and slow escalation up the staff hierarchy if calls aren't handled. Building a reliable system like this is hard - it's not a job I'd relish taking on given that a false positive is going to annoy a lot of people, and a false negative can be very expensive.</p>\n<p>It was a nice reminder that RSE software isn't just about data-processing pipelines or HPCs or perhaps even embedded sensor software. The tooling Callum put in place here is essential to the science work, but not being on the data collection or processing path probably isn't something we think of RSEs doing. But the tooling itself can be quite similar, as Callum pulled all this together using Python.</p>\n<h2>The RSE experience at the northernmost university in the world by <a href=\"https://www.linkedin.com/in/gregor-decristoforo-ab8741213/?originalSubdomain=no\">Gregor Decristoforo</a>, <a href=\"https://hpc.uit.no/\">HPC group at UiT The Arctic University of Norway</a></h2>\n<p>Gregor Decristoforo gave a follow up to a talk last year about the nascent RSE group forming within the HPC group a UiT. They are now 6 part time RSEs, having started 2.5 years ago, growing out of the HPC group, which is part of IT Dept at UiT.</p>\n<p>Mostly they collaborate with groups that they were part of before they joined (as in, the individuals were in a particular other discipline research group, and they now help those groups under this new banner).</p>\n<p>Challenges of forming an RSE group:</p>\n<ul>\n<li>Visibility of the group to researchers</li>\n<li>Convincing higher-ups RSE is valuable</li>\n<li>Mostly working as HPC engineers, so time is limited for RSE jobs</li>\n<li>People come with R problems, but its often a stats problem, and so not their area of expertise</li>\n</ul>\n<p>That last one is an interesting one that hasn't come up for us in our cluster in the EEG group, but perhaps that's because everyone knows I'm not good at R or stats :)</p>\n<p>It's not in my notes, but IIRC they hold an office hour once a week to help people that rotates between members.</p>\n<h2>The periodic table: R package and visualized with ggplot2 by <a href=\"https://cv.solarchemist.se\">Taha Ahmed</a></h2>\n<p>Another data-visualisation talk, this time Taha Ahmed talking on an R package he built to make <a href=\"https://github.com/solarchemist/periodicdata\">a customisable version of the periodic table</a>. There was a lack of freely licensed periodic table for customising, so he made his own.</p>\n<p>Internal data store is in yaml for both data and metadata, which are split into two parts, which is flexible, but gives raise to data-hygiene issues reading yaml into R (the usual JSON/YAML issues with lack of typing).</p>\n<p>Works nicely in a notebook, you can set values per entry and visualise on the table.</p>\n<h2>Experiences from 'High-performance R' course by <a href=\"https://helijuottonen.wordpress.com\">Heli Juottonen</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>The next talk was by Heli Juottonen talking about how at CSC they try to teach people to use R in an HPC context and the <a href=\"https://github.com/csc-training/high-performance-r\">training course</a> they run. The slides for this talk are <a href=\"https://a3s.fi/heli/hpr_hj_nrse25_wide.pdf\">here</a>. The course was made by a pair of people: one a biologist looking at R as a tool, and the other as a comp-sci.</p>\n<p>Heli maintains R on a supercomputer; they use <a href=\"https://docs.csc.fi/apps/r-env/\">r-env</a> for running R on the HPC machines.</p>\n<p>Common question: "Blah in R is taking too long, they run out memory, what now?" This certainly echos the questions we get on our computer cluster, and its frustrating (to Michael) that it's so hard to answer such seemingly simple questions - (though not unexpected: yay, <a href=\"https://en.wikipedia.org/wiki/Halting_problem\">halting problem</a>).</p>\n<p>The course aims:</p>\n<ul>\n<li>Understanding resource usage, finding bottlenecks</li>\n<li>Parallel and distributed computing</li>\n</ul>\n<p>Audience:</p>\n<ul>\n<li>RStudio user on supercomputer but doesn't know how to utilise the resources well</li>\n</ul>\n<p>It's a two day course, with the first day being about measurement, and the second day about batch jobs and dealing with distribution over multiple cores/nodes.</p>\n<p>One problem they hit was with users bringing their data - needs cleaning before use so slows down the course.</p>\n<h2>N-min-ct - a research software to tell how sustainable is the neighbourhood. by <a href=\"https://www.ntnu.edu/employees/ruslan.zhuravchak\">Ruslan Zhuravchak</a>, Norwegian University of Science and Technology</h2>\n<p>Ruslan Zhuravchak gave a talk on how he helped implement a project to facilitate the interactive analysis of urban form and mobility in smart (15-minute) ciries, and of various relevant performance metrics as suggested by <a href=\"https://fmezen.no\">FME/ZEN</a> - Zero Emission Neighbourhoods.</p>\n<p>Project was to try to assess how well a city is meeting ZEN KPIs based on sensor data.</p>\n<p>Unfortunately the project was only internally available, so whilst we got a demo which was quite interesting, I can't link to it alas. I had a chat with Ruslan afterwards, and he hopes to get it published. In the EEG group we have a few people working on smart city metrics, and Ruslan seemed keen to chat to those interested.</p>\n<h2>Modern HPC-access alternatives for the bash averse by <a href=\"https://www.chalmers.se/en/persons/vikren/\">Viktor Rehnberg</a>, Chalmers University of Technology</h2>\n<p>This session was quite interesting to me, as I semi-manage a compute "cluster" of machines shared by a set of ecologists. This is not an HPC set up, rather a set of very-large regular computers (256 cores per machine, 1TB RAM, etc.). We deliberately have taken a hands off approach to access, just leaving it as ssh access to a particular machine, but I think whilst we've got away with that, I'd like to see what else we can do here.</p>\n<p>One of the themes I've seen consistently in this conference is the adaption of <a href=\"https://slurm.schedmd.com/overview.html\">Slurm</a>, as was the case here. This talk wasn't about Slrum per se, but it did show me different ways our compute cluster could be presented, even if this talk was about HPC and we're just a bag of big computers (BBoC :).</p>\n<p>Victor Rehnberg gave this talk, and he started by trying to define what is an HPC cluster:</p>\n<ul>\n<li>Users access over login node</li>\n<li>It contains many nodes</li>\n<li>Is managed by a scheduler (typically Slurm)</li>\n<li>Has shared storage for the nodes (which enables the scheduled to distribute jobs)</li>\n</ul>\n<p>From this measurement, perhaps our BBoC does count as HPC, all it's missing is the login node and the scheduler. I usually think of HPC as having slightly more GPUs or other specialist hardware to them.</p>\n<p>The typical way you'd access the HPC is you ssh to login node, use command line to make jobs via the scheduler, and then it'll run your work at some point as resources allow. Currently we run a social scheduler (aka a Slack channel that people rarely use), and quite often I have to go nag people about this.</p>\n<p>The other topic that came up in a lunch discussion (I think with Gregor and Maria) was I realise that by not using Slurm, which is the de-facto standard for HPC, we're not preparing our users for when they migrate up to proper HPC. There will always be a big set of learning needed when moving from a big normal computer to a dedicated HPC, but if we moved to make our environment a little more like this it might both make things run smoother and help our users prepare for the wider world of scientific computing? In the past I've been resistant to using Slurm just as it adds overhead, but now we have more help on the server management side, perhaps it's time to reconsider that.</p>\n<p>Anyway, back to the talk! The main thrust of Viktor's talk was about what if you don't want to use ssh, can you use other tools to access the HPC? Graphical tools for instance. The answer was yes, and the options he presented were:</p>\n<ol>\n<li>Use X-forwarding - as an old person I love that this is still considered an option</li>\n<li>Remote desktop - <a href=\"https://www.cendio.com\">thinlinc</a> most common, but commercial. When you connect you are still using <code>sbatch</code> etc. from a terminal to launch jobs, but matlab etc. can also x-forward from compute notes.</li>\n<li><a href=\"https://github.com/lunarc/gfxlauncher\">Gfxlauncher</a> - runs on login node</li>\n<li>IDEs like <a href=\"https://code.visualstudio.com\">Visual Studio Code</a> or <a href=\"https://www.jetbrains.com/pycharm/\">PyCharm</a>, using ssh remote. I suspect VSCode is what most of our ecologists use to access the BBoC.</li>\n<li>Language environments like <a href=\"https://jupyter.org\">Jypter</a> (for Python), <a href=\"https://posit.co/download/rstudio-server/\">RStudio server</a>, matlab-proxy, which can be tunnelled over ssh.</li>\n<li>Web portal that sets up the above, like <a href=\"https://openondemand.org\">Open OnDemand</a></li>\n<li>Language libraries that let you code up job submission: e.g., <a href=\"https://github.com/facebookincubator/submitit\">submitit</a> and <a href=\"https://www.nextflow.io\">nextflow</a></li>\n</ol>\n<h2>Training Triumphs: Maximizing Impact with External Domain Experts by <a href=\"https://liu.se/en/employee/yonwa58\">Yonglei Wang</a>, <a href=\"https://liu.se\">Link\u00f6pings Universitet</a></h2>\n<p>Yonglei Wang works at Link\u00f6pings Universitet in the <a href=\"https://liu.se/en/organisation/liu/nsc\">national supercomputer centre</a> there, and gave an overview of all the various bits of training that are available, specifically from ENCCS - <a href=\"https://enccs.se\">EuroCC National Competence Centre Sweden</a>. Aims to enable HPC, AI, and High performance data analytics (HPDA).</p>\n<p>They run domain specific events: Bio-molecular, fluid dynamics. quantum chemistry, quantum computing - no ecology! They have had 3600 participants: 80% academic, 8% public sector, 7% large companies, and 5% SMEs. Gender breakdown was 23% female, 73% male.</p>\n<p>There was a long list of the training, but alas too much for me to note here - check out <a href=\"https://enccs.se/lessons/\">ENCCS lessons list</a> for more - but there's definitely some <a href=\"https://enccs.github.io/gpu-programming/\">I want to check out</a>.</p>\n<h2>My discussion session on lineage in scientific data-processing pipelines</h2>\n<p>This I'll write up and link to shortly as an independent post! But it was (at least from my perspective) a success, with many interesting tools and techniques I'd not been aware of before. I say, a proper post on that so everyone can share the results soon!</p>\n<h1>Misc other notes</h1>\n<ul>\n<li>Oxford has a <a href=\"https://www.rse.ox.ac.uk\">dedicated RSE group</a>, hat tip to Gregor Decristoforo for pointing them out to me.</li>\n<li><a href=\"https://digital-strategy.ec.europa.eu/en/policies/ai-factories\">AI Factories</a> seemed to be somewhat of a contentious term I'd not come across before, which seems to be an EU initiative to power AI projects. I suspect it's seen as hype that is draining money and people don't quite know what it means.</li>\n<li>The discussion/panel sessions were ran via the audience <a href=\"https://hackmd.io\">collaboratively editing a markdown document</a> that was on the projector, and the moderator calling out interesting things and asking whoever wrote that to speak a little on that. As a technique it worked really well with this size of audience, both live, and leaves everyone with notes for afterwards!</li>\n<li>Nordic-RSE 2026 will be in Troms\u00f8, Jun 9-10!</li>\n</ul>",+"content": "<p>This is a summary of last week's <a href=\"https://nordic-rse.org/nrse2025/\">2025 Nordic-RSE conference</a>, held in Gothenburg, Sweden. Whilst I'm not technically an Research Software Engineer (RSE), a lot of my role involves essentially the same activities in working on ecology pipelines like <a href=\"https://github.com/quantifyearth/LIFE/\">LIFE</a>, <a href=\"https://github.com/quantifyearth/STAR\">STAR</a>, and so on; indeed I'm a member of the UK <a href=\"https://society-rse.org\">Society of Research Software Engineering</a>. Not only do I effectively act as an RSE a good amount of my time, but it's also a part of my job I enjoy: collaborating with experts in other fields whilst getting to use my own expertise and learning something along the way is often quite satisfying.</p>\n<p>My role at the conference was twofold: to learn more about how others are working in the domain so I can pick up things for when I am an acting-RSE, but then also with the other side of my role as someone who is trying to build tools to support reproducible/repeatable scientific pipelines, looking at how our work to date on things like <a href=\"https://github.com/quantifyearth/shark/\">Shark</a> might connect with that.</p>\n<p>Disclaimer: all these summaries are projected through my own thoughts, so what I put here isn't necessarily the opinion of the speaker, but rather my interpretation. If you want a simpler summary of just the facts, you can <a href=\"https://hackmd.io/yivTsaSzR3qGDXSSwD4JIQ?both\">look at the group notes form the event</a>. Apologies to speakers if I've misinterpreted their words - please do correct me if so!</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A group photo of about forty research software engineers stood or knelt for a group photo inside a building.\" src=\"NRSE25_group.jpg\">\n\n</div>\n</div>\n\n<p></p>\n<p>(Thanks to the organisers for taking a picture of us all!)</p>\n<h1>Day 1</h1>\n<h2>Intro by <a href=\"https://www.gu.se/en/about/find-staff/matteotomasini2\">Matteo Tomasini</a>, <a href=\"https://nordic-rse.org\">Nordic RSE</a></h2>\n<p>One of the things I loved about the conference was that it was still small enough that I got to know a good proportion of the attendees throughout the conference. In the introduction Matteo Tomasini revealed that there were 45 people this year, up from 30 from last year, which was also the first year.</p>\n<p>There was a bit about what made an RSE, particularly as in most institutions in the nordics (except Aalto) there is no official RSE job (unlike in UK universities where RSE now is an officially recognised role). Generally in the RSE community, both in the UK and in the Nordics, it is recognised that a lot of people act as defacto-RSEs without having the term in their job title, and as such I've found both communities to be welcoming to those of us who self-identify as RSEs, and thus it was with this conference. Matteo defined it as:</p>\n<ul>\n<li>If you develop software for research</li>\n<li>You're the go to in your group for software work/questions</li>\n<li>You support the other researchers in your group</li>\n<li>If you feel like one</li>\n</ul>\n<p>I liked this broad definition in the opening, as it made it clear that everyone was welcome here.</p>\n<p>Matteo also touched on what does Nordic-RSE do:</p>\n<ul>\n<li>This conference</li>\n<li>Has a community Zulip chat for members</li>\n<li>A weekly online coffee meet (9am CET on Thursdays)</li>\n<li>Bi-weekly online community meeting</li>\n</ul>\n<p>It's clear the group has ambitions to help foster the RSE role in the Nordics, and throughout the conference the UK's <a href=\"https://society-rse.org\">Society of Research Software Engineering</a> (of which I'm a member, tough I couldn't make their conference last year) was cited as being about 5 years ahead of where this group wanted to be.</p>\n<h2>Keynote: Clarity, Not Consensus: Rethinking Unity in Open Science by <a href=\"https://rmwillen.info/wp-content/uploads/2024/09/rebecca-willen-cv-2024-1.pdf\">Rebecca Will\u00e9n</a>, <a href=\"https://igdore.org\">IGDORE</a></h2>\n<p>This was an interesting keynote on the quest for "open science". Rebecca Will\u00e9n was the founder of <a href=\"https://igdore.org\">IGDORE</a>, the Institute for Globally Distributed Open Research and Education, which they founded after the end of their PhD, a champion for reproducible science.</p>\n<p>She started by explaining there was a revolution in psychology in 2012, with research looking at the field of psychology and questioning the reproducibility of the results and how selective people were being about what they presented. This isn't necessarily scientific misconduct, but with the push to get published people might slip into what is defined as Questionable Research Practices (QRPs). Examples of this were:</p>\n<ul>\n<li>P hacking or data torture (selective results)</li>\n<li>Harking - the practice of finding a thing of interest in the data and then pretending that this was your hypothesis all along</li>\n</ul>\n<p>The QRP framing is meant to go beyond the deliberate misleading, and I think as a computer scientist interested in tools for reproducibility and having worked with many <a href=\"https://www.researchgate.net/publication/359725248_Myths_and_mythconceptions_what_does_it_mean_to_be_a_programming_language_anyhow\">vernacular programmers</a>, I think that computers amplify QRPs, by making it hard to do a good job at understanding lineage/provenance. I need to dig more into QRPs, and I think the citations for this are:</p>\n<ul>\n<li><a href=\"https://journals.sagepub.com/doi/10.1177/0956797611417632\">False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant by Joseph P. Simmons, Leif D. Nelson, and Uri Simonsohn, 2011</a></li>\n<li><a href=\"https://www.cmu.edu/dietrich/sds/docs/loewenstein/MeasPrevalQuestTruthTelling.pdf\">Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling by Leslie K. John, George Loewenstein, and Drazen Prelec</a></li>\n</ul>\n<p>I also found this more recent 2021 book, <a href=\"https://academic.oup.com/book/39705\">The Problem with Science: The Reproducibility Crisis and What to do About It by R. Barker Bausell</a> (specifically in <a href=\"https://academic.oup.com/book/39705/chapter/350374120\">chapter 3</a>) that seems to cover the topic in detail. Lots of interesting things to follow up on.</p>\n<p>Back to the talk. From this epiphany in the psychology research community in 2012 spun out an attempt to do better - a theme we'll see repeated later in Ina P\u00f6hner's talk in the Pharmacy community - and a push to open science.</p>\n<p>Rebecca then presented what she felt where the five tenants of open science that people talked about, each of which had many subcategories which I didn't manage to record, but the high levels were:</p>\n<ul>\n<li>Open access to knowledge and resources</li>\n<li>Access to infrastructure and enabling tools</li>\n<li>Reproducibility and scientific quality</li>\n<li>Research culture and work environment</li>\n<li>Diversity and inclusion</li>\n</ul>\n<p>The first two were listed as being accepted requirements in the open science world, at least in IGDORE, and the last three were still being debated.</p>\n<p>Rebecca made a comparison at this point to the open source software movement at this point, and gave a historic overview and pointed out how over time that movement started out as being a moral movement (people should have the right to examine and modify the code they run), to being a more qualitative bar (aka, <a href=\"https://en.wikipedia.org/wiki/Gratis_versus_libre\">libre vs gratis</a>).</p>\n<blockquote>\n<p>"The Free Software movement and the Open Source movement are today separate movements with different views and goals, although we can and do work together on some practical projects." - <a href=\"https://www.gnu.org/philosophy/free-software-for-freedom.html\">https://www.gnu.org/philosophy/free-software-for-freedom.html</a></p>\n</blockquote>\n<p>Rebecca identifies this theme in the timeline of open science also:</p>\n<ul>\n<li>Open access, arxiv, CC - late 1990s</li>\n<li>Protocols for clinical trials mandatory in 2005 - open and version controlled</li>\n<li>Work showing QRPs are common in 2011</li>\n<li><a href=\"https://osf.io\">Open science framework</a> - developed for psychology, used now used in all social science. Describes the process of pre-registration - saying what you're doing before the research -</li>\n<li>Added reproducibility to open science with intent that it prevents QRPs</li>\n<li><a href=\"https://cos.io\">The Center for Open Science</a> similar time to the Open Science Framework, but starts to shift from morality to quality similar to that shift in the OSS world</li>\n<li>Another reference to the <a href=\"https://www.unesco.org/en/open-science/toolkit\">UNESCO open science toolkit factsheet 2022</a>, specifically the <a href=\"https://unesdoc.unesco.org/ark:/48223/pf0000383323\">enumeration of its tenants</a> - the quality shift is now appearing here</li>\n</ul>\n<p>My personal opinion is that tech culture did lose track of that morality of open vs the open speeds up the tech sector discussion - part of the <a href=\"https://pluralistic.net/2022/11/28/enshittification/\">enshittification</a> we see today I guess, though some of that is just also unchecked capitalism having caught up with naive tech optimism from the prior decades. But I digress.</p>\n<p>At this point I got a little confused as to which tenants Rebecca was advocating for - as I wasn't sure as to which bits of the original 5 tenant list and the UNESCO definition of open science she saw as being about the moral purpose of open science vs the check boxing of open science to do what you were going to do anyway. But what was clear was the in IGDORE they'd had a loss of momentum because of this pull in different directions of what it means to be open science, and they'd not realised that this split was happening, and so consensus was lost in the organisation and there was lack of doing anything useful for many years as a result.</p>\n<p>So I'm not sure I agree about which tenants should be in or out of a definition of open science, but I do see that the split that happened in the tech community around libre/gratis could also be a challenge for the science community. But for me the main take away was the learning about QPRs, as this has given a name to a whole bunch of things I've thought about but never had a way to tie them together.</p>\n<h2>Design Patterns: The secret to smarter, reproducible research by <a href=\"https://codingresearcher.com\">Marine Guyot, CodingResearcher</a></h2>\n<p>The next talk was by Marine Guyot, who is a freelance RSE, and gave a talk on using design patterns in building software for research. The motivation for the talk is what I feel must be a very common pattern, which she told via the persona Anna:</p>\n<ul>\n<li>Anna makes a script to save time for her own research</li>\n<li>Others use it</li>\n<li>Other users ask for small modifications....</li>\n<li>Now Anna is trying to juggle hacking this script vs her own work - bad quality etc. due to time pressures</li>\n</ul>\n<p>Then either at some point it will be recognised as critical and a team will form around it, or Anna will still carry on trying to maintain it and burn out.</p>\n<p>I feel there is another option which is the software is abandoned and then something is lost, but I guess that's not part of the narrative for a talk on how to design better software.</p>\n<p>The rest of the talk focussed on design patterns in software, a topic I won't try to reiterate here as there's good books on this. The premise is that if you make something useful, others will want changes, and unless you put structure in place to manage those changes early on then you'll pay for it later. Something I suspect most people know (at least by the time they write software a second time :), but I suspect few people think of software as being anything other than a quick thing they do to try get a result for their work. It's like the old question about when is a number of things "many".</p>\n<p>The best nugget was in the Q&A at the end:</p>\n<p>Audience Q: what's the best thing I should do for the hand over (from RSE to researcher)\nMarine A: documentation</p>\n<h2>In the modern era, what can we learn from the history of free and open software? by <a href=\"https://research.aalto.fi/en/persons/richard-darst\">Richard Darst</a>, <a href=\"https://www.aalto.fi/en\">Aalto University</a></h2>\n<p>Richard Darst gave a talk on the history of open source software, looking at how it has evolved over time, and then how to deal with some challenges in opening up code (and maybe data or science?) today. Richard's slides are quite readable and <a href=\"https://cryptpad.fr/presentation/#/2/presentation/view/EiU5tmOdvJtbHsybb+DXYYLaHScbxcSN7LXJEJ9R+f8/embed/\">available here</a>, so I won't attempt to recap them here.</p>\n<p>I enjoyed the talk, and learned a bunch about the details of how debian view things via his overview of <a href=\"https://wiki.debian.org/DebianFreeSoftwareGuidelines\">the Debian Free Software Guidelines</a>, and how they have tests to help decide if a thing is truly open, such as the <a href=\"https://wiki.debian.org/DesertIslandTest\">desert island test</a> and the <a href=\"https://wiki.debian.org/DissidentTest\">dissident test</a>.</p>\n<p>One note that struck a chord after some recent experiences with primary data sources we've had:</p>\n<blockquote>\n<p>"In short term closed may be better, but more people will improve the open option long-term"</p>\n</blockquote>\n<p>In our case a group making open digital elevation maps that we've used in the past have switched to restrictive licensing for an open version and a paid version if you want to avoid that, and how that feels quite short cited, particularly given we're in the midst of a climate emergency.</p>\n<h2>Tutorial: 3D visualisation and manipulation of scientific data in static web applications by <a href=\"https://research.chalmers.se/person/joajohan\">Joakim Bohlin</a>, <a href=\"https://infravis.se\">InfraVis, Chalmers University of Technology</a></h2>\n<p>This talk by Joakim Bohlin was on building static web sites for visualising science data. The <a href=\"https://github.com/Akodiat/tutorialNordicRSE25/\">code examples he used are here</a>.</p>\n<p>In the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">EEG group</a> we have quite a strong static-site, self-hosting theme (this website is currently hosted on a raspberry-pi and running <a href=\"https://github.com/mdales/digitalflapjack.com/\">its own static-site generator</a>!), and I also have close to zero interest building frontends for our work that involve me working in React, Vue, or any of the larger contemporary Javascript frameworks that a lot of geospatial visualisation libraries assume you're using. Indeed, I think this is somewhat a point of contention within the group, as there's a clear need for communicating what we do, but because we're effectively mostly people who work at the bottom of the stack, it means no one wants to take time to learn those frameworks, and so we've been poor in communicating what we do.</p>\n<p>I guess this is another RSE thing - we write software, but we can't write <em>all</em> software individually.</p>\n<p>So with that context, I was interested to learn what Joakim had to share: although he can't solve the problem with geospatial visualisation libraries requiring React etc., it was good to know that people are having success delivering usable visualisations with a minimal stack, and if more people are doing that, hopefully we'll eventually see more tooling begin to support this approach.</p>\n<p>Some particularly interesting bits of tooling to me were:</p>\n<ul>\n<li><a href=\"https://pyodide.org/en/stable/\">Pyodide</a> - this lets you run Python in the browser, which Joakim pointed out isn't the best solution, but often if your group works in Python they might have existing things that use plotting libraries to generate graphs, and as a first cut at getting that in front of more people, just can be an easy way to get started. You can combine this with <a href=\"https://pypi.org/project/micropip/\">micropip</a> to include python packages from the javascript wrapper you use to load Pyodide.</li>\n<li><a href=\"https://vega.github.io/vega-lite/\">Vega-Lite</a> - a native javascript interactive graphing library, which I pronounce to rhyme the first half of the name with "Sega"but I fear is a pun based on Vegamite :) In the past I've used <a href=\"https://c3js.org\">C3.js</a> for this sort of thing, but Vega-lite looked a little more easy to make the data interactive.</li>\n</ul>\n<p>There were more, so if this sort of thing catches your interest, do check out the linked examples.</p>\n<h2>Donated data and what Google already knows by <a href=\"https://github.com/rantahar\">Jarno Rantaharju</a>, <a href=\"https://www.aalto.fi/en\">Aalto University</a></h2>\n<p>The premise of this talk was that collecting data in studies of people is hard:</p>\n<ul>\n<li>Takes time, expensive</li>\n<li>Requires participant effort</li>\n<li>Impacts subject behaviour</li>\n<li>Data is only collected after study starts</li>\n</ul>\n<p>That last one might seem obvious, but I guess it's a valid point if you wanted to say study how the covid pandemic changed behaviours. Jarno Rantaharju's point was that actually for a lot of studies the data you might want could already exist in the various cloud services you use, knowingly or not: Google or Netflix already have a lot of data on your behaviours, and thanks to GDPR you can get access to that data as a participant. This is being worked on by, amongst others, the <a href=\"https://digitraceslab.com\">DigiTraces Lab</a> at Aalto University, and is referred to as Facilitating Donated Data.</p>\n<p><a href=\"https://www.cogitatiopress.com/mediaandcommunication/article/view/9362\">An example publication</a> that was made using this data gathering technique on Netflix data.</p>\n<p>Jarno then went to walk through how Google's "takeout" service works to facilitate extracting user data, how to filter it, and so forth, all of which can be quite complicated. So then Jarno showed a browser extension they'd made that will automate much of the "takeout" process, show the user what it has, and then talk to a data collection website they were hosting for an experiment (all of which is open source I believe).</p>\n<p>There are also other tools out there, such as <a href=\"https://github.com/eyra/port\">PORT</a>, which are designed to allow the user to do some introspection and filtering of the donated data before uploading it, as "takeout" for instance doesn't make it easy to time restrict data, you have to give the science team a lot of data they don't necessarily want and you might not want them to have more than is necessary.</p>\n<p>I noted Jarno was using <a href=\"https://github.com/digitraceslab/niimpy\">Niimpy</a> in his demo showing what was in the "takeout" data, which is a python package for working with behavioural data, which looked quite useful if you were into that sort of thing.</p>\n<h2>Unreviewed and Unrunnable? On the role of peer review in publishing reproducible research software by <a href=\"https://uefconnect.uef.fi/en/ina.pohner/\">Ina P\u00f6hner</a>, <a href=\"https://www.uef.fi/en/unit/school-of-pharmacy\">School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland</a></h2>\n<p>This talk was one of the highlights for me in terms of how it related to existing work we've done on the topic in our group here, e.g., <a href=\"https://undonecs.sciencesconf.org/data/Undonecs_2024_abstract_43.pdf\">our Undone paper on how CS hinders climate research</a>.</p>\n<p>Ina P\u00f6hner started out with context that echoed the opening keynote talk, looking at how in their domain there are papers <a href=\"https://pubmed.ncbi.nlm.nih.gov/23758509/\">from over a decade ago</a> flagging issues with reproducibility of work, and then <a href=\"https://www.nature.com/articles/533452a\">another large survey in 2016</a> calling it a "reproducibility crisis". Since then there has been an increased requirement in providing code along side publications, but the question is does a code requirement really equate to reproducibility?</p>\n<p>Ina and her group looked at 1200 articles published between 2020 and 2024 and looked at how many had code, and then how many of those could actually be used. Some headline figures were, of those articles only 481 had code repositories associated with them. Of those they tried to run, and only 10% worked, some even no longer exist having been deleted after publication, and so forth. They also did a dive into those that didn't run, and worked out why they didn't run, looking at was it lack of documentation, lack of dependancies and so forth. I made a lot of notes here, but given the paper for this is still in review I feel best to wait for it to emerge.</p>\n<p>One of the more interesting comments was how it is seen in the review process. Of 75 journals that were surveyed, 65% mandate code be published, 34% ask for it (I assume without it blocking publication if not available), but only 4% give the reviewers any guidelines on how to review the code itself, and so effectively very little is done beyond checking the presence of code. Some reviewers interviewed did say they looked for a README or such, but they also had some reviewers say "we'd not try to rerun wet-lab experiments, so why would we try run the code?"</p>\n<p>I think this is all a great survey and the fact that the group did a lot of actually grind to check all these papers is valuable versus the gut instinct (that the entire audience shared) that code published isn't runnable. I think there's a second question here which would also cover data availability too, but I don't want to let that detract from this work which I appreciated.</p>\n<p>Ina went through a list of possible things publishers should do to address this, the most interesting of which I thought was drafting in early career researchers to help with code review for papers, and ensuring they get credit for this obviously. I kinda like this idea, though it might be hard to get a perfect match, it's a great way to not only get review, but build up code-for-publication as a habit in new researchers.</p>\n<p>As a final note to this session, Ina mentioned <a href=\"https://www.nature.com/articles/sdata201618\">FAIR</a> (and <a href=\"https://www.nature.com/articles/s41597-022-01710-x\">here</a>), which I'd not come across before, which is a guiding principles for scientific data management and Ina was advocating these should be used for code also.</p>\n<h2>RSE Career panel</h2>\n<p>Day 1 closed with a group discussion on RSE careers, <a href=\"https://hackmd.io/@nordic-rse/nre2025-career-panel\">the notes for this are online</a>. Common themes mostly stemmed from the fact that in the nordics this isn't for most people their full time role, they work in other departments (e.g., the university HPC group), and so there was talk of how to get funded for it, and how to ring fence time for such work.</p>\n<h1>Day 2</h1>\n<p>Day 2 was mostly short lightning talks of about ten minutes long, with a couple of longer talks and two panels thrown in also.</p>\n<h2>Panel : How to finance your RSE group - <a href=\"https://en.uit.no/ansatte/person?p_document_id=486227\">J\u00f8rn Dietze</a>, <a href=\"https://hpc.uit.no/\">HPC group at UiT The Arctic University of Norway</a></h2>\n<p>J\u00f8rn Dietze is a member of the <a href=\"https://research-software.uit.no\">RSE group at UiT</a>, but it's not really funded, they are part of HPC/central admin for university. The RSE side is done as slack time, so roughly 20% per person. They do 2 hours a week office hours where students come along with problems.</p>\n<p>This was then held in contrast to the <a href=\"https://www.aalto.fi/en/services/research-software-engineers\">Research Software Engineers service at Aalto</a>, which is part of the computing department, and represented by Richard Darst (who gave the previous day's talk on what we can learn from the history of open source). It started with no funding, helping out, then starts helping projects with funding, where in theory they can bill hours. Finance pushed back, saying nothing under a month is worth doing the billing for. Then a centre was set up for AI, which funded an research engineer, and in theory they work for the centre, but any spare time is used as general RSE. Also out of the university HPC group originally - so experience of working with other depts.</p>\n<p>Their funding breaks down as</p>\n<ul>\n<li>Big enough (more than month): on grant</li>\n<li>Small projects: out the departments general funding</li>\n</ul>\n<p>Inspiration from UK:</p>\n<ul>\n<li><a href=\"https://www.software.ac.uk/blog/visibility-research-software-engineers-research-funding\">https://www.software.ac.uk/blog/visibility-research-software-engineers-research-funding</a></li>\n<li><a href=\"https://imperialcollegelondon.github.io/research-computing/funding.html\">https://imperialcollegelondon.github.io/research-computing/funding.html</a></li>\n<li><a href=\"https://www.software.ac.uk/guide/how-fund-research-software-development\">https://www.software.ac.uk/guide/how-fund-research-software-development</a></li>\n<li><a href=\"https://www.software.ac.uk/programmes/research-software-maintenance-fund\">Research Software Maintenance fund</a> (only UK)</li>\n</ul>\n<p>Another topic was acknowledgements for work so as to try show group value.</p>\n<ul>\n<li>Some require RSE groups require acknowledgements in papers (not co-authorship)</li>\n<li>At Aalto they collate the publications they assisted with every year to show contribution to department</li>\n</ul>\n<p>This section is a bit disjoint, but we covered a lot of topics in an hour!</p>\n<h2>CodeRefinery: Where Research Software Engineers can begin and grow by <a href=\"https://www.software.ac.uk/fellowship-programme/samantha-wittke\">Samantha Wittke</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>Samantha Wittke talked about<a href=\"https://coderefinery.org\">CodeRefinery</a>, which is a collaborative project that:</p>\n<ul>\n<li>Provides hands-on training for coding for research</li>\n<li>Focus on good-enough</li>\n<li>Support Open Science and FAIR software development</li>\n</ul>\n<p>The teaching sits between introductory programming basics and high perf/GPU training. They're not the only ones doing it, and it sounds like they exchange ideas with other groups, e.g., <a href=\"https://carpentries-incubator.github.io/fair-research-software/01-fair-research-software.html\">The Carpentries FAIR Research Software course</a>. The courses are open licensed <a href=\"https://creativecommons.org/licenses/by/4.0/\">CC-BY</a>.</p>\n<p>CodeRefinery run workshops twice a year with global access, via both online and some in person classrooms. Currently they serve about 500 students per year and have 30 instructors/speakers.</p>\n<p>They also run a Zulip channel to go along side the course and provide networking (it's the same Zulip used by Nordic-RSE).</p>\n<p>Ways to get involved:</p>\n<ul>\n<li>Become a co-instructor</li>\n<li>Contribute to lesson materials</li>\n<li>Join as an observer</li>\n</ul>\n<p>They have had people become RSE types after completing CodeRefinery courses.</p>\n<h2>LUMI AI Guide by <a href=\"https://www.linkedin.com/in/gregor-decristoforo-ab8741213/?originalSubdomain=no\">Gregor Decristoforo</a>, UiT The Arctic University of Norway and <a href=\"https://scholar.google.com/citations?user=JhJxD981dnsC&hl=en\">Oskar Taubert</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>Gregor Decristoforo and Oskar Taubert talked about the <a href=\"https://www.lumi-supercomputer.eu\">Finish national supercomputer LUMI</a>, in particular how it can be used for AI work, despite not being designed for that sort of workload. Apparently now 29% of users are using it for AI, 41% for machine learning. Most of this is with done with <a href=\"https://pytorch.org\">PyTorch</a>.</p>\n<p>The main challenges users face when using LUMI:</p>\n<ul>\n<li>Software installation: how do people get their software on to LUMI? Options include <a href=\"https://docs.sylabs.io/guides/latest/user-guide/\">singularity containers</a> and <a href=\"https://docs.lumi-supercomputer.eu/software/installing/easybuild/\">easy build modules</a>. This is typically set up by support team.</li>\n<li>LUMI uses AMD GPUs, so no CUDA support I guess, which is somewhat more common</li>\n<li>It uses the <a href=\"https://www.lustre.org\">Luster file system</a>, but that isn't well suited to many small files, which is common in Python environments</li>\n<li>Helping people scaling training jobs to multiple nodes</li>\n<li>Monitor and profiling</li>\n</ul>\n<p>To this end they've put together a <a href=\"https://github.com/Lumi-supercomputer/LUMI-AI-Guide\">LUMI AI Guide</a> on how to go from laptop to LUMI, and Gregor and Oskar walked us through select parts of that.</p>\n<p>It uses Slurm for job access, which I chatted to Gregor about over lunch, and which will crop up again in a later talk. I'll put some notes on Slurm and what we do in the EEG below.</p>\n<h2>Harnessing power and synergy \u2013 Using game engines to produce visualisation and simulation by <a href=\"https://www.kth.se/profile/berendt?l=en\">Filip Berendt</a>, KTH Royal Institute of Technology</h2>\n<p>Filip Berendt gave a talk on using game engines in research software, which is something that I've <a href=\"https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4977986\">seen used in our group</a> before, but it was interesting to see a more broad appraisal of how it can be applied.</p>\n<p>The top level was that they can be quite useful, though not ideally matched, are great for prototyping and mixing in visualisation at the core of your work, and that licensing can be an issue, for example the <a href=\"https://unity3dperformance.com/index.php/2024/09/22/unity-license-change/\">Unity controversy from last year</a> - ultimately the game engine developers structure the payment model around games, not science!</p>\n<p>The engines covered in the discussion where <a href=\"https://unity.com\">Unity 3D</a>, which Filip did use until the licensing issue, <a href=\"https://en.wikipedia.org/wiki/Unreal_Engine\">Unreal Engine</a>, and <a href=\"https://godotengine.org\">Godot</a> (which is open source).</p>\n<p>Filip showed an example he'd built using Unity for <a href=\"https://kth.diva-portal.org/smash/get/diva2:1708157/FULLTEXT01.pdf\">implementing a model of pedestrian steering algorithm</a>, compared with established algorithm\nRelated works developed their own testing env, and visualisation done after the fact using a second environment - game engine lets you do both. I think this last fact is probably generally under-appreciated as to how important visualisation of results are for spotting issues in large data sets, so I like this a lot.</p>\n<h2>How Alarming by <a href=\"https://callumrollo.com\">Callum Rollo</a>, <a href=\"https://voiceoftheocean.org\">Voice of the Ocean Foundation</a></h2>\n<p>Callum Rollo works for Voice of The Ocean, who have several autonomous underwater drones in the waters around Sweden: you can <a href=\"https://observations.voiceoftheocean.org/\">see a live feed of their location online</a>. They will occasionally surface to report back data using an Iridium back-haul, and if they need support, they will stay on the surface until given more instructions. This is the most dangerous time for the drones, as they can get hit by boats, versus when they're lower down in the water they're relatively safe, and so when a drone surfaces and requests assistance, the right expert must be fetched quickly.</p>\n<p>Callum had to build a system to handle emergency calling of people in the company, with redundancy, and slow escalation up the staff hierarchy if calls aren't handled. Building a reliable system like this is hard - it's not a job I'd relish taking on given that a false positive is going to annoy a lot of people, and a false negative can be very expensive.</p>\n<p>It was a nice reminder that RSE software isn't just about data-processing pipelines or HPCs or perhaps even embedded sensor software. The tooling Callum put in place here is essential to the science work, but not being on the data collection or processing path probably isn't something we think of RSEs doing. But the tooling itself can be quite similar, as Callum pulled all this together using Python.</p>\n<h2>The RSE experience at the northernmost university in the world by <a href=\"https://www.linkedin.com/in/gregor-decristoforo-ab8741213/?originalSubdomain=no\">Gregor Decristoforo</a>, <a href=\"https://hpc.uit.no/\">HPC group at UiT The Arctic University of Norway</a></h2>\n<p>Gregor Decristoforo gave a follow up to a talk last year about the nascent RSE group forming within the HPC group a UiT. They are now 6 part time RSEs, having started 2.5 years ago, growing out of the HPC group, which is part of IT Dept at UiT.</p>\n<p>Mostly they collaborate with groups that they were part of before they joined (as in, the individuals were in a particular other discipline research group, and they now help those groups under this new banner).</p>\n<p>Challenges of forming an RSE group:</p>\n<ul>\n<li>Visibility of the group to researchers</li>\n<li>Convincing higher-ups RSE is valuable</li>\n<li>Mostly working as HPC engineers, so time is limited for RSE jobs</li>\n<li>People come with R problems, but its often a stats problem, and so not their area of expertise</li>\n</ul>\n<p>That last one is an interesting one that hasn't come up for us in our cluster in the EEG group, but perhaps that's because everyone knows I'm not good at R or stats :)</p>\n<p>It's not in my notes, but IIRC they hold an office hour once a week to help people that rotates between members.</p>\n<h2>The periodic table: R package and visualized with ggplot2 by <a href=\"https://cv.solarchemist.se\">Taha Ahmed</a></h2>\n<p>Another data-visualisation talk, this time Taha Ahmed talking on an R package he built to make <a href=\"https://github.com/solarchemist/periodicdata\">a customisable version of the periodic table</a>. There was a lack of freely licensed periodic table for customising, so he made his own.</p>\n<p>Internal data store is in yaml for both data and metadata, which are split into two parts, which is flexible, but gives raise to data-hygiene issues reading yaml into R (the usual JSON/YAML issues with lack of typing).</p>\n<p>Works nicely in a notebook, you can set values per entry and visualise on the table.</p>\n<h2>Experiences from 'High-performance R' course by <a href=\"https://helijuottonen.wordpress.com\">Heli Juottonen</a>, <a href=\"https://csc.fi/en/\">CSC - IT Center for Science</a></h2>\n<p>The next talk was by Heli Juottonen talking about how at CSC they try to teach people to use R in an HPC context and the <a href=\"https://github.com/csc-training/high-performance-r\">training course</a> they run. The slides for this talk are <a href=\"https://a3s.fi/heli/hpr_hj_nrse25_wide.pdf\">here</a>. The course was made by a pair of people: one a biologist looking at R as a tool, and the other as a comp-sci.</p>\n<p>Heli maintains R on a supercomputer; they use <a href=\"https://docs.csc.fi/apps/r-env/\">r-env</a> for running R on the HPC machines.</p>\n<p>Common question: "Blah in R is taking too long, they run out memory, what now?" This certainly echos the questions we get on our computer cluster, and its frustrating (to Michael) that it's so hard to answer such seemingly simple questions - (though not unexpected: yay, <a href=\"https://en.wikipedia.org/wiki/Halting_problem\">halting problem</a>).</p>\n<p>The course aims:</p>\n<ul>\n<li>Understanding resource usage, finding bottlenecks</li>\n<li>Parallel and distributed computing</li>\n</ul>\n<p>Audience:</p>\n<ul>\n<li>RStudio user on supercomputer but doesn't know how to utilise the resources well</li>\n</ul>\n<p>It's a two day course, with the first day being about measurement, and the second day about batch jobs and dealing with distribution over multiple cores/nodes.</p>\n<p>One problem they hit was with users bringing their data - needs cleaning before use so slows down the course.</p>\n<h2>N-min-ct - a research software to tell how sustainable is the neighbourhood. by <a href=\"https://www.ntnu.edu/employees/ruslan.zhuravchak\">Ruslan Zhuravchak</a>, Norwegian University of Science and Technology</h2>\n<p>Ruslan Zhuravchak gave a talk on how he helped implement a project to facilitate the interactive analysis of urban form and mobility in smart (15-minute) ciries, and of various relevant performance metrics as suggested by <a href=\"https://fmezen.no\">FME/ZEN</a> - Zero Emission Neighbourhoods.</p>\n<p>Project was to try to assess how well a city is meeting ZEN KPIs based on sensor data.</p>\n<p>Unfortunately the project was only internally available, so whilst we got a demo which was quite interesting, I can't link to it alas. I had a chat with Ruslan afterwards, and he hopes to get it published. In the EEG group we have a few people working on smart city metrics, and Ruslan seemed keen to chat to those interested.</p>\n<h2>Modern HPC-access alternatives for the bash averse by <a href=\"https://www.chalmers.se/en/persons/vikren/\">Viktor Rehnberg</a>, Chalmers University of Technology</h2>\n<p>This session was quite interesting to me, as I semi-manage a compute "cluster" of machines shared by a set of ecologists. This is not an HPC set up, rather a set of very-large regular computers (256 cores per machine, 1TB RAM, etc.). We deliberately have taken a hands off approach to access, just leaving it as ssh access to a particular machine, but I think whilst we've got away with that, I'd like to see what else we can do here.</p>\n<p>One of the themes I've seen consistently in this conference is the adaption of <a href=\"https://slurm.schedmd.com/overview.html\">Slurm</a>, as was the case here. This talk wasn't about Slrum per se, but it did show me different ways our compute cluster could be presented, even if this talk was about HPC and we're just a bag of big computers (BBoC :).</p>\n<p>Victor Rehnberg gave this talk, and he started by trying to define what is an HPC cluster:</p>\n<ul>\n<li>Users access over login node</li>\n<li>It contains many nodes</li>\n<li>Is managed by a scheduler (typically Slurm)</li>\n<li>Has shared storage for the nodes (which enables the scheduled to distribute jobs)</li>\n</ul>\n<p>From this measurement, perhaps our BBoC does count as HPC, all it's missing is the login node and the scheduler. I usually think of HPC as having slightly more GPUs or other specialist hardware to them.</p>\n<p>The typical way you'd access the HPC is you ssh to login node, use command line to make jobs via the scheduler, and then it'll run your work at some point as resources allow. Currently we run a social scheduler (aka a Slack channel that people rarely use), and quite often I have to go nag people about this.</p>\n<p>The other topic that came up in a lunch discussion (I think with Gregor and Maria) was I realise that by not using Slurm, which is the de-facto standard for HPC, we're not preparing our users for when they migrate up to proper HPC. There will always be a big set of learning needed when moving from a big normal computer to a dedicated HPC, but if we moved to make our environment a little more like this it might both make things run smoother and help our users prepare for the wider world of scientific computing? In the past I've been resistant to using Slurm just as it adds overhead, but now we have more help on the server management side, perhaps it's time to reconsider that.</p>\n<p>Anyway, back to the talk! The main thrust of Viktor's talk was about what if you don't want to use ssh, can you use other tools to access the HPC? Graphical tools for instance. The answer was yes, and the options he presented were:</p>\n<ol>\n<li>Use X-forwarding - as an old person I love that this is still considered an option</li>\n<li>Remote desktop - <a href=\"https://www.cendio.com\">thinlinc</a> most common, but commercial. When you connect you are still using <code>sbatch</code> etc. from a terminal to launch jobs, but matlab etc. can also x-forward from compute notes.</li>\n<li><a href=\"https://github.com/lunarc/gfxlauncher\">Gfxlauncher</a> - runs on login node</li>\n<li>IDEs like <a href=\"https://code.visualstudio.com\">Visual Studio Code</a> or <a href=\"https://www.jetbrains.com/pycharm/\">PyCharm</a>, using ssh remote. I suspect VSCode is what most of our ecologists use to access the BBoC.</li>\n<li>Language environments like <a href=\"https://jupyter.org\">Jypter</a> (for Python), <a href=\"https://posit.co/download/rstudio-server/\">RStudio server</a>, matlab-proxy, which can be tunnelled over ssh.</li>\n<li>Web portal that sets up the above, like <a href=\"https://openondemand.org\">Open OnDemand</a></li>\n<li>Language libraries that let you code up job submission: e.g., <a href=\"https://github.com/facebookincubator/submitit\">submitit</a> and <a href=\"https://www.nextflow.io\">nextflow</a></li>\n</ol>\n<h2>Training Triumphs: Maximizing Impact with External Domain Experts by <a href=\"https://liu.se/en/employee/yonwa58\">Yonglei Wang</a>, <a href=\"https://liu.se\">Link\u00f6pings Universitet</a></h2>\n<p>Yonglei Wang works at Link\u00f6pings Universitet in the <a href=\"https://liu.se/en/organisation/liu/nsc\">national supercomputer centre</a> there, and gave an overview of all the various bits of training that are available, specifically from ENCCS - <a href=\"https://enccs.se\">EuroCC National Competence Centre Sweden</a>. Aims to enable HPC, AI, and High performance data analytics (HPDA).</p>\n<p>They run domain specific events: Bio-molecular, fluid dynamics. quantum chemistry, quantum computing - no ecology! They have had 3600 participants: 80% academic, 8% public sector, 7% large companies, and 5% SMEs. Gender breakdown was 23% female, 73% male.</p>\n<p>There was a long list of the training, but alas too much for me to note here - check out <a href=\"https://enccs.se/lessons/\">ENCCS lessons list</a> for more - but there's definitely some <a href=\"https://enccs.github.io/gpu-programming/\">I want to check out</a>.</p>\n<h2>My discussion session on lineage in scientific data-processing pipelines</h2>\n<p>This I'll write up and link to shortly as an independent post! But it was (at least from my perspective) a success, with many interesting tools and techniques I'd not been aware of before. I say, a proper post on that so everyone can share the results soon!</p>\n<h1>Misc other notes</h1>\n<ul>\n<li>Oxford has a <a href=\"https://www.rse.ox.ac.uk\">dedicated RSE group</a>, hat tip to Gregor Decristoforo for pointing them out to me.</li>\n<li><a href=\"https://digital-strategy.ec.europa.eu/en/policies/ai-factories\">AI Factories</a> seemed to be somewhat of a contentious term I'd not come across before, which seems to be an EU initiative to power AI projects. I suspect it's seen as hype that is draining money and people don't quite know what it means.</li>\n<li>The discussion/panel sessions were ran via the audience <a href=\"https://hackmd.io\">collaboratively editing a markdown document</a> that was on the projector, and the moderator calling out interesting things and asking whoever wrote that to speak a little on that. As a technique it worked really well with this size of audience, both live, and leaves everyone with notes for afterwards!</li>\n<li>Nordic-RSE 2026 will be in Troms\u00f8, Jun 9-10!</li>\n</ul>",
+14
mwd/blog_pandas-vs-efficiency_.json
+14
mwd/blog_pandas-vs-efficiency_.json
···+"summary": "<p>As part of my role at the <a href=\"https://4c.cst.cam.ac.uk\">Cambridge Center for Carbon Credits (4C)</a>, working closely with ecologists in processing of data, I've become an accidental data scientist, by which I mean instead of loading CSV files directly I now use <a href=\"https://pandas.pydata.org\">pandas</a>. Pandas is a popular library that makes working with large arrays of data easier. Unlike <a href=\"https://numpy.org\">numpy</a>, which is there to make it easy to process data in multidimensional arrays, pandas works one abstraction higher, giving columns names and treating the data as tabular, rather than just raw numbers. But similar to numpy, it provides nice syntax to help you work with that data without worrying about individual elements, you tend to work at that tabular level, and that makes it particularly powerful for someone who isn't a native programmer, rather someone who is a domain expert in whatever they have data about who is trying to process that data to derive a new insight.</p>\n<p>I'm quite a fan of Numpy: Numpy makes it simple to reason about large amounts of data without worrying about each element,\nand at the same time it is really quite efficient at doing so. I recently tried rewriting some Python code that used numpy to a compiled language thinking it'd be faster, but under the hood numpy is using compiled code already to do vectorized operations, so is actually quite efficient, and my native code version was as a result no faster and harder to read.</p>\n<p>So given it's popularity, and the fact that it uses Numpy under the hood, I'd assumed that pandas would similarly provide that double win of simplicity of expression with efficiency of computation, but I was mistaken: using pandas to process data turned out to be very inefficient. In this post I'm going to walk through a particular problem I was trying to solve, and then look into how I managed to speed it up, and then worry about what this means for the regular data-scientist that isn't also steeped in computer-science knowledge.</p>\n<h1>The problem spec</h1>\n<p>I was implementing some code that tried to find pairings between two sets of data, which we shall refer to as Set K and Set S (as that's what they were called in my code :). The theory is that for every element in K, we want to find the closest match in S, based on two criteria. For certain properties of our element in K there must be direct matches on the property on the matching element, as in they must have the same value. Then for other properties we just want to find the closest approximation.</p>\n<p>To make that more concrete, the data I'm dealing with are points of ecological interest, so set K is a set of points in a region of interest, and I'm trying to find the closest match in some other area so I can then do other comparisons later. Certain properties must match absolutely, such as the type of land for the pixel (land use class), and the regional biome class (ecoregion), but then for other properties like elevation and distance from population I'm only interested in finding a rough match. For that rough match, because you might get conflicting nearnesses across say elevation and distance from population, we're going to use an off the shelf distance function that takes multiple variables and gives you a single distance value, called a <a href=\"https://en.wikipedia.org/wiki/Mahalanobis_distance\">Mahalanobis distance</a>. It doesn't really matter for this discussion what that is, just when you see it in the rest of this document that's what this is doing.</p>\n<p>Now, there's many ways to frame this problem, but I'm going to start with the naive pandas implementation, as I think it does a very good job at simplicity of expression side of things.</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\n# This statement does the first half of the matching!\n# The result is a new table that has every match between K and S on it\n# with the values from both, so can be quite large!\nk_and_s_joined = k_set.merge(\n\ts_set,\n\ton=['ecoregion', 'luc10', 'luc5', 'luc0'],\n\thow='inner',\n\tsuffixes=['_k', '_s']\n)\n\n# This is our function to apply to each of our initial matches.\n# Note that each row contains both the data of K and S, which is\n# why this just takes one row as a parameter.\ndef distance(row: Series, iconv: np.ndarray) -> float:\n\treturn mahalanobis(\n\t\t(row.elevation_k, row.slope_k, row.population_k),\n\t\t(row.elevation_s, row.slope_s, row.population_s),\n\t\ticonv\n\t)\n\n# Now we make a new column of data, which is the distance in each\n# row between the K data and S data, and add that back into our table\nk_and_s_joined['distance'] = k_and_s_joined.apply(\n\tdistance,\n\taxis=1,\n\targs=(invconv,)\n)\n\n# Now we just want for each element in K, one result where the distance\n# value is the lowest, so we cluster the results based on K's lat/lng,\n# and pick the one with the smallest value\nresult = joined.groupby(['lat_k', 'lng_k']).min('distance', axis=1)\n\n# Finally we can save that result!\nresult.to_parquet('result.parquet')\n</code></pre>\n<p>I mostly like this code. There is some algorithmic know-how required certainly, around the idea of the merge/join and the groupby/min, but if you've taken time to learn pandas, this is a nice succinct way to record what's going on: your code does not obfuscate the methodology unnecessarily.</p>\n<p>Unfortunately, in terms of performance this code is terrible.</p>\n<p>It is both slow to execute (I have to confess, I never let this version finish, as with my data it took more than a few hours), and very memory hungry. I'm now going to move through a few versions where I rework this to get it to a good place in terms of performance, all of which will come at the expense of clarity of intent in the code.</p>\n<h1>Too much data</h1>\n<p>The first thing I want to tackle is just the memory usage. For me the sets S and K are usually in the tens of thousands of values. If we assume that there is a fairly high hit rate on the first stage matching, this means that the table <code>k_and_s_joined</code> is going to be in the tens of millions, which is unfortunate as most of that data will be thrown away, because ultimately we want one match per element in K.</p>\n<p>When I ran this with my dataset the Python process was sat at around 60GB, which is quite a lot of memory to be using - on most personal computers and laptops that would not fit for instance. We have some large compute servers where this is not an issue, but having so much memory in use means I can't run many copies of this code at once, so most of the many CPU cores we have sit idle.</p>\n<p>So the first thing I'm going to do is not merge K and S with a join, but split this into a set of loops, so that we only have one copy of each set in memory, rather than the product of the two:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# realing with - equivelent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# This is our function to apply to each of our initial matches.\n\t# Note that each row contains both the data of K and S, which is\n\t# why this just takes one row as a parameter.\n\tdef distance(s_row: Series, k_row: Series, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\t(k_row.elevation, k_row.slope, k_row.population),\n\t\t\t(s_row.elevation, s_row.slope, s_row.population),\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_row, invconv,)\n\t)\n\n\t# Now find the one result where the distance value is the lowest.\n\tminimum_row = filtered_s[filtered_s.distance==filtered_s.distance.min()].iloc[0]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>On the plus side, this code now uses much less memory! With the sample sample data I'm now only using around 5GB of memory, which means we're now into the realm of being able to run this on a personal computer, or I can run ten times as many instances of this process concurrently on my server. Not only that, but this version runs faster too - completing in around 75 minutes on my dataset.</p>\n<p>The cost is that the code is now further away from the methodology, it's harder to read at a glance to learn what it's doing. I'm having to somewhat micromanage the computer by telling it what to do for each element of the set K rather than letting the computer figure out what's best.</p>\n<p>This already annoys me - I've not really done much but I've already got a huge win in terms of performance of my code, and I feel really I should have been able to get the computer to figure this out for me. But what annoys me more is that as a computer scientist I knew to do this, but pandas is meant for data scientists who are experts in domains other than computing, but here we are having to cause them to become people who understand the way programs use memory. And for my program to get better, this burden is going to get yet worse.</p>\n<h1>Why is this taking an hour?</h1>\n<p>At this point, running on the computer I have access to with 100s of CPU cores and enough memory I can use all those CPUs with the 5GB per process I have, I was ready to move on. But then we ran this code on a more reasonable computer and it took three hours to run for this data set, and longer for the next batch, and so I was forced to go back the code, and wonder: why is it so slow?</p>\n<p>Is it because the Mahalanobis calculation is very slow? Is it that doing filtering on pandas data sets is very slow? This code doesn't really do much, and even if you think we need to process tens of millions of rows, computers are really fast these days: a GHz processor will one billion operations per second, and so the math really shouldn't be slowing it down.</p>\n<p>Now, I could start putting in print statements with timestamps in, but being a computerist I reached for <a href=\"https://docs.python.org/3/library/profile.html\">CProfile</a>, which is the Python profiling library, and ran my code again. Profiling like this is basically going to just repeatedly pause my program and ask "what are you doing", just at a very fine granularity such that it'll even see what's happening inside function calls that complete very fast. The downside of this is that it will slow down the program - what took 75 minutes now took almost three hours to run.</p>\n<p>Still, run it did, and then I get an output that is just a list of all function calls made, how often they were made, how much time was spent in them, and how much of that time was spent in that function specifically rather than functions it called. On one hand, this is just another version of doing data science, only on the program itself, but again the data scientists I work with are experts in ecology not computering, and so I'd not say that this sort of program introspection is something they'd benefit from.</p>\n<pre><code>\t\t 25949560575 function calls (25949402038 primitive calls) in 9805.165 seconds\n\n Ordered by: standard name\n\n ncalls tottime percall cumtime percall filename:lineno(function)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:100(acquire)\n\t 32/6 0.000 0.000 0.202 0.034 <frozen importlib._bootstrap>:1022(_find_and_load)\n\t\t3 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:1038(_gcd_import)\n\t 1646 0.004 0.000 0.007 0.000 <frozen importlib._bootstrap>:1053(_handle_fromlist)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:125(release)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:165(__init__)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:169(__enter__)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:173(__exit__)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:179(_get_module_lock)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:198(cb)\n\t\t3 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:216(_lock_unlock_module)\n\n...1263 lines later...\n\n802394413 88.968 0.000 88.968 0.000 {pandas._libs.lib.is_scalar}\n42419 0.008 0.000 0.008 0.000 {pandas._libs.lib.item_from_zerodim}\n 6546 18.868 0.003 18.931 0.003 {pandas._libs.lib.maybe_convert_objects}\n\t2 0.000 0.000 0.000 0.000 {pandas._libs.lib.to_object_array_tuples}\n\t1 0.000 0.000 0.000 0.000 {pandas._libs.lib.to_object_array}\n\t1 0.179 0.179 0.179 0.179 {pyarrow._s3fs.ensure_s3_initialized}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.cpu_count}\n 12 0.000 0.000 0.000 0.000 {pyarrow.lib.field}\n\t2 0.000 0.000 0.000 0.000 {pyarrow.lib.int64}\n\t2 0.000 0.000 0.000 0.000 {pyarrow.lib.register_extension_type}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.schema}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.struct}\n\t2 0.298 0.149 0.298 0.149 {pyarrow.lib.table_to_blocks}\n</code></pre>\n<p>As an aside, note the top line: 25.9 billion function calls! That's a lot of function calls, just to process tens of millions of rows of data. But I guess those calls add up quickly when you're working with data this big.</p>\n<p>Anyway, most of the information was not interesting, but two things stood out. Firstly was this line:</p>\n<pre><code>\t 1631 0.020 0.000 9792.367 6.004 frame.py:9266(apply)\n</code></pre>\n<p>This tells me that Apply is being called 1631 times, which is once per entry in K for this run, which is what I'd expect, but it also tells me that it spent 9792 seconds in apply, which means that the apply call for the code is where we spend most of our time! So we have a good clue here: of the two stages to filtering the data, it's not the explicit matching stage that's slow, but working out the distances.</p>\n<p>The obvious conclusion to jump to then would be that it's the distance function itself that is slow, but if we find that in the profiler output:</p>\n<pre><code> 89146415 468.679 0.000 874.615 0.000 distance.py:979(mahalanobis)\n</code></pre>\n<p>We can see this is called a lot, nearly 90 million times, which is the product of K and S after you filter out the first stage matching, but it only accounts for a small fraction of our 9792 seconds. Where is the rest of the time going? So I scroll on and then I spot this:</p>\n<pre><code>802376453 829.165 0.000 7993.752 0.000 generic.py:5975(__getattr__)\n</code></pre>\n<p>Now, unless you understand how Python works under the hood, this is just yet another internal call that Python does that you have no control over, but because this isn't my first rodeo, I happen to know what this means, and what it is telling me. Python's <code>getattr</code> is used when you try to access a property on an object in Python. We know that this is happening in the loop of apply, and we can see that it's being called a lot, and so from that I can infer it's this code here that's the problem:</p>\n<pre><code>\t(row.elevation_k, row.slope_k, row.population_k),\n\t(row.elevation_s, row.slope_s, row.population_s),\n</code></pre>\n<p>The problem is when we access the data on the row by name line this. Pandas has been super helpful and made it possible for us to access the data in each column by name as if it was a property on the two, but in practice to do this it has to do a bunch of look up work to made this happen, going back to the table, finding the column names, checking you have provided one that is right, then finding the data and passing it back, and it turns out if you do this a lot, whilst it might be fast once, but if you do it a lot of times it all adds up.</p>\n<p>In fact, confession time, the code I'm showing you here is a simplified version of the real code, which used a lot more variables, and looked like this:</p>\n<pre><code>\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = (k_row.elevation, k_row.slope, k_row.population,\n\t\tk_row.cpc0_u, k_row.cpc0_d,\n\t\tk_row.cpc5_u, k_row.cpc5_d,\n\t\tk_row.cpc10_u, k_row.cpc10_d)\n\n\t# This is our function to apply to each of our initial matches.\n\tdef distance(s_row: Series, k_info: Tuple, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\tk_info,\n\t\t\t(s_row.elevation, s_row.slope, s_row.population,\n\t\t\t\ts_row.cpc0_u, s_row.cpc0_d,\n\t\t\t\ts_row.cpc5_u, s_row.cpc5_d,\n\t\t\t\ts_row.cpc10_u, s_row.cpc10_d),\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_info, invconv,)\n\t)\n</code></pre>\n<p>I'd already pulled out the calculation of the bit of K I needed out of the apply loop, because of habit - as someone who's coded a bunch I know that if I can do a thing once and re-use that result it's almost always better to do so. So my instinct had saved me being even slower here. So now you can see the numbers add up - we process 90 million rows, and we make a tuple from 9 fields inside that loop, which is our 800 million calls to getattr!</p>\n<p>So what can one do about this? Well, for better or worse (better in this case, worse in general) there are multiple ways in pandas to achieve the same thing. Rather than access each item on the row by a property on the object, I can just pass a list of column names, and it'll narrow things down for me. So now my code is:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# realing with - equivelent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = k_row[["elevation", "slope", "population"]]\n\n\t# This is our function to apply to each of our initial matches.\n\tdef distance(s_row: Series, k_info: Series, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\tk_info,\n\t\t\ts_row[["elevation", "slope", "population"]],\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_info, invconv,)\n\t)\n\n\t# Now find the one result where the distance value is the lowest.\n\tmin_distance = filtered_s.distance.min()\n\tminimum_row = filtered_s[filtered_s.distance==min_distance].iloc[0]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>The code has hardly changed here, just we're using a list of column names rather than directly accessing the values on the row in turn, but this dropped the run time of the program with my sample data from 75 minutes to just under 10 minutes!</p>\n<p>This is one tiny change, but the method by which I discovered it as not obvious and I'd argue something not easily discoverable by someone who's an expert ecologist data scientist. Perhaps this tip might have been listed somewhere and thus they'd know to avoid this, but that solution doesn't scale well. How many other tips are there out there that they're missing out on? By looking more into the profile output I found some other small performance wins, but what's interesting isn't those wins, but the level of knowledge of how computer programs work required to know to apply them. Pandas does such a good job at helping at a semantic level, but to get good performance out of it required a whole other level of expertise. This is in contrast to say numpy, which (albeit in a different domain) manages to pull off the trick of providing both semantic and performative efficiency. Even numpy will, eventually, break down this way, but the non-computer-domain-expert will get further before they hit that.</p>\n<p>This is another rendition of the tension <a href=\"/blog/yirgacheffe/\">I highlighted a few posts ago</a>, as captured in the <a href=\"https://dreamsongs.com/RiseOfWorseIsBetter.html\">\u201cWorst is Better\u201d</a> trilogy of papers by Richard P. Gabriel, between:</p>\n<ul>\n<li>\u201cThe right thing\u201d - having an elegant interface with all the icky complexity of dealing with complexities hidden inside</li>\n<li>\u201cWorse is better\u201d - having an elegant implementation that exposes the underlying complexities to the system\u2019s user</li>\n</ul>\n<p>At some point "The right thing" will break down, stranding the user, which is what is happening with pandas here. The counter argument is that you should make the user have to understand the complexity from the start so they're prepared for this. My personal preference is to try push "The right thing" as far as you can and then provide ways to flag what's going wrong - more people are enabled by doing the former than will succeed at the later, and I'd rather enable ecologists to save the planet, even if that's sometimes inefficient. But I digress, as I have one more stage to performance that I did, which kinda sidesteps that entire debate.</p>\n<h1>Using pandas where it's good, then getting it out the way</h1>\n<p>Recently I made a (poor) joke to my partner that I realised I'd become a data scientist when I started opening CSV files with pandas rather than just reading the contents directly and splitting the file up myself as was my habit before this last year. The nugget of gold in that glib statement is that, despite my lambasting it thus far, pandas is really good when doing its thing. Pandas makes it really easy to reason about tables of data when you're not worrying about individual values, but it seems to struggle when doing bulk calculations on that data; but I've already said that was an area where numpy is good, so why not just let each side do what it's best at?</p>\n<p>Thus, I eventually ran with this code, where I use pandas to do everything up to the point where I have to access discrete values, at which point I move the data wholesale into numpy world:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# dealing with - equivalent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = np.array(k_row[["elevation", "slope", "population"]].tolist())\n\n\t# Select the data we need for the distance calculation, and\n\t# export that as a large numpy 2D array\n\ts_subset = filtered_s[["elecation", "slope", "population"]]\n\ts_subset_raw = s_subset.to_numpy()\n\n\t# Now work over the numpy array to find the minimum distance\n\tmin_distance = VERY_LARGE_NUMBER\n\tmin_index = None\n\tfor index in range(len(s_subset_raw)):\n\t\ts_info = s_subset_raw[index]\n\t\tdistance = mahalanobis(k_info, s_info, invconv)\n\t\tif distance < min_distance:\n\t\t\tmin_distance = distance\n\t\t\tmin_index = index\n\n\t# Now find the corresponding data in the original pandas data\n\tminimum_row = filtered_s.iloc[min_index]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>The key bits to note here are that I used pandas to take the data I'd filtered at the first stage, and select just the columns I need for the distance comparison (a thing pandas is good at) and then convert the data straight to a large numpy array, and process the data from that (handing over to a thing numpy is good at). I now have to do some more accounting as I iterate over the data and find the minimum, but the result was I dropped from 10 minutes to 6 minutes, getting me faster again and I'm well below 10% or my original run time (not including the one that didn't finish!).</p>\n<p>The cost is that my code now is definitely very micro-managery, and doesn't reflect the original methodology very well - it's still following the methodology, but you need to reconstruct it from the code.</p>\n<h1>Why have you made me read all this?</h1>\n<p>There's two readings of this post. Firstly, if you're stuck trying to improve the performance of your pandas code, then consider exporting it to numpy if you're doing bulk calculations on the data rather than just dealing with columns etc. It'll save you some time and memory and your electricity bill will be lower. But then it'd also be valid to say for this kind of task you might also want to look at tools like <a href=\"https://sparkbyexamples.com\">Spark</a> and <a href=\"https://dask.org/\">Dask</a> which do some of the lifting for you, at the expense of learning yet another framework properly before it'll really be able to help you.</p>\n<p>But secondly, and perhaps more interestingly, is how could this be made such that if you're an expert in a domain that isn't computer science, how do you figure this stuff out? Or perhaps from my perspective: how, as someone making libraries for ecologists to use, how do I make it so they don't get into this trap? Perhaps it'd be better if pandas doesn't have the apply function to loop over the data, and it just had the "dump data to numpy" function instead? Providing nothing would have helped me, as I already know numpy, but that might have just put off other data scientists?</p>\n<p>Or put another way, does everyone doing significant data science in all domains but one need to have a part-time computerist on their team? Should we just acknowledge that this stuff requires some under-the-hood systems knowledge to get right, and so the way forward is a pairing of experts. You hope that most the time the tools do good, but at some point you want to have a domain expert review things? This falls down I imagine when it comes to funding - who wants to add another person to the project in the name of efficiency when you can kludge by and your budget is already tight?</p>\n<p>I don't know what the answer is, but I do know that having to apply me to even a small set of ecologists doesn't scale, and given the state of the climate we need to be enabling as many ecologists as we can. So with projects like <a href=\"https://github.com/carboncredits/yirgacheffe/\">yirgacheffe</a> I plan to continue trying to do "the right thing" to empower and enable my ecologist colleagues, but then perhaps I need to learn to explicitly signal when my way isn't the best way and perhaps expert review is needed.</p>",+"content": "<p>As part of my role at the <a href=\"https://4c.cst.cam.ac.uk\">Cambridge Center for Carbon Credits (4C)</a>, working closely with ecologists in processing of data, I've become an accidental data scientist, by which I mean instead of loading CSV files directly I now use <a href=\"https://pandas.pydata.org\">pandas</a>. Pandas is a popular library that makes working with large arrays of data easier. Unlike <a href=\"https://numpy.org\">numpy</a>, which is there to make it easy to process data in multidimensional arrays, pandas works one abstraction higher, giving columns names and treating the data as tabular, rather than just raw numbers. But similar to numpy, it provides nice syntax to help you work with that data without worrying about individual elements, you tend to work at that tabular level, and that makes it particularly powerful for someone who isn't a native programmer, rather someone who is a domain expert in whatever they have data about who is trying to process that data to derive a new insight.</p>\n<p>I'm quite a fan of Numpy: Numpy makes it simple to reason about large amounts of data without worrying about each element,\nand at the same time it is really quite efficient at doing so. I recently tried rewriting some Python code that used numpy to a compiled language thinking it'd be faster, but under the hood numpy is using compiled code already to do vectorized operations, so is actually quite efficient, and my native code version was as a result no faster and harder to read.</p>\n<p>So given it's popularity, and the fact that it uses Numpy under the hood, I'd assumed that pandas would similarly provide that double win of simplicity of expression with efficiency of computation, but I was mistaken: using pandas to process data turned out to be very inefficient. In this post I'm going to walk through a particular problem I was trying to solve, and then look into how I managed to speed it up, and then worry about what this means for the regular data-scientist that isn't also steeped in computer-science knowledge.</p>\n<h1>The problem spec</h1>\n<p>I was implementing some code that tried to find pairings between two sets of data, which we shall refer to as Set K and Set S (as that's what they were called in my code :). The theory is that for every element in K, we want to find the closest match in S, based on two criteria. For certain properties of our element in K there must be direct matches on the property on the matching element, as in they must have the same value. Then for other properties we just want to find the closest approximation.</p>\n<p>To make that more concrete, the data I'm dealing with are points of ecological interest, so set K is a set of points in a region of interest, and I'm trying to find the closest match in some other area so I can then do other comparisons later. Certain properties must match absolutely, such as the type of land for the pixel (land use class), and the regional biome class (ecoregion), but then for other properties like elevation and distance from population I'm only interested in finding a rough match. For that rough match, because you might get conflicting nearnesses across say elevation and distance from population, we're going to use an off the shelf distance function that takes multiple variables and gives you a single distance value, called a <a href=\"https://en.wikipedia.org/wiki/Mahalanobis_distance\">Mahalanobis distance</a>. It doesn't really matter for this discussion what that is, just when you see it in the rest of this document that's what this is doing.</p>\n<p>Now, there's many ways to frame this problem, but I'm going to start with the naive pandas implementation, as I think it does a very good job at simplicity of expression side of things.</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\n# This statement does the first half of the matching!\n# The result is a new table that has every match between K and S on it\n# with the values from both, so can be quite large!\nk_and_s_joined = k_set.merge(\n\ts_set,\n\ton=['ecoregion', 'luc10', 'luc5', 'luc0'],\n\thow='inner',\n\tsuffixes=['_k', '_s']\n)\n\n# This is our function to apply to each of our initial matches.\n# Note that each row contains both the data of K and S, which is\n# why this just takes one row as a parameter.\ndef distance(row: Series, iconv: np.ndarray) -> float:\n\treturn mahalanobis(\n\t\t(row.elevation_k, row.slope_k, row.population_k),\n\t\t(row.elevation_s, row.slope_s, row.population_s),\n\t\ticonv\n\t)\n\n# Now we make a new column of data, which is the distance in each\n# row between the K data and S data, and add that back into our table\nk_and_s_joined['distance'] = k_and_s_joined.apply(\n\tdistance,\n\taxis=1,\n\targs=(invconv,)\n)\n\n# Now we just want for each element in K, one result where the distance\n# value is the lowest, so we cluster the results based on K's lat/lng,\n# and pick the one with the smallest value\nresult = joined.groupby(['lat_k', 'lng_k']).min('distance', axis=1)\n\n# Finally we can save that result!\nresult.to_parquet('result.parquet')\n</code></pre>\n<p>I mostly like this code. There is some algorithmic know-how required certainly, around the idea of the merge/join and the groupby/min, but if you've taken time to learn pandas, this is a nice succinct way to record what's going on: your code does not obfuscate the methodology unnecessarily.</p>\n<p>Unfortunately, in terms of performance this code is terrible.</p>\n<p>It is both slow to execute (I have to confess, I never let this version finish, as with my data it took more than a few hours), and very memory hungry. I'm now going to move through a few versions where I rework this to get it to a good place in terms of performance, all of which will come at the expense of clarity of intent in the code.</p>\n<h1>Too much data</h1>\n<p>The first thing I want to tackle is just the memory usage. For me the sets S and K are usually in the tens of thousands of values. If we assume that there is a fairly high hit rate on the first stage matching, this means that the table <code>k_and_s_joined</code> is going to be in the tens of millions, which is unfortunate as most of that data will be thrown away, because ultimately we want one match per element in K.</p>\n<p>When I ran this with my dataset the Python process was sat at around 60GB, which is quite a lot of memory to be using - on most personal computers and laptops that would not fit for instance. We have some large compute servers where this is not an issue, but having so much memory in use means I can't run many copies of this code at once, so most of the many CPU cores we have sit idle.</p>\n<p>So the first thing I'm going to do is not merge K and S with a join, but split this into a set of loops, so that we only have one copy of each set in memory, rather than the product of the two:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# realing with - equivelent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# This is our function to apply to each of our initial matches.\n\t# Note that each row contains both the data of K and S, which is\n\t# why this just takes one row as a parameter.\n\tdef distance(s_row: Series, k_row: Series, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\t(k_row.elevation, k_row.slope, k_row.population),\n\t\t\t(s_row.elevation, s_row.slope, s_row.population),\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_row, invconv,)\n\t)\n\n\t# Now find the one result where the distance value is the lowest.\n\tminimum_row = filtered_s[filtered_s.distance==filtered_s.distance.min()].iloc[0]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>On the plus side, this code now uses much less memory! With the sample sample data I'm now only using around 5GB of memory, which means we're now into the realm of being able to run this on a personal computer, or I can run ten times as many instances of this process concurrently on my server. Not only that, but this version runs faster too - completing in around 75 minutes on my dataset.</p>\n<p>The cost is that the code is now further away from the methodology, it's harder to read at a glance to learn what it's doing. I'm having to somewhat micromanage the computer by telling it what to do for each element of the set K rather than letting the computer figure out what's best.</p>\n<p>This already annoys me - I've not really done much but I've already got a huge win in terms of performance of my code, and I feel really I should have been able to get the computer to figure this out for me. But what annoys me more is that as a computer scientist I knew to do this, but pandas is meant for data scientists who are experts in domains other than computing, but here we are having to cause them to become people who understand the way programs use memory. And for my program to get better, this burden is going to get yet worse.</p>\n<h1>Why is this taking an hour?</h1>\n<p>At this point, running on the computer I have access to with 100s of CPU cores and enough memory I can use all those CPUs with the 5GB per process I have, I was ready to move on. But then we ran this code on a more reasonable computer and it took three hours to run for this data set, and longer for the next batch, and so I was forced to go back the code, and wonder: why is it so slow?</p>\n<p>Is it because the Mahalanobis calculation is very slow? Is it that doing filtering on pandas data sets is very slow? This code doesn't really do much, and even if you think we need to process tens of millions of rows, computers are really fast these days: a GHz processor will one billion operations per second, and so the math really shouldn't be slowing it down.</p>\n<p>Now, I could start putting in print statements with timestamps in, but being a computerist I reached for <a href=\"https://docs.python.org/3/library/profile.html\">CProfile</a>, which is the Python profiling library, and ran my code again. Profiling like this is basically going to just repeatedly pause my program and ask "what are you doing", just at a very fine granularity such that it'll even see what's happening inside function calls that complete very fast. The downside of this is that it will slow down the program - what took 75 minutes now took almost three hours to run.</p>\n<p>Still, run it did, and then I get an output that is just a list of all function calls made, how often they were made, how much time was spent in them, and how much of that time was spent in that function specifically rather than functions it called. On one hand, this is just another version of doing data science, only on the program itself, but again the data scientists I work with are experts in ecology not computering, and so I'd not say that this sort of program introspection is something they'd benefit from.</p>\n<pre><code>\t\t 25949560575 function calls (25949402038 primitive calls) in 9805.165 seconds\n\n Ordered by: standard name\n\n ncalls tottime percall cumtime percall filename:lineno(function)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:100(acquire)\n\t 32/6 0.000 0.000 0.202 0.034 <frozen importlib._bootstrap>:1022(_find_and_load)\n\t\t3 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:1038(_gcd_import)\n\t 1646 0.004 0.000 0.007 0.000 <frozen importlib._bootstrap>:1053(_handle_fromlist)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:125(release)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:165(__init__)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:169(__enter__)\n\t 32 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:173(__exit__)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:179(_get_module_lock)\n\t 35 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:198(cb)\n\t\t3 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:216(_lock_unlock_module)\n\n...1263 lines later...\n\n802394413 88.968 0.000 88.968 0.000 {pandas._libs.lib.is_scalar}\n42419 0.008 0.000 0.008 0.000 {pandas._libs.lib.item_from_zerodim}\n 6546 18.868 0.003 18.931 0.003 {pandas._libs.lib.maybe_convert_objects}\n\t2 0.000 0.000 0.000 0.000 {pandas._libs.lib.to_object_array_tuples}\n\t1 0.000 0.000 0.000 0.000 {pandas._libs.lib.to_object_array}\n\t1 0.179 0.179 0.179 0.179 {pyarrow._s3fs.ensure_s3_initialized}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.cpu_count}\n 12 0.000 0.000 0.000 0.000 {pyarrow.lib.field}\n\t2 0.000 0.000 0.000 0.000 {pyarrow.lib.int64}\n\t2 0.000 0.000 0.000 0.000 {pyarrow.lib.register_extension_type}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.schema}\n\t1 0.000 0.000 0.000 0.000 {pyarrow.lib.struct}\n\t2 0.298 0.149 0.298 0.149 {pyarrow.lib.table_to_blocks}\n</code></pre>\n<p>As an aside, note the top line: 25.9 billion function calls! That's a lot of function calls, just to process tens of millions of rows of data. But I guess those calls add up quickly when you're working with data this big.</p>\n<p>Anyway, most of the information was not interesting, but two things stood out. Firstly was this line:</p>\n<pre><code>\t 1631 0.020 0.000 9792.367 6.004 frame.py:9266(apply)\n</code></pre>\n<p>This tells me that Apply is being called 1631 times, which is once per entry in K for this run, which is what I'd expect, but it also tells me that it spent 9792 seconds in apply, which means that the apply call for the code is where we spend most of our time! So we have a good clue here: of the two stages to filtering the data, it's not the explicit matching stage that's slow, but working out the distances.</p>\n<p>The obvious conclusion to jump to then would be that it's the distance function itself that is slow, but if we find that in the profiler output:</p>\n<pre><code> 89146415 468.679 0.000 874.615 0.000 distance.py:979(mahalanobis)\n</code></pre>\n<p>We can see this is called a lot, nearly 90 million times, which is the product of K and S after you filter out the first stage matching, but it only accounts for a small fraction of our 9792 seconds. Where is the rest of the time going? So I scroll on and then I spot this:</p>\n<pre><code>802376453 829.165 0.000 7993.752 0.000 generic.py:5975(__getattr__)\n</code></pre>\n<p>Now, unless you understand how Python works under the hood, this is just yet another internal call that Python does that you have no control over, but because this isn't my first rodeo, I happen to know what this means, and what it is telling me. Python's <code>getattr</code> is used when you try to access a property on an object in Python. We know that this is happening in the loop of apply, and we can see that it's being called a lot, and so from that I can infer it's this code here that's the problem:</p>\n<pre><code>\t(row.elevation_k, row.slope_k, row.population_k),\n\t(row.elevation_s, row.slope_s, row.population_s),\n</code></pre>\n<p>The problem is when we access the data on the row by name line this. Pandas has been super helpful and made it possible for us to access the data in each column by name as if it was a property on the two, but in practice to do this it has to do a bunch of look up work to made this happen, going back to the table, finding the column names, checking you have provided one that is right, then finding the data and passing it back, and it turns out if you do this a lot, whilst it might be fast once, but if you do it a lot of times it all adds up.</p>\n<p>In fact, confession time, the code I'm showing you here is a simplified version of the real code, which used a lot more variables, and looked like this:</p>\n<pre><code>\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = (k_row.elevation, k_row.slope, k_row.population,\n\t\tk_row.cpc0_u, k_row.cpc0_d,\n\t\tk_row.cpc5_u, k_row.cpc5_d,\n\t\tk_row.cpc10_u, k_row.cpc10_d)\n\n\t# This is our function to apply to each of our initial matches.\n\tdef distance(s_row: Series, k_info: Tuple, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\tk_info,\n\t\t\t(s_row.elevation, s_row.slope, s_row.population,\n\t\t\t\ts_row.cpc0_u, s_row.cpc0_d,\n\t\t\t\ts_row.cpc5_u, s_row.cpc5_d,\n\t\t\t\ts_row.cpc10_u, s_row.cpc10_d),\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_info, invconv,)\n\t)\n</code></pre>\n<p>I'd already pulled out the calculation of the bit of K I needed out of the apply loop, because of habit - as someone who's coded a bunch I know that if I can do a thing once and re-use that result it's almost always better to do so. So my instinct had saved me being even slower here. So now you can see the numbers add up - we process 90 million rows, and we make a tuple from 9 fields inside that loop, which is our 800 million calls to getattr!</p>\n<p>So what can one do about this? Well, for better or worse (better in this case, worse in general) there are multiple ways in pandas to achieve the same thing. Rather than access each item on the row by a property on the object, I can just pass a list of column names, and it'll narrow things down for me. So now my code is:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# realing with - equivelent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = k_row[["elevation", "slope", "population"]]\n\n\t# This is our function to apply to each of our initial matches.\n\tdef distance(s_row: Series, k_info: Series, iconv: np.ndarray) -> float:\n\t\treturn mahalanobis(\n\t\t\tk_info,\n\t\t\ts_row[["elevation", "slope", "population"]],\n\t\t\ticonv\n\t\t)\n\n\t# Now we make a new column of data, which is the distance in each\n\t# row between the K data and S data, and add that back into our table\n\tfiltered_s['distance'] = filtered_s.apply(\n\t\tdistance,\n\t\taxis=1,\n\t\targs=(k_info, invconv,)\n\t)\n\n\t# Now find the one result where the distance value is the lowest.\n\tmin_distance = filtered_s.distance.min()\n\tminimum_row = filtered_s[filtered_s.distance==min_distance].iloc[0]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>The code has hardly changed here, just we're using a list of column names rather than directly accessing the values on the row in turn, but this dropped the run time of the program with my sample data from 75 minutes to just under 10 minutes!</p>\n<p>This is one tiny change, but the method by which I discovered it as not obvious and I'd argue something not easily discoverable by someone who's an expert ecologist data scientist. Perhaps this tip might have been listed somewhere and thus they'd know to avoid this, but that solution doesn't scale well. How many other tips are there out there that they're missing out on? By looking more into the profile output I found some other small performance wins, but what's interesting isn't those wins, but the level of knowledge of how computer programs work required to know to apply them. Pandas does such a good job at helping at a semantic level, but to get good performance out of it required a whole other level of expertise. This is in contrast to say numpy, which (albeit in a different domain) manages to pull off the trick of providing both semantic and performative efficiency. Even numpy will, eventually, break down this way, but the non-computer-domain-expert will get further before they hit that.</p>\n<p>This is another rendition of the tension <a href=\"/blog/yirgacheffe/\">I highlighted a few posts ago</a>, as captured in the <a href=\"https://dreamsongs.com/RiseOfWorseIsBetter.html\">\u201cWorst is Better\u201d</a> trilogy of papers by Richard P. Gabriel, between:</p>\n<ul>\n<li>\u201cThe right thing\u201d - having an elegant interface with all the icky complexity of dealing with complexities hidden inside</li>\n<li>\u201cWorse is better\u201d - having an elegant implementation that exposes the underlying complexities to the system\u2019s user</li>\n</ul>\n<p>At some point "The right thing" will break down, stranding the user, which is what is happening with pandas here. The counter argument is that you should make the user have to understand the complexity from the start so they're prepared for this. My personal preference is to try push "The right thing" as far as you can and then provide ways to flag what's going wrong - more people are enabled by doing the former than will succeed at the later, and I'd rather enable ecologists to save the planet, even if that's sometimes inefficient. But I digress, as I have one more stage to performance that I did, which kinda sidesteps that entire debate.</p>\n<h1>Using pandas where it's good, then getting it out the way</h1>\n<p>Recently I made a (poor) joke to my partner that I realised I'd become a data scientist when I started opening CSV files with pandas rather than just reading the contents directly and splitting the file up myself as was my habit before this last year. The nugget of gold in that glib statement is that, despite my lambasting it thus far, pandas is really good when doing its thing. Pandas makes it really easy to reason about tables of data when you're not worrying about individual values, but it seems to struggle when doing bulk calculations on that data; but I've already said that was an area where numpy is good, so why not just let each side do what it's best at?</p>\n<p>Thus, I eventually ran with this code, where I use pandas to do everything up to the point where I have to access discrete values, at which point I move the data wholesale into numpy world:</p>\n<pre><code># Let us load our two sets\nk_set = pd.read_parquet(k_parquet_filename)\ns_set = pd.read_parquet(s_parquet_filename)\n\n# For Mahalanobis we need to calculate the relationship between the\n# variables we want as part of the distance function.\ninvconv = calculate_coveriance_inverse(s_set)\n\nresults = []\nfor _, k_row in k_set.iterrows():\n\n\t# Create a new set which is just the rows in s_set\n\t# that have a direct match to the row in K we're\n\t# dealing with - equivalent to joined_k_and_s for\n\t# a single row of K\n\tfiltered_s = s_set[\n\t\t(s_set.ecoregion == k_row.ecoregion) &\n\t\t(s_set.luc10 == k_row.luc10) &\n\t\t(s_set.luc5 == k_row.luc5) &\n\t\t(s_set.luc0 == k_row.luc0)\n\t]\n\n\t# If there are no matches move on. This isn't just an\n\t# optimisation, it's to avoid exceptions when later on\n\t# we try take a result!\n\tif len(filtered_s) == 0:\n\t\tcontinue\n\n\t# The data for this row will be the same every time,\n\t# so don't do it in the loop\n\tk_info = np.array(k_row[["elevation", "slope", "population"]].tolist())\n\n\t# Select the data we need for the distance calculation, and\n\t# export that as a large numpy 2D array\n\ts_subset = filtered_s[["elecation", "slope", "population"]]\n\ts_subset_raw = s_subset.to_numpy()\n\n\t# Now work over the numpy array to find the minimum distance\n\tmin_distance = VERY_LARGE_NUMBER\n\tmin_index = None\n\tfor index in range(len(s_subset_raw)):\n\t\ts_info = s_subset_raw[index]\n\t\tdistance = mahalanobis(k_info, s_info, invconv)\n\t\tif distance < min_distance:\n\t\t\tmin_distance = distance\n\t\t\tmin_index = index\n\n\t# Now find the corresponding data in the original pandas data\n\tminimum_row = filtered_s.iloc[min_index]\n\tresults.append(minimum_row)\n\n# Finally we can save that result!\npd_results = pd.DataFrame(results)\npd_results.to_parquet('result.parquet')\n</code></pre>\n<p>The key bits to note here are that I used pandas to take the data I'd filtered at the first stage, and select just the columns I need for the distance comparison (a thing pandas is good at) and then convert the data straight to a large numpy array, and process the data from that (handing over to a thing numpy is good at). I now have to do some more accounting as I iterate over the data and find the minimum, but the result was I dropped from 10 minutes to 6 minutes, getting me faster again and I'm well below 10% or my original run time (not including the one that didn't finish!).</p>\n<p>The cost is that my code now is definitely very micro-managery, and doesn't reflect the original methodology very well - it's still following the methodology, but you need to reconstruct it from the code.</p>\n<h1>Why have you made me read all this?</h1>\n<p>There's two readings of this post. Firstly, if you're stuck trying to improve the performance of your pandas code, then consider exporting it to numpy if you're doing bulk calculations on the data rather than just dealing with columns etc. It'll save you some time and memory and your electricity bill will be lower. But then it'd also be valid to say for this kind of task you might also want to look at tools like <a href=\"https://sparkbyexamples.com\">Spark</a> and <a href=\"https://dask.org/\">Dask</a> which do some of the lifting for you, at the expense of learning yet another framework properly before it'll really be able to help you.</p>\n<p>But secondly, and perhaps more interestingly, is how could this be made such that if you're an expert in a domain that isn't computer science, how do you figure this stuff out? Or perhaps from my perspective: how, as someone making libraries for ecologists to use, how do I make it so they don't get into this trap? Perhaps it'd be better if pandas doesn't have the apply function to loop over the data, and it just had the "dump data to numpy" function instead? Providing nothing would have helped me, as I already know numpy, but that might have just put off other data scientists?</p>\n<p>Or put another way, does everyone doing significant data science in all domains but one need to have a part-time computerist on their team? Should we just acknowledge that this stuff requires some under-the-hood systems knowledge to get right, and so the way forward is a pairing of experts. You hope that most the time the tools do good, but at some point you want to have a domain expert review things? This falls down I imagine when it comes to funding - who wants to add another person to the project in the name of efficiency when you can kludge by and your budget is already tight?</p>\n<p>I don't know what the answer is, but I do know that having to apply me to even a small set of ecologists doesn't scale, and given the state of the climate we need to be enabling as many ecologists as we can. So with projects like <a href=\"https://github.com/carboncredits/yirgacheffe/\">yirgacheffe</a> I plan to continue trying to do "the right thing" to empower and enable my ecologist colleagues, but then perhaps I need to learn to explicitly signal when my way isn't the best way and perhaps expert review is needed.</p>",
+14
mwd/blog_slack-bad-for-gis-rasters_.json
+14
mwd/blog_slack-bad-for-gis-rasters_.json
···+"summary": "<p>In the past <a href=\"/blog/some-notes-on-processing-and-display-geospatial-data/\">I've written about</a> how <a href=\"https://www.ogc.org/standard/geotiff/\">GeoTIFF</a> is one of the most common datas formats used for geospatial data. A GeoTIFF is just a standard TIFF image with a few extra fields on it that mean geospatial tools such as <a href=\"https://qgis.org/en/site/\">QGIS</a> or libraries such as <a href=\"https://gdal.org\">GDAL</a> know what geographic region this data is referring to, and what map projection system its stored relative to. For example, here is a GeoTIFF I generated showing me where in Sweden you might find moose (meese?):</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS, showing a map of the world focussed on Scandinavia, over which is a black and white rectangle showing some map data over Sweden.\" src=\"good.png\">\n \n </div>\n</div>\n<p>Perhaps I'm sufficiently excited by the notion of where the moose can be found that I want to share this with one of my colleagues via our workplace <a href=\"https://slack.com/\">Slack</a> channel. So I drop the file in, they download it, but then they complain that the image is now all wrong:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS again, only now the image data is stretched into a square, and surrounded by featureless blue rather then a map of Scandinavia.\" src=\"bad.png\">\n \n </div>\n</div>\n<p>If we zoom out a bit, we find not only does the image look funny, it's no longer in the right place! I couldn't really make a screenshot of this, as the moose data has been rendered so small that by the time I zoom out enough that you see the coast of Africa it's just a couple of pixels wide on my screen:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS showing the west coast of Africa, with a bunch of hand drawn annotations pointing to where the pixels would be.\" src=\"where.png\">\n \n </div>\n</div>\n<p>If look closely I can see that QGIS is warning me something is up with this layer that has been sent via Slack:</p>\n<div>\n <div>\n \n\n <img alt=\"There is a ? icon next to the layer in QGIS, and a tooltip is shown saying 'Layer has no coordinate reference system set! The layer is not georeferenced and has no geographic location available.'\" src=\"warning.png\">\n \n </div>\n</div>\n<p>So what's happened here? This is particularly mysterious as in the past my colleagues and I have successfully sent GeoTIFFs through slack with no ill effect.</p>\n\n\n<p>Let's take a look with some command line tools at the TIFF itself. If I use <code>gdalinfo</code> to query the original file, I see what I'd expect:</p>\n<pre><code>$ gdalinfo smb.tif\nDriver: GTiff/GeoTIFF\nFiles: smb.tif\nSize is 2867, 3060\nCoordinate System is:\nGEOGCRS["WGS 84",\n\tENSEMBLE["World Geodetic System 1984 ensemble",\n\t\tMEMBER["World Geodetic System 1984 (Transit)"],\n\t\tMEMBER["World Geodetic System 1984 (G730)"],\n\t\tMEMBER["World Geodetic System 1984 (G873)"],\n\t\tMEMBER["World Geodetic System 1984 (G1150)"],\n\t\tMEMBER["World Geodetic System 1984 (G1674)"],\n\t\tMEMBER["World Geodetic System 1984 (G1762)"],\n\t\tMEMBER["World Geodetic System 1984 (G2139)"],\n\t\tELLIPSOID["WGS 84",6378137,298.257223563,\n\t\t\tLENGTHUNIT["metre",1]],\n\t\tENSEMBLEACCURACY[2.0]],\n\tPRIMEM["Greenwich",0,\n\t\tANGLEUNIT["degree",0.0174532925199433]],\n\tCS[ellipsoidal,2],\n\t\tAXIS["geodetic latitude (Lat)",north,\n\t\t\tORDER[1],\n\t\t\tANGLEUNIT["degree",0.0174532925199433]],\n\t\tAXIS["geodetic longitude (Lon)",east,\n\t\t\tORDER[2],\n\t\t\tANGLEUNIT["degree",0.0174532925199433]],\n\tUSAGE[\n\t\tSCOPE["Horizontal component of 3D system."],\n\t\tAREA["World."],\n\t\tBBOX[-90,-180,90,180]],\n\tID["EPSG",4326]]\nData axis to CRS axis mapping: 2,1\nOrigin = (11.026820112567126,69.106496492030672)\nPixel Size = (0.004491576420598,-0.004491869987684)\nMetadata:\n AREA_OR_POINT=Area\nImage Structure Metadata:\n INTERLEAVE=BAND\nCorner Coordinates:\nUpper Left ( 11.0268201, 69.1064965) ( 11d 1'36.55"E, 69d 6'23.39"N)\nLower Left ( 11.0268201, 55.3613743) ( 11d 1'36.55"E, 55d21'40.95"N)\nUpper Right ( 23.9041697, 69.1064965) ( 23d54'15.01"E, 69d 6'23.39"N)\nLower Right ( 23.9041697, 55.3613743) ( 23d54'15.01"E, 55d21'40.95"N)\nCenter ( 17.4654949, 62.2339354) ( 17d27'55.78"E, 62d14' 2.17"N)\nBand 1 Block=2867x2 Type=Byte, ColorInterp=Gray\n</code></pre>\n<p>But if we do the same on the file that went via Slack we see:</p>\n<pre><code>$ gdalinfo ~/Downloads/smb.tif\nDriver: GTiff/GeoTIFF\nFiles: /Users/michael/Downloads/smb.tif\nSize is 2867, 3060\nImage Structure Metadata:\n INTERLEAVE=BAND\nCorner Coordinates:\nUpper Left ( 0.0, 0.0)\nLower Left ( 0.0, 3060.0)\nUpper Right ( 2867.0, 0.0)\nLower Right ( 2867.0, 3060.0)\nCenter ( 1433.5, 1530.0)\nBand 1 Block=2867x352 Type=Byte, ColorInterp=Gray\n</code></pre>\n<p>GDAL doesn't really know anything about the file. Why this is so becomes more apparent if we just look at the TIFF file metadata in both. GeoTIFFs are just regular TIFF files with a few special header tags. You can see this with <code>tiffinfo</code> from the command line on the original file:</p>\n<pre><code>$ tiffinfo smb.tif\nTIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34736 (0x87b0) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered.\n=== TIFF directory 0 ===\nTIFF Directory at offset 0x8 (8)\n Image Width: 2867 Image Length: 3060\n Bits/Sample: 8\n Sample Format: unsigned integer\n Compression Scheme: LZW\n Photometric Interpretation: min-is-black\n Samples/Pixel: 1\n Rows/Strip: 2\n Planar Configuration: single image plane\n Tag 33550: 0.004492,0.004492,0.000000\n Tag 33922: 0.000000,0.000000,0.000000,11.026820,69.106496,0.000000\n Tag 34735: 1,1,0,7,1024,0,1,2,1025,0,1,1,2048,0,1,4326,2049,34737,7,0,2054,0,1,9102,2057,34736,1,1,2059,34736,1,0\n Tag 34736: 298.257224,6378137.000000\n Tag 34737: WGS 84|\n Predictor: none 1 (0x1)\n</code></pre>\n<p>All those "Unknown field with tag..." warnings are the GeoTIFF extensions that tiffinfo doesn't understand. And if we again look at the file that we downloaded from Slack:</p>\n<pre><code>$ tiffinfo ~/Downloads/smb.tif\n=== TIFF directory 0 ===\nTIFF Directory at offset 0x85dda4 (8773028)\n Image Width: 2867 Image Length: 3060\n Bits/Sample: 8\n Compression Scheme: None\n Photometric Interpretation: min-is-black\n FillOrder: msb-to-lsb\n Orientation: row 0 top, col 0 lhs\n Samples/Pixel: 1\n Rows/Strip: 352\n Planar Configuration: single image plane\n Page Number: 0-1\n</code></pre>\n<p>All those GeoTIFF tags are gone, and what was a GeoTIFF has now just become a regular old TIFF after its journey through Slack.</p>\n\n\n<p>What was particularly puzzling about this is that I know for a fact I've sent GeoTIFFs through Slack before without hitting this issue. And when I tried to reproduce the issue I found I couldn't - it just seemed to be some particular GeoTIFFs it didn't like. I tried many things to narrow it down: was it some particular tags it didn't like? was it certain map projections? was it file size? was it data type? For each idea I was able to find a counter example where it worked. As I tested this the Slack channel where you can talk to yourself (which I finally had a use for!) filled up with GeoTIFFs:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of a slack channel where I send myself images, some of which have icons and some of which have previews. In each one I have written a silly message with things like 'more testing' 'meese!' and 'does this get a preview'\" src=\"slack.png\">\n \n </div>\n</div>\n<p>Eventually I realised that the issue is related to when Slack decides to generate a preview: if there is a preview, the downloaded file has no metadata, whereas is there isn't a preview you get the original file!</p>\n<p>Now, on one hand this isn't really progress, as I still don't know what settings on the GeoTIFF cause Slack to render a review versus not - I have the same problem and just a quicker way to diagnose it perhaps. But then on the other hand at least I can diagnose it now: if I send a colleague a GeoTIFF and it renders a preview, it is now ruined. I assume that if you have a preview and you download the image again then you get a new image from the original data rather than the one you uploaded. Or perhaps, as my colleague <a href=\"https://patrick.sirref.org\">Patrick</a> pointed out, once Slack decides to treat it like an image it <a href=\"https://techcrunch.com/2020/05/11/slack-strips-location-data/\">strips out metadata for security reasons</a> - but then if that's the case its poor that I can get images through that don't get cleaned up like that.</p>\n<p>It does remind me a bit of the early days of mobile networks, where your mobile operator would rewrite images you downloaded on their servers to be more highly compressed, so as to save them bandwidth. You hear less of that these days, and I have checked the actual image data, and I've not yet found one where the data itself was changed, just the metadata.</p>\n\n\n<p>Slack fiddling in this way me cost the better part of days work, because we didn't spot that it had mangled one of our datasets and so we had confused results, and so it's particularly vexing that I can't find a root cause, but at least I have a work around now: I've taken to putting any GeoTIFFs I send to colleagues into a zip file now - not to save space (given the GeoTIFFs are compressed putting them in a zip often makes them slightly larger), but rather to stop Slack fiddling with them. Not the best, but if I do it as a matter of course, I never have to think about this again, at least until Slack decide to fiddle with zip files.</p>",+"content": "<p>In the past <a href=\"/blog/some-notes-on-processing-and-display-geospatial-data/\">I've written about</a> how <a href=\"https://www.ogc.org/standard/geotiff/\">GeoTIFF</a> is one of the most common datas formats used for geospatial data. A GeoTIFF is just a standard TIFF image with a few extra fields on it that mean geospatial tools such as <a href=\"https://qgis.org/en/site/\">QGIS</a> or libraries such as <a href=\"https://gdal.org\">GDAL</a> know what geographic region this data is referring to, and what map projection system its stored relative to. For example, here is a GeoTIFF I generated showing me where in Sweden you might find moose (meese?):</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS, showing a map of the world focussed on Scandinavia, over which is a black and white rectangle showing some map data over Sweden.\" src=\"good.png\">\n \n </div>\n</div>\n<p>Perhaps I'm sufficiently excited by the notion of where the moose can be found that I want to share this with one of my colleagues via our workplace <a href=\"https://slack.com/\">Slack</a> channel. So I drop the file in, they download it, but then they complain that the image is now all wrong:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS again, only now the image data is stretched into a square, and surrounded by featureless blue rather then a map of Scandinavia.\" src=\"bad.png\">\n \n </div>\n</div>\n<p>If we zoom out a bit, we find not only does the image look funny, it's no longer in the right place! I couldn't really make a screenshot of this, as the moose data has been rendered so small that by the time I zoom out enough that you see the coast of Africa it's just a couple of pixels wide on my screen:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of QGIS showing the west coast of Africa, with a bunch of hand drawn annotations pointing to where the pixels would be.\" src=\"where.png\">\n \n </div>\n</div>\n<p>If look closely I can see that QGIS is warning me something is up with this layer that has been sent via Slack:</p>\n<div>\n <div>\n \n\n <img alt=\"There is a ? icon next to the layer in QGIS, and a tooltip is shown saying 'Layer has no coordinate reference system set! The layer is not georeferenced and has no geographic location available.'\" src=\"warning.png\">\n \n </div>\n</div>\n<p>So what's happened here? This is particularly mysterious as in the past my colleagues and I have successfully sent GeoTIFFs through slack with no ill effect.</p>\n\n\n<p>Let's take a look with some command line tools at the TIFF itself. If I use <code>gdalinfo</code> to query the original file, I see what I'd expect:</p>\n<pre><code>$ gdalinfo smb.tif\nDriver: GTiff/GeoTIFF\nFiles: smb.tif\nSize is 2867, 3060\nCoordinate System is:\nGEOGCRS["WGS 84",\n\tENSEMBLE["World Geodetic System 1984 ensemble",\n\t\tMEMBER["World Geodetic System 1984 (Transit)"],\n\t\tMEMBER["World Geodetic System 1984 (G730)"],\n\t\tMEMBER["World Geodetic System 1984 (G873)"],\n\t\tMEMBER["World Geodetic System 1984 (G1150)"],\n\t\tMEMBER["World Geodetic System 1984 (G1674)"],\n\t\tMEMBER["World Geodetic System 1984 (G1762)"],\n\t\tMEMBER["World Geodetic System 1984 (G2139)"],\n\t\tELLIPSOID["WGS 84",6378137,298.257223563,\n\t\t\tLENGTHUNIT["metre",1]],\n\t\tENSEMBLEACCURACY[2.0]],\n\tPRIMEM["Greenwich",0,\n\t\tANGLEUNIT["degree",0.0174532925199433]],\n\tCS[ellipsoidal,2],\n\t\tAXIS["geodetic latitude (Lat)",north,\n\t\t\tORDER[1],\n\t\t\tANGLEUNIT["degree",0.0174532925199433]],\n\t\tAXIS["geodetic longitude (Lon)",east,\n\t\t\tORDER[2],\n\t\t\tANGLEUNIT["degree",0.0174532925199433]],\n\tUSAGE[\n\t\tSCOPE["Horizontal component of 3D system."],\n\t\tAREA["World."],\n\t\tBBOX[-90,-180,90,180]],\n\tID["EPSG",4326]]\nData axis to CRS axis mapping: 2,1\nOrigin = (11.026820112567126,69.106496492030672)\nPixel Size = (0.004491576420598,-0.004491869987684)\nMetadata:\n AREA_OR_POINT=Area\nImage Structure Metadata:\n INTERLEAVE=BAND\nCorner Coordinates:\nUpper Left ( 11.0268201, 69.1064965) ( 11d 1'36.55"E, 69d 6'23.39"N)\nLower Left ( 11.0268201, 55.3613743) ( 11d 1'36.55"E, 55d21'40.95"N)\nUpper Right ( 23.9041697, 69.1064965) ( 23d54'15.01"E, 69d 6'23.39"N)\nLower Right ( 23.9041697, 55.3613743) ( 23d54'15.01"E, 55d21'40.95"N)\nCenter ( 17.4654949, 62.2339354) ( 17d27'55.78"E, 62d14' 2.17"N)\nBand 1 Block=2867x2 Type=Byte, ColorInterp=Gray\n</code></pre>\n<p>But if we do the same on the file that went via Slack we see:</p>\n<pre><code>$ gdalinfo ~/Downloads/smb.tif\nDriver: GTiff/GeoTIFF\nFiles: /Users/michael/Downloads/smb.tif\nSize is 2867, 3060\nImage Structure Metadata:\n INTERLEAVE=BAND\nCorner Coordinates:\nUpper Left ( 0.0, 0.0)\nLower Left ( 0.0, 3060.0)\nUpper Right ( 2867.0, 0.0)\nLower Right ( 2867.0, 3060.0)\nCenter ( 1433.5, 1530.0)\nBand 1 Block=2867x352 Type=Byte, ColorInterp=Gray\n</code></pre>\n<p>GDAL doesn't really know anything about the file. Why this is so becomes more apparent if we just look at the TIFF file metadata in both. GeoTIFFs are just regular TIFF files with a few special header tags. You can see this with <code>tiffinfo</code> from the command line on the original file:</p>\n<pre><code>$ tiffinfo smb.tif\nTIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34736 (0x87b0) encountered.\nTIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered.\n=== TIFF directory 0 ===\nTIFF Directory at offset 0x8 (8)\n Image Width: 2867 Image Length: 3060\n Bits/Sample: 8\n Sample Format: unsigned integer\n Compression Scheme: LZW\n Photometric Interpretation: min-is-black\n Samples/Pixel: 1\n Rows/Strip: 2\n Planar Configuration: single image plane\n Tag 33550: 0.004492,0.004492,0.000000\n Tag 33922: 0.000000,0.000000,0.000000,11.026820,69.106496,0.000000\n Tag 34735: 1,1,0,7,1024,0,1,2,1025,0,1,1,2048,0,1,4326,2049,34737,7,0,2054,0,1,9102,2057,34736,1,1,2059,34736,1,0\n Tag 34736: 298.257224,6378137.000000\n Tag 34737: WGS 84|\n Predictor: none 1 (0x1)\n</code></pre>\n<p>All those "Unknown field with tag..." warnings are the GeoTIFF extensions that tiffinfo doesn't understand. And if we again look at the file that we downloaded from Slack:</p>\n<pre><code>$ tiffinfo ~/Downloads/smb.tif\n=== TIFF directory 0 ===\nTIFF Directory at offset 0x85dda4 (8773028)\n Image Width: 2867 Image Length: 3060\n Bits/Sample: 8\n Compression Scheme: None\n Photometric Interpretation: min-is-black\n FillOrder: msb-to-lsb\n Orientation: row 0 top, col 0 lhs\n Samples/Pixel: 1\n Rows/Strip: 352\n Planar Configuration: single image plane\n Page Number: 0-1\n</code></pre>\n<p>All those GeoTIFF tags are gone, and what was a GeoTIFF has now just become a regular old TIFF after its journey through Slack.</p>\n\n\n<p>What was particularly puzzling about this is that I know for a fact I've sent GeoTIFFs through Slack before without hitting this issue. And when I tried to reproduce the issue I found I couldn't - it just seemed to be some particular GeoTIFFs it didn't like. I tried many things to narrow it down: was it some particular tags it didn't like? was it certain map projections? was it file size? was it data type? For each idea I was able to find a counter example where it worked. As I tested this the Slack channel where you can talk to yourself (which I finally had a use for!) filled up with GeoTIFFs:</p>\n<div>\n <div>\n \n\n <img alt=\"A screenshot of a slack channel where I send myself images, some of which have icons and some of which have previews. In each one I have written a silly message with things like 'more testing' 'meese!' and 'does this get a preview'\" src=\"slack.png\">\n \n </div>\n</div>\n<p>Eventually I realised that the issue is related to when Slack decides to generate a preview: if there is a preview, the downloaded file has no metadata, whereas is there isn't a preview you get the original file!</p>\n<p>Now, on one hand this isn't really progress, as I still don't know what settings on the GeoTIFF cause Slack to render a review versus not - I have the same problem and just a quicker way to diagnose it perhaps. But then on the other hand at least I can diagnose it now: if I send a colleague a GeoTIFF and it renders a preview, it is now ruined. I assume that if you have a preview and you download the image again then you get a new image from the original data rather than the one you uploaded. Or perhaps, as my colleague <a href=\"https://patrick.sirref.org\">Patrick</a> pointed out, once Slack decides to treat it like an image it <a href=\"https://techcrunch.com/2020/05/11/slack-strips-location-data/\">strips out metadata for security reasons</a> - but then if that's the case its poor that I can get images through that don't get cleaned up like that.</p>\n<p>It does remind me a bit of the early days of mobile networks, where your mobile operator would rewrite images you downloaded on their servers to be more highly compressed, so as to save them bandwidth. You hear less of that these days, and I have checked the actual image data, and I've not yet found one where the data itself was changed, just the metadata.</p>\n\n\n<p>Slack fiddling in this way me cost the better part of days work, because we didn't spot that it had mangled one of our datasets and so we had confused results, and so it's particularly vexing that I can't find a root cause, but at least I have a work around now: I've taken to putting any GeoTIFFs I send to colleagues into a zip file now - not to save space (given the GeoTIFFs are compressed putting them in a zip often makes them slightly larger), but rather to stop Slack fiddling with them. Not the best, but if I do it as a matter of course, I never have to think about this again, at least until Slack decide to fiddle with zip files.</p>",
+14
mwd/blog_tcc-part2_.json
+14
mwd/blog_tcc-part2_.json
···+"summary": "<p>This post is a follow on to the previous <a href=\"/blog/tcc/\">part 1 about Tiny Code Christmas 2022</a>, and is about what I did for TCC 2023. In order to get more info on what TCC is and why it's a fun thing to do, I recommend you start there, but if you're just interested in me messing around with <a href=\"https://ocaml.org/\">OCaml</a>, then you're in the right place.</p>\n\n\n<p>There were two things that motivated me to tackle this years TCC in the OCaml language. Firstly, a bunch of my colleagues at work use OCaml, and indeed work on language, and so there's been a bunch of encouragement from them that I should join in. However, I find I have two modes when it comes to working with programming languages: I can either work on a problem I'm not familiar with in a language I'm comfortable with, or I can do the inverse and tackle problems that I'm comfortable with in a language I'm having to pick up as I go, but I can't do both. Given my current work requires that I spend my time implementing ecological things and trying to do new things from a computer science perspective, I've just been leaning on my go to set of languages: Go, Python, and occasionally Swift.</p>\n<p>In a parallel thread, a couple of months ago I was at the return of <a href=\"https://twelve.barcamplondon.org\">London BarCamp</a>, and I happened to bump into <a href=\"https://www.jonathanhogg.com\">Jonathan Hogg</a>, someone with whom I used to share an office when I was doing my PhD. Jonathan was giving a talk, so I went to see what he was currently up to, and learned about <a href=\"https://github.com/jonathanhogg/flitter\">Flitter</a>, a purely functional programming language that he'd created to help him program visuals for live performances. I had a brief play with this after, as it seemed very cool, but a lack of free time eventually meant I didn't get far. But I liked the idea of having a declarative was to describe a light show rather than a long list of do this then that.</p>\n<p>Thus it was when TCC 2023 was announced, and especially as the main challenges were mostly based on last years, it felt like a great chance to take OCaml, a mostly functional language that I don't know well, and apply it to the domain of programming visuals that I'm somewhat familiar with now, and do it in a declarative way as inspired by Jonathan's Flitter work.</p>\n\n\n<p>For those who don't know either OCaml or are that familiar with functional programming, a very very brief primer. In most regular programming languages you come across, like C, Python, Lua (as used for last year's TCC), your program is basically a long list of "do this, then do that, and optionally do this thing". This is known in computer science terms as <a href=\"https://en.wikipedia.org/wiki/Imperative_programming\">imperative programming</a>. Each of these statements will typically either change the program's state or the state of the environment in which the code runs, and so slowly over time you build up your list of statements to sum up to some desired impact on the world.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Functional_programming\">Functional programming languages</a> take a different approach, whereby it's much more like a expressing a series of mathematical equations that plug together to describe how state changes, and if you plug in your current state of the world into the equation, the result will be your new state. If you're not into this sort of thing, this perhaps sounds like a bit of a made up difference, but the point is that you're not telling the computer to do individual steps that will get you to the end result, rather you specify the maths for generating and end result from a starting position, and then the compiler does the turning that into low level steps the computer carries out, and this generally makes your changes in state explicit, where as in imperative languages a lot of the state is implicit, and thus easy to get wrong, and this is where you find bugs.</p>\n<p>So in theory purely functional languages will make it easier to write safe software, but that comes at the expense they don't map to the mental model most people have of how to do tasks - the real world is naturally imperative: get the kettle, put water in it, turn it on, get the cup, etc. And indeed, some tasks in computing are imperative in nature too, and to express them functionally is awkward. Hence these days you get languages like Swift and OCaml that have a mix of functional and imperative behaviours - Swift leaning somewhat more on the imperative and OCaml more on the functional, but still both try to achieve that sweet spot of giving you the safety of a functional language, whilst the task composability of an imperative language. SwiftUI, Apple's new(ish) way of doing user interfaces is an attempt to make UI implementation more functional.</p>\n<p>If you want to know more about OCaml's place in the world as a language, then I can recommend <a href=\"https://podcasts.apple.com/gb/podcast/happy-path-programming/id1531666706?i=1000629688702\">this podcast episode</a> where Sabine Schmaltz, the maintainer of the <a href=\"https://ocaml.org\">ocaml.org</a> website (hosted with OCaml of course), gives a good overview of the language, its influences, and how it compares to things like Go and Rust.</p>\n\n\n<p>Rather than labour that point any more, lets look at how it went trying to solve Tiny Code Christmas in OCaml (all my code <a href=\"https://github.com/mdales/tcc23\">is on github</a>, and I have a <a href=\"https://mynameismwd.org/tags/tcc/\">gallery of the outputs</a>). Whilst I can lean on the imperative side of OCaml to keep things familiar, the idea is to try to follow the path inspired by Flitter and use the functional side of OCaml as much as I can. And indeed, it turns out that functional programming is quite a good model for a bunch of the effects I made in TCC.</p>\n<p>To start with I was just trying to find my feet, both with OCaml and a way of getting pixels onto the screen. For the former I mixed doing with reading bits of <a href=\"https://dev.realworldocaml.org/index.html\">Real World OCaml</a> - it's a good book, but I'd failed to make headway with it before as I learn best by doing, and I found doing a challenge in some way, then reading a bit of the book to see how I could have used the language better, and doing better the next day, and repeating this worked really well for me.</p>\n<p>For the pixel pushing I used the <a href=\"https://ocaml.github.io/graphics/graphics/Graphics/index.html\">Graphics</a> module, which gives a very simple way to plot pixels and basic shapes on the screen via X11. Whilst not the most advanced way of doing things, it being X11 meant that I can run my code on any of the computers I happen to be using, as macOS still has XQuartz support, and WSL under Windows supports X11 now too, and so I think I used both those and Linux directly to solve my challenges over the course of the month, which was nice. But beyond that, my code was very imperative to start with, as per <a href=\"https://github.com/mdales/tcc23/blob/main/day2/bin/main.ml\">the first couple of days</a>:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>Where you end up with sequences of statements to build the tree of primitive shapes</p>\n<pre><code>let draw_background () =\n let horizon_height = (480 / 3) in\n\tset_color white;\n\tfill_rect 0 0 640 horizon_height;\n\tset_color blue;\n\tfill_rect 0 horizon_height 640 (480 - horizon_height)\n\n...\n\nlet draw_scene (t: int) =\n draw_background ();\n draw_snow t 42 1;\n draw_tree ((size_x ()) / 2) 60;\n draw_snow t 62 2\n</code></pre>\n<p>This works, and got me started, but isn't really what I wanted to be doing. I did however put in place some TIC80-isms (<a href=\"https://tic80.com\">TIC80</a> being the platform I used for last year's TCC), so my code from the start was built around the idea that you'd have a sort of implicit runloop behind the scenes like in TIC80 or even in say arduino programming, whereby you have two functions where you are called, one (which I called <code>boot</code>) that gets called just once at the start of execution, and then a second function (which I called <code>tick</code>) that gets called repeatedly, with a counter passed in as a time indicator.</p>\n<pre><code>let tick (t: int) =\n draw_scene t\n</code></pre>\n<p>It's not doing much here, but later this is how we really end up with a functional programming style demo system. Obviously unlike in TIC80 and Arduino and things, I had to build the runloop myself, and so quickly I started trying to hide that code away into a library, so by the end of TCC, my <code>main.ml</code> really just had the demo code in it and nothing else - all the things I'd built on top of OCaml's graphics code were out of sight.</p>\n\n\n<p>And what is "all the things I'd built" there? Well, my aim wasn't just to implement the TCC challenges directly, but to keep things relatable to the rest of the community that was using things like TIC80, I ended up building a fantasy console emulation layer over the course of the 21 challenges I did (there were 24 in total this year, the 12 from last year and then another 12 "extra" challenges for those who wanted to go beyond what was done last year). For instance, TIC80 keeps the idea of your video card having a fixed palette of 16 colours, and your demo code is drawing in that palette. So I wrote my own <a href=\"https://en.wikipedia.org/wiki/Framebuffer\">framebuffer</a> abstraction that worked with a fixed palette that you define at the start of your program. This also gave me a place to add some scaling code so I was creating effects in low resolutions that befit retro computers and then scaling them up so they don't look tiny on modern displays.</p>\n<p>I must confess, although I kept to the 16 colours (or fewer) of TIC80, I did alternative between the 240x136 resolution of TIC80 and 640x480 VGA resolution depending on the demo, as some just looked really good at the slightly higher pixel count, and I feel 640x480x16 still is a retro display in 2023 :)</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>If we look at the <code>tick</code> loop for the above example, you can perhaps see, if you squint a bit, that this is starting to be a lot less imperative and a lot more functional in style:</p>\n<pre><code>let tick (t : int) =\n let height = size_y () and width = size_x () and ft = (float_of_int t) and colors = (List.length a_palette) in\n let fcolors = float_of_int colors in\n for j = 0 to height do\n\t\tfor i = 0 to width do\n\t \tlet x = float_of_int (i - (width / 2))\n\t \tand y = float_of_int (j - (height / 2)) in\n\t \tlet d1 = (float_of_int width) /. sqrt ((x *. x) +. (y *. y) +. 1.0)\n\t \tand c1 = ((atan2 y x) +. Float.pi) *. (fcolors /. (2.0 *. Float.pi)) in\n\t \tlet c2 = c1 +. (sin (ft /. 70.0) *. Float.pi *. 2.0)\n\t \tand d2 = d1 +. (Float.rem (ft /. 10.0) fcolors) in\n\t \tlet p = (int_of_float (Float.floor c2)) lxor (int_of_float (Float.floor d2)) in\n\t \tlet pindex = (p mod colors) in\n\t \tlet color = List.nth a_palette (if pindex < 0 then (colors + pindex) else pindex) in\n\t \tset_color color;\n\t \tplot i j\n\t\tdone\n done\n</code></pre>\n<p>This was before I added the framebuffer abstraction, so there's some imperative bits to do the actual drawing (you set the colour then you plot a point for example), but most of this code is just stacked mathematical equations and the value of each pixel only derives from the position on screen (the i, j loops) and the tick count (a proxy for time) - there is not other state happening here - a sort of perfect fit for functional programming.</p>\n<p>If I look back at last year's solution for this in Lua, then the code is in that form anyway, and so I'd argue that this soft of demo coding is inherently functional, and thus not only was this an opportunistic way for me to learn OCaml, it was actually a very well aligned way too, which I'd not considered when I started.</p>\n<pre><code>function TIC()\n\tfor j=0,h-1 do\n\t\tfor i=0,w-1 do\n\t\t\tx=i-(w/2)\n\t\t\ty=j-(h/2)\n\t\t\td=400/math.sqrt((x*x)+(y*y)+1)\n\t\t\tc=(math.atan2(y,x)+pi)*(16/(2*pi))\n\t\t\tc=c+(math.sin(t/70)*pi*2)\n\t\t\td=d+((t/10)%16)\n\t\t\tp=(d//1)~(c//1)\n\t\t\tpix(i,j,(p&11)+8)\n\t\tend\n\tend\n\tt=t+1\nend\n</code></pre>\n<p>Indeed, by the time I'd completed TCC and moved onto <a href=\"https://genuary.art\">Genuary</a> (a generative art prompt per day for January), my entire program is now very functional in style for doing graphics effects:</p>\n<pre><code>open Claudius\n\nlet tick t s _prev =\n\tlet palsize = Palette.size (Screen.palette s) in\n\tFramebuffer.init (Screen.dimensions s) (fun x y ->\n\t\tlet ft = (Float.of_int t) /. 10.\n\t\tand fx = (Float.of_int x) /. 140.\n\t\tand fy = (Float.of_int y) /. 140. in\n\t\tlet z = 10. +. (sin (ft /. 1000.) *. 5.)\n\t\tand d = 10. +. (cos (ft /. 1000.) *. 5.) in\n\t\tlet fc = (sin (sin ((fx +. ft) /. z)) +. sin (sin ((fy +. ft) /. d))) *. Float.of_int(palsize / 2) in\n\t\tlet rc = ((int_of_float fc)) mod palsize in\n\t\tif rc >= 0 then rc else (rc + palsize)\n\t)\n\nlet () =\n\tlet screen = Screen.create 640 480 1 (Palette.generate_plasma_palette 1024) in\n\tTcc.run screen "Genuary Day 2: No Palette" None tick\n</code></pre>\n<p>Now there's no direct state changes happening in the code, rather you create a framebuffer with a function that is called for every pixel. Quite a few of the old-school raster effects do fit this pattern of only having each pixel depend on x, y, and t.</p>\n<p>Some effects are more complicated, particular the vector or pseudo 3D effects, and do require a sort of imperative style "set up scene, do a transform, and then render to screen" flow, but because none of these stages rely on external state, they are still effectively functional just keyed to time, just at a slightly more macroscopic scale, as you can see in <a href=\"https://github.com/mdales/tcc23/blob/main/day11extraII/bin/main.ml\">this loop</a>:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<pre><code>let tick (t : int) (s : Tcc.screen) (_prev : Framebuffer.t) : Framebuffer.t =\n\tlet buffer = Framebuffer.init (Screen.dimensions s) (fun _x _y -> 15) in\n\n\tlet ft = Float.of_int t in\n\n\tgenerate_torus ft\n\t|> List.map (fun p ->\n\t\trotate_y (0.02 *. ft) p |> rotate_x (0.01 *. ft) |> rotate_z (0.005 *. ft)\n\t)\n\t|> List.sort point_z_cmp\n\t|> render_to_primatives ft s\n\t|> Framebuffer.render buffer;\n\n\tbuffer\n</code></pre>\n<p>The <code>|></code> operator in OCaml just takes the output of the previous function and feeds it as the last output of the next function, letting you build up these pipelines which are sort of imperative, but because they're self contained equations are still functional.</p>\n<p>And as <a href=\"https://github.com/mdales/tcc23/blob/main/day3extra/bin/main.ml\">another effect that looks stateful</a> but turns out to be functional, here is one of my favourite effects of the set:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>This looks like it's tracking the movement of a bunch of random particles, and so you'd expect some state to be there, but in fact whilst the points are randomly generated, they're done so from the same seed each tick, so you can recreate the world and move the based on the distance relative to the tick count you want, and so there is no state required.</p>\n<pre><code>let generat_points (count : int) (t : int) (screen : screen) : point list =\n\tRandom.init 42;\n\tList.init count (fun index ->\n\t\t{\n\t\t\tx = ((Random.int screen.width) + (((index + 1) * t) / 20)) mod screen.width ;\n\t\t\ty = ((Random.int screen.height) + (((index + 1) * t) / 20)) mod screen.height ;\n\t\t}\n\t)\n</code></pre>\n<p>For those wondering, the index is added to give them all different speeds, a neat trick provided by the <a href=\"https://tcc.lovebyte.party/day3extra/\">TCC challenge that day</a>. Ultimately give this function a point in time and it'll recreate the world from nothing consistently. This relates to something Jonathan said about his aims for Flitter:</p>\n<blockquote>\n<p>My fave thing of purely functional graphical systems is being able to play with the clock: stopping it, running it backwards, skipping, etc.</p>\n</blockquote>\n<p>Not only does the lack of state make things easier to reason about in terms of testing your code, but it also unlocks these kinds of creative flows that would be more complicated otherwise.</p>\n\n\n<p>So, as a way of learning a new language, TCC is a pretty good route. I wrote a little bit of OCaml every day for the better part of a month, and slowly my code got more idiomatic thanks to the guidance of my colleagues like the ever patient <a href=\"https://patrick.sirref.org\">Patrick</a>. I was pleased to see that Jonathan also joined in TCC in the end with Flitter, to give another take on how functional some of these are, given Flitter is a must more pure functional language that OCaml (an advantage of being domain specific).</p>\n<p>Do I know OCaml inside out yet? Certainly not, but I feel I've got enough familiarity that I might try using it in lieu of Go for a few projects this year. I know that I'm not yet an OCaml natural, as I stick to the comfort of explicit type declarations rather than letting the compiler use type inference for example, but I do have colleagues to help me get over that in the coming year.</p>\n<p>I'm going to use the follow on challenge of Genuary, a generative art prompt a day for January, to try take my OCaml fantasy retro computer library to a sort of completed point so that I can draw a line under it and feel I have a suitable conclusion, and thing should anyone else want to try TCC in OCaml in the future can pick up and use without having to worry about both suitable abstractions, graphics libraries, and doing TCC in a language not many use.</p>\n<p>I do want to give a quick shout out once again to the <a href=\"https://lovebyte.party\">LoveByte</a> community - not only is TCC a nicely rounded set of challenges that make for a great little puzzle a day, both the organisers and discord we very welcoming to the idea of someone making their own thing of it rather than just using one of the traditional retro/fantasy platforms. At no point did anyone object to OCaml being thrown into the usual mix of Lua solutions - it was accepted as a fun variation, and my solutions (all of which are <a href=\"https://mynameismwd.org/tags/tcc/\">up on my personal blog here</a>) made it into the end of TCC live stream, which was nice. A great community of people interested in learning and helping others have fun.</p>",+"content": "<p>This post is a follow on to the previous <a href=\"/blog/tcc/\">part 1 about Tiny Code Christmas 2022</a>, and is about what I did for TCC 2023. In order to get more info on what TCC is and why it's a fun thing to do, I recommend you start there, but if you're just interested in me messing around with <a href=\"https://ocaml.org/\">OCaml</a>, then you're in the right place.</p>\n\n\n<p>There were two things that motivated me to tackle this years TCC in the OCaml language. Firstly, a bunch of my colleagues at work use OCaml, and indeed work on language, and so there's been a bunch of encouragement from them that I should join in. However, I find I have two modes when it comes to working with programming languages: I can either work on a problem I'm not familiar with in a language I'm comfortable with, or I can do the inverse and tackle problems that I'm comfortable with in a language I'm having to pick up as I go, but I can't do both. Given my current work requires that I spend my time implementing ecological things and trying to do new things from a computer science perspective, I've just been leaning on my go to set of languages: Go, Python, and occasionally Swift.</p>\n<p>In a parallel thread, a couple of months ago I was at the return of <a href=\"https://twelve.barcamplondon.org\">London BarCamp</a>, and I happened to bump into <a href=\"https://www.jonathanhogg.com\">Jonathan Hogg</a>, someone with whom I used to share an office when I was doing my PhD. Jonathan was giving a talk, so I went to see what he was currently up to, and learned about <a href=\"https://github.com/jonathanhogg/flitter\">Flitter</a>, a purely functional programming language that he'd created to help him program visuals for live performances. I had a brief play with this after, as it seemed very cool, but a lack of free time eventually meant I didn't get far. But I liked the idea of having a declarative was to describe a light show rather than a long list of do this then that.</p>\n<p>Thus it was when TCC 2023 was announced, and especially as the main challenges were mostly based on last years, it felt like a great chance to take OCaml, a mostly functional language that I don't know well, and apply it to the domain of programming visuals that I'm somewhat familiar with now, and do it in a declarative way as inspired by Jonathan's Flitter work.</p>\n\n\n<p>For those who don't know either OCaml or are that familiar with functional programming, a very very brief primer. In most regular programming languages you come across, like C, Python, Lua (as used for last year's TCC), your program is basically a long list of "do this, then do that, and optionally do this thing". This is known in computer science terms as <a href=\"https://en.wikipedia.org/wiki/Imperative_programming\">imperative programming</a>. Each of these statements will typically either change the program's state or the state of the environment in which the code runs, and so slowly over time you build up your list of statements to sum up to some desired impact on the world.</p>\n<p><a href=\"https://en.wikipedia.org/wiki/Functional_programming\">Functional programming languages</a> take a different approach, whereby it's much more like a expressing a series of mathematical equations that plug together to describe how state changes, and if you plug in your current state of the world into the equation, the result will be your new state. If you're not into this sort of thing, this perhaps sounds like a bit of a made up difference, but the point is that you're not telling the computer to do individual steps that will get you to the end result, rather you specify the maths for generating and end result from a starting position, and then the compiler does the turning that into low level steps the computer carries out, and this generally makes your changes in state explicit, where as in imperative languages a lot of the state is implicit, and thus easy to get wrong, and this is where you find bugs.</p>\n<p>So in theory purely functional languages will make it easier to write safe software, but that comes at the expense they don't map to the mental model most people have of how to do tasks - the real world is naturally imperative: get the kettle, put water in it, turn it on, get the cup, etc. And indeed, some tasks in computing are imperative in nature too, and to express them functionally is awkward. Hence these days you get languages like Swift and OCaml that have a mix of functional and imperative behaviours - Swift leaning somewhat more on the imperative and OCaml more on the functional, but still both try to achieve that sweet spot of giving you the safety of a functional language, whilst the task composability of an imperative language. SwiftUI, Apple's new(ish) way of doing user interfaces is an attempt to make UI implementation more functional.</p>\n<p>If you want to know more about OCaml's place in the world as a language, then I can recommend <a href=\"https://podcasts.apple.com/gb/podcast/happy-path-programming/id1531666706?i=1000629688702\">this podcast episode</a> where Sabine Schmaltz, the maintainer of the <a href=\"https://ocaml.org\">ocaml.org</a> website (hosted with OCaml of course), gives a good overview of the language, its influences, and how it compares to things like Go and Rust.</p>\n\n\n<p>Rather than labour that point any more, lets look at how it went trying to solve Tiny Code Christmas in OCaml (all my code <a href=\"https://github.com/mdales/tcc23\">is on github</a>, and I have a <a href=\"https://mynameismwd.org/tags/tcc/\">gallery of the outputs</a>). Whilst I can lean on the imperative side of OCaml to keep things familiar, the idea is to try to follow the path inspired by Flitter and use the functional side of OCaml as much as I can. And indeed, it turns out that functional programming is quite a good model for a bunch of the effects I made in TCC.</p>\n<p>To start with I was just trying to find my feet, both with OCaml and a way of getting pixels onto the screen. For the former I mixed doing with reading bits of <a href=\"https://dev.realworldocaml.org/index.html\">Real World OCaml</a> - it's a good book, but I'd failed to make headway with it before as I learn best by doing, and I found doing a challenge in some way, then reading a bit of the book to see how I could have used the language better, and doing better the next day, and repeating this worked really well for me.</p>\n<p>For the pixel pushing I used the <a href=\"https://ocaml.github.io/graphics/graphics/Graphics/index.html\">Graphics</a> module, which gives a very simple way to plot pixels and basic shapes on the screen via X11. Whilst not the most advanced way of doing things, it being X11 meant that I can run my code on any of the computers I happen to be using, as macOS still has XQuartz support, and WSL under Windows supports X11 now too, and so I think I used both those and Linux directly to solve my challenges over the course of the month, which was nice. But beyond that, my code was very imperative to start with, as per <a href=\"https://github.com/mdales/tcc23/blob/main/day2/bin/main.ml\">the first couple of days</a>:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>Where you end up with sequences of statements to build the tree of primitive shapes</p>\n<pre><code>let draw_background () =\n let horizon_height = (480 / 3) in\n\tset_color white;\n\tfill_rect 0 0 640 horizon_height;\n\tset_color blue;\n\tfill_rect 0 horizon_height 640 (480 - horizon_height)\n\n...\n\nlet draw_scene (t: int) =\n draw_background ();\n draw_snow t 42 1;\n draw_tree ((size_x ()) / 2) 60;\n draw_snow t 62 2\n</code></pre>\n<p>This works, and got me started, but isn't really what I wanted to be doing. I did however put in place some TIC80-isms (<a href=\"https://tic80.com\">TIC80</a> being the platform I used for last year's TCC), so my code from the start was built around the idea that you'd have a sort of implicit runloop behind the scenes like in TIC80 or even in say arduino programming, whereby you have two functions where you are called, one (which I called <code>boot</code>) that gets called just once at the start of execution, and then a second function (which I called <code>tick</code>) that gets called repeatedly, with a counter passed in as a time indicator.</p>\n<pre><code>let tick (t: int) =\n draw_scene t\n</code></pre>\n<p>It's not doing much here, but later this is how we really end up with a functional programming style demo system. Obviously unlike in TIC80 and Arduino and things, I had to build the runloop myself, and so quickly I started trying to hide that code away into a library, so by the end of TCC, my <code>main.ml</code> really just had the demo code in it and nothing else - all the things I'd built on top of OCaml's graphics code were out of sight.</p>\n\n\n<p>And what is "all the things I'd built" there? Well, my aim wasn't just to implement the TCC challenges directly, but to keep things relatable to the rest of the community that was using things like TIC80, I ended up building a fantasy console emulation layer over the course of the 21 challenges I did (there were 24 in total this year, the 12 from last year and then another 12 "extra" challenges for those who wanted to go beyond what was done last year). For instance, TIC80 keeps the idea of your video card having a fixed palette of 16 colours, and your demo code is drawing in that palette. So I wrote my own <a href=\"https://en.wikipedia.org/wiki/Framebuffer\">framebuffer</a> abstraction that worked with a fixed palette that you define at the start of your program. This also gave me a place to add some scaling code so I was creating effects in low resolutions that befit retro computers and then scaling them up so they don't look tiny on modern displays.</p>\n<p>I must confess, although I kept to the 16 colours (or fewer) of TIC80, I did alternative between the 240x136 resolution of TIC80 and 640x480 VGA resolution depending on the demo, as some just looked really good at the slightly higher pixel count, and I feel 640x480x16 still is a retro display in 2023 :)</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>If we look at the <code>tick</code> loop for the above example, you can perhaps see, if you squint a bit, that this is starting to be a lot less imperative and a lot more functional in style:</p>\n<pre><code>let tick (t : int) =\n let height = size_y () and width = size_x () and ft = (float_of_int t) and colors = (List.length a_palette) in\n let fcolors = float_of_int colors in\n for j = 0 to height do\n\t\tfor i = 0 to width do\n\t \tlet x = float_of_int (i - (width / 2))\n\t \tand y = float_of_int (j - (height / 2)) in\n\t \tlet d1 = (float_of_int width) /. sqrt ((x *. x) +. (y *. y) +. 1.0)\n\t \tand c1 = ((atan2 y x) +. Float.pi) *. (fcolors /. (2.0 *. Float.pi)) in\n\t \tlet c2 = c1 +. (sin (ft /. 70.0) *. Float.pi *. 2.0)\n\t \tand d2 = d1 +. (Float.rem (ft /. 10.0) fcolors) in\n\t \tlet p = (int_of_float (Float.floor c2)) lxor (int_of_float (Float.floor d2)) in\n\t \tlet pindex = (p mod colors) in\n\t \tlet color = List.nth a_palette (if pindex < 0 then (colors + pindex) else pindex) in\n\t \tset_color color;\n\t \tplot i j\n\t\tdone\n done\n</code></pre>\n<p>This was before I added the framebuffer abstraction, so there's some imperative bits to do the actual drawing (you set the colour then you plot a point for example), but most of this code is just stacked mathematical equations and the value of each pixel only derives from the position on screen (the i, j loops) and the tick count (a proxy for time) - there is not other state happening here - a sort of perfect fit for functional programming.</p>\n<p>If I look back at last year's solution for this in Lua, then the code is in that form anyway, and so I'd argue that this soft of demo coding is inherently functional, and thus not only was this an opportunistic way for me to learn OCaml, it was actually a very well aligned way too, which I'd not considered when I started.</p>\n<pre><code>function TIC()\n\tfor j=0,h-1 do\n\t\tfor i=0,w-1 do\n\t\t\tx=i-(w/2)\n\t\t\ty=j-(h/2)\n\t\t\td=400/math.sqrt((x*x)+(y*y)+1)\n\t\t\tc=(math.atan2(y,x)+pi)*(16/(2*pi))\n\t\t\tc=c+(math.sin(t/70)*pi*2)\n\t\t\td=d+((t/10)%16)\n\t\t\tp=(d//1)~(c//1)\n\t\t\tpix(i,j,(p&11)+8)\n\t\tend\n\tend\n\tt=t+1\nend\n</code></pre>\n<p>Indeed, by the time I'd completed TCC and moved onto <a href=\"https://genuary.art\">Genuary</a> (a generative art prompt per day for January), my entire program is now very functional in style for doing graphics effects:</p>\n<pre><code>open Claudius\n\nlet tick t s _prev =\n\tlet palsize = Palette.size (Screen.palette s) in\n\tFramebuffer.init (Screen.dimensions s) (fun x y ->\n\t\tlet ft = (Float.of_int t) /. 10.\n\t\tand fx = (Float.of_int x) /. 140.\n\t\tand fy = (Float.of_int y) /. 140. in\n\t\tlet z = 10. +. (sin (ft /. 1000.) *. 5.)\n\t\tand d = 10. +. (cos (ft /. 1000.) *. 5.) in\n\t\tlet fc = (sin (sin ((fx +. ft) /. z)) +. sin (sin ((fy +. ft) /. d))) *. Float.of_int(palsize / 2) in\n\t\tlet rc = ((int_of_float fc)) mod palsize in\n\t\tif rc >= 0 then rc else (rc + palsize)\n\t)\n\nlet () =\n\tlet screen = Screen.create 640 480 1 (Palette.generate_plasma_palette 1024) in\n\tTcc.run screen "Genuary Day 2: No Palette" None tick\n</code></pre>\n<p>Now there's no direct state changes happening in the code, rather you create a framebuffer with a function that is called for every pixel. Quite a few of the old-school raster effects do fit this pattern of only having each pixel depend on x, y, and t.</p>\n<p>Some effects are more complicated, particular the vector or pseudo 3D effects, and do require a sort of imperative style "set up scene, do a transform, and then render to screen" flow, but because none of these stages rely on external state, they are still effectively functional just keyed to time, just at a slightly more macroscopic scale, as you can see in <a href=\"https://github.com/mdales/tcc23/blob/main/day11extraII/bin/main.ml\">this loop</a>:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<pre><code>let tick (t : int) (s : Tcc.screen) (_prev : Framebuffer.t) : Framebuffer.t =\n\tlet buffer = Framebuffer.init (Screen.dimensions s) (fun _x _y -> 15) in\n\n\tlet ft = Float.of_int t in\n\n\tgenerate_torus ft\n\t|> List.map (fun p ->\n\t\trotate_y (0.02 *. ft) p |> rotate_x (0.01 *. ft) |> rotate_z (0.005 *. ft)\n\t)\n\t|> List.sort point_z_cmp\n\t|> render_to_primatives ft s\n\t|> Framebuffer.render buffer;\n\n\tbuffer\n</code></pre>\n<p>The <code>|></code> operator in OCaml just takes the output of the previous function and feeds it as the last output of the next function, letting you build up these pipelines which are sort of imperative, but because they're self contained equations are still functional.</p>\n<p>And as <a href=\"https://github.com/mdales/tcc23/blob/main/day3extra/bin/main.ml\">another effect that looks stateful</a> but turns out to be functional, here is one of my favourite effects of the set:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>This looks like it's tracking the movement of a bunch of random particles, and so you'd expect some state to be there, but in fact whilst the points are randomly generated, they're done so from the same seed each tick, so you can recreate the world and move the based on the distance relative to the tick count you want, and so there is no state required.</p>\n<pre><code>let generat_points (count : int) (t : int) (screen : screen) : point list =\n\tRandom.init 42;\n\tList.init count (fun index ->\n\t\t{\n\t\t\tx = ((Random.int screen.width) + (((index + 1) * t) / 20)) mod screen.width ;\n\t\t\ty = ((Random.int screen.height) + (((index + 1) * t) / 20)) mod screen.height ;\n\t\t}\n\t)\n</code></pre>\n<p>For those wondering, the index is added to give them all different speeds, a neat trick provided by the <a href=\"https://tcc.lovebyte.party/day3extra/\">TCC challenge that day</a>. Ultimately give this function a point in time and it'll recreate the world from nothing consistently. This relates to something Jonathan said about his aims for Flitter:</p>\n<blockquote>\n<p>My fave thing of purely functional graphical systems is being able to play with the clock: stopping it, running it backwards, skipping, etc.</p>\n</blockquote>\n<p>Not only does the lack of state make things easier to reason about in terms of testing your code, but it also unlocks these kinds of creative flows that would be more complicated otherwise.</p>\n\n\n<p>So, as a way of learning a new language, TCC is a pretty good route. I wrote a little bit of OCaml every day for the better part of a month, and slowly my code got more idiomatic thanks to the guidance of my colleagues like the ever patient <a href=\"https://patrick.sirref.org\">Patrick</a>. I was pleased to see that Jonathan also joined in TCC in the end with Flitter, to give another take on how functional some of these are, given Flitter is a must more pure functional language that OCaml (an advantage of being domain specific).</p>\n<p>Do I know OCaml inside out yet? Certainly not, but I feel I've got enough familiarity that I might try using it in lieu of Go for a few projects this year. I know that I'm not yet an OCaml natural, as I stick to the comfort of explicit type declarations rather than letting the compiler use type inference for example, but I do have colleagues to help me get over that in the coming year.</p>\n<p>I'm going to use the follow on challenge of Genuary, a generative art prompt a day for January, to try take my OCaml fantasy retro computer library to a sort of completed point so that I can draw a line under it and feel I have a suitable conclusion, and thing should anyone else want to try TCC in OCaml in the future can pick up and use without having to worry about both suitable abstractions, graphics libraries, and doing TCC in a language not many use.</p>\n<p>I do want to give a quick shout out once again to the <a href=\"https://lovebyte.party\">LoveByte</a> community - not only is TCC a nicely rounded set of challenges that make for a great little puzzle a day, both the organisers and discord we very welcoming to the idea of someone making their own thing of it rather than just using one of the traditional retro/fantasy platforms. At no point did anyone object to OCaml being thrown into the usual mix of Lua solutions - it was accepted as a fun variation, and my solutions (all of which are <a href=\"https://mynameismwd.org/tags/tcc/\">up on my personal blog here</a>) made it into the end of TCC live stream, which was nice. A great community of people interested in learning and helping others have fun.</p>",
+14
mwd/blog_tcc_.json
+14
mwd/blog_tcc_.json
···+"summary": "<p>Normally, as someone who codes for a living, xmas is a time for me to down tools and step away from the computer for a bit, spent time with family, spend some time in the <a href=\"https://electricflapjack.com/\">workshop</a> and generally not stare at a computer so much. But this last two years I've had a lot of fun doing some small amounts of coding for fun. In this and the next post, I'm going to go through what got me doing this, what I learned from it, and why I recommend it to others. But the TL;DR is along the lines of: small regular constrained challenges, community, and exploration.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n\n\n<p>In an post I wrote a couple of years ago I <a href=\"/blog/a-journey-into-minimal-computing-or-my-slow-evolving-2021-side-project/\">mentioned in passing</a> the idea of fantasy consoles like <a href=\"https://tic80.com\">TIC80</a>: these are programs that pretend to emulate a retro computer from the 8 or 16 bit era of computing, but remove a bunch of the practical friction in doing so with a real computer from that era. TIC80 for instance let's you write in <a href=\"https://www.lua.org\">Lua</a>, which is a very approachable language for anyone that's done any sort of programming, doesn't require you find a CRT or floppy disks, and lets you quickly get up and going writing fun little graphical things in 240x136 pixels and 16 colours. There's no networking, there's no 3D, it's just very basic old-school style computing.</p>\n<p>In that earlier blog post I wrote:</p>\n<blockquote>\n<p>I think they appeal to me because they mostly avoid the pitfalls of being just a place for wallowing in computer nostalgia, and exhibit the fun that there is in building software for a more limited domain. I follow a bunch of people building software for Uxn on social media, and there\u2019s just a sense of fun and enthusiasm there for building software that I think is interesting and contagious, particularly as a way to try and make low-level computers more accessible, as they were back in those early days of the personal computer era.</p>\n</blockquote>\n<p>I think that freedom to have fun in this small sandbox is something that encourages experimentation and learning, and the community that then springs from this further reinforces that. Indeed, I got started because of that community aspect: before I tried TIC80 for myself I enjoyed watching the <a href=\"https://www.twitch.tv/fieldfxdemo\">Field-FX Monday night demo streams</a>, where they get four people to code up simple graphical demos live over the course of a couple of hours. It's super chill: the limitations of the TIC80 system mean you can't get super nerdy about tweaking graphics card registers or such, and whilst that'll limit its appear to some I'm sure, I enjoyed watching people with very different levels of experience taking part, and all making something fun. The nice thing also about the simplicity of the system is that it's fairly easy to follow the coders as they write their demos live (some even put in comments to talk to the audience).</p>\n<p>Thus, when I spotted that the same set of people were going to do a sort of challenge-a-day in then run up to xmas, dubbed <a href=\"https://tcc.lovebyte.party\">Tiny Code Christmas</a>, I felt inspired to take part: I knew the platform constraints would make this something that couldn't get out of hand, but at the same time doing these graphical style demos was something I'm quite rusty at, so there was a chance to learn some new techniques here. I compared it to doing crosswords or similar puzzles when trying to explain it to someone recently: a chance to push yourself a little, but it's very bounded.</p>\n\n\n<p>So in the 12 days that followed I had fun taking part, and dipping into that community to share what I'd done and find inspiration and know how in order to do better each time. We did classic old-school demo effects like shader bobs:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>and scrolling effects where you move the framebuffer along:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>I didn't really go in for the size-coding aspect, which is where you try to get your code down to as few bytes as possible: having worked in security software I now find I'm somewhat allergic to anything that makes code hard-to-read as a human, but I do understand the appeal of the challenge. You can find the code for all my entries <a href=\"https://github.com/mdales/tcc22\">posted here</a> - none of them are that exciting to read, but perhaps serve as an example that you can have a lot of fund with very little code in the correct context, particularly if you're in a community of others doing the same.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>TIC80 really does take a lot of the friction out of coding, which is key if you're trying to make this an entertaining experience rather than a chore: you write your code in TIC80, hit cmd-R to run it, press escape to get back to your code: so there's no compiling or having to drop out to the command line etc. The edit-run-retry loop is really short and does a great job at encouraging exploration and experimentation as the overheads of doing so are so low.</p>\n\n\n<p>Off the back of TCC22 I did two follow up things. <a href=\"https://lovebyte.party\">LoveByte</a>, the community behind TCC, have their annual competition at the start of February, so I did overcome my size-coding fear and submitted a small 254 byte demo for their newbie section:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>This was a re-creation of an effect from the 1993 demo <a href=\"https://en.wikipedia.org/wiki/Second_Reality\">Second Reality</a> which had totally blown my mind back then, and required a fun abuse of the memory map TIC80 provides, as I need to have a second screen of data hidden away to get the effect. I did very poorly in the competition - I suspect recreating a 30 year old effect was not a way to win over the audience who voted on the competition - but I had a lot of fun doing it, particularly taking part in the online demo party itself.</p>\n\n\n<p>The second thing I did thanks to TCC was spent a little time <a href=\"https://github.com/mdales/isometric-test\">writing a sort of game-engine</a>, something I'd never got around to doing before. Watching one of the FieldFX byte jams I'd spotted someone making a sort of ISO-metric projection landscape thing using just the simple 2D triangle drawing primative in TIC80, and so I gave that a whirl, learning as I went about <a href=\"https://www.gamedeveloper.com/design/interview-the-making-of-dwarf-fortress\">how videogames generate landscapes procedurally</a> using techniques like layering (https://en.wikipedia.org/wiki/Perlin_noise):</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>The joy of something like TIC80 is that its simplicity gives you this constrained enviroment where it's really easy to start doing this sort of thing. I'd tried to write game like things in the past, but always got quickly overwhelmed by all the possible directions I could take things, and ended up running out of steam and/or free time before I'd managed to build something cohesive - the fact that TIC80 doesn't let you do a whole bunch of things is an advantage then, keeping me focused on the key bits to the thing I'm trying to build.</p>\n<p>But as and when you do want to grow, there are incremental routes from TIC80 that don't mean you need to start over if you do want to build out a full game say from your TIC80 idea (although, to be clear, plenty of people do publish full games in TIC80). I ended up graduating my little game engine to <a href=\"https://www.love2d.org\">Love2D</a>, a small Lua based 2D game engine where most of my code would just work with a little bit of renaming of the drawing functions. From that and some sprites I bought from <a href=\"https://itch.io/\">itch.io</a> I wrapped up this little but of fun with a little game engine that made me happy to have spent a bit of time on it:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>By the end I had items to collect, simple NPCs, mountains, and sorted occlusion problems. Enough to satisfy my itch before I had to turn my coding attention back to work. One day I'd like to pick this back up and do something with it - I have an idea for building a community based little open world game - but there's no urgency, and it's parked in a nice place.</p>\n\n\n<p>As you can see if you've made it this far, TCC 2022 wasn't just fun for me because it was a coding thing and I'm a nerd, rather it was fun as I promised me a way to have some fun without it being a big drain on my holidays, let me work with a community of nice people via the LoveByte Discord, and inspired me to try a few things afterwards based on that which I'd not otherwise have tried. The barrier to entry to TCC is very low - TIC80 can be downloaded for free, Lua is a super easy language to learn, and the Discord is there if you want to chat to people. Best yet, the challenges are just there, you don't need it to be christmas to take part, you can do so at your own leisure.</p>\n<p>In the next post, I'll talk about what I did for TCC 2023, where I took it in quite a different direction.</p>",+"content": "<p>Normally, as someone who codes for a living, xmas is a time for me to down tools and step away from the computer for a bit, spent time with family, spend some time in the <a href=\"https://electricflapjack.com/\">workshop</a> and generally not stare at a computer so much. But this last two years I've had a lot of fun doing some small amounts of coding for fun. In this and the next post, I'm going to go through what got me doing this, what I learned from it, and why I recommend it to others. But the TL;DR is along the lines of: small regular constrained challenges, community, and exploration.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n\n\n<p>In an post I wrote a couple of years ago I <a href=\"/blog/a-journey-into-minimal-computing-or-my-slow-evolving-2021-side-project/\">mentioned in passing</a> the idea of fantasy consoles like <a href=\"https://tic80.com\">TIC80</a>: these are programs that pretend to emulate a retro computer from the 8 or 16 bit era of computing, but remove a bunch of the practical friction in doing so with a real computer from that era. TIC80 for instance let's you write in <a href=\"https://www.lua.org\">Lua</a>, which is a very approachable language for anyone that's done any sort of programming, doesn't require you find a CRT or floppy disks, and lets you quickly get up and going writing fun little graphical things in 240x136 pixels and 16 colours. There's no networking, there's no 3D, it's just very basic old-school style computing.</p>\n<p>In that earlier blog post I wrote:</p>\n<blockquote>\n<p>I think they appeal to me because they mostly avoid the pitfalls of being just a place for wallowing in computer nostalgia, and exhibit the fun that there is in building software for a more limited domain. I follow a bunch of people building software for Uxn on social media, and there\u2019s just a sense of fun and enthusiasm there for building software that I think is interesting and contagious, particularly as a way to try and make low-level computers more accessible, as they were back in those early days of the personal computer era.</p>\n</blockquote>\n<p>I think that freedom to have fun in this small sandbox is something that encourages experimentation and learning, and the community that then springs from this further reinforces that. Indeed, I got started because of that community aspect: before I tried TIC80 for myself I enjoyed watching the <a href=\"https://www.twitch.tv/fieldfxdemo\">Field-FX Monday night demo streams</a>, where they get four people to code up simple graphical demos live over the course of a couple of hours. It's super chill: the limitations of the TIC80 system mean you can't get super nerdy about tweaking graphics card registers or such, and whilst that'll limit its appear to some I'm sure, I enjoyed watching people with very different levels of experience taking part, and all making something fun. The nice thing also about the simplicity of the system is that it's fairly easy to follow the coders as they write their demos live (some even put in comments to talk to the audience).</p>\n<p>Thus, when I spotted that the same set of people were going to do a sort of challenge-a-day in then run up to xmas, dubbed <a href=\"https://tcc.lovebyte.party\">Tiny Code Christmas</a>, I felt inspired to take part: I knew the platform constraints would make this something that couldn't get out of hand, but at the same time doing these graphical style demos was something I'm quite rusty at, so there was a chance to learn some new techniques here. I compared it to doing crosswords or similar puzzles when trying to explain it to someone recently: a chance to push yourself a little, but it's very bounded.</p>\n\n\n<p>So in the 12 days that followed I had fun taking part, and dipping into that community to share what I'd done and find inspiration and know how in order to do better each time. We did classic old-school demo effects like shader bobs:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>and scrolling effects where you move the framebuffer along:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>I didn't really go in for the size-coding aspect, which is where you try to get your code down to as few bytes as possible: having worked in security software I now find I'm somewhat allergic to anything that makes code hard-to-read as a human, but I do understand the appeal of the challenge. You can find the code for all my entries <a href=\"https://github.com/mdales/tcc22\">posted here</a> - none of them are that exciting to read, but perhaps serve as an example that you can have a lot of fund with very little code in the correct context, particularly if you're in a community of others doing the same.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>TIC80 really does take a lot of the friction out of coding, which is key if you're trying to make this an entertaining experience rather than a chore: you write your code in TIC80, hit cmd-R to run it, press escape to get back to your code: so there's no compiling or having to drop out to the command line etc. The edit-run-retry loop is really short and does a great job at encouraging exploration and experimentation as the overheads of doing so are so low.</p>\n\n\n<p>Off the back of TCC22 I did two follow up things. <a href=\"https://lovebyte.party\">LoveByte</a>, the community behind TCC, have their annual competition at the start of February, so I did overcome my size-coding fear and submitted a small 254 byte demo for their newbie section:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>This was a re-creation of an effect from the 1993 demo <a href=\"https://en.wikipedia.org/wiki/Second_Reality\">Second Reality</a> which had totally blown my mind back then, and required a fun abuse of the memory map TIC80 provides, as I need to have a second screen of data hidden away to get the effect. I did very poorly in the competition - I suspect recreating a 30 year old effect was not a way to win over the audience who voted on the competition - but I had a lot of fun doing it, particularly taking part in the online demo party itself.</p>\n\n\n<p>The second thing I did thanks to TCC was spent a little time <a href=\"https://github.com/mdales/isometric-test\">writing a sort of game-engine</a>, something I'd never got around to doing before. Watching one of the FieldFX byte jams I'd spotted someone making a sort of ISO-metric projection landscape thing using just the simple 2D triangle drawing primative in TIC80, and so I gave that a whirl, learning as I went about <a href=\"https://www.gamedeveloper.com/design/interview-the-making-of-dwarf-fortress\">how videogames generate landscapes procedurally</a> using techniques like layering (https://en.wikipedia.org/wiki/Perlin_noise):</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>The joy of something like TIC80 is that its simplicity gives you this constrained enviroment where it's really easy to start doing this sort of thing. I'd tried to write game like things in the past, but always got quickly overwhelmed by all the possible directions I could take things, and ended up running out of steam and/or free time before I'd managed to build something cohesive - the fact that TIC80 doesn't let you do a whole bunch of things is an advantage then, keeping me focused on the key bits to the thing I'm trying to build.</p>\n<p>But as and when you do want to grow, there are incremental routes from TIC80 that don't mean you need to start over if you do want to build out a full game say from your TIC80 idea (although, to be clear, plenty of people do publish full games in TIC80). I ended up graduating my little game engine to <a href=\"https://www.love2d.org\">Love2D</a>, a small Lua based 2D game engine where most of my code would just work with a little bit of renaming of the drawing functions. From that and some sprites I bought from <a href=\"https://itch.io/\">itch.io</a> I wrapped up this little but of fun with a little game engine that made me happy to have spent a bit of time on it:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>By the end I had items to collect, simple NPCs, mountains, and sorted occlusion problems. Enough to satisfy my itch before I had to turn my coding attention back to work. One day I'd like to pick this back up and do something with it - I have an idea for building a community based little open world game - but there's no urgency, and it's parked in a nice place.</p>\n\n\n<p>As you can see if you've made it this far, TCC 2022 wasn't just fun for me because it was a coding thing and I'm a nerd, rather it was fun as I promised me a way to have some fun without it being a big drain on my holidays, let me work with a community of nice people via the LoveByte Discord, and inspired me to try a few things afterwards based on that which I'd not otherwise have tried. The barrier to entry to TCC is very low - TIC80 can be downloaded for free, Lua is a super easy language to learn, and the Discord is there if you want to chat to people. Best yet, the challenges are just there, you don't need it to be christmas to take part, you can do so at your own leisure.</p>\n<p>In the next post, I'll talk about what I did for TCC 2023, where I took it in quite a different direction.</p>",
+14
mwd/blog_the-partially-dynamic-web_.json
+14
mwd/blog_the-partially-dynamic-web_.json
···+"summary": "<h1>Background</h1>\n<p>I have three websites (this one, my <a href=\"https://mynameismwd.org/\">personal site</a>, and one for <a href=\"https://mwdales-guitars.uk/\">my luthiery endeavours</a>), and despite each starting out with a different technology stack, for the last few years I\u2019d migrated them all to the <a href=\"https://gohugo.io\">Hugo static site generator</a>, as a way of making it easier for me to mess around with. Without a fixed database, I could more easily readily structure the content as I wanted it, I had more freedom over templating, and ultimately it\u2019s less resource intensive to compile the site occasionally and just serve static files than keep dynamic infrastructure running for what is a set of low traffic websites. At least in theory, we\u2019ll come back to this last point.</p>\n<p>Like most static site generators, Hugo uses a system called <a href=\"https://gohugo.io/content-management/front-matter/\">Front matter</a>, where you store each page as a <a href=\"https://en.wikipedia.org/wiki/Markdown\">markdown file</a> of content, and some <a href=\"https://en.wikipedia.org/wiki/YAML\">YAML</a> at the top of that file to store metadata, such as page title and publication date, which isn\u2019t something markdown supports. With these two parts Hugo can generate your website based on where files are in directories, and with the appropriate bits from the front matter. Hugo will use a template system to turn your markdown into HTML files, roughly following the structure of the folders you store the markdown files in.</p>\n<p>The templates for my sites I\u2019d made by hand myself, which I think is a key part to unlocking the power of Hugo. Because not only can you decide how your markdown looks from your template, but you can also query the front matter, and so change how your page looks based on the metadata. I heavily used this feature, using it a bit like a database entry for each page. This let me add a synopsis to each page, or a title image with alt-text that becomes the thumbnail on the list views. For photos I store all the EXIF data in there too.</p>\n<p>The final feature of Hugo that is very powerful is that it lets you go beyond the standard markdown to HTML rendering by adding <a href=\"https://gohugo.io/content-management/shortcodes/\">short codes</a> - so in addition to standard markdown notation for links or images, you can add your own. So I added some to embed YouTube videos and audio, and I made my own image tag that gave me more control over rendering and let me specify an alt text etc.</p>\n<p>Between all this, Hugo worked pretty well for me, and was much lower maintenance than running a dynamic site that requires a database to store all the content in etc. But in the end I\u2019ve replaced it with <a href=\"https://github.com/mdales/webplats/\">my own semi-static-but-actually-dynamic system</a>, and I wanted to make some notes as to why.</p>\n<h1>Motivation</h1>\n<p>Firstly, let\u2019s talk about the resource usage. In general, I still think a static site is going to have lower overall resource requirements than a dynamic website, and I think that\u2019s true for two of my three websites still. However, for my personal site, it was demonstrably worse. My personal website has content going back over 20 years, and contains a lot of high resolution media in my photos sections. I have about 10k pages, but then when you add thumbnails and display sized images and all other things, that goes up to about 70k resources which Hugo had to prepare for display. I\u2019m not that famous or interesting, and so most of those pages are never going to be looked at in a given unit of time, yet if I make a change to my templates, they all get recalculated, and that\u2019s a lot of files to generate and copy to my server just for no one to look at them.</p>\n<p>I treat the source material for my website as an archive, and I keep in it all the images and video data at the highest resolution I have and then scale it down for the website at compile time. Even though I\u2019m keeping very high resolution primary data, my source directory is 8GB of data, the compiled static website is 12.5 GB currently. That\u2019s a lot of bytes that no one is going to look at. And I have to keep both a copy of the raw site and the compiled site so I don\u2019t need to rebuild all of it every time, so I\u2019m over 20GB on disk.</p>\n<p>So, in terms of resources alone for my personal website, I think it\u2019s safe to say that even with the sensible caching that Hugo does, the static site is somewhat wasteful at this scale.</p>\n\n\n<p>The next motivation for change is how Hugo handles <a href=\"https://gohugo.io/content-management/taxonomies/\">Taxonomies</a>. That is to say, alternative structures to present data from the raw \u201chere is a list of things over time\u201d. An easy example of that is albums of photos. I have my main feed of photos in the website, but I also like to group them into thematic albums. Hugo lets me express this, but the way it does this has to be constrained by the fact it\u2019s compiling to raw HTML files. So I can generate a page for the album, but when you click through to an item in the album, there is only a single page for the photo, so the previous and next links are just to the global feed previous and next, not the album version. This makes somewhat sense, otherwise it\u2019d have to generate copies of each page, and for my photos that\u2019d cause that 12.5 GB to shoot up even more, clearly not desirable. But the fact that albums can\u2019t have forward/back buttons that keep you in the album annoys me, because this sort of arrangement is something I do a lot across all my sites. The correct solution is you need to know at page render time how the visitor got to a page to generate the right forward/backward links, and so you need a dynamic renderer.</p>\n\n\n<p>A minor one, but I've been learning Swedish, so I have a small number of posts that are in both Swedish and English. Being a static site, I can't have a page served in either Swedish or English, I have to give each their own URL, but then linking becomes challenging if I don't want to duplicate each page,</p>\n\n\n<p>The final point I hit up with on Hugo was just it\u2019s built to do one style of website and do it well: one with lists of pages that you drill down into. It\u2019s absolutely great for that, but I found that at times I wanted to say generate a list of items and not have a page associated with them, just the list, and to do that I\u2019d still have to make the page and not link to it. Sort of virtual pages that are data driven is something Hugo seems to be slowly coming to, but ultimately Hugo needs some structure and folk like me will always find corner cases where it doesn\u2019t work for them.</p>\n<h1>Rolling my own semi-dynamic site</h1>\n<p>So, whilst I still would recommend Hugo as an excellent static site generator, as someone who likes to play with their websites and cares about how the content is structured, I\u2019ve decided to <a href=\"https://github.com/mdales/webplats/\">make my own dynamic website renderer</a>, called simply Webplats.</p>\n<p>The design goal is that I want a dynamically evaluated static site. This means the content will be stored just as it was for Hugo, with a series of frontmatter and markdown files on disk for all the content, but Webplats will do the rendering on demand, so I don\u2019t need to generate thousands of pages and thumbnails and resized images that are never going to be viewed. It also means I can take over the mapping from content to URL, and so I can fix that problem with having many views on the same content, with the content rendering being aware of how the viewer got there.</p>\n<p>My other goal is that I\u2019m not trying to make something that will work for everyone. How I use Hugo was highly customised, and that\u2019s what I\u2019m going to support. All there of my websites use the same set of tricks and extensions I\u2019d built using shortcakes and custom template logic on top of Hugo, so I\u2019ll build something generic enough to support them, but I\u2019ve no interest in maintaining some general purpose bit of software for people like this. It\u2019s open source to act as inspiration to others perhaps, but that\u2019s it. I think the power here is that I can tailor this to just what I want, and keep the foot print small and manageable as a hobby platform.</p>\n<h1>Webplats</h1>\n<p>To my surprise, getting something up and running took a few hours, and then it took me another week or so to get it to where I deployed it, just doing an hour or so a day. As a spare time project, I'm quite amazed how fast I went from idea to having <a href=\"https://mynameismwd.org/\">my personal site deployed with this</a>.</p>\n<p>It's not the cleanest of implementations: the code is in flux as I'm still figuring out things, and it'll be interesting as I move my other sites over to it, as I've hardcoded certain things for my personal website. But on the other hand that site is by far the most complicated one, so it's a good place to start. In the transition I\u2019ve certainly still got a bunch of things that are broken, but it\u2019s not a huge amount, and I already have improvements in terms of say album links working correctly now. This is the joy of doing it on my personal website, which is low traffic and low expectation - a small amount of regression really won\u2019t be noticed, and if it is it\u2019s not really that important.</p>\n<p>I\u2019m using the <a href=\"https://aantron.github.io/dream/\">Dream</a> library for <a href=\"https//ocaml.org/\">OCaml</a>, which has both built in routing and templating. I made sure to keep the URL layout that Hugo used as best I could, so in theory the transition shouldn\u2019t be noticed by most people, as all the content remains at the same URL it was.</p>\n<p>Using a functional language for this kind of work actually maps very nicely: all I\u2019m doing is taking data in one format and presenting it in another, so functional transforms are what I need. The way the website is stored on disk for a static site generator means I'm mostly doing a translation of that structure into the URIs for the website, so I was starting from a good place for this project.</p>\n<p>Thanks to Hugo encouraging me to use shortcodes for all resources in a page (I never used the markdown image tags), it was low effort to ensure all resources in a page have their own URL to render them on demand, as I don\u2019t ever need to parse the markdown myself beyond pulling out shortcodes. For images I\u2019m just using <a href=\"https://gallium.inria.fr/camlimages/\">Camlimages</a> which is quite an old library and doesn\u2019t support all the image formats I have acquired over 20 plus years, but it\u2019s enough to get started with. Performance wise, this will be a regression, as images are resized and stored in a small cache the first time they\u2019re viewed, but given most people consume my site via RSS, when I add a new page and look at it myself to check it works, it\u2019ll mean for most folk they don\u2019t see that.</p>\n<p>The aim so far has just been to get as close to the Hugo version as I can without changing the data on disk. What I'm looking forward to doing now I've switched is making changes to the on disk representation to let me simplify the OCaml code, and add some new fun features.</p>",+"content": "<h1>Background</h1>\n<p>I have three websites (this one, my <a href=\"https://mynameismwd.org/\">personal site</a>, and one for <a href=\"https://mwdales-guitars.uk/\">my luthiery endeavours</a>), and despite each starting out with a different technology stack, for the last few years I\u2019d migrated them all to the <a href=\"https://gohugo.io\">Hugo static site generator</a>, as a way of making it easier for me to mess around with. Without a fixed database, I could more easily readily structure the content as I wanted it, I had more freedom over templating, and ultimately it\u2019s less resource intensive to compile the site occasionally and just serve static files than keep dynamic infrastructure running for what is a set of low traffic websites. At least in theory, we\u2019ll come back to this last point.</p>\n<p>Like most static site generators, Hugo uses a system called <a href=\"https://gohugo.io/content-management/front-matter/\">Front matter</a>, where you store each page as a <a href=\"https://en.wikipedia.org/wiki/Markdown\">markdown file</a> of content, and some <a href=\"https://en.wikipedia.org/wiki/YAML\">YAML</a> at the top of that file to store metadata, such as page title and publication date, which isn\u2019t something markdown supports. With these two parts Hugo can generate your website based on where files are in directories, and with the appropriate bits from the front matter. Hugo will use a template system to turn your markdown into HTML files, roughly following the structure of the folders you store the markdown files in.</p>\n<p>The templates for my sites I\u2019d made by hand myself, which I think is a key part to unlocking the power of Hugo. Because not only can you decide how your markdown looks from your template, but you can also query the front matter, and so change how your page looks based on the metadata. I heavily used this feature, using it a bit like a database entry for each page. This let me add a synopsis to each page, or a title image with alt-text that becomes the thumbnail on the list views. For photos I store all the EXIF data in there too.</p>\n<p>The final feature of Hugo that is very powerful is that it lets you go beyond the standard markdown to HTML rendering by adding <a href=\"https://gohugo.io/content-management/shortcodes/\">short codes</a> - so in addition to standard markdown notation for links or images, you can add your own. So I added some to embed YouTube videos and audio, and I made my own image tag that gave me more control over rendering and let me specify an alt text etc.</p>\n<p>Between all this, Hugo worked pretty well for me, and was much lower maintenance than running a dynamic site that requires a database to store all the content in etc. But in the end I\u2019ve replaced it with <a href=\"https://github.com/mdales/webplats/\">my own semi-static-but-actually-dynamic system</a>, and I wanted to make some notes as to why.</p>\n<h1>Motivation</h1>\n<p>Firstly, let\u2019s talk about the resource usage. In general, I still think a static site is going to have lower overall resource requirements than a dynamic website, and I think that\u2019s true for two of my three websites still. However, for my personal site, it was demonstrably worse. My personal website has content going back over 20 years, and contains a lot of high resolution media in my photos sections. I have about 10k pages, but then when you add thumbnails and display sized images and all other things, that goes up to about 70k resources which Hugo had to prepare for display. I\u2019m not that famous or interesting, and so most of those pages are never going to be looked at in a given unit of time, yet if I make a change to my templates, they all get recalculated, and that\u2019s a lot of files to generate and copy to my server just for no one to look at them.</p>\n<p>I treat the source material for my website as an archive, and I keep in it all the images and video data at the highest resolution I have and then scale it down for the website at compile time. Even though I\u2019m keeping very high resolution primary data, my source directory is 8GB of data, the compiled static website is 12.5 GB currently. That\u2019s a lot of bytes that no one is going to look at. And I have to keep both a copy of the raw site and the compiled site so I don\u2019t need to rebuild all of it every time, so I\u2019m over 20GB on disk.</p>\n<p>So, in terms of resources alone for my personal website, I think it\u2019s safe to say that even with the sensible caching that Hugo does, the static site is somewhat wasteful at this scale.</p>\n\n\n<p>The next motivation for change is how Hugo handles <a href=\"https://gohugo.io/content-management/taxonomies/\">Taxonomies</a>. That is to say, alternative structures to present data from the raw \u201chere is a list of things over time\u201d. An easy example of that is albums of photos. I have my main feed of photos in the website, but I also like to group them into thematic albums. Hugo lets me express this, but the way it does this has to be constrained by the fact it\u2019s compiling to raw HTML files. So I can generate a page for the album, but when you click through to an item in the album, there is only a single page for the photo, so the previous and next links are just to the global feed previous and next, not the album version. This makes somewhat sense, otherwise it\u2019d have to generate copies of each page, and for my photos that\u2019d cause that 12.5 GB to shoot up even more, clearly not desirable. But the fact that albums can\u2019t have forward/back buttons that keep you in the album annoys me, because this sort of arrangement is something I do a lot across all my sites. The correct solution is you need to know at page render time how the visitor got to a page to generate the right forward/backward links, and so you need a dynamic renderer.</p>\n\n\n<p>A minor one, but I've been learning Swedish, so I have a small number of posts that are in both Swedish and English. Being a static site, I can't have a page served in either Swedish or English, I have to give each their own URL, but then linking becomes challenging if I don't want to duplicate each page,</p>\n\n\n<p>The final point I hit up with on Hugo was just it\u2019s built to do one style of website and do it well: one with lists of pages that you drill down into. It\u2019s absolutely great for that, but I found that at times I wanted to say generate a list of items and not have a page associated with them, just the list, and to do that I\u2019d still have to make the page and not link to it. Sort of virtual pages that are data driven is something Hugo seems to be slowly coming to, but ultimately Hugo needs some structure and folk like me will always find corner cases where it doesn\u2019t work for them.</p>\n<h1>Rolling my own semi-dynamic site</h1>\n<p>So, whilst I still would recommend Hugo as an excellent static site generator, as someone who likes to play with their websites and cares about how the content is structured, I\u2019ve decided to <a href=\"https://github.com/mdales/webplats/\">make my own dynamic website renderer</a>, called simply Webplats.</p>\n<p>The design goal is that I want a dynamically evaluated static site. This means the content will be stored just as it was for Hugo, with a series of frontmatter and markdown files on disk for all the content, but Webplats will do the rendering on demand, so I don\u2019t need to generate thousands of pages and thumbnails and resized images that are never going to be viewed. It also means I can take over the mapping from content to URL, and so I can fix that problem with having many views on the same content, with the content rendering being aware of how the viewer got there.</p>\n<p>My other goal is that I\u2019m not trying to make something that will work for everyone. How I use Hugo was highly customised, and that\u2019s what I\u2019m going to support. All there of my websites use the same set of tricks and extensions I\u2019d built using shortcakes and custom template logic on top of Hugo, so I\u2019ll build something generic enough to support them, but I\u2019ve no interest in maintaining some general purpose bit of software for people like this. It\u2019s open source to act as inspiration to others perhaps, but that\u2019s it. I think the power here is that I can tailor this to just what I want, and keep the foot print small and manageable as a hobby platform.</p>\n<h1>Webplats</h1>\n<p>To my surprise, getting something up and running took a few hours, and then it took me another week or so to get it to where I deployed it, just doing an hour or so a day. As a spare time project, I'm quite amazed how fast I went from idea to having <a href=\"https://mynameismwd.org/\">my personal site deployed with this</a>.</p>\n<p>It's not the cleanest of implementations: the code is in flux as I'm still figuring out things, and it'll be interesting as I move my other sites over to it, as I've hardcoded certain things for my personal website. But on the other hand that site is by far the most complicated one, so it's a good place to start. In the transition I\u2019ve certainly still got a bunch of things that are broken, but it\u2019s not a huge amount, and I already have improvements in terms of say album links working correctly now. This is the joy of doing it on my personal website, which is low traffic and low expectation - a small amount of regression really won\u2019t be noticed, and if it is it\u2019s not really that important.</p>\n<p>I\u2019m using the <a href=\"https://aantron.github.io/dream/\">Dream</a> library for <a href=\"https//ocaml.org/\">OCaml</a>, which has both built in routing and templating. I made sure to keep the URL layout that Hugo used as best I could, so in theory the transition shouldn\u2019t be noticed by most people, as all the content remains at the same URL it was.</p>\n<p>Using a functional language for this kind of work actually maps very nicely: all I\u2019m doing is taking data in one format and presenting it in another, so functional transforms are what I need. The way the website is stored on disk for a static site generator means I'm mostly doing a translation of that structure into the URIs for the website, so I was starting from a good place for this project.</p>\n<p>Thanks to Hugo encouraging me to use shortcodes for all resources in a page (I never used the markdown image tags), it was low effort to ensure all resources in a page have their own URL to render them on demand, as I don\u2019t ever need to parse the markdown myself beyond pulling out shortcodes. For images I\u2019m just using <a href=\"https://gallium.inria.fr/camlimages/\">Camlimages</a> which is quite an old library and doesn\u2019t support all the image formats I have acquired over 20 plus years, but it\u2019s enough to get started with. Performance wise, this will be a regression, as images are resized and stored in a small cache the first time they\u2019re viewed, but given most people consume my site via RSS, when I add a new page and look at it myself to check it works, it\u2019ll mean for most folk they don\u2019t see that.</p>\n<p>The aim so far has just been to get as close to the Hugo version as I can without changing the data on disk. What I'm looking forward to doing now I've switched is making changes to the on disk representation to let me simplify the OCaml code, and add some new fun features.</p>",
+2
-2
mwd/metadata.json
+2
-2
mwd/metadata.json
+14
mwd/weeknotes_2025-05-05_.json
+14
mwd/weeknotes_2025-05-05_.json
···+"summary": "<h1>Last Week</h1>\n<h2>OCaml GeoTIFF progress</h2>\n<p>I made some good progress on building on <a href=\"https://patrick.sirref.org\">Patrick</a>'s and George's work with the <a href=\"https://github.com/geocaml/ocaml-tiff\">OCaml GeoTIFF</a> library:</p>\n<ul>\n<li>I added reading of compressed LZW data</li>\n<li>Added support for more pixel formats</li>\n<li>Added support for reading from different planes within a file</li>\n<li>Added some unittests</li>\n</ul>\n<p>That last one turned out to cause some trouble, and I'm grateful to Patrick for his help by fixing things. Whilst they ran locally, the tests were failing in CI, apparently as both <a href=\"\">Ounit2</a>'s test runner and EIO which I was using to get data for the tests were using fork, and double forking is often a recipe for trouble.</p>\n<p>Patrick and I also had some discussion on issues around performance if you're not using EIO: the TIFF library's interface for reading data is based on <a href=\"https://github.com/mirage/ocaml-cstruct\"><code>Cstruct</code></a>, which I assume is to align with what EIO uses, but if you're not an EIO user, and indeed you're coming from a "new-to-ocaml" world, then you'll be looking to load data with <a href=\"https://ocaml.org/manual/5.3/api/In_channel.html\"><code>In_channel</code></a>, which presents a problem then, as the best you can do via <code>In_channel</code> is load the data into a <code>bytes</code> value and then copy it to a <code>Cstruct</code> value and then have the TIFF library consume it. Patrict kindly spent some time to come up with <a href=\"https://github.com/geocaml/ocaml-tiff/pull/28\">a more direct interface</a> for those not using EIO.</p>\n<p>This was nice, as although I was using EIO for the unittests, for manual testing I was hooking up the library to a simple <a href=\"https://github.com/claudiusFX/claudius\">Claudius</a>-based visualiser I have for geo-data, making it work with GeoTIFFs and that's not using EIO or such yet, and so Patrick's fix made loading data for this a lot nippier:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>Here I'm visualising one of the elevation maps we use in the <a href=\"https://github.com/quantifyearth/life/\">LIFE pipeline</a>. The tool I'm using is not really that usable yet, but it's a slow burn project to let me load 3D data in actual 3D: it does load GeoJSON and CSV data already, and now with GeoTIFF perhaps it'll be almost useful enough I'll start to put some effort into it. It clearly isn't a high quality rendering, but a quick visualisation like this is great for telling me that I'm extracting not just the image data but also the right geospatial data with TIFF, and in future it'll be a useful sanity-check tool for the pipelines I work on.</p>\n<h2>LIFE</h2>\n<p>I generated some new scenario versions of LIFE as needed by Ali for some investigations she was doing into how to present the LIFE metric. It does lead me to think we need a guide as to not just how to run LIFE but how to alter it to make certain experiments. Ali has already started on a methodology guide, perhaps we also need a method guide (and a hat tip to Tom Swinfield for educating me recently to the difference between those two terms). The downside of this is its just yet another thing to do and we're all quite busy.</p>\n<h2>STAR</h2>\n<p>Simon Tarr has finally tried running my STAR implementation, which is great news. Inevitably, as the first person who isn't me to try run it he hit some issues, but we can hopefully now just play the game where I fix a thing and he runs it until we hit the next issue.</p>\n<p>The one big thing that he hit, not having a compute server as big as the one I tend to use, is that for a bunch of the base layers that we need to resize/reproject but don't change over time and aren't a variable in the STAR method, they are super slow to calculate - which you do once and never again. To save Simon some time, after he demonstrated they started running, I just uploaded all the results to our shared cloud storage, as they're not that big. I think in general though we should push them to Zenodo, so that others can skip this stage also.</p>\n<p>Anyway, great news that we've started this, and Simon and I plan to sit down together in the DAB this coming week to try get through the rest of the issues.</p>\n<h2>Den Stora \u00c4lgvandringen \u00e4r \u00f6ver</h2>\n<p>This year's <a href=\"https://www.svtplay.se/den-stora-algvandringen\">Great Moose Migration</a> has come to a close, with 70 meese swimming over the river at the area near the cameras as they migrate north. It was an interesting one, as spring was very early this year, so they had to start the stream a week early, as the ice had already melted and meese were starting to be in the area. Indeed, most swam within that first week or so, and very few in the final week. This was the opposite of 2023, when spring was very late, and on the date of the official close no meese had swam, so they had to extend it a week the other way.</p>\n<p>It was a fun few weeks, and I have a plan for a geospatial related hack for next year's event, so hopefully I'll find a little time for that in the later half of the year.</p>\n<h1>This Week</h1>\n<h2>OCaml GeoTIFF</h2>\n<p>On the OCaml GeoTIFF side of things, writing data is the next big thing to tackle if this is to be a usable tool, and TIFF is not a great format from that perspective, as its flexibility leads to a bunch of challenges whereby the file itself can suffer internal fragmentation. TIFF data is stored in strips held in a dictionary, which is fine if your data is uncompressed and the length of those strips is a constant, but if your data is compressed, then the length of those strips can change depending on the data, so if you modify data in an existing image then the strip can shrink, leaving dead space in the middle of the file, or you won't have enough room, so you'll need to relocate the strip to the end of the file and now you have even more dead space in the middle of the file. You can compact the file, but on a 150GB file that's a lot of data churn if you modify the first strip...</p>\n<h2>STAR and LIFE</h2>\n<p>Specific things:</p>\n<ul>\n<li>Sit down with Simon and get him running my STAR code.</li>\n<li>We have another LIFE meeting around future work, and for once I think I've done all my action items for this one!</li>\n</ul>\n<p>On a more general note though, for both I need to complete the <a href=\"https://gmd.copernicus.org/articles/15/5093/2022/\">Dahal et al</a> validation method, which requires using occurrence data from <a href=\"https://www.gbif.org\">GBIF</a>. We've been mirroring GBIF locally, so I need to work with <a href=\"https://anil.recoil.org/\">Anil</a> to get access to that so I can start using it.</p>",+"content": "<h1>Last Week</h1>\n<h2>OCaml GeoTIFF progress</h2>\n<p>I made some good progress on building on <a href=\"https://patrick.sirref.org\">Patrick</a>'s and George's work with the <a href=\"https://github.com/geocaml/ocaml-tiff\">OCaml GeoTIFF</a> library:</p>\n<ul>\n<li>I added reading of compressed LZW data</li>\n<li>Added support for more pixel formats</li>\n<li>Added support for reading from different planes within a file</li>\n<li>Added some unittests</li>\n</ul>\n<p>That last one turned out to cause some trouble, and I'm grateful to Patrick for his help by fixing things. Whilst they ran locally, the tests were failing in CI, apparently as both <a href=\"\">Ounit2</a>'s test runner and EIO which I was using to get data for the tests were using fork, and double forking is often a recipe for trouble.</p>\n<p>Patrick and I also had some discussion on issues around performance if you're not using EIO: the TIFF library's interface for reading data is based on <a href=\"https://github.com/mirage/ocaml-cstruct\"><code>Cstruct</code></a>, which I assume is to align with what EIO uses, but if you're not an EIO user, and indeed you're coming from a "new-to-ocaml" world, then you'll be looking to load data with <a href=\"https://ocaml.org/manual/5.3/api/In_channel.html\"><code>In_channel</code></a>, which presents a problem then, as the best you can do via <code>In_channel</code> is load the data into a <code>bytes</code> value and then copy it to a <code>Cstruct</code> value and then have the TIFF library consume it. Patrict kindly spent some time to come up with <a href=\"https://github.com/geocaml/ocaml-tiff/pull/28\">a more direct interface</a> for those not using EIO.</p>\n<p>This was nice, as although I was using EIO for the unittests, for manual testing I was hooking up the library to a simple <a href=\"https://github.com/claudiusFX/claudius\">Claudius</a>-based visualiser I have for geo-data, making it work with GeoTIFFs and that's not using EIO or such yet, and so Patrick's fix made loading data for this a lot nippier:</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>Here I'm visualising one of the elevation maps we use in the <a href=\"https://github.com/quantifyearth/life/\">LIFE pipeline</a>. The tool I'm using is not really that usable yet, but it's a slow burn project to let me load 3D data in actual 3D: it does load GeoJSON and CSV data already, and now with GeoTIFF perhaps it'll be almost useful enough I'll start to put some effort into it. It clearly isn't a high quality rendering, but a quick visualisation like this is great for telling me that I'm extracting not just the image data but also the right geospatial data with TIFF, and in future it'll be a useful sanity-check tool for the pipelines I work on.</p>\n<h2>LIFE</h2>\n<p>I generated some new scenario versions of LIFE as needed by Ali for some investigations she was doing into how to present the LIFE metric. It does lead me to think we need a guide as to not just how to run LIFE but how to alter it to make certain experiments. Ali has already started on a methodology guide, perhaps we also need a method guide (and a hat tip to Tom Swinfield for educating me recently to the difference between those two terms). The downside of this is its just yet another thing to do and we're all quite busy.</p>\n<h2>STAR</h2>\n<p>Simon Tarr has finally tried running my STAR implementation, which is great news. Inevitably, as the first person who isn't me to try run it he hit some issues, but we can hopefully now just play the game where I fix a thing and he runs it until we hit the next issue.</p>\n<p>The one big thing that he hit, not having a compute server as big as the one I tend to use, is that for a bunch of the base layers that we need to resize/reproject but don't change over time and aren't a variable in the STAR method, they are super slow to calculate - which you do once and never again. To save Simon some time, after he demonstrated they started running, I just uploaded all the results to our shared cloud storage, as they're not that big. I think in general though we should push them to Zenodo, so that others can skip this stage also.</p>\n<p>Anyway, great news that we've started this, and Simon and I plan to sit down together in the DAB this coming week to try get through the rest of the issues.</p>\n<h2>Den Stora \u00c4lgvandringen \u00e4r \u00f6ver</h2>\n<p>This year's <a href=\"https://www.svtplay.se/den-stora-algvandringen\">Great Moose Migration</a> has come to a close, with 70 meese swimming over the river at the area near the cameras as they migrate north. It was an interesting one, as spring was very early this year, so they had to start the stream a week early, as the ice had already melted and meese were starting to be in the area. Indeed, most swam within that first week or so, and very few in the final week. This was the opposite of 2023, when spring was very late, and on the date of the official close no meese had swam, so they had to extend it a week the other way.</p>\n<p>It was a fun few weeks, and I have a plan for a geospatial related hack for next year's event, so hopefully I'll find a little time for that in the later half of the year.</p>\n<h1>This Week</h1>\n<h2>OCaml GeoTIFF</h2>\n<p>On the OCaml GeoTIFF side of things, writing data is the next big thing to tackle if this is to be a usable tool, and TIFF is not a great format from that perspective, as its flexibility leads to a bunch of challenges whereby the file itself can suffer internal fragmentation. TIFF data is stored in strips held in a dictionary, which is fine if your data is uncompressed and the length of those strips is a constant, but if your data is compressed, then the length of those strips can change depending on the data, so if you modify data in an existing image then the strip can shrink, leaving dead space in the middle of the file, or you won't have enough room, so you'll need to relocate the strip to the end of the file and now you have even more dead space in the middle of the file. You can compact the file, but on a 150GB file that's a lot of data churn if you modify the first strip...</p>\n<h2>STAR and LIFE</h2>\n<p>Specific things:</p>\n<ul>\n<li>Sit down with Simon and get him running my STAR code.</li>\n<li>We have another LIFE meeting around future work, and for once I think I've done all my action items for this one!</li>\n</ul>\n<p>On a more general note though, for both I need to complete the <a href=\"https://gmd.copernicus.org/articles/15/5093/2022/\">Dahal et al</a> validation method, which requires using occurrence data from <a href=\"https://www.gbif.org\">GBIF</a>. We've been mirroring GBIF locally, so I need to work with <a href=\"https://anil.recoil.org/\">Anil</a> to get access to that so I can start using it.</p>",
+14
mwd/weeknotes_2025-05-12_.json
+14
mwd/weeknotes_2025-05-12_.json
···+"summary": "<h1>Last week</h1>\n<h2>Part II project submissions</h2>\n<p>I'm supervising a couple of part II projects (for those not at Cambridge, part II is what they call the third and final undergraduate year here), and the submission deadline is at the end of this week, so last week I've been doing draft reviewing for them both.</p>\n<h2>Storage</h2>\n<p><a href=\"https://www.tunbury.org\">Mark</a>, <a href=\"https://anil.recoil.org/\">Anil</a>, and <a href=\"https://patrick.sirref.org\">Patrick</a> had a useful discussion about our plans for large scale storage of data in our group. When I started in the EEG a few years ago we had a 128 TB disk that seemed like it'd last forever: six months ago it filled up, and despite attempts to garbage collect it's remained stubbornly filled ever since. We could just build it up to be even bigger, but I think we'd rather learn from this and try do something else that straddles the line between "not being annoying overhead when your supervisor is demanding progress updates" and "why is our large/expensive storage system full of data no one needs or cares about any more, and now do we weed that out from the precious data we must preserve".</p>\n<p>We came up with some plans around how we can use ZFS to create datasets for people and then realise them on demand on our various compute servers, and then these can act as a unit of either garbage collection or publication as and when a project wraps up. I think this is fine in theory, but still needs a bit more detail put into the design, and I'm worried that (at least speaking for myself) it's too easy for this project to be pushed back compared to other near term demands on our time. To be clear, this isn't a criticism of others, just of myself as I know I'm a bit overloaded for the next month or so, and I'm trying to use my weeknotes here as a stick with which to beat myself later :) Or it's another way of saying Mark and I should have coffee soon :)</p>\n<h2>OCaml GeoTIFF progress</h2>\n<p>I stalled a bit on the OCaml GeoTIFF work, as it turns out trying to write to a TIFF file is a messy business. I alluded to some of this last week, in terms of challenge with data fragmentation within a TIFF if you are using compression and update the file, but I realised the same is true of the tagged metadata too: some tags that use more than a single unit of data don't store the offset, but an offset to where the value is stored, and that makes it awkward to build up the metadata block on disk incrementally. The flexibility of TIFF is clearly a feature, but also does make it more challenging to write.</p>\n<p>The first stage then to getting the <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCamlTIFF</a> library to be ready for writing is to change how it does reading. Firstly it currently does exclusively on-demand from disk loading of metadata when getter function is called, but I think we need to move to loading all the metadata into a struct so we can then conversely build that struct up for writing in a single pass. Then similarly I started to add data-strip caching, which is actually a useful feature anyway: currently the data is fetched from disk each time it is accessed, and so a block cache (which is configurable) will be useful for applications where you read the same data a lot (e.g., the base habitat maps in the AoH calculations I do), and it also gives me a place to store data being written to the TIFF before it is flushed out to disk.</p>\n<p>I also fell over on some of the clever typing that OCamlTIFF uses that is inherited from the typing used by the OCaml <a href=\"https://ocaml.org/manual/5.3/api/Bigarray.html\">Bigarray</a> library, and I need to sit down with Patrick at some point and make sure I understand what's going on there.</p>\n\n\n<p>As a bit of fun I also continued to tie together my little bits of map visualisation code, which still proves to be a good debugging tool, as adding the incremental loading of raster data you seen in the last example in this video showed up some subtle bugs in the OCamlTIFF library's handling of loading data in chunks that don't necessarily align with the way the data is striped in the file. It was also a good excuse to refresh my memory on <a href=\"https://ocaml.org/manual/5.3/parallelism.html\">parallel programming in OCaml</a>, mixing the loading and the rendering in parallel.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>I suspect the slowness you see with respect to the data loading here is down to my somewhat naive implementation of LZW. I do wish I had a bit of time to explore the TIFF compression space in general. I've occasionally done point explorations of playing with the different options of how TIFF stores data, but it'd be good to do a full matrix of the different compression options and tiling versus striped for the AoH style calculations I do.</p>\n<p>In particular, I'd also throw into this what seems to be an ah-hoc standard that <a href=\"https://gdal.org/en/stable/drivers/raster/gtiff.html#sparse-files\">GDAL has adopted</a> for sparse file support (at least, at cursor glance through the <a href=\"https://gdal.org/en/stable/drivers/raster/gtiff.html#sparse-files\">official GeoTIFF specification</a> didn't show this as being a standard feature). Seems if you just encode the data offset and length of a strip as zero then GDAL will assume this to be zero or whatever NODATA value the file has specified. Given that I spend a lot of time working with terrestrial species, it feels like switching from strips to tiling and not encoding the oceans would be a <a href=\"https://www.bbc.co.uk/news/entertainment-arts-33805593\">nice little earner</a> both performance and storage wise.</p>\n<h2>STAR</h2>\n<p>In June we have an IUCN redlist workshop, and one of my goals is to ensure that my STAR pipeline is runable by Simon Tarr by then. We started on this the week before, and last week I finally updated my copy of the <a href=\"https://www.iucnredlist.org\">IUCN Red List</a> from the 2023-1 release to the 2025-1 release, and did a full run through. This shook out a couple of manual bits of the pipeline I'd still not added to the shell scripts I have for people to run it.</p>\n<h2>Self hosting</h2>\n<p>Back in March I accidentally made my self hosting world worse by migrating from a well speced VPS to Raspberry Pi - I wrote about this <a href=\"/weeknotes/2025-03-17/\">at the time</a>, but whilst I thought the Pi might be a little slower, I wasn't prepared for it to be an order of magnitude slower, even after I'd taken steps to speed things up for the new hosting setup.</p>\n<p>As an aside, in those weeknotes I talked about moving from using <a href=\"https://opam.ocaml.org/packages/camlimages/\">Camlimages</a> in my OCaml based blog hosting software to just calling out to <a href=\"https://imagemagick.org\">ImageMagick</a> took image processing down from 29 seconds to 4 seconds an image. Since then I've switched from ImageMagick to <a href=\"http://www.graphicsmagick.org\">GraphicsMagick</a>, after seeing a reddit post suggesting it was faster on a Raspberry Pi, and it did indeed take me down from around 4 seconds to 3.5 seconds, but it's still an order of magnitude too slow.</p>\n<p>At the time I thought I'd just live with it, but the sucky performance is making me not just sad with how my websites behave, but also just stopping me work on other improvements I had planned for them. Wa have a bunch of momentum around blogging culture in our group at the Computer Lab now that is exciting, and I was on the leading edge of that, but I've since fallen behind.</p>\n<p>At the weekend I started to migrate my long term backups from <a href=\"https://www.backblaze.com/cloud-storage\">Backblaze B2</a> to <a href=\"https://www.hetzner.com/storage/object-storage/\">Hetzner Cloud Storage</a>, and whilst clicking around Hetzner's VPS offerings it looks like their ARM Ampera based servers are reasonably priced/speced, and are green energy certified! So I'll try kicking the tyres on one of those at some point soon, only this time I won't pay for a year up in advance as I did for the Pi :)</p>\n<h2>Weeknotes meta</h2>\n<p>On the topic of weeknotes, some other weeknotes I've read this week talking about weeknotes.</p>\n<p>Firstly this, by <a href=\"https://www.jonmsterling.com/2025-W19/index.xml\">Jon Sterling</a>, where he muses on how our group's weeknotes culture fosters a sense of team despite us not often all being in the office together. I particularly liked this quote:</p>\n<blockquote>\n<p>Blogging is not an alternative to meeting and talking in person; but I am starting to think that it is a prerequisite for the moments of serendipity that the latter can engender, because the ongoing dialogue of blogs and weeknotes makes me sufficiently informed to have a conversation that goes beyond the superficial.</p>\n</blockquote>\n<p>Another frequent weeknoter I follow is <a href=\"http://www.mcqn.net/mcfilter/archives/links/interesting_things_on_the_internet_may_12th_2025_edition.html\">Adrian McEwan</a> and in this latest update he shared a link to <a href=\"https://www.experimental-history.com/p/28-slightly-rude-notes-on-writing\">28 slightly rude notes on writing</a> by Adam Mastroianni, and point 18, which is a response to an earlier point about what motivates people to write beyond "my course assessor said I had to". I felt I had a good handle on that, as try to focus on writing things where the reader will take something away they didn't know before they started (hence heavy linking, a focus on failure as being more interesting, etc.), but Adam challenge this:</p>\n<blockquote>\n<p>Usually, we try to teach motive by asking: \u201cWhy should I, the reader, care about this?\u201d</p>\n<p>This is reasonable advice, but it\u2019s also wrong. You, the writer, don\u2019t know me. You don\u2019t have a clue what I care about. The only reasons you can give me are the reasons you could give to literally anyone. \u201cThis issue is important because understanding it could increase pleasure and reduce pain.\u201d Uh huh, cool!</p>\n<p>What I really want to know is: why do you care? You could have spent your time knitting a pair of mittens or petting your cat or eating a whole tube of Pringles. Why did you do this instead? What kind of sicko closes the YouTube tab and types 10,000 words into a Google doc? What\u2019s wrong with you? If you show me that\u2014implicitly, explicitly, I don\u2019t care\u2014I might just close my own YouTube tab and read what you wrote.</p>\n</blockquote>\n<p>I suspect you need to go through the caring about the reader to get to the first level of writing (see having read two part II reports this week...) but I think it'll be interesting to try be more conscious/deliberate of why I think things are interesting, rather than just relying on that happening naturally when I blog.</p>\n<p>Which I didn't do this week, obviously.</p>\n<h1>This week</h1>\n<p>Next week I'll be hosting a discussion on lineage at the <a href=\"https://nordic-rse.org/nrse2025/\">Nordic RSE conference</a>. Outside of other scheduled duties this week, if you see I'm not doing that, please ask me why I'm not, as currently I just have a vague idea in my head of how it'll go, and that just won't cut it when stood in front of a large audience of RSEs and consuming an hour of their time!</p>",+"content": "<h1>Last week</h1>\n<h2>Part II project submissions</h2>\n<p>I'm supervising a couple of part II projects (for those not at Cambridge, part II is what they call the third and final undergraduate year here), and the submission deadline is at the end of this week, so last week I've been doing draft reviewing for them both.</p>\n<h2>Storage</h2>\n<p><a href=\"https://www.tunbury.org\">Mark</a>, <a href=\"https://anil.recoil.org/\">Anil</a>, and <a href=\"https://patrick.sirref.org\">Patrick</a> had a useful discussion about our plans for large scale storage of data in our group. When I started in the EEG a few years ago we had a 128 TB disk that seemed like it'd last forever: six months ago it filled up, and despite attempts to garbage collect it's remained stubbornly filled ever since. We could just build it up to be even bigger, but I think we'd rather learn from this and try do something else that straddles the line between "not being annoying overhead when your supervisor is demanding progress updates" and "why is our large/expensive storage system full of data no one needs or cares about any more, and now do we weed that out from the precious data we must preserve".</p>\n<p>We came up with some plans around how we can use ZFS to create datasets for people and then realise them on demand on our various compute servers, and then these can act as a unit of either garbage collection or publication as and when a project wraps up. I think this is fine in theory, but still needs a bit more detail put into the design, and I'm worried that (at least speaking for myself) it's too easy for this project to be pushed back compared to other near term demands on our time. To be clear, this isn't a criticism of others, just of myself as I know I'm a bit overloaded for the next month or so, and I'm trying to use my weeknotes here as a stick with which to beat myself later :) Or it's another way of saying Mark and I should have coffee soon :)</p>\n<h2>OCaml GeoTIFF progress</h2>\n<p>I stalled a bit on the OCaml GeoTIFF work, as it turns out trying to write to a TIFF file is a messy business. I alluded to some of this last week, in terms of challenge with data fragmentation within a TIFF if you are using compression and update the file, but I realised the same is true of the tagged metadata too: some tags that use more than a single unit of data don't store the offset, but an offset to where the value is stored, and that makes it awkward to build up the metadata block on disk incrementally. The flexibility of TIFF is clearly a feature, but also does make it more challenging to write.</p>\n<p>The first stage then to getting the <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCamlTIFF</a> library to be ready for writing is to change how it does reading. Firstly it currently does exclusively on-demand from disk loading of metadata when getter function is called, but I think we need to move to loading all the metadata into a struct so we can then conversely build that struct up for writing in a single pass. Then similarly I started to add data-strip caching, which is actually a useful feature anyway: currently the data is fetched from disk each time it is accessed, and so a block cache (which is configurable) will be useful for applications where you read the same data a lot (e.g., the base habitat maps in the AoH calculations I do), and it also gives me a place to store data being written to the TIFF before it is flushed out to disk.</p>\n<p>I also fell over on some of the clever typing that OCamlTIFF uses that is inherited from the typing used by the OCaml <a href=\"https://ocaml.org/manual/5.3/api/Bigarray.html\">Bigarray</a> library, and I need to sit down with Patrick at some point and make sure I understand what's going on there.</p>\n\n\n<p>As a bit of fun I also continued to tie together my little bits of map visualisation code, which still proves to be a good debugging tool, as adding the incremental loading of raster data you seen in the last example in this video showed up some subtle bugs in the OCamlTIFF library's handling of loading data in chunks that don't necessarily align with the way the data is striped in the file. It was also a good excuse to refresh my memory on <a href=\"https://ocaml.org/manual/5.3/parallelism.html\">parallel programming in OCaml</a>, mixing the loading and the rendering in parallel.</p>\n<div>\n \n \n Your browser does not support the video element.\n \n</div>\n<p>I suspect the slowness you see with respect to the data loading here is down to my somewhat naive implementation of LZW. I do wish I had a bit of time to explore the TIFF compression space in general. I've occasionally done point explorations of playing with the different options of how TIFF stores data, but it'd be good to do a full matrix of the different compression options and tiling versus striped for the AoH style calculations I do.</p>\n<p>In particular, I'd also throw into this what seems to be an ah-hoc standard that <a href=\"https://gdal.org/en/stable/drivers/raster/gtiff.html#sparse-files\">GDAL has adopted</a> for sparse file support (at least, at cursor glance through the <a href=\"https://gdal.org/en/stable/drivers/raster/gtiff.html#sparse-files\">official GeoTIFF specification</a> didn't show this as being a standard feature). Seems if you just encode the data offset and length of a strip as zero then GDAL will assume this to be zero or whatever NODATA value the file has specified. Given that I spend a lot of time working with terrestrial species, it feels like switching from strips to tiling and not encoding the oceans would be a <a href=\"https://www.bbc.co.uk/news/entertainment-arts-33805593\">nice little earner</a> both performance and storage wise.</p>\n<h2>STAR</h2>\n<p>In June we have an IUCN redlist workshop, and one of my goals is to ensure that my STAR pipeline is runable by Simon Tarr by then. We started on this the week before, and last week I finally updated my copy of the <a href=\"https://www.iucnredlist.org\">IUCN Red List</a> from the 2023-1 release to the 2025-1 release, and did a full run through. This shook out a couple of manual bits of the pipeline I'd still not added to the shell scripts I have for people to run it.</p>\n<h2>Self hosting</h2>\n<p>Back in March I accidentally made my self hosting world worse by migrating from a well speced VPS to Raspberry Pi - I wrote about this <a href=\"/weeknotes/2025-03-17/\">at the time</a>, but whilst I thought the Pi might be a little slower, I wasn't prepared for it to be an order of magnitude slower, even after I'd taken steps to speed things up for the new hosting setup.</p>\n<p>As an aside, in those weeknotes I talked about moving from using <a href=\"https://opam.ocaml.org/packages/camlimages/\">Camlimages</a> in my OCaml based blog hosting software to just calling out to <a href=\"https://imagemagick.org\">ImageMagick</a> took image processing down from 29 seconds to 4 seconds an image. Since then I've switched from ImageMagick to <a href=\"http://www.graphicsmagick.org\">GraphicsMagick</a>, after seeing a reddit post suggesting it was faster on a Raspberry Pi, and it did indeed take me down from around 4 seconds to 3.5 seconds, but it's still an order of magnitude too slow.</p>\n<p>At the time I thought I'd just live with it, but the sucky performance is making me not just sad with how my websites behave, but also just stopping me work on other improvements I had planned for them. Wa have a bunch of momentum around blogging culture in our group at the Computer Lab now that is exciting, and I was on the leading edge of that, but I've since fallen behind.</p>\n<p>At the weekend I started to migrate my long term backups from <a href=\"https://www.backblaze.com/cloud-storage\">Backblaze B2</a> to <a href=\"https://www.hetzner.com/storage/object-storage/\">Hetzner Cloud Storage</a>, and whilst clicking around Hetzner's VPS offerings it looks like their ARM Ampera based servers are reasonably priced/speced, and are green energy certified! So I'll try kicking the tyres on one of those at some point soon, only this time I won't pay for a year up in advance as I did for the Pi :)</p>\n<h2>Weeknotes meta</h2>\n<p>On the topic of weeknotes, some other weeknotes I've read this week talking about weeknotes.</p>\n<p>Firstly this, by <a href=\"https://www.jonmsterling.com/2025-W19/index.xml\">Jon Sterling</a>, where he muses on how our group's weeknotes culture fosters a sense of team despite us not often all being in the office together. I particularly liked this quote:</p>\n<blockquote>\n<p>Blogging is not an alternative to meeting and talking in person; but I am starting to think that it is a prerequisite for the moments of serendipity that the latter can engender, because the ongoing dialogue of blogs and weeknotes makes me sufficiently informed to have a conversation that goes beyond the superficial.</p>\n</blockquote>\n<p>Another frequent weeknoter I follow is <a href=\"http://www.mcqn.net/mcfilter/archives/links/interesting_things_on_the_internet_may_12th_2025_edition.html\">Adrian McEwan</a> and in this latest update he shared a link to <a href=\"https://www.experimental-history.com/p/28-slightly-rude-notes-on-writing\">28 slightly rude notes on writing</a> by Adam Mastroianni, and point 18, which is a response to an earlier point about what motivates people to write beyond "my course assessor said I had to". I felt I had a good handle on that, as try to focus on writing things where the reader will take something away they didn't know before they started (hence heavy linking, a focus on failure as being more interesting, etc.), but Adam challenge this:</p>\n<blockquote>\n<p>Usually, we try to teach motive by asking: \u201cWhy should I, the reader, care about this?\u201d</p>\n<p>This is reasonable advice, but it\u2019s also wrong. You, the writer, don\u2019t know me. You don\u2019t have a clue what I care about. The only reasons you can give me are the reasons you could give to literally anyone. \u201cThis issue is important because understanding it could increase pleasure and reduce pain.\u201d Uh huh, cool!</p>\n<p>What I really want to know is: why do you care? You could have spent your time knitting a pair of mittens or petting your cat or eating a whole tube of Pringles. Why did you do this instead? What kind of sicko closes the YouTube tab and types 10,000 words into a Google doc? What\u2019s wrong with you? If you show me that\u2014implicitly, explicitly, I don\u2019t care\u2014I might just close my own YouTube tab and read what you wrote.</p>\n</blockquote>\n<p>I suspect you need to go through the caring about the reader to get to the first level of writing (see having read two part II reports this week...) but I think it'll be interesting to try be more conscious/deliberate of why I think things are interesting, rather than just relying on that happening naturally when I blog.</p>\n<p>Which I didn't do this week, obviously.</p>\n<h1>This week</h1>\n<p>Next week I'll be hosting a discussion on lineage at the <a href=\"https://nordic-rse.org/nrse2025/\">Nordic RSE conference</a>. Outside of other scheduled duties this week, if you see I'm not doing that, please ask me why I'm not, as currently I just have a vague idea in my head of how it'll go, and that just won't cut it when stood in front of a large audience of RSEs and consuming an hour of their time!</p>",
+14
mwd/weeknotes_2025-05-19b_.json
+14
mwd/weeknotes_2025-05-19b_.json
···+"summary": "<p></p><div>\n<div>\n\n\n<img alt=\"A photo of an island in a sea taken from the window of an airplane. The island has a swirly shape, which is emphasised by the sand banks around it.\" src=\"IMG_8920.JPG\">\n\n</div>\n</div>\n\n<p></p>\n<p>A photo of the Danish island of <a href=\"https://en.wikipedia.org/wiki/L%C3%A6s%C3%B8\">L\u00e6s\u00f8</a> taken from the plane as we descended towards Gothenburg airport.</p>\n<h1>Last Week</h1>\n<h2>Nordic-RSE</h2>\n<p>I did my prep for <a href=\"https://nordic-rse.org/nrse2025/\">Nordic-RSE conference</a>, or at least as much as I feel I can, where I'll be hosting a discussion session towards the end of conference on lineage in data-science pipelines. I've limited experience hosting discussion panels before, and so I've been reading through <a href=\"https://www.oreilly.com/library/view/gamestorming/9781449391195/\">Gamestorming by Gray et al</a>, a book we happened to have at home. It's one of those books where a lot of what it says (at least in the opening chapters) is perhaps somewhat obvious (e.g., a session should have an opening, an exploration, and a closing), but it's really useful to have this spelled out and formalised a little, and will hopefully lead to my managing of the session being a bit more deliberate and focussed.</p>\n<p>My ultimate aim for this is to try and tease out what tools and techniques other RSEs have been using to help preserve lineage and ensure repeatability and reproducibility of the projects they work on. At the end of this we'll hopefully have a bunch of suggestions which I'll then write up and host in a git repo somewhere so that other participants can fill in bits I missed or add more details. If at the end of the process we have a page with a set of things that the community can refer to in the future to make it easier to ensure lineage is preserved, then this will be a success. If we don't achieve that, modulo my bad facilitating, then this will also show that there is a gap here that could be used to direct our research, so that's also a success.</p>\n<h2>STAR</h2>\n<p>I squished the last known issue with my <a href=\"https://github.com/quantifyearth/star/\">STAR implementation</a>, which was another <a href=\"https://gdal.org/\">GDAL</a> oddity throwing an error that got lost in the sheer volume of species we process. In the default setting GDAL will not load GeoJSON polygons that are over 200MB size. The fix is simple, you remove the limits by setting the <code>OGR_GEOJSON_MAX_OBJ_SIZE</code> environmental variable to <code>0</code>. Note that the files themselves aren't over 200MB in size, so I assume this refers to the in memory representation for GDAL.</p>\n<p>I'd applied this earlier in the pipeline to get the AoHs to work, but the problem was I'd not set it for one of the later stages. If I actually used the dockerised version of the pipeline this wouldn't be an issue, as I'd have set it in the environment once and could forget it, but because I tend to run in my native environment it has to be set in every script that might impact it. I should probably just punt this into <a href=\"https://github.com/quantifyearth/yirgacheffe/\">yirgacheffe</a>, as working with the IUCN range data you regularly end up with polygons that exceed this for species that have coastal ranges (see my <a href=\"/weeknotes/2025-04-14/\">recent rant</a> on this).</p>\n\n\n<p>That out the way I sat down with Simon Tarr from the IUCN and we ran through getting the docker pipeline for my STAR pipeline to run on his laptop. I normally don't use the docker version myself, and so there were inevitably a few teething issues. But we got Simon to a point where he was generating AoH maps, but more importantly the process helped him understand how the pipeline works internally a little.</p>\n<p>I spent a little more time afterwards and got other parts of the pipeline also working in docker, like the model validation checks on the AoH maps. This required I install <a href=\"https://www.r-project.org\">R</a> in the container, as I had forgotten the <a href=\"https://eshinjolly.com/pymer4/\">Python stats package</a> I ended up using for this when porting over Chess's R code actually just calls to R under the hood \ud83e\udd26 Still, good progress.</p>\n<p>I do think I need to add some better CI around the docker images, and we should push both the LIFE and STAR docker images to a registry for people.</p>\n<h2>OCaml TIFF</h2>\n<p>Still tip-toeing my way around getting to the writing of files from <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCaml-TIFF</a>, and instead this week I changed the default handling of specifying the type of the data to be read from the file to the time you open it, and then worked on a couple of useful GDAL specific GeoTIFF extensions that are quite useful. In fact, I have to confess that I'd naively assumed these GDALisms were part of <a href=\"https://www.ogc.org/standards/geotiff/\">the standard for GeoTIFF</a> as I've come across them in datasets from others and popular tools like <a href=\"https://qgis.org\">QGIS</a> seem to honour them (though I suspect that's as QGIS is using GDAL under the hood).</p>\n<p>The extensions are: setting a NODATA value, and sparse TIFFs. NODATA lets you nominate a value in the TIFF file that should act as a sort of mask - if I set NODATA to 42, all values 42 in the image are ignored and in tools like QGIS won't be displayed. I see this used a lot, say to mask out the ocean or areas outside a given spatial range of interest. All fine, until you look at how it's been added to GeoTIFF by GDAL: the value is not stored encoded as the same type the data is in the file (e.g., an int value for int files, or float for float files), but rather as an ASCII string of the value. This means there is a bunch of inference that has to happen, for example if you have a uint8 TIFF, here's some variations on data you get for different NODATA values from GDAL:</p>\n<div>\n\n\nNODATA\nsynthesised data\n\n\n3.175\n3\n\n\n3.9\n4\n\n\n-32\n0\n\n\n321\n255\n\n\nnan\n0\n\n</div><p>For our library, which is meant to be promoting type safety, I think we'll not try mimic however GDAL does the conversion, and just throw an error if, for example, you have a unsigned integer layer and you provided a negative NODATA.</p>\n<p>The other extension is sparse TIFF files. TIFF files store the image data in strips or tiles, and in the header of the file have a table indicating the offset and length of those blocks within the file itself. GDAL has the nice extension that if you set the offset and length of a block to 0, then it'll synthesise the data for that block rather than reading it from the file. So if you have a block that's all a default value, or your NODATA value, you don't need to put that in the file. In particular, if you're using tiles and have areas of ocean and all you care about is land, this seems a neat saving. The synthesised block is initialised with either zero, or if one is specified, the NODATA value.</p>\n<p>I've not yet knowingly ran into sparse TIFFs in the wild (unlike NODATA values, which I see frequently), but I definitely intend to use them now I know of that.</p>\n<p>Also, excitingly, <a href=\"https://patrick.sirref.org\">Patrick</a> took my LZW implementation and <a href=\"https://github.com/geocaml/ocaml-tiff/pull/36\">speeded it up</a> by replacing my list based implementation and using Strings with some magic calling to internal functions to reduce allocations made.</p>\n<h2>Edge effects</h2>\n<p>I've been reading some papers on "edge effects" - that is, how does a species interact with the edge of its habitat range. For example, if a species likes forest, it won't necessarily live in every part of the forest, keeping away from the edges where it transitions to other habitats it doesn't like. I've been asked to implement edge-effects for my <a href=\"https://github.com/quantifyearth/aoh-calculator/\">AoH code</a>, and I have a general idea of how I'd implement this using something like a standard image processing convolution process, but I wanted to know how others have implemented this, to see if I was missing anything about the problem. To this end I'm currently reading my way through the original paper <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew</a> worked on <a href=\"https://www.researchgate.net/publication/292670611_To_what_extent_could_edge_effects_and_habitat_fragmentation_diminish_the_potential_benefits_of_land_sparing\">To What Extent Could Edge Effects And Habitat Fragmentation Diminish The Potential Benefits Of Land Sparing? by Lamb et al</a> and a more recent look at the topic that attempts to add more nuance to how the edge effects are implemented (which then is going to be computational more complicated I suspect) <a href=\"https://link.springer.com/article/10.1007/s10980-024-01865-5\">A Mechanistic Approach To Weighting Edge\u2011effects In Landscape Connectivity Assessments by Dennis et al</a>.</p>\n<h1>This Week</h1>\n<p>This week I'll be working from Gothenburg, with the Nordic-RSE conference taking up Tuesday and Wednesday. Monday I'll be doing more prep for my discussion session on Wednesday on lineage in scientific systems, Friday I'll be travelling back, and Thursday I'll hopefully do a little exploring and practicing my Swedish.</p>",+"content": "<p></p><div>\n<div>\n\n\n<img alt=\"A photo of an island in a sea taken from the window of an airplane. The island has a swirly shape, which is emphasised by the sand banks around it.\" src=\"IMG_8920.JPG\">\n\n</div>\n</div>\n\n<p></p>\n<p>A photo of the Danish island of <a href=\"https://en.wikipedia.org/wiki/L%C3%A6s%C3%B8\">L\u00e6s\u00f8</a> taken from the plane as we descended towards Gothenburg airport.</p>\n<h1>Last Week</h1>\n<h2>Nordic-RSE</h2>\n<p>I did my prep for <a href=\"https://nordic-rse.org/nrse2025/\">Nordic-RSE conference</a>, or at least as much as I feel I can, where I'll be hosting a discussion session towards the end of conference on lineage in data-science pipelines. I've limited experience hosting discussion panels before, and so I've been reading through <a href=\"https://www.oreilly.com/library/view/gamestorming/9781449391195/\">Gamestorming by Gray et al</a>, a book we happened to have at home. It's one of those books where a lot of what it says (at least in the opening chapters) is perhaps somewhat obvious (e.g., a session should have an opening, an exploration, and a closing), but it's really useful to have this spelled out and formalised a little, and will hopefully lead to my managing of the session being a bit more deliberate and focussed.</p>\n<p>My ultimate aim for this is to try and tease out what tools and techniques other RSEs have been using to help preserve lineage and ensure repeatability and reproducibility of the projects they work on. At the end of this we'll hopefully have a bunch of suggestions which I'll then write up and host in a git repo somewhere so that other participants can fill in bits I missed or add more details. If at the end of the process we have a page with a set of things that the community can refer to in the future to make it easier to ensure lineage is preserved, then this will be a success. If we don't achieve that, modulo my bad facilitating, then this will also show that there is a gap here that could be used to direct our research, so that's also a success.</p>\n<h2>STAR</h2>\n<p>I squished the last known issue with my <a href=\"https://github.com/quantifyearth/star/\">STAR implementation</a>, which was another <a href=\"https://gdal.org/\">GDAL</a> oddity throwing an error that got lost in the sheer volume of species we process. In the default setting GDAL will not load GeoJSON polygons that are over 200MB size. The fix is simple, you remove the limits by setting the <code>OGR_GEOJSON_MAX_OBJ_SIZE</code> environmental variable to <code>0</code>. Note that the files themselves aren't over 200MB in size, so I assume this refers to the in memory representation for GDAL.</p>\n<p>I'd applied this earlier in the pipeline to get the AoHs to work, but the problem was I'd not set it for one of the later stages. If I actually used the dockerised version of the pipeline this wouldn't be an issue, as I'd have set it in the environment once and could forget it, but because I tend to run in my native environment it has to be set in every script that might impact it. I should probably just punt this into <a href=\"https://github.com/quantifyearth/yirgacheffe/\">yirgacheffe</a>, as working with the IUCN range data you regularly end up with polygons that exceed this for species that have coastal ranges (see my <a href=\"/weeknotes/2025-04-14/\">recent rant</a> on this).</p>\n\n\n<p>That out the way I sat down with Simon Tarr from the IUCN and we ran through getting the docker pipeline for my STAR pipeline to run on his laptop. I normally don't use the docker version myself, and so there were inevitably a few teething issues. But we got Simon to a point where he was generating AoH maps, but more importantly the process helped him understand how the pipeline works internally a little.</p>\n<p>I spent a little more time afterwards and got other parts of the pipeline also working in docker, like the model validation checks on the AoH maps. This required I install <a href=\"https://www.r-project.org\">R</a> in the container, as I had forgotten the <a href=\"https://eshinjolly.com/pymer4/\">Python stats package</a> I ended up using for this when porting over Chess's R code actually just calls to R under the hood \ud83e\udd26 Still, good progress.</p>\n<p>I do think I need to add some better CI around the docker images, and we should push both the LIFE and STAR docker images to a registry for people.</p>\n<h2>OCaml TIFF</h2>\n<p>Still tip-toeing my way around getting to the writing of files from <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCaml-TIFF</a>, and instead this week I changed the default handling of specifying the type of the data to be read from the file to the time you open it, and then worked on a couple of useful GDAL specific GeoTIFF extensions that are quite useful. In fact, I have to confess that I'd naively assumed these GDALisms were part of <a href=\"https://www.ogc.org/standards/geotiff/\">the standard for GeoTIFF</a> as I've come across them in datasets from others and popular tools like <a href=\"https://qgis.org\">QGIS</a> seem to honour them (though I suspect that's as QGIS is using GDAL under the hood).</p>\n<p>The extensions are: setting a NODATA value, and sparse TIFFs. NODATA lets you nominate a value in the TIFF file that should act as a sort of mask - if I set NODATA to 42, all values 42 in the image are ignored and in tools like QGIS won't be displayed. I see this used a lot, say to mask out the ocean or areas outside a given spatial range of interest. All fine, until you look at how it's been added to GeoTIFF by GDAL: the value is not stored encoded as the same type the data is in the file (e.g., an int value for int files, or float for float files), but rather as an ASCII string of the value. This means there is a bunch of inference that has to happen, for example if you have a uint8 TIFF, here's some variations on data you get for different NODATA values from GDAL:</p>\n<div>\n\n\nNODATA\nsynthesised data\n\n\n3.175\n3\n\n\n3.9\n4\n\n\n-32\n0\n\n\n321\n255\n\n\nnan\n0\n\n</div><p>For our library, which is meant to be promoting type safety, I think we'll not try mimic however GDAL does the conversion, and just throw an error if, for example, you have a unsigned integer layer and you provided a negative NODATA.</p>\n<p>The other extension is sparse TIFF files. TIFF files store the image data in strips or tiles, and in the header of the file have a table indicating the offset and length of those blocks within the file itself. GDAL has the nice extension that if you set the offset and length of a block to 0, then it'll synthesise the data for that block rather than reading it from the file. So if you have a block that's all a default value, or your NODATA value, you don't need to put that in the file. In particular, if you're using tiles and have areas of ocean and all you care about is land, this seems a neat saving. The synthesised block is initialised with either zero, or if one is specified, the NODATA value.</p>\n<p>I've not yet knowingly ran into sparse TIFFs in the wild (unlike NODATA values, which I see frequently), but I definitely intend to use them now I know of that.</p>\n<p>Also, excitingly, <a href=\"https://patrick.sirref.org\">Patrick</a> took my LZW implementation and <a href=\"https://github.com/geocaml/ocaml-tiff/pull/36\">speeded it up</a> by replacing my list based implementation and using Strings with some magic calling to internal functions to reduce allocations made.</p>\n<h2>Edge effects</h2>\n<p>I've been reading some papers on "edge effects" - that is, how does a species interact with the edge of its habitat range. For example, if a species likes forest, it won't necessarily live in every part of the forest, keeping away from the edges where it transitions to other habitats it doesn't like. I've been asked to implement edge-effects for my <a href=\"https://github.com/quantifyearth/aoh-calculator/\">AoH code</a>, and I have a general idea of how I'd implement this using something like a standard image processing convolution process, but I wanted to know how others have implemented this, to see if I was missing anything about the problem. To this end I'm currently reading my way through the original paper <a href=\"https://www.zoo.cam.ac.uk/directory/andrew-balmford\">Andrew</a> worked on <a href=\"https://www.researchgate.net/publication/292670611_To_what_extent_could_edge_effects_and_habitat_fragmentation_diminish_the_potential_benefits_of_land_sparing\">To What Extent Could Edge Effects And Habitat Fragmentation Diminish The Potential Benefits Of Land Sparing? by Lamb et al</a> and a more recent look at the topic that attempts to add more nuance to how the edge effects are implemented (which then is going to be computational more complicated I suspect) <a href=\"https://link.springer.com/article/10.1007/s10980-024-01865-5\">A Mechanistic Approach To Weighting Edge\u2011effects In Landscape Connectivity Assessments by Dennis et al</a>.</p>\n<h1>This Week</h1>\n<p>This week I'll be working from Gothenburg, with the Nordic-RSE conference taking up Tuesday and Wednesday. Monday I'll be doing more prep for my discussion session on Wednesday on lineage in scientific systems, Friday I'll be travelling back, and Thursday I'll hopefully do a little exploring and practicing my Swedish.</p>",
+14
mwd/weeknotes_2025-05-26_.json
+14
mwd/weeknotes_2025-05-26_.json
···+"summary": "<h1>Last Week</h1>\n<h2>Nordic-RSE 2025</h2>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of a name-badge saying "Nordic RSE Conference" and "Michael Dales"\" src=\"DSCF5744.jpg\">\n\n</div>\n</div>\n\n<p></p>\n<p>I have a lot of notes from <a href=\"https://nordic-rse.org/nrse2025/\">the Nordic-RSE conference</a> that I hope to turn into a blog post, so I won't say much here other than it was a great event: I met interesting folk that work in other science disciplines as RSEs, I saw a bunch of interesting talks and learned a lot, and once again got to appreciate Gothenburg. Very much worth going for me, and I hopefully can join it again next year when it'll be in Troms\u00f8.</p>\n<h2>LIFE</h2>\n<p>Before heading out to the conference Ali came to me with a bit of weirdness she was seeing with one of the tools in the <a href=\"https://github.com/quantifyearth/life\">LIFE pipeline</a>. The LIFE pipeline does all its work in the WGS84 map projection, where the pixel grid aligns with the latitude/longitude grid on the globe. This is a popular map format, I assume on account of it being easy to reason about, but is quite a distorted map projection also, given it is the same number of pixels wide at the equator as it does at the poles. This means pixels at the equator cover a much larger area of land than they do at the poles; other map projections like <a href=\"https://en.wikipedia.org/wiki/Mollweide_projection\">Mollweide</a> attempt to have a projection that keeps a roughly equal area per pixel, but at the cost of being less easy to work with in other ways.</p>\n<p>Because of this WGS84 area-per-pixel distortion, you can't compare pixels in a map directly without taking into account their area. So when we do the <a href=\"\">Area of habitat</a> calculations in LIFE we multiply the contents of each pixel by another raster that contains in it just the approximate area of each pixel. I say approximate because to avoid excessive work given the resolution we work at, we make the simplifying assumption that the area of every pixel at the same latitude has the same area, and there are nice formulas for calculating that, which <a href=\"https://github.com/quantifyearth/LIFE/blob/main/prepare_layers/make_area_map.py\">I codified into a simple script</a>.</p>\n<p>So far, so good.</p>\n<p>However, if you look at the script, you'll see it makes an optimisation based on that simplifying assumption: the map it generates is only one pixel wide. There's two reasons for this. Firstly, LIFE works at a resolution to 100m per side per pixel at the equator, which means a global map is 150GB per byte per pixel, and storing the area as a float 32 as we do, that would be 600 GB uncompressed. Then if you look at the most commonly used compression in a TIFF file, <a href=\"https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch\">LZW compression</a>, that requires that to read a pixel in a row in the image, you have to decompress all the preceding pixels in that row: okay if you're calculating values for Alaska (being on the left of the map, so values are early in the row), but not so much for New Zealand (on the right of the map, so you need to decompress everything to the left first). Very early on in my time in the LIFE project I spotted that this was causing a lot of slow down, so I added a <a href=\"https://github.com/quantifyearth/yirgacheffe/blob/main/yirgacheffe/layers/area.py#L10-L18\">special mode</a> to my geospatial library <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a> that lets you provide it a one pixel wide image, and it'll just extrapolate that out to fill rows with that same value. A simple trick, and one which gave the pipeline a significant performance boost.</p>\n<p>So far, still so good.</p>\n<p>The problem came when Ali tried to use that same script to generate an area layer for some analysis she was doing at 10m per side per pixel at the equator, and then rather than generate an area-per-pixel raster that was 1 pixel wide, it made one that was 2 pixels wide. Yirgacheffe does some sanity checks when you use the special mode for these maps, and checks that you have passed it something that is a single pixel wide, and so was rejecting the new map. I dug into this, and once again it was floating point weirdness that was biting me.</p>\n<p>When you generate a GeoTIFF you need to specify the spatial location of the pixels. So when I generate my 1 pixel wide image, I was giving the location of -180\u02da longitude, which sort of made sense to me, as in Yirgacheffe I then expand all the pixels out to the right. This meant that in the internal logic I was generate a map that goes from -180\u02da to (-180\u02da + size of pixel), and despite Ali using a pixel size value of 0.0001, which seems to a human as a nice round number, when pushed back and forth through floating point, it turns out that -180.0 + 0.0001 rounds every so slightly larger than -179.999, and Yirgacheffe when creating GeoTIFFs based on area will always round up so as to not lose data, and this we end up at 2 pixels. To make things more icky, if you specify a pixel scale of 0.000100000000001 it all works, as the floating point approximations play nice.</p>\n<p>Fixing this properly is awkward, and I think I need to track in Yirgacheffe whether you really wanted to make a raster that was from -180.0 to -179.999 and a bit, or you just wanted to make something 1 pixel wide, and I didn't really have time for that plumbing, so I <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/26\">filed a bug against myself</a>, and just moved to offsetting the area-per-pixel raster at 0\u02da latitude, as the math works there, and internally the special mode in Yirgacheffe for expanding these area maps never checks the longitude, just the latitude. Not proud, but it got Ali unblocked before I vanished for a week.</p>\n<h2>Self-hosting fails</h2>\n<p>I got bit once more by my choice of self-hosting platform being a Raspberry-pi, when I tried to share my slides from Nordic-RSE, which take the form of a 17MB PDF. Turns out that my quick and admittedly bad approach of handling static files by "read all the data to memory then pass it to the web framework" broke, as <a href=\"https://ocaml.org\">OCaml</a>'s default max size for a string on a 32-bit system (which <a href=\"https://www.raspberrypi.com/software/\">Raspian Linux</a> is, despite the CPU being 64 bit) <a href=\"https://stackoverflow.com/questions/34973522/ocaml-string-length-limitation-when-reading-from-stdin-file\">is 16 MB</a>. I could just change an environmental variable for this, but I really shouldn't be loading files like this anyway, so I thought (sat in Gothenburg airport where they have eduroam), I'd try do this properly and serve the data in a more stream like way.</p>\n<p>Looking at the Dream source code (<a href=\"https://aantron.github.io/dream/\">Dream</a> is the OCaml web framework I use), it uses <a href=\"https://github.com/ocsigen/lwt\">LWT</a> under the hood for this. LWT is one of the many ways of doing concurrent work in the OCaml ecosystem, and I've been trying to avoid learning it, because if I tried to learn every competing concurrent framework for OCaml I'd be late for dinner, and our group is in team <a href=\"https://github.com/ocaml-multicore/eio\">EIO</a>, so I was going to invest time into that at some point. Anyway, it was small enough code that I could just borrow from the Dream implementation the LWT bit of its static loader (which I'm not using because it doesn't set <code>last-modified</code> headers).</p>\n<p>This works, but I still can't share my slides, as now I get an error of <code>Invalid_argument("Bytes.create")</code> within Dream somewhere when I use a large file - I assume I'm hitting a similar limit to that for strings - which implies the Dream/LWT implementation I based my updated static file handler on isn't as clever as I hoped.</p>\n<p>And there the matter rests for now. My slides weren't <em>that</em> interesting (as I lead a discussion session).</p>\n<h1>This Week</h1>\n<h2>Open Hardware Summit</h2>\n<p>This will be a somewhat short week again, as I need to head up to Edinburgh for <a href=\"https://2025.oshwa.org\">Open Hardware Summit 2025</a>, an event I last went to in Denver in 2017. Most of it isn't directly EEG related, though there is a <a href=\"https://2025.oshwa.org/panels/030-open-source-environmental/\">panel on environmental monitoring</a> which might relate to the <a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder</a> work <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh</a> is working on, so I'll try get to that.</p>\n<h2>Edge effects</h2>\n<p>I need to take a stab at implementing edge-effects on habitat rasters. I've been reading papers to see what others do, and I still have some unanswered questions about the nuance of this, but I think it's probably at the stage where I should make a thing just so I can get a sense of how good/bad that is, and then refine from there.</p>\n<h2>Write up Nordic-RSE</h2>\n<p>I need to write up both my discussion session, and a general overview. So many notes, so many good ideas and learnings.</p>\n<h2>GBIF processing</h2>\n<p>If there's any time left I still need to get into processing occurrence data for species based on GBIF data. This might be a good chance to see if the <a href=\"https://duckdb.org/2025/05/21/announcing-duckdb-130.html#spatial-join-operator\">performance increases duckdb announced for spacial joins</a> are meaningful for the sort of thing I do.</p>",+"content": "<h1>Last Week</h1>\n<h2>Nordic-RSE 2025</h2>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of a name-badge saying "Nordic RSE Conference" and "Michael Dales"\" src=\"DSCF5744.jpg\">\n\n</div>\n</div>\n\n<p></p>\n<p>I have a lot of notes from <a href=\"https://nordic-rse.org/nrse2025/\">the Nordic-RSE conference</a> that I hope to turn into a blog post, so I won't say much here other than it was a great event: I met interesting folk that work in other science disciplines as RSEs, I saw a bunch of interesting talks and learned a lot, and once again got to appreciate Gothenburg. Very much worth going for me, and I hopefully can join it again next year when it'll be in Troms\u00f8.</p>\n<h2>LIFE</h2>\n<p>Before heading out to the conference Ali came to me with a bit of weirdness she was seeing with one of the tools in the <a href=\"https://github.com/quantifyearth/life\">LIFE pipeline</a>. The LIFE pipeline does all its work in the WGS84 map projection, where the pixel grid aligns with the latitude/longitude grid on the globe. This is a popular map format, I assume on account of it being easy to reason about, but is quite a distorted map projection also, given it is the same number of pixels wide at the equator as it does at the poles. This means pixels at the equator cover a much larger area of land than they do at the poles; other map projections like <a href=\"https://en.wikipedia.org/wiki/Mollweide_projection\">Mollweide</a> attempt to have a projection that keeps a roughly equal area per pixel, but at the cost of being less easy to work with in other ways.</p>\n<p>Because of this WGS84 area-per-pixel distortion, you can't compare pixels in a map directly without taking into account their area. So when we do the <a href=\"\">Area of habitat</a> calculations in LIFE we multiply the contents of each pixel by another raster that contains in it just the approximate area of each pixel. I say approximate because to avoid excessive work given the resolution we work at, we make the simplifying assumption that the area of every pixel at the same latitude has the same area, and there are nice formulas for calculating that, which <a href=\"https://github.com/quantifyearth/LIFE/blob/main/prepare_layers/make_area_map.py\">I codified into a simple script</a>.</p>\n<p>So far, so good.</p>\n<p>However, if you look at the script, you'll see it makes an optimisation based on that simplifying assumption: the map it generates is only one pixel wide. There's two reasons for this. Firstly, LIFE works at a resolution to 100m per side per pixel at the equator, which means a global map is 150GB per byte per pixel, and storing the area as a float 32 as we do, that would be 600 GB uncompressed. Then if you look at the most commonly used compression in a TIFF file, <a href=\"https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch\">LZW compression</a>, that requires that to read a pixel in a row in the image, you have to decompress all the preceding pixels in that row: okay if you're calculating values for Alaska (being on the left of the map, so values are early in the row), but not so much for New Zealand (on the right of the map, so you need to decompress everything to the left first). Very early on in my time in the LIFE project I spotted that this was causing a lot of slow down, so I added a <a href=\"https://github.com/quantifyearth/yirgacheffe/blob/main/yirgacheffe/layers/area.py#L10-L18\">special mode</a> to my geospatial library <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a> that lets you provide it a one pixel wide image, and it'll just extrapolate that out to fill rows with that same value. A simple trick, and one which gave the pipeline a significant performance boost.</p>\n<p>So far, still so good.</p>\n<p>The problem came when Ali tried to use that same script to generate an area layer for some analysis she was doing at 10m per side per pixel at the equator, and then rather than generate an area-per-pixel raster that was 1 pixel wide, it made one that was 2 pixels wide. Yirgacheffe does some sanity checks when you use the special mode for these maps, and checks that you have passed it something that is a single pixel wide, and so was rejecting the new map. I dug into this, and once again it was floating point weirdness that was biting me.</p>\n<p>When you generate a GeoTIFF you need to specify the spatial location of the pixels. So when I generate my 1 pixel wide image, I was giving the location of -180\u02da longitude, which sort of made sense to me, as in Yirgacheffe I then expand all the pixels out to the right. This meant that in the internal logic I was generate a map that goes from -180\u02da to (-180\u02da + size of pixel), and despite Ali using a pixel size value of 0.0001, which seems to a human as a nice round number, when pushed back and forth through floating point, it turns out that -180.0 + 0.0001 rounds every so slightly larger than -179.999, and Yirgacheffe when creating GeoTIFFs based on area will always round up so as to not lose data, and this we end up at 2 pixels. To make things more icky, if you specify a pixel scale of 0.000100000000001 it all works, as the floating point approximations play nice.</p>\n<p>Fixing this properly is awkward, and I think I need to track in Yirgacheffe whether you really wanted to make a raster that was from -180.0 to -179.999 and a bit, or you just wanted to make something 1 pixel wide, and I didn't really have time for that plumbing, so I <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/26\">filed a bug against myself</a>, and just moved to offsetting the area-per-pixel raster at 0\u02da latitude, as the math works there, and internally the special mode in Yirgacheffe for expanding these area maps never checks the longitude, just the latitude. Not proud, but it got Ali unblocked before I vanished for a week.</p>\n<h2>Self-hosting fails</h2>\n<p>I got bit once more by my choice of self-hosting platform being a Raspberry-pi, when I tried to share my slides from Nordic-RSE, which take the form of a 17MB PDF. Turns out that my quick and admittedly bad approach of handling static files by "read all the data to memory then pass it to the web framework" broke, as <a href=\"https://ocaml.org\">OCaml</a>'s default max size for a string on a 32-bit system (which <a href=\"https://www.raspberrypi.com/software/\">Raspian Linux</a> is, despite the CPU being 64 bit) <a href=\"https://stackoverflow.com/questions/34973522/ocaml-string-length-limitation-when-reading-from-stdin-file\">is 16 MB</a>. I could just change an environmental variable for this, but I really shouldn't be loading files like this anyway, so I thought (sat in Gothenburg airport where they have eduroam), I'd try do this properly and serve the data in a more stream like way.</p>\n<p>Looking at the Dream source code (<a href=\"https://aantron.github.io/dream/\">Dream</a> is the OCaml web framework I use), it uses <a href=\"https://github.com/ocsigen/lwt\">LWT</a> under the hood for this. LWT is one of the many ways of doing concurrent work in the OCaml ecosystem, and I've been trying to avoid learning it, because if I tried to learn every competing concurrent framework for OCaml I'd be late for dinner, and our group is in team <a href=\"https://github.com/ocaml-multicore/eio\">EIO</a>, so I was going to invest time into that at some point. Anyway, it was small enough code that I could just borrow from the Dream implementation the LWT bit of its static loader (which I'm not using because it doesn't set <code>last-modified</code> headers).</p>\n<p>This works, but I still can't share my slides, as now I get an error of <code>Invalid_argument("Bytes.create")</code> within Dream somewhere when I use a large file - I assume I'm hitting a similar limit to that for strings - which implies the Dream/LWT implementation I based my updated static file handler on isn't as clever as I hoped.</p>\n<p>And there the matter rests for now. My slides weren't <em>that</em> interesting (as I lead a discussion session).</p>\n<h1>This Week</h1>\n<h2>Open Hardware Summit</h2>\n<p>This will be a somewhat short week again, as I need to head up to Edinburgh for <a href=\"https://2025.oshwa.org\">Open Hardware Summit 2025</a>, an event I last went to in Denver in 2017. Most of it isn't directly EEG related, though there is a <a href=\"https://2025.oshwa.org/panels/030-open-source-environmental/\">panel on environmental monitoring</a> which might relate to the <a href=\"https://anil.recoil.org/papers/2024-terracorder\">Terracorder</a> work <a href=\"https://profiles.imperial.ac.uk/joshua.millar22\">Josh</a> is working on, so I'll try get to that.</p>\n<h2>Edge effects</h2>\n<p>I need to take a stab at implementing edge-effects on habitat rasters. I've been reading papers to see what others do, and I still have some unanswered questions about the nuance of this, but I think it's probably at the stage where I should make a thing just so I can get a sense of how good/bad that is, and then refine from there.</p>\n<h2>Write up Nordic-RSE</h2>\n<p>I need to write up both my discussion session, and a general overview. So many notes, so many good ideas and learnings.</p>\n<h2>GBIF processing</h2>\n<p>If there's any time left I still need to get into processing occurrence data for species based on GBIF data. This might be a good chance to see if the <a href=\"https://duckdb.org/2025/05/21/announcing-duckdb-130.html#spatial-join-operator\">performance increases duckdb announced for spacial joins</a> are meaningful for the sort of thing I do.</p>",
+14
mwd/weeknotes_2025-06-02_.json
+14
mwd/weeknotes_2025-06-02_.json
···+"summary": "<h1>Last week</h1>\n<h2>Adding image processing support to Yirgacheffe</h2>\n<p>For some upcoming follow-on work to <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">LIFE</a>, I want to be able to model some things that look a lot to me like they could be implemented as <a href=\"https://en.wikipedia.org/wiki/Kernel_(image_processing)\">conventional image filters</a>. To this end then I spent some time adding support for this style of operation to <a href=\"https://github.org/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, my declarative geospatial library for Python. In general, it's much easier for me to implement fundamental operations like this in Yirgacheffe where I can unit test them well and get confidence in the implementation before I then use them in the more cumbersome scientific pipeline.</p>\n<p>Yirgacheffe these days has two computational backends that I now need to support when I add new features: a CPU-targeted backend that uses <a href=\"https://numpy.org\">numpy</a>, and a Metal GPU based backend that uses <a href=\"https://ml-explore.github.io/mlx/build/html/index.html\">MLX</a> (one day a CUDA backend will be added via <a href=\"https://cupy.dev/\">CUPY</a>, but I've not needed that yet, and so it hasn't :). I had a dig around, and whilst numpy doesn't really support the 2D convolution matrix operations needed to run this style of image processing, <a href=\"https://pytorch.org\">PyTorch</a> does (<a href=\"https://docs.pytorch.org/docs/stable/generated/torch.nn.Conv2d.html\">see conv2d</a>), so that gave me a path to adding support for the CPU-targeted backend. MLX supports the <a href=\"https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary/mlx.nn.Conv2d.html\">same API</a> as PyTorch for this operation, though weirdly it orders its multi-dimensional matrixes in a different order, which made the code a little messier for me. CUPY also looks to have a <a href=\"https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.ndimage.convolve.html\">similar API</a> so this isn't going to block a CUDA backend either.</p>\n<p>Hooking this into Yirgacheffe was a little more nuanced than my previous expansion of supported operators in Yirgacheffe. Under the hood, when you say add two raster layers together, Yirgacheffe does two things:</p>\n<ul>\n<li>It doesn't execute the expression when you define it, just as you save it. Instead it builds up a full s-expression of the operation at definition time.</li>\n<li>When it does eventually execute the expression, it carries out the operation in chunks, to avoid using too much memory (we regular work with raster layers that are many times bigger than available RAM), and to optionally allow parallelism.</li>\n</ul>\n<p>So in the case I'm adding two rasters, we read in the same data chunk from both the sources, add them together, and then write that chunk to the result raster. This is fine when you have a one-to-one mapping of input to output pixels, but for a convolution matrix you need to load more data than the result needs, other wise you get edge artefacts because the matrix will read null values when you go along the edges of the input layer and the matrix goes over said edges. Thus I needed to break that assumption that a chunk of data in the S-expression is always the same size wherever you are in the expression: if you go past a convolution matrix operation then the chunk window needs to expand all the way to the leaf notes of the expression.</p>\n<p>Tedious stuff, but this again is why I hide all this stuff in Yirgacheffe: I do it once, then I never have to think about it again now matter how often I need it!</p>\n<h2>Nordic-RSE follow up</h2>\n<p>I wrote a <a href=\"/blog/nordic-rse-25/\">blog post summarising Nordic RSE 2025</a> - this took a good amount of time, and I've no idea how <a href=\"https://anil.recoil.org/\">Anil</a> and others live blog events!</p>\n<h2>Geocaml</h2>\n<p>I had a good catchup with <a href=\"https://patrick.sirref.org/\">Patrick</a> about <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCaml-TIFF</a>, our OCaml library for working with GeoTIFF files. TIFF is a somewhat awkward format to deal with in terms of modifications, and so we drew out a plan for a minimal set of features we'd support for writing TIFF files, as that's the big blocker right now in using this for anything useful.</p>\n<h2>Experiments with TIFF formats</h2>\n<p>In an idle moment I did some small initial experiments with trying to see if changing the way the rasters for LIFE are stored would make a performance change to the various Area of Habitat (AoH) based pipelines I maintain. I'd wondered if switching from using the default TIFF storage format of storing the image by rows to storing it by tiles might make sense, as normally we're reading data for a species from a narrow band within the overall image width, but the results were actually slightly slower when I did this. I suspect this is down to <a href=\"https://github.org/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, which reads in chunks of data based on a fixed height size, and unless that happens to align with the tile boundaries, you'll end up reading a lot of the tiles twice. Trying to solve this for an arbitrary set of input files with different tile layouts will be a pain, but perhaps I can just special case the version where all rasters have the same tile layout, which is something I can ensure in my pipelines.</p>\n<h2>Self-hosting</h2>\n<p>I did a little bit of learning <a href=\"https://ocsigen.org/lwt/latest/manual/manual\">LWT</a>, which is a promise based concurrency library for OCaml, after that's what <a href=\"https://aantron.github.io/dream/\">Dream</a> (the web framework I use) is based on, and I wanted to see if I could at least improve on the current performance issues I'm having by moving image processing from the request handler directly onto a concurrent promise. I did a minimal job, but indeed using <code>Lwt_process</code> to invoke <a href=\"http://www.graphicsmagick.org\">GraphicsMagick</a> does seem to have improved responsiveness on the more image heavy pages I have on my <a href=\"https://mynameismwd.org/\">personal site</a>. The processing time for images that have yet to be cached is still poor, but at least the rest of the site doesn't bog down too much whilst it's doing that. There's more I can do now I understand LWT a little, but I can now chip away at those over time.</p>\n<h1>Next week</h1>\n<p>I'll be working from <a href=\"https://en.wikipedia.org/wiki/Wirral_Peninsula\">The Wirral</a> for at least the first half of the week.</p>\n<ul>\n<li>Write up my discussion session from Nordic-RSE - hopefully will take a little less time as it's more on a topic I'm familiar with, but then there were lots of new things shared, so maybe not.</li>\n<li>Try apply my new convolution code to LIFE.</li>\n<li>I have an <a href=\"https://www.outreachy.org\">Outreachy</a> intern starting this week to do fun stuff with <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>, so looking forward to seeing what new features we get from that.</li>\n</ul>",+"content": "<h1>Last week</h1>\n<h2>Adding image processing support to Yirgacheffe</h2>\n<p>For some upcoming follow-on work to <a href=\"https://royalsocietypublishing.org/doi/10.1098/rstb.2023.0327\">LIFE</a>, I want to be able to model some things that look a lot to me like they could be implemented as <a href=\"https://en.wikipedia.org/wiki/Kernel_(image_processing)\">conventional image filters</a>. To this end then I spent some time adding support for this style of operation to <a href=\"https://github.org/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, my declarative geospatial library for Python. In general, it's much easier for me to implement fundamental operations like this in Yirgacheffe where I can unit test them well and get confidence in the implementation before I then use them in the more cumbersome scientific pipeline.</p>\n<p>Yirgacheffe these days has two computational backends that I now need to support when I add new features: a CPU-targeted backend that uses <a href=\"https://numpy.org\">numpy</a>, and a Metal GPU based backend that uses <a href=\"https://ml-explore.github.io/mlx/build/html/index.html\">MLX</a> (one day a CUDA backend will be added via <a href=\"https://cupy.dev/\">CUPY</a>, but I've not needed that yet, and so it hasn't :). I had a dig around, and whilst numpy doesn't really support the 2D convolution matrix operations needed to run this style of image processing, <a href=\"https://pytorch.org\">PyTorch</a> does (<a href=\"https://docs.pytorch.org/docs/stable/generated/torch.nn.Conv2d.html\">see conv2d</a>), so that gave me a path to adding support for the CPU-targeted backend. MLX supports the <a href=\"https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary/mlx.nn.Conv2d.html\">same API</a> as PyTorch for this operation, though weirdly it orders its multi-dimensional matrixes in a different order, which made the code a little messier for me. CUPY also looks to have a <a href=\"https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.ndimage.convolve.html\">similar API</a> so this isn't going to block a CUDA backend either.</p>\n<p>Hooking this into Yirgacheffe was a little more nuanced than my previous expansion of supported operators in Yirgacheffe. Under the hood, when you say add two raster layers together, Yirgacheffe does two things:</p>\n<ul>\n<li>It doesn't execute the expression when you define it, just as you save it. Instead it builds up a full s-expression of the operation at definition time.</li>\n<li>When it does eventually execute the expression, it carries out the operation in chunks, to avoid using too much memory (we regular work with raster layers that are many times bigger than available RAM), and to optionally allow parallelism.</li>\n</ul>\n<p>So in the case I'm adding two rasters, we read in the same data chunk from both the sources, add them together, and then write that chunk to the result raster. This is fine when you have a one-to-one mapping of input to output pixels, but for a convolution matrix you need to load more data than the result needs, other wise you get edge artefacts because the matrix will read null values when you go along the edges of the input layer and the matrix goes over said edges. Thus I needed to break that assumption that a chunk of data in the S-expression is always the same size wherever you are in the expression: if you go past a convolution matrix operation then the chunk window needs to expand all the way to the leaf notes of the expression.</p>\n<p>Tedious stuff, but this again is why I hide all this stuff in Yirgacheffe: I do it once, then I never have to think about it again now matter how often I need it!</p>\n<h2>Nordic-RSE follow up</h2>\n<p>I wrote a <a href=\"/blog/nordic-rse-25/\">blog post summarising Nordic RSE 2025</a> - this took a good amount of time, and I've no idea how <a href=\"https://anil.recoil.org/\">Anil</a> and others live blog events!</p>\n<h2>Geocaml</h2>\n<p>I had a good catchup with <a href=\"https://patrick.sirref.org/\">Patrick</a> about <a href=\"https://github.com/geocaml/ocaml-tiff/\">OCaml-TIFF</a>, our OCaml library for working with GeoTIFF files. TIFF is a somewhat awkward format to deal with in terms of modifications, and so we drew out a plan for a minimal set of features we'd support for writing TIFF files, as that's the big blocker right now in using this for anything useful.</p>\n<h2>Experiments with TIFF formats</h2>\n<p>In an idle moment I did some small initial experiments with trying to see if changing the way the rasters for LIFE are stored would make a performance change to the various Area of Habitat (AoH) based pipelines I maintain. I'd wondered if switching from using the default TIFF storage format of storing the image by rows to storing it by tiles might make sense, as normally we're reading data for a species from a narrow band within the overall image width, but the results were actually slightly slower when I did this. I suspect this is down to <a href=\"https://github.org/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, which reads in chunks of data based on a fixed height size, and unless that happens to align with the tile boundaries, you'll end up reading a lot of the tiles twice. Trying to solve this for an arbitrary set of input files with different tile layouts will be a pain, but perhaps I can just special case the version where all rasters have the same tile layout, which is something I can ensure in my pipelines.</p>\n<h2>Self-hosting</h2>\n<p>I did a little bit of learning <a href=\"https://ocsigen.org/lwt/latest/manual/manual\">LWT</a>, which is a promise based concurrency library for OCaml, after that's what <a href=\"https://aantron.github.io/dream/\">Dream</a> (the web framework I use) is based on, and I wanted to see if I could at least improve on the current performance issues I'm having by moving image processing from the request handler directly onto a concurrent promise. I did a minimal job, but indeed using <code>Lwt_process</code> to invoke <a href=\"http://www.graphicsmagick.org\">GraphicsMagick</a> does seem to have improved responsiveness on the more image heavy pages I have on my <a href=\"https://mynameismwd.org/\">personal site</a>. The processing time for images that have yet to be cached is still poor, but at least the rest of the site doesn't bog down too much whilst it's doing that. There's more I can do now I understand LWT a little, but I can now chip away at those over time.</p>\n<h1>Next week</h1>\n<p>I'll be working from <a href=\"https://en.wikipedia.org/wiki/Wirral_Peninsula\">The Wirral</a> for at least the first half of the week.</p>\n<ul>\n<li>Write up my discussion session from Nordic-RSE - hopefully will take a little less time as it's more on a topic I'm familiar with, but then there were lots of new things shared, so maybe not.</li>\n<li>Try apply my new convolution code to LIFE.</li>\n<li>I have an <a href=\"https://www.outreachy.org\">Outreachy</a> intern starting this week to do fun stuff with <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>, so looking forward to seeing what new features we get from that.</li>\n</ul>",
+14
mwd/weeknotes_2025-06-16_.json
+14
mwd/weeknotes_2025-06-16_.json
···+"summary": "<h1>Previous week (at work)</h1>\n<p>I was on vacation last week driving around the Netherlands on my motorbike with my partner, so this is mostly what I did the week before that. The Netherlands was lovely, and their cycle biased road system puts Cambridge to shame: most the places we stayed had push-bikes for us to use and I felt more safe cycling there in atypical-to-me roadways than I did cycling to meet Anil this morning on home turf.</p>\n<p>We also had fun learning about the land management in the Netherlands: we walked on <a href=\"https://dezandmotor.nl/en/\">the sand motor</a>, an artificial sandbank that is an experiment in reinforcing the coast; we drove over <a href=\"https://theafsluitdijk.com\">Afsluitdijk</a>, a 20 mile long dyke that separates the open sea from the Ijsselmeer lake; we visited the museum at <a href=\"https://np-debiesbosch.nl/english-information/discover-the-park/\">Biesbosch National Park</a> where we learned about what happens if you don't maintain your wetlands infrastructure; and rounded it off with a guided tour of <a href=\"https://en.wikipedia.org/wiki/Maeslantkering\">Maeslantkering</a>, a huge set of swinging doors at Hoek van Holland for blocking the sea from reaching Rotterdam if the sea level looks like it'll swell too high, which are sufficiently large that if it wasn't for the special glass-impregnated white paint would expand 70cm in sun (as it is, the paint limits that to "just" 30 cm). Again, we do water management in East Anglia, but it's just at another scale in The Netherlands (I guess important when the centre of the country is six metres before sea level).</p>\n<h2>Area of Habitat Edge Effects</h2>\n<p>I spent some time trying to get my head around how to implement edge effects for Area of Habitat maps as part of LIFE. Edge effects refer to the fact that species that occupy certain habitats will sometimes not actually exist all the way to the edge of that habitat: if you have a habitat a species likes surrounded by habitat(s) it doesn't like, you can effectively shrink in the habitat by a set amount to allow for where they will not venture, making the population more concentrated within the inner region, and if areas a habitat are sufficiently small then species may not live there at all, despite it being a type they prefer.</p>\n<p>Edges are quite impactful in terms of land use change, as I tried to illustrate in this picture:</p>\n<div>\n <div>\n \n\n <img src=\"edges.svg\">\n \n </div>\n</div>\n<ol>\n<li>This just shows the edge on the area of habitat. The total splodge is the suitable habitat area, and the core is where the species will choose to live, avoiding the area marked edge.</li>\n<li>We may then think that if we change the land use of an area in the edge we don't impact the species...</li>\n<li>But in fact we just create a large edge that eats into the core area by an amount larger than just the area changed.</li>\n<li>Similarly for changing an area in the middle of the core</li>\n<li>The actual impact is amplified as there is an edge buffer all around the changed area, making it more impactful.</li>\n</ol>\n<p>Taking into account on how to use this though is subtle I think: on one hand you do want to account for the edge effect when working out the area that can support a species, but if you're looking to monitor the area where any changes could impact that population you need to use the entire habitable area, as even changes in the edge zone will impact the habitable core zone. My job now is to follow that through for the biodiversity metric pipelines I have and ensure I use the appropriate version of AoH at each step, which might mean I need to calculate both for each species.</p>\n<h2>Data pipeline tools</h2>\n<p>I made a start on writing up the discussion session I ran at the Nordic-RSE conference, but got sucked into trying to understand the detail of <a href=\"https://dvc.org\">DVC</a> and <a href=\"https://snakemake.github.io/\">Snakemake</a>, both of which had strong advocates in the session. The current tl;dr is I like the idea of DVC and how it ties code and data together, but it lacks the ability to do detailed dependancy analysis that I'd want from a build system, and Snakemake has that level of detail, but has a much poorer user experience (subjective, I appreciate).</p>\n<p>My secondary motivation here is that right now for both the <a href=\"https://github.com/quantifyearth/LIFE/\">LIFE</a> and <a href=\"https://github.copm/quantifyearth/STAR/\">STAR</a> pipelines I've written, the best sharable way to run them is via a shell script. Both were developed using <a href=\"https://github.com/quantifyearth/shark/\">Shark</a>, our own experimental data pipelining tool, but that's a bit too experimental for me to expect others to run, so I fell back on the shell script solution, but that does a bad job of only rebuilding the necessary parts of the pipeline if any of the inputs update, for that I want a proper build system, so I'm hoping that something from this exploration will give me another way out.</p>\n<h2>Outreachy</h2>\n<p>Outreachy kicked off, with <a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/outreachy-week-1\">Shreya Pawaskar joining</a> to help with <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>. This lead to me doing a bit of work to tidy up a few loose ends that I'd been putting off that I didn't want Shreya to have to deal with, but it was great to have them around to review my PRs for that!</p>\n<p>It's also forced me to be pragmatic, and work around a problem I have with OCaml's build system <a href=\"https://dune.readthedocs.io/en/stable/\">dune</a>. None of my work is in Opam yet, as I don't feel it's met the quality bar required in terms of documentation etc., and so if people want to use libraries I've built then on the guidance of others, I point people at using <a href=\"https://dune.readthedocs.io/en/stable/tutorials/dune-package-management/pinning.html\">dependancy pinning</a> whereby you can specify a github repository for a dependancy in your project's dune file, and then you run <code>dune pkg lock</code> and it'll fetch the pinned dependancies directly for you.</p>\n<p>This works fine, unless you have a submodule in your project. Claudius does use submodules for certain non-code resources, like the default font that is used for rendering text. Although this could be added as a subtree, my non-humble opinion is that a submodule is more appropriate here, as we don't care about the font's history, or indeed tracking updates. But <code>dune pkg lock</code> <a href=\"https://github.com/ocaml/dune/issues/11606\">does not cause submodules to be fetched</a>, and so currently Claudius breaks if you try to add it as a pinned dependancy. The ticket for this on dune has sat for a while now, and given that Claudius is yet to gain the fame and attention it deserves, I suspect my complaints won't move the needle there. Thus I'm going to have to <a href=\"https://github.com/claudiusFX/Claudius/pull/101\">add my resources as subtrees</a> and accept the history pollution this will cause - but it's a lot better than not having Claudius usable at all.</p>\n<h2>Summer interns</h2>\n<p>Looks like we have one undergrad interested in working on helping with <a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing geospatial data</a> over the summer, and I'm chatting to <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiff Ki</a> on providing support for 3D-printing camera jigs for <a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">digitising insect collections</a>.</p>\n<h1>This week</h1>\n<ul>\n<li>Write up some research ideas that have sat in my head for a while that I'm not getting time to act on - my hope is that by at least documenting them I can justify parking some of my current tasks or encourage others to at least run with the ideas so they might have impact.</li>\n<li>I need to write a quick overview of the AoH methodology for inclusion in some guidelines that the IUCN are pulling together.</li>\n<li>More on trying to write up my Nordic-RSE session on data pipelines.</li>\n<li>COVID booster jab - apologies in advance if I'm whinging on Friday :)</li>\n</ul>",+"content": "<h1>Previous week (at work)</h1>\n<p>I was on vacation last week driving around the Netherlands on my motorbike with my partner, so this is mostly what I did the week before that. The Netherlands was lovely, and their cycle biased road system puts Cambridge to shame: most the places we stayed had push-bikes for us to use and I felt more safe cycling there in atypical-to-me roadways than I did cycling to meet Anil this morning on home turf.</p>\n<p>We also had fun learning about the land management in the Netherlands: we walked on <a href=\"https://dezandmotor.nl/en/\">the sand motor</a>, an artificial sandbank that is an experiment in reinforcing the coast; we drove over <a href=\"https://theafsluitdijk.com\">Afsluitdijk</a>, a 20 mile long dyke that separates the open sea from the Ijsselmeer lake; we visited the museum at <a href=\"https://np-debiesbosch.nl/english-information/discover-the-park/\">Biesbosch National Park</a> where we learned about what happens if you don't maintain your wetlands infrastructure; and rounded it off with a guided tour of <a href=\"https://en.wikipedia.org/wiki/Maeslantkering\">Maeslantkering</a>, a huge set of swinging doors at Hoek van Holland for blocking the sea from reaching Rotterdam if the sea level looks like it'll swell too high, which are sufficiently large that if it wasn't for the special glass-impregnated white paint would expand 70cm in sun (as it is, the paint limits that to "just" 30 cm). Again, we do water management in East Anglia, but it's just at another scale in The Netherlands (I guess important when the centre of the country is six metres before sea level).</p>\n<h2>Area of Habitat Edge Effects</h2>\n<p>I spent some time trying to get my head around how to implement edge effects for Area of Habitat maps as part of LIFE. Edge effects refer to the fact that species that occupy certain habitats will sometimes not actually exist all the way to the edge of that habitat: if you have a habitat a species likes surrounded by habitat(s) it doesn't like, you can effectively shrink in the habitat by a set amount to allow for where they will not venture, making the population more concentrated within the inner region, and if areas a habitat are sufficiently small then species may not live there at all, despite it being a type they prefer.</p>\n<p>Edges are quite impactful in terms of land use change, as I tried to illustrate in this picture:</p>\n<div>\n <div>\n \n\n <img src=\"edges.svg\">\n \n </div>\n</div>\n<ol>\n<li>This just shows the edge on the area of habitat. The total splodge is the suitable habitat area, and the core is where the species will choose to live, avoiding the area marked edge.</li>\n<li>We may then think that if we change the land use of an area in the edge we don't impact the species...</li>\n<li>But in fact we just create a large edge that eats into the core area by an amount larger than just the area changed.</li>\n<li>Similarly for changing an area in the middle of the core</li>\n<li>The actual impact is amplified as there is an edge buffer all around the changed area, making it more impactful.</li>\n</ol>\n<p>Taking into account on how to use this though is subtle I think: on one hand you do want to account for the edge effect when working out the area that can support a species, but if you're looking to monitor the area where any changes could impact that population you need to use the entire habitable area, as even changes in the edge zone will impact the habitable core zone. My job now is to follow that through for the biodiversity metric pipelines I have and ensure I use the appropriate version of AoH at each step, which might mean I need to calculate both for each species.</p>\n<h2>Data pipeline tools</h2>\n<p>I made a start on writing up the discussion session I ran at the Nordic-RSE conference, but got sucked into trying to understand the detail of <a href=\"https://dvc.org\">DVC</a> and <a href=\"https://snakemake.github.io/\">Snakemake</a>, both of which had strong advocates in the session. The current tl;dr is I like the idea of DVC and how it ties code and data together, but it lacks the ability to do detailed dependancy analysis that I'd want from a build system, and Snakemake has that level of detail, but has a much poorer user experience (subjective, I appreciate).</p>\n<p>My secondary motivation here is that right now for both the <a href=\"https://github.com/quantifyearth/LIFE/\">LIFE</a> and <a href=\"https://github.copm/quantifyearth/STAR/\">STAR</a> pipelines I've written, the best sharable way to run them is via a shell script. Both were developed using <a href=\"https://github.com/quantifyearth/shark/\">Shark</a>, our own experimental data pipelining tool, but that's a bit too experimental for me to expect others to run, so I fell back on the shell script solution, but that does a bad job of only rebuilding the necessary parts of the pipeline if any of the inputs update, for that I want a proper build system, so I'm hoping that something from this exploration will give me another way out.</p>\n<h2>Outreachy</h2>\n<p>Outreachy kicked off, with <a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/outreachy-week-1\">Shreya Pawaskar joining</a> to help with <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>. This lead to me doing a bit of work to tidy up a few loose ends that I'd been putting off that I didn't want Shreya to have to deal with, but it was great to have them around to review my PRs for that!</p>\n<p>It's also forced me to be pragmatic, and work around a problem I have with OCaml's build system <a href=\"https://dune.readthedocs.io/en/stable/\">dune</a>. None of my work is in Opam yet, as I don't feel it's met the quality bar required in terms of documentation etc., and so if people want to use libraries I've built then on the guidance of others, I point people at using <a href=\"https://dune.readthedocs.io/en/stable/tutorials/dune-package-management/pinning.html\">dependancy pinning</a> whereby you can specify a github repository for a dependancy in your project's dune file, and then you run <code>dune pkg lock</code> and it'll fetch the pinned dependancies directly for you.</p>\n<p>This works fine, unless you have a submodule in your project. Claudius does use submodules for certain non-code resources, like the default font that is used for rendering text. Although this could be added as a subtree, my non-humble opinion is that a submodule is more appropriate here, as we don't care about the font's history, or indeed tracking updates. But <code>dune pkg lock</code> <a href=\"https://github.com/ocaml/dune/issues/11606\">does not cause submodules to be fetched</a>, and so currently Claudius breaks if you try to add it as a pinned dependancy. The ticket for this on dune has sat for a while now, and given that Claudius is yet to gain the fame and attention it deserves, I suspect my complaints won't move the needle there. Thus I'm going to have to <a href=\"https://github.com/claudiusFX/Claudius/pull/101\">add my resources as subtrees</a> and accept the history pollution this will cause - but it's a lot better than not having Claudius usable at all.</p>\n<h2>Summer interns</h2>\n<p>Looks like we have one undergrad interested in working on helping with <a href=\"https://anil.recoil.org/ideas/3d-print-world\">3D printing geospatial data</a> over the summer, and I'm chatting to <a href=\"https://www.cambridgephilosophicalsociety.org/funding/henslow-fellows/dr-tiffany-ki\">Tiff Ki</a> on providing support for 3D-printing camera jigs for <a href=\"https://anil.recoil.org/ideas/digitisation-of-insects\">digitising insect collections</a>.</p>\n<h1>This week</h1>\n<ul>\n<li>Write up some research ideas that have sat in my head for a while that I'm not getting time to act on - my hope is that by at least documenting them I can justify parking some of my current tasks or encourage others to at least run with the ideas so they might have impact.</li>\n<li>I need to write a quick overview of the AoH methodology for inclusion in some guidelines that the IUCN are pulling together.</li>\n<li>More on trying to write up my Nordic-RSE session on data pipelines.</li>\n<li>COVID booster jab - apologies in advance if I'm whinging on Friday :)</li>\n</ul>",
+14
mwd/weeknotes_2025-06-23_.json
+14
mwd/weeknotes_2025-06-23_.json
···+"summary": "<h1>Last week</h1>\n<p>Brief weeknotes as I'm a bit behind and it's a busy week ahead, apologies!</p>\n<h2>LIFE</h2>\n<p>We had a discussion in the LIFE team of the impact of the <a href=\"/weeknotes/2025-06-16/\">Area of Habitat edge effects I discussed last week</a>, and after some good discussion we came to a conclusion of how we want to approach it in the first instance. Alas this means I need to now get coding.</p>\n<h2>Yirgacheffe available via pip</h2>\n<p>I had to set up a new compute server this week, meaning I had to rebuild my development environment for the various pipelines I run. One of the friction points in this is I write a lot of libraries for myself that I then need to set up, the most common one of which is <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, my declarative geospatial library in which I hide all the complex things I don't want to deal with on a day-to-day basis. I've used this for many years on many projects and so finally decided to stop manually installing it for myself and <a href=\"https://pypi.org/project/yirgacheffe/\">get it into pypi</a>, which is the most common way of getting Python libraries out there. This involved less pain that I was expecting, and it's now set up that merges on PRs on github should lead to the pip package being updated. It's a silly thing, but feels like a good milestone to have hit.</p>\n<p>There are other Python distributions out there, notably <a href=\"https://docs.conda.io/projects/conda/en/stable/index.html\">conda</a>, so at some point I should probably also get it working on that too, but for now given all my pipelines currently use pip for dependancy management just having it in pypi helps a lot.</p>\n<h2>PROPL paper</h2>\n<p>On the topic of Yirgacheffe, Anil suggested I put something in to <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">PROPL</a> on the topic, so I've made a start on that this week. The deadline is July 3rd, but it's also only 5 pages, so hopefully enough time to get something reasonable written. The challenge is seeing it from the outside - I have to confess that as much as Yirgacheffe is useful to me, and it's now reasonably powerful, it all feels a bit obvious from the inside. Anil's been trying to get me to see that there's value in what I've built here, even if it doesn't feel novel in the scientific research sense to me.</p>\n<h2>OCaml-H3 wrapper into opam</h2>\n<p>This was motivated in part by my success getting Yirgacheffe into pypi, and in part by the ongoing work on with Shreya Pawaskar on <a href=\"https://github.com/claudiusFX/Claudius/\">Claudius</a>, the OCaml graphic library I started, which will eventually need to be more accessible, i.e., available via <a href=\"https://opam.ocaml.org/\">opam</a>, the OCaml standard package library. I thought rather than start with Claudius, I'd try getting my first thing into opam by submitting the <a href=\"https://github.com/geocaml/ocaml-h3\">OCaml wrapper I maintain</a> for the <a href=\"https://github.com/uber/h3\">Uber H3 geospatial library</a>, which is in theory a simpler project.</p>\n<p>That turned out to be somewhat wrong, as I needed to deal with the fact that the OCaml wrapper requires the H3 C library to be installed first. Initially I just assumed I'd rely on platform package managers to install it, but it turns out that although some platforms do have it (e.g., homebrew on macOS and Ubuntu), it seems more platforms do not have it as an option in their default package lists, and so I'd have to get opam to build and install it. In theory opam supports this, but getting it to work was a bit more nuanced, and after struggling for a bit I turned to fellow EEG member <a href=\"https://www.dra27.uk/\">David</a> for assistance, as I knew he spent a lot of time dealing with packages in opam. He soon had me pointed on the right track, and now I've got my <a href=\"https://github.com/ocaml/opam-repository/pull/28067\">first PR on opam repositories open</a>.</p>\n<h2>Outreachy/Claudius</h2>\n<p>On the topic of Claudius, Shreya got a prototype of <a href=\"https://github.com/claudiusFX/Claudius/pull/103\">saving Claudius output to animated GIFs</a> working, which is pretty cool! I'd show an example here, but my self built stack for this website <a href=\"https://github.com/mdales/webplats/issues/4\">doesn't know how to deal with animated GIFs</a> and I don't have time to fix that right now \ud83e\udd26</p>\n<h2>Limited acceptance of the future</h2>\n<p>I'm generally a luddite when it comes to AI related things, but I have to confess I've been using <a href=\"https://claude.ai\">Claude</a> on a limited basis with some success after both <a href=\"https://anil.recoil.org/\">Anil</a> and <a href=\"https://lbj20.blogspot.com/\">Laura</a> have been talking about how they use it. I'm not about to start vibe coding, but as a sort of natural language search engine, and a way to use it as a <a href=\"https://en.wikipedia.org/wiki/Rubber_duck_debugging\">rubber duck</a>, it's shown enough utility that I'll keep using it for now; I think like any tool, working out how and when to use it is key, and my stance of everything being "no" is probably ignoring some of the upsides of it. I just wish it was more easy to defend ethically.</p>\n<h2>TODO list</h2>\n<p>As I mentioned in the todos for this week in last week's notes, I wrote down all my various things that need working on or would like to be working on - a useful exercise, as at least I now know why I often feel like I'm jumping between too many tasks: it's because I'm jumping between too many tasks.</p>\n<h1>This week</h1>\n<ul>\n<li>IUCN workshop - there's a three day IUCN workshop taking place at the DAB this week, with one of the main themes being around their data-processing pipeline. Given I've been working with them on the implementation of both their STAR biodiversity metric and our own LIFE biodiversity metric that uses IUCN data, this should hopefully be a useful workshop for getting ahead of any planned changes they have, and aligning my efforts with their own.</li>\n<li>PROPL paper - I need to work more on the PROPL paper given the deadline is a week on Thursday!</li>\n</ul>",+"content": "<h1>Last week</h1>\n<p>Brief weeknotes as I'm a bit behind and it's a busy week ahead, apologies!</p>\n<h2>LIFE</h2>\n<p>We had a discussion in the LIFE team of the impact of the <a href=\"/weeknotes/2025-06-16/\">Area of Habitat edge effects I discussed last week</a>, and after some good discussion we came to a conclusion of how we want to approach it in the first instance. Alas this means I need to now get coding.</p>\n<h2>Yirgacheffe available via pip</h2>\n<p>I had to set up a new compute server this week, meaning I had to rebuild my development environment for the various pipelines I run. One of the friction points in this is I write a lot of libraries for myself that I then need to set up, the most common one of which is <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a>, my declarative geospatial library in which I hide all the complex things I don't want to deal with on a day-to-day basis. I've used this for many years on many projects and so finally decided to stop manually installing it for myself and <a href=\"https://pypi.org/project/yirgacheffe/\">get it into pypi</a>, which is the most common way of getting Python libraries out there. This involved less pain that I was expecting, and it's now set up that merges on PRs on github should lead to the pip package being updated. It's a silly thing, but feels like a good milestone to have hit.</p>\n<p>There are other Python distributions out there, notably <a href=\"https://docs.conda.io/projects/conda/en/stable/index.html\">conda</a>, so at some point I should probably also get it working on that too, but for now given all my pipelines currently use pip for dependancy management just having it in pypi helps a lot.</p>\n<h2>PROPL paper</h2>\n<p>On the topic of Yirgacheffe, Anil suggested I put something in to <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">PROPL</a> on the topic, so I've made a start on that this week. The deadline is July 3rd, but it's also only 5 pages, so hopefully enough time to get something reasonable written. The challenge is seeing it from the outside - I have to confess that as much as Yirgacheffe is useful to me, and it's now reasonably powerful, it all feels a bit obvious from the inside. Anil's been trying to get me to see that there's value in what I've built here, even if it doesn't feel novel in the scientific research sense to me.</p>\n<h2>OCaml-H3 wrapper into opam</h2>\n<p>This was motivated in part by my success getting Yirgacheffe into pypi, and in part by the ongoing work on with Shreya Pawaskar on <a href=\"https://github.com/claudiusFX/Claudius/\">Claudius</a>, the OCaml graphic library I started, which will eventually need to be more accessible, i.e., available via <a href=\"https://opam.ocaml.org/\">opam</a>, the OCaml standard package library. I thought rather than start with Claudius, I'd try getting my first thing into opam by submitting the <a href=\"https://github.com/geocaml/ocaml-h3\">OCaml wrapper I maintain</a> for the <a href=\"https://github.com/uber/h3\">Uber H3 geospatial library</a>, which is in theory a simpler project.</p>\n<p>That turned out to be somewhat wrong, as I needed to deal with the fact that the OCaml wrapper requires the H3 C library to be installed first. Initially I just assumed I'd rely on platform package managers to install it, but it turns out that although some platforms do have it (e.g., homebrew on macOS and Ubuntu), it seems more platforms do not have it as an option in their default package lists, and so I'd have to get opam to build and install it. In theory opam supports this, but getting it to work was a bit more nuanced, and after struggling for a bit I turned to fellow EEG member <a href=\"https://www.dra27.uk/\">David</a> for assistance, as I knew he spent a lot of time dealing with packages in opam. He soon had me pointed on the right track, and now I've got my <a href=\"https://github.com/ocaml/opam-repository/pull/28067\">first PR on opam repositories open</a>.</p>\n<h2>Outreachy/Claudius</h2>\n<p>On the topic of Claudius, Shreya got a prototype of <a href=\"https://github.com/claudiusFX/Claudius/pull/103\">saving Claudius output to animated GIFs</a> working, which is pretty cool! I'd show an example here, but my self built stack for this website <a href=\"https://github.com/mdales/webplats/issues/4\">doesn't know how to deal with animated GIFs</a> and I don't have time to fix that right now \ud83e\udd26</p>\n<h2>Limited acceptance of the future</h2>\n<p>I'm generally a luddite when it comes to AI related things, but I have to confess I've been using <a href=\"https://claude.ai\">Claude</a> on a limited basis with some success after both <a href=\"https://anil.recoil.org/\">Anil</a> and <a href=\"https://lbj20.blogspot.com/\">Laura</a> have been talking about how they use it. I'm not about to start vibe coding, but as a sort of natural language search engine, and a way to use it as a <a href=\"https://en.wikipedia.org/wiki/Rubber_duck_debugging\">rubber duck</a>, it's shown enough utility that I'll keep using it for now; I think like any tool, working out how and when to use it is key, and my stance of everything being "no" is probably ignoring some of the upsides of it. I just wish it was more easy to defend ethically.</p>\n<h2>TODO list</h2>\n<p>As I mentioned in the todos for this week in last week's notes, I wrote down all my various things that need working on or would like to be working on - a useful exercise, as at least I now know why I often feel like I'm jumping between too many tasks: it's because I'm jumping between too many tasks.</p>\n<h1>This week</h1>\n<ul>\n<li>IUCN workshop - there's a three day IUCN workshop taking place at the DAB this week, with one of the main themes being around their data-processing pipeline. Given I've been working with them on the implementation of both their STAR biodiversity metric and our own LIFE biodiversity metric that uses IUCN data, this should hopefully be a useful workshop for getting ahead of any planned changes they have, and aligning my efforts with their own.</li>\n<li>PROPL paper - I need to work more on the PROPL paper given the deadline is a week on Thursday!</li>\n</ul>",
+14
mwd/weeknotes_2025-06-30_.json
+14
mwd/weeknotes_2025-06-30_.json
···+"summary": "<h1>Last week</h1>\n<p>Last week time-wise was dominated by two things: a three day IUCN workshop on applying new technologies to the <a href=\"https://www.iucnredlist.org\">IUCN Red List</a>, and attempting to pull together a paper on <a href=\"https://github.com/quantifyearth/yirgacheffe/\">yirgacheffe</a> for <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">PROPL</a>. I'll do a separate blog post on the former, and I now have a complete draft of the later, so will have something to submit by the deadline of Thursday.</p>\n<h2>Opam submission failure</h2>\n<p>I mentioned <a href=\"/weeknotes/2025-06-23/\">last week</a> I <a href=\"https://github.com/ocaml/opam-repository/pull/28067\">submitted a package</a> to <a href=\"https://opam.ocaml.org/\">opam</a> for the first time, mostly as a learning experience. It was good that my main objective was to learn rather than get a package into opam, as that way I can claim success, otherwise the whole thing has been a frustrating exercise.</p>\n<p>The style of package I needed to submit isn't well documented, so I scanned through opam to see if I could find an example that was like what I wanted and based my PR on that. Unfortunately I picked poorly, and I got two responses to my pull request telling me to do it a different way, but without links to examples or documentation. I had a third tell me to change the name, despite the fact I'm wrapping an existing popular C library, and that was where the name came from.</p>\n<p>All of which is fine if I was a seasoned contributor, but I'd argue is a poor response for a first-time contributor to an open-source project to receive. If OCaml wants to draw in new contributors and widen its user pool, which is the tone I got from events like the FunOCaml conference I attended last year, then telling new contributors they're doing it wrong without support isn't going to encourage that, meaning you only want contributors who do this full time or are already embedded in the community. If it's a more casual contributor who isn't doing this as their primary function, this sort of interaction may well be the end of their attempt to participate.</p>\n<p>On the plus side, I did receive <a href=\"https://github.com/claudiusFX/bdfparser/pull/1\">a small PR</a> for one of my other obscure OCaml packages (which isn't in opam) last week, so that was a positive OCaml community wise, and shows that not having things in opam is the end of discoverability.</p>\n<h2>Bon In A Box</h2>\n<p>I had a play with <a href=\"https://boninabox.geobon.org\">Bon In A Box</a>, a containerised environment for running ecology data-science pipelines, and was quite impressed with it. Firstly it provides an environment in which for you to run your Python or R scripts, and to do so in a way that encourages reproducibility by making inputs and outputs explicitly at a higher level (no more hidden sideeffects in scripts, which I think is a major problem with using general purpose languages in this domain), and because it's containerised you have to use the package dependancies via their metadata setup, which is good also - a common reproducibility problem with running other people's scripts is missing package dependancies as they happened to have more installed on their system than what was in the requirements.txt or such (assuming that exists).</p>\n<p>Beyond that they also have this very cool way for building up pipelines where you drag scripts into a visual editor, and because the inputs and outputs of each script are defined in a metadata file, they can link together scripts visually, which is super exciting to me who spends a lot of time trying to generate this visualisation as an after the event view:</p>\n<div>\n <div>\n \n </div>\n</div>\n<p>Anil and I had a chat with some of the BIAB team, and it's still work in progress, so I hope there's some opportunities for us to collaborate there, as this solves a bunch of problems we were looking to tackle, and does so in more packaged and robust way than say our <a href=\"https://github.com/quantifyearth/shark\">Shark</a> project has. Not that time on Shark was wasted, but rather I'd be interested to see if we can contribute to an existing effort that does a lot of things right even if not quite how we'd do them, rather than duplicate a lot of effort to get our particular spin on things production ready.</p>\n<p>As a concrete version of that: I originally built the LIFE and STAR pipelines to be executed by Shark, which was great for me, but because Shark is very much work in progress, I couldn't ask other people to do that, so I also had to ship a shell script to run the pipeline. The shell script isn't great, as it lacks the nuance of a proper build system. It looks like BIAB will be a nice in between, so I now want to try porting a part of LIFE or STAR to BIAB to get a feel for how it goes.</p>\n<h1>This week</h1>\n<p>I very much need to make technical progress this week, and submit my PRORL paper. On the technical front:</p>\n<ul>\n<li>Make some progress on AoH edge effects</li>\n<li>Start to work with the land cover foundation model from our group so I can try to apply it to the projects I've been working on</li>\n</ul>\n<p>Thankfully I'm hidden up on the Wirral for the next two weeks, so hopefully I can just get my head down and get on with things.</p>\n<p>I'll also be at the aforementioned <a href=\"https://liverpoolmakefest.org\">Liverpool Makefest</a> on Saturday 5th, trying to demonstrate to the people of Liverpool how to get started in building guitars. And so in the unlikely event you're in the area next weekend, do say hi.</p>",+"content": "<h1>Last week</h1>\n<p>Last week time-wise was dominated by two things: a three day IUCN workshop on applying new technologies to the <a href=\"https://www.iucnredlist.org\">IUCN Red List</a>, and attempting to pull together a paper on <a href=\"https://github.com/quantifyearth/yirgacheffe/\">yirgacheffe</a> for <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025#Call-for-Papers\">PROPL</a>. I'll do a separate blog post on the former, and I now have a complete draft of the later, so will have something to submit by the deadline of Thursday.</p>\n<h2>Opam submission failure</h2>\n<p>I mentioned <a href=\"/weeknotes/2025-06-23/\">last week</a> I <a href=\"https://github.com/ocaml/opam-repository/pull/28067\">submitted a package</a> to <a href=\"https://opam.ocaml.org/\">opam</a> for the first time, mostly as a learning experience. It was good that my main objective was to learn rather than get a package into opam, as that way I can claim success, otherwise the whole thing has been a frustrating exercise.</p>\n<p>The style of package I needed to submit isn't well documented, so I scanned through opam to see if I could find an example that was like what I wanted and based my PR on that. Unfortunately I picked poorly, and I got two responses to my pull request telling me to do it a different way, but without links to examples or documentation. I had a third tell me to change the name, despite the fact I'm wrapping an existing popular C library, and that was where the name came from.</p>\n<p>All of which is fine if I was a seasoned contributor, but I'd argue is a poor response for a first-time contributor to an open-source project to receive. If OCaml wants to draw in new contributors and widen its user pool, which is the tone I got from events like the FunOCaml conference I attended last year, then telling new contributors they're doing it wrong without support isn't going to encourage that, meaning you only want contributors who do this full time or are already embedded in the community. If it's a more casual contributor who isn't doing this as their primary function, this sort of interaction may well be the end of their attempt to participate.</p>\n<p>On the plus side, I did receive <a href=\"https://github.com/claudiusFX/bdfparser/pull/1\">a small PR</a> for one of my other obscure OCaml packages (which isn't in opam) last week, so that was a positive OCaml community wise, and shows that not having things in opam is the end of discoverability.</p>\n<h2>Bon In A Box</h2>\n<p>I had a play with <a href=\"https://boninabox.geobon.org\">Bon In A Box</a>, a containerised environment for running ecology data-science pipelines, and was quite impressed with it. Firstly it provides an environment in which for you to run your Python or R scripts, and to do so in a way that encourages reproducibility by making inputs and outputs explicitly at a higher level (no more hidden sideeffects in scripts, which I think is a major problem with using general purpose languages in this domain), and because it's containerised you have to use the package dependancies via their metadata setup, which is good also - a common reproducibility problem with running other people's scripts is missing package dependancies as they happened to have more installed on their system than what was in the requirements.txt or such (assuming that exists).</p>\n<p>Beyond that they also have this very cool way for building up pipelines where you drag scripts into a visual editor, and because the inputs and outputs of each script are defined in a metadata file, they can link together scripts visually, which is super exciting to me who spends a lot of time trying to generate this visualisation as an after the event view:</p>\n<div>\n <div>\n \n </div>\n</div>\n<p>Anil and I had a chat with some of the BIAB team, and it's still work in progress, so I hope there's some opportunities for us to collaborate there, as this solves a bunch of problems we were looking to tackle, and does so in more packaged and robust way than say our <a href=\"https://github.com/quantifyearth/shark\">Shark</a> project has. Not that time on Shark was wasted, but rather I'd be interested to see if we can contribute to an existing effort that does a lot of things right even if not quite how we'd do them, rather than duplicate a lot of effort to get our particular spin on things production ready.</p>\n<p>As a concrete version of that: I originally built the LIFE and STAR pipelines to be executed by Shark, which was great for me, but because Shark is very much work in progress, I couldn't ask other people to do that, so I also had to ship a shell script to run the pipeline. The shell script isn't great, as it lacks the nuance of a proper build system. It looks like BIAB will be a nice in between, so I now want to try porting a part of LIFE or STAR to BIAB to get a feel for how it goes.</p>\n<h1>This week</h1>\n<p>I very much need to make technical progress this week, and submit my PRORL paper. On the technical front:</p>\n<ul>\n<li>Make some progress on AoH edge effects</li>\n<li>Start to work with the land cover foundation model from our group so I can try to apply it to the projects I've been working on</li>\n</ul>\n<p>Thankfully I'm hidden up on the Wirral for the next two weeks, so hopefully I can just get my head down and get on with things.</p>\n<p>I'll also be at the aforementioned <a href=\"https://liverpoolmakefest.org\">Liverpool Makefest</a> on Saturday 5th, trying to demonstrate to the people of Liverpool how to get started in building guitars. And so in the unlikely event you're in the area next weekend, do say hi.</p>",
+14
mwd/weeknotes_2025-07-07_.json
+14
mwd/weeknotes_2025-07-07_.json
···+"summary": "<h1>Last week</h1>\n<h2>Edge effects</h2>\n<p>The main technical achievement this week finally generating some initial edge-effect results. This isn't yet at a level where we're answering the scientific questions the LIFE team has, but I have made the first crude Area of Habitat (AoH) maps where there is some impact of the species within a habitat choosing not to occupy the areas that border with a habitat they do not like. Not all species do this, and those that do do so by different amounts, and whilst the depth of the edges discussed are only tens of metres, it does make a significant impact on fragmented landscapes, whereby although the total area of habitat available habitat might be large, the large number of edges eats into that area quickly once you consider edges.</p>\n<p>In the projects that I work on habitat types are encoded to the <a href=\"https://www.iucnredlist.org/resources/habitat-classification-scheme\">IUCN Habitat Classification Scheme</a>, which is a hierarchy of types: type 1 is forest, 2 is savanna, and so on, and then Type 1.1 is boreal forest, 1.2 subarctic forest, and so on. These are referred to as level 1 (the broad classification) and level 2 (more detailed). For LIFE, due to the limitations on historic data, we approximate everything to level 1, which means we have to simplify the current day habitat maps to match, by converting all their level 2 types to the more general level 1 type.</p>\n<p>I'd assumed that because of this, we'd not see much in the way of edge effects when processing AOH maps based on level 1 data, as we've lost a lot of subtlety in the data. However, it turns out I was wrong, as these before and after images show, with the standard AOH and edge impacted AOH for the <a href=\"https://www.iucnredlist.org/species/22702843/93891938\">White-brown Foliage-gleaner</a> (bird names are the best :), a bird that lives in south east Brazil:</p>\n<p></p><div>\n<div>\n\n\n<img src=\"before.png\">\n\n</div>\n</div>\n\n<p></p>\n<p></p><div>\n<div>\n\n\n<img src=\"after.png\">\n\n</div>\n</div>\n\n<p></p>\n<p>You can see, removing the edges has quite a large impact. How, this being a test, I'm using a very harsh edge impact rule, more so than we'd apply in practice, but it's useful here to see that there are a lot of edges in this area, even at the reduced detail of just using a Level 1 habitat map.</p>\n<p>I ran this for a set of species the LIFE team had identified as good candidates for testing with, and I've sent over a bunch of rasters for them to assess and see if my implementation of edges matches their expectations, or whether I need to adjust my algorithm at all.</p>\n<p>The other consequence of my assumptions being proven wrong (about how fragmented the level 1 map is), is that in LIFE we downsample the habitat map before generating the AOH, but with edges we can't do that, as downsampled pixel that is 50% covered could be because the left side is one habitat and the right side is another (low fragmentation) or because every alternating pixel is one habitat and then another. Both downsampled look the same, but in one edges have little impact, and in the other edges will wipe out that pixel. As such, it means calculating species metrics with edge considerations will be considerably more compute intensive due to having to work at the finest resolution we have at the AOH level and then downsampling to the target resolution afterwards.</p>\n<h2>PROPL paper/yirgacheffe</h2>\n<p>The PROPL paper is nearly there, but I've been struggling to get some meaningful performance metrics for it. That will be this afternoon's task, as the paper deadline is tomorrow. It has been useful trying to profile a few bits of yirgacheffe for the paper, as I found a few simple, and in hindsight obvious, <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/41\">things to improve upon</a>, and writing the paper made me also write down my various thoughts <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/38\">about breaking API changes</a> I need to make for 2.0 to simplify the API.</p>\n<h2>Claudius</h2>\n<p>I mentioned a while ago I'm working with Shreya Pawaskar, an <a href=\"https://www.outreachy.org\">outreachy</a> intern, on <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>, the OCaml graphics library I bootstrapped a year or so ago. Shreya just posted <a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/outreachy-week-3\">a progress update blog post</a> covering her recent work on building in animated GIF recording support to the library. I'm really pleased with this, as it'll make it so much easier for people to share what they've built in Claudius on social media etc., which is what that community wants to do after they've built some cool new demo or visualisation.</p>\n<h2>3D printing geospatial data</h2>\n<p>This week sees Finley Stirk join us for the summer to help with building tools to help people 3D print geospatial data. I've done a little playing with this in the past, and it was very painful to get to work well, so I'm hoping with Finley's help we can lower the barrier to entry for getting geospatial data out of the computer and into the real world, where it can have greater impact.</p>\n<h2>Bon in a box</h2>\n<p><a href=\"https://anil.recoil.org/\">Anil</a> and I had another call with the <a href=\"https://boninabox.geobon.org\">Geobon</a>, including a long chat about whether any of the tooling we've developed for parallelisation of tasks might be useful for their containerized reproducable data pipelines. I think there's something there, it'll just be how much time we both have to push that forward. I'm still super keen to use their "bon in a box" tooling to test out our pipelines over the summer, so perhaps it can align with that.</p>\n<h2>Liverpool Makefest</h2>\n<p>This weekend was the tenth <a href=\"https://liverpoolmakefest.org\">Liverpool Makefest</a>, where the UK maker community takes over Liverpool Central Library for a day to show the public all the things they've been doing, in an attempt to inspire others to try new things and see them in a different light. It being the tenth anniversary, and given that it's close to that for <a href=\"https://mwdales-guitars.uk/\">me building guitars</a>, I did a sort of retrospective to try show people how they might get started building guitars themselves. It's always a fun day, and as ever it was flat out for most of the day as people from Liverpool (including a lot that just wanted to visit the library!) came by and asked questions and had a go on the guitars. As ever, I was too busy talking to people to remember to take photos, but here is one taken of me and the <a href=\"https://liverpool.gov.uk/council/councillors-and-committees/lord-mayor/\">Lord Mayor of Liverpool</a> who'd stopped by to ask about the guitars.</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of me and the Lord Mayor of Liverpool, wearing a large gold neck piece that is the badge of office, stood in front of a table of guitars, which in turn are in front of shelves of books.\" src=\"IMG_9326.jpeg\">\n\n</div>\n</div>\n\n<p></p>\n<h1>This week</h1>\n<p>I'm very behind on things still, but this week I hope to:</p>\n<ul>\n<li>Get the PROPL paper submitted with some performance data</li>\n<li>Look at what Ian has been doing on the plant front</li>\n<li>Consider doing a more nuanced edge effect run closer to what we'd need to do</li>\n<li>Look into TESSERA if there's any free time</li>\n</ul>",+"content": "<h1>Last week</h1>\n<h2>Edge effects</h2>\n<p>The main technical achievement this week finally generating some initial edge-effect results. This isn't yet at a level where we're answering the scientific questions the LIFE team has, but I have made the first crude Area of Habitat (AoH) maps where there is some impact of the species within a habitat choosing not to occupy the areas that border with a habitat they do not like. Not all species do this, and those that do do so by different amounts, and whilst the depth of the edges discussed are only tens of metres, it does make a significant impact on fragmented landscapes, whereby although the total area of habitat available habitat might be large, the large number of edges eats into that area quickly once you consider edges.</p>\n<p>In the projects that I work on habitat types are encoded to the <a href=\"https://www.iucnredlist.org/resources/habitat-classification-scheme\">IUCN Habitat Classification Scheme</a>, which is a hierarchy of types: type 1 is forest, 2 is savanna, and so on, and then Type 1.1 is boreal forest, 1.2 subarctic forest, and so on. These are referred to as level 1 (the broad classification) and level 2 (more detailed). For LIFE, due to the limitations on historic data, we approximate everything to level 1, which means we have to simplify the current day habitat maps to match, by converting all their level 2 types to the more general level 1 type.</p>\n<p>I'd assumed that because of this, we'd not see much in the way of edge effects when processing AOH maps based on level 1 data, as we've lost a lot of subtlety in the data. However, it turns out I was wrong, as these before and after images show, with the standard AOH and edge impacted AOH for the <a href=\"https://www.iucnredlist.org/species/22702843/93891938\">White-brown Foliage-gleaner</a> (bird names are the best :), a bird that lives in south east Brazil:</p>\n<p></p><div>\n<div>\n\n\n<img src=\"before.png\">\n\n</div>\n</div>\n\n<p></p>\n<p></p><div>\n<div>\n\n\n<img src=\"after.png\">\n\n</div>\n</div>\n\n<p></p>\n<p>You can see, removing the edges has quite a large impact. How, this being a test, I'm using a very harsh edge impact rule, more so than we'd apply in practice, but it's useful here to see that there are a lot of edges in this area, even at the reduced detail of just using a Level 1 habitat map.</p>\n<p>I ran this for a set of species the LIFE team had identified as good candidates for testing with, and I've sent over a bunch of rasters for them to assess and see if my implementation of edges matches their expectations, or whether I need to adjust my algorithm at all.</p>\n<p>The other consequence of my assumptions being proven wrong (about how fragmented the level 1 map is), is that in LIFE we downsample the habitat map before generating the AOH, but with edges we can't do that, as downsampled pixel that is 50% covered could be because the left side is one habitat and the right side is another (low fragmentation) or because every alternating pixel is one habitat and then another. Both downsampled look the same, but in one edges have little impact, and in the other edges will wipe out that pixel. As such, it means calculating species metrics with edge considerations will be considerably more compute intensive due to having to work at the finest resolution we have at the AOH level and then downsampling to the target resolution afterwards.</p>\n<h2>PROPL paper/yirgacheffe</h2>\n<p>The PROPL paper is nearly there, but I've been struggling to get some meaningful performance metrics for it. That will be this afternoon's task, as the paper deadline is tomorrow. It has been useful trying to profile a few bits of yirgacheffe for the paper, as I found a few simple, and in hindsight obvious, <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/41\">things to improve upon</a>, and writing the paper made me also write down my various thoughts <a href=\"https://github.com/quantifyearth/yirgacheffe/issues/38\">about breaking API changes</a> I need to make for 2.0 to simplify the API.</p>\n<h2>Claudius</h2>\n<p>I mentioned a while ago I'm working with Shreya Pawaskar, an <a href=\"https://www.outreachy.org\">outreachy</a> intern, on <a href=\"https://github.com/claudiusFX/claudius/\">Claudius</a>, the OCaml graphics library I bootstrapped a year or so ago. Shreya just posted <a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/outreachy-week-3\">a progress update blog post</a> covering her recent work on building in animated GIF recording support to the library. I'm really pleased with this, as it'll make it so much easier for people to share what they've built in Claudius on social media etc., which is what that community wants to do after they've built some cool new demo or visualisation.</p>\n<h2>3D printing geospatial data</h2>\n<p>This week sees Finley Stirk join us for the summer to help with building tools to help people 3D print geospatial data. I've done a little playing with this in the past, and it was very painful to get to work well, so I'm hoping with Finley's help we can lower the barrier to entry for getting geospatial data out of the computer and into the real world, where it can have greater impact.</p>\n<h2>Bon in a box</h2>\n<p><a href=\"https://anil.recoil.org/\">Anil</a> and I had another call with the <a href=\"https://boninabox.geobon.org\">Geobon</a>, including a long chat about whether any of the tooling we've developed for parallelisation of tasks might be useful for their containerized reproducable data pipelines. I think there's something there, it'll just be how much time we both have to push that forward. I'm still super keen to use their "bon in a box" tooling to test out our pipelines over the summer, so perhaps it can align with that.</p>\n<h2>Liverpool Makefest</h2>\n<p>This weekend was the tenth <a href=\"https://liverpoolmakefest.org\">Liverpool Makefest</a>, where the UK maker community takes over Liverpool Central Library for a day to show the public all the things they've been doing, in an attempt to inspire others to try new things and see them in a different light. It being the tenth anniversary, and given that it's close to that for <a href=\"https://mwdales-guitars.uk/\">me building guitars</a>, I did a sort of retrospective to try show people how they might get started building guitars themselves. It's always a fun day, and as ever it was flat out for most of the day as people from Liverpool (including a lot that just wanted to visit the library!) came by and asked questions and had a go on the guitars. As ever, I was too busy talking to people to remember to take photos, but here is one taken of me and the <a href=\"https://liverpool.gov.uk/council/councillors-and-committees/lord-mayor/\">Lord Mayor of Liverpool</a> who'd stopped by to ask about the guitars.</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of me and the Lord Mayor of Liverpool, wearing a large gold neck piece that is the badge of office, stood in front of a table of guitars, which in turn are in front of shelves of books.\" src=\"IMG_9326.jpeg\">\n\n</div>\n</div>\n\n<p></p>\n<h1>This week</h1>\n<p>I'm very behind on things still, but this week I hope to:</p>\n<ul>\n<li>Get the PROPL paper submitted with some performance data</li>\n<li>Look at what Ian has been doing on the plant front</li>\n<li>Consider doing a more nuanced edge effect run closer to what we'd need to do</li>\n<li>Look into TESSERA if there's any free time</li>\n</ul>",
+14
mwd/weeknotes_2025-07-14_.json
+14
mwd/weeknotes_2025-07-14_.json
···+"summary": "<h1>This week</h1>\n<h2>Yirgacheffe</h2>\n<p>The short paper on the design and use of <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a> was submitted to <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">PROPL</a> was submitted on time, but not without a little stressing to the end, which is the downside of paper deadlines: something always turns up that makes them a rush, even if you felt you had things mostly in hand the week before.</p>\n<p>Context: for those who haven't seen it before, one of the main features of Yirgacheffe is that you can specify numerical operations directly on geospatial datasets, so you can add/multiply/filter these large rasters or polygons directly, and it'll do all the book keeping about aligning pixels, rasterizing polygons, etc., and at the end you either save the result to another raster layer, or you perform some aggregation like summing all the pixels or finding the min/max.</p>\n<p>One of the less used features of Yirgacheffe, at least by me, is that when doing that save or aggregation, Yirgacheffe can attempt to do so in parallel using multiple CPU cores. Normally the pipelines I work on don't use this feature as they tend towards data flows that work better if I run the same script many times in parallel, rather than one script that does everything within it. Partly this is down to Python being generally poor at parallelism, but mostly down to the data flows, e.g., processing thousands of area of habitat calculations at a time, it's jsut easier to run the AoH script once per species, and I can use an external tool like GNU Parallel or <a href=\"https://github.com/quantifyearth/littlejohn/\">Littlejohn</a> to orchestrate that.</p>\n<p>But, there are times when you just one script to do some calculation on a big raster as fast as possible, and for that I added the option to use multiple cores for the calculations. Internally you can imagine Yirgacheffe breaks down each calculation into say rows of pixels and does them one at time to avoid having to load too much data into memory, so it's a small logical leap to say we'll do several of those rows at a time in parallel, as they're independent of each other. Yirgacheffe doesn't try to do anything very clever here, but I found when I benchmarked the feature it performed much poorer than I'd expected, actually being several times slower than just using a single thread in some instances, one being over 6 times slower!</p>\n<div>\n</div>\n\n<p>My test case was processing 277 different species AoHs. I did specifically go for a mix of ranges, but the data for species sizes does tend to skew small, so don't process much data. Whilst I said above you could imagine Yirgacheffe processes a row of pixel data at a time, it actually does larger chunks than that: partly to get better disk behaviour and partly because polygon rasterization works very poorly at that scale, as it still has to process the entire polygon each time you want to rasterize a small chunk of it, and for species with ranges defined by detailed coastlines that can be a lot of data.</p>\n<p>So I realised that for many small species it was doing a single chunk of data, and if I set the parallelize flag it was still trying to do that work on a worker thread, which in Python is quite expensive to set up. So I added some checks to see if you would actually need parallelism, and if the calculation was just one chunk of data, then it'd revert to the single thread code path.</p>\n<div>\n</div>\n\n<p>This still isn't great, with still quite a few instances being slower than single threaded, but did bring the mean down taking less than a third of the original performance, with the min being around 12% of the original run.</p>\n<p>The overhead of processing one chunk like this did make me then wonder about how I was defining the chunk size, and whether I should look at the current default work unit size. I played a little with reducing it to encourage more parallelism, but that only seemed to make things worse, as the rasterization overheads kicked in, and given paper deadline, I didn't really have the time to try explore that space nor work out how to automatically infer what might be reasonable, so I had to park that. I also tried another, larger dataset, processing all 1600 odd mammals from the STAR metric, and this also gave me mixed results performance wise, and I didn't have time to dig into that: I assume the species' range distribution was different from my normal test sample set.</p>\n<p>Ultimately, on average the parallel save feature on Yirgacheffe does better than not having it, but is pretty poor given how many CPU cores it can use, and so overall I'm left quite unhappy with the feature. I feel that even allowing for Python related problems, something better could be done, but there was no time to look before the deadline passed \ud83d\ude20</p>\n<p>It's not like this was even a critical part of the narrative to the paper, and isn't a feature I use that much, but the process made me realise there's something going wrong and I don't understand why, and I don't have time to figure it out, and that is deeply frustrating.</p>\n<h2>LIFE</h2>\n<p>I started generating a new LIFE run using the latest RedList update from 2025. All the LIFE paper work was done with RedList data from when the project started in 2023, and there's now a 2025 update out, so we want to publish updated layers. I did a visual inspection of the new maps, and there's some differences, particularly around amphibians, but they generally look good, but I've passed them over to Alison who as a zoologist is actually capable in interpreting the results properly.</p>\n<p>Whilst doing this I'm also doing a little modernisation of the code, and changing the default results you get when you use the script that comes with the repo so that it just runs things we're still interested in, rather than everything that was in the original LIFE paper.</p>\n<h2>Claudius</h2>\n<p><a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/\">Shreya</a>, the Outreachy intern working on <a href=\"https://github.com/claudiusFX/claudius\">Claudius</a> has been working for the last few weeks on getting a feature to record animations out to an animated-GIF file, and that's now merged. I'd include an example here, but my <a href=\"https://github.com/mdales/webplats/\">self-written website publishing tool</a> doesn't have a way to let me include it, so I'll try fix that for next week \ud83e\udd26 We made some progress to getting Claudius into opam, as I got the <a href=\"https://github.com/claudiusFX/claudius/\">OCaml-GIF library</a> that it depends on that we maintain <a href=\"https://github.com/ocaml/opam-repository/pull/28146\">into opam</a>.</p>\n<p>The next challenge will be getting Claudius in, as the obvious paths don't quite work due to Claudius using a submodule to add a resource dependancy. Specifically, github releases don't include submodules in the produced tarball, which means Claudius won't build from a github release unfortunately, which is how I did the release for the GIF library.</p>\n<h2>3D-Printing maps</h2>\n<p>UROP studently Finley started, and impressed me by very quickly getting up and running generating models for 3D printing from digital elevation maps:</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A screen shot of a square area of hilly land rendered in some 3D-printer slicer software.\" src=\"srtm_print_hr_2.png\">\n\n</div>\n</div>\n\n<p></p>\n<p>Finley is going to try write up some weeknotes, so I'll link to those here as and when and not spoil his work, but I'm super excited about what we might get done this summer. I was working out of <a href=\"https://doesliverpool.com/\">DoES Liverpool</a> for part of last week, and I did spot this lovely CNC-routed landscape and I must resist trying to derail this project into even more time-consuming construction methods :)</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of me holding a wooden block into which a mountain range has been carved\" src=\"IMG_9342.jpeg\">\n\n</div>\n</div>\n\n<p></p>\n<p>I did find out the computer lab has some Prusa 3D-printers, so hopefully Finley and I can get trained on those.</p>\n<h1>This week</h1>\n<ul>\n<li>Make sure we have everything we need for the next LIFE manuscript ready for zenodo.</li>\n<li>Get some of Finley's results 3D-printed and try get him able to print on his own.</li>\n<li>Try to schedule a meeting on AoH validation with interested peeps. This was discussed around the IUCN workshop a few weeks back, and I need to try arrange that before people vanish for summer holidays (myself included).</li>\n<li>Look into TESSERA if there's any free time</li>\n</ul>",+"content": "<h1>This week</h1>\n<h2>Yirgacheffe</h2>\n<p>The short paper on the design and use of <a href=\"https://github.com/quantifyearth/yirgacheffe/\">Yirgacheffe</a> was submitted to <a href=\"https://conf.researchr.org/home/icfp-splash-2025/propl-2025\">PROPL</a> was submitted on time, but not without a little stressing to the end, which is the downside of paper deadlines: something always turns up that makes them a rush, even if you felt you had things mostly in hand the week before.</p>\n<p>Context: for those who haven't seen it before, one of the main features of Yirgacheffe is that you can specify numerical operations directly on geospatial datasets, so you can add/multiply/filter these large rasters or polygons directly, and it'll do all the book keeping about aligning pixels, rasterizing polygons, etc., and at the end you either save the result to another raster layer, or you perform some aggregation like summing all the pixels or finding the min/max.</p>\n<p>One of the less used features of Yirgacheffe, at least by me, is that when doing that save or aggregation, Yirgacheffe can attempt to do so in parallel using multiple CPU cores. Normally the pipelines I work on don't use this feature as they tend towards data flows that work better if I run the same script many times in parallel, rather than one script that does everything within it. Partly this is down to Python being generally poor at parallelism, but mostly down to the data flows, e.g., processing thousands of area of habitat calculations at a time, it's jsut easier to run the AoH script once per species, and I can use an external tool like GNU Parallel or <a href=\"https://github.com/quantifyearth/littlejohn/\">Littlejohn</a> to orchestrate that.</p>\n<p>But, there are times when you just one script to do some calculation on a big raster as fast as possible, and for that I added the option to use multiple cores for the calculations. Internally you can imagine Yirgacheffe breaks down each calculation into say rows of pixels and does them one at time to avoid having to load too much data into memory, so it's a small logical leap to say we'll do several of those rows at a time in parallel, as they're independent of each other. Yirgacheffe doesn't try to do anything very clever here, but I found when I benchmarked the feature it performed much poorer than I'd expected, actually being several times slower than just using a single thread in some instances, one being over 6 times slower!</p>\n<div>\n</div>\n\n<p>My test case was processing 277 different species AoHs. I did specifically go for a mix of ranges, but the data for species sizes does tend to skew small, so don't process much data. Whilst I said above you could imagine Yirgacheffe processes a row of pixel data at a time, it actually does larger chunks than that: partly to get better disk behaviour and partly because polygon rasterization works very poorly at that scale, as it still has to process the entire polygon each time you want to rasterize a small chunk of it, and for species with ranges defined by detailed coastlines that can be a lot of data.</p>\n<p>So I realised that for many small species it was doing a single chunk of data, and if I set the parallelize flag it was still trying to do that work on a worker thread, which in Python is quite expensive to set up. So I added some checks to see if you would actually need parallelism, and if the calculation was just one chunk of data, then it'd revert to the single thread code path.</p>\n<div>\n</div>\n\n<p>This still isn't great, with still quite a few instances being slower than single threaded, but did bring the mean down taking less than a third of the original performance, with the min being around 12% of the original run.</p>\n<p>The overhead of processing one chunk like this did make me then wonder about how I was defining the chunk size, and whether I should look at the current default work unit size. I played a little with reducing it to encourage more parallelism, but that only seemed to make things worse, as the rasterization overheads kicked in, and given paper deadline, I didn't really have the time to try explore that space nor work out how to automatically infer what might be reasonable, so I had to park that. I also tried another, larger dataset, processing all 1600 odd mammals from the STAR metric, and this also gave me mixed results performance wise, and I didn't have time to dig into that: I assume the species' range distribution was different from my normal test sample set.</p>\n<p>Ultimately, on average the parallel save feature on Yirgacheffe does better than not having it, but is pretty poor given how many CPU cores it can use, and so overall I'm left quite unhappy with the feature. I feel that even allowing for Python related problems, something better could be done, but there was no time to look before the deadline passed \ud83d\ude20</p>\n<p>It's not like this was even a critical part of the narrative to the paper, and isn't a feature I use that much, but the process made me realise there's something going wrong and I don't understand why, and I don't have time to figure it out, and that is deeply frustrating.</p>\n<h2>LIFE</h2>\n<p>I started generating a new LIFE run using the latest RedList update from 2025. All the LIFE paper work was done with RedList data from when the project started in 2023, and there's now a 2025 update out, so we want to publish updated layers. I did a visual inspection of the new maps, and there's some differences, particularly around amphibians, but they generally look good, but I've passed them over to Alison who as a zoologist is actually capable in interpreting the results properly.</p>\n<p>Whilst doing this I'm also doing a little modernisation of the code, and changing the default results you get when you use the script that comes with the repo so that it just runs things we're still interested in, rather than everything that was in the original LIFE paper.</p>\n<h2>Claudius</h2>\n<p><a href=\"https://pawaskar-shreya-outreachy.hashnode.dev/\">Shreya</a>, the Outreachy intern working on <a href=\"https://github.com/claudiusFX/claudius\">Claudius</a> has been working for the last few weeks on getting a feature to record animations out to an animated-GIF file, and that's now merged. I'd include an example here, but my <a href=\"https://github.com/mdales/webplats/\">self-written website publishing tool</a> doesn't have a way to let me include it, so I'll try fix that for next week \ud83e\udd26 We made some progress to getting Claudius into opam, as I got the <a href=\"https://github.com/claudiusFX/claudius/\">OCaml-GIF library</a> that it depends on that we maintain <a href=\"https://github.com/ocaml/opam-repository/pull/28146\">into opam</a>.</p>\n<p>The next challenge will be getting Claudius in, as the obvious paths don't quite work due to Claudius using a submodule to add a resource dependancy. Specifically, github releases don't include submodules in the produced tarball, which means Claudius won't build from a github release unfortunately, which is how I did the release for the GIF library.</p>\n<h2>3D-Printing maps</h2>\n<p>UROP studently Finley started, and impressed me by very quickly getting up and running generating models for 3D printing from digital elevation maps:</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A screen shot of a square area of hilly land rendered in some 3D-printer slicer software.\" src=\"srtm_print_hr_2.png\">\n\n</div>\n</div>\n\n<p></p>\n<p>Finley is going to try write up some weeknotes, so I'll link to those here as and when and not spoil his work, but I'm super excited about what we might get done this summer. I was working out of <a href=\"https://doesliverpool.com/\">DoES Liverpool</a> for part of last week, and I did spot this lovely CNC-routed landscape and I must resist trying to derail this project into even more time-consuming construction methods :)</p>\n<p></p><div>\n<div>\n\n\n<img alt=\"A photo of me holding a wooden block into which a mountain range has been carved\" src=\"IMG_9342.jpeg\">\n\n</div>\n</div>\n\n<p></p>\n<p>I did find out the computer lab has some Prusa 3D-printers, so hopefully Finley and I can get trained on those.</p>\n<h1>This week</h1>\n<ul>\n<li>Make sure we have everything we need for the next LIFE manuscript ready for zenodo.</li>\n<li>Get some of Finley's results 3D-printed and try get him able to print on his own.</li>\n<li>Try to schedule a meeting on AoH validation with interested peeps. This was discussed around the IUCN workshop a few weeks back, and I need to try arrange that before people vanish for summer holidays (myself included).</li>\n<li>Look into TESSERA if there's any free time</li>\n</ul>",
+18
onkar/2025_02_28_biospace.json
+18
onkar/2025_02_28_biospace.json
···+"content": "<h1>Blogging BIOSPACE25!</h1>\n\n<p>28th February, 2025</p>\n\n<p><strong>Hello world</strong>, this is my first Jekyll blog post.</p>\n\n<p>With that formality out the way\u2026 a couple weeks ago I headed off to the Biospace conference at the ESA-ESRIN Observation Center in Frascati, Italy. While I was only there for 2 days, there was a lot to be excited about.</p>\n\n<p>\n <img alt=\"\" src=\"http://localhost:4000/images/posts/biospace/esa.jpg\">\n <br>\n <em>Just past the entrance to the ESA-ESRIN Observation Center, and also past some fairly intense security!</em>\n</p>\n\n<p>My big takeaway from the opening speeches was that this is the <strong>first</strong> year that the ESA is spending <em>more</em> on building out its data science capabilities than it is on putting satellites into space. To me, this is indicative of the fact that the marginal benefit from putting effort into effectively wrangling huge amounts of data is now greater than that from collecting huge amounts of data at a faster pace.</p>\n\n<p>There was a lot of discussion around the Kunming-Montreal Global Biodiversity Framework, which introduces a new set of Essential Biodiversity Variables (EBVs). A lot of the discussion here was frankly making the case for much of my PhD research for me. Some quotes of note:</p>\n\n<p><em>\u201cWhy create even more indicators when we can\u2019t even measure the ones we already have?\u201d</em></p>\n\n<p><em>\u201cWe want to ensure that the mistakes that were made with the SDG indicators are not made again. These mistakes you only begin to learn about as you dive into the data.\u201d</em></p>\n\n<p>And yet, even with those mistakes, the SDGs are the targets that countries broadly have agreed upon.</p>\n\n<p>\n <img alt=\"\" src=\"http://localhost:4000/images/posts/biospace/opening.jpg\">\n <br>\n <em>Don't mind the satellites hanging from the ceiling</em>\n</p>\n\n<p>A key point multiple speakers made note of (there were a dozen or so speakers talking for perhaps ~10 minutes each) was that introducing frameworks and methodologies to give countries national ownership of their data and the ability to independently generate compatible statistics was the priority, not introducing new data products. If we can move towards all countries using the same standards, we can enable the aggregation of statistics up in a reliable manner.</p>\n\n<p>That\u2019s not to say that the ESA <em>isn\u2019t</em> introducing new data products: they noted that future missions include Biomass and Flex, which are designed to capture forest biomass and vegetation florescence respectively, both very on point for the theme of the conference.</p>\n\n<p>There was also a palpable frustration around the lack of people who exist in the zones between science, economics, and policy (a point that Simon Sharpe repeatedly makes in his fantastic book, Five Times Faster, which I should be reading at least two times faster given that I\u2019m only halfway through it\u2026). <em>\u201cWe can\u2019t drive impact without more of these people popping up over the next few years.\u201d</em></p>\n\n<p>Jilian Campbell of the Convention on Biological Diversity very astutely noted that even once we do have standarised EO-derived indicators, we still need robust mechanisms of connecting these to on-the-ground data for validation and verification purposes.</p>\n\n<p>Ilaria Dimatteo of the UN Statistical Commission explained how even though they carefully map out both environmental and economic circumstances for policy-making, <em>\u201cwhen decisions are take at the nation level, the environment does not really come into play.\u201d</em> In 2021, the commission adopted the System of Environmental-Economic Accounting Ecosystem Accounting (SEEA EA, not a typo) to essentially force the connection between these two domains in a spatially explicit manner. <em>\u201cFrom a statistical perspective, we really want international compatability. Methods to ensure that information generated is reliable, replicable, and widely known.\u201d</em></p>\n\n<p>The conference also featured a great talk from my fellow Cambridge PhD Andr\u00e9s Z\u00fa\u00f1iga Gonz\u00e1lez focusing on on-device scalable learning to aid urban tree management with a hardware focus.</p>\n\n<p>\n <img alt=\"\" src=\"http://localhost:4000/images/posts/biospace/andres.jpg\">\n <br>\n <em>I particularly enjoyed his hedgehog networking analogies!</em>\n</p>\n\n<p>At my own poster presentation, it turned out that almost everyone at the conference had an opinion on causal relationships between indicators and were either incredibly glad that someone was trying to tackle it or incredibly frustrated with the concept as a whole having bumped their own heads against it. The most valuable feedback I received involved mentions of additive and generalised statistical models which can better capture characteristics of causality in specific sorts of systems.</p>\n\n<p>My last day at the conference also involved a workshop focusing on the EBVs (I forgive you for forgetting what this stands for - Essential Biodiversity Variables), in which we went through the entire list of EBVs thinking through whether these were observable from satellite based data products and how feasible accurate measurement of them was at scale. It turned out that many people were only just beginning to gain familiarity with these, so it was a good learning exercise for us all.</p>\n\n<p>A wonderful view and delicious food to wrap up my trip. It is Italy after all!</p>\n\n<p>\n <img alt=\"\" src=\"http://localhost:4000/images/posts/biospace/gandolfo.jpg\">\n <br>\n <em>Ciao</em>\n</p>",
+18
onkar/2025_04_30_bookofwhy.json
+18
onkar/2025_04_30_bookofwhy.json
···+"content": "<h1>The Book of Why - My Thoughts on Causality in the 21st Century</h1>\n\n<p>30th April, 2025</p>\n\n<p>I\u2019ve probably said the word \u2018causality\u2019 more in these last couple months than I ever have before in my life. Chances are, with everything going on in the world, I\u2019m not the only one.</p>\n\n<p>As policymakers debate the true levers of growth and inflation in a globally uncertain economic environment and I mull over whether statistical and machine learning models are capable of discerning causal links between development indicators, it seems like a good time to visit the concept.</p>\n\n<p><a href=\"https://svr-sk818-web.cl.cam.ac.uk/keshav/wiki/index.php/Main_Page\">Prof. Srinivasan Keshav</a> of the <a href=\"https://www.cst.cam.ac.uk/research/eeg\">Energy and Environment Group</a> at the Computer Laboratory recommended I check out Judea Pearl\u2019s The Book of Why when I had first begun to think about how geospatial machine learning might help uncover causal influences on the ground.</p>\n\n<p>What I found within was a fantastic retelling of the how academics and statisticians have thought about causality (or done their best to avoid the concept entirely), and how we can leverage the tools of the causal revolution to ask better questions and seek clearer answers.</p>\n\n<p>To summarise Pearl\u2019s key theses, which he drives in from the very beginning of the book:<span></span></p>\n<ul>\n <li>The human brain is the most sophisticated causal processing machine on the planet.</li>\n <li>We can place causal thinking on three hierarchical levels, collectively termed the \u2018Ladder of Causation\u2019. These correspond (from bottom to top) with the concepts of \u2018association\u2019, \u2018intervention\u2019, and \u2018counterfactuals\u2019.</li>\n <li>Data alone cannot answer casual enquiries. We require machines specifically constructed for understanding causal relationships to do this, and by doing so we can arrive at artificial general intelligence (AGI).</li>\n</ul>\n\n<p>Pearl attacks the notation of causality head on, which traditionally statistics has cowered away from. I\u2019ll avoid getting into the hairs of the maths presented in the book, but Pearl notably takes the step of differentiating the <em>do</em> operator, which explicitly encodes causation by forcing an event to occur, and the oft-seen conditional probability notation we\u2019re all familiar with \u2013 <em>doing</em> instead of merely <em>seeing</em>.</p>\n\n<p>In practice, however, to say that this is challenging would be an understatement. Counterfactuals inherently cannot be directly observed. Construction of effective controls representative of counterfactuals often requires knowledge of causative factors which isn\u2019t available (if you need perfect understanding of existing causal links to make new ones, where do you begin!?) or is restricted by data availability.</p>\n\n<p>So, in reality, if you wish to truly predict inflation from fundamentals, you would first need to create the universe from scratch (just as you would were you to bake an apple pie from scrtach\u2026), track the deterministic behaviour of every elementary particle, and find a way to correct for quantum mechanical fluctuations.</p>\n\n<p>As with everything, we settle for an adequate level of abstraction. The level of abstraction will limit the bounds around our answer, but also require us to process only a relatively finite quantity of information in reaching that answer.</p>\n\n<p>I defer to macroeconomic examples because of both their current relevance and their immense (but often unseen) consequences on our individual lives.</p>\n\n<p>Pearl tends to arrive at the conclusion that assessing associations within data alone \u2013 that is, staying on the first rung of association \u2013 is insufficient for causal analysis. And yet, many would argue that large-language models (LLMs) are capable of some degree of causal comprehension. Have they then climbed up these rungs without us noticing? Pearl himself has stated in recent interviews that what he didn\u2019t account for was the possibility that the data that models are trained on may subtly contain causal relationships without them being explicitly coded in, as occurs with text in the case of LLMs.</p>\n\n<p>If you\u2019re wondering whether LLMs may be the first step towards true causal inference machines: both Pearl and I would push back on this being anywhere near a certainty. Traditional statistical models are not only up to the task of being the forerunners of causal inference but remain much more explainable than their neural network counterparts.</p>\n\n<p>I can\u2019t say I agree with everything in Pearl\u2019s book. What I am quite sure of, however, is that a combination of these causality-informed approaches, traditional statistics, and cutting-edge deep learning approaches holds the keys to making it all the way up our ladder of causation.</p>\n\n<p>The further we get into this, the larger the temptation goes to just say \u2018screw causality, I\u2019m happy with correlation\u2019. What I agree strongest with Pearl on is that science and statistics should not shy away from causality because it is tough to explain, but should tackle it head on for that very same reason, especially with the technology that we are fortunate enough to have in today\u2019s world.</p>\n\n<p>In order to know what levers to pull or push at the policy level to optimise economic well-being while enhancing sustainability and health outcomes, we need the most sophisticated causal inference machine ever created, and we need policymakers to listen to it.</p>\n\n<p>For a more applied look into working with causality in research contexts, I highly enjoyed reading <a href=\"https://mixtape.scunning.com/01-introduction\"> Causal Inference: The Mixtape</a> by Scott Cunningham, which builds up to the intuition behind difference-in-difference and synthetic control approaches, and discusses how these are actually applied in a variety of contexts.</p>\n\n\n\n\n<p>Separately, I was luckily appointed as the de facto interviewer for those coming out of the <a href=\"https://www.linkedin.com/posts/university-of-cambridge_shapingaiforeveryone-ai-cambridgeuniversity-activity-7318287515705118720-PYjT?utm_source=li_share&utm_content=feedcontent&utm_medium=g_dt_web&utm_campaign=copy\">VR supercomputer zone of the Cambridge festival</a>.</p>",
+2
-2
onkar/metadata.json
+2
-2
onkar/metadata.json
+2
-2
pf341/metadata.json
+2
-2
pf341/metadata.json
+18
pf341/rest-of-monthly-2025-06_.json
+18
pf341/rest-of-monthly-2025-06_.json
···+"summary": "<p>I have been away on holidays. Some of that was spent hiking in the Massif de Calanques in Marseille, unfortunately this week <a href=\"https://www.theguardian.com/world/2025/jul/08/marseille-airport-cancels-all-flights-as-wildfire-encroaches-on-city\">they were on fire</a>. Anyway, this \"weekly\" fills in the blanks for the rest of June 2025.</p>\n <p>\n <img alt=\"A Calanque just outside of Marseille, a canyon-like structure of calcified stone on the edge of the Mediterranean.\" src=\"/bafkrmigalfmuwbf6l6lpj5wbkmxz25qrwtoe536sjogt42rhagtorfyu54.jpg\" width=\"320\">\n </p>\n \n\n \n\n <h2>Revisiting <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a></h2>\n \n <p>I spent some time thinking and writing about <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> and the benefits of OCaml as source language and Hazel as a target language. The stability of OCaml with a powerful type-system makes it very expressive.</p>\n <p>Hazel, on the other hand, is a hotbed of PL theory research. The typed holes, which are fully supported within the language, are particular powerful targets for a transpiler. They allow transpiler developers to incrementally support the language whilst still getting full feedback of their generated source code. Whilst it is a niche use-case, I thought it was an interesting point to note.</p>\n \n \n\n \n\n <h2>A second year report</h2>\n \n <p>I wrote my second year report at the end of June. The process was a bit chaotic, but I think ultimately it was useful for clarifying some ideas I had rattling about in my head.</p>\n <p>Part of that was working on my <a href=\"https://patrick.sirref.org/publications/\">publications</a> page and, through a conversation with <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, realising that my work on <a href=\"https://patrick.sirref.org/graft/\">Graft</a> was in many ways a part of my own research.</p>\n <p>If you are interested, feel free to peruse both my <a href=\"/bafkrmigsvxp4qr3tltethz6oznlgxkkx2dwjjkovjvhemi3k3ljpcydvu4.pdf\">first year</a> and <a href=\"/bafkrmifhsh5b6mgzdomyfz6harcm2hrzxzxtupije7mqjiwxotupmz4bgy.pdf\">second year</a> reports. Take them with a pinch of salt! I think it is useful to be fairly open with this sort of thing when you can be. Maybe someone will find them useful.</p>\n \n\n \n\n <h3>What is <em>scientific programming</em>?</h3>\n \n <p>This question arose whilst writing my second year report. It got me thinking about how we distinguish scientific programming from other kinds of programming. I came across a paper by Tim Storer: <a href=\"https://patrick.sirref.org/storer2017bridging/\">Bridiging the chasm: A survey of Software Engineering Practice in Scientific Programming</a>. I enjoyed a few aspects of this paper.</p>\n <ul>\n <li>\n <p>Relating scientific programming to the scientific method so directly was pretty useful insight for me. <em>Falsifable hypotheses</em>, <em>repeatable experiments</em> and <em>reproducible results</em> when applied as characteristics of the computer stack (e.g. OS, programming language, filesystem etc.) might be a useful lens to argue for better tools to enable this.</p>\n </li>\n <li>\n <p>The breadth of Storer's case-study analysis on how scientific programming has \"gone wrong\" is very good, if anyone needs a reference to draw on to help support their tools or research, this seems like a good index to use.</p>\n </li>\n </ul>\n \n \n \n\n \n\n <h2>Ppxlib</h2>\n \n <p>I was reminded again of the pain of helping maintain <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a>. We are coming to the conclusion, that in its current form, it is becoming unmaintainable. I took a stab at updating <a href=\"https://github.com/aantron/bisect_ppx/pull/448\">bisect_ppx</a> to the latest ppxlib. With OCaml entering a potentially turbulent parsetree era, it might be time to take stock of this and propose some fillers to help. Nathan Rebours and I are meeting next week to discuss ideas he has to make parsetree migrations smooth sailing for all!</p>\n \n \n\n \n\n <h2>Geocaml</h2>\n \n <p>Work on <a href=\"https://patrick.sirref.org/ocaml-tiff/\">ocaml-tiff</a> has stalled, which is totally fine. <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> and I independently are working on other things and have obligations to fulfill. Maybe some day we'll carve out enough time together to push forward on this project, but not this week.</p>",+"content": "<p>I have been away on holidays. Some of that was spent hiking in the Massif de Calanques in Marseille, unfortunately this week <a href=\"https://www.theguardian.com/world/2025/jul/08/marseille-airport-cancels-all-flights-as-wildfire-encroaches-on-city\">they were on fire</a>. Anyway, this \"weekly\" fills in the blanks for the rest of June 2025.</p>\n <p>\n <img alt=\"A Calanque just outside of Marseille, a canyon-like structure of calcified stone on the edge of the Mediterranean.\" src=\"/bafkrmigalfmuwbf6l6lpj5wbkmxz25qrwtoe536sjogt42rhagtorfyu54.jpg\" width=\"320\">\n </p>\n \n\n \n\n <h2>Revisiting <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a></h2>\n \n <p>I spent some time thinking and writing about <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> and the benefits of OCaml as source language and Hazel as a target language. The stability of OCaml with a powerful type-system makes it very expressive.</p>\n <p>Hazel, on the other hand, is a hotbed of PL theory research. The typed holes, which are fully supported within the language, are particular powerful targets for a transpiler. They allow transpiler developers to incrementally support the language whilst still getting full feedback of their generated source code. Whilst it is a niche use-case, I thought it was an interesting point to note.</p>\n \n \n\n \n\n <h2>A second year report</h2>\n \n <p>I wrote my second year report at the end of June. The process was a bit chaotic, but I think ultimately it was useful for clarifying some ideas I had rattling about in my head.</p>\n <p>Part of that was working on my <a href=\"https://patrick.sirref.org/publications/\">publications</a> page and, through a conversation with <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, realising that my work on <a href=\"https://patrick.sirref.org/graft/\">Graft</a> was in many ways a part of my own research.</p>\n <p>If you are interested, feel free to peruse both my <a href=\"/bafkrmigsvxp4qr3tltethz6oznlgxkkx2dwjjkovjvhemi3k3ljpcydvu4.pdf\">first year</a> and <a href=\"/bafkrmifhsh5b6mgzdomyfz6harcm2hrzxzxtupije7mqjiwxotupmz4bgy.pdf\">second year</a> reports. Take them with a pinch of salt! I think it is useful to be fairly open with this sort of thing when you can be. Maybe someone will find them useful.</p>\n \n\n \n\n <h3>What is <em>scientific programming</em>?</h3>\n \n <p>This question arose whilst writing my second year report. It got me thinking about how we distinguish scientific programming from other kinds of programming. I came across a paper by Tim Storer: <a href=\"https://patrick.sirref.org/storer2017bridging/\">Bridiging the chasm: A survey of Software Engineering Practice in Scientific Programming</a>. I enjoyed a few aspects of this paper.</p>\n <ul>\n <li>\n <p>Relating scientific programming to the scientific method so directly was pretty useful insight for me. <em>Falsifable hypotheses</em>, <em>repeatable experiments</em> and <em>reproducible results</em> when applied as characteristics of the computer stack (e.g. OS, programming language, filesystem etc.) might be a useful lens to argue for better tools to enable this.</p>\n </li>\n <li>\n <p>The breadth of Storer's case-study analysis on how scientific programming has \"gone wrong\" is very good, if anyone needs a reference to draw on to help support their tools or research, this seems like a good index to use.</p>\n </li>\n </ul>\n \n \n \n\n \n\n <h2>Ppxlib</h2>\n \n <p>I was reminded again of the pain of helping maintain <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a>. We are coming to the conclusion, that in its current form, it is becoming unmaintainable. I took a stab at updating <a href=\"https://github.com/aantron/bisect_ppx/pull/448\">bisect_ppx</a> to the latest ppxlib. With OCaml entering a potentially turbulent parsetree era, it might be time to take stock of this and propose some fillers to help. Nathan Rebours and I are meeting next week to discuss ideas he has to make parsetree migrations smooth sailing for all!</p>\n \n \n\n \n\n <h2>Geocaml</h2>\n \n <p>Work on <a href=\"https://patrick.sirref.org/ocaml-tiff/\">ocaml-tiff</a> has stalled, which is totally fine. <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> and I independently are working on other things and have obligations to fulfill. Maybe some day we'll carve out enough time together to push forward on this project, but not this week.</p>",
+14
pf341/weekly-2025-01-20_.json
+14
pf341/weekly-2025-01-20_.json
···+"summary": "<p>This week was my first full week back from the break and I found it challenging trying to get back into what I had been working on previously.</p>\n <p>\n <strong>ICFP Papers</strong>\n </p>\n <p>In conversation with <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, we looked at options for submitting a paper to ICFP. I wrote up some notes on some options <a href=\"https://patrick.sirref.org/icfp25-ideas/\">we discussed</a>.</p>\n <p>\n <strong>Forester</strong>\n </p>\n <p>I spent some time this week converting this website to using <a href=\"https://www.jonmsterling.com/foreign-forester-jms-005P.xml\">Forester</a>. I'm not a huge fan of the syntax, especially as a lot of my site's content was already in markdown. So I wrote a markdown frontend to forester which is <a href=\"https://github.com/patricoferris/ocaml-forester/tree/markdown\">available on Github</a>.</p>\n <p>The markdown frontend integrates very nicely and only a few changes were needed in the core logic of <a href=\"https://patrick.sirref.org/forester/\">Forester</a> itself. Additionally, for any features not directly supported in markdown there is an escape hatch using code blocks such as:</p>\n <pre>```forester\n\\put\\transclude/numbered{false}\n\n\\transclude{pxf-1000}\n```</pre>\n <p>Personally, I'm still getting to grips with the <em>bottom-up</em> approach to building this site, atomically creating notes and reference cards that then are linked in many places.</p>\n <p>I'm excited to see how I can integrate some of the Forester concepts into \"Shark\".</p>\n <p>\n <strong>OCaml</strong>\n </p>\n <p>In the OCaml world, I spent time on <code>ppxlib</code>, reviewing PRs to update the lower bounds of the library, fixing effect syntax problems and bumping internal AST to 5.2. I've also spent some time looking into de-objecting Eio, making the OCaml types more friendly to new users. I need to revive my port of Vpnkit to Eio for the thoughts on <a href=\"https://patrick.sirref.org/icfp25-ideas/\">ICFP 2025</a> too.</p>\n <p>I also want to modernise and make more public my OCaml code for creating little shells in OCaml too -- I think the ideas here really have legs and would like to find a conference to submit them too.</p>\n <p>I also met with the single, OCaml Outreachy intern working on <a href=\"https://github.com/ocaml-semver/ocaml-api-watch\">ocaml-api-watch</a>.</p>\n <p>\n <strong>Misc.</strong>\n </p>\n <p>I met with most of my <a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> students this week which was nice to catch up and see how their projects are going. I also started marking work for my first year students who are at the <em>induction</em> part of their <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a> course.</p>",+"content": "<p>This week was my first full week back from the break and I found it challenging trying to get back into what I had been working on previously.</p>\n <p>\n <strong>ICFP Papers</strong>\n </p>\n <p>In conversation with <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, we looked at options for submitting a paper to ICFP. I wrote up some notes on some options <a href=\"https://patrick.sirref.org/icfp25-ideas/\">we discussed</a>.</p>\n <p>\n <strong>Forester</strong>\n </p>\n <p>I spent some time this week converting this website to using <a href=\"https://www.jonmsterling.com/foreign-forester-jms-005P.xml\">Forester</a>. I'm not a huge fan of the syntax, especially as a lot of my site's content was already in markdown. So I wrote a markdown frontend to forester which is <a href=\"https://github.com/patricoferris/ocaml-forester/tree/markdown\">available on Github</a>.</p>\n <p>The markdown frontend integrates very nicely and only a few changes were needed in the core logic of <a href=\"https://patrick.sirref.org/forester/\">Forester</a> itself. Additionally, for any features not directly supported in markdown there is an escape hatch using code blocks such as:</p>\n <pre>```forester\n\\put\\transclude/numbered{false}\n\n\\transclude{pxf-1000}\n```</pre>\n <p>Personally, I'm still getting to grips with the <em>bottom-up</em> approach to building this site, atomically creating notes and reference cards that then are linked in many places.</p>\n <p>I'm excited to see how I can integrate some of the Forester concepts into \"Shark\".</p>\n <p>\n <strong>OCaml</strong>\n </p>\n <p>In the OCaml world, I spent time on <code>ppxlib</code>, reviewing PRs to update the lower bounds of the library, fixing effect syntax problems and bumping internal AST to 5.2. I've also spent some time looking into de-objecting Eio, making the OCaml types more friendly to new users. I need to revive my port of Vpnkit to Eio for the thoughts on <a href=\"https://patrick.sirref.org/icfp25-ideas/\">ICFP 2025</a> too.</p>\n <p>I also want to modernise and make more public my OCaml code for creating little shells in OCaml too -- I think the ideas here really have legs and would like to find a conference to submit them too.</p>\n <p>I also met with the single, OCaml Outreachy intern working on <a href=\"https://github.com/ocaml-semver/ocaml-api-watch\">ocaml-api-watch</a>.</p>\n <p>\n <strong>Misc.</strong>\n </p>\n <p>I met with most of my <a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> students this week which was nice to catch up and see how their projects are going. I also started marking work for my first year students who are at the <em>induction</em> part of their <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a> course.</p>",
+14
pf341/weekly-2025-01-27_.json
+14
pf341/weekly-2025-01-27_.json
···+"summary": "<p>\n <strong>AT Protocol</strong>\n </p>\n <p>This week I've been diving into the <a href=\"https://atproto.com/\">AT Protocol</a>.</p>\n <blockquote>\n <p>The Authenticated Transfer Protocol, aka atproto, is a decentralized protocol for large-scale social web applications.</p>\n </blockquote>\n <p>The protocol could be a candidate for the glue that holds together a distributed, computational wiki network. The protocol, it seems, is very similar to <a href=\"https://patrick.sirref.org/ipfs/\">IPFS</a>. Thankfully, a few years ago, I was working on building out a suite of OCaml libraries for working with <a href=\"https://patrick.sirref.org/ipfs/\">IPFS</a>. For example, <a href=\"https://github.com/patricoferris/ocaml-cid\">ocaml-cid</a>, self-describing content-addressed identifiers.</p>\n <pre> <code><span>#</span><span> </span><span>let</span><span> </span><span>s</span><span> </span><span>=</span><span> </span><span>\"</span><span>zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA</span><span>\"</span><span>;</span><span>;</span><span>\n</span>\n<span>val</span><span> </span><span>s</span><span> </span><span>:</span><span> </span><span>string</span><span> </span><span>=</span><span> </span><span>\"</span><span>zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA</span><span>\"</span><span>\n</span>\n<span>#</span><span> </span><span>let</span><span> </span><span>cid</span><span> </span><span>=</span><span> </span><span>Cid</span><span>.</span><span>of_string</span><span> </span><span>s</span><span> </span><span>|></span><span> </span><span>Result</span><span>.</span><span>get_ok</span><span>;</span><span>;</span><span>\n</span>\n<span>val</span><span> </span><span>cid</span><span> </span><span>:</span><span> </span><span>Cid</span><span>.</span><span>t</span><span> </span><span>=</span><span> </span><span><</span><span>abstr</span><span>></span><span>\n</span>\n<span>#</span><span> </span><span>Cid</span><span>.</span><span>pp_human</span><span> </span><span>Format</span><span>.</span><span>std_formatter</span><span> </span><span>cid</span><span>;</span><span>;</span><span>\n</span>\n<span>cidv1</span><span> </span><span>-</span><span> </span><span>base58btc</span><span> </span><span>-</span><span> </span><span>raw</span><span> </span><span>-</span><span> </span><span>ident</span><span>( </span><span>sha2</span><span>-</span><span>256</span><span>) </span><span> </span><span>length</span><span>( </span><span>32</span><span>) </span><span> </span><span>digest</span><span>( </span><span>6e 6f </span><span>f7</span><span> </span><span>95</span><span> 0a </span><span>36</span><span> </span><span>18</span><span> 7a </span><span>80</span><span> </span><span>16</span><span> </span><span>13</span><span> </span><span>42</span><span> 6e </span><span>85</span><span> 8d </span><span>ce</span><span>\n</span>\n<span> </span><span>68</span><span> 6c </span><span>d7</span><span> </span><span>d7</span><span> </span><span>e3</span><span> </span><span>c0</span><span> </span><span>fc</span><span> </span><span>42</span><span> </span><span>ee</span><span> </span><span>03</span><span> </span><span>30</span><span> </span><span>07</span><span> 2d </span><span>24</span><span> 5c </span><span>95</span><span>\n</span></code>\n </pre>\n <p>To this end I have built out some more OCaml libraries for working with atproto, including:</p>\n <ul>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-atproto-data\">atproto-data</a>: the atproto data model, similar to JSON-LD.</p>\n </li>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-did\">ocaml-did</a>: an OCaml library for working with decentralized identifiers.</p>\n </li>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-atproto-lexicon\">atproto-lexicon</a>: atproto's schema format, I've been building a quick tool for doing an OCaml translation from these schemas.</p>\n </li>\n </ul>\n <p>I managed to get <a href=\"https://bsky.app/profile/patrick.sirref.org/post/3lh24rrjngw24\">a post published from the OCaml library</a> after fixing it up and porting it to <a href=\"https://patrick.sirref.org/eio/\">Eio</a>.</p>\n <p>\n <strong>An IR for Wikis</strong>\n </p>\n <p>I started working on a proof-of-concept intermediate representation for Wikis -- I imagine it a bit like <a href=\"https://github.com/stedolan/malfunction\">malfunction</a> but for computational wikis i.e. a target for Wiki building tools that allows different front-ends and servers to communicate in a common IR for exposing key functionalities of a wiki:</p>\n <ul>\n <li>\n <p>Links: External links, cross-wiki backlinks</p>\n </li>\n <li>\n <p>Versioned, temporal feeds</p>\n </li>\n <li>\n <p>Etc.</p>\n </li>\n </ul>\n <p>\n <strong>Other PhD Work</strong>\n </p>\n <p>I met with most of my <a href=\"https://patrick.sirref.org/part-ii/\">Part II</a> students this week, and I'm excited about their work. Progress reports are due this week and next they have a presentation to give.</p>\n <p>In <a href=\"https://patrick.sirref.org/discrete-maths/\">discrete maths</a> this week we did induction. Next up is a big section on sets, functions, bijections etc.</p>\n <p>\n <strong>Misc.</strong>\n </p>\n <p>I was happy to find the <a href=\"https://www.opentech.fund/fellowships/icfp/\">Information Controls Fellowship Program</a>.</p>\n <blockquote>\n <p>The Information Controls Fellowship Program (ICFP) cultivates research, outputs, and creative collaboration on topics related to repressive internet censorship and surveillance.</p>\n </blockquote>",+"content": "<p>\n <strong>AT Protocol</strong>\n </p>\n <p>This week I've been diving into the <a href=\"https://atproto.com/\">AT Protocol</a>.</p>\n <blockquote>\n <p>The Authenticated Transfer Protocol, aka atproto, is a decentralized protocol for large-scale social web applications.</p>\n </blockquote>\n <p>The protocol could be a candidate for the glue that holds together a distributed, computational wiki network. The protocol, it seems, is very similar to <a href=\"https://patrick.sirref.org/ipfs/\">IPFS</a>. Thankfully, a few years ago, I was working on building out a suite of OCaml libraries for working with <a href=\"https://patrick.sirref.org/ipfs/\">IPFS</a>. For example, <a href=\"https://github.com/patricoferris/ocaml-cid\">ocaml-cid</a>, self-describing content-addressed identifiers.</p>\n <pre> <code><span>#</span><span> </span><span>let</span><span> </span><span>s</span><span> </span><span>=</span><span> </span><span>\"</span><span>zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA</span><span>\"</span><span>;</span><span>;</span><span>\n</span>\n<span>val</span><span> </span><span>s</span><span> </span><span>:</span><span> </span><span>string</span><span> </span><span>=</span><span> </span><span>\"</span><span>zb2rhe5P4gXftAwvA4eXQ5HJwsER2owDyS9sKaQRRVQPn93bA</span><span>\"</span><span>\n</span>\n<span>#</span><span> </span><span>let</span><span> </span><span>cid</span><span> </span><span>=</span><span> </span><span>Cid</span><span>.</span><span>of_string</span><span> </span><span>s</span><span> </span><span>|></span><span> </span><span>Result</span><span>.</span><span>get_ok</span><span>;</span><span>;</span><span>\n</span>\n<span>val</span><span> </span><span>cid</span><span> </span><span>:</span><span> </span><span>Cid</span><span>.</span><span>t</span><span> </span><span>=</span><span> </span><span><</span><span>abstr</span><span>></span><span>\n</span>\n<span>#</span><span> </span><span>Cid</span><span>.</span><span>pp_human</span><span> </span><span>Format</span><span>.</span><span>std_formatter</span><span> </span><span>cid</span><span>;</span><span>;</span><span>\n</span>\n<span>cidv1</span><span> </span><span>-</span><span> </span><span>base58btc</span><span> </span><span>-</span><span> </span><span>raw</span><span> </span><span>-</span><span> </span><span>ident</span><span>( </span><span>sha2</span><span>-</span><span>256</span><span>) </span><span> </span><span>length</span><span>( </span><span>32</span><span>) </span><span> </span><span>digest</span><span>( </span><span>6e 6f </span><span>f7</span><span> </span><span>95</span><span> 0a </span><span>36</span><span> </span><span>18</span><span> 7a </span><span>80</span><span> </span><span>16</span><span> </span><span>13</span><span> </span><span>42</span><span> 6e </span><span>85</span><span> 8d </span><span>ce</span><span>\n</span>\n<span> </span><span>68</span><span> 6c </span><span>d7</span><span> </span><span>d7</span><span> </span><span>e3</span><span> </span><span>c0</span><span> </span><span>fc</span><span> </span><span>42</span><span> </span><span>ee</span><span> </span><span>03</span><span> </span><span>30</span><span> </span><span>07</span><span> 2d </span><span>24</span><span> 5c </span><span>95</span><span>\n</span></code>\n </pre>\n <p>To this end I have built out some more OCaml libraries for working with atproto, including:</p>\n <ul>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-atproto-data\">atproto-data</a>: the atproto data model, similar to JSON-LD.</p>\n </li>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-did\">ocaml-did</a>: an OCaml library for working with decentralized identifiers.</p>\n </li>\n <li>\n <p><a href=\"https://github.com/patricoferris/ocaml-atproto-lexicon\">atproto-lexicon</a>: atproto's schema format, I've been building a quick tool for doing an OCaml translation from these schemas.</p>\n </li>\n </ul>\n <p>I managed to get <a href=\"https://bsky.app/profile/patrick.sirref.org/post/3lh24rrjngw24\">a post published from the OCaml library</a> after fixing it up and porting it to <a href=\"https://patrick.sirref.org/eio/\">Eio</a>.</p>\n <p>\n <strong>An IR for Wikis</strong>\n </p>\n <p>I started working on a proof-of-concept intermediate representation for Wikis -- I imagine it a bit like <a href=\"https://github.com/stedolan/malfunction\">malfunction</a> but for computational wikis i.e. a target for Wiki building tools that allows different front-ends and servers to communicate in a common IR for exposing key functionalities of a wiki:</p>\n <ul>\n <li>\n <p>Links: External links, cross-wiki backlinks</p>\n </li>\n <li>\n <p>Versioned, temporal feeds</p>\n </li>\n <li>\n <p>Etc.</p>\n </li>\n </ul>\n <p>\n <strong>Other PhD Work</strong>\n </p>\n <p>I met with most of my <a href=\"https://patrick.sirref.org/part-ii/\">Part II</a> students this week, and I'm excited about their work. Progress reports are due this week and next they have a presentation to give.</p>\n <p>In <a href=\"https://patrick.sirref.org/discrete-maths/\">discrete maths</a> this week we did induction. Next up is a big section on sets, functions, bijections etc.</p>\n <p>\n <strong>Misc.</strong>\n </p>\n <p>I was happy to find the <a href=\"https://www.opentech.fund/fellowships/icfp/\">Information Controls Fellowship Program</a>.</p>\n <blockquote>\n <p>The Information Controls Fellowship Program (ICFP) cultivates research, outputs, and creative collaboration on topics related to repressive internet censorship and surveillance.</p>\n </blockquote>",
+14
pf341/weekly-2025-02-10_.json
+14
pf341/weekly-2025-02-10_.json
···+"summary": "<p>On paper, I don't have that many students. I teach four undergraduates (first year students at Pembroke College) <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a>. I supervise three third year students for their <a href=\"https://patrick.sirref.org/part-ii-2024/\">final year project</a> and another one I co-supervise with <a href=\"https://patrick.sirref.org/mdales/\">Michael Dales</a>. However, I do end up spending at least two full days a week on teaching. Something I really enjoy and take seriously. The time it takes is also quite unpredictable; last week for instance all the third year students had their mid-project demonstrations (a five-minute presentation in front of their peers and a few professor-types). My first year students also found two slides particularly challenging to understand from their lectures and asked if I could help explain what was going on, so I <a href=\"https://patrick.sirref.org/dm-note.pdf\">produced some materials for that</a>.</p>\n <p>\n <strong>OCaml</strong>\n </p>\n <p>Within the OCaml universe I spent a good bit of time trying to maintain a couple of different packages:</p>\n <ol>\n <li>\n <p>Mirage Crypto: this library provides cryptographic primitives for OCaml programs. Unfortunately, the maintainers removed direct support for Eio replacing it with a \"Unix\" alternative. This is not a fair swap as now Eio programs must make a dependency to Unix! Speaking to <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, the best approach might be to make Eio programs handle this directly. I think this highlighted how fragmented open-source maintenance can be as no user of Eio seems to have bumped into this yet and the upstream maintainers did not communicate a rather large breaking change.</p>\n </li>\n <li>\n <p>Ppxlib: In addition to the <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/514\">5.2 AST bump</a> (which is nearly ready to be merged), I queued up a <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/558\">5.3 AST bump</a> right behind it. I plan to write up a more detailed post about the challenges of maintaining this part of ppxlib.</p>\n </li>\n </ol>\n <p>\n <strong>Paris</strong>\n </p>\n <p>I spent some time in Paris at the end of the week. I enjoyed visiting the Mus\u00e9e d'Orsay and in particular their collection of impressionist paintings. Here's my favourite from that visit by Camille Pissarro.</p>\n \n\n <img alt=\"Woman in an Orchard (Spring Sunshine in the Meadow at Eragny).\" src=\"pissarro.jpeg\" width=\"400\">\n \nWoman in an Orchard (Spring Sunshine in the Meadow at Eragny)",+"content": "<p>On paper, I don't have that many students. I teach four undergraduates (first year students at Pembroke College) <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a>. I supervise three third year students for their <a href=\"https://patrick.sirref.org/part-ii-2024/\">final year project</a> and another one I co-supervise with <a href=\"https://patrick.sirref.org/mdales/\">Michael Dales</a>. However, I do end up spending at least two full days a week on teaching. Something I really enjoy and take seriously. The time it takes is also quite unpredictable; last week for instance all the third year students had their mid-project demonstrations (a five-minute presentation in front of their peers and a few professor-types). My first year students also found two slides particularly challenging to understand from their lectures and asked if I could help explain what was going on, so I <a href=\"https://patrick.sirref.org/dm-note.pdf\">produced some materials for that</a>.</p>\n <p>\n <strong>OCaml</strong>\n </p>\n <p>Within the OCaml universe I spent a good bit of time trying to maintain a couple of different packages:</p>\n <ol>\n <li>\n <p>Mirage Crypto: this library provides cryptographic primitives for OCaml programs. Unfortunately, the maintainers removed direct support for Eio replacing it with a \"Unix\" alternative. This is not a fair swap as now Eio programs must make a dependency to Unix! Speaking to <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a>, the best approach might be to make Eio programs handle this directly. I think this highlighted how fragmented open-source maintenance can be as no user of Eio seems to have bumped into this yet and the upstream maintainers did not communicate a rather large breaking change.</p>\n </li>\n <li>\n <p>Ppxlib: In addition to the <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/514\">5.2 AST bump</a> (which is nearly ready to be merged), I queued up a <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/558\">5.3 AST bump</a> right behind it. I plan to write up a more detailed post about the challenges of maintaining this part of ppxlib.</p>\n </li>\n </ol>\n <p>\n <strong>Paris</strong>\n </p>\n <p>I spent some time in Paris at the end of the week. I enjoyed visiting the Mus\u00e9e d'Orsay and in particular their collection of impressionist paintings. Here's my favourite from that visit by Camille Pissarro.</p>\n \n\n <img alt=\"Woman in an Orchard (Spring Sunshine in the Meadow at Eragny).\" src=\"pissarro.jpeg\" width=\"400\">\n \nWoman in an Orchard (Spring Sunshine in the Meadow at Eragny)",
+14
pf341/weekly-2025-02-17_.json
+14
pf341/weekly-2025-02-17_.json
···+"summary": "<p>Previous <a href=\"https://patrick.sirref.org/weeklies/\">weeklies</a> used <strong>strong</strong> emphasis to distinguish sections. This comes from <a href=\"https://patrick.sirref.org/forester/\">Forester</a>'s philosophy about atomicity of the content in your <em>forest</em>.</p>\n <p>However, <em>subtrees</em> are supported! I quickly hacked together the ability to use <em>subheadings</em> to indicate <em>subtrees</em>. This is strictly less expressive than the <code>\\subtree{}</code> of <a href=\"https://patrick.sirref.org/forester/\">Forester</a>'s default syntax as we cannot <em>close</em> heading sections in Markdown.</p>\n <p>This weekly uses subtrees.</p>\n \n\n \n\n <h2>Vpnkit</h2>\n \n <p>I spent some time this week trying to upgrade vpnkit to OCaml 5. I was originally working on <a href=\"https://patrick.sirref.org/vpnkit-er/\">a paper idea</a> which might need benchmarks, but <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I decided we could simply point to the port I did and show how it has simplified much of the code.</p>\n \n \n\n \n\n <h2>Void Processes</h2>\n \n <p>Work continued on implementing (and fully exploring) <a href=\"https://patrick.sirref.org/void-process/\">void processes</a>. A lot of the groundwork already exists in <a href=\"https://blog.hillion.co.uk/posts/void-processes/dissertation/jsh77-dissertation.pdf\">Jake Hillion's master's thesis</a>.</p>\n <p>This week I added a feature that I needed to help build the processes we need for the shell I'm building: mount points with modes!</p>\n <p>In addition to the root mount (taken care of with <a href=\"https://man7.org/linux/man-pages/man2/pivot_root.2.html\">pivot_root</a>), we need to be able to add additional mounts into the process' environment. These can now be added. All mount points can be mounted <code>readonly</code> or <code>readwrite</code>.</p>\n <p>Here is the \"Hello, World!\" example (the <code>/say/hey</code> program has been statically compiled using <code>musl-gcc</code>).</p>\n <pre> <code><span>let</span><span> </span><span>status</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>void</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>empty</span><span> \n</span>\n<span> </span><span>|></span><span> </span><span>mount</span><span> ~</span><span>mode</span><span>:</span><span>R</span><span> ~</span><span>src</span><span>:</span><span>hey_dir</span><span> ~</span><span>tgt</span><span>:</span><span>\"</span><span>say</span><span>\"</span><span>\n</span>\n<span> </span><span>|></span><span> </span><span>exec</span><span> </span><span>[ </span><span> </span><span>\"</span><span>/say/hey</span><span>\"</span><span> </span><span>] </span><span>\n</span>\n<span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>t</span><span> </span><span>=</span><span> </span><span>Void</span><span>.</span><span>spawn</span><span> ~</span><span>sw</span><span> </span><span>void</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>Promise</span><span>.</span><span>await</span><span> </span><span>( </span><span>Void</span><span>.</span><span>exit_status</span><span> </span><span>t</span><span>) </span><span>\n</span></code>\n </pre>\n <p>There really is nothing else in there. Without specifying a <code>root</code> mount, the void process is started with an empty <code>tmpfs</code> root. Next on the list is networking!</p>\n \n \n\n \n\n <h2>LSP Servers</h2>\n \n <p>I got a little side-tracked building a library for writing <a href=\"https://microsoft.github.io/language-server-protocol/\">LSP</a> servers in OCaml: <a href=\"https://patrick.sirref.org/mlsp/\">mlsp</a>. This may seem a little unrelated, but it isn't. The LSP has become the de facto standard for communicating between an editor and a programming language environment. If you have used VSCode to write a program in Python, chances are you are using <a href=\"https://marketplace.visualstudio.com/items?itemName=ms-python.python\">the official extension</a> which gives you linting, formatting, code navigation etc. All of these features are communicating using the LSP.</p>\n <p>It seems <a href=\"https://github.com/FurqanSoftware/codemirror-languageserver\">Code Mirror</a> can already proxy over a websocket for LSP support too (we might not even need that as we can compile OCaml directly to JavaScript/Webassembly and have the whole thing running locally!).</p>\n \n \n\n \n\n <h2>Open-Source & Community</h2>\n \n <p><a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I had a great conversation this week about building community especially as it pertains to open-source and OCaml.</p>\n <p>I've been going back over <a href=\"https://patrick.sirref.org/ostrom-gtc/\">Governing the Commons</a>, but have already discovered <a href=\"https://patrick.sirref.org/franklin-rwt/\">The Real World of Technology</a>!</p>\n <p>More thoughts on all of this soon... maybe</p>",+"content": "<p>Previous <a href=\"https://patrick.sirref.org/weeklies/\">weeklies</a> used <strong>strong</strong> emphasis to distinguish sections. This comes from <a href=\"https://patrick.sirref.org/forester/\">Forester</a>'s philosophy about atomicity of the content in your <em>forest</em>.</p>\n <p>However, <em>subtrees</em> are supported! I quickly hacked together the ability to use <em>subheadings</em> to indicate <em>subtrees</em>. This is strictly less expressive than the <code>\\subtree{}</code> of <a href=\"https://patrick.sirref.org/forester/\">Forester</a>'s default syntax as we cannot <em>close</em> heading sections in Markdown.</p>\n <p>This weekly uses subtrees.</p>\n \n\n \n\n <h2>Vpnkit</h2>\n \n <p>I spent some time this week trying to upgrade vpnkit to OCaml 5. I was originally working on <a href=\"https://patrick.sirref.org/vpnkit-er/\">a paper idea</a> which might need benchmarks, but <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I decided we could simply point to the port I did and show how it has simplified much of the code.</p>\n \n \n\n \n\n <h2>Void Processes</h2>\n \n <p>Work continued on implementing (and fully exploring) <a href=\"https://patrick.sirref.org/void-process/\">void processes</a>. A lot of the groundwork already exists in <a href=\"https://blog.hillion.co.uk/posts/void-processes/dissertation/jsh77-dissertation.pdf\">Jake Hillion's master's thesis</a>.</p>\n <p>This week I added a feature that I needed to help build the processes we need for the shell I'm building: mount points with modes!</p>\n <p>In addition to the root mount (taken care of with <a href=\"https://man7.org/linux/man-pages/man2/pivot_root.2.html\">pivot_root</a>), we need to be able to add additional mounts into the process' environment. These can now be added. All mount points can be mounted <code>readonly</code> or <code>readwrite</code>.</p>\n <p>Here is the \"Hello, World!\" example (the <code>/say/hey</code> program has been statically compiled using <code>musl-gcc</code>).</p>\n <pre> <code><span>let</span><span> </span><span>status</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>void</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>empty</span><span> \n</span>\n<span> </span><span>|></span><span> </span><span>mount</span><span> ~</span><span>mode</span><span>:</span><span>R</span><span> ~</span><span>src</span><span>:</span><span>hey_dir</span><span> ~</span><span>tgt</span><span>:</span><span>\"</span><span>say</span><span>\"</span><span>\n</span>\n<span> </span><span>|></span><span> </span><span>exec</span><span> </span><span>[ </span><span> </span><span>\"</span><span>/say/hey</span><span>\"</span><span> </span><span>] </span><span>\n</span>\n<span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>t</span><span> </span><span>=</span><span> </span><span>Void</span><span>.</span><span>spawn</span><span> ~</span><span>sw</span><span> </span><span>void</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>Promise</span><span>.</span><span>await</span><span> </span><span>( </span><span>Void</span><span>.</span><span>exit_status</span><span> </span><span>t</span><span>) </span><span>\n</span></code>\n </pre>\n <p>There really is nothing else in there. Without specifying a <code>root</code> mount, the void process is started with an empty <code>tmpfs</code> root. Next on the list is networking!</p>\n \n \n\n \n\n <h2>LSP Servers</h2>\n \n <p>I got a little side-tracked building a library for writing <a href=\"https://microsoft.github.io/language-server-protocol/\">LSP</a> servers in OCaml: <a href=\"https://patrick.sirref.org/mlsp/\">mlsp</a>. This may seem a little unrelated, but it isn't. The LSP has become the de facto standard for communicating between an editor and a programming language environment. If you have used VSCode to write a program in Python, chances are you are using <a href=\"https://marketplace.visualstudio.com/items?itemName=ms-python.python\">the official extension</a> which gives you linting, formatting, code navigation etc. All of these features are communicating using the LSP.</p>\n <p>It seems <a href=\"https://github.com/FurqanSoftware/codemirror-languageserver\">Code Mirror</a> can already proxy over a websocket for LSP support too (we might not even need that as we can compile OCaml directly to JavaScript/Webassembly and have the whole thing running locally!).</p>\n \n \n\n \n\n <h2>Open-Source & Community</h2>\n \n <p><a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I had a great conversation this week about building community especially as it pertains to open-source and OCaml.</p>\n <p>I've been going back over <a href=\"https://patrick.sirref.org/ostrom-gtc/\">Governing the Commons</a>, but have already discovered <a href=\"https://patrick.sirref.org/franklin-rwt/\">The Real World of Technology</a>!</p>\n <p>More thoughts on all of this soon... maybe</p>",
+18
pf341/weekly-2025-03-31_.json
+18
pf341/weekly-2025-03-31_.json
···+"summary": "<p>Last week I focused on <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> -- our idea that shells should have the same ability as reproducible build tools like Nix or Docker. To this end I now have a fairly fleshed out prototype.</p>\n \n\n \n\n <h2>Shelter Prototype</h2>\n \n <p>Shelter is a spin-off from the work <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> and I started with <a href=\"https://github.com/quantifyearth/shark\">Shark</a>. It takes the same ideas but applies them directly to a shell-like interface.</p>\n <p>We're still in the middle of working all of this, but you can read more about it at <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a>.</p>\n \n \n\n \n\n <h2>Forester Hacking</h2>\n \n <p>As you can probably tell, my website is still using <a href=\"https://patrick.sirref.org/forester/\">Forester</a>. I rebased my Markdown branch to include the new Atom syndication feature.</p>\n <p>Alongside that I added support for arbitrary HTML injection into Forester via codeblocks in Markdown. This was actually very straightforward thanks to <a href=\"https://ocaml.org/p/markup\">Markup</a> and being able to re-parse Forester syntax in the middle of converting a Markdown document. The HTML for the shell in <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> uses this feature.</p>\n <p>If you are interested in taking this custom Forester for a spin, there's <a href=\"https://github.com/patricoferris/ocaml-forester/tree/5-dev-md\">a branch on Github</a>. In fact, nearly the only change beyond letting the core engine know about markdown files, is <a href=\"https://github.com/patricoferris/ocaml-forester/blob/5-dev-md/lib/compiler/Parse_md.ml\">adding a new parser frontend</a>.</p>\n \n \n\n \n\n <h2>Hazel</h2>\n \n <p>For one of my <a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> students, I've been prototyping a transpiler from OCaml to Hazel. This has gone pretty well and now supports type annotations as well as straight-forward implementation translation.</p>\n <p>Consider the following OCaml <code>map</code> function.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>map</span><span> </span><span>f</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>[] </span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>( </span><span>f</span><span> </span><span>x</span><span>) </span><span> </span><span>:</span><span>:</span><span> </span><span>map</span><span> </span><span>f</span><span> </span><span>xs</span><span>\n</span></code>\n </pre>\n <p>The tool, <a href=\"https://github.com/patricoferris/hazel_of_ocaml\"><code>hazel_of_ocaml</code></a> can translate this to Hazel code, including making the polymorphism explicit.</p>\n <pre>let map : forall a -> forall b -> (a -> b) -> [a] -> [b] \n = typfun a -> typfun b -> fun f -> fun x1 -> case x1\n | [] => []\n | x :: xs => f(x) :: map(f)(xs)\nend in ?</pre>\n <p>You can copy and paste that codeblock into the <a href=\"https://hazel.org/build/dev/\">hazel playground</a>. But do note that it still needs some manual editing to add the type applications in directly ( <code>map@<a>@<b></code>). With the right amount of type-inferencing and scoping I actually think that you could place those type applications in yourself. This could make a nice Part II project I think.</p>\n \n \n\n \n\n <h2>Ppxlib</h2>\n \n <p>I recently wrote about the painful experience of <a href=\"https://patrick.sirref.org/ppxlib-5-2/\">miragrating ppxlib to the 5.2 OCaml AST</a>. This week, Nathan Rebours and I merged a PR to add the <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/558\">5.3 AST</a>! Ppxlib has been playing catch-up with the compiler and we decided it was best to try to catch up quickly and deal with the ecosystem fallout all at once rather than incrementally. With this new AST merged, ppx authors can now use the new <code>Pexp_effect</code> parsetree node. I'll write a little more about this in a separate post soon.</p>",+"content": "<p>Last week I focused on <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> -- our idea that shells should have the same ability as reproducible build tools like Nix or Docker. To this end I now have a fairly fleshed out prototype.</p>\n \n\n \n\n <h2>Shelter Prototype</h2>\n \n <p>Shelter is a spin-off from the work <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> and I started with <a href=\"https://github.com/quantifyearth/shark\">Shark</a>. It takes the same ideas but applies them directly to a shell-like interface.</p>\n <p>We're still in the middle of working all of this, but you can read more about it at <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a>.</p>\n \n \n\n \n\n <h2>Forester Hacking</h2>\n \n <p>As you can probably tell, my website is still using <a href=\"https://patrick.sirref.org/forester/\">Forester</a>. I rebased my Markdown branch to include the new Atom syndication feature.</p>\n <p>Alongside that I added support for arbitrary HTML injection into Forester via codeblocks in Markdown. This was actually very straightforward thanks to <a href=\"https://ocaml.org/p/markup\">Markup</a> and being able to re-parse Forester syntax in the middle of converting a Markdown document. The HTML for the shell in <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> uses this feature.</p>\n <p>If you are interested in taking this custom Forester for a spin, there's <a href=\"https://github.com/patricoferris/ocaml-forester/tree/5-dev-md\">a branch on Github</a>. In fact, nearly the only change beyond letting the core engine know about markdown files, is <a href=\"https://github.com/patricoferris/ocaml-forester/blob/5-dev-md/lib/compiler/Parse_md.ml\">adding a new parser frontend</a>.</p>\n \n \n\n \n\n <h2>Hazel</h2>\n \n <p>For one of my <a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> students, I've been prototyping a transpiler from OCaml to Hazel. This has gone pretty well and now supports type annotations as well as straight-forward implementation translation.</p>\n <p>Consider the following OCaml <code>map</code> function.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>map</span><span> </span><span>f</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>[] </span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>( </span><span>f</span><span> </span><span>x</span><span>) </span><span> </span><span>:</span><span>:</span><span> </span><span>map</span><span> </span><span>f</span><span> </span><span>xs</span><span>\n</span></code>\n </pre>\n <p>The tool, <a href=\"https://github.com/patricoferris/hazel_of_ocaml\"><code>hazel_of_ocaml</code></a> can translate this to Hazel code, including making the polymorphism explicit.</p>\n <pre>let map : forall a -> forall b -> (a -> b) -> [a] -> [b] \n = typfun a -> typfun b -> fun f -> fun x1 -> case x1\n | [] => []\n | x :: xs => f(x) :: map(f)(xs)\nend in ?</pre>\n <p>You can copy and paste that codeblock into the <a href=\"https://hazel.org/build/dev/\">hazel playground</a>. But do note that it still needs some manual editing to add the type applications in directly ( <code>map@<a>@<b></code>). With the right amount of type-inferencing and scoping I actually think that you could place those type applications in yourself. This could make a nice Part II project I think.</p>\n \n \n\n \n\n <h2>Ppxlib</h2>\n \n <p>I recently wrote about the painful experience of <a href=\"https://patrick.sirref.org/ppxlib-5-2/\">miragrating ppxlib to the 5.2 OCaml AST</a>. This week, Nathan Rebours and I merged a PR to add the <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/558\">5.3 AST</a>! Ppxlib has been playing catch-up with the compiler and we decided it was best to try to catch up quickly and deal with the ecosystem fallout all at once rather than incrementally. With this new AST merged, ppx authors can now use the new <code>Pexp_effect</code> parsetree node. I'll write a little more about this in a separate post soon.</p>",
+18
pf341/weekly-2025-04-14_.json
+18
pf341/weekly-2025-04-14_.json
···+"summary": "<p>This is a bit of a catch-up post. A week or so in Belfast has thrown me a little off course with the weeklies.</p>\n \n\n \n\n <h2>Outreachy December 2024 and Beyond</h2>\n \n <p>In the last week or so, we have come to the end of the December 2024 round of <a href=\"https://outreachy.org/\">Outreachy</a>. For those who do not know about Outreachy, it facilitates and organises internships in open-source. Internships for those who are historically underrepresented and impacted by systematic bias in tech.</p>\n <p>We, the OCaml community, held our biannual <em>Demo Day</em> where our current cohort of interns can show off what they have been building for three months. This round, we only had one intern, <a href=\"https://github.com/azzsal\">Abdulaziz</a>, who worked on the <a href=\"https://github.com/ocaml-semver/ocaml-api-watch\">OCaml API diffing tool</a>.</p>\n <p>\n \n</p>\n <p>The next round has just completed the contribution phase and our next cohort of interns is being selected.</p>\n \n \n\n \n\n <h2>Ppxlib Supports OCaml 5.4 (mostly) </h2>\n \n <p>This week, I <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/570\">added support for OCaml 5.4</a> to <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a>. A rather non-trivial change to the codebase due to changes in the representation in <code>Longident</code>s (which now have location information for all segments of the <code>Longident</code>). OCaml 5.4 has <em>labelled tuples</em>, a light-weight record-like syntax similar to labelled arguments in functions.</p>\n <pre> <code><span>let</span><span> </span><span>x</span><span> </span><span>=</span><span> ~</span><span>x</span><span>:</span><span>1</span><span>,</span><span> </span><span>10</span><span>,</span><span> ~</span><span>y</span><span>:</span><span>1</span><span> \n</span>\n<span>val</span><span> </span><span>x</span><span> </span><span>:</span><span> </span><span>x</span><span>:</span><span>int</span><span> </span><span>*</span><span> </span><span>int</span><span> </span><span>*</span><span> </span><span>y</span><span>:</span><span>int</span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>add</span><span> </span><span>( </span><span>~</span><span>x</span><span>,</span><span> ~</span><span>y</span><span>) </span><span> </span><span>=</span><span> </span><span>x</span><span> </span><span>+</span><span> </span><span>y</span><span>\n</span>\n<span>val</span><span> </span><span>add</span><span> </span><span>:</span><span> </span><span>( </span><span>x</span><span>:</span><span>int</span><span> </span><span>*</span><span> </span><span>y</span><span>:</span><span>int</span><span>) </span><span> </span><span>-></span><span> </span><span>int</span><span>\n</span></code>\n </pre>\n <p>As of writing this post, we are waiting for the full OCaml 5.4 feature freeze and magic number bump. Note this is <em>support for OCaml 5.4</em> (in ppxlib terms, we know have migration functions for 5.3 <-> 5.4) not a bump to use the 5.4 parsetree in ppxlib.</p>\n <p>\n <strong>(ppxlib)[Ppxlib]The Future of </strong>\n </p>\n <p>I have now been helping maintain <a href=\"https://patrick.sirref.org/ppxlib/\">Ppxlib</a> for a while and I am beginning to wonder about the long-term vision for the project. <a href=\"https://patrick.sirref.org/ppxlib/\">Ppxlib</a> is a central component to any modern OCaml library or tool, <a href=\"https://sherlocode.com/\">Sherlocode</a> reckons there's <em>18.3k</em> instances of <code>[@@deriving</code>. Yet, every compiler release is a painful, two-phase process for the maintainers. First, we have to support the new release of the compiler. This is mostly okay but does involve adding lots of code to Ppxlib (we need a whole copy of the Parsetree!). Later on down the line, we have to bump the AST and this often breaks many ppxes.</p>\n <p>The biggest issue is still the tension between abstract types and pattern-matching. By exposing the Parsetree directly to user's, they can pattern-match against the AST and build their ppxes in a rather straight forward way. However, this comes with the risk that the type may change thus breaking their code.</p>\n <p>I have mentioned <a href=\"https://patrick.sirref.org/ppxlib-5-2/\">before</a> about plans to use \"views\" to help with this, but I do not see that happening any time soon. It feels to me, we should be able to allow a user to <em>select</em> their AST to work with. How this interacts with modules like <code>Ast_builder</code> is unclear to me, but it would mean user's can remain on older ASTs even when <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a> bumps the internal, main AST.</p>\n \n \n\n \n\n <h2>Sherlorocq</h2>\n \n <p><a href=\"https://rocq-prover.org/\">Rocq</a> is a theorem prover (originally called <code>coq</code>) from many of the same folks who brought you OCaml. I have been interested in theorem provers for a while, more from an engineering perspective and less from a theoretical one. My good friend, <a href=\"https://patrick.sirref.org/dhsorens/\">Derek Sorenson</a>, taught me about Rocq when we <a href=\"https://github.com/dhsorens/mrdt\">encoded <em>mergeable replicated datatypes</em></a>. Later I had a go at encoding <a href=\"https://github.com/patricoferris/coq-difc\"><em>decentralised information flow control</em></a> (also in <a href=\"https://github.com/patricoferris/difc-star\">Fstar</a>).</p>\n <p>During this process, I found it quite challenging to find Rocq code that I could learn from. Compared to using say OCaml's <code>Stdlib</code>, using Rocq's standard library of datatypes and proofs did not come naturally (not helped by the fact that at the time I could not get the LSP to play ball).</p>\n <p>This lead me to port <a href=\"https://patrick.sirref.org/artw/\">Arthur Wendling's</a> excellent <a href=\"https://sherlocode.com/\">Sherlocode</a> to do the same thing only for Rocq code. Luckily, it uses <a href=\"https://swtch.com/~rsc/regexp/regexp4.html\">regular expression matching with a trigram index</a> so it is not tied to any particular programming language. Additionally, it is somewhat opam-centric (not actually that much) which is perfect for Rocq!</p>\n <p>Anyway, I've had some interest from the people at Inria to bring the server back to life. Their wish is my command!</p>\n <p><a href=\"https://sherlorocq.sirref.org/\">Sherlorocq</a> is back online!</p>",+"content": "<p>This is a bit of a catch-up post. A week or so in Belfast has thrown me a little off course with the weeklies.</p>\n \n\n \n\n <h2>Outreachy December 2024 and Beyond</h2>\n \n <p>In the last week or so, we have come to the end of the December 2024 round of <a href=\"https://outreachy.org/\">Outreachy</a>. For those who do not know about Outreachy, it facilitates and organises internships in open-source. Internships for those who are historically underrepresented and impacted by systematic bias in tech.</p>\n <p>We, the OCaml community, held our biannual <em>Demo Day</em> where our current cohort of interns can show off what they have been building for three months. This round, we only had one intern, <a href=\"https://github.com/azzsal\">Abdulaziz</a>, who worked on the <a href=\"https://github.com/ocaml-semver/ocaml-api-watch\">OCaml API diffing tool</a>.</p>\n <p>\n \n</p>\n <p>The next round has just completed the contribution phase and our next cohort of interns is being selected.</p>\n \n \n\n \n\n <h2>Ppxlib Supports OCaml 5.4 (mostly) </h2>\n \n <p>This week, I <a href=\"https://github.com/ocaml-ppx/ppxlib/pull/570\">added support for OCaml 5.4</a> to <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a>. A rather non-trivial change to the codebase due to changes in the representation in <code>Longident</code>s (which now have location information for all segments of the <code>Longident</code>). OCaml 5.4 has <em>labelled tuples</em>, a light-weight record-like syntax similar to labelled arguments in functions.</p>\n <pre> <code><span>let</span><span> </span><span>x</span><span> </span><span>=</span><span> ~</span><span>x</span><span>:</span><span>1</span><span>,</span><span> </span><span>10</span><span>,</span><span> ~</span><span>y</span><span>:</span><span>1</span><span> \n</span>\n<span>val</span><span> </span><span>x</span><span> </span><span>:</span><span> </span><span>x</span><span>:</span><span>int</span><span> </span><span>*</span><span> </span><span>int</span><span> </span><span>*</span><span> </span><span>y</span><span>:</span><span>int</span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>add</span><span> </span><span>( </span><span>~</span><span>x</span><span>,</span><span> ~</span><span>y</span><span>) </span><span> </span><span>=</span><span> </span><span>x</span><span> </span><span>+</span><span> </span><span>y</span><span>\n</span>\n<span>val</span><span> </span><span>add</span><span> </span><span>:</span><span> </span><span>( </span><span>x</span><span>:</span><span>int</span><span> </span><span>*</span><span> </span><span>y</span><span>:</span><span>int</span><span>) </span><span> </span><span>-></span><span> </span><span>int</span><span>\n</span></code>\n </pre>\n <p>As of writing this post, we are waiting for the full OCaml 5.4 feature freeze and magic number bump. Note this is <em>support for OCaml 5.4</em> (in ppxlib terms, we know have migration functions for 5.3 <-> 5.4) not a bump to use the 5.4 parsetree in ppxlib.</p>\n <p>\n <strong>(ppxlib)[Ppxlib]The Future of </strong>\n </p>\n <p>I have now been helping maintain <a href=\"https://patrick.sirref.org/ppxlib/\">Ppxlib</a> for a while and I am beginning to wonder about the long-term vision for the project. <a href=\"https://patrick.sirref.org/ppxlib/\">Ppxlib</a> is a central component to any modern OCaml library or tool, <a href=\"https://sherlocode.com/\">Sherlocode</a> reckons there's <em>18.3k</em> instances of <code>[@@deriving</code>. Yet, every compiler release is a painful, two-phase process for the maintainers. First, we have to support the new release of the compiler. This is mostly okay but does involve adding lots of code to Ppxlib (we need a whole copy of the Parsetree!). Later on down the line, we have to bump the AST and this often breaks many ppxes.</p>\n <p>The biggest issue is still the tension between abstract types and pattern-matching. By exposing the Parsetree directly to user's, they can pattern-match against the AST and build their ppxes in a rather straight forward way. However, this comes with the risk that the type may change thus breaking their code.</p>\n <p>I have mentioned <a href=\"https://patrick.sirref.org/ppxlib-5-2/\">before</a> about plans to use \"views\" to help with this, but I do not see that happening any time soon. It feels to me, we should be able to allow a user to <em>select</em> their AST to work with. How this interacts with modules like <code>Ast_builder</code> is unclear to me, but it would mean user's can remain on older ASTs even when <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a> bumps the internal, main AST.</p>\n \n \n\n \n\n <h2>Sherlorocq</h2>\n \n <p><a href=\"https://rocq-prover.org/\">Rocq</a> is a theorem prover (originally called <code>coq</code>) from many of the same folks who brought you OCaml. I have been interested in theorem provers for a while, more from an engineering perspective and less from a theoretical one. My good friend, <a href=\"https://patrick.sirref.org/dhsorens/\">Derek Sorenson</a>, taught me about Rocq when we <a href=\"https://github.com/dhsorens/mrdt\">encoded <em>mergeable replicated datatypes</em></a>. Later I had a go at encoding <a href=\"https://github.com/patricoferris/coq-difc\"><em>decentralised information flow control</em></a> (also in <a href=\"https://github.com/patricoferris/difc-star\">Fstar</a>).</p>\n <p>During this process, I found it quite challenging to find Rocq code that I could learn from. Compared to using say OCaml's <code>Stdlib</code>, using Rocq's standard library of datatypes and proofs did not come naturally (not helped by the fact that at the time I could not get the LSP to play ball).</p>\n <p>This lead me to port <a href=\"https://patrick.sirref.org/artw/\">Arthur Wendling's</a> excellent <a href=\"https://sherlocode.com/\">Sherlocode</a> to do the same thing only for Rocq code. Luckily, it uses <a href=\"https://swtch.com/~rsc/regexp/regexp4.html\">regular expression matching with a trigram index</a> so it is not tied to any particular programming language. Additionally, it is somewhat opam-centric (not actually that much) which is perfect for Rocq!</p>\n <p>Anyway, I've had some interest from the people at Inria to bring the server back to life. Their wish is my command!</p>\n <p><a href=\"https://sherlorocq.sirref.org/\">Sherlorocq</a> is back online!</p>",
+18
pf341/weekly-2025-04-21_.json
+18
pf341/weekly-2025-04-21_.json
···+"summary": "<p>I spent much of this week working on Shelter and things related to it. Some of that time was also spent on Forester.</p>\n \n\n \n\n <h2>Forester</h2>\n \n <p>I added two important quality-of-life features to my frontend to Forester this week.</p>\n \n\n \n\n <h3>Bibtex Support</h3>\n \n <p>I had previously mentioned adding support to <a href=\"https://patrick.sirref.org/forester/\">Forester</a> for <a href=\"https://patrick.sirref.org/weekly-2025-01-20/\">Markdown</a>. This week I added support for Bibtex too. From any <code>*.bib</code> file in your forest, <a href=\"https://patrick.sirref.org/forester/\">Forester</a> will now dutifully recognise it as a Bibtex file and convert, as best it can, all the entries into <code>Reference</code> trees.</p>\n <p>I'm becoming quite convinced of this model at the moment. I'm using Forester's <code>Code.t</code> as a target representation. In fact, to ease the process, I really shouldn't spend <em>all my time</em> on my website, I have reused my <code>Yaml.t -> Code.t</code> and <code>Markdown.t -> Code.t</code> functions in the Bibtex parser.</p>\n <p>To see it in action, you could have a look at the <a href=\"https://patrick.sirref.org/mokhov-build-systems/\">Build Systems \u00e0 la Carte</a> paper which is generated completely from Bibtex.</p>\n \n \n\n \n\n <h3>Full Heading Support</h3>\n \n <p>The eagled-eyed viewer may have noticed that the table of contents for this page has <em>more than one level</em>. I finally caved and spent an evening rejigging my <code>Cmarkit.Doc.t -> Tree</code> code which was hacky and broken and is now less hacky and less broken.</p>\n <p>In addition, headings support links and emphasis etc.</p>\n \n \n\n \n\n <h3>Lunch with <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a></h3>\n \n <p>I had a delightful lunch with <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a> discussing the future of Forester, the nature of the Web (old and new) and the success posting weekly updates for our colleagues. Thanks Jon.</p>\n \n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> Fixes</h2>\n \n <p>I spent a good chunk of my week fixing bugs in <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> with the aim to perhaps setup a VM somewhere and let people kick the tyres of what we've got so far.</p>\n <p>The first bug is pretty annoying. At the moment, our filsystem backend is ZFS and we make heavy use of snapshots and cloning in order to provide time-travelling capabilities. Unfortunately, ZFS will take a snapshot before data has fully made it to disk (or whatever is the equivalent point it should reach in ZFS). Commands that generated lots of disk activity would be snapshotted in a half finished state and this would cause all sorts of problems. Thanks to <a href=\"https://patrick.sirref.org/mtelvers/\">Mark Elvers</a> for the pointer to how OBuilder uses ZFS for the OCaml macOS builders which unmount datasets immediately therefore inducing a <em>flush</em> of sorts. Shelter now follows a similar model with all of the slow downs that create. <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I discussed some amalgamation of overlayfs, tmpfs and ZFS to alleviate some of this but for now that's a premature optimisation.</p>\n \n\n \n\n <h3>A small eDSL for <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a></h3>\n \n <p>Whilst testing <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a>, I ended up wanting a way to programmatically invoke the different run commands. This is similar to say a Dockerfile, but maybe with a little more expressivity.</p>\n <p>This lead me to revisit the <a href=\"https://patrick.sirref.org/mokhov-build-systems/\">Build systems \u00e0 la Carte</a> paper and rediscover <a href=\"https://patrick.sirref.org/mokhov-selective-2019/\">selective applicative functors</a>.</p>\n <p>I started playing around with a selective applicative interface to Shelter, this would allow you to express your dependencies statically but select them dynamically (as the paper says).</p>\n <pre> <code><span>module</span><span> </span><span>D</span><span> </span><span>=</span><span> </span><span>Shl</span><span> </span><span>( </span><span>Identity</span><span>) </span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>shelterfile</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>open</span><span> </span><span>D</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>base_image</span><span> </span><span>=</span><span> </span><span>from</span><span> </span><span>\"</span><span>alpine</span><span>\"</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>is_node_lst</span><span> </span><span>img</span><span> </span><span>=</span><span> </span><span>String</span><span>.</span><span>equal</span><span> </span><span>\"</span><span>v22.15.0</span><span>\"</span><span> </span><span>( </span><span>stdout</span><span> </span><span>img</span><span>) </span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>cmds</span><span> </span><span>base</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>node_version</span><span> </span><span>=</span><span> </span><span>run</span><span> </span><span>\"</span><span>node --version</span><span>\"</span><span> </span><span>base</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>Select</span><span>.</span><span>if'</span><span>\n</span>\n<span> </span><span>( </span><span>Select</span><span>.</span><span>map</span><span> ~</span><span>f</span><span>:</span><span>is_node_lst</span><span> </span><span>node_version</span><span>) </span><span>\n</span>\n<span> </span><span>( </span><span>run</span><span> </span><span>\"</span><span>node -e 'console.log('success!') </span><span>\"</span><span>) </span><span>\n</span>\n<span> </span><span>( </span><span>run</span><span> </span><span>\"</span><span>node -e 'console.log('failure!') </span><span>\"</span><span>) </span><span>\n</span>\n<span> </span><span>base</span><span>\n</span>\n<span> </span><span>in</span><span>\n</span>\n<span> </span><span>with_session</span><span> </span><span>\"</span><span>node</span><span>\"</span><span> </span><span>( </span><span>cmds</span><span> </span><span>base_image</span><span>) </span><span>\n</span></code>\n </pre>\n <p>From this, we get a slightly more expressive way to describe images.</p>\n \n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/geocaml/\">Geocaml</a> TIFF Library</h2>\n \n <p>I was pleasantly surprised to receive a pull request from <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> adding support to ocaml-tiff for reading TIFF files compressed using LZW. I was also surprised to hear the TIFF LZW is a little different to others.</p>\n <p>In trying to get this PR merged, I moved the initialisation of the Eio eventloop to outside each individual test case. This one change then completely broke the entire test suite. After a period of debugging and help from <a href=\"https://patrick.sirref.org/talex5/\">Thomas Leonard</a> the root cause was OCaml's <code>OUnit2</code> library using process-level parallelism (via <code>Unix.fork</code>), sharing the ring between the parent and the child lead to the issues.</p>\n <p><a href=\"https://github.com/ocaml-multicore/eio/issues/801\">Read more about that issue on the Eio issue tracker</a>.</p>\n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> Students</h2>\n \n <p>As the new term begins, it signals that there are only just over two weeks for the final year undegrads at <a href=\"https://patrick.sirref.org/ucam/\">Cambridge</a> to submit their dissertations.</p>\n <p>The four students that I help supervise have been sending me drafts of their work (and reminder you can <a href=\"https://patrick.sirref.org/part-ii-2024/\">read about their projects</a>) and I'm very impressed. I'm sure the next two weeks will be stressful, but I'm proud of what they have accomplished over the past academic year.</p>",+"content": "<p>I spent much of this week working on Shelter and things related to it. Some of that time was also spent on Forester.</p>\n \n\n \n\n <h2>Forester</h2>\n \n <p>I added two important quality-of-life features to my frontend to Forester this week.</p>\n \n\n \n\n <h3>Bibtex Support</h3>\n \n <p>I had previously mentioned adding support to <a href=\"https://patrick.sirref.org/forester/\">Forester</a> for <a href=\"https://patrick.sirref.org/weekly-2025-01-20/\">Markdown</a>. This week I added support for Bibtex too. From any <code>*.bib</code> file in your forest, <a href=\"https://patrick.sirref.org/forester/\">Forester</a> will now dutifully recognise it as a Bibtex file and convert, as best it can, all the entries into <code>Reference</code> trees.</p>\n <p>I'm becoming quite convinced of this model at the moment. I'm using Forester's <code>Code.t</code> as a target representation. In fact, to ease the process, I really shouldn't spend <em>all my time</em> on my website, I have reused my <code>Yaml.t -> Code.t</code> and <code>Markdown.t -> Code.t</code> functions in the Bibtex parser.</p>\n <p>To see it in action, you could have a look at the <a href=\"https://patrick.sirref.org/mokhov-build-systems/\">Build Systems \u00e0 la Carte</a> paper which is generated completely from Bibtex.</p>\n \n \n\n \n\n <h3>Full Heading Support</h3>\n \n <p>The eagled-eyed viewer may have noticed that the table of contents for this page has <em>more than one level</em>. I finally caved and spent an evening rejigging my <code>Cmarkit.Doc.t -> Tree</code> code which was hacky and broken and is now less hacky and less broken.</p>\n <p>In addition, headings support links and emphasis etc.</p>\n \n \n\n \n\n <h3>Lunch with <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a></h3>\n \n <p>I had a delightful lunch with <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a> discussing the future of Forester, the nature of the Web (old and new) and the success posting weekly updates for our colleagues. Thanks Jon.</p>\n \n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> Fixes</h2>\n \n <p>I spent a good chunk of my week fixing bugs in <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a> with the aim to perhaps setup a VM somewhere and let people kick the tyres of what we've got so far.</p>\n <p>The first bug is pretty annoying. At the moment, our filsystem backend is ZFS and we make heavy use of snapshots and cloning in order to provide time-travelling capabilities. Unfortunately, ZFS will take a snapshot before data has fully made it to disk (or whatever is the equivalent point it should reach in ZFS). Commands that generated lots of disk activity would be snapshotted in a half finished state and this would cause all sorts of problems. Thanks to <a href=\"https://patrick.sirref.org/mtelvers/\">Mark Elvers</a> for the pointer to how OBuilder uses ZFS for the OCaml macOS builders which unmount datasets immediately therefore inducing a <em>flush</em> of sorts. Shelter now follows a similar model with all of the slow downs that create. <a href=\"https://patrick.sirref.org/anilmadhavapeddy/\">Anil</a> and I discussed some amalgamation of overlayfs, tmpfs and ZFS to alleviate some of this but for now that's a premature optimisation.</p>\n \n\n \n\n <h3>A small eDSL for <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a></h3>\n \n <p>Whilst testing <a href=\"https://patrick.sirref.org/shelter/\">Shelter</a>, I ended up wanting a way to programmatically invoke the different run commands. This is similar to say a Dockerfile, but maybe with a little more expressivity.</p>\n <p>This lead me to revisit the <a href=\"https://patrick.sirref.org/mokhov-build-systems/\">Build systems \u00e0 la Carte</a> paper and rediscover <a href=\"https://patrick.sirref.org/mokhov-selective-2019/\">selective applicative functors</a>.</p>\n <p>I started playing around with a selective applicative interface to Shelter, this would allow you to express your dependencies statically but select them dynamically (as the paper says).</p>\n <pre> <code><span>module</span><span> </span><span>D</span><span> </span><span>=</span><span> </span><span>Shl</span><span> </span><span>( </span><span>Identity</span><span>) </span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>shelterfile</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>open</span><span> </span><span>D</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>base_image</span><span> </span><span>=</span><span> </span><span>from</span><span> </span><span>\"</span><span>alpine</span><span>\"</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>is_node_lst</span><span> </span><span>img</span><span> </span><span>=</span><span> </span><span>String</span><span>.</span><span>equal</span><span> </span><span>\"</span><span>v22.15.0</span><span>\"</span><span> </span><span>( </span><span>stdout</span><span> </span><span>img</span><span>) </span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>cmds</span><span> </span><span>base</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>let</span><span> </span><span>node_version</span><span> </span><span>=</span><span> </span><span>run</span><span> </span><span>\"</span><span>node --version</span><span>\"</span><span> </span><span>base</span><span> </span><span>in</span><span>\n</span>\n<span> </span><span>Select</span><span>.</span><span>if'</span><span>\n</span>\n<span> </span><span>( </span><span>Select</span><span>.</span><span>map</span><span> ~</span><span>f</span><span>:</span><span>is_node_lst</span><span> </span><span>node_version</span><span>) </span><span>\n</span>\n<span> </span><span>( </span><span>run</span><span> </span><span>\"</span><span>node -e 'console.log('success!') </span><span>\"</span><span>) </span><span>\n</span>\n<span> </span><span>( </span><span>run</span><span> </span><span>\"</span><span>node -e 'console.log('failure!') </span><span>\"</span><span>) </span><span>\n</span>\n<span> </span><span>base</span><span>\n</span>\n<span> </span><span>in</span><span>\n</span>\n<span> </span><span>with_session</span><span> </span><span>\"</span><span>node</span><span>\"</span><span> </span><span>( </span><span>cmds</span><span> </span><span>base_image</span><span>) </span><span>\n</span></code>\n </pre>\n <p>From this, we get a slightly more expressive way to describe images.</p>\n \n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/geocaml/\">Geocaml</a> TIFF Library</h2>\n \n <p>I was pleasantly surprised to receive a pull request from <a href=\"https://patrick.sirref.org/mdales/\">Michael</a> adding support to ocaml-tiff for reading TIFF files compressed using LZW. I was also surprised to hear the TIFF LZW is a little different to others.</p>\n <p>In trying to get this PR merged, I moved the initialisation of the Eio eventloop to outside each individual test case. This one change then completely broke the entire test suite. After a period of debugging and help from <a href=\"https://patrick.sirref.org/talex5/\">Thomas Leonard</a> the root cause was OCaml's <code>OUnit2</code> library using process-level parallelism (via <code>Unix.fork</code>), sharing the ring between the parent and the child lead to the issues.</p>\n <p><a href=\"https://github.com/ocaml-multicore/eio/issues/801\">Read more about that issue on the Eio issue tracker</a>.</p>\n \n \n\n \n\n <h2><a href=\"https://patrick.sirref.org/part-ii-2024/\">Part II</a> Students</h2>\n \n <p>As the new term begins, it signals that there are only just over two weeks for the final year undegrads at <a href=\"https://patrick.sirref.org/ucam/\">Cambridge</a> to submit their dissertations.</p>\n <p>The four students that I help supervise have been sending me drafts of their work (and reminder you can <a href=\"https://patrick.sirref.org/part-ii-2024/\">read about their projects</a>) and I'm very impressed. I'm sure the next two weeks will be stressful, but I'm proud of what they have accomplished over the past academic year.</p>",
+18
pf341/weekly-2025-05-04_.json
+18
pf341/weekly-2025-05-04_.json
···+"summary": "<p>I missed a week of posting last week, mainly because I spent more time writing <a href=\"https://patrick.sirref.org/posts/\">posts</a>.</p>\n \n\n \n\n <h2>Hazel of OCaml</h2>\n \n <p>I mentioned previously that I was building a tool to transpile OCaml code to Hazel. This work is now in a good enough state that, along with one of my students, we have transpiled a good number of OCaml programs to help them write their evaluation for their third-year project.</p>\n <p>I wrote up a little summary of that work, which I've <a href=\"https://www.jonmsterling.com/foreign/www.forester-notes.org/jms-007L/index.xml\">transcluded</a> below.</p>\n \n\n \n\n <h3>A Transpiler from OCaml to Hazel</h3>\n \n <p>Over the past few months, I have been piecing together a transpiler from <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> to OCaml. This is, in part, to help one of my third-year undergraduate students who is working on <a href=\"https://patrick.sirref.org/part-ii-hazel/\">type error debugging in Hazel</a>.</p>\n \n\n \n\n <h4>Typed Holes</h4>\n \n <p><a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> is a <a href=\"https://patrick.sirref.org/omar-hazel-2017/\">functional programming language with typed holes</a>. Holes are pieces of your program that have not yet been filled in. Holes can appear anywhere in your program both as expression or types. Hazel can still evaluate your program in the presence of holes.</p>\n <p>To get a flavour of Hazel, take a regular map function for lists.</p>\n <pre>let map = fun f -> fun xs -> case xs\n | [] => []\n | x :: xs => f (x) :: map(f)(xs) \nend in\nmap(fun x -> ?)([1, 2, 3])</pre>\n <p>The question mark ( <code>?</code>) is a hole. The program evaluates to the following expression of type <code>[?]</code> (for people more familiar with OCaml types <code>? list</code>).</p>\n <pre>[ ?, ?, ? ]</pre>\n <p>Hazel supports <a href=\"https://patrick.sirref.org/zhao-typeerror-2024/\">local type inference</a> but nothing involving unification variables. For example, a simple <code>add_one</code> function in <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> ( <code>fun x -> x + 1</code>) has type <code>? -> Int</code>.</p>\n \n \n\n \n\n <h4>From OCaml to Hazel</h4>\n \n <p>The ability to transpile OCaml programs to Hazel programs is motivated by one simple thought: there are more OCaml programs than there are Hazel programs. This could help bootstrap projects by alleviating the need to rewrite boilerplate code (e.g. URI parsing or standard library functions for strings).</p>\n \n\n \n\n <h5>A Transformation of Syntax</h5>\n \n <p>Hazel markets itself as an \"Elm/ML-like functional programming language\". From the previous example of <code>map</code>, it should be apparent just how close to OCaml the language is.</p>\n <p>It turns out that a majority of the transpiler is a <em>transformation of syntax</em>. Take a simple ADT for an arithmetic programming language.</p>\n <pre> <code><span>type</span><span> </span><span>expr</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Float</span><span> </span><span>of</span><span> </span><span>float</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Add</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Sub</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Mul</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Div</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span></code>\n </pre>\n <p>And when we run <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> over this OCaml type declaration.</p>\n <pre>type expr =\n + Float(Float)\n + Add((expr, expr))\n + Sub((expr, expr))\n + Mul((expr, expr))\n + Div((expr, expr))\n in ?</pre>\n <p>Not much has changed expect some syntax. <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> does not have a notion of top-level expression so <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> wraps the program into one set of value bindings. For the most part, Hazel acts as a subset of the pure, functional part of OCaml. At the time of writing, this subset is fairly limited with no support for modules or labelled records out of the box (there are plenty of development branches with these features).</p>\n <p>If we try out the same <code>map</code> function but written in OCaml and transpiled to Hazel we get.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>map</span><span> </span><span>f</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>[] </span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>f</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>map</span><span> </span><span>f</span><span> </span><span>xs</span><span>\n</span></code>\n </pre>\n <p>Which becomes the following hazel program.</p>\n <pre>let map = fun f -> fun x1 -> case x1\n | [] => []\n | x :: xs => f(x) :: map(f)(xs)\nend in ?</pre>\n <p>We could have a field day discussing the syntax of OCaml and Hazel (parentheses for function arguments, well-scoped cases for pattern-matching, a different arrow for pattern-matching etc.). What would be more interesting is taking a look at how to handle polymorphism in Hazel.</p>\n \n \n\n \n\n <h5>Explicit Polymorphism</h5>\n \n <p>Hazel has <em>explicit polymorphism</em>. So far, we have not seen it as we have let the types have holes in them. The <code>map</code> function in OCaml has the following type.</p>\n <pre> <code><span>val</span><span> </span><span>map</span><span> </span><span>:</span><span> \n</span>\n<span> </span><span>( </span><span>'a</span><span> </span><span>-></span><span> </span><span>'b</span><span>) </span><span> </span><span>-></span><span> </span><span>'a</span><span> </span><span>list</span><span> </span><span>-></span><span> </span><span>'b</span><span> </span><span>list</span><span>\n</span></code>\n </pre>\n <p>We must remind ourselves (by reading <a href=\"https://www.craigfe.io/posts/polymorphic-type-constraints\">Craig's excellent blogpost on the matter</a>) that in OCaml</p>\n <blockquote>\n <p>... type variables in signatures are implicitly universally-quantified</p>\n </blockquote>\n <p>So in reality, we have that <code>map</code> has the following type.</p>\n <pre>val map : \u2200 a b. (a -> b) -> a list -> b list</pre>\n <p>In Hazel, we have to explicitly type our <code>map</code> function to be polymorphic. Not only does this mean the type annotation requires universally quantified type variables, but we must also perform type application wherever we choose to apply the <code>map</code> function (whether that be recursively or somewhere later in our program).</p>\n <pre>let map : forall a -> forall b -> (a -> b) -> [a] -> [b] =\n typfun a -> typfun b -> fun f -> fun xs -> case xs\n | [] => []\n | x :: xs => f (x) :: map@<a>@<b>(f)(xs) \nend in\nmap@<Int>@<Int>(fun x -> ?)([1, 2, 3])</pre>\n <p><code>forall</code> introduces a universally quantified type variable into our type annotation, and <code>typfun</code> introduces it into the function itself (\u00e0 la System F). Type application requires <code>@<T></code> where <code>T</code> is some type. This allows hazel to quite easily support higher rank polymorphism, but we will not worry too much about that.</p>\n \n \n\n \n\n <h5>Propagating OCaml Types into Hazel</h5>\n \n <p>Most often, OCaml users interact with <em>prenex</em> polymorphism (rank-1) where the universal quantifiers are at the front of the type. <a href=\"https://ocaml.org/manual/5.2/polymorphism.html#s:higher-rank-poly\">OCaml does support quantifiers inside certain types like records</a>.</p>\n <p>What this means for the transpiler is that we can <strong>reuse OCaml's type inference</strong> to safely instantiate the correct type annotations and type applications in Hazel! To do this, <code>hazel_of_ocaml</code> uses <a href=\"https://ocaml.github.io/merlin/\">Merlin</a> to inspect the type of the function in either a value binding or at the point of a function application.</p>\n <p>Take a simple, polymorphic <code>length</code> function.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>length</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>0</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>_</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>1</span><span> </span><span>+</span><span> </span><span>length</span><span> </span><span>xs</span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>int_len</span><span> </span><span>=</span><span> </span><span>length</span><span> </span><span>[ </span><span> </span><span>1</span><span>;</span><span> </span><span>2</span><span>;</span><span> </span><span>3</span><span> </span><span>] </span><span>\n</span>\n<span>let</span><span> </span><span>str_len</span><span> </span><span>=</span><span> </span><span>length</span><span> </span><span>[ </span><span> </span><span>\"</span><span>only</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>two</span><span>\"</span><span> </span><span>] </span><span>\n</span></code>\n </pre>\n <p>When we run this through <code>hazel_of_ocaml</code> with the <code>-type</code> flag we get.</p>\n <pre>let length : forall a -> [a] -> Int = typfun a -> fun x1 -> case x1\n | [] => 0\n | _ :: xs => 1 + length@<a>(xs)\nend in\nlet int_len : Int = length@<Int>(1 :: 2 :: [3]) in\nlet str_len : Int = length@<String>(\"only\" :: [\"two\"])\nin ?</pre>\n <p><code>hazel_of_ocaml</code> has correctly instantiated the type for <code>length</code> inside the recursive function and then in each case with the integer list and the string list.</p>\n \n \n \n\n \n\n <h4>A Corpus of Hazel Programs</h4>\n \n <p>The impetus for this work was to derive a corpus of ill-typed Hazel programs. Luckily, such a corpus exists for OCaml! <a href=\"https://patrick.sirref.org/ocaml-corpus/\">Seidel et al.</a> created a corpus of OCaml programs from their undergraduate students at UC San Diego. <a href=\"https://github.com/patricoferris/hazel-corpus\">Some of these programs have been transpiled to Hazel</a>.</p>\n \n \n\n \n\n <h4>Future Work</h4>\n \n <p><a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> is a fun, research programming language. Potential third-year students may find it interesting to take this work further. For example, how would this look in terms of a module system? From a purely engineering perspective, plenty of work would be needed to convert a multi-library OCaml project to Hazel (e.g. handling the <code>cmi</code> files).</p>\n <p>Another line of research would be to have Hazel target one of the intermediate representations in OCaml which would give Hazel a fully functioning compiler to \"native\" code?</p>\n \n \n \n \n\n \n\n <h2>OxCaml</h2>\n \n <p>I spent some time this week getting more familiar with <a href=\"https://patrick.sirref.org/oxcaml-2024/\">Oxidized OCaml</a>. I have a habit of wrapping <em>new</em> OCaml tools and libraries into toplevel, browser applications. For example, <a href=\"https://patricoferris.github.io/try-irmin\">try-irmin</a> and <a href=\"https://patricoferris.github.io/try-eio/\">try-eio</a>.</p>\n <p>Naturally, I tried to wrap OxCaml into a toplevel so people could play around with the new modes that are part of the OxCaml type system. This turned out to be a lengthy debugging session (where type declarations did not align so the raw <code>Obj.repr</code> js_of_ocaml representation was broken for some parts of the toplevel). I would say that I do question the time-spent/value trade-off, but a mostly working toplevel with OxCaml is available at: <a href=\"https://patrick.sirref.org/oxcaml\">https://patrick.sirref.org/oxcaml</a>.</p>",+"content": "<p>I missed a week of posting last week, mainly because I spent more time writing <a href=\"https://patrick.sirref.org/posts/\">posts</a>.</p>\n \n\n \n\n <h2>Hazel of OCaml</h2>\n \n <p>I mentioned previously that I was building a tool to transpile OCaml code to Hazel. This work is now in a good enough state that, along with one of my students, we have transpiled a good number of OCaml programs to help them write their evaluation for their third-year project.</p>\n <p>I wrote up a little summary of that work, which I've <a href=\"https://www.jonmsterling.com/foreign/www.forester-notes.org/jms-007L/index.xml\">transcluded</a> below.</p>\n \n\n \n\n <h3>A Transpiler from OCaml to Hazel</h3>\n \n <p>Over the past few months, I have been piecing together a transpiler from <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> to OCaml. This is, in part, to help one of my third-year undergraduate students who is working on <a href=\"https://patrick.sirref.org/part-ii-hazel/\">type error debugging in Hazel</a>.</p>\n \n\n \n\n <h4>Typed Holes</h4>\n \n <p><a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> is a <a href=\"https://patrick.sirref.org/omar-hazel-2017/\">functional programming language with typed holes</a>. Holes are pieces of your program that have not yet been filled in. Holes can appear anywhere in your program both as expression or types. Hazel can still evaluate your program in the presence of holes.</p>\n <p>To get a flavour of Hazel, take a regular map function for lists.</p>\n <pre>let map = fun f -> fun xs -> case xs\n | [] => []\n | x :: xs => f (x) :: map(f)(xs) \nend in\nmap(fun x -> ?)([1, 2, 3])</pre>\n <p>The question mark ( <code>?</code>) is a hole. The program evaluates to the following expression of type <code>[?]</code> (for people more familiar with OCaml types <code>? list</code>).</p>\n <pre>[ ?, ?, ? ]</pre>\n <p>Hazel supports <a href=\"https://patrick.sirref.org/zhao-typeerror-2024/\">local type inference</a> but nothing involving unification variables. For example, a simple <code>add_one</code> function in <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> ( <code>fun x -> x + 1</code>) has type <code>? -> Int</code>.</p>\n \n \n\n \n\n <h4>From OCaml to Hazel</h4>\n \n <p>The ability to transpile OCaml programs to Hazel programs is motivated by one simple thought: there are more OCaml programs than there are Hazel programs. This could help bootstrap projects by alleviating the need to rewrite boilerplate code (e.g. URI parsing or standard library functions for strings).</p>\n \n\n \n\n <h5>A Transformation of Syntax</h5>\n \n <p>Hazel markets itself as an \"Elm/ML-like functional programming language\". From the previous example of <code>map</code>, it should be apparent just how close to OCaml the language is.</p>\n <p>It turns out that a majority of the transpiler is a <em>transformation of syntax</em>. Take a simple ADT for an arithmetic programming language.</p>\n <pre> <code><span>type</span><span> </span><span>expr</span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Float</span><span> </span><span>of</span><span> </span><span>float</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Add</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Sub</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Mul</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>Div</span><span> </span><span>of</span><span> </span><span>expr</span><span> </span><span>*</span><span> </span><span>expr</span><span>\n</span></code>\n </pre>\n <p>And when we run <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> over this OCaml type declaration.</p>\n <pre>type expr =\n + Float(Float)\n + Add((expr, expr))\n + Sub((expr, expr))\n + Mul((expr, expr))\n + Div((expr, expr))\n in ?</pre>\n <p>Not much has changed expect some syntax. <a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> does not have a notion of top-level expression so <a href=\"https://patrick.sirref.org/hazel_of_ocaml/\">hazel_of_ocaml</a> wraps the program into one set of value bindings. For the most part, Hazel acts as a subset of the pure, functional part of OCaml. At the time of writing, this subset is fairly limited with no support for modules or labelled records out of the box (there are plenty of development branches with these features).</p>\n <p>If we try out the same <code>map</code> function but written in OCaml and transpiled to Hazel we get.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>map</span><span> </span><span>f</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>[] </span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>f</span><span> </span><span>x</span><span> </span><span>:</span><span>:</span><span> </span><span>map</span><span> </span><span>f</span><span> </span><span>xs</span><span>\n</span></code>\n </pre>\n <p>Which becomes the following hazel program.</p>\n <pre>let map = fun f -> fun x1 -> case x1\n | [] => []\n | x :: xs => f(x) :: map(f)(xs)\nend in ?</pre>\n <p>We could have a field day discussing the syntax of OCaml and Hazel (parentheses for function arguments, well-scoped cases for pattern-matching, a different arrow for pattern-matching etc.). What would be more interesting is taking a look at how to handle polymorphism in Hazel.</p>\n \n \n\n \n\n <h5>Explicit Polymorphism</h5>\n \n <p>Hazel has <em>explicit polymorphism</em>. So far, we have not seen it as we have let the types have holes in them. The <code>map</code> function in OCaml has the following type.</p>\n <pre> <code><span>val</span><span> </span><span>map</span><span> </span><span>:</span><span> \n</span>\n<span> </span><span>( </span><span>'a</span><span> </span><span>-></span><span> </span><span>'b</span><span>) </span><span> </span><span>-></span><span> </span><span>'a</span><span> </span><span>list</span><span> </span><span>-></span><span> </span><span>'b</span><span> </span><span>list</span><span>\n</span></code>\n </pre>\n <p>We must remind ourselves (by reading <a href=\"https://www.craigfe.io/posts/polymorphic-type-constraints\">Craig's excellent blogpost on the matter</a>) that in OCaml</p>\n <blockquote>\n <p>... type variables in signatures are implicitly universally-quantified</p>\n </blockquote>\n <p>So in reality, we have that <code>map</code> has the following type.</p>\n <pre>val map : \u2200 a b. (a -> b) -> a list -> b list</pre>\n <p>In Hazel, we have to explicitly type our <code>map</code> function to be polymorphic. Not only does this mean the type annotation requires universally quantified type variables, but we must also perform type application wherever we choose to apply the <code>map</code> function (whether that be recursively or somewhere later in our program).</p>\n <pre>let map : forall a -> forall b -> (a -> b) -> [a] -> [b] =\n typfun a -> typfun b -> fun f -> fun xs -> case xs\n | [] => []\n | x :: xs => f (x) :: map@<a>@<b>(f)(xs) \nend in\nmap@<Int>@<Int>(fun x -> ?)([1, 2, 3])</pre>\n <p><code>forall</code> introduces a universally quantified type variable into our type annotation, and <code>typfun</code> introduces it into the function itself (\u00e0 la System F). Type application requires <code>@<T></code> where <code>T</code> is some type. This allows hazel to quite easily support higher rank polymorphism, but we will not worry too much about that.</p>\n \n \n\n \n\n <h5>Propagating OCaml Types into Hazel</h5>\n \n <p>Most often, OCaml users interact with <em>prenex</em> polymorphism (rank-1) where the universal quantifiers are at the front of the type. <a href=\"https://ocaml.org/manual/5.2/polymorphism.html#s:higher-rank-poly\">OCaml does support quantifiers inside certain types like records</a>.</p>\n <p>What this means for the transpiler is that we can <strong>reuse OCaml's type inference</strong> to safely instantiate the correct type annotations and type applications in Hazel! To do this, <code>hazel_of_ocaml</code> uses <a href=\"https://ocaml.github.io/merlin/\">Merlin</a> to inspect the type of the function in either a value binding or at the point of a function application.</p>\n <p>Take a simple, polymorphic <code>length</code> function.</p>\n <pre> <code><span>let</span><span> </span><span>rec </span><span>length</span><span> </span><span>=</span><span> </span><span>function</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>[] </span><span> </span><span>-></span><span> </span><span>0</span><span>\n</span>\n<span> </span><span>|</span><span> </span><span>_</span><span> </span><span>:</span><span>:</span><span> </span><span>xs</span><span> </span><span>-></span><span> </span><span>1</span><span> </span><span>+</span><span> </span><span>length</span><span> </span><span>xs</span><span>\n</span>\n<span>\n</span>\n<span>let</span><span> </span><span>int_len</span><span> </span><span>=</span><span> </span><span>length</span><span> </span><span>[ </span><span> </span><span>1</span><span>;</span><span> </span><span>2</span><span>;</span><span> </span><span>3</span><span> </span><span>] </span><span>\n</span>\n<span>let</span><span> </span><span>str_len</span><span> </span><span>=</span><span> </span><span>length</span><span> </span><span>[ </span><span> </span><span>\"</span><span>only</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>two</span><span>\"</span><span> </span><span>] </span><span>\n</span></code>\n </pre>\n <p>When we run this through <code>hazel_of_ocaml</code> with the <code>-type</code> flag we get.</p>\n <pre>let length : forall a -> [a] -> Int = typfun a -> fun x1 -> case x1\n | [] => 0\n | _ :: xs => 1 + length@<a>(xs)\nend in\nlet int_len : Int = length@<Int>(1 :: 2 :: [3]) in\nlet str_len : Int = length@<String>(\"only\" :: [\"two\"])\nin ?</pre>\n <p><code>hazel_of_ocaml</code> has correctly instantiated the type for <code>length</code> inside the recursive function and then in each case with the integer list and the string list.</p>\n \n \n \n\n \n\n <h4>A Corpus of Hazel Programs</h4>\n \n <p>The impetus for this work was to derive a corpus of ill-typed Hazel programs. Luckily, such a corpus exists for OCaml! <a href=\"https://patrick.sirref.org/ocaml-corpus/\">Seidel et al.</a> created a corpus of OCaml programs from their undergraduate students at UC San Diego. <a href=\"https://github.com/patricoferris/hazel-corpus\">Some of these programs have been transpiled to Hazel</a>.</p>\n \n \n\n \n\n <h4>Future Work</h4>\n \n <p><a href=\"https://patrick.sirref.org/hazel/\">Hazel</a> is a fun, research programming language. Potential third-year students may find it interesting to take this work further. For example, how would this look in terms of a module system? From a purely engineering perspective, plenty of work would be needed to convert a multi-library OCaml project to Hazel (e.g. handling the <code>cmi</code> files).</p>\n <p>Another line of research would be to have Hazel target one of the intermediate representations in OCaml which would give Hazel a fully functioning compiler to \"native\" code?</p>\n \n \n \n \n\n \n\n <h2>OxCaml</h2>\n \n <p>I spent some time this week getting more familiar with <a href=\"https://patrick.sirref.org/oxcaml-2024/\">Oxidized OCaml</a>. I have a habit of wrapping <em>new</em> OCaml tools and libraries into toplevel, browser applications. For example, <a href=\"https://patricoferris.github.io/try-irmin\">try-irmin</a> and <a href=\"https://patricoferris.github.io/try-eio/\">try-eio</a>.</p>\n <p>Naturally, I tried to wrap OxCaml into a toplevel so people could play around with the new modes that are part of the OxCaml type system. This turned out to be a lengthy debugging session (where type declarations did not align so the raw <code>Obj.repr</code> js_of_ocaml representation was broken for some parts of the toplevel). I would say that I do question the time-spent/value trade-off, but a mostly working toplevel with OxCaml is available at: <a href=\"https://patrick.sirref.org/oxcaml\">https://patrick.sirref.org/oxcaml</a>.</p>",
+18
pf341/weekly-2025-05-12_.json
+18
pf341/weekly-2025-05-12_.json
···+"summary": "<p>This week, I feel I have been stuck fighting the OCaml ecosystem trying to keep my <a href=\"https://patrick.sirref.org/try-oxcaml/\">OxCaml work afloat</a>. Aside from that, <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a> and I made some really nice progress with Shelter, culminating in <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a> describing it as a <em>metashell</em>.</p>\n \n\n \n\n <h2>Shelter the metashell</h2>\n \n <p>The main progress this week with Shelter was composing <a href=\"https://github.com/opencontainers/runc\">runc</a>'s <code>terminal</code> mode with entering raw terminal input mode on the Shelter side. This is inspired by <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a>'s own work on <a href=\"https://github.com/ryangibb/eon\">capability interfaces</a>.</p>\n <p>Shelter remains mostly intact, acting as an interactive shell. However, just before executing a command we switch to receiving and sending raw terminal inputs and outputs. This means tools like <code>vim</code> now work in Shelter! Not only that, but users can now <em>activate</em> an inner shell (e.g. <code>zsh</code>) and enjoy all the usual features of a fully-fledged shell (tab complete, fuzzy history search etc.) and upon exiting that shell, Shelter will snapshot the session. This lets you alter the granularity of snapshots from the command-line.</p>\n \n \n\n \n\n <h2>Louis Pouzin's \"Shell\"</h2>\n \n <p>I spent some time reading <a href=\"https://patrick.sirref.org/pouzin-shell-2013/\">part of the multics design documentation</a> this week. Louis Pouzin coined the term \"Shell\" in this document and I was reminded yet again just how important it is to be a good writer even as a \"computer science researcher\". For example, this excerpt from the requirements section of the document</p>\n <blockquote>\n <p>The previous definitions imply that a command MUST be designed while keeping in mind the user, sitting at his console, wondering about what might be going on, mistyping or forgetting arguments, even if fully aware of the conventions, and possibly interfering with the command by hasty quits, carriage returns, and other temperamental reactions.</p>\n </blockquote>\n <p>And then later, when defining the \"SHELL\".</p>\n <blockquote>\n <p>We may envision a common procedure called automatically by the supervisor whenever a user types in some message at his console, at a time when he has no other process in active execution under console control (presently called command level). This procedure acts as an interface between console messages and subroutine. The purpose of such a procedure is to create a medium of exchange into which one could activate any procedure, <em>inside of another program if it were called</em>. Hereafter, for simplification, we shall refer to that procedure as the \"SHELL\".</p>\n </blockquote>\n <p>It still surprises how little the undergraduate degree in computer science at <a href=\"https://patrick.sirref.org/ucam/\">Cambridge</a> focuses on writing skills.</p>\n \n \n\n \n\n <h2>OxCaml</h2>\n \n <p>Last week, I got a <a href=\"https://patrick.sirref.org/try-oxcaml/\">toplevel with OxCaml working</a>. This required a serious amount of work to understand the changes Janestreet have made to obscure parts of the OCaml compiler and then working those into tools like <code>js_of_ocaml</code>.</p>\n <p>This week, Janestreet pushed their latest rounds of changes and of course everything broke! I spent some more time fixing it all back up. I'm not entirely sure how maintainable this is. The problem is that, whilst things compile, the programs do not work together! Only when someone uses the program do the bugs surface.</p>\n \n \n\n \n\n <h2>Other OCaml Work</h2>\n \n <p>I worked on some other parts of the ecosystem this week.</p>\n \n\n \n\n <h3>Ppxlib</h3>\n \n <p>I helped review some changes to enable Janestreet to have ppx rewriters via attributes (usually they are via extension points). It is a bit of a controversial change to <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a> as we try to keep the API predictable for users:</p>\n <ol>\n <li>\n <p>Extension points are rewritten: this means the part that is rewritten is nicely delimited by the extension points start and end.</p>\n </li>\n <li>\n <p>Attributes extend: attributes do not rewrite the code they are attached to but rather extend the code with new AST nodes.</p>\n </li>\n </ol>\n <p><a href=\"https://github.com/ocaml-ppx/ppxlib/pull/574\">We will see what we decided to do (most likely provide the functionality behind some kind of \"expert\" interface) </a>.</p>\n \n \n\n \n\n <h3>Tiff</h3>\n \n <p>See <a href=\"https://patrick.sirref.org/mdales/\">Michael</a>'s <a href=\"https://digitalflapjack.com/weeknotes/2025-05-19b/\">notes</a>.</p>\n <p>I spent some time trying to speedup the LZW decompression of TIFF files in the pure OCaml tiff library this week(end). The two big changes to help with this are pretty common when speeding up these parts OCaml programs:</p>\n <ol>\n <li>\n <p>Allocate less</p>\n </li>\n <li>\n <p>Does less work</p>\n </li>\n </ol>\n <p>In terms of allocating less, the original implementation was using a <code>char list</code> to represent LZW strings. Manipulating these becomes quite costly, particularly since the most common operation is appending a single character to the end of a list. Converting this to use OCaml's immutable <code>string</code> saved a ton of allocations.</p>\n <p>In terms of doing less work, I opted to bypass <code>Cstruct</code>s sane (but slow) bounds checks in some of the \"hotter\" parts of the code. In particular, LZW ends up reading potentially <em>huge</em> arrays full of bytes one-by-one. So reading each byte needs to be quite snappy. This is a bit of a trade-off in terms of \"safety\" but we are in control of this code so I'm not too worried about that.</p>\n <p>Here are some results decompressing a fairly large array of some elevation data.</p>\n <pre>before: | tiff/lzw/cea \u2502 523851.2259 mjw/run \u2502 3289761.5414 mnw/run \u2502 9806796.7121 ns/run\u2502\nafter: \u2502 tiff/lzw/cea \u2502 27846.2408 mjw/run \u2502 587928.7527 mnw/run \u2502 8457161.3761 ns/run\u2502</pre>",+"content": "<p>This week, I feel I have been stuck fighting the OCaml ecosystem trying to keep my <a href=\"https://patrick.sirref.org/try-oxcaml/\">OxCaml work afloat</a>. Aside from that, <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a> and I made some really nice progress with Shelter, culminating in <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a> describing it as a <em>metashell</em>.</p>\n \n\n \n\n <h2>Shelter the metashell</h2>\n \n <p>The main progress this week with Shelter was composing <a href=\"https://github.com/opencontainers/runc\">runc</a>'s <code>terminal</code> mode with entering raw terminal input mode on the Shelter side. This is inspired by <a href=\"https://patrick.sirref.org/ryangibb/\">Ryan</a>'s own work on <a href=\"https://github.com/ryangibb/eon\">capability interfaces</a>.</p>\n <p>Shelter remains mostly intact, acting as an interactive shell. However, just before executing a command we switch to receiving and sending raw terminal inputs and outputs. This means tools like <code>vim</code> now work in Shelter! Not only that, but users can now <em>activate</em> an inner shell (e.g. <code>zsh</code>) and enjoy all the usual features of a fully-fledged shell (tab complete, fuzzy history search etc.) and upon exiting that shell, Shelter will snapshot the session. This lets you alter the granularity of snapshots from the command-line.</p>\n \n \n\n \n\n <h2>Louis Pouzin's \"Shell\"</h2>\n \n <p>I spent some time reading <a href=\"https://patrick.sirref.org/pouzin-shell-2013/\">part of the multics design documentation</a> this week. Louis Pouzin coined the term \"Shell\" in this document and I was reminded yet again just how important it is to be a good writer even as a \"computer science researcher\". For example, this excerpt from the requirements section of the document</p>\n <blockquote>\n <p>The previous definitions imply that a command MUST be designed while keeping in mind the user, sitting at his console, wondering about what might be going on, mistyping or forgetting arguments, even if fully aware of the conventions, and possibly interfering with the command by hasty quits, carriage returns, and other temperamental reactions.</p>\n </blockquote>\n <p>And then later, when defining the \"SHELL\".</p>\n <blockquote>\n <p>We may envision a common procedure called automatically by the supervisor whenever a user types in some message at his console, at a time when he has no other process in active execution under console control (presently called command level). This procedure acts as an interface between console messages and subroutine. The purpose of such a procedure is to create a medium of exchange into which one could activate any procedure, <em>inside of another program if it were called</em>. Hereafter, for simplification, we shall refer to that procedure as the \"SHELL\".</p>\n </blockquote>\n <p>It still surprises how little the undergraduate degree in computer science at <a href=\"https://patrick.sirref.org/ucam/\">Cambridge</a> focuses on writing skills.</p>\n \n \n\n \n\n <h2>OxCaml</h2>\n \n <p>Last week, I got a <a href=\"https://patrick.sirref.org/try-oxcaml/\">toplevel with OxCaml working</a>. This required a serious amount of work to understand the changes Janestreet have made to obscure parts of the OCaml compiler and then working those into tools like <code>js_of_ocaml</code>.</p>\n <p>This week, Janestreet pushed their latest rounds of changes and of course everything broke! I spent some more time fixing it all back up. I'm not entirely sure how maintainable this is. The problem is that, whilst things compile, the programs do not work together! Only when someone uses the program do the bugs surface.</p>\n \n \n\n \n\n <h2>Other OCaml Work</h2>\n \n <p>I worked on some other parts of the ecosystem this week.</p>\n \n\n \n\n <h3>Ppxlib</h3>\n \n <p>I helped review some changes to enable Janestreet to have ppx rewriters via attributes (usually they are via extension points). It is a bit of a controversial change to <a href=\"https://patrick.sirref.org/ppxlib/\">ppxlib</a> as we try to keep the API predictable for users:</p>\n <ol>\n <li>\n <p>Extension points are rewritten: this means the part that is rewritten is nicely delimited by the extension points start and end.</p>\n </li>\n <li>\n <p>Attributes extend: attributes do not rewrite the code they are attached to but rather extend the code with new AST nodes.</p>\n </li>\n </ol>\n <p><a href=\"https://github.com/ocaml-ppx/ppxlib/pull/574\">We will see what we decided to do (most likely provide the functionality behind some kind of \"expert\" interface) </a>.</p>\n \n \n\n \n\n <h3>Tiff</h3>\n \n <p>See <a href=\"https://patrick.sirref.org/mdales/\">Michael</a>'s <a href=\"https://digitalflapjack.com/weeknotes/2025-05-19b/\">notes</a>.</p>\n <p>I spent some time trying to speedup the LZW decompression of TIFF files in the pure OCaml tiff library this week(end). The two big changes to help with this are pretty common when speeding up these parts OCaml programs:</p>\n <ol>\n <li>\n <p>Allocate less</p>\n </li>\n <li>\n <p>Does less work</p>\n </li>\n </ol>\n <p>In terms of allocating less, the original implementation was using a <code>char list</code> to represent LZW strings. Manipulating these becomes quite costly, particularly since the most common operation is appending a single character to the end of a list. Converting this to use OCaml's immutable <code>string</code> saved a ton of allocations.</p>\n <p>In terms of doing less work, I opted to bypass <code>Cstruct</code>s sane (but slow) bounds checks in some of the \"hotter\" parts of the code. In particular, LZW ends up reading potentially <em>huge</em> arrays full of bytes one-by-one. So reading each byte needs to be quite snappy. This is a bit of a trade-off in terms of \"safety\" but we are in control of this code so I'm not too worried about that.</p>\n <p>Here are some results decompressing a fairly large array of some elevation data.</p>\n <pre>before: | tiff/lzw/cea \u2502 523851.2259 mjw/run \u2502 3289761.5414 mnw/run \u2502 9806796.7121 ns/run\u2502\nafter: \u2502 tiff/lzw/cea \u2502 27846.2408 mjw/run \u2502 587928.7527 mnw/run \u2502 8457161.3761 ns/run\u2502</pre>",
+18
pf341/weekly-2025-05-26_.json
+18
pf341/weekly-2025-05-26_.json
···+"summary": "<p>Over the past two weeks I have mainly split my time (amongst many things) developing <a href=\"https://patrick.sirref.org/open-trace/\">opentrace</a> and doing revision supervisions.</p>\n \n\n \n\n <h2>Opentrace</h2>\n \n <p>Thanks to <a href=\"https://github.com/koonwen/\">Koonwen's</a> excellent <a href=\"https://github.com/koonwen/ocaml-libbpf\">libbpf bindings in OCaml</a>, I have been building a little tool called <code>opentrace</code> to make it easier to track an executable's inputs and outputs.</p>\n <p>This work was inspired my <a href=\"https://patrick.sirref.org/mdales/\">Michael's</a> self-proclaimed \"gross hack\": <a href=\"https://github.com/quantifyearth/pyshark\">pyshark</a>. Whilst pyshark achieves its goals by injecting code into commonly used python objects and methods, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> uses <a href=\"https://ebpf.io/\">eBPF</a>. By using a lower-level API (hooks in the kernel), <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> can remain programming language agnostic. However, less information is none about the user's intent compared to something like pyshark.</p>\n \n\n \n\n <h3>Monitoring the System</h3>\n \n <p><a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> has an <code>all</code> command that will trace the entire system.</p>\n <pre>$ sudo opentrace all --flags=O_WRONLY\npid,cgid,comm,kind,flags,mode,filename,return\n16324,4417,\".nheko-wrapped\",openat,577,438,\"/home/patrick/.cache/nheko/nheko/curl_alt_svc_cache.txt\",47\n16324,4417,\".nheko-wrapped\",openat,193,33188,\"/home/patrick/.cache/nheko/nheko/cr3ZUHqBErIOe3PlwJ1SuU8zFKKxL12VzrRoHYMH.tmp\",47\n16324,4417,\".nheko-wrapped\",openat,577,438,\"/home/patrick/.cache/nheko/nheko/curl_alt_svc_cache.txt\",47\n16324,4417,\".nheko-wrapped\",openat,193,33188,\"/home/patrick/.cache/nheko/nheko/QfZTdZSuC56NdcDQ3aMxJc3BhMhAj8PmtYW1zFDP.tmp\",47\n2530,4235,\"systemd\",openat,524865,438,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/cgroup.subtree_control\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.min\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.low\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.high\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.max\",41</pre>\n <p>The <code>--flags=O_WRONLY</code> argument filters the events where the <code>O_WRONLY</code> flag was set in the call to <code>open</code>.</p>\n <p>We also get the name of the current executable linked to the task ( <code>comm</code>). The <code>-wrapped</code> is an artefact of using Nix.</p>\n \n \n\n \n\n <h3>Tracing an Executable</h3>\n \n <p>The primary use case for this tool is to inspect what files your program might be reading and writing.</p>\n <pre>$ sudo opentrace exec --format=json --flags=O_CREAT -- opam list\n$ cat trace.json | jq \".[] | .fname\"\n\"/home/patrick/.opam/cshell/.opam-switch/packages/cache\"\n\"/home/patrick/.opam/log/log-118747-29da3d.out\"\n\"/home/patrick/.opam/log/log-118747-29da3d.err\"\n\"/home/patrick/.opam/log/log-118747-29da3d.env\"\n\"/home/patrick/.opam/log/log-118747-29da3d.info\"</pre>\n <p>The \"flags\" argument can specify a small boolean formula for checking the open flags of a particular event with <code>|</code> (or), <code>&</code> (and), and <code>~</code> (not). Parentheses can be used for precedence.</p>\n <pre>$ sudo opentrace exec --flags=\"O_WRONLY|O_RDONLY\" -- ocaml --version</pre>\n \n \n\n \n\n <h3>Spawning Subprocesses</h3>\n \n <p>One feature <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> needs (in this proof-of-concept phase) is the ability to also trace subprocesses.</p>\n <p><a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> is primarily an eBPF program that is loaded into the kernel and communicates with an OCaml program. Events are communicated via a ring buffer and most of the post-processing happens in OCaml. To capture subprocesses, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> creates a new control group (cgroup) and places the new process into that group. This gives <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> a new identifier to track, namely the cgroup.</p>\n <p>So consider the following program.</p>\n <pre> <code><span>let</span><span> </span><span>() </span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>Eio_posix</span><span>.</span><span>run</span><span> </span><span>@@</span><span> </span><span>fun</span><span> </span><span>env</span><span> </span><span>-></span><span>\n</span>\n<span> </span><span>Eio</span><span>.</span><span>Path</span><span>.</span><span>( </span><span>save</span><span> ~</span><span>create</span><span>:</span><span>( </span><span>`Or_truncate</span><span> </span><span>0o664</span><span>) </span><span> </span><span>( </span><span>env</span><span>#</span><span>fs</span><span> </span><span>/</span><span> </span><span>\"</span><span>hello.txt</span><span>\"</span><span>) </span><span> </span><span>\"</span><span>hello</span><span>\"</span><span>) </span><span>;</span><span>\n</span>\n<span> </span><span>Eio</span><span>.</span><span>Process</span><span>.</span><span>run</span><span> </span><span>env</span><span>#</span><span>process_mgr</span><span>\n</span>\n<span> </span><span>[ </span><span> </span><span>\"</span><span>/bin/bash</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>-c</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>echo 'heya' > heya.txt</span><span>\"</span><span> </span><span>] </span><span>\n</span></code>\n </pre>\n <p>It first creates a file using direct calls to functions like <code>openat</code>. Then it spawns a process which creates a new file called <code>heya.txt</code>. This happens in a separate process. However, with the <code>--cgroups</code> flag we can capture both interactions with the operating system.</p>\n <pre>$ sudo opentrace exec --cgroups --flags=\"O_CREAT|O_TRUNC\" ./main.exe\npid,cgid,comm,kind,flags,mode,filename,return\n153187,530807,\"main.exe\",openat,526914,436,\"hello.txt\",5\n153192,530807,\"bash\",openat,577,438,\"heya.txt\",3</pre>\n \n\n \n\n <h4>Eio's Process API</h4>\n \n <p>I have used the <code>Eio_unix</code> <a href=\"https://ocaml.org/p/eio/latest/doc/Eio_unix/Process/index.html\">fork action process API</a> to be able to extend what happens in the child process. Loading most eBPF programs into the kernel requires special privileges hence the need for <code>sudo</code>. When a user requests for a particular program to be executed and traced, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> spawns a process via the Eio Process API. <a href=\"https://patrick.sirref.org/opentrace/\">Opentrace</a> defines a few new so-called \"fork actions\", little fragments of C code that are run after the call to <code>fork</code> ( <code>clone</code>). Most likely this ends with a call to <code>execve</code>, but other calls are possible for example <code>setuid</code> allowing <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> to change the user of the child process so it does not run as <code>root</code>. Similarly, this is where (if used) we create the cgroup and place the process into that group.</p>\n \n \n \n\n \n\n <h3>Limitations: Io_uring</h3>\n \n <p>Whilst testing <code>opentrace</code> against some of the tools I use nearly daily, I noticed some events were being missed. I tried tracing <a href=\"https://patrick.sirref.org/forester/\">forester</a>, and only the initial read of <code>forest.toml</code> was logged. It dawned on me that the reason for this was that <a href=\"https://patrick.sirref.org/forester/\">forester</a> (via <a href=\"https://patrick.sirref.org/eio/\">eio</a>) was using <a href=\"https://patrick.sirref.org/io_uring/\">io_uring</a> to perform most of the IO. Most attempts to open files were bypassing the open system calls, and instead they were being performed by the kernel after reading a submission request for an <code>openat2</code>-style call!</p>\n <p>This is not news to seasoned, Linux systems programmers. Io_uring <a href=\"https://blog.0x74696d.com/posts/iouring-and-seccomp/\">bypasses <code>SECCOMP</code> filters</a> for exactly the same reasons.</p>\n <pre>$ sudo opentrace exec -- forester build\n$ cat trace.csv\npid,cgid,comm,kind,flags,mode,filename,return\n155007,535570,\"forester\",openat,524288,0,\"forest.toml\",5\n155007,535570,\"forester\",Uring,2621440,0,\"\",0\n155021,535570,\"cp\",openat,131072,0,\"/home/patrick/documents/forest/theme/favicon-32x32.png\",4\n155007,535570,\"forester\",Uring,2686976,0,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0</pre>\n <p>It is interesting to note two things here:</p>\n <ol>\n <li>\n <p>We can tell that <a href=\"https://patrick.sirref.org/forester/\">forester</a> reads the configuration file probably using something like <code>In_channel</code> in OCaml ( <a href=\"https://git.sr.ht/~jonsterling/ocaml-forester/tree/7f275290e211db2590b0d715d8fb47fc1de36550/item/lib/frontend/Config.mlL22\">it does</a>).</p>\n </li>\n <li>\n <p>It appears that Uring is performing IO in both worker threads and directly.</p>\n </li>\n </ol>\n <p>The file paths are empty at the moment as I cannot find a clean way to trace openat submission requests into Uring without it sometimes going very wrong. I have tried quite a few methods (e.g. tracing <code>do_filp_open</code>) and at the moment I am tracing <code>io_openat2</code>, but this seems quite brittle, and often the filename is completely garbled, so I do not set it. If anyone has any ideas to trace Io_uring more reliably, I am all ears!</p>\n \n \n \n\n \n\n <h2>Supervisions</h2>\n \n <p>The first year students I teach will be doing their first year exams soon. I have been helping them revise for foundations of computer science and discrete mathematics. Whilst doing so, I have been thinking a good deal about <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a>'s <a href=\"https://www.jonmsterling.com/2025-W21/index.xml\">post about assessment bureaucracy</a>. In particular:</p>\n <blockquote>\n <p>At moments like this, it is a good idea to pause and reflect on whether it is better for our students that each faculty member spend a cumulative two months doing literally nothing but assessment and higher-order practices related to assessment, vs. other activities that could benefit our students more (including actual teaching, of which we do astonishingly little at Cambridge).</p>\n </blockquote>",+"content": "<p>Over the past two weeks I have mainly split my time (amongst many things) developing <a href=\"https://patrick.sirref.org/open-trace/\">opentrace</a> and doing revision supervisions.</p>\n \n\n \n\n <h2>Opentrace</h2>\n \n <p>Thanks to <a href=\"https://github.com/koonwen/\">Koonwen's</a> excellent <a href=\"https://github.com/koonwen/ocaml-libbpf\">libbpf bindings in OCaml</a>, I have been building a little tool called <code>opentrace</code> to make it easier to track an executable's inputs and outputs.</p>\n <p>This work was inspired my <a href=\"https://patrick.sirref.org/mdales/\">Michael's</a> self-proclaimed \"gross hack\": <a href=\"https://github.com/quantifyearth/pyshark\">pyshark</a>. Whilst pyshark achieves its goals by injecting code into commonly used python objects and methods, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> uses <a href=\"https://ebpf.io/\">eBPF</a>. By using a lower-level API (hooks in the kernel), <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> can remain programming language agnostic. However, less information is none about the user's intent compared to something like pyshark.</p>\n \n\n \n\n <h3>Monitoring the System</h3>\n \n <p><a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> has an <code>all</code> command that will trace the entire system.</p>\n <pre>$ sudo opentrace all --flags=O_WRONLY\npid,cgid,comm,kind,flags,mode,filename,return\n16324,4417,\".nheko-wrapped\",openat,577,438,\"/home/patrick/.cache/nheko/nheko/curl_alt_svc_cache.txt\",47\n16324,4417,\".nheko-wrapped\",openat,193,33188,\"/home/patrick/.cache/nheko/nheko/cr3ZUHqBErIOe3PlwJ1SuU8zFKKxL12VzrRoHYMH.tmp\",47\n16324,4417,\".nheko-wrapped\",openat,577,438,\"/home/patrick/.cache/nheko/nheko/curl_alt_svc_cache.txt\",47\n16324,4417,\".nheko-wrapped\",openat,193,33188,\"/home/patrick/.cache/nheko/nheko/QfZTdZSuC56NdcDQ3aMxJc3BhMhAj8PmtYW1zFDP.tmp\",47\n2530,4235,\"systemd\",openat,524865,438,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/cgroup.subtree_control\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.min\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.low\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.high\",41\n2530,4235,\"systemd\",openat,524545,0,\"/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice/gammastep.service/memory.max\",41</pre>\n <p>The <code>--flags=O_WRONLY</code> argument filters the events where the <code>O_WRONLY</code> flag was set in the call to <code>open</code>.</p>\n <p>We also get the name of the current executable linked to the task ( <code>comm</code>). The <code>-wrapped</code> is an artefact of using Nix.</p>\n \n \n\n \n\n <h3>Tracing an Executable</h3>\n \n <p>The primary use case for this tool is to inspect what files your program might be reading and writing.</p>\n <pre>$ sudo opentrace exec --format=json --flags=O_CREAT -- opam list\n$ cat trace.json | jq \".[] | .fname\"\n\"/home/patrick/.opam/cshell/.opam-switch/packages/cache\"\n\"/home/patrick/.opam/log/log-118747-29da3d.out\"\n\"/home/patrick/.opam/log/log-118747-29da3d.err\"\n\"/home/patrick/.opam/log/log-118747-29da3d.env\"\n\"/home/patrick/.opam/log/log-118747-29da3d.info\"</pre>\n <p>The \"flags\" argument can specify a small boolean formula for checking the open flags of a particular event with <code>|</code> (or), <code>&</code> (and), and <code>~</code> (not). Parentheses can be used for precedence.</p>\n <pre>$ sudo opentrace exec --flags=\"O_WRONLY|O_RDONLY\" -- ocaml --version</pre>\n \n \n\n \n\n <h3>Spawning Subprocesses</h3>\n \n <p>One feature <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> needs (in this proof-of-concept phase) is the ability to also trace subprocesses.</p>\n <p><a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> is primarily an eBPF program that is loaded into the kernel and communicates with an OCaml program. Events are communicated via a ring buffer and most of the post-processing happens in OCaml. To capture subprocesses, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> creates a new control group (cgroup) and places the new process into that group. This gives <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> a new identifier to track, namely the cgroup.</p>\n <p>So consider the following program.</p>\n <pre> <code><span>let</span><span> </span><span>() </span><span> </span><span>=</span><span>\n</span>\n<span> </span><span>Eio_posix</span><span>.</span><span>run</span><span> </span><span>@@</span><span> </span><span>fun</span><span> </span><span>env</span><span> </span><span>-></span><span>\n</span>\n<span> </span><span>Eio</span><span>.</span><span>Path</span><span>.</span><span>( </span><span>save</span><span> ~</span><span>create</span><span>:</span><span>( </span><span>`Or_truncate</span><span> </span><span>0o664</span><span>) </span><span> </span><span>( </span><span>env</span><span>#</span><span>fs</span><span> </span><span>/</span><span> </span><span>\"</span><span>hello.txt</span><span>\"</span><span>) </span><span> </span><span>\"</span><span>hello</span><span>\"</span><span>) </span><span>;</span><span>\n</span>\n<span> </span><span>Eio</span><span>.</span><span>Process</span><span>.</span><span>run</span><span> </span><span>env</span><span>#</span><span>process_mgr</span><span>\n</span>\n<span> </span><span>[ </span><span> </span><span>\"</span><span>/bin/bash</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>-c</span><span>\"</span><span>;</span><span> </span><span>\"</span><span>echo 'heya' > heya.txt</span><span>\"</span><span> </span><span>] </span><span>\n</span></code>\n </pre>\n <p>It first creates a file using direct calls to functions like <code>openat</code>. Then it spawns a process which creates a new file called <code>heya.txt</code>. This happens in a separate process. However, with the <code>--cgroups</code> flag we can capture both interactions with the operating system.</p>\n <pre>$ sudo opentrace exec --cgroups --flags=\"O_CREAT|O_TRUNC\" ./main.exe\npid,cgid,comm,kind,flags,mode,filename,return\n153187,530807,\"main.exe\",openat,526914,436,\"hello.txt\",5\n153192,530807,\"bash\",openat,577,438,\"heya.txt\",3</pre>\n \n\n \n\n <h4>Eio's Process API</h4>\n \n <p>I have used the <code>Eio_unix</code> <a href=\"https://ocaml.org/p/eio/latest/doc/Eio_unix/Process/index.html\">fork action process API</a> to be able to extend what happens in the child process. Loading most eBPF programs into the kernel requires special privileges hence the need for <code>sudo</code>. When a user requests for a particular program to be executed and traced, <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> spawns a process via the Eio Process API. <a href=\"https://patrick.sirref.org/opentrace/\">Opentrace</a> defines a few new so-called \"fork actions\", little fragments of C code that are run after the call to <code>fork</code> ( <code>clone</code>). Most likely this ends with a call to <code>execve</code>, but other calls are possible for example <code>setuid</code> allowing <a href=\"https://patrick.sirref.org/opentrace/\">opentrace</a> to change the user of the child process so it does not run as <code>root</code>. Similarly, this is where (if used) we create the cgroup and place the process into that group.</p>\n \n \n \n\n \n\n <h3>Limitations: Io_uring</h3>\n \n <p>Whilst testing <code>opentrace</code> against some of the tools I use nearly daily, I noticed some events were being missed. I tried tracing <a href=\"https://patrick.sirref.org/forester/\">forester</a>, and only the initial read of <code>forest.toml</code> was logged. It dawned on me that the reason for this was that <a href=\"https://patrick.sirref.org/forester/\">forester</a> (via <a href=\"https://patrick.sirref.org/eio/\">eio</a>) was using <a href=\"https://patrick.sirref.org/io_uring/\">io_uring</a> to perform most of the IO. Most attempts to open files were bypassing the open system calls, and instead they were being performed by the kernel after reading a submission request for an <code>openat2</code>-style call!</p>\n <p>This is not news to seasoned, Linux systems programmers. Io_uring <a href=\"https://blog.0x74696d.com/posts/iouring-and-seccomp/\">bypasses <code>SECCOMP</code> filters</a> for exactly the same reasons.</p>\n <pre>$ sudo opentrace exec -- forester build\n$ cat trace.csv\npid,cgid,comm,kind,flags,mode,filename,return\n155007,535570,\"forester\",openat,524288,0,\"forest.toml\",5\n155007,535570,\"forester\",Uring,2621440,0,\"\",0\n155021,535570,\"cp\",openat,131072,0,\"/home/patrick/documents/forest/theme/favicon-32x32.png\",4\n155007,535570,\"forester\",Uring,2686976,0,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0\n155007,535570,\"iou-wrk-155007\",Uring,557634,420,\"\",0</pre>\n <p>It is interesting to note two things here:</p>\n <ol>\n <li>\n <p>We can tell that <a href=\"https://patrick.sirref.org/forester/\">forester</a> reads the configuration file probably using something like <code>In_channel</code> in OCaml ( <a href=\"https://git.sr.ht/~jonsterling/ocaml-forester/tree/7f275290e211db2590b0d715d8fb47fc1de36550/item/lib/frontend/Config.mlL22\">it does</a>).</p>\n </li>\n <li>\n <p>It appears that Uring is performing IO in both worker threads and directly.</p>\n </li>\n </ol>\n <p>The file paths are empty at the moment as I cannot find a clean way to trace openat submission requests into Uring without it sometimes going very wrong. I have tried quite a few methods (e.g. tracing <code>do_filp_open</code>) and at the moment I am tracing <code>io_openat2</code>, but this seems quite brittle, and often the filename is completely garbled, so I do not set it. If anyone has any ideas to trace Io_uring more reliably, I am all ears!</p>\n \n \n \n\n \n\n <h2>Supervisions</h2>\n \n <p>The first year students I teach will be doing their first year exams soon. I have been helping them revise for foundations of computer science and discrete mathematics. Whilst doing so, I have been thinking a good deal about <a href=\"https://patrick.sirref.org/jonmsterling/\">Jon Sterling</a>'s <a href=\"https://www.jonmsterling.com/2025-W21/index.xml\">post about assessment bureaucracy</a>. In particular:</p>\n <blockquote>\n <p>At moments like this, it is a good idea to pause and reflect on whether it is better for our students that each faculty member spend a cumulative two months doing literally nothing but assessment and higher-order practices related to assessment, vs. other activities that could benefit our students more (including actual teaching, of which we do astonishingly little at Cambridge).</p>\n </blockquote>",
+18
pf341/weekly-2025-06-02_.json
+18
pf341/weekly-2025-06-02_.json
···+"summary": "<p>This week included some time finishing <a href=\"https://patrick.sirref.org/open-trace/\">opentrace</a> and subsequently folding it into <a href=\"https://patrick.sirref.org/shelter/\">shelter</a>. I have been writing up some more of the draft paper for <a href=\"https://patrick.sirref.org/shelter/\">shelter</a> which I am excited to share in the near future.</p>\n <p>I revisited the <a href=\"https://patrick.sirref.org/vpnkit-upgrade/\">upgrading vpnkit</a> PR and pushed some more fixes. I have been thinking, again, about the promise of a direct-style world for OCaml that just hasn't quite landed yet. <em>C'est la vie</em>.</p>\n \n\n \n\n <h2>Forester and Graft</h2>\n \n <p>I spent a bit of time finally pulling out my changes to <a href=\"https://patrick.sirref.org/forester/\">Forester</a> to add markdown and bibtex support into a standalone tool: <a href=\"https://patrick.sirref.org/graft/\">graft</a>.</p>\n \n\n \n\n <h3>Graft</h3>\n \n <p>Graft is a <a href=\"https://patrick.sirref.org/forester/\">Forester</a> preprocessor.</p>\n <p>It takes a forester (a directory of trees) written in a mixture of Markdown, Bibtex and <a href=\"https://patrick.sirref.org/forester/\">Forester</a> syntax and produces a new forest completely written in <a href=\"https://patrick.sirref.org/forester/\">Forester</a> syntax.</p>\n \n\n \n\n <h4>Usage</h4>\n \n <p><code>graft</code> simply preprocesses a forest generating Forester trees from <code>.md</code>, <code>.bib</code> and <code>.tree</code> files. It will copy the structure of the input directory in the output directory.</p>\n <pre> <code><span>$ graft preprocess --output=grafted-trees trees\n</span>\n<span>$ forester build\n</span></code>\n </pre>\n <p>This assumes that you have updated your Forester <code>toml</code> file to put the <code>grafted-trees</code> directory as your source of trees.</p>\n <pre>[forest]\ntrees = [ \"grafted-trees\" ]</pre>\n \n \n\n \n\n <h4>Example</h4>\n \n <p>A typical \"tree\" might look something like</p>\n <pre>---\ntitle: Opentrace and Supervisions\ndate: 2025-05-26\nauthor: Patrick Ferris\n---\n\nOver the past two weeks I have mainly split my time (amongst many things) developing [opentrace](open-trace)\nand doing revision supervisions.\n\n```forester\n\\put\\transclude/numbered{false}\n\\transclude{open-trace}\n```</pre>\n <p>A few things to note:</p>\n <ol>\n <li>\n <p>The <code>yaml</code> frontmatter allows you to add some of the metadata fields from Forester.</p>\n </li>\n <li>\n <p>At any point in your markdown there is an escape hatch to Forester using a <code>forester</code> codeblock.</p>\n </li>\n </ol>\n \n \n <p>It is very satisfying to find the separation of concerns works quite well. For a while I had been rebasing my development branch on <a href=\"https://patrick.sirref.org/forester/\">Forester</a>. I was also worried about trying to get the code upstream as it pulled in many dependencies. It seems that I have a very workable solution. I welcome contributions to <a href=\"https://patrick.sirref.org/graft/\">graft</a> including extra input formats. I am also considering extending the <a href=\"https://patrick.sirref.org/forester/\">Forester</a> configuration to contain some <a href=\"https://patrick.sirref.org/graft/\">graft</a> configuration for how it should generate new trees (e.g. at the moment every entry in a bibtex file is given a new tree).</p>\n \n\n \n\n <h3>Maths Support</h3>\n \n <p>As part of that process, <a href=\"https://patrick.sirref.org/graft/\">graft</a> now supports Markdown KaTeX. For example:</p>\n \n\n \n\n <h4>Mergeable Replicated Data Type Implementation</h4>\n \n <p>A <strong>mergeable replicated data type (MRDT) implementation</strong> for a data type <code>\\tau </code> is a tuple <code>D_{\\tau } = (\\Sigma , \\sigma _{0}, do, merge)</code> where:</p>\n <ul>\n <li>\n <p><code>\\Sigma </code> is the set of all possible states at a branch,</p>\n </li>\n <li>\n <p><code>\\sigma _{0} \\in \\Sigma </code> is the initial state,</p>\n </li>\n <li>\n <p><code>do : Op_{\\tau } \\times \\Sigma \\times Timestamp \\rightarrow \\Sigma \\times Val_{\\tau }</code> implements every data type operation,</p>\n </li>\n <li>\n <p><code>merge : \\Sigma \\times \\Sigma \\times \\Sigma \\rightarrow \\Sigma </code> implements the <em>three-way merge strategy</em>.</p>\n </li>\n </ul>\n <p><a href=\"https://patrick.sirref.org/kcrsk-mrdts-2022/\">Definition 2.1 from \"Certified Mergeable Replicated Data Types\"</a>.</p>\n \n \n \n \n\n \n\n <h2>Revision Supervisions</h2>\n \n <p>I have been doing revision supervisions for <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a> and <a href=\"https://patrick.sirref.org/focs.md\">Foundations of Computer Science</a>. I have also been marking Operating Systems past paper questions for the same group of first years.</p>",+"content": "<p>This week included some time finishing <a href=\"https://patrick.sirref.org/open-trace/\">opentrace</a> and subsequently folding it into <a href=\"https://patrick.sirref.org/shelter/\">shelter</a>. I have been writing up some more of the draft paper for <a href=\"https://patrick.sirref.org/shelter/\">shelter</a> which I am excited to share in the near future.</p>\n <p>I revisited the <a href=\"https://patrick.sirref.org/vpnkit-upgrade/\">upgrading vpnkit</a> PR and pushed some more fixes. I have been thinking, again, about the promise of a direct-style world for OCaml that just hasn't quite landed yet. <em>C'est la vie</em>.</p>\n \n\n \n\n <h2>Forester and Graft</h2>\n \n <p>I spent a bit of time finally pulling out my changes to <a href=\"https://patrick.sirref.org/forester/\">Forester</a> to add markdown and bibtex support into a standalone tool: <a href=\"https://patrick.sirref.org/graft/\">graft</a>.</p>\n \n\n \n\n <h3>Graft</h3>\n \n <p>Graft is a <a href=\"https://patrick.sirref.org/forester/\">Forester</a> preprocessor.</p>\n <p>It takes a forester (a directory of trees) written in a mixture of Markdown, Bibtex and <a href=\"https://patrick.sirref.org/forester/\">Forester</a> syntax and produces a new forest completely written in <a href=\"https://patrick.sirref.org/forester/\">Forester</a> syntax.</p>\n \n\n \n\n <h4>Usage</h4>\n \n <p><code>graft</code> simply preprocesses a forest generating Forester trees from <code>.md</code>, <code>.bib</code> and <code>.tree</code> files. It will copy the structure of the input directory in the output directory.</p>\n <pre> <code><span>$ graft preprocess --output=grafted-trees trees\n</span>\n<span>$ forester build\n</span></code>\n </pre>\n <p>This assumes that you have updated your Forester <code>toml</code> file to put the <code>grafted-trees</code> directory as your source of trees.</p>\n <pre>[forest]\ntrees = [ \"grafted-trees\" ]</pre>\n \n \n\n \n\n <h4>Example</h4>\n \n <p>A typical \"tree\" might look something like</p>\n <pre>---\ntitle: Opentrace and Supervisions\ndate: 2025-05-26\nauthor: Patrick Ferris\n---\n\nOver the past two weeks I have mainly split my time (amongst many things) developing [opentrace](open-trace)\nand doing revision supervisions.\n\n```forester\n\\put\\transclude/numbered{false}\n\\transclude{open-trace}\n```</pre>\n <p>A few things to note:</p>\n <ol>\n <li>\n <p>The <code>yaml</code> frontmatter allows you to add some of the metadata fields from Forester.</p>\n </li>\n <li>\n <p>At any point in your markdown there is an escape hatch to Forester using a <code>forester</code> codeblock.</p>\n </li>\n </ol>\n \n \n <p>It is very satisfying to find the separation of concerns works quite well. For a while I had been rebasing my development branch on <a href=\"https://patrick.sirref.org/forester/\">Forester</a>. I was also worried about trying to get the code upstream as it pulled in many dependencies. It seems that I have a very workable solution. I welcome contributions to <a href=\"https://patrick.sirref.org/graft/\">graft</a> including extra input formats. I am also considering extending the <a href=\"https://patrick.sirref.org/forester/\">Forester</a> configuration to contain some <a href=\"https://patrick.sirref.org/graft/\">graft</a> configuration for how it should generate new trees (e.g. at the moment every entry in a bibtex file is given a new tree).</p>\n \n\n \n\n <h3>Maths Support</h3>\n \n <p>As part of that process, <a href=\"https://patrick.sirref.org/graft/\">graft</a> now supports Markdown KaTeX. For example:</p>\n \n\n \n\n <h4>Mergeable Replicated Data Type Implementation</h4>\n \n <p>A <strong>mergeable replicated data type (MRDT) implementation</strong> for a data type <code>\\tau </code> is a tuple <code>D_{\\tau } = (\\Sigma , \\sigma _{0}, do, merge)</code> where:</p>\n <ul>\n <li>\n <p><code>\\Sigma </code> is the set of all possible states at a branch,</p>\n </li>\n <li>\n <p><code>\\sigma _{0} \\in \\Sigma </code> is the initial state,</p>\n </li>\n <li>\n <p><code>do : Op_{\\tau } \\times \\Sigma \\times Timestamp \\rightarrow \\Sigma \\times Val_{\\tau }</code> implements every data type operation,</p>\n </li>\n <li>\n <p><code>merge : \\Sigma \\times \\Sigma \\times \\Sigma \\rightarrow \\Sigma </code> implements the <em>three-way merge strategy</em>.</p>\n </li>\n </ul>\n <p><a href=\"https://patrick.sirref.org/kcrsk-mrdts-2022/\">Definition 2.1 from \"Certified Mergeable Replicated Data Types\"</a>.</p>\n \n \n \n \n\n \n\n <h2>Revision Supervisions</h2>\n \n <p>I have been doing revision supervisions for <a href=\"https://patrick.sirref.org/discrete-maths/\">Discrete Maths</a> and <a href=\"https://patrick.sirref.org/focs.md\">Foundations of Computer Science</a>. I have also been marking Operating Systems past paper questions for the same group of first years.</p>",
+2
-2
sadiqj/metadata.json
+2
-2
sadiqj/metadata.json
+20
sadiqj/www.toao.com_2017-01-15__blog_getting-ocaml-running-on-the-esp32.json
+20
sadiqj/www.toao.com_2017-01-15__blog_getting-ocaml-running-on-the-esp32.json
···+"summary": "<p><img alt=\"End result\" src=\"/static/wemos_board.gif\" title=\"End result\"></p>\n<p>I was looking for some small Christmas stocking-fillers to give to techie friends and decided to try to find some interesting electronics boards from China.</p>\n<p>In the end, I went with the <a href=\"https://wiki.wemos.cc/products:lolin32:lolin32_lite\">WEMOS Lolin32 Lite</a> which features Espressif's ESP32. If you're not familiar with the ESP32, it's an awesome little \u2026</p>",+"content": "<p><img alt=\"End result\" src=\"/static/wemos_board.gif\" title=\"End result\"></p>\n<p>I was looking for some small Christmas stocking-fillers to give to techie friends and decided to try to find some interesting electronics boards from China.</p>\n<p>In the end, I went with the <a href=\"https://wiki.wemos.cc/products:lolin32:lolin32_lite\">WEMOS Lolin32 Lite</a> which features Espressif's ESP32. If you're not familiar with the ESP32, it's an awesome little chip that features the following:</p>\n<ul>\n<li>Dual core 240Mhz 32-bit Xtensa LX6s</li>\n<li>Wi-Fi (802.11 b/g/n) and Bluetooth (v4.2 + BLE)</li>\n<li>520kb of SRAM</li>\n<li>A separate ultra-low power processor</li>\n</ul>\n<p>The Lolin32 Lite couples that with 4mb of flash, micro-usb connection and Li-Po charging circuitry.</p>\n<p>Espressif has a gcc-based toolchain and an <a href=\"https://github.com/espressif/esp-idf\">\"IoT Development Framework\"</a> which provides a port of Newlib, FreeRTOS, LWIP and a whole host of other frameworks.</p>\n<p>My friend <a href=\"http://anil.recoil.org/\">Anil</a> suggested that with a gcc and libc, porting of the OCaml interpreter would be fairly easy. He was mostly right.</p>\n<h3>Caml runtime</h3>\n<p>It took a little while to understand how OCaml's build system worked and thankfully it seems there's been good support for cross compilation since 4.02. The configure script has a pretty funky way of determining features of the compiler and runtime by compiling lots of small C programs and seeing what builds. This required a few small changes where features were detected but only partially available via Espressif's port of Newlib. Posix signals and BSD sockets were two cases where this happened. I should expand the code for the tests to cover the missing functionality and try to upstream it, which would avoid the configure script hacks.</p>\n<h3>Rebuilding Newlib</h3>\n<p>Unfortunately either I was failing or Espressif's build of newlib doesn't seem to include signal(). This meant a rebuild of newlib was required without the SIGNAL_PROVIDED flag, which includes an implementation of signal(). There's also no support for directories in the IDF, so I had to stub out some parts of sys.c and unix.c. With those changes it was possible to get libcamlrun.a compiled. </p>\n<p>Once I had a cross compiled bytecode runtime, I was most of the way there.</p>\n<h3>Building an image</h3>\n<p>Next step was to get some OCaml compiled which could then be incorporated into the image to flash. This is actually pretty simple with <a href=\"https://github.com/sadiqj/hello_caml/blob/master/main/component.mk#L12\">ocamlc and custom runtimes</a> and you end up with a C source file you can then throw in to the rest of the IDF component build system. I wrote a <a href=\"https://github.com/sadiqj/hello_caml/blob/master/main/hello_world_main.c#L34\">little bit of C</a> that kickstarted ocaml via caml_startup and had a buildable image to flash.</p>\n<h3>Debugging</h3>\n<p>I flashed the board and immediately got an abort after malloc failed. First things to tweak were the <a href=\"https://github.com/sadiqj/ocaml-esp32/blob/2798033d8e113f5da6c03ff8ef5ac9edec3e54f9/byterun/caml/config.h\">garbage collection settings</a> which were not designed for 512kb of ram. I tuned many of those and was still getting an abort but after some instrumentation it turns out that the runtime allocates a 64kb buffer for both stdin and stdout. After reducing those buffers considerably, the interpreter no longer aborted! It didn't, however, print anything out - which concerned me.</p>\n<p>After a fair amount of debugging, I still have no idea where stdout goes. It's certainly not the same place as printf, which makes it to the monitor. Once I had that figured out, I realised I had a functioning interpreter!</p>\n<h3>State of play</h3>\n<p>There's a <a href=\"https://github.com/sadiqj/ocaml-esp32-docker/blob/master/Dockerfile\">Dockerfile</a> for the whole build process:</p>\n<ul>\n<li>Installs the prerequisites, Xtensa gcc port, Espressif IDF</li>\n<li>Rebuilds Newlib</li>\n<li>Installs an OCaml via OPAM, then builds the OCaml ESP32 bytecode runtime</li>\n<li>Finally builds a simple Hello World OCaml project and builds an image</li>\n</ul>\n<p>You should be able to then flash the resulting image with <code>make flash</code> if you have a dev board connected and have passed the USB-serial device through to the container with <code>--device=/dev/ttyUSB0</code> (on Linux).</p>\n<h3>Short term TODOs</h3>\n<p>There are a couple of TODOs that probably need to be cleaned up or fixed. As I mentioned earlier, we could expand some of the hasgot tests to include functionality Espressif's Newlib build doesn't have and this would simplify some of the configure changes. Figuring out how to redirect stdout and stderr to the monitor would also be incredibly useful.</p>\n<h3>Longer term plans</h3>\n<p>A native compiler backend for Xtensa would mean we could produce a more compact and hopefully more performant image which would be very useful in environments with tight power budgets. Speaking of low-power, some kind of DSL for programming the ultra-low power core on the board would also probably be very useful.</p>\n<p>In terms of networking, the Espressif IDF ships with an lwip port for networking but there are sufficiently <a href=\"https://github.com/espressif/esp-idf/blob/3a271a4ae7df8a9049fbbb801feafca5043c31eb/components/esp32/include/esp_wifi_internal.h\">low level interfaces</a> available for the Wi-Fi device that could work with <a href=\"https://mirage.io/\">Mirage</a>'s <a href=\"https://github.com/mirage/mirage-tcpip\">tcpip</a> direct driver.</p>\n<p>Any volunteers?</p>\n<h3>End result</h3>\n<p><img alt=\"Hello from OCaml!\" src=\"/static/hello_caml.png\" title=\"Hello from OCaml!\"></p>",
+20
sadiqj/www.toao.com_2020-12-25__blog_teaching-bloom-filters-new-tricks.json
+20
sadiqj/www.toao.com_2020-12-25__blog_teaching-bloom-filters-new-tricks.json
···+"summary": "<p>In this post we're going to discuss how to teach Bloom Filters new tricks. We'll start with examining Partitioned Bloom Filters and then look at ways we can generalise Bloom Filters to new and interesting uses. By the end of the post you will be able to use this generalised framework to come up with novel probabilistic data structures.</p>",+"content": "<p>I've had a few interesting conversations around text indexing recently with friends and something that's fallen out is that there are some really cool data structures you build that are extensions of Bloom filters. The techniques for constructing them don't seem to be that widely known so this blog post intends to remedy that.</p>\n<p>We will start by examining a variant of the traditional Bloom filter and then we'll look at ways we can generalise the model to produce new types of probabilistic data structures and understand what properties they might have.</p>\n<h3>Bloom Filters</h3>\n<p>To begin with, let's take a look at what a Bloom filter is and how it works so we're all on the same page.</p>\n<p>Loosely a Bloom filter is a data structure into which you can add a set of elements. Later you can query it with any of those elements and it will return true. Crucially it <em>might</em> also return true for elements that weren't in the set with some (tunable) probability. The benefit you get from this trade-off is that Bloom filters can be very compact.</p>\n<h4>Trade-offs</h4>\n<p>How big this benefit is depends on two things. First, how high a probability of an incorrect answer (a false positive) your application can tolerate and second, how large the items are in the set you want to track.</p>\n<p>For the first case, let's look at a simple example. Imagine if you were a social network and you want to maintain a cache of your most active ten million users. Each user has is identified by a 32-bit integer. If we naively stored the identifiers of our most active million users we're need 320 million bits or 40 megabytes. On the other hand, if we're willing to accept a 1% false positive rate then a Bloom filter needs just over 11 megabytes. A decent win.</p>\n<p>The benefits become much greater as soon as you have large keys. As we'll see in a minute this is because Bloom filters use hashing to avoid storing the keys themselves. If instead of 32-bit user identifiers we are storing urls which could be 320 bits upwards then the win becomes significantly greater. Storing ten million is suddenly >400 megabytes but a bloom filter still clocks in at 11 megabytes for a 1% false positive rate or 17 megabytes for a 0.1% false positive rate.</p>\n<h4>False positives</h4>\n<p>Bloom filters are suited to situations where they can filter out the need to do some work and where a false positive just means some wasted work. A good example would be a local Bloom filter sitting in front of a remote key-value cache. The filter can contain the set of keys the remote cache holds the values for and can filter out needless network requests for keys that aren't present. A false positive in this case means we end up doing a request to the remote cache when not necessary.</p>\n<h4>How they work</h4>\n<p>Let's look at how a Bloom filter actually works. For those already familiar with Bloom filters, for presentation purposes we're only going to discuss Partitioned Bloom filters as this makes the mathematics simpler and exact.</p>\n<p>A Bloom filter consists of an array of <span>\\(m\\)</span> bits <span>\\(B\\)</span> and <span>\\(k\\)</span> hash functions <span>\\(h_0..h_{k-1}\\)</span>. The hash functions take a key <span>\\(x\\)</span> and map it to a location in <span>\\(B\\)</span>. The hash functions are assumed to be independent of each other. That is if we hash some value \"foo\" with <span>\\(h_0\\)</span> then the result should give us no information about the result from hashing with <span>\\(h_1\\)</span>.</p>\n<h4>Adding to a filter</h4>\n<p>To add to the Bloom filter we hash the key with the hash functions and use their results to set bits in the array <span>\\(B\\)</span>. More specifically:</p>\n<div><pre><span></span><code><span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span> </span><span>=</span><span> </span><span>true</span>\n</code></pre></div>\n\n<p>Each hash function indexes into a subrange of size <span>\\(\\frac{m}{k}\\)</span> i.e if there are 4 hash functions and array <span>\\(B\\)</span> is 256 bits then the first hash function picks a position in 0 to 63, the second hash function 64 to 127 and so on. These are later referred to as a hash function's <em>partition</em>.</p>\n<h4>Querying a filter</h4>\n<p>To check a Bloom filter for an element we again use the hash functions to hash the key (resulting in the same positions we would have had if we had added the element) and check the array positions they generate in <span>\\(B\\)</span>. Again, some some pseudo code:</p>\n<div><pre><span></span><code><span>p</span><span> </span><span>=</span><span> </span><span>true</span><span> </span><span>/* whether the element is present or not */</span>\n<span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>p</span><span> </span><span>=</span><span> </span><span>p</span><span> </span><span>&</span><span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span>\n<span>return</span><span> </span><span>p</span>\n</code></pre></div>\n\n<h4>Why they work</h4>\n<p>Let's reason out how this works. If we have the element \"foo\" and add it to a Bloom filter, it will result in a number (<span>\\(k\\)</span>) of positions in the array <span>\\(B\\)</span> being set to true. If we then test that same Bloom filter with \"foo\" we check those same positions and if they are all true then we probably added \"foo\" at some point in the past to the filter. There can never be a false negative, if we added \"foo\" those bits <em>must</em> have been set.</p>\n<p>There can however be false positives, we may have added some other keys which just so happened to generate positions that covered all the ones that \"foo\" would hash to. This is the source of false positives in Bloom filters. We can actually calculate the probability that this will occur.</p>\n<h4>False positive probability for k=1</h4>\n<p>To simplify let us have a Bloom filter with a single hash function (so <span>\\(k=1\\)</span>) and with array <span>\\(B\\)</span> of size <span>\\(m\\)</span> bits.</p>\n<p>A single element is added with the routine we specified above. What is the probability that a bit in <span>\\(B\\)</span> is set? Since <span>\\(h_{0}\\)</span> can pick any position<a href=\"#fn:1\">1</a> then the probability that a bit is set is <span>\\(\\frac{1}{m}\\)</span>.</p>\n<p>Now comes testing. We test the single element we added. The position will be the same as when we added it, so we'll correctly return that it is present. What if we tested a different element to the one we just added to the Bloom filter? What is the probability that we get a false positive? For a false positive to occur the element we are testing must map to a position that was already set in the array <span>\\(B\\)</span>. We've already calculated that as <span>\\(\\frac{1}{m}\\)</span>.</p>\n<p>Now instead of a single element, let us consider what happens if we add <span>\\(n\\)</span> elements and then test something not in those elements.</p>\n<p>For each element we add, the probability that a bit position <em>won't</em> be set is <span>\\(1-\\frac{1}{m}\\)</span>. After adding <span>\\(n\\)</span> elements, the probability the bit hasn't been set is <span>\\((1 - \\frac{1}{m})^n\\)</span>.</p>\n<p>Testing an element that wasn't one we added, we want to know the probability that the bit has been set by one of the previous <span>\\(n\\)</span> elements. This is complement of the event that it hasn't been set previously, so:</p>\n<div>$$1 - (1 - \\frac{1}{m})^n$$</div>\n<p>An intuitive way of looking at this is that it is the percentage of 1s set in <span>\\(B\\)</span> after we add <span>\\(n\\)</span> items. To test out our simple filter, if the size of <span>\\(B\\)</span> (<span>\\(m\\)</span>) is 512 bits and we add 32 elements then the false positive rate if we test a new element that wasn't in the original set is:</p>\n<div>$$1 - (1 - \\frac{1}{512})^{32} = 0.061$$</div>\n<p>So about 6.1%. If we add another 32 elements this increases to 11.8% and so on.</p>\n<h4>Generalising to multiple hash functions</h4>\n<p>How do we extend this to multiple hash functions? For partitioned Bloom filters we don't need to deal with collisions between hash functions e.g where <span>\\(h_0(\"foo\")\\)</span> and <span>\\(h_1(\"foo\")\\)</span> give the same result. To reiterate, a false positive is where all bit positions in <span>\\(B\\)</span> for an element tested, but not initially added, are set.</p>\n<p>Let's add a second hash function to our simple filter. Now we have <span>\\(h_0\\)</span> and <span>\\(h_1\\)</span>. To get a false positive we would have needed the bit in <span>\\(h_0\\)</span>'s partition to be set <em>and</em> the bit in <span>\\(h_1\\)</span>'s partition. The probability that a bit is set in the 0th partition when an element is added is <span>\\(\\frac{1}{\\frac{m}{2}} = \\frac{2}{m}\\)</span> because each partition is now <span>\\(m/2\\)</span> in size as there are two) That the bit is not set after <span>\\(n\\)</span> elements are added is <span>\\((1 - \\frac{2}{m})^n\\)</span>. For the two (independent) hash functions we end up with:</p>\n<div>$$(1 - (1 - \\frac{2}{m})^n)^2$$</div>\n<p>This generalises to:</p>\n<div>$$(1 - (1 - \\frac{k}{m})^n)^k$$</div>\n<p>We can quickly work our earlier example again. If B is 512 bits (<span>\\(m = 512\\)</span>), have 4 hash functions (<span>\\(k = 4\\)</span>) and we add 32 elements (<span>\\(n = 32\\)</span>) then we end up with:</p>\n<div>$$(1 - (1 - \\frac{4}{512})^{32})^4 = $$</div>\n<p>Which is 0.2%. Increasing k initially reduces the false positive rate but there's an optimum number (<span>\\(k_{opt}\\)</span>) before the rate starts to rise. In this example that is <span>\\(k = 11\\)</span>:</p>\n<p><img alt=\"Graph of k vs false positive rate for m = 512, n = 32\" src=\"/static/bloom_false_positive_graph.png\"></p>\n<h3>Generalising Bloom filters</h3>\n<p>Now we understand how a Bloom filter works, let's look at how we can generalise the technique and apply it to create some data structures with interesting properties. We'll start by sketching out a general structure and then look at how traditional Bloom filters fit within it.</p>\n<p>For the generalised data structure we have an array <span>\\(B\\)</span> as before but instead of bits we have state of type <span>\\(T\\)</span> i.e <span>\\(B[0]\\)</span> is data of type T at the first position in array <span>\\(B\\)</span>.</p>\n<p>To add to the generalised Bloom filter we have an input element <span>\\(x\\)</span> and associated data <span>\\(y\\)</span>. We use <span>\\(k\\)</span> hash functions (<span>\\(h_0\\)</span>, <span>\\(h_1\\)</span>, etc..) as before. The algorithm is:</p>\n<div><pre><span></span><code><span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span> </span><span>=</span><span> </span><span>combine</span><span>(</span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>,</span><span> </span><span>y</span><span>)</span>\n</code></pre></div>\n\n<p>So the main difference from before is that we use the <em>combine</em> function and element's associated data to update the state at the hash partition location.</p>\n<p>Now let's look at how we would query the generalised Bloom filter:</p>\n<div><pre><span></span><code><span>p</span><span> </span><span>=</span><span> </span><span>list</span><span>()</span><span> </span><span>/* empty list */</span>\n\n<span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>p</span><span>.</span><span>append</span><span>(</span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>)</span>\n\n<span>return</span><span> </span><span>reduce</span><span>(</span><span>p</span><span>)</span>\n</code></pre></div>\n\n<p>Here we take the state each hash partition location, make a list of them and pass them to a <em>reduce</em> function. The <em>reduce</em> function returns a result that is some value <span>\\(d\\)</span> or none.</p>\n<p>Lastly we're going to state that the result from querying our generalised Bloom filter will have a <em>one-sided</em> error. We define a third function <em>error</em> and say that if an element <span>\\(x\\)</span> with associated data <span>\\(y\\)</span> is added to the generalised Bloom filter then querying for element <span>\\(x\\)</span>, will always contain some value <span>\\(d\\)</span> and the function error(<span>\\(y\\)</span>, <span>\\(d\\)</span>) will always return true. If an element <span>\\(x\\)</span> with associated data <span>\\(y\\)</span> was not added to the filter then querying will return none or with some (tunable) probability a value <span>\\(d\\)</span> which error(<span>\\(y\\)</span>, <span>\\(d\\)</span>) could return true or false for.</p>\n<p>So to summarise, our parameters are a type <span>\\(T\\)</span> for the data array and three functions <em>combine</em>, <em>reduce</em> and <em>error</em>. It turns out that if we pick those four parameters carefully, we can maintain a one-sided error. We'll start with the traditional Bloom filter and then get more adventurous.</p>\n<h3>Fitting the traditional filter</h3>\n<p><span>\\(T\\)</span> in the traditional filter is a single bit. <em>combine</em> is the max function<a href=\"#fn:2\">2</a>, <em>reduce</em> is the min function and <em>error</em> is equals.</p>\n<p>We also need to modify the input slightly, <span>\\(x\\)</span> is still the element to be added but the associated data <span>\\(y\\)</span> is 1 if the element is in the set we added to the filter and 0 if it is not.</p>\n<p>If you plug these in to the earlier pieces of pseudo code you should be able to convince yourself that the filter performs identically to the standard description of a partitioned Bloom filter.</p>\n<h3>Maxmin-based Bloom filters</h3>\n<p>It turns out if we leave <em>combine</em> as the max function, <em>reduce</em> as the min function and <em>error</em> is <span>\\(d \\geq y\\)</span> then we can still get one-sided errors as long as the elements of type T have a total order.</p>\n<p>By a total order we mean that we can compare any two elements of type T<a href=\"#fn:3\">3</a>. We'll see later why we're being very specific about this.</p>\n<p>Here's an example filter that would satisfy the above requirements. Assuming the associated data <span>\\(y\\)</span> for each element is a positive integer (think files with a corresponding sizes in kilobytes) then we could use a maxmin-based Bloom filter with T being an unsigned integer. We initialise all elements of <span>\\(B\\)</span> to 0 and don't allow 0 as a <span>\\(y\\)</span>.</p>\n<p>To add to the example maxmin filter:</p>\n<div><pre><span></span><code><span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span> </span><span>=</span><span> </span><span>max</span><span>(</span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>,</span><span> </span><span>y</span><span>)</span>\n</code></pre></div>\n\n<p>And to query:</p>\n<div><pre><span></span><code><span>p</span><span> </span><span>=</span><span> </span><span>list</span><span>()</span><span> </span><span>/* empty list */</span>\n\n<span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>p</span><span>.</span><span>append</span><span>(</span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>)</span>\n\n<span>return</span><span> </span><span>min</span><span>(</span><span>*</span><span>p</span><span>)</span><span> </span><span>/* min in list */</span>\n</code></pre></div>\n\n<p>For an <span>\\(x\\)</span> added to the example filter, a query returns a result greater than or equal to the associated data <span>\\(y\\)</span> seen with <span>\\(x\\)</span>. For <span>\\(x\\)</span> that wasn't added to the filter we will return <em>none</em> or with some probability, a random number from the filter.</p>\n<p>There's actually two types of error here:</p>\n<ol>\n<li>We test an element <span>\\(x\\)</span> that we haven't added to the Bloom filter and get a value back for it instead of <em>none</em>. We call this a false positive.</li>\n<li>We test an element <span>\\(x\\)</span> that we <em>have</em> added to the Bloom filter and get a value back that is higher than the associated value <span>\\(y\\)</span> we originally added.</li>\n</ol>\n<p>Let's start with the false positive probability. We can reuse most of the reasoning from the partitioned Bloom filter false positive derivation earlier. Instead of the bit in <span>\\(B\\)</span> having been set by a different <span>\\(x\\)</span> we're instead interested in the element in <span>\\(B\\)</span> being greater than zero, which means it has been set before.</p>\n<p>Since we have <span>\\(m\\)</span> elements in <span>\\(B\\)</span>, the probability that an element is set to greater than zero on adding to the filter is still <span>\\(\\frac{1}{m}\\)</span>. After <span>\\(n\\)</span> additions the probability that it is still zero is <span>\\((1 - \\frac{1}{m})^n\\)</span>. So we again have the same probability of giving a false positive as the traditional case:</p>\n<div>$$(1 - (1 - \\frac{k}{m})^n)^k$$</div>\n<p>What about the second error? For this to happen each of the <span>\\(k\\)</span> hash partition locations for an element <span>\\(x\\)</span> must contain a value that is higher than the associated data <span>\\(y\\)</span>. They all must because we take the minimum. There's a bit of difficulty involved in calculating this directly since this kind of error requires a hash collision in each partition <em>and</em> for that collision to result in a higher resulting value. The probability clearly depends on <span>\\(y\\)</span>, if <span>\\(y\\)</span> is very low (or even the minimum we could see, which is 1) then any collision may lead to this error. If <span>\\(y\\)</span> is the maximum we could see then <em>no</em> collision will lead to this error.</p>\n<p>For now let us go with a very pessimistic assumption that a collision results in a higher value, implying <span>\\(y\\)</span> is the minimum we will see. This gives us an upper bound on the probability of the second type of error.</p>\n<p>Assume we add <span>\\(x\\)</span> to the filter and we now add a second item. Again we start with a single hash function <span>\\(k = 1\\)</span> and element array <span>\\(B\\)</span> of size <span>\\(m\\)</span>. What is the probability there was a collision? We picked a location randomly (using <span>\\(h_0(x)\\)</span>) for our element <span>\\(x\\)</span> and now we're picking one again for the second one - the probability of picking the same one is <span>\\(\\frac{1}{m}\\)</span>.</p>\n<p>Now we add <span>\\(n\\)</span> items, what's the chance we <em>didn't</em> collide? Not colliding after one is <span>\\(1-\\frac{1}{m}\\)</span>. After <span>\\(n\\)</span> items we end up with <span>\\((1-\\frac{1}{m})^n\\)</span>. Similar to earlier on with the traditional Bloom filter but reversed, an intuitive way of looking at this is the percentage of unset elements in <span>\\(B\\)</span>.</p>\n<p>So for a single hash function, our estimate (<span>\\(d\\)</span>) for the associated data <span>\\(y\\)</span> is exact with probability at least <span>\\((1-\\frac{1}{m})^n\\)</span>. Specifically:</p>\n<div>$$p(d-y > 0) \\leq (1-\\frac{1}{m})^n$$</div>\n<p>What is the situation with multiple hash functions? As we are taking the <em>min</em> then we only need one of the partitions not to have a collision to get an exact answer e.g <em>not</em> where there has been a collision in every partition.</p>\n<p>The probability of having a at least one collision in a partition after <span>\\(n\\)</span> elements is:</p>\n<div>$$1 - (1 - \\frac{k}{m})^n$$</div>\n<p>for <span>\\(k\\)</span> partitions this is:</p>\n<div>$$p(d - y > 0) \\leq (1 - (1 - \\frac{k}{m})^n)^k$$</div>\n<p>Which is the same as the equation we worked out for the first class of error! If we step back though this should make intuitive sense. Both class of error depend on the proportion of data elements that are non-zero in <span>\\(B\\)</span>. In the first class of error we might read all non-zero elements and incorrectly return a value instead of none and in the second class of error the more set elements we have the greater the chance we have of colliding in every partition.</p>\n<p>I should point out at this point that this is a very pessimistic bound. It assumes that any collision leads to an increase in the stored value. There's also a question we've skirted around in this discussion and that is if we make an error of the second class, how big is our over-estimate? I think there's a follow-up blog post on that - though at this stage I'm not sure we can make statements on the over-estimate without making assumptions as to the distribution of <span>\\(y\\)</span>. I may be wrong, let's discuss on twitter: <a href=\"https://twitter.com/intent/tweet?screen_name=sadiqj&ref_src=twsrc%5Etfw\">tweet me</a>.</p>\n<p>Another thing to note about this filter is that while we only specified it for elements <span>\\(x\\)</span> with accompanying data <span>\\(y\\)</span>, you can add <span>\\(x\\)</span> multiple times to the filter with <em>different</em> <span>\\(y\\)</span>s and the filter will return an estimate for the maximum <span>\\(y\\)</span> it encountered.</p>\n<h3>Generalising this model further</h3>\n<p>Just to recap. For Maxmin filters we have <em>combine</em> as the max function, <em>reduce</em> as the min function and <em>error</em> is <span>\\(d \\geq y\\)</span>. Our elements are of type T and have a total order.</p>\n<p>We can go further than this with a technique <a href=\"https://arxiv.org/abs/cs/0306046\">Boldi & Vigna showed in 2004</a> called <em>Compressed Approximators</em>.</p>\n<p>For Compressed Approximators we have <em>combine</em> as the least upper bound, <em>reduce</em> as the greatest lower bound and <em>error</em> is still <span>\\(d \\geq y\\)</span>. We'll define these terms in a second. The elements are of type T and form a lattice. For T, this means two things.</p>\n<p>First, it means we have some way of comparing two elements of type T, a <em>partial order</em>. The partial part means not all elements are comparable - this is contrast to a total order, where all elements are.</p>\n<p>The second is that for any two elements of type T we can find:</p>\n<ol>\n<li>the smallest element of type T that is greater than or equal to those two elements (the least upper bound)</li>\n<li>the largest element of type T that is less than or equal to those two elements (the greatest lower bound)</li>\n</ol>\n<p>This is convenient because we need to be able to do those two things for <em>combine</em> and <em>reduce</em>.</p>\n<p>We'll look at an example now to try to build up an intuitive feeling of what's going on. We base our example around the <em>inclusion</em> order of sets, which is a partial order<a href=\"#fn:4\">4</a>. Inclusion order means that if all elements of A are also in B i.e A is a subset of B (<span>\\(A \\subseteq B\\)</span>) then <span>\\(A \\leq B\\)</span>. For inclusion order, the least upper bound of two sets is their union and the greatest lower bound is their intersection<a href=\"#fn:5\">5</a>.</p>\n<h3>A multi-member Bloom Filter</h3>\n<p>Armed with this lattice, let's turn to a practical application. Imagine you have assets with a key <span>\\(x\\)</span> that may be cached on one or more of multiple caching servers all of which are infront of some authoritative source (an origin for a CDN, for example). The caching servers may be ephemeral and so we don't want to simply rely on hashing the key to a single server.</p>\n<p>The most naive option would be to maintain a Bloom filter for each of the servers but this requires us to potentially do as many tests as we have servers. We can construct a Compressed Approximator for this problem which we only need to test once.</p>\n<p>To add to the multi-member Bloom filter we do the following, assuming <span>\\(x\\)</span> is our key, <span>\\(y\\)</span> is the server it is now cached on, <span>\\(k\\)</span> is the number of hashes and <span>\\(B\\)</span> is an array of sets.</p>\n<div><pre><span></span><code><span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span> </span><span>=</span><span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>.</span><span>add</span><span>(</span><span>y</span><span>)</span>\n</code></pre></div>\n\n<p>So for each hash function and partition, we simply add <span>\\(y\\)</span> to the set at the hash's index. Assuming <em>add</em> adds to the set if it exists or if not, instantiates it to the singleton set containing just the parameter.</p>\n<p>To query the filter we do:</p>\n<div><pre><span></span><code><span>p</span><span> </span><span>=</span><span> </span><span>set</span><span>()</span><span> </span><span>/* empty set */</span>\n\n<span>for</span><span> </span><span>i</span><span> </span><span>=</span><span> </span><span>0</span><span> </span><span>to</span><span> </span><span>k</span><span>-</span><span>1</span>\n<span> </span><span>j</span><span> </span><span>=</span><span> </span><span>h</span><span>[</span><span>i</span><span>]</span><span>(</span><span>x</span><span>)</span>\n<span> </span><span>if</span><span> </span><span>i</span><span> </span><span>==</span><span> </span><span>0</span><span> </span><span>then</span>\n<span> </span><span>p</span><span> </span><span>=</span><span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span>\n<span> </span><span>else</span>\n<span> </span><span>p</span><span> </span><span>=</span><span> </span><span>intersection</span><span>(</span><span>p</span><span>,</span><span> </span><span>B</span><span>[</span><span>i*k+j</span><span>]</span><span>)</span>\n\n<span>return</span><span> </span><span>p</span>\n</code></pre></div>\n\n<p>We loop over each hash function and partition, taking the intersection of all the sets at those locations<a href=\"#fn:6\">6</a>.</p>\n<p>Again we have a data structure with two possible types of error. As with the maxmin filter, the first type of error is that we test an element we have not added to the filter and we will mistakenly return a set of servers it may be cached on instead of the empty set.</p>\n<p>The second type of error is if we test an element we have added to the filter we may get back a set of caches that contains servers the item is not cached on in addition to the right server.</p>\n<p>We can use the derivation from earlier for the first type of error, the probability of a false positive is:</p>\n<div>$$(1 - (1 - \\frac{k}{m})^n)^k$$</div>\n<p>For the second error we can also make a similar argument to the maxmin filter, our probability of getting an extra caching server back is at most that of our false positive rate<a href=\"#fn:7\">7</a>.</p>\n<h3>Recap and conclusion</h3>\n<p>So to recap, we've discussed the standard Bloom filter and derived a probability for the false positive rate of the partitioned variant. We then looked at maxmin Bloom filters which generalise them to elements that have a total order, including deriving probabilities for the false positive rate and the event that the value the filter returns is higher than the original value added. Finally we looked at a further generalisation for elements that have a partial order, Compact Approximators, and looked at one specific data structure using that model.</p>\n<p>We should be able to use this knowledge to come up with other novel probabilistic data structures based on Bloom filters that have properties which make them better suited to practical problems. I have a whole laundry list of other ones to write up for the next holiday. Want to stay tuned? Follow me on twitter: <a href=\"https://twitter.com/sadiqj?ref_src=twsrc%5Etfw\">@sadiqj</a></p>\n<div>\n\n\n<ol>\n<li>\n<p>Hopefully with uniform probability, if it's a good hash function. <a href=\"#fnref:1\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li>\n<p>Technically it doesn't have to be the max function, you're effectively ignoring the existing state but the symmetry with min as reduce makes the narrative a bit easier later on. <a href=\"#fnref:2\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n<li>\n<p>There are further restrictions on how the comparison must work, see <a href=\"https://en.wikipedia.org/wiki/Total_order\">the wikipedia page</a> for more information. <a href=\"#fnref:3\" title=\"Jump back to footnote 3 in the text\">↩</a></p>\n</li>\n<li>\n<p>We're just going to casually steam past this but Wikipedia <a href=\"https://en.wikipedia.org/wiki/Partially_ordered_set#Formal_definition\">has a page with the formal definitions</a>. We're also going to handwave a bit around the set definitions in this bit to keep things short. The take aways should be how to use the framework to develop new data structures. <a href=\"#fnref:4\" title=\"Jump back to footnote 4 in the text\">↩</a></p>\n</li>\n<li>\n<p>This only works if all the elements that could be in either set come from some other set, say S. More specifically we say that both of the sets belong to the power set of S. <a href=\"#fnref:5\" title=\"Jump back to footnote 5 in the text\">↩</a></p>\n</li>\n<li>\n<p>We could optimise this by early-outing if we ever get an empty set from a hash location. <a href=\"#fnref:6\" title=\"Jump back to footnote 6 in the text\">↩</a></p>\n</li>\n<li>\n<p>We can probably make the bound stronger by making some assumptions on how we use the filter and how many caching servers we add items to. That's material for another post though! <a href=\"#fnref:7\" title=\"Jump back to footnote 7 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",
+20
sadiqj/www.toao.com_2025-01-30__blog_json-output-from-deepseek-r1-and-distills-with-llamacpp.json
+20
sadiqj/www.toao.com_2025-01-30__blog_json-output-from-deepseek-r1-and-distills-with-llamacpp.json
···+"id": "tag:www.toao.com,2025-01-30:/blog/json-output-from-deepseek-r1-and-distills-with-llamacpp",+"summary": "<p>Here's a workaround to get JSON output from Deepseek R1 and distills when using the llama.cpp OpenAI-compatible server endpoint</p>",+"content": "<p>We're evaluating Deepseek R1 and its distills for our project on <a href=\"https://anil.recoil.org/projects/ce\">identifying and extracting evidence from literature</a>. One part of our pipeline needs a strong reasoning model with JSON structured output.</p>\n<p>Using llama-server's OpenAI compatible completion endpoint with response_format and JSON schema conflicts with the model's reasoning in <think> tags. Here's a workaround using the grammar functionality.</p>\n<p>First, grab <a href=\"https://github.com/ggerganov/llama.cpp/blob/master/examples/json_schema_to_grammar.py\">json_schema_to_grammar</a> from the llama.cpp repo.</p>\n<p>Next you can use the following:</p>\n<div><pre><span></span><code><span>def</span> <span>convert_schema_to_grammar</span><span>(</span><span>json_schema</span><span>:</span> <span>dict</span><span>)</span> <span>-></span> <span>str</span><span>:</span>\n <span>converter</span> <span>=</span> <span>json_schema_to_grammar</span><span>.</span><span>SchemaConverter</span><span>(</span>\n <span>prop_order</span><span>=</span><span>{},</span>\n <span>allow_fetch</span><span>=</span><span>False</span><span>,</span>\n <span>dotall</span><span>=</span><span>False</span><span>,</span>\n <span>raw_pattern</span><span>=</span><span>False</span>\n <span>)</span>\n\n <span>converter</span><span>.</span><span>visit</span><span>(</span><span>json_schema</span><span>,</span> <span>'json-schema'</span><span>)</span>\n <span>json_grammar</span> <span>=</span> <span>converter</span><span>.</span><span>format_grammar</span><span>()</span>\n\n <span># do the root gbnf grammar that handles the <think> and </think> tags</span>\n <span># lack of less-than didn't hinder performance for us but ymmv!</span>\n <span>base_rules</span> <span>=</span> <span>"""</span>\n<span>root ::= "<think>" [^<]+ "</think>" [</span><span>\\\\</span><span>n]* json-schema</span>\n<span>"""</span>\n\n <span>return</span> <span>base_rules</span> <span>+</span> <span>json_grammar</span>\n</code></pre></div>\n\n<p>Then use the 'extra_body' parameter in your request:</p>\n<div><pre><span></span><code><span>client</span> <span>=</span> <span>OpenAI</span><span>()</span>\n\n<span>params</span> <span>=</span> <span>{</span>\n <span>"model"</span><span>:</span> <span>"..."</span><span>,</span>\n <span>"extra_body"</span><span>:</span> <span>{</span>\n <span>"grammar"</span><span>:</span> <span>grammar</span>\n <span>}</span>\n<span>}</span>\n\n<span>response</span> <span>=</span> <span>client</span><span>.</span><span>chat</span><span>.</span><span>completions</span><span>.</span><span>create</span><span>(</span>\n <span>**</span><span>params</span>\n<span>)</span>\n</code></pre></div>\n\n<p>Example below:</p>\n<div><pre><span></span><code>Provide a fictional name and address using the following JSON schema:\n\n{\n "type": "object",\n "properties": {\n "name": {\n "type": "string"\n },\n "age": {\n "type": "integer"\n },\n "address": {\n "type": "object",\n "properties": {\n "street": {\n "type": "string"\n },\n "city": {\n "type": "string"\n },\n "zip": {\n "type": "string"\n }\n }\n }\n }\n}\n\n<think>\nOkay, so the user wants me to provide a fictional name and address following a specific JSON schema. Let me break this down.\n\nFirst, the schema has an object with name, age, and address. The address itself is another object containing street, city, and zip. I need to make sure each field is of the correct type: strings for name, street, city, zip, and an integer for age.\n\nI should come up with a plausible name. Maybe something common like Emily Carter. Age should be an adult, say 32. For the address, I'll pick a street name, perhaps Maple Street, a number like 147. The city could be something like Riverton, and the zip code needs to be a 5-digit number, maybe 12345.\n\nWait, is Riverton a real city? I think there are places named Riverton in various states, but since it's fictional, it doesn't matter. Alternatively, I could make up a city name, but using a real one might be easier.\n\nPutting it all together, I'll structure the JSON with these values. I need to ensure that the syntax is correct, with proper commas and brackets. Also, the keys should match exactly: name, age, address, street, city, zip.\n\nI should double-check that all the types are correct. Name is a string, age is an integer, and each part of the address is a string. That should satisfy the schema requirements.\n\nI think that's all. Time to put it together in the response.\n</think>\n\n{\n "name": "Emily Carter",\n "age": 32,\n "address": {\n "street": "147 Maple Street",\n "city": "Riverton",\n "zip": "12345"\n }\n}\n</code></pre></div>\n\n<p>This approach lets you preserve model reasoning while enforcing JSON output structure. Enjoy!</p>",
+20
sadiqj/www.toao.com_2025-05-01__blog_opam-archive-dataset.json
+20
sadiqj/www.toao.com_2025-05-01__blog_opam-archive-dataset.json
···+"summary": "<p>Introducing opam-archive-dataset: a Parquet dataset containing code from OCaml packages, designed to improve performance of language models for OCaml development through better training data</p>",+"content": "<p>If you have tried to use locally-hostable language models to develop OCaml code, then you will have noticed their performance significantly lags in more niche languages compared to Python or Javascript. <a href=\"https://www.cst.cam.ac.uk/people/jjl25\">Jon Ludlam</a>, <a href=\"https://anil.recoil.org/\">Anil Madhavapeddy</a> and I have been doing some work on this recently and there will be more on that soon.</p>\n<p>To improve code models, we first need data. To help with that I've created <a href=\"https://huggingface.co/datasets/sadiqj/opam-archive-dataset\">opam-archive-dataset</a> which periodically takes the code for all packages from the <a href=\"https://hub.docker.com/r/ocaml/opam/tags?name=archive\">ocaml/opam:archive</a> docker image, filters for the most recent version of each package, and then converts everything into the columnar parquet format. This is a very efficient format and results in a ~800MB set of files.</p>\n<p>To use the dataset and run queries over it, you can use the <a href=\"https://huggingface.co/docs/datasets/en/load_hub\">Hugging Face datasets library</a> or if you prefer SQL then you can do the following:</p>\n<div><pre><span></span><code><span># </span>clone<span> </span>the<span> </span>dataset<span> </span>from<span> </span>huggingface\n<span>sadiq@server:opam-archive$ </span>git<span> </span>clone<span> </span>https://huggingface.co/datasets/sadiqj/opam-archive-dataset\n<span>Cloning into 'opam-archive-dataset'...</span>\n<span>remote: Enumerating objects: 17, done.</span>\n<span>remote: Total 17 (delta 0), reused 0 (delta 0), pack-reused 17 (from 1)</span>\n<span>Unpacking objects: 100% (17/17), 4.31 KiB | 315.00 KiB/s, done.</span>\n<span>Filtering content: 100% (3/3), 388.79 MiB | 14.30 MiB/s, done.</span>\n\n<span># </span>grab<span> </span>clickhouse\n<span>sadiq@server:opam-archive$ </span>curl<span> </span>https://clickhouse.com/<span> </span><span>|</span><span> </span>sh\n<span>Successfully downloaded the ClickHouse binary, you can run it as:</span>\n<span> ./clickhouse</span>\n\n<span>You can also install it:</span>\n<span>sudo ./clickhouse install</span>\n\n<span># </span>we<span> </span><span>do</span><span> </span>not<span> </span>need<span> </span>to<span> </span>install<span> </span>it!<span> </span>We<span> </span>use<span> </span>clickhouse<span> </span><span>local</span>\n<span>sadiq@server:opam-archive$ </span>./clickhouse<span> </span><span>local</span>\n\n<span>./clickhouse local</span>\n<span>ClickHouse local version 25.5.1.1804 (official build).</span>\n\n<span>:) -- let's have a look at a few rows</span>\n<span>SELECT * FROM file('opam-archive-dataset/data/', Parquet) LIMIT 1;</span>\n\n<span>Query id: 0f786705-1568-40ac-837b-004457c3519d</span>\n\n<span>Row 1:</span>\n<span>\u2500\u2500\u2500\u2500\u2500\u2500</span>\n<span>package_name: dune-action-plugin</span>\n<span>version: 3.18.1</span>\n<span>license: MIT</span>\n<span>homepage: https://github.com/ocaml/dune</span>\n<span>dev_repo: git+https://github.com/ocaml/dune.git</span>\n<span>file_type: dune</span>\n<span>file_path: dune-3.18.1/test/blackbox-tests/test-cases/formatting/feature.t/enabled/dune-ocaml-syntax/dune</span>\n<span>file_contents: (* -*- tuareg -*- *)</span>\n\n<span>let</span>\n<span>()</span> <span>=</span>\n<span>Jbuild_plugin.V1.send {|</span>\n<span>(alias</span>\n<span> (name runtest)</span>\n<span> (action (echo "ocaml syntax")))</span>\n<span>|}</span>\n\n<span>:) -- Let's count how many rows we have</span>\n<span>SELECT COUNT(*) FROM file('opam-archive-dataset/data/', Parquet);</span>\n\n<span>SELECT COUNT(*)</span>\n<span>FROM file('opam-archive-dataset/data/', Parquet)</span>\n\n<span>Query id: 3ee6eb4b-13b7-47aa-be67-d027c81b47b0</span>\n\n<span> \u250c\u2500COUNT()\u2500\u2510</span>\n<span>1. \u2502 198862 \u2502</span>\n<span> \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518</span>\n\n<span>1 row in set. Elapsed: 0.013 sec. </span>\n\n<span>:) -- How many unique packages are spawning Domains?</span>\n<span>SELECT COUNT(DISTINCT package_name) FROM file('opam-archive-dataset/data/', Parquet) WHERE position('Domain.spawn', file_contents) > 0;</span>\n\n<span>SELECT COUNTDistinct(package_name)</span>\n<span>FROM file('opam-archive-dataset/data/', Parquet)</span>\n<span>WHERE position('Domain.spawn', file_contents) > 0</span>\n\n<span>Query id: 6f0978d9-3907-4572-bf5e-99aa4e2fceb8</span>\n\n<span> \u250c\u2500COUNTDistinct(package_name)\u2500\u2510</span>\n<span>1. \u2502 193 \u2502</span>\n<span> \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518</span>\n\n<span>1 row in set. Elapsed: 0.723 sec. Processed 197.86 thousand rows, 402.85 MB (273.81 thousand rows/s., 557.48 MB/s.)</span>\n<span>Peak memory usage: 385.88 MiB.</span>\n</code></pre></div>\n\n<p>We currently extract the package name, version, license, dev repo, file type (dune, opam, mli, ml, .c and .h), file path and the contents itself.</p>\n<p>If there are any extra fields that would be useful, let <a href=\"https://bsky.app/profile/sadiq.toao.com\">me</a> know. Enjoy!</p>",
+20
sadiqj/www.toao.com_2025-05-06__blog_ocaml-local-code-models.json
+20
sadiqj/www.toao.com_2025-05-06__blog_ocaml-local-code-models.json
···+"title": "Qwen3 Leads the Pack: Evaluating how Local LLMs tackle First Year CS OCaml exercises",+"summary": "<p>How well can locally-runnable language models handle OCaml code generation? We evaluate 19 open-weight LLMs on first-year Computer Science exercises, exploring the balance between model size, architecture, and reasoning capabilities for less mainstream programming languages.</p>",+"content": "<p><em>TL;DR</em> Qwen3's self-hostable sized models are very strong and would score top-marks on Cambridge's first year Computer Science OCaml exercises.</p>\n<p>Large language models (LLMs) are transforming software development. What if you use a language outside the mainstream that has relatively little training data though? Worse, what if you want to use features that are still under development or only recently released? The model can't have seen those features during training and so won't make use of them when generating unless explicitly prompted to do so. My colleague <a href=\"https://www.cst.cam.ac.uk/people/jjl25\">Jon Ludlam</a>, Anil Madhavapeddy and I <a href=\"https://anil.recoil.org/notes/claude-copilot-sandbox\">have been thinking about this</a> in the context of OCaml and some of the <a href=\"https://blog.janestreet.com/oxidizing-ocaml-locality/\">new extensions under development at Jane Street</a>. Having a process for teaching models to use novel language features would not just be useful for interactive code generation but could also enable retrofitting existing codebases or if cheap enough, could be used in the design process for the features themselves.</p>\n<p>Before we dived in though we wanted to understand just how good the kind of models are that a developer could run either locally or on a shared local inference server? To do that we used the OCaml tutorial problems first year Computer Science students at the university tackle on the <a href=\"https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/\">Foundations of Computer Science</a> course taught by Anil and Jon. These are <a href=\"https://jon.recoil.org/blog/2025/04/this-site.html\">interactive jupyter notebooks</a> where students populate answer cells and their solutions are automatically checked. Each tutorial is known as a 'tick' and has one or more questions. Students must complete earlier questions in order to progress to later questions. There are also 'starred' ticks which are essentially stretch goals and harder than the non-starred ones.</p>\n<h1>Terminology</h1>\n<p>A quick bit of terminology that will help in the later sections:</p>\n<p><em>Tokens</em> are the units that Large Language Models take as input and output. A token is roughly three quarters of a word in length on average and they come from a fixed vocabulary for a given model. Here's <a href=\"https://huggingface.co/deepseek-ai/DeepSeek-R1/raw/main/tokenizer.json\">Deepseek R1's vocab</a> (warning it's about 8mb of json).</p>\n<p><em>Reasoning models</em> or a thinking mode when the model is given some space in which to reason through its approach to the problem before it generates an answer. After pre-training on very large text corpora reasoning models undergo a phase of Reinforcement Learning. If you want to know more, Sebastian Raschka has a <a href=\"https://sebastianraschka.com/blog/2025/the-state-of-reinforcement-learning-for-llm-reasoning.html\">great overview article</a>. The latest Qwen3 models from Alibaba have the ability to selectively enable thinking mode. This lets you trade off latency and performance. More on this later.</p>\n<p><em>OpenRouter</em> is a marketplace for LLM providers. It provides a unified OpenAI-compatible API and then routes requests to different model providers based on the model you specify and any filtering options (e.g., no training on data, support for certain features). If you only use proprietary models then this is less interesting, they are often only available via a small handful of APIs that largely have similar pricing (e.g., for Gemini you can go via the Gemini or Vertex API). If you use open weights models though, there are a plethora of providers all competing on price, latency and throughput. For example if you want <a href=\"https://openrouter.ai/meta-llama/llama-3.3-70b-instruct\">Llama 3.3 70B</a> you can currently choose the cheapest (inference.net) at $0.10 per million tokens input and $0.25 per million tokens output with 1.02s time-to-first-token and 28.41 tokens per second throughput. On the other hand, if throughput is your primary concern then there's Groq at 372.2 tokens per second but you'll pay a premium at $0.59 in and $0.79 out.</p>\n<p><em>Distillation</em> is when you use the output of a much larger model to improve the performance of a smaller model. When releasing Deepseek-R1, Deepseek also released a series of smaller (originally non-Deepseek) models that had been fine-tuned on the outputs of Deepseek-R1.</p>\n<p><em>Mixture of experts</em> is a type of architecture where only a subset of the network weights are activated for each token processed. This can lead to a performance win at inference time and is in contrast to a <em>dense</em> model in which every process token interacts with every non-embedding model parameter.</p>\n<h1>Models</h1>\n<p>We took the following open weight models, using instruct variants models where available:</p>\n<ul>\n <li>Google Gemma3 12B and 27B</li>\n <li>Deepseek-R1-Llama 70B</li>\n <li>Meta Llama-3.1 8B and 70B</li>\n <li>Microsoft Phi-4</li>\n <li>Mistral 8B</li>\n <li>Mistral Nemo (12B)</li>\n <li>Mistral Small 24B</li>\n <li>Qwen 2.5 7B and 72B</li>\n <li>Qwen 2.5 Coder 7B and 32B</li>\n <li>Qwen 3 8B, 14B, 32B and 30BA3B (Thinking and non-thinking)</li>\n</ul>\n\n<p>We gave models a similar setup to students. Each model had 3 attempts to complete each question by producing a code block that would pass the notebook's tests. If they produced a passing block of code they proceeded to the next question. We repeated this 5 times per model per tick. We used OpenRouter for inference.</p>\n<p>The tick questions tested a range of concepts such as recursion, data structures, streams etc.. We'll look at them in more depth in the results.</p>\n<h1>Results</h1>\n<p>Here's a graph of the success rate against parameter count for all models. Note the logarithmic x-axis:</p>\n\n\n<p><img alt=\"Performance against model parameters\" src=\"/static/model_parameters.png\"></p>\n<p>And a table of <b>overall success rate</b>. This is the percentage of tick-question pairs that a model was able to solve given 3 attempts.</p>\n\n\n<div>\n\n\n \n \n Model\n Overall Success Rate\n \n \n \n \n anthropic/claude-3.7-sonnet:thinking\n 96.4%\n \n \n qwen/qwen3-32b:thinking\n 95.2%\n \n\n \n qwen/qwen3-14b:thinking\n 84.2%\n \n\n \n qwen/qwen3-30b-a3b:thinking\n 83.0%\n \n\n \n qwen/qwen-2.5-coder-32b-instruct\n 77.6%\n \n\n \n meta-llama/llama-3.3-70b-instruct\n 69.1%\n \n\n \n qwen/qwen3-8b:thinking\n 66.9%\n \n\n \n qwen/qwen3-32b\n 62.4%\n \n\n \n mistralai/mistral-small-3.1-24b-instruct\n 58.2%\n \n\n \n qwen/qwen3-14b\n 56.4%\n \n\n \n qwen/qwen3-30b-a3b\n 48.5%\n \n\n \n deepseek/deepseek-r1-distill-llama-70b\n 44.2%\n \n\n \n google/gemma-3-27b-it\n 33.9%\n \n\n \n qwen/qwen3-8b\n 33.9%\n \n\n \n microsoft/phi-4\n 32.1%\n \n\n \n google/gemma-3-12b-it\n 28.5%\n \n\n \n mistralai/mistral-nemo\n 26.1%\n \n\n \n meta-llama/llama-3.1-8b-instruct\n 25.5%\n \n\n \n mistral/ministral-8b\n 15.8%\n \n\n \n \n</div>\n\n<h1>High-level takeaways</h1>\n<p><b>Qwen 3 models performed very well</b> The best performing model was qwen3-32b in thinking mode at 95.2%. This is very close to Anthropic's Claude 3.7 Sonnet with thinking. At Qwen3 32B's current <a href=\"https://openrouter.ai/qwen/qwen3-32b\">$0.10 per million input tokens and $0.30 per million output tokens</a> this is up to 50x cheaper than <a href=\"https://www.anthropic.com/pricing#api\">Claude 3.7's $3 M in / $15 M out pricing</a>. Even better the Qwen3 32B model is Apache2 licensed and so can be self-hosted. In addition, all Qwen3 models in non-thinking mode outperformed comparably-sized models from other families.</p>\n<p><b>Qwen2.5's coding variant was the best non-thinking model.</b> The Qwen2.5-Coder-32B model performed well at 77.6% and was the best non-thinking model. It outperformed Qwen3 32B and this means <a href=\"https://x.com/ggerganov/status/1918373399891513571\">the upcoming Qwen3-Coder model</a> should be very strong.</p>\n<p><b>Qwen3 with thinking on performed better than non-thinking variants.</b> In every case allowing Qwen3 to think significantly improved performance. The qwen3-8b model with thinking on was very close in performance to llama-3.3-70b, a model almost an order of magnitude larger. Note though that this comes at the expense of increased latency and cost. In most cases thinking added 2-3,000 tokens of reasoning before the model produced its final answer, for Qwen3 32B on openrouter that is roughly a minute of thinking. There were also cases where the model entered a loop and kept reasoning until it hit the token limit.</p>\n<p><b>The Qwen3 30B-A3B MoE model could be a good compromise</b> Whilst it has 30 billion parameters the MoE model only activates about 3 billion of them for each token processed. This can be a substantial performance win when doing inference and indeed you can see the throughput ratio on OpenRouter between <a href=\"https://openrouter.ai/qwen/qwen3-32b\">Qwen3 32B</a> and <a href=\"https://openrouter.ai/qwen/qwen3-30b-a3b\">Qwen3 30B-A3B</a>. I suspect that gap will widen as providers do more optimisation of their MoE serving code.</p>\n<p><b>Within a family more parameters means better performance.</b> Within the Gemma and Qwen families, larger parameter counts generally led to better performance, as seen in the graph. Presumably there is <em>some</em> OCaml in the pretraining corpus and the larger models have a higher chance of retaining information from it?</p>\n<p><b>Deepseek R1's distilled Llama 3.3 70B reasoning model did not beat base model.</b> The Deepseek-R1-Llama-3.3-70B model which was produced by Deepseek by distilling from their large Deepseek R1 model had worse performance than the model it was based on Llama-3.3-70B.</p>\n<h1>What went wrong?</h1>\n<p>There were some consistent themes amongst failures across especially the smaller models. Here are some of the most common:</p>\n<h2>Syntax errors</h2>\n<p>Many models had basic syntax errors in their solutions, such as incorrect comment syntax or a lack of <code>in</code>. There were also cases of incorrect function application syntax. A very common mistake was not including <code>rec</code> for recursive functions.</p>\n<h2>Type system confusion</h2>\n<p>On several of the tick questions this involved confusing the int and float operators (+/+., <em>/</em>.). A common pattern was changing just one of the operators in response to a compile error, then changing another on the next error, etc..</p>\n<h2>Hallucinated functions</h2>\n<p>Some failures came from hallucinating functions such as <code>List.sub</code> or <code>List.combinations</code>, as well as assuming things like <code>Core</code> and <code>Format</code> were available. This is curious as even small models (like Qwen2.5-Coder-7B) assumed Core was available and yet still made fundamental OCaml syntactic errors.</p>\n<h2>Recursion</h2>\n<p>Beyond the failure to include <code>rec</code> the models struggled with recursion. There were several incorrect base cases in questions involving streams which resulted in a fair bit of non-termination in those questions. It would be interesting to try rewriting some of the tick questions as iterative and seeing whether success rates increase.</p>\n<h1>Limitations</h1>\n<p>These are only introductory tasks aimed at first year Computer Science students and so are not representative of the complexity of larger OCaml codebases that may also use advanced language features.</p>\n<h1>Summary</h1>\n<p>Qwen3 had an impressive showing. I had actually already written a draft of this blog post before Qwen3 was released, and re-running the evaluation to include it led to a very different outcome!</p>\n<p>So what next? It would be interesting to understand where the gap in performance is when thinking mode is on and off. Are there common mistakes Qwen3 is making when thinking is off?</p>\n<p>As a next step we clearly need to move beyond first-year Computer Science questions, especially as the top models are already saturating on this test. Anil reports performance on OCaml vibe-coded projects through Claude lags behind its abilities in Python.</p>\n<p>Next, how do we improve their performance? For this we really need more data. We now have a dataset of <a href=\"https://huggingface.co/datasets/sadiqj/opam-archive-dataset\">OCaml code extracted from opam</a> that is regularly updated but still small (~80k ml/mli files) and it is unclear as to the difficulty level of each bit of code. This is important because there is good evidence (<a href=\"https://arxiv.org/abs/2403.09472\">Sun et al</a>, <a href=\"https://arxiv.org/abs/2504.00829\">Ji et al</a>) that going from easy to difficult examples when training via reinforcement learning can improve performance.</p>\n<p>One promising avenue is <a href=\"https://arxiv.org/abs/2308.09895\">Cassano et al</a> who have done some work leveraging training data from 'high resource' languages to generate code and validation tests in target 'low resource' languages. This approach could be extended to novel language features, as well as being used for reinforcement learning rather than just fine-tuning.</p>\n<p>Unrelated to OCaml, I'm looking forward to testing out how these models perform on the <a href=\"https://anil.recoil.org/projects/ce/\">Conservation Evidence work</a> that Anil and I are involved in. We currently self-host Deepseek-R1-Llama-3.3-70B and switching to Qwen3 32B could substantially improve the performance at one of our crucial bottlenecks.</p>",
+20
sadiqj/www.toao.com_2025-06-27__blog_check-with-gemini.json
+20
sadiqj/www.toao.com_2025-06-27__blog_check-with-gemini.json
···+"summary": "<p>A simple MCP server that enables Claude Code (or any MCP host) to check things with gemini-cli</p>",+"content": "<p><a href=\"https://github.com/google-gemini/gemini-cli\">gemini-cli</a> was released a few days ago with a very ample free tier that gives access to Gemini 2.5 Pro, a model with a million token context window. I'm already a user of <a href=\"https://www.anthropic.com/claude-code\">Claude Code</a> so I wondered if I could have the two talk to each other. Enter <a href=\"https://github.com/sadiqj/check-with-gemini/\">check-with-gemini</a> which is a simple mcp server and lets Claude Code call Gemini to help analyse codebases or check plans.</p>\n<p>The MCP server has a single tool <code>check_with_gemini</code> which takes a prompt and content. Here's some examples of it in use.</p>\n<h2>Examples</h2>\n<p>I asked Claude Code to explain a codebase to me without Gemini:</p>\n<div><pre><span></span><code><span>></span><span> </span><span>Analyse</span><span> </span><span>this</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span><span>.</span><span> </span><span>Produce</span><span> </span><span>a</span><span> </span><span>short</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>a</span><span> </span><span>potential</span><span> </span><span>new</span><span> </span><span>contributor</span><span>.</span>\n\n<span>\u25cf</span><span> </span><span>I</span><span>'ll analyze this codebase to understand its structure and purpose, then provide a summary for potential contributors.</span>\n\n<span>\u25cf</span><span> </span><span>Task</span><span>(</span><span>Analyze</span><span> </span><span>codebase</span><span> </span><span>structure</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Done</span><span> </span><span>(</span><span>10</span><span> </span><span>tool</span><span> </span><span>uses</span><span> </span><span>\u00b7</span><span> </span><span>44.6</span><span>k</span><span> </span><span>tokens</span><span> </span><span>\u00b7</span><span> </span><span>1</span><span>m</span><span> </span><span>28.6</span><span>s</span><span>)</span><span> </span>\n\n<span>\u25cf</span><span> </span><span>This</span><span> </span><span>codebase</span><span> </span><span>is</span><span> </span><span>an</span><span> </span><span>AI</span><span> </span><span>Pipeline</span><span> </span><span>Visualization</span><span> </span><span>project</span><span> </span><span>that</span><span> </span><span>creates</span><span> </span><span>an</span><span> </span><span>interactive</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span> </span><span>of</span><span> </span><span>academic</span><span> </span><span>papers</span><span> </span><span>and</span><span> </span><span>their</span><span> </span><span>hierarchical</span><span> </span><span>clustering</span><span> </span><span>relationships</span><span>.</span><span> </span><span>Built</span><span> </span><span>with</span><span> </span><span>C</span><span>++</span><span>20</span><span> </span><span>and</span><span> </span><span>OpenGL</span><span> </span><span>4.1</span><span>,</span><span> </span><span>it</span><span> </span><span>renders</span>\n<span> </span><span>thousands</span><span> </span><span>of</span><span> </span><span>papers</span><span> </span><span>as</span><span> </span><span>animated</span><span> </span><span>points</span><span> </span><span>in</span><span> </span><span>3</span><span>D</span><span> </span><span>space</span><span>,</span><span> </span><span>grouped</span><span> </span><span>into</span><span> </span><span>transparent</span><span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>clusters</span><span> </span><span>representing</span><span> </span><span>semantic</span><span> </span><span>topics</span><span>.</span><span> </span><span>The</span><span> </span><span>system</span><span> </span><span>loads</span><span> </span><span>paper</span><span> </span><span>data</span><span> </span><span>from</span><span> </span><span>CSV</span><span> </span><span>files</span><span> </span><span>containing</span><span> </span><span>titles</span><span>,</span><span> </span><span>coordinates</span><span>,</span><span> </span><span>and</span>\n<span> </span><span>cluster</span><span> </span><span>assignments</span><span> </span><span>at</span><span> </span><span>multiple</span><span> </span><span>hierarchical</span><span> </span><span>depths</span><span> </span><span>(</span><span>4</span><span>-</span><span>64</span><span> </span><span>clusters</span><span>),</span><span> </span><span>then</span><span> </span><span>provides</span><span> </span><span>real</span><span>-</span><span>time</span><span> </span><span>animation</span><span> </span><span>showing</span><span> </span><span>paper</span><span> </span><span>exploration</span><span> </span><span>progress</span><span> </span><span>with</span><span> </span><span>interactive</span><span> </span><span>camera</span><span> </span><span>controls</span><span> </span><span>and</span><span> </span><span>dynamic</span><span> </span><span>statistics</span><span>.</span>\n\n<span> </span><span>For</span><span> </span><span>contributors</span><span>,</span><span> </span><span>the</span><span> </span><span>project</span><span> </span><span>follows</span><span> </span><span>a</span><span> </span><span>clean</span><span> </span><span>architecture</span><span> </span><span>with</span><span> </span><span>clear</span><span> </span><span>separation</span><span> </span><span>of</span><span> </span><span>concerns</span><span>:</span><span> </span><span>PaperLoader</span><span> </span><span>handles</span><span> </span><span>data</span><span> </span><span>parsing</span><span>,</span><span> </span><span>ClusterRenderer</span><span> </span><span>manages</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span>,</span><span> </span><span>and</span><span> </span><span>the</span><span> </span><span>OpenGL</span><span> </span><span>framework</span><span> </span><span>provides</span>\n<span> </span><span>efficient</span><span> </span><span>instanced</span><span> </span><span>rendering</span><span>.</span><span> </span><span>The</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>uses</span><span> </span><span>CMake</span><span> </span><span>with</span><span> </span><span>dependencies</span><span> </span><span>on</span><span> </span><span>GLFW3</span><span>,</span><span> </span><span>GLM</span><span>,</span><span> </span><span>FreeType2</span><span>,</span><span> </span><span>and</span><span> </span><span>Assimp</span><span>.</span><span> </span><span>Key</span><span> </span><span>areas</span><span> </span><span>for</span><span> </span><span>contribution</span><span> </span><span>include</span><span> </span><span>improving</span><span> </span><span>Windows</span><span> </span><span>platform</span><span> </span><span>support</span><span> </span><span>(</span><span>currently</span><span> </span><span>incomplete</span><span>),</span>\n<span> </span><span>optimizing</span><span> </span><span>rendering</span><span> </span><span>performance</span><span> </span><span>for</span><span> </span><span>larger</span><span> </span><span>datasets</span><span>,</span><span> </span><span>extending</span><span> </span><span>the</span><span> </span><span>UI</span><span> </span><span>with</span><span> </span><span>additional</span><span> </span><span>visualization</span><span> </span><span>modes</span><span>,</span><span> </span><span>or</span><span> </span><span>enhancing</span><span> </span><span>the</span><span> </span><span>clustering</span><span> </span><span>algorithms</span><span>.</span><span> </span><span>The</span><span> </span><span>codebase</span><span> </span><span>is</span><span> </span><span>well</span><span>-</span><span>structured</span><span> </span><span>with</span><span> </span><span>logical</span><span> </span><span>file</span><span> </span><span>organization</span>\n<span> </span><span>under</span><span> </span><span>src</span><span>/</span><span> </span><span>for</span><span> </span><span>core</span><span> </span><span>components</span><span> </span><span>and</span><span> </span><span>data</span><span>/</span><span> </span><span>for</span><span> </span><span>assets</span><span>,</span><span> </span><span>making</span><span> </span><span>it</span><span> </span><span>approachable</span><span> </span><span>for</span><span> </span><span>developers</span><span> </span><span>familiar</span><span> </span><span>with</span><span> </span><span>modern</span><span> </span><span>C</span><span>++</span><span> </span><span>and</span><span> </span><span>OpenGL</span><span> </span><span>programming</span><span>.</span>\n</code></pre></div>\n\n<p>Now with <code>check_with_gemini</code>:</p>\n<div><pre><span></span><code><span>></span><span> </span><span>Analyse</span><span> </span><span>this</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span><span>.</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>that</span><span> </span><span>your</span><span> </span><span>understanding</span><span> </span><span>is</span><span> </span><span>correct</span><span>.</span><span> </span><span>Produce</span><span> </span><span>a</span><span> </span><span>short</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>a</span><span> </span><span>potential</span><span> </span><span>new</span><span> </span><span>contributor</span><span>.</span>\n\n<span>\u25cf</span><span> </span><span>I</span><span>'ll analyze this codebase in depth and verify my understanding with Gemini before providing a summary for potential contributors.</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2610</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2610</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2610</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2610</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n<span> </span><span>\u2610</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2610</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2610</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2610</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2610</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n<span> </span><span>\u2610</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n\n<span>\u25cf</span><span> </span><span>List</span><span>(</span><span>.</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Listed</span><span> </span><span>1119</span><span> </span><span>paths</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Search</span><span>(</span><span>pattern</span><span>:</span><span> </span><span>"**/*"</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Found</span><span> </span><span>100</span><span> </span><span>files</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>README</span><span>.</span><span>md</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>65</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2610</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2610</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2610</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n<span> </span><span>\u2610</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>92</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>main</span><span>.</span><span>cpp</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>557</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>csv_loader</span><span>.</span><span>cpp</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>174</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2612</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2612</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2610</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n<span> </span><span>\u2610</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n\n<span>\u25cf</span><span> </span><span>Bash</span><span>(</span><span>git</span><span> </span><span>log</span><span> </span><span>--</span><span>oneline</span><span> </span><span>-</span><span>10</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>903</span><span>a1c0</span><span> </span><span>Start</span><span> </span><span>animation</span><span> </span><span>paused</span>\n<span> </span><span>ab728d8</span><span> </span><span>Reduce</span><span> </span><span>vignette</span>\n<span> </span><span>59</span><span>cb10e</span><span> </span><span>Update</span><span> </span><span>README</span><span>.</span><span>md</span>\n<span> </span><span>\u2026</span><span> </span><span>+</span><span>7</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>src</span><span>/</span><span>paper_loader</span><span>.</span><span>h</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>122</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>src</span><span>/</span><span>clusters</span><span>.</span><span>h</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>112</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2612</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2612</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2612</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n<span> </span><span>\u2610</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n\n<span>\u25cf</span><span> </span><span>check_with_gemini</span><span>:</span><span>check_with_gemini</span><span> </span><span>(</span><span>MCP</span><span>)(</span><span>prompt</span><span>:</span><span> </span><span>"Review this analysis of a C++ OpenGL visualization project for scientific papers. Is my understanding correct?</span><span>\\n\\n</span><span>This is an AI Pipeline Visualization project</span>\n<span> </span><span>that</span><span>:</span>\\<span>n1</span><span>.</span><span> </span><span>Loads</span><span> </span><span>scientific</span><span> </span><span>paper</span><span> </span><span>data</span><span> </span><span>from</span><span> </span><span>CSV</span><span> </span><span>files</span><span> </span><span>containing</span><span> </span><span>paper</span><span> </span><span>titles</span><span>,</span><span> </span><span>inclusion</span><span> </span><span>status</span><span>,</span><span> </span><span>2</span><span>D</span><span>/</span><span>3</span><span>D</span><span> </span><span>positions</span><span>,</span><span> </span><span>and</span><span> </span><span>hierarchical</span><span> </span><span>cluster</span><span> </span><span>assignments</span><span> </span><span>(</span><span>clusters</span><span> </span><span>at</span><span> </span><span>depths</span>\n<span> </span><span>2</span><span>-</span><span>6</span><span>)</span>\\<span>n2</span><span>.</span><span> </span><span>Visualizes</span><span> </span><span>papers</span><span> </span><span>as</span><span> </span><span>instanced</span><span> </span><span>3</span><span>D</span><span> </span><span>cubes</span><span> </span><span>using</span><span> </span><span>OpenGL</span><span> </span><span>with</span><span> </span><span>diffuse</span><span>/</span><span>ambient</span><span> </span><span>lighting</span>\\<span>n3</span><span>.</span><span> </span><span>Groups</span><span> </span><span>papers</span><span> </span><span>into</span><span> </span><span>hierarchical</span><span> </span><span>clusters</span><span> </span><span>represented</span><span> </span><span>by</span><span> </span><span>transparent</span>\n<span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>models</span>\\<span>n4</span><span>.</span><span> </span><span>Provides</span><span> </span><span>interactive</span><span> </span><span>exploration</span><span> </span><span>with</span><span> </span><span>animation</span><span> </span><span>that</span><span> </span><span>progresses</span><span> </span><span>through</span><span> </span><span>papers</span><span> </span><span>chronologically</span>\\<span>n5</span><span>.</span><span> </span><span>Features</span><span> </span><span>include</span><span> </span><span>bar</span><span> </span><span>charts</span><span> </span><span>showing</span><span> </span><span>cluster</span>\n<span> </span><span>distributions</span><span>,</span><span> </span><span>viewing</span><span> </span><span>mode</span><span> </span><span>toggles</span><span>,</span><span> </span><span>and</span><span> </span><span>animation</span><span> </span><span>speed</span><span> </span><span>controls</span>\\<span>n6</span><span>.</span><span> </span><span>Uses</span><span> </span><span>modern</span><span> </span><span>C</span><span>++</span><span> </span><span>(</span><span>C</span><span>++</span><span>20</span><span>),</span><span> </span><span>CMake</span><span> </span><span>build</span><span> </span><span>system</span><span>,</span><span> </span><span>and</span><span> </span><span>libraries</span><span>:</span><span> </span><span>GLFW3</span><span>,</span><span> </span><span>GLAD</span><span>,</span><span> </span><span>GLM</span><span>,</span><span> </span><span>Assimp</span><span>,</span><span> </span><span>FreeType2</span><span>,</span>\n<span> </span><span>and</span><span> </span><span>convhull_3d</span>\\<span>n7</span><span>.</span><span> </span><span>Recent</span><span> </span><span>development</span><span> </span><span>focused</span><span> </span><span>on</span><span> </span><span>UI</span><span> </span><span>improvements</span><span> </span><span>(</span><span>vignette</span><span> </span><span>effects</span><span>,</span><span> </span><span>post</span><span>-</span><span>processing</span><span>)</span><span> </span><span>and</span><span> </span><span>documentation</span><span> </span><span>updates</span>\\<span>n</span>\\<span>nThe</span><span> </span><span>project</span><span> </span><span>appears</span><span> </span><span>designed</span><span> </span><span>for</span>\n<span> </span><span>exploring</span><span> </span><span>relationships</span><span> </span><span>between</span><span> </span><span>scientific</span><span> </span><span>papers</span><span> </span><span>in</span><span> </span><span>a</span><span> </span><span>3</span><span>D</span><span> </span><span>space</span><span> </span><span>with</span><span> </span><span>hierarchical</span><span> </span><span>clustering</span><span> </span><span>visualization</span><span>.</span><span> </span><span>Is</span><span> </span><span>this</span><span> </span><span>assessment</span><span> </span><span>accurate</span><span>?</span><span>", content: "</span><span>Key</span><span> </span><span>code</span>\n<span> </span><span>structure</span><span>:</span>\\<span>n</span><span>-</span><span> </span><span>main</span><span>.</span><span>cpp</span><span>:</span><span> </span><span>Core</span><span> </span><span>rendering</span><span> </span><span>loop</span><span>,</span><span> </span><span>handles</span><span> </span><span>user</span><span> </span><span>input</span><span>,</span><span> </span><span>manages</span><span> </span><span>animation</span><span> </span><span>state</span>\\<span>n</span><span>-</span><span> </span><span>paper_loader</span><span>.</span><span>h</span><span>/</span><span>cpp</span><span>:</span><span> </span><span>Loads</span><span> </span><span>paper</span><span> </span><span>data</span><span> </span><span>from</span><span> </span><span>CSV</span><span>,</span><span> </span><span>manages</span><span> </span><span>clustering</span>\n<span> </span><span>hierarchy</span>\\<span>n</span><span>-</span><span> </span><span>clusters</span><span>.</span><span>h</span><span>/</span><span>cpp</span><span>:</span><span> </span><span>Renders</span><span> </span><span>cluster</span><span> </span><span>convex</span><span> </span><span>hulls</span><span> </span><span>using</span><span> </span><span>Assimp</span><span>-</span><span>loaded</span><span> </span><span>models</span>\\<span>n</span><span>-</span><span> </span><span>opengl</span><span>/</span><span>:</span><span> </span><span>Contains</span><span> </span><span>shader</span><span> </span><span>management</span><span>,</span><span> </span><span>mesh</span><span>/</span><span>model</span><span> </span><span>loading</span><span>,</span><span> </span><span>camera</span><span>,</span><span> </span><span>fonts</span><span>,</span><span> </span><span>etc</span><span>.</span>\\<span>n</span><span>-</span>\n<span> </span><span>CMakeLists</span><span>.</span><span>txt</span><span>:</span><span> </span><span>Linux</span><span>-</span><span>focused</span><span> </span><span>build</span><span> </span><span>(</span><span>Windows</span><span> </span><span>build</span><span> </span><span>marked</span><span> </span><span>as</span><span> </span><span>faulty</span><span>)</span>\\<span>n</span><span>-</span><span> </span><span>Data</span><span> </span><span>files</span><span>:</span><span> </span><span>CSV</span><span> </span><span>with</span><span> </span><span>paper</span><span> </span><span>metadata</span><span>,</span><span> </span><span>pre</span><span>-</span><span>generated</span><span> </span><span>cluster</span><span> </span><span>models</span><span> </span><span>(</span><span>.</span><span>obj</span><span>),</span><span> </span><span>shaders</span><span>,</span><span> </span><span>fonts</span><span>,</span>\n<span> </span><span>textures</span><span>")</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Yes</span><span>,</span><span> </span><span>your</span><span> </span><span>assessment</span><span> </span><span>is</span><span> </span><span>highly</span><span> </span><span>accurate</span><span>.</span><span> </span><span>Based</span><span> </span><span>on</span><span> </span><span>the</span><span> </span><span>file</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>your</span><span> </span><span>description</span><span>,</span><span> </span><span>this</span><span> </span><span>project</span><span> </span><span>is</span><span> </span><span>indeed</span><span> </span><span>a</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span> </span><span>tool</span><span> </span><span>for</span><span> </span><span>exploring</span><span> </span><span>hierarchically</span><span> </span><span>clustered</span><span> </span><span>scientific</span><span> </span><span>paper</span><span> </span><span>data</span><span>.</span>\n\n<span> </span><span>Your</span><span> </span><span>breakdown</span><span> </span><span>of</span><span> </span><span>the</span><span> </span><span>components</span><span> </span><span>and</span><span> </span><span>their</span><span> </span><span>functions</span><span> </span><span>is</span><span> </span><span>spot</span><span> </span><span>on</span><span>:</span>\n<span> </span><span>\u2026</span><span> </span><span>+</span><span>9</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2612</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2612</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2612</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2612</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n<span> </span><span>\u2610</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n\n<span>\u25cf</span><span> </span><span>This</span><span> </span><span>is</span><span> </span><span>an</span><span> </span><span>OpenGL</span><span>-</span><span>based</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span> </span><span>tool</span><span> </span><span>for</span><span> </span><span>exploring</span><span> </span><span>hierarchically</span><span> </span><span>clustered</span><span> </span><span>scientific</span><span> </span><span>paper</span><span> </span><span>data</span><span> </span><span>from</span><span> </span><span>AI</span><span> </span><span>research</span><span> </span><span>pipelines</span><span>.</span><span> </span><span>The</span><span> </span><span>project</span><span> </span><span>renders</span><span> </span><span>thousands</span><span> </span><span>of</span><span> </span><span>papers</span><span> </span><span>as</span><span> </span><span>colored</span><span> </span><span>cubes</span><span> </span><span>in</span><span> </span><span>3</span><span>D</span><span> </span><span>space</span><span>,</span><span> </span><span>with</span>\n<span> </span><span>their</span><span> </span><span>positions</span><span> </span><span>and</span><span> </span><span>cluster</span><span> </span><span>assignments</span><span> </span><span>derived</span><span> </span><span>from</span><span> </span><span>dimensionality</span><span> </span><span>reduction</span><span> </span><span>techniques</span><span>.</span><span> </span><span>Papers</span><span> </span><span>are</span><span> </span><span>grouped</span><span> </span><span>into</span><span> </span><span>transparent</span><span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>clusters</span><span> </span><span>at</span><span> </span><span>multiple</span><span> </span><span>hierarchical</span><span> </span><span>levels</span><span> </span><span>(</span><span>2</span><span>-</span><span>6</span><span>),</span><span> </span><span>allowing</span><span> </span><span>researchers</span><span> </span><span>to</span>\n<span> </span><span>explore</span><span> </span><span>relationships</span><span> </span><span>between</span><span> </span><span>papers</span><span> </span><span>through</span><span> </span><span>interactive</span><span> </span><span>navigation</span><span> </span><span>with</span><span> </span><span>real</span><span>-</span><span>time</span><span> </span><span>animation</span><span> </span><span>that</span><span> </span><span>progresses</span><span> </span><span>chronologically</span><span> </span><span>through</span><span> </span><span>the</span><span> </span><span>dataset</span><span>.</span>\n\n<span> </span><span>The</span><span> </span><span>codebase</span><span> </span><span>uses</span><span> </span><span>modern</span><span> </span><span>C</span><span>++</span><span>20</span><span> </span><span>with</span><span> </span><span>a</span><span> </span><span>CMake</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>targeting</span><span> </span><span>Linux</span><span> </span><span>(</span><span>Windows</span><span> </span><span>support</span><span> </span><span>is</span><span> </span><span>incomplete</span><span>)</span><span>.</span><span> </span><span>Core</span><span> </span><span>dependencies</span><span> </span><span>include</span><span> </span><span>GLFW3</span><span>/</span><span>GLAD</span><span> </span><span>for</span><span> </span><span>OpenGL</span><span> </span><span>context</span><span>,</span><span> </span><span>GLM</span><span> </span><span>for</span><span> </span><span>3</span><span>D</span><span> </span><span>math</span><span>,</span><span> </span><span>Assimp</span><span> </span><span>for</span><span> </span><span>loading</span><span> </span><span>cluster</span>\n<span> </span><span>models</span><span>,</span><span> </span><span>and</span><span> </span><span>FreeType2</span><span> </span><span>for</span><span> </span><span>text</span><span> </span><span>rendering</span><span>.</span><span> </span><span>The</span><span> </span><span>architecture</span><span> </span><span>cleanly</span><span> </span><span>separates</span><span> </span><span>concerns</span><span>:</span><span> </span><span>data</span><span> </span><span>loading</span><span> </span><span>from</span><span> </span><span>CSV</span><span> </span><span>files</span><span>,</span><span> </span><span>instanced</span><span> </span><span>rendering</span><span> </span><span>for</span><span> </span><span>performance</span><span>,</span><span> </span><span>cluster</span><span> </span><span>hull</span><span> </span><span>generation</span><span>/</span><span>rendering</span><span>,</span><span> </span><span>and</span><span> </span><span>UI</span><span> </span><span>overlays</span>\n<span> </span><span>showing</span><span> </span><span>statistics</span><span> </span><span>like</span><span> </span><span>cluster</span><span> </span><span>distributions</span><span> </span><span>and</span><span> </span><span>exploration</span><span> </span><span>progress</span><span>.</span><span> </span><span>Recent</span><span> </span><span>development</span><span> </span><span>has</span><span> </span><span>focused</span><span> </span><span>on</span><span> </span><span>post</span><span>-</span><span>processing</span><span> </span><span>effects</span><span> </span><span>and</span><span> </span><span>UI</span><span> </span><span>improvements</span><span> </span><span>to</span><span> </span><span>enhance</span><span> </span><span>the</span><span> </span><span>visual</span><span> </span><span>exploration</span><span> </span><span>experience</span><span>.</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>project</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>identify</span><span> </span><span>main</span><span> </span><span>components</span>\n<span> </span><span>\u2612</span><span> </span><span>Analyze</span><span> </span><span>build</span><span> </span><span>system</span><span> </span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u2612</span><span> </span><span>Examine</span><span> </span><span>main</span><span>.</span><span>cpp</span><span> </span><span>and</span><span> </span><span>core</span><span> </span><span>functionality</span>\n<span> </span><span>\u2612</span><span> </span><span>Check</span><span> </span><span>with</span><span> </span><span>Gemini</span><span> </span><span>for</span><span> </span><span>understanding</span><span> </span><span>verification</span>\n<span> </span><span>\u2612</span><span> </span><span>Write</span><span> </span><span>contributor</span><span> </span><span>summary</span>\n<span> </span><span>\u2612</span><span> </span><span>Review</span><span> </span><span>recent</span><span> </span><span>commits</span><span> </span><span>for</span><span> </span><span>project</span><span> </span><span>direction</span>\n</code></pre></div>\n\n<p>We can also have Gemini do the initial work and have Claude Code check it:</p>\n<div><pre><span></span><code><span>></span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyse</span><span> </span><span>this</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span><span>.</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>the</span><span> </span><span>resutls</span><span> </span><span>Gemini</span><span> </span><span>gives</span><span> </span><span>you</span><span> </span><span>and</span><span> </span><span>make</span><span> </span><span>sure</span><span> </span><span>they</span><span> </span><span>are</span><span> </span><span>correct</span><span>.</span><span> </span><span>Produce</span><span> </span><span>a</span><span> </span><span>short</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>a</span><span> </span><span>potential</span><span> </span><span>new</span><span> </span><span>contributor</span><span>.</span>\n\n<span>\u25cf</span><span> </span><span>I</span><span>'ll analyze this codebase using Gemini and provide a comprehensive summary for potential contributors.</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2610</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2610</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2610</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2610</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2610</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2610</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2610</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2610</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>List</span><span>(</span><span>.</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Listed</span><span> </span><span>1119</span><span> </span><span>paths</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>CMakeLists</span><span>.</span><span>txt</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>92</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>main</span><span>.</span><span>cpp</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>557</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Search</span><span>(</span><span>pattern</span><span>:</span><span> </span><span>"**/*.{cpp,h,hpp,c,txt,md,json}"</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Found</span><span> </span><span>100</span><span> </span><span>files</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2610</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2610</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2610</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>README</span><span>.</span><span>md</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>65</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Task</span><span>(</span><span>Analyze</span><span> </span><span>source</span><span> </span><span>files</span><span> </span><span>structure</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Done</span><span> </span><span>(</span><span>15</span><span> </span><span>tool</span><span> </span><span>uses</span><span> </span><span>\u00b7</span><span> </span><span>50.0</span><span>k</span><span> </span><span>tokens</span><span> </span><span>\u00b7</span><span> </span><span>1</span><span>m</span><span> </span><span>6.4</span><span>s</span><span>)</span><span> </span>\n\n<span>\u25cf</span><span> </span><span>Now</span><span> </span><span>I</span><span>'ll have Gemini analyze the codebase comprehensively:</span>\n\n<span>\u25cf</span><span> </span><span>check_with_gemini</span><span>:</span><span>check_with_gemini</span><span> </span><span>(</span><span>MCP</span><span>)(</span><span>prompt</span><span>:</span><span> </span><span>"Please provide a comprehensive analysis of this codebase. This is a 3D visualization application for academic papers that uses OpenGL. I want you to analyze</span>\n<span> </span><span>the</span><span> </span><span>code</span><span> </span><span>quality</span><span>,</span><span> </span><span>architecture</span><span>,</span><span> </span><span>potential</span><span> </span><span>issues</span><span>,</span><span> </span><span>and</span><span> </span><span>provide</span><span> </span><span>insights</span><span> </span><span>about</span><span> </span><span>its</span><span> </span><span>design</span><span> </span><span>patterns</span><span> </span><span>and</span><span> </span><span>implementation</span><span> </span><span>approaches</span><span>.</span><span> </span><span>Consider</span><span>:</span>\\<span>n</span>\\<span>n1</span><span>.</span><span> </span><span>Code</span><span> </span><span>organization</span><span> </span><span>and</span>\n<span> </span><span>structure</span>\\<span>n2</span><span>.</span><span> </span><span>OpenGL</span><span> </span><span>and</span><span> </span><span>graphics</span><span> </span><span>programming</span><span> </span><span>best</span><span> </span><span>practices</span>\\<span>n3</span><span>.</span><span> </span><span>Memory</span><span> </span><span>management</span><span> </span><span>and</span><span> </span><span>resource</span><span> </span><span>handling</span>\\<span>n4</span><span>.</span><span> </span><span>Performance</span><span> </span><span>considerations</span>\\<span>n5</span><span>.</span><span> </span><span>Error</span><span> </span><span>handling</span><span> </span><span>and</span>\n<span> </span><span>robustness</span>\\<span>n6</span><span>.</span><span> </span><span>Code</span><span> </span><span>maintainability</span><span> </span><span>and</span><span> </span><span>extensibility</span>\\<span>n7</span><span>.</span><span> </span><span>Any</span><span> </span><span>potential</span><span> </span><span>security</span><span> </span><span>or</span><span> </span><span>stability</span><span> </span><span>issues</span>\\<span>n8</span><span>.</span><span> </span><span>Overall</span><span> </span><span>software</span><span> </span><span>engineering</span><span> </span><span>quality</span>\\<span>n</span>\\<span>nBe</span><span> </span><span>thorough</span><span> </span><span>and</span><span> </span><span>critical</span>\n<span> </span><span>in</span><span> </span><span>your</span><span> </span><span>analysis</span><span>.</span><span> </span><span>Point</span><span> </span><span>out</span><span> </span><span>both</span><span> </span><span>strengths</span><span> </span><span>and</span><span> </span><span>weaknesses</span><span>.</span><span>", content: "</span><span>This</span><span> </span><span>codebase</span><span> </span><span>is</span><span> </span><span>a</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span> </span><span>application</span><span> </span><span>called</span><span> </span>\\<span>"AI Pipeline Visualisation</span><span>\\"</span><span> that renders</span>\n<span> </span><span>academic</span><span> </span><span>papers</span><span> </span><span>as</span><span> </span><span>cubes</span><span> </span><span>in</span><span> </span><span>3</span><span>D</span><span> </span><span>space</span><span>,</span><span> </span><span>organized</span><span> </span><span>into</span><span> </span><span>clusters</span><span>.</span><span> </span><span>Here</span><span>'s the key information:</span><span>\\n\\n</span><span>## Core Architecture:</span><span>\\n</span><span>- Built with C++20, OpenGL 4.1, CMake build</span>\n<span> </span><span>system</span>\\<span>n</span><span>-</span><span> </span><span>Uses</span><span> </span><span>GLFW</span><span> </span><span>for</span><span> </span><span>windowing</span><span>,</span><span> </span><span>GLAD</span><span> </span><span>for</span><span> </span><span>OpenGL</span><span> </span><span>loading</span><span>,</span><span> </span><span>GLM</span><span> </span><span>for</span><span> </span><span>math</span><span>,</span><span> </span><span>Assimp</span><span> </span><span>for</span><span> </span><span>3</span><span>D</span><span> </span><span>models</span><span>,</span><span> </span><span>FreeType</span><span> </span><span>for</span><span> </span><span>fonts</span>\\<span>n</span><span>-</span><span> </span><span>Main</span><span> </span><span>components</span><span>:</span><span> </span><span>App</span><span> </span><span>(</span><span>OpenGL</span><span> </span><span>wrapper</span><span>),</span><span> </span><span>PaperLoader</span>\n<span> </span><span>(</span><span>data</span><span> </span><span>management</span><span>),</span><span> </span><span>Clusters</span><span> </span><span>(</span><span>3</span><span>D</span><span> </span><span>cluster</span><span> </span><span>visualization</span><span>),</span><span> </span><span>Shader</span><span> </span><span>management</span>\\<span>n</span>\\<span>n</span><span>## Main Files Structure:\\n- main.cpp: Main application loop with animation, rendering, and</span>\n<span> </span><span>UI</span>\\<span>n</span><span>-</span><span> </span><span>src</span><span>/</span><span>opengl</span><span>/</span><span>app</span><span>.</span><span>cpp</span><span>:</span><span> </span><span>Core</span><span> </span><span>OpenGL</span><span> </span><span>application</span><span> </span><span>framework</span>\\<span>n</span><span>-</span><span> </span><span>src</span><span>/</span><span>paper_loader</span><span>.</span><span>cpp</span><span>:</span><span> </span><span>Loads</span><span> </span><span>papers</span><span> </span><span>from</span><span> </span><span>CSV</span><span>,</span><span> </span><span>generates</span><span> </span><span>clusters</span>\\<span>n</span><span>-</span><span> </span><span>src</span><span>/</span><span>clusters</span><span>.</span><span>cpp</span><span>:</span><span> </span><span>Generates</span><span> </span><span>and</span><span> </span><span>renders</span>\n<span> </span><span>3</span><span>D</span><span> </span><span>convex</span><span> </span><span>hulls</span><span> </span><span>for</span><span> </span><span>clusters</span>\\<span>n</span><span>-</span><span> </span><span>src</span><span>/</span><span>opengl</span><span>/</span><span>shader</span><span>.</span><span>cpp</span><span>:</span><span> </span><span>Shader</span><span> </span><span>program</span><span> </span><span>management</span>\\<span>n</span>\\<span>n</span><span>## Key Code Snippets:\\n\\n### main.cpp (main loop):\\n```cpp\\n// Main rendering loop</span>\n<span> </span><span>with</span><span> </span><span>instanced</span><span> </span><span>paper</span><span> </span><span>rendering</span>\\<span>nwhile</span><span> </span><span>(</span><span>!</span><span>app</span><span>.</span><span>shouldClose</span><span>())</span>\\<span>n</span><span>{</span>\\<span>n</span><span> </span><span>app</span><span>.</span><span>handleInput</span><span>();</span>\\<span>n</span><span> </span><span>app</span><span>.</span><span>enablePostProcessing</span><span>();</span>\\<span>n</span><span> </span><span>app</span><span>.</span><span>clear</span><span>();</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Render</span><span> </span><span>papers</span><span> </span><span>as</span>\n<span> </span><span>instanced</span><span> </span><span>cubes</span>\\<span>n</span><span> </span><span>pointShader</span><span>.</span><span>use</span><span>();</span>\\<span>n</span><span> </span><span>pointShader</span><span>.</span><span>setMat4</span><span>(</span>\\<span>"projection</span><span>\\"</span><span>, app.getPerspectiveMatrix());</span><span>\\n</span><span> pointShader.setMat4(</span><span>\\"</span><span>view</span><span>\\"</span><span>, app.getViewMatrix());</span><span>\\n</span>\n<span> </span><span>glDrawArraysInstanced</span><span>(</span><span>GL_TRIANGLES</span><span>,</span><span> </span><span>0</span><span>,</span><span> </span><span>36</span><span>,</span><span> </span><span>static_cast</span><span><</span><span>int</span><span>></span><span>(</span><span>paperData</span><span>.</span><span>size</span><span>()</span><span> </span><span>/</span><span> </span><span>5</span><span>));</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Render</span><span> </span><span>transparent</span><span> </span><span>clusters</span><span> </span><span>with</span><span> </span><span>depth</span><span> </span><span>sorting</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>float</span><span>,</span>\n<span> </span><span>std</span><span>::</span><span>pair</span><span><</span><span>int</span><span>,</span><span> </span><span>glm</span><span>::</span><span>vec3</span><span>>></span><span> </span><span>sortedClusters</span><span>{};</span>\\<span>n</span><span> </span><span>for</span><span> </span><span>(</span><span>int</span><span> </span><span>c</span><span> </span><span>=</span><span> </span><span>0</span><span>;</span><span> </span><span>c</span><span> </span><span><</span><span> </span><span>std</span><span>::</span><span>pow</span><span>(</span><span>2</span><span>,</span><span> </span><span>CLUSTER_DEPTH</span><span>);</span><span> </span><span>++</span><span>c</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Sort</span><span> </span><span>clusters</span><span> </span><span>by</span><span> </span><span>distance</span><span> </span><span>for</span><span> </span><span>proper</span><span> </span><span>transparency</span>\\<span>n</span>\n<span> </span><span>distance</span><span> </span><span>=</span><span> </span><span>glm</span><span>::</span><span>length</span><span>(</span><span>app</span><span>.</span><span>getCameraPosition</span><span>()</span><span> </span><span>-</span><span> </span><span>clusterData</span><span>-></span><span>position</span><span>);</span>\\<span>n</span><span> </span><span>sortedClusters</span><span>[</span><span>distance</span><span>]</span><span> </span><span>=</span><span> </span><span>std</span><span>::</span><span>make_pair</span><span>(</span><span>c</span><span>,</span><span> </span><span>color</span><span>);</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>//</span>\n<span> </span><span>Animation</span><span> </span><span>and</span><span> </span><span>UI</span><span> </span><span>updates</span>\\<span>n</span><span> </span><span>animationProgress</span><span> </span><span>+=</span><span> </span><span>ANIMATION_SPEED</span><span> </span><span>*</span><span> </span><span>app</span><span>.</span><span>getDeltaTime</span><span>();</span>\\<span>n</span><span>}</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>### src/opengl/app.cpp (OpenGL framework):\\n```cpp\\nclass App {\\n</span>\n<span> </span><span>GLFWwindow</span><span>*</span><span> </span><span>window</span><span>;</span>\\<span>n</span><span> </span><span>Camera</span><span> </span><span>camera</span><span>;</span>\\<span>n</span><span> </span><span>PostProcessor</span><span>*</span><span> </span><span>postProcessor</span><span>;</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>void</span><span> </span><span>init</span><span>(</span><span>int</span><span> </span><span>width</span><span>,</span><span> </span><span>int</span><span> </span><span>height</span><span>,</span><span> </span><span>const</span><span> </span><span>std</span><span>::</span><span>string</span><span>&</span><span> </span><span>title</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>if</span>\n<span> </span><span>(</span><span>!</span><span>glfwInit</span><span>())</span><span> </span><span>{</span><span> </span><span>/*</span><span> </span><span>error</span><span> </span><span>handling</span><span> </span><span>*/</span><span> </span><span>}</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>glfwWindowHint</span><span>(</span><span>GLFW_CONTEXT_VERSION_MAJOR</span><span>,</span><span> </span><span>4</span><span>);</span>\\<span>n</span><span> </span><span>glfwWindowHint</span><span>(</span><span>GLFW_CONTEXT_VERSION_MINOR</span><span>,</span><span> </span><span>1</span><span>);</span>\\<span>n</span>\n<span> </span><span>glfwWindowHint</span><span>(</span><span>GLFW_OPENGL_PROFILE</span><span>,</span><span> </span><span>GLFW_OPENGL_CORE_PROFILE</span><span>);</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>window</span><span> </span><span>=</span><span> </span><span>glfwCreateWindow</span><span>(</span><span>width</span><span>,</span><span> </span><span>height</span><span>,</span><span> </span><span>title</span><span>.</span><span>c_str</span><span>(),</span><span> </span><span>nullptr</span><span>,</span><span> </span><span>nullptr</span><span>);</span>\\<span>n</span><span> </span><span>if</span>\n<span> </span><span>(</span><span>!</span><span>gladLoadGLLoader</span><span>((</span><span>GLADloadproc</span><span>)</span><span>glfwGetProcAddress</span><span>))</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>/*</span><span> </span><span>error</span><span> </span><span>handling</span><span> </span><span>*/</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span>};</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>### src/paper_loader.cpp (data</span>\n<span> </span><span>management</span><span>):</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nclass</span><span> </span><span>PaperLoader</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>vector</span><span><</span><span>Paper</span><span>></span><span> </span><span>papers</span><span>;</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>Cluster</span><span>>></span><span> </span><span>clusters</span><span>;</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>void</span><span> </span><span>loadFromFile</span><span>(</span><span>const</span>\n<span> </span><span>std</span><span>::</span><span>string</span><span>&</span><span> </span><span>filename</span><span>,</span><span> </span><span>float</span><span> </span><span>scale</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Parse</span><span> </span><span>CSV</span><span> </span><span>with</span><span> </span><span>wide</span><span> </span><span>string</span><span> </span><span>support</span><span> </span><span>for</span><span> </span><span>UTF</span><span>-</span><span>8</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>wifstream</span><span> </span><span>file</span><span>(</span><span>filename</span><span>);</span>\\<span>n</span>\n<span> </span><span>file</span><span>.</span><span>imbue</span><span>(</span><span>std</span><span>::</span><span>locale</span><span>(</span>\\<span>"en_US.UTF-8</span><span>\\"</span><span>));</span><span>\\n</span><span> </span><span>\\n</span><span> // Process each paper line</span><span>\\n</span><span> while (std::getline(file, line)) {</span><span>\\n</span><span> Paper paper;</span><span>\\n</span>\n<span> </span><span>//</span><span> </span><span>Parse</span><span> </span><span>coordinates</span><span>,</span><span> </span><span>title</span><span>,</span><span> </span><span>inclusion</span><span> </span><span>status</span>\\<span>n</span><span> </span><span>paper</span><span>.</span><span>x</span><span> </span><span>=</span><span> </span><span>std</span><span>::</span><span>stof</span><span>(</span><span>fields</span><span>[</span><span>1</span><span>])</span><span> </span><span>*</span><span> </span><span>scale</span><span>;</span>\\<span>n</span><span> </span><span>paper</span><span>.</span><span>y</span><span> </span><span>=</span><span> </span><span>std</span><span>::</span><span>stof</span><span>(</span><span>fields</span><span>[</span><span>2</span><span>])</span><span> </span><span>*</span><span> </span><span>scale</span><span>;</span>\\<span>n</span>\n<span> </span><span>papers</span><span>.</span><span>push_back</span><span>(</span><span>paper</span><span>);</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>void</span><span> </span><span>generateClusters</span><span>()</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Generate</span><span> </span><span>hierarchical</span><span> </span><span>clusters</span><span> </span><span>(</span><span>levels</span><span> </span><span>2</span><span>-</span><span>6</span><span>)</span>\\<span>n</span><span> </span><span>for</span><span> </span><span>(</span><span>int</span><span> </span><span>level</span><span> </span><span>=</span><span> </span><span>2</span><span>;</span>\n<span> </span><span>level</span><span> </span><span><=</span><span> </span><span>6</span><span>;</span><span> </span><span>++</span><span>level</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Group</span><span> </span><span>papers</span><span> </span><span>into</span><span> </span><span>2</span><span>^</span><span>level</span><span> </span><span>clusters</span>\\<span>n</span><span> </span><span>clusters</span><span>[</span><span>level</span><span>]</span><span> </span><span>=</span><span> </span><span>generateClusterLevel</span><span>(</span><span>level</span><span>);</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span>};</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>###</span>\n<span> </span><span>src</span><span>/</span><span>clusters</span><span>.</span><span>cpp</span><span> </span><span>(</span><span>3</span><span>D</span><span> </span><span>cluster</span><span> </span><span>visualization</span><span>):</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nclass</span><span> </span><span>ClusterRenderer</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>ClusterData</span><span>>></span><span> </span><span>clusterData</span><span>;</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>void</span>\n<span> </span><span>generateClusters</span><span>(</span><span>const</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>int</span><span>,</span><span> </span><span>Cluster</span><span>>>&</span><span> </span><span>clusters</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>for</span><span> </span><span>(</span><span>const</span><span> </span><span>auto</span><span>&</span><span> </span><span>[</span><span>level</span><span>,</span><span> </span><span>levelClusters</span><span>]</span><span> </span><span>:</span><span> </span><span>clusters</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>for</span><span> </span><span>(</span><span>const</span><span> </span><span>auto</span><span>&</span>\n<span> </span><span>[</span><span>id</span><span>,</span><span> </span><span>cluster</span><span>]</span><span> </span><span>:</span><span> </span><span>levelClusters</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Generate</span><span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>from</span><span> </span><span>paper</span><span> </span><span>positions</span>\\<span>n</span><span> </span><span>ConvexHull</span><span> </span><span>hull</span><span> </span><span>=</span><span> </span><span>convhull_3d_build</span><span>(</span><span>cluster</span><span>.</span><span>vertices</span><span>);</span>\\<span>n</span>\n<span> </span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Export</span><span> </span><span>as</span><span> </span><span>.</span><span>obj</span><span> </span><span>file</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>ofstream</span><span> </span><span>objFile</span><span>(</span>\\<span>"data/cluster_models/cluster_</span><span>\\"</span><span> + </span><span>\\n</span>\n<span> </span><span>std</span><span>::</span><span>to_string</span><span>(</span><span>level</span><span>)</span><span> </span><span>+</span><span> </span>\\<span>"_</span><span>\\"</span><span> + std::to_string(id) + </span><span>\\"</span><span>.obj</span><span>\\"</span><span>);</span><span>\\n</span><span> // Write vertices and faces</span><span>\\n</span><span> }</span><span>\\n</span><span> }</span><span>\\n</span><span> }</span><span>\\n</span><span> </span><span>\\n</span><span> void</span>\n<span> </span><span>renderCluster</span><span>(</span><span>const</span><span> </span><span>Shader</span><span>&</span><span> </span><span>shader</span><span>,</span><span> </span><span>const</span><span> </span><span>glm</span><span>::</span><span>mat4</span><span>&</span><span> </span><span>projection</span><span>,</span><span> </span>\\<span>n</span><span> </span><span>const</span><span> </span><span>glm</span><span>::</span><span>mat4</span><span>&</span><span> </span><span>view</span><span>,</span><span> </span><span>const</span><span> </span><span>glm</span><span>::</span><span>vec3</span><span>&</span><span> </span><span>color</span><span>,</span><span> </span><span>int</span><span> </span><span>level</span><span>,</span><span> </span><span>int</span><span> </span><span>id</span><span>)</span><span> </span><span>{</span>\\<span>n</span>\n<span> </span><span>ClusterData</span><span>*</span><span> </span><span>data</span><span> </span><span>=</span><span> </span><span>getClusterData</span><span>(</span><span>level</span><span>,</span><span> </span><span>id</span><span>);</span>\\<span>n</span><span> </span><span>shader</span><span>.</span><span>setVec3</span><span>(</span>\\<span>"color</span><span>\\"</span><span>, color);</span><span>\\n</span><span> shader.setMat4(</span><span>\\"</span><span>model</span><span>\\"</span><span>, data->modelMatrix);</span><span>\\n</span>\n<span> </span><span>glBindVertexArray</span><span>(</span><span>data</span><span>-></span><span>VAO</span><span>);</span>\\<span>n</span><span> </span><span>glDrawElements</span><span>(</span><span>GL_TRIANGLES</span><span>,</span><span> </span><span>data</span><span>-></span><span>indexCount</span><span>,</span><span> </span><span>GL_UNSIGNED_INT</span><span>,</span><span> </span><span>0</span><span>);</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span>};</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>### Shader</span>\n<span> </span><span>examples</span><span>:</span>\\<span>n</span><span>```</span><span>glsl</span>\\<span>n</span><span>//</span><span>points</span><span>.</span><span>vert</span><span> </span><span>-</span><span> </span><span>Instanced</span><span> </span><span>paper</span><span> </span><span>rendering</span>\\<span>n</span><span>#version 410 core\\nlayout (location = 0) in vec3 aPos;\\nlayout (location = 1) in vec3 aNormal;\\nlayout</span>\n<span> </span><span>(</span><span>location</span><span> </span><span>=</span><span> </span><span>2</span><span>)</span><span> </span><span>in</span><span> </span><span>vec3</span><span> </span><span>aOffset</span><span>;</span><span> </span><span>//</span><span> </span><span>per</span><span>-</span><span>instance</span><span> </span><span>position</span>\\<span>nlayout</span><span> </span><span>(</span><span>location</span><span> </span><span>=</span><span> </span><span>3</span><span>)</span><span> </span><span>in</span><span> </span><span>float</span><span> </span><span>aIncluded</span><span>;</span><span> </span><span>//</span><span> </span><span>per</span><span>-</span><span>instance</span><span> </span><span>inclusion</span><span> </span><span>flag</span>\\<span>n</span>\\<span>nuniform</span><span> </span><span>mat4</span><span> </span><span>model</span><span>,</span><span> </span><span>view</span><span>,</span>\n<span> </span><span>projection</span><span>;</span>\\<span>nout</span><span> </span><span>vec3</span><span> </span><span>FragPos</span><span>,</span><span> </span><span>Normal</span><span>;</span>\\<span>nout</span><span> </span><span>float</span><span> </span><span>Included</span><span>;</span>\\<span>n</span>\\<span>nvoid</span><span> </span><span>main</span><span>()</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>vec3</span><span> </span><span>worldPos</span><span> </span><span>=</span><span> </span><span>aPos</span><span> </span><span>+</span><span> </span><span>aOffset</span><span>;</span>\\<span>n</span><span> </span><span>gl_Position</span><span> </span><span>=</span><span> </span><span>projection</span><span> </span><span>*</span><span> </span><span>view</span><span> </span><span>*</span><span> </span><span>model</span><span> </span><span>*</span>\n<span> </span><span>vec4</span><span>(</span><span>worldPos</span><span>,</span><span> </span><span>1.0</span><span>);</span>\\<span>n</span><span> </span><span>FragPos</span><span> </span><span>=</span><span> </span><span>worldPos</span><span>;</span>\\<span>n</span><span> </span><span>Normal</span><span> </span><span>=</span><span> </span><span>aNormal</span><span>;</span>\\<span>n</span><span> </span><span>Included</span><span> </span><span>=</span><span> </span><span>aIncluded</span><span>;</span>\\<span>n</span><span>}</span>\\<span>n</span>\\<span>n</span><span>//</span><span>cluster</span><span>.</span><span>frag</span><span> </span><span>-</span><span> </span><span>Transparent</span><span> </span><span>cluster</span><span> </span><span>rendering</span>\\<span>n</span><span>#version 410</span>\n<span> </span><span>core</span>\\<span>nin</span><span> </span><span>vec3</span><span> </span><span>FragPos</span><span>,</span><span> </span><span>Normal</span><span>;</span>\\<span>nuniform</span><span> </span><span>vec3</span><span> </span><span>color</span><span>,</span><span> </span><span>CameraPos</span><span>;</span>\\<span>n</span>\\<span>nvoid</span><span> </span><span>main</span><span>()</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Blinn</span><span>-</span><span>Phong</span><span> </span><span>lighting</span>\\<span>n</span><span> </span><span>vec3</span><span> </span><span>lightDir</span><span> </span><span>=</span><span> </span><span>normalize</span><span>(</span><span>CameraPos</span><span> </span><span>-</span><span> </span><span>FragPos</span><span>);</span>\\<span>n</span>\n<span> </span><span>float</span><span> </span><span>diff</span><span> </span><span>=</span><span> </span><span>max</span><span>(</span><span>dot</span><span>(</span><span>Normal</span><span>,</span><span> </span><span>lightDir</span><span>),</span><span> </span><span>0.0</span><span>);</span>\\<span>n</span><span> </span><span>vec3</span><span> </span><span>result</span><span> </span><span>=</span><span> </span><span>(</span><span>0.3</span><span> </span><span>+</span><span> </span><span>0.7</span><span> </span><span>*</span><span> </span><span>diff</span><span>)</span><span> </span><span>*</span><span> </span><span>color</span><span>;</span>\\<span>n</span><span> </span><span>gl_FragColor</span><span> </span><span>=</span><span> </span><span>vec4</span><span>(</span><span>result</span><span>,</span><span> </span><span>0.3</span><span>);</span><span> </span><span>//</span><span> </span><span>Semi</span><span>-</span><span>transparent</span>\\<span>n</span><span>}</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>##</span>\n<span> </span><span>CMakeLists</span><span>.</span><span>txt</span><span> </span><span>build</span><span> </span><span>configuration</span><span>:</span>\\<span>n</span><span>```</span><span>cmake</span>\\<span>nproject</span><span>(</span><span>main</span><span> </span><span>VERSION</span><span> </span><span>0.2</span><span> </span><span>DESCRIPTION</span><span> </span>\\<span>"AI Pipeline Visualisation</span><span>\\"</span><span> LANGUAGES CXX)</span><span>\\n</span><span>set(CMAKE_CXX_STANDARD 20)</span><span>\\n\\n</span><span>#</span>\n<span> </span><span>Cross</span><span>-</span><span>platform</span><span> </span><span>library</span><span> </span><span>linking</span>\\<span>nif</span><span> </span><span>(</span><span>CMAKE_SYSTEM</span><span> </span><span>MATCHES</span><span> </span><span>Linux</span><span>)</span>\\<span>n</span><span> </span><span>set</span><span>(</span><span>GL_LIBS</span><span> </span><span>GL</span><span> </span><span>GLU</span><span> </span><span>glfw3</span><span> </span><span>assimp</span><span> </span><span>freetype</span><span>)</span>\\<span>n</span>\n<span> </span><span>include_directories</span><span>(</span><span>$</span><span>{</span><span>CMAKE_SOURCE_DIR</span><span>}</span><span>/</span><span>include</span><span>/</span><span>linux</span><span>)</span>\\<span>nendif</span><span>()</span>\\<span>n</span>\\<span>n</span><span># Copy assets and shaders to build directory\\nadd_custom_target(copy_assets\\n COMMAND</span>\n<span> </span><span>$</span><span>{</span><span>CMAKE_COMMAND</span><span>}</span><span> </span><span>-</span><span>E</span><span> </span><span>copy_directory</span><span> </span><span>$</span><span>{</span><span>CMAKE_CURRENT_LIST_DIR</span><span>}</span><span>/</span><span>data</span><span> </span><span>$</span><span>{</span><span>CMAKE_CURRENT_BINARY_DIR</span><span>}</span><span>/</span><span>data</span><span>)</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>nThe</span><span> </span><span>application</span><span> </span><span>loads</span><span> </span><span>academic</span><span> </span><span>papers</span><span> </span><span>from</span><span> </span><span>CSV</span><span>,</span><span> </span><span>renders</span><span> </span><span>them</span>\n<span> </span><span>as</span><span> </span><span>animated</span><span> </span><span>3</span><span>D</span><span> </span><span>cubes</span><span>,</span><span> </span><span>groups</span><span> </span><span>them</span><span> </span><span>into</span><span> </span><span>hierarchical</span><span> </span><span>clusters</span><span> </span><span>with</span><span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>visualization</span><span>,</span><span> </span><span>and</span><span> </span><span>provides</span><span> </span><span>interactive</span><span> </span><span>controls</span><span> </span><span>for</span><span> </span><span>exploration</span><span>.</span><span>")</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Error</span><span>:</span><span> </span><span>Gemini</span><span> </span><span>request</span><span> </span><span>timed</span><span> </span><span>out</span><span> </span><span>after</span><span> </span><span>30</span><span> </span><span>seconds</span>\n\n<span>\u25cf</span><span> </span><span>Let</span><span> </span><span>me</span><span> </span><span>try</span><span> </span><span>with</span><span> </span><span>a</span><span> </span><span>more</span><span> </span><span>concise</span><span> </span><span>analysis</span><span> </span><span>request</span><span>:</span>\n\n<span>\u25cf</span><span> </span><span>check_with_gemini</span><span>:</span><span>check_with_gemini</span><span> </span><span>(</span><span>MCP</span><span>)(</span><span>prompt</span><span>:</span><span> </span><span>"Analyze this 3D visualization codebase for code quality, architecture, and potential issues. Focus on OpenGL best practices, memory management, error handling,</span>\n<span> </span><span>and</span><span> </span><span>overall</span><span> </span><span>software</span><span> </span><span>engineering</span><span> </span><span>quality</span><span>.</span><span> </span><span>Provide</span><span> </span><span>both</span><span> </span><span>strengths</span><span> </span><span>and</span><span> </span><span>weaknesses</span><span>.</span><span>", content: "</span><span>C</span><span>++</span><span> </span><span>OpenGL</span><span> </span><span>4.1</span><span> </span><span>academic</span><span> </span><span>paper</span><span> </span><span>visualization</span><span> </span><span>app</span><span> </span><span>with</span><span> </span><span>these</span><span> </span><span>key</span>\n<span> </span><span>components</span><span>:</span>\\<span>n</span>\\<span>n</span><span>**</span><span>Architecture</span><span>:</span><span>**</span>\\<span>n</span><span>-</span><span> </span><span>App</span><span> </span><span>class</span><span>:</span><span> </span><span>OpenGL</span><span>/</span><span>GLFW</span><span> </span><span>wrapper</span><span>,</span><span> </span><span>handles</span><span> </span><span>window</span><span>,</span><span> </span><span>camera</span><span>,</span><span> </span><span>rendering</span><span> </span><span>loop</span>\\<span>n</span><span>-</span><span> </span><span>PaperLoader</span><span>:</span><span> </span><span>CSV</span><span> </span><span>data</span><span> </span><span>loading</span><span>,</span><span> </span><span>cluster</span><span> </span><span>generation</span><span>,</span><span> </span><span>UTF</span><span>-</span><span>8</span>\n<span> </span><span>support</span><span> </span>\\<span>n</span><span>-</span><span> </span><span>ClusterRenderer</span><span>:</span><span> </span><span>3</span><span>D</span><span> </span><span>convex</span><span> </span><span>hull</span><span> </span><span>generation</span><span>/</span><span>rendering</span><span> </span><span>for</span><span> </span><span>clusters</span>\\<span>n</span><span>-</span><span> </span><span>Shader</span><span> </span><span>management</span><span> </span><span>system</span>\\<span>n</span>\\<span>n</span><span>**</span><span>Key</span><span> </span><span>Code</span><span> </span><span>Patterns</span><span>:</span><span>**</span>\\<span>n</span>\\<span>nMain</span><span> </span><span>loop</span><span> </span><span>with</span><span> </span><span>instanced</span>\n<span> </span><span>rendering</span><span>:</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nwhile</span><span> </span><span>(</span><span>!</span><span>app</span><span>.</span><span>shouldClose</span><span>())</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>app</span><span>.</span><span>handleInput</span><span>();</span>\\<span>n</span><span> </span><span>app</span><span>.</span><span>clear</span><span>();</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Instanced</span><span> </span><span>paper</span><span> </span><span>cubes</span>\\<span>n</span><span> </span><span>glDrawArraysInstanced</span><span>(</span><span>GL_TRIANGLES</span><span>,</span>\n<span> </span><span>0</span><span>,</span><span> </span><span>36</span><span>,</span><span> </span><span>paperCount</span><span>);</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Transparent</span><span> </span><span>clusters</span><span> </span><span>with</span><span> </span><span>depth</span><span> </span><span>sorting</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>map</span><span><</span><span>float</span><span>,</span><span> </span><span>std</span><span>::</span><span>pair</span><span><</span><span>int</span><span>,</span><span> </span><span>glm</span><span>::</span><span>vec3</span><span>>></span><span> </span><span>sortedClusters</span><span>;</span>\\<span>n</span><span> </span><span>for</span><span> </span><span>(</span><span>clusters</span><span>)</span><span> </span><span>{</span>\\<span>n</span>\n<span> </span><span>distance</span><span> </span><span>=</span><span> </span><span>glm</span><span>::</span><span>length</span><span>(</span><span>cameraPos</span><span> </span><span>-</span><span> </span><span>clusterPos</span><span>);</span>\\<span>n</span><span> </span><span>sortedClusters</span><span>[</span><span>distance</span><span>]</span><span> </span><span>=</span><span> </span><span>clusterData</span><span>;</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span> </span><span>//</span><span> </span><span>Render</span><span> </span><span>back</span><span>-</span><span>to</span><span>-</span><span>front</span>\\<span>n</span><span>}</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>nResource</span>\n<span> </span><span>management</span><span>:</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nclass</span><span> </span><span>App</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>GLFWwindow</span><span>*</span><span> </span><span>window</span><span>;</span>\\<span>n</span><span> </span><span>unsigned</span><span> </span><span>int</span><span> </span><span>framebuffer</span><span>,</span><span> </span><span>colorTexture</span><span>,</span><span> </span><span>depthTexture</span><span>;</span>\\<span>n</span><span> </span>\\<span>n</span><span> </span><span>~</span><span>App</span><span>()</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>glDeleteFramebuffers</span><span>(</span><span>1</span><span>,</span>\n<span> </span><span>&</span><span>framebuffer</span><span>);</span>\\<span>n</span><span> </span><span>glfwTerminate</span><span>();</span>\\<span>n</span><span> </span><span>}</span>\\<span>n</span><span>};</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>nData</span><span> </span><span>loading</span><span> </span><span>with</span><span> </span><span>wide</span><span> </span><span>strings</span><span>:</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nvoid</span><span> </span><span>PaperLoader</span><span>::</span><span>loadFromFile</span><span>(</span><span>const</span><span> </span><span>std</span><span>::</span><span>string</span><span>&</span><span> </span><span>filename</span><span>,</span><span> </span><span>float</span>\n<span> </span><span>scale</span><span>)</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>std</span><span>::</span><span>wifstream</span><span> </span><span>file</span><span>(</span><span>filename</span><span>);</span>\\<span>n</span><span> </span><span>file</span><span>.</span><span>imbue</span><span>(</span><span>std</span><span>::</span><span>locale</span><span>(</span>\\<span>"en_US.UTF-8</span><span>\\"</span><span>));</span><span>\\n</span><span> // Parse CSV data, handle UTF-8 paper titles</span><span>\\n</span><span>}</span><span>\\n</span><span>```</span><span>\\n\\n</span><span>OpenGL state</span>\n<span> </span><span>management</span><span>:</span>\\<span>n</span><span>```</span><span>cpp</span>\\<span>nvoid</span><span> </span><span>App</span><span>::</span><span>enableDepthTesting</span><span>()</span><span> </span><span>{</span>\\<span>n</span><span> </span><span>glEnable</span><span>(</span><span>GL_DEPTH_TEST</span><span>);</span>\\<span>n</span><span> </span><span>glDepthFunc</span><span>(</span><span>GL_LESS</span><span>);</span>\\<span>n</span><span>}</span>\\<span>n</span>\\<span>nvoid</span><span> </span><span>App</span><span>::</span><span>enableBlending</span><span>()</span><span> </span><span>{</span>\\<span>n</span>\n<span> </span><span>glEnable</span><span>(</span><span>GL_BLEND</span><span>);</span>\\<span>n</span><span> </span><span>glBlendFunc</span><span>(</span><span>GL_SRC_ALPHA</span><span>,</span><span> </span><span>GL_ONE_MINUS_SRC_ALPHA</span><span>);</span>\\<span>n</span><span>}</span>\\<span>n</span><span>```</span>\\<span>n</span>\\<span>n</span><span>**</span><span>Libraries</span><span>:</span><span>**</span><span> </span><span>GLFW</span><span>,</span><span> </span><span>GLAD</span><span>,</span><span> </span><span>GLM</span><span>,</span><span> </span><span>Assimp</span><span>,</span><span> </span><span>FreeType</span><span>,</span><span> </span><span>ConvexHull3D</span>\\<span>n</span><span>**</span><span>Features</span><span>:</span><span>**</span>\n<span> </span><span>Instanced</span><span> </span><span>rendering</span><span>,</span><span> </span><span>transparent</span><span> </span><span>clusters</span><span>,</span><span> </span><span>post</span><span>-</span><span>processing</span><span>,</span><span> </span><span>font</span><span> </span><span>rendering</span><span>,</span><span> </span><span>CSV</span><span> </span><span>data</span><span> </span><span>loading</span><span>,</span><span> </span><span>animation</span><span> </span><span>system</span><span>")</span>\n<span> </span><span>\u23bf</span><span> </span><span>First</span><span>,</span><span> </span><span>I</span><span>'ll conduct a comprehensive search of the codebase to identify all C++ and header files. This will give me a complete picture of the project'</span><span>s</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>all</span><span> </span><span>its</span><span> </span><span>components</span><span>.</span><span>Okay</span><span>,</span><span> </span><span>that</span><span>'s a lot of fi</span>\n<span> </span><span>les</span><span>.</span><span> </span><span>Based</span><span> </span><span>on</span><span> </span><span>your</span><span> </span><span>description</span><span> </span><span>and</span><span> </span><span>the</span><span> </span><span>file</span><span> </span><span>list</span><span>,</span><span> </span><span>I</span><span>'ll focus on the core application logic in the `src` directory and the root-level `.cpp` files. The `include` and `lib` directories appear to contain third-p</span>\n<span> </span><span>arty</span><span> </span><span>libraries</span><span>,</span><span> </span><span>so</span><span> </span><span>I</span><span>'ll exclude those from the primary analysis.</span>\n<span> </span><span>\u2026</span><span> </span><span>+</span><span>57</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2612</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2610</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2610</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>Let</span><span> </span><span>me</span><span> </span><span>critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis by examining specific code examples to verify the claims:</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>src</span><span>/</span><span>opengl</span><span>/</span><span>app</span><span>.</span><span>h</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>183</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Read</span><span>(</span><span>src</span><span>/</span><span>opengl</span><span>/</span><span>app</span><span>.</span><span>cpp</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Read</span><span> </span><span>100</span><span> </span><span>lines</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Search</span><span>(</span><span>pattern</span><span>:</span><span> </span><span>"delete|new|malloc|free"</span><span>,</span><span> </span><span>include</span><span>:</span><span> </span><span>"*.cpp"</span><span>)</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>Found</span><span> </span><span>8</span><span> </span><span>files</span><span> </span><span>(</span><span>ctrl</span><span>+</span><span>r</span><span> </span><span>to</span><span> </span><span>expand</span><span>)</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2612</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2612</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2610</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>After</span><span> </span><span>critically</span><span> </span><span>evaluating</span><span> </span><span>Gemini</span><span>'s analysis against the actual codebase, I found it to be largely accurate and insightful. The analysis correctly identifies the use of raw pointers (_defaultShader and</span>\n<span> </span><span>_postProcessor</span><span>)</span><span> </span><span>in</span><span> </span><span>app</span><span>.</span><span>h</span><span>:</span><span>146</span><span> </span><span>and</span><span> </span><span>app</span><span>.</span><span>h</span><span>:</span><span>151</span><span>,</span><span> </span><span>the</span><span> </span><span>memory</span><span> </span><span>management</span><span> </span><span>issue</span><span> </span><span>in</span><span> </span><span>wstring2string</span><span> </span><span>(</span><span>main</span><span>.</span><span>cpp</span><span>:</span><span>500</span><span>-</span><span>503</span><span>),</span><span> </span><span>and</span><span> </span><span>the</span><span> </span><span>hardcoded</span><span> </span><span>paths</span><span> </span><span>throughout</span><span> </span><span>the</span><span> </span><span>codebase</span><span>.</span><span> </span><span>The</span><span> </span><span>assessment</span><span> </span><span>of</span><span> </span><span>OpenGL</span><span> </span><span>best</span><span> </span><span>practices</span><span>,</span><span> </span><span>instanced</span>\n<span> </span><span>rendering</span><span> </span><span>usage</span><span>,</span><span> </span><span>and</span><span> </span><span>transparency</span><span> </span><span>handling</span><span> </span><span>is</span><span> </span><span>also</span><span> </span><span>correct</span><span> </span><span>based</span><span> </span><span>on</span><span> </span><span>the</span><span> </span><span>code</span><span> </span><span>structure</span><span>.</span>\n\n<span>\u25cf</span><span> </span><span>Update</span><span> </span><span>Todos</span>\n<span> </span><span>\u23bf</span><span> \u00a0</span><span>\u2612</span><span> </span><span>Explore</span><span> </span><span>codebase</span><span> </span><span>structure</span><span> </span><span>and</span><span> </span><span>files</span><span> </span>\n<span> </span><span>\u2612</span><span> </span><span>Have</span><span> </span><span>Gemini</span><span> </span><span>analyze</span><span> </span><span>the</span><span> </span><span>codebase</span><span> </span><span>in</span><span> </span><span>depth</span>\n<span> </span><span>\u2612</span><span> </span><span>Critically</span><span> </span><span>evaluate</span><span> </span><span>Gemini</span><span>'s analysis</span>\n<span> </span><span>\u2612</span><span> </span><span>Create</span><span> </span><span>two</span><span> </span><span>paragraph</span><span> </span><span>summary</span><span> </span><span>for</span><span> </span><span>new</span><span> </span><span>contributors</span>\n\n<span>\u25cf</span><span> </span><span>Summary</span><span> </span><span>for</span><span> </span><span>New</span><span> </span><span>Contributors</span>\n\n<span> </span><span>This</span><span> </span><span>is</span><span> </span><span>a</span><span> </span><span>sophisticated</span><span> </span><span>3</span><span>D</span><span> </span><span>data</span><span> </span><span>visualization</span><span> </span><span>application</span><span> </span><span>built</span><span> </span><span>in</span><span> </span><span>C</span><span>++</span><span>20</span><span> </span><span>with</span><span> </span><span>OpenGL</span><span> </span><span>4.1</span><span> </span><span>that</span><span> </span><span>renders</span><span> </span><span>academic</span><span> </span><span>papers</span><span> </span><span>as</span><span> </span><span>animated</span><span> </span><span>cubes</span><span> </span><span>in</span><span> </span><span>3</span><span>D</span><span> </span><span>space</span><span>,</span><span> </span><span>organized</span><span> </span><span>into</span><span> </span><span>hierarchical</span><span> </span><span>clusters</span><span> </span><span>with</span><span> </span><span>convex</span><span> </span><span>hull</span>\n<span> </span><span>visualizations</span><span>.</span><span> </span><span>The</span><span> </span><span>codebase</span><span> </span><span>demonstrates</span><span> </span><span>solid</span><span> </span><span>software</span><span> </span><span>engineering</span><span> </span><span>practices</span><span> </span><span>with</span><span> </span><span>clear</span><span> </span><span>separation</span><span> </span><span>of</span><span> </span><span>concerns</span><span> </span><span>across</span><span> </span><span>its</span><span> </span><span>main</span><span> </span><span>components</span><span>:</span><span> </span><span>the</span><span> </span><span>App</span><span> </span><span>class</span><span> </span><span>manages</span><span> </span><span>OpenGL</span><span> </span><span>context</span><span> </span><span>and</span><span> </span><span>rendering</span><span> </span><span>pipeline</span><span>,</span>\n<span> </span><span>PaperLoader</span><span> </span><span>handles</span><span> </span><span>CSV</span><span> </span><span>data</span><span> </span><span>ingestion</span><span> </span><span>with</span><span> </span><span>UTF</span><span>-</span><span>8</span><span> </span><span>support</span><span> </span><span>and</span><span> </span><span>cluster</span><span> </span><span>generation</span><span>,</span><span> </span><span>ClusterRenderer</span><span> </span><span>creates</span><span> </span><span>and</span><span> </span><span>renders</span><span> </span><span>3</span><span>D</span><span> </span><span>convex</span><span> </span><span>hulls</span><span> </span><span>using</span><span> </span><span>the</span><span> </span><span>ConvexHull3D</span><span> </span><span>library</span><span>,</span><span> </span><span>and</span><span> </span><span>a</span><span> </span><span>comprehensive</span><span> </span><span>shader</span><span> </span><span>management</span>\n<span> </span><span>system</span><span>.</span><span> </span><span>The</span><span> </span><span>application</span><span> </span><span>uses</span><span> </span><span>modern</span><span> </span><span>OpenGL</span><span> </span><span>techniques</span><span> </span><span>including</span><span> </span><span>instanced</span><span> </span><span>rendering</span><span> </span><span>for</span><span> </span><span>performance</span><span>,</span><span> </span><span>proper</span><span> </span><span>transparency</span><span> </span><span>handling</span><span> </span><span>with</span><span> </span><span>depth</span><span> </span><span>sorting</span><span>,</span><span> </span><span>and</span><span> </span><span>post</span><span>-</span><span>processing</span><span> </span><span>effects</span><span>,</span><span> </span><span>while</span><span> </span><span>supporting</span><span> </span><span>real</span><span>-</span><span>time</span>\n<span> </span><span>animation</span><span> </span><span>controls</span><span> </span><span>and</span><span> </span><span>interactive</span><span> </span><span>3</span><span>D</span><span> </span><span>navigation</span><span>.</span>\n\n<span> </span><span>The</span><span> </span><span>architecture</span><span> </span><span>is</span><span> </span><span>well</span><span>-</span><span>organized</span><span> </span><span>with</span><span> </span><span>appropriate</span><span> </span><span>use</span><span> </span><span>of</span><span> </span><span>modern</span><span> </span><span>C</span><span>++</span><span> </span><span>features</span><span>,</span><span> </span><span>RAII</span><span> </span><span>principles</span><span> </span><span>for</span><span> </span><span>resource</span><span> </span><span>management</span><span>,</span><span> </span><span>and</span><span> </span><span>established</span><span> </span><span>graphics</span><span> </span><span>programming</span><span> </span><span>patterns</span><span>,</span><span> </span><span>though</span><span> </span><span>it</span><span> </span><span>has</span><span> </span><span>some</span><span> </span><span>areas</span><span> </span><span>for</span><span> </span><span>improvement</span>\n<span> </span><span>including</span><span> </span><span>inconsistent</span><span> </span><span>error</span><span> </span><span>handling</span><span>,</span><span> </span><span>occasional</span><span> </span><span>use</span><span> </span><span>of</span><span> </span><span>raw</span><span> </span><span>pointers</span><span> </span><span>instead</span><span> </span><span>of</span><span> </span><span>smart</span><span> </span><span>pointers</span><span>,</span><span> </span><span>and</span><span> </span><span>hardcoded</span><span> </span><span>configuration</span><span> </span><span>values</span><span>.</span><span> </span><span>New</span><span> </span><span>contributors</span><span> </span><span>should</span><span> </span><span>be</span><span> </span><span>comfortable</span><span> </span><span>with</span><span> </span><span>OpenGL</span><span>/</span><span>graphics</span><span> </span><span>programming</span>\n<span> </span><span>concepts</span><span>,</span><span> </span><span>understand</span><span> </span><span>3</span><span>D</span><span> </span><span>mathematics</span><span> </span><span>(</span><span>GLM</span><span> </span><span>library</span><span>),</span><span> </span><span>and</span><span> </span><span>be</span><span> </span><span>familiar</span><span> </span><span>with</span><span> </span><span>build</span><span> </span><span>systems</span><span> </span><span>like</span><span> </span><span>CMake</span><span>.</span><span> </span><span>The</span><span> </span><span>codebase</span><span> </span><span>includes</span><span> </span><span>comprehensive</span><span> </span><span>asset</span><span> </span><span>handling</span><span> </span><span>(</span><span>3</span><span>D</span><span> </span><span>models</span><span>,</span><span> </span><span>shaders</span><span>,</span><span> </span><span>fonts</span><span>,</span><span> </span><span>textures</span><span>),</span><span> </span><span>supports</span>\n<span> </span><span>cross</span><span>-</span><span>platform</span><span> </span><span>development</span><span> </span><span>(</span><span>Linux</span><span> </span><span>focus</span><span> </span><span>with</span><span> </span><span>Windows</span><span> </span><span>compatibility</span><span>),</span><span> </span><span>and</span><span> </span><span>implements</span><span> </span><span>a</span><span> </span><span>complete</span><span> </span><span>data</span><span> </span><span>pipeline</span><span> </span><span>from</span><span> </span><span>CSV</span><span> </span><span>loading</span><span> </span><span>through</span><span> </span><span>hierarchical</span><span> </span><span>clustering</span><span> </span><span>to</span><span> </span><span>real</span><span>-</span><span>time</span><span> </span><span>3</span><span>D</span><span> </span><span>visualization</span><span> </span><span>with</span><span> </span><span>interactive</span>\n<span> </span><span>controls</span><span> </span><span>for</span><span> </span><span>exploring</span><span> </span><span>academic</span><span> </span><span>paper</span><span> </span><span>datasets</span><span>.</span>\n</code></pre></div>\n\n<p>As you can see one of the downsides is that some of the calls to <code>check_with_gemini</code> take longer than 30 seconds and so timeout.</p>",
+20
sadiqj/www.toao.com_2025-07-15__blog_ocaml-0725.json
+20
sadiqj/www.toao.com_2025-07-15__blog_ocaml-0725.json
···+"summary": "<p>A look at recent OCaml projects, from benchmarking AI code models and building new agentic tools to improvements in the garbage collector.</p>",+"content": "<p>Thanks to sponsorship from <a href=\"www.tarides.com\">Tarides</a> I've been able to spend some time over the last few months hacking on various OCaml-related projects. Some of these I've already published as blog posts, while others are still works in progress but this post gives brief summaries and ties some of them together for a bigger picture.</p>\n<h1>AI Coding Agents</h1>\n<p><a href=\"https://jon.recoil.org/\">Jon Ludlam</a>, <a href=\"https://anil.recoil.org/\">Anil Madhavapeddy</a> and I have spent a fair amount of time thinking broadly about AI coding agents. In addition to this post it's worth checking this <a href=\"https://www.cl.cam.ac.uk/~avsm2/2025-ocaml-ai-draft1.pdf\">short draft paper</a> we've written that summaries our work so far and our thoughts for what could be needed going forward for OCaml to work well with AI tooling.</p>\n<h2>Benchmarking</h2>\n<p>Anil and Jon lecture the first year <a href=\"https://www.cl.cam.ac.uk/teaching/2425/FoundsCS/\">Foundation of Computer Science</a> here at Cambridge, so we <a href=\"https://toao.com/blog/ocaml-local-code-models\">tested how well self-hostable models performed on first year coding exercises</a>. We found the Qwen3 models were very effective, rivaling much bigger and older models. We plan to do some more benchmarking, especially as we've heard good things from community members about newer models like <a href=\"https://mistral.ai/news/devstral\">Mistral's devstral</a> which has <a href=\"https://mistral.ai/news/devstral-2507\">just been updated</a>. Before we can do that though, we want to build a more extensive set of benchmarks. A step in that direction is <a href=\"https://toao.com/blog/opam-archive-dataset\">opam-archive-dataset</a> which is a continuously-updated parquet dataset of all source code in opam. We've also been looking at <a href=\"https://arxiv.org/abs/2504.14757\">SWE-Synth</a> and how we would apply a similar method for synthesising across community OCaml projects.</p>\n<h2>Agentic tooling</h2>\n<p>There are coding models that we can't modify ourselves<a href=\"#fn:1\">1</a> and which users access remotely. Nearly all agents using these coding models support tool use through the <a href=\"https://modelcontextprotocol.io/introduction\">Model Context Protocol</a>. We've started working on a tool <a href=\"https://github.com/sadiqj/odoc-llm\">odoc-llm</a> which enables natural language search of functionality over all packages and libraries in opam - in a way that can be hosted centraly as a remote MCP server. It's still a work in progress and Jon is planning to write something more extensive once we've fixed the last few issues. We've mentioned a few other potential tools in the <a href=\"https://www.cl.cam.ac.uk/~avsm2/2025-ocaml-ai-draft1.pdf\">draft piece</a> and if anyone in the community is working on any of them or would like to collaborate then please do get in touch!</p>\n<p>I've also been exploring using simple collaboration between coding models and created <a href=\"https://github.com/sadiqj/check-with-gemini\">a very simple MCP server</a> that lets you use gemini-cli as tool in another agent. Mostly I use this to have Claude Code call Gemini to check it's work - this can be helpful for tasks that involve analysing larger code bases where Gemini's larger context window is useful.</p>\n<h1>The OCaml runtime</h1>\n<p>In addition to the forward-looking work around AI models I've also been doing a bit of runtime maintenance. The <a href=\"https://github.com/ocaml/ocaml/pull/13616\">change to the shared heap's free list representation</a> has gone through two stages of review and should land fairly soon. We're just waiting on some final performance testing before merging.</p>\n<p>I had a fun debugging session with <a href=\"https://janmidtgaard.dk/\">Jan Midtgaard</a> trying to figure out what could have caused <a href=\"https://github.com/ocaml/ocaml/issues/13739\">ocaml issue 13739</a>. It turned out that terminating domains could orphan shared pools in parallel with a running stop-the-world section, which was unexpected and bad as it could lead to a segfault or memory corruption if you were very unfortunate. <a href=\"https://gallium.inria.fr/~scherer/\">Gabriel Scherer</a> fixed this in <a href=\"https://github.com/ocaml/ocaml/pull/14025\">ocaml pr 14025</a>.</p>\n<p>Lastly with the GC there is the long-running compactor unification between the <a href=\"https://github.com/ocaml/ocaml/blob/trunk/runtime/shared_heap.c#L1066\">trunk ocaml compactor</a> and <a href=\"https://github.com/oxcaml/oxcaml/blob/main/runtime/shared_heap.c#L2166\">OxCaml compactor</a><a href=\"#fn:2\">2</a>. The OxCaml compactor is a bit more expensive than trunk but plays much better with virtual memory. This helps reduce memory usage and improve performance in large, long-running applications. I suspect proposing the OxCaml compactor for trunk will be something worth doing next quarter.</p>\n<p>Finally, there have a number of <a href=\"https://github.com/ocaml/ocaml/pull/14035\">small</a> <a href=\"https://github.com/ocaml/ocaml/pull/13970\">fixes</a> <a href=\"https://github.com/ocaml/ocaml/pull/13785\">to</a> runtime events. It's nice to see these and new feature requests appearing.</p>\n<h2>Projects</h2>\n<p>I had hoped to kick off a couple of undergraduate or masters projects along with <a href=\"https://kcsrk.info/\">KC Sivaramakrishnan</a>, either at Cambridge or at IIT Madras. These are intended to be ways of mentoring students who think they might like to become contributors to the OCaml runtime. Unfortunately, February and March are a bit late for picking projects on academic courses.</p>\n<p>The first is implementing 'Early Release' for the OCaml minor collector. At present, all domains pause for a minor collection and wait until all domains have finished promoting their minor heaps before resuming. Consider each domain, it's minor heap consists of values that are reachable from other domains but also values that only it can reach. If all domains have promoted the global roots, their local roots, and their remembered set, then any domain that has finished promoting everything in their minor heap should be free to leave the minor collection and resume the mutator. This should mean that domains that do few minor heap allocations spend very little time in minor collections. We had an early prototype of this implemented a few years ago on the ocaml-multicore repo but we weren't convinced it was correct and ran out of time to fix it before upstreaming multicore. I think now is a good time to try again.</p>\n<p>The second project is 'Domain runtime work-sharing'. At present the number of domains needs to be less than the number of available physical cores. Going over that number can significantly reduce performance as domains wait to enter or complete a stop-the-world section in the runtime. Could we address this by restricting the number of domains running in a stop-the-world section to the number of physical cores and have that set of domains take on the work of all stopped domains? This is probably most advantageous for minor collections but could be done for major stop-the-world sections too.</p>\n<p>A third wild card project is exploring how we could use information from the GC and <a href=\"https://docs.kernel.org/scheduler/sched-ext.html\">Linux's sched_ext</a> to better schedule highly parallel OCaml programs. This is a lot more speculative but might make for an interesting Masters project.</p>\n<p>If you are interested in any of these projects for the next academic year, please do get in touch with me or <a href=\"https://kcsrk.info/\">KC</a>.</p>\n<div>\n\n\n<ol>\n<li>\n<p>Mainly models from proprietary providers like OpenAI, Anthropic and Google or where the models are huge and uneconomical to run in anything less than large multi-user deployments, like Deepseek V3/R1 or Kimi 2. <a href=\"#fnref:1\" title=\"Jump back to footnote 1 in the text\">↩</a></p>\n</li>\n<li>\n<p>OxCaml actually has two compactors, the trunk compactor and a new one. I only mean the new one here. <a href=\"#fnref:2\" title=\"Jump back to footnote 2 in the text\">↩</a></p>\n</li>\n</ol>\n</div>",