Thicket data repository for the EEG
at main 18 kB view raw
1{ 2 "id": "https://www.tunbury.org/2025/05/08/debugging-obuilder-macos", 3 "title": "Debugging OBuilder on macOS", 4 "link": "https://www.tunbury.org/2025/05/08/debugging-obuilder-macos/", 5 "updated": "2025-05-08T12:00:00", 6 "published": "2025-05-08T12:00:00", 7 "summary": "The log from an OBuilder job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.", 8 "content": "<p>The log from an <a href=\"https://github.com/ocurrent/obuilder\">OBuilder</a> job starts with the steps needed to reproduce the job locally. This boilerplate output assumes that all OBuilder jobs start from a Docker base image, but on some operating systems, such as FreeBSD and macOS, OBuilder uses ZFS base images. On OpenBSD and Windows, it uses QEMU images. The situation is further complicated when the issue only affects a specific architecture that may be unavailable to the user.</p>\n\n<div><div><pre><code>2025-05-08 13:29.37: New job: build bitwuzla-cxx.0.7.0, using opam 2.3\n from https://github.com/ocaml/opam-repository.git#refs/pull/27768/head (55a47416d532dc829d9111297970934a21a1b1c4)\n on macos-homebrew-ocaml-4.14/amd64\n\nTo reproduce locally:\n\ncd $(mktemp -d)\ngit clone --recursive \"https://github.com/ocaml/opam-repository.git\" &amp;&amp; cd \"opam-repository\" &amp;&amp; git fetch origin \"refs/pull/27768/head\" &amp;&amp; git reset --hard 55a47416\ngit fetch origin master\ngit merge --no-edit b8a7f49af3f606bf8a22869a1b52b250dd90092e\ncat &gt; ../Dockerfile &lt;&lt;'END-OF-DOCKERFILE'\n\nFROM macos-homebrew-ocaml-4.14\nUSER 1000:1000\nRUN ln -f ~/local/bin/opam-2.3 ~/local/bin/opam\nRUN opam init --reinit -ni\nRUN opam option solver=builtin-0install &amp;&amp; opam config report\nENV OPAMDOWNLOADJOBS=\"1\"\nENV OPAMERRLOGLEN=\"0\"\nENV OPAMPRECISETRACKING=\"1\"\nENV CI=\"true\"\nENV OPAM_REPO_CI=\"true\"\nRUN rm -rf opam-repository/\nCOPY --chown=1000:1000 . opam-repository/\nRUN opam repository set-url -k local --strict default opam-repository/\nRUN opam update --depexts || true\nRUN opam pin add -k version -yn bitwuzla-cxx.0.7.0 0.7.0\nRUN opam reinstall bitwuzla-cxx.0.7.0; \\\n res=$?; \\\n test \"$res\" != 31 &amp;&amp; exit \"$res\"; \\\n export OPAMCLI=2.0; \\\n build_dir=$(opam var prefix)/.opam-switch/build; \\\n failed=$(ls \"$build_dir\"); \\\n partial_fails=\"\"; \\\n for pkg in $failed; do \\\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then \\\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\"; \\\n fi; \\\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails=\"$partial_fails $pkg\"; \\\n done; \\\n test \"${partial_fails}\" != \"\" &amp;&amp; echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\"; \\\n exit 1\n\n\nEND-OF-DOCKERFILE\ndocker build -f ../Dockerfile .\n</code></pre></div></div>\n\n<p>It is, therefore, difficult to diagnose the issue on these operating systems and on esoteric architectures. Is it an issue with the CI system or the job itself?</p>\n\n<p>My approach is to get myself into an interactive shell at the point in the build where the failure occurs. On Linux and FreeBSD, the log is available in <code>/var/log/syslog</code> or <code>/var/log/messages</code> respectively. On macOS, this log is written to <code>ocluster.log</code>. macOS workers are single-threaded, so the worker must be paused before progressing.</p>\n\n<p>Each step in an OBuilder job consists of taking a snapshot of the previous layer, running a command in that layer, and keeping or discarding the layer depending on the command’s success or failure. On macOS, layers are ZFS snapshots mounted over the Homebrew directory and the CI users’ home directory. We can extract the appropriate command from the logs.</p>\n\n<div><div><pre><code>2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"canmount=noauto\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"mount\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"mountpoint=none\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/brew@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"-o\" \"mountpoint=none\" \"--\" \"obuilder/result/a67e6d3b460fa52b5c57581e7c01fa74ddca0a0b5462fef34103a09e87f3feec/home@snap\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\ncannot open 'obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40@snap': dataset does not exist\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"--\" \"obuilder/cache/c-opam-archives@snap\" \"obuilder/cache-tmp/8608-c-opam-archives\"\n2025-05-08 14:31.17 application [INFO] Exec \"zfs\" \"clone\" \"--\" \"obuilder/cache/c-homebrew@snap\" \"obuilder/cache-tmp/8609-c-homebrew\"\n2025-05-08 14:31.18 obuilder [INFO] result_tmp = /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/usr/local\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\n2025-05-08 14:31.18 obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8608-c-opam-archives, dst = /Users/mac1000/.opam/download-cache, type rw\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000/.opam/download-cache\" \"obuilder/cache-tmp/8608-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives\n2025-05-08 14:31.18 obuilder [INFO] src = /Volumes/obuilder/cache-tmp/8609-c-homebrew, dst = /Users/mac1000/Library/Caches/Homebrew, type rw\n2025-05-08 14:31.18 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=/Users/mac1000/Library/Caches/Homebrew\" \"obuilder/cache-tmp/8609-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew\n2025-05-08 14:31.19 application [INFO] Exec \"sudo\" \"dscl\" \".\" \"list\" \"/Users\"\n2025-05-08 14:31.19 application [INFO] Exec \"sudo\" \"-u\" \"mac1000\" \"-i\" \"getconf\" \"DARWIN_USER_TEMP_DIR\"\n2025-05-08 14:31.19 application [INFO] Fork exec \"sudo\" \"su\" \"-l\" \"mac1000\" \"-c\" \"--\" \"source ~/.obuilder_profile.sh &amp;&amp; env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' \"$0\" \"$@\"\" \"/usr/bin/env\" \"bash\" \"-c\" \"opam reinstall bitwuzla-cxx.0.7.0;\n res=$?;\n test \"$res\" != 31 &amp;&amp; exit \"$res\";\n export OPAMCLI=2.0;\n build_dir=$(opam var prefix)/.opam-switch/build;\n failed=$(ls \"$build_dir\");\n partial_fails=\"\";\n for pkg in $failed; do\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\";\n fi;\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails=\"$partial_fails $pkg\";\n done;\n test \"${partial_fails}\" != \"\" &amp;&amp; echo \"opam-repo-ci detected dependencies failing: ${partial_fails}”;\n exit 1\"\n2025-05-08 14:31.28 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:31.58 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:32.28 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:32.43 application [INFO] Exec \"zfs\" \"inherit\" \"mountpoint\" \"obuilder/cache-tmp/8608-c-opam-archives\"\nUnmount successful for /Users/mac1000/.opam/download-cache\n2025-05-08 14:32.44 application [INFO] Exec \"zfs\" \"inherit\" \"mountpoint\" \"obuilder/cache-tmp/8609-c-homebrew\"\nUnmount successful for /Users/mac1000/Library/Caches/Homebrew\n2025-05-08 14:32.45 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=none\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/home\"\nUnmount successful for /Users/mac1000\n2025-05-08 14:32.45 application [INFO] Exec \"zfs\" \"set\" \"mountpoint=none\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40/brew\"\nUnmount successful for /usr/local\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache/c-homebrew\" \"obuilder/cache-tmp/8610-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache/c-homebrew\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"promote\" \"obuilder/cache-tmp/8609-c-homebrew\"\n2025-05-08 14:32.46 application [INFO] Exec \"zfs\" \"destroy\" \"-f\" \"--\" \"obuilder/cache-tmp/8610-c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8610-c-homebrew\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@snap\" \"obuilder/cache-tmp/8609-c-homebrew@old-2152\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"destroy\" \"-d\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@old-2152\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"snapshot\" \"-r\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew@snap\"\n2025-05-08 14:32.48 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8609-c-homebrew\" \"obuilder/cache/c-homebrew\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8609-c-homebrew\n2025-05-08 14:32.49 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache/c-opam-archives\" \"obuilder/cache-tmp/8611-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache/c-opam-archives\n2025-05-08 14:32.50 application [INFO] Exec \"zfs\" \"promote\" \"obuilder/cache-tmp/8608-c-opam-archives\"\n2025-05-08 14:32.50 application [INFO] Exec \"zfs\" \"destroy\" \"-f\" \"--\" \"obuilder/cache-tmp/8611-c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8611-c-opam-archives\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@snap\" \"obuilder/cache-tmp/8608-c-opam-archives@old-2152\"\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"destroy\" \"-d\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@old-2152\"\n2025-05-08 14:32.51 application [INFO] Exec \"zfs\" \"snapshot\" \"-r\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives@snap\"\n2025-05-08 14:32.52 application [INFO] Exec \"zfs\" \"rename\" \"--\" \"obuilder/cache-tmp/8608-c-opam-archives\" \"obuilder/cache/c-opam-archives\"\nUnmount successful for /Volumes/obuilder/cache-tmp/8608-c-opam-archives\n2025-05-08 14:32.52 application [INFO] Exec \"zfs\" \"destroy\" \"-r\" \"-f\" \"--\" \"obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\"\nUnmount successful for /Volumes/obuilder/result/af09425cd7744c7b32ed000b11db90295142f3d3430fddb594932d5c02343b40\n2025-05-08 14:32.58 worker [INFO] OBuilder partition: 27% free, 2081 items\n2025-05-08 14:33.04 worker [INFO] Job failed: \"/usr/bin/env\" \"bash\" \"-c\" \"opam reinstall bitwuzla-cxx.0.7.0;\n res=$?;\n test \"$res\" != 31 &amp;&amp; exit \"$res\";\n export OPAMCLI=2.0;\n build_dir=$(opam var prefix)/.opam-switch/build;\n failed=$(ls \"$build_dir\");\n partial_fails=\"\";\n for pkg in $failed; do\n if opam show -f x-ci-accept-failures: \"$pkg\" | grep -qF \"\\\"macos-homebrew\\\"\"; then\n echo \"A package failed and has been disabled for CI using the 'x-ci-accept-failures' field.\";\n fi;\n test \"$pkg\" != 'bitwuzla-cxx.0.7.0' &amp;&amp; partial_fails=\"$partial_fails $pkg\";\n done;\n test \"${partial_fails}\" != \"\" &amp;&amp; echo \"opam-repo-ci detected dependencies failing: ${partial_fails}\";\n exit 1\" failed with exit status 1\n\n</code></pre></div></div>\n\n<p>Run each of the <em>Exec</em> commands at the command prompt up to the <em>Fork exec</em>. We do need to run it, but we want an interactive shell, so let’s change the final part of the command to <code>bash</code>:</p>\n\n<div><div><pre><code>sudo su -l mac1000 -c -- \"source ~/.obuilder_profile.sh &amp;&amp; env 'TMPDIR=/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/' 'OPAM_REPO_CI=true' 'CI=true' 'OPAMPRECISETRACKING=1' 'OPAMERRLOGLEN=0' 'OPAMDOWNLOADJOBS=1' bash\"\n</code></pre></div></div>\n\n<p>Now, at the shell prompt, we can try <code>opam reinstall bitwuzla-cxx.0.7.0</code>. Hopefully, this fails, which proves we have successfully recreated the environment!</p>\n\n<div><div><pre><code>$ opam source bitwuzla-cxx.0.7.0\n$ cd bitwuzla-cxx.0.7.0\n$ dune build\nFile \"vendor/dune\", lines 201-218, characters 0-436:\n201 | (rule\n202 | (deps\n203 | (source_tree bitwuzla)\n.....\n216 | %{p0002}\n217 | (run patch -p1 --directory bitwuzla))\n218 | (write-file %{target} \"\")))))\n(cd _build/default/vendor &amp;&amp; /usr/bin/patch -p1 --directory bitwuzla) &lt; _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h'\nCan't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied\npatch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/build_9012b8_dune/patchoEyVbKAjSTw': Permission denied\n</code></pre></div></div>\n\n<p>This matches the output we see on the CI logs. <code>/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T</code> is the <code>TMPDIR</code> value set in the environment. <code>Permission denied</code> looks like file system permissions. <code>ls -l</code> and <code>touch</code> show we can write to this directory.</p>\n\n<p>As we are running on macOS, and the Dune is invoking <code>patch</code>, my thought goes to Apple’s <code>patch</code> vs GNU’s <code>patch</code>. Editing <code>vendor/dune</code> to use <code>gpatch</code> rather than <code>patch</code> allows the project to build.</p>\n\n<div><div><pre><code>$ dune build\n(cd _build/default/vendor &amp;&amp; /usr/local/bin/gpatch --directory bitwuzla -p1) &lt; _build/default/vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\nFile include/bitwuzla/cpp/bitwuzla.h is read-only; trying to patch anyway\npatching file include/bitwuzla/cpp/bitwuzla.h\n</code></pre></div></div>\n\n<p>Running Apple’s <code>patch</code> directly,</p>\n\n<div><div><pre><code>$ patch -p1 &lt; ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h'\nCan't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI', output is in '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied\npatch: **** can't create '/var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI': Permission denied\n</code></pre></div></div>\n\n<p>However, <code>touch /var/folders/s_/z7_t3bvn5txfn81hk9p3ntfw0000z8/T/patchorVrfBtHVDI</code> succeeds.</p>\n\n<p>Looking back at the output from GNU <code>patch</code>, it reports that the file itself is read-only.</p>\n\n<div><div><pre><code>$ ls -l include/bitwuzla/cpp/bitwuzla.h\n-r--r--r-- 1 mac1000 admin 52280 May 8 15:05 include/bitwuzla/cpp/bitwuzla.h\n</code></pre></div></div>\n\n<p>Let’s try to adjust the permissions:</p>\n\n<div><div><pre><code>$ chmod 644 include/bitwuzla/cpp/bitwuzla.h\n$ patch -p1 &lt; ../../../../vendor/patch/0001-api-Add-hook-for-ocaml-z-value.patch\npatching file 'include/bitwuzla/cpp/bitwuzla.h’\n</code></pre></div></div>\n\n<p>And now, it succeeds. The issue is that GNU’s <code>patch</code> and Apple’s <code>patch</code> act differently when the file being patched is read-only. Apple’s <code>patch</code> gives a spurious error, while GNU’s <code>patch</code> emits a warning and makes the change anyway.</p>\n\n<p>Updating the <code>dune</code> file to include <code>chmod</code> should both clear the warning and allow the use of the native patch.</p>\n\n<div><div><pre><code>(rule\n (deps\n (source_tree bitwuzla)\n (:p0001\n (file patch/0001-api-Add-hook-for-ocaml-z-value.patch))\n (:p0002\n (file patch/0002-binding-Fix-segfault-with-parallel-instances.patch)))\n (target .bitwuzla_tree)\n (action\n (no-infer\n (progn\n (run chmod -R u+w bitwuzla)\n (with-stdin-from\n %{p0001}\n (run patch -p1 --directory bitwuzla))\n (with-stdin-from\n %{p0002}\n (run patch -p1 --directory bitwuzla))\n (write-file %{target} \"\")))))\n</code></pre></div></div>\n\n<p>As an essential last step, we need to tidy up on this machine. Exit the shell. Refer back to the log file for the job and run all the remaining ZFS commands. This is incredibly important on macOS and essential to keep the jobs database in sync with the layers.</p>", 9 "content_type": "html", 10 "author": { 11 "name": "Mark Elvers", 12 "email": "mark.elvers@tunbury.org", 13 "uri": null 14 }, 15 "categories": [ 16 "macOS,OBuilder", 17 "tunbury.org" 18 ], 19 "source": "https://www.tunbury.org/atom.xml" 20}