Thicket data repository for the EEG
at main 5.3 kB view raw
1{ 2 "id": "https://www.tunbury.org/2025/07/14/tessera-workshop", 3 "title": "Tessera Workshop", 4 "link": "https://www.tunbury.org/2025/07/14/tessera-workshop/", 5 "updated": "2025-07-14T00:00:00", 6 "published": "2025-07-14T00:00:00", 7 "summary": "I wrote previously about setting up a Jupyter notebook in a Docker container. This worked well for a single user, but we intend to hold a workshop and so need a multi-user setup.", 8 "content": "<p>I wrote previously about setting up a <a href=\"https://www.tunbury.org/2025/07/09/jupyter/\">Jupyter notebook in a Docker container</a>. This worked well for a single user, but we intend to hold a workshop and so need a multi-user setup.</p>\n\n<p>We would prefer that as much of the per-user setup as possible be completed automatically so participants don’t need to waste time setting up the environment.</p>\n\n<p>There is a great resource at <a href=\"https://github.com/jupyterhub/jupyterhub-the-hard-way/blob/HEAD/docs/installation-guide-hard.md\">jupyterhub/jupyterhub-the-hard-way</a> walking you through the manual setup.</p>\n\n<p>However, there are many Docker containers that we can use as the base, including <code>python:3.11</code>, but I have decided to use <code>jupyter/data science:latest</code>. The containers are expected to be customised with a <code>Dockerfile</code>.</p>\n\n<p>In my <code>Dockerfile</code>, I first installed JupyterLab and the other dependencies to avoid users needing to install these manually later.</p>\n\n<div><div><pre><code>RUN pip install --no-cache-dir \\\n jupyterhub \\\n jupyterlab \\\n notebook \\\n numpy \\\n matplotlib \\\n scikit-learn \\\n ipyleaflet \\\n ipywidgets \\\n ipykernel\n</code></pre></div></div>\n\n<p>Then the system dependencies. A selection of editors and <code>git</code> which is needed for <code>pip install git+https</code>.</p>\n\n<div><div><pre><code>USER root\nRUN apt-get update &amp;&amp; apt-get install -y \\\n curl git vim nano \\\n &amp;&amp; rm -rf /var/lib/apt/lists/*\n</code></pre></div></div>\n\n<p>Then our custom package from GitHub.</p>\n\n<div><div><pre><code>RUN pip install git+https://github.com/ucam-eo/geotessera.git\n</code></pre></div></div>\n\n<p>The default user database is PAM, so create UNIX users for the workshop participants without a disabled password.</p>\n\n<div><div><pre><code>RUN for user in user1 user2 user3; do \\\n adduser --disabled-password --gecos '' $user; \\\n done\n</code></pre></div></div>\n\n<p>Finally, set the entrypoint for the container:</p>\n\n<div><div><pre><code>CMD [\"jupyterhub\", \"-f\", \"/srv/jupyterhub/jupyterhub_config.py\"]\n</code></pre></div></div>\n\n<p>Next, I created the <code>jupyterhub_config.py</code>. I think most of these lines are self-explanatory. The password is the same for everyone to sign in. Global environment variables can be set using <code>c.Spawner.environment</code>.</p>\n\n<div><div><pre><code>from jupyterhub.auth import DummyAuthenticator\n\nc.JupyterHub.authenticator_class = DummyAuthenticator\nc.DummyAuthenticator.password = \"Workshop\"\n\n# Allow all users\nc.Authenticator.allow_all = True\n\n# Use JupyterLab by default\nc.Spawner.default_url = '/lab'\n\n# Set timeouts\nc.Spawner.start_timeout = 300\nc.Spawner.http_timeout = 120\nc.Spawner.environment = {\n 'TESSERA_DATA_DIR': '/tessera'\n}\n\n# Basic configuration\nc.JupyterHub.ip = '0.0.0.0'\nc.JupyterHub.port = 8000\n</code></pre></div></div>\n\n<p>I’m going to use Caddy as a reverse proxy for this setup, for this I need a <code>Caddyfile</code> containing the public FQDN and the Docker container name and port:</p>\n\n<div><div><pre><code>workshop.cam.ac.uk {\n\treverse_proxy jupyterhub:8000\n}\n</code></pre></div></div>\n\n<p>The services are defined in <code>docker-compose.yml</code>; Caddy and the associated volumes to preserve SSL certificates between restarts, <code>jupyterhub</code> with volumes for home directories so they are preserved and a mapping for our shared dataset.</p>\n\n<div><div><pre><code>services:\n caddy:\n image: caddy:latest\n ports:\n - \"80:80\"\n - \"443:443\"\n volumes:\n - ./Caddyfile:/etc/caddy/Caddyfile\n - caddy_data:/data\n - caddy_config:/config\n\n jupyterhub:\n build: .\n volumes:\n - ./jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py\n - jupyter_home:/home\n - tessera_data:/tessera\n\nvolumes:\n caddy_data:\n caddy_config:\n jupyter_home:\n tessera_data:\n</code></pre></div></div>\n\n<p>Reset UFW to defaults</p>\n\n<div><div><pre><code>ufw <span>--force</span> reset\n</code></pre></div></div>\n\n<p>Set default policies</p>\n\n<div><div><pre><code>ufw default deny incoming\nufw default allow outgoing\n</code></pre></div></div>\n\n<p>Allow SSH and HTTP(S) services</p>\n\n<div><div><pre><code>ufw allow ssh\nufw allow http\nufw allow https\n</code></pre></div></div>\n\n<p>Enable UFW</p>\n\n<div><div><pre><code>ufw <span>enable</span>\n</code></pre></div></div>\n\n<p>Check status</p>\n\n<div><div><pre><code>ufw status verbose\n</code></pre></div></div>", 9 "content_type": "html", 10 "author": { 11 "name": "Mark Elvers", 12 "email": "mark.elvers@tunbury.org", 13 "uri": null 14 }, 15 "categories": [ 16 "jupyter" 17 ], 18 "source": "https://www.tunbury.org/atom.xml" 19}