Thicket data repository for the EEG
at main 6.6 kB view raw
1{ 2 "id": "https://www.tunbury.org/2025/04/19/gluster", 3 "title": "Gluster", 4 "link": "https://www.tunbury.org/2025/04/19/gluster/", 5 "updated": "2025-04-19T00:00:00", 6 "published": "2025-04-19T00:00:00", 7 "summary": "Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend dm-cache with LVM instead.", 8 "content": "<p>Gluster is a free and open-source software network filesystem. It has been a few years since I last looked at the project, and I was interested in taking another look. Some features, like automatic tiering of hot/cold data, have been removed, and the developers now recommend <code>dm-cache</code> with LVM instead.</p>\n\n<p>I am going to use four QEMU VMs on which I have installed Ubuntu via PXE boot. For easy repetition, I have wrapped my <code>qemu-system-x86_64</code> commands into a <code>Makefile</code>.</p>\n\n<div><div><pre><code>machine: disk0.qcow2 disk1.qcow2 OVMF_VARS.fd\n qemu-system-x86_64 -m 8G -smp 4 -machine accel=kvm,type=pc -cpu host -display none -vnc :11 \\\n -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE.fd \\\n -drive if=pflash,format=raw,file=OVMF_VARS.fd \\\n -serial stdio \\\n -device virtio-scsi-pci,id=scsi0 \\\n -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \\\n -drive file=disk0.qcow2,if=none,id=drive0 \\\n -device scsi-hd,drive=drive1,bus=scsi0.0,channel=0,scsi-id=1,lun=0 \\\n -drive file=disk1.qcow2,if=none,id=drive1 \\\n -net nic,model=virtio-net-pci,macaddr=02:00:00:00:00:11 \\\n -net bridge,br=br0\n\ndisk%.qcow2:\n qemu-img create -f qcow2 $@ 1T\n\nOVMF_VARS.fd:\n cp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd\n\nclean:\n rm -f *.qcow2 OVMF_VARS.fd\n</code></pre></div></div>\n\n<p>Gluster works on any file system that supports extended attributes <em>xattr</em>, which includes <code>ext[2-4]</code>. However, XFS is typically used as it performs well with parallel read/write operations and large files. I have used 512-byte inodes, <code>-i size=512</code>, which is recommended as this creates extra space for the extended attributes.</p>\n\n<div><div><pre><code>mkfs.xfs <span>-i</span> <span>size</span><span>=</span>512 /dev/sdb\n<span>mkdir</span> <span>-p</span> /gluster/sdb\n<span>echo</span> <span>\"/dev/sdb /gluster/sdb xfs defaults 0 0\"</span> <span>&gt;&gt;</span> /etc/fstab\nmount <span>-a</span>\n</code></pre></div></div>\n\n<p>With the filesystem prepared, install and start Gluster. Gluster stores its settings in <code>/var/lib/glusterd</code>, so if you need to reset your installation, stop the gluster daemon and remove that directory.</p>\n\n<div><div><pre><code>apt <span>install </span>glusterfs-server\nsystemctl <span>enable </span>glusterd\nsystemctl start glusterd\n</code></pre></div></div>\n\n<p>From one node, probe all the other nodes. You can do this by IP address or by hostname.</p>\n\n<div><div><pre><code>gluster peer probe node222\ngluster peer probe node200\ngluster peer probe node152\n</code></pre></div></div>\n\n<p><code>gluster pool list</code> should now list all the nodes. <code>localhost</code> indicates your current host.</p>\n\n<div><div><pre><code>UUID Hostname State\n8d2a1ef0-4c23-4355-9faa-8f3387054d41 node222 Connected\n4078f192-b2bb-4c74-a588-35d5475dedc7 node200 Connected\n5b2fc21b-b0ab-401e-9848-3973121bfec7 node152 Connected\nd5878850-0d40-4394-8dd8-b9b0d4266632 localhost Connected\n</code></pre></div></div>\n\n<p>Now we need to add a volume. A Gluster volume can be distributed, replicated or dispersed. It is possible to have mix distributed with the other two types, giving a distributed replicated volume or a distributed dispersed volume. Briefly, distributed splits the data across the nodes without redundancy but gives a performance advantage. Replicated creates 2 or more copies of the data. Dispersed uses erasure coding, which can be considered as RAID5 over nodes.</p>\n\n<p>Once a volume has been created, it needs to be started. The commands to create and start the volume only need to be executed on one of the nodes.</p>\n\n<div><div><pre><code>gluster volume create vol1 disperse 4 transport tcp node<span>{</span>200,222,223,152<span>}</span>:/gluster/sdb/vol1\ngluster volume start vol1\n</code></pre></div></div>\n\n<p>On each node, or on a remote machine, you can now mount the Gluster volume. Here I have mounted it to <code>/mnt</code> from the node itself. All writes to <code>/mnt</code> will be dispersed to the other nodes.</p>\n\n<div><div><pre><code>echo \"localhost:/vol1 /mnt glusterfs defaults 0 0\" &gt;&gt; /etc/fstab\nmount -a\n</code></pre></div></div>\n\n<p>The volume can be inspected with <code>gluster volume info</code>.</p>\n\n<div><div><pre><code>Volume Name: vol1\nType: Disperse\nVolume ID: 31e165b2-da96-40b2-bc09-e4607a02d14b\nStatus: Started\nSnapshot Count: 0\nNumber of Bricks: 1 x (3 + 1) = 4\nTransport-type: tcp\nBricks:\nBrick1: node200:/gluster/sdb/vol1\nBrick2: node222:/gluster/sdb/vol1\nBrick3: node223:/gluster/sdb/vol1\nBrick4: node152:/gluster/sdb/vol1\nOptions Reconfigured:\nnetwork.ping-timeout: 4\nstorage.fips-mode-rchecksum: on\ntransport.address-family: inet\nnfs.disable: on\n</code></pre></div></div>\n\n<p>In initial testing, any file operation on the mounted volume appeared to hang when a node went down. This is because Gluster has a default timeout of 42 seconds. This command will set a lower value:</p>\n\n<div><div><pre><code>gluster volume set vol1 network.ping-timeout 4\n</code></pre></div></div>\n\n<p>The video below shows the four VMs running. One is writing random data to <code>/mnt/random</code>. The other machines are running <code>ls -phil /mnt</code> so we can watch the file growing. <code>node222</code> is killed, and after the 4-second pause, the other nodes continue. When the node is rebooted, it automatically recovers.</p>\n\n\n\n<blockquote>\n <p>While I used 4 nodes, this works equally well with 3 nodes.</p>\n</blockquote>", 9 "content_type": "html", 10 "author": { 11 "name": "Mark Elvers", 12 "email": "mark.elvers@tunbury.org", 13 "uri": null 14 }, 15 "categories": [ 16 "Gluster,Ubuntu", 17 "tunbury.org" 18 ], 19 "source": "https://www.tunbury.org/atom.xml" 20}