{"id":285,"date":"2010-10-02T19:05:08","date_gmt":"2010-10-02T16:05:08","guid":{"rendered":"http:\/\/www.held.org.il\/blog\/?p=285"},"modified":"2011-11-07T23:04:07","modified_gmt":"2011-11-07T20:04:07","slug":"booting-linux-from-iscsi","status":"publish","type":"post","link":"http:\/\/www.held.org.il\/blog\/2010\/10\/booting-linux-from-iscsi\/","title":{"rendered":"Booting Linux from iSCSI"},"content":{"rendered":"<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">What is this long post about?<\/span><\/p>\n<p><a href=\"http:\/\/en.wikipedia.org\/wiki\/ISCSI\">iSCSI<\/a> is standard for accessing block devices (e.g. disks) over network, just as if they were local SCSI devices. That's similar to <a href=\"http:\/\/en.wikipedia.org\/wiki\/ATA_over_Ethernet\">AoE<\/a> and <a href=\"http:\/\/en.wikipedia.org\/wiki\/FCoE\">FCoE<\/a>, although the latter two are good for the LAN only, while iSCSI is over IP thus is good on WAN. This article would focus iSCSI but could be used as a base for doing similar things with AoE and FCoE.<\/p>\n<p>So, iSCSI in the simplest configuration, allows us to mount and manage a data disk that is physically connected to a remote computer (the \"server\", aka <em>target<\/em>)\u00c2\u00a0from our own computer (client, aka <em>initiator<\/em>) .<\/p>\n<p>On this post I'll discuss the deep details of the more advanced stage: <strong>having the root (also boot) disk on a remote computer<\/strong>, so client could remotely boot from it. Surprisingly it can be done even with relatively old hardware.<\/p>\n<p><!--more--><\/p>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">But.. why?<\/span><\/p>\n<p>There are several possible uses for this neat technology, I'll mention some:<\/p>\n<ul>\n<li>A live, read-write rescue disk (USB alternative: no need to physically touch the machine)\n<ul>\n<li>My original motivation was, by the way, an old machine that required rescuing but didn't support boot from USB. Plus I hate optical media.<\/li>\n<\/ul>\n<\/li>\n<li>diskless machines have always been cool<\/li>\n<li>Root disk is an image file\n<ul>\n<li>Allows easy backups and even better: snapshots - of the root disk of important machines.<\/li>\n<li>In a similar fashion, allows to quickly move\/copy the root disk from machine A to machine B, as it can be just an image file.<\/li>\n<\/ul>\n<\/li>\n<li>Polymorphic machines: a simple script could make a machine boot a specific image, make a task, and later boot boot another image and make another task. Very useful for automated nightly tests: why waste four machines for testing on different OSes, if same machine can boot all four?<\/li>\n<\/ul>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">HW\/SW Requirements<\/span><\/p>\n<ul>\n<li>A client (<em>initiator<\/em>) machine that supports network-booting (<a href=\"http:\/\/en.wikipedia.org\/wiki\/Preboot_Execution_Environment\">PXE<\/a>-<a href=\"http:\/\/en.wikipedia.org\/wiki\/Universal_Network_Device_Interface\">UNDI<\/a>). Even relatively old machines (e.g. 2005) can do this. They don't need to support directly iSCSI boot, although it can make the process much easier. This post would discuss machines that <em>cannot<\/em> boot iSCSI natively.<\/li>\n<li>The client's OS should support booting from iSCSI. Recent Debuntu versions support that well, and I overheard that Windows 7+2008 also do, while Win2003 needs some tweaks.<\/li>\n<li>A server machine: should act as a DHCP, TFTP servers and iSCSI target. This post would discuss the setup on Debuntu machines.<\/li>\n<\/ul>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">The theory in a nutshell<\/span><\/p>\n<p>So, what do we need for booting iSCSI on a computer that doesn't support iSCSI boot? There's a quite crazy, repeating bootstrapping process:<\/p>\n<ol>\n<li>The BIOS or NIC send a DHCP request to set-up the IP network settings and find a network bootable server, using the PXE-UNDI mechanisms<\/li>\n<li>gPXE image is found, and downloaded using TFTP. gPXE sends yet another DHCP request and should now find the iSCSI address of the remote boot disk<\/li>\n<li>gPXE starts as an iSCSI initiator that logs in to the iSCSI target, reads the remote boot disk's MBR and starts its boot loader (grub)<\/li>\n<li>grub loads the kernel and initrd<\/li>\n<li>initrd sends yet another DHCP request, sets up the IP network, and uses the iscsistart script, which sets up the iscsi initiator and logins (yes, again) to the iscsi target.<\/li>\n<li>iscsistart script then mounts this disk and uses pivot_root (as usual) to make it the new root<\/li>\n<li>boot process starts from the real root now, running \/sbin\/init<\/li>\n<\/ol>\n<hr \/>\n<p>So.. Let\u00e2\u20ac\u2122s get going!<\/p>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">Ok, just a quick disclaimer:<\/span><\/p>\n<ul>\n<li>I wrote the following instructions partly from memory, so I might have some imperfect parts. If you find such, let me know and I\u00e2\u20ac\u2122ll fix them.<\/li>\n<li>Don\u00e2\u20ac\u2122t blame me if any of the instruction below ruin your data\/life\/relationship.<\/li>\n<\/ul>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">STEP I: setup DHCP+TFTP+gPXE on server machine:<\/span><\/p>\n<p>gPXE is a neat project that lets us boot from iSCSI and AoE. If your BIOS supports iSCSI or AoE boot, I guess you could skip this step.<\/p>\n<p>The following steps are my paraphrase to this <a href=\"http:\/\/etherboot.org\/wiki\/pxechaining\">gPXE chainloading howto<\/a>.<\/p>\n<ol>\n<li><strong>Install the DHCP + TFTP daemons:<\/strong><br \/>\n<blockquote><p><code>$ sudo aptitude install isc-dhcp-server atftpd<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong> Configure DHCP:<\/strong> in this example, the subnet is 192.168.1.0\/24, and the server is 192.168.1.100.\u00c2\u00a0(<strong>Beware <\/strong>of\u00c2\u00a0using\u00c2\u00a0this DHCP server in your workplace or something, so not to interfere with the other DHCP servers)<br \/>\n<blockquote><p><code>subnet 192.168.1.0 netmask 255.255.255.0 {<br \/>\nallow booting;<br \/>\nallow bootp;<br \/>\nnext-server 192.168.1.100;<br \/>\nif exists user-class and option user-class = \"gPXE\" {<br \/>\nfilename \"\";<br \/>\noption root-path \"iscsi:192.168.1.100::::iqn.my-laptop:target1\";<br \/>\n} else {<br \/>\nfilename \"undionly.kpxe\";<br \/>\n}<br \/>\nrange 192.168.1.100 192.168.1.200;<br \/>\n}<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Put undionly.kpxe (the gPXE UNDI chain loader) in the tftp root:<\/strong>\n<ol>\n<li><a href=\"http:\/\/etherboot.org\/wiki\/download\">Get gPXE<\/a> and take the undionly.kpxe file off it. (needs compiling first?)<\/li>\n<li>Place it in the tftp root directory. e.g. \/srv\/tftp or \/tftproot, depends on your tftpd configuration.c.<\/li>\n<li>Test that everything is fine:<br \/>\n<blockquote><p><code>$ tftp localhost<br \/>\ntftp&gt; get \/undionly.kpxe<\/code><\/p><\/blockquote>\n<\/li>\n<\/ol>\n<\/li>\n<li><strong>Test booting a client machine<\/strong>: just boot a client from network, and see that it gets gPXE trying to connect an iscsi target. As we didn't set up the target, it should fail at that stage, but if it didn't reach there, you'd better go fix it first.<\/li>\n<\/ol>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">STEP II: setup an iSCSI target which shares a disk<\/span><\/p>\n<ol>\n<li><strong>Create the disk<\/strong>. It's more fun and flexible with an image file instead of a real physical disk. Let's create a 500MiB image and represent it as a loop block device:<br \/>\n<blockquote><p><code>$ sudo dd if=\/dev\/zero of=\/data\/my_root_disk.img bs=1024k count=500<br \/>\n$ sudo losetup \/dev\/loop0 \/data\/my_root_disk.img<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Install the iscsi target tools:<\/strong><br \/>\n<blockquote><p><code>$ sudo aptitude install iscsitarget<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Configure \/etc\/iet\/ietd.conf<\/strong> to share our block device, as Lun 0 (zero) on target1:<br \/>\n<blockquote><p><code>Target iqn.my-laptop:target1<br \/>\nLun 0 Path=\/dev\/loop0,Type=fileio,ScsiId=xyz,ScsiSN=xyz<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Test the target<\/strong> by setting up an initiator to log in the target. This can be done locally on the target machine:<br \/>\n<blockquote><p><code>$ sudo iscsiadm -m discovery -t st -p &lt;target's IP&gt;<br \/>\n$ sudo iscsiadm -m node -L all<\/code><\/p><\/blockquote>\n<p>If everything worked well, above lines had discovered and logged into the iSCSI target, and you should see new scsi devices on \/dev (and notes about these new devices in <em>\/var\/log\/messages<\/em>)<\/p>\n<p><strong>Note:<\/strong> from this stage on, it's possible to do bad things such as mounting the same image twice (directly or over iscsi, e.g. from different initiators), so avoid doing that.<\/li>\n<\/ol>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">Step III: Create the root disk<\/span><\/p>\n<ol>\n<li><strong>Partitioning: <\/strong>I've used the modern gpt partitions, but it should be possible with the ancient DOS partitions as well.So, using <em>parted<\/em> I've created the gpt partition table, then created two partitions:#1 for grub boot loader (note the flag <em>grub_bios,<\/em> grub requires us to add this flag from <em>parted<\/em>)#2 for the root disk itselfThat's my eventual partition table:<br \/>\n<blockquote><p><code>Disk \/data\/my_root_disk.img: 419MB<br \/>\nSector size (logical\/physical): 512B\/512B<br \/>\nPartition Table: gpt<br \/>\nNumber \u00c2\u00a0Start \u00c2\u00a0 End \u00c2\u00a0 \u00c2\u00a0Size \u00c2\u00a0 File system \u00c2\u00a0Name \u00c2\u00a0Flags<br \/>\n1 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a017.4kB \u00c2\u00a0132kB \u00c2\u00a0114kB \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 grub \u00c2\u00a0bios_grub<br \/>\n2 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0132kB \u00c2\u00a0 419MB \u00c2\u00a0419MB \u00c2\u00a0ext4 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 \u00c2\u00a0 root<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Create the root filesystem on partition 2 and mount it to <em>\/mnt\/tmp<\/em>: <\/strong><strong> <\/strong>Unfortunately it doesn't seem easy to access a partition on a loop device (i.e. \/dev\/loop0p1 doesn't show up) . <strong>[Update: see Petr's comment below, it IS possible in modern kernels] <\/strong>So my quick-and-dirty trick was simply doing it over local iSCSI, as mentioned in STEP II-4. Then we get \/dev\/sdX1 and \/dev\/sdX2, see <em>\/var\/log\/messages<\/em> to find what X stands for. If you see multiple devices which seem the same (e.g. both sdc and sdd) that's because there are multiple IP paths to them (E.g. 127.0.01 and ::1 of IPv6); any of them would do.<br \/>\n<blockquote><p><code>$ sudo mkfs.ext4 \/dev\/sdX2 # beware not to be mistaken with device name which could ruin your life<br \/>\n$ sudo mount \/dev\/sdX2 \/mnt\/tmp # make sure it's not already mounted anywhere<br \/>\n<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Populate the root filesystem<\/strong>: I used the amazing <em><a href=\"http:\/\/wiki.debian.org\/Debootstrap\">debootstrap<\/a> <\/em>tool\u00c2\u00a0to put Debian sid on it:<br \/>\n<blockquote><p><code><br \/>\n$ sudo debootstrap sid \/mnt\/tmp<br \/>\n<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Prepare to chroot inside the new root filesystem:<\/strong> would be useful for many events. But we'd better also have \/dev, \/sys, \/proc there:<br \/>\n<blockquote><p><code>$ sudo mount -o bind \/dev \/mnt\/tmp\/dev<br \/>\n$ sudo mount -o bind \/sys \/mnt\/tmp\/sys<br \/>\n$ sudo mount -o bind \/proc \/mnt\/tmp\/proc<br \/>\n<\/code><\/p><\/blockquote>\n<p>Now chroot:<\/p>\n<blockquote><p><code># chroot \/mnt\/tmp<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Update the initial ram fs (initramfs) to support iscsi root<\/strong>from within the chrootted environment:As the initrd's responsibility is to mount the real final root device, and our real final root device is iscsi, initrd should have iscsi capabilities. Recent Debuntu's initramfs is capable of that\n<ol>\n<li><strong>enable iscsi-initramfs:<\/strong> This is done by setting-up the\u00c2\u00a0<em>\/etc\/iscsi\/iscsi.initramfs<\/em> file, making sure it exists and contains a\u00c2\u00a0<em>unique<\/em> IQN (can be generated by\u00c2\u00a0<em>iscsi-iname<\/em> tool) in that format:<br \/>\n<blockquote><p><code>InitiatorName=&lt;unique IQN&gt;<\/code><\/p><\/blockquote>\n<\/li>\n<li><strong>Create the new initrd:<\/strong><br \/>\n<blockquote><p><code># update-initramfs -u<\/code><\/p><\/blockquote>\n<\/li>\n<\/ol>\n<\/li>\n<li><strong>Install grub boot loader on the root disk:<\/strong> I believe it's best to do this also from chrootted environment:<br \/>\n<blockquote><p><code>If grub is not there, install it:<br \/>\n# aptitude install grub<br \/>\nInstall grub boot code on the MBR:<br \/>\n# grub-install \/dev\/sdX<\/code><\/p><\/blockquote>\n<\/li>\n<\/ol>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">Step IV: Boot the client<\/span><br \/>\nI hope it all worked \ud83d\ude09<br \/>\nPlease comment about your experiences, your additions or mistakes you've found in this post.<\/p>\n<p><span style=\"font-weight: bold; color: #666; font-size: 125%;\">References<\/span><\/p>\n<ul>\n<li><a href=\"http:\/\/etherboot.org\/wiki\/sanboot\">Good HOWTOs for\u00c2\u00a0various\u00c2\u00a0san-boot configuration for a variety of operating systems<\/a>.<\/li>\n<li><a href=\"http:\/\/www.thogan.com\/site2\/archives\/10\">booting Windows 7 from SAN<\/a> while also taking LVM snapshot of the windows root disk.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>What is this long post about? iSCSI is standard for accessing block devices (e.g. disks) over network, just as if they were local SCSI devices. That&#8217;s similar to AoE and FCoE, although the latter two are good for the LAN only, while iSCSI is over IP thus is good on WAN. This article would focus [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[8],"tags":[143,141,32,147,198,144,140,203,145,142,146,119],"_links":{"self":[{"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/posts\/285"}],"collection":[{"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/comments?post=285"}],"version-history":[{"count":0,"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/posts\/285\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/media?parent=285"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/categories?post=285"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.held.org.il\/blog\/wp-json\/wp\/v2\/tags?post=285"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}