Having fun with Qemu, Busybox, and Linux.
Clemens Lahme <email@example.com>
Build a tiny functional Linux OS from source by yourself in about half an hour.
Original source: http://techinvest.li/tinux/tiny_linux.html
Get the Source Stuff
Get a kernel and check its authentity. Older kernels have less bloat than newer once. For this exercise we don't need recent security fixes. Version 4.4 is already unsupported, but it was a long term supported kernel from 2016 till just 2022. You can grab a newer one, if you like.
mkdir tiny # Make a new project directory. cd tiny # Work there. wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.4.302.tar.xz wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.4.302.tar.sign # PGP signature.
And next we get the busybox source code (for branch 1.36.0), which will package he Unix shell and its basic commands all into a single binary after compilation.
git clone -b 1_36_0 https://git.busybox.net/busybox busybox-1.36.0
Verify the Source
unxz -k linux-4.4.302.tar.xz # tar file for the signature, but keep xz file around. gpg --verify linux-4.4.302.tar.sign
The output should be somthing like:
gpg: assuming signed data in 'linux-4.4.302.tar' gpg: Signature made Thu 03 Feb 2022 09:29:04 AM CET gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E gpg: checking the trustdb gpg: marginals needed: 3 completes needed: 1 trust model: pgp gpg: depth: 0 valid: 4 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 4u gpg: Good signature from "Greg Kroah-Hartman <firstname.lastname@example.org>" [ultimate] gpg: aka "Greg Kroah-Hartman <email@example.com>" [ultimate] gpg: aka "Greg Kroah-Hartman (Linux kernel stable release signing key) <firstname.lastname@example.org>" [ultimate]
Now we can remove the tar file again just leaving the xz version for later use with tar.
And for busybox, let's compare that we both got at least the identical version, by comparing the last git commit you have with the commit I got and signed (my public key is here).
cd busybox-1.36.0/ cat > git.log.asc <<EOF -----BEGIN PGP SIGNATURE----- iQJMBAABCAA6FiEEUM1qhcDHq+cYGhvwlRStT3MlFn4FAmPjzS0cHGNsZW1lbnMu bGFobWVAdGVjaGludmVzdC5saQAKCRCVFK1PcyUWfhl1D90ftjrfBq02wcIlxf3j fe/8QSfHPZdYgR/LOzvMTwjKWhXDSzrjg1GP80YIoAO/TGKIjhhSnS/bQfqKIeED tnMGmzBmHeGMakHBb8F6nqd9ksKCMGX5xF4LTQmf7RX3FL4VCyhKUTMv9am3FTeh Y/+67hta2OFDPXc/A5LVNUIkunn7Anc7t7B7/53etA89swt8QDeflH9/QiaOUxwt 3yVIX1dPXhEMGMdtxcKcLgJMA3378IwT3lDWRLpOr40mT2PI08HAF5hCroPlysLl gYxfgEuh9/mofFORFJiO/WaonhHJyvc20m/P6Yhghb61JUyIvUR39O8ZjLG2szzU IPs4qMHnb1PxuAKHOae1joW2n95WNAqns9Y2ZAU/MRWVovKr5LlhYpJgnn4IYUNp cSYYyUVbU0rLhE9OVsMHuP1IPZDVJ+a3V5v3btYErIMKsYYA1lW4sy7f6gV0mbP/ LrE8X7C5A3HlmlkEPPZUYT7oB6HpMpMWmjH+jsNXd8B14BVMqGRfhDquNHD3RXxi wZ5u4rix3OR2EdVQqo/iKSqPjwdideUuwViH0/b0sjkKSdj8gXidDmBZ1LE5quag HsB10IQlW+YLWGfphl/bDDVBtIilcpyYoFXsx6xRpBDB/eN8PyuRf8Byofqgt5Cn DvoshwZh4Bdv1xwC6vE7 =kVD/ -----END PGP SIGNATURE----- EOF git log --no-decorate -n 1 | tee git.log gpg --verify git.log.asc
Which should result in something like (assuming you trusted my public PGP key before):
commit 70f77e4617e06077231b8b63c3fb3406d7f8865d Author: Denys Vlasenko <email@example.com> Date: Tue Jan 3 15:15:41 2023 +0100 Bump version to 1.36.0 Signed-off-by: Denys Vlasenko <firstname.lastname@example.org> gpg: assuming signed data in 'git.log' gpg: Signature made Wed 08 Feb 2023 05:26:21 PM CET gpg: using RSA key 50CD6A85C0C7ABE7181A1BF09514AD4F7325167E gpg: issuer "email@example.com" gpg: Good signature from "Clemens Lahme <firstname.lastname@example.org>" [ultimate]
If everything went smoothly, proceed to the next level.
Build the Kernel
tar xdf linux-4.4.302.tar.xz cd linux-4.4.302 make mrproper # Clean up before, just in case. make tinyconfig # Turns as few options on as imaginable, even less than 'allnoconfig' does. grep =y .config | wc -l
And we get 202 options activated in the kernel config, like 32 bit and X86.
Now this kernel won't do much, yet. We want at least console output to see what the kernel is doing. So we turn on two options relevant to it and make sure other options, these two depend on, are also turned on. As we are developing in 2023 on X86 64 architectures, let's also first turn on 64 bit support, as everything else by default get's compiled to it.
./scripts/config --set-val CONFIG_64BIT y ./scripts/config --set-val CONFIG_PRINTK y ./scripts/config --set-val CONFIG_TTY y make olddefconfig grep =y .config | wc -l
Furthermore the kernel must be able to execute some binary code as well as shell scripts to do something useful. So we turn on ELF support and shebang support for shell scripts (this #! magic at the beginning of a script file).
./scripts/config --set-val CONFIG_BINFMT_ELF y ./scripts/config --set-val CONFIG_BINFMT_SCRIPT y make olddefconfig grep =y .config | wc -l
Now that the kernel knows how to start an executable or script, it also needs to have a place for them. So we need support for some temporary file system, the so called init ram file system, also named initramfs (formerly initrd). If you use Qemu with a real file image, that contains both the root file system as well as the kernel image itself, you don't need support for initramfs/initrd. But in our case we can skip creation of a proper real file image by invoking Qemu without such and directly booting from or into the kernel, and henceforth replace the solid file system with an initramfs for the most basic stuff instead.
./scripts/config --set-val CONFIG_BLK_DEV_INITRD y make olddefconfig grep =y .config | wc -l
So, finally, let's build the kernel by using all the computer cores and threads available to us while also timing the whole thing.
time make -j$(nproc)
A few minutes later we get something like this output...
... BUILD arch/x86/boot/bzImage Setup is 15260 bytes (padded to 15360 bytes). System is 469 kB CRC fbd7975d Kernel: arch/x86/boot/bzImage is ready (#1) real 0m52.611s user 2m39.835s sys 0m21.433s
So in the next step we build Busybox, so that we have a shell to invoke for the Linux kernel.
We are still located in the kernel directory. So move over to our busybox repository and create a default config and make sure we will build a static binary.
cd ../busybox-1.36.0 make defconfig echo CONFIG_STATIC=y >> .config time make -j$(nproc)
The output is the binary busybox just in the current directory. We use that in the next step for our initramfs.
Create the Initramfs
So let's go out of the busybox directory and back to the parent tiny directory, use the created busybox binary and zip everything into a cpio archive. And hop back out again into the top directory.
cd .. mkdir -p initramfs/bin cp -p busybox-1.36.0/busybox initramfs/bin/ cd initramfs/bin ln -s busybox sh cd .. find . | cpio -ov --format=newc | gzip -9 >../initramfs.cpio.gz cd ..
Now the moment of truth approaches. With both the kernel and initramfs files available we invoke Qemu with them both directly.
qemu-system-x86_64 -kernel ./linux-4.4.302/arch/x86/boot/bzImage -initrd ./initramfs.cpio.gz
By default the kernel looks for a program /init, and if there is none, last resort is to look for /bin/sh, which we provided in the form of busybox.