Ever wondered how computers startup? Or where does the term booting come from in the first place? I was personally experimenting with boot-loaders (the program that loads an operating system from a disk to the memory in its simplest functionality), and came across several core concepts behind the booing-process of a computer.
The word booting is generally used to represent a set of steps a computer performs after powering it up, to the point where it becomes fully functional for use. Modern computers does this in even a matter of seconds. Historically, the term booting originated from the paradox that, if one wants to load a computer memory with softwares, by another software inside the computer itself, some sort of a cyclic mechanism should be in place. The term loading means, moving the software to the primary memory from somewhere else. So, to do this, a method which is analogous to the phrase pull oneself up by one’s bootstraps is put in place. Basically what happens is that, when the computer is powered up, its primary memory will be empty (or can be filled with garbage, which can result in unexpected behaviour if executed). So, once ROMs have become popular in the industry, engineers put together an on-board ROM with some standard initialisation program, which loads a set of specific programs into the primary memory. Historically this program is called the BIOS - Basic Input Output System. Although with the advancements in computing, more advanced systems like UEFI were designed to replace BIOS, these programs are backward compatible with the BIOS. Since the BIOS/UEFI is written to an on-board ROM chip, they can be called a firmware rather than a software or a program. So, in simple words, job of such a program is to load some program generally called a boot-loader into the primary memory. Thus the boot-loader actually contributes to the term booting.
A boot-loader is in simple words, a small program which is present in the boot-sector of any of the disk available on the system. We will find what is a boot sector in the next section. Boot loader should be sized exactly 510 bytes or less. Nothing much can be done with that size limit, apart from printing some messages, may be, and transferring the rest of the job to a more capable program. This is the reason why, booting is done in multiple stages. We will see what are the stages in booting, but this 510 byte program is in general called the first stage boot loader or The Boot Loader itself.
Disks (floppy disks to be precise) are divided physically into small sections called sectors. The major reason why disks are divided into sectors is that, to quickly find the location of a specific file. Say for example, one could tell a file is at Disk-0, Track-3, Sector-19. And the reason for such a quick access mechanism was needed because, the read-write head on hard disks are moved by electric motors and always has some delays in moving from one place to the other across the disk area. These sector and are traditionally sized 512 bytes for hard disks are floppies. A boot sector is nothing but a sector, which contains a boot-loader program, followed by the boot signature. A boot signature is nothing but the last two bytes of the boot sector should be the magic number 0xAA55. This boot-sector should always be the first sector of the disk as well. Such a disk with the first sector loaded with the boot-loader program is called a boot disk.
The image above shows the content of a boot sector. Notice that the last two bytes are equal to 0xAA55. The rest of the 512 bytes should be the boot loader code. Also notice that it is actually 0x55 followed by 0xAA because the lower byte is represented first.
Since boot loaders are limited to 512 bytes in size, not much can be done using the boot-loader program itself. But it can load other programs in to the primary memory for execution. This gives us an opportunity. The boot loader can load a more capable program into the memory. This newly loaded program doesn’t have any size constraints. Most modern operating systems uses this approach. The boot-loader loads another more capable program, and then that program loads more files and programs and so on. Thus the booting process is represented as stages. The first 512-byte boot loader as first stage boot loader, and the next as second stage boot loader and so on.
I have been experimenting with boot loaders and managed to write a simple boot loader that any x86 compatible computer will load automatically. For this experiment, I have created a virtual box VM with standard basic-x86 configurations. Then I have coded a simple boot loader and assembled it with NASM. Please find the following code.
org 0x7c00 mov si, message print_string: pusha mov al, [si] cmp al, 0 je end call print_char add si, 1 jmp print_string popa ret print_char: pusha mov ah, 0x0e int 0x10 popa ret end: jmp $ message: db 'Booting OS...', 0 times 510-($-$$) db 0 dw 0xaa55
The first line of the program is nothing but an assembler directive. The assembler will assemble that code and relative addresses such that the program is loaded to the address 0x7c00 always. The address labeled message is loaded with a string Booting OS… and the rest of the program does nothing but print the message into the computer’s console in text-mode. Notice that no OS provided interrupt routines are available in this stage, because there is not fucking OS at this level ofcourse. Interrupt routine 0x10 is provided by the BIOS for printing characters into a text-mode console. Here is the result of this experiment.
This example boot loader does nothing but printing a message to the console. As you can see, a trivial task like this takes a lot of effort and understanding of the underlaying system. Some calls this bare-metal programming, as there is not much of an abstraction available like in other programming environments. The next task would obviously to create a second stage boot loader and load and execute it from this level. We might also want to switch to 32-bit protected mode since the 16-bit real mode in which all first stage boot-loaders run is not memory-secure. Will keep posting updates on this blog.
Thanks for reading.