Marco Scilipoti's blog

Table of contents
Introduction
Design and Challenges Encountered
Implementation of test programs and their modifications
Limitations
Conclusion
Appendix
- Compilation and execution instructions
- Citations of code snippets reused from other sources

Introduction

The following report presents the operating system developed for Operating Systems 72.11. The name comes from the Argentinian footbaler Pipita Higuain, who repeatedly failed penalties and became a meme, thus gdh comes to represent 'gol de higuain'—getting this thing working was just as likely as a penalty goal of Higuain.

Among the main functionalities we implemented:

Two memory managers (defined in build time)
Scheduler supporting process creation/blocking/killing
Inter-process communication (IPC) mechanisms
Process synchronization (semaphores).
System calls and exceptions handling through IDT table.

To achieve this, we worked based on the knowledge developed in the first half of 72.11 on UNIX-type Operating Systems as well as the kernel built for Computer Architecture 72.08

Design and Challenges Encountered

Memory Manager

Based on the Linux kernel, the memory manager relies on the Buddy Algorithm. The primary limitation of this algorithm is directly tied to its design, the way memory division is defined. The memory is divided into powers of two, allowing the assigned blocks to be represented in the form of a tree (thus making search and deletion O(log n)). However, this also means that memory block sizes grow exponentially. For instance, if we want 66Kb (2^16), the block that the algorithm would provide is the next power, i.e., 122Kb (2^17). This issue is known as internal fragmentation.

The other algorithm presented is based on a doubly linked list that allocates memory in a First fit way. This algorithm goes through memory blocks and returns the first one that can store the required memory. It's important to note that the implementation of this algorithm lays the groundwork to implement realloc without major difficulties.

This section of the operating system was extremely insightful for understanding how dynamic memory works. From the library malloc, free, to the system calls, to the kernel level management of memory, it's something that looks like magic, but it is just engineering.

Scheduler

The scheduling algorithm implemented is Round Robin (RR) with priorities. It has five priorities, with 0 being the highest and 4 being the lowest. When a new process is added, it starts with an initial priority of 2, and the user can raise or lower this priority using the nice command. Processes can also be blocked using the block system call, which will remove the process from the RR and store it in a list of blocked processes. Each process has its table of file descriptors, which can be modified using dup2. It's worth noting that when a new process is created, then its file descriptors are inherited from the parent process, analogous to processes in UNIX. Also, each process is associated with its PID and the type of process it is, with 0 being a foreground process and 1 a background process.

In terms of user space applications, it is possible to run programs both in foreground or background much like in Linux systems. For example, the following command will run the program test in the background with the argument memory (to test the memory manager), thus allowing the user to interact with the operating system as the test in being executed.

user@device ~$ &test memory

Synchronization

In terms of process synchronization a core design requirement was that synchronization must be independent of the processes that use it, meaning the kernel act as a mediator between a set of independent processes. In the kernel, we implemented a series of syscalls, which perform the following actions:

wait: Decreases the semaphore by 1, and if it gets a value less than zero, the process is blocked until another process wakes it up using signal (see next item).
signal: Increases the semaphore by 1, and if its original value was zero, it wakes up a process that was blocked in a wait.
open: Simply, it initializes and returns a semaphore if it doesn't exist with that id, or it just returns the existing semaphore with that id.
close: It unlinks the calling process from that semaphore, and if it is the last linked process, the semaphore releases its resources and completely closes.
state: Lists all active semaphores with: their id, number of processes using them, and value.

Two main difficulties were encountered.

Firstly, we had to carefully model the behavior of semWait(semaphore) and semSignal(semaphore), as various cases arose, such as what happens when semSignal wakes up a process, in what order are the processes awakened? Among other cases. For this, we implemented a FIFO (First In First Out) type data structure called pidQueue. This way, the process that performs a wait with the semaphore value at 0, its PID will be sent to the pidQueue in order. When a semSignal is performed, if I have any process waiting, I will not increase the semaphore value, as I will take the first process from pidQueue to unblock it and let it continue.

Secondly, even though the OS is single-core for the implementation of synchronization mechanisms, it was a design requirement to consider the situation where we might have another CPU core that also wants to access the information of the same semaphore. For this, we have used the Spin Lock algorithm. This algorithm guarantees us through the xchg instruction, which allows us to block the data bus while a value is read, and another one is written to the same memory address. In this way, it ensures that only one core can 'reserve' the turn to operate on a semaphore, and there is no overlap when operating with the same semaphore.

Inter Process Communication

In terms of IPC, the kernel provides pipes through pipeEngine.h, and to support the pipes correclty, an entire file descriptor system is implemented, the logic of which is defined in fileDescriptorManager.h. Both systems maintain circular array, so we can access information from both pipes and file descriptors in O(1), and maintain simplicity in the overall design. Also, the fact that it is a circular array means that as pipes/fds are closed, they can be reused by the same system.

An important design decision was to handle file descriptors as integers, not pointers, encapsulating all the logic within the fileDescriptorManager, which considerably simplifies the handling of file descriptors and, therefore, pipes.

Finally, it's worth noting that the dup2 function does not receive the pid to modify its file descriptor table, it can only modify its own (otherwise there is a security issue in being able to modify the file descriptor table of another process).

In terms of user space application of IPC there are different processes that support redirection of standard input and output to another process (much like linux). For example, the command

user@device ~$ man phylo | wc

will redirect the standard output of man phylo to wc

User Space Applications

We have implemented multiple applications in user testing, many of them simply run tests or are wrappers for system calls. The most interesting of which is `phylo.

This programs it aims to simulate the concurrency problem of the dining philosophers. The variation of the problem implemented considers that the number of philosophers can vary during execution between 5 and 10. The main difficulty was to avoid incurring any deadlock when adding or removing philosophers. As there are multiple solutions to this problem with a fixed number of philosophers, we have taken the asynchronous solution where each philosopher takes the cutlery in a different order depending on whether they are in an even or odd position.

To add or remove philosophers without inhibiting all of them, a semaphore has been implemented for each one, where the manager of these performs semWait to block a specific semaphore. In the same way, to unlock a new philosopher, a simple semSignal is done. Given this solution, the philosopher will always finish thinking-eating-sleeping before being blocked. This way we avoid any possible inconsistent state.

Implementation of test programs and their modifications

For the implementation of the tests, we decided to create a process which will receive through an argument which test we want to run. To run the tests, the command is as follows:

user@device ~$ test {nameTest}

Where nameTest must be one of the following:

memory: Tests the memory manager with which it has been compiled.
prio: Tests the Round Robin algorithm with priorities implemented in Scheduling.
sync: Tests the implementation of semaphores as synchronization mechanisms.
processes: Tests the creation, blocking and termination of processes exhaustively

Note: The sync test is the only one that receives additional parameters, as follows:

user@device ~$ test sync {X} {Y}

Where X = number of increments. Y = 1 if semaphores are desired, Y = 0 otherwise. Finally, the test command can be run both in foreground and background, by adding the & symbol before it as Linux does.

Finally, we can highlight the operation of the terminal, which initially separates the commands into two groups, those without a pipe and those with a pipe. Note that this is due to the semantics of the arguments, with the pipe command arguments being one on the left and one on the right, while all other arguments are only on the right.

Limitations

One of the main limitations is that if a process runs in the foreground and it doesnt stop by itself, then it cannot be killed with any combination of keys. Another implementation that has not been achieved is to be able to do cat | filter or any combination that has cat as the first pipe parameter as the pipe waits for the first process to finish so that the second one can execute and read from the pipe.

Finally, the commandsEngine does not parse the commands through a finite automata, we had not yet taken the class when we built the project. The solution we proposed used KMP to check the suffix of the command (after removing the spaces the first string should always represent the command) and then split the command around spaces to parse the arguments passed to the command.

Conclusion

In conclusion, we have been able to implement many concepts that we studied at the beginning of the course applied in UNIX-type Operating Systems. This way we have deepened our understanding of operating systems, since their implementation required a low-level understanding of them. In addition to this, it has allowed us to gain a level of understanding that translates to virtually all of software.

Appendix

Compilation and execution instructions

A makefile system is provided to compile the program, as well as a compilation parameter that determines the memory manager to be used once compiled.

To compile with the default memory manager, the first fit, use the following command standing at the root of the project:

bash make all

To compile with the buddy memory manager you need to pass a compilation argument:

make all mm=buddy

To run the program you can flash the image present in the Image folder or if you have the dependencies mentioned in the README installed, you can use the run script:

./run.sh

Finally, to delete all objects and compilation files, run the command:

make clean

Citations of code snippets reused from other sources

When implementing the 'phylo' application, we searched for solutions to the problem of philosophers with fixed amounts, to try to guide the solution and avoid errors from the starting point.

Finally, for the memory manager, we based ourselves on different sources, among them the most significant were:

for the implementation of both the buddy and the first fit.

Scilim

gdhOS

Table of contents