Marco Scilipoti's blog

Table of contents
Introduction
Project structure
Challenges Encountered and Design Decisions
- Selector and State Machine
- Parser
Testing
Client Implementation
Server Implementation
Conclusions
Appendix

Introduction

In this report we depict the implementation of a non-blocking concurrent POP3 server. In addition, we designed and implemented our own monitoring protocol for the server, and a client that uses it. We will also expand on the various design decisions and challenges encountered, as well as the various tests to which the server was subjected (and passed).

Project structure

Even though the project was designed with educational pourposes, it was important to reflect the application architecture on the design. Along the same lines the server was tested with PVS-studio for correctness, compiled with two different C compilers and tested with Valgrind for memory leaks.

The libraries are included in the lib/ directory and within the client/ directory we have the implementation of the client.

The server is implemented in the pop3 folder, main being defined in main.c. Here the server creates two different pairs sockets to listen to incoming connections, both for monitor and pop3 connections. We include two pairs because the server supports ipv4 and ipv6 connections.

The files descriptors are then registered into the selector defined in lib/selector, this is the part of the server responsible for switching through the different connections. Internally it implements pselect to listen to the different connections without blocking.

The connection states are defined through a state machine implemented in lib/stm. This way we can clearly define the different states a connection has, as well as multiple methods that are executed when a connection arrives/leaves to a new state.

Finally, the two protocols implemented are text-based so we make use of another finiste state machine (implemented in lib/parser) to process the commands sent by a pop3 or monitor client. The actual implementation of the parser is rather straightforward as the grammar defined for both protocols is trivial, a command followed by at most two parameters delimited by a whitespace.

Challenges Encountered and Design Decisions

Selector and State Machine

We had many challenges while working on the server, most of which were related to both the state machine (stm) and the selector. With the selector, we had many difficulties understanding how the orchestration of reading and writing should be. Within the selector, we also had the problem of how to release memory from each file descriptor (fd), both when doing a quit and when the user does a ctrl c. This issue took us a lot of time since we prioritized not having any memory leak, and that's why after testing with both -fsanitize=address and Valgrind we found that we have no memory leaks.

In the state machine, initially, the problem was the transition between the states of the connection. After being able to transition quickly, the idea of grouping the behavior of the commands in their respective files arose. In this way, we managed to encapsulate the specific logic of each command in its respective file. For example, pop3/commands/monitor/authorization defines all the commands available in the authorization step of the protocol . Initially, we proposed this for the reading stage through the read_commands command, which interacted directly with the parser and was responsible for executing the handler for the parsed command. After implementing the writing logic, it was considered appropriate to use the existing file structure to also make each command manage the writing logic in its respective file.

Finally, the litmus test of this structure was adapting it for the monitor protocol. The truth is that there were quite a few bugs when repeating the structure due to the lack of C's generic types. However, we successfully implemented the same command reading/writing structure for the monitor.

Parser

The parser design was to adapt the parser.c library provided by the course to parse what the user entered. Because the POP3 protocol has a maximum of two arguments separated by a space for each command, the parser was designed to generically recognize a command and two arguments. After reading the second argument, a flag is set in the parser_feed response. Then, the read_commands function detects this flag and calls the process_commands function that compares the entered command with the array of available commands for that state. If it finds such a command, it then executes the corresponding handler (through the stored function pointers) with the respective arguments and the state of the connection.

Once the parser had broken the input into tokens it matched the command (the protocol, and connection state) against the following lists to execute it.

Testing

An integral part of the implementation of the pop3 server was testing the capabilites as well as design choices and their impact on speed. This sections expands on testing of buffer sizes for optimal speed, as well as stress tests on the server running at close to full capacity.

Buffer Testing

2K Buffer

For the first test case, we tested with a 2K buffer for the email. We should keep in mind that the server was compiled using the -O3 optimization and without the fsanitize flag to obtain the best performance. It should be noted that during development, the fsanitize flag was used, and it was also tested with valgrind to ensure there were no memory leaks.

As we can see in Figure 1, the time it takes the server to process a 1GB email with a write and read buffer of 2048Bytes is 13 seconds in this case.

4K Buffer

For the second test, we decided to double both the read and write buffer to compare them with the other cases.

In this case, the time presented is 11 seconds, thus being 2 seconds faster than with the 2K buffer. Analyzing this, we found it to be a significant improvement, as it improved the performance by almost 15%. The CPU usage was also tested and when the retr command is processing, the usage is 100%, but when the processing is finished, it drops sharply. This shows us that when the mail processing is finished, all resources are freed and no garbage continues to be processed.

8K Buffer

In the third test, we decided to further increase the buffers to see how the server behaved to such change. We noticed that the server's performance continued to improve as the times were reduced by 1s, which is 10% of the current performance.

16K Buffer

We decided to double the buffer to 16K. In this case, something very interesting resulted - there was no increase in performance; it remained the same as with an 8K buffer. This leads us to think that a larger buffer does not mean that the server will have better performance.

32K Buffer

Finally, we ran the same test with a 32K buffer. As we can see in Figure 5, the time was the same as when we ran the test with an 8K buffer, which reinforces our hypothesis that the improvement in performance due to buffer size has a limit, which we have reached.

Conclusions

Thanks to these tests, we were able to conclude that the server performance improvement does not only come from the size of the buffer since, in this case, from a buffer size of 8K, the performance was exactly the same. That's why we concluded that the final buffer size is 8K.

1GB Email Test

For this test, we created a 1G file and converted it to base64. Then we retr the email and did a diff between both emails.

As we can see in Figure 6, the difference between both emails is only in the last line, as we saw in the previous case. This is why we can say that there is no difference between the original email and the one returned by the POP3 server.

Stress Test

Using a bash script (which is in the versioned repository within the test directory), we created 500 connections from 500 different POP3 server users. We aim to validate that concurrent connections are handled correctly. Additionally, after the user authentication, a LIST is performed to ensure that the users actually interact with the server.

With this, we validate the correct management of concurrent connections for 500 users. Remembering that the server will handle up to 1024 file descriptors, there would be margin for the server's file descriptor management, but having a maximum of 500 users, there could not be any more connected (as two connections cannot be made with the same user).

To verify that the 500 users have connected, we configure the logger to display the [DEBUG] messages. Then we transfer the standard output to a file and count the number of lines with the message '[DEBUG] yyyy-mm-dd hh:mm:ss ADD_USER' (where the monitor adds each user) and '[DEBUG] yyyy-mm-dd hh:mm:ss Registered a new connection' where a new connection is registered.

A word count could also be performed on this document.

$ wc log_file

Concurrency Testing with Email Reading

Extending the previous test, we aim to test reading an email for each of the 500 concurrent connections.

For this, another bash script was created (also present in the test directory) where the necessary folders were created for reading the emails of each user. Analogous to the previous test, 500 connections are made and for each one, RETR 1 is performed. All users have the same email of approximately 1 MB.

To verify, in addition to using the logger as in the previous test. It must be verified that the output in the test script corresponds to the email present in each user's directory. This can also be verified through the scripts in the test directory, but fair is to say that the server passed these tests.

Client Implementation

The client application was created to have a simple communication with the monitoring server. That's why it has functionalities limited to what the monitoring protocol offers.

Arguments

The monitor application will receive everything necessary for its execution through arguments. That's why it has mandatory arguments and optional arguments. Among the mandatory arguments are -p, which receives the port where the monitoring server is running, and -a specifies the address where the monitoring server is running. For authentication, you need to use the -u user:password argument. Among the non-mandatory arguments, there are two: the -n user:password argument, which will add a new user to the POP3 server, and the -m argument, which will print all the metrics the POP3 server has. The last optional argument is -d directoryPath, which receives a directory which is the new main directory of the POP3 server.

Security Considerations

The POP3 monitor protocol does not establish additional security mechanisms to those already present in the POP3 protocol. The client's authentication to the monitor server is performed using the USERNAME and PASSWORD commands, which must be adequately protected to prevent unauthorized access to the monitor server.

It is recommended to implement additional security measures, such as communication encryption, to ensure the confidentiality and integrity of the data transmitted between the client and the monitor server.

Implementation Considerations

The monitor server must be able to manage multiple client connections and must implement the functionalities defined in the protocol.

Communication between the client and the monitor server is carried out through TCP connections. The monitor server must be configured to listen on a specific port and manage incoming connection requests from clients.

Extensibility Considerations

The POP3 monitor protocol is extensible and can be adapted to include new functionalities and commands in future versions, especially in case of implementing new metrics.

When adding new commands or metrics, it is important to follow the conventions established in the protocol, such as the format of command and response messages, to ensure interoperability between different implementations.

It is recommended to clearly document any extension made to the protocol and provide the necessary information for clients to use the new functionalities appropriately.

Server Implementation

New connection diagram

The diagram shows the flow from when a new connection is established and how, through the selector, it iterates as the connection state changes.

The diagram simplifies the RETR operation of a user's email. In this case, the file containing the email is opened and added to the selector. The connection at this point goes into the NOOP state.

Limitations

In the creation of the POP3 server, certain measures had to be taken which resulted in limitations. One of them is that a maximum of 500 users can be had simultaneously since select() is used. Within the limitations, we also have the length of a directory name, which we decided to take the maximum that Linux has, which is 4096. Therefore, directory names longer than 4096 will not be accepted. Another limitation we had was the length of usernames, but this is not such a limitation since the POP3 RFC indicates that the maximum length a username can have is 40 characters, and that is our limit. Finally, we have as a limitation the number of emails a user can have in their curl folder. During development, the decision was made that the maximum number of emails a person can have is 500 emails, in case the user has more than 500 emails, the POP3 server will only process the first 500 emails in its processing

Possible Extensions

In terms of the implementation of the POP3 server, more commands that are not currently supported could be added, such as spam mechanisms for mails.

On the other hand, the monitoring protocol could be added the ability to perform real-time analysis of the server metrics, such as identifying usage patterns or detecting anomalies in server behavior. Possibly also extending the monitor client to be able to graph the server metrics. Finally, the commands allowing remote server configuration could be expanded, providing administrators greater control over the operation of the POP3 server. One of the main possible extensions in the future is the addition of the dele_user command, which can remove users from the POP3 server. Analogously for monitor users, the possibility of creating and deleting administrator users, with access to the monitor, could be added. Currently, monitor users are defined at runtime.

Conclusions

The implementation of the POP3 protocol and its monitor server has allowed us to deepen our understanding of internet protocols and the design and implementation of server-client systems. Through this project, we have been able to apply and develop skills in concurrent programming, the use of state machines, and socket handling.

Moreover, the introduction of the design of a monitoring protocol (and its respective client) has provided a mechanism to better understand considerations when creating a protocol and its respective RFC.

In summary, this project has been a valuable opportunity to learn about the design and implementation of protocols through the implementation of server-client systems and the design of our own protocol. Beyond actually implementing servers and/or designing our own protocols, this project allowed us to gain deep insight on what's the backbone of any project on the internet: there will be a server serving content, and a client receiving the content, all of the content following a specific communication protocol.

Appendix

Installation Guide

Generation of binaries: Located at the root of the repository, to compile both the monitor and the client, execute the command make all. Also, you can remove all binaries with the command make clean. It should be noted that the generated binaries were tested both on Linux and macOS.
Client Compilation: To compile the client, being in the root of the repository, execute the command make client.
Server Compilation: To compile the server, being in the root of the repository, execute the command make server.

Server and Client Configuration Instructions

Client Execution: To run the client, run the command bin/client -a ipServer -p portServer -u user:pass. Where -a ipServer specifies the IP address of the server supporting the monitor protocol, -p portServer specifies the port of the server supporting the monitor protocol, and -u user:pass specifies the administrator user that already exists within the server to be able to enter as such.
Server Execution: To run the server, you can run the command bin/server -u user:pass -u user2:pass2 -a admin:admin-pass -a admin2:admin-pass2 -d mails/. Where -u user/pass defines the user with the name 'user' and password 'pass', and -a admin:admin-pass defines the administrator user with the name 'admin' and password 'admin-pass'. Finally, the -d mails/ argument comes to be the path where the mails that the server will manage are located.

Monitoring Examples

Add a new POP3 user to the server would be executed as bin/client -n new-user:new-user-pass -p portServer -u user:pass, where -u user:pass are the credentials of the server administrator.
See the server metrics would be executed as bin/client -m -p portServer -a serverAddress -u user:pass, where metric-id is the identifier of the metric in the monitor's RFC and -u user:pass are the credentials of the server administrator. Another example is with the -n argument.

Scilim

POP3 server

Table of contents

Introduction

Project structure

Challenges Encountered and Design Decisions

Selector and State Machine

Parser

Testing

Buffer Testing

2K Buffer

4K Buffer

8K Buffer

16K Buffer

32K Buffer

Conclusions

1GB Email Test

Stress Test

Concurrency Testing with Email Reading

Client Implementation

Arguments

Security Considerations

Implementation Considerations

Extensibility Considerations

Server Implementation

New connection diagram

Limitations

Possible Extensions

Conclusions

Appendix

Installation Guide

Server and Client Configuration Instructions

Monitoring Examples

Contributors