This document explains the functionalities of Rabia packages along with some assumptions and design rationales. The content below is gathered from package-level comments from each package, which is the most up-to-date source developer should consult.
The config package defines the Config object that stores most (if not all) parameters needed by a Rabia server, a client, or a controller. Nearly all other packages access a global Config object named Conf to retrieve OS environmental variables, command-line arguments, and hard-coded or calculated parameters.
The ledger package defines the Ledger object, and exactly one copy of a Ledger is held by each server. A Ledger is made by an array of Slot objects, and the Slot associated with index i (of the ledger array) stores all current information this server holds regarding the replicated log entry i, including: 1) this server's proposal for this replicated log entry and this server's binary consensus messages, 2) other servers' proposals and binary consensus messages, 3) whether a decision message has received, 4) whether this server has made a decision, 5) what is the decision, 6) what's the current phase and round of the consensus protocol that this server has proceeded so far. Due to the network latency and other factors, each server may have (slightly) different views about a replicated log entry at a certain time, so it is natural that they hold (slightly) different values in some fields of a Slot object. Nevertheless, Rabia's goal is to let all servers in the cluster agree on the same sequence of decisions eventually, and as fast as possible
Note: At each server, the ledger is read by the proxy layer and read and written by the consensus layer.
The logger package defines the function that initializes a Rabia Logger and several logger-related functions. A Logger can be set as a terminal logger (output to stdout only), or file logger (output to a file in the log folder), or a terminal & file logger.
Note:
- If the logger write to a file, the file pointer is also returned and needs to be managed by the user. The user needs to call .Sync() method of the file pointer to flush writes to the disk. Otherwise, the log file's last line may be ill-formatted. See the last a few lines of Executor.Executor() function for an example
- If we let the logger outputs to a file, the file's name is a specific prefix based on some current configurations in Conf. This strategy helps us organize logs that are generated by different benchmarking tests.
- We choose the package zerolog logger for Rabia version 4 because it offers better visual effects for command-line logs. Some of our developers, based outside the States, constantly have trouble installing zap, our logging module for Rabia version 3.
- Each entity (e.g., a client, a proxy, or a network layer) can have one or more loggers by having loggers of different subId fields.
The msg package contains multiple files. msg.proto defines Rabia's messaging objects' formats, including Command, ConsensusObj, and Msg objects, where ConsensusObj is embedded in Msg for server-server transmission. msg.go defines several messaging-related helper functions that facilitate ConsensusObj comparison, Command serialization, and Msg serialization. msg.pb.go, defines the serialization & de-serialization schema of messaging objects, and it is auto-generated by gogo-protobuf based on definitions in msg.proto.
The queue package defines an implementation of the ConsensusObj priority queue, which is used to sort (based on the proxy id and proxy sequence fields) and store pending proxy-batched requests.
Note:
- A proxy-batched request contains one or more client-batched requests that are batched again by the proxy. The size (number of requests) of the batch depends on several parameters in Conf and the arrival time of client requests.
- Each consensus instance has its pending request queue, which means a server can have more than one queue when Concurrency is greater than 1.
- The priority queue follows the example at https://golang.org/pkg/container/heap
The rstring package contains only one function, which generates a random string in a fast way. Possible characters of the string are from a to z and from A to Z.
Note:
- Random strings are used as the primary key of clients' read and write requests to the database.
- The random string generation function follows the RandStringBytesMaskImprSrcUnsafe() function at https://stackoverflow.com/a/31832326
The package system contains only one function, which should be instantiated as a Go routine: it listens to OS signals like SIGINT and SIGTEM, and when it receives a signal, it closes the channel given as the sole argument to notify the caller routine.
Note: The use case is when a user wants to terminate Rabia (in case Rabia errs) before it normally exits.
The tcp package defines ClientTCP, ProxyTCP, and NetTCP objects, which are the TCP communication component of a client, a server's proxy layer, and a server network layer, respectively. These objects and their functions help establish and maintain TCP connections and let Go routines to listen to SendChan (the channel that queues messages to be sent) and forward messages received over TCP to RecvChan (to be subsequently accessed by caller routines).
Note:
-
This package assumes Conf is initialized.
-
This version of TCP endpoints does no support reconfiguration. Reconfiguring a server requires having multiple TCP endpoint objects
The client package defines the struct and functions of a Rabia client. There are two types of clients, open-loop ones and closed-loop ones. The former sends requests one after another until they send Conf.NClientRequests requests, and waits are requests are replied. The latter waits for a reply after sending a request until the benchmark time ends, i.e., when Conf.ClientTimeout is reached. Each request contains one more key-value store operations.
Note:
- The purpose of ClientBatchSize:
When ClientBatchSize = 1, each Client routine can be conceived as a KV store client, i.e., a kv-store client process that sends one request and receives one reply at a time.
When ClientBatchSize = n >= 1, each Client routine can be conceived as a client proxy, which batches one or more kv-store client processes' requests, and sends a batch of requests and receives a batch of requests at a time.
See comments in msg.proto for more discussion.
The controller package defines the struct and functions of the benchmark controller. The controller (1) informs all clients to start submitting requests at around the same time. (2) When all clients have done sending or the specified execution time is up, they signal the controller, informing all servers to exit. (1) attempts to maximize the read and write pressure against Rabia, and purpose (2) helps the shell scripts of Rabia to determine when all servers and clients exit so that the scripts can start to gather logs.
Note:
-
When Rabia runs in a production environment (i.e., when we are not doing benchmarking), there is no need to have a benchmark controller. We need to modify some "superficial" code to remove dependencies on the controller.
-
The caller shell script waits for the controller instead of sending the controller process to the background as it does to the servers and clients. So after servers and clients normally exit, the controller exits, and then the script knows it is time to collect logs from the cluster of machines.
-
The controller communicates with a Receiver object at each server and client to coordinate necessary steps in benchmarking.
The server package defines a Rabia server's struct and its initialization, preparation, and main function. Note: A server has many Goroutines, and each routine belongs to one of the three server layers: the proxy/application layer, the peer-communicating network layer, and the consensus instance layer. There are one or two main Go routines at each layer. When a server wants to exit, it needs to wait for them all. Besides, the proxy and network layers maintain their TCP connections, which require a few more Goroutines to send requests and to wait for responses.
The consensus package defines the struct of a consensus instance and methods shared by the consensus executor and the message handler. Both routines modify states in the server Ledger, but the executor follows the pseudo-code of Rabia. The handler does the dirty work that is not described in the pseudo-code -- basically, send and receive messages on the behalf of this executor (but the proposal request and reply phase is an exception). See descriptions in executor.go and msg_handler.go for more details.
Note:
- The implemented code for deciding each slot is somewhat different from the algorithm in our paper; The implementation follows a more verbose version of the algorithm presented in the SOSP paper, see the document in the docs folder
The network package defines the network layer of a server, which in charge of communicating with server peers. It sends messages produced by executors, messages received from network layer message handlers, and messages forwarded from the proxy to one or more servers (may include itself). When it receives a message from its peer, it routes the message to a channel (to a proxy, or an executor, or a handler) according to the message's type.
Note:
-
For messages of type ProposalRequest and ProposalReply, some fields besides the type fields are also used in determining routing destination. See the comment in msg.proto for more details.
-
Comments on the sequence number / logical slot number / message sequence number: They mean the same thing, and I use them interchangeably. Why "message sequence number" means the same is a little obscure, basically, messages except those of type ClientRequests has a slot number associated with it, and that number is called "message sequence number."
The proxy package defines the proxy/application layer of a server. The proxy connects to one or more Rabia clients to send and receive client requests. It also executes client commands decided by consensus instance(s). For these two reasons, it has two primary routines that run concurrently, one is CmdReceiver (client command receiver), and the other is KVSExecutor (KV-store executor).
The main package contains Rabia's entry function, which loads configurations and spawns a Rabia server, or a client, or a controller based on provided arguments.