Skip to content

Functional Block: GK

Michel Machado edited this page Oct 27, 2023 · 37 revisions

The Gatekeeper (GK) block can be considered the main component of Gatekeeper servers, as its task is to accept incoming packets, lookup policy decisions on those packets, and queue requests and granted packets for transmission to Grantor. It is in the data plane and can scale across multiple lcores. It only runs when the Gatekeeper executable is being run as a Gatekeeper server (as opposed to a Grantor Server). To better balance the load among the NUMA nodes, we spread the GK instances among all the NUMA nodes.

Table of Contents

Description

Each instance of the GK block listens on a separate queue and processes packets that arrive at that queue. We utilize RSS to distribute incoming packets among the GK instances. RSS also provides a guarantee that packets belonging to the same flow will be directed to the same queue. Therefore, each GK instance maintains a flow table without locks because each block is the only writer and reader of this table. However, all the GK instances look up policy decisions in a global LPM table.

The basic algorithm of the GK block is to repeatedly load a set of packets from the front interface, and for each packet in the set do the following:

  • If the pair of source and destination addresses is in the flow table, proceed as the forwarding entry instructs (send request to Grantor, transmit packet according to some rate limit, or drop packet).
  • Otherwise, look up the destination address in the global LPM table.
    • If there is an entry for the destination, initialize an entry in the flow table. Proceed with the policy decision that the LPM entry instructs: send request to Grantor, transmit packet according to some rate limit, or drop packet.
    • Otherwise, drop the packet.
This is the path that most packets will take. However, the GK block also forwards flows that have been configured to bypass the Gatekeeper algorithm, as well as forwarding packets from the back interface to the front interface.

Static Configuration

To run the executable as a Gatekeeper server (as opposed to a Grantor server), then local gatekeeper_server in lua/main_config.lua should be set to true

To change the number of lcores assigned to the GK block on all NUMA nodes (i.e. the number of GK instances), change the variable local n_gk_lcores in lua/main_config.lua.

All other static configuration variables can be configured in lua/gk.lua.

Variables to Change for Basic Operation

These variables are likely to change from deployment-to-deployment based on the operator's preferences.

Log Level

log_level

The log level for the GK block. Can be set to any one of the following values: RTE_LOG_EMERG, RTE_LOG_ALERT, RTE_LOG_CRIT, RTE_LOG_ERR, RTE_LOG_WARNING, RTE_LOG_NOTICE, RTE_LOG_INFO, RTE_LOG_DEBUG.

Since we typically use RTE_LOG_ERROR as the most severe log condition, we recommend not to set this value below RTE_LOG_ERROR.

BPF programs

bpf_base_directory & bpf_programs

bpf_base_directory is the directory in which Gatekeeper servers can find BPF programs that will be associated to flow entries. bpf_programs is a list of these programs. bpf_programs is table whose key is the program index of the program and the value is the name of the program. Currently, the program index goes from 0 to 255, and the indexes from 0 to 15 should be reserved for BPF programs that come with Gatekeeper.

Notice that these program indexes are used by policies. Thus, a mismatch of these program indexes between Gatekeeper and Grantor servers can lead to unexpected behavior.

Variables to Change for Performance Reasons

It is not crucial to change these variables, and they only need to be changed to fine tune the performance of Gatekeeper. Otherwise, the default values are likely fine.

Mailbox Maximum Entries (Exponential)

mailbox_max_entries_exp

The log (base 2) of the maximum size of the GK mailbox. For example, if the variable is set to 7, then room for 2^7 = 128 entries will be made in the mailbox.

Also used to determine how many entries will actually be available for use in the mailbox, which for efficiency reasons is one less than the maximum size of the mailbox (127 in the example above).

Mailbox Cache Size

mailbox_mem_cache_size

Number of mailbox entries to keep in the cache for more efficient use of the mailbox. Set to 0 to disable the cache of the memory pool for the mailbox.

Mailbox Burst Size

mailbox_burst_size

Maximum number of entries to receive in a burst every time the mailbox is checked.

Log Rate Limit Interval

log_ratelimit_interval_ms

The interval at which logs are rate limited (in milliseconds). For a given interval, only log_ratelimit_burst log entries are permitted. The count of entries is reset for each new interval.

Log Rate Limit Burst

log_ratelimit_burst

The number of entries per interval allowed to be logged. When the number of log entries exceeds this limit in a given interval, the entries will be dropped.

Interface Maximum Packet Burst (Front and Back)

max_pkt_burst_front & max_pkt_burst_back

A suggestion for the maximum number of packets received in each burst on the front and back interfaces in the GK instances, respectively. For each interface, this parameter is only a suggestion since the true burst size is chosen as the maximum between this configured value and the number of ports that compose in the interface.

FIB Dump Batch Size

fib_dump_batch_size

The batch size for dumping GK FIB. You find the motivation for this parameter in issue #492.

Flow Hash Table

  • flow_ht_size
  • flow_table_scan_iter
  • scan_del_thresh
  • flow_ht_max_probes
  • flow_ht_scale_num_bucket

Each GK instance maintains its own flow table for flows that it has handled.

flow_ht_size specifies the number of entries of the flow hash table for each GK instance.

flow_table_scan_iter is the number of iterations of the GK block's main loop that occur between scanning entries of the flow table. Set to 0 to scan an entry every iteration of the loop.

scan_del_thresh is the number of flow entries that need to be released before a new flow entry can be added again to the flow table. Trying to add a flow entry is very costly when the flow table is full or nearly full. Thus, a GK block stops adding new flow entries when its flow table is nearly full until scan_del_thresh flow entries are released.

Until version 1.1, Gatekeeper used the DPDK Hash library for its flow table. Starting with version 1.2, Gatekeeper employs its own Hopscotch hash library. Version 1.2 introduced the parameters flow_ht_max_probes and flow_ht_scale_num_bucket.

flow_ht_max_probes specifies the maximum number of buckets that are queried to find an empty bucket. The larger this number, the higher the occupancy of the hash table can be and the worse performance to add new flows when the hash table is close to its maximum occupancy. Therefore, the trade-off is between occupancy and performance.

flow_ht_scale_num_bucket scales flow_ht_size to obtain the number of buckets in the hash table. The more buckets, the less flow coalition in buckets, the fewer entries per bucket (i.e. faster lookups), and the more memory is allocated to hold the buckets. flow_ht_scale_num_bucket must be greater than zero. flow_ht_scale_num_bucket less than one is useful for stress testing the code in test environments.

LPM Table

  • max_num_ipv4_rules & max_num_ipv6_rules
  • num_ipv4_tbl8s & num_ipv6_tbl8s
  • max_num_ipv6_neighbors

For its global LPM table shared among the GK instances, Gatekeeper uses the DPDK LPM library. This library implements the DIR-24-8 algorithm using two types of tables: (1) tbl24, a table with 2^24 entries; (2) tbl8, a table with 2^8 entries. To configure an LPM component instance, one needs to specify: (1) the maximum number of rules to support and (2) the number of tbl8 tables.

max_num_ipv4_rules and max_num_ipv6_rules represent the maximum number of rules to support for the IPv4 and IPv6 LPM tables, respectively. The IPv4 LPM reserves 24 bits for the next-hop field, whereas IPv6 LPM reserves 21 bits. Given that each FIB entry has its own entry, the bit length of the next-hop field effectively limits the largest size of each FIB table to 2^24 and 2^21 entries, respectively. These limits are enough for the current size of routing tables.

num_ipv4_tbl8s and num_ipv6_tbl8s represent the number of tbl8 tables used in the DIR-24-8 algorithm for the IPv4 and IPv6 LPM tables, respectively.

max_num_ipv6_neighbors is the maximum number of neighbor entries for the IPv6 forwarding table. For IPv4, the address space is small enough that we can allocate enough room for all possible neighbors. For IPv6, the address space is too large; therefore, we specify the expected maximum size of this table in the configuration.

ICMP Rate Limiting

  • front_icmp_msgs_per_sec & front_icmp_msgs_burst
  • back_icmp_msgs_per_sec & back_icmp_msgs_burst

It is necessary to limit the number of ICMP replies Gatekeeper issues per second to avoid a scenario where Gatekeeper servers are the source of an attack. We limit the rate of ICMP messages using the token bucket algorithm, and the limit is set per network interface.

front_icmp_msgs_per_sec and front_icmp_msgs_burst represent the rate and burst size of the ICMP messages for the front interface, and back_icmp_msgs_per_sec and back_icmp_msgs_burst represent the rate and burst size of the ICMP messages for the back interface.

GK blocks only issue ICMP messages when a packet to be forwarded exceeds its hop count limit.

Basic Measurement Interval

basic_measurement_logging_ms

The interval at which basic measurements of the GK instances are taken (in milliseconds). These measurements include the number of packets and bytes seen, requested, declined, granted, etc., and are written to the Gatekeeper log every interval.

Variables Unlikely to Change

BPF programs

bpf_enable_jit

When bpf_enable_jit is false, Gatekeeper servers run BPF programs using a VM, whereas it is true, the default, Gatekeeper servers compile BPF programs to the host instructions to run them natively. The only reasons to disable the just in time (JIT) compilation are (1) if Gatekeeper cannot compile the program to your platform, or (2) you are diagnosing if the JIT compiler is wrongly compiling the programs.

Dynamic Configuration

There are various ways to configure and inspect the GK block at runtime using Gatekeeper's dynamic configuration.

Adding and Deleting FIB Entries

To add an entry to the forwarding table, a Lua script can be used with the dynamic configuration library to call:

int add_fib_entry(const char *prefix, const char *gt_ip, const char *gw_ip,         
        enum gk_fib_action action, struct gk_config *gk_conf)
  • prefix is an IP prefix specified in CIDR notation of the entry to add
  • gt_ip is the IP address of the Grantor server to forward packets in this prefix to (if applicable)
  • gw_ip is the IP address of the gateway through which packets in this prefix should be forwarded (if applicable)
  • action is one of GK_FWD_GRANTOR, GK_FWD_GATEWAY_FRONT_NET, GK_FWD_GATEWAY_BACK_NET, or GK_DROP to indicate that flows from this prefix should be forwarded to a Grantor server, forwarded to a destination in the front or back network through a gateway, or dropped, respectively
  • gk_conf is the context for the GK block currently running
FIB entries can also be deleted through the dynamic configuration by calling:
int del_fib_entry(const char *ip_prefix, struct gk_config *gk_conf)
  • ip_prefix is an IP prefix specified in CIDR notation of the entry to delete
  • gk_conf is the context for the GK block currently running
For examples, see lua/examples/example_of_dynamic_config_request.lua.

Dumping a FIB

The IPv4 and IPv6 forwarding tables can be separately dumped to the console by calling either list_gk_fib4() or list_gk_fib6(). They accept as parameters a GK configuration structure, a callback function that is invoked once for every entry in the forwarding table, and a accumulator string.

For examples, see lua/examples/example_of_dynamic_config_request.lua.

Dumping a Neighbor Table

Similar to the FIB, the IPv4 and IPv6 neighbor tables can be separately dumped to the console by calling either list_gk_neighbor4() or list_gk_neighbor6(). They accept as parameters a GK configuration structure, a callback function that is invoked once for every entry in the neighbor table, and a accumulator string.

For examples, see lua/examples/example_of_dynamic_config_request.lua.

Flushing Entries from Flow Tables

The dynamic configuration enables the user to flush entries from the GK flow tables using the following C function:

int gk_flush_flow_table(const char *src_prefix,                                     
        const char *dst_prefix, struct gk_config *gk_conf)
  • src_prefix is an IP address range in CIDR notation or NULL
  • dst_prefix is an IP address range in CIDR notation or NULL
  • gk_conf is the context for the GK block currently running
Each GK instance will prune its flow table of any entries whose source and destination IP address fall within these ranges.

For examples, see lua/examples/example_of_dynamic_config_request.lua.

Logging Flow State from Flow Table

The dynamic configuration enables the user to log a flow state from the corresponding GK flow table using the following C function:

int gk_log_flow_state(const char *src_addr,
	const char *dst_addr, struct gk_config *gk_conf)
  • src_addr is an IP address
  • dst_addr is an IP address
  • gk_conf is the context for the GK block currently running
For examples, see lua/examples/example_of_dynamic_config_request.lua.

Loading BPF programs

The dynamic configuration can load BPF programs in runtime using the following C function:

int gk_load_bpf_flow_handler(struct gk_config *gk_conf, unsigned int index,
	const char *filename, int jit)
  • gk_conf is the context for the GK blocks currently running;
  • index is the index to be used for the new BPF program. Lua policies must refer to this index in order to associate the program to a flow;
  • filename is the name of the file from where to load the BPF program;
  • jit if true and possible, compile the BPF program to the machine instructions.
For examples, see lua/examples/example_of_dynamic_config_request.lua.