Skip to content
Frank Denis edited this page Jul 31, 2016 · 7 revisions

Label Tab-separated Values is a simple and clever format to encode a set of key/value pairs.

A key/value pair is represented by: key:value. The split happens right after the first column (:), so it is perfectly valid for a value to contain columns.

Key/value pairs are delimited by TAB characters. Therefore, keys can contain any characters with the exception of : and TAB, whereas values can contain any characters with the exception of TAB and the records delimiter (usually \n).

Keys and values must be valid UTF-8 strings.

Generating LTSV records is extremely fast and easy. Values containing TAB characters can usually just replace these with spaces.

Flowgger expects LTSV records to match some very basic constraints:

  • A message MUST include a timestamp, under a key named time. The timestamp can be expressed in RFC 3339 format, as a Unix timestamp, or in English format.
  • A message MUST include a source host name, under the key host.
  • A message MAY include a description, under the key message.
  • A message MAY include a severity level, as a number (between 0 and 7, matching syslog severity levels), under the key level.
  • A message MAY include any number of additional key/value pairs.

Here is an example of valid LTSV record (\t has to be replaced with actual TAB characters):

time:1469996508\thost:testhostname\tname1:value1\tname 2: value 2\tn3:v3

In addition to being easy to generate, LTSV is also the fastest option in Flowgger.

The LTSV decoder can be enabled in the [input] section of Flowgger's configuration file:

[input]
format = "ltsv"

LTSV schemas

By design, and unlike JSON-based formats, values in LTSV records are not typed, and are assumed to be strings by default.

However, it may be desirable to enforce type constraints, and to retains the types when converting LTSV to typed formats such as JSON.

In order to do so, a schema can be defined for LTSV inputs, in an [input.ltsv_schema] section of the Flowgger configuration file:

[input.ltsv_schema]
counter = "u64"
amount = "f64"

Supported types are:

  • string
  • bool (boolean value)
  • f64 (floating-point number)
  • i64 (signed integer)
  • u64 (unsigned integer)

Pay attention to the fact that some of these values may not have a representation in the target format. For example, Javascript, hence JSON (hence GELF) can only represent values up to 2^53-1 without losing precision.

LTSV automatic suffixing

Suffixes can be automatically added to keys with non-string values, as defined by the schema.

For example, Flowgger can ensure that keys with i64 and u64 values are always suffixed with _long, that keys with f64 values are always suffixed with _double, and that boolean values have _bool suffix.

This can be enabled via the configuration file:

[input.ltsv_schema]
quantity = "u64"
amount = "f64"
done = "bool"

[input.ltsv_suffixes]
i64 = "_long"
u64 = "_long"
f64 = "_double"
bool = "_bool"

In the previous example, an incoming message with the following key/value pairs:

{
    "quantity": 42,
    "amount": 3.14,
    "done": false
}

will be automatically transformed to:

{
    "quantity_long": 42,
    "amount_double": 3.14,
    "done_bool": false
}

This can be especially useful with ElasticSearch, that expects a fixed type for a given index.

Property names will be transparently rewritten with the correct suffix for -their value type, unless they are already properly suffixed.

Clone this wiki locally