Skip to content

Commit

Permalink
Merge pull request #23 from jeqo/multi-modes
Browse files Browse the repository at this point in the history
Add storage modes to distribute storage and aggregation
  • Loading branch information
jeqo authored Mar 14, 2019
2 parents 12e83d4 + 63b5197 commit 731516a
Show file tree
Hide file tree
Showing 41 changed files with 1,891 additions and 1,095 deletions.
111 changes: 49 additions & 62 deletions DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@

## Goals

* Remove need for additional storage when Kafka is in place.
* Provide a fast and reliable storage that enable extensability via Kafka
Consumers.
* Provide dependency graph building on real-time via stream-processing.
* Provide a fast and reliable storage that enable extensibility via Kafka Topics.
* Provide full storage functionality via streaming aggregations (e.g., dependency graph).
* Create a processing space where additional enrichment can be plugged in into the processing
pipeline.
* Remove need for additional storage when Kafka is available.

### Zipkin Storage Component

Expand All @@ -32,101 +33,87 @@ A Zipkin Storage component has the following internal parts:
- `long endTs, lookback;`
- `int limit;`

Then we have 2 main interfaces to implement: `SpanConsumer` and `SpanStore`.

### Kafka Zipkin Storage

#### `KafkaSpanConsumer`

Kafka Span Consumer will be supported by a Kafka Producer which will receive a list
of Spans and will store them on a Kafka Topic (e.g. `zipkin-spans_v2`).

Then this Kafka topic will be used as a source for stream processors in charge
of support queries, covered by Span Store.

#### `KafkaSpanStore`

Kafka Span Store will need to support different kind of queries:

##### Get Trace

A Stream processor will take `spans` as input, and aggragate them into a trace
with key/value format: `key=trace_id, value=trace_payload` that will be used as
a source for this query. Target topic will be called: `zipkin-traces_v2`
Span Consumer ingest Spans collected by different transports (potentially Kafka as well).

##### Get Service Names
Collected Spans will be known as *Raw Spans* that represents spans with all metadata, including
tags and annotations. Raw spans are stored as-is in a Spans Kafka topic.

This query will be served by a Stream processor that will take `spans` as input
and turn it into a key/value representation with `key=service_name,
value=operation_name`. All keys will be used as a source.
A later process will turn `Raw Spans` into Spans without annotations and tags, known as
*Light Spans*.

##### Get Span Names
- `Raw Spans` are meant to be used for indexing as they contain most metadata. This spans are
partitioned by span id.
- `Light Spans` are meant to be used on aggregations as traces and dependency graphs. This spans are
partitioned by trace id, in order to be processed together.

Based on the stream processor described above, and in the key (i.e., service
name) selected by the previous operation, this query will be serve by a look by
key on the same store.

##### Get Dependencies
#### `KafkaSpanStore`

This query will be supported by a topic sourced from a stream-processor that
will be based on a previous work done here:
<https://github.com/sysco-middleware/zipkin-dependencies-streaming>
Span Store is expecting 2 source topics:

##### Get Traces
- `Raw Spans` and
- `Light Spans`

This query is the most complex one and will require additional capabilities
that are not supported by Kafka Streams yet: we need an index that support
`QueryRequest` properties. As an initial option, Lucene or
[Luwak](https://github.com/flaxsearch/luwak) will be used to create an
in-memory index that can handle these queries.
> These can be created by Span Consumer, or can be **enriched** by other Stream Processors, outside of
Zipkin Server.

## Implementation
Kafka Span Store will need to support different kind of queries:

This Zipkin storage implementation is based on Kafka Streams State Store.

### Kafka Span Consumer
##### Get Service Names/Get Span Names

`KafkaSpanConsumer` implementation is based on Kafka Producer API, that stores spans individually on
a Spans Kafka topic.
Service name to Span names pairs are indexed by aggregating `light` spans.

### Kafka Span Store
##### Get Trace/Find Traces

`KafkaSpanStore` is based on Kafka Streams and [Lucene](https://lucene.apache.org/).
Traces are processed in 2 different processes:

To support `getTrace`, `getServiceNames` and `getSpanNames` a stream-processor is:
- `Light spans` are aggregated into traces. This traces are stored to represent span DAG.
- `Raw spans` are indexed using a `Lucene` state store. This index enabled trace searches.

- Polling data from incoming spans created by `KafkaSpanConsumer`.
- Group spans on traces.
- Record service names and span names related.
- Creating service dependencies.
When search requests are received, span index is used to search for trace ids. After a list is
retrieved, trace DAG is retrieved from trace state store.

To support `getTraces` requires an index of spans tagged by service names, span names, id,
timestamp, tags, etc.
As trace are retrieved, trace are hydrated from index to return a complete trace.

A custom Kafka Stream store was implemented to collect traces and index them using Lucene.
##### Get Dependencies

**All Stores are using Global state, which consumes all partitions. This means that every instance
stores all data on a defined directory.** All these with the trade-off of removing the need for an
additional database.
After `light spans` are aggregated into traces, traces are processed to collect dependencies.
Dependencies changelog are stored in a Kafka topic to be be stored as materialized view on
Zipkin instances.

### Stream processors

#### Aggregation Stream Processor
#### Trace Aggregation Stream Processor

This is the main processors that take incoming spans and aggregate them into:

- Traces
- Service Names
- Dependencies

#### Store Stream Processor
![trace aggregation](docs/trace-aggregation-stream.png)

#### Store Stream Processors

Global tables for traces, service names and dependencies to be available on local state.

![trace store](docs/trace-store-stream.png)

![service store](docs/service-store-stream.png)

![dependency store](docs/dependency-store-stream.png)

#### Index Stream Processor

Custom processor to full-text indexing of traces using Lucene as back-end.

![span index](docs/span-index-stream.png)

#### Retention Stream Processor

This is the processor that keeps track of trace timestamps for cleanup.
This is the processor that keeps track of trace timestamps for cleanup.

![trace retention](docs/trace-retention-stream.png)
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ FROM openjdk:8
ARG KAFKA_STORAGE_VERSION=0.1.1

ENV ZIPKIN_REPO https://jcenter.bintray.com
ENV ZIPKIN_VERSION 2.12.1
ENV ZIPKIN_VERSION 2.12.5

# Use to set heap, trust store or other system properties.
ENV JAVA_OPTS -Djava.security.egd=file:/dev/./urandom
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ all: build

OPEN := 'xdg-open'
MAVEN := './mvnw'
VERSION := '0.2.1-SNAPSHOT'
VERSION := '0.2.2-SNAPSHOT'

.PHONY: run
run: build zipkin-local
Expand Down
35 changes: 30 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,36 @@

[![Build Status](https://www.travis-ci.org/jeqo/zipkin-storage-kafka.svg?branch=master)](https://www.travis-ci.org/jeqo/zipkin-storage-kafka)

Proof of concept of fully-featured Kafka-based storage for Zipkin.
Kafka-based storage for Zipkin.

*This is in experimentation phase at the moment. Don't use in production!*
> This is in experimentation phase at the moment.
```
+-----------------------------zipkin------------------------------------------
| +-->( service:span )
| +-->( span-index )
( collected-spans )-|->[ span-consumer ] [ span-store ]--+-->( traces )
| | ^ | +-->( dependencies )
+---------|-----------------------------------|----|--------------------------
| | |
------------------------------|-----------------------------------|----|--------------------
| | |
kafka +-->( raw-spans )--//potential//--+ +-->( traces )
| | |
+-->( light-spans )--//enriching//--+ +-->( dependencies )
--------------------------------------------------------------------------------------------
```

- [Design notes](DESIGN.md)

## Configuration

| Configuration | Description | Default |
|---------------|-------------|---------|
| `KAFKA_STORE_SPAN_CONSUMER_ENABLED` | Process spans collected by Zipkin server | `true` |
| `KAFKA_STORE_SPAN_STORE_ENABLED` | Aggregate and store Zipkin data | `true` |
| `KAFKA_BOOTSTRAP_SERVERS` | Kafka bootstrap servers, format: `host:port` | `localhost:9092` |
| `KAFKA_STORE_ENSURE_TOPICS` | Ensure topics are created if don't exist | `true` |
| `KAFKA_STORE_DIRECTORY` | Root path where Zipkin stores tracing data | `/tmp/zipkin` |
Expand All @@ -24,9 +44,9 @@ Proof of concept of fully-featured Kafka-based storage for Zipkin.
| `KAFKA_STORE_TRACES_TOPIC` | Topic where aggregated traces are stored. | `zipkin-traces` |
| `KAFKA_STORE_TRACES_TOPIC_PARTITIONS` | Traces topic number of partitions. | `1` |
| `KAFKA_STORE_TRACES_TOPIC_REPLICATION_FACTOR` | Traces topic replication factor. | `1` |
| `KAFKA_STORE_SERVICES_TOPIC` | Topic where aggregated service names are stored. | `zipkin-services` |
| `KAFKA_STORE_SERVICES_TOPIC_PARTITIONS` | Services topic number of partitions. | `1` |
| `KAFKA_STORE_SERVICES_TOPIC_REPLICATION_FACTOR` | Services topic replication factor. | `1` |
| `KAFKA_STORE_TRACE_SPANS_TOPIC` | Topic where aggregated service names are stored. | `zipkin-services` |
| `KAFKA_STORE_TRACE_SPANS_TOPIC_PARTITIONS` | Services topic number of partitions. | `1` |
| `KAFKA_STORE_TRACE_SPANS_TOPIC_REPLICATION_FACTOR` | Services topic replication factor. | `1` |
| `KAFKA_STORE_DEPENDENCIES_TOPIC` | Topic where aggregated service dependencies names are stored. | `zipkin-dependencies` |
| `KAFKA_STORE_DEPENDENCIES_TOPIC_PARTITIONS` | Services topic number of partitions. | `1` |
| `KAFKA_STORE_DEPENDENCIES_TOPIC_REPLICATION_FACTOR` | Services topic replication factor. | `1` |
Expand All @@ -35,6 +55,11 @@ Proof of concept of fully-featured Kafka-based storage for Zipkin.

To build the project you will need Java 8+.

```bash
make build
make test
```

### Run locally

To run locally, first you need to get Zipkin binaries:
Expand Down
Loading

0 comments on commit 731516a

Please sign in to comment.