Server components to receive, validate, convert, store, and process Telemetry data from the Mozilla Firefox browser.
Talk to us on irc.mozilla.org
in the #telemetry
channel, or visit the
Project Wiki for more information.
See the TODO list for some outstanding tasks.
See StorageFormat for details.
See StorageLayout for details.
- Use RevisionCache to load the correct Histograms.json for a given payload
- Use
revision
if possible - Fall back to
appUpdateChannel
andappBuildID
orappVersion
as needed - Use the Mercurial history to export each version of Histograms.json with the date range it was in effect for each repo (mozilla-central, -aurora, -beta, -release)
- Keep local cache of Histograms.json versions to avoid re-fetching
- Use
- Filter out bad submission data
- Invalid histogram names
- Histogram configs that don't match the expected parameters (histogram type, num buckets, etc)
- Keep metrics for bad data
We have implemented a lightweight MapReduce framework that uses the Operating System's support for parallelism. It relies on simple python functions for the Map, Combine, and Reduce phases.
For data stored on multiple machines, each machine will run a combine phase, with the final reduce combining output for the entire cluster.
Telemetry data can be optionally imported into mongodb. The benefits of doing that is the reduced time to run multiple map-reduce jobs on the same dataset, as mongodb keeps as much data as possible in memory.
- Start mongodb, e.g.
mongod --nojournal
- Fetch a dataset from S3, e.g.
aws s3 cp s3://... /mnt/yourdataset --recursive
- Import the dataset, e.g.
python3 -m mongodb.importer /mnt/yourdataset
- Run a map-reduce job, e.g.
mongo localhost/telemetry mongodb/examples/osdistribution.js
Once we have the converter and MapReduce framework available, we can easily consume from the existing Telemetry data source. This will mark the first point that the new dashboards can be fed with live data.
Integration with the existing pipeline is discussed in more detail on the Bagheera Integration page.
When everything is ready and productionized, we will route the client (Firefox) submissions directly into the new pipeline.
These are the important parts of the Telemetry Server architecture.
Contains the Node.js HTTP server for receiving payloads. The server's job is simply to write incoming submissions to disk as quickly as possible.
It accepts single submissions using the same type of URLs supported by Bagheera, and expects (but doesn't require) the partition information to be submitted as part of the URL.
To set up a test server locally:
- Install node.js (left as an exercise to the reader)
- Edit
http/server_config.json
, replacinglog_path
andstats_log_file
with directories suitable to your machine - Run the server using
cd http; node ./server.js ./server_config.js
- Send some test data to the server. Using curl:
curl -X POST http://127.0.0.1:8080/submit/telemetry/foo/bar/baz -d '{"test": 1}'
Stop the server, and check that there is a telemetry.log.<something>.finished
file in the directory you specified in step 2 above.
You can examine the resulting file in python (from the root of the repo):
import telemetry.util.files as fu
for r in fu.unpack('/path/to/telemetry.log.<something>.finished'):
print "URL Path:", r.path
print "JSON Payload:", r.data
print "Submission Timestamp:", r.timestamp
print "Submission IP:", r.ip
print "Error (if any):", r.error
Contains the Converter
class, which is used to convert a JSON payload from
the raw form submitted by Firefox to the more compact storage format for
on-disk storage and processing.
You can run the main method in this file to process a given data file (the expected format is one record per line, each line containing an id followed by a tab character, followed by a json string).
You can also use the Converter
class to convert data in a more flexible way.
Contains code to export data to Amazon S3.
Contains the StorageLayout
class, which is used to save payloads to disk
using the directory structure as documented in the storage layout section
above.
Contains the RevisionCache
class, which provides a mechanism for fetching
the Histograms.json
spec file for a given revision URL. Histogram data is
cached locally on disk and in-memory as revisions are requested.
Contains the TelemetrySchema
class, which encapsulates logic used by the
StorageLayout and MapReduce code.
Contains the multi-process version of the data-transformation code. This is used to download incoming data (as received by the HTTP server), validate and convert it, then publish the results back to S3.
Contains the C++ data validation and conversion routines.
- Clang 3.1 or GCC 4.7.0 or Visual Studio 10
- CMake (2.8.7+) - http://cmake.org/cmake/resources/software.html
- Boost (1.54.0) - http://www.boost.org/users/download/
- zlib
- OpenSSL
- Protobuf
- Graphviz (2.28.0) - http://graphviz.org/Download..php
- Doxygen (1.8+)- http://www.stack.nl/~dimitri/doxygen/download.html#latestsrc
mkdir release
cd release
cmake -DCMAKE_BUILD_TYPE=release ..
make
heka_server
(string) - Hostname:port of the heka log/stats service.histogram_server
(string) - Hostname:port of the histogram.json web service.telemetry_schema
(string) - JSON file containing the dimension mapping.histogram_server
(string) - Hostname:port of the histogram.json web service.storage_path
(string) - Converter output directoryupload_path
(string) - Staging directory for S3 uploads.max_uncompressed
(int) - Maximum uncompressed size of a telemetry record.memory_constraint
(int) -compression_preset
(int) -
{
"heka_server": "localhost:5565",
"telemetry_schema": "telemetry_schema.json",
"histogram_server": "localhost:9898",
"storage_path": "storage",
"upload_path": "upload",
"max_uncompressed": 1048576,
"memory_constraint": 1000,
"compression_preset": 0
}
pushd http
../bin/get_histogram_tools.sh
popd
python -m http.histogram_server
in the release directory
mkdir input
./convert convert.json input.txt
# input.txt should contain a list of files to process (newline delimited)
# i.e. /<path to telemetry-server>/release/input/telemetry1.log
from another shell, in the release directory
cp ../process_incoming/worker/common/test/data/telemetry1.log input
Without the histogram server running it will produce something like this:
processing file:"telemetry1.log"
LoadHistogram - connect: Connection refused
ConvertHistogramData - histogram not found: http://hg.mozilla.org/releases/mozilla-release/rev/a55c55edf302
done processing file:"telemetry1.log" processed:1 failures:1 time:0.001871 throughput (MiB/s):9.3563 data in (B):18356 data out (B):0
With the histogram server running:
processing file:"telemetry1.log"
done processing file:"telemetry1.log" processed:1 failures:0 time:0.013622 throughput (MiB/s):1.2851 data in (B):18356 data out (B):45909
apt-get install cmake libprotoc-dev zlib1g-dev libboost-system1.54-dev \
libboost-filesystem1.54-dev libboost-thread1.54-dev libboost-test1.54-dev \
libboost-log1.54-dev libboost-regex1.54-dev protobuf-compiler libssl-dev \
liblzma-dev xz-utils
Contains the MapReduce code. This is the interface for running jobs on
Telemetry data. There are example job scripts and input filters in the
examples/
directory.
Contains scripts to provision and launch various kinds of cloud services. This includes launching a telemetry server node, a MapReduce job, or a node to process incoming data.
Contains the configuration used by Heka to process server logs.