Skip to content

Latest commit

 

History

History
63 lines (50 loc) · 1.98 KB

README.md

File metadata and controls

63 lines (50 loc) · 1.98 KB

ServiceX DataStream

Event Data Streaming Service for ServiceX. This service accepts flattened N-tuples ans streams them out for analysis using Kafka.

Installation

The Datastream Service runs inside a Kubernetes cluster.

Install Kafka

$ helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
$ kubectl create ns kafka
$ helm install --name my-kafka --namespace kafka incubator/kafka

Create a namespace

We will put all of our applicaton pods in the servicex namespace.

% kubectl create namespace servicex

Set up a shared volume for Event Data

Create a persistent volume claim called servicex-pvc. We create this with

% kubectl -n servicex create -f kube/pvc.yml

To make it easier to work with this persistent volume, we will create a busybox pod with the volume mounted.

% kubectl -n servicex create -f busybox.yml

When the pod is ready, you can create a shell with

% kubectl exec -it -n servicex busybox sh

you can see the mount under /servicex

You can copy a sample xAOD Root file into the shared volume using this pod with

% kubectl cp AOD.11182705._000001.pool.root.1 servicex/busybox:servicex/AOD.11182705._000001.pool.root.1

Run the Transformer on the Data

We use a containerized transformer from ServiceX_transformer to read the xAOD File and reduce it to flattened n-tuples.

% kubectl -n servicex create -f transform_job.yml

When this job complets, there will be two files in the shared volume:

  • flat_file.root: Flattened n-tuple root file
  • xaodBranches.txt: Dump of all of the branch names from the original file

Acknowledgements

This project is supported by National Science Foundation under Cooperative Agreement OAC-1836650. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.