Skip to content

uprush/rosetta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rosetta

Log Management Makes Easy

Rosetta makes log management easier by automating the setup of several open source log management software.

Features including:

  • Real-time log collection.
  • Log search.
  • Dashboard.
  • Sending logs to Amazon S3.
  • Automate all the setup.

Logs are also stored in S3

Data is stored in JSON format, which is easy to parse and add new fields in the logs.

{"host":"104.24.56.37","user":null,"method":"GET","path":"/item/jewelry/113","code":200,"size":138,"referer":null,"agent":"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)","@node":"ip-172-31-11-76","@timestamp":"2013-11-13T04:01:17.000Z","@version":"1","type":"apache_access","tags":["apache_access"]}

Use Hive to query JSON logs.

CREATE  EXTERNAL  TABLE apache_logs
(
  log STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE
LOCATION  's3://rosetta-logs/apache';


select b.*
from apache_logs a
LATERAL VIEW json_tuple(a.log, '@timestamp', 'code', 'path') b
as timestamp, code, path
where b.code != 200
limit 100;

Find more Hive and RedShift query samples in rosetta/aws/commands.sh.

Pre-requisites

  • Ubuntu 13.04 raring

Architecture

There are four components:

  • Rosetta Agent: A Fluentd daemon running on the server where logs are generated.
  • Rosetta Broker: Redis server acts as a FIFO queue. Logs collected by agents are sent to broker.
  • Rosetta Filter: LogStash server to consume input from broker, perform filtering and data manipulation on each input entry, output result to multiple targets.
  • Rosetta Indexer: ElasticSearch cluster to index logs.
  • Rosetta Dashboard: Kibana as the front-end of the ElasticSearch cluster. Provide search interface and customize dashboard.

Rosetta Architecture

Setup

Deployment is automated using Chef-solo and Capistrano. Setup should be able to complete by several commands.

The first step is to bootstrap a Rosetta operation center.

curl https://raw.github.com/uprush/rosetta/master/bootstrap.sh | bash

All operations are defined as Capistrano tasks. The tasks are expected to execute on the operation center.

Execute following steps on the operation center to set up Rosetta components:

  1. Export Rosetta global environment variables, including ROSETTA_ENV and ROSETTA_ZONE.
  2. Change your configuration in rosetta/chef/attributes/<ROSETTA_ENV>.yml.
  3. Configure server IPs of eash the componens in rosetta/config/<ROSETTA_ENV>-<ROSETTA_ZONE>.rb.
  4. Distribute Chef cookbooks to target servers.
  5. Set up the component.

A sample setup flow as follows:

# 1. Export environment variables
export ROSETTA_ENV=aws
export ROSETTA_ZONE=a

# 2. Change environment attributes
vi rosetta/chef/attributes/aws.yml

# 3. Change zone specific configurations like component server IPs
vi rosetta/config/aws-a.rb

# 4. Distribute Chef cookbooks to rosetta agents
cd rosetta
TARGET=agent cap rosetta:common:dist_cookbook

# 5. set up rosetta agents
cap rosetta:agent:setup

See a list of Capistrano tasks:

cd ~/rosetta && cap -T

Releases

No releases published

Packages

No packages published