- python 3.6+
git clone https://github.com/Qumulo/power-tools
cd power-tools
This python script uses the Qumulo api to quickly walk a Qumulo filesystem tree in a parallel manner. The output is a single file with a list of every file and directory as well as its size. You can look in the script to see other available file metadata you might wish to search/track.
Performance results:
- 3,000+ files per second
- 100+ directories per second Using Qumulo 4 node QC24 cluster with a Mac 2.3 GHz Intel i7 8-core CPU.
- Make sure you have the python requirements:
pip install -r requirements.txt
python api-tree-walk.py -s product -p ********* -d /home
We know you're excited to get your Qumulo API data into your centralized databases and monitoring systems. Use this script to send activity (throughput, data and metadata IOPS) by path and client into influx, elastic search, postgres, splunk, and/or csv.
- Install the python 3 prequisites.
pip3 install qumulo_api
- Copy
sample-config.json
toconfig.json
and then specify your databases and Qumulo clusters you wish to use in the newconfig.json
file. - run
python api-to-dbs.py
to confirm everything works. - Add the
python api-to-dbs.py
command to your crontab to run every 1 or two minutes with something like:
* * * * * cd /location/of/the-power-tools; python api-to-dbs.py >> api-to-dbs.log.txt 2>&1
The "QUMULO_CLUSTERS"
section allows for multiple Qumulo clusters to be tracked.
"QUMULO_CLUSTERS":[
sample-config.json
is currently only set up to save data to hourly csv files. If you want to send Qumulo data to other databases, here is the json configurations you can add to your config.json
:
"host": "product.example.com",
"user": "admin",
"password": ""
"client_ip_regex": "10.0.0.*"
host (required): Hostname or IP of the Qumulo cluster API server
user (required): Username of account with API access
password (required): Password for API user
client_ip_regex (optional): a regex based filter for the clients accessing the Qumulo cluster. This allows for cleaning up metrics for only those clients that you are interested in storing metrics.
"influx": {"host": "influxdb.example.com",
"db": "qumulo",
"measurement": "qumulo_fs_activity"
},
"postgres": {"host": "db.example.com",
"db": "qumulo_data",
"user": "postgres",
"pass": ""
},
"elastic": {"host": "elasticdb.example.com",
"index": "qumulo",
"type": "qumulo_fs_activity"
},
"splunk": {"host": "elasticdb.example.com",
"token": "2ea5c4dd-dc73-4b89-af26-2ea6026d0d39",
"event": "qumulo_fs_activity"
},
For the various databases, you will need to have your own DB server set up to pass the data. You will also potentially need to create a new database withing the DB server. For splunk, you will need to enable the /services/collector/event in splunk and generate a token for the configuration file.
The configuration parameters are pretty self-explanatory and described below.
"IOPS_THRESHOLD": 1,
"THROUGHPUT_THRESHOLD": 10000,
"DIRECTORY_DEPTH_LIMIT": 4,
"DIRECTORIES_ONLY": true,
"DEBUG": true
- "IOPS_THRESHOLD"- the minimum total IOPS value per path+client required for being saved into the database
- "THROUGHPUT_THRESHOLD" - the minimum total throughput value per path+client required for being saved into - the database
- "DIRECTORY_DEPTH_LIMIT" - the maximum directory depth to be tracked in the paths
- "DIRECTORIES_ONLY" - only story directory names. if false then we go down to file names.
- "DEBUG" - verbose logging