a scalable machine learning crawler framework, which automatically crawls and downloads data from a list of 'known' devices, processes their data using a device-specific model (that automatically updates when new data is found), and uploads that processed data back into ChainAPI.
chaindataprocessor is part of LearnAir, a master's thesis.
##documentation and examples
code written for LearnAir includes:
- chainCrawler and chainSearcher - a web crawler and a breadth-first-search tool for the semantic web data achitecture ChainAPI
- chainTraverser and chainDataPush - a stateful web spider to traverse, upload, modify, and interact with ChainAPI nodes and data, including pushing data from Excel files
- chainProcessor - a scalable machine learning crawler framework, which automatically crawls and downloads data from a list of 'known' device types in ChainAPI, processes their data using a device-specific model (that automatically updates when new data is found), and uploads that processed data back into ChainAPI
- an Air Quality Ontology Adaptation of Chain API (original tool written by Spencer Russel et al) - air quality data ontology written with ChainAPI- a semantic web, RESTful Sensor API
Additional resources include:
- the thesis document (full documentation/motivation, esp. Chap 6. ChainAPI for Air Quality)
- the repo for the thesis document
- jupyter notebooks used in data pre-processing, machine learning, and plot generation(with raw data)
- a quick video introducing the learnAir concept
- the original ChainAPI project