In adaptor jobs there are 2 fundamental modes of operation.
- Bounded mode - The pipeline runs once based on the specified schedule
- Unbounded mode - The pipeline is always running
A pipeline maybe specified in Json format and submitted to the framework server to auto-generate JAR files and run them. The following is the spec outline to be followed in making a configuration file.
{
"name": "<unique name for this adaptor",
"schedulePattern": "<cron like schedule pattern >",
"adaptorType": "ETL",
"failureRecoverySpec": {
},
"inputSpec": {
},
"parseSpec": {
},
"deduplicationSpec": {
},
"transformSpec": {
},
"publishSpec": {
}
}
Detailed explanation of the individual specs are given below.
- Meta spec
- Failure Recovery Spec
- Input Spec
- Parse Spec
- Deduplication Spec
- Transformation Spec
- Publish Spec
The spec can then be submitted to the adaptor server which will validate it and generate a JAR for the entire pipeline.