Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
create_dataset_and_upload_to_s3.py		create_dataset_and_upload_to_s3.py
dozer-config.yaml		dozer-config.yaml
env.sample		env.sample

README.md

Ingesting data from AWS S3

This README provides a brief guide on how to set up Dozer for real-time data ingestion from an AWS S3 bucket. For a more comprehensive tutorial, please refer to our blog post.

Prerequisites

AWS account with access to S3 services
AWS CLI installed and configured
Python installed
Dozer installed

Steps

Generate and Upload Data to S3: Use a Python script to generate a dataset and upload it to an S3 bucket.

python create_dataset_and_upload_to_s3.py

If you already have a dataset in your S3 bucket, you can skip this step.

Configure Dozer: Create a YAML configuration file that defines the data sources, transformations, and APIs. Checkout the sample Dozer configuration file dozer-config.yaml that uses AWS S3 connector.

connections:
  - config : !S3Storage
      details:
        access_key_id: {{YOUR_ACCESS_KEY}}
        secret_access_key: {{YOUR_SECRET_KEY}}
        region: {{YOUR_REGION}}
        bucket_name: aws-s3-sample-stock-data-dozer
      tables:
        - !Table
          name: stocks
          config: !CSV
            path: . # path to files or folder inside a bucket
            extension: .csv
    name: s3

Running Dozer: Start Dozer by running the following command in the terminal:
```
dozer -c dozer-config.yaml
```
Querying the Dozer APIs: Query the Dozer endpoints to get the results of your SQL queries. You can query the cache using gRPC or REST.

Example queries:
```
# REST
curl -X GET http://localhost:8080/analysis/ticker
```
Append New Data & Query: Dozer automatically detects and ingests new data files added to the bucket. This allows you to process recurring data without changing any configuration. You can upload a new file to the bucket and can see the dozer ingesting the newly uploaded files in console log.

Additional Information

If you encounter any issues or have suggestions, please file an issue in the issue tracker on our Github page or reach out to us on discord.

Happy coding with Dozer!

Contributing

We love contributions! Please check our Contributing Guidelines if you're interested in helping!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-s3

aws-s3

README.md

Ingesting data from AWS S3

Prerequisites

Steps

Additional Information

Contributing

Files

aws-s3

Directory actions

More options

Directory actions

More options

Latest commit

History

aws-s3

Folders and files

parent directory

README.md

Ingesting data from AWS S3

Prerequisites

Steps

Additional Information

Contributing