We will use the community edition https://community.cloud.databricks.com/ which offers free Spark clusters with no time limit.
Important
Community edition signup is using the same form as the regular paid service. At step 2 of the signup process there is a link Get started with Community Edition which will enable the free no time limit community service.
At the signup page https://www.databricks.com/try-databricks
Fill out the form and click continue
On the next screen DO NOT SELECT CLOUD PROVIDER.
click on the link at the bottom of the form Get started with Community Edition
You will be emailed a verification link and from there you will setup your credentials.
The signup documentation
https://docs.databricks.com/en/getting-started/community-edition.html
Data bricks allows integration with many popular external data sources. For example here is how we connect to MongoDB. https://docs.databricks.com/en/external-data/mongodb.html
We will build an entirely cloud based data pipeline Managed by databricks and putting data on the MongoDB Atlas service.
https://docs.databricks.com/en/getting-started/etl-quick-start.html
https://docs.databricks.com/en/_extras/notebooks/source/complex-nested-structured.html
There are datasets that can be used directly from your Databricks workspace. https://docs.databricks.com/en/dbfs/databricks-datasets.html Introduction on Sample Datasets https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/7567016051051197/2957017482322847/7192593576348604/latest.html
Notebook with code for importing data from Chicago Data Portal to MongoDB Atlas published here https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/7567016051051197/1979567446034704/7192593576348604/latest.html