All 3 questions are on db collection sample_mflix.embedded_movies To load the sample database follow the instructions here https://github.com/mirkenstein/MSAI-339-NoSQL/blob/master/Mongo/README_ATLAS.md

Write a query which captures all 3 requirements:
- Movies with year between 1975 and 1980
- Display only 3 columns title,year, runtime
- Order by runtime (asc or desc)

Important

Submission: Return top 5 results and submit as part of your homework submission.

Return results would look like this

title,year,runtime
The Terminator ,1980, 120

Helpful documentation

Write an aggregation aggregating year which calculates sum of all runtime for movies where year is between 1975 and 1980 including.

Important

Submission: Return in your homework the year and sum of runtime.

Return results would look like this

year,sumRuntime
1975,1234

Helpful documentation

Evey cluster in databricks has access to preloaded datasets.

The dataset /databricks-datasets/adult/adult.data is part of that preloaded collection.

See the notebook demo discusseds in class

Read into a dataframe the sample dataset /databricks-datasets/adult/adult.data
Display top 5 rows ordered in ascending order by age and ascenidng order by education_num.

Important

Submission: Submit the 5 rows from the result in point 2 as part of your homework submition.

Provide feedback

Saved searches