Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CR_CREATE_DATE and update CR_SUBLEVEL, CR_COU_COPYRIGHT_HOLDER #11

Merged
merged 15 commits into from
Feb 28, 2024

Conversation

ibrahimjaved12
Copy link
Collaborator

@ibrahimjaved12 ibrahimjaved12 commented Jan 31, 2024

What are the relevant tickets?

Closes https://github.com/mitodl/hq/issues/3514

Description (What does it do?)

Adds CR_CREATE_DATE and updates CR_SUBLEVEL, and CR_COU_COPYRIGHT_HOLDER using the instructions mentioned in the ticket and ongoing discussion.

We're still being updated on the exact months to use for semesters, but that would be a trivial change so this can be reviewed in the meantime.

Updated CSV Link

https://docs.google.com/spreadsheets/d/1uBFHJD0LpMqR62JQdXrGMV8k4TQwW5FGz07GTtOAYL0/edit#gid=660003245

How can this be tested?

Make sure the changes are working as expected in the above CSV link.

Now checkout to this branch, run the program, make sure everything is good:

  1. Build the container:
docker compose build
  1. Start the container:
docker compose run --rm app
  1. To generate a JSON file containing complete API data:
docker compose run --rm app --create_json
  1. To create a CSV file from the local JSON file:
docker compose run --rm app --create_csv --source=json
  1. To run unit tests:
docker run --rm ocw_oer_export python -m unittest discover

@ibrahimjaved12 ibrahimjaved12 changed the title Add course start date Add CR_CREATE_DATE and update CR_SUBLEVEL Jan 31, 2024
@ibrahimjaved12 ibrahimjaved12 marked this pull request as ready for review January 31, 2024 13:13
@ibrahimjaved12 ibrahimjaved12 added the Needs Review An open Pull Request that is ready for review label Jan 31, 2024
Comment on lines +71 to +78
def get_description_in_plain_text(description):
"""Get Course Resource plain text description by cleaning up markdown and HTML."""
cleaned_description = text_cleanup(description)
return cleaned_description


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only moved this function some lines up, this isn't a new function.

"Gender, Power, Leadership, and the Workplace",https://ocw.mit.edu/courses/wgs-s10-gender-power-leadership-and-the-workplace-spring-2014,Full Course,Text/HTML,null,"This course will provide students with an analytic framework to understand the roles that gender, race, and class play in defining and determining access to leadership and power in the U.S., especially in the context of the workplace. We will explore women and men in leadership positions within the corporate, political and non-profit sectors, with attention to the roles of women of color and immigrant women within this context. We will also look at specific policies such as affirmative action, parental leave, child-care policy, and working-time policies and the role they play–or could play–in achieving parity. We will further investigate ways in which these policies address gender, racial, and class inequities, and think critically about mechanisms for change. The course will be highly interactive, and will combine texts, theater, videos and visual arts.",en,Creative Commons Attribution Non Commercial Share Alike 4.0,student|teacher,Business and Communication|Management|Political Science|Social Science|Women’s Studies,Gender|Power|Leadership|Workplace|Race|Poverty|Inequality|Ethnicity|Labor|Corporate|Work|Family|Work-Life Balance|Government|Policy|Politics|Sexual Harassment|Affirmative Action|Parental Leave|Child Care,"Fried, Mindy",MIT,MIT OpenCourseWare,https://creativecommons.org/licenses/by-nc-sa/4.0/,MIT,Curriculum/Instruction|Assessment|Professional Development,Visual|Textual
Theory of Probability,https://ocw.mit.edu/courses/18-175-theory-of-probability-spring-2014,Full Course,Text/HTML,null,"This course covers topics such as sums of independent random variables, central limit phenomena, infinitely divisible laws, Levy processes, Brownian motion, conditioning, and martingales.",en,Creative Commons Attribution Non Commercial Share Alike 4.0,student|teacher,Mathematics|Statistics and Probability,Laws of Large Numbers|Central Limit Theorems for Sums of Independent Random Variables|Conditioning and Martingales|Brownian Motion and Elements of Diffusion Theory|Functional Limit Theorems.,"Sheffield, Scott",MIT,MIT OpenCourseWare,https://creativecommons.org/licenses/by-nc-sa/4.0/,MIT,Curriculum/Instruction,Visual|Textual
Entrepreneurial Marketing,https://ocw.mit.edu/courses/15-835-entrepreneurial-marketing-spring-2002,Full Course,Text/HTML,null,"This course clarifies key marketing concepts, methods, and strategic issues relevant for start-up and early-stage entrepreneurs. At this course, there are two major questions:
CR_TITLE,CR_URL,CR_MATERIAL_TYPE,CR_MEDIA_FORMATS,CR_SUBLEVEL,CR_ABSTRACT,CR_LANGUAGE,CR_COU_TITLE,CR_PRIMARY_USER,CR_SUBJECT,CR_KEYWORDS,CR_CREATE_DATE,CR_AUTHOR_NAME,CR_PROVIDER,CR_PROVIDER_SET,CR_COU_URL,CR_COU_COPYRIGHT_HOLDER,CR_EDUCATIONAL_USE,CR_ACCESSIBILITY
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests / Tests' JSON data had to be updated for two reasons:

  1. To capitalize CR_Media_Formats
  2. level is now an array in the API data, previously it wasn't

Base automatically changed from schedule-csv-generation to main February 1, 2024 06:22
@ibrahimjaved12
Copy link
Collaborator Author

Rebased the branch to main, now that schedule-csv-generation was merged to main

@ChristopherChudzicki ChristopherChudzicki self-assigned this Feb 1, 2024
@@ -89,6 +107,20 @@ def get_cr_keywords(fm_ocw_keywords_mapping, list_of_topics_objs, course_url):
return "|".join(topic["name"] for topic in list_of_topics_objs)


def get_cr_create_date(semester, year):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Some OCW courses do not have a year. In that case, the current code will return None-01-01.

Instead, add if not year: return "".

However: A bug is currently preventing MIT Open from including these courses. See mitodl/mit-open#441. So a little hard to test this change on real data right now.

Copy link
Collaborator

@ChristopherChudzicki ChristopherChudzicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibrahimjaved12 I think you should make this change now, even though it's hard to test it: https://github.com/mitodl/ocw_oer_export/pull/11/files#r1474815558

If everything requires a CR_CREATE_DATE... I'm not sure what to do quite yet. Hopefully we'll have the real creation date soon.

Comment on lines 59 to 70
def get_cr_sublevel(levels):
"""Set the value(s) of CR_SUBLEVEL based on the course levels."""
level_mappings = {
"Undergraduate": ["Community College/Lower Division", "College/Upper Division"],
"Graduate": ["Graduate/Professional"],
"High School": ["High School", "Community College/Lower Division"],
"Non-Credit": ["Career/Technical Education"],
}
sublevels = [sublevel for level in levels for sublevel in level_mappings.get(level)]
return "|".join(sorted(set(sublevels)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently merged a breaking change to the run.level array in MIT Open. It now an array of { "code": "id-like-string", "name": "human-readable-string" } objects.

This could should be updated to reflect that.

Within MIT Open, the change is not on RC or Prod yet, but it is available locally in main. I think it would also be worthwhile to allow changing the API baseurl via an env variable or something so it's easier to develop this against local/RC/prod

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ibrahimjaved12 This change is live on RC now.

@ChristopherChudzicki ChristopherChudzicki removed the Needs Review An open Pull Request that is ready for review label Feb 12, 2024

import logging

from .create_csv import create_csv
from .create_json import create_json
from .cli import main
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove the warning:
<frozen runpy>:128: RuntimeWarning: 'ocw_oer_export.cli' found in sys.modules after import of package 'ocw_oer_export', but prior to execution of 'ocw_oer_export.cli'; this may result in unpredictable behaviour
I think this was also slowing down the program execution

@ibrahimjaved12 ibrahimjaved12 added Needs Review An open Pull Request that is ready for review and removed Waiting on Author labels Feb 19, 2024
@ibrahimjaved12
Copy link
Collaborator Author

ibrahimjaved12 commented Feb 19, 2024

PR ready to be re-reviewed (also updated the CSV file in PR description)

@ibrahimjaved12 ibrahimjaved12 changed the title Add CR_CREATE_DATE and update CR_SUBLEVEL Add CR_CREATE_DATE and update CR_SUBLEVEL, CR_COU_COPYRIGHT_HOLDER Feb 21, 2024
@ibrahimjaved12
Copy link
Collaborator Author

Added a quick one line change to change CR_COU_COPYRIGHT_HOLDER from MIT to instructors' names. Ready to be reviewed again.

Copy link
Collaborator

@ChristopherChudzicki ChristopherChudzicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me.

Comment on lines 4 to 7
API_BASE_URLS = {
"PRODUCTION": "https://mitopen.odl.mit.edu/api/v1/courses/?platform=ocw",
"RC": "https://mitopen-rc-2ea15531374d.herokuapp.com/api/v1/courses/?platform=ocw",
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: IMO, just make the env variable the base URL:

API_BASE_URL=https://mitopen.odl.mit.edu
# or
API_BASE_URL=https://mitopen-rc.odl.mit.edu
# or
API_BASE_URL=http://localhost:8063

and the library can add api/v1/courses/?platform=ocw

That way you can set the base url to whatever you want—local, rc, prod.

@ibrahimjaved12 ibrahimjaved12 merged commit b4e7ade into main Feb 28, 2024
2 checks passed
@ibrahimjaved12 ibrahimjaved12 deleted the create-date-and-sublevel branch February 28, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review An open Pull Request that is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants