CourseGraph plugin for Tutor
CourseGraph is a tool that allows Open edX developers and support specialists to inspect their platform instance's learning content. It consists of two simple components:
- The CourseGraph support application, which loads courses from the CMS's internal course store and dumps them into an instance of...
- Neo4j, a popular open-source graph database. Staff for an Open edX instance can query the course graph via Neo4j's Web console using the Cypher query language.
CourseGraph was initially an internal tool at edX, Inc., but as of the Maple release it was shared with the greater Open edX community. This Tutor plugin aims to provide an easy mechanism for developers and operators to trial and deploy CourseGraph.
This is Beta software. It was developed as part of the Tutor Adoption Initiative for the purpose learning about the Tutor Plugin API, and it has been tested to work for with Tutor and Open edX for local development, local deployment, and Kubernetes deployment. However, it has not yet been tested with a production Open edX instance. Although this plugin is nominally maintained by the Center for Reimagining Learning, we do not currently have plans to do any comprehensive production-tier testing. You have been warned!
If you choose to use this plugin, we're eager to hear about your experiences! Please let us know by opening an issue in this repository or posting on the Open edX or Tutor forums. Also, if you're interested in contributing, feel free to open an issue or a pull request. We'll try to give it a first look within a week.
This plugin uses the same branching model as Tutor:
nightly |
Latest development. Compatible with Tutor Nightly and Open edX master. Merged into master at each Open edX named release. |
master |
Latest stable release and patches. Compatible with Tutor master and latest Open edX named release. Merged into nightly continuously. |
The syncing between branches is currently done manually.
If your change is backwards-compatible with the last Open edX named release, then propose it against master
. If not, then propose it against nightly
.
This repository aims to adhere to all relevant Open edX Proposals, including OEP-55, Conventional Commits.
Install the latest stable version (requires the latest Tutor release):
pip install tutor-contrib-coursegraph
Or, install the latest nightly version (requires Tutor Nightly):
pip install git+https://github.com/openedx/tutor-contrib-coursegraph@nightly
Or, install the plugin to be hacked on:
git clone [email protected]:openedx/tutor-contrib-coursegraph cd tutor-contrib-coursegraph source {{ PATH TO A VIRTUAL ENVIRONMENT }} make dev-requirements
Enable the plugin and re-generate your Tutor environment:
tutor plugins enable coursegraph tutor config save
Then, run initialization in order to dump your platform's existing courses into CourseGraph:
tutor [dev|local|k8s] init --limit=coursegraph
Start CourseGraph:
tutor [dev|local|k8s] start coursegraph
Once CourseGraph is started and courses have been dumped to it, the tool can be viewed at:
http://coursegraph.{{ LMS_HOST }}
For example, if your LMS is at openedx.example.edu
and you're using the default Neo4j HTTP port, that'd be:
http://coursegraph.openedx.example.edu
When running in development mode, you'll also need to specify port 7474. For example:
http://coursegraph.local.overhang.io:7474
The Connect URL should already be populated correctly. If it's not, here's what to enter, substituting openedx.example.edu
for your own LMS root domain:
Mode | Connect URL |
---|---|
tutor k8s |
bolt://bolt.coursegraph.openedx.example.edu:80 |
tutor local |
bolt://bolt.coursegraph.openedx.example.edu:80 |
tutor dev |
bolt://coursegraph.local.overhang.io:7687 |
By default, you can log in using Username / Password authentication:
Username | "neo4j" |
Password | $(tutor config printvalue COURSEGRAPH_NEO4J_PASSWORD) |
If you set COURSEGRAPH_NEO4J_PASSWORD
to null
before initializing CourseGraph, then instead select No Authentication.
Now that you're in, try querying your courses!
Operating CourseGraph is fairly straightforward, especially if you treat CourseGraph data as a non-critical secondary view into the CMS's course data. That is: you should be willing to completely drop and re-generate the CourseGraph data stord in Neo4j. By doing so, you avoid needing to back up Neo4j, and you de-risk the Neo4j schema version upgrades that you'll need to perform over time with new Open edX releases.
By default, this plugin configures CMS to dump each course to CourseGraph whenever it is published, allowing you to "set and forget" the tool. You can disable this behavior by setting COURSEGRAPH_DUMP_COURSE_ON_PUBLISH
to false
, regenerating your Tutor environment, and restarting CMS.
If you have disabled automatic dumping, then you'll need to periodically refresh the data in CourseGraph manually. You can do so via the CMS administration console at, under the COURSE GRAPH COURSE DUMPS page in the COURSE GRAPH app:
Alternatively, you can skip the admin console by using a CMS management command:
tutor [dev|local|k8s] exec coursegraph ./manage.py cms dump_to_neo4j
By default, CourseGraph's Neo4j service is accessible to the outside world for HTTP and Bolt traffic via Caddy, the proxy provided by Tutor. Neo4j's bfasic authentication scheme (described in Managing Users) may or may not be sufficient for your security needs. If it is not sufficient, then you have two options:
- Use your own Web proxy in place of Tutor's Caddy service by setting
ENABLE_WEB_PROXY=false
. Via your Web proxy, restrict ingress CourseGraph access as desired. If you're not already using your own Web proxy, then this is probably not what you want to do, as it'd affect your entire Tutor deployment just for the sake of this plugin. - Set
COURSEGRAPH_RUN_NEO4J=false
and self-manage a separate Neo4j service, whose ingress access you could control as desired. Then, configure this plugin to point at your self-managed Neo4j service so that CMS can dump to content to it. See the Configuration section for specifics. If you choose to go this route, the authors of this plugin recommend the official Neo4j docker image as a starting point for self-hosting Neo4j.
This plugin currently ships with Neo4j 3.5 Community Edition, which provides some very rudimentary authentication controls via the Neo4j Web console. The Neo4j 3.5 page on procedures for native user management describes the available operations. Take note that only four operations are available in the Community Edition:
CALL dbms.security.createUser(<username>, <password>, <requirePasswordChange>)
CALL dbms.security.changePassword(<password>, <requirePasswordChange>)
CALL dbms.security.deleteUser(<username>)
CALL dbms.security.listUsers()
Since authorization control is not availble in the Neo4j Community Edition, all users will have full administrative control over Neo4j data, including ability to to read all graphs, modify nodes, modify relationships, and create new users. Because of this, access to any set of CourseGraph Neo4j credentials confers read access to your entire course catalog, as well as the ability to insert fake data into CourseGraph. As such, if your CourseGraph instance is accessible by the public, then all Neo4j credential sets should be treated as production secrets.
By default, this plugin initializes Neo4j with one user, whose username is neo4j
and whose password is set from the COURSEGRAPH_NEO4J_PASSWORD
Tutor setting. When CMS pushes data to Neo4j, it also authenticates with neo4j
as its username and COURSEGRAPH_NEO4J_PASSWORD
as its password. Keep in mind that:
- If the
neo4j
user is destroyed, CMS will be unable to dump new data to Neo4j until a newneo4j
user is created. - Once CourseGraph is initialized, changing
COURSEGRAPH_NEO4J_PASSWORD
alone does not update theneo4j
user's password. You will need to useCALL dbms.security.changePassword(...)
from the Neo4j Web console for the password change to take effect.
Finally, initializing CourseGraph with COURSEGRAPH_NEO4J_PASSWORD
set to !!null
disables authentication entirely. Anyone with access to CourseGraph will have full read and write access to the Neo4j database. Obviously, this is only suitable when CourseGraph endpoint access is restricted to trusted private network.
The Tutor plugin can be configured with several settings. The names of all settings below are prefixed with COURSEGRAPH_
.
CourseGraph Setting | Type | Default | Description |
---|---|---|---|
NEO4J_PASSWORD |
str | (20 random characters) | Initial password set for neo4j user, as well as password used to by CMS to authenticate as neo4j user. See Managing Users for more details. |
NEO4J_VERSION |
str | "3.5.28" |
Version of Neo4j to use. Appended to default image. Overriding NEO4J_DOCKER_IMAGE annuls this setting. |
NEO4J_DOCKER_IMAGE |
str | "docker.io/neo4j:3.5.28" |
Neo4j Docker image to be pulled and used. By default, based on your DOCKER_REGISTRY and COURSEGRAPH_NEO4J_VERSION . |
NEO4J_HOST |
str | Prod: "coursegraph.www.openedx.com" , Dev: "coursegraph.local.overhang.io" |
Hostname of CourseGraph. By default, based on your LMS_HOST . |
DUMP_COURSE_ON_PUBLISH |
bool | true |
Should CMS automatically dump a course to CourseGraph whenever it's published? If disabled, you will instead need to periodically dump courses via the management command or admin console. |
RUN_NEO4J |
bool | true |
Whether the Neo4j service should be started. Only disable if you wish to manage your own Neo4j instance outside of Tutor. If you disable this, then set COURSEGRAPH_NEO4J_HOST to the hostname of your self-managed instance, set COURSEGRAPH_NEO4J_PASSWORD to the password of your instance's neo4j user, and expose port 7687 for unencrypted Bolt traffic. |
Upgrade version pins:
make upgrade
Run just static checks:
make test-format test-lint test-types
Run all tests:
cp $(tutor config printroot)/config.yml tutor_config.bak.yml make test # clobbers some Tutor configuration mv tutor_config.bak.yml $(tutor config printroot)/config.yml # restore original config
This software is licensed under the terms of the AGPLv3.