Skip to content

Order Hierarchy API

Jj edited this page Aug 6, 2018 · 2 revisions

Order Hierarchy API

The Order Hierarchy API is available for the Dataset and Project Class and it enables the user to easily organize the order of variables and datasets respectively.

Consider a freshly created project named my project and two datasets named 1st dataset and 2nd dataset which do not belong to the project yet. If a project is likely to hold a lot of datasets in the future, you might want to organize them in groups.

The respective Order classes make it possible to access those groups similar to accessing folders in a filesystem. You can also list items or place an entity somewhere in the order tree. The pipe | is used as a separator to access different levels of groups and it also represents the root of the order tree. But you can also access groups in a relative fashion.

Here are some basic usage examples:

>>> from scrunch import get_project

>>> pro = get_project('my project')
>>> pro.order
[
]
>>> type(pro.order)
<class 'scrunch.order.ProjectDatasetsOrder'>
>>> type(pro.order['|'])
<class 'scrunch.order.Group'>
>>> pro.order['|'].is_root
True
>>> pro.order['|'].create_group('A Group')
>>> pro.order
[
  {
    "A Group": []
  }
]

You can create groups on any level in the order tree.

>>> pro.order['|A Group'].create_group('A SubGroup')
# or
>>> a_group = pro.order['A Group']
>>> a_group.create_group('A SubGroup')
>>> pro.order
[
  {
    "A Group": [
      {
        "A SubGroup": []
      }
    ]
  }
]

Rename a group by using the rename method:

# absolute
>>> pro.order['|A Group|A SubGroup'].rename('renamed SubGroup')
# relative
>>> pro.order['A Group|A SubGroup'].rename('renamed SubGroup')
>>> pro.order
[
  {
    "A Group": [
      {
        "renamed SubGroup": []
      }
    ]
  }
]

Moving datasets into a project

This can be easily achieved by changing the owner of a dataset to a project. The change_owner method can take your project object as an argument:

>>> from scrunch import get_dataset

>>> pro
<Project: name='my project'; id='...'>
>>> ds1 = get_dataset('1st dataset')
>>> ds1.owner
<User: email='[email protected]'; id='...'>
>>> ds1.change_owner(project=pro)
>>> ds1.owner
<Project: name='my project'; id='...'>

Note that we have to refresh the projects order object:

>>> pro.order.load()
>>> pro.order
[
  {
    "A Group": [
      {
        "renamed SubGroup": []
      }
    ]
  },
  "1st dataset"
]

Organizing datasets inside a project

You can organize datasets with the place method. Take note that only absolute path's are allowed here! It is also possible to arrange datasets in a desired order by providing a datasets id via the before and after keyword arguments.

>>> pro.order.place(ds1, '|A Group')
>>> pro.order
[
  {
    "A Group": [
      {
        "renamed SubGroup": []
      },
      "1st dataset"
    ]
  },
]
# get the 2nd dataset into our project
>>> ds2 = get_dataset('1nd dataset')
>>> ds2.change_owner(project=pro)
>>> pro.order.load()
>>> pro.order.place(ds2, '|A Group', before=ds1.id)
>>> pro.order
[
  {
    "A Group": [
      {
        "renamed SubGroup": []
      },
      "2nd dataset",
      "1st dataset"
    ]
  },
]
>>> pro.order.place(ds1, '|A Group', before='renamed SubGroup')
>>> pro.order
[
  {
    "A Group": [
      "1st dataset",
      {
        "renamed SubGroup": []
      },
      "2nd dataset"
    ]
  },
]

Managing sub projects

With recent API changes, Crunch is dropping support for a per-project Shoji order organization in favor of a nested project hierarchy to organize datasets.

Projects' index members will contain both datasets that belong to it as well as projects that could be nested inside them.

As a result, Scrunch will now use that API to organize projects. Now the Project class contains the same methods as the old project.order helper, but it is still available for compatibility purposes.

>>> from scrunch import get_project

>>> pro = get_project('my project')
>>> type(pro.order)
<class 'scrunch.datasets.Project'>
>>> type(pro.order['|'])
<class 'scrunch.datasets.Project'>
>>> pro.order['|'].is_root  # There is no concept of root anymore
True
>>> project_a = pro.order['|'].create_group('A Group')
>>> project_a

The same methods of the original Scrunch API are available.

  • rename: Renames a project
  • place: Moves a project or dataset inside the current project
  • create_group: Creates a new sub project as child of the current one
  • reorder: Receives a list of the datasets and projects in the current project and updates their order

As well as a new Scrunch API is available:

  • create_project: Preferred method, create_group is an alias of this
  • move_here: Receives a list of projects or datasets and places them inside the current project