Skip to content

Releases: NarrativeScience-old/delver

Minor release for testing

18 Feb 23:13
d0f4123
Compare
Choose a tag to compare
Pre-release

No new changes compared to v0.0.5, used for testing CI pipeline.

Update code styles

18 Feb 22:58
5411889
Compare
Choose a tag to compare
  • Updates the code styles to meet flake8 and black guidelines
  • Updates copyright year to 2020

Rename pypi package to pydelver

16 Dec 20:00
6212825
Compare
Choose a tag to compare
  • Renames pypi package to pydelver to avoid collisions with existing delver package.

Initial public release

16 Dec 19:37
fe567c1
Compare
Choose a tag to compare

Delver

The Delver tool allows for the visual exploration of nested objects, which can
be useful for coming to grips with unfamiliar data or learning the structure of
a new codebase. In particular, this package exposes a command line tool delve
as well as a Python library which can be used to understand JSON structures
and arbitrary Python objects.

Command Line Tool

The delve command allows for the exploration of JSON data from the
command line, with the ability to see the types of data within as well as your
current location:

$  delve test_delver.json
-------------------------------------------------------------------------------
Dict (length 3)
+-----+--------+------------------------+
| Idx | Key    | Data                   |
+-----+--------+------------------------+
| 0   | league | MidwestDataAwesomeness |
| 1   | sport  | Data Innovation        |
| 2   | teams  | <list, length 2>       |
+-----+--------+------------------------+
[<key index>, u, q] -->

This displays the top level keys as well as a description of their values for the
test_delver.json file. A number of input options are printed at the bottom which
indicate that a user can either:

  • Select a key index from the available Idx values in the column on the left
  • Select u
  • Select q

Selecting a key index will navigate into that value and display information
about any keys and/or values at that level. For example, selecting 2 would navigate
into the teams object, which we can now see is a list of dictionaries:

-------------------------------------------------------------------------------
At path teams
List (length 2)
+-----+------------------+
| Idx | Data             |
+-----+------------------+
| 0   | <dict, length 4> |
| 1   | <dict, length 4> |
+-----+------------------+
[<int>, u, q] --> 0

From this point, the user can select u to go back up one level to the top, or they can
further delve by selecting an index. For example if the user selects 0:

-------------------------------------------------------------------------------
At path teams->0
Dict (length 4)
+-----+-------------+------------------+
| Idx | Key         | Data             |
+-----+-------------+------------------+
| 0   | mascot      | TRex             |
| 1   | players     | <list, length 6> |
| 2   | team symbol | ☃                |
| 3   | teamname    | editors          |
+-----+-------------+------------------+
[<key index>, u, q] -->

At this point, the user can continue navigating using the indices, return to a higher
level using u, or enter q to exit.

Python Library

The Delver class, which powers the delve tool above, can be used
directly when working in a python interpreter:

In [1]: import delver

In [2]: test_object = {'foo': 200, 'bar': False, 'baz': [1, 2, 3]}

In [3]: delver.run(test_object)
-------------------------------------------------------------------------------
At path: root
Dict (length 3)
+-----+-----+------------------+
| Idx | Key | Data             |
+-----+-----+------------------+
| 0   | bar | False            |
| 1   | baz | <list, length 3> |
| 2   | foo | 200              |
+-----+-----+------------------+
[<key index>, u, q] -->

This can be useful when debugging software (it works the same in
pdb or
ipdb), as well as when working in a new or
unfamiliar codebase. For example, it's very easy to see what public methods
and classes are defined in a package:

In [1]: import delver

In [2]: import unittest

In [3]: delver.run(unittest)
-------------------------------------------------------------------------------
At path: root
+-----+-------------------+---------------------------------------------------+
| Idx | Attribute         | Data                                              |
+-----+-------------------+---------------------------------------------------+
| 0   | BaseTestSuite     | <class 'unittest.suite.BaseTestSuite'>            |
| 1   | FunctionTestCase  | <class 'unittest.case.FunctionTestCase'>          |
| 2   | SkipTest          | <class 'unittest.case.SkipTest'>                  |
| 3   | TestCase          | <class 'unittest.case.TestCase'>                  |
| 4   | TestLoader        | <class 'unittest.loader.TestLoader'>              |
| 5   | TestProgram       | <class 'unittest.main.TestProgram'>               |
| 6   | TestResult        | <class 'unittest.result.TestResult'>              |
| 7   | TestSuite         | <class 'unittest.suite.TestSuite'>                |
| 8   | TextTestResult    | <class 'unittest.runner.TextTestResult'>          |
| 9   | TextTestRunner    | <class 'unittest.runner.TextTestRunner'>          |
| 10  | case              | <module 'unittest.case' from                      |
|     |                   | '/Users/nscience/python2.7/unittest/case.pyc'>    |
| 11  | defaultTestLoader | <unittest.loader.TestLoader object at 0x10eeb0d10>|
| 12  | expectedFailure   | <function expectedFailure at 0x10eeab8c0>         |
| 13  | findTestCases     | <function findTestCases at 0x10eeb59b0>           |
| 14  | getTestCaseNames  | <function getTestCaseNames at 0x10eeb58c0>        |
| 15  | installHandler    | <function installHandler at 0x10eeb5f50>          |
| 16  | loader            | <module 'unittest.loader' from                    |
|     |                   | '/Users/nscience/python2.7/unittest/loader.pyc'>  |
| 17  | main              | <class 'unittest.main.TestProgram'>               |
| 18  | makeSuite         | <function makeSuite at 0x10eeb5938>               |
| 19  | registerResult    | <function registerResult at 0x10eeb5e60>          |
| 20  | removeHandler     | <function removeHandler at 0x10eec2050>           |
| 21  | removeResult      | <function removeResult at 0x10eeb5ed8>            |
| 22  | result            | <module 'unittest.result' from                    |
|     |                   | '/Users/nscience/python2.7/unittest/result.pyc'>  |
| 23  | runner            | <module 'unittest.runner' from                    |
|     |                   | '/Users/nscience/python2.7/unittest/runner.pyc'>  |
| 24  | signals           | <module 'unittest.signals' from                   |
|     |                   | '/Users/nscience/python2.7/unittest/signals.pyc'> |
| 25  | skip              | <function skip at 0x10eeab758>                    |
| 26  | skipIf            | <function skipIf at 0x10eeab7d0>                  |
| 27  | skipUnless        | <function skipUnless at 0x10eeab848>              |
| 28  | suite             | <module 'unittest.suite' from                     |
|     |                   | '/Users/nscience/python2.7/unittest/suite.pyc'>   |
| 29  | util              | <module 'unittest.util' from                      |
|     |                   | '/Users/nscience/python2.7/unittest/util.pyc'>    |
+-----+-------------------+---------------------------------------------------+
[<attr index>, u, q] -->

Moving through the hierarchy, then, enables quickly understanding all the parts
that make up the unfamiliar library.

Advanced Features

This tool is typically used to look through large JSON payloads where seeing
the entirety of the file in a text editor or on a web page is
unwieldy/inconvenient. The advanced features allow for simplifying payloads or making
them easier to navigate and explore.

Specifying a Data Transform from the Command Line

The delve script allows for the ability to specify a 'transform' step to
be applied before the data is actually explored. This might be used in the case
where unwanted fields in the JSON should be ignored. For example, consider the
following dataset:

{
   "company_name": "MegaCorp",
   "company_location": "Gotham",
   "company_description": "Innovator in the corporate activity space",
   "subsidiary_companies": [
     {
       "company_name": "tinycorp",
       "company_location": "Gotham",
       "company_id": "2391235091875091348523472634782352354981723409128734019283471203941239085"
    },
    {
      "company_name": "smallcompany",
      "company_location": "Podunk",
      "company_id": "3912750918273410928347120938751098234712034981250917123049817234091283471"
    }
  ]
}

When viewing/exploring the data, it may not be necessary to see the large
company_id field on each of the subsidiary_companies. If we defined the
following transform function in a module called transform.py which is within
the current directories scope (i.e. is listed in the PYTHONPATH or is
within the current directory), then we can appropriately ignore that field when
exploring.

def remove_company_ids(payload):
    """Given a company payload, remove all of the 'company_id' fields
    within the company dictionaries listed under 'subsidiary_companies'.

    :param payload: dictionary containing company information with company
        id fields to remove
    :type payload: ``dict``

    :return: a modified *payload* without any 'company_id' fields
    :rtype: ``dict``
    """
    for company in payload.get('subsidiary_companies', []):
        del company['company_id']
    return payload

To run the delve command with the transform, just specify the transform-func
parameter:

$  delve company_info.json --transform-func transform:remove_company_ids
-------------------------------------------------------------------------------
Dict (length 4)
+-----+----------------------+-------------------------------------------+
| Idx | Key                  | Data                                      |
+-----+----------------------+-------------------------------------------+
| 0   | company_description  | Innov...
Read more