Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

queryType: select #13

Merged
merged 1 commit into from
Sep 8, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions pydruid/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,8 @@ def build_query(self, args):
query_dict['postAggregations'] = Postaggregator.build_post_aggregators(val)
elif key == 'datasource':
query_dict['dataSource'] = val
elif key == 'paging_spec':
query_dict['pagingSpec'] = val
elif key == "filter":
query_dict[key] = Filter.build_filter(val)
else:
Expand Down Expand Up @@ -491,3 +493,47 @@ def time_boundary(self, **kwargs):
self.validate_query(valid_parts, kwargs)
self.build_query(kwargs)
return self.__post(self.query_dict)

def select(self, **kwargs):
"""
A select query returns raw Druid rows and supports pagination.

Required key/value pairs:

:param str datasource: Data source to query
:param str granularity: Time bucket to aggregate data by hour, day, minute, etc.
:param dict paging_spec: Indicates offsets into different scanned segments
:param intervals: ISO-8601 intervals for which to run the query on
:type intervals: str or list

Optional key/value pairs:

:param pydruid.utils.filters.Filter filter: Indicates which rows of data to include in the query
:param list dimensions: The list of dimensions to select. If left empty, all dimensions are returned
:param list metrics: The list of metrics to select. If left empty, all metrics are returned

:return: The query result
:rtype: list[dict]

Example:

.. code-block:: python
:linenos:

>>> raw_data = query.select(
datasource=twitterstream,
granularity='all',
intervals='2013-06-14/pt1h',
paging_spec={'pagingIdentifies': {}, 'threshold': 1}
)
>>> print raw_data
>>> [{'timestamp': '2013-06-14T00:00:00.000Z', 'result': {'pagingIdentifiers': {'twitterstream_2013-06-14T00:00:00.000Z_2013-06-15T00:00:00.000Z_2013-06-15T08:00:00.000Z_v1': 1, 'events': [{'segmentId': 'twitterstream_2013-06-14T00:00:00.000Z_2013-06-15T00:00:00.000Z_2013-06-15T08:00:00.000Z_v1', 'offset': 0, 'event': {'timestamp': '2013-06-14T00:00:00.000Z', 'dim': 'value'}}]}}]
"""
self.query_type = 'select'
valid_parts = [
'datasource', 'granularity', 'filter', 'dimensions', 'metrics',
'paging_spec', 'intervals'
]
self.validate_query(valid_parts, kwargs)
self.build_query(kwargs)
return self.__post(self.query_dict)