Databricks SQL Query Logger

Retrieves DBSQL query history and merges it into a Delta Lake table.

A System Table containing similar data is on the Databricks roadmap, so solutions like this will be mostly unecessary in the near future.

Features

Ingests DBSQL query history based on supplied parameters
Supports incremental ingestion
Works in triggered or continuous mode:
- triggered - ingests data and exits
- continuous - ingests data every N seconds until cancelled

Getting Started

Dependencies

Deletion Vectors (DBR 14.1 and above) and Liquid Clustering (DBR 13.3 and above). If you need to support an earlier DBR, update the create_target_table function accordingly.
The Python whl includes the Databricks SDK 0.25.1. If you use the module directly, be sure to test any other versions prior to use.

Installation

Download the python whl file from the latest release. It is not published to PyPi.
Upload the whl file as a workspace file or to a Unity Catalog volume
Option 1: Install as cluster-scoped library
- Go to the Libraries menu of your cluster
- Select Install new and navigate to where you saved the whl
Option 2: Install as notebook-scoped library
- Run the following in a notebook cell: %pip install /path/to/my_package.whl

See the libraries documentation for full details and additional options.

Usage

The code below shows how to run the DBSQL Query Logger. Refer to docstrings for more details.

The examples folder includes a notebook that uses widgets for easy parameterization.

import logging
from dbsql_query_logger import QueryLogger

logging.basicConfig(
    format="%(asctime)s %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S%z"
)

logger = logging.getLogger('dbsql_query_logger')
logger.setLevel(logging.INFO)

query_logger = QueryLogger(
    catalog = 'chris_koester',
    schema = 'observe',
    table = 'query_history',
    pipeline_mode = 'triggered',
    backfill_period = '24 hours',
    reset = 'no'
)

query_logger.run()

Authentication

Authentication is handled by the Databricks SDK, which implements the Databricks client unified authentication standard. Examples include:

Notebook - uses native notebook authentication. No configuration necessary.
Environment variables - environment variable can be set as part of the cluster configuration for scheduled jobs.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets/img		assets/img
examples		examples
.gitignore		.gitignore
README.md		README.md
dbsql_query_logger.py		dbsql_query_logger.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Databricks SQL Query Logger

Features

Getting Started

Dependencies

Installation

Usage

Authentication

About

Releases 1

Packages

Contributors 2

Languages

cjkoester/dbsql_query_logger

Folders and files

Latest commit

History

Repository files navigation

Databricks SQL Query Logger

Features

Getting Started

Dependencies

Installation

Usage

Authentication

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages