Skip to content

Commit

Permalink
Support "Advanced" level validation (#28)
Browse files Browse the repository at this point in the history
* Created tests for advanced support

* Contributed Test Harness Project - AddColumn, init.sql

Set up support to begin running contributed tests with init script, and addColumn change type tests passing

* Add more contributed tests

* Update some foreign key constraints - Base Test Harness

* Add more Base Tests

* Add yet more base test functionality

* Add table/columns/view remarks support - update Readme with progress

* Add advanced tests

* Finish Contributed Testing

1. Finished Contributed Tests (loadUpdateData, etc.)
2. Make GitActions do each testing level in series to not disturb the others

* update readme

* Add more advanced testing

1. Fix Foreign Key Snapshotting
2. Start createIndex change type mapping to CLUSTER BY

* Add Support For: CheckConstraints, Optimize, Analyze, ClusteredTables

Add Databricks specific change type support.

* Update Terraform to use new DBR LTS 13.3

* Add support for creating partitioned tables

Add support and tests for creating partitioned tables

* Test Terraform Catalog Issue

* Finish Advanced Test Harness

Finish Advanced Test Harness

* Update Terraform to use DBSQL warehouse for testing

* troubleshoot terraform deployment

* switch back id to name in tf deployment

* Update test.yml

Updating to the most recent version of the build logic to see if the Sonar scan issue is fixed.

* update dbsql creation terrafor for testing

* remove bad tf parameter from test harness

---------

Co-authored-by: CodyAustinDavis <[email protected]>
Co-authored-by: Kevin Chappell <[email protected]>
  • Loading branch information
3 people authored Sep 28, 2023
1 parent cf899ab commit d063292
Show file tree
Hide file tree
Showing 177 changed files with 4,190 additions and 223 deletions.
9 changes: 5 additions & 4 deletions .github/workflows/lth.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,13 @@ jobs:
TF_VAR_DBX_HOST: ${{ secrets.TH_DATABRICKS_WORKSPACE_HOST }}
TF_VAR_DBX_TOKEN: ${{ secrets.TH_DATABRICKS_WORKSPACE_TOKEN }}
TF_VAR_TEST_CATALOG: main
TF_VAR_TEST_SCHEMA: lb_test_harness
TF_VAR_TEST_SCHEMA: liquibase_harness_test_ds
WORKSPACE_ID: ${{ secrets.TH_DATABRICKS_WORKSPACE_ID }}

strategy:
max-parallel: 1
matrix:
liquibase-support-level: [Foundational] # Define the different test levels to run
liquibase-support-level: [Foundational, Contributed, Advanced] # Define the different test levels to run
fail-fast: false # Set fail-fast to false to run all test levels even if some of them fail

steps:
Expand All @@ -41,9 +42,9 @@ jobs:
- name: Collect Databricks Config
working-directory: src/test/terraform
run: |
CLUSTER_ID=$(terraform output -raw cluster_url)
CLUSTER_ID=$(terraform output -raw endpoint_url)
DATABRICKS_HOST=${TF_VAR_DBX_HOST#https://}
echo "DATABRICKS_URL=jdbc:databricks://$DATABRICKS_HOST:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/$WORKSPACE_ID/$CLUSTER_ID;AuthMech=3;ConnCatalog=$TF_VAR_TEST_CATALOG;ConnSchema=$TF_VAR_TEST_SCHEMA;EnableArrow=0" >> "$GITHUB_ENV"
echo "DATABRICKS_URL=jdbc:databricks://$DATABRICKS_HOST:443/default;transportMode=http;ssl=1;httpPath=/sql/1.0/warehouses/$CLUSTER_ID;AuthMech=3;ConnCatalog=$TF_VAR_TEST_CATALOG;ConnSchema=$TF_VAR_TEST_SCHEMA;EnableArrow=0" >> "$GITHUB_ENV"
- name: Setup Temurin Java 17
uses: actions/setup-java@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@ on:

jobs:
build-test:
uses: liquibase/build-logic/.github/workflows/os-extension-test.yml@v0.3.1
uses: liquibase/build-logic/.github/workflows/os-extension-test.yml@v0.4.3
secrets: inherit
105 changes: 87 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,98 @@
# liquibase-databricks
# Liquibase-Databricks Connector


## Current Summary
Base and Foundational Change types should be supported at this stage. Change types such as procedures, triggers, merge column, indexes are not supported.
Databricks specific change types that are added are listed below along with their completion status.


## To Do:
## Summary
This is the Liquibase Extension for Managing Delta Tables on DatabricksSQL.

1. Add unit tests with liquibase test harness - Cody Davis - Done
2. Pass Foundational Test Harness - Cody Davis - Done
3. Pass Base Test Harness - Cody Davis - In Progress - ETA May 15, 2023
4. Pass Advanced Test Harness - Unassigned - Not Started

## Change Types to Add:
Base/Contributed and Foundational Change types should be supported at this stage. Change types such as procedures, triggers, sequences, indexes are not supported.
Databricks specific change types that are added are listed below along with their completion status.
Databricks tables creates with liquibase are automatically created with the Delta configs / versions that are required for all passing change types including: 'delta.feature.allowColumnDefaults' = 'supported', 'delta.columnMapping.mode' = 'name'


## NOTE! ONLY TABLES CREATED WITH UNITY CATALOG ARE SUPPORTED FOR MOST ADVANCED OPERATIONS
This extension utilizes Unity Catalog System tables for many advanced operations such as snapshotting, identifying various constraints (PK/FK/NOT NULL, etc.)
If hive_metastore is used, this is not tested and may not provide all the below functionality.


## Harness Status:

1. [x] Add unit tests with liquibase test harness - Cody Davis - DONE
2. [x] Pass Foundational Test Harness - Cody Davis - DONE 4/1/2023
3. [x] Pass Contributed Test Harness - Cody Davis - DONE 9/15/2023
4. [x] Pass Advanced Test Harness - Cody Davis - DONE 9/28/2023


## Currently Supported Change Types:

### Contributed / Base
1. [x] createTable/dropTable
2. [x] addColumn/dropColumn
3. [x] addPrimaryKey/dropPrimaryKey
4. [x] addForeignKey/dropForeignKey
5. [x] addNotNullConstraint/dropNotNullConstraint
6. [x] createTable/createTableDataTypeText/createTableTimestamp/dropTable
7. [x] createView/dropView
8. [x] dropAllForeignKeyConstraints
9. [x] createView/dropView
10. [x] setTableRemarks - supported but not returned in snapshot as JDBC Driver not populating it
11. [x] setColumnRemarks
12. [x] setViewRemarks (set in TBLPROPERTIES ('comment' = '<comment>'))
13. [x] executeCommand
14. [x] mergeColumns
15. [x] modifySql
16. [x] renameColumn
17. [x] renameView
18. [x] sql
19. [x] sqlFile
20. [x] Change Data Test: apply delete
21. [x] Change Data Test: apply insert
22. [x] Change Data Test: apply loadData
23. [x] Change Data Test: apply loadDataUpdate
24. [ ] Add/Drop Check Constraints - TO DO: Need to create snapshot generator but the change type works

### Advanced
1. [x] addColumn snapshot
2. [x] addPrimaryKey snapshot
3. [x] addForeignKey snapshot
4. [x] schemaAndCatalogSnapshot snapshot
5. [x] createTable snapshot
6. [x] createView snapshot
7. [x] generateChangelog -
2. [x] addUniqueConstraint - not supported
3. [x] createIndex - Not Supported, use changeClusterColumns change type for datbricks to Map to CLUSTER BY ALTER TABLE statements for Delta Tables


### Databricks Specific:
1. [x] OPTIMIZE - optimizeTable - optimize with zorderCols options - <b> SUPPORTED </b> in Contributed Harness
2. [x] CLUSTER BY (DDL) - createClusteredTable - createTable with clusterColumns as additional option for liquid - <b> SUPPORTED </b> in Contributed Harness
3. [x] ANALYZE TABLE - analyzeTable - change type with compute stats column options - <b> SUPPORTED </b> in Contributed Harness
4. [x] VACUUM - vacuumTable - change type with retentionHours parameter (default is 168) - <b> SUPPORTED </b> in Contributed Harness
5. [ ] ALTER CLUSTER KEY - changeClusterColumns - change type that will be used until index change types are mapped with CLUSTER BY columns for snapshot purposes


## Remaining Required Change Types to Finish in Base/Contributed
1. [ ] (nice to have, not required) createFunction/dropFunction - in Liquibase Pro, should work in Databricks, but change type not accessible from Liquibase Core
2. [x] (nice to have, not required) addCheckConstraint/dropCheckConstraint - in Liquibase Pro, should work in Databricks, but change type not accessible from Liquibase Core
3. [ ] addDefaultValue (of various types). Databricks/Delta tables support this, but does not get populated by databricks in the JDBC Driver (COLUMN_DEF property always None even with default)

The remaining other change types are not relevant to Databricks and have been marked with INVALID TEST


## Aspirational Roadmap - Databricks Specific Additional Change Types to Add:

1. COPY INTO
2. MERGE
3. RESTORE VERSION AS OF
4. ANALYZE TABLE - Code Complete - Cody Davis
5. SET TBL PROPERTIES - In Progress - Cody Davis
4. ANALYZE TABLE - Code Complete - Adding Tests - Cody Davis
5. SET TBL PROPERTIES - (Defaults are in createTable change type with min required table props to support Liquibase)
6. CLONE
7. BLOOM FILTERS
8. OPTIMIZE / ZORDER - Code Complete - No Test Yet - Cody Davis
9. VACUUM - Code Complete - Cody Davis
7. BLOOM FILTERS - Maybe do not support, CLUSTER BY should be the primary indexing mechanism long term
8. OPTIMIZE / ZORDER - Code Complete - Adding Tests - Cody Davis
9. VACUUM - Code Complete - Adding Tests - Cody Davis
10. SYNC IDENTITY
11. VOLUMES
12. GRANT / REVOKE statements
13. CLUSTER BY - Similar to Indexes, important to support as a create table / alter table set of change types (params in createTable change), addClusterKey new change type to ALTER TABle



Expand Down
7 changes: 6 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,11 @@
<artifactId>liquibase-core</artifactId>
<version>${liquibase.version}</version>
</dependency>
<dependency>
<groupId>org.liquibase</groupId>
<artifactId>liquibase-commercial</artifactId>
<version>${liquibase.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
Expand Down Expand Up @@ -163,7 +168,7 @@
<configuration>
<attach>true</attach>
<author>false</author>
<doctitle>Liquibase Databricks ${version} API</doctitle>
<doctitle>Liquibase Databricks ${project.version} API</doctitle>
<quiet>true</quiet>
<doclint>none</doclint>
<encoding>UTF-8</encoding>
Expand Down
8 changes: 8 additions & 0 deletions project/app-changelog.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd">

<!-- Changes to include -->
<includeAll path="project/changes/"/>

</databaseChangeLog>
28 changes: 28 additions & 0 deletions project/changes/example.changelog.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
<?xml version="1.0" encoding="UTF-8"?>


<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:ext="http://www.liquibase.org/xml/ns/dbchangelog-ext"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd http://www.liquibase.org/xml/ns/dbchangelog-ext http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-ext.xsd">

<changeSet id="3" author="example">
<ext:createTable tableName="managed_system" tableFormat="delta">
<column name="id" type="int"/>
<column name="name" type="varchar(20)"/>
</ext:createTable>
</changeSet>

<changeSet id="4" author="example">
<createTable tableName="user_table">
<column name="id" type="int"/>
<column name="username" type="varchar(20)"/>
<column name="password" type="varchar(20)"/>
</createTable>
</changeSet>

<changeSet id="5" author="example">
<ext:optimize tableName="user_table" zorderColumns="id"/>
</changeSet>

</databaseChangeLog>
23 changes: 23 additions & 0 deletions project/changes/example.changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
databaseChangeLog:

- changeSet:
id: 1
author: codydavis
changes:
- createTable:
catalogName: main
schemaName: liquibase
tableName: test_yml
columns:
- column:
name: id
type: bigint
autoIncrement: true

- column:
name: name_of_col
type: string

- column:
name: test_struct
type: STRUCT<ID string, name string>
10 changes: 9 additions & 1 deletion project/example.changelog.xml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,15 @@
</changeSet>

<changeSet id="3" author="example">
<ext:optimize tableName="secure_user_table" zorderColumns="col1"/>
<ext:optimizeTable tableName="user_table" zorderColumns="id"/>
</changeSet>

<changeSet id="4" author="example">
<ext:analyzeTable tableName="user_table" analyzeColumns="id"/>
</changeSet>

<changeSet id="5" author="example">
<ext:vacuumTable tableName="user_table" retentionHours="72"/>
</changeSet>

</databaseChangeLog>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
package liquibase.ext.databricks.change.addCheckConstraint;

import liquibase.change.*;
import liquibase.database.Database;
import liquibase.ext.databricks.change.dropCheckConstraint.DropCheckConstraintChangeDatabricks;
import liquibase.ext.databricks.database.DatabricksDatabase;
import liquibase.statement.SqlStatement;
import java.text.MessageFormat;
import liquibase.ext.databricks.database.DatabricksDatabase;

@DatabaseChange(name = "addCheckConstraint",
description = "Adds check constraint to Delta Table",
priority = DatabricksDatabase.PRIORITY_DEFAULT + 200,
appliesTo = {"column"}
)
public class AddCheckConstraintChangeDatabricks extends AbstractChange {

private String catalogName;
private String schemaName;
private String tableName;

private String constraintName;

private String constraintBody;

public String getCatalogName() {
return catalogName;
}

public void setCatalogName (String catalogName) {
this.catalogName = catalogName;
}

public String getTableName() {
return tableName;
}

public void setTableName (String tableName) {
this.tableName = tableName;
}

public String getSchemaName() {
return schemaName;
}

public void setSchemaName (String schemaName) {
this.schemaName = schemaName;
}


// Name of Delta Table Constraint
@DatabaseChangeProperty(
description = "Name of the check constraint"
)
public String getConstraintName() {
return this.constraintName;
}

public void setConstraintName(String name) {
this.constraintName = name;
}


// The is the SQL expression involving the contraint

@DatabaseChangeProperty(
serializationType = SerializationType.DIRECT_VALUE
)
public String getConstraintBody() {
return this.constraintBody;
}

public void setConstraintBody(String body) {
this.constraintBody = body;
}

@Override
public String getConfirmationMessage() {
return MessageFormat.format("{0}.{1}.{2} successfully Added check constraint {3}.", getCatalogName(), getSchemaName(), getTableName(), getConstraintName());
}

protected Change[] createInverses() {
DropCheckConstraintChangeDatabricks var1 = new DropCheckConstraintChangeDatabricks();
var1.setTableName(getTableName());
var1.setSchemaName(getSchemaName());
var1.setCatalogName(getCatalogName());
var1.setConstraintName(getConstraintName());

return new Change[]{var1};
}

@Override
public SqlStatement[] generateStatements(Database database) {

AddCheckConstraintStatementDatabricks statement = new AddCheckConstraintStatementDatabricks();

statement.setCatalogName(getCatalogName());
statement.setSchemaName(getSchemaName());
statement.setTableName(getTableName());
statement.setConstraintName(getConstraintName());
statement.setConstraintBody(getConstraintBody());

return new SqlStatement[] {statement};
}
}
Loading

0 comments on commit d063292

Please sign in to comment.