Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint, vacuum and compact lead to corruption of a Deltalake table #3057

Open
valentinmk opened this issue Dec 13, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@valentinmk
Copy link

Environment

Delta-rs version: 0.22.3

Binding: Python

Environment:

  • Cloud provider: custom S3 (have a prod problems)
  • OS: reproduce on Mac OS with Python 3.12

Bug

What happened: Function .is_deltatable return False for existing Delta table after some iteration of an ingestion and clean up via compact + vacuum + checkpoint

What you expected to happen: Expects that a write_delta function with mode="append" or mode="overwrite" not broke delta table. A checkpoint function do not broke Delta tabel. A vacuum function cleanup all outdated files.

How to reproduce it: I've prepared demo script to show my finding. Please guide me if some thing doing wrong, I start to figure out (intuition) that some sort of a 'state' stored with in delta-rs and ie before vacuum it is good idea to reinitiate dt with DeltaTable

More details:

Initiate a python stuff. It is writing to a folder lake_delta.

ipython
Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:13:44) [Clang 16.0.6 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.30.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import polars as pl
   ...: from deltalake import DeltaTable
   ...: from datetime import datetime

In [2]: filename = "lake_delta"
   ...: configuration = {
   ...:     "delta.deletedFileRetentionDuration": "interval 0 days",
   ...:     "delta.logRetentionDuration": "interval 0 days",
   ...:     "delta.targetFileSize": str(128 * 1024 * 1024),
   ...: }

In [3]: for i in range(2):
   ...:     filename = "lake_delta"
   ...:     now = int(datetime.now().timestamp()*1000)
   ...:     data = [[now], [1], [2], [3], [4], [5]]
   ...:     df = pl.DataFrame(data, schema=["dt", "a", "b", "c", "d", "e"])
   ...:     df.write_delta(filename, delta_write_options={"configuration": configuration},mode="append")
   ...:

Define helper function to see some information about our table.

In [4]: def print_debug():
   ...:     dt = DeltaTable(filename)
   ...:     print(
   ...:         f"{DeltaTable.is_deltatable(filename)}, Version: {dt.version()}, "
   ...:         f"Num. files: {len(dt.files())} , {dt.metadata()}"
   ...:         )
   ...:     !cat lake_delta/_delta_log/_last_checkpoint
   ...:     print()
   ...: print_debug()
True, Version: 1, Num. files: 2 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
cat: lake_delta/_delta_log/_last_checkpoint: No such file or directory

I run compact to join small files to a bigger one.

In [5]: dt = DeltaTable(filename)
   ...: dt.optimize.compact(target_size=128 * 1024 * 1024)
   ...: print_debug()
True, Version: 2, Num. files: 1 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days'})
cat: lake_delta/_delta_log/_last_checkpoint: No such file or directory

In [6]: dt = DeltaTable(filename)

In [7]: dt.files()
Out[7]: ['part-00001-a6540901-96a1-4668-a831-44220dd1e39f-c000.zstd.parquet']

In [8]: !ls lake_delta
_delta_log								part-00001-a6540901-96a1-4668-a831-44220dd1e39f-c000.zstd.parquet
part-00001-354066a1-e0bb-4df2-be1a-2bd2aee27390-c000.snappy.parquet	part-00001-dc16787a-1d23-4a16-bd89-b405085bf934-c000.snappy.parquet

Create checkpoint and it harms table. DeltaTable.is_deltatable(filename) returns False as result

In [9]: dt = DeltaTable(filename)
   ...: dt.create_checkpoint()
   ...: print_debug()
False, Version: 2, Num. files: 1 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":3,"sizeInBytes":22878,"version":2}

I continue to ingest a new portion of data 5 times (don't ask me why).
Each time it shows False, until 5th iteration, which returns True and a table "healed" itself (?)

In [10]: for i in range(5):
    ...:     now = int(datetime.now().timestamp()*1000)
    ...:     df = pl.DataFrame([[now], [1], [2], [3], [4], [5]], schema=["dt", "a", "b", "c", "d", "e"])
    ...:     df.write_delta(filename, delta_write_options={"configuration": configuration}, mode="append")
    ...:     print_debug()
    ...:
False, Version: 3, Num. files: 2 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":3,"sizeInBytes":22878,"version":2}
False, Version: 4, Num. files: 3 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":3,"sizeInBytes":22878,"version":2}
False, Version: 5, Num. files: 4 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":3,"sizeInBytes":22878,"version":2}
False, Version: 6, Num. files: 5 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":3,"sizeInBytes":22878,"version":2}
True, Version: 7, Num. files: 6 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":3,"sizeInBytes":22878,"version":2}

Try to use existing dt to make a vacuum.

In [11]: vacuum_log = dt.vacuum(retention_hours=0, enforce_retention_duration=False, dry_run=False)
    ...: display(vacuum_log)
    ...: print_debug()
---------------------------------------------------------------------------
DeltaError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 vacuum_log = dt.vacuum(retention_hours=0, enforce_retention_duration=False, dry_run=False)
      2 display(vacuum_log)
      3 print_debug()

File ~/<....>/.venv/lib/python3.12/site-packages/deltalake/table.py:811, in DeltaTable.vacuum(self, retention_hours, dry_run, enforce_retention_duration, custom_metadata, post_commithook_properties, commit_properties)
    808     if retention_hours < 0:
    809         raise ValueError("The retention periods should be positive.")
--> 811 return self._table.vacuum(
    812     dry_run,
    813     retention_hours,
    814     enforce_retention_duration,
    815     commit_properties,
    816     post_commithook_properties,
    817 )

DeltaError: Generic error: Failed to read delta log object: Object at location /Users/<....>/lake_delta/_delta_log/00000000000000000001.json not found: No such file or directory (os error 2)

Now I try to fix it with reread of dt variable and seams it works but it already leave 2 opfant files (should be 6 files but directory contains 8 files)

In [12]: dt = DeltaTable(filename)
    ...: vacuum_log = dt.vacuum(retention_hours=0, enforce_retention_duration=False, dry_run=False)
    ...: display(vacuum_log)
    ...: print_debug()
[]
True, Version: 7, Num. files: 6 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":3,"sizeInBytes":22878,"version":2}

In [13]: !ls lake_delta
_delta_log								part-00001-9f2cdee3-37b7-4ad0-aba0-ab38f979b9dd-c000.snappy.parquet
part-00001-2543cbb8-eb0d-4df9-937b-6ddae56f0af0-c000.snappy.parquet	part-00001-a6540901-96a1-4668-a831-44220dd1e39f-c000.zstd.parquet
part-00001-354066a1-e0bb-4df2-be1a-2bd2aee27390-c000.snappy.parquet	part-00001-c736372a-4085-4077-a86c-bdc27e482d70-c000.snappy.parquet
part-00001-48098419-e244-4bff-8700-31aa0f442115-c000.snappy.parquet	part-00001-dc16787a-1d23-4a16-bd89-b405085bf934-c000.snappy.parquet
part-00001-77649b97-50b6-4cb9-8877-2600580bb91b-c000.snappy.parquet

Create a checkpoint again

In [14]: dt = DeltaTable(filename)
    ...: dt.create_checkpoint()
    ...: print_debug()
True, Version: 7, Num. files: 6 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":8,"sizeInBytes":23573,"version":7}

Next ingestion via write_table don't brake anything.

In [15]: now = int(datetime.now().timestamp()*1000)
    ...: df = pl.DataFrame([[now], [1], [2], [3], [4], [5]], schema=["dt", "a", "b", "c", "d", "e"])
    ...: df.write_delta(filename, delta_write_options={"configuration": configuration},mode="append")
    ...: print_debug()
True, Version: 8, Num. files: 7 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":8,"sizeInBytes":23573,"version":7}

I execute a compaction and cleanup as it implemented in my project.

In [16]: dt = DeltaTable(filename)
    ...: optimize = dt.optimize.compact(target_size=128 * 1024 * 1024)
    ...: display(optimize)
    ...: vacuum_log = dt.vacuum(retention_hours=0, enforce_retention_duration=False, dry_run=False)
    ...: display(vacuum_log)
    ...: print(DeltaTable.is_deltatable(filename), dt.version(), dt.metadata(), len(dt.files()))
    ...: dt.create_checkpoint()
    ...: print_debug()
{'numFilesAdded': 1,
 'numFilesRemoved': 7,
 'filesAdded': '{"avg":2082.0,"max":2082,"min":2082,"totalFiles":1,"totalSize":2082}',
 'filesRemoved': '{"avg":1974.0,"max":2052,"min":1961,"totalFiles":7,"totalSize":13818}',
 'partitionsOptimized': 1,
 'numBatches': 7,
 'totalConsideredFiles': 7,
 'totalFilesSkipped': 0,
 'preserveInsertionOrder': True}
['part-00001-9f2cdee3-37b7-4ad0-aba0-ab38f979b9dd-c000.snappy.parquet',
 'part-00001-77649b97-50b6-4cb9-8877-2600580bb91b-c000.snappy.parquet',
 'part-00001-c736372a-4085-4077-a86c-bdc27e482d70-c000.snappy.parquet',
 'part-00001-2543cbb8-eb0d-4df9-937b-6ddae56f0af0-c000.snappy.parquet',
 'part-00001-fbad1b06-feec-4f39-84f7-e7aecfc7eb28-c000.snappy.parquet',
 'part-00001-a6540901-96a1-4668-a831-44220dd1e39f-c000.zstd.parquet',
 'part-00001-48098419-e244-4bff-8700-31aa0f442115-c000.snappy.parquet']
True 9 Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days'}) 1
True, Version: 11, Num. files: 1 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":3,"sizeInBytes":22884,"version":9}

In [17]: !ls lake_delta
_delta_log								part-00001-41eae111-d48f-4a56-b2f1-615ff26c6246-c000.zstd.parquet
part-00001-354066a1-e0bb-4df2-be1a-2bd2aee27390-c000.snappy.parquet	part-00001-dc16787a-1d23-4a16-bd89-b405085bf934-c000.snappy.parquet

In [18]: now = int(datetime.now().timestamp()*1000)
    ...: df = pl.DataFrame([[now], [1], [2], [3], [4], [5]], schema=["dt", "a", "b", "c", "d", "e"])
    ...: df.write_delta(filename, delta_write_options={"configuration": configuration},mode="append")
    ...: print_debug()
True, Version: 12, Num. files: 2 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":3,"sizeInBytes":22884,"version":9}

In [19]: dt = DeltaTable(filename)
    ...: dt.create_checkpoint()
    ...: print_debug()
True, Version: 12, Num. files: 2 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}

On 13th version it brakes again ...

In [20]: now = int(datetime.now().timestamp()*1000)
    ...: df = pl.DataFrame([[now], [1], [2], [3], [4], [5]], schema=["dt", "a", "b", "c", "d", "e"])
    ...: df.write_delta(filename, delta_write_options={"configuration": configuration},mode="append")
    ...: print_debug()
False, Version: 13, Num. files: 3 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}

In [21]: DeltaTable(filename).files()
Out[21]:
['part-00001-ff70cc93-b5c3-41aa-bef9-ba4eb2ec9650-c000.snappy.parquet',
 'part-00001-649c0a34-b19c-470c-8342-9fec50dd0491-c000.snappy.parquet',
 'part-00001-41eae111-d48f-4a56-b2f1-615ff26c6246-c000.zstd.parquet']

In [22]: !ls lake_delta
_delta_log								part-00001-649c0a34-b19c-470c-8342-9fec50dd0491-c000.snappy.parquet
part-00001-354066a1-e0bb-4df2-be1a-2bd2aee27390-c000.snappy.parquet	part-00001-dc16787a-1d23-4a16-bd89-b405085bf934-c000.snappy.parquet
part-00001-41eae111-d48f-4a56-b2f1-615ff26c6246-c000.zstd.parquet	part-00001-ff70cc93-b5c3-41aa-bef9-ba4eb2ec9650-c000.snappy.parquet

We make 14 new ingestion and table eventually heals itself ...

In [23]: for i in range(14):
    ...:     now = int(datetime.now().timestamp()*1000)
    ...:     df = pl.DataFrame([[now], [1], [2], [3], [4], [5]], schema=["dt", "a", "b", "c", "d", "e"])
    ...:     df.write_delta(filename, delta_write_options={"configuration": configuration},mode="append")
    ...:     print_debug()
    ...:
False, Version: 14, Num. files: 4 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 15, Num. files: 5 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 16, Num. files: 6 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 17, Num. files: 7 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 18, Num. files: 8 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 19, Num. files: 9 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 20, Num. files: 10 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 21, Num. files: 11 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 22, Num. files: 12 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 23, Num. files: 13 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 24, Num. files: 14 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.logRetentionDuration': 'interval 0 days', 'delta.deletedFileRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 25, Num. files: 15 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days', 'delta.targetFileSize': '134217728'})
{"size":4,"sizeInBytes":23090,"version":12}
False, Version: 26, Num. files: 16 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}
True, Version: 27, Num. files: 17 , Metadata(id: c1b8a787-1fd0-47c8-948a-d9b816b8880d, name: None, description: None, partition_columns: [], created_time: 1734107942684, configuration: {'delta.targetFileSize': '134217728', 'delta.deletedFileRetentionDuration': 'interval 0 days', 'delta.logRetentionDuration': 'interval 0 days'})
{"size":4,"sizeInBytes":23090,"version":12}

I hope you can reproduce it locally
Need help

PS Whatever existing current bugs, many thanks for this project <3

@valentinmk valentinmk added the bug Something isn't working label Dec 13, 2024
@ion-elgreco
Copy link
Collaborator

Honestly the issue report is quite convoluted, so I don't quite follow what you are trying to achieve, and what is going wrong.

The only thing I can mention for now is that at this stage: "Try to use existing dt to make a vacuum."

The writes you did prior to that was without passing the dt object, so you never updated the snapshot of your dt object. So it was still referencing an older version.

@valentinmk
Copy link
Author

Hi Ion!
Main problem so far is if I did .optimize.compact and right after I did create_checkpoint it broke something internally and next DeltaTable.is_deltatable(filename) return False
Please check 7-9th lines of code.
If you prefer I can split this issue to smaller ones, but I beleave the problem here in integration test rather then "unit" tests

@ion-elgreco
Copy link
Collaborator

I found the issue with the vacuum, will have to dive into it deeper for a fix, that also might fix the is_delttable by itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants