Skip to content

Pipeline is not executed for parameter with name size or nfiles #10296

Open
@shcheklein

Description

@shcheklein

Bug Report

Description

See this link https://stackoverflow.com/questions/77962532/dvc-using-cached-run-although-parameter-changed

Reproduce

Use this repo: https://github.com/shcheklein/test-dvc-so-77962532

Run with size 30, then change to 40, run dvc status, run dvc repro again. It's not running the pipeline, saying this:

Stage 'data_ingestion' is cached - skipping run, checking out outputs
Updating lock file 'dvc.lock'

To track the changes with git, run:

	git add dvc.lock

To enable auto staging, run:

	dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.

File size stays the same.

Logs

2024-02-08 20:48:29,997 DEBUG: v3.44.0 (pip), CPython 3.11.4 on macOS-13.3.1-arm64-arm-64bit
2024-02-08 20:48:29,998 DEBUG: command: /Users/ivan/Projects/test-dvc-so/.venv/bin/dvc repro -v
2024-02-08 20:48:30,158 DEBUG: Dependency 'params.yaml' of stage: 'data_ingestion' changed because it is '{'size': 'modified'}'.
2024-02-08 20:48:30,159 DEBUG: stage: 'data_ingestion' changed.
2024-02-08 20:48:30,159 DEBUG: Removing output 'artifacts/data_ingestion' of stage: 'data_ingestion'.
2024-02-08 20:48:30,160 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/data_ingestion'
2024-02-08 20:48:30,161 DEBUG: {}
2024-02-08 20:48:30,161 DEBUG: defaultdict(<class 'dict'>, {'params.yaml': {'size': 'modified'}})
Stage 'data_ingestion' is cached - skipping run, checking out outputs
2024-02-08 20:48:30,163 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/.COXZdYuRz3gn4oeArpSdWQ.tmp'
2024-02-08 20:48:30,164 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/.COXZdYuRz3gn4oeArpSdWQ.tmp'
2024-02-08 20:48:30,164 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/.dvc/cache/files/md5/.wnNey-IUBNjTwgwkjkVJoQ.tmp'
2024-02-08 20:48:30,170 DEBUG: built tree 'object 3d7dd9c155ee06ec6ff8fa04e49f49fe.dir'
2024-02-08 20:48:30,170 DEBUG: Computed stage: 'data_ingestion' md5: '91baabba76b22d5f1480db2cfe105d8b'
2024-02-08 20:48:30,173 DEBUG: built tree 'object 3d7dd9c155ee06ec6ff8fa04e49f49fe.dir'
2024-02-08 20:48:30,173 DEBUG: Preparing to transfer data from 'memory://dvc-staging-md5/2b21226c06eec22f3477afe4c6de75a80828635723b703713230e4c3c4c39626' to '/Users/ivan/Projects/test-dvc-so/.dvc/cache/files/md5'
2024-02-08 20:48:30,173 DEBUG: Preparing to collect status from '/Users/ivan/Projects/test-dvc-so/.dvc/cache/files/md5'
2024-02-08 20:48:30,173 DEBUG: Collecting status from '/Users/ivan/Projects/test-dvc-so/.dvc/cache/files/md5'
2024-02-08 20:48:30,174 DEBUG: built tree 'object 3d7dd9c155ee06ec6ff8fa04e49f49fe.dir'
2024-02-08 20:48:30,174 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/.z_523r89dhvz_hXD3vW61g.tmp'
2024-02-08 20:48:30,174 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/.z_523r89dhvz_hXD3vW61g.tmp'
2024-02-08 20:48:30,174 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/.dvc/cache/files/md5/.OiTk5AM8wHoSOkDtcj45sA.tmp'
2024-02-08 20:48:30,175 DEBUG: Removing '/Users/ivan/Projects/test-dvc-so/artifacts/data_ingestion/test_data.csv'
2024-02-08 20:48:30,177 DEBUG: stage: 'data_ingestion' was reproduced
Updating lock file 'dvc.lock'

To track the changes with git, run:

	git add dvc.lock

To enable auto staging, run:

	dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
2024-02-08 20:48:30,182 DEBUG: Analytics is enabled.
2024-02-08 20:48:30,222 DEBUG: Trying to spawn ['daemon', 'analytics', '/var/folders/8f/fbysfztx1mb953_gpwl477p80000gn/T/tmpf_cyrru9', '-v']
2024-02-08 20:48:30,226 DEBUG: Spawned ['daemon', 'analytics', '/var/folders/8f/fbysfztx1mb953_gpwl477p80000gn/T/tmpf_cyrru9', '-v'] with pid 6119

Expected

Running the stage.

Environment information

(.venv) √ Projects/test-dvc-so % dvc version
DVC version: 3.44.0 (pip)
-------------------------
Platform: Python 3.11.4 on macOS-13.3.1-arm64-arm-64bit
Subprojects:
	dvc_data = 3.11.0
	dvc_objects = 5.0.0
	dvc_render = 1.0.1
	dvc_task = 0.3.0
	scmrepo = 3.1.0
Supports:
	http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3)
Config:
	Global: /Users/ivan/Library/Application Support/dvc
	System: /Library/Application Support/dvc
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk3s1s1
Caches: local
Remotes: None
Workspace directory: apfs on /dev/disk3s1s1
Repo: dvc, git
Repo.site_cache_dir: /Library/Caches/dvc/repo/4883da32ce8435ea352f10b710b4a968

Metadata

Metadata

Assignees

No one assigned

    Labels

    A: pipelinesRelated to the pipelines featurebugDid we break something?

    Type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions