Tech debt: `GpsBatchJobs._start_job` #915

soxofaan · 2024-10-23T14:54:32Z

GpsBatchJobs_start_job is a vital part of batch job handling in the geopyspark driver, but has a very high tech debt score and zero unit test coverage (I think).

some quick observations

500+ lines of code
giant if isKube: .. else: ... construct with 200 LoC in if branch and 100 LoC in else branch
weird construction with sparkapplication.yaml.j2 where a jinja template is used to generate YAML, that is parsed and eventually converted to a dict. Feels a bit like overkill, but more importantly: a lot that can go wrong and zero unit testing
likewise, the submit_batch_job_spark3.sh code path is also pretty cumbersome: giant positional argument list with adhoc serialization and deserialization in bash. also see Simplify "submit_batch_job_spark3.sh" workflow #627
lot of hardcoded assumptions and vito references

The text was updated successfully, but these errors were encountered:

soxofaan · 2024-10-23T15:06:26Z

FYI the lack of test coverage make it risky to add features here, e.g. like i have with #845/#914. I basically have to merge in master and wait for integration tests to discover a problem. This slows down both my own work cycle, and I potentially break the cycle of other people/projects

soxofaan changed the title ~~Tech debt: GpsBatchJobs_start_job~~ Tech debt: GpsBatchJobs._start_job Oct 23, 2024

soxofaan added architecture technical debt labels Oct 23, 2024

soxofaan mentioned this issue Oct 24, 2024

use zip files for UDF dependencies #845

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tech debt: `GpsBatchJobs._start_job` #915

Tech debt: `GpsBatchJobs._start_job` #915

soxofaan commented Oct 23, 2024

soxofaan commented Oct 23, 2024 •

edited

Loading

Tech debt: GpsBatchJobs._start_job #915

Tech debt: GpsBatchJobs._start_job #915

Comments

soxofaan commented Oct 23, 2024

soxofaan commented Oct 23, 2024 • edited Loading

Tech debt: `GpsBatchJobs._start_job` #915

Tech debt: `GpsBatchJobs._start_job` #915

soxofaan commented Oct 23, 2024 •

edited

Loading