Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tech debt: GpsBatchJobs._start_job #915

Open
soxofaan opened this issue Oct 23, 2024 · 1 comment
Open

Tech debt: GpsBatchJobs._start_job #915

soxofaan opened this issue Oct 23, 2024 · 1 comment

Comments

@soxofaan
Copy link
Member

GpsBatchJobs_start_job is a vital part of batch job handling in the geopyspark driver, but has a very high tech debt score and zero unit test coverage (I think).

some quick observations

  • 500+ lines of code
  • giant if isKube: .. else: ... construct with 200 LoC in if branch and 100 LoC in else branch
  • weird construction with sparkapplication.yaml.j2 where a jinja template is used to generate YAML, that is parsed and eventually converted to a dict. Feels a bit like overkill, but more importantly: a lot that can go wrong and zero unit testing
  • likewise, the submit_batch_job_spark3.sh code path is also pretty cumbersome: giant positional argument list with adhoc serialization and deserialization in bash. also see Simplify "submit_batch_job_spark3.sh" workflow #627
  • lot of hardcoded assumptions and vito references
@soxofaan
Copy link
Member Author

soxofaan commented Oct 23, 2024

FYI the lack of test coverage make it risky to add features here, e.g. like i have with #845/#914. I basically have to merge in master and wait for integration tests to discover a problem. This slows down both my own work cycle, and I potentially break the cycle of other people/projects

@soxofaan soxofaan changed the title Tech debt: GpsBatchJobs_start_job Tech debt: GpsBatchJobs._start_job Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant