Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstoppable "running" job #2841

Closed
Tracked by #2824
praiskup opened this issue Aug 8, 2023 · 7 comments · Fixed by #2867
Closed
Tracked by #2824

Unstoppable "running" job #2841

praiskup opened this issue Aug 8, 2023 · 7 comments · Fixed by #2867

Comments

@praiskup
Copy link
Member

praiskup commented Aug 8, 2023

I even tried "cancel", and the source state still stays in "running" state:
https://copr.stg.fedoraproject.org/coprs/frostyx/custom-1-TEST1687524244666456862/build/2914596/
https://copr.stg.fedoraproject.org/status/running/

@FrostyX
Copy link
Member

FrostyX commented Aug 10, 2023

Related frontend log:

2023-08-09 16:31:45,706 [INFO][/usr/share/copr/coprs_frontend/coprs/logic/builds_logic.py:973|builds_logic:update_state_from_dict][backend: 2600:1f18:8ee:ae00:6c8c:e094:1c5b:c2f9] Updating build 2914596 by: {'timeout': 108000, 'frontend_base_url': 'https://copr.stg.fedoraproject.org', 'memory_reqs': None, 'enable_net': True, 'project_owner': 'frostyx', 'project_name': 'custom-1-TEST1687524244666456862', 'project_dirname': 'custom-1-TEST1687524244666456862', 'submitter': 'frostyx', 'ended_on': 1687524906.7894292, 'started_on': 1687524678.1798966, 'submitted_on': None, 'status': 1, 'chroot': 'srpm-builds', 'arch': 'x86_64', 'buildroot_pkgs': None, 'task_id': '2914596', 'build_id': 2914596, 'package_name': 'quick-package', 'package_version': None, 'git_repo': None, 'git_hash': None, 'git_branch': None, 'source_type': 9, 'source_json': '{"script": "#! /bin/sh -x\\n\\nset -e\\n\\ngenerate_specfile()\\n{\\n    test -n \\"$DESTDIR\\" && mkdir -p \\"$DESTDIR\\"\\n\\n    test -n \\"$BUILDDEPS\\" && {\\n        for i in $BUILDDEPS; do\\n            rpm -q $i\\n        done\\n    }\\n\\n    if ${HOOK_PAYLOAD-false}; then\\n        test -f hook_payload\\n        test \\"$(cat hook_payload)\\" = \\"{\\\\\\"a\\\\\\": \\\\\\"b\\\\\\"}\\"\\n    else\\n        ! test -f hook_payload\\n    fi\\n\\ncat > \\"${DESTDIR-.}\\"/quick-package.spec <<\\\\EOF\\nName:           quick-package\\nVersion:        0\\nRelease:        0%{?dist}\\nSummary:        dummy package\\nLicense:        GPL\\nURL:            http://example.com/\\n\\n%{!?_pkgdocdir: %global _pkgdocdir %{_docdir}/%{name}-%{version}}\\n\\n%description\\nnothing\\n\\n\\n%install\\nmkdir -p $RPM_BUILD_ROOT/%{_pkgdocdir}\\necho \\"this does nothing\\" > $RPM_BUILD_ROOT/%{_pkgdocdir}/README\\n\\n\\n%files\\n%doc %{_pkgdocdir}/README\\n\\n\\n%changelog\\n* Thu Jun 05 2014 Pavel Raiskup <[email protected]> - 0-1\\n- does nothing!\\nEOF\\n}\\ngenerate_specfile\\n", "chroot": "fedora-rawhide-x86_64", "builddeps": null, "resultdir": null, "repos": ""}', 'pkg_name': 'quick-package', 'pkg_main_version': None, 'pkg_epoch': None, 'pkg_release': None, 'srpm_url': 'https://download.copr-dev.fedorainfracloud.org/results/frostyx/custom-1-TEST1687524244666456862/srpm-builds/02914596/quick-package-0-0.src.rpm', 'uses_devel_repo': None, 'sandbox': 'frostyx/custom-1-TEST1687524244666456862--frostyx', 'results': {'packages': [{'name': 'quick-package', 'epoch': 0, 'version': '0', 'release': '0', 'arch': 'src'}]}, 'appstream': False, 'background': False, 'repos': [], 'destdir': '/var/lib/copr/public_html/results/frostyx/custom-1-TEST1687524244666456862', 'results_repo_url': 'https://download.copr-dev.fedorainfracloud.org/results/frostyx/custom-1-TEST1687524244666456862', 'result_dir': '02914596', 'built_packages': '', 'tags': ['arch_x86_64'], 'pkg_version': '0-0', 'id': 2914596, 'mockchain_macros': {'copr_username': 'frostyx', 'copr_projectname': 'custom-1-TEST1687524244666456862', 'vendor': 'Fedora Project COPR (frostyx/custom-1-TEST1687524244666456862)'}}
2023-08-09 16:31:45,732 [ERROR][/usr/share/copr/coprs_frontend/coprs/error_handlers.py:102|error_handlers:_log_admin_only_exception][backend: 2600:1f18:8ee:ae00:6c8c:e094:1c5b:c2f9] Admin-only exception
Request: POST https://copr.stg.fedoraproject.org/backend/update/
User: None
Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/share/copr/coprs_frontend/coprs/views/misc.py", line 231, in decorated_function
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/share/copr/coprs_frontend/coprs/views/backend_ns/backend_general.py", line 416, in update
    logic_cls.update_state_from_dict(obj, to_update[i])
  File "/usr/share/copr/coprs_frontend/coprs/logic/builds_logic.py", line 1037, in update_state_from_dict
    exclusivearch = upd_dict["results"]["exclusivearch"]
                    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'exclusivearch'

@FrostyX
Copy link
Member

FrostyX commented Aug 10, 2023

So it seems to me that I incorrectly specified the minimal copr-rpmbuild version in PR #2769.

MIN_BUILDER_VERSION = "0.68.dev"

and for this broken build we used old copr-rpmbuild and new frontend?

@praiskup
Copy link
Member Author

praiskup commented Aug 11, 2023

Maybe? Not sure, it seems there actually is some issue even now, because the beaker test for the custom method created another unstoppable builds:

Running Project Build Package Name Package Version Chroot
23 hours praiskup/custom-1-TEST1691652763585729547 2914821 quick-package - Source build
a day praiskup/custom-1-TEST1691598245047470908 2914800 quick-package - Source build
a month frostyx/custom-1-TEST1687524244666456862 2914596 quick-package - Source build

@praiskup praiskup assigned praiskup and unassigned FrostyX Aug 11, 2023
@praiskup
Copy link
Member Author

Taking for a while (per slack chat)

@praiskup
Copy link
Member Author

praiskup commented Aug 11, 2023

Seems like the build worker is still alive, and keeps trying to communicate with frontend

User: None 
Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/share/copr/coprs_frontend/coprs/views/misc.py", line 231, in decorated_function
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/share/copr/coprs_frontend/coprs/views/backend_ns/backend_general.py", line 416, in update
    logic_cls.update_state_from_dict(obj, to_update[i])
  File "/usr/share/copr/coprs_frontend/coprs/logic/builds_logic.py", line 1037, in update_state_from_dict
    exclusivearch = upd_dict["results"]["exclusivearch"]
                    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
KeyError: 'exclusivearch'
2023-08-11 10:35:46,063 [ERROR][/usr/share/copr/coprs_frontend/coprs/error_handlers.py:42|error_handlers:handle_error][backend: 2600:1f18:8ee:ae00:6c8c:e094:1c5b:c2f9] Response error: 500 Request wasn't successful, there is probably a bug in the Copr code.

praiskup added a commit to praiskup/copr that referenced this issue Aug 11, 2023
This actually causes error 500 upon "update" request from copr backend
background worker, eventually leading to never-stopped builder process.

Relates: fedora-copr#2841
@praiskup praiskup assigned FrostyX and unassigned praiskup Aug 11, 2023
@praiskup
Copy link
Member Author

Returning back. There's this problem:

[2023-08-11 10:19:53,602][  INFO][PID:2580356][/usr/bin/copr-backend-process-build.managed.pid-2580356][background_worker_build.py:_get_srpm_build_details:631] Retrieving SRPM info from /var/lib/copr/public_html/results/praiskup/custom-1-TEST1691748849870238588/srpm-builds/02914960
[2023-08-11 10:19:53,603][ ERROR][PID:2580356][/usr/bin/copr-backend-process-build.managed.pid-2580356][background_worker_build.py:_get_build_details:681] Can't collect build results for 2914960
Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/copr_backend/background_worker_build.py", line 674, in _get_build_details
    build_details = self._get_srpm_build_details(job)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/copr_backend/background_worker_build.py", line 637, in _get_srpm_build_details
    build_details["pkg_name"] = results["name"]
                                ~~~~~~~^^^^^^^^
KeyError: 'name'

The thing is that the custom method returns a different results.json format, compared to other methods:
https://download.copr-dev.fedorainfracloud.org/results/praiskup/custom-1-TEST1691751431073416528/srpm-builds/02914961/builder-live.log.gz

Running RPMResults tool
Package info:
{
    "packages": [
        {
            "name": "quick-package",
            "epoch": null,
            "version": "0",
            "release": "0",
            "arch": "src"
        }
    ]
}
RPMResults finished

See the RPMResults vs SRPMResults difference.

@praiskup
Copy link
Member Author

KeyError: 'exclusivearch'

This is embarrassing, you wrote this 12 hours ago. Sorry I missed that :( Anyway, the Custom method seems to be broken now.

FrostyX added a commit to FrostyX/copr that referenced this issue Aug 11, 2023
Fix fedora-copr#2841

The previous check worked for all SRPM methods except for custom
method. It uses `self.chroot` to specify "what chroot to run the
script in". Checking `source_type` should be more reliable.
praiskup pushed a commit that referenced this issue Aug 14, 2023
Fix #2841

The previous check worked for all SRPM methods except for custom
method. It uses `self.chroot` to specify "what chroot to run the
script in". Checking `source_type` should be more reliable.
praiskup added a commit that referenced this issue Aug 14, 2023
This actually causes error 500 upon "update" request from copr backend
background worker, eventually leading to never-stopped builder process.

Relates: #2841
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants