Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cocoapods support to package.py #119

Merged
merged 13 commits into from
Sep 19, 2024
Merged

Conversation

johnmhoran
Copy link
Member

Reference: #116

@pombredanne @TG1999 @keshav-space Please review when you have a chance. Meanwhile, I'll start work on the tests.

Reference: #116

Signed-off-by: John M. Horan <[email protected]>
@johnmhoran
Copy link
Member Author

FYI I'm refactoring the three primary cocoapods functions, beginning to pull out some code blocks into separate (and testable) functions, and creating tests. One so far. One question: I could break out and create half a dozen tests (and will try to do so) -- but where do we draw the line? ;-)

filemode="w",
)
input_purl = purl
purl = PackageURL.from_string(purl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in try/except block, given input may not be a valid PURL

Copy link
Member Author

@johnmhoran johnmhoran Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @TG1999 . I'm in the midst of refactoring but will add this to the updated code. One note: there are nearly a dozen other uses of that same syntax by other supported PURL types in package.py and none uses a try/except (but perhaps should?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 On second thought, purldb-toolkit's purlcli.py already handles invalid PURL inputs by checking the validate endpoint (including for the metadata command, which is the command that calls the fetchcode package.py info() function) and prints a warning to the output JSON warnings list, so I don't think a try/except is needed in the package.py cocoapods code. E.g.,

(venv) Mon Apr 29, 2024 12:25 PM  /home/jmh/dev/nexb/purldb jmh (365-update-cocoapods-pypi-support)
$ python -m purldb_toolkit.purlcli metadata --purl pkg:cocoapods/# --output -
{
    "headers": [
        {
            "tool_name": "purlcli",
            "tool_version": "0.2.0",
            "options": {
                "command": "metadata",
                "--purl": [
                    "pkg:cocoapods/#"
                ],
                "--file": null,
                "--output": "<stdout>"
            },
            "purls": [
                "pkg:cocoapods/#"
            ],
            "errors": [],
            "warnings": [
                "'pkg:cocoapods/#' not valid"
            ]
        }
    ],
    "packages": []
}
(venv) Mon Apr 29, 2024 12:29 PM  /home/jmh/dev/nexb/purldb jmh (365-update-cocoapods-pypi-support)
$

src/fetchcode/package.py Outdated Show resolved Hide resolved
@@ -362,6 +376,290 @@ def get_gnu_data_from_purl(purl):
)


@router.route("pkg:cocoapods/.*")
def get_cocoapods_data_from_purl(purl):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnmhoran after refactoring get_cocoapods_data_from_purl into multiple functions, please put those functions in package_util.py and only keep the top-level get_cocoapods_data_from_purl function in package.py file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keshav-space -- I was wondering about that, given how the other existing, relatively short @router.route() functions in package.py have related functions in both package_util.py and utils.py. I've already added a handful of utilities to utils.py for cocoapods support (siblings of existing utilities, but these do not throw exceptions because that stops the purlcli metadata command, which we don't want to do) and will do as you suggest with the now 4 additional functions for cocoapods created by my almost-finished refactoring. And then I have 3 or 4 mock tests to create.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keshav-space Moving these related functions to package_util.py raises one question: in order to facilitate the collection and sharing of cocoapods data from a number of different sources, I've created a dictionary at the top of package.py which all functions can access. When I move some functions to package_util.py, will continued access be as simple as importing that dictionary from package.py into package_util.py? That's my plan atm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keshav-space I am having trouble importing and accessing in package_util.py the logger I've defined and use widely in my package.py code. I'll dig into this soon, but meanwhile, do you have any guidance on how to share a logging function -- this prints to screen and to the "errors"/"warnings" keys in the JSON output. I now import in package_util.py with from fetchcode.package import logger but get this error running metadata:

(venv) Wed May 01, 2024 08:33 AM  /home/jmh/dev/nexb/purldb jmh (365-update-cocoapods-pypi-support)
$ python -m purldb_toolkit.purlcli metadata --purl pkg:cocoapods/[email protected] --output -
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/jmh/dev/nexb/purldb/purldb-toolkit/src/purldb_toolkit/purlcli.py", line 19, in <module>
    from fetchcode.package import info
  File "/home/jmh/dev/nexb/fetchcode/src/fetchcode/package.py", line 32, in <module>
    from fetchcode.package_util import GITHUB_SOURCE_BY_PACKAGE
  File "/home/jmh/dev/nexb/fetchcode/src/fetchcode/package_util.py", line 25, in <module>
    from fetchcode.package import logger
ImportError: cannot import name 'logger' from partially initialized module 'fetchcode.package' (most likely due to a circular import) (/home/jmh/dev/nexb/fetchcode/src/fetchcode/package.py)

(venv) Wed May 01, 2024 08:48 AM  /home/jmh/dev/nexb/purldb jmh (365-update-cocoapods-pypi-support)
$

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnmhoran please don't share the same logger across different files. Define a new logger for package_util.py and avoid any circular dependencies i.e. don't import anything from package.py in package_util.py. The error above is due to a circular dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keshav-space . I've defined the logger in each of package.py and package_util.py (configured in get_cocoapods_data_from_purl()), and have defined the pod_summary dictionary in package_util.py and import it into package.py (pod_summary is shared among functions in both files), and everything seems to still work as desired. 👍

@johnmhoran
Copy link
Member Author

@keshav-space @TG1999 I've committed and pushed my refactored work including mock tests for 4 of the 5 main cocoapods functions (those 4 now reside in package_util.py) -- all of this is ready for review when you have time.

Meanwhile I'll be troubleshooting my mocking and related code in the test for the 5th function (get_cocoapods_data_from_purl() in package.py) -- initial print testing has uncovered some "actual_results" data that could only have come from a live network call bc not in my mocked objects.

@johnmhoran johnmhoran requested review from TG1999 and keshav-space May 8, 2024 01:36
Copy link
Member

@keshav-space keshav-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johnmhoran, few nitpicks for your consideration.

src/fetchcode/package.py Outdated Show resolved Hide resolved
src/fetchcode/package.py Outdated Show resolved Hide resolved
src/fetchcode/package.py Outdated Show resolved Hide resolved
src/fetchcode/package.py Outdated Show resolved Hide resolved
src/fetchcode/package.py Outdated Show resolved Hide resolved
src/fetchcode/package_util.py Outdated Show resolved Hide resolved
Comment on lines 1000 to 1001
if source.get("tag") and source.get("tag").startswith("v"):
corrected_tag = source.get("tag")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnmhoran why do we need to use source.get("tag") only when it has leading v?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. In my recent exploration of the cocoapods data ecosystem, I've noticed that when, in the podspec.json file for a cocoapod, there is a tag value nested in the source value and the tag value has a "v" prefix, the download_url has a "v" prefix in the path segment containing the version being downloaded -- without the "v" there, the download_url is wrong. And otherwise, no "v" for one means no "v" for the other. So my code adds a "v" to the download_url when needed. No idea if there are exceptions to this practice in the cocoapods data, but the spec permits both (and a number of other approaches as well). See, e.g.,

And a few examples of some of the variations I've encountered recently:

Several that use "v":

input_purl: pkg:cocoapods/[email protected]
corrected_tag = v2.2
homepage_url = https://github.com/bigfish24/ABFRealmMapView 
download_url: https://github.com/bigfish24/ABFRealmMapView/archive/refs/tags/v2.2.tar.gz 
api_url: https://raw.githubusercontent.com/CocoaPods/Specs/master/Specs/5/d/e/ABFRealmMapView/2.2/ABFRealmMapView.podspec.json 

  "source": {
    "git": "https://github.com/bigfish24/ABFRealmMapView.git",
    "tag": "v2.2"
  },

input_purl: pkg:cocoapods/[email protected]
corrected_tag = v4.1.0
homepage_url = https://github.com/danielgindi/Charts 
download_url: https://github.com/danielgindi/Charts/archive/refs/tags/v4.1.0.tar.gz 
api_url: https://raw.githubusercontent.com/CocoaPods/Specs/master/Specs/5/1/e/Charts/4.1.0/Charts.podspec.json 

  "source": {
    "git": "https://github.com/danielgindi/Charts.git",
    "tag": "v4.1.0"
  },

An example of a cocopoad that does not use the "v":

input_purl: pkg:cocoapods/[email protected]
homepage_url: https://github.com/zendesk/zendesk_sdk_ios 
download_url: https://github.com/zendesk/zendesk_sdk_ios/archive/refs/tags/4.0.1.tar.gz 
api_url: https://raw.githubusercontent.com/CocoaPods/Specs/master/Specs/6/5/9/ZendeskSDK/4.0.1/ZendeskSDK.podspec.json 

  "source": {
    "git": "https://github.com/zendesk/zendesk_sdk_ios.git",
    "tag": "4.0.1"
  },

An example that uses "http" rather than "git" and "tag" (or some other combo) as the nested "source" value:

input_purl: pkg:cocoapods/[email protected]
homepage_url: https://github.com/clemensg/sqlite3pod 
download_url: https://www.sqlite.org/2024/sqlite-src-3450100.zip 
api_url: https://raw.githubusercontent.com/CocoaPods/Specs/master/Specs/d/c/2/sqlite3/3.45.1/sqlite3.podspec.json 

  "source": {
    "http": "https://www.sqlite.org/2024/sqlite-src-3450100.zip"
  },

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Thanks for the explanation, @johnmhoran.

In that case, since we are accessing the GitHub tag three times, it's better to store it in a variable.

Suggested change
if source.get("tag") and source.get("tag").startswith("v"):
corrected_tag = source.get("tag")
github_tag = source.get("tag")
if github_tag and github_tag.startswith("v"):
corrected_tag = github_tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @keshav-space -- thanks and done.

Comment on lines 183 to 202
def get_github_rest_no_exception(url):
headers = None
gh_token = get_github_token()
if gh_token:
headers = {
"Authorization": f"Bearer {gh_token}",
}

return get_json_response(url, headers)


def get_json_response(url, headers=None):
"""
Generate `Package` object for a `url` string
"""
resp = requests.get(url, headers=headers)
if resp.status_code == 200:
return resp.json()

return f"Failed to fetch: {url}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have these function available here https://github.com/nexB/fetchcode/blob/ff0bad32881464afbfa6f70a65909ed0170e94e5/src/fetchcode/utils.py#L157-L176

Suggested change
def get_github_rest_no_exception(url):
headers = None
gh_token = get_github_token()
if gh_token:
headers = {
"Authorization": f"Bearer {gh_token}",
}
return get_json_response(url, headers)
def get_json_response(url, headers=None):
"""
Generate `Package` object for a `url` string
"""
resp = requests.get(url, headers=headers)
if resp.status_code == 200:
return resp.json()
return f"Failed to fetch: {url}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keshav-space . I'd added these new helper functions when I decided to use logging and messaging in lieu of raising exceptions/letting errors bubble up. I've now removed all of that logging and related code and testing including these helper functions.

Comment on lines +258 to +264
# for FIPS support
sys_v0 = sys.version_info[0]
sys_v1 = sys.version_info[1]
if sys_v0 == 3 and sys_v1 >= 9:
md5_hasher = partial(hashlib.md5, usedforsecurity=False)
else:
md5_hasher = hashlib.md5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this even needed? I don't think we're using FIPS enabled system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keshav-space . I copied this over from https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/cocoapods.py#L89-L118 at the start of the project. Don't know whether this is needed or not. Shall I remove this FIPS block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne For the time being I'm leaving the get_hashed_path() function and related code from packagedcode/cocoapods.py as is in fetchcode/utils.py, though I think you'd mentioned a better way of handling this (the hash is used to get the path to a pod's podspec).

@keshav-space had raised the question of whether we need the FIPs block above. Is this needed?

Comment on lines 194 to 210
'Date': 'Thu, 02 May 2024 06:02:10 GMT',
'Content-Type': 'text/html;charset=utf-8',
'Connection': 'keep-alive',
'Report-To': '{"group":"heroku-nel","max_age":3600,"endpoints":[{"url":"https://nel.heroku.com/reports?ts=1714629728&sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add&s=rPI0KHQY0J7GvkjgHpmcuMxWDuTga0k8UEFRezWRyrU%3D"}]}',
'Reporting-Endpoints': 'heroku-nel=https://nel.heroku.com/reports?ts=1714629728&sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add&s=rPI0KHQY0J7GvkjgHpmcuMxWDuTga0k8UEFRezWRyrU%3D',
'Nel': '{"report_to":"heroku-nel","max_age":3600,"success_fraction":0.005,"failure_fraction":0.05,"response_headers":["Via"]}',
'Cache-Control': 'public, max-age=20, s-maxage=60',
'Location': 'https://github.com/juxingzhutou/BSSimpleHTTPNetworking',
'X-Xss-Protection': '1; mode=block',
'X-Content-Type-Options': 'nosniff',
'X-Frame-Options': 'SAMEORIGIN',
'Via': '1.1 vegur',
'CF-Cache-Status': 'HIT',
'Age': '2',
'Vary': 'Accept-Encoding',
'Server': 'cloudflare',
'CF-RAY': '87d5cd05bd2cf973-SJC',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to mock unused header information

Suggested change
'Date': 'Thu, 02 May 2024 06:02:10 GMT',
'Content-Type': 'text/html;charset=utf-8',
'Connection': 'keep-alive',
'Report-To': '{"group":"heroku-nel","max_age":3600,"endpoints":[{"url":"https://nel.heroku.com/reports?ts=1714629728&sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add&s=rPI0KHQY0J7GvkjgHpmcuMxWDuTga0k8UEFRezWRyrU%3D"}]}',
'Reporting-Endpoints': 'heroku-nel=https://nel.heroku.com/reports?ts=1714629728&sid=c46efe9b-d3d2-4a0c-8c76-bfafa16c5add&s=rPI0KHQY0J7GvkjgHpmcuMxWDuTga0k8UEFRezWRyrU%3D',
'Nel': '{"report_to":"heroku-nel","max_age":3600,"success_fraction":0.005,"failure_fraction":0.05,"response_headers":["Via"]}',
'Cache-Control': 'public, max-age=20, s-maxage=60',
'Location': 'https://github.com/juxingzhutou/BSSimpleHTTPNetworking',
'X-Xss-Protection': '1; mode=block',
'X-Content-Type-Options': 'nosniff',
'X-Frame-Options': 'SAMEORIGIN',
'Via': '1.1 vegur',
'CF-Cache-Status': 'HIT',
'Age': '2',
'Vary': 'Accept-Encoding',
'Server': 'cloudflare',
'CF-RAY': '87d5cd05bd2cf973-SJC',
'Date': 'Thu, 02 May 2024 06:02:10 GMT',
'Content-Type': 'text/html;charset=utf-8',
'Connection': 'keep-alive',
'Location': 'https://github.com/juxingzhutou/BSSimpleHTTPNetworking',

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @keshav-space . This passage was part of the now-deleted test for get_cocoapods_org_url_status(), one of several functions I've removed in my latest refactoring.

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Here is some feedback .... there are some changes needed wrt. logging that we discussed in other PRs. Also the management of errors could be entirely simplified: errors are best left alone and not caught in most cases when in a library.

setup.cfg Outdated
@@ -59,6 +59,7 @@ install_requires =
requests
python-dateutil
python-dotenv
univers == 30.11.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please never pin the version here. Only express what is your minimal required version. The pin goes in the requirements file.

Suggested change
univers == 30.11.0
univers >= 30.11.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you caught that -- thanks and fixed.

purl_to_cocoapods_org_url_status = get_cocoapods_org_url_status(purl, name, cocoapods_org_url)
cocoa_org_url_status = purl_to_cocoapods_org_url_status["return_message"]

status_values = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have all these statuses in other package implementations?
I am not sure we need to track all these. Having errors and exceptions bubble up is fine here, especially in a low level library. If we need to do more refined handling of errors when we fetch, this should not be something special to cocoapods IMHO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have been removed along with all related logging, messaging etc.

"""
Generate `Package` object from the `purl` string of cocoapods type
"""
logging.basicConfig(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we want logging in the library, but I am sure we do not want logging to be enabled and configured here at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming that my added logging has been removed.

name = purl.name
version = purl.version
cocoapods_org_url = f"https://cocoapods.org/pods/{name}"
repository_homepage_url = f"https://cocoapods.org/pods/{name}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why track two variables with the exact same value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted.

@johnmhoran
Copy link
Member Author

Thanks for your feedback @pombredanne . I'd begun updating the fetchcode cocoapods code along the lines we discussed earlier this week when the cocoapods.org site went down 2 days ago, preventing me from running the fetchcode cocoapods-related code as I made changes.

Meanwhile I began to refactor the purlcli.py and related code/files, a somewhat complex process given the broad use of logging/messaging up to now. Although cocoapods.org was back up and available when I checked this morning, it will be simpler/more efficient for me to finish my purlcli/purldb-toolkit refactoring first before returning to fetchcode.

Reference: #116

Signed-off-by: John M. Horan <[email protected]>
@johnmhoran
Copy link
Member Author

@pombredanne @keshav-space I've finished and pushed my fetchcode changes focused on removing the logging and messaging and otherwise updating the cocoapods-specific code and tests.

I was going to ask you to take a look when schedules permit, but several GH checks have failed and it seems at least one test has failed (def test_get_cocoapods_data_from_purl()). All relevant tests passed locally just before I committed and pushed, so a puzzle atm. Will take a look and report back here when I know more.

One note: this work does not include changes to the other supported PURL types in package.py that do not produce the metadata output @pombredanne wants (metadata for just 1 version if the input PURL has a version, otherwise metadata for each version of the PURL). I could tackle those next (after resolving the failed GH checks) if that makes sense.

@johnmhoran
Copy link
Member Author

The GH failures were triggered by an expected language value of "Swift" and an actual value of null in a test that uses mocking for values that would otherwise be provided with the help of the requests library.

In my local branch, all fetchcode tests pass including def test_get_cocoapods_data_from_purl() (seems to be the only failed test on GH). I experimented by making changes to the language key in two relevant mocked function .json files (from "Swift" to some other value), and this seemed to confirm that the test is sensitive to the language values. And it also confirmed that the test passes locally. I have no idea atm why it fails (for 3 GH checks) up on GH. 🤔 Will rerun the GH tests one more time....

@johnmhoran
Copy link
Member Author

This time just 2 GH checks failed (same failed test) and the other 9 passed. No changes made in the interim.

@johnmhoran
Copy link
Member Author

@pombredanne @keshav-space I've rerun all GH checks once again and this time, all 11 pass. Yesterday 3 failed, I reran and 2 failed, and now all pass -- no changes in the interim to the code, tests, data or anything else. Puzzling.

In any event, the updated fetchcode code and tests are ready for you to review when you can, and please also take a look at my recently-updated PURLCLI PR with similar changes (removing logging etc.).

Reference: #116
Signed-off-by: John M. Horan <[email protected]>
Reference: #116
Signed-off-by: John M. Horan <[email protected]>
- We now have a check_package function that can
  load a file and, in the future, regenerate the
  test file(s).

Reference: #116
Signed-off-by: John M. Horan <[email protected]>
Reference: #116
Signed-off-by: John M. Horan <[email protected]>
@johnmhoran
Copy link
Member Author

@pombredanne @keshav-space Like the failing tests I commented on last week, once again all tests pass locally but 1 or 2 GitHub checks fail and the error is always the test_get_cocoapods_data_from_purl test, and the same supposed failure:

  • we expect "primary_language": "Swift" but GH generates "primary_language": null.

This is bizarre because the various input values are mocked and thus not changing from local to GitHub. Last week after I reran the failing tests over several days, they all mysteriously passed. I'm rerunning today as well but so far I keep getting the same failure. The cause remains a mystery.

OK, many minutes have passed since I started writing this comment, during which I kept rerunning and rerunning the failed checks, and now, all of them pass. Is there some way for me to avoid this going forward? Please share ideas if you have any. 🙏 ;-)

Comment on lines 116 to 140
def test_get_cocoapods_data_from_purl(
mock_get_hashed_path,
mock_get_cocoapod_tags,
mock_get_github_rest,
mock_get_response,
):
mock_get_hashed_path.return_value = "5/5/b"
mock_get_cocoapod_tags.return_value = [
'0.1.5',
'0.1.4',
'0.1.3',
'0.1.2',
'0.1.1',
'0.1.0',
]
mock_get_github_rest.return_value = load_json("tests/data/cocoapods/mock_get_github_rest_return_value.json")
mock_get_response.side_effect = file_json("tests/data/cocoapods/mock_get_response_side_effect.json")
expected_result_to_dict = file_json("tests/data/cocoapods/expected_result_to_dict.json")
purl = "pkg:cocoapods/ASNetworking"
actual_result = get_cocoapods_data_from_purl(purl)

for pkg, expected_pkg_to_dict in zip(list(actual_result), expected_result_to_dict):
pkg_to_json = json.dumps(pkg.to_dict())
expected_pkg_to_dict_json_dumps = json.dumps(expected_pkg_to_dict)
assert pkg_to_json == expected_pkg_to_dict_json_dumps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne @keshav-space Like the failing tests I commented on last week, once again all tests pass locally but 1 or 2 GitHub checks fail and the error is always the test_get_cocoapods_data_from_purl test, and the same supposed failure:

  • we expect "primary_language": "Swift" but GH generates "primary_language": null.

This is bizarre because the various input values are mocked and thus not changing from local to GitHub. Last week after I reran the failing tests over several days, they all mysteriously passed. I'm rerunning today as well but so far I keep getting the same failure. The cause remains a mystery.

OK, many minutes have passed since I started writing this comment, during which I kept rerunning and rerunning the failed checks, and now, all of them pass. Is there some way for me to avoid this going forward? Please share ideas if you have any. 🙏 ;-)

@johnmhoran This is the test which is failing, get_cocoapods_data_from_purl depends on make_head_request. It seems like you forgot to mock the make_head_request function. Since make_head_request is not mocked, it will perform live network calls, and if the network call fails for some reason, your test will also fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @keshav-space . I see that my primary function in this test calls construct_cocoapods_package(), which contains a head request utils.make_head_request(api_url) that I overlooked here and in a related test test_construct_cocoapods_package(). Thanks for your eagle eye.

- Adjusted data output for bitbucket, cargo, npm, pypi and rubygems
  types to return metadata (1) for all versions when the input PURL has
  no version and (2) for just the specified version when the input PURL
  has a version.
Signed-off-by: John M. Horan <[email protected]>
@johnmhoran
Copy link
Member Author

Still waiting for the last 2 GitHub checks to run.

@johnmhoran
Copy link
Member Author

@pombredanne @keshav-space For some reason the 2 remaining GitHub checks have not yet completed. I've refreshed the web page several times hoping that everything had actually finished but no such luck. I'll check from time to time.

Should these remaining GH checks succeed, my fetchcode updates will be ready for your review. This PR covers inter alia the addition of cocoapods support in package.py and the refinement of the data output for the already-supported bitbucket, cargo, npm, pypi and rubygems PURL types.

@johnmhoran
Copy link
Member Author

I "neutral" check (?) and 1 failed ##[error]The request: 269860 was abandoned due to an infrastructure failure. Notification of assignment to an agent was never received. CrowdStrike? Rerunning all 11 checks. Interesting that all purldb checks passed for that PR. (aboutcode-org/purldb#436)

Copy link
Member

@keshav-space keshav-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johnmhoran LGTM!

@keshav-space keshav-space merged commit a9ad33c into master Sep 19, 2024
11 checks passed
@keshav-space keshav-space deleted the 116-cocoapods-pypi-support branch September 19, 2024 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants