-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create data directory hierarchy if not present #201
Comments
Hmm, after messing around a bit more I'm starting to feel like I'm missing an important step between |
Re-opening, didn't mean to close the issue. |
@gregoryfoster I filed a quick PR to fix the issue you identified: #202 However, the |
Thanks, @konklone, for the quick fix. It does take care of creating the I'm a little puzzled and honestly a little distressed to hear that the |
No no no, I re-wrote the bills task last year to convert the new official bill XML (from fdsys) into the existing JSON data format. Since GovTrack relies on the JSON format and I don't have the capacity to re-write GovTrack's importer to use the fdsys XML directly, I'm still invested in keeping the bills task running. |
The mkdir issue probably stemmed from my rewrite last year, btw. Sorry about breaking it on clean directories (which I never test on). |
Whew, glad to hear, @JoshData! Returning to the original edge case of an absent and now clean Let me know if you want me to open a separate issue. And if you can sketch an outline of what needs to be done, I'd be happy to contribute a PR. |
Apologies for confusing the issue! And I can verify what @gregoryfoster says -- #202 fixes the errors, but it still doesn't cause the |
Hello, This issue and #202 seems to be related to my error even though it is over 2 years old and still Open. #202 says "This fixes #201 by using mkdir_p as necessary when examining data paths on disk.", but without any specific directions on how or where that fix should be applied. After reading the last 2 comments here, I have to ask if this scraper is still being maintained? If so, where can I find directions on how to fix this issue? Thanks. |
Hi. At GovTrack we use this project extensively. Unfortunately we don't have the resources to fix problems that we're not experiencing ourselves, though. This repository was created at a time when multiple well-funded organizations (besides us) we're investing in creating a shared data ecosystem for legislative data, but now some of those organizations effectively don't exist anymore. |
Thanks for you quick response. I started a project several years ago with GovTrack (GT) bulk data. When I came back to it last year the GT data was no longer online. I found parts of it on ProPublica and elsewhere but some parts I can’t find, like the set of Amendments. I will spend some time over the next few days trying to figure this scraper out. If it can produce what I’m looking for I will post the fix. I might even try to fork it to Python3 since Python2 is due to be obsolete next year. |
@jox58 Can I ask specifically which scraper you're running that it doesn't create a data directory? I ask because I cloned the repository into a new directory and ran |
You are right. My mistake for not reading the instructions carefully. I did a |
Hello, and thank you for sharing and maintaining such a valuable project. I'm just getting started by way of
legis-graph
and intend to become a frequent user and hopefully a helpful contributor.I've setup a fresh installation and Python 2.7 virtual environment. As a heads up for potential future
congress
users, I ran into an SSL handshake issue sourced toscrapelib
which prevents execution of thefdsys
task (and likely others). That issue and workaround is detailed here.Currently, I'm attempting to
./run bills --congress=115
and the task fails because there is nodata
hierarchy in the filesystem yet.mkdir -p data/115
and a subsequentos.listdir
call will fail because there are no bill types. This is easy enough to workaround with some knowledge of the expected hierarchy, but it seems like something we could also easily fix.I see there's a
mkdir_p
function inutils.py
we could reuse - is there a good central place in the codebase to anticipate this edge case? I'd be happy to put together a pull request with a little guidance.Thanks again for this very useful project!
The text was updated successfully, but these errors were encountered: