-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate all issues from Google Code to github [rt.cpan.org #104912] #99
Comments
Looking into this.
I've scraped and de-wayback-machined the issues from the 2015 wayback link (using WWW::Mechanize and WWW::WebArchive::WaybackMachine to clean the html). I'll get the individual issues parsed and extracted soon. How would you like the tickets? Just bulk import them with appropriate labels? Or I could provide a spreadsheet for triaging? Snapshot of my code in progress. #!/usr/bin/env perl
use strict;
use warnings;
use WWW::WebArchive::WaybackMachine;
my $url = "http://code.google.com/p/www-mechanize/issues/list";
# Download specific older wayback urls, since the most
# current versions 301 redirect to an empty page.
my @wayback_urls = (
"https://web.archive.org/web/20150227111857/$url",
"https://web.archive.org/web/20130510011416/http://code.google.com/p/www-mechanize/issues/list?num=100&start=100"
);
my $wayback = WWW::WebArchive::WaybackMachine->new(
url => $url,
verbose => 1,
);
# make our own Mech, to use with WWW::WA::WM methods
my $ua = WWW::Mechanize->new();
$ua->agent_alias('Windows IE 6');
$ua->stack_depth(1);
my $file = $url;
$file =~ s,https?://,,;
my $dir = './';
my $domain = 'code.google.com';
for my $wayback_url (@wayback_urls) {
$ua->get($wayback_url);
my @links = $ua->find_all_links(
tag => "a",
text_regex => qr/^\d+$/,
url_regex => qr/issues.detail/,
);
foreach my $link (@links) {
printf ("text:%s url:%s\n", $link->text, $link->url);
$wayback->mirror($ua, $link->url, $link->text, $dir, $domain);
}
} |
Nearly ready to load the issues. I've skipped thinking about tagging or assigning the issues, as I don't have the privs necessary in the API. |
Finished migrating issues from google code, into new issues #116 through #225. I've created a repo with the scripts and parsed data here: https://github.com/spazm/google_code_to_github_issues #CPAN-PRC #TeamZiprecruiter. Please close this ticket. Thanks. |
@spazm thanks so much. This is great to have! |
Migrated from rt.cpan.org#104912 (status was 'open')
Requestors:
From [email protected] on 2015-06-03 17:05:30:
From [email protected] on 2015-06-09 11:05:13:
From [email protected] on 2015-06-09 16:45:33:
From [email protected] on 2015-07-20 19:19:34:
From [email protected] on 2015-07-20 19:59:50:
From [email protected] on 2015-07-20 20:57:14:
From [email protected] on 2015-07-20 21:06:36:
From [email protected] on 2015-07-20 21:53:27:
From [email protected] on 2015-07-21 14:33:13:
The text was updated successfully, but these errors were encountered: