Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-retry when dataverse is slow #48

Open
mayeulk opened this issue Jun 19, 2019 · 3 comments
Open

Auto-retry when dataverse is slow #48

mayeulk opened this issue Jun 19, 2019 · 3 comments

Comments

@mayeulk
Copy link
Contributor

mayeulk commented Jun 19, 2019

Often, the dataverse server is slow and update_icews() stops. It would be great to have an option to relaunch it automatically in these cases (maybe after a delay, specified in seconds). There are at least 2 types of errors for which relaunching works:

  • Gateway Timeout (HTTP 504).

  • parse error: premature EOF

> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181129-icews-events.zip'
Ingesting records from '20181129-icews-events.tab'
Downloading '20181130-icews-events.zip'
Ingesting records from '20181130-icews-events.tab'
Downloading '20181203-icews-events.zip'
Ingesting records from '20181203-icews-events.tab'
Downloading '20181204-icews-events.zip'
Ingesting records from '20181204-icews-events.tab'
Downloading '20181205-icews-events.zip'
Ingesting records from '20181205-icews-events.tab'
Downloading '20181206-icews-events.zip'
Ingesting records from '20181206-icews-events.tab'
Downloading '20181207-icews-events.zip'
Ingesting records from '20181207-icews-events.tab'
Downloading '20181208-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
parse error: premature EOF
                                       
                     (right here) ------^
> update_icews(dryrun = FALSE); date()
Downloading '20181208-icews-events.zip'
Ingesting records from '20181208-icews-events.tab'
Downloading '20181209-icews-events.zip'
Ingesting records from '20181209-icews-events.tab'
Downloading '20181210-icews-events.zip'
Ingesting records from '20181210-icews-events.tab'
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Error in value[[3L]](cond) : 
  Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Error in get_file(file_ref, get_doi()[[repo]]) : 
  Gateway Timeout (HTTP 504).
> update_icews(dryrun = FALSE); date()
Downloading '20181211-icews-events.zip'
Ingesting records from '20181211-icews-events.tab'
Downloading '20181212-icews-events.zip'
Ingesting records from '20181212-icews-events.tab'
Downloading '20181213-icews-events.zip'
@mayeulk
Copy link
Contributor Author

mayeulk commented Jun 19, 2019

I just launched this in one go:

update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()
update_icews(dryrun = FALSE)
date()

This worked,... but then Harvard's Dataverse "crashed":


> > update_icews(dryrun = FALSE)
> Downloading '20181215-icews-events.zip'
> Ingesting records from '20181215-icews-events.tab'
> Downloading '20181216-icews-events.zip'
> Ingesting records from '20181216-icews-events.tab'
> Downloading '20181217-icews-events.zip'
> Ingesting records from '20181217-icews-events.tab'
> Downloading '20181218-icews-events.zip'
> Ingesting records from '20181218-icews-events.tab'
> Downloading '20181219-icews-events.zip'
> Ingesting records from '20181219-icews-events.tab'
> Downloading '20181220-icews-events.zip'
> Ingesting records from '20181220-icews-events.tab'
> Downloading '20181221-icews-events.zip'
> Error in get_file(file_ref, get_doi()[[repo]]) : 
>   Internal Server Error (HTTP 500).
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:17 2019"
> > update_icews(dryrun = FALSE)
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > date()
> [1] "Wed Jun 19 18:31:18 2019"
> > update_icews(dryrun = FALSE); date()
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^
> > update_icews(dryrun = FALSE); date()
> Error in value[[3L]](cond) : 
>   Something went wrong in 'dataverse' or the Dataverse API, try again. Original error message:
> lexical error: invalid char in json text.
>                                        <!DOCTYPE HTML PUBLIC "-//IETF/
>                      (right here) ------^

Going to https://dataverse.harvard.edu/ gives a 503 - Service Unavailable
"Service Unavailable
The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later."

2 or 3 minutes later, it was working fine again, and update_icews(dryrun = FALSE) worked

@mayeulk
Copy link
Contributor Author

mayeulk commented Jun 19, 2019

Note: after the (possible?) server restart (?) of dataverse.harvard.edu/ , things went smoother, with about 107 downloads, until stopping with the following issue again: #45 (comment)

@andybega
Copy link
Owner

Hmm. I'm wondering if this is something that should be done in the actual client (https://github.com/IQSS/dataverse-client-r), but it's at the moment not being actively maintained for lack of a new owner.

Any suggestions for how this should properly be done? Upon encountering one of these errors, iterate through waiting for some small amount of time until either success or some limit is reached?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants