Skip to contents

Consider using long waiting times, and using a high number of retry. Retry is done graciously, using httr::RETRY, and respecting the waiting time given when error 529 "too many requests" is returned by the server. This is still likely to take a long amount of time.


  url = NULL,
  wait = 32,
  retry_times = 64,
  pause_base = 16,
  pause_cap = 1024,
  pause_min = 64,
  only_if_unavailable = TRUE,
  ia_check = TRUE,
  ia_check_wait = 2,
  db_connection = NULL,
  check_db = TRUE,
  write_db = TRUE,



A charachter vector of length one, a url.


Defaults to 32. I have found no information online about what wait time is considered suitable by itself, but I've noticed that with wait time shorter than 10 seconds the whole process stops getting positive replies from the server very soon.


Defaults to 10. Number of times to retry download in case of errors.

pause_base, pause_cap

This method uses exponential back-off with full jitter - this means that each request will randomly wait between pause_min and pause_base * 2 ^ attempt seconds, up to a maximum of pause_cap seconds.


Minimum time to wait in the backoff; generally only necessary if you need pauses less than one second (which may not be kind to the server, use with caution!).


Defaults to TRUE. If TRUE, checks for availability of urls before attempting to save them.


Defaults to TRUE. If TRUE, checks again the URL after saving it and keeps record in the local database.


Defaults to 2, passed to cas_ia_check(). Can generally be kept low, as this is a light API.


Defaults to TRUE. If TRUE, checks if given URL has already been checked in local database, and queries APIs only for URLs that have not been previously checked.


Defaults to TRUE. If TRUE, writes result to a local database.


Passed to cas_get_db_file().


if (FALSE) {
if (interactive()) {
  # Once the usual parameters are set with `cas_set_options()` it is generally
  # ok to just let it get urls from the database and let it run without any
  # additional parameter.