Skip to contents

This function relies on data stored in the database.

Usage

cas_get_path_to_files(
  urls = NULL,
  id = NULL,
  batch = "latest",
  status = 200,
  index = FALSE,
  index_group = NULL,
  custom_folder = NULL,
  custom_path = NULL,
  file_format = "html",
  sample = FALSE,
  db_connection = NULL,
  db_folder = NULL,
  disconnect_db = TRUE,
  ...
)

Arguments

batch

Default to "latest": returns only the path to the file with the highest batch identifier available. Valid values are: "latest", "all", or a numeric identifier corresponding to desired batch.

status

Defaults to 200. Keeps only files downloaded with the given status (can be more than one, given as a vector). If NULL, no filter based on status is applied.

index

Logical, defaults to FALSE. If TRUE, downloaded files will be considered index files. If not, they will be considered contents files. See Readme for a more extensive explanation.

sample

Defaults to FALSE. If TRUE, the download order is randomised. If a numeric is given, the download order is randomised and at most the given number of items is downloaded.

db_connection

Defaults to NULL. If NULL, uses local SQLite database. If given, must be a connection object or a list with relevant connection settings (see example).

...

Passed to cas_get_db_file().

Value

A data frame of one row if "batch" is set to "latest". Possibly more than one row in other cases.