Downloads one file at a time with the default R function for downloading files
Source:R/cas_download_internal.R
cas_download_internal.Rd
Mostly used internally by cas_download()
, relies on download.file()
function.
Usage
cas_download_internal(
download_df = NULL,
index = FALSE,
index_group = NULL,
overwrite_file = FALSE,
ignore_id = TRUE,
wait = 1,
ignore_ssl_certificates = FALSE,
create_folder_if_missing = NULL,
db_connection = NULL,
disconnect_db = FALSE,
sample = FALSE,
file_format = "html",
download_again = FALSE,
download_again_if_status_is_not = NULL,
...
)
Arguments
- download_df
A data frame with four columns:
id
,url
,path
,type
.- index
Logical, defaults to FALSE. If TRUE, downloaded files will be considered
index
files. If not, they will be consideredcontents
files. See Readme for a more extensive explanation.- overwrite_file
Logical, defaults to FALSE.
- wait
Defaults to 1. Number of seconds to wait between downloading one page and the next. Can be increased to reduce server load, or can be set to 0 when this is not an issue.
- ignore_ssl_certificates
Logical, defaults to FALSE. If TRUE it uses
wget
to download the page, and does not check if the SSL certificate is valid. Useful, for example, for https pages with expired or mis-configured SSL certificate.- db_connection
Defaults to NULL. If NULL, uses local SQLite database. If given, must be a connection object or a list with relevant connection settings (see example).
- disconnect_db
Defaults to TRUE. If FALSE, leaves the connection to database open.
- sample
Defaults to FALSE. If TRUE, the download order is randomised. If a numeric is given, the download order is randomised and at most the given number of items is downloaded.
- file_format
Defaults to
html
. Used for storing files in dedicated folders, but also for determining processing options. For example, if a sitemap is downloaded as an index withfile_format
set to xml, it will be processed accordingly. If it is stored as xml.gz, it will be automatically decompressed for correct processing.- ...
Passed to
cas_get_db_file()
.