This function is typically used to check a web page when extracting links from index, or contents from contents pages.
Usage
cas_browse(
index = FALSE,
remote = TRUE,
id = NULL,
batch = NULL,
index_group = NULL,
file_format = "html",
sample = 1,
disconnect_db = TRUE,
...
)
Arguments
- index
Logical, defaults to FALSE. If TRUE, downloaded files will be considered
index
files. If not, they will be consideredcontents
files. See Readme for a more extensive explanation.- remote
Defaults to TRUE. If TRUE, opens relevant url online. If FALSE, it opens the locally stored file.
- file_format
Defaults to
html
. Used for storing files in dedicated folders, but also for determining processing options. For example, if a sitemap is downloaded as an index withfile_format
set to xml, it will be processed accordingly. If it is stored as xml.gz, it will be automatically decompressed for correct processing.- sample
Defaults to 1. By default, it opens one random url.
- ...
Passed to
cas_get_db_file()
.