Get path to folder where the corpus is stored.
Source:R/cas_get_corpus_path.R
cas_get_corpus_path.RdGet path to folder where the corpus is stored.
Usage
cas_get_corpus_path(
...,
corpus_folder = "corpus",
file_format = "parquet",
partition = NULL,
token = "full_text"
)Arguments
- ...
Passed to
cas_get_db_file().- file_format
Defaults to "parquet". Currently, other options are not implemented.
- partition
Defaults to
NULL. IfNULL, the parquet file is not partitioned. "year" is a common alternative: if set to "year", the parquet file is partitioned by year. If ayearcolumn does not exist, it is created based on the assumption that adatecolumn exists and it is (or can be coerced to) a vector of classDate.- token
Defaults to "full_text", which does not tokenise the text column. If different from
full_text, it is passed totidytext::unnest_tokens()(see its help for details). Accepted values include "words", "sentences", and "paragraphs".