Count strings in a corpus relative to the number of words
Source:R/cas_count_relative.R
cas_count_relative.Rd
Count strings in a corpus relative to the number of words
Usage
cas_count_relative(
corpus,
pattern,
text = text,
group_by = date,
ignore_case = TRUE,
fixed = FALSE,
full_words_only = FALSE,
pattern_column_name = pattern,
n_column_name = n,
locale = "en"
)
Arguments
- corpus
A textual corpus as a data frame.
- pattern
A character vector of one or more words or strings to be counted.
- text
Defaults to text. The unquoted name of the column of the corpus data frame to be used for matching.
- group_by
Defaults to NULL. If given, the unquoted name of the column to be used for grouping (e.g. date, or doc_id, or source, etc.)
- ignore_case
Defaults to TRUE.
- full_words_only
Defaults to FALSE. If FALSE, string is counted even when the it is found in the middle of a word (e.g. if FALSE, "ratio" would be counted as match in the word "irrational").
- pattern_column_name
Defaults to 'word'. The unquoted name of the column to be used for the word in the output (if
include_string
is set to TRUE, as per default).- n_column_name
Defaults to 'n'. The unquoted name of the column to be used for the count in the output.
- locale
Locale to be used when ignore_case is set to TRUE. Passed to
stringr::str_to_lower
, defaults to "en".
Examples
if (FALSE) { # \dontrun{
cas_count_relative(
corpus = corpus,
pattern = c("dogs", "cats", "horses"),
text = text,
group_by = date,
n_column_name = n
)
} # }