Summarise for a given time period word counts, typically calculatd with cas_count()
Source: R/cas_summarise.R
cas_summarise.Rd
Summarise for a given time period word counts, typically calculatd with
cas_count()
Usage
cas_summarise(
count_df,
date_column_name = date,
n_column_name = n,
pattern_column_name = pattern,
period = NULL,
f = mean,
period_summary_function = sum,
every = 1L,
before = 0L,
after = 0L,
complete = FALSE,
auto_convert = FALSE
)
Arguments
- count_df
A data frame. Must include at least a column with a date or date-time column and a column with number of occurrences for the given time.
- period
Defaults to NULL. A string describing the time unit to be used for summarising. Possible values include "year", "quarter", "month", "day", "hour", "minute", "second", "millisecond".
- f
Defaults to
mean
. Function to be applied over n for all the values in a given time period. Common alternatives would bemean
ormedian
.- period_summary_function
Defaults to
sum
. This is applied when grouping by period (e.g. whenperiod
is set to year). When calculating absolute word frequency, the default (sum
) is fine. When calculating relative frequencies, thenmean
would be more appropriate, but extra consideration should be given to the implications if then a rolling average is applied.- every
[positive integer(1)]
The number of periods to group together.
For example, if the period was set to
"year"
with an every value of2
, then the years 1970 and 1971 would be placed in the same group.- before, after
[integer(1) / Inf]
The number of values before or after the current element to include in the sliding window. Set to
Inf
to select all elements before or after the current element. Negative values are allowed, which allows you to "look forward" from the current element if used as the.before
value, or "look backwards" if used as.after
.- complete
[logical(1)]
Should the function be evaluated on complete windows only? If
FALSE
, the default, then partial computations will be allowed.- auto_convert
Defaults to FALSE. If FALSE, the date column is returned using the same format as the input; the minimun vale in the given group is used for reference (e.g. all values for January 2022 are summarised as 2021-01-01 it the data were originally given as dates.). If TRUE, it tries to adapt the output to the most intuitive correspondent type; for year, a numeric column with only the year number, for quarter in the format 2022.1, for month in the format 2022-01.
- date
Defaults to
date
. Unquoted name of a column having either date or date-time as class.- n
Unquoted to
n
. Unquoted name of a column having number of occurrences per time unit.