Skip to contents

Adds a column with n words before and after the selected pattern to see keywords in context


  text = text,
  words_before = 5,
  words_after = 5,
  same_sentence = TRUE,
  period_at_end_of_sentence = TRUE,
  ignore_case = TRUE,
  regex = TRUE,
  full_words_only = FALSE,
  full_word_with_partial_match = TRUE,
  pattern_column_name = pattern



A textual corpus as a data frame.


A pattern, typically of one or more words, to be used to break text. Should be of length 1 or length equal to the number of rows.


Defaults to text. The unquoted name of the column of the corpus data frame to be used for matching.


Integer, defaults to 5. Number of columns to include in the before column.


Integer, defaults to 5. Number of columns to include in the after column.


Logical, defaults to TRUE. If TRUE, before and after include only words found in the sentence including the matched pattern.


Logical, defaults to TRUE. If TRUE, a period (".") is always included at the end of a sentence. Relevant only if same_sentence is set to TRUE.


Defaults to TRUE.


Defaults to TRUE. Treat pattern as regex.


Defaults to FALSE. If FALSE, pattern is counted even when it is found in the middle of a word (e.g. if FALSE, "ratio" would be counted as match in the word "irrational").


Defaults to TRUE. If TRUE, if there is a partial match of the pattern, the pattern column still includes the full word where the match has been found. Relevant only when full_words_only is set to FALSE.


Defaults to 'pattern'. The unquoted name of the column to be used for the word in the output.


A data frame (a tibble), with the same columns as input, plus three columns: before, pattern, and after. Only rows where the pattern is found are included.


  corpus = tifkremlinen::kremlin_en,
  pattern = c("china", "india")
#> # A tibble: 6,517 × 11
#>    doc_id text  date       title location link     id term  before pattern after
#>    <chr>  <chr> <date>     <chr> <chr>    <chr> <dbl> <chr> <chr>  <chr>   <chr>
#>  1 presi… The … 2000-01-14 Acti… The Gov… http… 37752 Puti… milit… China   and …
#>  2 presi… Acti… 2000-01-18 Acti… The Kre… http… 37781 Puti… Minis… China's Mini…
#>  3 presi… Acti… 2000-01-18 Acti… The Kre… http… 37781 Puti… the C… China   Janu…
#>  4 presi… Mr P… 2000-01-18 Acti… The Kre… http… 37781 Puti… for t… China   .    
#>  5 presi… He a… 2000-01-18 Acti… The Kre… http… 37781 Puti… He al… China   had …
#>  6 presi… The … 2000-02-01 Acti… Preside… http… 37849 Puti… Autho… China   as w…
#>  7 presi… Acti… 2000-02-17 Acti… NA       http… 37943 Puti… Russi… China   .    
#>  8 presi… Acti… 2000-02-28 Acti… The Kre… http… 38018 Puti… to th… China   sche…
#>  9 presi… As t… 2000-03-01 Vlad… The Kre… http… 38761 Puti… and c… China   were…
#> 10 presi… Mr T… 2000-03-01 Vlad… The Kre… http… 38761 Puti… to Mr… China   with…
#> # ℹ 6,507 more rows