Text mining using tidy tools :sparkles::page_facing_up::sparkles:
Added alt text to figures in vignettes and README (#233)
Update vignette for quanteda::dfm() v4 (#242)
stm()
tidiers for high FREX and lift words (#223)dfm
because of the upcoming release of Matrix (#218)scale_x/y_reordered()
now uses a function labels
as its main input (#200)to_lower
is passed to underlying tokenization function for character shingles (#208)content
, thanks to @jonathanvoelkle (#209)collapse
argument to unnest_functions()
. This argument now takes either NULL
(do not collapse text across rows for tokenizing) or a character vector of variables (use said variables to collapse text across rows for tokenizing). This fixes a long-standing bug and provides more consistent behavior, but does change results for many situations (such as n-gram tokenization).reorder_within()
now handles multiple variables, thanks to @tmastny (#170)to_lower
argument to other tokenizing functions, for more consistent behavior (#175)glance()
method for stm's estimated regressions, thanks to @vincentarelbundock (#176)