Extra 'Recipes' for Text Processing


[Up] [Top]

Documentation for package ‘textrecipes’ version 1.1.0

Help Pages

all_tokenized Role Selection
all_tokenized_predictors Role Selection
count_functions List of all feature counting functions
emoji_samples Sample sentences with emojis
show_tokens Show token output of recipe
step_clean_levels Clean Categorical Levels
step_clean_names Clean Variable Names
step_dummy_hash Indicator Variables via Feature Hashing
step_lda Calculate LDA Dimension Estimates of Tokens
step_lemma Lemmatization of Token Variables
step_ngram Generate n-grams From Token Variables
step_pos_filter Part of Speech Filtering of Token Variables
step_sequence_onehot Positional One-Hot encoding of Tokens
step_stem Stemming of Token Variables
step_stopwords Filtering of Stop Words for Tokens Variables
step_textfeature Calculate Set of Text Features
step_texthash Feature Hashing of Tokens
step_text_normalization Normalization of Character Variables
step_tf Term frequency of Tokens
step_tfidf Term Frequency-Inverse Document Frequency of Tokens
step_tokenfilter Filter Tokens Based on Term Frequency
step_tokenize Tokenization of Character Variables
step_tokenize_bpe BPE Tokenization of Character Variables
step_tokenize_sentencepiece Sentencepiece Tokenization of Character Variables
step_tokenize_wordpiece Wordpiece Tokenization of Character Variables
step_tokenmerge Combine Multiple Token Variables Into One
step_untokenize Untokenization of Token Variables
step_word_embeddings Pretrained Word Embeddings of Tokens
tidy.step_clean_levels Clean Categorical Levels
tidy.step_clean_names Clean Variable Names
tidy.step_dummy_hash Indicator Variables via Feature Hashing
tidy.step_lda Calculate LDA Dimension Estimates of Tokens
tidy.step_lemma Lemmatization of Token Variables
tidy.step_ngram Generate n-grams From Token Variables
tidy.step_pos_filter Part of Speech Filtering of Token Variables
tidy.step_sequence_onehot Positional One-Hot encoding of Tokens
tidy.step_stem Stemming of Token Variables
tidy.step_stopwords Filtering of Stop Words for Tokens Variables
tidy.step_textfeature Calculate Set of Text Features
tidy.step_texthash Feature Hashing of Tokens
tidy.step_text_normalization Normalization of Character Variables
tidy.step_tf Term frequency of Tokens
tidy.step_tfidf Term Frequency-Inverse Document Frequency of Tokens
tidy.step_tokenfilter Filter Tokens Based on Term Frequency
tidy.step_tokenize Tokenization of Character Variables
tidy.step_tokenize_bpe BPE Tokenization of Character Variables
tidy.step_tokenize_sentencepiece Sentencepiece Tokenization of Character Variables
tidy.step_tokenize_wordpiece Wordpiece Tokenization of Character Variables
tidy.step_tokenmerge Combine Multiple Token Variables Into One
tidy.step_untokenize Untokenization of Token Variables
tidy.step_word_embeddings Pretrained Word Embeddings of Tokens
tokenlist Create Token Object