Package org.languagetool.rules.ngrams
Class LanguageModelUtils
java.lang.Object
org.languagetool.rules.ngrams.LanguageModelUtils
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic double
get3gramProbabilityFor
(Language lang, LanguageModel lm, int position, AnalyzedSentence sentence, String candidate) (package private) static double
get3gramProbabilityFor
(Language lang, LanguageModel lm, GoogleToken token, List<GoogleToken> tokens, String term) static double
get4gramProbabilityFor
(Language lang, LanguageModel lm, int position, AnalyzedSentence sentence, String candidate) (package private) static double
get4gramProbabilityFor
(Language lang, LanguageModel lm, GoogleToken token, List<GoogleToken> tokens, String term) getContext
(GoogleToken token, List<GoogleToken> tokens, String newToken, int toLeft, int toRight) getContext
(GoogleToken token, List<GoogleToken> tokens, List<GoogleToken> newTokens, int toLeft, int toRight) static <T> List<T>
getContext
(T token, List<T> tokens, List<T> newTokens, int toLeft, int toRight, Predicate<T> isWhitespace, T endToken) (package private) static Tokenizer
getGoogleStyleWordTokenizer
(Language language) Return a tokenizer that works more like Google does for its ngram index (which doesn't seem to be properly documented).
-
Field Details
-
logger
private static final org.slf4j.Logger logger
-
-
Constructor Details
-
LanguageModelUtils
private LanguageModelUtils()
-
-
Method Details
-
getGoogleStyleWordTokenizer
Return a tokenizer that works more like Google does for its ngram index (which doesn't seem to be properly documented). -
getContext
static List<String> getContext(GoogleToken token, List<GoogleToken> tokens, String newToken, int toLeft, int toRight) -
getContext
static List<String> getContext(GoogleToken token, List<GoogleToken> tokens, List<GoogleToken> newTokens, int toLeft, int toRight) -
getContext
-
get3gramProbabilityFor
public static double get3gramProbabilityFor(Language lang, LanguageModel lm, int position, AnalyzedSentence sentence, String candidate) -
get4gramProbabilityFor
public static double get4gramProbabilityFor(Language lang, LanguageModel lm, int position, AnalyzedSentence sentence, String candidate) -
get3gramProbabilityFor
static double get3gramProbabilityFor(Language lang, LanguageModel lm, GoogleToken token, List<GoogleToken> tokens, String term) -
get4gramProbabilityFor
static double get4gramProbabilityFor(Language lang, LanguageModel lm, GoogleToken token, List<GoogleToken> tokens, String term)
-