[ngrams] improve function documentation, add types, add unit tests

I want to understand ngrams algorithms better.
parent d7a70fd4
Pipeline #7169 passed with stages
in 55 minutes and 26 seconds