Commit bd4e8f10 authored by Alexandre Delanoë's avatar Alexandre Delanoë

[DOC] references.

parent be7e5a1c
......@@ -10,8 +10,9 @@ Portability : POSIX
# Implementation of Unsupervized Word Segmentation
References:
- EleVe Python implementation and discussions with Korantin August and Bruno Gaume
[git repo](https://github.com/kodexlab/eleve.git)
- Python implementation (Korantin August, Emmanuel Navarro):
[EleVe](https://github.com/kodexlab/eleve.git)
- Unsupervized Word Segmentation:the case for Mandarin Chinese Pierre
Magistry, Benoît Sagot, Alpage, INRIA & Univ. Paris 7, Proceedings of
......@@ -19,9 +20,8 @@ References:
, pages 383–387. [PDF](https://www.aclweb.org/anthology/P12-2075)
Notes for current implementation:
- The node count is correct; TODO AD add tests to keep track of it
- NP fix normalization
- NP extract longer ngrams (see paper above, viterbi algo can be used)
- TODO fix normalization
- TODO extract longer ngrams (see paper above, viterbi algo can be used)
- TODO AD TEST: prop (Node c _e f) = c == Map.size f
- AD: Real ngrams extraction test
......@@ -31,7 +31,6 @@ Notes for current implementation:
$ catMaybes
$ Gargantext.map _hyperdataDocument_abstract docs
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment