Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
humanities
gargantext
Commits
42298c2e
Commit
42298c2e
authored
Jul 28, 2016
by
c24b
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[TO TEST] lang_detection
parent
67cd43b0
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
6 additions
and
7 deletions
+6
-7
languages.py
gargantext/util/languages.py
+4
-6
parsing.py
gargantext/util/toolchain/parsing.py
+1
-1
requirements.txt
install/gargamelle/requirements.txt
+1
-0
No files found.
gargantext/util/languages.py
View file @
42298c2e
from
gargantext.constants
import
*
from
gargantext.constants
import
*
from
langdetect
import
detect
from
langdetect
import
detect
,
DetectorFactory
from
langdetect
import
DetectorFactory
class
Language
:
class
Language
:
def
__init__
(
self
,
iso2
=
None
,
iso3
=
None
,
full_name
=
None
,
name
=
None
):
def
__init__
(
self
,
iso2
=
None
,
iso3
=
None
,
full_name
=
None
,
name
=
None
):
...
@@ -18,9 +16,6 @@ class Language:
...
@@ -18,9 +16,6 @@ class Language:
return
result
return
result
__repr__
=
__str__
__repr__
=
__str__
def
detect_lang
(
self
,
text
):
DetectorFactory
.
seed
=
0
return
Languages
[
detect
(
text
)]
.
iso2
class
Languages
(
dict
):
class
Languages
(
dict
):
def
__missing__
(
self
,
key
):
def
__missing__
(
self
,
key
):
...
@@ -30,6 +25,9 @@ class Languages(dict):
...
@@ -30,6 +25,9 @@ class Languages(dict):
raise
KeyError
raise
KeyError
languages
=
Languages
()
languages
=
Languages
()
def
detect_lang
(
self
,
text
):
DetectorFactory
.
seed
=
0
return
languages
[
detect
(
text
)]
.
iso2
import
pycountry
import
pycountry
pycountry_keys
=
(
pycountry_keys
=
(
...
...
gargantext/util/toolchain/parsing.py
View file @
42298c2e
...
@@ -47,7 +47,7 @@ def parse(corpus):
...
@@ -47,7 +47,7 @@ def parse(corpus):
indexed
=
False
indexed
=
False
# a simple census to raise language info at corpus level
# a simple census to raise language info at corpus level
for
l
in
[
"iso2"
,
"iso3"
,
"full_name"
]:
for
l
in
[
"iso2"
,
"iso3"
,
"full_name"
]:
if
hyperdata
[
"indexed"
]
is
True
:
if
indexed
is
True
:
break
break
lang_field
=
"language_"
+
l
lang_field
=
"language_"
+
l
if
lang_field
in
hyperdata
.
keys
():
if
lang_field
in
hyperdata
.
keys
():
...
...
install/gargamelle/requirements.txt
View file @
42298c2e
...
@@ -14,6 +14,7 @@ html5lib==0.9999999
...
@@ -14,6 +14,7 @@ html5lib==0.9999999
python-igraph>=0.7.1
python-igraph>=0.7.1
jdatetime==1.7.2
jdatetime==1.7.2
kombu==3.0.33 # messaging
kombu==3.0.33 # messaging
langdetect==1.0.6 #detectinglanguage
nltk==3.1
nltk==3.1
numpy==1.10.4
numpy==1.10.4
psycopg2==2.6.1
psycopg2==2.6.1
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment