Commit 4c00539c authored by delanoe's avatar delanoe

Merge branch 'testing' into testing-share

parents 3d3724d1 9a6805bf
......@@ -2,6 +2,12 @@
* Guided Tour
* Sources form highlighting crawlers
## Version 3.0.6.8
* REPEC Crawler (connection with https://multivac.iscpif.fr)
* HAL Crawler (connection to https://hal.archives-ouvertes.fr/)
* New Graph Feature: color nodes by growth
## Version 3.0.6.4
* COOC SQL improved
......
# Definitions and notation for the documentation (!= python notation)
## Node
The table (nodes) is a list of nodes: [Node]
Each Node has:
- a typename
- a parent_id
- a name
### Each Node has a parent_id
Node A
├── Node B
└── Node C
If Node A is Parent of Node B and Node C
then NodeA.id == NodeB.parent_id == NodeC.parent_id.
### Each Node has a typename
Notation: Node[foo](bar) is a Node of typename "foo" and with name "bar".
Then:
- Then Node[project] is a project.
- Then Node[corpus] is a corpus.
- Then Node[document] is a document.
### Each Node as a typename and a parent
Node[user](name)
├── Node[project](myProject1)
│   ├── Node[corpus](myCorpus1)
│   ├── Node[corpus](myCorpus2)
│   └── Node[corpus](myCorpus3)
└── Node[project](myProject2)
/!\ 3 way to manage rights of the Node:
1) Then Node[User] is a folder containing all User projects and corpus and
documents (i.e. Node[user] is the parent_id of the children).
2) Each node as a user_id (mainly used today)
3) Right management for the groups (implemented already but not
used since not connected to the frontend).
## Global Parameters
Global User is Gargantua (Node with typename User).
This node is the parent of the others Nodes for parameters.
Node[user](gargantua) (gargantua.id == Node[user].user_id)
├── Node[TFIDF-Global](global) : without group
│   ├── Node[tfidf](database1)
│   ├── Node[tfidf](database2)
│   └── Node[tfidf](database2)
└── Node[anotherMetric](global)
## NodeNgram
NodeNgram is a relation of a Node with a ngram:
- document and ngrams
- metrics and ngrams (position of the node metrics indicates the
context)
# Community Parameters
# User Parameters
......@@ -8,6 +8,9 @@ Gargantext is a web plateform to explore your corpora using text-mining[...](abo
* [Take a tour](demo.md) of the different features offered by Gargantext
## Architecture
* [Architecture](architecture.md) Architecture of Gargantext
##Need some help?
Ask the community at:
......
* Create user gargantua
Main user of Gargantext is Gargantua (role of Pantagruel soon)!
``` bash
sudo adduser --disabled-password --gecos "" gargantua
```
* Create the directories you need
here for the example gargantext package will be installed in /srv/
``` bash
for dir in "/srv/gargantext"
"/srv/gargantext_lib"
"/srv/gargantext_static"
"/srv/gargantext_media"
"/srv/env_3-5"; do
sudo mkdir -p $dir ;
sudo chown gargantua:gargantua $dir ;
done
```
You should see:
```bash
$tree /srv
/srv
├── gargantext
├── gargantext_lib
├── gargantext_media
│   └── srv
│   └── env_3-5
└── gargantext_static
```
* Get the main libraries
Download uncompress and make main user access to it.
PLease, Be patient due to the size of the packages libraries (27GO)
this step can be long....
``` bash
wget http://dl.gargantext.org/gargantext_lib.tar.bz2 \
&& tar xvjf gargantext_lib.tar.bz2 -o /srv/gargantext_lib \
&& sudo chown -R gargantua:gargantua /srv/gargantext_lib \
&& echo "Libs installed"
```
* Get the source code of Gargantext
by cloning the repository of gargantext
``` bash
git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext \
&& cd /srv/gargantext \
&& git fetch origin refactoring \
&& git checkout refactoring \
```
TODO(soon): git clone https://gogs.iscpif.fr/gargantext.git
See the [next steps of installation procedure](install.md#Install)
tools/manual_install.md
\ No newline at end of file
......@@ -181,8 +181,6 @@ def get_tagger(lang):
return tagger()
RESOURCETYPES = [
{ "type": 1,
'name': 'Europresse',
......@@ -242,19 +240,44 @@ RESOURCETYPES = [
'crawler': None,
},
{ "type": 9,
"name": 'SCOAP [XML]',
"name": 'SCOAP [API/XML]',
"parser": "CernParser",
"format": 'MARC21',
'file_formats':["zip","xml"],
"crawler": "CernCrawler",
},
# { "type": 10,
# "name": 'REPEC [RIS]',
# "parser": "RISParser",
# "format": 'RIS',
# 'file_formats':["zip","ris", "txt"],
# "crawler": None,
# },
#
{ "type": 10,
"name": 'REPEC [RIS]',
"parser": "RISParser",
"format": 'RIS',
'file_formats':["zip","ris", "txt"],
"crawler": None,
"name": 'REPEC [MULTIVAC API]',
"parser": "MultivacParser",
"format": 'JSON',
'file_formats':["zip","json"],
"crawler": "MultivacCrawler",
},
{ "type": 11,
"name": 'HAL [API]',
"parser": "HalParser",
"format": 'JSON',
'file_formats':["zip","json"],
"crawler": "HalCrawler",
},
{ "type": 12,
"name": 'ISIDORE [SPARQLE API /!\ BETA]',
"parser": "IsidoreParser",
"format": 'JSON',
'file_formats':["zip","json"],
"crawler": "IsidoreCrawler",
},
]
#shortcut for resources declaration in template
PARSERS = [(n["type"],n["name"]) for n in RESOURCETYPES if n["parser"] is not None]
......
......@@ -28,19 +28,20 @@ import graph.urls
import moissonneurs.urls
urlpatterns = [ url(r'^admin/' , admin.site.urls )
, url(r'^api/' , include( gargantext.views.api.urls ) )
, url(r'^' , include( gargantext.views.pages.urls ) )
urlpatterns = [ url(r'^admin/' , admin.site.urls )
, url(r'^api/' , include( gargantext.views.api.urls ) )
, url(r'^' , include( gargantext.views.pages.urls ) )
, url(r'^favicon.ico$', Redirect.as_view( url=static.url('favicon.ico')
, permanent=False), name="favicon")
, permanent=False), name="favicon" )
# Module Graph
, url(r'^' , include( graph.urls ) )
, url(r'^' , include( graph.urls ) )
# Module Annotation
# tempo: unchanged doc-annotations routes --
, url(r'^annotations/', include( annotations_urls ) )
, url(r'^projects/(\d+)/corpora/(\d+)/documents/(\d+)/(focus=[0-9,]+)?$', annotations_main_view)
, url(r'^annotations/', include( annotations_urls ) )
, url(r'^projects/(\d+)/corpora/(\d+)/documents/(\d+)/(focus=[0-9,]+)?$'
, annotations_main_view)
# Module Scrapers (Moissonneurs in French)
, url(r'^moissonneurs/' , include( moissonneurs.urls ) )
......
......@@ -4,7 +4,7 @@
# ***** CERN Scrapper *****
# ****************************
# Author:c24b
# Date: 27/05/2015
# Date: 27/05/2016
import hmac, hashlib
import requests
import os
......@@ -96,10 +96,12 @@ class CernCrawler(Crawler):
print(self.results_nb, "res")
#self.generate_urls()
return(self.ids)
def generate_urls(self):
''' generate raw urls of ONE record'''
self.urls = ["http://repo.scoap3.org/record/%i/export/xm?ln=en" %rid for rid in self.ids]
return self.urls
def fetch_records(self, ids):
''' for NEXT time'''
raise NotImplementedError
......
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** HAL Scrapper ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Crawler import *
import json
from gargantext.constants import UPLOAD_DIRECTORY
from math import trunc
from gargantext.util.files import save
class HalCrawler(Crawler):
''' HAL API CLIENT'''
def __init__(self):
# Main EndPoints
self.BASE_URL = "https://api.archives-ouvertes.fr"
self.API_URL = "search"
# Final EndPoints
# TODO : Change endpoint according type of database
self.URL = self.BASE_URL + "/" + self.API_URL
self.status = []
def __format_query__(self, query=None):
'''formating the query'''
#search_field="title_t"
search_field="abstract_t"
return (search_field + ":" + "(" + query + ")")
def _get(self, query, fromPage=1, count=10, lang=None):
# Parameters
fl = """ title_s
, abstract_s
, submittedDate_s
, journalDate_s
, authFullName_s
, uri_s
, isbn_s
, issue_s
, journalPublisher_s
"""
#, authUrl_s
#, type_s
wt = "json"
querystring = { "q" : query
, "rows" : count
, "start" : fromPage
, "fl" : fl
, "wt" : wt
}
# Specify Headers
headers = { "cache-control" : "no-cache" }
# Do Request and get response
response = requests.request( "GET"
, self.URL
, headers = headers
, params = querystring
)
#print(querystring)
# Validation : 200 if ok else raise Value
if response.status_code == 200:
charset = ( response.headers["Content-Type"]
.split("; ")[1]
.split("=" )[1]
)
return (json.loads(response.content.decode(charset)))
else:
raise ValueError(response.status_code, response.reason)
def scan_results(self, query):
'''
scan_results : Returns the number of results
Query String -> Int
'''
self.results_nb = 0
total = ( self._get(query)
.get("response", {})
.get("numFound" , 0)
)
self.results_nb = total
return self.results_nb
def download(self, query):
downloaded = False
self.status.append("fetching results")
corpus = []
paging = 100
self.query_max = self.scan_results(query)
#print("self.query_max : %s" % self.query_max)
if self.query_max > QUERY_SIZE_N_MAX:
msg = "Invalid sample size N = %i (max = %i)" % ( self.query_max
, QUERY_SIZE_N_MAX
)
print("ERROR (scrap: Multivac d/l ): " , msg)
self.query_max = QUERY_SIZE_N_MAX
#for page in range(1, trunc(self.query_max / 100) + 2):
for page in range(0, self.query_max, paging):
print("Downloading page %s to %s results" % (page, paging))
docs = (self._get(query, fromPage=page, count=paging)
.get("response", {})
.get("docs" , [])
)
for doc in docs:
corpus.append(doc)
self.path = save( json.dumps(corpus).encode("utf-8")
, name='HAL.json'
, basedir=UPLOAD_DIRECTORY
)
downloaded = True
return downloaded
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** ISIDORE Scrapper ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Crawler import *
import json
from gargantext.constants import UPLOAD_DIRECTORY
from math import trunc
from gargantext.util.files import save
from gargantext.util.crawlers.sparql.bool2sparql import bool2sparql, isidore
class IsidoreCrawler(Crawler):
''' ISIDORE SPARQL API CLIENT'''
def __init__(self):
# Main EndPoints
self.BASE_URL = "https://www.rechercheisidore.fr"
self.API_URL = "sparql"
# Final EndPoints
# TODO : Change endpoint according type of database
self.URL = self.BASE_URL + "/" + self.API_URL
self.status = []
def __format_query__(self, query=None, count=False, offset=None, limit=None):
'''formating the query'''
return (bool2sparql(query, count=count, offset=offset, limit=limit))
def _get(self, query, offset=0, limit=None, lang=None):
'''Parameters to download data'''
isidore(query, count=False, offset=offset, limit=limit)
def scan_results(self, query):
'''
scan_results : Returns the number of results
Query String -> Int
'''
self.results_nb = [n for n in isidore(query, count=True)][0]
return self.results_nb
def download(self, query):
downloaded = False
self.status.append("fetching results")
corpus = []
limit = 1000
self.query_max = self.scan_results(query)
print("self.query_max : %s" % self.query_max)
if self.query_max > QUERY_SIZE_N_MAX:
msg = "Invalid sample size N = %i (max = %i)" % ( self.query_max
, QUERY_SIZE_N_MAX
)
print("WARNING (scrap: ISIDORE d/l ): " , msg)
self.query_max = QUERY_SIZE_N_MAX
for offset in range(0, self.query_max, limit):
print("Downloading result %s to %s" % (offset, self.query_max))
for doc in isidore(query, offset=offset, limit=limit) :
corpus.append(doc)
self.path = save( json.dumps(corpus).encode("utf-8")
, name='ISIDORE.json'
, basedir=UPLOAD_DIRECTORY
)
downloaded = True
return downloaded
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** MULTIVAC Scrapper ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Crawler import *
import json
......
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** MULTIVAC Scrapper ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Crawler import *
import json
from gargantext.settings import API_TOKENS
from gargantext.constants import UPLOAD_DIRECTORY
from math import trunc
from gargantext.util.files import save
class MultivacCrawler(Crawler):
''' Multivac API CLIENT'''
def __init__(self):
self.apikey = API_TOKENS["MULTIVAC"]
# Main EndPoints
self.BASE_URL = "https://api.iscpif.fr/v2"
self.API_URL = "pvt/economy/repec/search"
# Final EndPoints
# TODO : Change endpoint according type of database
self.URL = self.BASE_URL + "/" + self.API_URL
self.status = []
def __format_query__(self, query=None):
'''formating the query'''
None
def _get(self, query, fromPage=1, count=10, lang=None):
# Parameters
querystring = { "q" : query
, "count" : count
, "from" : fromPage
, "api_key" : API_TOKENS["MULTIVAC"]["APIKEY"]
}
if lang is not None:
querystring["lang"] = lang
# Specify Headers
headers = { "cache-control" : "no-cache" }
# Do Request and get response
response = requests.request( "GET"
, self.URL
, headers = headers
, params = querystring
)
#print(querystring)
# Validation : 200 if ok else raise Value
if response.status_code == 200:
charset = ( response.headers["Content-Type"]
.split("; ")[1]
.split("=" )[1]
)
return (json.loads(response.content.decode(charset)))
else:
raise ValueError(response.status_code, response.reason)
def scan_results(self, query):
'''
scan_results : Returns the number of results
Query String -> Int
'''
self.results_nb = 0
total = ( self._get(query)
.get("results", {})
.get("total" , 0)
)
self.results_nb = total
return self.results_nb
def download(self, query):
downloaded = False
self.status.append("fetching results")
corpus = []
paging = 100
self.query_max = self.scan_results(query)
#print("self.query_max : %s" % self.query_max)
if self.query_max > QUERY_SIZE_N_MAX:
msg = "Invalid sample size N = %i (max = %i)" % ( self.query_max
, QUERY_SIZE_N_MAX
)
print("ERROR (scrap: Multivac d/l ): " , msg)
self.query_max = QUERY_SIZE_N_MAX
for page in range(1, trunc(self.query_max / 100) + 2):
print("Downloading page %s to %s results" % (page, paging))
docs = (self._get(query, fromPage=page, count=paging)
.get("results", {})
.get("hits" , [])
)
for doc in docs:
corpus.append(doc)
self.path = save( json.dumps(corpus).encode("utf-8")
, name='Multivac.json'
, basedir=UPLOAD_DIRECTORY
)
downloaded = True
return downloaded
# Scrapers config
QUERY_SIZE_N_MAX = 1000
from gargantext.constants import get_resource
from gargantext.constants import get_resource, QUERY_SIZE_N_MAX
from gargantext.util.scheduling import scheduled
from gargantext.util.db import session
from requests_futures.sessions import FuturesSession
......@@ -18,31 +18,34 @@ class Crawler:
#the name of corpus
#that will be built in case of internal fileparsing
self.record = record
self.name = record["corpus_name"]
self.project_id = record["project_id"]
self.user_id = record["user_id"]
self.resource = record["source"]
self.type = get_resource(self.resource)
self.query = record["query"]
self.record = record
self.name = record["corpus_name"]
self.project_id = record["project_id"]
self.user_id = record["user_id"]
self.resource = record["source"]
self.type = get_resource(self.resource)
self.query = record["query"]
#format the sampling
self.n_last_years = 5
self.YEAR = date.today().year
self.YEAR = date.today().year
#pas glop
# mais easy version
self.MONTH = str(date.today().month)
self.MONTH = str(date.today().month)
if len(self.MONTH) == 1:
self.MONTH = "0"+self.MONTH
self.MAX_RESULTS = 1000
self.MAX_RESULTS = QUERY_SIZE_N_MAX
try:
self.results_nb = int(record["count"])
except KeyError:
#n'existe pas encore
self.results_nb = 0
try:
self.webEnv = record["webEnv"]
self.webEnv = record["webEnv"]
self.queryKey = record["queryKey"]
self.retMax = record["retMax"]
self.retMax = record["retMax"]
except KeyError:
#n'exsite pas encore
self.queryKey = None
......@@ -67,6 +70,7 @@ class Crawler:
if self.download():
self.create_corpus()
return self.corpus_id
def get_sampling_dates():
'''Create a sample list of min and max date based on Y and M f*
or N_LAST_YEARS results'''
......
import subprocess
import re
from .sparql import Service
#from sparql import Service
def bool2sparql(rawQuery, count=False, offset=None, limit=None):
"""
bool2sparql :: String -> Bool -> Int -> String
Translate a boolean query into a Sparql request
You need to build bool2sparql binaries before
See: https://github.com/delanoe/bool2sparql
"""
query = re.sub("\"", "\'", rawQuery)
bashCommand = ["/srv/gargantext/gargantext/util/crawlers/sparql/bool2sparql-exe","-q",query]
if count is True :
bashCommand.append("-c")
else :
if offset is not None :
for command in ["--offset", str(offset)] :
bashCommand.append(command)
if limit is not None :
for command in ["--limit", str(limit)] :
bashCommand.append(command)
process = subprocess.Popen(bashCommand, stdout=subprocess.PIPE)
output, error = process.communicate()
if error is not None :
raise(error)
else :
print(output)
return(output.decode("utf-8"))
def isidore(query, count=False, offset=None, limit=None):
"""
isidore :: String -> Bool -> Int -> Either (Dict String) Int
use sparql-client either to search or to scan
"""
query = bool2sparql(query, count=count, offset=offset, limit=limit)
go = Service("https://www.rechercheisidore.fr/sparql/", "utf-8", "GET")
results = go.query(query)
if count is False:
for r in results:
doc = dict()
doc_values = dict()
doc["url"], doc["title"], doc["date"], doc["abstract"], doc["source"] = r
for k in doc.keys():
doc_values[k] = doc[k].value
yield(doc_values)
else :
count = []
for r in results:
n, = r
count.append(int(n.value))
yield count[0]
def test():
query = "delanoe"
limit = 100
offset = 10
for d in isidore(query, offset=offset, limit=limit):
print(d["date"])
#print([n for n in isidore(query, count=True)])
if __name__ == '__main__':
test()
This diff is collapsed.
......@@ -8,29 +8,12 @@ import random
_members = [
{ 'first_name' : 'Constance', 'last_name' : 'de Quatrebarbes',
'mail' : '4barbesATgmail.com',
'website' : 'http://c24b.github.io/',
'picture' : 'constance.jpg',
'role' : 'developer'},
{ 'first_name' : 'David', 'last_name' : 'Chavalarias',
'mail' : 'david.chavalariasATiscpif.fr',
'website' : 'http://chavalarias.com',
'picture' : 'david.jpg',
'role':'principal investigator'},
# { 'first_name' : 'Elias', 'last_name' : 'Showk',
# 'mail' : '',
# 'website' : 'https://github.com/elishowk',
# 'picture' : '', 'role' : 'developer'},
{ 'first_name' : 'Mathieu', 'last_name' : 'Rodic',
'mail' : '',
'website' : 'http://rodic.fr',
'picture' : 'mathieu.jpg',
'role' : 'developer'},
{ 'first_name' : 'Samuel', 'last_name' : 'Castillo J.',
'mail' : 'kaisleanATgmail.com',
'website' : 'http://www.pksm3.droppages.com',
......@@ -43,12 +26,6 @@ _members = [
'picture' : 'maziyar.jpg',
'role' : 'developer'},
{ 'first_name' : 'Romain', 'last_name' : 'Loth',
'mail' : '',
'website' : 'http://iscpif.fr',
'picture' : 'romain.jpg',
'role' : 'developer'},
{ 'first_name' : 'Alexandre', 'last_name' : 'Delanoë',
'mail' : 'alexandre+gargantextATdelanoe.org',
'website' : 'http://alexandre.delanoe.org',
......@@ -59,9 +36,34 @@ _members = [
# copy-paste the line above and write your informations please
]
_membersPast = [
{ 'first_name' : 'Constance', 'last_name' : 'de Quatrebarbes',
'mail' : '4barbesATgmail.com',
'website' : 'http://c24b.github.io/',
'picture' : 'constance.jpg',
'role' : 'developer'},
{ 'first_name' : 'Mathieu', 'last_name' : 'Rodic',
'mail' : '',
'website' : 'http://rodic.fr',
'picture' : 'mathieu.jpg',
'role' : 'developer'},
{ 'first_name' : 'Romain', 'last_name' : 'Loth',
'mail' : '',
'website' : 'http://iscpif.fr',
'picture' : 'romain.jpg',
'role' : 'developer'},
{ 'first_name' : 'Elias', 'last_name' : 'Showk',
'mail' : '',
'website' : 'https://github.com/elishowk',
'picture' : '', 'role' : 'developer'},
]
_institutions = [
{ 'name' : 'Mines ParisTech', 'website' : 'http://mines-paristech.fr', 'picture' : 'mines.png', 'funds':''},
{ 'name' : 'Institut Pasteur', 'website' : 'http://www.pasteur.fr', 'picture' : 'pasteur.png', 'funds':''},
#{ 'name' : 'Institut Pasteur', 'website' : 'http://www.pasteur.fr', 'picture' : 'pasteur.png', 'funds':''},
{ 'name' : 'EHESS', 'website' : 'http://www.ehess.fr', 'picture' : 'ehess.png', 'funds':''},
#{ 'name' : '', 'website' : '', 'picture' : '', 'funds':''},
# copy paste the line above and write your informations please
......@@ -74,9 +76,10 @@ _labs = [
]
_grants = [
{ 'name' : 'Institut Mines Telecom', 'website' : 'https://www.imt.fr', 'picture' : 'IMT.jpg', 'funds':''},
{ 'name' : 'Forccast', 'website' : 'http://forccast.hypotheses.org/', 'picture' : 'forccast.png', 'funds':''},
{ 'name' : 'Mastodons', 'website' : 'http://www.cnrs.fr/mi/spip.php?article53&lang=fr', 'picture' : 'mastodons.png', 'funds':''},
{ 'name' : 'ADEME', 'website' : 'http://www.ademe.fr', 'picture' : 'ademe.png', 'funds':''},
#{ 'name' : 'ADEME', 'website' : 'http://www.ademe.fr', 'picture' : 'ademe.png', 'funds':''},
#{ 'name' : '', 'website' : '', 'picture' : '', 'funds':''},
# copy paste the line above and write your informations please
]
......@@ -86,6 +89,10 @@ def members():
random.shuffle(_members)
return _members
def membersPast():
random.shuffle(_membersPast)
return _membersPast
def institutions():
random.shuffle(_institutions)
return _institutions
......
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** HAL Parser ***
# ****************************
# CNRS COPYRIGHTS 2017
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Parser import Parser
from datetime import datetime
import json
class HalParser(Parser):
def parse(self, filebuf):
'''
parse :: FileBuff -> [Hyperdata]
'''
contents = filebuf.read().decode("UTF-8")
data = json.loads(contents)
filebuf.close()
json_docs = data
hyperdata_list = []
hyperdata_path = { "id" : "isbn_s"
, "title" : "title_s"
, "abstract" : "abstract_s"
, "source" : "journalPublisher_s"
, "url" : "uri_s"
, "authors" : "authFullName_s"
}
uris = set()
for doc in json_docs:
hyperdata = {}
for key, path in hyperdata_path.items():
field = doc.get(path, "NOT FOUND")
if isinstance(field, list):
hyperdata[key] = ", ".join(field)
else:
hyperdata[key] = field
if hyperdata["url"] in uris:
print("Document already parsed")
else:
uris.add(hyperdata["url"])
# hyperdata["authors"] = ", ".join(
# [ p.get("person", {})
# .get("name" , "")
#
# for p in doc.get("hasauthor", [])
# ]
# )
#
maybeDate = doc.get("submittedDate_s", None)
if maybeDate is not None:
date = datetime.strptime(maybeDate, "%Y-%m-%d %H:%M:%S")
else:
date = datetime.now()
hyperdata["publication_date"] = date
hyperdata["publication_year"] = str(date.year)
hyperdata["publication_month"] = str(date.month)
hyperdata["publication_day"] = str(date.day)
hyperdata_list.append(hyperdata)
return hyperdata_list
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** ISIDORE Parser ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Parser import Parser
from datetime import datetime
import json
class IsidoreParser(Parser):
def parse(self, filebuf):
'''
parse :: FileBuff -> [Hyperdata]
'''
contents = filebuf.read().decode("UTF-8")
data = json.loads(contents)
filebuf.close()
json_docs = data
hyperdata_list = []
hyperdata_path = { "title" : "title"
, "abstract" : "abstract"
, "authors" : "authors"
, "url" : "url"
, "source" : "source"
}
uniq_id = set()
for doc in json_docs:
hyperdata = {}
for key, path in hyperdata_path.items():
hyperdata[key] = doc.get(path, "")
if hyperdata["url"] not in uniq_id:
# Removing the duplicates implicitly
uniq_id.add(hyperdata["url"])
# Source is the Journal Name
hyperdata["source"] = doc.get("source", "ISIDORE Database")
# Working on the date
maybeDate = doc.get("date" , None)
if maybeDate is None:
date = datetime.now()
else:
try :
# Model of date: 1958-01-01T00:00:00
date = datetime.strptime(maybeDate, '%Y-%m-%dT%H:%M:%S')
except :
print("FIX DATE ISIDORE please >%s<" % maybeDate)
date = datetime.now()
hyperdata["publication_date"] = date
hyperdata["publication_year"] = str(date.year)
hyperdata["publication_month"] = str(date.month)
hyperdata["publication_day"] = str(date.day)
hyperdata_list.append(hyperdata)
return hyperdata_list
......@@ -13,20 +13,21 @@ class ISTexParser(Parser):
hyperdata_list = []
hyperdata_path = {
"id" : "id",
"source" : 'corpusName',
"title" : 'title',
"source" : "corpusName",
"title" : "title",
"genre" : "genre",
"language_iso3" : 'language',
"doi" : 'doi',
"host" : 'host',
"publication_date" : 'publicationDate',
"abstract" : 'abstract',
"language_iso3" : "language",
"doi" : "doi",
"host" : "host",
"publication_date" : "publicationDate",
"abstract" : "abstract",
# "authors" : 'author',
"authorsRAW" : 'author',
"authorsRAW" : "author",
#"keywords" : "keywords"
}
suma = 0
for json_doc in json_docs:
hyperdata = {}
......@@ -103,7 +104,7 @@ class ISTexParser(Parser):
RealDate = RealDate[0]
# print( RealDate ," | length:",len(RealDate))
Decision=""
Decision = True
if len(RealDate)>4:
if len(RealDate)>8:
try: Decision = datetime.strptime(RealDate, '%Y-%b-%d').date()
......
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# **** MULTIVAC Parser ***
# ****************************
# CNRS COPYRIGHTS
# SEE LEGAL LICENCE OF GARGANTEXT.ORG
from ._Parser import Parser
from datetime import datetime
import json
class MultivacParser(Parser):
def parse(self, filebuf):
'''
parse :: FileBuff -> [Hyperdata]
'''
contents = filebuf.read().decode("UTF-8")
data = json.loads(contents)
filebuf.close()
json_docs = data
hyperdata_list = []
hyperdata_path = { "id" : "id"
, "title" : "title"
, "abstract" : "abstract"
, "type" : "type"
}
for json_doc in json_docs:
hyperdata = {}
doc = json_doc["_source"]
for key, path in hyperdata_path.items():
hyperdata[key] = doc.get(path, "")
hyperdata["source"] = doc.get("serial" , {})\
.get("journaltitle", "REPEC Database")
try:
hyperdata["url"] = doc.get("file", {})\
.get("url" , "")
except:
pass
hyperdata["authors"] = ", ".join(
[ p.get("person", {})
.get("name" , "")
for p in doc.get("hasauthor", [])
]
)
year = doc.get("serial" , {})\
.get("issuedate", None)
if year == "Invalide date":
year = doc.get("issuedate" , None)
if year is None:
year = datetime.now()
else:
try:
date = datetime.strptime(year, '%Y')
except:
print("FIX DATE MULTIVAC REPEC %s" % year)
date = datetime.now()
hyperdata["publication_date"] = date
hyperdata["publication_year"] = str(date.year)
hyperdata["publication_month"] = str(date.month)
hyperdata["publication_day"] = str(date.day)
hyperdata_list.append(hyperdata)
return hyperdata_list
......@@ -78,7 +78,7 @@ class PubmedParser(Parser):
if "publication_month" in hyperdata: PubmedDate+=" "+hyperdata["publication_month"]
if "publication_day" in hyperdata: PubmedDate+=" "+hyperdata["publication_day"]
Decision=""
Decision=True
if len(RealDate)>4:
if len(RealDate)>8:
try: Decision = datetime.strptime(RealDate, '%Y %b %d').date()
......
......@@ -175,7 +175,6 @@ def parse(corpus):
hyperdata = hyperdata,
)
session.add(document)
session.commit()
documents_count += 1
if pending_add_error_stats:
......@@ -190,6 +189,9 @@ def parse(corpus):
session.add(corpus)
session.commit()
# Commit any pending document
session.commit()
# update info about the resource
resource['extracted'] = True
#print( "resource n°",i, ":", d, "docs inside this file")
......
......@@ -47,7 +47,8 @@ def about(request):
context = {
'user': request.user,
'date': datetime.datetime.now(),
'team': credits.members(),
'team' : credits.members(),
'teamPast': credits.membersPast(),
'institutions': credits.institutions(),
'labos': credits.labs(),
'grants': credits.grants(),
......
#!/bin/bash
### Update and install base dependencies
echo "############ DEBIAN LIBS ###############"
apt-get update && \
......@@ -32,26 +34,26 @@ update-locale LC_ALL=fr_FR.UTF-8
libxml2-dev xml-core libgfortran-6-dev \
libpq-dev \
python3.5 \
python3-dev \
python3.5-dev \
python3-six python3-numpy python3-setuptools \
python3-numexpr \
python3-pip \
libxml2-dev libxslt-dev zlib1g-dev
libxml2-dev libxslt-dev zlib1g-dev libigraph0-dev
#libxslt1-dev
UPDATE AND CLEAN
# UPDATE AND CLEAN
apt-get update && apt-get autoclean
#NB: removing /var/lib will avoid to significantly fill up your /var/ folder on your native system
########################################################################
### PYTHON ENVIRONNEMENT (as ROOT)
########################################################################
#adduser --disabled-password --gecos "" gargantua
cd /srv/
pip3 install virtualenv
virtualenv /srv/env_3-5
virtualenv /srv/env_3-5 -p /usr/bin/python3.5
echo 'alias venv="source /srv/env_3-5/bin/activate"' >> ~/.bashrc
# CONFIG FILES
......@@ -60,9 +62,9 @@ update-locale LC_ALL=fr_FR.UTF-8
source /srv/env_3-5/bin/activate && pip3 install -r /srv/gargantext/install/gargamelle/requirements.txt && \
pip3 install git+https://github.com/zzzeek/sqlalchemy.git@rel_1_1 && \
python3 -m nltk.downloader averaged_perceptron_tagger -d /usr/local/share/nltk_data
chown gargantua:gargantua -R /srv/env_3-5
#######################################################################
## POSTGRESQL DATA (as ROOT)
#######################################################################
......
......@@ -14,7 +14,7 @@ echo "::::: DJANGO :::::"
/bin/su gargantua -c 'source /env_3-5/bin/activate &&\
su gargantua -c 'source /srv/env_3-5/bin/activate &&\
echo "Activated env" &&\
/srv/gargantext/manage.py makemigrations &&\
/srv/gargantext/manage.py migrate && \
......@@ -24,4 +24,4 @@ echo "::::: DJANGO :::::"
/srv/gargantext/dbmigrate.py && \
/srv/gargantext/manage.py createsuperuser'
/usr/sbin/service postgresql stop
service postgresql stop
##
# You should look at the following URL's in order to grasp a solid understanding
# of Nginx configuration files in order to fully unleash the power of Nginx.
# http://wiki.nginx.org/Pitfalls
# http://wiki.nginx.org/QuickStart
# http://wiki.nginx.org/Configuration
#
# Generally, you will want to move this file somewhere, and start with a clean
# file but keep this around for reference. Or just disable in sites-enabled.
#
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.
##
# the upstream component nginx needs to connect to
upstream gargantext {
server unix:///tmp/gargantext.sock; # for a file socket
#server 127.0.0.1:8001; # for a web port socket (we'll use this first)
}
# Default server configuration
#
server {
listen 80 default_server;
listen [::]:80 default_server;
# SSL configuration
#
# listen 443 ssl default_server;
# listen [::]:443 ssl default_server;
#
# Note: You should disable gzip for SSL traffic.
# See: https://bugs.debian.org/773332
#
# Read up on ssl_ciphers to ensure a secure configuration.
# See: https://bugs.debian.org/765782
#
# Self signed certs generated by the ssl-cert package
# Don't use them in a production server!
#
# include snippets/snakeoil.conf;
client_max_body_size 800M;
client_body_timeout 12;
client_header_timeout 12;
keepalive_timeout 15;
send_timeout 10;
root /var/www/html;
# Add index.php to the list if you are using PHP
#index index.html index.htm index.nginx-debian.html;
server_name _ stable.gargantext.org gargantext.org ;
# Django media
location /media {
alias /var/www/gargantext/media; # your Django project's media files - amend as required
}
location /static {
alias /srv/gargantext_static; # your Django project's static files - amend as required
}
# Finally, send all non-media requests to the Django server.
location / {
uwsgi_pass gargantext;
include uwsgi_params;
}
#access_log off;
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
}
server {
listen 80 ;
listen [::]:80;
server_name dl.gargantext.org ;
error_page 404 /index.html;
location / {
root /var/www/dl ;
proxy_set_header Host $host;
proxy_buffering off;
}
access_log /var/log/nginx/dl.gargantext.org-access.log;
error_log /var/log/nginx/dl.gargantext.org-error.log;
}
# try bottleneck
eventlet==0.20.1
amqp==1.4.9
anyjson==0.3.3
billiard==3.3.0.23
......
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# ***** HAL Crawler *****
# ****************************
# LICENCE: GARGANTEXT.org Licence
RESOURCE_TYPE_HAL = 11
from django.shortcuts import redirect, render
from django.http import Http404, HttpResponseRedirect \
, HttpResponseForbidden
from gargantext.constants import get_resource, load_crawler, QUERY_SIZE_N_MAX
from gargantext.models.nodes import Node
from gargantext.util.db import session
from gargantext.util.db_cache import cache
from gargantext.util.http import JsonHttpResponse
from gargantext.util.scheduling import scheduled
from gargantext.util.toolchain import parse_extract_indexhyperdata
def query( request):
'''get GlobalResults()'''
if request.method == "POST":
query = request.POST["query"]
source = get_resource(RESOURCE_TYPE_HAL)
if source["crawler"] is not None:
crawlerbot = load_crawler(source)()
#old raw way to get results_nb
results = crawlerbot.scan_results(query)
#ids = crawlerbot.get_ids(query)
print(results)
return JsonHttpResponse({"results_nb":crawlerbot.results_nb})
def save(request, project_id):
'''save'''
if request.method == "POST":
query = request.POST.get("query")
try:
N = int(request.POST.get("N"))
except:
N = 0
print(query, N)
#for next time
#ids = request.POST["ids"]
source = get_resource(RESOURCE_TYPE_HAL)
if N == 0:
raise Http404()
if N > QUERY_SIZE_N_MAX:
N = QUERY_SIZE_N_MAX
try:
project_id = int(project_id)
except ValueError:
raise Http404()
# do we have a valid project?
project = session.query( Node ).filter(Node.id == project_id).first()
if project is None:
raise Http404()
user = cache.User[request.user.id]
if not user.owns(project):
return HttpResponseForbidden()
# corpus node instanciation as a Django model
corpus = Node(
name = query,
user_id = request.user.id,
parent_id = project_id,
typename = 'CORPUS',
hyperdata = { "action" : "Scrapping data"
}
)
#download_file
crawler_bot = load_crawler(source)()
#for now no way to force downloading X records
#the long running command
filename = crawler_bot.download(query)
corpus.add_resource(
type = source["type"]
#, name = source["name"]
, path = crawler_bot.path
)
session.add(corpus)
session.commit()
#corpus_id = corpus.id
try:
scheduled(parse_extract_indexhyperdata)(corpus.id)
except Exception as error:
print('WORKFLOW ERROR')
print(error)
try:
print_tb(error.__traceback__)
except:
pass
# IMPORTANT ---------------------------------
# sanitize session after interrupted transact
session.rollback()
# --------------------------------------------
return render(
template_name = 'pages/projects/wait.html',
request = request,
context = {
'user' : request.user,
'project': project,
},
)
data = [query_string,query,N]
print(data)
return JsonHttpResponse(data)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# ***** ISIDORE Crawler *****
# ****************************
RESOURCE_TYPE_ISIDORE = 12
from django.shortcuts import redirect, render
from django.http import Http404, HttpResponseRedirect, HttpResponseForbidden
from gargantext.constants import get_resource, load_crawler, QUERY_SIZE_N_MAX
from gargantext.models.nodes import Node
from gargantext.util.db import session
from gargantext.util.db_cache import cache
from gargantext.util.http import JsonHttpResponse
from gargantext.util.scheduling import scheduled
from gargantext.util.toolchain import parse_extract_indexhyperdata
def query( request):
'''get GlobalResults()'''
if request.method == "POST":
query = request.POST["query"]
source = get_resource(RESOURCE_TYPE_ISIDORE)
if source["crawler"] is not None:
crawlerbot = load_crawler(source)()
#old raw way to get results_nb
results = crawlerbot.scan_results(query)
#ids = crawlerbot.get_ids(query)
return JsonHttpResponse({"results_nb":crawlerbot.results_nb})
def save(request, project_id):
'''save'''
if request.method == "POST":
query = request.POST.get("query")
try:
N = int(request.POST.get("N"))
except:
N = 0
print(query, N)
#for next time
#ids = request.POST["ids"]
source = get_resource(RESOURCE_TYPE_ISIDORE)
if N == 0:
raise Http404()
if N > QUERY_SIZE_N_MAX:
N = QUERY_SIZE_N_MAX
try:
project_id = int(project_id)
except ValueError:
raise Http404()
# do we have a valid project?
project = session.query( Node ).filter(Node.id == project_id).first()
if project is None:
raise Http404()
user = cache.User[request.user.id]
if not user.owns(project):
return HttpResponseForbidden()
# corpus node instanciation as a Django model
corpus = Node(
name = query,
user_id = request.user.id,
parent_id = project_id,
typename = 'CORPUS',
hyperdata = { "action" : "Scrapping data"
, "language_id" : "fr"
}
)
#download_file
crawler_bot = load_crawler(source)()
#for now no way to force downloading X records
#the long running command
filename = crawler_bot.download(query)
corpus.add_resource(
type = source["type"]
#, name = source["name"]
, path = crawler_bot.path
)
session.add(corpus)
session.commit()
#corpus_id = corpus.id
try:
scheduled(parse_extract_indexhyperdata)(corpus.id)
except Exception as error:
print('WORKFLOW ERROR')
print(error)
try:
print_tb(error.__traceback__)
except:
pass
# IMPORTANT ---------------------------------
# sanitize session after interrupted transact
session.rollback()
# --------------------------------------------
return render(
template_name = 'pages/projects/wait.html',
request = request,
context = {
'user' : request.user,
'project': project,
},
)
data = [query_string,query,N]
print(data)
return JsonHttpResponse(data)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ****************************
# ***** MULTIVAC Crawler *****
# ****************************
# LICENCE: GARGANTEXT.org Licence
RESOURCE_TYPE_MULTIVAC = 10
from django.shortcuts import redirect, render
from django.http import Http404, HttpResponseRedirect, HttpResponseForbidden
from gargantext.constants import get_resource, load_crawler, QUERY_SIZE_N_MAX
from gargantext.models.nodes import Node
from gargantext.util.db import session
from gargantext.util.db_cache import cache
from gargantext.util.http import JsonHttpResponse
from gargantext.util.scheduling import scheduled
from gargantext.util.toolchain import parse_extract_indexhyperdata
def query( request):
'''get GlobalResults()'''
if request.method == "POST":
query = request.POST["query"]
source = get_resource(RESOURCE_TYPE_MULTIVAC)
if source["crawler"] is not None:
crawlerbot = load_crawler(source)()
#old raw way to get results_nb
results = crawlerbot.scan_results(query)
#ids = crawlerbot.get_ids(query)
print(results)
return JsonHttpResponse({"results_nb":crawlerbot.results_nb})
def save(request, project_id):
'''save'''
if request.method == "POST":
query = request.POST.get("query")
try:
N = int(request.POST.get("N"))
except:
N = 0
print(query, N)
#for next time
#ids = request.POST["ids"]
source = get_resource(RESOURCE_TYPE_MULTIVAC)
if N == 0:
raise Http404()
if N > QUERY_SIZE_N_MAX:
N = QUERY_SIZE_N_MAX
try:
project_id = int(project_id)
except ValueError:
raise Http404()
# do we have a valid project?
project = session.query( Node ).filter(Node.id == project_id).first()
if project is None:
raise Http404()
user = cache.User[request.user.id]
if not user.owns(project):
return HttpResponseForbidden()
# corpus node instanciation as a Django model
corpus = Node(
name = query,
user_id = request.user.id,
parent_id = project_id,
typename = 'CORPUS',
hyperdata = { "action" : "Scrapping data"
, "language_id" : "en"
}
)
#download_file
crawler_bot = load_crawler(source)()
#for now no way to force downloading X records
#the long running command
filename = crawler_bot.download(query)
corpus.add_resource(
type = source["type"]
#, name = source["name"]
, path = crawler_bot.path
)
session.add(corpus)
session.commit()
#corpus_id = corpus.id
try:
scheduled(parse_extract_indexhyperdata)(corpus.id)
except Exception as error:
print('WORKFLOW ERROR')
print(error)
try:
print_tb(error.__traceback__)
except:
pass
# IMPORTANT ---------------------------------
# sanitize session after interrupted transact
session.rollback()
# --------------------------------------------
return render(
template_name = 'pages/projects/wait.html',
request = request,
context = {
'user' : request.user,
'project': project,
},
)
data = [query_string,query,N]
print(data)
return JsonHttpResponse(data)
......@@ -10,32 +10,35 @@
# moissonneurs == getting data from external databases
# Available databases :
## Pubmed
## IsTex,
## CERN
from django.conf.urls import url
import moissonneurs.pubmed as pubmed
import moissonneurs.istex as istex
import moissonneurs.cern as cern
# TODO
#import moissonneurs.hal as hal
#import moissonneurs.revuesOrg as revuesOrg
# Available databases :
import moissonneurs.pubmed as pubmed
import moissonneurs.istex as istex
import moissonneurs.cern as cern
import moissonneurs.multivac as multivac
import moissonneurs.hal as hal
import moissonneurs.isidore as isidore
# TODO ?
# REST API for the moissonneurs
# TODO : ISIDORE
# /!\ urls patterns here are *without* the trailing slash
urlpatterns = [ url(r'^pubmed/query$' , pubmed.query )
, url(r'^pubmed/save/(\d+)' , pubmed.save )
, url(r'^istex/query$' , istex.query )
, url(r'^istex/save/(\d+)' , istex.save )
, url(r'^cern/query$' , cern.query )
, url(r'^cern/save/(\d+)' , cern.save )
urlpatterns = [ url(r'^pubmed/query$' , pubmed.query )
, url(r'^pubmed/save/(\d+)' , pubmed.save )
, url(r'^istex/query$' , istex.query )
, url(r'^istex/save/(\d+)' , istex.save )
, url(r'^cern/query$' , cern.query )
, url(r'^cern/save/(\d+)' , cern.save )
, url(r'^multivac/query$' , multivac.query )
, url(r'^multivac/save/(\d+)' , multivac.save )
, url(r'^hal/query$' , hal.query )
, url(r'^hal/save/(\d+)' , hal.save )
, url(r'^isidore/query$' , isidore.query )
, url(r'^isidore/save/(\d+)' , isidore.save )
]
......@@ -183,9 +183,55 @@
</div>
</div>
</div>
{% endif %}
{% if teamPast %}
<div class="panel panel-default">
<div class="panel-heading">
<h2 class="panel-title">
<a data-toggle="collapse" data-parent="#accordion" href="#collapseTeamPast">
<center>
<h2>
<span class="glyphicon glyphicon-question-sign" aria-hidden="true"></span>
Former Developers
<span class="glyphicon glyphicon-question-sign" aria-hidden="true"></span>
</h2>
</center>
</a>
</h2>
</div>
<div id="collapseTeamPast" class="panel-collapse collapse" role="tabpanel">
<div class="panel-body">
<div class="container">
<div class="row">
<div class="thumbnails">
{% for member in teamPast %}
<div class="col-md-5 ">
<div class="thumbnail">
<div class="caption">
<center>
<h3>{{ member.first_name }} {{member.last_name }}</h3>
{% if member.role %}
<p class="description">{{ member.role }}</p>
{% endif %}
</center>
</div>
</div>
</div>
{% endfor %}
</div>
</div>
</div>
</div>
</div>
</div>
{% endif %}
</div>
</div>
<div class="panel panel-default">
<div class="panel-heading">
......
......@@ -367,7 +367,7 @@
<p>
Gargantext
<span class="glyphicon glyphicon-registration-mark" aria-hidden="true"></span>
, version 3.0.6.7,
, version 3.0.6.9.4,
<a href="http://www.cnrs.fr" target="blank" title="Institution that enables this project.">
Copyrights
<span class="glyphicon glyphicon-copyright-mark" aria-hidden="true"></span>
......
......@@ -86,12 +86,12 @@
<button type="button" class="close" data-dismiss="modal" aria-label="Close">
<span aria-hidden="true">&times;</span>
</button>
<h2 class="modal-title"><h2><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Uploading corpus...</h2>
<h2 class="modal-title"><h2><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span>Building corpus...</h2>
</div>
<div class="modal-body">
<h5>
Your file has been uploaded !
Gargantext need some time to eat it.
Gargantext is gathering your texts
and need some time to eat it.
Duration depends on the size of the dish.
</h5>
</div>
......
......@@ -209,9 +209,11 @@
function CustomForSelect( selected ) {
// show Radio-Inputs and trigger FileOrNotFile>@upload-file events
selected = selected.toLowerCase()
var is_pubmed = (selected.indexOf('pubmed') != -1);
var is_istex = (selected.indexOf('istex') != -1);
if (is_pubmed || is_istex) {
var is_pubmed = (selected.indexOf('pubmed') != -1);
var is_istex = (selected.indexOf('istex' ) != -1);
var is_repec = (selected.indexOf('repec' ) != -1);
if (is_pubmed || is_istex || is_repec) {
// if(selected=="pubmed") {
console.log("show the button for: " + selected)
$("#pubmedcrawl").css("visibility", "visible");
......
......@@ -41,39 +41,42 @@
<div class="container theme-showcase" role="main">
<div class="jumbotron">
<div class="row">
<div class="col-md-4">
<h1>
<span class="glyphicon glyphicon-home" aria-hidden="true"></span>
Projects
</h1>
</div>
<div class="col-md-3"></div>
<div class="col-md-5">
<p id="project" class="help">
<br>
<button id="add" type="button" class="btn btn-primary btn-lg help" data-container="body" data-toggle="popover" data-placement="bottom">
<span class="glyphicon glyphicon-plus" aria-hidden="true"></span>
Add a new project
</button>
<div id="popover-content" class="hide">
<div id="createForm" class="form-group">
{% csrf_token %}
<div id="status-form" class="collapse">
</div>
<div class="row inline">
<label class="col-lg-3" for="inputName" ><span class="pull-right">Name:</span></label>
<input class="col-lg-8" type="text" id="inputName" class="form-control">
</div>
<div class="row inline">
<div class="col-lg-3"></div>
<button id="createProject" class="btn btn-primary btn-sm col-lg-8 push-left">Add Project</button>
<div class="col-lg-2"></div>
<div class="col-md-4">
<h1>
<span class="glyphicon glyphicon-home" aria-hidden="true"></span>
Projects
</h1>
</div>
<div class="col-md-3"></div>
<div class="col-md-5">
<p id="project" class="help">
<br>
<button id="add" type="button" class="btn btn-primary btn-lg help" data-container="body" data-toggle="popover" data-placement="bottom">
<span class="glyphicon glyphicon-plus" aria-hidden="true"></span>
Add a new project
</button>
<div id="popover-content" class="hide">
<form>
<div id="createForm" class="form-group">
{% csrf_token %}
<div id="status-form" class="collapse"></div>
<div class="row inline">
<label class="col-lg-3" for="inputName" ><span class="pull-right">Name:</span></label>
<input class="col-lg-8" type="text" id="inputName" class="form-control">
</div>
<div class="row inline">
<div class="col-lg-3"></div>
<button id="createProject" class="btn btn-primary btn-sm col-lg-8 push-left">Add Project</button>
<div class="col-lg-2"></div>
</div>
</div>
</form>
</div>
</div>
</div>
</p>
</p>
</div>
</div>
</div>
</div>
......@@ -87,7 +90,7 @@
</div>
<!-- CHECKBOX EDITION -->
<!--
<!--
<div class="row collapse" id="editor">
<button title="delete selected project" type="button" class="btn btn-danger" id="delete">
<span class="glyphicon glyphicon-trash " aria-hidden="true" ></span>
......@@ -98,9 +101,8 @@
<!-- <button type="button" class="btn btn-info" id="recalculate">
<span class="glyphicon glyphicon-refresh " aria-hidden="true" onclick="recalculateProjects()"></span>
</button>
-->
</div>
-->
<br />
......
This diff is collapsed.
......@@ -199,12 +199,12 @@
<button type="button" class="close" data-dismiss="modal" aria-label="Close">
<span aria-hidden="true">&times;</span>
</button>
<h2 class="modal-title"><h2><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span> Uploading corpus...</h2>
<h2 class="modal-title"><h2><span class="glyphicon glyphicon-info-sign" aria-hidden="true"></span>Building the corpus...</h2>
</div>
<div class="modal-body">
<p>
Your file has been uploaded !
Gargantext need some time to eat it.
Gargantext is gathering your texts
and need some time to eat it.
Duration depends on the size of the dish.
</p>
</div>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment