Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
pubmed
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
2
Issues
2
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
crawlers
pubmed
Commits
1e853d8b
Commit
1e853d8b
authored
Dec 11, 2023
by
Alfredo Di Napoli
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Build cleanly on GHC 8.10.7 and GHC 9.4.7
parent
66196687
Changes
5
Hide whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
22 additions
and
8 deletions
+22
-8
cabal.ghc-9.4.7.project
cabal.ghc-9.4.7.project
+8
-0
cabal.project
cabal.project
+6
-0
crawlerPubMed.cabal
crawlerPubMed.cabal
+5
-3
PUBMED.hs
src/PUBMED.hs
+1
-2
Parser.hs
src/PUBMED/Parser.hs
+2
-3
No files found.
cabal.ghc-9.4.7.project
0 → 100644
View file @
1e853d8b
with-compiler: ghc-9.4.7
packages: .
tests: True
source-repository-package
type: git
location: https://github.com/delanoe/data-time-segment.git
tag: 4e3d57d80e9dfe6624c8eeaa8595fc8fe64d8723
cabal.project
0 → 100644
View file @
1e853d8b
packages: .
with-compiler: ghc-8.10.7
source-repository-package
type: git
location: https://github.com/delanoe/data-time-segment.git
tag: 4e3d57d80e9dfe6624c8eeaa8595fc8fe64d8723
crawlerPubMed.cabal
View file @
1e853d8b
...
...
@@ -59,9 +59,9 @@ library
, lens
, mtl
, optparse-applicative
, servant
, servant-client
, servant-client-core
, servant
>= 0.18.3 && < 0.20
, servant-client
>= 0.18.3 && < 0.20
, servant-client-core
>= 0.18.3 && < 0.20
, taggy
, taggy-lens
, text
...
...
@@ -143,6 +143,8 @@ test-suite crawlerPubMed-test
, servant-client-core
, taggy
, taggy-lens
, tasty
, tasty-hunit
, text
, time
default-language: Haskell2010
src/PUBMED.hs
View file @
1e853d8b
...
...
@@ -26,7 +26,6 @@ import Network.HTTP.Client (newManager)
import
Network.HTTP.Client.TLS
(
tlsManagerSettings
)
import
PUBMED.Client
import
PUBMED.Parser
import
Panic
(
panic
)
import
Prelude
hiding
(
takeWhile
)
import
Servant.Client
(
runClientM
,
mkClientEnv
,
BaseUrl
(
..
),
ClientEnv
(
..
),
ClientError
,
Scheme
(
..
))
import
qualified
Data.ByteString.Lazy
as
LBS
...
...
@@ -91,7 +90,7 @@ getPage pageNum = do
debugLog
$
"[getPage] getting page "
<>
show
pageNum
<>
", offset: "
<>
show
offset
<>
", perPage: "
<>
show
perPage
<>
", query: "
<>
T
.
unpack
query
<>
", apiKey: "
<>
show
apiKey
eDocs
<-
runSimpleFindPubmedAbstractRequest
(
Just
offset
)
case
eDocs
of
Left
err
->
panic
$
"[getPage] error: "
<>
show
err
Left
err
->
error
$
"[getPage] error: "
<>
show
err
Right
docs
->
do
_
<-
liftIO
$
threadDelay
2
_000_000
-- two seconds
debugLog
$
"[getPage] docs length: "
<>
show
(
length
docs
)
...
...
src/PUBMED/Parser.hs
View file @
1e853d8b
...
...
@@ -23,7 +23,6 @@ import qualified Data.ByteString.Lazy as DBL
import
qualified
Data.Text
as
T
import
qualified
Data.Text.Lazy
as
TL
import
qualified
Data.Text.Lazy.IO
as
TLIO
import
Panic
(
panic
)
import
qualified
Text.Read
as
TR
import
qualified
Text.Taggy.Lens
as
TTL
...
...
@@ -40,7 +39,7 @@ parseDocIds txt = map parseId parsed
where
parsed
=
txt
^..
TTL
.
html
.
TTL
.
allNamed
(
only
"eSearchResult"
)
.
namedEl
"IdList"
.
namedEl
"Id"
.
TTL
.
contents
parseId
s
=
case
(
TR
.
readMaybe
(
T
.
unpack
s
)
::
Maybe
Integer
)
of
Nothing
->
panic
$
"Can't read doc id from: "
<>
(
T
.
unpack
s
)
Nothing
->
error
$
"Can't read doc id from: "
<>
(
T
.
unpack
s
)
Just
cnt
->
cnt
parseDocCount
::
TL
.
Text
->
Maybe
Integer
...
...
@@ -97,7 +96,7 @@ parsePubMed txt = catMaybes $ txt ^.. pubmedArticle . to pubMed
articleId
=
medline
.
namedEl
"PMID"
pubDate
=
namedEl
"PubmedData"
.
namedEl
"History"
.
namedEl
"PubMedPubDate"
.
TTL
.
attributed
(
ix
"PubStatus"
.
only
"pubmed"
)
pubMedId
el
=
case
(
TR
.
readMaybe
$
T
.
unpack
$
el
^.
TTL
.
contents
)
of
Nothing
->
panic
$
"Cannot parse id: "
<>
(
T
.
unpack
$
el
^.
TTL
.
contents
)
Nothing
->
error
$
"Cannot parse id: "
<>
(
T
.
unpack
$
el
^.
TTL
.
contents
)
Just
id
->
id
pubMedDate
el
=
PubMedDate
{
pubmedDate_date
=
jour
y
m
d
,
pubmedDate_year
=
y
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment