Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
A
arxiv-api
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
crawlers
arxiv-api
Commits
84b68adb
Commit
84b68adb
authored
Apr 07, 2022
by
Przemyslaw Kaminski
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
bring back Arxiv module, better Result type
parent
a7a556b6
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
31 additions
and
27 deletions
+31
-27
Main.hs
app/Main.hs
+1
-1
crawlerArxiv.cabal
crawlerArxiv.cabal
+7
-1
package.yaml
package.yaml
+3
-0
Arxiv.hs
src/Arxiv.hs
+20
-25
No files found.
app/Main.hs
View file @
84b68adb
module
Main
where
module
Main
where
import
Arxiv
.Wrapper
import
Arxiv
import
Conduit
import
Conduit
main
::
IO
()
main
::
IO
()
...
...
crawlerArxiv.cabal
View file @
84b68adb
...
@@ -25,11 +25,13 @@ source-repository head
...
@@ -25,11 +25,13 @@ source-repository head
library
library
exposed-modules:
exposed-modules:
Arxiv
.Wrapper
Arxiv
other-modules:
other-modules:
Paths_crawlerArxiv
Paths_crawlerArxiv
hs-source-dirs:
hs-source-dirs:
src
src
default-extensions:
RecordWildCards
build-depends:
build-depends:
arxiv
arxiv
, base >=4.7 && <5
, base >=4.7 && <5
...
@@ -50,6 +52,8 @@ executable arxiv-exe
...
@@ -50,6 +52,8 @@ executable arxiv-exe
Paths_crawlerArxiv
Paths_crawlerArxiv
hs-source-dirs:
hs-source-dirs:
app
app
default-extensions:
RecordWildCards
ghc-options: -threaded -rtsopts -with-rtsopts=-N
ghc-options: -threaded -rtsopts -with-rtsopts=-N
build-depends:
build-depends:
arxiv
arxiv
...
@@ -73,6 +77,8 @@ test-suite arxiv-test
...
@@ -73,6 +77,8 @@ test-suite arxiv-test
Paths_crawlerArxiv
Paths_crawlerArxiv
hs-source-dirs:
hs-source-dirs:
test
test
default-extensions:
RecordWildCards
ghc-options: -threaded -rtsopts -with-rtsopts=-N
ghc-options: -threaded -rtsopts -with-rtsopts=-N
build-depends:
build-depends:
arxiv
arxiv
...
...
package.yaml
View file @
84b68adb
...
@@ -22,6 +22,9 @@ extra-source-files:
...
@@ -22,6 +22,9 @@ extra-source-files:
# common to point users to the README.md file.
# common to point users to the README.md file.
description
:
Please see the README on GitHub at <https://github.com/githubuser/arxiv#readme>
description
:
Please see the README on GitHub at <https://github.com/githubuser/arxiv#readme>
default-extensions
:
-
RecordWildCards
dependencies
:
dependencies
:
-
arxiv
-
arxiv
-
base >= 4.7 && < 5
-
base >= 4.7 && < 5
...
...
src/Arxiv
/Wrapper
.hs
→
src/Arxiv.hs
View file @
84b68adb
module
Arxiv
.Wrapper
module
Arxiv
where
where
import
Control.Applicative
((
<$>
))
import
Control.Applicative
((
<$>
))
...
@@ -14,6 +14,7 @@ import Network.HTTP.Conduit (parseRequest)
...
@@ -14,6 +14,7 @@ import Network.HTTP.Conduit (parseRequest)
import
Network.HTTP.Simple
as
HT
import
Network.HTTP.Simple
as
HT
import
Network.HTTP.Types.Status
import
Network.HTTP.Types.Status
import
Text.HTML.TagSoup
import
Text.HTML.TagSoup
import
Text.Read
(
readMaybe
)
import
qualified
Conduit
as
C
import
qualified
Conduit
as
C
import
qualified
Data.ByteString
as
B
hiding
(
unpack
)
import
qualified
Data.ByteString
as
B
hiding
(
unpack
)
import
qualified
Data.ByteString.Char8
as
B
(
unpack
)
import
qualified
Data.ByteString.Char8
as
B
(
unpack
)
...
@@ -129,38 +130,32 @@ results q sp = Ax.forEachEntryM sp (C.yield . mkResult)
...
@@ -129,38 +130,32 @@ results q sp = Ax.forEachEntryM sp (C.yield . mkResult)
----------------------------------------------------------------------
----------------------------------------------------------------------
-- Get data and format
-- Get data and format
----------------------------------------------------------------------
----------------------------------------------------------------------
data
Result
=
Result
{
doi
::
String
data
Result
=
Result
{
abstract
::
String
,
url
::
String
,
authors
::
[
Ax
.
Author
]
,
primaryCategory
::
Maybe
Ax
.
Category
,
categories
::
[
Ax
.
Category
]
,
categories
::
[
Ax
.
Category
]
,
doi
::
String
,
journal
::
String
,
journal
::
String
,
authors
::
[
Ax
.
Author
]
,
primaryCategory
::
Maybe
Ax
.
Category
,
publication_date
::
String
,
publication_date
::
String
,
year
::
String
,
title
::
String
,
title
::
String
,
abstract
::
String
,
total
::
Int
,
total
::
Int
,
url
::
String
,
year
::
Maybe
Int
}
deriving
(
Show
)
}
deriving
(
Show
)
mkResult
::
[
Soup
]
->
Result
mkResult
::
[
Soup
]
->
Result
mkResult
sp
=
let
doi'
=
Ax
.
getDoi
sp
mkResult
sp
=
let
abstract
=
Ax
.
getSummary
sp
&
clean'
url'
=
Ax
.
getPdf
sp
authors
=
Ax
.
getAuthors
sp
primaryCategory'
=
Ax
.
getPrimaryCategory
sp
categories
=
Ax
.
getCategories
sp
categories'
=
Ax
.
getCategories
sp
doi
=
Ax
.
getDoi
sp
journal'
=
Ax
.
getJournal
sp
journal
=
Ax
.
getJournal
sp
authors'
=
Ax
.
getAuthors
sp
primaryCategory
=
Ax
.
getPrimaryCategory
sp
publication_date'
=
Ax
.
getPublished
sp
publication_date
=
Ax
.
getPublished
sp
year'
=
Ax
.
getYear
sp
title
=
Ax
.
getTitle
sp
&
clean'
title'
=
Ax
.
getTitle
sp
&
clean'
total
=
Ax
.
totalResults
sp
abstract'
=
Ax
.
getSummary
sp
&
clean'
url
=
Ax
.
getPdf
sp
total'
=
Ax
.
totalResults
sp
year
=
readMaybe
$
Ax
.
getYear
sp
in
(
Result
doi'
url'
in
(
Result
{
..
}
)
primaryCategory'
categories'
journal'
authors'
publication_date'
year'
title'
abstract'
total'
)
where
clean'
x
=
let
x'
=
clean
[
'
\n
'
,
'
\r
'
,
'
\t
'
]
x
where
clean'
x
=
let
x'
=
clean
[
'
\n
'
,
'
\r
'
,
'
\t
'
]
x
in
if
null
x'
then
"Not found"
else
x'
in
if
null
x'
then
"Not found"
else
x'
clean
_
[]
=
[]
clean
_
[]
=
[]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment