Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
haskell-gargantext
haskell-gargantext
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 159
    • Issues 159
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 7
    • Merge Requests 7
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • gargantext
  • haskell-gargantexthaskell-gargantext
  • Issues
  • #4

Closed
Open
Opened Sep 18, 2018 by delanoe@anoe3 of 3 tasks completed3/3 tasks
  • Report abuse
  • New issue
Report abuse New issue

Lang corpus

  • wget lang-wikimedia.xml.bz2 : https://dumps.wikimedia.org/backup-index.html

  • bunzip2 lang-wikimedia.xml.bz2

  • fonction extraire les articles avec un format spécifique

Fonction attendue:

wiki2text :: FilePath -> [Article]

data Article = Article { title :: Text , abstract :: Text , text :: Text}

Edited Oct 05, 2018 by Mael NICOLAS
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
Sep 20, 2018
Due date
Sep 20, 2018
0
Labels
None
Assign labels
  • View project labels
Reference: gargantext/haskell-gargantext#4