Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
haskell-gargantext
haskell-gargantext
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 154
    • Issues 154
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 7
    • Merge Requests 7
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • gargantext
  • haskell-gargantexthaskell-gargantext
  • Merge Requests
  • !249

Merged
Opened Feb 22, 2024 by Alfredo Di Napoli@AlfredoDiNapoli
  • Report abuse
Report abuse

Improve PhyloMaker performance up to 50%

@anoe This is the first of many patches that I will land trying to improve the performance and the readability of the Phylo code.

I have spent a few weeks on this trying to get myself familiar with the code, which is quite complex and intricate, and after a few false starts, I have gone back to the drawing board by starting to tackle the low hanging fruit, small wins where we could see immediately the performance gains. At glance, this is what I have done:

  1. I have replace the nub and sort functions in PhyloMaker with nubOrd or, when possible, with nub and sort from the discrimination package, which runs in O(n), whereas Data.List.nub sort is notoriously slow and exponential (O(n^2)) and Data.List.sort uses IIRC a flavour of the QuickSort and runs in O(n*logn).

Furthermore, as the majority of the time of the PhyloMaker is spent calculating relatedComponents, but we do this on all the similarity Set, we can parallelise that, and so I have added a strategic parMap.

This patch also adds a new (test) executable called garg-phylo-profile which shares code from the garg-phylo but is meant to be used for profiling as it load some canned data.

Results

Before my patch, running garg-phylo-profile would take:

real	0m17,210s
user	0m25,466s
sys	0m0,747s

After my patch:

real	0m9,230s
user	0m34,577s
sys	0m0,465s

In terms of eventlog, we are doing a bit better, but I need to do more investigation as I think there are bits which are highly parallel and they can be improved, but that will happen in a separate round of investigations:

Screenshot_2024-02-22_at_09.32.23

As you can see this is now much more parallel (green means "doing work", here) and the productivity went up a bit and the GC down a bit, but there is a huge gap where we are (likely) waiting for work to be computed (my hunch is that we are waiting for the similarities to be computed). This can be improved, but will happen in due course.

Edited Feb 22, 2024 by Alfredo Di Napoli

Check out, review, and merge locally

Step 1. Fetch and check out the branch for this merge request

git fetch origin
git checkout -b adinapoli/phylo-profile-2 origin/adinapoli/phylo-profile-2

Step 2. Review the changes locally

Step 3. Merge the branch and fix any conflicts that come up

git fetch origin
git checkout origin/dev
git merge --no-ff adinapoli/phylo-profile-2

Step 4. Push the result of the merge to GitLab

git push origin dev

Note that pushing to GitLab requires write access to this repository.

Tip: You can also checkout merge requests locally by following these guidelines.

  • Discussion 3
  • Commits 12
  • Pipelines 2
  • Changes 18
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
0
Labels
None
Assign labels
  • View project labels
Reference: gargantext/haskell-gargantext!249

Revert this commit

This will create a new commit in order to revert the existing changes.

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.

Cherry-pick this commit

Switch branch
Cancel
A new branch will be created in your fork and a new merge request will be started.