Improve PhyloMaker performance up to 50%
@anoe This is the first of many patches that I will land trying to improve the performance and the readability of the Phylo code.
I have spent a few weeks on this trying to get myself familiar with the code, which is quite complex and intricate, and after a few false starts, I have gone back to the drawing board by starting to tackle the low hanging fruit, small wins where we could see immediately the performance gains. At glance, this is what I have done:
- I have replace the
nub
andsort
functions inPhyloMaker
withnubOrd
or, when possible, withnub
andsort
from the discrimination package, which runs inO(n)
, whereasData.List.nub
sort is notoriously slow and exponential (O(n^2)
) andData.List.sort
uses IIRC a flavour of the QuickSort and runs inO(n*logn)
.
Furthermore, as the majority of the time of the PhyloMaker
is spent calculating relatedComponents
, but we do this on all the similarity Set
, we can parallelise that, and so I have added a strategic parMap
.
This patch also adds a new (test) executable called garg-phylo-profile
which shares code from the garg-phylo
but is meant to be used for profiling as it load some canned data.
Results
Before my patch, running garg-phylo-profile
would take:
real 0m17,210s
user 0m25,466s
sys 0m0,747s
After my patch:
real 0m9,230s
user 0m34,577s
sys 0m0,465s
In terms of eventlog, we are doing a bit better, but I need to do more investigation as I think there are bits which are highly parallel and they can be improved, but that will happen in a separate round of investigations:
As you can see this is now much more parallel (green means "doing work", here) and the productivity went up a bit and the GC down a bit, but there is a huge gap where we are (likely) waiting for work to be computed (my hunch is that we are waiting for the similarities to be computed). This can be improved, but will happen in due course.