Performance measurement suite #73

kodonnell · 2016-10-05T02:07:25Z

As discussed in #67 and #68. Code isn't ready for a full merge - I'm looking for some initial feedback on the general approach. Key points:

created matching-tools module (like graphhopper-tools)
copied Measurement.java from GH, and added two tests: one for location lookups (i.e. findNClosest) and one for testing a bunch of randomly created GPX entries (randomly pick start/end, then find route, then down-sample points).
learned (some of) Maven, the hard way

Todo:

general tidy up
tweak random creation of GPX entries (above) to be near the path, not on it.
do I need to warm up the JVM?
...?

Example usage:

$ ./map-matching.sh action=measurement map-data/leipzig_germany.osm.pbf 3
...
$ cat measurement.20161005_145346 
commits (in order tested):
3b305dc8a76123a12375da91a69d8ecf47466ed0 merged most recent
cacb7a3e0d2033042ca11d7280af43501a2a06dc ok, seems to be working ...
59a01fcc3307a05e9b8e74119ec1948f8a9de7c9 getting multiple commits working ...

measurements:
-------------

location_index_match.max      5.978906            6.654648            6.393774            
location_index_match.mean     0.1895065804        0.194572196         0.1902064926        
location_index_match.min      0.001334            0.00126             0.001377            
location_index_match.sum      947.532902          972.86098           951.032463          
map_match.max                 8.049864            7.153459            7.29994             
map_match.mean                0.7924976776        0.7868942498        0.7843470472        
map_match.min                 0.094859            0.128563            0.091647            
map_match.sum                 3962.488388         3934.471249         3921.735236         
measurement.count             5000                5000                5000                
measurement.seed              123                 123                 123                 
measurement.time              7010                7007                7027                
measurement.totalMB           655                 655                 655                 
measurement.usedMB            19                  19                  21

karussell · 2016-10-05T09:00:17Z

Really nice - thanks!

learned (some of) Maven, the hard way

oh, what was the problem here? I do like gradle a bit more (see also an issue on GH core) but the work would be too big to make sense now.

do I need to warm up the JVM?

That should be done already from the MiniPerfTest

tweak random creation of GPX entries (above) to be near the path, not on it.

Yes, this sounds good. I think I've done this in the tests already somewhere.

kodonnell · 2016-10-05T18:12:43Z

oh, what was the problem here

I've just never had experience with Maven (or Java) before (I'm more python/JS/etc.) - so figuring out how to make it all work (including in Eclipse) took a bit of head-scratching. Thankfully, I could just copy from e.g. graphhopper-tools.

I think I've done this in the tests already somewhere.

Ah, didn't realise that. From what I see, there's no randomness added - both in which points are chosen (it seems one GPXEntry is created per instruction point) and the GPXEntry aligns exactly with the route. As an aside, I was thinking about adding in some new tests which create a random route and create some GPX entries, and then check that the map-matched route is the same - what do you think?

I will tidy this PR up now. For anyone else reading, some next steps (which I'll try to add as TODO's in the code):

tidy up map-matching.sh ... following the graphhopper equivalent, which is more robust.
add some more perf. tests to Measurement.java
figure out git a bit more nicely when measuring the last commits: currently you can only do it after you've just committed (as you can't check out another commit without commiting any existing changes), and it doesn't come back to where you started (I seem to end up in a detached head state).
similarly, when comparing the last three commits, it assumes you've got the same measurement properties - it won't merge things nicely if e.g. you add a new test (which isn't present in old commits).

karussell · 2016-10-05T18:53:53Z

including in Eclipse

I recommend to use NetBeans or IntelliJ instead (especially when you use maven). Eclipse is a no go for me personally ;)

figure out git a bit more nicely when measuring the last commits

For perf comparison you could make the parameters you want to change configurable instead. But for me it was often a lot easier to create a separate branch and test some config change here something there and commit these to the branch. The it doesn't matter where I ended up I could always type either git checkout master or git checkout some_branch afterwards.

From what I see, there's no randomness added

Uups - indeed. That is something we can/should do. I probably didn't added the randomness as the old algorithm was not good enough to figure out all situations.

I was thinking about adding in some new tests which create a random route and create some GPX entries, and then check that the map-matched route is the same - what do you think?

Yes, sounds good. I would distort the routing output a bit before the map matching and compare the original route length R (not the length G calculated from gpx entries) with the map matched length M and allow only <1% difference or something. If one would compare G and M the difference could be bigger.

kodonnell · 2016-10-06T01:21:51Z

parameters you want to change configurable

I was more thinking of changes to the code base (which you want to test before you commit). I actually just slipped up after running the test as it left me in a detached head state, which I then committed to.

Yes, sounds good

Submitted a new issue to remind us.

kodonnell · 2016-10-13T22:52:53Z

Is it OK to do this on my fork, to update my user details for this PR, given it hasn't been merged. I'm not sure if that'll mess with anything else (including the PR you just merged ...)

karussell · 2016-10-14T09:15:34Z

Yes, you should be able to change your user details here and push (force) without making a problem on my side here.

kodonnell · 2016-10-14T19:18:23Z

Excellent, seems to have worked. Want me to merge in the new API, or leave that to you?

karussell · 2016-10-14T19:21:07Z

matching-tools/src/main/java/com/graphhopper/matching/tools/Measurement.java

+                        }
+
+                        // TODO: do we need to return something non-trivial?
+                        return 0;


for all return values we need some results from the calculation to avoid JVM optimization kick in and remove the whole code above (although unlikely here IMO)

Ah, didn't know that. Fixed, and also fixed a few other bugs with the test.

karussell · 2016-11-14T09:30:51Z

Sorry, will have a look to your latest changes - thanks!

karussell

Looks good. Will test it locally.

karussell · 2016-11-14T10:05:31Z

map-matching.sh

@@ -1,51 +1,98 @@
 #!/bin/bash

+# bail if any errors ...
+set -e


This strict option might be useful for graphhopper.sh too

karussell · 2016-11-14T10:07:50Z

map-matching.sh

+            git checkout "$commit"
+            git log -n 1 --pretty=oneline >> "$fname"
+            startMeasurement
+            while read -r line; do


where does this loop through?

This is my way of merging results across multiple commits into a single file. Outputs are written temporarily to $measurement_fname (see line 43 and 57) and read in line by line (see line 78). Then 66-77 is logic for stripping the key and value from each line, and

adding a new line if it's a new measurement key

adding the value to the next column if it's an existing key

This is taken care of in the $values file. My advice: try it and see = )

My advice: try it and see = )

Will definitely do this... before: why would I want to have multiple results in one file? If I have one file per commit I know exactly what is what and can give a computer those files as input (already have an bit hidden UI here - wait a bit and then click e.g. routing)

See first PR comment - I find it much easier to compare performance across multiple commits when all the values are side-by-side, etc. The previous approach meant you had to open e.g. 10 files, and manually eyeball through until you found the right value, then memorise it and compare against all others, etc. If you wanted to compare across multiple metrics, it was even worse. Of course, if you're making a separate UI, there's not much need.

It's very easy to keep the original raw files per commit, if you're that way inclined (I just delete them, i.e. line 85).

You convinced me to invest a bit time to make the UI working easily - wasn't easy to get it working locally but with nodejs this was rather simple:

graphhopper/graphhopper#894

Would you mind to make this working with this UI?

kodonnell · 2016-11-29T14:04:21Z

@karussell, is there anything I can do to move this along? I ask because I've anecdotally found odd results. E.g. one would expect deduping on node ID (see #81) to improve performance (by reducing the number of candidates), but I actually found it decreased it (in a specific case, and including when I'm caching the paths and using DIJKSTRA_ONE_TO_MANY). Similary, it'd be good to know how #83 affects performance, etc.

…ue_68

kodonnell · 2016-12-04T06:11:46Z

Updated to address some of the above concerns, and add in a new feature. Specifically, running it for e.g. the last 5 commits will create five separate files (named measurement<time at start of test>.properties as per the usual GH behaviour) and one simple merged file measurement_<first_commit>_<last_commit>.propertieswhich allows for easy comparison between commits, including graphs (!). That is:

measurement2016-12-04_19_06_37.properties
measurement2016-12-04_19_07_01.properties
measurement2016-12-04_19_07_25.properties
measurement2016-12-04_19_07_49.properties
measurement2016-12-04_19_08_14.properties
measurement_f83558d_758529c.properties

The (start) for the corresponding output (which is the same as the merged file):

commits:
--------------------------

f83558d [2016-12-04] start with oldest commit first
173bed8 [2016-12-04] playing with graphs
5441639 [2016-12-04] show charts one by one
2a990e1 [2016-12-04] show charts one by one
758529c [2016-12-04] Merge branch 'issue_68' of github.com:kodonnell/map-matching into issue_68

measurements:
-------------

                              f83558d             173bed8             5441639             2a990e1             758529c             
                              -------             -------             -------             -------             -------             
location_index_match.max      6.218712            6.221624            6.680946            6.379524            6.493001            
location_index_match.mean     0.190912689         0.1928028062        0.19456209100000002 0.19487444040000002 0.1991708052        
location_index_match.min      0.001219            0.001199            0.001186            0.001306            0.001221            
location_index_match.sum      954.563445          964.014031          972.810455          974.372202          995.854026          
map_match.max                 1518.55109          1655.122742         1527.293507         1656.642537         1640.558294         
map_match.mean                120.16302784999999  129.7930923         124.57563984000001  126.85586235000001  126.61892128        
map_match.min                 3.718201            3.810781            3.795458            3.840196            3.79461             
map_match.sum                 12016.302785        12979.30923         12457.563984        12685.586235        12661.892128        
measurement.count             5000                5000                5000                5000                5000                
measurement.seed              123                 123                 123                 123                 123                 
measurement.time              17178               18165               17749               17939               17943               
measurement.totalMB           1449                1528                1343                1302                1469                
measurement.usedMB            35                  37                  33                  33                  35                  



                                                                               
                          location_index_match.max                             
                                                                               
                                   ********                                    
      ********      *********      *      *      *********      ********       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *      *      *       *      *      *      *       *      *      *       
      *  +   *      *   +   *      *   +  *      *   +   *      *   +  *       
  +---********------*********------********------*********------********---+   
      f83558d        173bed8        5441639       2a990e1        758529c       
                                                                               
Press any key to view the next chart, or CTRL + C to exit ...

In future (this PR?), we could/should

add an error quantification (e.g. std) to the measurement class (inc. in GH), and add it to the outputs (inc. plots)
cache some of the builds/tests to speed things up if the developer's looking for quick feedback

Would you mind to make this working with this UI?

I guess it should work if you copy the files into the right place. (I'm assuming you don't want me to add a duplicate UI to the map-matching module? As and aside, that's another good reason for merging).

As discussed here ... maybe we have something like this for the individual developer use case, and leave the UI as a separate thing for the full history browsing use case?

kodonnell · 2016-12-04T06:14:56Z

PS - we can remove all the merging stuff and graphs, if you want. It's kind of a proof-of-concept, but it might be pointless, depending what happens with the GUI.

karussell · 2016-12-04T15:28:35Z

which allows for easy comparison between commits, including graphs (!).

Hmmh, you rely on gnuplot. I really like that you try for simple solutions, but this will definitely make problems on other OSes (and I say this although I love gnuplot and I have it always installed :) )

(I'm assuming you don't want me to add a duplicate UI to the map-matching module?

No. Duplication would be ugly. Maybe I'll keep it out of GH then completely

As discussed here ... maybe we have something like this for the individual developer use case

If we really want a separate thing here to get a fast overview directly after the run I would highly prefer a pure Java solution for the graphing/pictures and e.g. use jfreechart (which can be used to export images too). See e.g. the usage in the pt branch here

karussell · 2016-12-04T15:34:43Z

Totally forgot: thanks for creating the separate measurement properties, also the merged measurement file is useful. Now we need to decide regarding the plotting, if optional, with pure Java or not at all in this for now.

kodonnell · 2016-12-04T17:16:27Z

Hmmh, you rely on gnuplot

Kind of - it works without, and you get notified if you don't have it. If a developer wants graphs (they may not need then) and if they're like me (unix based and install things willy nilly) then it shouldn't be an issue. (It's also worth noting that the UI here relies on NPM etc.) However, this is more of a POC and I think (?) I could pretty easily write a bash-based solution, or we could use a different tool. (I've actually never used gnuplot before - I just chose it as it gives you ASCII charts.)

I would highly prefer a pure Java solution for the graphing/pictures

I agree. However, the only way I can see this working (which I think you're alluding to as well) is

we create Java runner to take care of e.g. switching commits, rebuilding, saving stats in memory, displaying, etc. I don't know enough Java to comment on how easy that'd be, but I think this is probably the 'best' solution.
or we leave the current method for generating the measurement* files, but use something else (Java or html/js) for viewing the graphs. I don't like this approach as it complicates the workflow etc.
stick with status quo, and wait and see if others want more than what we currently have.

Totally forgot: thanks

Thanks = ) I enjoy weird little puzzles like this.

karussell · 2016-12-05T10:28:52Z

It's also worth noting that the UI here relies on NPM etc.

NPM is something we already require for the web UI, also as you said we can load measurement files from disc directly or even reject the entire PR as not really helpful. What do you mean with 'etc'?

Kind of - it works without, and you get notified if you don't have it.

Would you still mind to remove it? Maybe you can create a separate script for your own usage?

Keep in mind: everything thing we add needs to be maintained (somehow from someone) and for that reason I would try to reduce complexity - at least for now. (step by step appraoch :))

stick with status quo, and wait and see if others want more than what we currently have.

Probably that is the best. Still we need to figure out how to improve step by step, with a clear vision of the final goal (historic overview vs. developer trial and also regarding Java plotter).

kodonnell · 2016-12-06T18:39:11Z

What do you mean with 'etc'?

Just my shorthand way for expanding the way you did above.

Would you still mind to remove it?

No worries at all - it was more a test that anything else. All removed.

everything thing we add needs to be maintained

Good point - I don't think I've fully appreciated that.

Probably that is the best

Agreed.

karussell

I'll merge this here now.

btw: afterwards I'll start to merge map matching repo with the core repo

karussell · 2016-12-20T17:20:58Z

Would you squash into as few commits as necessary for you (in your opinion)?

kodonnell · 2016-12-20T18:23:31Z

btw: afterwards I'll start to merge map matching repo with the core repo

Great - hope it isn't too painful!

Would you squash into as few commits as necessary for you (in your opinion)?

I haven't done this before sorry, and on the merge button options it says that 'squash and merge' is not enabled for this repository? Any tips?

karussell · 2016-12-20T20:42:21Z

I haven't done this before sorry, and on the merge button options it says that 'squash and merge' is not enabled for this repository? Any tips?

You can do it either with your IDE or with the command line: git reset --soft HEAD~X where X is the number of commits or use git rebase -i HEAD~X if you want to e.g. concatenate the existing commit messages.

I've explicitly disabled squashing here for merging to avoid problems like authorship lost etc

karussell · 2016-12-20T20:48:53Z

Ah and afterwards you will have e.g. just one commit different on your branch issue_68, then do a git push -f which does a force push to the issue_68 branch. Note: you should never ever do a forced push if you share the branch with someone (will cause them trouble especially if they have unpushed commits)! But doing it here is good and afterwards we can merge it easily into master.

If that is too much for the beginning I can do it :)

kodonnell · 2016-12-20T21:36:38Z

If that is too much for the beginning I can do it :)

Yes please! It looked simple enough, except I've also got merges in there, and for now I'd appreciate if you dealt with those, if it makes sense. (If it were a private repo, I might play, but I don't want to muck this up for others.) I'm happy for you to squash it all into a single commit (if possible). Thanks!

karussell · 2016-12-20T21:47:50Z

Ah, yes. Then I do it via a workaround which I found somewhere:

# update master to latest origin/master
git checkout master
git pull
# now merge the branch into the master but reset afterwards
git merge somebranch
git reset origin/master

Now the merge is reverted but all changes are still in the working tree and we can commit all changes on master or if on the other branch then do:

# create a fresh branch with the same name
git branch -D somebranch
gco -b somebranch
# now commit the changes with the original author but your 'hash'
git commit -m "one commit" --author "Old Author Name <[email protected]>"
git push -f

karussell · 2016-12-20T21:54:12Z

Ok, decided to go the simple way to enable squashed merges temporary ... I hope this was okay?

kodonnell · 2016-12-20T22:03:22Z

Doesn't worry me - looks like I'm still down as contributing. Thanks @karussell.

karussell added architecture improvement labels Oct 5, 2016

kodonnell mentioned this pull request Oct 6, 2016

Improve generic map-matching tests #74

Open

kodonnell added 10 commits October 15, 2016 08:14

initial work based on graphhopper Measurement.java

eb4ed17

getting multiple commits working ...

8a587be

getting multiple commits working ...

d60985b

ok, seems to be working ...

0b34d2d

fixed merge while rebasing

ee586c4

tidying up

77818af

formatting msg

727ad77

formatting code

b08edf8

rats, committed on the wrong branch ...

0373ee5

tab to space ...

48cd988

karussell reviewed Oct 14, 2016

View reviewed changes

kodonnell added 4 commits October 16, 2016 07:12

merging master

6a62784

updated to new API change, plus fixed test bug

696820c

fixed infinite loop

3f3829a

OK, perf tests working correctly ... hopefully

d540a14

karussell reviewed Nov 14, 2016

View reviewed changes

karussell mentioned this pull request Nov 19, 2016

Reduce memory usage via using max visited nodes for collection initialization size #80

Closed

kodonnell mentioned this pull request Nov 23, 2016

route cache for performance? #81

Closed

kodonnell added 4 commits December 4, 2016 17:56

playing with graphs

173bed8

show charts one by one

5441639

show charts one by one

2a990e1

Merge branch 'issue_68' of github.com:kodonnell/map-matching into iss…

758529c

…ue_68

kodonnell added 2 commits December 4, 2016 20:25

woops, fixing trap

c38265d

clear charts

38790e4

kodonnell added 2 commits December 7, 2016 07:21

removing gnuplots

028add7

tidied up doco a bit

9d78657

This was referenced Dec 7, 2016

correctly utilise dijkstra one to many #82

Closed

map matching 'quality' test suite #89

Open

karussell approved these changes Dec 20, 2016

View reviewed changes

karussell merged commit 3cd6476 into graphhopper:master Dec 20, 2016

karussell added this to the 0.9 milestone Dec 20, 2016

kodonnell mentioned this pull request Dec 24, 2016

Penalize inner-link U-turns (builds on #83) #88

Merged

karussell mentioned this pull request Dec 31, 2016

Add measurement suite for matching and lookup of some samples #68

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance measurement suite #73

Performance measurement suite #73

kodonnell commented Oct 5, 2016

karussell commented Oct 5, 2016

kodonnell commented Oct 5, 2016

karussell commented Oct 5, 2016

kodonnell commented Oct 6, 2016

kodonnell commented Oct 13, 2016

karussell commented Oct 14, 2016

kodonnell commented Oct 14, 2016

karussell Oct 14, 2016

kodonnell Oct 15, 2016

karussell commented Nov 14, 2016

karussell left a comment

karussell Nov 14, 2016

karussell Nov 14, 2016

kodonnell Nov 14, 2016

karussell Nov 14, 2016 •

edited

Loading

kodonnell Nov 14, 2016

karussell Dec 2, 2016

kodonnell commented Nov 29, 2016

kodonnell commented Dec 4, 2016

kodonnell commented Dec 4, 2016

karussell commented Dec 4, 2016 •

edited

Loading

karussell commented Dec 4, 2016

kodonnell commented Dec 4, 2016

karussell commented Dec 5, 2016

kodonnell commented Dec 6, 2016

karussell left a comment

karussell commented Dec 20, 2016 •

edited

Loading

kodonnell commented Dec 20, 2016

karussell commented Dec 20, 2016 •

edited

Loading

karussell commented Dec 20, 2016 •

edited

Loading

kodonnell commented Dec 20, 2016

karussell commented Dec 20, 2016

karussell commented Dec 20, 2016 •

edited

Loading

kodonnell commented Dec 20, 2016

Performance measurement suite #73

Performance measurement suite #73

Conversation

kodonnell commented Oct 5, 2016

karussell commented Oct 5, 2016

kodonnell commented Oct 5, 2016

karussell commented Oct 5, 2016

kodonnell commented Oct 6, 2016

kodonnell commented Oct 13, 2016

karussell commented Oct 14, 2016

kodonnell commented Oct 14, 2016

karussell Oct 14, 2016

Choose a reason for hiding this comment

kodonnell Oct 15, 2016

Choose a reason for hiding this comment

karussell commented Nov 14, 2016

karussell left a comment

Choose a reason for hiding this comment

karussell Nov 14, 2016

Choose a reason for hiding this comment

karussell Nov 14, 2016

Choose a reason for hiding this comment

kodonnell Nov 14, 2016

Choose a reason for hiding this comment

karussell Nov 14, 2016 • edited Loading

Choose a reason for hiding this comment

kodonnell Nov 14, 2016

Choose a reason for hiding this comment

karussell Dec 2, 2016

Choose a reason for hiding this comment

kodonnell commented Nov 29, 2016

kodonnell commented Dec 4, 2016

kodonnell commented Dec 4, 2016

karussell commented Dec 4, 2016 • edited Loading

karussell commented Dec 4, 2016

kodonnell commented Dec 4, 2016

karussell commented Dec 5, 2016

kodonnell commented Dec 6, 2016

karussell left a comment

Choose a reason for hiding this comment

karussell commented Dec 20, 2016 • edited Loading

kodonnell commented Dec 20, 2016

karussell commented Dec 20, 2016 • edited Loading

karussell commented Dec 20, 2016 • edited Loading

kodonnell commented Dec 20, 2016

karussell commented Dec 20, 2016

karussell commented Dec 20, 2016 • edited Loading

kodonnell commented Dec 20, 2016

karussell Nov 14, 2016 •

edited

Loading

karussell commented Dec 4, 2016 •

edited

Loading

karussell commented Dec 20, 2016 •

edited

Loading

karussell commented Dec 20, 2016 •

edited

Loading

karussell commented Dec 20, 2016 •

edited

Loading

karussell commented Dec 20, 2016 •

edited

Loading