create account

How to compare GPS tracks by bitcalm

View this thread on: hive.blogpeakd.comecency.com
· @bitcalm ·
$17.23
How to compare GPS tracks
I've been using my Garmin Forerunner 210 to record my running progress. Like many gadgets it has a built in GPS receiver, allowing the watch to not only record how long my run is, but also where I run.

Running the same route every time can be really boring. To keep things interesting, I take slight detours on my routes. Then a crazy idea popped into my head: can I automatically compare GPS tracks to see where they differ?

![](http://steemit.bitcalm.mm.st/posts/compare-gps/thun_1_raw.png)
![](http://steemit.bitcalm.mm.st/posts/compare-gps/thun_2_raw.png)

# Aligning Protein Sequences

[Aligning protein sequences][1] is really useful. It allows bioinformaticians to find regions of similarity which may be a consequence of functional, structural, or evolutionary relationships between the sequences.

    Sequence 1: TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
    Sequence 2: TCCPSIVARRSNFNVCRLTPEAICATYTGCIIIPGATSPGDYAN
    
When the above sequences are aligned, gaps are shown to indicate insertions and deletions.

    Alignment 1: TTCCPSIVA-RSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
    Alignment 2: -TCCPSIVARRSNFNVCRL--TPEAICATYTGCIIIPGATSPGDYAN

One of the earliest methods for aligning sequences is the [Needleman-Wunsch algorithm][2], published in 1970.

At the heart of the algorithm is the scoring system, which is used to determine how a pair of residues should align. The higher the score for a given combination, the more it is biased in the resulting alignment.

A gap penalty is also applied during alignment. If the gap penalty is greater than the score for aligning two residues, a gap is introduced in the alignment instead.

You can use any scoring system you like, depending on the problem you're trying to solve. [BLOSUM][3] is a similarity matrix commonly used as the scoring system to align protein sequences. It takes into account the liklihood that a residue will mutate into another based on what has already been observed in nature.

# Aligning GPS tracks

How many of you skipped the bit about protein sequences? Come on, be honest. It's okay, I don't mind, but you might be confused if you don't read that first.

Hopefully you can already see the connection: instead of aligning residues in protein sequences, we'll be aligning co-ordinates in GPS tracks. The bits that don't align, the gaps, will be the sections of the tracks that differ.

But what scoring system and gap penalty should be used?

When we compare tracks by eye, we look at the distance between points. We can use the same metric to score GPS points in the alignment; however, the Needleman-Wunsch algorithm considers higher values a match.

Higher distances mean the points are further apart and therefore non-matching, so we need to use the negative distance as the score, which results in larger distances having smaller values.

The gap penalty works as the cut-off for when the distance between two points is considered large enough that they don't represent the same location. As the distance is always negative, the gap penalty distance must also be negative.

There is no universal value for this cut-off. It depends on the quality of the data and the context.

If the GPS device is accurate to 5 metres, setting the cut-off to 2 metres will result in a noisy alignment. Likewise, if the tracks being compared are from planes, you might consider them on the same flight path if they are within 200 metres of one another; for tracks of people walking, this distance will be far too large.

So what happens if we use all this information to align two GPS tracks stored in a GPX file and plot them on a map, showing the matched points in blue and the differences in red and orange?

![](http://steemit.bitcalm.mm.st/posts/compare-gps/thun_raw.png)

It's not bad, but also not perfect. It's found the major detour at the bottom-left of the map (in red), but there are also a lot of incorrect mismatches spread throughout the tracks (in orange). What's causing this?

The GPS points in the two tracks are not evenly distributed, resulting in points that are far apart being compared when the density in a section of one track differs greatly from the other. If you look at the locations of the orange points in the alignment, they match very closely to the low-density regions in the plot of the first track shown in the introduction.

This is fixed by pre-processing the GPX files to distribute GPS points evenly over the track. The closer together the points, the more accurate the result, but the longer the alignment will take.

So how does it look now? Amazing!

![](http://steemit.bitcalm.mm.st/posts/compare-gps/thun_even.png)

# Show me the code

The implementation of the Needleman-Wunsch algorithm, adapted for GPS tracks, is shown below:

```
def align_tracks(track1, track2, gap_penalty):
    """ Needleman-Wunsch algorithm adapted for gps tracks. """

    _log.info("Aligning tracks")

    def similarity(p1, p2):
        d = gpxpy.geo.distance(p1.latitude, p1.longitude, p1.elevation,
                               p2.latitude, p2.longitude, p2.elevation)
        return -d

    # construct f-matrix
    f = numpy.zeros((len(track1), len(track2)))
    for i in range(0, len(track1)):
        f[i][0] = gap_penalty * i
    for j in range(0, len(track2)):
        f[0][j] = gap_penalty * j
    for i in range(1, len(track1)):
        t1 = track1[i]
        for j in range(1, len(track2)):
            t2 = track2[j]
            match = f[i-1][j-1] + similarity(t1, t2)
            delete = f[i-1][j] + gap_penalty
            insert = f[i][j-1] + gap_penalty
            f[i, j] = max(match, max(delete, insert))

    # backtrack to create alignment
    a1 = []
    a2 = []
    i = len(track1) - 1
    j = len(track2) - 1
    while i > 0 or j > 0:
        if i > 0 and j > 0 and \
           f[i, j] == f[i-1][j-1] + similarity(track1[i], track2[j]):
            a1.insert(0, track1[i])
            a2.insert(0, track2[j])
            i -= 1
            j -= 1
        elif i > 0 and f[i][j] == f[i-1][j] + gap_penalty:
            a1.insert(0, track1[i])
            a2.insert(0, None)
            i -= 1
        elif j > 0 and f[i][j] == f[i][j-1] + gap_penalty:
            a1.insert(0, None)
            a2.insert(0, track2[j])
            j -= 1
    return a1, a2
```

There are more examples, along with the code to generate your own alignments, in the [jonblack/cmpgpx][4] repository on GitHub. It's in the public domain, too, so do what you please with it. Yay!

---

This was originally [posted on my blog](http://www.humblecoder.com/how-to-compare-gps-tracks/).

[1]: https://en.wikipedia.org/wiki/Sequence_alignment
[2]: https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm
[3]: https://en.wikipedia.org/wiki/BLOSUM
[4]: https://github.com/jonblack/cmpgpx
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
properties (23)
authorbitcalm
permlinkhow-to-compare-gps-tracks
categoryprogramming
json_metadata{"tags":["programming","coding","science","bioinformatics","algorithms"],"image":["http://steemit.bitcalm.mm.st/posts/compare-gps/thun_1_raw.png"]}
created2016-08-04 19:34:45
last_update2016-08-04 19:34:45
depth0
children5
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value13.322 HBD
curator_payout_value3.905 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length6,800
author_reputation24,919,530,803,138
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id608,320
net_rshares8,175,877,502,291
author_curate_reward""
vote details (40)
@gunpower ·
properties (23)
authorgunpower
permlinkhow-to-compare-gps-tracks
categoryprogramming
json_metadata""
created2016-08-04 19:35:33
last_update2016-08-04 19:35:33
depth1
children0
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length37
author_reputation-1,328,386,749,651
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id608,329
net_rshares335,227,481
author_curate_reward""
vote details (41)
@isaac.asimov ·
Flesch Kincaid Grade Level
Hi! This post has a <a href="https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests">Flesch-Kincaid</a> grade level of 8.9 and reading ease of 68%. This puts the writing level on par with Leo Tolstoy and David Foster Wallace.
properties (22)
authorisaac.asimov
permlinkre-how-to-compare-gps-tracks-20160804t193534
categoryprogramming
json_metadata""
created2016-08-04 19:35:33
last_update2016-08-04 19:35:33
depth1
children0
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length241
author_reputation-982,572,424,326
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id608,332
net_rshares0
@jedau ·
I've been trying to break into bioinformatics (to no avail) for some time now, so I really appreciate this example. I really appreciate that the code you posted is written in Python, because that's what I'm heavily into these days. There are tons of applications for your idea especially for traffic management and rerouting.

Based on the problem you posited, my first thought for a solution was similar to how noise from images are removed. Since we're talking about coordinates on a 2d map, I feel that it could be used in conjunction with your sequencing algorithm. I've had more experience (and success) dealing with images that genetic sequencing, so my stand is a bit biased.

Here's another resource I found that discussed a different algorithm for the same problem: http://gis.stackexchange.com/questions/81551/matching-gps-tracks

When you told me that you weren't in the field of psychology, I didn't imagine that we belong in the same field. Though I bet you're miles ahead from where I'm currently at.
properties (22)
authorjedau
permlinkre-bitcalm-how-to-compare-gps-tracks-20160805t054431747z
categoryprogramming
json_metadata{"tags":["programming"],"links":["http://gis.stackexchange.com/questions/81551/matching-gps-tracks"]}
created2016-08-05 05:44:30
last_update2016-08-05 05:44:30
depth1
children2
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,014
author_reputation50,429,040,590,557
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id617,611
net_rshares0
@bitcalm ·
The book [An Introduction to Bioinformatics Algorithms](https://www.amazon.com/Introduction-Bioinformatics-Algorithms-Computational-Molecular/dp/0262101068) is a good place to start. The author also has a course on Coursera.

It's funny you mention image noise removal because Needleman-Wunsch is also used for that, particularly with stereoscopic 3D images, where one side can be considered a mutated form of the other.

There are lots of other possibilities as well.  Needleman-Wunsch is a global alignment algorithm which finds the optimal solution. With two sequences/tracks this isn't a problem, but if you want to compare a track to a database, it would take too long. To solve this problem, a local alignment algorithm would be much more efficient but not optimal. A popular algorithm for this is BLAST (Basic Local Alignment Search Tool). I've been meaning to implement this for GPS tracks but haven't found the time.

The psychology articles are things I'm interested in, plus it helps that my girlfriend is a psychologist :)
properties (22)
authorbitcalm
permlinkre-jedau-re-bitcalm-how-to-compare-gps-tracks-20160805t055322406z
categoryprogramming
json_metadata{"tags":["programming"],"links":["https://www.amazon.com/Introduction-Bioinformatics-Algorithms-Computational-Molecular/dp/0262101068"]}
created2016-08-05 05:53:18
last_update2016-08-05 05:53:18
depth2
children1
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,034
author_reputation24,919,530,803,138
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id617,689
net_rshares0
@jedau ·
I've actually enrolled in the Coursera specialization. I didn't get the verified course, but I passed all the courses. It's hard to break into the industry though. Having a terrible time finding (remote) employment in the field.

I've actually used BLAST in genetic sequencing, so I already have a brief notion of its capabilities for comparing GPS tracks.
properties (22)
authorjedau
permlinkre-bitcalm-re-jedau-re-bitcalm-how-to-compare-gps-tracks-20160805t061241119z
categoryprogramming
json_metadata{"tags":["programming"]}
created2016-08-05 06:12:42
last_update2016-08-05 06:12:42
depth3
children0
last_payout2016-09-04 16:42:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length356
author_reputation50,429,040,590,557
root_title"How to compare GPS tracks"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id617,881
net_rshares0