Tuesday, 30 May 2017

Faster way to loop through array of points and find if within polygons

I have a Node app which allows users to plot 'events' on dot plot graphs. An event is represented by an array of floats, e.g 1 event may look like this.

[ 242841.86914496083,
1090.0027001134586,
11711.344635666988,
142639.20305005432 ]

Users can plot the 1st element(242841.86914496083) against the 2nd element(1090.0027001134586) on a graph, and for example the 3rd(11711.344635666988) against the 4th(142639.20305005432), and all combinations in between.

In the following example, all values range from 0 to 262,144. So to plot on a 200x200 graph I get the ratio and plot accordingly. A graph of 20 events will look something like this (this graph plots the first element of an event on the x axis against the 2 element of the same event on the y axis):

enter image description here

My users can then do what they call a 'gate'. They can draw a polygon around certain events to isolate these events. For example here, I've draw a gate around these events, colored the events in the gate red, and I'm only showing these events on the graph:

enter image description here

You can see i have 'gated' 12 events. Now I'd like to see these gated events plotted with the 4th element on the x axis and the 3rd element on the y axis:

enter image description here

Now I'd like to draw another gate on this. This new gate will be a child of the previous gate i.e. an event considered in this gate is also in the first gate:

enter image description here

Now on my original graph (with the 1st element plotted on x axis and 2nd on y axis), I want to see all the events colored correctly. I.e non-gated events will be white, events that are in both gates will be green, and events in the first gate will be red. The result is this:

enter image description here

Its important to notice that the gates were made on different elements: the first gate was made on the 1st and 2nd elements of an event, and the 2nd on the 3rd and 4th.

The issue I'm having is that some users are uploading files with 800,000 events. My algorithms work well for anything below 200,000 but then everything becomes very slow. I have made a detailed plunker here.

My algorithm is based on getting an array of gate 'chains'. Each of these chains is an array. If an event is within the polygon defined in each gate element, then it gets the color of the furthest down child gate. If you look at the console on Plunkr for example here, you'll see what the gate chain array looks like:

enter image description here

So the problem I have is that when looping through the events, I then have to loop through each 'chain' and working out if the event is within all the polygons in the chain. This is really slow after about 200k. At 800k, it can take over a minute.

To check if an event is actually in a polygon, I first check if it's within the bounding box (the rectangle made from getting the lowest and highest x and y points), then I use a small NPM package https://www.npmjs.com/package/point-in-polygon

All the code in plunker.

I know this is complex but I've racked my grains and cant think of how to improve it. Is there a faster way?

Any help greatly appreciated.

In case you missed it (its a long question!) the plunker is here.



via Mark

No comments:

Post a Comment