Describing 3D swiping hand movement as curve on a XY plane
I'm trying to describe a 3D swipe gesture (only vertical or horizontal, no diagonals) above a given flat surface using as much conventional geometry or similar non-machine-learning techniques (Hidden Markov model, artificial neural networks etc. are therefore excluded) as possible. From multiple observations of the data retrieved from the device I concluded that a swipe can "easily" be described as a curve (or in some cases as a really straight line). With this question I would like to know how a curve and curved movement can be described in simple geometric terms in a most efficient (mostly speed- but also memory-wise) way.
The post is divided into two parts - one that gives information on the data that is used and one that gives an overview of what I have come up with so far. Sorry in advance for my poor Paint skills. :D
The 3D position data
The device I'm using streams 3D points each representing the hand's position at a given point in time. I can capture and evaluate these. The following image visualizes the plot of the data from two different perspectives - top-down and isometric (more or less):
- XY plane view (on the left, aka top-down view) - for each sample only the values along the X and Y axes are taken into consideration. This view represents the surface of the device above which the movement of the hand is detected
- XYZ view (on the right, aka isometric view) - for each sample all three axes are taken into consideration. This view represents the full 3D movement in a volume above the device surface which defines the space where gestures can be detected
In the next image I have added the movement of the hand as detected by the device:
The actual movement looks more like this:
Based on the observation of the actual movement and the one detected by the device I can mark almost half of the samples that the device has given me as invalid namely all border values (along each axis a position can be between 0 and 65534) which do not describe the actual movement of the hand from the perspective of the user of the device (in the image below invalid data is represented as the trajectory part which is covered by a polygon):
Of course sometimes the "valid" portion of the trajectory is rather small compared to the invalid data:
The algorithm I have described below doesn't care how much the valid data is as long as there are at least 2 samples that fulfill the requirement of not being border positions meaning X and Y are different from 0 and 65534. An issue arises from this which I will elaborate on in the next part of this post.
Describing the movement
I have given it some thought and this is what I came up with:
Extract only the set of valid samples that is exclude all which have a border position
For each sample generate a local XY coordinate system which is aligned with the XY coordinate system of the surface of the device (to make things easier :)):
Next I'm thinking of calculating the vector between the current sample and the next one (if present) and calculate the angle between that vector and the X axis (can also do that with the Y axis):
Using the magnitude of each angle I can determine if the movement between the current and the next sample leans more towards horizontal or vertical one and also in which direction.
This should allow me to determine the general direction of the swipe movement as well as how it is position above the surface. I have done a lot of swiping :D but since I want to describe this in a more formal way I obviously need to describe my findings hence the need to find a way to describe and classify a curve based on its properties. Perhaps calculate the curvature of the whole trajectory?
There are of course some issues with this algorithm that came to my mind:
Since the user swipes in full 3D space and not just rub hisfinger across a surface it might happen that he makes a swipe movement with a hand and ALL the values are border values. I'm thinking of handling this case by simple introducing two cases for a swipe gesture:
border gesture - all samples have border position. I've noticed that when doing such a swipe movement the samples are placed along only one of the borders. This makes things easier since I don't have to think about the case where all the samples have border position but are also distributed along 2 or more borders (in which case I would actually have two portions of the trajectory in a 90deg different direction to each other). This gesture is also not a curve but a straight line so evaluating it is much, much easier (note that in the isometric view the samples are all glued to the wall of the volume that is closer to the user with all Y values equal to 0):
non-border gesture - a portion or even all samples have position different from a border position. In this case I can exclude all samples aligned at the border and extract only those that are really part of a curve
- Efficiency - since I'm doing relatively simple calculations it should be fast enough (for example a swipe gesture should be recognized withing 200-300ms in order to provide smooth interaction). Memory-wise things also look relatively good.
I've searched online before I started thinking of creating the algorithm I've described above but couldn't find anything. Even the topic of classifying curves seems to be either not that popular or the search terms I've used are too broad/restricting. The classification here is not that essential imho (unlike what follows) but it would still be nice to be able to split the resulting curves in sets each representing a swipe gesture.
The next thing I have been thinking about is curve fitting. I have read articles about this but frankly beside a couple of tasks at my university during the math course I haven't given it much thought except for Bezier curves. Can anyone tell me if curve fitting is a plausible solution for my case? Since it's curve fitting one would be right to guess that we need some initial curve which we want to do our fitting against. This would require gathering swipe movements and then extracting a possible optimal curve which is something of an "average" of all curves for a given swipe. I can use the first algorithm I have described above to get compact description of a curve and then store and analyze multiple curves for a given swipe in order to get the "perfect" curve. How does one proceed when handling curve classification?