Arm Gesture Classification for Human-Robot Interaction -- Pattern Recognition Dataset

In the Robotic Life Group, we have a Vicon Motion Capture system. We will be offering a dataset of arm gestures for use in classification problems. The application of these gestures is for a mixed reality communication/interaction with an on-screen animated robot character (shown the figure). If you decide to work on this problem we'll have you come to the lab and see exactly how the data was collected to better understand the dataset you're working with.

The data set has 15 classes of gestures (3 gestures at 5 locations). Each gesture is a sequence of data captured from the Vicon system from a person wearing several markers on their right arm. We have two kinds of features from the Vicon markers, one is the raw marker positions (x,y,z) and the other is higher level skeletal information (e.g., wrist, elbow, ...) Ideally we would like to classify gestures with only the raw markers, but it is a much more challenging problem because there is no correspondence between the raw points from one frame to the next. In order to classify based on just the raw points, it might be necessary to derive higher level features (moments) from the bag of data points in each frame.

Data Description

Download: gesture data

This directory has several files representing 15 classes of gestures. Currently there are 3 or 4 examples of each class for exploration, and over the next week there will be about 100 examples of each of the 15 classes.
class1 = grasping to location 1
class2 = grasping to location 2
class3= grasping to location 3
class4 = grasping to location 4
class5 = grasping to location 5
class6-10 = pointing to location 1-5
class11-15 = pushing down at location 1-5

Each file has an example of one of three gestures to one of five locations. The file names indicate the gesture/location class with the following syntax: (gesturename)L(locationnumber)(examplenumber).pi

The first line of data in each file indicates how long the example is (number of frames). Then each line after this is a frame of data.

Each frame has several vectors separated by ';' and each vector is a comma separated list of (x,y,z) position.

The first 5 vectors in each frame are the higher level skeletal features The remaining vectors in each frame are the lower level data, raw marker positions.