Current Profiling Dataset for MAS.622j Fall 2006

Source Data This dataset was generated by processing current and voltage time series measurements of consumer electronics, such as laptop power supplies. In particular, a Plug was used to measure the input voltage and current drawn by each device. All measurements were taken at 8-kHz for at least one minute.
Features The current data has been processed to compute a 256-bin power spectrum density (PSD) of approximately one-second intervals (8192 samples) of time series data. In addition, the time series voltage and current data used to generate the PSDs are included as well. The time series data could be used, for example, to determine the phase difference between voltage and current signals. These data are all labeled with an identifier unique to the consumer electronics device (or a particular mode of use of the device) from which the data were taken.
Project Ideas

1) Use a clustering algorithm to discover classes of devices. Do these classes match easily observable classifications, such as "laptop power supplies", "resistive loads", and "halogen lamps"? How well does the clustering algorithm perform? Can individual devices be identified?

2) Use a dimension reduction algorithm to determine the minimum number of dimensions needed for reasonably accurate classification/identification as described above. Determine the performance of the classification/identification algorithm as a function of the number of dimensions. Taking this further, attach and justify a cost to each dimension and find the optimal set of dimensions to use for a given performance metric of the classification/identification algorithm. For example, lower frequency components might cost more than higher frequency components of the PSD due to time constraints. The opposite might be true due to memory constraints.

Dataset current_profile_dataset.tar.gz (76 MB)