Mobile Sensing III Signal Processing for Sensor Data Analysis Petteri Nurmi Spring 2015 17.3.2015 1
Learning Objectives Understand the basics of time and frequency domain representations of signals Understand what are the different phases of sensor data analysis cycle Learn most common preprocessing techniques Learn most common types of features Understand how to measure similarity of sensor measurements 17.3.2015 2
Signals In sensor analysis, signals are typically functions of time and multidimensional Example: 3D accelerometer data x y note: coordinate axes relative to derive orientation of device z time 17.3.2015 3
Signals: Sampling Continuous signals cannot be collected or stored, so instead we operate on (discrete) samples Sampling interval Hz, defines how often samples taken Example of samples taken within 1s (Magnetometer 95hz) 17.3.2015 4
Signals: Noise In practice, measurements noisy versions of the true underlying signal Linear equation: z i = x i + v i where v i is noise, z i is the observed measurement, and x i is the true signal value Often assumed to be Gaussian, zero-mean, and independent identically distributed (i.i.d.) 17.3.2015 5
Frames Typically measurements are processed in frames Sequence of successive measurements Frame width refers to the number of measurements included in frame Sampling rate not necessarily uniform è use time instead Overlap of a frame refers to the fraction of measurements it has common with previous frame Example: No-overlap 50% overlap 17.3.2015 6
Sample Synchronization Sample 1: A: 1320133789651 M:1320133789652 Sample 2: A: 1320133789663 M: 1320133789664 Measurements from different sensors often arrive at different times è need to synchronize sample Consider, e.g., the example on the left with two samples from accelerometer and magnetometer Possible ways to achieve sample synchronization: 1. Take time indices from the sensor with highest sampling rate and align measurements from other sensor with the closest timestamp 2. Fix uniform time window size and interpolate values at specific times Computationally heavier, but recommended particularly when sampling rate of a sensor is non-uniform 17.3.2015 7
Interpolation Method of constructing new data within a the range of a discrete set of points Assumes two points (x 0, y 0 ) and (x 1, y 1 ) (i.e., time and sensor) value are given Linear interpolation: Effectively weighted average where weight depends on distance from the values Spline interpolation Intervals between two points modelled with low-order polynomials Polynomial pieces for intervals selected so that they fit smoothly with each other 17.3.2015 8
Interpolation: Example 0-1.59 1-1.32 14-1.05 23-0.98 33-0.94 Consider the data given in the table on the left. What is the value of the signal at t = 10? Linear interpolation: -1.13 Y0 = -1.32, Y1 = -1.05, X0 = 1, X1 = 14, X = 10 Equivalently: 4/13 * Y0 + 9/13 * Y1 Cubic interpolation: -0.83 Cubic interpolation Original data 17.3.2015 9
Time vs. Frequency Domain Thus far we have looked at time-domain representation of signals Time on x-axis, value of signal on y-axis Alternative is to look at frequency domain of signal So-called spectral analysis Function decomposed into sinusoid components Analysis looks at the frequency, phase and amplitude of these components 17.3.2015 10
Fourier Transform Transformation that decomposes a function f(x) into frequencies that make it up Captures the energy of the signal within each frequency When x is time (seconds), the output represents frequency (in hertz) Essential for capturing periodicity in signals Invertible transform, i.e., the original signal can be recovered from a transformed signal Fourier Transform Inverse Fourier Transform: 17.3.2015 11
Discrete Fourier Transform (DFT) Fourier transform for discrete and periodic signals Discrete approximation of Fourier transform that estimates the transform at N possible output values Integral in Fourier transform estimated as discrete sum Each X n is a complex number Effectively converts the signal into a sum of cosine and sine waves Inverse Discrete Fourier Transform Reverse operation that recovers the original signal Fast Fourier Transform (FFT) Algorithm for calculating the DFT and its inverse DFT O(N 2 ), FFT O(n log n) 17.3.2015 12
Computing FFT: Cooley-Tukey Danielson-Lanczos lemma: Discrete Fourier transform of length N can be written as sum of two discrete Fourier transforms of N/2 One formed from even, the other from odd points Can be applied recursively to further divide computations Cooley-Tukey algorithm: Implementation of FFT that builds on the Danielson- Lanczos lemma Practical implementations rely on so-called bit-reversal ordering of elements to significantly speed up calculations 17.3.2015 13
Illustration of FFT Data 10.9 14.9 17.0 16.1 16.4 19.1 28.0 20.3 17.8 19.0 179.50-38.37 25.74-5.69 0.10 7.44-1.14 9.01 0.70 9.01-1.14 7.44 0.10-5.69 25.74-38.37 0.00-86.94 3.50-16.92-16.60 13.92-18.50 9.50 0.00-9.50 18.50-13.92 16.60 16.92-3.50 86.94 + i DC Complex conjugates signal FFT IFFT(FFT(signal)) 17.3.2015 14
Sensor Data Analysis Pipeline (Revisited from Lecture I) Raw sensor data High-level contexts: Walking, Running, Vacuuming, Driving, Sensor data extraction Frames, synchronization Preprocessing Smoothing, noise removal, transformation Feature extraction Extracting time domain and frequency domain featurse from measurements Inference: classification/regression 17.3.2015 15
Phase I: Preprocessing Preprocessing refers to steps that are carried out on the data before analysing them Filtering: removing specific sources of noise Smoothing: reducing spikes in the measurements with the aim to capture the main trend better Windowing: improving frequency domain representation Discussed during a later lecture Possible data transformations Some applications operate on alternative signal representations, e.g., empirical ECDF in activity recognition and hyperbolic signals in WiFi 17.3.2015 16
Preprocessing: Filter Types Noise removal often done by removing measurements falling into a specific frequency range Low-pass filter: Signals with a frequency lower than threshold allowed to pass, higher frequencies cut-off High-pass filter: Signals with a frequency higher than threshold allowed to pass, lower frequencies cut-off Band-bass filter: Signals within a given frequency range are allowed to pass, others discarded Several ways to implement FFT-based: cut of frequencies outside given range Windowed filters, allow smoother form for result signal 17.3.2015 17
Example: FFT preprocessing BEFORE Original data After filtering AFTER Time Domain Freque ncy Domain FFT of original data FFT of filtered data 17.3.2015 18
Smoothing: Mean and Median Filters Smoothing approximates original signal by reducing short-term fluctuations Mean and median filters common ways to implement smoothing Set a window around current sample, [i w, i + w] where w is width of the filter Replace the value of the sample with the mean/median of the window Median more robust to outliers But mean can be calculated online 17.3.2015 19
Mean and Median Filters: Example Mean filter Median filter 17.3.2015 20
Phase II: Feature Extraction Feature extraction: identifying and extracting key characteristics of the signals / frames Typically during design phase as many features as possible extracted and the most useful ones selected for final sensing application Two main feature categories Time domain: features related to summary and order statistic Frequency domain: spectral characteristics, extracted from FFT transformed signals Necessary evil, even the best classifier cannot reach high accuracy if the features are not discriminative! 17.3.2015 21
Descriptive Statistics Location (central tendency) arithmetic mean, median, mode Spread (statistical dispersion) Variance/standard deviation, min, max, range Interquartile range (IQR) IQR: difference between 75 th and 25 th percentile Requires ordering the measurements Distribution-related features Shape parameters: skewness (bias to right or left in distribution), kurtosis (peakedness of distribution) 17.3.2015 22
Example Data 10.9 14.9 17.0 16.1 16.4 19.1 28.0 20.3 17.8 19.0 Location Mean = 18.0 Median = 17.4 Mean influenced more by the extreme value 28 Spread: St. Deviation = 4.5, Max = 28, Min = 10.9, range = 17.1 IQR: 25 th percentile 16.1, 75 th percentile 19.1 è IQR = 3 Distribution Skewness = 0.8 Kurtosis = 4.0 17.3.2015 23
Other statistical features Sample difference (derivative): Δ(x i+1 x i ) Effectively the rate of change in measurements Useful for identifying peaks in signals Zero/mean crossing rate Number of times the signal passes through zero/mean Can be used, e.g., to identify events with symmetric changes in magnitude Magnitude (or L1-norm): L1-n x i Root Mean Square Error (RMS): ( x i 2 ) / n 17.3.2015 24
Example How many zero-crossings? How many mean-crossings? 8 13 17.3.2015 25
Frequency Domain Features Frequency domain features extracted from the FFT output of a signal Discrete Component (DC) Spectral Energy Squared sum of spectral coefficients normalized by length of sample window FFT coefficients / sum of FFT coefficients Energy at a given frequency, or at given set of frequencies Dominant frequency I.e., frequency with highest energy content 17.3.2015 26
Examples Original (speech) signal Energy (frame size = 100) 17.3.2015 27
Other classes of features Wavelets Alternative to FFT, represents signal as a function of Wavelet base functions Several different transformations, e.g., Haar and Daubechies Contrary to FFT, lacks a physical interpretation String features Quantize values to a discrete set of possible ones, e.g., using piecewise approximations Assign a character for each possible level è signals can be transformed into strings String similarity metrics can be used as features 17.3.2015 28
Phase III: Inference As discussed, the final phase refers to inference, where the goal is to estimate a given output Discrete categories è classification Numeric outputs/scores è regression Common classification techniques Naive Bayes classifier, Support Vector Machines (SVM) Decision Trees, knn classifiers Common regression techniques Support Vector Regression, Gaussian processes, linear regression 17.3.2015 29
Measuring Similarity In many applications, features correspond to the similarity of two (or more) signals Applications that compare signals across devices, e.g., proximity sensing, authentication, and social sensing Moving object databases literature has applications that require finding similar movement traces Comparison of users in participatory and persuasive sensing applications Nearest neighbour classifiers rely on a similarity measure for signals (as discussed) Several possible similarity measures, best choice often application dependent 17.3.2015 30
Correlation Continuous data Ordinal (ranked) and skewed data Measure of dependency between two signals Can be used as measure of similarity Pearson product-moment correlation Spearman rank correlation Converts data into ranks and measures the difference between ranks d i Kendall s tau 17.3.2015 31
Autocorrelation and Cross- Correlation Cross-correlation Similarity of two signals as a function of the lag of one relative to other Useful for identifying signals that are copies of each other but shifted in time Can be used, e.g., to synchronize two audio streams by using lag with maximal auto-corr. Autocorrelation Cross-correlation of a signal with itself Measures correlation of signal values with itself Useful for finding periodic patterns, e.g., step counting can be autocorrelation to identify gait cycle 17.3.2015 32
Dynamic Time Warping Elastic similarity measure that calculates optimal match between two time series Allows warping in time-dimension, i.e., measurements can stretch or compress (e.g., due to different speed) Most widely used time-series similarity Does not satisfy triangle inequality è not a metric Calculated using dynamic programming Closely related to edit distance, but uses dynamic distance as penalty instead of a constant 17.3.2015 33
Dynamic Time Warping: Illustration A B 17.3.2015 34
Dynamic Time Warping: Calculating Initialization: DTW(0,0) = 0, DTW(0,i) = DTW(i,0) = Iterate over cost matrix, setting the value to: where e.g., Euclidean 17.3.2015 35
Dynamic Time Warping: Example 0 1 2 3 4 5 6 7 8 0 0 Signal 1-1.59-1.32-1.05-0.98-0.94-0.89-0.31-0.31-0.19-0.83 Signal 2 6.05 5.72 5.71 5.97 6.25 6.25 5.30 5.30 1 7.6 15 22.3 29.8 37.7 45.5 52.4 59.3 2 15.0 14.7 21.7 29.0 36.6 44.1 50.8 57.4 3 22.1 21.5 21.4 28.5 35.8 43.1 49.4 55.7 4 29.1 28.2 28.1 28.4 35.6 42.8 49.1 55.4 5 36.1 34.8 34.8 35 35.6 42.8 49 55.2 6 43.1 41.4 41.4 41.6 42.2 42.7 48.9 55.1 7 49.4 47.5 47.4 47.6 48.2 48.7 48.3 53.9 8 55.8 53.5 53.4 53.7 54.2 54.8 53.9 53.9 9 62.0 59.4 59.3 59.6 60.1 60.7 59.4 59.4 10 68.9 65.9 65.8 66.1 66.6 67.2 65.6 65.5 17.3.2015 36
Final Note: System Level View As discussed during Lecture II, resource-efficiency a crucial design goal for mobile sensing systems Frequency-domain analysis often quoted as energy heavy, but not really Depends heavily on the FFT implementation Distribution costs of all sensor data often MUCH higher than doing the processing on the device Mainly beneficial to offload heavy classifiers, and especially their training phase Big-data frameworks (Hadoop, Spark,...) Currently mainly useful for analysing large small data, e.g., small sets of features collected from many devices 17.3.2015 37
Summary Signals, discrete samples, noise Signals can be analyzed in time or frequency domain (i)fft used to convert between time and frequency domain representations Preprocessing techniques needed to overcome noise in measurements Smoothing, filtering Feature extraction Summary and order statistic features Frequency domain features Similarity another important aspect 17.3.2015 38
References Figo, D.; Diniz, P.; Ferreira, D. & Cardoso, J. Preprocessing techniques for context recognition from accelerometer data, Personal and Ubiquitous Computing, 2010, 14-7, 645-662 Krumm, J., Processing Sequential Sensor Data, Ubiquitous Computing Fundamentals, CRC Press, 2009, 353-380 Keogh, E. & Ratanamahatana, C. A., Exact Indexing of Dynamic Time Warping, Knowledge and Information Systems, 2004, 7, 358-386 Additional reading: ANY book on Digital Signal Processing Smith, S. W., Digital Signal Processing, Newnes, 2003 AND the following HIGHLY RECOMMENDED book: Press, W. H.; Teukolsky, S. A.; Vetterling, W. T. & Flannery, B. P., Numerical Recipes: The Art of Scientific Computing, Cambridge University Press, 2007 17.3.2015 39