Review Petteri Nurmi

Samankaltaiset tiedostot
Efficiency change over time

Capacity Utilization

Positioning Algorithms. Petteri Nurmi

Spatial Analysis Clustering Petteri Nurmi

Positioning Algorithms. Petteri Nurmi

Tracking and Filtering. Petteri Nurmi

The CCR Model and Production Correspondence

E80. Data Uncertainty, Data Fitting, Error Propagation. Jan. 23, 2014 Jon Roberts. Experimental Engineering

TM ETRS-TM35FIN-ETRS89 WTG

Tracking and Filtering. Petteri Nurmi

Metsälamminkankaan tuulivoimapuiston osayleiskaava

( ( OX2 Perkkiö. Rakennuskanta. Varjostus. 9 x N131 x HH145

Spatial Analysis Clustering. Petteri Nurmi

Tynnyrivaara, OX2 Tuulivoimahanke. ( Layout 9 x N131 x HH145. Rakennukset Asuinrakennus Lomarakennus 9 x N131 x HH145 Varjostus 1 h/a 8 h/a 20 h/a

TM ETRS-TM35FIN-ETRS89 WTG

T Statistical Natural Language Processing Answers 6 Collocations Version 1.0

TM ETRS-TM35FIN-ETRS89 WTG

Indoor Localization I Introduction and Positioning Algorithms Petteri Nurmi

TM ETRS-TM35FIN-ETRS89 WTG

TM ETRS-TM35FIN-ETRS89 WTG

16. Allocation Models

WindPRO version joulu 2012 Printed/Page :47 / 1. SHADOW - Main Result

WindPRO version joulu 2012 Printed/Page :42 / 1. SHADOW - Main Result

Returns to Scale II. S ysteemianalyysin. Laboratorio. Esitelmä 8 Timo Salminen. Teknillinen korkeakoulu

Other approaches to restrict multipliers

1. SIT. The handler and dog stop with the dog sitting at heel. When the dog is sitting, the handler cues the dog to heel forward.

,0 Yes ,0 120, ,8

TM ETRS-TM35FIN-ETRS89 WTG

( ,5 1 1,5 2 km

Spatial Analysis Clustering. Petteri Nurmi

The Viking Battle - Part Version: Finnish

TM ETRS-TM35FIN-ETRS89 WTG

7.4 Variability management

TM ETRS-TM35FIN-ETRS89 WTG

Huom. tämä kulma on yhtä suuri kuin ohjauskulman muutos. lasketaan ajoneuvon keskipisteen ympyräkaaren jänteen pituus

TM ETRS-TM35FIN-ETRS89 WTG

Alternative DEA Models

Rakennukset Varjostus "real case" h/a 0,5 1,5

Valuation of Asian Quanto- Basket Options

Information on preparing Presentation

Location Systems Petteri Nurmi

Indoor Localization II Location Systems. Petteri Nurmi Autumn 2015

Trajectory Analysis. Sourav Bhattacharya, Petteri Nurmi

LYTH-CONS CONSISTENCY TRANSMITTER

TM ETRS-TM35FIN-ETRS89 WTG

TM ETRS-TM35FIN-ETRS89 WTG

Location Systems. Petteri Nurmi

Modeling Mobility. Petteri Nurmi

C++11 seminaari, kevät Johannes Koskinen

Gap-filling methods for CH 4 data

HITSAUKSEN TUOTTAVUUSRATKAISUT

Results on the new polydrug use questions in the Finnish TDI data

Bounds on non-surjective cellular automata

Location Systems. Petteri Nurmi

TM ETRS-TM35FIN-ETRS89 WTG

( N117 x HH141 ( Honkajoki N117 x 9 x HH120 tv-alueet ( ( ( ( ( ( ( ( ( ( m. Honkajoki & Kankaanpää tuulivoimahankkeet

Vaisala s New Global L ightning Lightning Dataset GLD360

7. Product-line architectures

National Building Code of Finland, Part D1, Building Water Supply and Sewerage Systems, Regulations and guidelines 2007

Use of spatial data in the new production environment and in a data warehouse


S Sähkön jakelu ja markkinat S Electricity Distribution and Markets

Characterization of clay using x-ray and neutron scattering at the University of Helsinki and ILL

TIEKE Verkottaja Service Tools for electronic data interchange utilizers. Heikki Laaksamo

Tietorakenteet ja algoritmit

On instrument costs in decentralized macroeconomic decision making (Helsingin Kauppakorkeakoulun julkaisuja ; D-31)

Kvanttilaskenta - 1. tehtävät

Trajectory Clustering. Teemu Pulkkinen

LX 70. Ominaisuuksien mittaustulokset 1-kerroksinen 2-kerroksinen. Fyysiset ominaisuudet, nimellisarvot. Kalvon ominaisuudet

Choose Finland-Helsinki Valitse Finland-Helsinki

On instrument costs in decentralized macroeconomic decision making (Helsingin Kauppakorkeakoulun julkaisuja ; D-31)

Statistical design. Tuomas Selander

Mat Seminar on Optimization. Data Envelopment Analysis. Economies of Scope S ysteemianalyysin. Laboratorio. Teknillinen korkeakoulu

Hankkeen toiminnot työsuunnitelman laatiminen

Trajectory Similarity and Clustering. Teemu Pulkkinen, Petteri Nurmi

Research plan for masters thesis in forest sciences. The PELLETime 2009 Symposium Mervi Juntunen

toukokuu 2011: Lukion kokeiden kehittämistyöryhmien suunnittelukokous

SIMULINK S-funktiot. SIMULINK S-funktiot

Telecommunication Software

19. Statistical Approaches to. Data Variations Tuomas Koivunen S ysteemianalyysin. Laboratorio. Optimointiopin seminaari - Syksy 2007

Network to Get Work. Tehtäviä opiskelijoille Assignments for students.

Paikkatiedon semanttinen mallinnus, integrointi ja julkaiseminen Case Suomalainen ajallinen paikkaontologia SAPO

Use of Stochastic Compromise Programming to develop forest management alternatives for ecosystem services

2017/S Contract notice. Supplies

On instrument costs in decentralized macroeconomic decision making (Helsingin Kauppakorkeakoulun julkaisuja ; D-31)

Salasanan vaihto uuteen / How to change password

Exercise 1. (session: )

KONEISTUSKOKOONPANON TEKEMINEN NX10-YMPÄRISTÖSSÄ

MIKES, Julkaisu J3/2000 MASS COMPARISON M3. Comparison of 1 kg and 10 kg weights between MIKES and three FINAS accredited calibration laboratories

MALE ADULT FIBROBLAST LINE (82-6hTERT)

Toppila/Kivistö Vastaa kaikkin neljään tehtävään, jotka kukin arvostellaan asteikolla 0-6 pistettä.

Operatioanalyysi 2011, Harjoitus 4, viikko 40

How to handle uncertainty in future projections?

Categorical Decision Making Units and Comparison of Efficiency between Different Systems

WAMS 2010,Ylivieska Monitoring service of energy efficiency in housing Jan Nyman,

Olet vastuussa osaamisestasi

21~--~--~r--1~~--~--~~r--1~

Returns to Scale Chapters

RINNAKKAINEN OHJELMOINTI A,

Location-Awareness. Petteri Nurmi

Transkriptio:

Review Petteri Nurmi 21.2.2012 1

Overview of the Course I: Measuring and estimating location information II: Analysing and understanding location data Representing location, location systems, positioning and tracking Place identification, mobility modeling, trajectory analysis 21.2.2012 2

Part I: Measuring and Estimating Location Information 21.2.2012 3

Representing Location Information Absolute Coordinates with respect to a given reference system Requires an origin and a reference/unit distance Relative Location specified relative to an object Requires reference point, distance and angle Symbolic Location expressed using a semantic description (place or room) 21.2.2012 4

Ellipsoidal Coordinate Systems Contemporary geographic reference systems rely on ellipsoidal coordinate systems Shape of Earth represented as an ellipsoid Origin specified by two orthogonal planes that intersect at the geocenter of Earth Latitude ): parallel to Equator Longitude ): perpendicular to Equator Shape of geographic area approximated using a reference ellipsoid Essential for distance calculations Specified by a combination of semi-major axis (or diameter) and inverse flattening 21.2.2012 5

Geoid Approximation of mean sea level Equipotential surface of the gravitational field that coincides with mean sea level Standard for measuring altitudes from sea level 1) True ocean level 2) Reference ellipsoid 3) Local plumb 4) Continent 5) Geoid 21.2.2012 6

Measuring distances between locations Euclidean distance Useful for short distances Requires an unit distance, i.e., how much a difference of 0.01 (or so) in coordinates is in meters at a given latitude Geodetic distance Distance along the surface of an ellipsoid Geodetic problem defines how differences in coordinates are mapped into distances and vice versa Direct: given an angle and distance, determine new location from a known origin Indirect: given two points, determine angle and distance between them Vincenty s formula provides an iterative algorithm for solving geodetic problems 21.2.2012 7

Determining Location Location system Component that provides measurements that can be used for determining the position of an entity Types of measurements: Identifier: GSM cell identifier, WiFi Mac address RSS: received signal strength (-dbm) Geographic: distance or angle Positioning algorithm Technique for determining the position of an entity using the measurements provided by the location system Triangulation, trilateration, multilateration, fingerprinting 21.2.2012 8

Geometric Positioning Algorithms Basic idea: Angle and/or distance measurements from two or more reference points used for determining position Triangulation The use of angle measurements for determining the position of an entity Trilateration The use of distance measurements for determining the position of an entity Multilateration The use of differences in distances for determining the position of an entity 21.2.2012 9

Error of Geometric Positioning Distance and/or angle from reference points can seldom be measured exactly Obstacles cause attenuation and multi path effects Synchronization, reference point errors, interference Measurements define an error region Intersection of multiple error regions defines an area of uncertainty for position estimate Size of the area of uncertainty referred to as the Dilution of Precision (DoP) Geometry of reference points influences the size of the error region 21.2.2012 10

Location Systems for Geometric Positioning Distance can be measured using Time-of-flight One-way: from reference point to receiver or vice versa, requires time synchronization (e.g., GPS) Round-trip time (e.g., Radar) Radio propagation models Mathematical formulation of how signals vary as a function of distance Example, log distance path loss model Alternative is to use angle measurements E.g., ultrasound, zigbee 21.2.2012 11

Other sources of error Atmospheric effects Propagation follows speed of light only in a vacuum Refraction: change in direction of wave, occurs when a wave enters from one medium to another Multipath propagation Phenomenon where signals reach the receiver along multiple paths Reflection, diffraction or scattering, depending on wavelength and size of obstacles Clock errors Synchronization errors between two clocks can cause huge errors due to speed of signals 21.2.2012 12

Fingerprinting Technique that exploits spatial variations in observed signal characteristics for positioning Two phases Calibration: construct a database that characterizes the signal variations at different locations Operates on measurements containing both location and signal characteristics Estimation: compare a new measurement against the radio map to determine the location of an entity Given a new signal characteristics measurement, use the radio map to estimate the most likely position where the measurement was taken 21.2.2012 13

Deterministic Fingerprinting Basic idea: Given measurement s, calculate a distance d(s,x) between s and all measurements x in the radio map Typically Euclidean distance or correlation Use the most similar measurements x (as defined by the distance) to estimate location of client knn and WkNN Find the best k measurements, i.e., measurement x for which d(s,x) is smallest Assign a weight for each measurement (knn = uniform, WkNN = non-uniform) Position estimated as a weighted average of the locations associated with the best matching measurements 21.2.2012 14

Probabilistic Fingerprinting Basic idea: Given measurement s, calculate the probability of seeing the measurement at different locations Estimate the location of the client based on the resulting probabilities Probability of a location given by a signal model Specifies the probability distribution of observing particular signal values at a given location Can be histogram-based or parametric (e.g., Gaussian) 21.2.2012 15

Location Systems Proximity Sensing Position based on closeness of a reference point Examples: infrared, RFID, Bluetooth Special case: mobile call-detail records (CDR), location estimated as coordinates of current cell tower 21.2.2012 16

Location Systems Satellite Positioning Trilateration-based positioning approach Reference points are satellites on a specific orbit Distances from satellites measured using one way timeof-flight measurements Satellites broadcast messages that contain orbital position and system time of the satellite Receivers listen for the broadcasts and estimate Distance from the satellites Clock offset Pseudorange: 21.2.2012 17

Location Systems Mobile Networks Network divided into base stations and cells Base Transceiver Station: equipment responsible for handling communications within a geographic area Cell: antenna that serves a specific geographic area One BTS can be responsible for multiple cells Location Area Identifier (LAI): globally unique id for a cell Location Area (LA): cluster of cells, smallest unit for which the network maintains position information Position can be determined using Proximity sensing: BTS coordinates, Timing Advance Multilateration: Differences in arrival times Fingerprinting: on the handset based on observed RSS values of cells 21.2.2012 18

Position Tracking Monitoring the location of an entity over time State space models provide a generic framework for implementing tracking Location of an entity represented using a state x k which cannot be directly observed Evolution of state over time controlled by state equation x k = Ax k-1 + v Measurements (or observations) y k provide cues about the true state of the system Measurement equation specifies how the measurements relate to the true system state y k = Ux k + w 21.2.2012 19

Bayesian Optimal Filter Probabilistic approach to state space models Given a sequence of measurements y 1:k returns a probability distribution p(x k y 1:k ) over the current state (location of object) Evolution of state specified by a probability distribution p(x k x k-1 ), also know as (state) transition probability Measurements relate to state using another probability distribution p(y k x 1:k, y 1:k-1 ) Usually assume that Future and past independent of each other given present Measurements are conditionally independent given state 21.2.2012 20

Kalman Filter Closed form solution of the Bayesian optimal filter Assumes noise is uncorrelated and Gaussian and that relationships between variables are linear Under these assumptions, the filtering distribution p(x k y k-1 ) is also Gaussian Implementing the filter thus requires maintaining estimates of the mean and covariance matrix Two phases Prediction: guess the most likely values of the parameters given the current values Update: correct the predictions when a new measurement comes available 21.2.2012 21

Kalman Filter Prediction: Propagate current values using system dynamics Update m* k A k-1 m k-1 P* k A k-1 P k-1 A T k-1 + Q Calculate residuals between predicted value and estimated value and correct the predicted values 21.2.2012 22

Particle Filters In real world applications noise often non-gaussian and the state space is non-linear x k = z(x k-1 ) + q k y k = h(x k )+ r k Particle filtering uses Monte Carlo integration recursively to approximate the filtering distribution For function g( ) and distribution f(x), we have: E(g(x)) = g(x) f(x) dx When f(x) equals the filtering density, we get: E(g(x)) = g(x) p(x k, y 1:k ) dx 1/K j g(x kj ) 21.2.2012 23

Particle Filters Sequential Importance Resampling Initialization: Draw K particles according to prior distribution, set w j = 1/ K for all particles Estimation step: Draw K samples from proposal distribution: x kj (x k x kj, y 1:k ) Update importance weight of particle j w kj = w k-1 j p(y k x kj ) p(x kj x k-1j ) / (x k j, x 1:k-1j, y 1:k ) Calculate number of effective particles N EFF = 1 / j (w kj ) 2 and resample particles if value below a threshold State of filter can be estimated using j w kj x k j 21.2.2012 24

Part II: Analysing Location Measurements 21.2.2012 25

Preprocessing Errors in location measurements should be rectified before data can be analyzed Measurement validity: remove extreme values or values that are otherwise known to be erroneous (e.g., less than 4 GPS satellites) Measurement uncertainty: remove measurements with high uncertainty (e.g., high HDOP value) or filter measurements over time (Kalman filter or particle filter) to smoothen them 21.2.2012 26

Partitioning Algorithms K-Means One of the best-known clustering algorithms Iterative relocation algorithm, optimizes squared loss m i corresponds to the center of a cluster, C i is the set of points allocated to cluster i Basic structure: Initialization: generate k cluster centers according to some criterion (e.g., random selection from data) During each iteration: Allocate each point to the cluster that is closest Revise cluster centers based on the points that are assigned to the cluster Repeat until no change in values 21.2.2012 27

Partitioning Algorithms Gaussian Mixture Models Data generated by k random variables, each variable X i characterized by probability density function f i ( i ) For each point i, a hidden and unobservable variable c i determines the cluster where i belongs to The clusters are called mixture components Each f i ( i ) is assumed to be Gaussian Mean i determines the center of the cluster Covariance matrix i determines shape of the cluster Cluster parameters can be determined using the expectation maximization (EM) algorithm 21.2.2012 28

Density-Based Clustering Class of algorithms that represents clusters as dense regions of objects Epsilon neighborhood: collection of points that are within distance Eps from a point Dense neighborhood: Epsilon neighborhood that contains at least MinPts points Radius-based clustering Merge all points within distance Eps Prune clusters using a density criterion 21.2.2012 29

Density-Based Clustering DBScan Algorithm that recursively merges Epsilon neighborhoods together to identify dense regions Let c be a core object, within the Epsilon neighborhood of c considered as seed points A point that has at least MinPts within its Epsilon neighborhood is called a core object Non-core objects which do not belong to the Epsilon neighborhood of any core objects are noise Cluster expanded with (previously unallocated) points that are within the Epsilon neighborhood of a seed point 21.2.2012 30

Place Identification (from coordinate data) Place is a symbolic/semantic representation of location Physical locations linked with activities and semantics Consistent with the way people refer to location information Places can be detected from coordinate data: Spatial clustering used to identify regions where a person spends a significant amount of time Pre- or post-processing used to remove areas/points that are unlikely to be significant E.g., using temporal or spatial constraints Additionally a labeling step that assigns semantics with the identified places 21.2.2012 31

Movement Statistics Area of influence The geographical area within which a user spends most of her time doing daily activities Diameter Maximum distance between two cell towers (BTS) Characterizes the size of the area of influence Radius of gyration Average distance that the person typically travels 21.2.2012 32

Mobility Model Mathematical characterization of how people move Model can capture: 1. Movement between places (location) 2. Arrival times, i.e., when person arrives to places 3. Duration of stay in places Context independent vs. dependent Whether the model depends or not on factors such as time-of-day, weekend/weekday, type of location etc. Model order Specifies how much historical information needs to be considered to make predictions about future behavior 21.2.2012 33

Location Transition Modeling Markov Predictor 0.97 0.02 0.00 0.01 0.01 0.00 0.99 0.01 0.00 0.00 0.67 0.00 0.33 0.00 0.00 0.04 0.00 0.00 0.96 0.00 0.00 0.00 0.00 0.00 1.00 Stochastic state machine State of the system is assumed to evolve over time according to a probabilistic model p(x k x 1:k-1 ) Transition probabilities p(x k x 1:k-1 ) determine how likely it is that a person moves from one place to another Probability of current state assumed to depend on previous q values, where q is the order of the model In practice Use place identification to learn the places of interest for an individual Use location measurements to estimate how the person moves from one location to another 21.2.2012 34

Lempel-Ziv (LZ) Predictor Sequence of visited locations represented as a string The string is recursively split into parts to maximize compressibility of the input Splitting of string can be represented using a so-called LZ tree Next place can be predicted by examining how often a particular location has followed the current location history 2/3 3/8 2/8 2/8 1/8 1/2 1/2 21.2.2012 35

Stationary Detection Stationary Moving Determining whether a person is moving or not, the first subtask in transportation behavior monitoring Sensor values tend to contain more variation when the user is moving compared to stationary periods Variance or intensity of accelerometer values Rate of change in the signal environment Similarity of WiFi access points seen over consecutive time windows Number of unique cell towers observed within a window 21.2.2012 36

GPS Locomotion Detection HCR = P c / d SR = P s / d VCR = P v / d Variations in movement trajectories can be used for determining transportation mode Heading change rate (HCR): frequency that people change heading direction within unit distance Stop rate (SR): frequency that people stop / move with slow speed within unit distance Velocity change rate (VCR): frequency of observing significant changes in velocity over a unit distance Other possible features Statistical features (mean, variance, maximum etc.) characterizing velocity and acceleration, calculated over specific road segments 21.2.2012 37

Trajectory The location of an object over time Continuous function f(t) that returns the 2D or 3D position of an object at time t Location measurements can only be collected at discrete intervals Sensed trajectory is a piece-wise continuous function f(t k ) that returns position of object at sampling interval t k s 1 s 2 s 3 s 4 s 5 s 6 s 7 s8 = s R 21.2.2012 38

Trajectory Simplification Simplification refers to approximating the original trajectory with a simpler form Reduces storage space Provides savings in power consumption Location measurements Sensed Trajectory Simplified Trajectory Measurements retained 21.2.2012 39

Simplification Error Error measured as the difference between the original trajectory and simplified trajectory Perpendicular distance: maximum distance between any measurement and the simplified line Angular distance: accumulated difference in angle Orthogonal distance: difference in distance e 4 e 5 e 6 e 1 e 2 e 3 21.2.2012 40

Trajectory Simplification Douglas-Peucker Iterative trajectory simplification algorithm Generate segment between first and last measurement Identify point that is furthest from the segment and if it is further than a threshold, include the point Apply algorithm recursively on the two segments defined by the first, last and the added point e e e e e e 21.2.2012 41

Trajectory Simplification Minimum Description Length Simplified trajectory considered a hypothesis Hypothesis that provides the best balance between preciseness (accuracy) and conciseness (small number of points) is the optimal simplification True s i+1 s i+2 False s i+3 False s i s i+4 = s j 21.2.2012 42

Trajectory Similarity Measures Euclidean A B Compares measurements at the same time instance t Time offsets, dilation and noise have significant impact on resulting values A B Dynamic time warping Dynamic programming approach to time series, similar to edit distance but differences carry a dynamic penalty Longest common subsequence (LCS) Similarity measured as the total length of common subsequences 21.2.2012 43

Trajectory Clustering KMedoids 15 Partitioning-based clustering algorithm where clusters represented using the most centrally located measurement/object 15 10 10 5 5 medoid 0 0 medoid -5-5 medoid -10-10 -15-15 -10-5 0 5 10 15-15 -15-10 -5 0 5 10 15 21.2.2012 44

Trajectory Clustering TRACLUS 1. Segment trajectories 2. Calculate weighted sum of distances between trajectory segments 3. Apply DBSCAN 4. Extract representative trajectory 21.2.2012 45

Location-Based Services Computer applications that deliver information depending on the location of the device and user Numerous different categories of services Emergency services: E911 and E112 Mobile advertising Location-based games (Mobile) Augmented reality Main challenges Privacy Energy-efficiency Indoor positioning Lack of standards 21.2.2012 46

Energy-Efficiency Minimizing power consumption of location sensing, tracking and reporting Trade-off between position accuracy and the amount of savings that can be obtained Define an error threshold E and optimize energy so that position is accurate within E most of the time Three main techniques for reducing power Duty cycling: reducing sampling frequency of sensors Sensor management: use less accurate but more energyefficient sensors whenever possible Data uploading: intelligent schemes for reducing location reporting frequency 21.2.2012 47

Location Privacy Ability to prevent other parties from learning one s current or past location Countermeasures Anonymity: replace associated name with pseudonym or other untraceable identifier K-Anonymity Mix-Zones Obfuscation: reduce quality of information to hide sensitive details Spatial and/or temporal degradation Cloaking 21.2.2012 48

Home Exam 21.2.2012 49

Home Exam Exam period 22/2 16:00 7/3 16:00 Similar to the exercises, do not provide only results but also how they were derived (including code) Solutions to exam must be submitted by email to the lecturer AND teaching assistants before the hard cut-off (16:00 on 7 th of March) Course grading takes into account both exercise activity and success in the exam 21.2.2012 50