Map overlay as modeling of spatial phenomena. Kirsi Virrantaus GIS-E1060 Spatial Analytics

Map overlay as modeling of spatial phenomena Kirsi Virrantaus GIS-E1060 Spatial Analytics 8.11.2016

Map overlay spatiaalisten ilmiöiden mallina Kirsi Virrantaus GIS-E1060 Spatial Analytics 8.11.2016

Topics Map overlay. Book Ch 10 Case study: Cross country analysis.

Aiheet Map overlay. Kirjan luku 10 Case: Maaston kulkukelpoisuusanalyysi.

1. What is Map Overlay? first formalized by McHarg (1969) the idea is older: historical background map transparencies on the top of each others, analysis on the basis of several map layers overlay procedure creates a new map data layer as a function of two or more source data layers compare with multicriteria decision making problem (Malczewski, 1999) can be performed to any geometrical data type, see Table 10.1

1.Mitä on Map Overlay? (päällekkäisanalyysi) ensimmäinen eksakti määrittely jo 1969 (McHarg) perusidea ikivanha: historiallinen tausta skissipaperille piirretään kartalta merkittäviä asioita, toinen skissipaperi toisen päälle, analysoidaan aluetta useiden karttojen perusteella päällekkäisanalyysi tuottaa uuden karttatason useiden lähtökarttatasojen perusteella vrt. monikriteerisen päätöksentekoon (Malczewski, 1999) voidaan toteuttaa kaikille geometrisille datatyypeille, ks. taulukko 10.1

More discussion on Map Overlay sieve mapping the term used by the authors of the textbook basic form based on binary logic, Boolean logic suitability to some use is analysed by logical reasoning in which several data layers give (binary) values to each locations and the reasoning is based on the logics based on these values the logic of the analysis (criteria) often collected by the experts (knowledge driven) could also be data driven (any data available could be analysed in order to find the criteria)!!!! typically implemented as raster operation, possible (but more complicated) to compute also for vector data objects slivers (remember the intersection algorithms presented in Introduction course!)

Lisää yleistä map overlaysta kirjan kirjoittajat käyttävät termiä sieve mapping (sieve = siivilä) perusmuoto perustuu binäärilogiikkaan, Boolen logiikka onko vai eikö ole jotain? analysoidaan esim. alueen soveltuvuutta johonkin käyttöön, alue saa ominaisuuksia karttatasoilta, jotka antavat (binäärisen) arvon jokaiselle lokaatiolle; tulos perustuu logiikkaan näiden perusteella overlayn sisältö perustuu usein asiantuntijoiden kokemukseen ja kriteereihin (tietämyspohjainen), voisi myös perustua dataanalyysiin (datapohjainen) jolloin kriteerit tuotettaisiin analysoimalla kaikkea saatavilla olevaa dataa!!!! tyypillisesti rasterioperaatio, mahdollista (mutta monimutkaisempaa) laskea vektoriaineistolle slivers = merkityksettömän pienet leikkausalueet

Problems with map overlay input data are often in different coordinate systems are originally often in different scales (scanned) maps have often been generalized and objects may take more space than in reality (roads) data has often been interpolated (DEMs) if the uncertainty of the data sets is not known then the results are of no value if serious decisions are made on the basis of results the knowledge about the reliability is vital

Map overlayn ongelmia lähtödata on usein eri koordinaattijärjestelmissä eri mittakaavoissa yleistettyä dataa (skannatut kartat), tietyt kohteet on kuvattu spatiaalisesti suurempina kuin ne todellisuudessa ovat (esim. tiet) interpoloitua dataa (esim. korkeusmalli) jos lähtödatan epävarmuutta ei tunneta ei tuloksilla ole juuri arvoa erityisesti, jos tulosten perusteella tehdään vakavia päätöksiä

Weaknesses in Simple Boolean Overlay it is assumed that relationships are Boolean the two-valued (Y/N) logic in sieve mapping creates spatial discontinuities that do not reflect the natural situation Example: if 30 degrees is used as a threshold, then 29 degrees slope is not risky for landslide but 31 degrees slope is it is assumed that any interval or ratio scale attributes are known without significant measurement error any categorical attribute data are known exactly without uncertainty the boundaries of discrete objects are certain/crisp

Boolen logiikan heikkouksia oletetaan että kaikki on binääristä kaksiarvoinen (K/E) logiikka luo epäluonnollisia epäjatkuvuuskohtia esimerkiksi kun käytetään 30 astetta kynnysarvona, jonka alapuolella esim. 29 astetta maalaji ei ole altis vyörymälle, mutta yläpuolella, esim. 31 astetta on oletetaan, että kaikki välimatka- ja suhdeasteikolla mitatut ominaisuudet ovat virheettömiä kaikki luokkamuuttujien arvot tunnetaan oikein, ilman epävarmuutta kaikkien kohteiden rajat ovat täsmällisiä

It should be remembered that: using map overlay = modeling map overlay must always be well designed which are the relevant data for the problem? how the various data layers are weighted? how the various operations work on data? how the decision rules are defined? the entire decision must be well modeled and understood clear understanding instead of fancy computations (Clemen)

On muistettava, että map overlayn käyttö = mallintamista päällekkäisanalyysi on suunniteltava hyvin mikä on ongelman kannalta relevantti data? kuinka eri datatyypit (tasot) painotetaan? miten eri operaatiot vaikuttavat? kuinka päätössäännöt määritetään? koko päätöksenteko tulee mallintaa ja ymmärtää clear understanding instead of fancy computations (Clemen)

Map Algebra (Tomlin, 1990) map algebra is quite simple tool as such map algebra gives a tool for implementing for example map overlays different map overlay functions in Local operations for example LocalProduct LocalDifference LocalSum

Kartta-algebra (Tomlin, 1990) kartta-algebra on sellaisenaan varsin yksinkertainen työkalu, joka mahdollistaa map overlayn toteutuksen eri overlay -mahdollisuuksia Local operaatioilla esimerkkejä LocalProduct LocalDifference LocalSum

Map overlay definitions given by Malczewski (1999) overlay operation can be based on arithmetic, algebraic, logical, stochastic, or fuzzy operations addition, subtraction, multiplication, division average, power, order, minimum, mazimum intersection, logical AND; union, logical OR; complement, logical NOT probabilistic and fuzzy definitions for intersection, union and complement

Malczewskin määrittelemä map overlay (1999) päällekkäisanalyysi voi perustua aritmeettisiin, algebrallisiin, loogisiin, stokastisiin tai sumeisiin operaatioihin yhteenlasku, vähennys, kerto, jako keskiarvo, potenssi, järjestys, minimi, maksimi leikkaus, looginen AND; unioni, loginen OR, komplementti, looginen NOT todennäköisyyteen ja sumeuteen perustuvat leikkauksen, unionin, komplementin määritelmät

Towards a generic model O Sullivan and Unwin propose a general model for map overlay based on the concept of : favorability function map overlay evaluates the favorability of the subareas for some activity can be evaluated by using a simple mathematical function at every location explained in the text book on pages 304 311

Yleinen map overlay:n malli O Sullivan ja Unwin esittävät yleisen, ns. edullisuus/suotuisuusfunktioon perustuvan mallin Boole-tyyppinen overlay arvioi alueen osa-alueiden soveltuvuutta/suotuisuutta tiettyyn tarkoitukseen voidaan kuvata yksinkertaisella matemaattisella funktiolla, jokaisessa lokaatiossa kuvattu kirjassa ss. 304 311

Simple form: Favorability function can be written m F(s) = X M (s) M=1 -F(s) = favorability, for example cross country mobility, get values 0 or 1 in every location; s refers to location -m source data layers, all have equal importance -X(s) is the source data value in pixel s, value 0 or 1 -pi indicates the multiplication; thus the result is also binary -compare with Map Algebra Local Product

Yksinkertainen muoto: Edullisuusfunktio voidaan kirjoittaa yksinkertaiseen muotoon F(s) = X M (s) M=1 -F(s) = edullisuus/suotuisuus, esim. kulkukelpoisuus -arvioidaan binääriarvoilla 0 tai 1 jokaisessa lokaatiossa -m kappaletta lähtötietokarttatasoja X, kaikki samanarvoisia -X(s) lähtötiedon arvo pikselissä s, saa arvon 0 tai 1 -pii tarkoittaa kertolaskua -analyysin tulos on siis 1 tai 0 -vrt. kartta-algebran Local product

Improvements to the basic model -the favourability function can get value in more graduated scale than binary, for example ordinal (low-medium-high) or even ratio -criteria can be coded on the scales mentioned -criteria can be weighted according to the relative importance -criteria can be weighted according to the knowledge of experts, values -instead of multiplication some other function, for example adding the scores -Boolean overlay is a special case of the general function F= f(w 1 X 1,,,w m X m )

Perusmallin parannuksia -tekijät voidaan ilmaista järjestysasteikolla (matala, keskitaso,korkea) tai suhdeasteikolla; jatkuva asteikko 0 1 -antamalla kriteereihin perustuvia sääntöjä -kriteereitä voidaan painottaa -kriteereihin voidaan liittää tietämyspohjaista painotusta, asiantuntijoiden arvoja -kertolaskun sijaan voidaan myös laskea yhteen -Boolen overlay on erikoistapaus yleisestä mallista F= f(w 1 X 1,,,w m X m )

Indexed overlay Malczewski calls this as weighted linear combination/simple additive weighting the use of single metric in ordinal scale like in the cross-country mobility 1 7 each layer can be weighted according to their importance summing up, normalization; the result get also values 1 M multiplication has been changed to to adding maybe the most typical way of using map overlay in practice

Indeksoitu overlay Malczewski kutsuu tätä painotetuksi lineaariseksi kombinaatioksi/yksinkertaiseksi lisääväksi painotukseksi käytetään kaikilla tasoilla samaa järjestysasteikkoa 0 M kuten kulkukelpoisuusanalyysissä 1 7 voidaan painottaa jokaista tasoa niiden keskinäisen merkityksen suhteessa summataan yhteen, normalisoidaan, tulos myös asteikolla 1 M kertolasku on vaihdettu yhteenlaskuun ehkä yleisin tapa käyttää map overay-analyysiä käytännössä

Modeling dependent variables in map overlay, WOF WOF = the term weights of evidence The method is based on the use of conditional probability the conditional probability of A, when we know that B already occurred P(A:B) B either increases or decreases the probability of A compare to the joint probability of independent events, in which the probabilities of events do not effect on each others, they are independent, and we do not know any of them Two flips of coins are independent event P(H H) = P(H) x P(H) = 0.5 x 0.5 = 0.25) Raining today and raining yesterday are not totally independent

Toisistaan riippuvien asioiden mallinnus map overlayssä, WOF WOF = Weights of evidence menetelmä perustuu ehdollisen todennäköisyyden käyttöön A:n ehdollinen todennäköisyys, kun tiedetään, että B on jo tapahtunut P(A:B) B lisää tai vähentää A:n todennäköisyyttä vrt. toisistaan riippumattomien tapahtuminen yhdistetty todennäköisyys, jossa tapahtumien todennäköisyydet eivät vaikuta toisiinsa Kaksi kolikon heittoa ovat riippumattomia toisistaan P(H H) = P(H) x P(H) = 0.5 x 0.5 = 0.25) Sataako tänään ja satoiko eilen eivät ole täysin riippumattomia toisistaan

Example Example:Probability that is rains today when we know that it rained yesterday In most climates it is probable that is also rains tomorrow if it rains today (called also autocorrelation) (compare to the spatial autocorrelation) This is worth of keeping in ming when we study later the spatial autoregressive models

Esimerkki Esimerkki: Todennäköisyys, että sataa huomenna, kun tiedetään, että tänään satoi useimmissa ilmastoissa on todennäköisempää, että myös huomenna sataa (vrt. spatiaaliseen autokorrelaatioon, joka onkin omaksuttu aikasarjoista) tämä kannattaa pitää mielessä kun opiskellaan spatiaalista autoregressiivistä mallia; ehdollinen todennäköisyys spatiaalisessa datassa

Conditional probability, the formula when we know that the other event has occured it is denoted P(A:B); the probability of A given B -is not the same than P(A B), because we know that B already occurred and it either reduces or increases the change of A; gives evidence to the change of A P(A:B) = P(A) x (P(B:A)/P(B)) -last term = weight of evidence, 1 0, either increases or reduces the probability of A

Ehdollinen todennäköisyys kun tiedetään, että toinen tapahtuma on jo tapahtunut merkitään P(A:B); tarkoittaa A:n todennäköisyys, kun tiedetään, että B on jo tapahtunut -ei ole sama kuin P(A B), koska kun tiedetään, että B on jo tapahtunut, sillä on vaikutuksensa siihen mitä A on P(A:B) = P(A) x (P(B:A)/P(B)) -jälkimmäinen termi, 0 1, joko vahvistaa tai vähentää A:n tapahtumista, jos suhde on yli 1, B:n esiiintyminen vahvistaa, jos alle, se pienentää A:n todennäköisyyttä

The formula in the previous example P(A) is the probability of event A P(A B) = P(A:B) P(B) ; in case of independent events this is clear the same thing for the other event P(B) is the probability of B P(B A) = P(B:A) P(A) it must be that P(A B) = P(B A) we get the formula P(A:B) P(B) = P(B:A) P(A) and further P(A:B) = P(A) P(B:A) /P(B) term P(B:A) /P(B) is so called weights of evidence when > 1 then the occurrence of increases the probability of A When < 1 then it reduces it

Kaava Edellisessä oli esitetty P(A) on An todennäköisyys P(A B) = P(A:B) P(B) ; riippumattomien tapahtumien tapauksessa selvä Sama toisinpäin P(B) on Bn todennäköisyys P(B A) = P(B:A) P(A) näinollen on P(A B) = P(B A) saadaan kaava P(A:B) P(B) = P(B:A) P(A) Ja edelleen P(A:B) = P(A) P(B:A) /P(B) termi P(B:A) /P(B) on weight of evidence jos > 1 silloin tapahtuma B lisää tapahtuman A todennäköisyyttä jos < 1 silloin sen vähentää sitä

Weight of evidence probability based overlay Bayesian approach to map overlay the conditional probability of event A given that the other event B is known to be occurred the fact that B already occurred provides additional evidence to the probability of A applying Bayes to map overlay means that the weight of evidence is taken into account in the reasoning of the result

Todennäköisyyksiin perustuva overlay, WOE Bayesiläinen lähestymistapa map overlayn käyttöön ehdollinen todennäköisyys: A:n ehdollinen todennäköisyys, kun tiedetään että toinen tapahtuma B on tapahtunut tosiasia, että B on jo tapahtunut vaikuttaa A:n todennäköisyyteen Bayes map overlayssa tarkoittaa sitä, että otetaan weight of evidence huomioon tulosta laskettaessa

Landslide probabilities in map overlay in a 10 000 sqkm region we have identified 100 landslides; we define as the baseline probability of a land slide event in a sqkm area P(landslide) = 0.01 75 of slides occurred in areas with steeper slope than 30 degrees thus we ca say that the probability of that the landslide that happened is in a steep slope area (P(slope>30 :landslide)=0.75 we know that 1000 skm of the entire area is steeper than 30 degrees; the probability of being steep in the area is P(slope>30 degrees) = 0.1 the slope increases clearly the probability of having a land slide and can be used in the conditional probability calculation as the weight of evidence in map overlay we have landslide layer and slope layer and the probability of getting a landslide when there is a steep slope is calculated by the formula below P(landslide:slope>30) = P(landslide) P(slope>30 :landslide)/p(slope>30) =0.075 = 0.01(0.75/0.1), see page 308

Maanvyörymämahdollisuuden analyysi koko 10 000 neliökm:n alueella on tapahtunut 100 vyörymää, tästä päätellään, että maanvyörymän todennäköisyys neliökilometrin alueella on 100/10 000 = 0,01 P(landslide) = 0,01 tiedetään, että 75% (75 kpl) maanvyörymistä on tapahtunut 10% alueella (1000 neliökm), joten päätellään, että todennäköisyys, että tapahtunut maanvyörymä on jyrkän rinteen alueella on P(slope>30 :landslide) = 0.75 ja myös P(slope>30 )=0.1 nyt halutaan ennustaa maanvyörymän todennäköisyys kun alueesta tiedetään, että sen kaltevuus on yli 30 astetta; sovelletaan ehdollista todennäköisyyttä ja map overlayta jaetaan alue pikseleihin ja otetaan kaltevuus jo tapahtuneeksi tekijäksi, jolloin se vahvistaa tietyillä alueilla maanvyörymätodennäköisyyttä Sovelletaan weights of evidence kaavaa P(landslide:slope>30) = P(landslide) P(slope>3:landslide)/P(slope>30) =0.075 = 0.01(0.75/0.1), kirjassa sivulla 308

Use of regression analysis if there is available input and output data the model can calibrated by using regression model the weighted linear combination model, added intercept constant and error term problems are caused by categorical variables, however also the model can be formulated to fit them spatial autocorrelation: spatial autoregression and geographically weighted regression

Regressioanalyysin käyttö jos on käytössä aineistoa sekä input että output datasta, voidaan regressiomalli kalibroida PNS-menetelmää käyttäen lähtökohtana painotettu lineaarinen malli, lisättynä vakiolla ja virhetermillä ongelmana luokkamuuttujadata, joskin voidaan kehittää myös siihen sopivia menetelmiä spatiaalisen autokorrelaation huomioiminen: spatiaalinen autoregressio ja maantieteellisesti painotettu regressio

Case: Terrain mobility analysis Maaston kulkukelpoisuusanalyysi Kirsi Virrantaus GIS-E1060 Spatial Analytics 8.11.2016

1. The problem: Reliability of the Cross-country mobility Cross-country analysis model developed at the Finnish Defence Forces/Engineering School Problem of the analysis: How difficult it is to advance in the terrain? Result of the analysis: A map showing 7 classes of mobility by 7 colours (1=no-go 7=go; blue=water/built area; not in the analysis)

Solution: map overlay Cross country mobility analysis based on : soil type (quarternary deposit map) elevation model amount of vegetation thickness of snow depth of frost Model is map overlay type All layers are of grid structure, equal pixel size, equal orientation Model type is indexed overlay layers get weights

20Q2D4

2. Our research goals 1) to analyse the reliability of the previous result map: how uncertain the result map is in a specified pixel location? what is the effect of the uncertainties of different source data types to the uncertainty of results in a specified pixel location? 2) to present the results in a way which can be used and interpreted by the users in the decision making - together with topographic maps

3. Soil map uncertainty in this presentation the uncertainty model of soil map is dealt with, because soil class is the primary variable in the analysis other data sets: snow, frost, vegetation, slope soil map is an interesting data set it is categorical and imprecise data it is manually produced and no metadata (no quality data) is available on the soil maps quality information must be collected afterwards In this case information on quality was collected from geologists knowledge based information The data was modified into a misclassification matrix (väärinluokittelumatriisi)

4. Monte Carlo simulation Monte Carlo simulation was applied for data all source data sets were simulated by using the uncertainty information available analysis was computed by simulated data in evaluation of the results the simulated realizations (the mean) were used as real data the original data was the estimate the uncertainty of the estimate was evaluated in our earlier research we had no model of spatial dependency

20Q2D4 21N4A1 0 1 2 3 4 5 6

5. Spatial autoregressive process (according to Goodchild et al., -92) in order to add spatial dependency the 4-neighbourhood is taken into account by giving equal weight for all 4 neighbours spatially dependent random field for the error in selecting the soil type is created by solving X in X=ρWX + ε ; based on spatial correlation parameter ρ and probabilities from the misclassification matrix; W is the adjacency matrix of pixels of equal soil type; ε represents normally distributed error 21 different values were used for the parameter ρ, the spatial dependency level, 100 simulations in simulations X vector was used together with the misclassification matrix to get the simulated soil values

6. Evaluation methods misclassification matrices only give the uncertainties for each soil class, so by using the matrix and PCC values we can only compare two test areas but not evaluate the uncertainty in each pixel both in case of source data and results analysis by a simple regression model was made but the crosscountry mobility model seemed to be too complicated to be analysed statistically

7. Visualization of the evaluation results visual analysis seemed to be the most powerful and only - tool in analysing and interpreting the spatially dependent results for the users and especially by the users themselves in the following two examples that show the possibilities

Example 1: Visual analysis of the uncertainty of soil types in area 1 the yellow silt and pink sandy heath have high uncertainty (in the MM they have low % for correctly classified) in the upper right corner of the area 2 there is a marsh polygon with very low uncertainty (in the MM marsh has 100% correct classification) the darker the value the higher the uncertainty the effect of increasing ρ can be seen the user can easily compare the source map and the uncertainty map layer associated to it

Uncertainty of classification of soil maps

Example 2: Visual interpretation of the results on the left side the cross-country analysis result 4 lower rows give results in different seasons for test area 2 in the other columns the uncertainty of the cross-country analysis results is shown by using increasing spatial dependency value ρ the darker the value the higher the uncertainty the user can easily find the spatially changing uncertainty by comparing the maps

Uncertainty of the analysis results

8. Conclusions: Visualizations are perfect tools the visualisations of uncertainties of source data sets and the original result map can be compared in specific locations the visualizations of the uncertainty of the result map can be compared with the original result map in specific locations we can also generate maps, which show the risk of having wrong class in the results (for example + - 1 class) in a specified pixel!

Conclusions: Spatial uncertainty model is needed visual analysis can not be made without a spatially dependent uncertainty model the quality of imprecise geographic data (like soil map in this case) can not be described by traditional quality measures each imprecise data set should be provided by a spatially dependent uncertainty layer which describes in a very user-friendly way some features of the quality (like spatial and thematic accuracy in our case)

9. Future: Developing the simple model the parameter ρ - different values for each soil type can maybe found and added to the model the membership vector of fuzzy soil model instead of probability vector from the misclassification matrix in simulation gives more local uncertainty information kriging together with fuzzy model in order to get better model for boundary areas

Literature O Sullivan&Unwin, Geographical information analysis, Chapter 10 Horttanainen,P., Virrantaus,K., Uncertainty evaluation by simulation and visualization, Geoinformatics 2004, Gävle, 7.-9.6.2004.