UEF Statistics Teaching Bulletin, Spring 2017

UEF Statistics Teaching Bulletin, Spring 2017 The minor subject of statistics offers methodological courses to all students of the university. In Spring 2017, we offer the following basic courses in Finnish: Regressiotekniikat, 4 op, Joensuu, kuvaus sivulla 2 Tilastollinen ohjelmistokurssi, 2 op, Kuopio, kuvaus sivulla 2 Tilastotieteen johdantokurssi, 5 op, Kuopio, kuvaus sivulla 3 Todennäköisyysmallit, päättely ja epäparametriset menetelmät, 2-5 op, Kuopio ja Joensuu, kuvaus sivulla 3 For students with sufficient basic knowledge, we offer also intermediate and advanced courses. Courses Introduction to Statistical Inference 1 and 2 run every year and the rest intermediate and advanced courses run every second year. These courses are suitable methodological Ph.D. studies in many fields. In spring 2017, we offer the following courses in Joensuu and Kuopio. Bayesian Inference 2, 5 cr, description on page 4 Generalized linear models, 5 cr, description on page 5 Introduction to Statistical Inference 1, 5 cr, description on page 6 Linear mixed-effects models, 4 cr, web course, description on page 7 Tentative long term schedule of up coming courses can be found at http://www.uef.fi/web/stat/opetus We also provide statistical consultiong for PhD students and researchers of the university. For more details, see https://elomake.uef.fi/lomakkeet/4612/lomake.html UEF statistics teaching bulletin, provides timely information on the available statistics courses to the students of UEF. The bulletin is published at the beginning of each semester and posted to the emailing list statistics-info@uef.fi; instructions to join this list can be found at JoiningStatistics-infoList.docx or http: // www. uef. fi/ web/ stat/ opetus 1

3622213 Regressiotekniikat, 4 op, Joensuu kontaktiopetusta 28 tuntia sisältäen sekä luentoja että harjoituksia Ajoitus: 4. periodi. Ensimmäinen luento 21.03.17 Opettaja: esko.valtonen@uef.fi Sisältö: Kurssilla tarkastellaan tavallisen lineaarisen yhden ja monen selittäjän regressiomallin ohella yhden ja monen selittäjän logistista regressiomallia. Klassisen yksiuuntaisen ja kaksisuuntaisen varianssianlyysin ohella kurssilla tutustutaan myös ensimainitun epäparametriseen vastineeseen. Esitiedot: Kurssit Kuvaileva tilastotiede ja aineiston hankinta, Tilastolliset mallit ja testaus sekä Todennäköisyysmallit, päättely ja epäparametriset menetelmät. 3622232 Tilastollinen ohjelmistokurssi, 2 op, Kuopio ( 2 op ) 20 t pienryhmäopetusta Ajoitus: 3. periodi alkaa 10.01.17 ja 4. periodi alkaa 20.03.17 Opettaja: Matti Estola, matti.estola@uef.fi Sisältö: SPSS-ohjelmiston käyttöympäristö ja peruskäsitteet, havaintoaineiston muodostaminen, muokkaus sekä kuvailu tunnusluvuilla ja kuvioilla, muuttujamuunnokset, hypoteesin testaus (mm. odotusarvojen t-testit), regressio- ja varianssianalyysi. Esitietovaatimukset: Tilastotieteen johdantokurssi ja tilastotieteen peruskurssi tai vastaavat tiedot. Kurssin suorittaminen ei edellytä aiempaa kokemusta SPSS:n käytöstä. 2

3622230 Tilastotieteen johdantokurssi, 5 op, Kuopio ( 5 op) 34 t luentoja ja 14 t harjoituksia Ajoitus: 3. periodi. Ensimmäinen luento 10.01.17. Opettaja: Juho Kettunen, juho.kettunen@uef.fi Sisältö: Kurssilla opiskelija oppii kuvailemaan havaintoaineiston tunnuslukujen ja kuvioiden avulla sekä tarkastelemaan kahden muuttujan välistä riippuvuutta ristiintaulukon ja hajontakuvion avulla. Kurssilla tutustutaan myös havaintoaineiston hankintaan liittyen erilaisiin tutkimusasetelmiin, tilastollisen koesuunnittelun periaatteisiin ja otantatutkimuksen perusmenetelmiin. Lisäksi kurssilla käsitellään todennäköisyyslaskennan perusteita ja todennäköisyysjakaumia. 3622212, Todennäköisyysmallit, päättely ja epäparametriset menetelmät, 2-5 op, Joensuu ja Kuopio ( 5 op) 28 t luentoja + 14 t harjoituksia ( 2 op) 28 t luentoja + harjoitustehtäviä (vain Joensuu) Ajoitus: 3. periodi. Ensimmäinen luento Joensuu 17.01.17, ensimmäinen luento Kuopio 12.01.17 Opettaja: Joensuussa esko.valtonen@uef.fi, Kuopiossa mika.hujo@uef.fi Sisältö: Kurssilla tutustutaan matemaattisen tilastotieteen keskeisiin käsitteisiin ja tuodaan esille tuloksia, joita sovelletaan toistuvasti tilastollisessa päättelyssä.tavoitteena on sekä antaa perusteita aiemmisssa tilastotieteen perusopinnoissa tarkastelluille päättelymenetelmille (kuten t-testille) että tutustuttaa soveltavilla kursseilla tarvittaviin tekniikoihin (kuten pns- ja SU-estimointiin). Kurssilla tarkastellaan myös muutamia usein käytettäviä epäparametria testejä. Vaikka esitystapa on formaalinen, pääpaino ei ole tiukan eksaktissa ja yksityiskohtaisessa tulosten todistamisessa vaan niiden ensisijassa niiden esittelyssä ja sisällön avaamisessa. Esitiedot: Joensuussa kurssit Kuvaileva tilastotiede ja aineiston hankinta sekä Tilastolliset mallit ja testaus, Kuopiossa kurssit Tilastotieteen johdantokurssi ja Tilastotieteen peruskurssi. 3

3622348 Bayesian Inference 2 (5 credits) Teacher: Ville Hautamäki, Senior researcher, School of computing, villeh @cs.uef.fi Timing: Spring 2017, Teaching language: English Passing the course: Graded exercises. Modern machine learning leans very heavily towards Bayesian inference and espeic ally probabilistic modeling using graphical models. There are two basic strands of machine learnig, either via directed graphs (where causality is explicitly mo deled) and undirected graphs. Undirected graphs are especially used in the celeb rated Boltzman machine and its practical variant restricted Boltzman machine (RB M). The development of the RBM made the explosion of deep learning research possible. In addition to studying the graphical models we need to study how to do inference on those models. In this course we concentrate on the approximate inference (as exact inference is typically intractable). In machine learning we call this type of inference as algorithmic inference or just deterministic inference. Specifically, we study expectation maximixation (EM), Laplace method, variational Bayes (VB) and stochastic variational inference (SVI). The course builds on the Bayesian inference 1, but passing it is not a prerequisite. Required concepts will be quickly reviewed. Literature: C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006. I. Goodfellow et al, Deep Learning, MIT Press, 2016. 4

3622346 Generalized linear models (5 credits) Responsible teacher (lectures and course material): Mika Hujo, Lecturer in statistics, School of computing, mika.hujo@uef.fi Timing: 4nd period (the first lecture on 20.03.17). For complete information on timing and locations, see WebOodi. Familiar linear regression assumes that the response is on interval scale and follows a normal distribution. Many measurents in practice have non-normal distribution. We may have discrete count data or 0-1 response (absence or presence) or continuous response can be skewed. This means that linear regression models are not applicable. In generalized linear model frame work we may try to apply poisson (poisson regression), bernoulli (logistic regression) and gamma distributions to these cases. As it is evident from the name the Generalized linear models is a generalization of linear models. As special cases Generalized linear models include linear regression, analysis of variance models, logistic regression, poisson regression etc. This means that all of these models can be studied under the same theoretical framework. For example there is a common method for computing parameter estimates. The aim is that after the course, the student understands the basic properties of generalized linear models as extension of linear models and is able to do statistical modeling (with R-software), interpret and utilize the models, and explore the validity of the inherent assumptions. The recommended background knowledge includes some basic courses in statistics and basic knowledge in regression analysis. In addition, programming skills and knowledge of the basics in R-software are of great benefit. Completing the self-study R- course sections 1-4 (1 credits) prior or parallel to the course is recommended (see http://moodle.uef.fi/course/view.php?id=3749). The course includes 28 hours of lectures and 14 hours of demonstrations where the solutions of the weekly exercises are presented. As additional reading, one could use e.g., the following books: P. McCullagh and J. Nelder. Generalized linear models. R. Myers, D. Montgomery, G. Vining and T. Robinson. Generalized linear models with applications in engineering and the sciences 5

3622321 Johdatus tilastolliseen päättelyyn 1 (5 op) Introduction to statistical inference 1 (5 credits) Responsible teacher (lectures and course material): Lauri Mehtätalo, Associate professor in applied statistics, School of computing, lauri.mehtatalo@uef.fi Timing: 4th period (the first lecture on 22.03.17 ). For complete information on timing and locations, see WebOodi. Teaching language: English When studying a specific intermediate course/topic in statistics, such as regression analysis, linear mixed models, generalized linear models, spatial statistics, sampling, multivariate analysis, or Bayesian inference, student encounters some amount of theoretical knowledge that should have been mastered before. The courses on statistical inference (Introduction to statistical inference 1 and 2) cover these basics so that further studies are faciliated. Especially, we try to focus on the general understanding of concepts and ideas, not so much on the mathematical proofs and technical details. In general, many of the results will not be formally proven but they will be demonstrated and justified with example calculations and computer simulations using R. Course Introduction to statistical inference 1 will start with univariate random variables, the description of them using probability distributions, and summaries of the essential properties using expected value and variance. Thereafter we will introduce the necessary matrix algebra for treatment of multivariate random variables. The third part will cover multivariate random variables and the related distributions: joint distribution, conditional distribution and marginal distribution, as well as their summaries: mean variance and covariance. The course Introduction to statistical inference 2, which will be given in fall 2017, will continue with the theory on parameter estimation, hypothesis testing, confidence intervals and important large-sample results. The course is highly recommended for all students who are going to study statistics beyond the introductory level, especially if your plan an academic career on a field where statistical methods are used as standard tools. The course should be taken right after, or even parallel to the basic courses. Together with the second part (Introduction to statistical inference 2) it is a mandatory course for intermediate studies in statistics (tilastotieteen aineopinnot). Completing the self-study R-course sections 1-4 (1 credits) prior or parallel to the course is recommended (see http://moodle.uef.fi/course/ view.php?id=3749). The course includes 32 hours of lectures and 16 hours of demonstrations where the solutions of the weekly exercises are presented. As additional reading, one could use e.g.,: G. Casella and R. L. Berger, Statistical inference, 2002 DeGroot and Schervish, Probability and statistics, 2012. 6

3622314 Linear mixed-effects models (4 credits) Teacher: Lauri Mehtätalo, Associate professor in applied statistics, School of computing, lauri.mehtatalo@uef.fi Timing: Spring 2017, Registration starts at 25.12.16. For complete information on timing and locations, see WebOodi. Teaching language: English Linear mixed-effects models provide extension of linear models into such grouped datasets, where the groups constitute a sample from a large population of groups, and observations within the group are dependent. Examples of such datasets are pupils within school class, trees on a forest sample plot, and repeated measurements of persons. Essentially, the groups (school classes, sample plots, persons) are a sample from a population of groups. Mixed effects models are widely used in several fields, including ecology, forestry, medicine and social sciences. The course starts with a short summary of linear model. Thereafter we formulate the linear mixed-effects models for dataset with single level of hierarchy. At the end of the course, we will discuss about extensions to more complicated groupings, such as nested groupings (repeated measurements for trees within sample plots) and crossed groupings (e.g. calendar years for repeated measurements with different starting year). Real-data examples and exercises are included using R- software. The recommended background knowledge includes some basic courses in statistics, linear regression, matrix algebra and statistical inference. Programming skills and knowledge of the basics in R-software are of great benefit. Completing the self-study R-course prior to the course is highly recommended (see http://moodle.uef.fi/course/view. php?id=3749). The course includes 24 hours of lectures and weekly exercises, which are returned to moodle and evaluated. The course is organized as a web-course at moodle using the videotaped lectures given in 2015. The registration for course ends on 13.1. Registered students are informed about practical details by an email after that and practical details (e.g. deadlines of the 6 sets of exercises) can be negotiated with the participating students. Literature Course videos, lecture notes. Additional reading (1)Pinheiro and Bates 2000. Mixed-effects models in S and S-Plus. Springer (Available at UEF library in electronic form). (2) Demidenko, E. 2013. Mixed models, theory and applications with R, second edition. Wiley. (3) Galecki and Burtzykowski 2013. Linear mixed-effects models using R. A step by step approach. (4) McCulloch, C.E., Searle, S.R. and Neuhaus, J.M. 2008. Generalized, Linear, and mixed models, second edition. Wiley. 7