CS-E4230 Transaction Management in DB Early Spring 2017 Tutorial No 2 (1/5) [0] [a] Why is it a good idea to keep the DB log on a separate disk? Miksi on hyvä pitää tietokannan loki omalla levyllään? [b]writing one log record at a time to disk is slow. How can batching log records be made to work in practice? Lokin kirjoitus tietue kerrallaan levylle on hidasta. Miten lokitietueita voidaan käytännössä tallentaa levylle eränä? [c] What is the difference between a modified page and a dirty page? Mitä eroa on päivitetyllä sivulla ja likaisella sivulla? [d] Answer briefly the following. Can an unmodified page contain dirty data items? Why can all the data items in a modified page (also called a dirty page by some authors) be clean (that is, nondirty)? [e] Joe wanted to explain buffer management to his classmates using the analogy below. Did Joe get these analogies right? Consider eight scientists working together on a joint book containing twelve chapters with a total of 200 pages. It is agreed that each single page of any chapter will be written by one scientist and then be reviewed and corrected by another scientist within the next day. The scientists prefer to work on a printed version of the book, so that once a single page has been written and printed, it is placed in one of the free slots of a wooden pigeonhole (see left) consisting of 40 free holes (=slots). So when the author of a page p wants the page to be reviewed, s/he puts the printed page p in an empty slot together with a small sticker on which the page number is written to indicate that the page should be reviewed. Another scientist then comes and picks up page p, leaving the slot empty except for the remaining sticker in the slot which acts as an indicator that the corrected page p should be returned into this same slot (i.e. the slot is taken). No *other* page can be inserted into such a slot containing a sticker. Eventually, the scientist who picked up page p will return the page marked with her/his corrections to the same slot and remove the sticker. The page will then remain in the slot until another scientist possibly decides to review it or the chapter is ready (see below). When all the pages comprising a chapter c have been reviewed by at least one scientist, an assistant collects all the pages belonging to chapter c, scans them and stores all the notes for the updates on the chapters on a special disk. The chapter is then considered as closed. Assume that a chapter represents a transaction, that a page of a book is equivalent to a data page, that each pigeonhole slot is equivalent to a buffer frame and that a closed chapter is equivalent to a completed transaction. In case all of the 40 slots become full, a new page to be reviewed can be inserted into a slot containing another page (that has been reviewed and thus has no sticker) as follows: the existing page in the slot is removed from the slot, scanned and written to disk and the new page is inserted into the now free slot. Now here are the actual analogies: the STEAL analogy: If there is no room (all the slots are taken) where to put a page that needs to be reviewed, then one of the pages for which the relevant chapter is not yet closed can be removed from the slot (as long as there is no sticker on the slot), scanned and written to disk. the DIRTY-PAGE analogy: a page is considered dirty if it has been reviewed by a scientist but the chapter to which the page belongs is not yet closed. the FORCE-policy analogy: Before a chapter is closed (considered complete), all the handwritten corrections for the page have to be scanned and stored on a data disk.
CS-E4230 Transaction Management in DB Early Spring 2017 Tutorial No 2 (2/5) [1][a] Define the before image and the after image Määritä data-alkion alkukuva sekä jälkikuva [b] If all of a transaction s modified pages are always written to disk at commit time (before the commit record), what do we call such a policy? Why is a latch placed on a page? Kun transaktion kaikki päivitetyt sivut kirjoitetaan levylle aina sitoutumisen yhteydessä (eli ennen kuin transaktio merkitään sitoutuneeksi), miksi tätä kutsutaan? Miksi sivu salvataan? What does a No-Steal policy mean? Give a precise definition of the Steal policy. Mitä älä-varasta käytäntö tarkoittaa? Anna tarkka määritelmä varasta käytännöstä. [c] Combine the WAL-policy and the Commit Protocol (Section 3.8) so that we can define a committed transaction with respect to its logs. Also define a fuzzy or light checkpoint? [d] Steal and Force both write modified pages to disk. What is the difference between the two polices (i) with regards to the number of pages for each policy? (ii) with regards to evicting a page (removing a page from the buffer)? Sekä varasta että pakota käytäntö kirjoittavat päivitettyjä sivuja levylle. Miten ne eroavat toisistaan (i) koskien kirjoitettavien sivujen lukumäärää? (ii) koskien sivun poistamista puskurista? [e] Why do we need the UNDO and REDO operations at RECOVERY? Mihin tarvitaan Peruutus ja Toisto toimintoja? When do we need the UNDO and REDO operations? [f] What is the main idea of recovery? What is a so-called loser transaction? Mikä on elvytyksen pääperiaate? Mikä on ns. häviäjä transaktio? Mitkä ovat ns. häviäjä-transaktioita alla olevassa kuvassa? Identify the loser transactions above. What is the state of T 1? What happens to transactions T 4 and T 5? Mikä on T 1 :n tila? Mitä tapahtuu transaktioille T 4 ja T 5?
CS-E4230 Transaction Management in DB Early Spring 2017 Tutorial No 2 (3/5) [g] What is the modified pages table (sometimes aka dirty pages table) and where is it kept? What about the transaction table? Mikä on päivitettyjen sivujen (joskus myös likaisten sivujen taulu ) ja missä sitä pidetään? Entä transaktiotaulu? [h] If the transaction is committing/aborting, what gets written to the log? What is physiological logging? Jos transaktio sitoutuu/peruuntuu, mitä kirjoitetaan lokiin? Mikä on fysiologinen loki? [i]what operation could the physiological log record represent? Mihin operaatioon alla oleva fysiologinen loki voisi viitata? [j] What is wrong with the following statement? Mitä vikaa on seuraavassa väittämässä: With the Force policy, a page is removed from the buffer pool and written to disk at commit time How would the statement change if instead No-Force were used along with the WAL-policy? Miten lause muuttuisi, jos Force policyn sijaan käytössä onkin No-Force sekä WAL-käytäntö? [2] a) With physiological logging and using the key-range model, what log records are produced by the following transaction? You can assume there are no internal changes and no concurrency. Mitä avainvälimallin lokikirjauksia seuraava transaktio tuottaa fysiologisessa lokikäytännössä? Voit olettaa, ettei rakennemuutoksia satu ja ettei suorituksessa ole muita samanaikaisia transaktioita. insert into r values(x 1 ; v 1 ); set savepoint P 1 ; insert into r values(x 2 ; v 2 ); set savepoint P 2 ; insert into r values(x 3 ; v 3 ); rollback to savepoint P 2 ; insert into r values(x 4 ; v 4 ); rollback to savepoint P 1 ; insert into r values(x 5 ; v 5 ); rollback. [b] Assume that the database is as follows: contains relations R(X, V ) and S(X, V ). Relation R contains only the tuple (5, 10) and is stored in page p1. Relation S contains only the tuple (15, 2) and is stored in page p2. We execute the following transaction: begin transaction; update R set V=18 where X=5; update S set V=20 where X=15; commit Olkoon tietokanta kuten yllä ja suoritetaan edellä mainittu transaktio: What log entries are produced by the transaction? Assume that the next free LSN is 101. Mitä lokitietoja transaktio synnyttää? Oletetaan, että seuraava vapaa LSN on 101.
CS-E4230 Transaction Management in DB Early Spring 2017 Tutorial No 2 (4/5) [3] At the highest level of abstraction, logical logging would mean writing original SQL statements into the log, such as the single log record below done by transaction T. Are there any advantages in this approach? What are the disadvantages? Does this work at all? <T, update r set V = V + 1 where X > 50> SQL statement is update r set V = V + 1 where X > 50 Korkeimmalla abstraktiotasolla, looginen lokikirjaus tarkoittaisi alkuperäisten SQL-lauseiden kirjoittamista lokiin, kuten yllä olevaa lokitietuetta, jonka transaktio T suorittaa. Onko tässä menetelmässä mitään etuja? Entä sen haitat? Toimiiko tämä ylipäätänsä? [4] (Problem 2.1 from you text) Assume that the size of the database buffer pool is 1 000 buffer frames for database pages of size 4 kilobytes (4k). The tuples of relation r occupy 10 000 data pages, and the tuples of relation s 500 data pages. In the beginning, there are no pages in the buffer pool, and no transactions are active. The first transaction to be run is T1: update r set A = A + 100; commit. [a] How many data pages of r are fetched from disk into the buffer pool, and how many data pages are flushed from the buffer pool onto disk while T 1 is running? How does the disk version of the database differ from the current version of the database at the time T 1 commits? [b] Immediately after the commit of T 1, another transaction T 2, is run: update s set B = B + 200; commit. How many data pages of r and s are fetched from disk, and how many data pages are flushed onto disk while T 2 is running? How does the disk version of the database differ from the current version of the database at the time T 2 commits? No checkpoints are taken. [5] Transaction T consists of the following SQL statements: Transaktio T koostuu seuraavista SQL-lauseista: begin transaction; insert into EMPLOYEE values (1, 'John Doe'); insert into DEPARTMENT values (100, 'Research'); commit; We unrealistically assume that there is room for only one database page in the buffer pool and that the relations EMPLOYEE and DEPARTMENT are not on the same page. The keys of the relations are the first elements of the tuples. What log entries are written while executing transaction T, when physiological logging is applied? When are the log entries and the buffer page written to the disk if the buffer management policy is: a) Steal and Force (varasta- ja pakota käytäntö) b) Steal and No-Force. c) No-steal and no-force? d) No-steal and force? Epärealistisesti oletetaan, että puskurissa on tilaa vain yhdelle datasivulle ja että relaatiot EMPLOYEE ja DEPARTMENT eivät ole samalla sivulla. Relaatioiden avaimina ovat monikoiden ensimmäiset alkiot. Mitä lokikirjauksia kirjataan suoritettaessa transaktiota T, kun käytetään fysiologista lokimerkintää? Milloin lokimerkinnät ja puskurisivu menevät levylle jos puskurinhallintaa perustuu edellä oleviin kohtiin: (a-d)
CS-E4230 Transaction Management in DB Early Spring 2017 Tutorial No 2 (5/5) [6] ARIES: Introduction The contents of the log on disk at the time of a system crash are the following: Lokin sisältö häiriön sattuessa on seuraavanlainen: 101: <begin-checkpoint> 102: <transaction-table, {}> 103: <page-table, {}> 104: <end-checkpoint> 105: <T 1,B> 106: <T 2,B> 107: <T 1, I, p, 12, 2, 105> 108: <T 2, I, p, 14, 6, 106> 109: <T 1, I, p, 21, 5, 107> 110: <T 2, D, p, 17, 3, 108> 111: <T 2, A> 112: <T 2, D 1, p, 17, 3, 108> [a] What is, briefly stated, the purpose of the Analysis-phase? Mikä on lyhyesti Analyysivaiheen tarkoitus? [b] What do we get as a result of the Analysis phase? Mitä saadaan Analyysivaiheen tuloksena? [c] Your book mentions that we can be certain that the log tail (=log buffer) is flushed to the log disk before the transaction commits. Here the log has been written to the log disk (otherwise it would have been lost following the crash) even though there is not commit entry. What do you think caused the log tail to be written to the log disk? Kirja mainitsee, että voimme olla varmoja siitä, että lokin häntä (=lokipuskuri) on tallentunut lokilevylle ennen kuin transaktio sitoutuu. Tässä loki on kirjautunut lokilevylle (muuten se oli menetetty romahduksen yhteydessä) vaikka lokissa ei olekaan sitoutumismerkintää. Mikä mielestäsi on voinut saada lokin hännän kirjautumaan lokilevylle?