Tapahtumanhallinnan pulmakohtia ja ratkaisuja ITK E54 - Kehittämismenetelmät ja arkkitehtuurit liiketoiminnassa kevät 2005 Ville Seppänen <rissepp@cc.jyu.fi> Jakamattomuus (Atomicity) Tapahtuman järjestelmään tekemät muutokset vahvistetaan (commit) ainoastaan mikäli kaikki t:n operaariot onnistuvat. Muussa tapauksessa t keskeytetään (abort) Virheen sattuessa järjestelmään mahdollisesti tehdyt muutokset peruutetaan (roll-back, backward error recovery) siten, että järjestelmän alkuperäinen tila palautetaan lokitietojen perusteella Johdonmukaisuus (Consistency) T:n järjestelmän tilaan tekemien muutosten tulee vastata kuvatussa ilmiössä tapahtuvia muutoksia ja muutosten tulee tapahtua järjestelmän eheysvaatimusten (integrity constraints) rajoissa Useimmissa tapauksissa järjestelmä on väistämättä epäjohdonmukaisessa tilassa jossain t:n välivaiheessa. Eheys tarkastetaan tavallisesti vasta t:n pyytäessä vahvistusta (deferred integrity constraints) Eriytyvyys (Isolation) Järjestelmän samanaikaisesti suorittamat tapahtumat täytyy eristää toisistaan; ts. ne eivät saa päästä käsiksi epäjohdonmukaisessa tilassa oleviin resursseihin (vrt. edellinen kalvo) Virheelliset tulokset Ketjuuntuvat peruutukset (cascading rollback) Sarjallistuvuusvaatimus Eriytyvyys (Isolation) Pysyvyys (Durability) Hallintakeinoina mm. optimistinen samanaikaisuuden hallinta, aikaleimaus, lukitusmenetelmät Sopiva menetelmä tapauskohtaisesti Tapahtuman järjestelmään tekemät muutokset eivät saa kadota tahattomasti tai tapaturmaisesti
Two-phase commit protocol Distributed transaction management Atomic commitment protocol An algorithm that ensures that all the processes involved in a distributed transaction either commit or abort Coordinator process Responsible for controlling the overall atomicity Controls the voting A number of participating processes Execute parts of a distributed transaction Two-phase commit protocol 2 Voting procedure - Phase One The coordinator sends vote requests to all participants When a participant receives the vote request, it replies by voting either Yes or No, according to whether it is able to carry out the task Participants that voted Yes start waiting for a comfirmation message from the coordinator. Participants that voted No can unilateraly abort Two-phase commit protocol 3 Voting procedure - Part Two The coordinator collects all vote messages If all the participants voted Yes the coordinator decides to commit and sends Commit messages to all participants Otherwise, the coordinator decides to abort and sends Abort messages to all participants that voted Yes According to the received message, a participant decides to commit or abort Two-phase commit protocol 4 It is possible that messages may not arrive due to a failure and processes may be waiting forever: A timeout mechanism must be able to interrupt the waiting period In addition, the coordinator may attach the list of participants to the vote request message and thereby let the participants to know each other: Cooperative termination protocol Cooperative Termination Protocol If participant p comes across the timeout while waiting for the Commit or Abort message from the coordinator, it can request this from the participant q If q has already decided to commit (or abort) it sends a Commit (or Abort) to p In the case that q has not voted yet it can decide to abort and send Abort to p If q has voted Yes but has not received the final Commit or Abort request from the coordinator it cannot help p in making the decision
Nested transactions 1 A mechanism to facilitate transactions in distributed systems (Moss 1981) A tree-like model with parents, children, top-level (root), and leaves Nested transactions 2 Rules A parent can spawn any number of children Any number of children may be active concurrently Parent can t access data when its children are alive A child can inherit a lock held by any ancestor On child commit, its locks are inherited (anti-inheritance) by parent Nested transactions 3 Nested transactions 4 Rules continued... Commit dependency: parent can commit only after all its children terminate (commit/abort) Abort dependency: On parent abort, even updates of committed children are undone Updates persist only if all ancestors commit Nested transactions 5 Intra-transactional parallelism Safe concurrency Reduced response time Intra-transactional recovery control Finer control of error handling Improves availability System modularity Composition of separately developed modules Save points Can be seen as a check point in a transaction that forces the system to save the state of the running application and return a save point identifier for future reference Instead of removing an entire transaction after a failure of single operation (sub-process) backward recovery can return the last valid state of the transaction saved in the save point reference
Compensation & Sagas 1 Long-living and distributed transactions are not reasonable to implement as a single ACID transaction Keep the resources locked for the long periods of time, which can significantly delay the commit of other simultaneously executed transactions Deadlock frequency grows with the fourth power of the transaction size Compensation & Sagas 2 Atomic commitment protocols ease the implementation of ACID transactions in distributed systems but futher transactions to become long living Backward recovery is expensive and difficult to implement The idea of compensating transactions (processes, operations) was introduced to simulate the transactional properties in such applications Compensation & Sagas 3 In Saga model (Garcia-Molina & Salem 1987) a long living transaction is broken up into a collection of subtransactions that interleave with other transactions The results of a subtransaction can be made immediately visible after it is committed Compensation & Sagas 4 Saga subtransactions can be executed independently but they must finally form an atomic unit Should any of subtransactions fail the failed transaction is aborted and already committed subtransactions are undone Removing the committed transaction means that results of subsequent transactions may become inconsistent and cause cascading rollbacks Instead of removing the results of failed transaction by using rollback the idea of compensating transaction is introduced Compensation & Sagas 5 Each Saga subtransaction Ti is provided with a compensating transaction Ci If the compensation is not needed saga T1, T2,..., Tn will execute as a sequence of transactions In case of compensation the sequence would be T1, T2,..., Tj, Cj,..., C2, C1 where 0 <= j < n, and Cj is a predefined compensating transaction of Ti A compensating transaction doesn t necessarily restore the state that prevailed before the execution of Ti but rather undoes operation(s) of Ti from a semantic point of view Compensation & Sagas 6 Contrary to traditional backward recovery mechanism, which is based on compensating operations automatically deduced from the schedule, semantically compensating transactions may not be automatically generated Even relatively simple actions (such as subtraction) must follow the integrity constraints Everything cannot be undone for example real actions with real consequences : impossible to undo but it may be possible to override the effect with a new action Committed acceptable & aborted acceptable termination states: business considerations
Forward recovery 1 Forward recovery 2 Combining the save points, compensation and recovery of failed transaction Transaction that caused the failure is aborted using the conventional rollback Committed transactions are undone in reversed order using their compensating counterparts until the save point is found (backward recovery) Finally, the transaction is restarted from the location of found save point (forward recovery) Forward recovery 3 When recovering, it s not always feasible or even possible to complete the transaction by trying to re-execute the original steps again However, it s possible that the same objective can be achieved with different ways. Alternative methods of forward recovery should be considered during the process design phase Transactional processes 1 Typical scenario: a top-level process consists of subcomponents that are logically connected to the higherlevel process but are executed virtually independently A global transaction a collage of all separate transactions that are included in the process often long-living are distributed Each global transaction can be divided into a set of local transactions can be represented as a tree of hierarchically ordered tasks, which may also include manual operations Transactional processes 2 The lowest level consists of DB ACID transactions Transactional processes 3
Transactional processes 4