Tapahtumanhallinnan pulmakohtia ja ratkaisuja

Tapahtumanhallinnan pulmakohtia ja ratkaisuja TJTSE54 - Kehittämismenetelmät ja arkkitehtuurit liiketoiminnassa, kevät 2007 Ville Seppänen <rissepp@jyu.fi>

Kertausta Transaktio eli tapahtuma Tietokantatapahtumien tulee täyttää nk. ACIDvaatimukset Atomicity = Jakamattomuus Consistency = Johdonmukaisuus Isolation = Erityvyys Durability = Pysyvyys Ihanneoloissa samat vaatimukset voitaisiin asettaa kaikenlaisille tapahtumille, mutta käytännössä sitä ei aina voida/kannata toteuttaa

Jakamattomuus Tapahtuman (T) järjestelmään tekemät muutokset vahvistetaan (commit) ainoastaan mikäli kaikki T:n operaatiot onnistuvat. Muussa tapauksessa T keskeytetään (abort) Virheen tapahtuessa järjestelmään mahdollisesti jo tehdyt muutokset perutaan (rollback, backward error recovery) siten, että järjestelmän alkuperäinen tila palautetaan (esim. lokitietojen perusteella)

Tapahtuma ja operaatio? Tapahtuma T = operaatiot t 1,..., t n T = Siirretään 100 tililtä A tilille B t 1 Luetaan A:n saldo AS t 2 Kirjoitetaan AS = AS - 100 t 3 Luetaan B:n saldo BS t 4 Kirjoitetaan BS = BS + 100 T = Myydään tavaraa t 1 Vastaanotetaan tilaus t 2 Käsitellään tilaus t 3 Vähennetään tuotteen varastomäärää t n...

Johdonmukaisuus T:n järjestelmään tekemien muutosten tulee vastata kuvatussa ilmiössä tapahtuvia muutoksia ja muutosten tulee tapahtua järjestelmän eheysvaatimusten (integrity constraints) puitteissa Useimmissa tapauksissa järjestelmä on väistämättä epäjohdonmukaisessa tilassa jossain T:n välivaiheessa. Eheys tarkistetaan tavallisesti vasta T:n pyytäessä vahvistusta (deferred integrity constraints)

Johdonmukaisuus State n ACID State n+1 ACID State n+2 Read(A) Write(A, (A-10)) Read(B) Write(B, (B+10)) Temporary inconsistency

Erityvyys Järjestelmän samanaikaisesti suorittamat tapahtumat täytyy eristää toisistaan; ts. ne eivät saa päästä käsiksi epäjohdonmukaisessa tilassa oleviin resursseihin (vrt. edellinen) Virheelliset tulokset Ketjuuntuvat peruutukset (cascading rollback) Eriytyvyyden suhteen useita eri lähestymistapoja mm. optimistinen samanaikaisuuden hallinta, aikaleimaus, lukitusmenetelmät Sopiva menetelmä tapauskohtaisesti

Eriytyvyys read(a) write(a, (A+10)) commit read(a) write(a, (A+10)) commit read(a) write(a, (A+10)) abort read(a) write(a, (A+10)) commit update loss dirty read read(a) read(a) commit read(a) write(a, (A+10)) commit inconsistent read

Pysyvyys Tapahtuman järjestelmään tekemät muutokset eivät saa kadota tahattomasti tai tapaturmaisesti

Long living transaction (long running, pitkäkestoinen) Has a long duration compared to the majority of other transactions either because It accesses many database objects It has lenghty computations It pauses for inputs from the users (or processes) Or a combination of these factors About the lenght of time of execution, not size of a transaction

Distributed Transaction mngnt Hajautettu tapahtumanhallinta tx_being; order_part; withdraw_part; payment; tx_commit; Transaction coordinator Subtransactions withdraw_part return_part stock_level Inventory application Local apps Resource mngrs order_part billing Billing application DB Site 1 DBMS DB Site 2

Kaksivaiheinen vahvistus 2-phase commit protocol Atomic commitment protocol An algorithm that ensures that all the processes involved in a distributed transaction either commit or abort Coordinator process Responsible for controlling the overall atomicity Controls the 'voting' A number of participating processes Execute parts of a distributed transaction

Kaksivaiheinen vahvistus 2-phase commit protocol Voting procedure phase one The coordinator sends vote requests to all participants When a participant receives the vote request it replies by voting either Yes or No, according to whether it is able to carry out the task Participants that voted Yes start waiting for a confirmation message from the coordinator Participants that voted No can unilateraly abort

Kaksivaiheinen vahvistus 2-phase commit protocol Voting procedure phase two The coordinator collects all the vote messages If all the participants voted Yes the coordinator decides to commit and sends Commit messages to all participants Otherwise, the coordinator decides to abort and sends Abort messages to all participants that voted Yes According to the received message, a participants decides to commit or abort

Kaksivaiheinen vahvistus 2-phase commit protocol It is possible that messages may not arrive due to a failure and processes may be waiting forever: a timeout mechanism must be able to interrupt the waiting period In addition, the coordinator may attach the list of participants to the vote request message and thereby let the participants to know each other: Cooperative termination protocol

Cooperative termination protocol If participant p comes across the timeout while waiting for the Commit or Abort message from the coordinator, it can requrest this from the participant q If q has already decided to commit (or abort) it sends a Commit (or Abort) to p In the ase that q has not voted yet it can decide to abort and send Abort to p If q has voted Yes but has not received the final Commit or Abort request from the coordinator it cannot help p in making the decision

Nested transactions A mechanism to facilitate transactions in distributed systems A tree-like model with parents, children, toplevel (root), and leaves Top-level T0 Parent Child T1 T2 Leaf T3 T4

Nested transactions Rules A parent can spawn any number of children Any number of children may be active concurrently Parent can't access data when its children are alive A child can inhereit a lock held by any ancestor On child commit, its lock are inherited (antiinheritance) by parent

Nested transactions Rules, continued Commit dependency: parent can commit only after all its children terminate (commit/abort) Abort dependency: on parent abort, updates of committed children are undone Updates persist only if all ancestors commit Kotitehtävä: perustele säännöt

Nested transactions void send_all_salaries() { Transaction tx = new Transaction(); tx.begin(); for(int i=0; i<emplcount; i++) { send_salary(employeraccount, employeeaccount[i], amount[i]); } tx.commit(); } void send_salary(from, to, amount) { Transaction tx = new Transaction(); tx.begin(); from.remove(amount); to.add(amount); tx.commit(); }

Nested transactions Intra-transactional parallelism Safe concurrency Reduced response time Intra-transactional recovery control Finer control on error handling Improves availability System modularity Composition of separately developed modules

Save points A check point in a transaction that forces system to save the state of the running application and return a save-point identifier for future references Instead of removing the entire transaction after a failure of single operation (sub-process) backward recovery can return the last valid state of the transaction saved in the save-point reference

Compensation & Saga transactions Long-living and distribued transactions are not reasonable to implement as single ACID transactions Keep resources locked for the long periods of time, which can significantly delay the commit of other simultaneously executed transactions Deadlock frequency grows with the fourth power of the transaction size

Compensation & Saga transactions Atomic commitment protocols ease the implementation of ACID transactions in distributed systems but further transactions to become long living Backward recovery is expensive and difficult to implement The idea of compensating transactions (or processes, operations) was introduced to simulate the transactional properties in such applications

Compensation & Saga transactions In Saga transaction model a long living transaction is broken up into a collection of subtransactions that interleave with other transactions The results of a subtransaction can be made immediately visible once it has committed

Compensation & Saga transactions Saga subtransactions can be executed independently but they must finally form an atomic unit Should any of subtransactions fail the failed transaction is aborted and already committed subtransactions are undone Removing the committed transaction means that results of subsequent transactions may become inconsistent and cause cascading rollbacks Instead of removing the results of failed transaction by using rollback, the idea of compensating transaction is intruduced

Compensation & Saga transactions Each substransaction is a real transaction in the sense that it preserver DB consistency However, they are related to each other and any partial executions of the saga are undesirable; if they occur the compensation takes place

Compensation & Saga transactions Each saga substrabsactions T i is provided with a compensating transaction C i If the compensation is not needed saga T 1, T 2,..., T n will executed as a sequence of transactions The compensating transaction undoes, from a semantic point of view, any of the actions performed by the transaction but doesn't necessarily return the database to the state that existed when the executing of transaction began

Compensation & Saga transactions For the transaction T 1, T 2,..., T n the compensating transaction C 1, C 2, C n-1 is defined In case of compensation the sequence executed is T 1, T 2,..., T j, C j,..., C 2, C 1 ; where 0 <= j < n

Compensation & Saga transactions Contrary to traditional backward recovery mechanism, which is based on compensating operations automatically deduced from the schedule, semantically compensating transactions may not be automatically generated Even simple compensating actions (such as + & - operations) must follow the integrity constraints Everything cannot be undone For example, 'real actions' with 'real consequences': impossible to undo but it may be possible to override the effect with a new action Committed acceptable & aborted acceptable termination states: 'business considerations'

Forward recovery Combining the save-points, compensation and recovery of failed transaction Transaction that caused the failure is aborted using the conventional rollback Committed transactions are undone in reversed order using their compensating counterparts until the save-point is found (backward recovery) Finally, the transaction is restarted from the location of save-point (forward recovery)

Forward recovery C F G A B D E H I Execution graph (arrow) and compensation graph (dashed arrow)

Forward recovery When recovering, it's not always feasible or even possible to complete the transaction by reexecuting the original steps again However, it's possible that the same objective can be achieved with different ways. Alternative methods of forward recovery should be considered during the process design

Transactional processes Typical scenario: a top-level process consists of subcomponents that are logically connected to the higher level process but execute virtually independently A global transaction A collage of all separate transactions that are included in the process Often long-living and distributed Each global transaction can be divided in a set of local transactions Can be represented as a tree of hierarchically ordered tasks (or a graph), which may also include manual operations

Transactional processes Global transaction structure (top-level process) globally ACID (atomic commitment), distributed, long-living Sales Book Invoice Payment Send documents Prepare documents Cancel Write booking Booking app DB Transportation Accommodation Booking app DB Subprocesses Local transactions locally ACID

Transactional processes Apply save-points on top-level process workflow. Provide with compensating processes. Globally ACID Sales Book Invoice Payment Prepare documents Send documents Sub-processes Cancel may execute nested saga or some other model depending on the task Transportation Write booking Accommodation Recovery mechanism required also on lower levels Booking DB app Booking app DB Local transactions Atomic commitment DB-level transactions Managed by DBMS

Kysyttävää?