Skip to content
  • Jeremy Daer's avatar
    8fd123f7
    Fix corrupt transaction state caused by `before_commit` exceptions · 8fd123f7
    Jeremy Daer authored
    When a `before_commit` callback raises, the database is rolled back but
    AR's record of the current transaction is not, leaving the connection in
    a perpetually broken state that affects all future users of the
    connection: subsequent requests, jobs, etc. They'll think a transaction
    is active when none is, so they won't BEGIN on their own. This manifests
    as missing `after_commit` callbacks and broken ROLLBACKs.
    
    This happens because `before_commit` callbacks fire before the current
    transaction is popped from the stack, but the exception-handling path
    they hit assumes that the current transaction was already popped. So the
    database ROLLBACK is issued, but the transaction stack is left intact.
    
    Common cause: deadlocked `#touch`, which is now implemented with
    `before_commit` callbacks.
    
    What's next:
    * We shouldn't allow active transaction state when checking in or out
      from the connection pool. Verify that conns are clean.
    * Closer review of txn manager sad paths. Are we missing other spots
      where we'd end up with incorrect txn state? What's the worst that can
      happen if txn state drifts? How can we guarantee it doesn't and
      contain the fallout if it does?
    
    Thanks for @tomafro for expert diagnosis!
    8fd123f7
    Fix corrupt transaction state caused by `before_commit` exceptions
    Jeremy Daer authored
    When a `before_commit` callback raises, the database is rolled back but
    AR's record of the current transaction is not, leaving the connection in
    a perpetually broken state that affects all future users of the
    connection: subsequent requests, jobs, etc. They'll think a transaction
    is active when none is, so they won't BEGIN on their own. This manifests
    as missing `after_commit` callbacks and broken ROLLBACKs.
    
    This happens because `before_commit` callbacks fire before the current
    transaction is popped from the stack, but the exception-handling path
    they hit assumes that the current transaction was already popped. So the
    database ROLLBACK is issued, but the transaction stack is left intact.
    
    Common cause: deadlocked `#touch`, which is now implemented with
    `before_commit` callbacks.
    
    What's next:
    * We shouldn't allow active transaction state when checking in or out
      from the connection pool. Verify that conns are clean.
    * Closer review of txn manager sad paths. Are we missing other spots
      where we'd end up with incorrect txn state? What's the worst that can
      happen if txn state drifts? How can we guarantee it doesn't and
      contain the fallout if it does?
    
    Thanks for @tomafro for expert diagnosis!
Loading