Affairs - ShuWa

What are the characteristics of transactions?#

The MyISAM engine does not support transactions. Transactions must adhere to four characteristics:

Atomicity: All operations within a transaction must either be completed or none of them should be completed. Transactions should not end in the middle of an operation. If an error occurs during the execution of a transaction, it should be rolled back to the state before the transaction started.
Consistency: Before and after transaction operations, the data should satisfy integrity constraints, and the database should maintain a consistent state.
Isolation: The database allows multiple concurrent transactions to read, write, and modify data. Isolation prevents data inconsistency caused by concurrent execution of multiple transactions. When multiple transactions use the same data simultaneously, they do not interfere with each other. Each transaction has a separate data space and is isolated from other concurrent transactions.
Durability: After the transaction processing is completed, the modifications to the data are permanent and will not be lost even in the event of a system failure.

How does the InnoDB engine ensure these four characteristics of transactions?

Durability is ensured through redo logs.
Atomicity is ensured through undo logs.
Isolation is ensured through MVCC (Multi-Version Concurrency Control) or locking mechanisms.
Consistency is ensured through durability, atomicity, and isolation.

What problems can parallel transactions cause?#

When processing multiple transactions simultaneously, dirty reads, non-repeatable reads, and phantom reads may occur.

Dirty read: If a transaction "reads" data that has been modified by another "uncommitted transaction," it means that a "dirty read" has occurred.
Non-repeatable read: When a transaction reads the same data multiple times within a transaction, if the data read in the first and second instances is different, it means that a "non-repeatable read" has occurred.
Phantom read: When a transaction queries the "number of records" that meet certain criteria multiple times within a transaction, if the number of records queried in the first and second instances is different, it means that a "phantom read" has occurred.

What are the isolation levels of transactions?#

The SQL standard proposes four isolation levels to avoid these phenomena. The higher the isolation level, the lower the performance efficiency. The four isolation levels are as follows:

Read uncommitted: When a transaction has not been committed, the changes made by the transaction can be seen by other transactions.
Read committed: After a transaction is committed, the changes made by the transaction can be seen by other transactions.
Repeatable read: The data seen during the execution of a transaction remains consistent with the data seen when the transaction started. This is the default isolation level for MySQL InnoDB engine.
Serializable: Records are locked for both read and write operations. If there is a read-write conflict between multiple transactions for a record, the later transaction must wait for the previous transaction to complete before continuing.

Although the default isolation level for MySQL InnoDB engine is "repeatable read," it largely avoids the occurrence of phantom reads (although it does not completely solve them). There are two solutions to address phantom reads:

For snapshot reads (ordinary select statements), phantom reads are addressed through MVCC. Under the repeatable read isolation level, the data seen during the transaction execution remains consistent with the data seen when the transaction started. Even if another transaction inserts a record during this period, it cannot be queried, effectively avoiding phantom reads.
For current reads (select ... for update statements, etc.), phantom reads are addressed through next-key locks (record locks + gap locks). When executing a select ... for update statement, a next-key lock is added. If another transaction inserts a record within the range of the next-key lock, the insert statement will be blocked and cannot be successfully executed, effectively avoiding phantom reads.

Examples of scenarios where phantom reads occur

First example: For snapshot reads, MVCC does not completely avoid phantom reads. If transaction A updates a record inserted by transaction B, the number of records queried by transaction A before and after the update will be different, resulting in a phantom read.

Second example: For current reads, if a transaction does not immediately execute a current read after starting the transaction, but instead performs a snapshot read first, and another transaction inserts a record during this period, the number of records queried by the transaction using a current read will be different, resulting in a phantom read.

Therefore, the repeatable read isolation level in MySQL does not completely solve phantom reads. It largely avoids the occurrence of phantom reads, but not entirely.

To avoid phantom reads in these special scenarios, it is recommended to immediately execute current reads (e.g., select ... for update statements) after starting a transaction. This adds a next-key lock to the record, preventing other transactions from inserting a new record.