There are two very different strategies for dealing with updates and deletes in a relational database. Update-in-place semantics will directly overwrite data in-situ with a new version of that data; while append semantics will simply append the new version to a chronological log and leave the old data untouched. The update-in-place approach was extensively used in early database architectures but more recent database systems tend to opt for append semantics. Let’s look at the reasons why that may be.
When we append updates rather than overwrite data we are able to keep the old and consistent version of the data while the update is in progress and this affords a robust recovery mechanism should the update fail. We are able to easily watermark the updates and we can choose to move that watermark only when we know that the updates have been completed and successfully persisted in storage. Whereas with update-in-place semantics we would have to write and retain the old version in a separate log before the update took place to be able to guarantee recovery from a failed update (assuming that a full recovery from a database backup copy is a little too rudimentary for most people).
The use of an append semantics allows multiple updates to be appended to the same area of storage rather than scattered at random locations across the database. This affords a huge performance benefit to updates because large numbers of small updates become coalesced into a smaller number of larger writes and this hugely reduces the load on fsync() operations – which are essential for durability in write back cache arrangements and for avoiding anomalies arising from out-of-order write behaviour in modern disk systems.
Even on flash storage this type of large append write pattern has huge benefits over small random writes because much less of an erase block is wasted with each write. With a small random update I/O, a flash device has to copy the entirety of an updated block to a new empty location and remap the block’s location. Once a flash device fills, unused blocks must be explicitly erased before they can be re-used and on a busy device this garbage collection activity can impact performance even where it happens asynchronously ahead of time.
Append semantics show significant performance gains in other areas too, such as in transaction isolation. With update-in-place semantics either an explicit MVCC (multi-version consistency control) scheme with separate before-update images must be managed; or worse still, read activity must lock-out update activity on the same data to ensure consistent reads. While the use of read locks is a death knell to performance with concurrent users and is tantamount to eating small children in a public place; the overhead of managing before and after images also places a huge burden on throughput. However, none of this band aid architecture is needed with append semantics because versioning is implicit and watermarks demarcate the boundaries between transactions.
There’s also a slight deception to the term update-in-place. What happens when the version we are attempting to overwrite is smaller than the updated version? This usually leads to schemes such as row chaining where database rows are split to accommodate increasing row length. These chained row fragments may not even reside in the same storage block as the original row and this ultimately leads to further performance horror at query time.
Moreover, contemporary data management requirements commonly require a history of transactions to be retained for management, analysis and legal reasons. So why go out of your way to destroy that very history by overwriting it?
That all sounds great for append semantics, but are there any good reasons for choosing to update in place? The usual response here is that append semantics will require more storage space than update-in-place semantics because we are retaining a complete history of the updates. This is indeed true, but is it important?
On any very large database system dealing with machine generated or web data, the huge majority of data is derived from immutable inserts that never get updated and any data that can be updated (such as customer or account data) is insignificant by comparison. Besides, customer and account data is the very type of data we want to maintain a history for because these are the significant real-world entities on which the business depends.
At this point, someone will helpfully suggest that they might use a table which contains a single row which gets frequently updated for the sole purpose of maintaining a sequence number. (Suddenly, a silence falls and a distant church bell can be heard to slowly toll…) Enough said.
Therefore, either the volume of updates is unlikely to significantly impact storage space on a large system where space is critical; or the system is too small for it to matter at today’s storage prices. Indeed, storage bandwidth is a lot costlier than storage capacity and update-in-place offers no benefit there at all.
Neither is query performance improved by update-in-place semantics since with append semantics, all versions of a row can be equally accessible and resolvable through whatever structures are use to locate an original row. Indeed query performance is likely to suffer more with update-in-place semantics because of row chaining.
So why do some database vendors continue to use update-in-place semantics? Good question.
A precedent embalms a principle. Benjamin Disraeli.