Let’s draw a picture…
Consider three metrics that govern the performance of a CPU.
- Instruction execution rate
- Maximum data transfer rate (between CPU and memory)
- Maximum data transfer request rate (between CPU and memory)
These represent the performance capacity of this hardware and are indicative of its potential throughput. Note that the data transfer request rate is the inverse of the more conventional transfer latency figure and we are using this metric because we want measures that are directly proportional to performance for diagrammatic purposes.
We can consider these metrics as largely orthogonal dimensions in a 3 dimensional space and can construct a cuboid to represent the performance capacity, like so
Clearly the CPU may operate at any point within the space bounded by the polyhedron but we cannot exceed its boundaries. In the diagram above, the point labelled H represents the point of optimal throughput for the hardware in question.
It may be the case that the metrics are not completely independent and the limits of capacity are represented by a more complex surface, but for clarity of argument let us just consider a simple cuboid.
That’s hardware. But what about software?
Clearly, an algorithm utilises the available capacity of the hardware to some extent, but an algorithm has no fixed boundary – the faster the hardware, the faster an algorithm can operate. But interestingly, an algorithm does have a shape – this makes sense when you consider that for each iteration of an algorithm there will be a demand for given number of CPU cycles, transfer requests and transfer volume between the CPU and memory.
Hence we can see that software also forms a polyhedron of a specific shape (within the hardware polyhedron) which can expand in size until it meets a boundary created by the hardware. Clearly, the software can expand to a greater extent if its shape is more closely matched to that of the enclosing hardware polyhedron.
For example, consider an algorithm that requires a lot of memory transfer activity but little in the way of instruction execution. This is illustrated below. The software polyhedron expands until it hits one of the memory transfer boundaries, but retains its shape. The best throughput achieved by this software on this hardware is denoted by the point labelled S.
Now, let’s take a step back to the start of the relational era (1970’s). Again, we can create a cuboid for the hardware characteristics of the time – but let us also normalize the dimensions such that we achieve a perfect cube of unitary dimensions. It would be reasonable to assume that the software of that time would fit this cube reasonably well; otherwise the implication is that the algorithms of that time were grossly inefficient and I don’t think that is the case. In the 1970’s we therefore have a picture like so…
So far, so good. No real surprises there. But lets expand the dimensions of the hardware polyhedron to reflect the performance improvements we have already seen in hardware and detailed again here.
We arrive at a polyhedron of a very different shape that looks something like the diagram below where the hardware polyhedron now resembles a flat plane rather than its original cube. When we expand the software cube up to the boundaries of the hardware cuboid we see that a lot of the hardware potential remains unused.
The hardware performance polyhedron has changed shape but the algorithms still being used on that hardware have not and this illustrates the extent of mismatch we see in 20th century data structures with 21st century hardware and the resulting waste of performance potential.