Dennard, Amdahl, and Moore: Identifying Limitations to Forecasting Laws

Will Moore’s Law Hold Up?

As the hallmark law of technology forecasting (and often, the only case that people are familiar with) a debate rages around Moore’s Law and its validity–will it hold true? Will it fail? Will it plateau and then see breakthroughs? Fact of the matter is, that all of these are true statements…depending on what the exact metric you’re measuring is. In fact, the precise measured metric and how to choose one is going to be the focus of a later post, but for now I’d like to address what most people think of when they say Moore’s law, and what they expect, which is computers seeing drastic gains in raw speed performance from a processor level (disregarding improvements on the part of other parts of the system like the speed improvements from Solid State Drives).

If you go by that metric, Moore’s law has failed to keep up. There’s no two ways about it. I’m not saying the sky is falling, and I’m certainly not saying that this won’t change. All I’m saying is that, for now, the raw speed improvements in computers has failed to keep up. Why is that?

Well, there’s a corollary of Moore’s Law called ‘Dennard Scaling’. Simply put, Dennard Scaling states that as transistors get smaller their power density stays constant, or that total power per transistor decreases linearly. This means that if you cut the linear size of a chip by half in two dimensions, the power density will decrease by 1/4. If this wasn’t the case, 3 Moore’s law doubling cycles (ie an 8x improvement in number of transistors in a given area) would mean an 8x higher power density.

Dennard Scaling is what’s broken down. More details are explained here, but the gist of it is that the smaller the transistors get, the more static power loss there is. The more static power loss there is, the more the chip heats up, leading to even more static power loss, which is a self-reinforcing cycle called thermal runaway. Another problem occurs when the static power loss (which is a signal) is greater than the gate voltage, leading to errant activation of transistors, meaning faulty operation.

To avoid this, manufacturers began producing multicore chips (which you may have observed in the last few years). This is a valid approach, and also led to the push in parallelized code. However, while there are a number of architectural issues above my head here, there is one important fact about building multicore instead of single core system. What is it?

 

The Problem

For a multicore system to work, a task has to be distributed to different cores and then gathered again for a result. This is a drastic simplification, but works for the purpose of this argument. Say you have have a program comprised of 100 tasks that need to be accomplished, with 40 that can be parallelized and 60 that can’t, and you run these tasks on a single core processor that does 1 task per ‘tick’ (a general unit of time). It will take you 100 ticks to finish the operation. Now, if you replace your single core processor with a quad core processor, what changes? Well, 40 of them can be parallelized, meaning they can be sent off to your quad core system. That leaves you with 60 tasks that have to be done in sequence–so even though you might have 4 times the number of transistors in your system, it will still take you 70 ticks to finish the operation–10 ticks (40 ticks / 4 processors) plus 60 ticks (one processor handling the non-parallelized tasks).

This is a general law called Amdahl’s Law. Amdahl’s Law states that the time T(n)  an algorithm takes to finish when being executed on n threads of execution with a fraction B of the algorithm that is strictly serial corresponds to:

T(n)=T(1)(B+\frac{1}{n}(1-B))

Amdahl's Law Depiction

Amdahl’s law at 50%, 75%, 90%, and 95% parallelizable code. Source: http://en.wikipedia.org/wiki/Amdahl’s_law#mediaviewer/File:AmdahlsLaw.svg

As can be seen in the graph, even if your code is 95% parallelizable, as n approaches infinity (and infinite number of processors) you only get a 20x speedup…or just over 4 Moore’s Law cycles (8-10 years).

This article isn’t meant to try to convince you that these issues won’t be solved. In fact, for what it’s worth, I’m strongly of the opinion that they will be solved–new computing architectures and substrates mean that we will likely resume some form of rapid growth soon (this may be influenced by a degree of hope, but there are certainly enough alternatives being explored I find it somewhat likely). While it’s an interesting problem to look at, I think it’s a more useful example of how every technology forecasting law has associated theorems and roadblocks, and that finding these is important to a forecast.

Associated Laws and Roadblocks

Forecasting laws have associated laws. That’s a pretty simple sentence with a lot of meaning, but what exactly is it saying? Exactly this: for every statement you make about a capability changing over time (transistors, laser capabilities, etc.) there are associated laws relating to associated capabilities. Dennard scaling associates with Moore’s law–it’s an observation that power density stays the same, meaning power requirements per transistor must be dropping, allowing Moore’s law to continue. There are any number of these, and in some ways you might even be able to consider multiple versions of a forecasting law to be very close associated laws (such as what type of forecasting method you’re using).

Every technology (that we know of) has roadblocks as well. Roadblocks are what I call ‘any obstacle to progress in the development of a technology’. There are a variety of types of these roadblocks, and they can impact forecasting accuracy (macro) or simply describe problems that need to be/will be overcome in the pursuit of development (micro). In the case of Amdahl’s Law, it follows from mathematical axioms and is thus what I would call a ‘Axiomatic Roadblock’. This associates with the impossibilities mentioned in “Possible, Probable, Personal“, specifically the axiomatic impossibility–indicating that the limitation is put in place due to mathematical reasons more than physical laws (a semantic distinction that dissolves if looked at closely enough, but useful for identification purposes).

While the identification of the issues in moving forward in trivial Moore’s law forecasting is important, and I hope that I clarified things somewhat for my readers, it’s just as important to give a good example of how these associated laws that might be passed over can lead to new limitations. I personally think that the issues will be overcome, and that Moore’s Law will continue (or need to be reformulated if a different substrate has different research patterns associated). All the same, being able to identify when axiomatic and physical impossibilities and roadblocks will arise is absolutely necessary for identifying the validity of a forecast.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>