Comments on the Larry Roberts and Caspian Networks study of Internet traffic growth Andrew Odlyzko University of Minnesota odlyzko@umn.edu http://www.dtc.umn.edu/~odlyzko 1. Introduction The study of Internet traffic growth by Larry Roberts and his team from Caspian Networks that was released in the form of a PowerPoint presentation, claims that backbone Internet traffic in the United States is growing at an annual rate of about 300%. If correct, this would be extremely significant for the future of the Internet and even the entire high tech industry. However, I have serious doubts whether this claim is valid. 2. Technicalities The Larry Roberts study claims that Internet traffic is growing at 4x per year, meaning that if we take his measure of traffic for the current month, and divide by that for the corresponding month a year ago, we get a factor of about 4. It may seem pedantic to go through such obvious definitions, but apparently necessary, as one news report claimed the study established the annual growth rate to be 400%, while another claimed it was 4% per year. In fact, 4x annual growth corresponds to a growth rate of 300% per year. In general, there is an astonishing degree of innumeracy around, which often leads to statements that are incorrect, sometimes to the degree they are absurd, in other cases plausible enough that they are misleading. Many examples are cited in my papers, and the news reports mentioned above are two more instances. To cite yet another example, to show that similar problems crop up among financial analysts, the June 27, 2001 report from CIBC, entitled "Switching gears - Photons dominate next generation networks," coauthored by R. Schafer, J. Jungjohann, A. Bezoza, and K. Gallagher, says on the title page (and in several places further on) that "Internet traffic is still doubling every 100 days." Traffic doubling every 100 days corresponds to a growth rate of over 12x per year. Yet in the body of this report, Exhibit 4 on p. 8 and Exhibit 11 on p. 16 both show Internet traffic growing only slightly faster than 2x per year! As yet another example, the news story quoted below on the growth rate of Cable & Wireless IP traffic confused bits and bytes (as indirect correspondence with the company showed), with the result that the actual traffic carried by this ISP was understated by a factor of 8. There are numerous ways to go astray in counting traffic. One issue that crops up occasionally is that telecom links are offered as symmetric ones, with same capacity in both directions, and one has to worry whether measurements cover just one or both directions. (Utilization figures are often misleading because they do not take this factor into account.) For maximal clarity, I have always preferred to work with traffic measured in bytes either leaving a network or exiting the network. (Those two are seldom identical, because of issues such as packet losses and multicasting, but tend to be close.) The 95-th percentile figures used by Larry Roberts are much more ambiguous. In principle, if we look at one direction of some link, it might show ordinary utilization (in terms of bytes carried divided by maximal capacity) of 6%, but its 95-percentile traffic figure could be anyplace between 1.05% and 100%, depending on the traffic profile. In practice, on large backbone links, this is not a problem, as the 95-th percentile traffic figure appears to be typically around 1.5 times the ordinary one, but this is a concern. There are also other problems with using 95-th percentile traffic estimates, related to questions about frequency of measurements, how the data for the two directions are combined into the 95-th percentile figure, etc. Hence I prefer to use straight byte traffic figures, as those are relatively unambiguous. The data Larry Roberts works with, though, is that of 95-th percentile traffic estimates. 3. The Larry Roberts and Caspian Networks study This study is based on data obtained from what it says are the 19 largest ISPs in the United States. However, it does not name them. (Further, how would one know which are the largest 19 without having measurements for a much larger sample of carriers?) More seriously, as far as I know not a single ISP out of these 19 has come forward to say anything about the quality of the measurements they have provided to Larry Roberts. (This is important, since although the study claims on slide #3 that this is the first real measurement of Internet traffic since 1996, it is not a systematic measurement with a uniform methodology, but a compilation of measurements provided by unknown sources.) Thus it is hard to judge how far one can trust the final results. The Larry Roberts study claims that the Internet traffic growth rate was at an annual rate of 3.9x in the half-year from April to October of 2000, and accelerated to 4x in the following half-year ending in April 2001. This estimate goes counter to the estimates that Kerry Coffman and I have made. We first observed back in 1997/8 (in the paper "The size and growth rate of the Internet," published in the Oct. 1998 issue of "First Monday") that Internet traffic was growing only about 2x per year (by which we explicitly said we meant growth rates of between 1.7x and 2.5x per year, since our data did not allow greater precision). This was extremely controversial, as it went counter to the almost-universally accepted "Internet traffic doubling every three or four months," which corresponds to rates of between 8x and 16x per year. We further predicted that this was likely to be the natural rate of growth in the future. This prediction was based on observation of numerous institutions that exhibited such growth rates for extended periods of time, even when they had plentiful bandwidth. This phenomenon suggested that traffic growth was not governed by available capacity (as the mantra "build it and they will come" suggested), but rather by the rate of adoption of new applications. A followup study by Kerry Coffman and myself, entitled "Internet growth: Is there a 'Moore's Law' for data traffic?," was released in the summer of 2000, and updated earlier this year. It is about to be published, and is available, along with our other papers, at . It confirmed our earlier prediction that Internet traffic would grow about 2x a year, at least through the middle of 2000, and produced yet more evidence that this was likely to be the growth rate in the future. Although the Roberts study also comes in with growth rates far lower than the 8x or 16x per year that had been widely assumed, it makes a huge difference whether the true growth rate is 2x (as Kerry Coffman and I estimate) or 4x per year (as Larry Roberts claims). If 4x is the right growth path, then the Internet industry will soon see a huge increase in revenues, and this will feed back into huge increases of sales in equipment. However, if the growth rate is 2x per year, then the picture is much less cheerful, as technological progress, combined with competition and the overbuilding during the past couple of years, would keep revenue growth to modest levels for quite a while. In particular, which of our estimates is correct is likely to become very evident within a year or two, from watching what happens to the carriers' revenues. I am skeptical of Larry Roberts' estimate of 4x annual growth. All the data I have seen is far more consistent with 2x growth rates. (Moreover, the recent cutbacks in capital spending plans by many major carriers for the next couple of years suggest that they do not see huge growth rates in traffic on their networks.) I will not go into the details, but my papers with Kerry Coffman are fully documented, and based primarily on publicly available information, or else information where the sources are explicitly named. While we did not obtain systematic statistics about a large sample of ISPs, as Larry Roberts says he has done, we did get data for a few, which are listed in the papers (together with sources of data). Furthermore, all the evidence that has been accumulating since our work goes to support our estimate of annual 2x growth. (For example, the plot of Genuity's traffic, recently made available at , shows a very steady 2.2x annual growth rate for that carrier for the last 3.5 years. Also, the recent growth rate for the Cable & Wireless backbone, presented in the June 25, 2001 story in "The Standard," , shows their growth rate rebounding to this range after a period of very low growth, caused by the problems they had in taking over internetMCI.) One may ask how this 2x estimate fits with the growth rate of AT&T's Internet traffic, which was recently revealed by an official spokesperson to be around 3x per year (see the August issue of "The Cook Report on the Internet"). Well, Kerry Coffman and I knew the AT&T growth rate (although we did not have permission to publish it), but we also had heard, for example, that AT&T's share of peering traffic of some other carriers was growing, which confirmed our view that AT&T was growing faster than the industry average. Thus this evidence was also consistent with our 2x estimate. Why would Larry Roberts obtain a higher estimate of Internet traffic growth rate? I do not know, since his data is not available for inspection and analysis. Thus it is hard to say how carefully it was assembled and processed. It could be that he is measuring capacity growth more than traffic growth (and it has generally been true that capacity has grown faster than traffic, as Kerry Coffman and I have pointed out many times). It all depends on what kind of data was collected in the first place. Our papers contain some amusing (or appalling, depending on one's view) examples where authoritative people cited traffic figures that turned out to be wrong. A major reason I am concerned about the quality of the data and analysis that Larry Roberts relied on is that there seem to be numerous small problems evident in his PowerPoint presentation. Let me just cite a few points. They may seem trivial, but they are bothersome. (A) On slide #3, it is claimed that this is the first thorough study of Internet traffic since 1996, when the NSFNET statistics stopped becoming available. Well, that is not correct, as is easy to check. NSFNET statistics collection ceased in April 1995, when the NSFNET was phased out. Even the late-1994 and early-1995 NSFNET statistics cannot be regarded as giving a representative view of Internet growth, as traffic started shifting away from NSFNET by the end of 1994. Thus we have not had a thorough study of Internet traffic since 1994. (The NSFNET traffic statistics and reports are easily available on the Web, at .) (B) On slide #9, it is claimed that in other countries, Internet traffic is growing 2.8x per year, and that this was the historical (pre-2000) growth rate in the US. What is the source of this claim? Has Larry Roberts obtained data from foreign carriers? (Their growth rates vary widely, from about 2x per year for Australia to about 4x for international Internet bandwidth for China.) Also, what evidence is there that 2.8x was the annual growth rate in the US before 2000? The NSFNET statistics cited above show a remarkably regular 2x annual growth rate in the early 1990s. Further, in the paper "Beyond Moore's Law: Internet growth trends," published on pages 117-119 in the Jan. 2000 issue (vol. 33, no. 1) of "IEEE Computer," and available in a version entitled "Internet growth trends" at , Larry Roberts claimed (without providing any references) that Internet traffic started growing at 4x per year in 1998. (Growth rates of 4x from 1998 through today would have produced absurdly high traffic volumes by now.) (C) Slide #10 claims that the four largest ISPs carry comparable traffic volumes. This appears to be counter to the general belief (supported by some data about peering volumes) that UUNET is still by far the largest ISP. (The U.S. Department of Justice and the European antitrust authorities do have precise data for early 2000, collected in connection with the proposed and eventually aborted takeover of Sprint by MCI WorldCom, but unfortunately this data is not available to the public.) (D) On slide #11, we have a graph showing growth in Internet traffic, starting in 1970. It appears to show that some measure of traffic (it is not clear whether it is the 95-th percentile or some other one) grew from about 20 bps in 1970 to about 300 Mbps in 1995. Well, growing by a factor of 15 million (300 Mbps divided by 20 bps) in 25 years corresponds to a growth rate of 1.94x per year. Thus the long-term growth rate has been close to 2x. This slide suggests relatively smooth transitions from one growth rate to another. However, Kerry Coffman and I found that there was a period of abnormally rapid growth in 1995 and 1996. The traffic through the NSFNET backbone at the end of 1994 was about 15 TB/month (measured in straight bytes entering or leaving the network). Readings from the traffic statistics for the public NAPs (which were initially publicly available on their Web sites) showed that traffic through them was around 500 TB/month in mid-1996, and around 1,200 TB/month at year-end 1997. Under the reasonable assumption that little traffic was transiting more than a single NAP, this provides a firm lower bound on Internet traffic, and does show there was a growth spurt in 1995 and 1996. (We do not have detailed data about the time distribution of this spurt. Also, we estimated total U.S. Internet backbone traffic to have been around 1,500 TB/month at year-end 1996, but this estimate is less solid.) (E) On slide #13, it is estimated that prices are dropping in half each year, while traffic is growing 4x, which implies revenues should be growing 2x per year. Yet we do not see such rapid growth in revenues. Generally, carriers' IP revenues from business customers seem to be growing at 30-50% per year. (See, for example, the May 11,, 2001 report entitled "IP!," from J.P. Morgan and McKinsey.) Furthermore, since prices are not in general dropping in half annually (with the exception of some major routes where there is a lot of fiber and heavy competition), this claim seems very questionable. (F) The claims about utilization of backbone links are questionable. Utilizations are low, and are likely to stay low for a variety of reasons. As an example, one can examine the AboveNet network, which makes its detailed traffic statistics publicly available at . The above points contribute to my doubts about the soundness of Larry Roberts' conclusions. 4. Final concluding remarks While I am skeptical of Larry Roberts' 4x growth rate estimate, it could conceivably be correct. If it is, then something very interesting must be going on. Since we do know that the bandwidth of local connections to the Internet leased by the ILECs is not growing very fast, this would most likely mean that the fiber provided by CLECs is finally beginning to carry significant traffic, and new applications are being deployed much faster than before. What makes the 4x growth rate feasible is that the Internet is not all that large, certainly not when compared to the amount of fiber in the ground. For example, with very few exceptions all ISPs have typically just a single OC48 or at most OC192 link along their major routes. Yet the facilities based carriers typically have between 40 and 800 fibers along each route, and each fiber is usually capable (with current DWDM technology) of carrying 80 OC48 or OC192 wavelengths. Thus only a small fraction of the fiber capacity is currently used for Internet traffic. (Another way to look at this is to note that the estimate on slide #6 of Larry Roberts' PowerPoint deck of just 1,200 Gb/s sum of edge ports for the 19 ISPs corresponds to just about 120 OC192 router interfaces.) Although I am skeptical of the 4x growth rate that Larry Roberts estimates, I agree with him on several important points. One is that it is business traffic that is dominant on the Internet. (Too many projections for the future of the Internet seem to be based on the behavior of residential customers only.) Another is that there is no sign of a serious slowdown in the rate of Internet traffic growth. There have been recent estimates from financial analysts that Internet traffic growth is on a rapidly decelerating path, and even one claim, by the head of Nortel, that not just the rate of growth, but Internet traffic itself, has gone down. In that view, the telecom crash was caused by users abandoning their former ways, and tempering their appetite for bandwidth. Neither Larry Roberts nor I see any evidence of that. I would not go as far as he does in calling Internet traffic "anti-recessionary," but there are factors (such as plummeting prices of transmission and switching capacity) that could stimulate traffic growth even in a recession. The tragedy of Sept. 11, 2001 can only spur the search for redundant paths, duplication of databases, etc., which will fuel growth. Internet traffic is growing rapidly (even 2x per year is rapid by any standard measure), and the crash was caused by the collision of reality with unrealistic expectations. Business plans made on the basis of assumptions of 8x or 16x growth per year, or on the basis of the associated myth of "Internet time," could not survive in an environment of 2x growth per year.