Lee Gomes' Aug. 2 WSJ column does make a valid point that hits
still matter. But at the same time the evidence it cites provides
strong support for a quantitative form of the "long tail" hypothesis.
This quantitative form suggests where long tails are likely to be
most important, and where their influence might be slight.
The quantitative form of the "long tail" hypothesis arises from the
ubiquitous Zipf's Law, which says that the k-th most popular item is
1/k times as popular as the most popular one. This means (if we
approximate 1 + 1/2 + 1/3 + ... + 1/k by log(k)) that the most popular
k items out of a total of n items should be bought/viewed/...
log(k) / log(n)
fraction of the time. Now let's look at the numbers in Lee's column:
(a) For Amazon, Lee cites estimates that the top 100,000 sellers
account for 60% to 80% of all sales. Since Amazon is supposed to
list 3.7 million books, the rule above suggests that the top 100,000
should account for
log(100000) / log(3700000) = 0.761...,
or 76%, right in the range of estimates we have.
(b) Netflix: 50 out of 60,000 titles account for 30% of rentals
according to Lee's column. The rule above predicts
log(50) / log(60000) = 0.355...
which is even more than what we see (and so the "long tail" is even
bigger than might be expected).
(c) YouTube: Top 10% of 5.1 million videos account for 79% of plays,
and top 20% for 89%. The rule listed predicts
log(510000) / log(5100000) = 0.8509...
and
log(1020000) / log(5100000) = 0.8957...,
respectively, so that in the first case the "long tail" is again bigger
than predicted, while in the second case it is almost exactly on target.
So the conclusion is that yes, the "long tail" is definitely there, and
the numbers Lee cites show striking agreement with the quantitative form
of Chris Anderson's hypothesis. But this form also suggests that the long
tail may often not matter too much, and so Lee may often be right. The
key question is just how long the long tail is, and whether it is likely
to get longer.
Consider the Amazon example. With the current 3.7 million
titles, the top 100,000 should account (according to the logarithmic ratio
rule) for 76% of sales. But how much larger can the 3.7 million figure
grow? Books are not easy to write, and so even if every would-be author
who manages to write a complete manuscript gets "published" in some form,
we are unlikely to increase the total number of books by more than a factor
of 10, say. So suppose that Amazon goes to 37 million books from 3.7 million.
Then the quantitative rule would suggest that the top 100,000 titles would
account for 66% of the sales. That is a noticeable drop from the 76% today,
but hardly earth-shattering.
On the other hand, the difference can be substantial in other settings.
For example, if historical patterns repeat, then home-made videos will become
key to the growth in penetration of broadband. And with improved cameras,
editing tools, and high-speed connectivity, it is easy to imagine billions
of videos available on the Net. Let's assume we end up with a relatively
modest figure of 6 billion videos (we already have over 5 million on YouTube).
Then the top 50 titles on Netflix might drop from the 35% predicted by the rule
for today to 17%, and the entire current inventory of 60,000 titles might
account for just
log(60000) / log(6000000000) = 0.488...
or 49% of the total. That would be a major change.
The quantitative version of the "long tail" hypothesis is developed in my paper
with Ben Tilly, " A refutation of Metcalfe's Law and a better estimate for the
value of networks and network interconnections,"
http://www.dtc.umn.edu/~odlyzko/doc/metcalfe.pdf
(which also gives references for Zipf's Law and related issues),
and in a shorter form in the paper with Bob Briscoe and Ben Tilly,
"Metcalfe's Law is wrong," which appeared in the July 2006 issue of IEEE
Spectrum,
http://www.spectrum.ieee.org/jul06/4109
It can also be used to provide a quantitative justification for the
observation that connectivity has traditionally been valued more highly
than content, as was shown in my Feb. 2001 paper "Content is not king,"
http://firstmonday.org/issues/issue6_2/odlyzko/
Basically the huge mass of trivial communications (such as your making a
dinner reservation), mostly of very little importance to anyone beyond
the two people involved, and so at the extreme tail of the long tail,
outweighs the blockbusters. (Ordinary voice telephony in the US, wired
and wireless, still produces well over $300 billion a year in revenues,
while Hollywood brings in something like $80 billion, and much of that
from overseas.)