Flaky?

From Michael Arrington's post on Digg acquisition rumors:

One point of controversy was around Digg’s claim of 20 million unique monthly visitors and steep monthly growth, whereas the Comscore’s most recent September report shows only 1.3 million monthly unique visitors and flat growth since April (see chart below). Comscore is notoriously flaky, and these numbers are for U.S. households only. Comscore is almost certainly significantly under-reporting Digg traffic.

Michael is one of the best bloggers ever and I read Techcrunch every day. But I think he got this one wrong. Comscore is not "flaky". They are a third party measurement service. They don't always get everything right. None of the third party measurement services do. But they are the best of the lot in my opinion. Now I am biased as I have been an investor in Comscore since 1999 and have been on the board since then.

I do not think Wall Street, Madison Avenue, and all of the major Internet companies would spend millions of dollars a year on data that is "flaky". Comscore's data is always being reconciled, debated, improved, dicussed, and analyzed and it gets better and better every year.

Now let's talk about Digg's Comscore numbers.

1 - Comscore is a "consumer panel". It measures mainstream web users. It is not a "leading edge" panel and it will almost certainly undercount "geek" services like Delicious and Digg. But it won't be off by 20x. It probably won't even be off by 2x.

2 - Comscore counts real viewers in its panel, not cookies. Cookies get deleted by spyware removal software. If you remove your cookies once a week, you'll look like four users every month to someone using cookies as a basis for UVs. The more sophisicated a user base is, the more likely they use cookie removal. And that results in significant UV overcounting.

3 - If you visit a service from multiple computers, you will be counted as mutiple users by most analytics programs. I suspect a decent subset of Digg's user base does that.

4 - Comscore has a US panel and a International panel. The 1.3mm monthly uniques is US data. Comscore's worldwide number for Digg's UVs in September is 3.1mm.

So let's look at Digg's claim that they have 20 million UVs. Do you believe that? As Bryce said in his comments on Techcrunch:

Wow, if 20 million visitors is true - that’s a lot - the entire population of Australia.

YouTube had 20mm unique visitors in September. Do you think that Digg has as many uniques as YouTube? I don't.

My guess is that Digg has something like 5mm monthly unique visitors worldwide. Not 20mm. The difference probably results from cookie counting, multiple browsers, and a few other factors.

And I'd like to encourage everyone out there to sit down and understand third party measurement services before calling them "flaky". My bet is they are more accurate than internal analytics numbers a lot of the time.

Comments

Fred,

This is a great post. Can you also explain what you think might have happened with the del.ico.us/comscore issue from a couple of months ago? It looked like less of an issue of how far off comscore was, and more of an issue of them suggesting a dramatic trend that Yahoo completely denied. The post is here - http://www.techcrunch.com/2006/08/04/more-stats-on-delicious-this-time-positive/

Michael,

i love the dog avatar you are using. i wish more more people would use icons/avatars in mybloglog.

Comscore's sept numbers show del.icio.us has grown to about 800k US unique visitors about 4x what they had at the start of 2006.

i would guess that when you include international users, that number would at least double, so maybe 1.6mm to 2mm uniques worldwide.

the trick to using comscore to compare with internal weblogs is to use the worldwide numbers and then to realize that internal weblogs count cookies, not real users.

Excellent post Fred!

There are many other causes for a disconnect in website server logs to the actual facts. I would rely on comScore or NetRatings in 99% of cases when trying to understand the reach and value of a website. Especially if my money is on the table.

Yes, the main issue is probably the cookie deletion problem that web analytics companies are suffering with. It is compounded on sites like Digg, because of the highly technical user base that has a high rate of cookie removal. Additionally, these users are hyper-connected, with many points of connection to the web.

Yes – visit with FireFox and IE and you are seen as two people in server log, comScore counts only one person. Visit from your laptop and then your desktop, another two in your server logs, comScore counts only one. Then visit from your blackberry and internet café, another two in the logs. Visit from work then home, another two in the logs. Wow! This is going to be a really big issue as more people become ‘hyper-connected’ to the web.

This all adds up exponentially. Then pile on the number of spiders and bots that are coming to visit Digg to pull down their content, and you have an exaggeration of 12x reality.

Even FM Publishing has the estimate of total worldwide traffic at 8MM. When have you known a marketing kit to under estimate reality?

Both del.icio.us and digg.com have a fairly high amount of technical users. These users would block comscores method of data retrieval, then add on to the international users I can see comscore easily being out of wack. Even with that in mind I believe digg numbers are bit high due to the fact technical users also attract more screen scrapers, bots, access from many computers for the same user and so on. Now if only I stayed in touch with Owen...

For whatever its worth, Alexa figures do not support this Digg claim either.

On a relative basis, Digg reach is less than 1/6th that of youtube, and is just slightly higher than delicious.

http://www.alexaholic.com/youtube.com+icio.us+digg.com

Triangulating this information puts Digg figures at about 1 mill in the US.

Is there any evidence that supports Digg's 20mm claim?

It seems like a self interested and unsubstantiated claim. It is irresponsible reporting to indict comScore in this situation.

This is a quote from Comscore's website...

"These members, representing a cross section of the Internet population, give comScore explicit permission to confidentially monitor their online activities in return for valuable benefits such as server-based virus protection, sweepstakes prizes, and the opportunity to help shape the future of the Internet."

Tell me with a straight face that even 5% of Digg's user base is going to participate in the services offered there. This is a new generation of traffic and monetization... I don't think even Alexa's numbers are close to right for sites such as Digg.

Michael,

You could argue the opposite. Digg users have a high rate of trial on new technology and services. comScore recruits their panel based on many different methods and offerings, including Random Digit Dial (RDD), which is the most statistically significant approach around today.

But beyond the recruiting methods, it doesn't matter if the Digg user base is under represented by the panel, because they use solid weighting and projections to ensure the panel is representative of the total US online population.

There is no possible situation that would explain a disconnect of 20MM vs. 1MM. I looked up some of the other ratings services (some of which don't recruit a panel, and use more passive methods to monitor) and the numbers were similar.

Check out the numbers: http://larrison.blogspot.com/2006/10/out-of-whack.html

They are not all apples to apples, but they give you the general idea of the situation. I think Digg should release access to their server data.

PS - even FM Publishing reports 8MM WW visits.

Lets evaluate this using information from Digg's site. They have a "Top Users" section where they list all registered Digg users who have Dugg at least one story.

There are 11,397 pages and a total of 341,908 users.

http://www.digg.com/topusers/page11397

Clearly the digg user base is built from a much larger group of people that do not register and simply consume the content.

If we assume that 90% of Digg's visitor base never register or digg items, we land at an estimate of 3.4 million users worldwide.

In order to get to 20mm, the registered user base would have to account for no more than 1.7% of its total user base.

This is not impossible, but highly improbable.


I agree they are a bit flaky, however, I'm glad we at least have it.

My post earlier this month:

MySpace and Facebook primarily composed of older users? Yeah, right.

comScore recently reported that half of MySpace's users are now age 35 or older. What makes me question this entire report is the fact that they also claim 48% of Facebook's userbase to be between the ages of 12-24, as of August 2006. Facebook didn't open to public registrations until late September. In August, Facebook did have various private companies listed in their system, which if you had an email account with them, you could join Facebook -- however I find it very hard to believe that more than half of Facebook was comprised of users 25 and older (undergraduate college students must out-populate graduate students).

I have 7 Digg accounts. Everyone in our office has at least 5 Digg accounts. Why? Because gaming Digg works, and you can set up a Digg account with free email addresses.

The implications for the "gaming of Digg" go far beyond the validity of their news rankings, and directly into the heart of "what should you count as a "USER"???

Oh, and I forgot to mention, that "gaming Digg" typically requires a quick erase of cookies to log in as another alias....thus giving the appearence of more "unique" visitors to many tracking programs.

"If you visit a service from multiple computers, you will be counted as mutiple users by most analytics programs. I suspect a decent subset of Digg's user base does that."

Excellent point. I bet 1/2 half of Digg users access Digg from home and work computers or even have multiple work computers.

Not to mention multiple Digg accounts.

fred, i wildly agree that a consumer panel is the only way to go. and i hope comscore (or folks like them) win out and soon in the publisher and popular and especially media imaginations because the boasting, hyperbole and blarney these days vis a vis web traffic is as thick as the late 1990s. maybe worse -- i would think many more folks these days know how unreliable and open to manipulation server logs and cookie-counting algorithms can be.

the old maxim in the TV world was, Nielsen stinks, but at least it stiffs everybody the same. i think we need that kind of accepted (albeit hated) standard on the web

If Digg has 20mm users, I'll take Lindsay Campbell out to lunch. :)

By your own estimates, Digg probably has 5 MM unique users. Comscore reports 3.1 million unique users. That means Comscore is underreporting by 40%.

This is for a top 100 web site. 40% off. And that estimate from an investor.

I suppose the word 'flaky' is open for interpretation. If Comscore didn't describe what they do as 'measuremen', I might agree that 'flaky' was too strong of a word. Read more on sample bias: http://www.naffziger.net/blog/index.php/2006/10/11/sample-size-vs-sample-bias/

The best way to bridge the gap will be through an audit. We used PWC's Business Assurance group (New media) at Quova to audit our IP geography accuracy claims. I found them to be incredibly thorough, smart and reasonable. They do extensive work with the IAB and if they aren't already involved in resolving this, I'm sure they're being considered.

How Google can overnight dethrone Alexa's influence and over time compete with NNR and comScore:

1. Open Google Analytics to everyone;
2. Make it free,
3. Let publishers/site owners make traffic public on their sites
4. Compile rankings of largest sites using these numbers.

If the public (press, investors, ad buyers) could rely on those site-specific numbers in lieu of a panel-based number, what do you think they would trust more?

For more:
http://www.watchmojo.com/web/blog/?p=630

It can be tough to reconcile other tracking numbers with comscore.

I know of one large (millions of uniques/month) site that gets close to 3x difference in uniques with comscore much much lower vs omniture...

I don't think anyone is "right" in any provable sense, we've just agreed as an industry to accept comscore numbers, right or wrong.

Compete, (www.compete.com), is showing US uniques at over 2.26 million with a nice spike in the last month.

An interesting analysis on Digg.com was posted on the Compete blog today. Link: http://blog.compete.com

For trended data, see http://snapshot.compete.com/digg.com/

Disclosure: Compete is my Employer

I think that it is fair to say that digg isn't a news site and by relying upon digg as your source of news you are giving up your own free will. ok that was a bit much but digg is an entertainment site and if you choose to get your news there you are relying upon(even without anyone gaming it) conventional wisdom to tell you what is news. the fact that somewhere along digg's evolution as a service 'news' happened to be what people perceived as digg's premise doesn't change the fact that the whole notion of conventional wisdom is frightening. i'm not saying that is the case with digg but it sure could be....i haven't seen anyone question the premise of others telling you what is news or what you should read and I certainly have seen others screaming about the people who game the system. this isn't about news, its about traffic and money. people aren't pissed because the news isn't right, they are pissed because they think someone is gaming the system which may result in traffic being 'stolen' from them.

I have been reading your blog since I took a trip across Europe for several months in 2004.I learn more from your blog or are directed to the places where I do, than any other resource in an industry I am not directly a part of. Following your blog and the information shared has become one of my several avocations.

I do not comment very often, (other than to GG's blog for dinner recommendations when we are in NYC) and believe my last one to you may have been during the tsunami in December of '04.

I was struck by your comment that you "do not think Wall Street, Madison Avenue, and all of the major Internet companies would spend millions of dollars a year on data that is "flaky"."

I have worked on Wall Street (West Coast mostly) since 1983. I have many friends that work on Madison Avenue. I began as a financial advisor, working my up to a portfolio manager, manager of RIA's, F.A.'s private bankers, serving on the committees of senior executive officers, regional chairmen, presidents of large investment banking and asset management institutions and most recently, overseeing a region of states for a mutual fund company owned by a rather prominent banking institution. My first hand experience has been to witness millions of dollars actually are wasted by Wall Street and Madison Avenue from time to time on failed ad campaigns, product launches and management programs based on data that has been outdated or does not fit doing business in the real world, hence my definition of flaky.

I am sorry if I am mincing words but as a right brain spirit that has worked in a left-brain career most of my professional life...words are a language to me as numbers are a language to others. Do you feel your statement may have been too broad of a brush stroke?

For those of us in the industry, Comscore data is frustratingly inaccurate. At my company, we have very robust internal reporting tools (yes, we can tell a user who visits us with two browsers) and we routinely see Comscore under-report visitor numbers by 20x. Routinely.

When we do contact them to try to understand their techniques, it becomes apparent that they have more invested in preserving the status quo than in achieving any change.

While I make no claims about what digg's traffic is or isn't, having watched Comscore screw over companies both big and small with their reports, I truly hope this pushes the company to improve their service, and simultaneously inspires audiences to cast a very wary eye on Comscore.

While media planners use comscore and nnr to justify their media plans, the fact is that neither comscore nor nnr are used in the buying stage: clients ask publishers to submit plans based on internal numbers and then audit these with their own ad servers during the campaign.

Does that not show that panels are off?

Read more here:
http://www.watchmojo.com/web/blog/?p=643

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment