Web Analytics Blogs

Eric T. Peterson has been working in web analytics for over ten years and has built up an incredibly rich body of knowledge about the subject, knowledge Mr. Peterson works to share every week here in his Web Analytics Demystified weblog. Whether you're new to the subject or the most experienced practitioner, you should join the thousands of people around the globe already subscribing to Peterson's blog and start reading today.

Subscribe to Eric T. Peterson's weblog

Archive for 'Web 2.0'

« Previous Entries

Free white paper on measuring multimedia on the Internet

This morning the fine folks at Nedstat in Holland published a white paper that Michiel Berger and I co-wrote titled Measuring Multimedia Content in a Web 2.0 World.  This free white paper explores the emerging direct measurement model for multimedia content by examining several common business cases for deploying video and provides a new set of definitions and key performance indicators (KPIs) designed to help companies effectively track their investment in video based content.

The timing is somewhat ironic because Judah has been writing a fair amount about Video Analytics over in his blog — I guess great minds think alike!

While video measurement has been around for awhile, the new social media certainly increases the complexity associated with determining the efficacy of video from a business perspective.  The folks at Nedstat are committed to helping their customers resolve these issues, and are generously making our white paper available without registration requirements.

You can read the press release about the paper’s availability or download your own copy right away.

Please attend my webinar on Web Analytics 2.0 and the Web Site Optimization Ecosystem

Thanks to Tealeaf I’m excited to be able to present a free webinar on December 11th titled “Who, What, Where, When, and Why: Understanding Visitor Interactions on the Internet.” I’ll be presenting my thoughts on Web Analytics 2.0 and discussing the Web Site Optimization Ecosystem fundamental to helping companies effectively measure and manage visitor and customer experiences in a Web 2.0 world. Plus, everyone who registers will get copy of a whitepaper I recently published sponsored by Tealeaf titled Customer Experience Management and Web Analytics: From KPIs to Customer Transactions.

When: December 11th at 9 AM Pacific / Noon Eastern
Register at: The Tealeaf web site

If you’ve ever wondered about Tealeaf and how their technology is best integrated with your existing web analytics practice I’d encourage you to attend this free seminar.

Example uses of the visitor engagement metric

My post last week on measuring visitor engagement was pretty long by the time I outlined the calculation, so I put off publishing examples of how the metric could be used until now. I’m excited to see that this topic has generated so much interest, both in terms of comments and emails sent to me directly.

My goal for this post is to provide a few examples and explanations to show how the metric can be used to supplement our otherwise already-rich set of web analytics data. Since so many folks have been willing to explore the engagement metric, I have embedded a bunch of questions in this post in italics that I’d love your feedback on.

Distribution of engagement scores and segmentation. Here is the distribution of engagement scores for about six months at Web Analytics Demystified by percent of visitors. As you can see, these scores are left-skewed and tail off as the score increases, showing that nearly half (47.6%) of visitors to my site are “poorly engaged”. When I look at this distribution it makes perfect sense to me — what do you think?

I have created segments to group visitors by their engagement score: “Well engaged” visitors have engagement scores over 30%, “moderately engaged” visitors are those between 10% and 30%, and”poorly engaged” visitors score less than 10%. These segments can then be used to explore how the behavior of visitors in each engagement group differs by looking at my page and referring source dimensions (page, content group, referring domain, campaign, search phrase, etc.)

Identify relationships that might otherwise not be found. At the top of this report you can see the pronounced difference in visitor engagement (and traditional metrics) for “branded” and unbranded searches (”None”) bringing visitors to my site. Now, because branded searches are a component of the calculation (Brand Index), you definitely expect to see a difference between the two engagement scores. What is interesting is that while other metrics (duration, sessions per visitor, page views per session) show a slight difference, visitor engagement and conversion are all three times higher for branded searches. I think this difference observed in all the metrics is further evidence that brand-driven searches are bringing more engaged visitors — what do you think?

In the middle table you can see search phrases bringing visitor to my site, showing visitor engagement, page views per session, and sessions per visitor. Here three phrases stand out to me:

  1. “web analytics book” and “web analytics process”, neither of which are particularly distinguished from other search phrases based on page views per session or sessions per visitor but both of which have visitor engagement scores over double my site-wide average of 8.8%. This is important to me because these are un-branded search terms that are critically important to my business.
  2. “vendor discovery tool” which would appear to be pretty important based on traditional metrics but only stands out slightly using the visitor engagement score (at 13.6%) I spend a lot of time trying to figure out how to drive folks using the vendor discovery tool to take other actions (buy books, inquire about consulting) and this data suggests that there is an unrealized opportunity.
  3. “performance indicators” which shows that the visitor engagement metric is useful to identify terms that you’d think are important to the site but aren’t attracting the right audience (average engagement score for these visitors is only 5.6%)

I think this level of information is actually pretty helpful for identifying search marketing opportunities — what do you think?

Engagement-derivative metrics like “Percent Highly Engaged Visitors” are useful. Here you can see a select group of referring domains showing the percent of highly and percent moderately engaged visitors they’re sending my way (with conversion to show that engagement and conversion are in fact different!) Avinash Kaushik is sending me a few (0.2%) highly engaged visitors (thanks!) but Ian Thomas is sending me a bunch (70.4%) of moderately engaged visitors, many of whom are purchasing books (1.2% conversion rate.)

By looking at traffic from Avinash’s site over time (bar graph) I can see peaks and valleys in overall engagement from folks coming from his site, which would be useful to back into those peaks to try and determine what other blogger’s readers might be reacting to when they’re exhibiting highly-engaged behavior on my site (see late August and early September.) Given that Clint proved that conversion is a poor measure of success when trying to evaluate traffic from other bloggers, I think visitor engagement is useful for examining the non-revenue value of referring sources — what do you think?

Those of you who are looking for correlation between engagement and conversion, have a look at the data for Mr. Jim Sterne’s wonderful site emetrics.org —  5.6% of the folks coming from Jim’s site are highly engaged, 66.2% moderately engaged, and man-oh-man does Jim help sell some copies of Web Analytics Demystified.  You’re the man, Jim!

Visitor engagement is globally useful. At least in Visual Sciences Visual Site you can apply engagement metrics and segments to pretty much any dimension tracked. Here I’m looking at the percentage of “highly” engaged visitors (50% or more) in my “well engaged” segment broken down by country. Now, this is certainly more interesting in light of the total volume of traffic coming from each geographic location, and as I think about localizing my books and planning future trips around the world this information becomes very helpful.

There is more, including some of the more granular visitor-level stuff I talked about in the first series of posts on the subject, but I want to be sensitive to protecting the identity of individual users on my site. If you’re interested in helping me collect some “ground truth” regarding the engagement calculation, write me and I’ll explain how you can help.

So what do you think? Do the screen-shots help you understand the calculation better? Or do they still make it look super-complicated and scary? Is there something specific you’d like to see me demonstrate with the calculation? Or do you think you could come up with these same insights using more traditional metrics?

Web Analytics 2.0? I am more worried about Web Analytics 3.0!

If you’re reading the web analytics blogs, you’ve probably already heard about the recent presentations I’ve given on the subject of “Web Analytics 2.0″. The future of web analytics and the relationship between Web 2.0 technology and measurement is something I’ve been talking about for over six months — I actually have a Web Analytics 2.0 workshop that I regularly give that you can read about under Analytics Consulting on my site — but given that it is “conference season” it is no wonder that this subject is getting attention from other folks in the industry. I have given my presentation at Web Analytics Day in Brussels, SEMphonic X Change in Napa, and will be giving a variation on same at Jim Sterne’s Marketing Optimization Summit in October.

Due to demand, you can download a PDF of the presentation from the white papers section of my site. If you’re interested in learning more about Web Analytics 2.0, please give me a call and I’d be happy to discuss it with you.

Strangely enough, the slides that are generating the most interest and commentary are not those about the Web Site Optimization Ecosystem, the integration of quantitative and qualitative data, or the Web Analytics Demystified RAMP, but rather the few slides I included outlining my thoughts about Web 3.0 and what I am calling Web Analytics 3.0.

What the heck is Web Analytics 3.0?!

Before I can tell you what Web Analytics 3.0 is, I need to tell you what I think Web 3.0 is going to be. The good old Wikipedia basically dodges this by saying:

Web 3.0 is a term that has been coined with different meanings to describe the evolution of Web usage and interaction along several separate paths. These include transforming the Web into a database, a move towards making content accessible by multiple non-browser applications, the leveraging of artificial intelligence technologies, the Semantic web, the Geospatial Web, or the 3D web.

While I know that Judah is all hopped up on the notion of the semantic web, after having traveled to Tokyo and Europe in the past month, I find myself absolutely convinced that the next technology era will be characterized by our collective ability to access the Internet anyplace, anytime, using so many devices we begin to look back on computers much the same way young people do television today — as something nice to use when YouTube is unavailable. Rolf Skyberg, a disruptive innovator from eBay who I met in Rotterdam a few weeks back, called it “digital ubiquity” — the point where we forget that the Internet actually exists and take our ability to access information completely for granted.

Given so many sexy alternatives — 3D web, transforming the Internet into a database, artificial intelligence, and the such — why am I so convinced that in the next three years we’ll be talking about Web 3.0 when we talk about mobile phones and non-traditional browsers?

Easy. The financial opportunity available via the mobile Internet makes the billions transacted today look like pocket change.

Think about it:

Just think for a minute about how your browsing experience might change if the web sites you visited remembered you and delivered a tailored experience based on your demographic profile (theoretically available via your phone number), your browsing history (accurate because you’re not deleting your phone number) and your specific geographic location when you make the request?

Now think about how the advertising buying experience would change if the same were true, not to mention behavioral targeting. I mean, given GPS and demographic data, the behavior being tracked could be “works downtown during the day, checks Facebook on his phone often, lives in the suburbs, surfs sports scores from his neighborhood bar.” The Starbucks web site could have a link at the top with a coupon to save $1 on my double-tall non-fat latte in stores 1 block, 2 blocks, and 5 blocks from my current location; the Best Buy web site could have an in-store promotion for the store I am standing in, targeted to my age and gender; and my search engine could disambiguate my searches based on my demographic profile, my geographic location, and my recent search history to serve me paid search ads designed to influence my geo-spatial movement, not just my likelihood to click.

Jeepers, huh?

Sure there are privacy issues, but given the intensely personal relationship most people have with their cell phones, and the fact that far more people in the world have mobile phones than computers (Gartner estimates 271 million units sold to end-users by Q2 2007) it is easy to make a convincing case for mobile computing and digital ubiquity defining the next technology era, much like social networking, AJAX, XML, and mashed-up business models define the current Web 2.0 era we’re living in today.

Okay, mobile is the future. So what the heck is Web Analytics 3.0?

If Web Analytics 1.0 was all about measuring page views to generate reports and define key performance indicators, and if Web Analytics 2.0 is about measuring events and integrating qualitative and quantitative data, then Web Analytics 3.0 is about measuring real people and optimizing the flow of information to individuals as they interact with the world around them.

Your log file analyzer can do that, right?

The current state of mobile measurement isn’t about Omniture and Visual Sciences, it isn’t about JavaScript and cookies, and it isn’t about page views, visits, and visitors. Web Analytics 3.0 is going to be something completely different, and it will depend on completely new technology. Anil Batra and I talked about a project he did a few years back while he was at digiMine — he hacked together WAP gateway logs into a pseduo-log file, using the phone number in place of a cookie. Brilliant, and the fact that Anil has this experience propels him to very near the head of the class for Web Analytics 3.0 analysts.

In theory, the mobile Internet has many of the same measurements as the hard-wired Internet. But as the information the platform and device providers make available changes, something I very much believe will happen, the quality and volume of information at our disposal will increase and improve. The W3C document on “Mobile Best Practices 1.0″ already exists but surprisingly enough don’t have a section about logging requests or measuring user interaction. M:Metrics is out there providing analyst reports, but the service is more similar to comScore and Nielsen than WebTrends and ClickTracks.

This post is already extremely long but I wanted to start the conversation. In future posts, as time allows, I’ll expand on some of what I believe is possible and how. In the interim, let me know what you think! Am I wrong? Is Web 3.0 bigger than mobile? Or do you already have a handle on measuring your mobile content, even without GPS and phone numbers as unique IDs? Do you personally have experience doing analysis on mobile content? If so, I’d love to hear about your experience.

As usual, I very much welcome your comments but am happy to receive your comments directly via email. Also, if you’re a mobile service provider or device manufacturer concerned with how advertisers and marketers will measure their success through your platform, application, or device, I would love to talk to you about the Web Analytics Demystified vision for Web Analytics 3.0.

Congratulations to the WAA Standards Committee!

I wanted to say congratulations to Jason Burby, Angie Brown, and everyone on the Web Analytics Association’s Standards Committee for publishing their standards document last week. Given the number of web analytics terms they defined (26) and the somewhat slow process the Association has for getting documents approved, this effort is a huge milestone for the organization, one that Jason and Angie deserve great praise for indeed!

If you haven’t already downloaded and read the definitions, check them out here (PDF download).

While the PDF document says that the final product is “Web Analytics Definitions - Version 4.0″ this is clearly a “Web Analytics 1.0″ document. The committee relegated all of the really wonderful Web 2.0 stuff like AJAX, RSS, XML, and the such to the same confusing obscurity they exist in today with the comment “certain technologies including (but not limited to) Flash, AJAX, media files, downloads, documents, and PDFs do not follow the typical page paradigm but may be definable as pages in specific tools.”

Given the last year’s push towards measuring Web 2.0 the right way and some great, insightful work from folks like Ian Houston and Judah Phillips it is kind of a shame that this document doesn’t address event-based measurement architecture more directly. The group does define “event” but only does so under the header of “Conversion Metrics” stating that an event is “any logged or recorded action that has a specific date and time assigned to it by either the browser or server.

Sounds like the definition of a Web 2.0 event to me, but I’m not sure why this is relegated to conversion metrics.

Regardless, this is great and valuable and useful work on the part of these hard-working volunteers. But the definition of standards raises one particularly important question: Given the definition of standards, what the hell do web analytics practitioners do with them?

The Fundamental Problem

The fundamental problem with these definitions (and any standard definitions IMHO) is that without an enforcement mechanism they are unlikely to provide any real benefit to the folks in the trenches. As long as smart folks like Eric Enge at Stone Temple Consulting continue to uncover as much as a 154% difference in the measured number of visitors and a 161% difference in the measured number of page views between concurrently deployed solutions, the average web analytics end user should not be comforted by the existence of standards.

Put another way, it is not the definition of standards that makes a difference, it is the adherence to standards by technology vendors that will provide the portability of skills, knowledge, and solutions so desired by many in our industry. Jason Burby sagely points this out in his Clickz article on his volunteer work when he says:

“Companies often switch metrics tools and subsequently change the terms they use to discuss analytics. One tool will call something one name, while another tool calls it by a different name or applies different meanings to a very similar name. When people switch tools and bring data with them, they don’t get an apples-to-apples comparisons. As a result, companies lose the important year-over-year view.

Though the new standards won’t instantly take care of that issue, they provide a step in the right direction.”

The Barrier to the Adoption of Standards

The problem as I see it is this: For many web analytics vendors, the way they calculate some of the critical metrics in web analytics is the “secret sauce” in their solution. Consider the WAA’s definition of unique visitors which states that unique visitors are:

“The number of inferred individual people (filtered for spiders and robots), with a designated reporting timeframe, with activity consisting of one or more visits to a site. Each individual is counted only once in the unique visitor measure for the reporting period.”

This is perfectly reasonable, but the definition goes on to say that “a unique visitor count is always associated with a time period (most often a day, week, or month), and it is a non-additive metric.”

Do you wonder what the folks at Visual Sciences who have spent millions to perfect their “data wheels” technology that effectively removes the “time period” requirement would say to this? One of the major value propositions at Visual Sciences (at least during my brief tenure) was that time was irrelevant — if you wanted the number of unique visitors for the football season, you dragged your mouse across the calendar; if you wanted the number of unique visitors for a few hours during the day, you dragged your mouse; if you wanted the number of unique visitors to your site since recording began, you dragged your mouse.

You can make the case that this example more or less removes the time dependence associated with the WAA definition. But should all the vendors who don’t have this capability (anywhere you are forced to use metrics like “Daily Unique Visitors”) spend the R&D money necessary to eliminate the dependence on time? Or should Visual back this functionality out of their application?

When you start to think about these kinds of things, much less issues associated with data sampling and data roll-off that occurs for a litany of reasons, you can start to understand why I made this somewhat snide comment in a MediaShift article awhile back:

“A friend of mine described it as the most beautiful fantasy…but it would never happen,” consultant Peterson said. “Omniture has a $1 billion market cap, and I don’t see Omniture tearing apart their technology to calculate unique visitors and page views differently because all their competitors have decided there’s a different way to do it. It’s hard to imagine. Not impossible. Fantasies sometimes come true.”

Ironically the cost isn’t the main problem: The impact on existing customers who would be forced to learn new definitions and suffer from potentially dramatic changes in data collection and reporting is the main problem. Do you want to be the person who has to tell a Fortune 500 customer that because you’re adopting more standard definitions that their page view count will suddenly drop by 35% month-over-month?

I had to do that once. Trust me here, it wasn’t a fun conversation to have.

An Idea in the Absence of a Solution

Given that I think that the WAA has produced some incredibly valuable work, despite some potential barriers to the work’s adoption, I do have an idea that I would love to see the Association follow-up on, one that would add a tremendous amount of value to this already great work.

I would love to see the Standards Committee create a matrix of standards compliance for each of the vendors in the marketplace today. Basically a checklist that details on a term-by-term basis which vendors are currently using the WAA definitions that would let companies looking for a solution to include that criteria in their assessment. Something that would let everyone quickly determine:

  1. How standards compliant a given solution is (and which solution today is “most compliant”)
  2. Which standard definitions are calculated out-of-box in each solution (for example, “Original Referrer” and “Bounce Rate”)
  3. Which currently available solutions dramatically differ from the norm in their use of standard terms

Something like this would probably have to be backed up with some documentation or examples as proof points, just for reference. And yeah, this is kind of a lot of work, but if you think about it all you really need is for one WAA member per solution to poke around in their documentation and then someone (Jason and Angie maybe) to collate the results and write it up. I would be happy to contribute the matrix assessment for the web analytics solution I’m using now if that would up!

Who knows, maybe we’d discover that all the vendors are already standards compliant and there really isn’t a problem with definitions!

What Do You Think?
I’d love to hear what all of you think about the new standards and my concerns about how they’ll be used (or not used.) Am I missing something? Were you disappointed to not see something that spoke more clearly to your concerns about Web 2.0 technology? Or are you just pleased that the WAA published these definitions and see them as a small-but-important first step?

« Previous Entries