Web Analytics Blogs

Eric T. Peterson has been working in web analytics for over ten years and has built up an incredibly rich body of knowledge about the subject, knowledge Mr. Peterson works to share every week here in his Web Analytics Demystified weblog. Whether you're new to the subject or the most experienced practitioner, you should join the thousands of people around the globe already subscribing to Peterson's blog and start reading today.

Subscribe to Eric T. Peterson's weblog

Archive for 'Web Analytics 2.0'

Next Entries »

How to measure visitor engagement, redux

Back in December of last year when I first posted on measuring visitor engagement, I hardly imagined how much interest the topic would generate. Shortly after the first post, I commented that my definition of engagement was as follows:

Engagement is an estimate of the degree and depth of visitor interaction on the site against a clearly defined set of goals.

I then went and wrote over a dozen posts, publishing feedback from some incredibly bright people and demonstrating the utility of a well-defined measure for engagement. Since that time, however, some have questioned the value of such a metric and thusly prompted me to update and publish the following calculation for visitor engagement:

I presented this calculation to a completely full room last week at Emetrics but wanted to provide an update to all my patient readers who were not able to make the event. You can download my entire Emetrics on “Web Analytics 2.0″ which includes the slides on measuring visitor engagement from the White Papers and Presentations section of my site.

I very much believe that engagement is a metric, not an excuse, and that the metric described in this post provides a powerful measurement framework for sites looking for new ways to examine and evaluate visitor interaction. I know that for my own site, the use of simple measures like “bounce rate”, “conversion rate” and “average time spent” is simply insufficient for selling anything other than my books. But I’m now in the business of selling consulting, a complex and sometimes time-consuming sale, and so I’m always on the hunt for any web analytics measure that will give me an edge and help identify truly qualified opportunities.

I believe this metric is exactly that.

This post is an extension of the work I did in late 2006 and early 2007 and was written to clarify my position, update my thinking in the context of “Web Analytics 2.0″, and reiterate my desire to have an open and honest conversation with my peers and other interested parties regarding the measurement of visitor engagement. Web analytics is hard but not impossible; the same is true regarding the calculation and use of robust measures of visitor behavior.

I believe the visitor engagement measurement to be perhaps the most important of all “Web Analytics 2.0″ measurements. Given that this model fully supports both quantitative and qualitative data, and given that the model is build as much around the measurement of “events” as much as page views, sessions, and visitors, I (perhaps haughtily) believe this calculation to be prototypical of the types of measurements we will see as we continue to explore the boundaries of “Web Analytics 2.0″ (download my presentation from SEMphonic X Change).

The Web Analytics Demystified Visitor Engagement Calculation

The latest version of my visitor engagement metric, with notes about its calculation and use, are as follows. If you’re too busy to read this entire post but would like to learn more about this measure, please write me directly and we can set up a time to discuss it.

This is a model, not an absolute calculation for all sites. I agree with other analysts and bloggers who insightfully say that there is no single calculation of engagement useful for all sites, but I do believe my model is robust and useful with only slight modification across a wide range of sites. The modification comes in the thresholds for individual indices, the qualitative component, and the measured events (see below); otherwise I believe that any site capable of making this calculation can do so without having to rethink the entire model.

The calculation needs to be made over the lifetime of visitor sessions to the site and also accommodate different time spans. This means that to calculate “percent of sessions having more than 5 page views” you need to examine all of the visitor’s sessions during the time-frame under examination and determine which had more than five page views. If the calculation is unbounded by time, you would examine all of the visitor’s sessions in the available dataset; if the calculation was bounded by the last 90 days, you would only examine sessions during the past 90 days.

The individual session-based indices are defined as follows (and these are slightly updated from past posts on the subject):

  • Click-Depth Index (Ci) is the percent of sessions having more than “n” page views divided by all sessions.
  • Recency Index (Ri) is the percent of sessions having more than “n” page views that occurred in the past “n” weeks divided by all sessions. The Recency Index captures recent sessions that were also deep enough to be measured in the Click-Depth Index.
  • Duration Index (Di) is the percent of sessions longer than “n” minutes divided by all sessions.
  • Brand Index (Bi) is the percent of sessions that either begin directly (i.e., have no referring URL) or are initiated by an external search for a “branded” term divided by all sessions (see additional explanation below)
  • Feedback Index (Fi) is the percent of sessions where the visitor gave direct feedback via a Voice of Customer technology like ForeSee Results or OpinionLab divided by all sessions (see additional explanation below)
  • Interaction Index (Ii) is the percent of sessions where the visitor completed one of any specific, tracked events divided by all sessions (see additional explanation below)

In addition to the session-based indices, I have added two small, binary weighting factors based on visitor behavior:

  • Loyalty Index (Li) is scored as “1″ if the visitor has come to the site more than “n” times during the time-frame under examination (and otherwise scored “0″)
  • Subscription Index (Si) is scored as “1″ if the visitor is a known content subscriber (i.e., subscribed to my blog) during the time-frame under examination (and otherwise scored “0″)

You take the value of each of the component indices, sum them, and then divide by “8″ (the total number of indices in my model) to get a very clean value between “0″ and “1″ that is easily converted to a percentage. Given sufficient robust technology, you can then segment against the calculated value, build super-useful KPIs like “percent highly-engaged visitors” and add the engagement metric to the reports you’re already running.

The Visitor Engagement Calculation in Detail

The Click-Depth, Recency, and Duration indices are all pretty straight forward and are more-or-less the traditional indicators that most people (incorrectly) call “measures of engagement”. Each of these are very important to the overall calculation, but none of these alone are sufficiently robust to describe “engaged” visitors. I set the “n” values for my site’s calculation based on the average value for each and this seems to work pretty well (meaning my Ci looks for sessions more than “5 page views” in depth, my Ri looks for sessions more than “5 page views” that occurred in the “past three weeks” and my Di is looking for sessions longer than about “5 minutes” in length.)

Brand Index is a little more complicated. Here I have made a list of all the terms I believe to be “branded” for my site and business, terms like eric t. peterson, web analytics demystified, web site measurement hacks, web analytics wednesday, and the big book of key performance indicators. Whenever a session begins either with no referring domain or comes from a search engine with one of these terms attached, I count this as a “branded session” and score appropriately. While this index perhaps unfairly weights towards search engines, I firmly believe that if you’re starting your session with either my branded URL, my name, or the name of one of my books that you are already engaged.

Feedback Index is the sole qualitative input to this model but it can easily be expanded if necessary. Here I am simply scoring sessions based on whether visitors are providing qualitative feedback via the OpinionLab “O” present throughout my web site or writing me directly by clicking a “mailto:” link. I’m not looking at whether the feedback is positive or negative, only whether feedback was given, operating under the belief that anyone willing to provide direct feedback is engaged.

The Feedback Index could easily be expanded by scoring based on the answer to direct questions posed to the visitor, questions like “do you find the content on this site valuable?”, “do you plan on calling Web Analytics Demystified about consulting?” and “would you described yourself as engaged with this site?” Given a sufficiently robust mechanism for making the calculation, the Feedback Index can provide a tremendously powerful input to the visitor engagement model.

The Interaction Index captures sessions in which specific “engaged events” occur other than the site’s primary conversion event — events like downloading a white paper, providing an email address, requesting a presentation or PDF, commenting on a blog post, Digging a post, emailing content to a friend, printing a page, etc. The Interaction Index is designed to capture a small weighting from those measurable goals on your site you believe to be indicative of engagement.

The Interaction Index specifically does not examine commerce transactions and other conversion events of fundamental import to the site. While I have debated this in the past, here is the rationale for recommending the exclusion of primary conversion events:

  1. These events already have their own key performance indicator: conversion. Given that conversion is likely already defined for most transactional sites and tracked in great detail, adding conversion to the visitor engagement calculation is superfluous in my opinion.
  2. The visitor engagement metric is designed to provide information about the large number of visitors who do not convert. Given relatively low conversion rates online, having visitor engagement be decoupled from conversion provides a cleaner measure for use in exploring non-purchaser behavior, including looking for independent correlation between the two measures.
  3. By excluding conversion, the two metrics can be used side-by-side to look for visitor behaviors may not be obvious otherwise. Given the lifetime of possible visitor behaviors, having a way to look for well-engaged visitors who have not completed a transaction online or have completed a transaction outside of the available data set provides a critical view not otherwise readily attained.

The Loyalty Index is a reflection of my belief that repeat visitation behavior is perhaps the best measure of engagement available. Based on the distribution of visitor loyalty data at Web Analytics Demystified, I score “1″ when visitors have come to the site more than five times in the past 12 months.

The Subscription Index is a reflection that truly engaged visitors are able to self-identify by subscribing to our blogs or newsletters; if you have taken the time to subscribe to one of the Web Analytics Demystified blogs I believe you to be engaged. If your site does not have some type of XML-based content subscription you can either drop this index or (perhaps better) look for an opportunity to develop a subscription service, thusly giving your visitors another good engagement point.

How Does This All Work in Practice?

Careful readers will likely have already figured out that as visitors come to your site over time, their cumulative “lifetime engagement score” changes as they satisfy the criteria of each individual index. So someone coming from a Google search for “web analytics demystified” who looks at 10 pages over the course of 7 minutes, downloads a white paper and then returns to my site the next day will have a higher visitor engagement value than someone coming from a blog post who looks at 2 pages and leaves 2 minutes later, never to return.

If you think about it for just a bit, and consider the components in the full calculation, the visitor engagement metric starts to make an awful lot of sense. Consider the following:

  • A visitor can quickly move through a lot of pages, getting exactly what they need, and still be scored usefully through the Click-Depth Index
  • A visitor can slowly and methodically read a few pages and be scored usefully through the Duration Index
  • A visitor can come to the site frequently and do little more than read a single page of content and be usefully scored through the Recency and Loyalty Indices
  • A visitor can come to the site once, subscribe to the blog, return later and download a presentation, and be usefully scored through the Subscription and Interaction Indices
  • A visitor can come to the site, click on dozens of pages but fail to find what they are looking for, then tell me so using my feedback mechanisms and be usefully scored through the Click-Depth and Feedback Indices

The power of the metric is appreciated when you apply it to the commonly measured dimensions found in web analytics: referring domain/URL, search engine/phrase, campaign/placement/creative, content group and page, browser/operating system, etc. Suddenly instead of looking at simple measures, you’re examining the potential of visitors coming from or going to each element in the dimension. To see the metric in action, I encourage you to read my post on the gradual building of context, at least until I’m able to publish new screenshots later this week.

Some Parting Thoughts about Measuring Visitor Engagement

Some folks have complained that this metric is “not immediately useful”, that nobody will understand it, and that it is impossible to calculate. Perhaps, but I would argue that A) no metric is truly immediately useful and B) most people don’t understand web analytics because web analytics is hard. The assumption that a diverse organization is going to be more successful using “bounce rate” because it can be glibly explained by saying “your content sucks” is just wrong — all of this stuff needs to be explained regardless of the complexity of the metrics involved.

Regarding the metric being impossible to calculate, it fully depends on which application you’re using. If you’re trying to get by using free tools then yes, you’re out of luck. But if you’re using robust tools like the high-end offerings from Unica, IndexTools, Visual Sciences, and WebTrends then you should have little trouble using the metric I describe in this post.

I personally believe that Web Analytics 2.0 both requires and allows us to be more creative and thoughtful in our use of metrics. Why not use a robust indicator if one is warranted? Especially if you’re not selling anything online, or if you’re selling high-consideration items, my visitor engagement metric can be shown to be an extremely powerful measurement.

Given the assertion that some consultants are apparently charging $200,000 USD for complex “engagement index” work, and given that someone working for Google is in the process of trying to patent a much simpler version of this equation, I am happy to give my work away to the entire industry in an effort to promote the use of more meaningful metrics to be brought to bear on increasingly complex measurement problems.

What do you think? Did you see my Emetrics presentation and still have questions? Did you read every word of my series on engagement and still not believe me? Do you need to see engagement in action before you’re willing to say it’s not just an excuse? Or are you chomping at the bit to have a robust measure like this for use on your own site?

Especially on this subject I relish your feedback, either via comments or via email — your choice! I find the subject fascinating and welcome the opportunity to discuss it you, my (hopefully) engaged readers.

Is engagement an excuse?

Blogger Avinash Kaushik kicked off a little debate in the blogosphere a few weeks when he declared:

“Engagement is not a metric that anyone understands and even when used it rarely drives the action / improvement on the website.

Why?

Because it is not really a metric, it is an excuse.”

Suffice to say, some pretty bright folks disagreed with Avinash, openly and vocally. Anil Jasra has a good summary of a panel from WebTrends Engage where Gary Angel, Andy Beal, Manoj Jasra, Jim Novo and Jim Sterne all apparently voiced their opinion that engagement is a metric, not an excuse.

Perhaps ironically, in an interview with Eric Enge from February of this year, Enge asked Kaushilk about my long series of posts on measuring engagement (emphasis mine)

Eric Enge: Another thing I read about recently was Eric Peterson’s notion of an engagement metric. Can you comment on that?

Avinash Kaushik: Sure. You know that Eric is obviously a leader in the industry. We are all following the trail that Eric has blazed. He is just an awesome guy and a really great thinker. And, in terms of the specific post that you are referring for engagement, I think Eric’s initial proposal for the methodology is a very good one, and it does extend the conversation in terms of what it is possible for us to measure, because Eric obviously has access to some pretty good tools that allow for deeper analysis. But my preference is to ask a random sampling of people, or every single person who comes to website, are you engaged, here is my definition of engagement, do you like this site or product, are you going to recommend it, or whatever is the case.

Now, to be fair, I agree with part of Avinash’s argument — qualitative data is a valuable input into measuring visitor engagement — I just don’t think qualitative data is the only input. Nor do I think that it is “nearly impossible to define engagement”. For over a year I have been calculating visitor engagement on my site using the following equation:

Looks complicated, huh? It is. But if you’re running a site like mine where the major outcome you’re trying to create is simply not measurable online, wouldn’t you like to have some reasonable proxy that would help you identify where your best leads are coming from, what those leads are looking at, and who your highest quality leads actually are?!

I know I do.

Obviously the equation above doesn’t tell you very much. If you want to hear the rest of the story, you have two options:

  1. Come to my Web Analytics 2.0 presentation next Wednesday at 1:30 PM in the Blue Ballroom at Emetrics
  2. Wait until next Thursday and download my updated Web Analytics 2.0 presentation from my web site

Ironically this little debate prompted me to stick the long-awaited explanation of how to measure and use visitor engagement into my Web Analytics 2.0 presentation. Thanks to Avinash for kicking off a nice (if a bit lopsided) debate!

See you in Washington!

Stephane Hamel on Web Analytics 2.0 and 3.0

Stephane at immeria has a blurb about Avinash Kaushik’s video on Web Analytics 2.0 and my post this week on Web Analytics 3.0 that I started responding to in a comment. But as typical of me the comment got really long so I will just publish it here and link it to Hamel’s blog.

Stephane, good point that I didn’t explicitly define Web Analytics 3.0 … something for a follow-up post to be sure.

To your point:

“The Web and Internet ecosystem encompass quantitative and qualitative elements, physical and virtual organisms, online and offline interactions that are functioning together within legal, ethical and technological constraints. From that angle, things like a website, competition or location can’t, by themselves, explain the complexity of what’s going on. They can merely improve the science of analysis that will eventually lead to better insight.”

While it is difficult to disagree with you, I think you’re making the same argument that Charlene Li of Forrester made regarding her definition of engagement — she commented that engagement can be indicated at a minute level, such as when a flashy print ad catches your eye. Sure, but how the hell do you MEASURE someone noticing Charlene’s flashy print ad? And how do you MEASURE your legal, ethical, and technological constraints?

Kaushik and I are in near complete agreement about Web Analytics 2.0, and I thought he did a pretty good job explaining it. A lot of people have been saying the same thing as Avinash and I for over a year (Larry Freed pops to mind). An important distinction is that both the Web Analytics 2.0 and Web Analytics 3.0 paradigms are focused on tangible, measurable aspects of our (online) lives. And, in my humble opinion, the measures we take should be practical to make.
So I agree with you, it’s not about “e” business but rather about simply doing business, you’re spot on there. But here is the problem:

Web Analytics 1.0 was a full-on after-thought … not just for companies like yours but for the entire Internet. First we had web sites then later (more or less in 1995 if you believe most time-lines) we had measurement tools built to hack web server log files (poorly) and to try and cobble together some semblance of visitor behavior. A ton of R&D and money has gone into refining Web Analytics 1.0 and today we have JavaScript page tags and sophisticated applications that are basically still an after-thought for most companies.

Web Analytics 2.0 is also an after-thought, at least for the most part. I mean, we’ve had the qualitative data in systems like ForeSee Results and Tealeaf for years, so why is it only now that we’re actively talking about combining these data into a more holistic view of the visitor? We’ve had multivariate testing systems like Offermatica and SiteSpect for years, so why is it only now that we’re actively talking about using the combination of qualitative and quantitative data to drive action? (FYI, you can download my Web Analytics 2.0 presentation from my web site if you’re interested in more of my views on the subject …)

So I guess what I’m getting at by talking about Web Analytics 3.0 at this early stage is this:

Wouldn’t it be nice if the global solution to measuring the inevitable state of “digital ubiquity” wasn’t another after-thought?

Wouldn’t it be sweet if the platform providers and device manufacturers, the standards bodies and the compliance police, all came together now instead of 10 years from now and asked “How in the world will we measure all of this?” Personally, I think so, that’s why I’m starting the conversation more-or-less five years ahead of time, so that this time we’re not all standing around trying to figure out how to answer good business questions using incomplete and inaccurate data.

Call me crazy …

So yeah, I am probably still right and wrong. And yes, you make a good point — Kaushik and I were both caught navel-gazing (again!) But if in 5 years you and I are banging around in the Yahoo! group asking people whether the “Nokia X5150J Revolution” accepts cookies and JavaScript I am going to be awfully put out, aren’t you?

Thanks very much Stephane for offering up an opinion other than “Eric and Avinash are both brilliant!” The ego stroking is great but this kind of stuff needs to be debated, openly and honestly in my humble opinion. Beers are on me in D.C.

Web Analytics 2.0? I am more worried about Web Analytics 3.0!

If you’re reading the web analytics blogs, you’ve probably already heard about the recent presentations I’ve given on the subject of “Web Analytics 2.0″. The future of web analytics and the relationship between Web 2.0 technology and measurement is something I’ve been talking about for over six months — I actually have a Web Analytics 2.0 workshop that I regularly give that you can read about under Analytics Consulting on my site — but given that it is “conference season” it is no wonder that this subject is getting attention from other folks in the industry. I have given my presentation at Web Analytics Day in Brussels, SEMphonic X Change in Napa, and will be giving a variation on same at Jim Sterne’s Marketing Optimization Summit in October.

Due to demand, you can download a PDF of the presentation from the white papers section of my site. If you’re interested in learning more about Web Analytics 2.0, please give me a call and I’d be happy to discuss it with you.

Strangely enough, the slides that are generating the most interest and commentary are not those about the Web Site Optimization Ecosystem, the integration of quantitative and qualitative data, or the Web Analytics Demystified RAMP, but rather the few slides I included outlining my thoughts about Web 3.0 and what I am calling Web Analytics 3.0.

What the heck is Web Analytics 3.0?!

Before I can tell you what Web Analytics 3.0 is, I need to tell you what I think Web 3.0 is going to be. The good old Wikipedia basically dodges this by saying:

Web 3.0 is a term that has been coined with different meanings to describe the evolution of Web usage and interaction along several separate paths. These include transforming the Web into a database, a move towards making content accessible by multiple non-browser applications, the leveraging of artificial intelligence technologies, the Semantic web, the Geospatial Web, or the 3D web.

While I know that Judah is all hopped up on the notion of the semantic web, after having traveled to Tokyo and Europe in the past month, I find myself absolutely convinced that the next technology era will be characterized by our collective ability to access the Internet anyplace, anytime, using so many devices we begin to look back on computers much the same way young people do television today — as something nice to use when YouTube is unavailable. Rolf Skyberg, a disruptive innovator from eBay who I met in Rotterdam a few weeks back, called it “digital ubiquity” — the point where we forget that the Internet actually exists and take our ability to access information completely for granted.

Given so many sexy alternatives — 3D web, transforming the Internet into a database, artificial intelligence, and the such — why am I so convinced that in the next three years we’ll be talking about Web 3.0 when we talk about mobile phones and non-traditional browsers?

Easy. The financial opportunity available via the mobile Internet makes the billions transacted today look like pocket change.

Think about it:

Just think for a minute about how your browsing experience might change if the web sites you visited remembered you and delivered a tailored experience based on your demographic profile (theoretically available via your phone number), your browsing history (accurate because you’re not deleting your phone number) and your specific geographic location when you make the request?

Now think about how the advertising buying experience would change if the same were true, not to mention behavioral targeting. I mean, given GPS and demographic data, the behavior being tracked could be “works downtown during the day, checks Facebook on his phone often, lives in the suburbs, surfs sports scores from his neighborhood bar.” The Starbucks web site could have a link at the top with a coupon to save $1 on my double-tall non-fat latte in stores 1 block, 2 blocks, and 5 blocks from my current location; the Best Buy web site could have an in-store promotion for the store I am standing in, targeted to my age and gender; and my search engine could disambiguate my searches based on my demographic profile, my geographic location, and my recent search history to serve me paid search ads designed to influence my geo-spatial movement, not just my likelihood to click.

Jeepers, huh?

Sure there are privacy issues, but given the intensely personal relationship most people have with their cell phones, and the fact that far more people in the world have mobile phones than computers (Gartner estimates 271 million units sold to end-users by Q2 2007) it is easy to make a convincing case for mobile computing and digital ubiquity defining the next technology era, much like social networking, AJAX, XML, and mashed-up business models define the current Web 2.0 era we’re living in today.

Okay, mobile is the future. So what the heck is Web Analytics 3.0?

If Web Analytics 1.0 was all about measuring page views to generate reports and define key performance indicators, and if Web Analytics 2.0 is about measuring events and integrating qualitative and quantitative data, then Web Analytics 3.0 is about measuring real people and optimizing the flow of information to individuals as they interact with the world around them.

Your log file analyzer can do that, right?

The current state of mobile measurement isn’t about Omniture and Visual Sciences, it isn’t about JavaScript and cookies, and it isn’t about page views, visits, and visitors. Web Analytics 3.0 is going to be something completely different, and it will depend on completely new technology. Anil Batra and I talked about a project he did a few years back while he was at digiMine — he hacked together WAP gateway logs into a pseduo-log file, using the phone number in place of a cookie. Brilliant, and the fact that Anil has this experience propels him to very near the head of the class for Web Analytics 3.0 analysts.

In theory, the mobile Internet has many of the same measurements as the hard-wired Internet. But as the information the platform and device providers make available changes, something I very much believe will happen, the quality and volume of information at our disposal will increase and improve. The W3C document on “Mobile Best Practices 1.0″ already exists but surprisingly enough don’t have a section about logging requests or measuring user interaction. M:Metrics is out there providing analyst reports, but the service is more similar to comScore and Nielsen than WebTrends and ClickTracks.

This post is already extremely long but I wanted to start the conversation. In future posts, as time allows, I’ll expand on some of what I believe is possible and how. In the interim, let me know what you think! Am I wrong? Is Web 3.0 bigger than mobile? Or do you already have a handle on measuring your mobile content, even without GPS and phone numbers as unique IDs? Do you personally have experience doing analysis on mobile content? If so, I’d love to hear about your experience.

As usual, I very much welcome your comments but am happy to receive your comments directly via email. Also, if you’re a mobile service provider or device manufacturer concerned with how advertisers and marketers will measure their success through your platform, application, or device, I would love to talk to you about the Web Analytics Demystified vision for Web Analytics 3.0.

Next Entries »
Mobilytics