Web Analytics Blogs

Eric T. Peterson has been working in web analytics for over ten years and has built up an incredibly rich body of knowledge about the subject, knowledge Mr. Peterson works to share every week here in his Web Analytics Demystified weblog. Whether you're new to the subject or the most experienced practitioner, you should join the thousands of people around the globe already subscribing to Peterson's blog and start reading today.

Subscribe to Eric T. Peterson's weblog

Archive for 'Engagement'

« Previous Entries

Measuring Online Engagement: Step One

Following up on my post from Monday of this week announcing that Joseph Carrabis of NextStage Evolution will be joining “The Engagement Project” and bringing his mathematical expertise to the table, Mr. Carrabis has summarized what he’ll initially be doing for the chef in all of us.

According to Mr. Carrabis:

“Eric’s already posted that I’ll be working with him to make the formula more applicable to a wider variety of interfaces with greater general use features. I also know that I can always use help and have repeatedly and publicly stated that I don’t know web analytics.

So, first steps? A semantically exact statement of what we’re hoping to measure. I suggest this step because it’s much easier to know if your variables will result in the desired solution if you are exact in what the solution looks like and what you have to put into that solution.

Think of it this way; You want to make some chicken soup and you use your grandmother’s recipe. I want to make some chicken soup and I use my grandmother’s recipe. But your grandmother is Irish and mine is Italian. I’ll bet we’d use different spices, different vegetables, different noodles (if indeed we both did).

But I’d bet we both use chicken stock as a base. And is your chicken stock from the leftovers of a roast chicken? What spices did you use there? Or is your stock from bullion?

So the first step is to decide what we all mean by “chicken soup”. One of my mentors was a genius of an author who use to write “speculative fiction”. I would ask, “What is speculative fiction?” and he’d reply “It’s what I’m pointing at when I say it.” This is a great anecdote and an undefensible statement (except in cultural anthropology). If one person “owns” the definition of “speculative fiction”, “chicken soup” or “engagement” then that definition is only valid so long as there exists a market for that definition.

However, a definition that says something like “Basic Chicken Soup”, that is something I can start with to make “Italian Chicken Soup” and allows my Irish friend to extend it to “Irish Chicken Soup”? Now that’s a good definition.

I snuck the concept of “extendable” into the above. “Extendable” means the definition accommodates special cases (Italian, Irish, etc). Think of a recipe for Italian Chicken Soup that begins “Step 1: Make the Basic Chicken Soup. Step 2: Now add garlic, oregano, …” That “Step 2″ part means that the original definition isn’t limited, that it can be extended to incorporate specific features to make it unique to a given environment (Italian, Irish, …).

The concept of “extensible” has two parts; First, you can substitute one thing for another if they share some basic properties. For example, you can substitute a glass of wine for a glass of water in the recipe because they’re both liquids. You can’t substitute a lamb chop for a glass of water, though. Mathematically, this means that if we want to include “clickthroughs” we can use whatever product A calls clickthroughs, whatever product B calls clickthroughs, etc., so long as they all meet some definition of “clickthroughs” (I’ll let the WAA worry about things like that).

Second, “extensible” means new spices, new vegetables, new types of noodles, etc., can be used to make the chicken soup better. This means that you can add a new spice to your recipe in addition to the existing spices already in it. Extensible (in this sense) means you’re doing what you already do to make your style chicken soup and now you’ve discovered something more you can add to it to make even more “your style”. You’re not watering it down or adding more vegetables to make the soup go further. That’s scalability and the equation should be scalable without needing to define it as such.

The sum of these two concepts of “extensible” translates to “the equation is valid across all interfaces including those we haven’t thought of yet.” Mathematically extendability and extensibility form the axes of a very rich solution space.”

Joseph says “Basic Chicken Soup” and I say “a measure of the depth and degree of visitor engagement online” … clearly he and I both have our work cut out for us. If you’d like to join us in our quest for a better measure of visitor engagement online, please let me know.

Measuring Engagement Online: The Next Stage

In the last few months there has been a tremendous surge in interest in my framework for measuring engagement online. Lately, some of the largest and well-known companies in the world have approached me about working with them to bridge the gap between the metrics they have today and something similar to the composite metric I first described back in December 2006.

While I am tremendously flattered that I have somehow become the focal point for this conversation, I have been thinking lately about how the framework has been developed and how it might end up being used by the measurement industry in general. And while early tests using the framework I’ve described are very encouraging, the calculation in it’s current state was meant to move the discussion along and get more people to “think different” about how engagement could be calculated online.

Given that interest in the framework has clearly increased, one primary concern comes up again and again: the need to apply mathematical rigor to the framework and calculation so that A) the result is repeatable, reliable, and trustworthy and B) when naysayers inevitably emerge to criticize this small side project of mine, that I have a suitable response to their criticism, regardless of where and why it comes.

I believe that the need for “A” is obvious. The need to address “B” is perhaps less obvious, but I believe that I owe it to those of you who are investing your time, energy, and money into this framework. Especially as the stakes seem to increase exponentially with every presentation, every conversation, and every high-visibility blog post on the subject, I believe now is the time to approach the engagement framework not just as a hobby but as a serious project with committed resources.

To this end, I am extraordinarily happy to say that the single smartest person I know, Joseph Carrabis the Founder and Chief Research Officer of NextStage Evolution and NextStage Global, has offered to bring mathematical rigor and analytical precision to what I am officially dubbing “The Engagement Project.” Those of you not familiar with Joseph and his work are advised to A) meet him in person at the upcoming Emetrics Summit in San Francisco or B) read some of his recent work at iMedia Connection.

Joseph will be working to make the formula universally applicable and universally defensible. Suffice to say I can think of nobody better to bring mathematical and scientific rigor to the framework I have been evolving over the past year. Watch this blog and Joseph’s blog at BizMediaScience over the next week or so for a more complete analysis of the framework in it’s current state, something we’ve agreed is the first step towards creating a true function capable practically describing the degree and depth of engagement a visitor is displaying towards a web site over time.

At the end of the day, without regard to my framework, Joseph’s analysis, or any person or group’s particular position on the use of the word “engagement”, my goal is to solve one problem and one problem only:

If you’re interested in working with Joseph and me on The Engagement Project please feel free to contact me directly.

Example uses of the visitor engagement metric

My post last week on measuring visitor engagement was pretty long by the time I outlined the calculation, so I put off publishing examples of how the metric could be used until now. I’m excited to see that this topic has generated so much interest, both in terms of comments and emails sent to me directly.

My goal for this post is to provide a few examples and explanations to show how the metric can be used to supplement our otherwise already-rich set of web analytics data. Since so many folks have been willing to explore the engagement metric, I have embedded a bunch of questions in this post in italics that I’d love your feedback on.

Distribution of engagement scores and segmentation. Here is the distribution of engagement scores for about six months at Web Analytics Demystified by percent of visitors. As you can see, these scores are left-skewed and tail off as the score increases, showing that nearly half (47.6%) of visitors to my site are “poorly engaged”. When I look at this distribution it makes perfect sense to me — what do you think?

I have created segments to group visitors by their engagement score: “Well engaged” visitors have engagement scores over 30%, “moderately engaged” visitors are those between 10% and 30%, and”poorly engaged” visitors score less than 10%. These segments can then be used to explore how the behavior of visitors in each engagement group differs by looking at my page and referring source dimensions (page, content group, referring domain, campaign, search phrase, etc.)

Identify relationships that might otherwise not be found. At the top of this report you can see the pronounced difference in visitor engagement (and traditional metrics) for “branded” and unbranded searches (”None”) bringing visitors to my site. Now, because branded searches are a component of the calculation (Brand Index), you definitely expect to see a difference between the two engagement scores. What is interesting is that while other metrics (duration, sessions per visitor, page views per session) show a slight difference, visitor engagement and conversion are all three times higher for branded searches. I think this difference observed in all the metrics is further evidence that brand-driven searches are bringing more engaged visitors — what do you think?

In the middle table you can see search phrases bringing visitor to my site, showing visitor engagement, page views per session, and sessions per visitor. Here three phrases stand out to me:

  1. “web analytics book” and “web analytics process”, neither of which are particularly distinguished from other search phrases based on page views per session or sessions per visitor but both of which have visitor engagement scores over double my site-wide average of 8.8%. This is important to me because these are un-branded search terms that are critically important to my business.
  2. “vendor discovery tool” which would appear to be pretty important based on traditional metrics but only stands out slightly using the visitor engagement score (at 13.6%) I spend a lot of time trying to figure out how to drive folks using the vendor discovery tool to take other actions (buy books, inquire about consulting) and this data suggests that there is an unrealized opportunity.
  3. “performance indicators” which shows that the visitor engagement metric is useful to identify terms that you’d think are important to the site but aren’t attracting the right audience (average engagement score for these visitors is only 5.6%)

I think this level of information is actually pretty helpful for identifying search marketing opportunities — what do you think?

Engagement-derivative metrics like “Percent Highly Engaged Visitors” are useful. Here you can see a select group of referring domains showing the percent of highly and percent moderately engaged visitors they’re sending my way (with conversion to show that engagement and conversion are in fact different!) Avinash Kaushik is sending me a few (0.2%) highly engaged visitors (thanks!) but Ian Thomas is sending me a bunch (70.4%) of moderately engaged visitors, many of whom are purchasing books (1.2% conversion rate.)

By looking at traffic from Avinash’s site over time (bar graph) I can see peaks and valleys in overall engagement from folks coming from his site, which would be useful to back into those peaks to try and determine what other blogger’s readers might be reacting to when they’re exhibiting highly-engaged behavior on my site (see late August and early September.) Given that Clint proved that conversion is a poor measure of success when trying to evaluate traffic from other bloggers, I think visitor engagement is useful for examining the non-revenue value of referring sources — what do you think?

Those of you who are looking for correlation between engagement and conversion, have a look at the data for Mr. Jim Sterne’s wonderful site emetrics.org —  5.6% of the folks coming from Jim’s site are highly engaged, 66.2% moderately engaged, and man-oh-man does Jim help sell some copies of Web Analytics Demystified.  You’re the man, Jim!

Visitor engagement is globally useful. At least in Visual Sciences Visual Site you can apply engagement metrics and segments to pretty much any dimension tracked. Here I’m looking at the percentage of “highly” engaged visitors (50% or more) in my “well engaged” segment broken down by country. Now, this is certainly more interesting in light of the total volume of traffic coming from each geographic location, and as I think about localizing my books and planning future trips around the world this information becomes very helpful.

There is more, including some of the more granular visitor-level stuff I talked about in the first series of posts on the subject, but I want to be sensitive to protecting the identity of individual users on my site. If you’re interested in helping me collect some “ground truth” regarding the engagement calculation, write me and I’ll explain how you can help.

So what do you think? Do the screen-shots help you understand the calculation better? Or do they still make it look super-complicated and scary? Is there something specific you’d like to see me demonstrate with the calculation? Or do you think you could come up with these same insights using more traditional metrics?

Nick Arnett challenges my visitor engagement calculation

Nick Arnett from MCC Media (and one of the creators of Buzzmetrics) posted a very well though-out and moderately critical assessment of the visitor engagement calculation I wrote about earlier this week. Nick makes some great points and I thought it was worth addressing them while I prepare the follow-up post that shows off some of what the metric can do. My comments are preceded by ETP and Nick’s statements are in italics.

Definitely thought-provoking, Eric… I’m deep into this issue, although focused specifically on community sites.Overall, your approach doesn’t work for me on two main counts — it is too complicated (and thus unlikely to become any sort of standard) and doesn’t generate a metric that allows different sites to be compared. The latter is arguable, since standardized weightings could yield comparable numbers, but I think that’s excess complication also.

ETP: I’m sorry the calculation doesn’t work for you but I do appreciate your thoughts on the subject. Regarding it being too complicated, compared to what? Compared to “simple” metrics like bounce rate and average page views per session? Or compared to the technology you built to power Buzzmetrics? I guess I separate the complexity of making the calculation from one’s ability to actually explain the calculation.

ETP: Regarding using this metric to compare different sites … as I mentioned in the post, I don’t think there is “one” measure of visitor engagement and thusly trying to compare sites is probably a futile effort at best. I suppose you could remove the Brand, Feedback, Subscription and Interaction indices and come up with a standard set of threshold values for specific vertical markets, but I’m not sure that is really the best use of this calculation.

Is there any ground truth behind this? In case that isn’t clear, do you have any sort of primary market data for engagement that correlates with the output of your engagement metric?

ETP: Hmmm, here I’m not sure what you mean. What kind of primary market data is actually able to identify “engaged” visitors? Because I am able to see individuals interacting with my web site, I did talk to a handful of people based on their engagement scores when I was doing the original work on this metric, and some of their feedback was critical to tweaking the metric and inputs to its current state. But other than that I’d love to see the primary data you’re talking about if you’re able to share it!

As I’ve dug into the issues and our data (about five dozen communities ranging from very large to very small), I keep coming back to two main indicators of engagement — return rates and proactive behavior. If visitors don’t visit regularly and do something other than passive page viewing, I have a tough time including them in any measurement of community engagement.

ETP: Exactly why the Recency Index and Interaction Index are included in the calculation, but I disagree with your assessment that these are the only measures of engagement. I’m not sure exactly how I would determine that someone was only “passively” viewing pages, and again this metric is not designed to be a measure of “community engagement” but rather visitor engagement more broadly considered.

Some point-by-point thoughts…

Click-depth index — this is a place where ground truth really matters, I think. I’m not comfortable with the assumption that more clicks per session means greater engagement. Do we know enough about browsing behavior to know that this is true? And of course there’s the old problem of bad design resulting in more clicks… but when I consider that issue, I tend to think that if people show willingness to click through a bad design, maybe that means they really are engaged! Perhaps we should all include some known bad design… ;-)

ETP: I haven’t seen anything that says that more clicks means less engagement but I agree that confused people might generate more clicks. But I think it’s unlikely that confused and frustrated people would return, complete defined events, subscribe to blogs, etc. so despite your assertion that the metric is complex, multiple inputs are designed to mitigate those that may be confusing.

ETP: You do, however, make an excellent argument for not using something as simple as “click-depth” or “average depth of visit” as your sole measure of engagement.

I have pretty much the same questions about duration. Is there good, objective evidence that session duration correlates to engagement? There are visitors with long-duration visits who don’t visit regularly and don’t do anything proactive… I can’t see including them in any measurement of engagement.

ETP: It sorta depends on your definition of engagement, doesn’t it? But see my comment above about why a single measure like duration (as in Nielsen’s Time Spent ranking system) is perhaps inappropriate on its own to determine engagement.

Recency makes perfect sense to me — the fact that engaged visitors return often is practically a tautology. I would be very skeptical of calling anybody engaged if they aren’t returning regularly.

ETP: What about first time visitor? Are you saying you can’t be engaged on the first visit to a site? I agree that regular returns are a good indicator of engagement, but in my analysis the metric I’ve defined is able to resolve first time visitors into several engagement segments which I personally have found quite useful.

Your Brand Index is a great piece of data, but I don’t believe it works in a metric intended to compare sites. Language is too subtle and ambiguous to infer engagement from search terms. I spent years in the search engine and related markets, which gave me a great appreciation for the fact that what sometimes seems obvious about language isn’t. When people search on brand-related terms, it indicates *reach* to me, not engagement. I’m unwilling to assume anything more than brand awareness. People search on things they dislike, but that doesn’t mean they’re engaged with the subject they’re searching. And my data shows that visitors who show many other indications of engagement actually search *less* often.

ETP: Same comment about this metric not being specifically designed for comparing sites. I know that is the uber-goal for lots of folks in the world, it’s just not necessarily my goal or the best use for my engagement calculation.

ETP: Doesn’t “reach” plus “action” equal engagement? I haven’t spent years in search and related markets, but I struggle to believe that people searching on brands they dislike are not somehow engaged. Again, maybe this is a semantic issue arising from conflicting definitions of engagement.

ETP: Because the calculation is designed to be made over the lifetime of visitor sessions, searching less often is not a problem. I guess I more-or-less expect that the “direct” component of the Brand Index will be more important over time with truly engaged visitors (who wouldn’t be as likely to go back to Google and search on a branded term.)

Counting brand-related searches makes sense if we’re measuring *brand* engagement.
Counting direct (non-referred) visits makes sense if we’re measuring *site* engagement.

Counting both in the same metric doesn’t make sense to me. I don’t think we should even be talking here about ways to measure brand engagement… because I believe that’s well beyond the scope of site analytics. It requires massive monitoring systems along the lines of BuzzMetrics. (I’m the primary inventor of one of their systems.)

ETP: I’m not differentiating *brand* and *site* engagement since I’m trying to calculate an operational measure of ongoing *visitor* engagement. Brand is just a component, and the site is the measurement point. I think I understand your desire to differentiate the two given your background with Nielsen but I’m not trying to do the same thing.

One more problem with the Brand Index — people will argue all day long about what terms are appropriate to include… and there’s a strong incentive for site owners to err on the side of too many terms if their success is being measured by this metric. For example, you included “web site measurement hacks” in your list… but that could be a generic term. Is “Web Analytics Wednesday” really your brand? Or is it the WAA’s? I don’t want to argue which it is, just point out the kind of ambiguity that is inevitable.

ETP: Here I agree with you, coming up with a reasonable list is not easy, but web analytics is hard so at some point you have to make some tough decisions. “Web Site Measurement Hacks” is a book title and a branded term but could be a generic phrase. “Web Analytics Wednesday” is a branded Web Analytics Demystified term and has nothing to do with the Web Analytics Association. I don’t think there is that much ambiguity at the site level, at least in my experience.

Your Feedback Index is a specific instance of what I think of as the general principle of tracking proactive behaviors — what you seem to be getting at in your Interaction Index. In communities, visitors have many such opportunities — posting, editing, tagging, voting and so forth. I decided very early in this work to just give people one point for each such proactive action, despite the temptation to weight them (which would violate the need to keep things simple). These are the behaviors that make a community work; sites that aren’t based on user-generated content can exist without them.

ETP: Same comment about this calculation perhaps not being what you’re looking for vis-a-vis communities.

Your session focus really got me thinking. Does it make more sense to count the number of sessions in which visitors signal engagement or the number of actual such signals? I think it’s close to a toss-up, but so far, our ground truth suggests the latter — the number of proactive actions correlates better to our subjective estimates of engagement… but among our future tasks is to establish better ground truth. So far, I’m just using our community manager’s collective subjective scoring… but it correlates quite well to all but our smallest communities.

ETP: I agree, it’s probably a toss-up but if you think about the calculation all it does is count the number of signals. Long sessions are a signal, deep sessions are a signal, frequent sessions are signal, etc. I know you don’t like anything but recency and interaction but we can agree to disagree on this point. I’d love to hear about your “ground truthing” efforts and I’ll try and keep you appraised of mine.

The subscriber index doesn’t work for me because we want to be able to compare communities regardless of whether or not visitors are able to subscribe, join, become members or what-have-you. Some of our clients — e.g., a large professional sports organization — allow full participation without any need to sign up. Also, as I’ll explain below, I’ve found a strong negative correlation between highly active visitors and RSS subscribers.

ETP: Again, not designed for comparison (and at this point no wonder you don’t like my calculation!) I’d love to see the negative correlation data for RSS and yes, if you don’t have a subscription it doesn’t make sense to assign a negative penalty.

Finally, I guess I’ll toss out one of the ideas that I’m working with — segmenting visitors by proactivity.

In several ways, communities (and most web sites, I suspect) have a bimodal distribution of users. There’s typically a relatively large “Core” group that visits often, looks at lots of pages and does a lot of proactive stuff. There’s a middle ground, which I’m calling “Lingerers,” of people who fall into the 10th to 80th percentiles of such activities. Third and last, there’s a large contingent in the 0th percentile, people who might have one or two activities in a given time period, which I call the “Drive-bys.” In our communities, the Drive-bys are the largest group, but the Core usually is a bigger group than the Lingerers. What this says to me is that people tend to engage a lot or hardly at all — there isn’t much middle ground. I’ve been focusing on the Core’s relationship to the whole community for my engagement measurements. That’s what seems to correlate best to what little ground truth we have.

ETP: I am seeing a more normal distribution, especially as visitors return a third time, but it is definitely left-skewed towards lower levels of engagement. I’ll try and highlight this when I show some data that highlights the calculation in action. And since I’m not working on a community proper, I’ve found myself focusing on my middle group (”Moderately Engaged”) and trying to determine what I might be able to do to shift them up to “Highly Engaged”.

Overall, I’ve found that the Drive-bys and Lingers exhibit fairly similar behavior, but the Core is different. The Core visitors post more, search less and use RSS far less (so much for “subscribing” to RSS as a positive indicator of engagement!)

ETP: Your assessment of RSS being a poor indicator of engagement runs contrary to popular opinion (why would you subscribe to a RSS feed or email newsletter if you weren’t engaged??!) Perhaps this result is uncovering a flaw in your engagement calculation?

This post is getting long… so I’ll wrap it up (but ready to discuss further, of course) by repeating myself. I think any sort of engagement metric has to be backed up by demonstrating correlation to some kind of ground truth. Otherwise, it’s a mental exercise that runs the risk of having little relevance to the marketplace.

ETP: You keep coming back to the notion of “ground truth” but surely you recognize that this is A) extraordinarily difficult to come by and B) if we had it easily available we wouldn’t need a measure of engagement. I would love to see your “ground truth” data and talk about how you’re generating that, but unless I’m missing something it sounds a little impractical for widespread use. Still, I appreciate your feedback and very thoughtful comments and will endeavor to demonstrate the correlation between my calculation and “truly engaged” visitors.


Man, talk about a long post! What do you think? Is Nick more right than wrong? Are you focusing on communities and have the same concerns? Do you have similar concerns about your site? The conversation is almost as interesting as the metric and resulting analysis in my opinion so please, comment away!

How to measure visitor engagement, redux

Back in December of last year when I first posted on measuring visitor engagement, I hardly imagined how much interest the topic would generate. Shortly after the first post, I commented that my definition of engagement was as follows:

Engagement is an estimate of the degree and depth of visitor interaction on the site against a clearly defined set of goals.

I then went and wrote over a dozen posts, publishing feedback from some incredibly bright people and demonstrating the utility of a well-defined measure for engagement. Since that time, however, some have questioned the value of such a metric and thusly prompted me to update and publish the following calculation for visitor engagement:

I presented this calculation to a completely full room last week at Emetrics but wanted to provide an update to all my patient readers who were not able to make the event. You can download my entire Emetrics on “Web Analytics 2.0″ which includes the slides on measuring visitor engagement from the White Papers and Presentations section of my site.

I very much believe that engagement is a metric, not an excuse, and that the metric described in this post provides a powerful measurement framework for sites looking for new ways to examine and evaluate visitor interaction. I know that for my own site, the use of simple measures like “bounce rate”, “conversion rate” and “average time spent” is simply insufficient for selling anything other than my books. But I’m now in the business of selling consulting, a complex and sometimes time-consuming sale, and so I’m always on the hunt for any web analytics measure that will give me an edge and help identify truly qualified opportunities.

I believe this metric is exactly that.

This post is an extension of the work I did in late 2006 and early 2007 and was written to clarify my position, update my thinking in the context of “Web Analytics 2.0″, and reiterate my desire to have an open and honest conversation with my peers and other interested parties regarding the measurement of visitor engagement. Web analytics is hard but not impossible; the same is true regarding the calculation and use of robust measures of visitor behavior.

I believe the visitor engagement measurement to be perhaps the most important of all “Web Analytics 2.0″ measurements. Given that this model fully supports both quantitative and qualitative data, and given that the model is build as much around the measurement of “events” as much as page views, sessions, and visitors, I (perhaps haughtily) believe this calculation to be prototypical of the types of measurements we will see as we continue to explore the boundaries of “Web Analytics 2.0″ (download my presentation from SEMphonic X Change).

The Web Analytics Demystified Visitor Engagement Calculation

The latest version of my visitor engagement metric, with notes about its calculation and use, are as follows. If you’re too busy to read this entire post but would like to learn more about this measure, please write me directly and we can set up a time to discuss it.

This is a model, not an absolute calculation for all sites. I agree with other analysts and bloggers who insightfully say that there is no single calculation of engagement useful for all sites, but I do believe my model is robust and useful with only slight modification across a wide range of sites. The modification comes in the thresholds for individual indices, the qualitative component, and the measured events (see below); otherwise I believe that any site capable of making this calculation can do so without having to rethink the entire model.

The calculation needs to be made over the lifetime of visitor sessions to the site and also accommodate different time spans. This means that to calculate “percent of sessions having more than 5 page views” you need to examine all of the visitor’s sessions during the time-frame under examination and determine which had more than five page views. If the calculation is unbounded by time, you would examine all of the visitor’s sessions in the available dataset; if the calculation was bounded by the last 90 days, you would only examine sessions during the past 90 days.

The individual session-based indices are defined as follows (and these are slightly updated from past posts on the subject):

  • Click-Depth Index (Ci) is the percent of sessions having more than “n” page views divided by all sessions.
  • Recency Index (Ri) is the percent of sessions having more than “n” page views that occurred in the past “n” weeks divided by all sessions. The Recency Index captures recent sessions that were also deep enough to be measured in the Click-Depth Index.
  • Duration Index (Di) is the percent of sessions longer than “n” minutes divided by all sessions.
  • Brand Index (Bi) is the percent of sessions that either begin directly (i.e., have no referring URL) or are initiated by an external search for a “branded” term divided by all sessions (see additional explanation below)
  • Feedback Index (Fi) is the percent of sessions where the visitor gave direct feedback via a Voice of Customer technology like ForeSee Results or OpinionLab divided by all sessions (see additional explanation below)
  • Interaction Index (Ii) is the percent of sessions where the visitor completed one of any specific, tracked events divided by all sessions (see additional explanation below)

In addition to the session-based indices, I have added two small, binary weighting factors based on visitor behavior:

  • Loyalty Index (Li) is scored as “1″ if the visitor has come to the site more than “n” times during the time-frame under examination (and otherwise scored “0″)
  • Subscription Index (Si) is scored as “1″ if the visitor is a known content subscriber (i.e., subscribed to my blog) during the time-frame under examination (and otherwise scored “0″)

You take the value of each of the component indices, sum them, and then divide by “8″ (the total number of indices in my model) to get a very clean value between “0″ and “1″ that is easily converted to a percentage. Given sufficient robust technology, you can then segment against the calculated value, build super-useful KPIs like “percent highly-engaged visitors” and add the engagement metric to the reports you’re already running.

The Visitor Engagement Calculation in Detail

The Click-Depth, Recency, and Duration indices are all pretty straight forward and are more-or-less the traditional indicators that most people (incorrectly) call “measures of engagement”. Each of these are very important to the overall calculation, but none of these alone are sufficiently robust to describe “engaged” visitors. I set the “n” values for my site’s calculation based on the average value for each and this seems to work pretty well (meaning my Ci looks for sessions more than “5 page views” in depth, my Ri looks for sessions more than “5 page views” that occurred in the “past three weeks” and my Di is looking for sessions longer than about “5 minutes” in length.)

Brand Index is a little more complicated. Here I have made a list of all the terms I believe to be “branded” for my site and business, terms like eric t. peterson, web analytics demystified, web site measurement hacks, web analytics wednesday, and the big book of key performance indicators. Whenever a session begins either with no referring domain or comes from a search engine with one of these terms attached, I count this as a “branded session” and score appropriately. While this index perhaps unfairly weights towards search engines, I firmly believe that if you’re starting your session with either my branded URL, my name, or the name of one of my books that you are already engaged.

Feedback Index is the sole qualitative input to this model but it can easily be expanded if necessary. Here I am simply scoring sessions based on whether visitors are providing qualitative feedback via the OpinionLab “O” present throughout my web site or writing me directly by clicking a “mailto:” link. I’m not looking at whether the feedback is positive or negative, only whether feedback was given, operating under the belief that anyone willing to provide direct feedback is engaged.

The Feedback Index could easily be expanded by scoring based on the answer to direct questions posed to the visitor, questions like “do you find the content on this site valuable?”, “do you plan on calling Web Analytics Demystified about consulting?” and “would you described yourself as engaged with this site?” Given a sufficiently robust mechanism for making the calculation, the Feedback Index can provide a tremendously powerful input to the visitor engagement model.

The Interaction Index captures sessions in which specific “engaged events” occur other than the site’s primary conversion event — events like downloading a white paper, providing an email address, requesting a presentation or PDF, commenting on a blog post, Digging a post, emailing content to a friend, printing a page, etc. The Interaction Index is designed to capture a small weighting from those measurable goals on your site you believe to be indicative of engagement.

The Interaction Index specifically does not examine commerce transactions and other conversion events of fundamental import to the site. While I have debated this in the past, here is the rationale for recommending the exclusion of primary conversion events:

  1. These events already have their own key performance indicator: conversion. Given that conversion is likely already defined for most transactional sites and tracked in great detail, adding conversion to the visitor engagement calculation is superfluous in my opinion.
  2. The visitor engagement metric is designed to provide information about the large number of visitors who do not convert. Given relatively low conversion rates online, having visitor engagement be decoupled from conversion provides a cleaner measure for use in exploring non-purchaser behavior, including looking for independent correlation between the two measures.
  3. By excluding conversion, the two metrics can be used side-by-side to look for visitor behaviors may not be obvious otherwise. Given the lifetime of possible visitor behaviors, having a way to look for well-engaged visitors who have not completed a transaction online or have completed a transaction outside of the available data set provides a critical view not otherwise readily attained.

The Loyalty Index is a reflection of my belief that repeat visitation behavior is perhaps the best measure of engagement available. Based on the distribution of visitor loyalty data at Web Analytics Demystified, I score “1″ when visitors have come to the site more than five times in the past 12 months.

The Subscription Index is a reflection that truly engaged visitors are able to self-identify by subscribing to our blogs or newsletters; if you have taken the time to subscribe to one of the Web Analytics Demystified blogs I believe you to be engaged. If your site does not have some type of XML-based content subscription you can either drop this index or (perhaps better) look for an opportunity to develop a subscription service, thusly giving your visitors another good engagement point.

How Does This All Work in Practice?

Careful readers will likely have already figured out that as visitors come to your site over time, their cumulative “lifetime engagement score” changes as they satisfy the criteria of each individual index. So someone coming from a Google search for “web analytics demystified” who looks at 10 pages over the course of 7 minutes, downloads a white paper and then returns to my site the next day will have a higher visitor engagement value than someone coming from a blog post who looks at 2 pages and leaves 2 minutes later, never to return.

If you think about it for just a bit, and consider the components in the full calculation, the visitor engagement metric starts to make an awful lot of sense. Consider the following:

  • A visitor can quickly move through a lot of pages, getting exactly what they need, and still be scored usefully through the Click-Depth Index
  • A visitor can slowly and methodically read a few pages and be scored usefully through the Duration Index
  • A visitor can come to the site frequently and do little more than read a single page of content and be usefully scored through the Recency and Loyalty Indices
  • A visitor can come to the site once, subscribe to the blog, return later and download a presentation, and be usefully scored through the Subscription and Interaction Indices
  • A visitor can come to the site, click on dozens of pages but fail to find what they are looking for, then tell me so using my feedback mechanisms and be usefully scored through the Click-Depth and Feedback Indices

The power of the metric is appreciated when you apply it to the commonly measured dimensions found in web analytics: referring domain/URL, search engine/phrase, campaign/placement/creative, content group and page, browser/operating system, etc. Suddenly instead of looking at simple measures, you’re examining the potential of visitors coming from or going to each element in the dimension. To see the metric in action, I encourage you to read my post on the gradual building of context, at least until I’m able to publish new screenshots later this week.

Some Parting Thoughts about Measuring Visitor Engagement

Some folks have complained that this metric is “not immediately useful”, that nobody will understand it, and that it is impossible to calculate. Perhaps, but I would argue that A) no metric is truly immediately useful and B) most people don’t understand web analytics because web analytics is hard. The assumption that a diverse organization is going to be more successful using “bounce rate” because it can be glibly explained by saying “your content sucks” is just wrong — all of this stuff needs to be explained regardless of the complexity of the metrics involved.

Regarding the metric being impossible to calculate, it fully depends on which application you’re using. If you’re trying to get by using free tools then yes, you’re out of luck. But if you’re using robust tools like the high-end offerings from Unica, IndexTools, Visual Sciences, and WebTrends then you should have little trouble using the metric I describe in this post.

I personally believe that Web Analytics 2.0 both requires and allows us to be more creative and thoughtful in our use of metrics. Why not use a robust indicator if one is warranted? Especially if you’re not selling anything online, or if you’re selling high-consideration items, my visitor engagement metric can be shown to be an extremely powerful measurement.

Given the assertion that some consultants are apparently charging $200,000 USD for complex “engagement index” work, and given that someone working for Google is in the process of trying to patent a much simpler version of this equation, I am happy to give my work away to the entire industry in an effort to promote the use of more meaningful metrics to be brought to bear on increasingly complex measurement problems.

What do you think? Did you see my Emetrics presentation and still have questions? Did you read every word of my series on engagement and still not believe me? Do you need to see engagement in action before you’re willing to say it’s not just an excuse? Or are you chomping at the bit to have a robust measure like this for use on your own site?

Especially on this subject I relish your feedback, either via comments or via email — your choice! I find the subject fascinating and welcome the opportunity to discuss it you, my (hopefully) engaged readers.

« Previous Entries