Web Analytics Demystified

Archive for 'White Papers'

Does your data quality still suck?

Years ago Google’s Analytics Evangelist Avinash Kaushik told everyone “data quality sucks, get over it” which at the time was quite the funny and controversial thing to say.  Among other things Mr. Kaushik encouraged his readers to “resist the urge to dig deep” to understand data-related problems, to “assume a level of comfort with the data” and to focus more on trends and less on absolutes.

At the time this advice seemed good. Any number of companies were in the midst of switching vendors back in 2006 (a trend that has noticeably declined) and so guidance to not stress out on the differences observed between old system “A” and new system “B” was good, as was his encouragement to spend more time focusing on data quality in key areas (checkout, carts, etc.)

Unfortunately times have changed.

Since 2006 we have seen a slow but steady increase in the prominence that digitally collected data has within businesses of all sizes. Now in 2010, more senior managers, Vice Presidents, and CEOs than ever are incorporating both qualitative and quantitative data collected from web, mobile, and social sites than ever before. Among our clients we have seen a profound shift from “nice to have” to “critical” when it comes to data flowing through Omniture, Coremetrics, Unica, Google Analytics and other systems, and slowly web analytics is becoming an embedded component of business decision making.

While this shift has far reaching implications lately at Web Analytics Demystified we have been looking more closely at how we can help our clients not “get over” the “suckiness” of data quality and actually do something about it.  We are doing this for one simple reason: senior leadership doesn’t want a glib response to data quality issues, they want as high a level of accuracy possible and concrete answers for why that accuracy isn’t forthcoming.

Don’t believe me? The next time your boss asks about the quality of the numbers you produce look them squarely in the eye and repeat Mr. Kaushik’s words, “Well Bob the data quality sucks and so you should just get over it, okay?”

When you’re done you can call my friend Corry Prohens to help you find a new job.

The alternative is, of course, to actually pay attention to your data’s quality and work diligently to incrementally improve data collection processes. Rather  than be lazy about the very foundation of all of your valuable work (and the high-quality analysis you’re working to drive into the business) you can do a few simple things designed to make your data “less sucky” and thusly more valuable.

And what are those things? Thanks to our friends at ObservePoint we have authored a short white paper on this exact subject! Titled “When More is Not Better: Page Tags” and subtitled “The Dramatic Proliferation of Script-based Tagging and the Resulting Need for a Chief Data Officer” (okay, not my best title, I admit it) the paper outlines the business processes and technologies required to develop a little more trust and faith in your digitally collected data.

The paper is a free download from ObservePoint but you will need to trade some information with them. I can assure you Rob Seolas and his team are fine folks and given that they have a tendency to send out sweet USB devices to prospects there are worse things than having someone from ObservePoint call.

Get your copy of “When More is Not Better: Page Tags” at the ObservePoint web site today!

If you’ve read the paper I welcome your comments. While we recognize that few companies are going to appoint a Chief Data Officer to manage their digital data quality we hope that our readers understand that the point is not the job title but rather the work the associated work. Our thesis is that as the push towards digital continues those companies who have (and can communicate) a high-level of trust in their data will gain a competitive advantage, and in a world where the competition is always only a click away, who doesn’t want every advantage they can create?

My Interview with Adobe Chief Privacy Officer

Those of you paying close attention to issues regarding consumer privacy on the Internet are probably at least a little familiar by now with Flash Local Shared Objects (also called Flash “Cookies” by some.) I wrote a white paper on the subject Flash objects’s use in web analytics on behalf of BPA Worldwide back in February and had to update the blog post I wrote when I noticed  that Adobe had wisely written a letter to the Federal Trade Commission regarding the use of Flash to reset browser cookies.

After writing that update I got in contact with Adobe’s Chief Privacy Officer, MeMe Rasmussen, who politely agreed to answer a few questions that I had about their letter and Adobe’s position on the use of Flash as a back-up strategy for cookies.  Given that Scout Analytics is now reporting that Flash “Cookies” are increasingly being deleted by privacy-concerned Internet users I figured it was a good time to publish my questions and MeMe’s responses.

The following are my questions (in bold) and Mrs. Rasmussen’s responses verbatim.

Flash Local Shared Objects (LSOs) have been around for a long-time and I have been aware of their use as a “backup” for browser cookies for reset and other calculations for a few years.  What made you write your letter to the FTC now?  Was there a specific event or occurrence?

The topic of respawning browser cookies using Flash local storage was publicized after research conducted by UC Berkeley on the subject was published in August 2009.  The topic was also raised at the FTC’s First Privacy Roundtable in December, so when the FTC announced that its Second Roundtable would focus on Technology and Privacy, we felt it was the appropriate opportunity for Adobe to describe the problem and state our position on the practice.

While I believe the position you outlined in your letter to the FTC is the correct one, you have put many of your customers in an uncomfortable position by condemning an act that they have been using for quite some time — essentially issuing negative guidance where none had been previously issued (to my knowledge.)  What has the response to this been if I may ask?

We have not received any comments or concerns from customers about our Comment Letter to the FTC.  Adobe’s position specifically condemns the practice of using Flash local storage to back up browser cookies for the purpose of restoring them after they have been deleted by the user without the user’s knowledge and express consent.  We believe companies should follow responsible privacy practices for their products and services, regardless of the technologies they choose to use.

On page 8 of your response to the FTC you discuss Adobe’s commitment to research the extent of this (mis)use of Flash LSOs.  Given the extent to which LSOs are being used perhaps “not as designed” and the sheer popularity of Flash on the web this seems quite a task.  Can you describe how you have started going about this effort?

We are currently in the process of defining the research project and are working with a well-respected consumer advocacy group and university professor.  At this time, the specific details of the project have not yet been finalized.

Within the web analytics community many have commented that your position on Flash LSOs may impact some of what Mr. Nayaren and Mr. James have said about the integration of Omniture and Adobe products like Flash.  Specifically some of the commentary suggests a tight integration of Omniture’s tracking and Flash.  Does your position on LSOs as a tracking device change the guidance the company has issued to common customers?

No, the position we outlined in the FTC Comment on condemning the misuse of local storage, was specific to the practice of restoring browser cookies without user knowledge and express consent.  We believe that there are opportunities to provide value to our customers by combining Omniture solutions with Flash technology while honoring consumers’ privacy expectations.

One of the suggestions I made in the white paper with BPA Worldwide that you cited was to use Flash LSO as a back-up tracking mechanism but NOT to use it to re-spawn cookies.  From a measurement perspective there are a handful of good reasons to do this … does Adobe have a position on that strategy that you can outline?

The point we made in our FTC Comment was that we considered the practice of using Flash local storage to respawn HTML cookies without user consent or knowledge to be an inappropriate privacy practice.  In your white paper, you identified some uses of Flash local storage whereby browser cookies are rest but the use is given clear notice and an opportunity to consent.  We believe that technology should be used responsibly and in ways that are consistent with user expectations.  The example you presented in your white paper was an example of a Web site that, by giving notice and control to the user, implemented our technology in what appeared to be a responsible manner.

(Thanks again to MeMe and the team at Adobe for getting these responses back to me! As always I welcome your comments and questions.)

Flash Cookies and Consumer Privacy

Update: I should apologize to Adobe since I knew they had written to the FTC but didn’t mention it when I originally published this post. If you’re interested in this topic you should definitely download and read Adobe’s letter to the Secretary of the FTC regarding the use of Flash Local Shared Objects to re-spawn cookies. They cite my BPA white paper and do a great job outlining the company’s position on this particular use of their technology. I am writing to Adobe now to see if I can get someone on the phone to discuss in greater depth but if you know anyone there please ask them to email me directly.

A few weeks back we published a white paper with our client BPA Worldwide on the use of Flash Local Shared Objects in web analytics practices. The paper, titled “Flash LSOs: Is Your Privacy at Risk?” is available for download at BPA Worldwide and does require a tiny bit of information (name, company, email.) We wrote the paper with BPA Worldwide because we are seeing a resurgence in the use of Flash LSO as a back-up mechanism for browser cookies and frankly I personally worry about the practice.

Cookie deletion is what it is, and nothing anyone has done in the past five years has seemed to do anything to lessen (or worsen) the rate at which consumers clear cookie and history files. And yes, cookie deletion has a confounding effect on a variety of metrics web analytics professionals consider important, we’ve covered this more or less ad nasuem, although I certainly wonder how comScore’s recent reversal on the value of cookies will play out across combined web analytics + audience measurement efforts.

My concern is that companies are increasingly using cookies to over-ride consumer preferences regarding cookie deletion. Documented by Soltani, et al. in their paper “Flash Cookies and Privacy”, companies are actively using Flash LSO, which are much more difficult to block and delete than their browser-based counterparts, to essentially “reset” browser cookie values and thusly “remember” information that consumers are either implicitly or explicitly asking the web browser to forget.

If you’re doing this, or even considering this, I would encourage you to download the white paper as we provide what I believe to be sound guidance regarding the use of Flash LSO in a measurement practice.  You might also want to check out this post over at the Adobe web site which details how Adobe Flash 10.1 will begin to support the “private browsing” feature in most browsers. While I don’t blame Adobe particularly for how companies are using LSO in digital measurement practices, this update is an excellent response from the company and shows their commitment to consumer privacy.

As always your thoughts and feedback are welcome.

Google Analytics Intelligence Feature is Brilliant!

Long-time blog readers are likely aware that I’m not prone to writing about individual technologies or product features unless I have the opportunity to break the news about something new and cool (or not, as the case is from time to time.) But once and awhile a single feature comes along that in my mind is so compelling and cool I need to bend my own rules; Google Analytics new “Intelligence” offering is exactly that feature.

Just in case you’ve been living under a rock for the past month and haven’t already heard about “Intelligence” have a quick watch of the following video pulled from the Google Analytics blog:

Pretty awesome, huh? What’s more, now that I’ve had a few weeks to play with the feature and think about it in the context of my published views on the Coming Revolution in Web Analytics, I think that “Intelligence” is one of the most important advances in web analytics since the JavaScript page tag.

While Google is certainly not the first vendor to apply some level of statistical and mathematical rigor to web analytics data, an honor that would likely go to Technology Leaders for their Dynamic Alert product or Yahoo for their use of confidence intervals when exposing demographic data in Yahoo Web Analytics, in my humble opinion Google has done the best possible job making statistical analysis of web analytics data accessible, useful, and valuable.

Some things I really like:

  • An approachable way to determine confidence intervals via their “Alert Sensitivity” slider. While the implementation doesn’t necessarily impart the level of detail some folks would like, the slider mitigates the prevalent concern that “people won’t understand confidence intervals.”
  • Great visual cues for alerts, especially when statistically relevant changes are not obvious based on traffic patterns. Sometimes traffic patterns just look like hills and valleys, even when something important is happening — for example, the next figure shows two alerts at the lowest threshold setting on September 16th that, upon exploration, turned out to be great news (that I might have missed otherwise.)

  • Good visual cues regarding the statistical relevance of the insight being communicated. This is tough since Google is trying to present moderately complex information regarding the underlying calculations and how much emphasis you should be putting on the insight. By showing a relative scale for “significance” I think Google has more or less nailed it.

  • Google Analytics finally starts communicating about web analytics data in terms of “expectations” instead of absolutes. All of us (present company included) have a tendency to get wrapped up in whole numbers, hard counts, and complete data sets. But we also know that Internet-based data collection just isn’t that accurate, and so any push to get us to start thinking in terms of predicted ranges and estimates is a step in the right direction. For example, I love knowing that on a given day Google Analytics “expects” between 311 and 388 people to come to my site from the UK!

  • Lots more, including the ability to pivot the views and look from a “metric-centric” and “dimension-centric” perspective, the ability to aggregate on day, week, and month, and the ability to add your own custom alerts based on changes in traffic patterns. Perhaps ironically this last functionality (“Custom Alerts”) is how we’ve all historically thought about “Intelligence” in reporting, and while useful seems somewhat weak compared to Google’s stats-based implementation.

While awesome in it’s first instantiation there are some obvious things that the Great GOOG could improve in the feature. Some ideas include:

  • More dimensions and metrics, although I believe both Nick and Avinash have commented that they are already working on adding intelligence to other data collected.
  • Some way to expose confidence intervals and p-values would be useful (perhaps as a mouse-over) so that the increasing number of analysts with experience in statistics could have that data in their back pocket when they went to present results.
  • Email alerts for the automatically generated insights, for example when “Intelligence” determines that five or more alerts have been generated it would be cool to get an email/SMS/Tweet/Wave notification.
  • The ability to generate alerts against defined segments, so that I could see the same analysis for different audiences that I’m tracking.

Mostly ticky-tack stuff, but again I’m pretty damn impressed with their freshman effort. I suppose I shouldn’t be surprised since evangelist Avinash has been talking about the need for statistics in web analytics for an awfully long time, but given that so many in our industry have balked at bringing more mathematical rigor to our work (including said evangelist, oh well) it’s encouraging to see Google move in this direction.

What do you think? Are you using “Intelligence”? Is it helping you make better decisions? Do you like the implementation as much as I do? I’d love to hear your thoughts and comments.

Are You Ready for the Coming Revolution?

Few would argue that the past few years in web analytics have been, well, intense. The emergence of Yahoo Web Analytics, multiple management shake-ups at WebTrends, Adobe’s acquisition of Omniture following Omniture’s acquisition of Visual Sciences, WebSideStory, Offermatica, Instadia, and TouchClarity, and the continued push into the Enterprise from Google Analytics. From where I sit we have seen more changes in the last 24 months than we had in the entire 12 years previous (my tenure in the sector) combined.

When I think about these changes, I find myself coming to the undeniable conclusion that our industry is undergoing a radical transformation. More companies than ever are paying attention to digital measurement, and despite my disbelief in Forrester’s numbers, an increasing number of these companies are forging a smart, focused digital measurement strategy. At the X Change, at Emetrics, and at Web Analytics Wednesday events around the world there is more and more evidence that this wonderful sector I call “home” is really starting to grow up.

And we’re just getting started.

If you pay close attention to the marketing you see from Omniture, WebTrends, Unica, Coremetrics, and the other “for fee” vendors you’ve surely noticed a dramatic change recently. Nobody is talking about web analytics anymore; the entire focus has become one of systems integration, multichannel data analysis, and cross-channel analytics.

All the sudden web analytics is starting to sound like, gasp, business and customer intelligence.

Eek.

Since it’s late and since this post will be over-shadowed by the hype around Google Analytics releasing more “stuff” on Tuesday I’ll cut right to the chase: I believe that we are (finally) on the cusp of a profound revolution in web analytics and that the availability of third-generation web analytics technologies will finally get digital measurement the seat at the table we’ve been fighting to get for years.

Statistics, people … statistics and modeling, predictive analytics based on web data, true forecasting, and true analytical competition for the online channel. Yahoo’s use of confidence intervals when presenting demographic data and the application of statistical models in Google’s new “Analytics Intelligence” feature are just the beginning. As an industry it’s time to stop fearing math and embrace analytical sciences that have been around for longer than many of us have been alive. It’s time to stop grousing about how bad the data is and actually do something about it.

Do I have your attention? Good.

Thanks to the generosity of the kind folks at SAS I have a nicely formatted white paper that is now available for download titled “The Coming Revolution in Web Analytics.” Just so you can see if you might be interested here is the Executive Summary from the document:

“Forrester Research estimates the market for web analytics will be roughly US $431 million in the U.S. in 2009, growing at a rate of 17% between now and 2014.  Gartner reports that the global market for analytics applications, performance management, and business intelligence solutions was US $8.7 billion in 2008—roughly 20 times the global investment in web analytics.  Among their three top corporate initiatives, most companies are focusing their efforts online, expanding their digital efforts Internet to increase the organization’s presence in the least expensive, fastest growing channel.

Today, a majority of companies are dramatically under-invested in analyzing data flowing from digital channels.  Even when business managers have committed money to measurement technology, they usually fail to apply commensurate resources and effort to make the technology work for their business.  Instead, most organizations focus too much on generating reports and too little on producing true insights and recommendations, opting for what is easy, not for what is valuable to the business.

Web Analytics Demystified believes this situation is exacerbated by the inherent limitations found in first- and second-generation digital measurement and optimization solutions.  Provided by a host of companies primarily focused on short-term gains in the digital realm, not long-term opportunities for the whole business and their customers.  Historically these companies worked to differentiate themselves from traditional business and customer intelligence, focusing on the needs of digital marketers.  Unfortunately, as the need for whole business analysis increases, many of these vendors are playing catch-up and forced to bolt-on data collection and processing technology as an afterthought.

The current state of digital analytics is untenable over time, and Web Analytics Demystified believes that companies that persist in treating online and offline as “separate and different” will begin to cede ground to competitors who are willing to invest in the creation and use of a strategic, whole-business data asset.  These organizations are using third-generation digital analytics tools to effectively blur the lines between online and offline data—tools that bridge the gap between historical direct marketing and market research techniques and Internet generated data, affording their users unprecedented visibility into insights and opportunities.

This white paper describes the impending revolution in digital analytics, one that has the potential to change both the web analytics and business intelligence fields forever.  We make the case for a new approach towards customer intelligence that leverages all available data, not just that data which is most convenient given the available tools.  We make this case not because we believe there is anything wrong with today’s tools when used appropriately, but because we believe digital analytics should take a greater role in business decision making in the future.”

Since I pride myself on the quality of my readership I sincerely hope that each of you will download this document and  take the time to read it. More importantly I’d love you to share it with your co-workers, friends, and followers on Twitter. I believe we are at a critical juncture in our practice’s history where the skills that have served us all along are not going to serve us for much longer, but I am always willing to admit that I’m wrong and more than anything I love a spirited debate.

Are you ready for the revolution?

 
COPYRIGHT © 2011 WEB ANALYTICS DEMYSTIFIED, INC. ALL RIGHTS RESERVED. PRIVACY POLICY