The 2015 International CES is just around the corner. If you walk the show floors (there will be several this year) one of the first things you will notice about the devices on display is how much data they will generate. What will we do with the data? How will we mine it? What will we change with the knowledge? Has anyone ever had a big data problem that was this… “big”?
As it turns out, full-time haberdasher and part-time proto-demographer John Graunt described London circa 1662 by meticulously analyzing every single death record… by hand! In his study, Natural and Political Observations Made upon the Bills of Mortality, Graunt concludes: death by plague was de minimis; however, if you were under the age of 16, you had about a 1 in 3 chance of surviving to adulthood. (London in the 1640s was a tough town.)
What would motivate such an endeavor? Fear of the plague. (But I digress.) Graunt’s work was academically rigorous, scientific for its day and an impressive demonstration of data munging and data mining. From it, we learned we could combine demography and mathematics to yield actionable results. Importantly, Graunt was not using the Bills of Mortality as a proxy for his audience… he was collecting and mining data that “was” his audience.
The similarities between Graunt’s work and modern demographics are striking. We still deduce the composition of audience cohorts and we still make natural and political observations. Thankfully, we don’t need to rely on Bills of Mortality to make them. What we do rely on is a seemingly overwhelming amount of data, gathered from a consortia of the devices we use everyday.
What can we learn from the data we already have? Can traditional demography be adapted to describe the world as we observe it? Are there tools that could help us better understand consumer behavior? What does it mean to sell an actual audience as opposed to selling a proxy for it?
My friends at Time Warner Cable (TWC) took me through an in-depth analysis of set top box data that allowed them to create a new way to segment customers. They call them “TV Tribes.” At first glance, I thought this was simply marketing jargon. But the combination of data, demographics and mathematics got my undivided attention.
According to TWC, “TV Tribes are informed by both content and context: where they live, what they watch and how they consume media. This combination of factors improves upon analysis based solely on demographics, and the subsequent data transcends Nielsen ratings by finding out if particular channels are watched in specific areas of the country at specific time periods.”
The data has helped TWC identify dozens of distinct TV Tribes, such as:
- News: Whether it’s local or national, news channels are the main story within this Tribe. These consumers’ favorite channels are MSNBC, CNN and Fox News.
- Kids & Family: Families with young children tend to have a strong bias toward kids’ programming, and channels like Cartoon Network and Disney XD.
- Tailgaters: Sports fans who live and breath all season long with their favorite teams. Channels like ESPN and the NBC Sports Network are “king” to this Tribe.
- True Stories: This Tribe prefers channels like A&E and WE TV, which air documentaries and real history shows about the past, present and future.
- Fast & Furious: This Tribe is obsessed with action programming, movies and fast-paced comedies on networks like Syfy, AMC and SPIKE TV.
Within each of these tribes, though, the data can be broken down even further. Marketers can see, for example, that only 3% of the market in Cleveland is made up of the Fast & Furious Tribe, while that number grows to 5.4% in Los Angeles or 7.1% in NYC. The data can also be broken down by income level, giving marketers an even better idea of exactly whom they’re selling to.
I’m fascinated by the granularity and targeting capability that “TV Tribes” accomplishes using the “blunt force instrument” we call television. It makes me wonder about the quest to conquer big data (defined here as every bit of information generated by everything). Probabilistic data vs. deterministic data… is there a middle ground? Someplace between, to paraphrase Einstein, “…counting absolutely everything that can be counted and counting only what counts?”
Whether it’s putting ad dollars to work, or striving to deliver the most relevant content, we are likely to need a human-understandable narrative, and context to augment our nascent automated pattern matching and machine learning tools. TV Tribes seem like a positive evolutionary step. Graunt would have liked them – seriously – check out some of his writing; he literally describes his version of “Tribes” in the book.
If you’ve read this far, and you know me, you are undoubtedly wondering when I’m going to start talking up the value of online and on-demand video – how well it can be measured and how easily targetable it is. All true. But. Just this week Nielsen announced that the average American still watched more than 141 hours of live television a month—more than four hours a day. How much online video do they watch? 11 hours a month—around 22 minutes a day. While it is undeniable that live television viewing is trending down and streaming video viewing is trending up, the best ROI will be a result of increasing the efficacy of the bigger number first.
By the way, if you needed to do all five years of Graunt’s data wrangling, munging and analysis today, it would take you less time than you spent reading this article. Importantly, this would not make you a better demographer than John Graunt, only a faster one. See you at CES.