On Analytics

Christopher Reid Ad Ops

Before we get going here, I'm going to provide some context around how I view digital publishing. I'm an engineer, and Sortable got into the publishing industry by launching a number of product comparison websites, powered by technology we created. Digital publishing is full of inefficiencies, and we were fortunate to have the engineering resources internally to develop solutions through technology and, as we pivoted, start offering these solutions to others. Publishing is both an art and a science, and publishers need to give as much attention to the science (engineering) as they do to the art (content and marketing). I believe that solving engineering problems, or focusing on the science, of this industry improves the ecosystem for everyone and that building technical solutions to common challenges enables the artists to do their best work.

So, that being said, let's start with the engineering powerhouses that are Facebook and Google, currently the second and eighth largest companies in the world. They are both online publishers: Facebook, who publishes photos of your friends and family, and Google, who publishes links to your websites and hosts your videos, dominate the web. There are others for sure: Washington Post, Reddit, BuzzFeed, Zillow, WebMD and many more are all examples of incredible online businesses providing real value for their readers. Google and Facebook, however, are instructive: they dominate the digital landscape, and while they have much larger ambitions (cloud, consumer, internet, VR, transportation, etc), the bulk of their profit stems from advertisements surrounding the content they publish. It's interesting to note that neither of them create original content. They are, instead, engineering powerhouses that use data and computer science to give users and advertisers what they want by leveraging the content of others.

More and more advertising spend accrues to Google and Facebook for very good reason. Their advertising ecosystems perform better for advertisers, they're accepted by users, they have massive scale, and their audience data is second to none. Modern marketers and brands can go to them, sans agencies, and reach the right people in the right context at scale. The open web ecosystem is quite poor in comparison; too many participants, too poor performance. Brands are taking note. Marc Pritchard, Chief Brand Officer at Procter & Gamble, has been vocal about the challenges of the open web ecosystem. We realize there is no sustainable advantage in a complicated, non-transparent, inefficient, and fraudulent media supply chain, Pritchard told AdExchanger recently. Talk to a publisher doing audience extension and user acquisition and they'll choose Facebook over their fellow publisher's sites because it's what works. If publishers ever hope to shift the tide, it's instructive to ask why do Google and Facebook perform so much better. Why are CTRs on Facebook an order of magnitude better? Why is search such a compelling product for advertisers?

As I mentioned above, both Facebook and Google enable, rather than stifle, content creation and marketing efforts. These two companies perform better because they invest heavily in a controlled ecosystem that is optimized in every manner to satisfy users, advertisers, and their own bottom lines. They build everything themselves, no need for 101 vendors that can't get along.

In contrast, the open web is currently completely and totally unprepared to match the engineering prowess of the world's prominent technology companies. But what publisher can possibly afford to employ 500 top-rate engineers at $500k per year let alone thousands of engineers? Contextual advertising, user engagement, optimizing click through rates, full purchase path attribution, and content recommendation are all problems that the big guys have spent hundreds of millions of dollars optimizing and are now refining with machine learning and other AI techniques.

Building a better publishing business is not solely a technology problem ” remember, publishing is just as much an art. Editors still need to produce incredible journalism, but content quality is not all that matters. Artists still need good data and insights to continue to create better experiences for their audience. Quality and performance must extend much more broadly to every aspect of the experience and the only way to do that is to dig into data, analyze it, and apply the insights. Publishers are thirsty for data that can help them make better decisions. They're not only looking for data to help them monetize better, but data that can inform the overall strategy and help optimize the non-yield yield “ the refinements and improvements to content and marketing that build loyal audiences.

Publishers understand that data and analytics are critical to helping them measure ROI on their content and marketing efforts, build better experiences for users, provide better ad buying and audience targeting for advertisers, understand their user base to produce more engaging and personalized content, build repeat traffic, and do more with less. Our goal at Cracked is to use data to create content that our audience loves while creating value for the business. And in this rapidly changing advertising landscape, it is becoming increasingly important to understand the true ROI of content, says Mandy Rusin, GM & VP at Cracked-E.W. Scripps Media. Sounds great, but building out a data pipeline is a significant, resource-intensive endeavour, so how will publishers ever catch up? For every Cracked (or BuzzFeed, which has been making inroads with a content and engineering team using data and science to drive content strategy helping them build huge brands like Tasty), there are hundreds of publishers who still don't even know how much a given piece of content earns them. One of my biggest challenges in my previous roles, including at VICE Canada, was demonstrating tangible, meaningful benefit to our content and marketing efforts, says Ryan Fuss, who joined Sortable from VICE Canada. Sortable is building something amazing. 'Analytics platform' doesn't sound sexy, but don't underestimate how elusive actionable data has been for digital publishing teams.

This is where publishers find themselves these days: short on resources, no insight into their users, short on profit, and surrounded by an ad tech and pub tech ecosystem that's confusing, expensive, and much worse for advertisers. This plight is the reason I started Sortable - my last venture was a small publishing startup, and it opened my eyes to how incredibly difficult the position publishers are in.

Getting Unstuck
Getting unstuck starts with trying new things and iterating quickly. I hear so many good questions and ideas coming from content, marketing, and monetization teams at progressive publishers. These are the teams that are going to set the model for others because they're hungry for change. They have different challenges than traditional ad ops teams: Understanding the lifetime value of a customer so that you can work harder to acquire them. Figuring out how long it takes to develop a dedicated reader, app install base, or email subscriber, and then building look-alike audiences to acquire them. Constantly testing which pagination models are preferred by users, generate more session revenue, and provide better KPIs for advertisers. Developing interesting PMP-based page executions for programmatic buyers. Tailoring your content strategy around the content your users are already showing interest in. Understanding which headlines will generate more shares or click throughs on social media. Building unique data sets and offering them to advertisers outside of your site.

The best basis for getting unstuck is for publishers to focus on four immediate goals:

  1. Collect all the data about their site, users, content, and revenue
  2. Organize and normalize it together in one spot
  3. Understand their data across marketing, revenue, and content teams
  4. Act on their data rapidly

Collecting Data
Collecting data means you need to log a lot of info about what the user is doing, their behaviour, are they high-risk/NHT users, are they a repeat user, did they come from a specific marketing campaign, how did they browse the page, do they have an ad blocker installed, which advertisers bid on them, are they worth a lot, did you set pricing floors accurately for them, and so on. You need to normalize browser-side data with DFP, SSP log-level data, finalized revenue data, fraud data, viewability data, and many other data sets. Ideally, you're doing this at the impression, page, and session level so you can query against it in the most granular way. This is a lot of disparate data to gather. Just doing the metrics-logging well is not a trivial undertaking, but worth it if you want to have extreme insight into everything from how your content is loading to which ads are serving.

Organizing Data
Having terabytes of data is not something to be proud of unless you can derive powerful insights from it. You need to take all that data and be able to use it in a productive way. That often means putting it in a normalized form by doing a lot of post-processing on it. Then you need to build out multiple roll-ups on top of it in order to make querying it for insights fast (we've all gone through DFP hell when trying to get big queries back). Plus, you probably want to have the ability to process it with a spark or another big data cluster so you can do interesting analysis, like using machine learning to discover flooring models. Organizing the data is an extremely important effort because the organization matters, the industry and use cases matter, and if you want it to be fast and to work well you need to get this part right.

Understanding Your Data
If you have your data well-organized (and very few do), then understanding it actually becomes fairly easy. Most of the heavy lifting has been done and you can now start looking at really interesting insights and asking tough questions like when do auto manufacturers start ramping spend, which of my authors makes me the most money, how do RPM values vary by session depth, how do bidder participation rates change over a session, how does page load affect viewability, do any of my responsive breakpoints have drastically different viewability or session RPMs, what is driving true ups.

Understanding your data is the fun part. You can build default dashboards, automated email alerts, notifications into Slack or another collaboration platform. You can cluster your data and do automated anomaly detection, and raise flags when bidders go down. Almost anything is possible if you have the right framework built and your data is well-organized.

Acting on Your Data
This can mean many things:

  1. Being able to run and change experiments. For example, it should be really easy to test things like refresh strategies, lazy-load dynamics, ad placement, timeouts, bidder participation, etc.
  2. You should have automated monitoring to facilitate alerts to your staff of critical issues in real time, such as huge partner discrepancies, ads not serving, libraries not loading, or critical infrastructure elements going down.
  3. Updates to how your site runs. There are many things that can be modified like updating discrepancy factors for partners as revenue data gets loaded in, setting flooring levels based on historic moving averages and real-time data, and moving traffic towards winning strategies while prospecting for new winners.

Sounds Impossible
If you're like most publishers, digesting all this work in any reasonable time frame is unrealistic, which kind of negates my advice that publishers iterate quickly in order to get unstuck.

Doing this is admittedly very hard, you have to make sense of the vendor data spaghetti: Google Analytics, CrazyEgg, AdJuster, Moat, Optimizely, White Ops, GeoEdge, SSP revenue and bid logs, DFP API, flooring strategies, data analytics, YieldX, Facebook, and a myriad of client-side data such as header bidder performance, page load, and page metadata.

Then you need to find the developer resources to process and organize it in the right kind of data stores, leaving your engineering team with the prospect of building out a non-trivial cloud computing and data environment.

Putting an analytics package on top of all this and then building out a client/server platform and appropriate vendor integrations to make the actioning easy just further complicates the problem.

You're now looking at a multi-year effort. Heck, many party vendor analytics integrations alone take a year to integrate only to find publishers disappointed.

Sortable Analytics
Recap: building out the framework for analytics is necessary, but it's also hard. Really hard. Because of this, we've spent the last 18 months working on Sortable Analytics, which we're just launching this week. We've built a publisher-first, content, marketing, and monetization platform that was designed from the ground up to address publishers' need to understand their businesses and move quickly. For too long, I've been hearing about how hard this industry is, how frustratingly slowly it moves, and how non-publisher-centric it is. By adding this on top of our industry-leading monetization stack, we're helping publishers solve even more problems using a single tool.

Right now content, marketing, and rev ops teams largely sit in separate silos, each with their own tools, staff, and access to engineering resources. We don't think siloing the three most important areas of your publishing business makes sense, and so the first thing we want to do is help publishers break down those barriers.

We're taking the position that publishers need world-class engineering services and software to enable them to make better business decisions, and they need to be the ones in control. Analytics comes free to all our publishers and it focuses on helping them in three key areas: Content, Monetization, and Traffic Acquisition. After being on the digital publishing side of the business for almost 20 years, it's so exciting to now be a part of Sortable, working on innovative ways to improve publisher operations and bring all aspects of the business - revenue, marketing, and content - together, says Fuss.

Best of all it just works ” no horribly painful 6-12 month projects. Everything from viewability analysis, fraud detection, bad ad monitoring, header-bidding, server-to-server, super in-depth analytics, experiments, and more comes built-in. Every publisher now gets a world-class stack. And they're reaping the rewards. 

Access to Sortable's analytics has helped educate our content team to focus on the performance metrics that actually drive revenue, says Sam Weltman, Managing Director - Pulse Division at Gateway Media. After only a few weeks using Sortable, our content writers were able to identify the monetization trends tied to various content categories. If a content writer has an article idea, they can use the data available to them to search how a similar article performed before writing it. This has also allowed our content team to hone-in on the content that our audience loves so they keep coming back for more.

Unifying teams through data and insights is a strong first step towards a much larger plan to return power to publishers. This has always been our goal, and we'll continue to bulk-up our engineering capacity and build products that put publishers back in the driver's seat. If you're a publisher “ CRO, ad ops specialist, editor, content marketer, or sole entrepreneur “ and you like the idea of using data and world class software to power your business, reach out and we'll see if we can help.

Request a Demo