Listen Now

In this episode of the Ecommerce Playbook, Taylor Holiday and Richard Gaffin dig into the Measurement Roadmap—CTC’s framework for deciding what to test, when to test it, and why it matters. They unpack why incrementality testing is so crucial, the massive swings it can reveal in platform reporting, and how brands can use repeated tests to progressively “shrink the error bars” around decision-making.

From Meta’s 20% under-reporting benchmark to surprising Amazon lift results, Taylor shares concrete examples of how measurement gaps can change behavior, creative strategy, and even channel prioritization. The conversation highlights a hard but liberating truth: marketers don’t need perfect certainty—they need enough clarity to make confident decisions anchored in contribution margin.
Show Notes:

Watch on YouTube

[00:00:00] Taylor Holiday: That's really important as a thinker in this space, is that your job is not to get to perfect information. It's to get to information that gives you enough confidence to make a decision.

[00:00:08] Richard Gaffin: Yeah.

[00:00:09] Taylor Holiday: and I think we're trying to help people. 'cause what I'll tell you is this is hard for people to work through. They really, really want certainty with

[00:00:16] Richard Gaffin: Mm-hmm.

[00:00:16] Taylor Holiday: very frustrated when we can't offer back to them certainty. But there's like an intellectual honesty to representing things that are. Probabilistically true. Or true today and might be different tomorrow. And so these are ideas that I think as marketers we have to embrace and use as directional signals. And it's why we would still say in the hierarchy of metrics,

[00:00:38] Richard Gaffin: Mm-hmm.

[00:00:38] Taylor Holiday: our view that these are all still subordinate to the financial reality of what's happening on the field

This episode of the E-Commerce Playbook is brought to you by all Stars. Whether you're hiring your first international hire or you're building a full department, all Stars can help you hire the right people without the usual headaches. They've done 450 plus specialized placements for e-commerce sellers in the last three years.

Their team of expert recruiters makes it hassle-free, so you're not spending dozens of hours looking at resumes and doing interviews only to end up hiring the wrong person. They've created proprietary role specific assessments for the most common e-commerce roles, which perfects the evaluation beyond interviews alone and allows you to see how candidates actually work without the guesswork.

Just go to hire all stars.com or check the show notes for more information.

[00:01:28] Richard Gaffin: Hey folks. Welcome to the eCommerce Playbook podcast. I'm your host, Richard Gaffin, director of Digital Product Strategy here at Common Thread Collective. I'm joined this week it's been a little while, but I'm joined this week by Mr. Taylor Holiday, our CEO here at Common Thread. Taylor, what's going on, man?

[00:01:43] Taylor Holiday: I am not sure. Oh, whoa. Excuse me. I went to talk and that's what happens when you were yelling on stage all day yesterday at the

[00:01:48] Richard Gaffin: Of course.

[00:01:49] Taylor Holiday: round table, you know? So my voice is a little faded on me.

[00:01:53] Richard Gaffin: All right. Well, we'll get back into the swing of, of the podcasting here. But basically what we wanted to talk about this week is kind of part of a series that we've been doing for a little while now on specific tooling that we're using here at Common Thread Collective to think through things like measurement, testing, incrementality.

Creative volume, all that type of thing. And so we wanted to talk about today is something that we're calling the measurement roadmap. And, and essentially what this is, is a tool that allows you to understand what you should be testing and why. So I think I'll just turn it right over to you, Taylor, to kind of dig in a little bit more about what this is and, and how it's useful to us.

Stop.

[00:02:29] Taylor Holiday: So incrementality is obviously a buzzword and a hot topic for people, and when we have brands come to us, what we wanted to try to do to have a logical framework for deciding what we should test first and why, and what it represents in terms of potential impact to the business. So one of the benefits of when you run a lot of tests you begin to get a population, a dataset of those test results for different platforms. And what that begins to represent then is the upper and lower bound of potential incrementality in a channel. So let's just use meta as an example. When we have run 50, a hundred, 150 incrementality tests on meta, we begin to see what all those results represent in terms of how much the platform could be overreporting, or how much the platform could be underreporting. And then that allows us to see, okay, what are the bounds of potential? What is the most likely outcome? Both of which are important data points in our workflow. And I'll explain how. The median, the most likely result of all those tests is what we use as the benchmark factor. So the starting point that without a test result, we would assume the channel to be. And for meta, interestingly enough. The factor that we use is 1.2 against seven day click. So if you are using seven day click revenue, we would use a benchmark starting factor that says the channel is underreporting the revenue by about 20%. And oddly enough, this is the exact same factor that house published in

[00:04:12] Richard Gaffin: Hmm.

[00:04:12] Taylor Holiday: of their test results too.

So that that's like sort of a reinforcement that it's a really strong. point that our tests have aggregated there, their tests have aggregated there. There seems to be good confidence that as a, as a, in the absence of your individualized test result, that's a pretty good number, but okay. We also want to be able to represent. That may be the median result, but we've had some cases where meta is wildly overreporting and we've had some cases where it's wildly underreporting that same view, and we wanna be able to say to clients, that represents the total upside or total downside of your current view. so I'm gonna share my screen real fast to show you an example of how this is illustrated.

[00:04:52] Speaker: Hey, everyone wanted to take a quick moment to let you know that you can now book one-on-one consultation calls with CTCs paid media experts directly. If you've been listening to this pod for a while, you know that CTCs ad buyers are among the best in the world, and they're now just a click away. Go to your admission.co/book CTC to get started, or check the link in the show notes.

That's your admission, A-D-M-I-S-S-I-O n.co/book CTC to get started.

[00:05:20] Taylor Holiday: So if you're following at home on just audio, I'm gonna talk through this. Hopefully you're gonna be able to see though. See, this is our testing planning section of stats where we can take a brand and we can say, okay,

priority channel for them is meta because their current spend is the largest in that channel. And if we were to take the platform reported revenue, second column, then we were to apply a worst case and a best case. So that

[00:05:49] Richard Gaffin: Hmm.

[00:05:49] Taylor Holiday: the outer bounds of the potential incrementality results we have seen. Okay, so the worst case and the best case relative to the platform reported revenue. So again, if it's 3 million is what the platform's reporting on seven day click, we've seen the worst seven day click incrementality.

This looks like about 60% of the platform reported number, the best case is like 180%. you can see the bounds of the outcomes represent from about 0.6 to about a 1.8 in terms of the factors we see on seven day click. And you can understand then if you're using the platform reported number, which is a 1 7 6, and you were to look at the total upside. In the best case scenario, if you had the best incrementality result of the entire data set we've ever seen, that would actually mean that you had $8 million of incremental revenue and were actually getting a 2.61. Well, if

[00:06:41] Richard Gaffin: Hmm.

[00:06:41] Taylor Holiday: were true, it would fundamentally change your behavior. Now, also on the downside, if I go all the way to the downside, meaning I take the worst case scenario, the lowest bound of potential impact. only be 2.66 versus the 3 0 6. Right? So, so now I'm looking and going, all right, well represents the amount of uncertainty that exists in this channel, right? And so how important is it that we answer this question versus say the question of YouTube? We're only spending $30,000 on YouTube. So the unknown, the variable risk there, and we actually already have a test result. You can see these are the channels where we actually have tests. You can open the actual test results and you can see the

[00:07:25] Richard Gaffin: Hmm.

[00:07:25] Taylor Holiday: result. this then helps us to prioritize the roadmap and to quantify for our customers the amount of delta that could currently exist in their present measurement modality.

[00:07:38] Richard Gaffin: So in, so in this, because that's such an enormous swing, right? I mean, obviously for this particular brand, we're going to run our own test and have some more concrete answer to what the actual incrementality factor is for the specific brand. And obviously making decisions from there becomes a little easier.

But let's say in the scenario where you don't have a test, like what is the. What is the set of choices that you have given that there's an, you know, an almost was a $3 million swing potentially.

[00:08:06] Taylor Holiday: Well, I think this is why incrementality is so important is that the amount of delta, or I guess you could call it alpha, but basically the, the variation between what the platform is currently telling you and what is actually true is very wide. This is the key point, and this is why this has become such an important topic, is because need to get to the best available truth. And what this is illustrating to you is that if you are just using the platform reported number, a pretty wide range of error. truth. And so what we think, the way we talk about this is that our job is get to get to a progressively better truth. There is no singular right answer. It

[00:08:49] Richard Gaffin: Mm-hmm.

[00:08:50] Taylor Holiday: But your, our goal is to narrow those error bars, so exactly what you're describing, which is how wide is the best and worst case represents the potential error in your current behavior. And so our job is to shrink the error bars. We're never gonna get it to a definitive singular number, but we want to tighten the bands.

And this is why you test and retest and retest and retest. 'cause each one of those individual data points allow you then to get to your own measurement dataset where your own error bars exist versus having to use the entire population. And so all of this helps us get closer and closer and closer to better truth.

[00:09:29] Richard Gaffin: Right. So ultimately the usefulness of this is primarily, like we said at the beginning, prioritizing what to test, what to run these specifically incrementality tests on. And but then, and then also, yeah, I guess the, it was what to test and why. I mean, the why obviously being, 'cause there's such an enormous potential outcome swing.

[00:09:46] Taylor Holiday: That's

[00:09:47] Richard Gaffin: So at this point then basically what we do is run the test and then as you're saying, again, the gap between worst and best case should shrink, shrink, shrink, shrink, shrink till you have a little bit of a better sense of, of exactly what the outcome might be.

[00:09:59] Taylor Holiday: One of the, the next step from there is, is as you retest platforms, you actually get your own. Representative range of potential value. also, I think is the most honest way to assess what's currently happening, which is rather than saying definitively it's this, it's, we have seen meta report 0.6 0.8, 4.7, 1.7 3.74, like. Over the course of years, and so the range of incrementality looks like this. Now you can be more aggressive how you're interpreting the present range against what's currently happening. But it allows you to, I think, be clearer about the way you're assessing this versus, and it, and this is like where I think there's a. Willingness to hold ambiguity. That's really important as a thinker in this space, is that your job is not to get to perfect information. It's to get to information that gives you enough confidence to make a decision.

[00:10:55] Richard Gaffin: Yeah.

[00:10:56] Taylor Holiday: and I think we're trying to help people. 'cause what I'll tell you is this is hard for people to work through. They really, really want certainty with

[00:11:03] Richard Gaffin: Mm-hmm.

[00:11:03] Taylor Holiday: very frustrated when we can't offer back to them certainty. But there's like an intellectual honesty to representing things that are. Probabilistically true. Or true today and might be different tomorrow. And so these are ideas that I think as marketers we have to embrace and use as directional signals. And it's why we would still say in the hierarchy of metrics,

[00:11:25] Richard Gaffin: Mm-hmm.

[00:11:25] Taylor Holiday: our view that these are all still subordinate to the financial reality of what's happening on the field

[00:11:31] Richard Gaffin: Mm-hmm.

[00:11:32] Taylor Holiday: to anchor ourselves in contribution margin and then we have to use these channel lever measurement signals. indicators of how we affect that number, but not as the overarching rule of law in terms of what we're trying to accomplish.

[00:11:45] Richard Gaffin: So even within the ambiguity that exists in, let's say, the incrementality of Facebook as a channel or whatever, you still have the more concrete or, or the more certain number of like bottom line, what's actually happening, right? Like what your contribution margin turned out to be.

[00:11:59] Taylor Holiday: right. What, what is the actual money in the bank account

[00:12:01] Richard Gaffin: Mm-hmm.

[00:12:02] Taylor Holiday: period of time is the ultimate truth.

[00:12:05] Richard Gaffin: Yeah.

[00:12:06] Taylor Holiday: the thing that we're often wrestling with is. If you get these numbers wrong, in other words, if you're underrepresenting the opportunity dramatically in some way, what you've lost is money that would not show up in the bank account, right?

So,

[00:12:19] Richard Gaffin: Mm-hmm.

[00:12:20] Taylor Holiday: like, it, it, it's that opportunity cost that is very hard to see.

[00:12:24] Richard Gaffin: Yeah.

[00:12:25] Taylor Holiday: that's what we're often trying to extract out is, are we underrepresenting the impact of this ad spend such that we are leaving massive opportunity on the table I'll give you an example of something that I was just having a Slack message about. One of the areas where we're really exploring this is about the impact of meta on Amazon beyond just your incrementality factor. And so we have a big brand spends a lot of money on meta right now that just got a test back there's a 46, 40 3% of

[00:13:01] Richard Gaffin: Hmm.

[00:13:01] Taylor Holiday: lift of their ad spend comes from Amazon. So that if you think about like your running your med a account and your target is a two and you're sitting there at 1.4, but then all of a sudden you discover that there's an incremental 43% of the total impact of this ad spend is showing up in this way that you're currently not accounting for. that changes all of your behavior, right? That goes from a signal that things aren't working to a signal that things are. Right. And think about this from even like a creative evaluation. Like if you're in that ad account and you don't have that view, you have this like feedback cycle that tells you all this creative concepting is bad.

[00:13:39] Richard Gaffin: Mm-hmm.

[00:13:40] Taylor Holiday: it's not delivering the result, right? And so you would run off and change it and try and do different things when in reality it's just functionally a measurement problem.

[00:13:48] Richard Gaffin: Yeah. Yeah. Yeah.

[00:13:49] Taylor Holiday: this is why this is so important to go after.

This episode of the E-Commerce Playbook is brought to you by all Stars. Whether you're hiring your first international hire or you're building a full department, all Stars can help you hire the right people without the usual headaches. They've done 450 plus specialized placements for e-commerce sellers in the last three years.

Their team of expert recruiters makes it hassle-free, so you're not spending dozens of hours looking at resumes and doing interviews only to end up hiring the wrong person. They've created proprietary role specific assessments for the most common e-commerce roles, which perfects the evaluation beyond interviews alone and allows you to see how candidates actually work without the guesswork.

Just go to hire all stars.com or check the show notes for more information.

[00:14:35] Richard Gaffin: Yeah. Well, no, I mean, I am struck by just like the complexity of the ambiguity, I guess. Like we're dealing in, in ambiguities, we're dealing in things that are unseen. We're dealing in a lot of like known unknowns, unknown unknowns. It's kind of like existing in that area. So like, walk us through like what's the, the decision making.

Process for, like you were mentioning before, living in that ambiguity. Let's say you're at a place where you don't have a clean incrementality test on, let's say meta acquisition and you have this enormous swing. What's the set of decisions you, you make there before you get to a point where you've actually narrowed in a little bit more clearly on what the incrementality of that channel is?

[00:15:13] Taylor Holiday: I, I think you have to decide when it's worth pursuing this knowledge for the sake of some change you're trying to create. And I think there are these things that create impetus for this, right? Like I just told you one, like if I was diversifying my distribution, I was launching onto Amazon, I was launching on walmart.com. I would use that as a moment to say, okay, hold on. is a clear period where we have to reassess our measurement system,

[00:15:39] Richard Gaffin: Mm-hmm.

[00:15:40] Taylor Holiday: where we have to look at it and go, there's something really distinct about this. that we have to get an answer to. The next would be like similarly like channel diversification.

If you're getting ready to run on app 11, which everybody I think should consider in Q4, you should in October run a incrementality test on app 11 so that as you go into a key moment where you potentially could scale spend a lot, you have visibility into the impact of that channel.

[00:16:03] Richard Gaffin: Mm-hmm.

[00:16:04] Taylor Holiday: so I think there's these moments.

The other thing would be like. you're really struggling in a channel or you've seen performance dip, or in some cases like we get brands that come to us. Another good example is. You have a lot of organic demand. Maybe you're a business that has been built an organic audience for most of your lifetime, and all of a sudden now you're venturing into paid.

And so it's really, you have really high ROAS numbers in meta and you're just uncertain about the actual causal impact of that ad spend versus what's organic. I think it's really important. For people to be like principled thinkers about in what scenario is this information useful and in what scenario is it not? And what I've found, like even, you know, I was listening to who is the CMO of Cozy Earth, who we work with, and he was on the Marketing Operators podcast this past week. And he and I have had a lot of conversations. We're even thinking about doing a series about their, their incrementality journey 'cause it's so complex. But one of the things I've started to notice

[00:17:02] Richard Gaffin: Hmm.

[00:17:02] Taylor Holiday: I was listening to him and Connor and Connor talk and. like what they kept saying was like, yeah, we get this test result, but we don't really believe it.

[00:17:11] Richard Gaffin: Hmm.

[00:17:12] Taylor Holiday: this is, this is the inherent complexity of this, is that there's this, there's this tension in the model and the trust.

And what I would just say is that if you design a test before you launch it, you are going to inherently distrust the outcome. Then we should probably take a step back and redesign the test. 'cause it's probably not a good use of time and energy. Right? Like, and this is the hard thing is that often when the test result is, confirms our biases, we tend to place higher trust in it than when it re

[00:17:41] Richard Gaffin: Yeah.

[00:17:42] Taylor Holiday: it when it's at conflict with what we believed before. And this is, this is a challenging sort of general premise of how we as people is that when we. new information that's at, that's in conflict with our previous belief. It's harder to accept.

[00:17:56] Richard Gaffin: Mm-hmm.

[00:17:56] Taylor Holiday: there's like very human psychology tied up in all of this work too, that you just have to,

[00:18:01] Richard Gaffin: Yeah.

[00:18:01] Taylor Holiday: figure out.

[00:18:03] Richard Gaffin: Okay. Well, so I'm curious then as, as a skeptical man yourself, what is the thing that like pro like gives you trust to feel like this? These types of incrementality tests are useful

[00:18:13] Taylor Holiday: right.

[00:18:13] Richard Gaffin: and worth listening to.

[00:18:14] Taylor Holiday: care so much about the aggregate work is because what I want to see I want to see some indication that there are reasons that underpin the outcomes that be, that are repeatable within multiple brands. I'll give you an example of a hypothesis that would be confirmed or rejected with this kind of information. Non categorical, or sorry, categorical search in a competitive environment more incremental than when you're the only provider. So an example, like there are categories where there is a single brand leader that is almost synonymous with the category. And in those scenarios, not our categorical search acts more like branded search. My hypothesis would be that in those scenarios, like incrementality would be lower for spending on categorical search versus something like swimsuits where it's a bloodbath of like many, many brands competing for all those clicks and winning them is super important, and someone who searches for women's swimsuits isn't likely looking for your specific brand in that search.

And so those clicks are highly incremental. That's like a, that's like a underlying hypothesis. Would make sense to me to align with the data. and so when I can understand the organization of the information and those scenarios play out consistently over time, it affirms like the quality of the information. I, when the results are all random in ways that are hard to understand, it doesn't mean they're wrong. It just means that I know less what to do with it, and so it becomes less actionable. Another thing I like is that when multiple sources, so like the house example I gave with meta where it is my experiential belief that day click under reports the impact of meta.

[00:20:04] Richard Gaffin: Mm-hmm.

[00:20:04] Taylor Holiday: It seems to me that seven days is insufficient, that view conversions do have some level of effect on people. It's not zero. And so those two things would lead me to believe that seven day click would under represent the total impact. Then we run a bunch of tests and we find that yes, that's, that's the case by 20%.

And then a separate party house who also runs a bunch of tests finds a confirmatory result.

[00:20:28] Richard Gaffin: Mm-hmm.

[00:20:28] Taylor Holiday: I have this like alignment between experience, my own data set, a third party data set in a way that I go, my confidence increases in the information. I, I look for things like that where I, the amount of trust I put into the information depends on an alignment of all of those things when they're out of alignment, like if they had come back and said, seven day click actually over reports by 50%, I'd be like, whoa, I don't now I'd, I feel uncertain and I

[00:20:56] Richard Gaffin: Mm-hmm.

[00:20:56] Taylor Holiday: likely to action it.

So there's, there's some sort of alignment across those experience, multiple data sets, repeated results of testing, experiments, things like that.

[00:21:06] Richard Gaffin: Well, so I mean, yeah, I, I think it's interesting to dig into this a little bit because like the examples that you're giving are where a presupposition was affirmed by the test and then backed up, let's say by multiple sources, whatever. Is there an example where an a presupposition you had was actually,

[00:21:21] Taylor Holiday: all

[00:21:21] Richard Gaffin: Disconfirmed or whatever by it?

And how were you able to, to kind of get into trusting that?

[00:21:26] Taylor Holiday: so I'll, I'll give you an example of one where I feel like YouTube is one for me, where the results are really ran really weird to me, and I, I don't yet have a lot of confidence. Like there's this, I, so I think the general hypothesis would be that YouTube struggles. Within platform reporting because there's no direct click path

[00:21:47] Richard Gaffin: Mm-hmm.

[00:21:47] Taylor Holiday: it's primarily view based. A lot of it's watched on tv and so it's highly likely that the platform under reports, well, that's not always the case. Like sometimes I see, like even the result we were just looking at where actually the incremental revenue of YouTube was less than what was reported by the channel and, and you're like, oh man, like. I don't understand yet. And so I think there's these spaces of, of these more novel channels, especially when you get into these higher funnel actions that the results are really disparate across things. And I don't yet have a clear idea in my head about how to interpret it. And so I really sort of place higher value on the individual results of like, okay, well in this case, this is what happened. I have to change the inputs. I also, the creative strategy is a lot less clear to me of exactly what wins and how to do it. It's newer, so I think about it like, you know, you and I have sort of talked about this as it relates to how I would try to teach my kids to believe information or even process religion.

It's like, well,

[00:22:47] Richard Gaffin: Yeah.

[00:22:47] Taylor Holiday: you need to have this. Relationship between the consistent and repetitive of a set of information from multiple places and multiple sources and your own experience that like as those things compound, your trust meter goes from like 0% to one, to two to three, and it kind of grows over time, and then something might set it back for a bit, but you just, it's not binary.

You don't just go like a yes or no, you know? And like I think. asked of you in some spaces, but I don't think that that's my experience of how I now sit in the idea of belief. It's that meter that's kind of going up and down for each channel relative to the confirmation of the experiments in multiple cases, individualized experiences, lived experiences.

All these things build a robust picture of belief.

[00:23:32] Richard Gaffin: Yeah. Yeah. Well that, that's an interesting thing to dig into all of its own, but I guess I'm curious about oh my God, what was I gonna ask? My brain's gone completely blank on this. Oh, oh yeah. Okay. Sorry. So one, one important element of like, of the foundational level of trust here, I would imagine is do you trust the actual mechanics of the test itself? Right? So we've talked about how these works like a million times, but I, I think it might be worthwhile to say like, so at least at some level, obviously like 60 incrementality tests are gonna be more valid than one or whatever.

But why is that The sort of mechanic, this sort of like geo holdout thing. Why do you trust that Fundamentally?

[00:24:11] Taylor Holiday: I think in its purest form, the idea of the absence of an effect and the presence of an effect, assessing the value of the effect

[00:24:27] Richard Gaffin: Mm-hmm.

[00:24:27] Taylor Holiday: structurally the best kind of test design that you could have

[00:24:31] Richard Gaffin: Mm-hmm.

[00:24:32] Taylor Holiday: Sort of the classic like double-blind placebo idea, right? Where there's two parties that one doesn't know that they're receiving a placebo.

One receives the actual pill, and they don't know which is which. And then the effects are present or not present in each group, and you can determine which the value or the efficacy of said drug.

[00:24:54] Richard Gaffin: Mm-hmm.

[00:24:56] Taylor Holiday: Now, so, so I think from a, an experiment design, test design, when people talk about like the idea of a gold standard, it's similar in like a scientific setting, right? And the challenge is. So I'm, I'm telling you why it's important that I'm gonna undermine it too. So, so

[00:25:11] Richard Gaffin: Sure.

[00:25:11] Taylor Holiday: it's really hard to isolate the variables in our world. And you're dealing with two things. You're dealing with like a, a synthetic set of data. So you're actually like trying to model out what the revenue would've been without the spend.

Like, so you

[00:25:22] Richard Gaffin: Mm-hmm.

[00:25:23] Taylor Holiday: design these effects a little bit. To try to understand what the baseline would've been in each case. And you're trying to find like matched markets, right? So you're trying to find, this state looks like this state, and so we're gonna hold out this state and compare it to this state it's not truly. The same isolated group receiving it and not receiving it at the same time, you can't really replicate it in that way. Now meta does that on like a user level. They do conversion lift studies at a user level. There's a separate problem there related to the the data set that they're assessing, which is the attributed revenue versus the actual revenue.

So, I wish they would do user match studies versus geom match studies. Where they find like users and then they hold them out and isolate their feeds. But so we're doing the best to replicate the process of sort of that double blind experiment with these geo holdouts, where two similar regions who have similar buying behaviors over time, suddenly one is no longer receiving ads. One continues to, and you see the difference, the impact of the absence of the advertising. So in theory, you should see more revenue lift out of one group than the other, and that's the sort of lift percentage that you'll see as a comparison. Or in some cases, if it's a net new channel like we're going into app oven, then we're gonna say, Hey, right from the start, this, this region's gonna get none and this region's gonna get some and we're gonna see the lift.

So. That should give you the strongest sense of impact of this isolated variable over time. of course, there variables are actually not iat. The weather, the news, the experience of people in that town all have some impact here that's not totally perfect,

[00:27:03] Richard Gaffin: Mm-hmm.

[00:27:03] Taylor Holiday: it's the best available. That's why I think this phrase best available truth of all the ways you could measure it, it's currently the best option to get closest to the causal impact of the advertising.

[00:27:16] Richard Gaffin: Yeah. And, and it seems too, like you're mentioning sort of the, the confirmation of like, the house result with our result. There at least seems to be on, on aggregate, having done these tests many, many times, that there's some consistency there, which is an indicator of that it's, it's useful in some way.

[00:27:32] Taylor Holiday: Well, in some places there's consistency in other places. It seems inconsistent right now.

[00:27:36] Richard Gaffin: Mm-hmm.

[00:27:36] Taylor Holiday: that goes back to like, where do I assign more trust? And this just goes back to like, same thing with scientific studies. If a bunch of third parties can replicate them and they get the same outcome, then it's like, oh, okay, well this feels like maybe. There's more confidence here. The more people replicate the result, the more

[00:27:51] Richard Gaffin: Mm-hmm.

[00:27:51] Taylor Holiday: apple and it falls to the ground, the more trust we have in gravity. If that only happened in then they'd be like, well, that's confusing, you know? You'd

[00:27:59] Richard Gaffin: Mm-hmm.

[00:27:59] Taylor Holiday: alternative effect besides sort of a, a, a globally shared gravitational principle.

Like, and so I think that's what we're looking for here is, is, is the experience of meta seven day click. The same. Does it have a range? Does it have a likely outcome? And that also gets us to a place where in theory, at some point you might be able to define enough attributes to not need to test everybody.

This is another thing

[00:28:24] Richard Gaffin: Hmm.

[00:28:24] Taylor Holiday: is ultimately part of my ambition here, is that. imagine you ran a thousand tests and you had this really tight concentration of results around a very similar endpoint. Then you could actually use sort of a nearest neighbor principle, which is to say this brand is like these other brands in these ways, and you could probably get pretty good at predicting the likelihood of incre mentality.

And I think that's. Some of what the future's gonna hold for this is that it's not actually gonna be necessary to test everyone all the time in every place, but you're gonna be able to use the collection of results to infer likely outcomes that you can be confident in, in, at a level that allows people to make decisions faster.

[00:29:04] Richard Gaffin: Yeah. All right. Well, let's let's, let's bring it back to our measurement roadmap then. And so, as kind of we're laying out here, getting to the absolute truth is impossible, but getting closer and closer and narrowing in on it is, and prioritizing what you should measure and why, and when is exactly what this is for getting a little bit clearer on what the outcome might be.

So, is this tell us, Taylor, is this, this rolled out right now? Are we doing this with clients? What's what's the skinny there?

[00:29:28] Taylor Holiday: so we have the testing roadmap rolled out to customers and our gross judges are gonna be using it in conversation to represent potential value. There's a piece of the puzzle that we're trying to wrestle with, which is like, how do you assess the potential impact of spend that doesn't yet exist?

So like why app loving? Well, we could put in, if you put in this much budget, this is the potential range of impact that we've seen. So there's almost like a media planning. Part of it that I

[00:29:53] Richard Gaffin: Mm-hmm.

[00:29:54] Taylor Holiday: component we're gonna keep improving on. is a complicated thing. We're trying to equip our people with tools to help educate the customers to understand why this is valuable and what can they can expect from it.

So we're gonna keep improving this tool. It's something we wanna keep helping to design and help our people think through to make sure that people are confident. And why we're suggesting a testing roadmap and to also reinforce the idea that this is a journey. It's not a

[00:30:17] Richard Gaffin: Yeah.

[00:30:17] Taylor Holiday: time solves all the problems.

It's that we're going on a truth seeking mission together for the next few years, and our hope is that we get closer to something we feel confident in. The longer that we go and the more tests that we run.

[00:30:30] Richard Gaffin: That's right. And if you're interested on going on a Truth seeking mission with us, you know where to find us. Comment for code.com. Hit that highest us button, let us know that you're interested. We would love to chat with you if you're an eight or nine figure brand. Alright folks, well I think that's gonna do it for us for this week.

Until next time, we'll see you.