Synthetic Research Explained: A Powerful Tool To Support, Not Replace, Human Insight | The Tech Talks Network

How far can we trust research that is generated without asking a single human being?

In this episode, I sat down with Jordan Harper from Qualtrics to unpack one of the most talked-about developments at the Qualtrics X4 Summit, synthetic research. It is a topic that sparks curiosity, excitement, and a fair amount of skepticism in equal measure. And honestly, that tension is exactly why this conversation matters.

Jordan brings a rare mix of scientific thinking and real-world technology experience, which makes him well placed to cut through the hype. We explored what synthetic panels actually are, and just as importantly, what they are not.

While many assume this is simply about asking a large language model for answers, the reality is far more nuanced. The approach Jordan and his team are building is grounded in how humans respond to surveys, trained on vast datasets to reflect the inconsistencies, biases, and unpredictability that make human insight valuable in the first place.

What stood out throughout our conversation was the idea that synthetic research should be seen as additive rather than a replacement. It offers speed, flexibility, and the ability to test ideas quickly, but it does not replace the depth and lived experience that only real people can provide. In fact, some of the most interesting insights come from comparing synthetic responses with human ones, revealing patterns, biases, and even blind spots in traditional research methods.

We also got into the practical side of things. From controlling for issues like survey fatigue and social desirability bias, to experimenting with question design in ways that would be difficult with human respondents, synthetic research opens up new ways of working. At the same time, it raises important questions about validation, trust, and where to draw the line when decisions carry real-world consequences.

For me, this episode is about perspective. In a world where AI is accelerating everything, it can be tempting to look for shortcuts. But as Jordan explains, the real value comes from using these tools thoughtfully, alongside human insight rather than in place of it.

So as this technology continues to evolve, how should researchers and business leaders strike that balance? And where could synthetic research help you ask better questions before you make your next big decision?

Useful LInks

Connect with Jordan Harper from Qualtrics
Learn more about Qualtrics X4 Summit
Learn more about Synthetic Panels and more here

[00:00:03] Welcome back to the Tech Talks Daily Podcast, where once again I'm recording live at the Qualtrics X4 Summit in Seattle. And today I want to dive into one of the most talked about and often misunderstood areas in AI right now. I'm talking about synthetic research. And my guest is Jordan Harper.

[00:00:27] He's the principal AI thought leader at Qualtrics, where he works within the Edge Centre of Excellence, helping shape how researchers, academics and insight teams all think about synthetic audiences and what they actually can do in the real world. And one of the many things I enjoyed about my conversation with Jordan today is he brings both technical depth and a very practical lens to the discussion.

[00:00:57] And his background stretches from physics and astrophysics through to digital technology leadership. And I think that this mix gives him such a useful and valuable perspective on why AI needs to be tested, validated and understood, rather than simply admired from a distance. So today we're going to unpack what synthetic panels really are.

[00:01:22] We're going to demystify that today and talk about how they differ from simply asking a general purpose AI model a question. And why this matters so much for researchers that are trying to make better decisions faster. And yeah, we'll also get into that messy reality of human responses, the hidden biases inside traditional surveys,

[00:01:45] and explore why synthetic research is opening up some very interesting possibilities without replacing the human side of research altogether. So if you've been hearing more and more about synthetic panels and wondering whether it's just hype, a shortcut, or something that is genuinely useful to your business, I'm hoping today's episode will help you make sense of where the opportunity really is. But enough from me.

[00:02:14] Let me beam your ears all the way to Seattle, where you can sit down with myself and Jordan right now. So thank you for joining me here today at the X-Force Summit in Seattle. Can you tell everyone listening a little about who you are and what you do? Sure. Thanks for having me, Neil. I am Jordan Harper. I'm the Senior Principal Thought Leader for Edge Centre of Excellence at Qualtrics. So that is our market research centre of excellence.

[00:02:40] So working with research teams, researchers, academics on a synthetic research platform, primarily edge audiences. And the question I've got to ask you, because I was doing a little research on you before you joined me today, and you've got a background in science and tech. So tell me more about your origin story and what it was that lit that spark that would lead to you bumping into me today after a session, you know? Sure. So as you probably saw from my background, I studied physics at university. I did a master's in astrophysics and ended up spending a couple of years working in nuclear engineering consultancy.

[00:03:11] Wow. That was sort of my choice after university where I was like, I'm not ready to give up the fact that I just did a science degree. Like, let's see if I can enjoy this as a career as well. And it was around about the time that, you know, it was early 2000s when digital technologies, the web, internet and everything, which was a sort of side passion of mine. And throughout university, I'd earned my beer money from building people websites rather than working behind bars. So I made the decision to sort of lean into that as a full time career,

[00:03:40] moved to London and started working in new media agencies as we called them back then. And as a developer and then a dev team manager, tech director, and ended up CTO for an integrated marketing agency where I spend kind of most of my time working with C-suite clients to help them understand, you know, what's coming next, how to structure their teams, their technology strategies around the future of technology, what's happening. So that was all through the mobile era as that kind of changed everything.

[00:04:09] I was lucky enough to get on the ground floor with, you know, what all of those huge sort of market shifting changes in mobile. And I feel like AI is now that new mobile smartphone era tier shift in the way kind of the world works and businesses need to think about technology. So that's how I ended up, ended up here and working with Qualtrics on AI. It really has, because now you've been beamed right into the heart of absolutely everything.

[00:04:37] Before joining me today, you were in a session about synthetic panels, and we hear a lot of agentic AI and AI at tech conferences, but this one in particular, synthetic panels, is something I'm hearing more and more about. But for people new to the topic, can you just offer an overview of what synthetic panels are? Yeah, so the Qualtrics synthetic panel, I mean, I suppose maybe the first thing to start out with is, it can mean many things.

[00:05:03] And it is interpreted in several different ways, depending on who's selling you the synthetic panel, who's talking about the technology. The majority of platforms out there that sell synthetic panels or synthetic research are essentially just wrappers around an LLM. Ask a question, sends it off to the LLM, takes the answer back and returns you a result. I find that those, that type of approach to synthetic research, and I'll put some air quotes around research in that one,

[00:05:29] you essentially get, you know, very stereotypical persona-based responses that mean revert quite easily. They don't really provide the same kind of diversity or the same kind of distributions that you get when you ask humans questions, because humans are a little messy and not quite as logically perfect as LLMs, even when LLMs are hallucinating and behaving a bit strange. The other style is where, you know, the so-called kind of rag approach, where you take,

[00:05:54] like a company will say, give me all of your data, I'll upload it into the context of an LLM, and then you can ask it questions, and then it'll answer more like your customers, rather than the kind of generic LLM that mean reverts to stereotypes. Unfortunately, what happens there is it mean reverts to something different. It reverts to what your customers have already told you. So you don't really learn anything new. That's not really an opportunity to ask different types of audiences questions. So we don't think that's the right approach to do this either. So what we've built is a foundational model LLM.

[00:06:22] Essentially, we've built our own version of ChatGPT or Gemini, our own LLM that's trained on millions of rows of customer survey responses. So what our LLM really understands is how humans answer surveys, how people respond to surveys. And what that enables you to do is build a survey, just like you normally would in the Qualtrics survey builder tool. You load it into our edge audiences product, and then you can choose to field it with humans or field it with synthetic or both. Yeah.

[00:06:52] And set your demographic targeting, send that survey out. You'll get rows of data back from humans as normal and rows of data back from synthetic respondents. And each one of those synthetic respondents is kind of a unique instance of that LLM that gets spun up. It takes the survey, it remembers the answers as it goes through, returns the responses and then turns itself off. And then a new one spun up to answer the next row and so on and so forth.

[00:07:16] So what you get is a pretty accurate simulation of how humans respond to surveys based on the demographic targeting and the questions you're asking. It's fascinating, isn't it? Are there any parallels between the two, the human responses and the synthetic? Yeah. So a huge part of us feeling comfortable offering this as a product that people pay for is that we have validated the responses against human responses. So we're confident that it provides a really accurate reflection of the way humans respond to surveys.

[00:07:45] We're constantly in the center of excellence, retesting, revalidating. Like, and I do mean constantly every week, every month to ensure that it's staying kind of on track with how humans respond to things. We also see some interesting differences sometimes. And that's where I think it gets really interesting. It's very easy to look at the difference and say, oh, well, that's where it's obviously getting it wrong. Yeah.

[00:08:08] But what we also find is because of the unique nature of a synthetic respondent, we're getting sometimes signal that humans aren't giving us because, you know, humans are not perfect survey responders. I don't know about you, but, you know, I know I've definitely not always told the truth when ticking the boxes on a survey or I might have ticked a five when it's actually really a three or I might have ticked a one when it's really a two. Or I might just have not answered a question at all.

[00:08:31] And I think the interesting thing about synthetics is they don't, what we've seen from our experimentation, rather than it's not speculation, it's like we've actually seen these results, is that it doesn't, they don't tend to suffer from social desirability bias. They don't, they don't worry about how the person looking at those survey results is going to think about them as a human being, which we do. They don't suffer from things like priming. So where questions asked earlier in the survey might influence sort of subconsciously the way they answer questions later in the survey.

[00:09:01] We did an experiment with a question at the end of the survey was, do you think smartphones are net good or net bad for the world? Do they do more harm or more good? Like at scale one to five. And what we, what earlier on in the survey, we split the, we split the respondents into three cohorts. One of them, we didn't have any other questions in the survey that touched on smartphones.

[00:09:21] Another cohort, we had some questions earlier in the survey that said like, do you think, you know, smartphones help you stay in touch with friends and family or, you know, navigate the world a little easier with maps and things like that. And another cohort, we asked some questions earlier in the survey was, you know, do you worry about the impact of social media on your children? Or like, do you think data privacy concerns from photographs on your phone or a problem, that kind of thing?

[00:09:42] What you see with humans is exactly what you would expect to see, which is when you ask that question more harm than good, the ones who've seen the negatively primed questions earlier in the survey will tend to answer that question a little more negatively versus the unprimed cohort. And the ones who've seen the positive ones will answer more positively. What you see with synthetic is, you know, like a fuzzy variation across the whole thing. So no real significant deviation and certainly no obvious pattern that one priming is sending them in one direction versus another.

[00:10:11] So the ability to use synthetic to control for that kind of thing or test for that kind of thing is really interesting. So it also shows the power of suggestion to a certain amount, does it? And the behavioral science behind that and how you can sway people in the wrong hands, maybe. Yeah, absolutely. I think like one of the probably one of the trickiest things for researchers to tackle with synthetic is going to be almost like unpicking all of the stuff that they've spent years having to do because of human imperfections.

[00:10:39] Like there's so much processing and work that researchers have to do with human generated data to be confident in saying, OK, we stripped out all of the potential priming. We stripped out all of the potential social desirability skewing. We've stripped out all of this stuff. And this is what the results are with, you know, with synthetic audiences. You don't really have to think about that. Another thing like that is like the length of the survey, a very sensitive topic amongst anyone asking a survey or anyone responding to a survey for that matter.

[00:11:06] You know, how many questions are you going to make me answer before I get to the end of this? Synthetic respondents don't get bored. They don't answer questions later in the survey with less thought and consideration than they answer questions at the start of the survey. So you can ask a 200-question survey if you want. Probably still wouldn't even recommend that with synthetic. But, you know, you don't have to control for that kind of thing in the way that you do with humans.

[00:11:29] And so I think that is really interesting and probably quite a challenge for career researchers to unpick in the way they think about asking those questions and building the questions that they want to ask. And apologies for asking this, but I suggest, well, I would imagine you get this question asked a lot. And also people listening think, oh, when we're talking about synthetic, why can't I just ask Gemini or just ask ChatGPT? Of course, it's not as simple as that. And you've got access to so much more data. But is it a question you get a lot and how do you answer that? Yeah, absolutely.

[00:11:58] I mean, I touched on it a little bit earlier, but you can absolutely do it with ChatGPT. And with our platform, we've built the capability not for customers, but for us to test to be able to switch out from using our foundational LLM to using ChatGPT or Gemini and to see how it responds to those questions. And what you see is that mean reversion, stereotypical response. So we asked the question about one of the examples that we used was asking about climate change. So is climate change a problem?

[00:12:25] What you see from humans is not 100% of people saying yes. You see a distribution because that question is very loaded for humans, comes with political connotations and all kinds of stuff. You know, it's not the perfect logical answer. If you ask ChatGPT that question a thousand times, what we saw when we did the testing was it was 100% yes. Not 99% and a couple of outliers. It was 100% yes, climate change is a problem for humanity.

[00:12:53] When we asked our finely tuned model, because it knows how humans answer surveys, because it has that messiness and inconsistency that humans exhibit built in, it actually showed a very similar distribution to humans where majority yes, but separation between no or don't know or not sure was very, very similar to the human distribution.

[00:13:14] And I think one of the biggest dangers with synthetic for people is thinking that asking ChatGPT or using a model that's built on essentially just putting a wrapper around ChatGPT or Gemini or Claude or whatever is going to be a substitute for research because it's not. That's going to give you not usable data. It's not validated.

[00:13:33] And there's lots of kind of qualitative examples where it's a conversation and people, LLMs are very good at helping you at sounding very confident and very eloquent in the responses that they're giving you. And that can be a bit of a trap when it comes to research because it might sound like a great response. Oh, this sounds exactly like my customer. I'm talking to this persona. But there's no real way of validating that or testing it, which is why we went kind of quantitative research first, because we know we can validate quantitative research.

[00:14:03] There's processes, there's methodologies, there's math behind quant research. And so that's why our platform is built around that rather than the conversational. And your session here, you've performed it several times here on two different days and hugely popular. And I'm curious, what kind of questions did people ask in the sessions? Were there any differences in the different sessions? Any trends in the kind of things people were asking? Not too dissimilar to the questions you're asking here, to be honest. The most common one is actually B2B.

[00:14:32] Probably like, you know, we have a lot of Qualtrics has a lot of B2B customers and B2B is a really important part of our customer base. And particularly when it comes to research and market research. And the answer that I gave there is, you know, we know that B2B is really important and we're working on a B2B specific model that is able to understand the firmographic targeting that you would use with B2B. We also know that B2B is not monolithic.

[00:14:55] You know, it's a very big difference between healthcare B2B, IT decision maker B2B, financial services B2B. There's different types of targeting, different types of questions that get asked and different types of questions that we need to train a model on. So we're currently working through exactly how we bring that all together. And we get access to the highest quality data that we can in order to train that model. It's interesting you say that because I think one of the big things I've noticed here is the word context and nuance.

[00:15:22] And that seems to be a real big part of what you're going for here, isn't it? It's not just one size fits all and that model will sort this and it will fix all your problems. You need that context, don't you? Yeah. It's important for, I mean, it's important even in the way you write the questions for synthetic research. Like one of the things that I talk about in the talk a little bit is, you know, how you have to ask and formulate the questions a little bit differently because LLMs interpret words in slightly different ways.

[00:15:51] So a good example is there's an example I use in the deck where we ask a question and say, pick as many as you pick all that apply. The question was, what triggers you to buy new clothes? Tick all that apply. Humans on average tick 2.8 things when you say all that apply. When you say all that apply to LLMs, the average is one. Thing. So you had a much lower selection rate, much less data, much less useful. If you say select exactly three, humans still select 2.8.

[00:16:18] Interestingly, like humans don't quite follow the exactly three as much as anyone else. And even more interestingly, I think the LLM averages 2.8 in that scenario as well and with a very similar distribution to the GPT. So that context, that extra framing that we say they're rule followers, LLMs. And so you have to give them the rules and you have to give them the context to be able to answer useful questions. And that's another sort of mindset shift for researchers that I think might be important to get a grip.

[00:16:48] And as someone right in the heart of this space, you must see and read and hear a lot about synthetic panels. Any myths or misconceptions that you see out there that might frustrate you that we can just lay to rest today? Is there anything that springs to mind or not? I think one of the things that I hear a lot is, and is a very fair skepticism and pushback from a lot of people is, you know, synthetic, the data generated by synthetic research, synthetic panels is not real data.

[00:17:17] It's not data about humans. It's not actual data that should be used to train anything else. And people tend to extrapolate that to like, well, it's, and as a result, it's not data that you should pay attention to or do anything with. And I kind of think about weather predictions when I think about that, right? Like a weather prediction, it is going to rain this afternoon. It's not data that it rained this afternoon. You don't get the data until this afternoon.

[00:17:40] But knowing based on complicated models, predictive data that it is likely to rain this afternoon is pretty useful information if you're going to go fishing this afternoon or if you're going to go and play around a golf or whatever. Weather predictions are very, very useful. And they're not data. And I sort of see the research that comes out of synthetic panels very similar. It is predictive data about how humans might respond to surveys, might respond to the questions that you're asking.

[00:18:09] And that can be incredibly useful for a business if it's actionable, if you're asking the right questions and it's giving you the opportunity to make a decision in advance. Going back to the question earlier about being able to do more before rather than after a decision is made is key. So for any business leaders that are listening today fascinated by what we're talking about, how confident should they be in synthetic panels when it comes to high stake decisions? Or where do you draw the line between speed and trust, where it works, where it doesn't?

[00:18:39] I think where it is, where it works and where it doesn't is probably going to be very different for almost every individual business. And the most important kind of thing to take forward for businesses is this is an area where I think should be experimenting. I think about work when I used to work in the agency and the amount of pitching that we used to do, for instance. A lot of the time you'd want to back the ideas in your pitch up with some research.

[00:19:03] But the nature of pitching a creative idea is you've got a week before you have to go and take it and present it to the client. You haven't got the time to do real research. If I'd had access to a tool like Edge Audiences to be able to get some quick quantitative feedback, even if it was imperfect, if it was 80% good, that would be incredibly valuable and incredibly useful. Or I'd want to test it. I'd want to evaluate it against things that we knew, against some real research that we'd done on concepts,

[00:19:31] to know that I could rely on it and I was comfortable putting it into a client presentation. But I would absolutely be doing that testing and that evaluation now. And I'd say for almost any business, going and starting to do that and not replacing your research team, not replacing your human research, but doing this on the side of it to understand how this new technology compares to the traditional research that you've done.

[00:19:57] And then using that information to make decisions about how you balance your investment in synthetic versus human in the future and what more you can unlock, what you can do with synthetic or what questions did you never ask your customers because you were worried about what they'd think if you asked it. Like, can we now start asking synthetic those questions and get some kind of signal and some kind of idea of what we could do with that information? What questions are you cutting out of your surveys because you want to make them shorter?

[00:20:22] Like, I can go and ask synthetic those questions in the future and maybe get something useful if you've done that validation process first. And I'm curious, if we were to go right back before the conference and you're working on this 45-minute session, an event with 10,000 people attending there, what was the main message that you want to deliver? What do you want people to walk away with when they're thinking about synthetic panels? I think the main message, you know, we're speaking to researchers here.

[00:20:48] We're speaking to people steeped in this rightful sceptics who don't just want to embrace some new technology because it's new and it's exciting and everybody's talking about it, who want to be rigorous and know that being right is important. So my message was really to them, this isn't something that I think anyone should be pitching as coming out to replace researchers or to replace research or to stop asking humans questions.

[00:21:14] It's a new tool in the toolbox for researchers to be able to enable them to do more, do different things to almost rediscover kind of an old mode of research where you could be more scientific, where you could experiment, where you didn't know whether this was the right way to ask a question or not. And you can go and ask a question in 15 different ways and see which one gets the right kind of answers. We know we can't do that with humans now because humans are fatigued of being asked too many questions.

[00:21:42] So you do all of that work before you actually go and recover any data. Synthetic is this new tool that allows researchers to do more, to become more of a scientist again, to be more of an active, agile researcher and really expand what they're able to do rather than reduce it. And I was in another panel a couple of days ago. I think it was Booking.com. Although it is very early days, there are companies that are starting to experiment, like you said.

[00:22:11] And Booking.com, for one, have had great success, haven't they? Absolutely. It's been one of, if not our fastest ever growing new product at Qualtrics in terms of getting customers on board with it. There's a clear demand for it. People are. The more they're using it and the more they're experimenting with it, the more value they're seeing from it. And so, yeah, it's been really exciting to see amazing customers like Booking.com really digging into it and becoming a partner in learning where it's working, where it isn't,

[00:22:39] so we can better advise everybody going forward. And I suspect for everybody listening, we've set off a few light bulb moments. Everyone's frantically Googling synthetic panels now. I'll include a link in the show notes to your LinkedIn for anyone that wants to reach out there. Anywhere else you'd like me to point everyone listening at all? Yeah, I mean, we're publishing. We're constantly publishing. We're very transparent about the work that we're doing in the Centre of Excellence and the work that we're doing with Synthetic. We're constantly publishing new white papers and data on the Qualtrics Edge Audiences blog.

[00:23:09] So check out the Qualtrics website and search for Edge Audiences and find all the amazing stuff that my team and I are putting out there. Awesome. Well, I'll include links to everything that you said. I urge everyone listening to go and check that out. But more than anything, thank you for sharing your story today. Thank you for having me. It's been a pleasure. What I found especially interesting in my conversation with Jordan was that he didn't position synthetic panels as some kind of grand replacement for human research.

[00:23:37] Instead, he frames them as a new tool for researchers, one that can help them test more ideas, move more quickly and ask the questions that they never had the time, budget or confidence to explore before. And that alone feels like a much more useful and realistic way to think about this space. And I also thought his point about prediction was particularly strong too. Because synthetic data is not the same as human data.

[00:24:06] But that doesn't mean that it's useless. In the right context, it can be a valuable signal, much like a forecast helps people before the event itself happens. And I think it's this shift in thinking is where I suspect many business leaders and research teams will begin to see the real value here. And maybe the biggest takeaway is that companies getting the most out of this technology are not blindly trusting it. And they're not dismissing it either. It's not as binary as that.

[00:24:36] Scratch beneath the surface and you will see them experimenting, validating and comparing it to real world results and learning where it helps. So I'll include the links in the show notes to Jordan's LinkedIn and Qualtrics Edge audience's resources that he mentioned there. So you can dig a little bit deeper into some of that research and see how this space is evolving. But as always, I'd love to hear your thoughts.

[00:25:02] Would you trust synthetic panels as part of your decision making process? Or do you still see too many unanswered questions? Let me know, as always, techtalksnetwork.com. I'd love to hear from you all. But that's it for today. Thank you for listening. Thank you for joining me in today's conversation. Hope you found it as valuable and intriguing and interesting and fascinating as I did. And I'll hope to deliver on all those things again tomorrow.

[00:25:32] Speak with you then. Bye for now. Bye for now. Bye for now. Bye for now.