The AI Transparency Problem: Why Reproducibility Matters

The AI Transparency Problem: Why Reproducibility Matters in Insights

By AddMaple

article

Dashboards
Data Analytics
Data Visualisation
NLP (Natural Language Processing)
Qual-Quant Hybrid
Reporting
Statistical Analysis
Survey Analysis
Survey Research
Text Analytics
Verbatim Response Coding
AI
Artificial Intelligence

Summarise with AI

There is a question I keep coming back to when I look at how AI is being used in research and analytics tools. It is not whether the AI is clever enough, or fast enough, or whether it can summarise findings in a way that sounds credible. The question is: can you work backwards?

That may sound simple, but it gets to the heart of something the insights industry has not yet fully reckoned with.

Learn more by watching or listening to Ange on the Founders and Leaders Series podcast here:

Episode 11: Ange Taylor, Founder & CEO, AddMaple

Founders & Leaders Series interview: Ange Taylor, founder of AddMaple, shares how frustration with fragmented data tools inspired a visual-first exploration platform built for insights teams.

FIND OUT MORE

What Our Job Actually Is

If we go back to first principles as insight professionals, the job is this: take a business question, weigh up the various business questions according to urgency and risk, go and do the research, and get an insight back. That is the process. It sounds straightforward, but there is a step in there that is easy to overlook – the step where you are confident enough in your findings to actually present them to whoever needs them.

That confidence depends on being able to trace your work. It depends on being able to say: here is the data, here is what I did with it, here is how I got to this conclusion. If something gets questioned – and it will get questioned – you need to be able to go back through the chain and show your working.

If you are working with an interface that does not allow that, you have a problem. If the AI tells you something is correct, but you cannot work backwards from that, you cannot give a confident answer to whoever is asking for your insight. And if you cannot do that, then what is the point?

The Prompt Ping Pong Problem

I see this play out in a very specific way with generative AI tools. You ask a question, and the AI gives you an answer. You are not sure whether that answer is right. So you ask again, slightly differently. You get a slightly different answer. You go back and forth, refining and re-prompting, trying to get to something you trust.

This is what I call prompt ping pong. And the honest question is: is this actually saving time? There is a parallel here with coding tools. These tools have significantly reduced the time required to generate code. But the feedback from many teams now is that the time spent validating, verifying, querying, and refining is almost as long – if not longer – than it would have been to do it the established way. The same risk applies to data analysis.

When you lose the ability to reproduce a result, you also lose confidence. And in our industry, confidence in findings is not a nice-to-have – it is the entire basis on which we ask product and marketing teams, and senior leaders, to make decisions.

The Difference Between Calculation and Interpretation

This is why I think it matters to be clear about what AI should and should not be doing within a research tool. There is a real and important distinction between calculation and interpretation.

Calculation needs to be deterministic. Every time you load the same data set, you should get the same output. The count should be the same. The statistical test should produce the same result. That consistency is what makes it possible to say, with confidence, that these themes are linked to these scores, or that this subgroup behaves differently from that one. It is also what makes it possible for someone else to check your work.

Interpretation is different. AI can sit on top of a stats engine and help you understand what the results might mean. It can surface patterns, generate summaries, and flag things worth looking at. That is genuinely useful. But the AI should interpret results calculated by a deterministic engine – not run the calculations itself.

The reason this matters is that generative AI, by its nature, does not always give you the same answer twice. The interpretation might vary slightly depending on how the question is phrased or on how the model behaves on a given day. That is fine for interpretation, not for the numbers underneath.

If AI also generates those numbers, you have a situation where you cannot reproduce your findings. And that is a confidence problem and a time risk.

What This Means in Practice

When AI generates a chart, you often cannot edit it. You cannot relabel it, switch the columns and the rows, or make the kinds of small adjustments that are a normal part of presenting data to a real audience. You either use what the AI gives you or go back into the prompt and try to get something closer to what you need. That is frustrating in any context. In our industry, where people need to present and defend their findings, this is a real problem.

The tools that are getting this right are the ones that keep AI in its proper lane. The stats engine does the calculations. The chart engine does the visualisation. And the AI sits on top of all of that as an interpretation layer. It can trigger an analysis – key driver analysis, clustering, whatever is appropriate for the data type – but it does not run the numbers itself. That way, every time you load the same data set, you get the same output. The AI might phrase its summary slightly differently, but the answer is the same right down to the count.

That is what reproducibility looks like. And reproducibility is what makes it possible to do the actual job: take a business question, find an answer, and give someone the confidence to act on it.

I am not saying AI has no place in research tools. It clearly does. However, we need to be more honest about where it adds value and where it introduces risk. Using AI as an interpretation layer on top of rigorous, deterministic analysis is genuinely useful. Using it to run the calculations and then asking people to trust outputs they cannot trace back or reproduce, is a problem we have not solved yet solved.

Learn more by watching or listening to Ange on the Founders and Leaders Series podcast here:

Episode 11: Ange Taylor, Founder & CEO, AddMaple

Founders & Leaders Series interview: Ange Taylor, founder of AddMaple, shares how frustration with fragmented data tools inspired a visual-first exploration platform built for insights teams.

FIND OUT MORE

Author

Angelique Taylor

Angelique Taylor founded AddMaple to solve slow, fragmented insight extraction, making data exploration intuitive, accessible, and efficient.

FIND OUT MORE

Learn more about

AddMaple

AddMaple is a survey analysis and visualisation platform with dashboards, statistical tools and Gen AI text analysis in 80+ languages.

FIND OUT MORE