3118: Unpacking the BI Paradox: Simplifying Complexity with IndeXima | The Tech Talks Network

In today's data-driven world, businesses rely heavily on tools like Tableau and Power BI to make sense of their data. But why do these tools, designed to make analytics simple, often involve such complex and resource-intensive backends? This tension, known as the BI paradox, is what Nicolas Korchia, CEO of IndeXima, is on a mission to resolve.

In this episode, Nicolas shares his journey from data geek to entrepreneur, tackling one of the biggest challenges in business intelligence: the balance between user-friendly dashboards and the intricate data pipelines that power them. We explore how IndeXima leverages AI and machine learning to create dynamic tables, accelerating data analytics, improving dashboard performance, and reducing the cost and environmental impact of data queries. Nicolas explains how his team's solutions are empowering businesses to streamline decision-making while addressing sustainability—a key concern in today's IT landscape.

Beyond the technology, Nicolas provides real-world use cases, including how IndeXima transforms retail analytics by optimizing Snowflake queries to deliver near-instantaneous results. We also dive into the future of business intelligence, discussing emerging trends such as the role of large language models (LLMs) in generating SQL queries and how automation is shaping the way businesses handle data.

If your organization is struggling with slow dashboards, skyrocketing query costs, or inefficiencies in BI processes, this conversation will spark ideas and provide insights into optimizing your data strategies. Could the solution to your BI challenges be simpler than you think?

Tune in to find out—and let us know your thoughts on the BI paradox and where you see data analytics heading in the near future.

[00:00:04] Have you ever wondered why business intelligence tools, despite their promise of simplicity and efficiency, can sometimes feel like a double-edged sword?

[00:00:14] Well, today I'm going to be addressing the fascinating BI paradox, the tensions between the simplicity that end users desire and the complex and often frustrated back-end systems that data teams wrestle with on a daily basis.

[00:00:31] Well, my guest today is a self-proclaimed data geek and founder of a company called IndeXima, a French company revolutionising the way businesses handle data analytics.

[00:00:44] And I want to learn more about how he and his team are helping businesses streamline decision-making processes and making sense of their data in ways that are fast, efficient, practical and sustainable.

[00:00:59] So, how can businesses better bridge the gap between simplicity for the user and efficiency for the back-end?

[00:01:08] Let's find out together, because it's time to introduce you to today's guest.

[00:01:13] So, a massive warm welcome to the show. Can you tell everyone listening a little about who you are and what you do?

[00:01:20] Okay, nice to meet you all. Thanks, Neil, for the invitation. I'm Nico, Nicolas Corsia from IndeXima, a French-based company founded in 2017.

[00:01:31] I'm a data geek, and I'm working around data since years, a lot in BI tools like Tableau and Power BI for sure.

[00:01:42] And now I'm the founder of IndeXima.

[00:01:44] Well, it's a pleasure to have you join me today.

[00:01:47] Every day on this podcast, I try and take a different area or a different challenge and demystify it,

[00:01:52] putting the language that any business leader can understand or a stakeholder from around the business.

[00:01:56] And today, one of the things I'd love to tackle with you is this concept of the BI paradox or business intelligence paradox.

[00:02:04] It almost highlights a tension between simplicity and output and complexity in operation.

[00:02:10] So, can you tell me a bit more about how IndeXima addresses this challenge and ultimately helps streamline decision-making for businesses?

[00:02:19] Because that is where the magic happens and that's where businesses need to get to.

[00:02:23] Yeah, yeah, for sure. I can explain to you.

[00:02:26] In fact, that BI paradox comes from two things.

[00:02:30] For the end user, the user of the BI tools, the user of the data viz, data viz is simple.

[00:02:36] It's just about click somewhere on a graphics that is very fancy and that goes very fast or should go very fast.

[00:02:45] Yeah.

[00:02:45] That simple aspect has a contradiction because below, behind the scene, it can be very complex, very difficult, very long to set up

[00:02:56] because of data complexity, data modeling always changing because the end user wants something new.

[00:03:07] And when we are talking about that data mess, you have a lot of sources of data and a lot of different aggregation layer to build on top of the data so that the data can be usable in a data viz.

[00:03:22] So, the data pipeline can be really complex to set up.

[00:03:27] And this is the BI paradox.

[00:03:29] I mean, on the left, it's very simple to use.

[00:03:32] On the right, it's very difficult to set up for data engineers.

[00:03:36] And in fact, it leads to very slow projects.

[00:03:41] I mean, months is for a new graphic.

[00:03:46] Yeah.

[00:03:47] Very expensive data viz when we are talking about direct query.

[00:03:52] I mean, plug your Power BI or Tableau directly to Snowflake, for example, and you have live querying.

[00:04:00] Sometimes those queries that are very complex to execute, complex to compute can be very expensive.

[00:04:07] And finally, no one is really happy on that.

[00:04:10] And what we do at Indexima is trying to solve that dilemma, trying to have agility inside that BI stuff in order to speed up the projects and make everyone happy, even the guy that is paying at the end the bill of Snowflake.

[00:04:28] I absolutely love that.

[00:04:30] And as you said at the very beginning of our conversation, you are a data geek.

[00:04:34] So, can you explain a little how the creation of dynamic tables through AI and machine learning, how that can help accelerate data analytics and improve dashboard performance?

[00:04:45] So, one of the reasons I asked that is so much hype around AI and machine learning, but a lot of businesses are trying to work out, what does this mean for me?

[00:04:51] How can it improve my day-to-day operation?

[00:04:56] So, I'd love to dig a little bit deeper on how this can improve dashboard performance, data analytics, because it's a huge topic in business right now.

[00:05:04] Yeah.

[00:05:04] In fact, you know, Neil, when we are speaking about data viz, at the very end, this is data that will be used by a human.

[00:05:12] So, that human cannot deal with a lot of different points.

[00:05:16] I mean, in a data viz, you do not show millions of billions of points.

[00:05:20] Even if your data is coming from a huge amount of lines and billions of lines or tens of billions of lines.

[00:05:29] At the end, in your data viz, the data that is exposed to the end user is small.

[00:05:35] It's small data.

[00:05:37] So, let's skip that sentence.

[00:05:40] Whatever the data size, the data you show in your data viz, you use in your data viz is small data.

[00:05:49] Yeah.

[00:05:51] Based on that, it means that when you have huge amount of data in between your data and your data viz, you can set up an aggregation layer in between.

[00:06:00] A small data that will be the data exposed to the data viz.

[00:06:04] That small data exists since years.

[00:06:07] The name was cube or extract or in-memory cubes or data marks and so on.

[00:06:14] Different names for always the same thing, storing somewhere the computation of different combinations of data you will have to use in your data viz.

[00:06:26] And that data, that layer is small.

[00:06:29] This is the aggregation layer.

[00:06:31] People are used to build that aggregation layer themselves.

[00:06:35] I mean, with data engineering and ETL and that kind of stuff.

[00:06:39] What we do at Indexima is reading up the dashboard or let's say the dashboard usage and generate automatically, automagically that layer, that aggregation layer.

[00:06:52] So what we do is to build that small data you will need inside your data viz, thanks to AI.

[00:06:59] And that's where we are.

[00:07:01] AI in order to help BI.

[00:07:03] So we have very simple algorithm based on clustering algorithm.

[00:07:09] We cluster different SQL queries that are sent by the data viz to the data.

[00:07:15] And thanks to that, we can say, yeah, there is three groups, three patterns of queries.

[00:07:21] And based on that, let's create three cubes.

[00:07:26] I named them cubes, but in fact, in Snowflake, we are using dynamic tables to store that.

[00:07:31] Let's create three dynamic tables that will be the best layer in order to answer to that three patterns of queries.

[00:07:39] And one of the things that stood out for me with your presentation here today at the IT Press Tour is your emphasis on sustainability alongside performance.

[00:07:49] I think anybody working in BI, they're very familiar with complex queries that go against what, not millions or even billions of actual rows there.

[00:07:59] So how does your solution help organization optimize computing resources while reducing environmental impact?

[00:08:07] Because we've all waited a long time for those queries to come through and for those reports to generate.

[00:08:12] But there is this extra emphasis on sustainability now, isn't there?

[00:08:16] Yeah, this is one point I'm really fond of.

[00:08:19] How can Indexima help in terms of sustainability?

[00:08:24] When we speak about BI, in fact, BI tools and dashboards and reports are a lot clicked.

[00:08:31] But there is a lot of interactions in between the end user and the BI tool.

[00:08:36] And each of those interactions generates from one to 50 queries to the database.

[00:08:43] Meaning that each time you interact with the database, you have the feeling to do only one click.

[00:08:49] But in fact, you generate a lot of compute inside your database.

[00:08:53] When your database is already optimized, in fact, those clicks and those queries can be very fast because you've already, let's say, indexed your data quite good or quite well.

[00:09:08] Or you've already partitioned your data just the way it will be used.

[00:09:14] And that's not perfect.

[00:09:15] But in fact, in most of the cases, there is a lot of queries that generate a lot of bytes scan.

[00:09:24] And this is where we are.

[00:09:27] We try to reduce the bytes scanned for each queries.

[00:09:32] That's why we use AI in order to find patterns inside the different queries sent by the BI tool.

[00:09:40] And we identify not only patterns of queries that look like the same, but moreover, the patterns that scan the data.

[00:09:53] We try to work more on that kind of queries, queries that scan the lot.

[00:09:59] So we create that object, that dynamic table that will have less bytes to be scanned in order to answer to that kind of queries.

[00:10:10] And this is the way we help to reduce the carbon footprint.

[00:10:14] In fact, when you have billions of rows and hundreds of gigabytes of data that has to be scanned in order to answer to one query.

[00:10:24] If you create that dynamic table that is small enough so that it's no more hundreds of gigabytes of data that is scanned, but only one megabyte or let's say even one gigabyte.

[00:10:36] But you really help the planet.

[00:10:40] You help for sure the query to go faster and you help the end user to pay less inside Snowflake.

[00:10:46] But at the very end to my site, we really help the planet to have less compute engine working on that kind of queries.

[00:10:55] And that person that hits generate report, what kind of time difference would they notice if we go from querying a billion rows the old fashioned way, how long that takes compared to what you've created here?

[00:11:07] Just to help people understand the difference it could make to their lives too.

[00:11:11] Yeah, it's a good question.

[00:11:12] You're right because you've not seen a demo or you've not seen the product in real life.

[00:11:17] We are speaking about reducing the response time by 100.

[00:11:21] Wow.

[00:11:22] So it's coming from minutes to seconds or from tens of seconds to tens of milliseconds.

[00:11:30] This is what we do.

[00:11:32] Customers call us when they have a dashboard that's refreshed in something like 20 seconds, 30 seconds, one minute, which is slow.

[00:11:42] It's too slow for them.

[00:11:43] And they call us and after they put indexima in between their BI tool and Snowflake, they can say that the BI tool interactions are in less than one second each time.

[00:11:56] Wow.

[00:11:58] I suppose this is a great opportunity to introduce people to your new SaaS product, which is called Snowflake Autopilot, which also promises to redefine BI on Snowflake.

[00:12:08] So what specific features set it apart, would you say?

[00:12:12] And how does it improve that overall user experience for customers?

[00:12:16] In fact, Snowflake Autopilot made by Indexima or the name is also Indexima for Snowflake.

[00:12:21] This is that AI that helps your BI building automatically dynamic tables inside your Snowflake.

[00:12:30] In a few words, you have a dashboard in Tableau, Omni, Power BI or whatever that is direct connected to your Snowflake.

[00:12:39] Okay.

[00:12:40] Indexima will come, will have a look at your queries and will send orders of creation of dynamic tables inside your Snowflake.

[00:12:50] Yeah.

[00:12:50] So that the data do not move from your Snowflake.

[00:12:52] And the writes are still the same inside your Snowflake.

[00:12:56] And the refreshness of your dynamic table is ensured by your Snowflake.

[00:13:02] Everything is on your side.

[00:13:04] And what just to Indexima is to find the best aggregation layer stored inside dynamic tables.

[00:13:11] And after that, when a new query is coming to Indexima and that query is eligible, is able to go inside one of those dynamic tables, Indexima will rewrite the query in order to use that very small dynamic table instead of using your very huge table, which is below.

[00:13:31] And automation in BI is such a huge topic right now.

[00:13:35] It's ultimately about making our lives easier wherever you are.

[00:13:38] So for any data engineers, this thing, how does your technology minimize some of those manual processes, the manual workload and also maintain things like governance and data security and all those important IT kind of concerns?

[00:13:53] Yeah.

[00:13:53] For data engineering team, there is two pain points.

[00:13:57] The first one is all those projects around there is a new BI usage that is emerging and they need to create a new pipeline in order to create that new exposition model, data mart or extract or cube somewhere to enter to that.

[00:14:19] And it's often very long projects with a lot of back and forth in between the BI team, let's say the analytic team and the data engineers.

[00:14:29] I was a data engineer in my past.

[00:14:31] When I was asking to do that kind of thing, I had to build the next cube.

[00:14:36] I've already built 150 cubes and you asked me to build the next 151 cube and 152 cube and so on.

[00:14:47] This is not fair.

[00:14:48] This is not the same, this is the kind of thing I'd love to do.

[00:14:52] And the second part is when you've built that 153 cubes and this has been done.

[00:15:03] Now you have to explain the database user how to use it and to expose it and to change the dashboard and so on.

[00:15:12] And this is the two points we've automatically optimized inside Indexima.

[00:15:18] The generation of the cube is automatic thanks to reading SQL and the usage of the cube is also automatic.

[00:15:26] The query is sent to the table, but in the meanwhile, the query is sent to the table, but Indexima on the fly will rewrite the query so that instead of using the huge amount of data inside the table, we'll use the aggregation layer.

[00:15:45] That dynamic table we've built automatically thanks to Indexima.

[00:15:49] And when I was doing a little research on you before you came on the podcast today, I was reading how you've supported so many different companies across various sectors.

[00:15:58] And I'm talking things like finance, marketing, logistics and so many others.

[00:16:03] So are there any real world examples or use cases you can share of where your solutions might have significantly boosted performance and cost efficiency just to bring to life what we're talking about here?

[00:16:15] Yeah, I have a very, very easy to understand use case.

[00:16:19] It was a Tableau use case.

[00:16:21] Tableau on top of retailing data.

[00:16:25] That customer was and is in fact analyzing performances of its store.

[00:16:34] So what is the store with the best margin?

[00:16:39] What is the store with the worst margin?

[00:16:42] And so on.

[00:16:44] At first, before using Indexima, there were struggle making extracts inside Tableau.

[00:16:52] So in hyper technology, they were extracting the data.

[00:16:57] And since that BI usage was more and more complex, the extract was more and more huge, leading to struggling too long to build that extract.

[00:17:10] And at the very end, the extract that was slow inside Tableau.

[00:17:17] We arrived.

[00:17:18] We said to him, you just have to remove your extract and send your queries inside Snowflake in live querying.

[00:17:26] And the customer said to us, but it will cost me a lot.

[00:17:29] Yes, you're right.

[00:17:30] If you do not do anything else and moving your data usage to live querying, it will cost a lot.

[00:17:38] But trust us, after minutes, after hours, Indexima will guess the best layer, like the best hyper on top of your data.

[00:17:49] But instead of making one big hyper each day, one big extract each day, it's building different small extract inside Snowflake.

[00:18:00] Yeah.

[00:18:01] And after the first day, the usage was faster.

[00:18:05] And after one day, the end user was very fast.

[00:18:11] Having live querying and no more need to extract the data each night.

[00:18:16] The customer is really happy right now.

[00:18:19] And you mentioned cost there.

[00:18:21] And there are so many different BI tools available at the moment.

[00:18:24] So I've got to ask, how does it compare with other BI tools in terms of things like speed, cost, ease of implementation?

[00:18:31] And what is it that makes your solutions unique in such a crowded market out there?

[00:18:37] Yeah, the market is really crowded.

[00:18:39] And there is a lot of different actors.

[00:18:42] There is data viz tools, semantic layers.

[00:18:45] There is data warehouses.

[00:18:47] There is snowflake optimization tools.

[00:18:51] And there is also a specialist like us of aggregations.

[00:18:55] It's very difficult to make your mind and to make your decision of what kind of tools you should use or what kind of tools you should enlarge.

[00:19:06] I mean, you can just enlarge your snowflake and your queries will go faster.

[00:19:10] But you will have to pay a lot.

[00:19:12] Yeah.

[00:19:13] You can enlarge your poor BI product and it will be okay.

[00:19:18] You can deal with a huge amount of data, but you will have to pay a lot.

[00:19:24] Yeah.

[00:19:25] And what we do is to automatically make the same magic that any DBA, any database administrator will do inside your snowflake.

[00:19:37] We optimize automatically your queries thanks to rewriting, thanks to the aggregation layer we've picked a lot.

[00:19:43] But what we do is to do it automatically thanks to AI.

[00:19:48] And I would say this is the thing that is different than the others.

[00:19:55] We use AI for a very specific use case of finding the best aggregation layer that you have to sit on top of your data inside your snowflake.

[00:20:08] Fantastic.

[00:20:09] And of course, we're just a few weeks away from life in 2025.

[00:20:13] So are there any trends that you see shaping the future of business intelligence, particularly in leveraging cloud platforms like Snowflake, AI, etc.

[00:20:23] So much going on, but any trends that you're noticing?

[00:20:25] Yeah, there is one trend I am really looking at since weeks right now.

[00:20:30] It's what could be done with LLM.

[00:20:34] I mean, you can ask from to chat GPT, you can do it right now.

[00:20:38] No, you give to chat GPT one or to cloud or whatever.

[00:20:43] You give them one query that is not optimized right now.

[00:20:48] Yeah.

[00:20:48] And ask to that model to rewrite the queries and to optimize it.

[00:20:53] And you could be really astonished that chat GPT or cloud are able right now to say to you, just build that materialized view or that dynamic table, and then rewrite your query like that.

[00:21:05] And it seems magic.

[00:21:07] And for us, it was a huge fear.

[00:21:10] It was a huge fear before understanding that a lot of those queries seem to be okay.

[00:21:16] And in fact, they are not okay.

[00:21:19] Because right now, generating SQL queries thanks to LLM is not trustable.

[00:21:25] And what we are looking at right now is how to use that model that we would not trend ourselves, but just use like that.

[00:21:37] And how they could generate, instead of generate a query, generate objects, JSON, that I would send to a semantic layer.

[00:21:48] And after I trust the semantic layer to translate that object linked into a query.

[00:21:57] And this is where we are.

[00:21:59] And this is a trend you should really follow all of yours, having a look at how LLM can be leveraged in order to speed up your product.

[00:22:09] So I've got to ask, what's next for you guys?

[00:22:12] Is there anything you can share about the road ahead, what you're working on, strategic goals?

[00:22:16] You probably locked down and can't reveal too much, but any teasers you can leave us with?

[00:22:21] For sure.

[00:22:22] What we've done in order to speed up and to reduce the carbon footprint of your snowflake queries will be done and will be replicated on, let's say, Databricks for sure for the next one.

[00:22:34] But also, one thing that we are working on right now and that we are studying is that question of LLM around generating SQL.

[00:22:44] Well, instead of generating SQL, we are working on our technology, our algorithm in order to generate semantic layer objects.

[00:22:56] Maybe DBT, maybe others.

[00:22:59] We don't know that yet.

[00:23:00] But for sure, we would replicate what we've done and what we generate on top of Snowflake on different technology and different kinds of objects.

[00:23:12] Fantastic. Exciting times ahead.

[00:23:14] And for anyone listening wanting to dig a little bit deeper on some of the things we talked about today, there is so much rich information out there and it can get quite complex.

[00:23:23] So where would you like to point everyone listening?

[00:23:24] Everyone can just go to indexima.com and click on the button try now and let's see the magic. In less than 15 minutes, you will have your BI going faster. That's it.

[00:23:39] Well, there is a challenge for everybody listening. I urge you to check that out. I'll add links to make it nice and easy. We covered so much in a short amount of time, but thank you for sharing your story today.

[00:23:49] Thanks a lot, Neil.

[00:23:50] I think as we wrap up today's episode, it's clear that tackling the BI paradox isn't just about simplifying dashboards or speeding up analytics.

[00:24:00] It's about creating harmony between end users and the often unseen complexities of data systems.

[00:24:08] And my guest today has shown us how innovative approaches like those in Dexamo can not only resolve some of these challenges, but also empower businesses to unlock the potential from their data.

[00:24:20] But what do you think? How does your organization approach this BI paradox and where do you see room for improvement?

[00:24:28] Love to hear your thoughts on this one. So please join the conversation. Email me now, techblogwriteroutlook.com, LinkedIn, X, Instagram, just at Neil C. Hughes. Let me know your thoughts and let's keep exploring the ways that technology can solve these real world challenges together.

[00:24:47] But until next time, stay curious, stay connected and make sure you join me again tomorrow for another guest. I will speak with you all then. Bye for now.