From 1.16 BillionReactive Logs A Day To Proactive Insight: Storio Group And Dynatrace
Tech Talks DailyFebruary 22, 2026
3596
25:4123.51 MB

From 1.16 BillionReactive Logs A Day To Proactive Insight: Storio Group And Dynatrace

How do you protect millions in revenue during your busiest hour of the year when your entire business depends on digital performance?

At Perform 2026, I caught up with Alex Hibbitt, Engineering Director responsible for the customer platform at Storio Group, to unpack what happens when observability moves from an engineering afterthought to a board-level priority. Storio Group was formed from the merger of Photobox and Albelli, bringing together multiple brands and five separate e-commerce platforms into one unified customer journey. That consolidation created opportunity, but it also exposed risk, especially during peak trading from Black Friday through Black Sunday and into the Christmas rush.

Alex shared what it really looks like when downtime is non-negotiable. At peak, Storio's platform can generate up to 1.5 million euros per hour. A single poorly timed incident is not simply a technical problem, it is a direct threat to revenue and customer trust. Before partnering with Dynatrace, the team was relying heavily on centralized logging, processing over a billion log lines a day and depending on engineers to manually interpret signals. It was reactive, labor intensive, and left too much to chance.

What stood out for me was how cultural change led the transformation. Rather than imposing a new tool from the top down, Alex and his team built a maturity model engineers could relate to, created internal champions, and framed observability as risk management and business protection. The result was a reported 65 to 70 percent reduction in log costs, a 50 percent drop in mean time to detect overall, and up to 90 percent improvement for the most severe incidents.

We also explored how unifying logs, metrics, and traces into a single AI-driven platform helped Storio move from reactive firefighting to proactive detection. During one Black Sunday alone, three major issues were identified early enough to avoid an estimated 4.5 million euros in potential impact.

This conversation goes beyond tooling. It is about protecting customer experience, safeguarding revenue during peak demand, and building an engineering culture that embraces change. If your organization is wrestling with cloud costs, fragmented monitoring, or the pressure to deliver flawless digital performance under load, there are some powerful lessons here.

Useful Links

[00:00:04] What does observability, what does it really mean when your busiest trading day can bring in over a million euros an hour and a single blind spot could quietly undo years of customer trust? Well today's conversation comes from the show floor at Dynatrace. And today I want to hone in on a story that will feel uncomfortably familiar to many digital leaders.

[00:00:28] Rapid growth, multiple platforms all stitched together through mergers, proud engineering teams and e-commerce business where peak season is both the biggest opportunity while also being the greatest source of risk. So my guest today is Alex Hibbitt. He's the engineering director who's responsible for the customer platform at Storio Group.

[00:00:52] And if you've never heard of Storio Group before, you almost certainly know the kind of moments they power. From personalised photo books, GIFs, keepsakes that turn some of those digital memories that we all have into something physical, emotional and lasting. All of which makes downtime much more than a technical failure as it could quickly become a broken promise at exactly the wrong moment, the wrong time of year, the busiest time of year.

[00:01:19] So Alex today will share how Storio Group came together through a merger that left them with five different e-commerce platforms. Not to mention fragmented tooling, a monitoring approach that relied heavily on humans staring at logs during peak periods. And as Black Friday rolled into their even more intense Black Sunday, the gaps became impossible to ignore. So the conversation today is about moving from reactive firefighting to proactive insight.

[00:01:50] But it is equally a story about culture. And Alex will talk openly about pushback from engineers and the challenges of changing deeply held habits in an organisation. So the big question is how do they go from that to unlocking real-time business visibility, sharper incident detection and measurable cost savings without losing sight of that customer experience.

[00:02:15] So if you are responsible for a consumer platform where peak demand is unforgiving, where cloud costs are under scrutiny and where trust is earned one interaction at a time and understand the value in being able to see problems before your customers do. But enough from me. Let me officially introduce you to Alex now. So a massive warm welcome to the show.

[00:02:43] Can you tell everyone listening a little about who you are and what you do? Absolutely. My name is Alex Hibbert. I'm an engineering director responsible for the customer platform at Storio Group. And for people who might be hearing about the Storio Group for the very first time, can you tell me a bit more about the scale of your digital platform? And also, as we're here at Dynatrace performed today, why observability became such a pressing issue for your business? Absolutely. So I'll tell you a little bit about Storio Group.

[00:03:12] We were formed three years ago when a company called Photobox, who were big in the UK and France and a few other countries around Europe, merged with another big company called Albelli, who operating in Europe as well. When we merged together, we had five different e-commerce platforms powering our customer journey. Huge sort of selection of different technologies providing the customer experiences for all of our different brands.

[00:03:37] That was super inefficient because whenever we introduced a change, it was really, really difficult to roll that out. We had to do it five times. So as soon as we merged, we clearly had a call to action. Move on to one platform. As we went on that journey, we had to choose a load of different compromises to make to be laser focused on that ambition. And one of them was in the observability space.

[00:04:01] We knew what good observability looked like, but we knew that we couldn't deliver an observability change at the same time as driving to one singular platform. That brought us to Dynatrace. And you made it sound very, very easy there, but there'll be a lot of people listening that have been on similar journeys. What were your biggest challenges along the way there? Oh, there were so many. First of all, I think it was identifying the scale of the problem.

[00:04:26] We as a business are very heavily driven by the holiday time period. So from Black Friday through to Christmas and in Spain out to Reyes, that's a huge trading time for us. We just migrated a platform. We were now on one singular platform, but we did not have the observability capabilities that we needed to be effective at making sure that that platform was always giving a great experience to our customers.

[00:04:54] That translates through to business risk. Crystallizing that was the first big challenge. Then beginning to explain that to an engineering community who were really proud of what they built, who didn't necessarily know actually there are things that are different in the observability space right now. So bringing that together as a message and doing a cultural change was the next big driver. And then finally delivering on that while also trying to deliver everything else we needed to do.

[00:05:22] And during that holiday season for a lot of businesses, it sounds like you're very similar here. Most of your income will come over that two, three month period. Downtime is non-negotiable. Absolutely. Absolutely. We had a look at what order rates look like on a Black Sunday weekend, which tends to be our biggest day of the year. At points, we're hitting 1.5 million euros per hour across the entirety of the platform.

[00:05:48] That is a huge number if you're thinking about an incident taking you down from, say, it's a quick to resolve incident, a minimum of 20 minutes to half an hour to amount of response through to, say, an hour and a half, two hours to a big thing. The wrong incident at the wrong time can be absolutely crippling for our organization, not to mention the impact it has on trust from our customers and the precious memories that they're crafting on our platform.

[00:06:16] So if we go back before you implemented this, before working with Dynatrace and people like Amazon Web Services, what did your monitoring and log management setup look like then? Because you must have been on quite a journey. And where was it letting teams down day to day before? Absolutely. Without going too techie, we were really relying on the human element. We had followed a pattern where everything that happened on our platform emitted a log.

[00:06:43] And that resulted in, on a busy day, producing something like 1.16 billion log lines. They were all centralized into a cluster based around OpenSearch, AWS's Elasticsearch technology, and allowed our engineers to read all of those logs. But the solution was not more sophisticated than that.

[00:07:06] What we really relied on was engineers being able to count the log events that they thought were important and set thresholds around them to trigger alerts. And for things that they didn't think about in advance, we were relying on engineers sifting through those logs and finding those problems. What that meant was half of our engineers across the peak period were sat there staring at logs or looking at graphs, looking for things going wrong. It was completely and absolutely reactive.

[00:07:33] All the time that we were spending doing that was time that we weren't spending innovating and was fundamentally a massive anchor to the organization, as well as not providing the customer experience we wanted, nor letting us rest. And you've been on quite a journey here, and you probably don't even realize just how far you've come. But there will be people listening that they're at where you were before.

[00:07:57] And many organizations, yes, they will talk about moving from reactive to proactive observability, but they do struggle to make it real. So what were the first concrete changes that Storio Group made to break out of fragmented and siloed monitoring? Because there will be people listening thinking, this is the path that we need to follow. But what happened there? The first thing we did was we framed the problem in a way that our engineering community could actually engage with.

[00:08:23] We went out, we looked at what industry best practice looked like. We looked at what we were doing internally. We framed that in a way that our engineers could understand. We built a maturity model. We didn't want to just pick one off the shelf. There are plenty out there. They're all fantastic. We wanted something that spoke to our own engineers. That then gave us an idea of what good could look like from this proactive position we wanted to go to.

[00:08:49] Once we had that, it became how do we turn that into actually a meaningful reality? So we have a very convenient, hard deadline, the Black Friday weekend. We could work backwards from that and say what needs to be true for us to be the first to know by the time we hit that point comfortably. That then translated into a set of actions we needed to do. We had core teams that we needed to make sure we're on the platform, we're brought up to speed with what good observability looked like.

[00:09:18] We could embed evangelists that we built within the organizations to work directly with those teams, help them understand what good looked like, and then start doing things that we couldn't do before. So the great example is we wanted a singular business metric to know how well we were doing on a day-to-day basis. Simply, how much have we taken in terms of revenue in that particular day? We couldn't get that in a non-real-time manner from our other business intelligence systems in the organization.

[00:09:47] But once we'd gone to a point where we had Dynatrace across the entirety of our checkout pipelines, we suddenly were able to get that data in near real-time, like 10 seconds delay or something like that. For us, that really was the moment that our entire engineering community went, I can see the power in this migration. And those little moments, those little eye-opening moments for us were what created a momentum of delivery. And what I love about your approach here is it's not even about the technology.

[00:10:17] Right from the outset, you've mentioned it wasn't a leadership team saying, hey, we're going to do tech this way now, this is what we're going to do. We're getting the engineers involved right from the outset. And I think bringing that culture together is as important as the technology. Is that something you were passionate about from the outset? I absolutely agree with your analysis there. Being culturally led is how we were able to deliver this piece. It was also a necessity for us.

[00:10:42] We have a group of engineers across our organization who are quite rightly proud of what they've built. They do not want an idiot like me wandering in and going, I think you should do it like this. That is not going to work in any way, shape or form. For us to be effective, we had to help everybody understand where there was an opportunity to improve and tap into that passion and pride that our engineering community had in their platform.

[00:11:10] So they wanted to be the people to improve it. If I'd have just sat there from my ivory tower and gone, do it like this, they'd have just gone, oh, he's being a wally again and ignored me. Yeah. And I would imagine in approaching that way, you get better buy-in or was there any pushback at all? There was absolutely pushback. I think many of our engineering teams were still really attached to what they had built in terms of a logging solution.

[00:11:38] They couldn't immediately see the benefit. We had real struggles in some parts of our organization getting over the hump of getting people on board with the plan. And some people didn't really accept it until they truly saw the benefits.

[00:11:56] For us, we had a really interesting experience on Black Sunday itself where three separate problems were spotted by Dynatrace as a platform that we would have been completely blind to if we had not had Dynatrace rolled out or he had not invested the effort in building a proactive baseline for our observability platform.

[00:12:20] That was the moment that the detractors in the organization went, actually, I can see the benefits here. This has saved us over four and a half million was how much we estimated in terms of the downtime that would have happened if we hadn't have had a solution like Dynatrace in place. That was the true moment that our detractors started becoming actually involved in it and really interested in the solution. And we're continuing to build from there. And I've learned something here.

[00:12:47] Obviously, I've heard of Black Friday and Cyber Monday, but Black Sunday, that is the one. That's where... Our customer journey is different compared to a lot of e-commerce partners. Building a photo gift takes time. And what it means is somebody might be starting the experience on Black Friday or before, but they're usually finishing it at least a couple of days later, sometimes weeks later. So for us, Black Friday is a really important day. Black Sunday is even more important. Oh, wow. Okay.

[00:13:16] Now, of course, you achieve a reported log cost savings of around 65 to 70%. And I think that figure alone is incredibly important at a time where ROI on any tech project is front of mind. So what decisions or trade-offs were most responsible for those savings? And again, any advice you'd give to teams trying to rein in those spiraling observability costs?

[00:13:39] Yeah. So for us, the core driver of costs on our old observability solution was the sheer volume of signals we were producing. And signals is a really important word because there is more than just one signal. In the traditional world of monitoring, we relied heavily on logs as the one singular signal that you would produce. In the observability world, there are at least three accepted signals. You could argue that there are five kicking around right now.

[00:14:09] We saw those savings not because of the efficiency of Dynatrace as a platform, but because Dynatrace unlocked an ability for us to start choosing the right signal to emit the telemetry that we were emitting. So once we had our 1.16 billion logs a day going into the Dynatrace platform, we started being able to say, this is the wrong signal bearer. We're using a really inefficient signal bearer here.

[00:14:37] In fact, we can change that to be a much more efficient signal bearer. Say, choose a metric or a trace rather than a log. That let us chop our 1.16 billion logs easily in half. And that then drove costs down. Amazing. There's a lot of talk around AI and hype that surrounds AI as well. And we'll talk about the ROI again.

[00:15:00] How did unifying logs, metrics and traces into this single AI driven platform, how did that change how quickly your teams could get to root cause during incidents? Because this is where the meat on the bone, so to speak. Absolutely. I had the pleasure of being able to stand up in our final peak war room of the peak trading period and share some of these numbers because they truly are, at least to me, are quite impressive.

[00:15:26] So we saw a 50% reduction in our mean time to detect, which was the core metric we were looking at with this particular implementation. We wanted to get to a point where we were shaving down our mean time to detect across all our different incident types. We'd originally targeted a 10% reduction. 50% is massively over delivering, but it gets better than that. When we start looking at our really important incidents, our SEV1s and SEV2s, that number increases up to 90%.

[00:15:54] We dropped down from a mean time to detect of over two hours down to about 14 minutes. And that is really spectacular. That 14 minutes we can still improve on. That was caused by systems that were outside of our direct control and other bits and bobs. But it's a huge gain compared to where we were. And that was unlocked because it was no longer us having to think, how do we set a threshold for a particular scenario? How do we set a threshold for another particular scenario?

[00:16:22] But by relying on a causal predictive AI, we started finding problems that we'd never considered would occur. And that really drove a massive reduction in the MTTD for us. And we've talked about the importance of the holiday season to you guys. So from an operational perspective, how has proactive AI first observer ability, how has that affected your ability to protect revenue during those peak demand periods like seasonal gifting, for example?

[00:16:52] As we spoke about earlier, our customer journey is much longer. The first thing that we've been able to do is ensure that that customer journey is of high quality to our customers all the way through that process. We've also, when we're under massive amounts of load, been able to proactively solve problems before they occur. We spoke a little while ago about the three problems we'd have seen on Black Sunday.

[00:17:18] The customer impact of those particular three problems was absolutely minuscule. Less than 100 customers across the thousands that were coming for our platform on that particular day saw any impact at all. And most importantly, they did not see the website break. They just saw a slight slowdown. That sort of proactive problem detection, allowing us to mitigate before a problem becomes a big problem, has really enabled a better experience for us. And with that, revenue protection.

[00:17:46] If we provide a bad experience to our customers on a day like that, they're simply going to find another vendor who's going to provide a better experience. So that really helps us hang on to those revenues, hang on to a great customer experience and allow them to create memories. Now, you and I, we both love our technology, but unfortunately in some businesses, observability will be seen or feel like an engineering concern. It's seen as a cost center rather than a value add.

[00:18:15] So how did you communicate the value of these changes to maybe some of the non-technical stakeholders across the group? That was a difficult one. At times, I suspect my chief product officer was quite cross at how much I was going on about observability and distracting all of the engineers from delivering value. The lens we tried to apply was deliberately a risk lens. We wanted to say, if we are not able to observe our platform, we know what the impact will be.

[00:18:44] We had seen in previous parts of the year, the challenges identifying incidents, and we had data to prove that. And that allowed us to have a data driven conversation to say, if this plays out, we cannot guarantee peak on these particular days revenue impact could look like this. It's very easy to produce some very scary numbers when you have that sort of a situation. So that helped me get the buy-in, but then came the value proposition piece.

[00:19:12] Dynatrace for us is not just about risk, although we use that lens to focus in on a particular space in order to create the conversations to bolster us into moving forward. Now we have Dynatrace there as a platform, we're beginning to explore the value piece, understanding how our customers behave, understanding business metrics in real time, as we were talking about earlier. It's really powerful to our product partners and the rest of the organization.

[00:19:40] And they're now really excited about the observability platform that is going to unlock possibilities for them to know our customers better. And have you spoken to them since now? Are you on the other side of this? Do they say, I get it now? They absolutely say, I get it now. And they are asking me, how can we use Dynatrace more? Which for me is a great one. And looking back on your journey so far, is there a particular lesson that you'd share with any consumer facing digital business leader that might be listening to our conversation today?

[00:20:10] They also want to simplify operations, control cloud spend and still improve customer experience at the same time. It seems overwhelming, but you've kind of done it. You've been there and done it. So any advice there? Yeah, I think we've touched on a couple of points. I'm going to cheat a little bit and give two pieces of advice if I can get away with it. First of all, double down on your cultural investment. It is by far the biggest challenge.

[00:20:36] Bringing people on the journey, getting them to understand, unlock so many possibilities and will make the end result so much richer. And second of all, it's really important to find a partner who wants to work with you, wants to understand your business. It's not just looking for an opportunity to sell a bad fit solution into your organization, but really wants to understand the problems that you have.

[00:21:05] Dynatrace are constantly asking me, how can we do better? And that is a really, really important thing to have in a partner. And while you've been here, you've been hosting breakout sessions. You've been speaking to people all around the world on the show floor and the many events here. I'm curious, when you take that long 10 hour flight home and reflect on everything you've seen and heard, what are you going to be thinking about on that journey? Anything you're going to be taking away from this event? I think there are a couple of pieces.

[00:21:35] There is a huge excitement in our industry right now about the possibilities of AI, but we're still in a relatively early position when it comes to the AI adoption journey. I've seen some great things in terms of being able to unleash the potential of agentic AI on the observability space here at Perform. There's been some really exciting developments coming out of the Dynatrace platform.

[00:22:00] I'll be thinking about how I can start using those to help continue to drive down our MTTD and to continue to get better insights from Dynacros in terms of how our customers are behaving. And then the other thing I'm going to be thinking about is how do we as a community of technologists share more success stories like this? It's been really nice to hear other people's perspectives, hear their journeys and be able to share my own.

[00:22:27] So it's been great to be able to take part in this podcast and hopefully share a little bit of our experience. Well, thank you so much for joining me today. And for anyone listening, maybe they want to keep up to speed with further announcements throughout the year or see more about what you're doing at Storio Group, both as a consumer and as a business insider. Where would you like to point everyone listening? We, as everybody does, we love to have a good social media presence.

[00:22:51] You will find us on LinkedIn. We love to tell the industry about how we are trying to evolve the organization. And I'm certain that those of us involved in the tech community will see me pop up telling more stories in the future. Awesome. Well, I would add links to everything that you mentioned there. And also I'm going to be checking out Storio Group. I've got a few birthdays coming up, so I'm going to be looking at a personalized gift there. But thank you for joining me today. My absolute pleasure. Have a good rest of your day.

[00:23:19] I think when you step back from this conversation, what really stands out is how observability stopped being an engineering side project and became almost a shared business language. And Alex's story today is not about chasing the latest idea or copying a reference architecture. It's about recognizing risk early, being honest about blind spots and bringing engineers, product leaders and executives into the same conversation.

[00:23:49] Using data that they can all understand. And the numbers matter, whether it be dramatic reductions in detection time. But I think the deeper shift in everything we talked about today is cultural. Engineers move from just watching dashboards in fear to trusting a platform, to trusting a platform that would surface what actually mattered. Leaders move from seeing observability as just another cost to seeing it as a source of confidence.

[00:24:18] And I think there's something quietly powerful in how Storio Group reframed its success. Preventing full outages during peak season, protecting revenue before it disappeared and ensuring customers experience only a slight slowdown instead of a broken journey. And those moments rarely make the headlines, but they define whether a digital business earns loyalty or loses it forever.

[00:24:45] And if this conversation today sparks any ideas about your own observability journey, maybe the real question is this. What would change in your organization today if you became the first to know when something was going wrong instead of the last to find out? As always, pop over to techtalksnetwork.com. Let me know your thoughts, experiences, insights, stories, good and bad. I want to hear them all.

[00:25:12] But before I go, just a big thank you to Alex for taking the time to chat with me on the show floor at Dynatrace Perform. And a big thank you to each and every one of you, of course, for tuning into this podcast. So I'll be back again tomorrow with another story. But thank you for listening as always. And I'll speak to you then. Bye for now.