Good morning, afternoon, evening. I see somebody from Nigeria. Thank you for your time. I hope this is useful for you. I am Gerardo Dada and the field CTO for Catchpoint Systems. I've been involved with IT Web Technologies, internet, and the cloud in general for about 25 years, maybe more actually 30 years since 1995.
Today we're going to talk about how we can make IT the enterprise innovation engine, which is something that most IT departments want to be and, and it's a big challenge. This is not going to be an ad for a company or anything like that. I'm hoping the information I'll share is going to be useful for you.
The first thing is that there are many stats. This one is from Deloitte, but IT wants to be in the innovation game.
Innovation is often times done through technology and at the same time, a lot of IT departments are struggling with innovating because they spend 80% of their time just keeping the lights on managing stuff.
Let's look at some history. I started in IT a long, long time ago.
In the early days, I remember my first job in corporate, I was in a company where you had local e-mail share, probably use an Exchange small business server. You have file share, you have print server, you have a couple applications. All of them were local.
And people worked in the office, right? Connectivity was local only, and IT was the team that deployed some of these applications. There were only a handful of them, and they were the guardians of uptime, meaning nobody cared about IT, unless something went wrong.
Unfortunately, that's still how sometimes it is perceived.
It sucks because, you know, obviously applications have become more complex and, and things go down more often and things become more, we have more and more applications all the time. It is not getting all the credit for all the hard work they do. People only think about them as a call center.
Today, every company is a software company, meaning everything we do in a company is supported by a digital experience or is digital in itself. All the business processes are digital.
When I was at Rackspace back in 2010, and we were in the early days of cloud, we were thinking about what the most useful use cases for cloud were, and we made a long list. Surprisingly, accounting software was one of the top. eCommerce was right at the top of that because of the elasticity of the cloud, etcetera. I thought restaurants would be very much at the bottom because why would you need the cloud if you had a restaurant—everything happens offline.
But today, you cannot find a restaurant on Yelp if your Internet is not working, you cannot get to the restaurant using Google Maps. If you get to the restaurant, you cannot scan the QR code. You cannot place the order in the restaurant management system. You cannot present the bill, and you cannot charge for the meal. If there's no connectivity, no Internet, no digital systems, you might as well close the restaurant.
If you think of any “offline” process, like, I don't know, boarding a plane, checking in a restaurant, filing an insurance claim—everything is digital nowadays. That means everything is software-based, everything is supported by digital technology. It's critical for everything.
So in the past, keeping the lights on was focusing on operational excellence, basically uptime. It was very reactive. You only called IT when you hired somebody, you fired somebody or when there's a problem that you need to solve. And therefore, IT was a call center, meaning you needed to minimize it. There was no glory. There was no strategic contribution. The only thing they did was keep things working, and you only thought about IT when things went bad.
But today, with everything being digital, IT is in the business of innovation.
They need to support, enable and even drive business strategy because they can support not only user experience, but users can be both customers as well as employees. IT needs to ensure digital processes and enable innovation.
This sounds easy and logical, but the fact is that over 50% of the workforce today, the digital workforce that works in front of a computer, suffers from what Gartner calls digital friction. Meaning some things don't work sometimes.
You've heard things like, “Teams is not working for me,” “Webex is not launching,” “I can't log into the system,” or “My Wi-Fi connection is bad.” One common example is when renting a car—there's often that moment where the staff says, “Sorry, my computer is slow.” This seems to happen almost every time you interact with someone in a service role.
IT has the opportunity to be the driver, the enabler, but also to educate the business on how to enhance the overall strategy.
I want to talk about four things that are happening in IT.
The first one is complexity explosion, right? IT was supposed to become simpler, and there were already many technologies 25 and 20 years ago. Then, there was component-based development, which was supposed to make application development easier. After that, automated automation was expected to simplify IT further. Finally, the cloud was supposed to make everything self-service and happen automatically.
Now AI is supposed to make everything better automatically.
But the reality is, every time we introduce one of these technologies, any of these technologies, it just raises the bar of what's possible with IT and enhances the complexity.
Today, we have Kubernetes, multi-cloud, hybrid, 5G, SASE technology, security, and a bunch of other stuff. So that's the first thing people need to work with.
The second one is AI OPS potential. I think it's a big buzzword, and I call it a buzzword at the moment because there’s a lot of AI talk—what AI can and cannot do—while there is still relatively limited real-world benefit from all the AI investment.
I'm not saying it's not there, or it’s not going to be there. But the point here, with the word "potential," is that there’s a very large expectation for IT to save money, improve uptime, and improve processes with AI.
So the business is going to expect IT to step up to the plate and to see these benefits quickly.
Obviously, there’s remote and hybrid work. There's a big challenge, meaning people are not in the office. This means you cannot take a look at PCs. Their help desk is getting more complex. People are accessing applications from all over the planet. Companies have gone completely distributed.
Catchpoint is one of those companies. We started as a company over two years ago. We closed our office in New York City, and people moved out. One of our founders lives in LA, another one lives in Florida, and I'm in Austin, TX. We get together every now and then, but we're completely distributed. We don't have an office.
And then last, the pressure to optimize, which is always there. The pressure to optimize comes from multiple places. One, there's a lot of legacy technology that needs to be modernized, right? Like there are companies that are still using SiteScope and mainframes, and those things need to be modernized and moved out. There's also a lot of pressure for consolidation. IT has too many tools, and there’s pressure to deliver savings, like every company is trying to save money in every business unit.
These are the four main things we see as creating pressure for IT to deliver more and deliver better. So how can they do that? How can IT change from being that reactive cost center to really becoming a business partner?
In this presentation, I'm going to offer four steps. I'm going to cover them super quickly right now, and then we'll go into each one of them.
The first one is a focus on resilience and trust, right? Resilience is the ability to provide that uptime we’ve talked about and the ability to withstand all the challenges IT faces today, delivering systems that are always up.
We’ll explore more about what resilience means and how you can actually deliver trust not only internally but also for all your stakeholders, your customers, your partners, etc.
The second one is to align the business and IT using an application tiering methodology. The third one is embracing digital experience maturity models since we agree that digital experience is what drives the business. There's a process we’ve been developing over multiple years that is, again, not vendor-specific but helps you move from being reactive to being business-led, then proactive, and finally, building a digital operation center.
You see, there’s a lot of focus on digital because, at the end of the day, a fundamental concept in this thinking is that the goal of IT is not to run a server, right? A server is just a means to an end. The goal of IT is not to reduce CPU utilization below 60% or to deploy databases in the next phase. No, the goal of IT is for the applications that the business requires to run, to run well, to perform, and to be available. That’s what the concept of resilience is about.
So, part of the challenge here is that many organizations, when we talk about resilience, still have too many war rooms. You should have a question on your screen asking, "Do you still have too many war rooms? Is it still a massive disruption for the IT team?" Is it something like, "Yeah, we have a few, but it’s OK," or is it not a problem, or do you not know?
I’d appreciate it if you could take a second and click on your answer because I think it'll be interesting to see how bad it is. In general, most people we talk to say they have too many war rooms. Obviously, the ideal number is zero.
Can we see the results from the poll? All right, so I’m moving the results here, and you can see that 58% of people have some war room problems, 16% don't know, and for 20% of people, it’s not a problem.
OK, let’s move on. Most IT departments are still struggling with outages, right? That’s the most disruptive form of digital disruption.
And I have here a few companies just to show that every large company—Cloudflare, Fastly, Akamai—I picked these three CDNs to show that the companies powering the Internet, the ones that really understand the Internet the best, the ones that have a lot of engineers and very smart people, are still having some of these incidents.
So the question is not if it's going to happen to you, but when is it going to happen to you? How long is it going to last? How often are you going to see some of these incidents, and how bad are they going to be? What's going to be the impact? And will somebody get fired? Who's going to be accountable?
Just in the last week, we've seen incidents from Microsoft Azure, from ServiceNow, and from a bunch of other companies. We have a site—cashpoint.com/outages—where you can see in real-time all the outages, and I've never seen it be blank, meaning there's always an outage going on.
Part of the challenge is caused by what we call the complexity of the Internet stack. Just like you have an application stack with multiple components—code, web front end, databases, servers, networks, storage systems—that’s well understood and has plenty of monitoring technologies, there’s also an Internet stack that supports applications and services.
Today, everything we do is distributed. It's multi-cloud, it’s hybrid. Whether people are connected from multiple places, or your applications are hosted in different places, with components of those applications distributed in various locations, corporate applications are often made up of multiple components that are spread across different clouds and locations.
This means what seems like a single application from the user’s perspective might actually have multiple clouds, multiple APIs, multiple DNS systems, and even BGP, which is a border gateway protocol that many people don’t even know exists. Facebook learned a hard lesson three years ago when everything—Instagram, WhatsApp, and Facebook—went down for eight hours because their BGP system went down. They couldn’t even get into their data centers.
Of course, there are also protocols and network technologies, both inside and outside your network, like Sassy, WAN backbone, and new technologies like ECN, L4S, Quick, and MQTT workers for IoT applications.
My point here is that many Internet technologies have a significant impact on systems today. Despite having auto-scaling groups and self-healing applications, the systems are so complex, built in so many different places, and deployed from multiple locations, that it’s critical to understand what the Internet stack is supporting in each of these applications.
The impact of not doing that is the impact to trust, right? The impact to trust comes from a few key areas, one of them being brand reputation. Obviously, when a site is down, or when you have an incident that impacts the business, there’s an impact on revenue—both from the inability to charge money in the moment and from the long-term loss of customers. There's also an operational cost, like when you have a war room, pull in your smartest people, and disrupt their day by stopping everything they’re working on. It’s costly, and when you wake people up, that operational cost is even higher.
Gartner research has highlighted the risk impacts coming from brand reputation, financial loss, legal issues, and safety, especially for life-critical applications like in a hospital. End-user experience is also key.
We call digital resilience—or Internet resilience—the ability to withstand all this complexity, manage the risks, and deliver the right digital experience for users. There are four components to resilience.
It’s useful to think of it like a store, right? First, can you actually get to the store? If the store is closed, it doesn’t work. Second, is it performing well? Is it fast, or is it slow? If there’s a long line of people and it’s not moving, then it’s as good as not working. Third, is everything in the store available and functional? You might be able to go to the store, but if the cash registers aren’t working, it doesn’t do you any good. Lastly, reliability: can you trust that the store will be there every time you go? You want your coffee shop to be open every morning to power your day. If you go twice in a week and it’s closed, you’ll probably start thinking about finding another coffee shop.
The thing is, in the digital world, loyalty disappears very quickly. So we think about resilience in these four ways. This framework helps you evaluate the resilience you’re delivering to the business and for the IT systems that power it.
That covers resilience, which is the first step—focusing on resilience and trust, both internally with your users and externally with your customers and partners. Focus on resilience and trust, both internal to your users and external to your customers and partners.
The second step to becoming a strategic business partner is to align application tiering with IT business priorities.
And here we'll have another question. Do you know how many applications your IT department manages across your enterprise?
Here’s another poll for you: How many applications does your own IT department manage? This is just to give us an idea of the complexity.
If you can take a minute to click quickly, and we can show the results. Let's show the results now.
As you can see, it's hundreds. Well, if you add it up, 19 percent is over 1,000, right? Managing 1,000 applications is very complex. If you need to think about monitoring, reporting, updating them constantly, consolidation, etc., that’s a daunting task.
So how do you do that?
And so there's this concept called application tiering. That basically means you can pick 3 tiers, 4 tiers, or 7 tiers. It doesn’t really matter, right? Here’s a model that shows four different tiers, just grouping your applications based on their level of criticality to your business.
This is a best practice, right? Tier one applications can be revenue-generating and critical for business operations. Tier 4 applications are mostly internal; they support productivity and are important but not critical.
The key to putting them in the right tier is understanding the impact of an outage, right? What happens if a Tier 1 application goes down? You start getting calls. You call a war room, and you get a call from your CEO. In the case of Facebook, when all their applications went down, it wasn’t just a tier one incident. The board of directors was calling Zuckerberg, asking not only what the problem was but also how he was going to make sure it wouldn’t happen again.
For Tier 4, it may not be that impactful if the application goes down for a couple of hours. Here’s an example: If you’re in an e-commerce business, your e-commerce platform is probably mission-critical. If you’re in B2B, Salesforce is going to be business-critical, but maybe not mission-critical.
You might have an HR application, or employee benefits systems, which can be deferrable for the 4th tier, right? For example, if a website supporting your holiday party goes down for a couple of hours or even a day or two, it’s going to be OK.
To evolve this process and thinking about application tiers, let’s do a quick poll: Are you familiar with the concepts of RPO and RTO? This concept is usually related to storage systems but applies more broadly.
If you're familiar with it, are you applying it, or are you using a similar set of metrics for your business? Please take a second to answer, and we’ll see the results.
So, 33% have a similar system, and 25% are familiar with it. Most people are familiar, but 6% to 9% are not familiar or unsure if they’re familiar with how it’s being used.
All right, let’s cover that topic quickly.
RTO is again mostly for disaster recovery, usually for standby systems and storage. RTO is essentially defined as the maximum acceptable downtime for a system, right? So, how long between the time a disaster happens and when you actually recover and are back online?
That’s RTO. RPO, or Recovery Point Objective, refers to how much data loss is acceptable, meaning how much of that data can be lost or needs to be recovered. This varies by application, right? Most companies that have a tier system might have RTO and RPO objectives defined by application or by tier.
If we go back to that table, you can see that resilience requirements range from low to high. Then you have both an SLA and an RTO objective. For example, a Tier 4 application might be able to go down for two hours, or maybe even eight hours in your case. A Tier 1 application, though, might only have a five-minute RTO.
Some applications may even require a recovery time shorter than that. Unfortunately, it's really hard to recover in five minutes. That means you need to identify the problem, find the root cause, fix it, and validate that the fix is working, all within five minutes. It’s a daunting challenge, but this model helps break down the thousands of applications into groups. Maybe you have 20 Tier 1 applications, 100 business-critical applications, and many more in the lower tiers.
This approach helps you focus and report to the business how you're doing in terms of SLAs and how you're maintaining uptime for those applications.
All right, so we talked about resilience, trust, and tiering. Now, the third point is about the digital experience maturity model.
When you're trying to build this kind of system to achieve five-minute resilience or even four nines (99.99%) or five nines (99.999%) availability, companies fall into different stages of maturity, right?
The first stage is where you're not monitoring—you’re not actively tracking the system. This can vary by application. The second stage is more reactive: you have some ability to detect and restore issues. Unfortunately, this is where most organizations exist. This stage is very focused on infrastructure systems.
When you're on the left side of the spectrum, you're closer to being a traditional, reactive IT department that’s seen more as a service provider. As you move to the right side of the spectrum, things get more interesting.
In stage 3, you are proactive. You have system-level objectives, you track results, and you work in what's called a "blame-free" environment where you don’t have the network team and application team pointing fingers at each other. Instead, you focus on figuring out what happened and how to prevent it from happening again.
Being proactive means you're trying to prevent problems from occurring, and you're catching problems before users complain, minimizing the business impact.
Stage 4 is where you get real value. You have Internet performance monitoring during the design phase, AI overlays, and a transparent culture where all data is open for everyone. There’s a culture of continuous improvement. Operations are measured not by processes or by what’s urgent, but by the outcomes for the business.
You have unified operations, DevOps, SecOps, and similar frameworks, and your value and outcomes drive the business. So, it’s no longer about uptime or fixing problems or lowering costs. It’s about asking, "How do I, as an IT department, maximize revenue? How do I optimize the user experience for both customers and internal users? How do I protect the brand? How do I achieve resilience by reducing service impact? How do I improve the productivity of the workforce that depends on these applications?"And ultimately, how do I reduce digital friction for my users?
To conduct the maturity model assessment, you typically assess the ecosystem, look at implementation, and continuously reassess. You keep improving and closing gaps, identifying business challenges, and consistently trying to move further along the maturity model.
You’re not necessarily in stage 3; you might be at stage 3.15. Then, next month, you could move to stage 3.2, and so on. It’s an ongoing process of reassessing where each part of your IT department is within the maturity model so that you can continue to move forward.
If you look at the details behind the maturity model, there are multiple perspectives you can examine: the application perspective, Internet service delivery perspective, infrastructure perspective, network experience perspective, and workforce experience.
All of this data and information is centralized in a system with a correlation engine, dashboards, alerts, processes, analytics, SLO tracking, and SLA tracking, based on your budgets but also on experience level objectives (XLOs).
This data is used to drive business decisions and to show how you are driving the business. You capture multiple metrics, but you’re more focused on the value that the IT systems bring to the business by enabling digital experiences.
The maturity score is a combination of different factors—technology use cases, networks, applications, websites, etc. By the way, "application experience" refers to APIs, web services, and third-party components, among others.
With what we call the resiliency pillar—reachability, functional availability, performance, and reliability—there's a saying that's been around for maybe 10 years: "Slow is the new down," right?
A website that does not respond within 7 seconds is considered just as bad as one that doesn't work at all. When you combine all these scores across the resilience pillars by each functional area, you end up with a score. You have different maturity levels based on each technology or resilience aspect, which gives you an overall score that you can work on improving.
Hopefully, that identifies where your performance is suffering the most. In this case, it’s likely that performance is the area where you need to focus the most. In terms of technology, you have application and website experience to focus on to bring your maturity to the next level.
The goal is to move from passive monitoring—where you're barely able to react, and people learn about downtime through Twitter or by complaining—to a more proactive approach.
You'd be surprised by the number of large companies that have mission-critical systems. I was talking to the CIO of one of the largest pharmacy chains in the United States, and I asked him, "How would you know if all your stores here in Texas lost their connectivity to your insurance carrier, so they couldn't fill any prescriptions?"
He said, "I wouldn't know. I would only know based on complaints, maybe a day or two later."
They're unmonitored, passive systems. And it’s surprising that in 2024, we still have critical systems in that situation.
When you're proactive, you're able to catch things as soon as an error happens. You catch it almost instantaneously, and you fix it. Then, there’s value.
That means your system, in this case for the pharmacies, gives you confidence that your users are not only able to fill prescriptions but also to pay and access all the digital capabilities they need.
Then IT is brought to the table as a way to improve those processes. What other capabilities can IT drive in terms of customer experience? What kind of efficiencies, connectivity improvements, and partnerships with third parties and technology partners can we bring to enhance that process overall?
So, that’s the maturity model.
So, we talked about resilience, tiering, and the maturity model, and the last part is building a digital operations center.
The idea is that once you've implemented all this technology and understand how it drives the business, your dashboard is no longer focused on CPU utilization, throughput, cloud connectivity, or alerts about what’s broken right now. Instead, imagine a digital operations center that shows you everything happening in your company, from both a technology perspective and a digital experience perspective.
Because, again, if a user in Texas can't fill their prescription and they get upset, you won’t make money, and you're at risk of losing that customer forever. There are plenty of options, right? There’s another pharmacy across the street, there’s Capsule, Amazon Pharmacy, and more. Users have so many options, and if you don’t deliver a perfect experience, they can easily go elsewhere.
And if IT responds by saying, “Well, my server shows everything is green” or “My alerts show no problem,” it highlights the issue: IT is not even aware of the customer experience challenges.
The concept of a digital operations center is similar to a NOC (network operations center) or SoC (security operations center). If we agree that digital operations, processes, and software are the lifeblood of a modern company, wouldn’t it make sense to have a DOC (digital operations center) where you can see how your internal applications are enabling your workforce?
How is the health of IT in general? Can all your employees use the software they need? Are your branches and pharmacies connected properly? Are your websites performing quickly? Is your connectivity and API to third-party systems (such as pharmacies or transaction processing systems) working properly?
We looked at CNN as an example. The CNN website has hundreds of digital components to load, from tag management to loading fonts from one place, to a CDN for some of the graphics and videos, etc. It’s surprising that the website can load in five seconds when you think about the hundreds of different things coming from various parts of the Internet.
We had a blog post on our website about an incident we caught in December, where Adobe Tag Manager—a very common tool for websites—went down. The companies and organizations that had a digital operations center in place were able to catch it as a dependency immediately and turn it off so it wouldn’t impact their business, allowing them to continue operating smoothly.
However, the organizations that weren’t paying attention to third-party dependencies and didn’t have the right monitoring in place weren’t thinking about the user experience. They began getting complaints and seeing the impact of the downtime on their website.
We’re talking about multi-billion-dollar companies. In this case, they called a war room. Adobe wasn’t even publishing updates about this on their website, Twitter feed, or status page. These companies had to spend an hour or two just identifying the problem and turning off the dependency. As you know, if you’re in IT, 90% of the game is finding the root cause, right?
It's about reducing the meantime to innocence or the meantime to identification.
So, what we built at Catchpoint—and this is the only kind of commercial slide like this—is a system that captures everything about the Internet, everything about end-user experience, API experience, and digital experience across the Internet through thousands of vantage points around the world. Different types of telemetry are collected and analyzed.
We’re not the only company doing this; there are one or two others that offer something similar. What we’ve built is presented in a way that shows a global map, status updates on applications, SLA and XLO forecasts and monitoring, and an Internet stack map that shows all the dependencies impacting a business.
The goal of this system is not just to show the performance of an individual server. It’s not about collecting logs and storing terabytes of data, hoping to find the root cause later. The goal is resilience—ensuring reachability, availability, performance, and reliability. And to the extent that it’s possible, automation plays a role as well.
The idea is that with this digital operations center—whether from Catchpoint or any other technology—you can identify issues because you’re looking at them from an end-user perspective.
You can identify the cost of friction, the digital friction, and the things that are upsetting customers and end users. It’s not only about the big failures; it’s 100% actionable because you can see everything impacting the user, not just from their perspective, but also as the traffic traverses the Internet—from ISPs and CDNs to backbone connectivity, front-end connectivity, your internal network, and even down to your code traces.
This way, there’s less finger-pointing, and there’s no guesswork or false positives. It’s all proactive because you're monitoring everything.
In the case of synthetics, you're simulating users and applications. In the case of real-user monitoring (RUM), you're seeing real users. Then, you have these experience scores, which use AI to show you the ideal experience you need to deliver to your customers, for instance, in London, or with your ServiceNow or SAP Concur applications.
So, where does your IT need to focus? What are the things you need to pay attention to, and what requires immediate attention from your IT department?
Again, it’s less about focusing on services and systems, and more about focusing on the experience score. At the end of the day, the goal is migrating from a "keeping the lights on" system to an IT innovation system focused on the digital user experience. This ensures that all digital processes and business operations supported by these digital systems are working—and working well.
They’re resilient, performing, available, and reliable. You’re enabling innovation because once you do all these things, and you agree with the business on what your top applications are, what your SLAs are, and what your experience level objectives are, you’re delivering on those.
You deliver a digital operations center that helps the business understand how information and data are flowing, and how the systems are enabling the business. That’s how you get a seat at the table to innovate.
A few weeks ago, I was looking at a report from Ernst and Young, and they were talking about the vital role sales are going to play with technology, focusing on the end-user and future innovation.
This was closely related to the presentation I was preparing today. They talked about six actions.
First action: focus on the customer and client experience. Again, it’s less about the system and more about the experience.
Second: use analytics and AI for deeper visibility and insights. We’ve talked about that today as well.
Third: modernize IT with cloud services for efficiency, with the keyword being resilience.
Fourth: increase flexibility by nurturing a digital ecosystem, which speaks to the complexity and interdependency of systems. It’s not just about managing your internal systems—IT needs to care about all the external dependencies that impact the business.
Fifth: develop and secure an advanced talent pool. It’s critical, but it’s also about using systems that are smart enough so you don’t need the smartest people in the world to find and fix things.
And finally: manage technology risk and maintain trust. This is tied to the trust factor that we’ve been discussing.
I found it very interesting that Ernst and Young was talking about some of the exact same topics I was referring to.
So, that covers the presentation for today. If you're interested in learning more about Catchpoint's IPM platform and how we can help your organization align with the business and some of the technology we shared today, let us know. If not, that's absolutely fine. If you just want the slides, we’ll make sure you get a copy. We’re not going to share the results of the polls today so we can follow up later.
With that, we can open up to any questions and answers from the audience.
First, again, thank you for your time. I appreciate that all the people who were here when we started the presentation are still here. Hopefully, this was useful. I tried to make this useful for everyone, and I wasn’t trying to sell you anything—at least, not a lot.
If you have any questions, you can type them in the chat, and I'll be happy to answer them to the best of my ability.
Haley, do we have any questions?
Yes, we do have a couple in the Q&A. The first one is: "What are some examples of companies that are doing this today?"
Great. We work with a team from IKEA, the Swedish furniture maker. They don't even call it monitoring anymore; it's called operational insights. Monitoring typically means just looking at the internals, while operational insights focus on understanding how the business works.
This team uses the concept of digital experience to understand everything, from the performance of their website to the resilience of in-store applications that allow customers to design cabinets or kitchens. All those applications are monitored with the same principles.
For example, they monitor the warehouse application that processes customer pickup orders. If a customer shows up at an IKEA store, the system sends information to the warehouse, and the customer receives their furniture. They monitor not just the server but the entire customer experience, asking: How many times does the customer have an issue? What is the wait time from the moment the customer says, "I’m here waiting for my furniture," to when they receive it?
They also monitor every workforce team, from office staff using CRM systems to the DevOps teams using tools like GitLab, Dynatrace, or New Relic. These systems, often hosted in the cloud, are critical for thousands of developers to remain productive. If they go down, there’s no visibility into what's going on.
IKEA monitors all these aspects from an end-user perspective, with the philosophy that their goal as an IT department isn’t to just fix systems but to deliver a seamless experience for technology users and to empower IKEA to continue being the world's largest furniture store.
We see this approach with other electronics manufacturers and banks, although many are still catching up. E-commerce, of course, is probably the most natural area for this type of monitoring.
Another question: "How would IT report resilience scores to the business?"
A central digital operations center would be a great place to have joint visibility across the business. If you're enabling in-store payments or other critical processes, everyone in the business should care about that. A central monitoring system should show how the business is doing because everything is interconnected.
You can establish SLAs (which are typically vendor-centric) or experience level objectives (XLOs). For example, you could set an objective that your website should respond in less than three seconds for users in every region. If you're an American company, you could monitor all 50 states. If you're a pan-European company, you could monitor all 40 countries where you operate, ensuring that every customer can access your website within three seconds, and that it’s 99.99% available for them.
Your mobile applications should also allow customers to complete their tasks, whether it’s buying something, filling a prescription, or transferring money in a bank. You would monitor these XLOs in real-time, connecting IT performance to business units and the overall business.
Another question is: "Monitoring is already in place and services are stabilized, so how do you convince the business to reinvest savings back into IT resources?"
That's a great question. In every department, you'll find some savings, and the finance team may say, "Thank you very much; we’ll add that to our profit line." The opportunity here is to stop thinking as just an IT person and to think like the business. Go back to the triangle I showed earlier, and explain that by investing in the right monitoring systems and technologies, along with the right innovations, you can impact revenue, experience, and trust.
If you make this investment—say, $1,000,000 in IT systems—and you deliver an improved experience to your customers, you might see gains in customer loyalty, repeat purchases, etc., resulting in an additional $5,000,000 in revenue. If you propose investing 1 to get 5, any CFO will go for it as long as your numbers are reasonable.
The last question I see is from Marcus: "We use Google 360. Is that data useful for further analysis here?"
Well, Google 360 is just one application, right? Yes, you can monitor the productivity of your users using Google 360. But that’s just one of the many applications we discussed. Most customers have hundreds of applications.
So, any application your team uses can be monitored. This is really application-agnostic. It doesn’t matter which tools each group is using. You can have a dashboard where you can see how finance is doing versus how the productivity team is doing, or you can monitor by application group. For example, how are people using ERP compared to those using office applications like Google, Teams, or Microsoft?
So, to the extent that your business relies on digital processes and software to operate, the concept of user experience in a digital operations center is absolutely applicable.
And I think that’s all the questions we have for today.
With that, I want to thank the VIP team and every attendee for your time. I sincerely hope this was useful, and we're happy to help you in any way we can.
Have a fantastic rest of your week, and thank you so much.