How to improve website performance with multi-dimensional data
Metrics, metrics everywhere... a gauge here, a counter there... milliseconds, percentages... a list of variables running into pages... what is fast, what is slow...? how on earth is one to know...?
Today we have all manner of variables around us, of differing gravity, that each have their own individual purpose in the measurement of web performance. Some of these are atomic or independent metrics, whereas others are aggregated or dependent. An important example of the latter is “Time to First Byte”.
TTFB
Time to First Byte, more commonly called TTFB, is “The length of time between a client making an HTTP request to when the client receives the first byte of the request’s response.” In other words, TTFB because it is the price the end user must pay to load each resource on the page, especially the first resource on a new domain. A higher TTFB translates to higher latency, which is not good, in terms of user experience, and what makes this a key metric. Google’s recommendation for TTFB is 200ms max.
TTFB is a duration which represents the several stages in the lifecycle of a resource request. This is particularly important for the very first request of the page, which corresponds to the loading of the HTML code (no HTML code = no page to process and display). A good TTFB means all those steps are fast, and it’s (very) good!
A poor TTFB can have several causes, usually indicating weakness on the infrastructure side. Let’s now split the TTFB into the atomic metrics it is comprised of, to get to some of those root cause(s):
Diving a little deeper into each of these metrics:
- DNS resolving time: Time to convert a domain name, for instance https://www.catchpoint.com/ to its IP address: 64.79.149.76.
- Connect time: Time for the web browser to establish a connection to the server.
- SSL time: SSL certificate negotiation (secured HTTP, mandatory since HTTP/2). Usually, SSL time is included in Connect time.
- Send time: Time to send the request to the Web browser.
- Wait time: Processing time on the server side, before responding to the web browser.
NB: Steps 1 to 3 only occur for the first request to a new domain.
The above example highlights something interesting. Wait Time is relatively large, begging the question why?
Wait Time encompasses all the steps that need to take place on the server before it has what it needs to send back that first byte of data to the user. While Wait Time can involve many things, it basically is comprised of Server Timing.
Server Timing
Server Timing is a set of headers which can be added to your application responses to produce timing data in the browser’s developer tools to yield what the Wait Time is. It can include things like CPU time, database time, file server access time, and anything else that you choose.
For example:
When it comes to performance, especially over the web, a large part of the server time is due to CDN behaviour.
For example, some CDNs provide very useful Edge and Origin related timings, alongside Cache Hit/Miss information. Therefore, if the Wait time is relatively long, was it due to a Cache Miss (meaning the request has gone back to origin)? If so, how long did it take to get to the Edge and then go back to the Origin?
This is how this looks in the headers:
And this is how that looks in developer tools:
Imagine having the ability to compare and contrast this wide and varied set of information within your observability platform. Catchpoint’s Insights feature offers the opportunity for our customers to gather insightful performance data for rich analysis. Let's break down some examples to see why it matters.
Multi-dimensional analysis
Utilising Catchpoint’s Insights feature, it is possible to extract information from headers, browser APIs, HTML, JSON, text, etc. and have it stored within Catchpoint’s database, and use it like any other piece of data. This includes access to your dashboards, alerts, long-term trending and deep dive analysis.
Here is a table on CDN data showing default metrics alongside the additional Insights in a single view:
Those insights are not restricted to the Server Timing header. Other useful information provided here, for instance, includes which Edge server was used to handle the request. This is gold since it enables you to quickly ascertain whether performance might be slow due to the Edge server in use being located too far away from where the request was made.
Take a look at the following Catchpoint dashboard widget, which shows performance time alongside the edge PoP location. In this case, performance from Seattle was over 1 second, very likely due to the fact that the edge server handling the request was located in Perth, Australia!
Enable better and faster triage
There are as many opportunities with Insights as you need: pull data from external APIs… even the PerfScore from a Meraki device!
Not to mention Prometheus data, your own server names, custom metrics, navigation API data, SNMP polling data, etc. Knowledge is power.
You can also pull in additional data from RUM, such as Revenue and Conversions, which approximates the net revenue lost due to a specific performance degradation.
The data world is your oyster
The data world is your oyster with Insights: bring in additional data to enable better and faster triage and make empowered, insight-driven decisions to improve overall digital experience for your users.
Why not try out Insights for yourself? Sign up for a free 14-day trial of Catchpoint here.