Start your free 14-day ContainIQ trial

How Speedscale Improves Cluster Health And Catches Errors As They Happen Using ContainIQ

March 24, 2022

In this case study, we highlight Speedscale, a venture-backed startup, that used ContainIQ to improve cluster health and catch previously missed errors.

Nate Matherson
Co-founder
“I like using ContainIQ because in a single portal I can see Kubernetes infrastructure health alongside the application health all from the same package. Instead of installing separate logging, monitoring, and APM agents, everything is all in one place. In the first week, we found an error in the logs that we didn’t know was happening.” Ken Ahrens, co-founder, and CEO of Speedscale.

Background on Speedscale

Speedscale is a fast-growing SaaS technology vendor that helps companies stress test their APIs with real-world scenarios. Speedscale allows its customers to collect and replay traffic without scripting. By simulating load and measuring performance, Speedscale helps companies like Chick-fil-a and Nylas mock 3rd party endpoints at a scale unavailable anywhere else but production. Speedscale is a venture-backed company with investors including Y Combinator, Sierra Ventures, Soma Capital, and Tech Square Ventures. 

Challenges

Speedscale has the fortunate problem of a growing number of new enterprise customers and also expectations from its engaged user base who rely on Speedscale for regular load testing. The Speedscale team deploys numerous cloud-based Kubernetes clusters to ensure their platform can scale to the size of their largest customers. Speedscale was looking for a monitoring and observability platform built to scale with the growing size of their Kubernetes infrastructure. Speedscale’s growing infrastructure and customer demands introduced a number of challenges for the technology company:

  • Error creep often went unnoticed.
  • The company had outgrown open-source tooling and needed a single tool for core metrics, logging, events, and tracing.
  • Each Kubernetes cluster was monitored separately, and errors would fall through the cracks.
  • Instrumenting new microservices took time and reducing engineering time spent maintaining monitoring became a priority.
  • Maintaining a low latency and performant experience for end-users was challenging as the company scaled its infrastructure.
  • Maintaining a cost-efficient and affordable monitoring solution.

Solution & Benefits

After learning about ContainIQ, a monitoring and tracing platform, Speedscale made the decision to instrument ContainIQ across the company’s Kubernetes infrastructure. With ContainIQ, it is easy to correlate metrics, logs, events, latencies, and traces. On their own, using ContainIQ’s self-service on-boarding, Speedscale installed ContainIQ throughout the company’s multiple Kubernetes environments without having to install application packages, middleware, or having to configure each dashboard manually. It was a one-line install.

ContainIQ delivered Speedscale five core features:
(1) metrics: CPU and memory for pods/nodes, view limits, capacity, and correlate to events, alert on changes; 

(2) events: K8s events dashboard, correlate to logs, alerting (ex crash loops, evictions, etc); 

(3) latency: monitor RPS, p95, and p99 latencies by microservices, including by URL path, alerts; and 

(4) logs: container level log storage and search.

(5) tracing: View all incoming and outgoing HTTP requests alongside metadata (ex. the status code of the response, the latency of the request, and the pod(s) / services involved in the request.)

Within the first day of using ContainIQ, Ken and his team caught their first error which would have been previously missed and would have impacted user performance. Using ContainIQ’s logging dashboard, including its structured search, Speedscale caught a gRPC error that would have impacted end-user performance and that was not previously caught because these types of errors only happen after environments have been running for several days.

Using ContainIQ’s Kubernetes Events dashboard, Speedscale has been able to find issues, including OOMs, that had been previously missed by other tools.

And by leveraging ContainIQ’s latency dashboard, Speedscale has been able to monitor and alert on slowdowns on given paths and endpoints, again improving the end-user experience.

Finally, ContainIQ’s straightforward per node and per GB of log ingest pricing model gave the entire team confidence that it knew what it was going to cost today and over the long term.

Why ContainIQ

Speedscale’s requirements are for an out-of-the-box solution that supports their unique environment: Multiple Kubernetes environments with high demands for reliability and performance.

Having tried a number of open-source tools previously, including Prometheus, Speedscale knew that it needed a solution that could scale across all of its environments in a predictable and affordable way. And a solution that could deliver application-level insights from the Kernel and OS layer, helping to reduce the time spent on initial instrumentation and ongoing maintenance.

With ContainIQ, Speedscale is able to monitor core metrics, cluster level, and application logs, as well as HTTP latency and traces across all of their microservices. ContainIQ’s intelligent alerting features help Speedscale identify issues as they happen and to enact fixes before end-user performance is impacted.

And because ContainIQ delivers an incredible amount of value at a predictable price point, Speedscale was able to confidently deploy ContainIQ across their entire environment including clusters that had previously been ignored due to unforeseeable costs associated with other monitoring solutions.

Getting Started

To get started with ContainIQ, you can sign-up directly on our website using the self-service flow without talking to anyone. You can also Book a Demo with a member of our team to learn more about how ContainIQ can help improve cluster health and observability at your company.

Speedscale is a paying customer of ContainIQ. And ContainIQ is a user of Speedscale's load testing services. We encourage you to explore using ContainIQ and Speedscale together.

Start your free 14-day ContainIQ trial
Start Free TrialBook a Demo
Nate Matherson
Co-founder

Nate Matherson is the Co-founder & CEO of ContainIQ. An experienced entrepreneur and technologist, he has founded multiple venture-backed companies and is a two-time Y Combinator Alum. Nate is also an active angel investor.

READ MORE