Always Running – How Zignal’s engineering team maintain our applications 24/7
Having immediate, real-time visibility of problems and bottlenecks in real time applications can be a challenge. At Zignal we’ve embraced observability and DevOps culture to ensure we’re both preventing and reacting to problems well before our customers see them.
The role of software engineers has changed significantly over recent years since the explosion in cloud computing and containerization. Customers now demand realtime products. This movement to continuous delivery has developed a DevOps culture among software engineers. We’re no longer just responsible for writing code and unit tests; we’re committed to owning what we build, from initial inception of an idea through to supporting what we’ve built in production.
We use a combination of metrics, health check scripts running continuously in AWS Lambdas, Datadog monitoring and PagerDuty to ensure we’re informed the split-second a problem is detected that could affect our customers in any way.
Getting paged by an automated system at 2 am can be daunting, especially when the message says a critical piece of the application — or a third-party webservice — isn’t responding. Will I be able to fix it? How will I fix it? Can I reach out to others for help if I need it? To ensure that no one is ever overwhelmed by the challenge at hand, we place multiple individuals on call at the same time, each with an expertise in different functional areas of the system. We also have engineering managers on call to help out with communication, triage and to provide additional support if needed.
Over time we’ve realized just how empowering and powerful having the engineering team own the app right through to production is. We all understand much more of our system and build with great tests, metrics and monitors at top of mind to ensure we’re armed with the information we need to avoid outages and minimize disruption for ourselves, our coworkers and above all our customers.
The Zignal Labs platform engineering team powers the collection, enrichment and indexing of billions of news stories, videos, social media and broadcast clips using cutting edge big data technology. We’re proud of the fact that within split seconds of news stories/social media updates being published, they’re streamed to our Command Center, newsroom and dashboards in real time empowering our customers to discover trends and gain insights the second they occur.