In this brave new world of distributed systems, we are entrusted with keeping the infrastructure up and running.The source of the challenge is to monitor the services themselves and the space in between. We face non-determinism, sometimes we can’t tell if our system is up, down, or partially working, and every failure is a taskContinue reading “Building resilient Distributed Systems at scale”