Tracing requests in distributed systems
Skelia’s Technology Architect Serhii Hromovyi took part in the AWS User Group Ukraine Meetup and shared his experience with distributed systems and X-Ray. We’ll highlight the most important points of his presentation.
Distributed systems are a collection of independent apps in a single coherent system, and they look like one system in the eyes of end users.
Why have they emerged and gained popularity?
- Powerful yet affordable microprocessors allow for lost-cost and high-reliability systems
- Advancements in communication technology made it easier for systems distanced from each other to connect
Advantages of distributed systems
- Cost-effectiveness compared to monolith apps. Initial costs are higher, but at scale, it’s way cheaper to work with distributed systems.
- High reliability. When one component fails to work, the system will still partially function.
- Horizontal scalability. You can increase the number of components, as well as scale individual components separately, whereas in monolith systems, you’ll have to scale everything.
- Improved performance. Distributed computing and distributed load between the components allow for better performance and faster response times.
Disadvantages of distributed systems
- Security concerns. It’s harder to secure multiple components than it is with a single app.
- Initial costs. As I’ve mentioned, distributed systems require a solid initial investment.
- Complexity. There might be dozens or even hundreds of components which makes it harder to manage them and detect errors. Plus, if there’s no documentation, then it’s a developer’s nightmare.
- Network issues. It isn’t stable all the time, which results in additional developer work.
The challenges of distributed systems
- Heterogeneity. Components may be developed with different languages and on different OS.
- Open nature. Distributed systems should be open to changes.
- Security. The more components, the more vulnerabilities.
- Scalability. The architecture should support scalability, otherwise, there’s no point in distributed systems being what they are. (You might have seen microservice architectures that appear to be just distributed monoliths)
- Error handling. It’s harder to predict and detect errors because they are somewhere in one or several components while others function as usual. Plus, the analysis of one component is not sufficient for detecting a problem.
- Concurrency. It’s hard to control when concurrent requests modify data or use a common resource.
- Reliability. The more components there are, the higher the chances there will be some issues. There should be several copies of components in case a component fails to respond. Also, if there’s no response, we should ensure that data isn’t lost and have several database replicas.
Tracing is the method of tracking requests in distributed systems. For example, we can detect which part of the system is the most loaded thanks to tracing.
The benefits of the distributed tracing method:
- Productivity. You don’t have to deal with logs from different services and analyze which of them are relevant to the error.
- Improved cross-team communication. Since different teams are responsible for different components, it can be hard to manage who will fix the issue. Tracing tools can solve this problem.
- Flexible implementation. There are lots of frameworks that ease the process.
Correlation ID is the request identifier and X-Ray is the app performance monitoring system that we use.
The benefits of AWS X-Ray
- It’s already integrated into AWS
- It supports several programming languages
- It offers a convenient service map
- It’s a SaaS, which means that you don’t need to waste time on DevOps and support
- It’s easy to adopt
- It has a powerful profiler
X-Ray key concepts
- Segments and subsegments. Each segment is the result of request processing and contains one or more subsegments with data.
- Traces. The system traces requests and aggregates the segments created when processing each request.
- Service graphs. The system creates graphs on how segments communicate with each other. This JSON document is visualized in the service map.
The downsides of AWS X-Ray
- Trace ID is immutable in Lambda
- Not all services and languages are supported (it’s been five years but C# and PHP still don’t have the SDK)
- No SNS — SQS integration: X-Ray won’t create a trace in the SQS queue with the Lambda function (Amazon promises to solve this by the end of the year)
* You can look at the sample app by following the link: