Correlated Telemetry Driven Throughput Analysis in Distributed Pipelines
Main Article Content
Abstract
Throughput degradation remains a persistent challenge in distributed data pipelines, particularly as systems scale across multiple nodes and operate under dynamic workload conditions. Modern distributed pipelines rely on parallel execution and shared Input Output resources, making throughput sensitive to coordination overhead, resource contention, and variability in stage level execution behavior. Existing performance analysis approaches commonly depend on isolated monitoring mechanisms such as metrics, logs, or traces analyzed independently. Although these mechanisms provide partial visibility, their separation limits the ability to systematically relate throughput variations to execution flow and resource interactions. A major limitation of current approaches lies in the absence of correlated telemetry across pipeline stages. Metrics capture aggregated system behavior without execution context, logs record discrete events without continuity, and traces expose execution paths without sufficient insight into resource level activity. As cluster size increases, coordination patterns and resource dependencies evolve across nodes, further complicating throughput analysis using isolated signals. This work addresses these limitations by examining a correlated telemetry driven approach to throughput analysis in distributed pipelines. The proposed direction focuses on the structured integration of metrics, logs, and traces using shared execution identifiers and temporal alignment. Through experimental analysis across varying cluster sizes, correlated telemetry is used to examine how execution flow, coordination behavior, and Input Output activity collectively influence throughput. The objective is to establish an analysis framework that supports empirical characterization of throughput dynamics based on observed execution behavior, enabling systematic evaluation of scalability challenges in distributed data pipelines.