Quality in simultaneous interpreting is reframed not as maximal cognitive effort, but as a regulated equilibrium between concurrent processes — operationalised through a second-order Integrated Cognitive Balance Index (ICBI).
Simultaneous interpreting is intensive bilingual language use in which comprehension and production occur at once and stay tightly interconnected. Quality, the article argues, cannot be reduced to processing speed or task complexity — it arises from the interpreter's ability to sustain a dynamic equilibrium of cognitive load across competing tasks.
Within Interpreting Studies, focus has shifted from cumulative load to moment-to-moment control.
Early effort-based models treated interpreting as the coordination of listening, memory, production and monitoring efforts drawing on a shared pool of cognitive resources. Later research showed that performance failures rarely stem from excessive strain on a single component; they emerge from transient imbalances in effort, triggered by fluctuations in speech rate, information density or discourse structure.
Temporal indicators such as the ear–voice span (EVS), pausing behaviour and segmentation are informative because they reveal how interpreters align comprehension and production under time pressure. Yet comparable EVS values can accompany markedly different accuracy, fluency and pragmatic adequacy — temporal coordination alone does not explain variation in quality.
Multidimensional accounts converge on a shared conclusion: interpreting quality depends not on how much effort is expended, but on how cognitive resources are managed. Anticipatory processing and semantic compression sustain performance under high informational density, while inhibition and monitoring promote cognitive economy — preventing overexplicitation, disfluency and pragmatic drift.
Emotional regulation, once treated as secondary, is now recognised to directly affect attentional stability and monitoring efficiency, particularly in politically or ethically sensitive settings. For diplomatic, humanitarian or crisis work, keeping emotion and cognition in balance is part of managing load, not an optional extra.
Despite these advances, quality assessment remains largely outcome-focused — mistake typologies, accuracy metrics and expert evaluation — and is slow to make the underlying cognitive processes visible.
Load is reframed from a static limit on processing capacity to a dynamic, linguistically mediated and context-sensitive phenomenon: interpreters structure, redistribute and regulate effort during real-time mediation.
Setton's cognitive-pragmatic approach highlights the interpreter's capacity to generate predictive hypotheses about discourse structure and communicative intent. Anticipation is less a conscious skill than a basic mechanism that stabilises temporal coordination and reduces monitoring effort — especially in syntactically predictable or rhetorically organised discourse.
Ear–voice span is redefined as a linguistically grounded measure of equilibrium, not merely delay. Its variability reflects how interpreters coordinate syntactic projection, semantic integration and output planning. Shorter EVS is not inherently superior — excessive compression can signal strain rather than efficiency, particularly in remote interpreting.
Expert interpreters minimise informational redundancy via semantic compression, paraphrasing and syntactic restructuring — reducing load while preserving pragmatic adequacy. This is bound up with inhibition: suppressing conflicting lexical, syntactic and pragmatic representations to keep the target text coherent, observable in fewer false starts, self-repairs and over-explicitation.
When emotional salience rises, temporal coordination becomes less stable and monitoring more effortful. Experienced interpreters draw on mitigation, modalisation and shifts in perspective as stabilising mechanisms — sustaining pragmatic adequacy while limiting the disruptive impact of affective load. Emotional regulation is integral to balance, not external to it.
Each index captures a specific dimension of interpreting cognition while remaining grounded in observable linguistic behaviour. They are not independent predictors competing with one another — what matters for quality is the balance between them. Formulas are reproduced exactly as in the article.
The temporal distance (in seconds) between the onset of a source-language segment and the corresponding target output, computed from time-aligned transcripts in ELAN. It indexes attentional coordination and anticipation — consistent values signal proficient temporal regulation; heightened variability signals instability or elevated monitoring cost.
Measures the consistency of processing under varying demands, consolidating EVS variability, pause density and self-repair frequency into a single composite. Lower BKN indicates a more balanced distribution of cognitive resources; higher values reflect instability and overload — operationalising the notion of dynamic equilibrium.
Reflects the interpreter's ability to reduce cognitive cost through semantic compression and efficient reformulation. High KCE signifies efficient suppression of redundancy and regulated paraphrase — sustaining communicative adequacy while cutting superfluous processing work.
Measures the ability to regulate emotional load without destabilising cognitive processing, combining EVS stability with expert-coded emotional-modulation markers (intonation shifts, attenuation, mitigation). Higher ECE corresponds to greater emotional–cognitive stability, especially in emotionally charged discourse.
The contribution is not yet another isolated micro-measure, but the way familiar indicators are assembled into a balance-oriented instrument. ICBI gains its explanatory value from the joint behaviour of its components and cannot be reduced to any single parameter — including EVS. Weights w₁–w₄ were determined empirically through regression analysis, and the index was related to the Quality Index (Q) to test the predictive link between cognitive balance and interpreting quality.
The model links observable linguistic and temporal patterns in the output to underlying regulatory processes. Quality is an emergent outcome of regulated interaction — not a static result of linguistic transfer.
Fig. 1 — The model represents quality as an emergent outcome of regulated interaction between temporal coordination, processing stability, cognitive economy and emotional–cognitive regulation, integrated in the ICBI and examined against expert-based quality assessment.
Quantitative measurement is integrated with qualitative expert evaluation, combining a large corpus of authentic performances with a controlled longitudinal training experiment — so the balance construct is tested under both naturalistic variation and controlled instructional conditions.
Corpus. 255 annotated transcripts of English–Ukrainian and Ukrainian–English simultaneous interpreting from international institutional settings (UN, EU, NATO, OSCE), diplomatic and humanitarian events, economic forums and academic conferences (2014–2025) — roughly 800k word tokens spanning diverse discourse types, speech rates and emotional loads. All recordings were publicly available and legally accessible.
Trainees. 32 MA students of translation and interpreting at Taras Shevchenko University of Kyiv, aged 20–22 (M = 21.0, SD = 0.7; 27 female, 5 male). All native Ukrainian speakers with English as the primary working foreign language (B2–C2). The component lets the model be examined as a trainable cognitive system, not a static construct.
Materials. English source speeches across institutional, diplomatic, humanitarian and war-related briefings, mixing neutral segments with higher-affect passages; controlled delivery (~130 wpm; ~7-minute segments). Baseline and endline used parallel texts outside the weekly set.
Sessions. Four days/week, ~2 hours/day across two academic semesters; 15-minute SI blocks with fixed breaks, warm-up and post-block self-monitoring notes, recorded under a consistent headset-based setup.
Quality. Two independent experts rated each performance on EMT-aligned 10-point scales — semantic accuracy, terminological adequacy, pragmatic equivalence, fluency & delivery — combined into a composite Quality Index (Q). Inter-rater agreement: Cohen's κ > 0.80.
E was coded directly in the interpreter's output from aligned transcript and audio by the same two expert raters. Affective disruption markers were defined as observable departures from stable professional delivery that cluster around emotionally salient content and plausibly increase monitoring demands. Three recurrent patterns were coded — prosodic spikes (abrupt tension, raised pitch, unmotivated emphatic stress); affect-linked hesitation/repair (long pauses > 2 s, clustered self-repairs, audible non-lexical tokens next to affect-loaded wording); and affect-driven pragmatic drift (unmotivated intensification or over-mitigation). Each segment received an intensity score: 0 (none), 1 (mild/transient), 2 (marked). E is the normalised density of these disruptions relative to segment length — higher E means more frequent or stronger affective interference; lower E means steadier emotional–cognitive regulation.
Across the dataset, EVS clustered in a narrow range (mean 2.9 s, SD 0.7), BKN showed moderate variability, KCE a right-skewed distribution, and ECE the greatest dispersion — particularly in crisis-related and humanitarian discourse.
Pearson correlation coefficients between each cognitive indicator and the composite Quality Index.
Negative for EVS and BKN (excessive delay and processing instability depress ratings); strongly positive for KCE (semantic compression as a central feature of expert behaviour) and positive for ECE (emotional–cognitive equilibrium yields greater pragmatic stability). No single indicator is a sufficient proxy for overall regulation.
Mean Quality Index (Q) across successive ICBI ranges (Fig. 2).
A clear monotonic pattern: gains are not driven by outliers but reflect a stable tendency associated with progressively more even regulation — where ICBI rises and at least three of the four indicators shift in the expected direction.
Standardised β coefficients from the regression model (Fig. 3).
EVS and KCE emerged as the strongest predictors, followed by BKN and ECE. No single variable achieved comparable explanatory power alone — quality emerges from the interaction and equilibrium of multiple processes.
Mean values of EVS, KCE and ECE across discourse types (Fig. 4).
Diplomatic/institutional discourse showed moderate EVS with low instability and regulated delivery; war-related discourse showed diminished EVS and elevated KCE (expedited processing via formulaic language); humanitarian discourse showed greater EVS variability and lower ECE (the destabilising effect of affective salience); media discourse showed the lowest mean EVS and highest KCE (high automatisation). Interpreters who sustained higher ECE in charged genres reached quality levels comparable to less demanding ones.
Percentage change in each index for the experimental group across two semesters.
All indices improved significantly: EVS −14%, BKN −18%, KCE +12%, and most pronounced of all, ECE +23% (enhanced emotional regulation). The control group showed only marginal change, attributable to task familiarity rather than systematic cognitive adaptation. Cognitive load balance is a trainable configuration of processes, not a static trait.
The findings converge: interpreting quality is determined not by the absolute level of cognitive effort, but by the interpreter's ability to regulate and distribute resources across concurrent processes — supporting process-oriented approaches that emphasise dynamic regulation over static capacity limits.
Similar EVS values can correspond to markedly different quality outcomes. EVS should be read together with processing stability, cognitive economy and emotional regulation — not as a standalone gauge of performance.
Extending descriptive traditions in Translation Studies, cognitive balance is framed as a norm-regulated dimension of professional interpreting — visible in consistent patterns of timing, compression and pragmatic adaptation rather than as a hidden psychological state.
Its non-dominant yet significant role underscores that affective control is inseparable from cognitive performance in high-stakes settings — a stabilising factor, not a secondary skill.
Quality is best understood as the outcome of how interpreters coordinate several cognitive processes at once.
Temporal control, cognitive economy, processing stability and emotional regulation do not operate independently; their interaction shapes performance in a systematic way. Quality is associated not with maximal cognitive effort, but with the ability to keep competing demands in relative equilibrium.
The novelty is not one more standalone metric, but a balance logic testable against real assessment practice — four interacting components whose joint configuration is the primary explanatory unit, summarised by ICBI as a second-order construct.
Aggregated measures of cognitive balance capture relatively stable patterns of performance rather than short-lived fluctuations. ICBI does not replace expert judgement — it shows which configuration of regulation tends to produce that judgement, and why it differs across discourse conditions.
The longitudinal data indicate that balance can be developed through instruction, especially emotional–cognitive regulation. Integrating process-sensitive indicators into training and assessment may support more transparent evaluation and help interpreters cope with high-pressure, emotionally demanding assignments.
The framework offers a coherent basis for further research across additional language pairs, interpreting modes and technological settings — including remote and AI-assisted interpreting.