ORIGINAL ARTICLE · UDC 81'255.4 · DOI 10.32342/anuJPh.2026.31.22
SIMULTANEOUS INTERPRETING · PROCESS-ORIENTED ANALYSIS

Cognitive Load Balance Index in Simultaneous Interpreting Quality Assessment

Serhii Skrylnyk Taras Shevchenko National University of Kyiv (Ukraine)

Quality in simultaneous interpreting is reframed not as maximal cognitive effort, but as a regulated equilibrium between concurrent processes — operationalised through a second-order Integrated Cognitive Balance Index (ICBI).

Explore the four indices → See the results

255

annotated SI transcripts (2014–2025)

800k

word tokens in the corpus

MA interpreting trainees

2.9s

mean ear–voice span (SD 0.7)

0.58

adjusted R² · ICBI → quality

>0.80

Cohen's κ inter-rater agreement

01 — Introduction

From outcome scores to the regulation behind them

Simultaneous interpreting is intensive bilingual language use in which comprehension and production occur at once and stay tightly interconnected. Quality, the article argues, cannot be reduced to processing speed or task complexity — it arises from the interpreter's ability to sustain a dynamic equilibrium of cognitive load across competing tasks.

Within Interpreting Studies, focus has shifted from cumulative load to moment-to-moment control.

Early effort-based models treated interpreting as the coordination of listening, memory, production and monitoring efforts drawing on a shared pool of cognitive resources. Later research showed that performance failures rarely stem from excessive strain on a single component; they emerge from transient imbalances in effort, triggered by fluctuations in speech rate, information density or discourse structure.

Temporal indicators such as the ear–voice span (EVS), pausing behaviour and segmentation are informative because they reveal how interpreters align comprehension and production under time pressure. Yet comparable EVS values can accompany markedly different accuracy, fluency and pragmatic adequacy — temporal coordination alone does not explain variation in quality.

Multidimensional accounts converge on a shared conclusion: interpreting quality depends not on how much effort is expended, but on how cognitive resources are managed. Anticipatory processing and semantic compression sustain performance under high informational density, while inhibition and monitoring promote cognitive economy — preventing overexplicitation, disfluency and pragmatic drift.

Emotional regulation, once treated as secondary, is now recognised to directly affect attentional stability and monitoring efficiency, particularly in politically or ethically sensitive settings. For diplomatic, humanitarian or crisis work, keeping emotion and cognition in balance is part of managing load, not an optional extra.

Despite these advances, quality assessment remains largely outcome-focused — mistake typologies, accuracy metrics and expert evaluation — and is slow to make the underlying cognitive processes visible.

Central hypothesis. Expert-rated simultaneous interpreting quality is more strongly and stably associated with the overall balance of cognitive regulation — captured by the Integrated Cognitive Balance Index (ICBI) — than with any single indicator such as EVS considered in isolation. Higher-quality performances are expected to show a jointly favourable pattern across temporal coordination, processing stability, cognitive economy and emotional–cognitive regulation, rather than isolated improvement in one dimension.

02 — Theoretical Framework

Cognitive load as a regulated balance

Load is reframed from a static limit on processing capacity to a dynamic, linguistically mediated and context-sensitive phenomenon: interpreters structure, redistribute and regulate effort during real-time mediation.

Anticipation & meaning construction

Setton's cognitive-pragmatic approach highlights the interpreter's capacity to generate predictive hypotheses about discourse structure and communicative intent. Anticipation is less a conscious skill than a basic mechanism that stabilises temporal coordination and reduces monitoring effort — especially in syntactically predictable or rhetorically organised discourse.

EVS as processing equilibrium

Ear–voice span is redefined as a linguistically grounded measure of equilibrium, not merely delay. Its variability reflects how interpreters coordinate syntactic projection, semantic integration and output planning. Shorter EVS is not inherently superior — excessive compression can signal strain rather than efficiency, particularly in remote interpreting.

Cognitive economy & inhibition

Expert interpreters minimise informational redundancy via semantic compression, paraphrasing and syntactic restructuring — reducing load while preserving pragmatic adequacy. This is bound up with inhibition: suppressing conflicting lexical, syntactic and pragmatic representations to keep the target text coherent, observable in fewer false starts, self-repairs and over-explicitation.

Emotional–cognitive regulation

When emotional salience rises, temporal coordination becomes less stable and monitoring more effortful. Experienced interpreters draw on mitigation, modalisation and shifts in perspective as stabilising mechanisms — sustaining pragmatic adequacy while limiting the disruptive impact of affective load. Emotional regulation is integral to balance, not external to it.

A situated view. Following Risku and Rogl, interpreting is treated as distributed cognition — shaped by the interpreter, the communicative situation and the tools involved. Genre conventions and technological constraints (latency, audio quality in remote settings) can affect cognitive stability and increase emotional strain, so the model defines quality as the outcome of cognitive load balance: a dynamic equilibrium among temporal coordination, cognitive economy, inhibition and emotional regulation.

03 — Operationalising Balance

Four indices, one balance construct

Each index captures a specific dimension of interpreting cognition while remaining grounded in observable linguistic behaviour. They are not independent predictors competing with one another — what matters for quality is the balance between them. Formulas are reproduced exactly as in the article.

EVSTemporal coordination

Ear–Voice Span

The temporal distance (in seconds) between the onset of a source-language segment and the corresponding target output, computed from time-aligned transcripts in ELAN. It indexes attentional coordination and anticipation — consistent values signal proficient temporal regulation; heightened variability signals instability or elevated monitoring cost.

EVS = t_output − t_input

r = −0.62correlation with Quality (Q), p < 0.01 — non-linear: extremely short EVS does not mean higher quality

BKNProcessing stability

Balance of Cognitive Load Index

Measures the consistency of processing under varying demands, consolidating EVS variability, pause density and self-repair frequency into a single composite. Lower BKN indicates a more balanced distribution of cognitive resources; higher values reflect instability and overload — operationalising the notion of dynamic equilibrium.

BKN = σ_EVS + P + RN

σ_EVS standard deviation of EVS values
P proportion of pauses exceeding 2 seconds
R number of self-repairs
N number of analysed segments

r = −0.68correlation with Quality (Q), p < 0.01 — instability, not absolute effort, predicts lower quality

KCECognitive economy

Cognitive Economy Coefficient

Reflects the interpreter's ability to reduce cognitive cost through semantic compression and efficient reformulation. High KCE signifies efficient suppression of redundancy and regulated paraphrase — sustaining communicative adequacy while cutting superfluous processing work.

KCE = SULU

r = 0.71correlation with Quality (Q), p < 0.01 — the strongest positive predictor of pragmatic adequacy & fluency

ECEEmotional regulation

Emotional–Cognitive Equilibrium Index

Measures the ability to regulate emotional load without destabilising cognitive processing, combining EVS stability with expert-coded emotional-modulation markers (intonation shifts, attenuation, mitigation). Higher ECE corresponds to greater emotional–cognitive stability, especially in emotionally charged discourse.

ECE = (EVS_stableEVS_total) × (1 − E)

E density of affective disruption markers

r = 0.59correlation with Quality (Q), p < 0.05 — strongest in emotionally loaded discourse

The second-order construct

Integrated Cognitive Balance Index

The contribution is not yet another isolated micro-measure, but the way familiar indicators are assembled into a balance-oriented instrument. ICBI gains its explanatory value from the joint behaviour of its components and cannot be reduced to any single parameter — including EVS. Weights w₁–w₄ were determined empirically through regression analysis, and the index was related to the Quality Index (Q) to test the predictive link between cognitive balance and interpreting quality.

ICBI = w₁·EVS + w₂·BKN + w₃·KCE + w₄·ECE

04 — Conceptual Model

How the components hold together

The model links observable linguistic and temporal patterns in the output to underlying regulatory processes. Quality is an emergent outcome of regulated interaction — not a static result of linguistic transfer.

Linguistic-cognitive Model of Simultaneous Interpreting Quality Assessment

Effort models Anticipation Cognitive economy Inhibition & monitoring Emotional regulation

↓

COGNITIVE LOAD BALANCEDynamic equilibrium of cognitive processes

↓

Temporal coordination

Ear–Voice Span · EVS

Processing stability

Instability · BKN

Cognitive economy

Coefficient · KCE

Emotional regulation

Equilibrium · ECE

↓

INTEGRATED COGNITIVE BALANCE INDEX (ICBI)

↓

Interpreting Quality (Q)Accuracy · fluency · pragmatic adequacy

Fig. 1 — The model represents quality as an emergent outcome of regulated interaction between temporal coordination, processing stability, cognitive economy and emotional–cognitive regulation, integrated in the ICBI and examined against expert-based quality assessment.

05 — Methodology & Data

A mixed-methods, process-oriented design

Quantitative measurement is integrated with qualitative expert evaluation, combining a large corpus of authentic performances with a controlled longitudinal training experiment — so the balance construct is tested under both naturalistic variation and controlled instructional conditions.

Data & participants

Corpus. 255 annotated transcripts of English–Ukrainian and Ukrainian–English simultaneous interpreting from international institutional settings (UN, EU, NATO, OSCE), diplomatic and humanitarian events, economic forums and academic conferences (2014–2025) — roughly 800k word tokens spanning diverse discourse types, speech rates and emotional loads. All recordings were publicly available and legally accessible.

Trainees. 32 MA students of translation and interpreting at Taras Shevchenko University of Kyiv, aged 20–22 (M = 21.0, SD = 0.7; 27 female, 5 male). All native Ukrainian speakers with English as the primary working foreign language (B2–C2). The component lets the model be examined as a trainable cognitive system, not a static construct.

Materials, sessions & quality scoring

Materials. English source speeches across institutional, diplomatic, humanitarian and war-related briefings, mixing neutral segments with higher-affect passages; controlled delivery (~130 wpm; ~7-minute segments). Baseline and endline used parallel texts outside the weekly set.

Sessions. Four days/week, ~2 hours/day across two academic semesters; 15-minute SI blocks with fixed breaks, warm-up and post-block self-monitoring notes, recorded under a consistent headset-based setup.

Quality. Two independent experts rated each performance on EMT-aligned 10-point scales — semantic accuracy, terminological adequacy, pragmatic equivalence, fluency & delivery — combined into a composite Quality Index (Q). Inter-rater agreement: Cohen's κ > 0.80.

Coding the affective component (E in ECE)

E was coded directly in the interpreter's output from aligned transcript and audio by the same two expert raters. Affective disruption markers were defined as observable departures from stable professional delivery that cluster around emotionally salient content and plausibly increase monitoring demands. Three recurrent patterns were coded — prosodic spikes (abrupt tension, raised pitch, unmotivated emphatic stress); affect-linked hesitation/repair (long pauses > 2 s, clustered self-repairs, audible non-lexical tokens next to affect-loaded wording); and affect-driven pragmatic drift (unmotivated intensification or over-mitigation). Each segment received an intensity score: 0 (none), 1 (mild/transient), 2 (marked). E is the normalised density of these disruptions relative to segment length — higher E means more frequent or stronger affective interference; lower E means steadier emotional–cognitive regulation.

Marked disruption · score = 2
EN "Thousands of civilians were killed overnight."
→ UKR "Тисячі… [pause 2.6 s] … мирних людей загинули…"
— a long pause and tense delivery occur around a high-affect noun phrase.

Triangulation. Corpus data, experimental data and expert evaluation together strengthen internal validity. The model does not claim to exhaust interpreting cognition; it offers a theoretically robust, empirically verifiable framework extendable to other language pairs and modalities.

Stated limitations. Some participant variables were not modelled explicitly — gender was recorded but the sample does not support reliable subgroup inference — and platform-specific technical factors (audio quality, latency, recording conditions) cannot be fully normalised and may add variance in monitoring demands.

06 — Results

Quality tracks balance, not effort

Across the dataset, EVS clustered in a narrow range (mean 2.9 s, SD 0.7), BKN showed moderate variability, KCE a right-skewed distribution, and ECE the greatest dispersion — particularly in crisis-related and humanitarian discourse.

Correlations with expert quality (Q)

Pearson correlation coefficients between each cognitive indicator and the composite Quality Index.

Negative for EVS and BKN (excessive delay and processing instability depress ratings); strongly positive for KCE (semantic compression as a central feature of expert behaviour) and positive for ECE (emotional–cognitive equilibrium yields greater pragmatic stability). No single indicator is a sufficient proxy for overall regulation.

Quality rises monotonically with balance

Mean Quality Index (Q) across successive ICBI ranges (Fig. 2).

A clear monotonic pattern: gains are not driven by outliers but reflect a stable tendency associated with progressively more even regulation — where ICBI rises and at least three of the four indicators shift in the expected direction.

Contribution of components to quality

Standardised β coefficients from the regression model (Fig. 3).

EVS and KCE emerged as the strongest predictors, followed by BKN and ECE. No single variable achieved comparable explanatory power alone — quality emerges from the interaction and equilibrium of multiple processes.

Regression model. The model was statistically significant — F(4, 1965) = 12.34, p < 0.001 — with an adjusted R² of 0.58, meaning more than half of the variance in expert quality assessments is explained by the combined effect of the cognitive balance indicators.

Genre-based variation in cognitive load balance

Mean values of EVS, KCE and ECE across discourse types (Fig. 4).

Diplomatic/institutional discourse showed moderate EVS with low instability and regulated delivery; war-related discourse showed diminished EVS and elevated KCE (expedited processing via formulaic language); humanitarian discourse showed greater EVS variability and lower ECE (the destabilising effect of affective salience); media discourse showed the lowest mean EVS and highest KCE (high automatisation). Interpreters who sustained higher ECE in charged genres reached quality levels comparable to less demanding ones.

Balance is trainable: pre- vs post-training change

Percentage change in each index for the experimental group across two semesters.

All indices improved significantly: EVS −14%, BKN −18%, KCE +12%, and most pronounced of all, ECE +23% (enhanced emotional regulation). The control group showed only marginal change, attributable to task familiarity rather than systematic cognitive adaptation. Cognitive load balance is a trainable configuration of processes, not a static trait.

07 — Discussion

A bridge between a score and the regulation beneath it

The findings converge: interpreting quality is determined not by the absolute level of cognitive effort, but by the interpreter's ability to regulate and distribute resources across concurrent processes — supporting process-oriented approaches that emphasise dynamic regulation over static capacity limits.

EVS is not a linear proxy

Similar EVS values can correspond to markedly different quality outcomes. EVS should be read together with processing stability, cognitive economy and emotional regulation — not as a standalone gauge of performance.

Balance as a norm

Extending descriptive traditions in Translation Studies, cognitive balance is framed as a norm-regulated dimension of professional interpreting — visible in consistent patterns of timing, compression and pragmatic adaptation rather than as a hidden psychological state.

Emotion as a stabiliser

Its non-dominant yet significant role underscores that affective control is inseparable from cognitive performance in high-stakes settings — a stabilising factor, not a secondary skill.

Diagnostic value. Used alongside expert scoring, ICBI helps distinguish "fast but brittle" output from performance that is stable, economical and pragmatically controlled — explaining why two performances with similar fluency, or even similar EVS, can diverge in accuracy and pragmatic adequacy once instability, economy and affective pressure are taken into account. This is especially relevant in high-stakes and emotionally loaded discourse.

08 — Conclusions

Making "quality" methodologically legible

Quality is best understood as the outcome of how interpreters coordinate several cognitive processes at once.

Temporal control, cognitive economy, processing stability and emotional regulation do not operate independently; their interaction shapes performance in a systematic way. Quality is associated not with maximal cognitive effort, but with the ability to keep competing demands in relative equilibrium.

The novelty is not one more standalone metric, but a balance logic testable against real assessment practice — four interacting components whose joint configuration is the primary explanatory unit, summarised by ICBI as a second-order construct.

Aggregated measures of cognitive balance capture relatively stable patterns of performance rather than short-lived fluctuations. ICBI does not replace expert judgement — it shows which configuration of regulation tends to produce that judgement, and why it differs across discourse conditions.

The longitudinal data indicate that balance can be developed through instruction, especially emotional–cognitive regulation. Integrating process-sensitive indicators into training and assessment may support more transparent evaluation and help interpreters cope with high-pressure, emotionally demanding assignments.

The framework offers a coherent basis for further research across additional language pairs, interpreting modes and technological settings — including remote and AI-assisted interpreting.

−14%

EVS after training

−18%

BKN (more stable processing)

+12%

KCE (cognitive economy)

+23%

ECE (emotional regulation)

References

Works cited

Arumi Ribas, M. (2010). Gile, Daniel (2009). Basic Concepts and Models for Interpreter and Translator training. The Journal of Specialised Translation, 14, 263–265.
Chen, S. (2017). The construct of cognitive load in interpreting and its measurement. Perspectives: Studies in Translatology, 25(5), 640–657.
Chesterman, A. (1997). Memes of Translation: The Spread of Ideas in Translation Theory. Amsterdam, Philadelphia: John Benjamins.
Davitti, E., Braun, S. (2020). Analysing interactional phenomena in video remote interpreting in collaborative settings. The Interpreter and Translator Trainer, 14(3), 279–302.
Dong, Y., Lin, J. (2013). Parallel processing in simultaneous interpreting: Evidence from eye-tracking. Bilingualism: Language and Cognition, 16(3), 682–692.
Dong, Y., Xie, Z. (2014). Contributions of second language proficiency and interpreting experience to cognitive control differences among young adult bilinguals. Journal of Cognitive Psychology, 26(5), 506–519.
Fantinuoli, C. (2018). Interpreting and technology: The upcoming technological turn. In C. Fantinuoli (Ed.), Interpreting and technology (pp. 1–12). Berlin: Language Science Press.
Froeliger, N., Krause, A., Salmi, L. (Eds.). (2022). EMT Competence Framework—2022. European Commission.
Gile, D. (1995). Basic Concepts and Models for Interpreter and Translator Training. Amsterdam, Philadelphia: John Benjamins.
Green, D.W. (1998). Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition, 1(2), 67–81.
Green, D.W., Abutalebi, J. (2013). Language control in bilinguals: The adaptive control hypothesis. Journal of Cognitive Psychology, 25(5), 515–530.
Holmes, J.S. (1988). Translated!: Papers on literary translation and translation studies. Leiden: Brill.
Hvelplund, K.T. (2022). Institutional translation and the translation process. In T. Svoboda, Ł. Biel, V. Sosoni (Eds.), Institutional Translator Training (pp. 920–110). New York: Routledge.
Hvelplund, K.T. (2024). Experimental Translation Studies. In A. Lange, D. Monticelli, C. Rundle (Eds.), The Routledge Handbook of the History of Translation Studies (pp. 309–323).
Janikowski, P., Chmiel, A. (2025). Ear–voice span in simultaneous interpreting. Interpreting: International Journal of Research and Practice in Interpreting, 27(1), 28–51.
Kaczmarek, Ł. (2011). Modelling competence in community interpreting: Expectancies, impressions and implications for accreditation. New Voices in Translation Studies, 7(1).
Korpal, P. (2021). Stress and emotion in conference interpreting. In M. Albl-Mikasa, E. Tiselius (Eds.), The Routledge Handbook of Conference Interpreting (pp. 401–413). London: Routledge.
Korpal, P., Mellinger, C.D. (2022). Self-care strategies of professional community interpreters. Translation Cognition & Behavior, 5(2), 275–299.
Landis, J.R., Koch, G.G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Lee, T. (2004). Ear Voice Span in English into Korean Simultaneous Interpreting. Meta: Journal des traducteurs, 47(4), 596–606.
Mellinger, C., Hanson, T. (2016). Quantitative Research Methods in Translation and Interpreting Studies. London: Routledge.
Miyake, A., Friedman, N.P., Emerson, M.J., Witzki, A.H., Howerter, A., Wagner, T.D. (2000). The unity and diversity of executive functions and their contributions to complex "frontal lobe" tasks: A latent variable analysis. Cognitive Psychology, 41(1), 49–100.
Moser-Mercer, B. (2010). The search for neuro-physiological correlates of expertise in interpreting. In G.M. Shreve, E. Angelone (Eds.), Translation and Cognition (pp. 263–287). Amsterdam, Philadelphia: John Benjamins.
Napier, J. (2021). Review of Salaets & Brône (2020): Linking up with Video. Interpreting: International Journal of Research and Practice in Interpreting, 24(1), 147–154.
Paas, F., van Merriënboer, J.J.G. (1994). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6(4), 351–371.
Pavlenko, A. (Ed.). (2023). Multilingualism and History. Cambridge: Cambridge University Press.
Pöchhacker, F. (2016). Introducing Interpreting Studies. London: Routledge.
Pradas Macías, E.M., Zwischenberger, C. (2021). Quality and norms in conference interpreting. In M. Albl-Mikasa, E. Tiselius (Eds.), The Routledge Handbook of Conference Interpreting (pp. 243–257). London: Routledge.
Prandi, B. (2018). An exploratory study on CAI tools in simultaneous interpreting: Theoretical framework and stimulus validation. In C. Fantinuoli (Ed.), Interpreting and Technology (pp. 29–59). Berlin: Language Science Press.
Risku, H., Rogl, R. (2020). Translation and situated, embodied, distributed, embedded, and extended cognition. In F. Alves, A. Jakobsen (Eds.), The Routledge Handbook of Translation and Cognition (pp. 478–499). London: Routledge.
Salaets, H., Brône, G. (2023). "Working at a distance from everybody": Challenges (and some advantages) in working with video-based interpreting platforms. The Interpreters' Newsletter, 28, 189–209.
Seeber, K.G. (2011). Cognitive load in simultaneous interpreting: Existing theories – new models. Interpreting, 13(2), 176–204.
Seeber, K.G. (Ed.). (2021). 100 Years of Conference Interpreting: A Legacy. London: Cambridge Scholars Publishing.
Seeber, K.G., Keller, L., Hervais-Adelman, A. (2020). When the ear leads the eye – the use of text during simultaneous interpreting. Language, Cognition and Neuroscience, 35(10), 1480–1494.
Seeber, K.G., Kerzel, D. (2012). Cognitive load in simultaneous interpreting: Model meets data. International Journal of Bilingualism, 16(2), 228–242.
Setton, R. (2006). Simultaneous Interpreting. A Cognitive-pragmatic Analysis. Journal of Literary Semantics, 30(3), 210–214.
Setton, R., Dawrant, A. (2016). Conference Interpreting – A Complete Course. Amsterdam, Philadelphia: Benjamins Translation Library.
Sweller, J., van Merriënboer, J.J.G., Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–291.
Toury, G. (1995). Descriptive Translation Studies and Beyond. Amsterdam, Philadelphia: John Benjamins.
Walker, C., Hvelplund, K.T., Lei, V. (2025). Screen eyetracking. In A.M. Rojo López & R. Muñoz Martín (Eds.), Research Methods in Cognitive Translation and Interpreting Studies (pp. 213–234). Amsterdam, Philadelphia: John Benjamins.