Static ensemble weights fail in non-stationary environments, and coherence between models carries the signal you're missing

01 [Evaluation] Static ensemble weights fail in non-stationary environments, and coherence between models carries the signal you’re missing

Traditional ensembles assign fixed weights to constituent models, or learn those weights offline. Both approaches assume the environment is stable enough for historical performance rankings to stay valid. In sequential decision-making, that assumption breaks constantly: task distributions shift, some models degrade on specific sub-domains, and yesterday’s best model may be today’s worst.

EARCP (Ensemble Auto-Régulé par Cohérence et Performance) updates model weights online after every decision, combining two signals: individual model accuracy and inter-model coherence (how much a given model agrees with the consensus of the ensemble). The coherence term acts as a regularizer — when a model diverges from the group, its weight is suppressed even if its recent point accuracy looks acceptable. The update rule derives from multiplicative weight (exponentiated gradient) algorithms, which carry formal regret bounds guaranteeing that cumulative loss approaches the best fixed-weight combination in hindsight. The coherence regularization is the novel addition: it penalizes models that drift from ensemble consensus, reducing variance in non-stationary regimes without sacrificing the theoretical guarantees.

One honest limitation: regret bounds hold under the theoretical framework’s assumptions, and the paper’s empirical validation scope isn’t specified in the abstract. Real-world performance on a specific task distribution will require direct benchmarking. For teams running heterogeneous model ensembles in production pipelines where input distribution shifts over time (recommendation systems, adaptive agents, multi-step planning), this is a principled replacement for static weighting with a lower risk of silent degradation.

Key takeaways:

EARCP combines per-model accuracy tracking with inter-model coherence scoring to reweight ensemble members online; models that drift from group consensus lose influence even when their local accuracy appears stable
The theoretical grounding via multiplicative weight update algorithms provides regret bounds — the ensemble provably converges toward the best fixed combination in hindsight, a guarantee static or offline-learned ensembles cannot make in non-stationary settings
Teams running multi-model ensembles in production environments with shifting input distributions should treat static weight assignment as a known liability; EARCP’s online reweighting mechanism is a direct architectural replacement worth evaluating

Source: EARCP: Self-Regulating Coherence-Aware Ensemble Architecture for Sequential Decision Making