Preprint
Filtered not mixed: stochastic filtering-based online gating for mixture of large language models
- Abstract:
- We propose MoE-F – a formalized mechanism for combining N pre-trained expert Large Language Models (LLMs) in online time-series prediction tasks. MoE-F adaptively forecasts the optimal weighting of LLM predictions at each time step by leveraging the conditional information in each expert’s running performance, enabling the best combination of experts for the next step prediction. Diverging from static (learned) Mixture of Experts (MoE) methods, our approach employs time-adaptive stochastic filtering techniques to combine experts. By framing the expert selection problem as a finite state-space, continuous-time Hidden Markov model (HMM), we can leverage the Wonham-Shiryaev filter. Our approach first constructs N parallel filters corresponding to each N individual LLMs. Each filter proposes its best combination of LLMs, given the information that they have access to. Subsequently, the N filter outputs are optimally aggregated to maximize their robust predictive power, and this update is computed efficiently via a closed-form expression, thus generating our ensemble predictor. Our contributions are: (I) the MoE-F algorithm – deployable as a plug-and-play filtering harness over any heterogenous mixture of LLMs or specialized models, (II) theoretical optimality guarantees of the proposed filtering-based gating algorithm (via optimality guarantees for its parallel Bayesian filtering and its robust aggregation steps), and (III) empirical evaluation and ablative results using state of the art foundational and MoE LLMs on a real-world Financial Market Movement task based on streaming news where MoE-F attains a 17% absolute and 48.5% relative F1-score improvement over the best performing individual LLM expert. Further, we provide empirical evidence of substantial performance gains with MoE-F over specialized models in the long-horizon time-series forecasting domain using electricity-grid datasets.
- Publication status:
- Published
- Peer review status:
- Not peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Pre-print, pdf, 1.5MB, Terms of use)
-
- Preprint server copy:
- 10.48550/arxiv.2406.02969
Authors
+ Natural Sciences and Engineering Research Council of Canada
More from this funder
- Funder identifier:
- https://ror.org/01h531d29
- Funding agency for:
- Kratsios, A
- Saqur, R
- Grant:
- RGPIN-2023-04482
- Preprint server:
- arXiv
- Publication date:
- 2024-06-05
- DOI:
- Language:
-
English
- Pubs id:
-
2282236
- UUID:
-
uuid_9a561c7d-f272-4f09-b61d-8376859fd6ae
- Local pid:
-
pubs:2282236
- Source identifiers:
-
W4399447660
- Deposit date:
-
2026-01-23
- ARK identifier:
Terms of use
- Copyright holder:
- Saqur et al.
- Copyright date:
- 2024
- Rights statement:
- © The Author(s) 2024.
If you are the owner of this record, you can report an update to it here: Report update to this record