Conference item
Orthogonal sequential fusion in multimodal learning
- Abstract:
- The integration of data from multiple modalities is a fundamental challenge in machine learning, encompassing applications from image captioning to text-to-image generation. Traditional fusion methods typically combine all inputs concurrently, which can lead to an uneven representation of the modalities and restricted control over their integration. In this paper, we introduce a new fusion paradigm called Orthogonal Sequential Fusion (OSF), which sequentially merges inputs and permits selective weighting of modalities. This stepwise process also enables the promotion of orthogonal representations, thereby extracting complementary information for each additional modality. We demonstrate the effectiveness of our approach across various applications, and show that OSF outperforms existing fusion techniques. Our approach represents a promising alternative to established fusion techniques and offers a sophisticated way of combining modalities for a wide range of applications, including integration into any complex multimodal model that relies on information fusion.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 780.5KB, Terms of use)
-
- Publisher copy:
- 10.23919/FUSION65864.2025.11124022
Authors
- Publisher:
- IEEE
- Host title:
- 2025 28th International Conference on Information Fusion (FUSION)
- Publication date:
- 2025-08-26
- Acceptance date:
- 2025-04-30
- Event title:
- 28th International Conference on Information Fusion (ISIF 2025)
- Event location:
- Rio de Janeiro, Brazil
- Event website:
- https://isif.org/event/conference/28th-international-conference-information-fusion
- Event start date:
- 2025-07-07
- Event end date:
- 2025-07-10
- DOI:
- EISBN:
- 9781037056239
- ISBN:
- 9798331503505
- Language:
-
English
- Keywords:
- Pubs id:
-
2133067
- Local pid:
-
pubs:2133067
- Deposit date:
-
2025-06-27
Terms of use
- Copyright holder:
- IEEE
- Copyright date:
- 2025
- Rights statement:
- © IEEE 2025
- Notes:
- This paper was presented at the 28th International Conference on Information Fusion (ISIF 2025), 7th-10th July 2025, Rio de Janeiro, Brazil. For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission.
If you are the owner of this record, you can report an update to it here: Report update to this record