MMBind
- Home
 - portfolio
 - Frameworks
 - MMBind
 
MMBind
Multimodal sensing systems are increasingly prevalent in various real-world applications. Most existing multimodal learning approaches heavily rely on training with a large amount of synchronized, complete multimodal data. However, such a setting is impractical in real-world IoT sensing applications where data is typically collected by distributed nodes with heterogeneous data modalities, and is also rarely labeled. In this paper, we propose MMBind, a new data binding approach for multimodal learning on distributed and heterogeneous IoT data. The key idea of MMBind is to construct a pseudo-paired multimodal dataset for model training by binding data from disparate sources and incomplete modalities through a sufficiently descriptive shared modality. We also propose a weighted contrastive learning approach to handle domain shifts among disparate data, coupled with an adaptive multimodal learning architecture capable of training models with heterogeneous modality combinations. Evaluations on ten real-world multi-modal datasets highlight that MMBind outperforms state-of-the-art baselines under varying degrees of data incompleteness and domain shift, and holds promise for advancing multimodal foundation model training in IoT applications.
													Details & Specifications
Adaptive Learningy
Machine Learning
Multimodal Learning
- Xiaomin Ouyang (HKUST)
 - Jason Wu (UCLA)
 - Tomoyoshi Kimura (UIUC)
 - Yihan Lin (UCLA)
 - Gunjan Verma (US Army ARL)
 - Tarek Abdelzaher (UIUC)
 - Dr. Mani Srivastava (UCLA)
 
WristPrint Statistics
Weeks of Daily Sensor Data
We apply our model to a data set consisting of 10 weeks of daily sensor wearing.
Users
Motion sensor data was collected from wrist-worn devices in users’ natural environment.