We tried out Google’s new family of multi-modal models with variants compact enough to work on local devices. They work well.
Abstract: Infrared-visible image fusion (IVF) is an essential task in multimodal image processing that integrates infrared and visible modalities to enhance the overall image information content.