Qwen3.5-Omni Technical Report
Qwen3.5-Omni is a multimodal large language model that integrates text, image, audio, and video capabilities in a unified architecture. It demonstrates strong performance across various benchmarks while maintaining efficient inference through optimized tokenization and training techniques.