MEVZU N°128ISTANBULYEAR I — VOL. III
MEVZU N° TAG / VOL. 102
#multimodal
0 blog · 0 news · 3 wiki
§03
03Wiki
§01Glossary
VLM — Vision-Language Model
A model that jointly understands images and text and produces text responses.
- EN
- VLM (Vision-Language Model)
- TR
- VLM — Görü-Dil Modeli
§02Glossary
Multimodal
Models capable of understanding or producing more than one input type — text, image, audio, video.
- EN
- Multimodal
- TR
- Çok-Modlu
§03Glossary
MLLM — Multimodal LLM
A large language model that also processes modalities like image, audio, or video.
- EN
- MLLM (Multimodal LLM)
- TR
- MLLM — Çok-Modlu LLM