02 [RAG] Training Remote Sensing VLMs with OpenStreetMap This paper presents a new method to train Vision-Language Models for satellite imagery using free OpenStreetMap data. It overcomes the typical reliance on expensive manually labeled data or costly large AI models for training these specialized systems. This approach makes developing AI for tasks like environmental monitoring or urban planning more accessible and scalable. link
03 [Speech] PARSA-Bench: First Persian Audio-Language AI Benchmark PARSA-Bench is the first benchmark designed to evaluate large audio-language models specifically on Persian language and culture. It addresses unique challenges of Persian audio, like classical poetry and code-switching, through 16 tasks and over 8,000 samples. This benchmark will accelerate the development of more accurate and culturally aware AI models for millions of Persian speakers worldwide. link