Klára Janoušková
PhD Student · Computer Vision & Machine Learning · CTU Prague
I am a third-year computer vision and machine learning PhD student at the Czech Technical University in Prague (CTU), supervised by professor Jiří Matas.
My current research focuses on spatial understanding for vision-language models, ImageNet-scale recognition benchmarks and reannotation, and reinforcement learning for video object segmentation. Previously, I worked on fine-grained species classification and biodiversity benchmarks, test-time adaptation for segmentation, AI-assisted labelling for civil infrastructure inspection, and scene text detection and recognition.
I teach labs for the Machine Learning and Pattern Recognition course at CTU and co-supervise several BSc/MSc students.
During my undergraduate studies I interned at the Technion (Chaim Baskin, Alex Bronstein), IBM Research Zurich (Mattia Rigotti, Ioana Giurgiu, Cristiano Malossi), and the CVC at UAB (Dimosthenis Karatzas, Lluis Gomez).
- Mar 2026 Our project on Efficient Vision-Language Models received funding from Toyota Motor Europe.
- Feb 2026 Our paper 'Multimodal Large Language Models as Image Classifiers' was accepted to CVPR 2026 Findings!
- Jul 2025 Our paper 'SAM2RL: Towards Reinforcement Learning Memory Control in Segment Anything Model 2' was accepted by RL4RS@RLC'25.
- Jun 2025 See you at CVPR! I will be presenting our paper 'FungiTastic' at the workshop poster sessions. You can also find me at the FGVC workshop as one of the organizers.
- May 2025 The FungiCLEF challenge I am co-organizing has ended with more than 70 submissions! The overview report is now published.
I am now interested in spatial understanding for VLMs — improving the spatial representation for VLMs and efficient VLMs, building on our prior work on context-aware object recognition and closed-form adaptation (Koo-Fu CLIP).
The aim is complete validation set reannotation — an ongoing, soon to be published project. Building on our analysis of ImageNet flaws and VLM-based recognition methods.
Investigating reinforcement learning for learned memory control in the Segment Anything Model 2 (SAM2), with the goal of improving long-form video object segmentation by dynamically managing the memory bank.
All projects (including past work) on the Projects page · Publications on the Publications page.