Applied Vision
Authors: Donghwan Lee (Yonsei University) , Wooju Kim (Yonsei University)
As medical image databases expand, precise Content-Based Medical Image Retrieval (CBMIR) techniques are increasingly required to support case-based reasoning, clinical education, and data-driven decision-making. Recent deep learning–based CBMIR approaches typically rely on global embeddings to enhance retrieval performance. However, such image-level representations often dilute localized anatomical features and fail to capture clinically relevant organ-specific details. To address this limitation, we propose a region-based CBMIR framework that integrates organ-level information into both representation learning and retrieval. The ROI Embedding Selector extracts patch-level embeddings from user-specified regions of interest (ROIs). The Region-aware Organ Attention (ROA) module then learns structured organ representations through cross-attention between image patches and dedicated organ tokens. During inference, a visibility-weighted aggregation strategy guided by Organ Visibility Recognition incorporates query-relevant organs, enabling anatomically targeted and clinically meaningful retrieval. Experiments on the TotalSegmentator dataset demonstrate that the proposed framework consistently outperforms global embedding–based vision foundation models, particularly in region query settings.
Keywords:
How to Cite: Lee, D. & Kim, W. (2026) “Organ Level Representation Learning for Region Based Medical Image Retrieval”, Proceedings of the Austrian Symposium on AI, Robotics, and Vision. 3(1).