In this episode we discuss SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
by Pinaki Nath Chowdhury, Ayan Kumar Bhunia, Aneeshan Sain, Subhadeep Koley, Tao Xiang, Yi-Zhe Song. The paper discusses an extension of scene understanding that includes human sketch as a modality, resulting in a complete trilogy of scene representation from three diverse modalities - sketch, photo, and text. The focus is on learning a joint embedding that supports the flexibility of using any combination of modalities as a query for downstream tasks like retrieval and simultaneously utilizing the embedding for either discriminative or generative tasks. The proposed embedding is capable of accommodating a variety of scene-related tasks without any task-specific modifications.
view more