In this episode we discuss Visual Programming: Compositional visual reasoning without training by Authors: Tanmay Gupta and Aniruddha Kembhavi Affiliation: - PRIOR @ Allen Institute for AI. The paper introduces VISPROG, a neuro-symbolic approach to solving complex visual tasks based on natural language instructions. The system generates python-like modular programs that are executed to produce the solution and a comprehensive rationale. The approach avoids the need for task-specific training and instead uses the in-context learning ability of large language models. The paper demonstrates the flexibility of VISPROG on four diverse tasks, including image editing and factual knowledge object tagging, and shows its potential to expand AI systems to perform complex tasks.
Create your
podcast in
minutes
It is Free