LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Internal Consistency and Self-Feedback in Large Language Models: A Survey
On the Diagram of Thought
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
On the limits of agency in agent-based models
Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization
PuLID: Pure and Lightning ID Customization via Contrastive Alignment
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery
PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
GeoCalib: Learning Single-image Calibration with Geometric Optimization
Artificial Immune System of Secure Face Recognition Against Adversarial Attacks
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
rerankers: A Lightweight Python Library to Unify Ranking Methods
Automated Design of Agentic Systems
Text2SQL is Not Enough: Unifying AI and Databases with TAG
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Sapiens: Foundation for Human Vision Models
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
AI Deep Dive
The WAN Show
Cyber Security Headlines
Cybersecurity Today
Babbage from The Economist