AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models
FinanceBench: A New Benchmark for Financial Question Answering
Stable-Hair: Real-World Hair Transfer via Diffusion Model
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Patch-Level Training for Large Language Models
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
IMAGDressing-v1: Customizable Virtual Dressing
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
SEED-Story: Multimodal Long Story Generation with Large Language Model
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Agentless: Demystifying LLM-based Software Engineering Agents
Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
Babbage from The Economist
The 404 Media Podcast
The WAN Show
Click Here
Noticias de la mañana