FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

1Harbin Institute of Technology (Shenzhen), 2Tsinghua University

Abstract

This paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework for end-to-end film automation in our constructed 3D virtual spaces. FilmAgent simulates key crew roles—directors, screenwriters, actors, and cinematographers, and integrates efficient human workflows within a sandbox environment. A team of agents collaborates through iterative feedback and revisions, thereby verifying intermediate scripts and reducing hallucinations. Human evaluation shows that FilmAgent outperforms all baselines across all aspects and scores 3.98 out of 5 on average, showing the feasibility of multi-agent collaboration in filmmaking. Further analysis reveals that FilmAgent, despite using the less advanced GPT-4o model, surpasses the single-agent o1, showing the advantage of a well-coordinated multi-agent system. Lastly, we discuss the complementary strengths and weaknesses of OpenAI's text-to-video model Sora and our FilmAgent in filmmaking.

Unity Environment

Apartment kitchen
Apartment kitchen
Apartment living room
Apartment living room
Beverage room
Beverage room
Billiard room
Billiard room
Dining room
Dining room
Gaming room
Gaming room
Large kitchen
Large kitchen
Meeting room
Meeting room
Office
Office
Reception room
Reception room
Relaxing room
Relaxing room
Roadside
Roadside
Sofa corner
Sofa corner
Storehouse
Storehouse
Work room
Work room

Videos