Abstract
This paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework for end-to-end film automation in our constructed 3D virtual spaces.
FilmAgent simulates key crew roles—directors, screenwriters, actors, and cinematographers, and integrates efficient human workflows within a sandbox environment. A team of agents collaborates through iterative feedback and revisions, thereby verifying intermediate scripts and reducing hallucinations.
Human evaluation shows that FilmAgent outperforms all baselines across all aspects and scores 3.98 out of 5 on average,
showing the feasibility of multi-agent collaboration in filmmaking.
Further analysis reveals that FilmAgent, despite using the less advanced GPT-4o model, surpasses the single-agent o1, showing the advantage of a well-coordinated multi-agent system.
Lastly, we discuss the complementary strengths and weaknesses of OpenAI's text-to-video model Sora and our FilmAgent in filmmaking.