Virtual film production requires intricate decision-making processes, including scriptwriting, virtual cinematography, and precise actor positioning and actions.
Remarkable progress in automated decision-making have utilized agent societies powered by large language models (LLMs).
This paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework designed to automate and streamline the film production process.
FilmAgent simulates key crew roles—directors, screenwriters, actors, and cinematographers, and integrates efficient human workflows within a sandbox environment.
The process is divided into three stages: planning, scriptwriting, and cinematography.
Each stage engages a team of film crews providing iterative feedback, thus verifying intermediate results and reducing errors.
Our evaluation of generated videos shows that the collaborative FilmAgent significantly outperforms individual efforts in line consistency, script coherence, character actions, and camera settings.
Further analysis highlights the importance of feedback and verification in reducing hallucinations, enhancing script quality, and improving camera choices.
We also explore the strengths and limitations of FilmAgent and suggest directions for future research on integrating LLMs into creative multimedia tasks.