How AI Artists Generate Images and Videos from Scratch: From Prompt to Screen
    Inteligencia Artificial (IA)

    How AI Artists Generate Images and Videos from Scratch: From Prompt to Screen

    Gianro Compagno
    2026-03-29
    5 min read
    The audiovisual creation with artificial intelligence has radically transformed the way creatives approach their projects, going far beyond just writing a simple prompt. The process requires creative vision, detailed planning, visual references, and constant experimentation, thus redefining the boundaries of traditional production. Clara López, AI Artist at Freepik, summarizes it clearly: “With this tool, I can do things for which I don’t have the means in real life.” It is not about replacing teams or achieving absolute democratization, but about the possibility of materializing ideas that were previously only in the imagination, without relying on large budgets or infrastructures. Today, it is common for a creative to imagine a music video or a campaign that is impossible to shoot due to technical or economic limitations. Where there were once barriers, AI opens a new process: describing, iterating, and adjusting. However, contrary to the belief that it is enough to write a prompt, working with AI requires a precise methodology, where every decision affects the final result. The process begins long before interacting with the tool. “People cannot see what is in our heads,” explains López, who sees AI as a bridge between the idea and its representation. This bridge requires planning, visual culture, and a structured workflow that combines creativity and technique. The starting point is mental: before writing any text, it is essential to accurately imagine what one wants to create. For López, this phase is almost cinematic: she visualizes shots, composition, and color clearly. She does not need detailed sketches or complex mood boards, but rather an internal visualization that guides the entire process. Translating that mental image into a prompt is an exercise in precise description: shot, lighting, and key elements. López prefers brief and direct prompts, which facilitate iteration and avoid overloading the model with unnecessary information. The key is to choose the right words for each generation. The prompt is rarely definitive. The same instruction can generate multiple variations, and here the creator intervenes to select, discard, and refine. Part of the work consists of launching an initial batch of results to identify the best visual direction and adjust the indications as necessary. Understanding how AI interprets concepts like shots, lights, or styles is essential, but one must also accept that the result will never be identical to what was imagined. The prompt thus becomes a dialogue between the creator and the tool. An example of this process can be seen in the music video for “Ni borracho” by Quevedo, where López has applied this approach. The initial idea does not come from nowhere. “What am I going to need?” López asks herself. Reference images, art direction, and a script are essential elements, supported by the creator's training and experience. References help to delineate the visual terrain: images, color palettes, framing, or textures guide the AI towards a specific aesthetic and allow for validating decisions before physically producing them, saving time and resources. The script, even in brief formats, serves to structure the process and avoid a chaotic succession of tests. The art direction unifies all elements, defining tone, visual coherence, and intention. Here, creative background is fundamental: AI cannot replace visual culture and professional judgment. A common mistake is thinking that all AI tools work the same. Each model has its strengths, limitations, and its own visual personality. The choice depends on the objective: generating an image from scratch is not the same as editing an existing one, nor is it the same to seek a realistic or artistic finish. López highlights that some models are better for certain styles or movements, and she often combines several tools depending on the phase of the process. The criteria are both technical and aesthetic. Some tools excel in specific textures or styles, and the choice of model also depends on practical factors such as formats, clip duration, or file size. Anticipating these details allows for a more efficient workflow and avoids later blockages. In this context, the tool ceases to be a simple means and becomes an extension of the creative process. Choosing well does not mean using the most popular platform, but understanding what each one contributes, as not all AIs generate the same results or respond equally to instructions. Artificial intelligence has imposed new rules in audiovisual creation: it forces the concretization of ideas, constant decision-making, and more precise thinking. Technology accelerates execution, but it does not replace the judgment or perspective of the creator, who remains the differential factor behind each work.
    Gianro Compagno

    Gianro Compagno

    CTO

    Gianro aporta una gran experiencia en gestión de proyectos tecnológicos en entornos multinacionales. Su experiencia técnica combinada con un MBA y una maestría en Psicología Investigativa crea un enfoque único para las soluciones tecnológicas. Como Experto en IA y Automatización, aplica conocimientos psicológicos para diseñar sistemas más intuitivos y centrados en el ser humano. Su enfoque orientado al detalle y mentalidad positiva aseguran que nuestras soluciones no solo sean innovadoras y confiables, sino que también se alineen con cómo las personas piensan y trabajan naturalmente.