WEBVTT

00:00:05.836 --> 00:00:10.286
AI Basics
AI Video Production Process - Planning

00:00:25.574 --> 00:00:26.544
Hello

00:00:26.824 --> 00:00:29.784
I will be leading this introductory course on creating videos with AI

00:00:29.784 --> 00:00:32.174
and I am Shim Dongyup

00:00:32.634 --> 00:00:35.120
To produce a video with AI

00:00:35.220 --> 00:00:40.020
you need to go through several steps before you can complete the final output

00:00:40.480 --> 00:00:45.780
Even though AI has improved enough to skip many parts of the process

00:00:46.200 --> 00:00:49.940
there are still essential steps required when generating AI videos

00:00:50.360 --> 00:00:55.239
So in this lecture, I’ll guide you through the essential workflow for AI video production

00:00:55.279 --> 00:01:00.300
using theory, hands-on practice, and example cases

00:01:00.460 --> 00:01:05.660
so you can create AI videos yourself

00:01:06.000 --> 00:01:08.299
This is a beginner course 

00:01:08.559 --> 00:01:12.400
but we’re not just learning how to use tools

00:01:12.760 --> 00:01:16.439
We’re training how to think creatively with AI

00:01:16.959 --> 00:01:19.940
In our first session we’ll look at 

00:01:20.120 --> 00:01:25.579
the full AI video production process from planning to editing

00:01:25.919 --> 00:01:30.680
and then practice how to use AI 

00:01:30.680 --> 00:01:33.240
specifically in the planning stage

00:01:34.480 --> 00:01:38.159
I’ll explain everything from the basics 

00:01:38.259 --> 00:01:40.138
so feel free to follow along comfortably

00:01:40.438 --> 00:01:45.088
AI Video Production Process

00:01:45.919 --> 00:01:49.239
First the AI video production process

00:01:50.279 --> 00:01:53.919
You may have seen ads or short films 

00:01:53.919 --> 00:01:58.239
with a note at the bottom saying 

00:01:58.599 --> 00:02:01.119
this video was created 100 percent with AI

00:02:01.599 --> 00:02:06.699
For a while people were more surprised that those were 

00:02:06.919 --> 00:02:09.100
made 100% with AI than by its quality 

00:02:09.320 --> 00:02:13.420
but now what matters is 

00:02:13.620 --> 00:02:18.259
the final output itself and AI-generated videos are 

00:02:18.559 --> 00:02:21.860
reaching a very high level of quality

00:02:22.600 --> 00:02:26.039
AI-generated video content can be divided into two types

00:02:26.219 --> 00:02:31.259
100% AI and Hybrid AI

00:02:32.039 --> 00:02:37.080
The difference is when everything from image generation 

00:02:37.080 --> 00:02:40.240
to the final video is made only with prompts 

00:02:40.360 --> 00:02:44.199
we call it 100 percent AI

00:02:44.459 --> 00:02:48.600
When AI is used together with real footage 

00:02:48.600 --> 00:02:51.199
or images shot in advance 

00:02:51.199 --> 00:02:54.140
we call that hybrid AI

00:02:55.160 --> 00:02:59.720
So when we make a video with 100% AI 

00:02:59.720 --> 00:03:01.980
What process do we follow?

00:03:02.440 --> 00:03:07.240
We can break it down into six main steps

00:03:07.839 --> 00:03:10.080
First is idea development

00:03:10.440 --> 00:03:12.679
Second is text planning

00:03:12.919 --> 00:03:14.859
Third is image generation

00:03:14.879 --> 00:03:16.920
Fourth is video generation

00:03:17.120 --> 00:03:19.660
Fifth is music and voice creation

00:03:19.800 --> 00:03:23.140
Sixth is final editing and completion

00:03:24.100 --> 00:03:27.680
Steps 1 and 2 are the planning stage or pre-production

00:03:27.960 --> 00:03:30.280
Steps 3 to 5 are the production stage

00:03:30.700 --> 00:03:34.479
Step 6 is the post-production stage

00:03:35.119 --> 00:03:38.260
Now let’s look at each stage

00:03:39.360 --> 00:03:41.080
First is the planning stage

00:03:42.240 --> 00:03:44.060
This is the idea development stage

00:03:44.240 --> 00:03:48.720
The first step is expanding your initial idea based on keywords

00:03:49.600 --> 00:03:52.999
A video plan can start from a single word

00:03:53.279 --> 00:03:56.720
Using an LLM based on the word

00:03:56.720 --> 00:03:59.900
you can expand that word into a wide range of ideas

00:04:00.940 --> 00:04:05.020
LLM stands for large language model

00:04:05.160 --> 00:04:08.399
and refers to a massive deep learning model 

00:04:08.399 --> 00:04:10.100
trained on an enormous amount of data

00:04:10.400 --> 00:04:14.460
The most well-known LLMs are OpenAI’s GPT

00:04:14.740 --> 00:04:16.399
and Google’s Gemini

00:04:17.119 --> 00:04:20.659
An LLM can generate many ideas

00:04:20.839 --> 00:04:25.000
but you need to refine your prompts 

00:04:25.000 --> 00:04:28.160
to fit the content you want to create 

00:04:28.160 --> 00:04:30.359
so you can get results that match your final goal

00:04:30.939 --> 00:04:34.419
By asking clear questions this way 

00:04:34.599 --> 00:04:38.500
you can shape the output and you can also customize 

00:04:38.760 --> 00:04:41.259
the planning ideas to match your own style

00:04:41.859 --> 00:04:46.640
For example if you enter keywords like perfume commercial or sci-fi thriller

00:04:46.760 --> 00:04:50.760
the LLM will suggest a variety of concepts 

00:04:50.760 --> 00:04:53.900
and storylines that fit the theme

00:04:54.600 --> 00:04:57.240
And if you give more detailed conditions 

00:04:57.440 --> 00:05:01.580
like a floral daily perfume commercial 

00:05:01.760 --> 00:05:04.279
targeting women in their 20s

00:05:04.519 --> 00:05:07.580
You’ll get results that fit your needs even better

00:05:07.880 --> 00:05:11.639
And when you customize it like with GPTs

00:05:11.839 --> 00:05:15.180
even if you only enter 

00:05:15.480 --> 00:05:18.440
something like a perfume commercial

00:05:18.440 --> 00:05:20.920
it will automatically generate 

00:05:20.920 --> 00:05:23.640
a 30-second emotional ad idea 

00:05:23.640 --> 00:05:27.179
with a kitschy yet romantic mood 

00:05:27.279 --> 00:05:28.980
because it follows the style you preset

00:05:29.600 --> 00:05:33.180
When used well, an LLM can 

00:05:33.320 --> 00:05:36.360
take you beyond simple ideas and give you 

00:05:36.500 --> 00:05:39.460
detailed and customized planning concepts

00:05:39.880 --> 00:05:43.379
Second is text planning and prompt writing

00:05:43.799 --> 00:05:47.320
If the first step gave you a broad set of

00:05:47.320 --> 00:05:49.600
ideas for the video

00:05:49.820 --> 00:05:53.720
The second step is where you define the purpose of the video

00:05:53.720 --> 00:05:58.460
and write specific plans and prompts that match that purpose

00:05:58.880 --> 00:06:03.360
Depending on whether it’s an ad a film a short-form clip or an educational video

00:06:03.480 --> 00:06:07.880
the direction tone and structure of the plan will be completely different

00:06:08.040 --> 00:06:10.399
And the prompts you create 

00:06:10.399 --> 00:06:13.199
for image generation will change accordingly

00:06:14.079 --> 00:06:18.340
For example an ad is copy-focused a film is plot-focused 

00:06:18.480 --> 00:06:23.340
and a short-form clip needs strong impact in the first three seconds

00:06:23.480 --> 00:06:26.719
So you plan the video based on these points 

00:06:26.959 --> 00:06:31.179
and then write prompts for image generation

00:06:32.239 --> 00:06:36.159
So once the basic ideas come out in part one

00:06:36.359 --> 00:06:40.279
Step two expands them into a concrete plan

00:06:40.579 --> 00:06:43.419
and then moves on to writing prompts

00:06:43.559 --> 00:06:47.380
that visualize that plan through image generation

00:06:48.160 --> 00:06:52.600
When you ask the LLM to generate images

00:06:52.760 --> 00:06:55.640
you can say something like 

00:06:55.640 --> 00:07:00.680
create Midjourney prompts for a cinematic 30-cut commercial 

00:07:00.880 --> 00:07:06.519
based on the attached ad concept 

00:07:06.519 --> 00:07:08.720
and it will generate the prompts you need

00:07:09.760 --> 00:07:14.380
Once the initial idea is shaped into a concrete plan 

00:07:14.480 --> 00:07:20.279
and all the prompts for actual production are ready

00:07:20.359 --> 00:07:23.080
We move on to the real production stage

00:07:23.760 --> 00:07:28.640
The third step in the AI video process is image generation

00:07:29.880 --> 00:07:33.699
This is where the visual tone and mood of the video are decided

00:07:34.559 --> 00:07:40.160
For image generation you can use tools like 

00:07:40.160 --> 00:07:42.359
MidJourney, Whisk, Dreamina, and ImageFX

00:07:42.919 --> 00:07:48.539
Each one has its own style and strengths 

00:07:48.839 --> 00:07:54.760
so it’s important to try different platforms to find the image style you want

00:07:55.160 --> 00:07:59.820
When generating images it’s important to keep 

00:07:59.820 --> 00:08:04.359
the character style tone and manner and brand identity consistent throughout the whole video

00:08:05.119 --> 00:08:09.039
To maintain this consistency

00:08:09.279 --> 00:08:13.619
it’s best to stick to one image-generation platform 

00:08:13.679 --> 00:08:15.659
whenever possible

00:08:15.839 --> 00:08:18.759
To keep the tone and manner 

00:08:18.979 --> 00:08:21.219
of the final video consistent

00:08:21.399 --> 00:08:27.629
using shared prompt words like romantic cinematic or neon cyberpunk 

00:08:27.629 --> 00:08:29.339
is very helpful 

00:08:29.519 --> 00:08:34.439
for setting the overall mood

00:08:35.239 --> 00:08:40.959
And if you want your own brand identity reflected in the video

00:08:41.159 --> 00:08:45.399
 it’s best to apply that from the very first image creation

00:08:45.719 --> 00:08:50.889
To do that it helps to practice building your own

00:08:50.909 --> 00:08:53.160
composition color schemes props and overall style

00:08:53.640 --> 00:08:57.810
It’s also important to create a storyboard

00:08:57.890 --> 00:09:01.519
Because it helps you check the overall consistency of the images

00:09:02.159 --> 00:09:06.579
A storyboard lets you see the story visually at a glance

00:09:06.719 --> 00:09:11.759
and you can add or remove parts that feel empty or excessive

00:09:11.839 --> 00:09:15.720
making the content more solid

00:09:16.520 --> 00:09:21.419
With a storyboard you can see the story visually at a glance

00:09:21.559 --> 00:09:26.009
and it becomes much easier to add or remove anything that feels empty or excessive

00:09:26.009 --> 00:09:29.680
so the final content becomes stronger

00:09:30.440 --> 00:09:31.810
Next is video generation

00:09:32.350 --> 00:09:35.710
Once all the images are ready 

00:09:35.990 --> 00:09:39.640
you use them to create the video clips

00:09:40.520 --> 00:09:42.410
Now for video generation

00:09:42.730 --> 00:09:47.720
There are many ways to create AI videos

00:09:47.840 --> 00:09:52.960
and new methods keep coming out

00:09:53.140 --> 00:09:57.614
but let’s look at the most commonly used types of AI video generation

00:09:58.407 --> 00:10:02.217
First is T2V text-to-video

00:10:02.977 --> 00:10:07.427
You describe the video you want through a prompt

00:10:07.547 --> 00:10:10.240
and the system generates the video directly

00:10:10.880 --> 00:10:16.260
You can generate it with natural language and in many different languages

00:10:16.420 --> 00:10:21.979
These days people also use human-readable structured formats

00:10:22.095 --> 00:10:25.695
like JSON or YAML quite often

00:10:26.095 --> 00:10:29.495
Next is I2V image-to-video

00:10:29.695 --> 00:10:33.775
This method uses an image 

00:10:34.000 --> 00:10:36.180
as the base to create the video

00:10:36.340 --> 00:10:40.500
It’s usually used together with a prompt

00:10:40.860 --> 00:10:44.880
and since you already have a reference image it produces more stable results

00:10:44.960 --> 00:10:50.000
and makes it easier to match the style you want compared to T2V 

00:10:50.140 --> 00:10:55.740
where the outcome can be less predictable

00:10:56.240 --> 00:10:59.750
However if you don’t know how to create the right reference image

00:10:59.830 --> 00:11:01.920
 this method can be difficult to use

00:11:02.400 --> 00:11:05.800
Next is V2V video-to-video

00:11:06.120 --> 00:11:09.070
This method uses an existing video or a generated video

00:11:09.190 --> 00:11:11.779
to create a new one

00:11:12.079 --> 00:11:15.069
Many people use restyle method

00:11:15.209 --> 00:11:19.219
when they want to change an existing video

00:11:19.419 --> 00:11:22.239
or when they want to change or remove 

00:11:22.319 --> 00:11:26.120
certain elements inside the video

00:11:27.020 --> 00:11:30.799
Next is E2V elements-to-video

00:11:31.279 --> 00:11:34.689
This method takes multiple elements 

00:11:34.809 --> 00:11:39.640
or images and combines them into a single video

00:11:40.160 --> 00:11:44.559
You can make it play randomly with different clips 

00:11:44.719 --> 00:11:49.320
or you can use prompts to control the movements you want

00:11:49.600 --> 00:11:53.919
It’s very useful for people who find it hard 

00:11:54.079 --> 00:11:57.499
to combine multiple separate elements into 

00:11:57.599 --> 00:11:59.880
one coherent image or scene

00:12:00.300 --> 00:12:03.180
Now let’s look at video-generation platforms

00:12:03.740 --> 00:12:09.040
There are already many platforms that support AI video creation

00:12:09.180 --> 00:12:12.719
and new ones are continuing to appear

00:12:13.419 --> 00:12:16.720
Some of the most talked about platforms right now are 

00:12:16.860 --> 00:12:21.080
Runway, Veo3 Kling, and Higgsfield

00:12:21.260 --> 00:12:23.410
And because each of these platforms

00:12:23.410 --> 00:12:29.360
has its own strengths and unique features

00:12:29.440 --> 00:12:33.020
you need to understand and use those advantages well 

00:12:33.200 --> 00:12:35.359
to create a good video

00:12:35.859 --> 00:12:38.160
Now let’s look at Veo3

00:12:38.280 --> 00:12:41.020
It’s a video generation platform by Google

00:12:41.120 --> 00:12:43.960
and if you add the audio and dialogue to the prompt

00:12:44.120 --> 00:12:48.760
It can create a finished video with lip sync

00:12:49.720 --> 00:12:52.880
But its generation rules are quite strict

00:12:53.060 --> 00:12:58.159
and it doesn’t offer many extra features beyond video generation itself

00:12:59.259 --> 00:13:00.439
Next is Kling

00:13:00.639 --> 00:13:05.679
Kling offers a wide range of features from image generation to video editing 

00:13:05.679 --> 00:13:07.919
adding clips and applying various effects

00:13:08.019 --> 00:13:09.539
and it is known for 

00:13:09.539 --> 00:13:14.400
producing very stable high-quality results

00:13:14.980 --> 00:13:17.759
However its credits are relatively expensive 

00:13:17.839 --> 00:13:20.959
and it sometimes fails to 

00:13:20.999 --> 00:13:23.120
fully understand the prompt

00:13:23.640 --> 00:13:24.700
Next is Runway

00:13:24.960 --> 00:13:29.919
Runway is an early video platform that maintains shapes well 

00:13:30.019 --> 00:13:34.160
and has strong V2V features like Act 2 and RF

00:13:34.340 --> 00:13:39.000
However it has the drawback that movements often don’t match what you want

00:13:39.600 --> 00:13:40.939
Next is Higgsfield

00:13:40.999 --> 00:13:46.279
Higgsfield creates realistic styles that work well for social media

00:13:46.459 --> 00:13:52.080
and it can generate many VFX-style videos that were hard to make with prompts 

00:13:52.320 --> 00:13:56.459
using simple one-click tools

00:13:56.639 --> 00:14:00.080
which makes it very effective for diverse video creation

00:14:00.320 --> 00:14:05.440
However the overall video quality still leaves something to be desired

00:14:06.240 --> 00:14:09.719
The third step is storyboard-based video generation

00:14:10.799 --> 00:14:15.040
Creating each video clip 

00:14:15.040 --> 00:14:18.000
according to the storyboard

00:14:18.160 --> 00:14:20.320
lets you check the story as you go 

00:14:20.520 --> 00:14:23.719
and it helps you understand the overall flow 

00:14:23.779 --> 00:14:26.359
when producing a story-driven video

00:14:26.559 --> 00:14:30.220
It also makes it easier to see 

00:14:30.220 --> 00:14:32.680
which scene you need to create next 

00:14:32.860 --> 00:14:35.920
So working this way is very effective

00:14:36.440 --> 00:14:39.540
The fourth step is creating cinematic direction

00:14:40.280 --> 00:14:44.800
Using various VFX effects

00:14:44.900 --> 00:14:49.679
or applying cinematic and advertising-style composition 

00:14:49.799 --> 00:14:52.319
and framing

00:14:52.499 --> 00:14:55.119
or incorporating unique objects 

00:14:55.239 --> 00:15:00.959
can help create rich and distinctive visuals that are unique 

00:15:01.099 --> 00:15:02.479
to AI-generated videos

00:15:03.119 --> 00:15:06.479
The fifth step is generating music and voice

00:15:07.419 --> 00:15:11.240
Images play a big role in a video

00:15:11.340 --> 00:15:15.140
but without sound effects, music, or voice

00:15:15.260 --> 00:15:18.159
the overall quality drops significantly 

00:15:18.339 --> 00:15:22.900
So this is a crucial element you should never skip

00:15:23.820 --> 00:15:26.399
In the past music and sound were 

00:15:26.439 --> 00:15:29.780
hard to access unless you were a professional

00:15:29.900 --> 00:15:32.679
but now you can create 

00:15:32.899 --> 00:15:37.399
the music sound effects and dialogue you want through various platforms

00:15:38.179 --> 00:15:42.239
One of the most effective platforms for creating BGM 

00:15:42.239 --> 00:15:44.759
is Suno

00:15:44.959 --> 00:15:49.440
where you can easily generate music 

00:15:49.540 --> 00:15:51.640
by entering the lyrics and the mood or genre you want

00:15:52.260 --> 00:15:54.650
and it also offers professional features 

00:15:54.650 --> 00:15:56.999
for detailed editing

00:15:57.639 --> 00:16:01.640
There are also platforms like MMaudio 

00:16:01.700 --> 00:16:05.459
that generate sound effects automatically 

00:16:05.459 --> 00:16:07.520
based on the video you upload

00:16:08.460 --> 00:16:10.880
BGM and sound effects 

00:16:11.000 --> 00:16:14.740
make the video feel much richer 

00:16:14.860 --> 00:16:18.720
So it’s important to remember that they are essential elements

00:16:19.220 --> 00:16:24.239
Another important element is the voice used in videos with dialogue

00:16:24.639 --> 00:16:30.040
The choice of voice tone intonation and emotion 

00:16:30.260 --> 00:16:34.239
shapes the overall tone and mood of the video 

00:16:34.379 --> 00:16:35.920
So it is a very important factor

00:16:36.780 --> 00:16:40.000
In the past AI voices were easy to notice

00:16:40.120 --> 00:16:44.479
but now platforms like ElevenLabs and Korean tools like Typecast 

00:16:44.698 --> 00:16:47.279
and Supertone 

00:16:47.364 --> 00:16:52.346
can create dialogue that sounds very realistic and expressive

00:16:52.808 --> 00:16:54.997
The sixth step is the editing stage

00:16:56.079 --> 00:17:01.780
After creating all the AI video clips through these steps

00:17:02.010 --> 00:17:05.034
The final and most important stage is editing

00:17:05.746 --> 00:17:08.376
How you edit can even change 

00:17:08.377 --> 00:17:12.655
the entire story of the video so it is an extremely important part of the process

00:17:13.119 --> 00:17:19.430
First you upload all the video clips music and voice tracks 

00:17:19.516 --> 00:17:21.767
and sync them together

00:17:22.131 --> 00:17:26.487
Then you arrange the cuts according to the storyboard 

00:17:26.605 --> 00:17:30.088
and adjust them as needed to build a stronger final video

00:17:30.568 --> 00:17:35.962
Once the rough cut is created by placing all the videos 

00:17:35.965 --> 00:17:36.883
and audio together

00:17:37.348 --> 00:17:41.802
You review the whole piece 

00:17:41.996 --> 00:17:44.092
and strengthen the story 

00:17:44.269 --> 00:17:48.160
by adding what’s missing and removing anything excessive 

00:17:48.160 --> 00:17:49.945
or unnecessary

00:17:50.723 --> 00:17:55.839
There are many editing tools now including AI-based platforms

00:17:55.839 --> 00:18:00.186
One of the most widely used is CapCut which is intuitive

00:18:00.244 --> 00:18:03.920
and offers various video effects and audio options 

00:18:04.000 --> 00:18:09.460
making it a very convenient tool especially for beginners

00:18:09.852 --> 00:18:14.702
Planning Stage Practice

00:18:14.959 --> 00:18:17.642
Now we’ll take some time to apply

00:18:17.642 --> 00:18:21.211
what we learned and create something in practice

00:18:21.693 --> 00:18:23.864
Before we begin the planning practice

00:18:24.091 --> 00:18:27.192
remember that there is no single correct platform 

00:18:27.573 --> 00:18:32.480
So it’s important to explore more tools and production methods 

00:18:32.693 --> 00:18:36.382
beyond the ones I introduce

00:18:36.582 --> 00:18:40.319
Also since this explanation is given in a short time 

00:18:40.532 --> 00:18:45.171
please keep in mind that 

00:18:45.171 --> 00:18:47.548
we can’t cover every detail

00:18:48.320 --> 00:18:55.351
Now let’s move from theory to practice 

00:18:55.368 --> 00:18:57.809
and create a short AI video together

00:18:58.705 --> 00:19:03.269
We’ll go through each step while keeping 

00:19:03.464 --> 00:19:05.973
the six-step process in mind

00:19:06.877 --> 00:19:10.427
So the first step is idea development

00:19:10.427 --> 00:19:13.976
Let’s look at how to use GPT for this step

00:19:14.491 --> 00:19:17.234
Every video begins with an idea

00:19:17.280 --> 00:19:20.534
And a good idea starts with a good question

00:19:20.790 --> 00:19:22.940
As we saw earlier

00:19:22.940 --> 00:19:26.303
A clear and specific initial prompt is essential

00:19:26.559 --> 00:19:30.464
So first we’ll practice

00:19:30.464 --> 00:19:35.379
how to refine an idea by using step-by-step requests

00:19:35.599 --> 00:19:40.559
Instead of asking for five synopses at once

00:19:40.559 --> 00:19:43.948
it’s better to approach it step by step like this

00:19:44.159 --> 00:19:50.343
First ask it to suggest core themes for the content you want to create

00:19:50.559 --> 00:19:55.513
For example you can request five core themes 

00:19:55.513 --> 00:19:58.358
for a short film starring a cat

00:19:58.719 --> 00:20:03.891
Once you run it you’ll see the suggested themes like this

00:20:04.320 --> 00:20:09.338
Then simply choose the one you like the most from the list

00:20:09.643 --> 00:20:13.794
Now let’s develop the first suggested theme 

00:20:14.047 --> 00:20:16.715
from the list

00:20:16.715 --> 00:20:23.191
The next step is to ask for a more detailed explanation of the idea you selected

00:20:23.191 --> 00:20:25.480
Like this, you can request 

00:20:25.654 --> 00:20:29.319
a detailed description of the main character the cat’s personality 

00:20:29.319 --> 00:20:34.497
and the owner’s traits based on the chosen idea

00:20:35.186 --> 00:20:41.196
Then you’ll receive a detailed explanation like this

00:20:41.516 --> 00:20:44.710
With this level of detail

00:20:44.928 --> 00:20:48.753
you can start building your synopsis

00:20:49.039 --> 00:20:53.581
Next you ask it to create a three-scene synopsis 

00:20:53.681 --> 00:20:58.523
based on the characters and story elements you developed above

00:20:58.766 --> 00:21:03.723
When you run it you’ll see that it generates 

00:21:03.730 --> 00:21:07.131
a scenario made up of three scenes

00:21:07.320 --> 00:21:13.459
From here you can refine each scene by adding details 

00:21:13.674 --> 00:21:20.924
such as camera movement, lighting, and other techniques to make it more complete

00:21:21.334 --> 00:21:25.969
The next step is to translate the scenario 

00:21:25.969 --> 00:21:30.119
into visual language that Midjourney can understand 

00:21:30.119 --> 00:21:33.615
meaning into prompts

00:21:34.559 --> 00:21:38.987
From this point the quality of the results can vary significantly

00:21:39.359 --> 00:21:43.309
When writing a prompt the main focus 

00:21:43.309 --> 00:21:45.767
is deciding what you want it to draw

00:21:46.400 --> 00:21:52.157
Here, we won’t generate every shot from each scene 

00:21:52.586 --> 00:21:58.095
We’ll create only the key image for each one

00:21:58.400 --> 00:22:01.820
So I’ll give a command like this

00:22:02.336 --> 00:22:08.411
Next, you choose three key shots that represent the scenario 

00:22:08.636 --> 00:22:13.652
and ask it to create Midjourney image prompts for them

00:22:14.391 --> 00:22:16.641
Once you run the command 

00:22:16.641 --> 00:22:22.941
you’ll see prompts created for 

00:22:22.941 --> 00:22:27.736
the three representative cuts from the scenario

00:22:28.200 --> 00:22:32.420
Of course if you want to show a more detailed story 

00:22:32.449 --> 00:22:35.949
you can adjust the number of cuts 

00:22:36.176 --> 00:22:38.626
in the command 

00:22:38.626 --> 00:22:43.277
and generate as many images as you need

00:22:43.689 --> 00:22:46.127
Another important point is that

00:22:46.127 --> 00:22:51.833
the style of the cuts is 

00:22:51.833 --> 00:22:53.719
not specified here

00:22:53.719 --> 00:23:01.425
So the images may come out in very different tones 

00:23:01.425 --> 00:23:04.119
such as cartoon-like or fully realistic

00:23:04.119 --> 00:23:10.559
To avoid this it’s better to provide more specific details in the prompt

00:23:10.559 --> 00:23:15.204
If you choose three key shots that represent the scenario

00:23:15.440 --> 00:23:21.814
and specify a broad style such as Japanese animation 

00:23:22.119 --> 00:23:25.557
then ask it to create the prompts 

00:23:25.869 --> 00:23:30.307
You’ll get images with consistent style

00:23:30.563 --> 00:23:35.195
So it’s important to keep this in mind when you work on this step

00:23:35.448 --> 00:23:38.504
When you follow this process you’ll see that

00:23:38.504 --> 00:23:42.259
the prompt includes descriptions related to Japanese animation styles 

00:23:42.494 --> 00:23:48.523
and references to artists associated with that style

00:23:49.039 --> 00:23:54.516
So far we’ve looked at the overall process of creating an AI video 

00:23:54.516 --> 00:23:57.316
and practiced 

00:23:57.316 --> 00:24:01.368
how AI can be used in the planning stage

00:24:01.679 --> 00:24:05.104
As you can see, creating an AI video 

00:24:05.169 --> 00:24:07.907
requires many steps 

00:24:08.031 --> 00:24:15.545
and it’s not something that comes out with a single click

00:24:16.000 --> 00:24:18.481
To use AI video effectively

00:24:18.489 --> 00:24:23.359
it’s helpful to take advantage of what sets it apart from traditional video production

00:24:23.359 --> 00:24:25.844
Using virtual models or virtual environments

00:24:25.844 --> 00:24:31.881
 that are difficult to film in real life is one approach

00:24:32.119 --> 00:24:37.008
But as we learned today using AI in the planning stage 

00:24:37.015 --> 00:24:39.662
also helps expand your ideas 

00:24:39.667 --> 00:24:44.055
beyond what’s in your head

00:24:44.719 --> 00:24:50.927
So as a creator it’s good to keep exploring and using AI in many different ways

00:24:51.400 --> 00:24:56.337
I hope this session helped you do that and this concludes the lecture

00:24:56.982 --> 00:24:57.546
Thank you

00:24:58.791 --> 00:25:00.151
Summary
AI Video Production Process

00:25:00.198 --> 00:25:01.918
Types of AI Video Production
100% AI Production: A method where everything, from image generation to the final video, is created entirely through prompts

00:25:01.918 --> 00:25:03.328
Hybrid AI Production: A method that uses real images or video sources along with AI to create the final output

00:25:03.328 --> 00:25:04.828
Steps for 100% AI Production

00:25:04.828 --> 00:25:06.088
1. Idea Development: Using an LLM based on keywords
2. Text Planning and Prompt Writing

00:25:06.088 --> 00:25:07.588
3. Image Generation: Using tools like MidJourney Whisk Dreamina and ImageFX
4. Video Generation: Using platforms like Veo3 Kling Runway and Higgsfield

00:25:07.588 --> 00:25:08.908
5. Music and Voice Generation: Using tools like Suno MMaudio ElevenLabs Typecast and Supertone
6. Editing

00:25:09.388 --> 00:25:11.718
Planning Stage Practice
Ideation Using GPT
Refining Ideas Through a Step-by-Step Approach

00:25:11.768 --> 00:25:14.988
Planning Steps
Select the core theme
Create a synopsis made of three scenes

00:25:14.988 --> 00:25:17.988
Generate English prompts that Midjourney can understand
Create key images for each scene

00:25:18.852 --> 00:25:20.112
Prompt Guide Sample

00:25:20.163 --> 00:25:21.163
Idea Development (Example)
[For ChatGPT]

00:25:21.183 --> 00:25:23.543
Create five synopsis ideas in Korean for a 30-second short film related to a cat with a dramatic and witty cinematic mood formatted as three 10-second scenes

00:25:23.563 --> 00:25:24.863
Text Planning (Example)
[For ChatGPT]
The Cat’s Revenge

00:25:24.863 --> 00:25:25.803
Scene 1 (10s): The owner secretly eats a late-night snack under the kitchen light while the cat glares from under the table.

00:25:25.803 --> 00:25:26.743
Scene 2 (10s): The cat suddenly jumps knocks over a cup and the snack spills onto the floor. The owner panics.

00:25:26.743 --> 00:25:28.043
Scene 3 (10s): The cat sits proudly on the spilled food and stares into the camera as if saying &quot;it’s mine.&quot;

00:25:28.043 --> 00:25:29.003
Create an English Midjourney prompt to generate the following content in a cute animation style