编码和解码

0 引子

在FFmpeg的众多库函数中，libavformat和libavcodec分别实现了封装相关和编解码相关功能，通过这些库函数能够实现视频裸数据（.yuv）到封装后的文件（.mp4）的互转。就稍微记录一下编解码相关的流程和相关的代码实现。

首先我们需要一个yuv文件和一个mp4文件，可以通过以下命令实现（需要安装ffmpeg）：

1
2


ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=60 -pix_fmt yuv420p input.mp4
ffmpeg -f lavfi -i testsrc=duration=10:size=1280x720:rate=60 -pix_fmt yuv420p input.yuv

顺便一提，裸数据和压缩封装后的视频文件的大小差别真的好大：

1
2


-rw-r--r--    1 cyanwoods  staff   196K Sep 19 17:02 input.mp4
-rw-r--r--    1 cyanwoods  staff   791M Sep 19 17:03 input.yuv

0.1基础概念

编解码.drawio

这是我理解的一个视频封装格式的内容：若干个视频流、若干个音频流、若干个字母流以及元数据，同时还有文件头和文件尾；在这次实验中只有一个视频流。

frame是一帧在解码后（也就是原始数据）的内容，packet通常表示在一个frame在编码后的数据，随后就将数据写入到媒体容器中。

1 yuv ==> mp4

1.1 大致流程

由于裸视频文件不包含文件头，因此我们需要手动设置视频的分辨率和帧率。

总的来说大致就是以下步骤：

读取yuv文件；
读取yuv文件的分辨率（宽和高）和帧率，这三项以及yuv文件的像素格式是编码过程中必须的；
设置封装后的文件路径；
选择一种编码方式（264/265, etc），根据这种编码方式创建一个编码器以及编码上下文；
将视频的基本参数（分辨率、帧率、像素格式）设置完成后将编码上下文进行初始化；
创建并初始化封装；
设置媒体流的编码信息，这个编码信息至少包含四项（这个是在我的测试之后给出来的，在其他编码器中可能需要更多参数）：
1. 流类型；
2. 编码器类型
3. 视频宽
4. 视频高
打开输出文件；
在视频文件中写入数据头；
创建帧数据，其中包含：
1. 像素格式
2. 视频宽
3. 视频高
设置参数之后要使用av_frame_get_buffer()分配对应的内存空间；
初始化packet，声明packet后要使用av_new_packet()分配内存空间，不然就会出现总线错误（一把辛酸泪）
开始循环读取yuv文件中的yuv分量，读取一遍之后对帧计数器进行自增，如果读到数据就执行如下步骤：
1. 将帧发送给编码器进行编码；
2. 从编码器获取编码后的数据，保存在packet中（也就是一个packet保存了一个视频帧的数据），当读取到数据之后执行以下操作：
  1. 转换时间基
  2. 将packet中的数据放到封装器中进行封装
  3. 清空packet中的数据；
3. 然后回到步骤12
写视频尾；
释放相关的编码和封装结构
关闭裸视频文件；

1.2 代码

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101


// Make .yuv file to .mp4 file
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/imgutils.h>

int main(int argc, char *argv[]) {
    if (argc != 3){
        printf("输入参数错误 ");
        printf("./encode.out <INPUT.yuv> <width>x<height>@<fps>");
        return 0;
    }
    /* 读取并分割参数 */
    char* file_path= argv[1];
    int width = atoi(strtok(argv[2], "x"));
    int height = atoi(strtok(NULL, "@"));
    int fps = atoi(strtok(NULL, ""));
    char* outputpath = "output.mp4";
    printf(" =====>input file path: %s\n"
           " =====> %dx%d@%d\n"
           " =====>output file path: %s\n",
           argv[1],width, height, fps, outputpath);
    printf(" =====读取参数完成=====>\n");

    /* 读取yuv文件 */
    FILE *yuv_fd = fopen(file_path, "rb");
    printf(" =====打开文件完成=====>\n");

    /* 设置编码器 */
    const AVCodec *pCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
    AVCodecContext *pCodecCtx = avcodec_alloc_context3(pCodec);
    pCodecCtx->width = width;
    pCodecCtx->height = height;
    pCodecCtx->time_base = (AVRational){1, fps};
    pCodecCtx->pix_fmt = AV_PIX_FMT_YUV420P;
    avcodec_open2(pCodecCtx, pCodec, NULL);
    printf(" =====编码器设置完成=====>\n");

    /* 设置封装器 */
    AVFormatContext *pFormatCtx = avformat_alloc_context();
    avformat_alloc_output_context2(&pFormatCtx, NULL, NULL, outputpath);
    printf(" =====封装器设置完成=====>\n");

    /* 设置视频流的编码信息 */
    AVStream *vStream = avformat_new_stream(pFormatCtx, pCodec);
    vStream->codecpar->codec_id = pCodec->id;
    vStream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
    vStream->codecpar->width = pCodecCtx->width;
    vStream->codecpar->height = pCodecCtx->height;
    printf(" =====视频流设置完成=====>\n");

    avio_open(&pFormatCtx->pb, "output.mp4", AVIO_FLAG_WRITE); //打开output.mp4文件，并返回封装的上下文pFormatCtx->pb
    printf(" =====输出文件设置完成=====>\n");

    /* 写视频文件头 */
    if (avformat_write_header(pFormatCtx, NULL) != 0){
        printf("Error in writing header");
    }
    printf(" =====写头完成=====>\n");

    /* 初始化帧数据 */
    int frame_count = 0;
    int y_num = pCodecCtx->height * pCodecCtx->width;
    AVFrame *frame = av_frame_alloc();
    frame->format = AV_PIX_FMT_YUV420P;
    frame->width = width;
    frame->height = height;
    av_frame_get_buffer(frame, 0);
    printf(" =====初始化frame完成=====>\n");

    /* 初始化packet */
    AVPacket packet;
    av_new_packet(&packet, 0); //这里*3
    printf(" =====初始化packet完成=====>\n");

    /* 循环向frame写入数据 */
    while (1){
        if (fread(frame->data[0], 1, y_num, yuv_fd) <= 0 ||
            fread(frame->data[1], 1, y_num / 4, yuv_fd) <= 0 ||
            fread(frame->data[2], 1, y_num / 4, yuv_fd) <= 0) {
            break;// 当读不出数据了就跳出死循环
        }
        frame->pts = frame_count;
        frame_count++;
        /* 向编码器塞入帧数据 进行编码*/
        avcodec_send_frame(pCodecCtx, frame);
        /* 如果从编码器中读取到数据 */
        if (avcodec_receive_packet(pCodecCtx, &packet) == 0) {
            av_packet_rescale_ts(&packet, pCodecCtx->time_base, vStream->time_base);//转换时间基准
            av_interleaved_write_frame(pFormatCtx, &packet); //将packet中的数据放进封装上下文中进行封装
            av_packet_unref(&packet);//释放资源
        }
    }
    av_write_trailer(pFormatCtx);
    avcodec_close(pCodecCtx);
    avcodec_free_context(&pCodecCtx);
    av_frame_free(&frame);
    avio_close(pFormatCtx->pb);
    avformat_free_context(pFormatCtx);
    fclose(yuv_fd);
    return 0;
}

2 mp4 => yuv

2.1 大致流程

在前面说过，一个媒体封装格式中可能会有多个流，因此在读取mp4文件的时候要先识别出来视频流是哪一个，随后分析视频流的相关解码数据对视频进行解码。

读取mp4文件，将相关数据保存在封装上下文中；
从封装上下文中扒拉出来一个视频流出来（判断流的编码类型是不是视频）；
通过视频流中.codecPar参数创建对应的解码上下文、解码器，最后通过avcodec_open2()进行加载；
创建输出文件；
创建frame和packet；
从封装上下文中读取一个个packet，如果读到的话：
1. 将从解码器上下文中扒拉出来完成解码的一帧，如果成功的话
  1. 分别将对应通道的裸数据写到输出文件中；
2. 向解码器发送新的packet；
3. 释放packet；
关闭输出文件；
释放帧；
释放封装和解码上下文；

2.2 代码

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51


// Make .mp4 file to .yuv file
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/imgutils.h>

int main(int argc, char *argv[])
{
    AVFormatContext *pFormatCtx = NULL;
    avformat_open_input(&pFormatCtx, argv[1], NULL, NULL);
    int videoStream = -1;
    for (int i = 0; i < pFormatCtx->nb_streams; i++)
    {
        if (pFormatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
        {
            videoStream = i;
            break;
        }
    }
    if (videoStream == -1){
        return -1;
    }
    
    AVCodecParameters *pCodecPar = pFormatCtx->streams[videoStream]->codecpar;
    const AVCodec *pCodec = avcodec_find_decoder(pCodecPar->codec_id);
    AVCodecContext *pCodecCtx = avcodec_alloc_context3(pCodec);
    avcodec_parameters_to_context(pCodecCtx, pCodecPar);
    avcodec_open2(pCodecCtx, pCodec, NULL);

    FILE *pFile = fopen("output.yuv", "wb");
    
    AVFrame *pFrame = av_frame_alloc();
    AVPacket packet;

    while (av_read_frame(pFormatCtx, &packet) >= 0)
    {
        if (avcodec_receive_frame(pCodecCtx, pFrame) == 0)
        {
            fwrite(pFrame->data[0], 1, pCodecCtx->width * pCodecCtx->height, pFile);     // Y分量
            fwrite(pFrame->data[1], 1, pCodecCtx->width * pCodecCtx->height / 4, pFile); // U分量
            fwrite(pFrame->data[2], 1, pCodecCtx->width * pCodecCtx->height / 4, pFile); // V分量
        }
        avcodec_send_packet(pCodecCtx, &packet);
        av_packet_unref(&packet);
    }
    fclose(pFile);
    av_frame_free(&pFrame);
    avcodec_close(pCodecCtx);
    avformat_close_input(&pFormatCtx);

    return 0;
}

3 尾巴

从裸数据到mp4文件和mp4到裸数据这两个步骤来看，视频的编码解码封装解封装的核心也就是下面这几步：

编解码-第 2 页.drawio

从来源看：

解码阶段的packet是从封装上下文formatctx中获取的；
编码阶段的packet是从编码器那里获得的；
解码阶段的frame是从解码上下文CodecCtx中获得的；
编码阶段的frame是从裸数据得到的；

从去向看：

解码阶段的packet会发送给解码器进行解码；
编码阶段的packet会发送给编码上下文；
解码阶段的frame会被写到原始文件中；
编码阶段的frame会发送给编码器；

从解码流程看：

拿到封装上下文；
从封装上下文拿到packet；
packet发给解码器得到frame；
frame信息写到裸数据中；

从编码流程看：

从裸数据得到frame数据；
frame数据发给编码器得到packet；
packet发送给封装上下文；
封装上下文写一个所需的封装格式；