X265 码率控制算法

参考资料:https://www.cnblogs.com/lakeone/p/5436481.html 关于算法部分写的很清晰

1-PASS ABR

rateControlStart()

这个函数相当于码率控制的入口,会在每一帧被编码前调用,根据码率控制设置编码参数。

1
2
double q = x265_qScale2qp(rateEstimateQscale(curFrame, rce));
...

rateEstimateQscale(curFrame, rce)

ABR 码率控制模式的主要函数,设定当前帧的 QScale

  • B 帧没有单独进行码率控制,其 QP 是通过相邻两帧 P 帧的 QP 平均值加上一个偏移量来估计的
  • 对于其他他帧首先维护 SATD 的 moving average,cplxsum 相当于一个归一化系数(不除以他的话初始几帧算出的均值会偏小)。 blurredComplexity 就是帧的平均复杂度
1
2
3
4
5
m_shortTermCplxSum *= 0.5;
m_shortTermCplxCount *= 0.5;
m_shortTermCplxSum += m_currentSatd / (CLIP_DURATION(m_frameDuration) / BASE_FRAME_DURATION);
m_shortTermCplxCount++;
rce->blurredComplexity = m_shortTermCplxSum / m_shortTermCplxCount;
  • 对 QScale 进行初始估计 (See getQScale())
  • 对 QScale 进行二次调整 (See tuneAbrQScaleFromFeedback())
  • 预测当前帧所需要的 bits 数 (See predictSize())
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
...
if (!m_param->rc.bStatRead)
    checkAndResetABR(rce, false);
double initialQScale = getQScale(rce, m_wantedBitsWindow / m_cplxrSum);//给出QScale的第一次估计
...
double tunedQScale = tuneAbrQScaleFromFeedback(initialQScale);//根据实际码率与目标码率的偏差调整QScale
...
q = x265_clip3(lqmin, lqmax, q);
if (m_2pass)
    rce->frameSizePlanned = qScale2bits(rce, q);
else
    rce->frameSizePlanned = predictSize(&m_pred[m_predType], q, (double)m_currentSatd);

getQScale(rce, rateFactor)

传入历史信息,以及 rateFactor = m_wantedBitsWindow / m_cplxrSum,给出对 QS 的估计 qScale=blurredComplexity^(1-qCompress)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
    if (m_param->rc.cuTree && !m_param->rc.hevcAq)
    {
        // Scale and units are obtained from rateNum and rateDenom for videos with fixed frame rates.
        double timescale = (double)m_param->fpsDenom / (2 * m_param->fpsNum);
        q = pow(BASE_FRAME_DURATION / CLIP_DURATION(2 * timescale), 1 - m_param->rc.qCompress);
    }
    else
        q = pow(rce->blurredComplexity, 1 - m_param->rc.qCompress);

    // avoid NaN's in the Rceq
    if (rce->coeffBits + rce->mvBits == 0)
        q = m_lastQScaleFor[rce->sliceType];
    else
    {
        m_lastRceq = q;
        q /= rateFactor;
    }
    return q;

tuneAbrQScaleFromFeedback(qScale)

对 qScale 进行二次调整:

  • 首先计算已经编码完成的时间 timeDone 以及应该使用的 bit 数 wantedBits
  • 拿到实际已经使用的 bit 数 encodedBits
  • 根据两者之差及 abrBuffer (正比于 $\sqrt{timeDone}$,这里的平方根也很神秘,感性理解:如果每个页面的bit数iid分布,则其和(正态分布)与期望值的差异正比于 $\sqrt{t}$),计算出一个调整系数 qScale (限制在 $[0.5, 2]$) 调整当前的 qScale
1
2
3
abrBuffer *= X265_MAX(1, sqrt(timeDone));
overflow = x265_clip3(.5, 2.0, 1.0 + (encodedBits - wantedBits) / abrBuffer);
qScale *= overflow;

函数完整实现:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
double RateControl::tuneAbrQScaleFromFeedback(double qScale)
{
    double abrBuffer = 2 * m_rateTolerance * m_bitrate;
    /* use framesDone instead of POC as poc count is not serial with bframes enabled */
    double overflow = 1.0;
    double timeDone = (double)(m_framesDone - m_param->frameNumThreads + 1) * m_frameDuration;
    double wantedBits = timeDone * m_bitrate;
    int64_t encodedBits = m_totalBits;
    if (m_param->totalFrames && m_param->totalFrames <= 2 * m_fps)
    {
        abrBuffer = m_param->totalFrames * (m_bitrate / m_fps);
        encodedBits = m_encodedBits;
    }

    if (wantedBits > 0 && encodedBits > 0 && (!m_partialResidualFrames || 
        m_param->rc.bStrictCbr || m_isGrainEnabled))
    {
        abrBuffer *= X265_MAX(1, sqrt(timeDone));
        overflow = x265_clip3(.5, 2.0, 1.0 + (encodedBits - wantedBits) / abrBuffer);
        qScale *= overflow;
    }
    return qScale;
}

从量化参数预测 bits 数

似乎有两套预测的函数,有时候用第一个,有时候用第二个。

class:Predictor

存储一个一次函数的系数,用于通过当前帧的 satd 和 qp 估计最终 bit 数(predictSize)。(对于I/B/P-frame 各有一套)

$$bit_p=\frac{coeff \times satd + offset}{q \times count}$$

其中 $count$ 看起来是对于系数的归一化参数。

会根据如下法则更新,一个类似 moving average 状物:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
void RateControl::updatePredictor(Predictor *p, double q, double var, double bits)
{
    if (var < 10)
        return;
    const double range = 2;
    double old_coeff = p->coeff / p->count;
    double old_offset = p->offset / p->count;
    double new_coeff = X265_MAX((bits * q - old_offset) / var, p->coeffMin );
    double new_coeff_clipped = x265_clip3(old_coeff / range, old_coeff * range, new_coeff);
    double new_offset = bits * q - new_coeff_clipped * var;
    if (new_offset >= 0)
        new_coeff = new_coeff_clipped;
    else
        new_offset = 0;
    p->count  *= p->decay;
    p->coeff  *= p->decay;
    p->offset *= p->decay;
    p->count++;
    p->coeff  += new_coeff;
    p->offset += new_offset;
}

qScale2bits(rce, qScale)

rce 中存储历史平均的 bit 数及 qScale 值,传入第二个参数为当前帧的 qScale,用于预测当前帧的 bit 数。 看起来是个经验公式,不太理解为什么这里 mvBits 跟 qpScale 是根号的关系?

1
2
3
4
5
6
7
8
inline double qScale2bits(RateControlEntry *rce, double qScale)
{
    if (qScale < 0.1)
        qScale = 0.1;
    return (rce->coeffBits + .1) * pow(rce->qScale / qScale, 1.1)
           + rce->mvBits * pow(X265_MAX(rce->qScale, 1) / X265_MAX(qScale, 1), 0.5)
           + rce->miscBits;
}

CRF

CRF 与 ABR 几乎相同,除了少了二次调整 QScale 的步骤,传入 getQScaleratefactor 也改为了: m_rateFactorConstant = pow(baseCplx, 1 - m_qCompress) / x265_qp2qScale(m_param->rc.rfConstant + mbtree_offset);

VBV (TODO)

See clipQscale(), tuneQScaleForZone()