C++优化热点语句

最新推荐文章于 2024-02-01 18:17:54 发布

Achilles.Wang

最新推荐文章于 2024-02-01 18:17:54 发布

阅读量491

点赞数

分类专栏： C++并发编程实战 C-C++

本文链接：https://blog.csdn.net/andrewgithub/article/details/117432430

版权

C-C++ 同时被 2 个专栏收录

161 篇文章

订阅专栏

C++并发编程实战

15 篇文章

订阅专栏

本文探讨了如何通过预计算字符串长度和选择高效循环结构来提升代码性能。通过对比，发现预先计算字符串长度能显著减少函数执行时间，而for循环与do-while循环在现代编译器优化下表现接近。此外，开启编译器优化选项也能进一步提高代码执行效率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

优化热点语句

提前计算固定值

先观察下面的性能测试代码：

static void find_blank(benchmark::State& state) {

    for (auto _: state) {
        char s[] = "This string has many space (0x20) chars. ";

        for (size_t i = 0; i < strlen(s); ++i)
            if (s[i] == ' ')
                s[i] = '*';
    }
}
BENCHMARK(find_blank);

这段代码对字符串中的每个字符都会判断循环条件 i < strlen(s) 是否成立 1。调用strlen() 的开销是昂贵的，遍历参数字符串对它的字符计数使得这个算法的开销从 O(n)变为了 O(n2)。这是一个在库函数中隐藏了循环的典型例子

既然，每次strlen()调用都会导致一次遍历，并且计算结果不会随着函数的运行而改变，可以尝试先求出strlen保存，然后后期就直接使用计算结果，而不是每次循环都进行计算，如下：

static void find_blank_init_length(benchmark::State& state) {

    for (auto _: state) {
        char s[] = "This string has many space (0x20) chars. ";

        for (size_t i = 0, len = strlen(s); i < len; ++i)
            if (s[i] == ' ')
                s[i] = '*';
    }
}

BENCHMARK(find_blank_init_length);

测试结果如下：

-----------------------------------------------------------------
Benchmark                       Time             CPU   Iterations
-----------------------------------------------------------------
find_blank                    191 ns          191 ns      3431752
find_blank_init_length       72.4 ns         72.4 ns      9635766

在禁用变异优化的选项下编译，从计算结果可以看出，更改之后整个函数性能提升了将近3倍左右。

使用更加高效的循环

通常for循环将会被编译成如下的代码：

	初始化表达式；
L1: if (!循环条件) goto L2;
	语句；
	继续表达式;
	goto L1;
L2:

而do-while编译之后一般为：

L1: 控制语句
	if (循环条件) goto L1;

当然不同的编译器可能实现不一样，按照上述分析使用do-while肯定要比for循环要好很多，但是，在ubuntu20.04上实际测试for循环的速度基本上和do-while保持一致，也可能是for循环用的多，所以编译器哪些大佬特意特意进行了优化。

do-while的实现：

static void find_blank_do_while(benchmark::State& state) {

    for (auto _: state) {
        char s[] = "This string has many space (0x20) chars. ";
        size_t i = 0, len = strlen(s);
        do {
            if (s[i] == ' ')
                s[i] = '*';
            ++ i;
        }while (i < len);
    }
}

BENCHMARK(find_blank_do_while);

实际测试结果：

-----------------------------------------------------------------
Benchmark                       Time             CPU   Iterations
-----------------------------------------------------------------
find_blank                    191 ns          191 ns      3431752
find_blank_init_length       72.4 ns         72.4 ns      9635766
find_blank_do_while          71.6 ns         71.6 ns      9629498

使用编译器进行优化

在不更改代码的情况下，可以更改优化选项，告诉编译器可以对代码进行优化，当编译器选项由O0更改为O3之后的测试结果如下：

-----------------------------------------------------------------
Benchmark                       Time             CPU   Iterations
-----------------------------------------------------------------
find_blank                   60.4 ns         60.4 ns     10536782
find_blank_init_length       34.6 ns         34.6 ns     20298248
find_blank_do_while          34.4 ns         34.4 ns     20249507
-----------------------------------------------------------------
Benchmark                       Time             CPU   Iterations
-----------------------------------------------------------------
find_blank                   60.2 ns         60.2 ns      9566706
find_blank_init_length       34.4 ns         34.4 ns     20362644
find_blank_do_while          34.6 ns         34.6 ns     20355712