当前位置：首页 → 问答吧 → 【挑战极限】从BYTE数组到无符号整数间的转换效率

【挑战极限】从BYTE数组到无符号整数间的转换效率

时间：2011-11-08

来源：互联网

unsigned char buf[] = {1,2,3,4};
int res;
//方式1
__asm
{
movzx eax,byte ptr [buf]
movzx ebx,byte ptr [buf+1]
movzx ecx,byte ptr [buf+2]
movzx edx,byte ptr [buf+3]
shl eax,18h
shl ebx,10h
shl ecx,8
or eax,ebx
or eax,ecx
or eax,edx
mov dword ptr [res],eax
}
//res得到0x01020304

//方式2
__asm
{
mov eax, DWORD PTR [buf]
rol ax, 8
rol eax, 16
rol ax, 8
mov res, eax
}
//res得到0x01020304

//方式3
__asm
{
mov eax, DWORD PTR [buf]
xchg al, ah
rol eax, 16
xchg al, ah
mov res, eax
}
//res得到0x01020304

方式1、方式2效率基本相当，方式3最慢，大概慢25%。

问题：何以导致第三那种方式最慢？最好的办法是哪种？有没有其他更好的办法？

作者: leechiyang 发布时间: 2011-11-08

试试这样如何：
movzx eax,byte ptr [buf]
movzx ebx,byte ptr [buf+1]
shl eax,18h
shl ebx,10h
movzx ecx,byte ptr [buf+2]
or eax,ebx
shl ecx,8
or eax,ecx
movzx edx,byte ptr [buf+3]
or eax,edx
mov dword ptr [res],eax

作者: Areslee 发布时间: 2011-11-08

修正一下，是这样：
movzx eax,byte ptr [buf]
movzx ebx,byte ptr [buf+1]
shl eax,18h
shl ebx,10h
movzx ecx,byte ptr [buf+2]
or eax,ebx
shl ecx,8
movzx edx,byte ptr [buf+3]
or eax,ecx
or eax,edx
mov dword ptr [res],eax

作者: Areslee 发布时间: 2011-11-08

实测速度和第一种方式没什么区别。0xC0000000次循环需要11秒
把同样的指令分隔开执行是不是考虑到充分利用CPU的多条流水线？

作者: leechiyang 发布时间: 2011-11-08

以前看的P5优化的书上说寄存器最好交错使用

作者: Areslee 发布时间: 2011-11-08

来一个，可能也差不多。
Assembly code

mov eax,dword ptr [buf]
mov edx,eax
shr eax,16
rol dx ,8
rol ax ,8
shl edx,16
or  eax,edx         
mov res,eax

xchg 影响效率，一致性方面用的多一点。

作者: G_Spider 发布时间: 2011-11-08

寄存器重复使用相关性太强，不利并行操作，影响效率。

作者: masmaster 发布时间: 2011-11-08

【挑战极限】从BYTE数组到无符号整数间的转换效率

热门阅读

热门下载