OpenSSL源码中SHA1_Init()的实现在哪里?

事情起源于想基于openssl的源码编译自己写的一个调用API的简单程序,结果在m_sha1.c:init()函数中报“undefined reference to `SHA1_Init'”的错误(并不是所有openssl源码的文件都添加到我的项目中了,我的想法是尽量只使用我的自己写的程序所必需的文件),我立即在源码中搜索SHA1_Init()函数的实现,结果根本找不到。附带说一下,我使用的源码分析工具是微软的Visual Studio Express 2013 for Windows Desktop和Linux下的find命令(主要是“find . -name '*.c" -exec grep 'pattern' /dev/null '{}' \+”命令——*.c可以修改为其他类型文件)。通过这两种方式,只能找到SHA1_Init()的声明,实现却是死活找不着。没办法只好网上搜索一下,还好,也找到两个线索,分别是这么说的:

线索1 -- 

SHA1_Init is indeed defined in sha_locl.h as HASH_INIT, whereas
SHA1_Update and SHA1_Final are defined in md32_common.h (under crypto
folder) as HASH_UPDATE and HASH_FINAL respectively.


线索2 --

>     Many OpenSSL functions are implemented in non-traditional manners,
>     such as
>     via macro's, assembly language, etc.
>
>     If you are on a Unix/Linux platform, using something like "nm -o
>     libcrypto.a | grep SHA1_Update" will get you started.  In this case,
>     sha1dgst.c.
Most of these can be found in the corresponding crypto\ subdirectories,
e.g. crypto\md5 and crypto\sha,  in the source dump (I assume you mean
the source code archives available at openssl.org; if not, what are you
looking at?).

Looking for functions is simple once you take into account that some of
them are specified through the use of macros to aid in code reuse
(Forget about that assembly stuff Richard mentioned for now; it's not
mandatory to have a fully functional SSL library, but only helps to
provide additional speed improvements for selected platforms, where
available):

grepping for MD5_Update for instance (a.k.a.: 'find in files' in your
editor of choice, e.g. UltraEdit32) will turn up the line

#define HASH_UPDATE      MD5_Update

used in crypto\md5\ code to instantiate the MD5 implementation of a
common hash routine, which, can be found as the generic
HASH_UPDATE(...) routine written in md32_common.h

When you look for SHA1_Update you'll find that crypto/sha will apply the
same mechanism:

#define HASH_UPDATE      SHA1_Update

and one more #include of md32_common.h, thus actually reusing the code
in md32_common.h . You will find these practices throughout OpenSSL to
help in preventing a particularly nasty type of copy&paste software bugs.


找到上面这两个线索之后,我就首先使用命令“nm -o libcrypto.a | grep SHA1_Update”(其中libcrypto.a是openssl已经编译好的静态库),输出结果如下,

libcrypto.a:sha1dgst.o:0000000000000000 T SHA1_Update
libcrypto.a:sha1_one.o:                 U SHA1_Update
libcrypto.a:eng_openssl.o:                 U SHA1_Update
libcrypto.a:m_sha1.o:                 U SHA1_Update
libcrypto.a:m_dss.o:                 U SHA1_Update
libcrypto.a:m_dss1.o:                 U SHA1_Update
libcrypto.a:m_ecdsa.o:                 U SHA1_Update
libcrypto.a:e_aes_cbc_hmac_sha1.o:                 U SHA1_Update

根据nm命令的输出格式,标志为T的应该是却是包含这段代码的目标文件,因此接下来查看crypto/sha/sha1dgst.c,这个文件内容很简单,主要是定义了SHA_1宏,和“#include "sha_locl.h"”。注意,在这个.c文件中还有一句关键注释,“/* The implementation is in ../md32_common.h */”。继续查看crypto/sha/sha_locl.h,这个头文件分别定义HASH_UPDATE, HASH_TRANSFORM , HASH_FINAL, HASH_INIT为SHA1_Update, SHA1_Transform, SHA1_Final, SHA1_Init,然后“#include "md32_common.h"”。继续查看/crypto/md32_common.h文件,里面确实有HASH_UPDATE, HASH_TRANSFORM , HASH_FINAL函数的定义(宏展开后,就是SHA1_xxx的定义)。不容易啊,终于找着三个,但是SHA1_Init()的定义呢?依然找不着。返回sha_locl.h文件(要牢记编译单元是sha1dgst.c,其中包含sha_locl.h,再包含md32_common.h),发现在“#include "md32_common.h"”指令后面,跟Init沾点边的是fips_md_init(SHA)和fips_md_init_ctx(SHA1, SHA),如下所示:

#ifdef SHA_0
fips_md_init(SHA)
#else
fips_md_init_ctx(SHA1, SHA)
#endif
    {
    memset (c,0,sizeof(*c));
    c->h0=INIT_DATA_h0;
    c->h1=INIT_DATA_h1;
    c->h2=INIT_DATA_h2;
    c->h3=INIT_DATA_h3;
    c->h4=INIT_DATA_h4;
    return 1;
    }

但这两个东西在sha_locl.h本身没出现过,那很可能是所包含的文件带来的了。再查看md32_common.h文件,没有这个内容。进行泛查,发现crypto/crypto.h中有相关定义:

#define fips_md_init(alg) fips_md_init_ctx(alg, alg)

#define fips_md_init_ctx(alg, cx) \
    int alg##_Init(cx##_CTX *c)

这样的话,fips_md_init_ctx(SHA1, SHA)进行宏展开就是int SHA1_Init(SHA_CTX *c)了(注意平时我们习惯于只在.c文件才有宏替换,实际上在头文件中也可以用的)。于是乎,终于找着SHA1_Init()函数的定义了!不容易啊!这种代码写法,简直就是要搞死代码分析器的节奏啊,至少可以秒杀大部分编辑器或IDE的代码分析功能了!

鉴于找不到相应定义是因为宏展开的原因,所以一开始就对sha1dgst.c进行预处理,查看输出结果可能更好。因此最后总结两条,找不着函数定义时,基于编译单元的思想(函数定义如果有则总会出在某个编译单元中),执行两步:

1. 对库或各目标文件执行“nm -o <库或目标文件> | grep <function-name>”,找到标记为T的目标文件;

2. 对该编译单元的.c文件进行预处理,得到预处理结果文件,然后在该文件中查找相关函数。

以找到SHA1_Init()为例:

1. nm -o libcrypto.a | grep SHA1_Init,可以定位编译单元是sha1dgst.c

2. 进入crypto/sha/目录,gcc -E sha1dgst.c -I../../include -I../ -o cpptmp

3. 然后可以在cpptmp文件中找到SHA1_Init()的定义。




阅读更多

更多精彩内容