c - 关于 typedef 中的单实例数组的一些问题

Question

我正在使用 GNU Multi-Precision (GMP) 库代码阅读一些使用任意长度整数的代码。MP 整数的类型mpz_t在 gmp.h 头文件中定义。

mpz_t但是，我对这个库定义类型的低级定义有一些疑问。在标题代码中：

/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
    /* SOME OTHER STUFF HERE */
} __mpz_struct;

typedef __mpz_struct mpz_t[1];

第一个问题：[1]关联是否与__mpz_struct？换句话说，是否将类型typedef定义为一次出现的数组？mpz_t__mpz_struct

第二个问题：为什么是数组？（为什么只出现一次？）这是我听说过的那些结构黑客之一吗？

第三个问题（可能与第二个问题间接相关）：该mpz_init_set(mpz_t, unsigned long int)函数的 GMP 文档说仅将其用作按值传递，尽管人们会假设该函数将在被调用函数中修改其内容（因此需要通过引用）语法。参考我的代码：

/* FROM MY CODE */
mpz_t fact_val;                /* declaration */
mpz_init_set_ui(fact_val, 1);  /* Initialize fact_val */

单次出现的数组是否会自动启用按引用传递（由于 C 中数组/指针语义的分解）？我坦率地承认我有点过度分析了这一点，但我当然喜欢对此进行任何讨论。谢谢！

score 5 · Accepted Answer

This does not appear to be a struct hack in the sense described on C2. It appears that they want mpz_t to have pointer semantics (presumably, they want people to use it like an opaque pointer). Consider the syntactic difference between the following snippets:

struct __mpz_struct data[1];

(&data[0])->access = 1;
gmp_func(data, ...);

And

mpz_t data;

data->access = 1;
gmp_func(data, ...);

Because C arrays decay into pointers, this also allows for automatic pass by reference for the mpz_t type.

It also allows you to use a pointer-like type without needing to malloc or free it.

score 3 · Accepted Answer

*^{第一个问题：是否[1]与__mpz_struct 关联？换句话说，typedef 是否将 mpz_t 类型定义为一个出现一次的 __mpz_struct 数组？}*

是的。

^{Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?}

Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an mpz_t object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking like

malloc(sizeof(struct whatever) + sizeof(mpz_t) * some_number)`

would be a dead giveaway.

^{Does the single-occurrence array enable pass-by-reference automatically...?}

Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.

I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.

score 3 · Accepted Answer

The reason for this comes from the implementation of mpn. Specifically, if you're mathematically inclined you'll realise N is the set of natural numbers (1,2,3,4...) whereas Z is the set of integers (...,-2,-1,0,1,2,...).

Implementing a bignum library for Z is equivalent to doing so for N and taking into account some special rules for sign operations, i.e. keeping track of whether you need to do an addition or a subtraction and what the result is.

Now, as for how a bignum library is implemented... here's a line to give you a clue:

typedef unsigned int        mp_limb_t;
typedef mp_limb_t *     mp_ptr;

And now let's look at a function signature operating on that:

__GMP_DECLSPEC mp_limb_t mpn_add __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr,mp_size_t));

Basically, what it comes down to is that a "limb" is an integer field representing the bits of a number and the whole number is represented as a huge array. The clever part is that gmp does all this in a very efficient, well optimised manner.

Anyway, back to the discussion. Basically, the only way to pass arrays around in C is, as you know, to pass pointers to those arrays which effectively enables pass by reference. Now, in order to keep track of what's going on, two types are defined, a mp_ptr which is an array of mp_limb_t big enough to store your number, and mp_srcptr which is a const version of that, so that you cannot accidentally alter the bits of the source bignums on what you are operating. The basic idea is that most of the functions follow this pattern:

func(ptr output, src in1, src in2)

etc. Thus, I suspect mpz_* functions follow this convention simply to be consistent and it is because that is how the authors are thinking.

Short version: Because of how you have to implement a bignum lib, this is necessary.

c - 关于 typedef 中的单实例数组的一些问题

3 回答 3

Related

Reference