algorithm - 如何掌握原地数组修改算法？

Question

我正在准备软件工作面试，但在就地数组修改时遇到了麻烦。

例如，在 out-shuffle 问题中，您将数组的两半交错，这样1 2 3 4 5 6 7 8就会变成1 5 2 6 3 7 4 8. 这个问题要求一个恒定的内存解决方案（和线性时间，虽然我不确定这是否可能）。

首先我认为线性算法是微不足道的，但后来我无法解决。然后我确实找到了一个简单的O(n^2)算法，但花了我很长时间。而且我仍然没有找到更快的解决方案。

我记得在解决 Bentley 的 Programming Pearls 第 2 列中的类似问题时也遇到了麻烦：

将数组向左旋转i位置（例如abcde旋转 2 变为cdeab），及时O(n)且仅保留几个字节的额外空间。

有没有人有提示可以帮助我解决这些问题？

score 17 · Accepted Answer

大约 O(n) 时间，O(1) 空间算法用于 out-shuffle

在 O(n) 时间和 O(1) 空间内进行洗牌是可能的，但这很困难。不知道为什么人们认为这很容易并建议您尝试其他方法。

以下论文有一个 O(n) 时间和 O(1) 空间解决方案（虽然它是用于 in-shuffle，但进行 in-shuffle 会使 out-shuffle 变得微不足道）：

http://arxiv.org/PS_cache/arxiv/pdf/0805/0805.1598v1.pdf

关于解决就地数组修改算法的方法

就地修改算法可能变得非常难以处理。

考虑一对：

在线性时间内就地洗牌。使用数论。
就地合并排序，开放了几年。一个算法来了，但太复杂了，不实用。使用非常复杂的簿记。

抱歉，如果这听起来令人沮丧，但没有灵丹妙药可以为您解决所有就地算法问题。你需要解决这个问题，找出它的属性，并尝试利用它们（就像大多数算法一样）。

也就是说，对于结果是原始数组排列的数组修改，您可以尝试遵循排列循环的方法。基本上，任何排列都可以写成一组不相交的循环（也参见约翰的回答）。例如排列：

1 4 2 5 3 6

of1 2 3 4 5 6可以写成

1 -> 1
2 -> 3 -> 5 -> 4 -> 2
6 -> 6.

您可以将箭头解读为“前往”。

因此，要排列数组1 2 3 4 5 6，请遵循三个周期：

1 变为 1。

6 到 6。

2 到 3，3 到 5，5 到 4，4 到 2。

要遵循这个漫长的循环，您可以只使用一个temp变量。将 3 存储在其中。将 2 放在 3 所在的位置。现在将 3 放入 5 中，然后将 5 存储在temp等中。由于您只使用恒定的额外temp空间来遵循特定的循环，因此您正在为该循环对数组进行就地修改。

现在，如果我给你一个计算元素去向的公式，那么你现在需要的只是每个循环的起始元素集。

明智地选择循环的起点可以使算法变得简单。如果你想出 O(1) 空间的起点，你现在就有了一个完整的就地算法。这是您实际上可能必须熟悉问题并利用其属性的地方。

即使您不知道如何计算循环的起点，但有一个计算下一个元素的公式，在某些特殊情况下，您也可以使用此方法获得 O(n) 时间就地算法。

例如：如果您知道无符号整数数组仅包含正整数。

您现在可以遵循周期，但将其中的数字否定为“已访问”元素的指标。现在您可以遍历数组并选择遇到的第一个正数并遵循循环，使循环的元素为负并继续找到未触及的元素。最后，您只需再次使所有元素变为正数即可获得结果排列。

你得到一个 O(n) 时间和 O(1) 空间算法！当然，我们通过使用数组整数的符号位作为我们个人的“访问”位图来“欺骗”。

即使数组不一定是整数，这种方法（遵循循环，而不是符号位的破解 :-)）实际上可以用来解决您所说的两个问题：

The in-shuffle (or out-shuffle) problem：当2n+1是的幂时3，可以证明（使用数论）1,3,3^2,等处于不同的循环中，并且使用这些循环覆盖所有循环。将这一点与 in-shuffle 容易分而治之的事实相结合，您将得到一个 O(n) 时间、O(1) 空间算法（公式为i -> 2*i modulo 2n+1）。有关详细信息，请参阅上述论文。
The cyclic shift an array problemn: 对大小为的数组进行循环移位k也给出了结果数组的排列（由公式给出i）i+k modulo n，也可以使用以下循环方法在线性时间和原地求解。事实上，就元素交换的数量而言，这种以下循环方法优于3 次反转算法。当然，由于访问模式的原因，遵循循环方法可以杀死缓存，并且在实践中，3 反向算法实际上可能会更好。

至于面试，如果面试官是一个理性的人，他们会看你是如何思考和解决问题的，而不是你是否真的解决了它。所以即使你没有解决一个问题，我想你也不应该气馁。

score 4 · Accepted Answer

The basic strategy with in place algorithms is to figure out the rule for moving a entry from slot N to slot M.

So, your shuffle, for instance. if A and B are cards and N is the number of chards. the rules for the first half of the deck are different than the rules for the second half of the deck

 // A is the current location, B is the new location.
 // this math assumes that the first card is card 0
 if (A < N/2)
    B = A * 2;
 else
    B = (A - N/2) * 2 + 1;

Now we know the rule, we just have to move each card, each time we move a card, we calculate the new location, then remove the card that is currently in B. place A in slot B, then let B be A, and loop back to the top of the algorithm. Each card moved displaces the new card which becomes the next card to be moved.

I think the analysis is easier if we are 0 based rather than 1 based, so

 0 1 2 3 4 5 6 7  // before
 0 4 1 5 2 6 3 7  // after

So we want to move 1->2 2->4 4->1 and that completes a cycle then move 3->6 6->5 5->3 and that completes a cycle and we are done.

Now we know that card 0 and card N-1 don't move, so we can ignore those, so we know that we only need to swap N-2 cards in total. The only sticky bit is that there are 2 cycles, 1,2,4,1 and 3,6,5,3. when we get to card 1 the second time, we need to move on to card 3.

 int A = 1;
 int N = 8;
 card ary[N]; // Our array of cards
 card a = ary[A];

 for (int i = 0; i < N/2; ++i)
 {
     if (A < N/2)
        B = A * 2;
     else
        B = (A - N/2) * 2 + 1;

     card b = ary[B];
     ary[B] = a;
     a = b;
     A = B;

     if (A == 1)
     {
        A = 3;
        a = ary[A];
     }
 }

Now this code only works for the 8 card example, because of that if test that moves us from 1 to 3 when we finish the first cycle. What we really need is a general rule to recognize the end of the cycle, and where to go to start the next one.

That rule could be mathematical if you can think of a way, or you could keep track of which places you had visited in a separate array, and when A is back to a visited place, you could then scan forward in your array looking for the first non-visited place.

For your in-place algorithm to be 0(n), the solution will need to be mathematical.

I hope this breakdown of the thinking process is helpful to you. If I was interviewing you, I would expect to see something like this on the whiteboard.

Note: As Moron points out, this doesn't work for all values of N, it's just an example of the sort of analysis that an interviewer is looking for.

score 1 · Accepted Answer

For the first one, let's assume n is even. You have:

first half: 1 2 3 4
second : 5 6 7 8

Let x1 = first[1], x2 = second[1].

Now, you have to print one from the first half, one from the second, one from the first, one from the second...

Meaning first[1], second[1], first[2], second[2], ...
Obviously, you don't keep two halves in memory, as that will be O(n) memory. You keep pointers to the two halves. Do you see how you'd do that?

The second is a bit harder. Consider:

12345
abcde
..cde
.....ab
..cdeab
cdeab

Do you notice anything? You should notice that the question basically asks you to move the first i characters to the end of your string, without affording the luxury of copying the last n - i in a buffer then appending the first i and then returning the buffer. You need to do with O(1) memory.

To figure how to do this you basically need a lot of practice with these kinds of problems, as with anything else. Practice makes perfect basically. If you've never done these kinds of problems before, it's unlikely you'll figure it out. If you have, then you have to think about how you can manipulate the substrings and or indices such that you solve your problem under the given constraints. The general rule is to work and learn as much as possible so you'll figure out the solutions to these problems very fast when you see them. But the solution differs quite a bit from problem to problem. There's no clear recipe for success I'm afraid. Just read a lot and understand the stuff you read before you move on.

The logic for the second problem is this: what happens if we reverse the substring [1, 2], the substring [3, 5] and then concatenate them and reverse that? We have, in general:

1, 2, 3, 4, ..., i, i + 1, i + 2, ..., N

reverse [1, i] =>
i, i - 1, ..., 4, 3, 2, 1, i + 1, i + 2, ..., N

reverse [i + 1, N] =>
i, i - 1, ..., 4, 3, 2, 1, N, ..., i + 1

reverse [1, N] =>
i + 1, ..., N, 1, 2, 3, 4, ..., i - 1, i

which is what you wanted. Writing the reverse function using O(1) memory should be trivial.

score 1 · Accepted Answer

Frank,

For programming with loops and arrays, nothing beats David Gries's textbook The Science of Programming. I studied it over 20 years ago, and there are ideas that I still use every day. It is very mathematical and will require real effort to master, but that effort will repay you many times over for your whole career.

score 1 · Accepted Answer

补充Aryabhatta 的回答：

即使不知道每个循环的起始位置或使用内存来知道访问过的循环，也有一种通用方法可以“跟踪循环”。如果您需要 O(1) 内存，这将特别有用。

对于数组中的每个位置 i，遵循循环而不移动任何数据，直到达到...

起始位置 i：循环结束。这是一个新的循环：这次再次移动数据。
低于 i 的位置：这个循环已经被访问过，与它无关。

当然，这有时间开销（我相信 O(n^2)）并且存在一般“跟随周期”方法的缓存问题。

score 0 · Accepted Answer

Generally speaking, the idea is to loop through the array once, while

storing the value at the position you are at in a temporary variable
finding the correct value for that position and writing it
either move on to the next value, or figure out what to do with your temporary value before continuing.

score 0 · Accepted Answer

一般方法如下：

构造一个位置数组 int[] pos，使得pos[i]引用a[i]在混洗数组中的位置（索引）。

根据这个位置数组pos重新排列原始数组int[] a。

/** Shuffle the array a. */    
void shuffle(int[] a) {
    // Step 1
    int [] pos = contructRearrangementArray(a)
    // Step 2
    rearrange(a, pos);
}

/**
 * Rearrange the given array a according to the positions array pos.
 */
private static void rearrange(int[] a, int[] pos)
{
    //  By definition 'pos' should not contain any duplicates, otherwise rearrange() can run forever.
   // Do the above sanity check.
    for (int i = 0; i < pos.length; i++) {
        while (i != pos[i]) {
            // This while loop completes one cycle in the array
            swap(a, i, pos[i]);
            swap(pos, i, pos[i]);
        }
    }
}

/** Swap ith element in a with jth element. */
public static void swap(int[] a, int i, int j) 
{
    int temp = a[i];
    a[i] = a[j];
    a[j] = temp;
}

例如，对于outShuffle的情况，以下是 contructRearrangementArray() 的实现。

/**
 * array     : 1 2 3 4 5 6 7 8
 * pos       : 0 2 4 6 1 3 5 7
 * outshuffle: 1 5 2 6 3 7 4 8 (outer boundaries remain same)
 */
public int[] contructRearrangementArray(int[] a)
{
    if (a.length % 2 != 0) {
        throw new IllegalArgumentException("Cannot outshuffle odd sized array");
    }
    int[] pos = new int[a.length];
    for (int i = 0; i < pos.length; i++) {
        pos[i] = i * 2 % (pos.length - 1);
    }
    pos[a.length - 1] = a.length - 1;
    return pos;
}

algorithm - 如何掌握原地数组修改算法？

7 回答 7

Related

Reference