我将提出@trincot 的好答案的变体。就像那个一样,它不关心一行中可能有多少异常值,但与那个不同的是,它也不关心一行中有多少不是异常值。
基本思想只是让每个序列元素“投票”决定第一个序列元素“应该是”什么。得票最多者获胜。通过构造,这最大化了保持不变的元素数量:在 1-liner 循环结束后,votes[i]
如果i
选择作为起点,则保持不变的元素数量。
def correct(numbers, mod=None):
# this part copied from @trincot's program
if mod is None: # if argument is not provided:
# Make a guess what the range is of the values
mod = max(numbers) + 1
votes = [0] * mod
for i, x in enumerate(numbers):
# which initial number would make x correct?
votes[(x - i) % mod] += 1
winning_count = max(votes)
winning_numbers = [i for i, v in enumerate(votes)
if v == winning_count]
if len(winning_numbers) > 1:
raise ValueError("ambiguous!", winning_numbers)
winning_number = winning_numbers[0]
for i in range(len(numbers)):
numbers[i] = (winning_number + i) % mod
return numbers
然后,例如,
>>> correct([9,4,5,6,7,8,9,0,1,2,3,4,8,6,7,0,9,0,1,2,3,4,1,6,7,8,9,0,1,6])
[3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
但
>>> correct([1, 5, 3, 7, 5, 9])
...
ValueError: ('ambiguous!', [1, 4])
也就是说,无法猜测您是否[1, 2, 3, 4, 5, 6]
想要[4, 5, 6, 7, 8, 9]
. 他们都有 3 个“正确”的数字,尽管在这两种情况下都没有两个相邻的异常值。