python - python中的数组问题

Question

我有这个代码：

import numpy as np
import tables as tb

ndim = 50000
h5in = tb.openFile('data.h5','r')
data = h5in.root.x

h5out = tb.openFile('testout.h5', mode='w', title="argsort distances")
root = h5out.root
x = h5out.createCArray(root,'x',tb.Int16Atom(),shape=(ndim,ndim))

for i in xrange(ndim):
    x[:,i] = np.argsort(dist[i,:])

它只需要一个永恒的执行。有没有办法加快速度？

注意：它必须是 x[:,i] 而不是 x[i,:]

score 1 · Accepted Answer

将 for 循环替换为：

x[:,:] = np.argsort(dist, axis=1).T

更新：如果这太大，那么尝试在切片大小上找到一个折衷方案：

slice_size = 100 # or 1000 if it fits into your memory
for i in xrange(0, ndim, slice_size):
    x[:,i:i+slice_size] = np.argsort(dist[i:i+slice_size,:], axis=1)

score 0 · Accepted Answer

0

您是否尝试加载具有行形式数据的文件？

如果是，请尝试 np.loadtxt

于 2011-12-30T04:32:13.160 回答

python - python中的数组问题

2 回答 2

Related

Reference