7

我想使用 Python 的 csv writer 编写一个固定宽度、空格分隔和最少引用的 CSV 文件。输出示例:

item1           item2  
"next item1"    "next item2"
anotheritem1    anotheritem2  

如果我使用

writer.writerow( ("{0:15s}".format(item1), "{0:15s}".format(item2)) )
...

然后,使用空格分隔符,由于项目格式的尾随空格,格式被破坏,因为添加了引号或转义符(取决于 csv.QUOTE_* 常量):

"item1          " "item2          "
"next item1     " "next item2     "
"anotheritem1   " "anotheritem2   "

当然,我可以自己格式化所有内容:

writer.writerow( ("{0:15s}{1:15s}".format(item1, item2)) )

但是使用 csv 编写器并没有多大意义。此外,当空间嵌入到项目中并且应该使用引用/转义时,我将不得不手动整理这些情况。换句话说,我似乎需要一个(不存在的)“QUOTE_ABSOLUTELYMINIMAL”csv 常量,它可以充当“QUOTE_MINIMAL”,但也会忽略尾随空格。

有没有办法实现“QUOTE_ABSOLUTELYMINIMAL”行为或使用 Python 的 CSV 模块获得固定宽度、空格分隔的 CSV 输出?

我想要 CSV 文件中的固定宽度功能的原因是更好的可读性。因此它将被处理为 CSV 用于读取和写入,但由于列结构,可读性更好。读取不是问题,因为 csv skipinitialspace 选项会忽略多余的空格。令我惊讶的是,写作似乎是一个问题......

编辑:我得出结论,使用当前的 csv 插件是不可能实现的。它不是一个内置选项,我看不到如何手动实现它的任何合理方法,因为似乎没有办法在不引用或转义它们的情况下由 Python 的 csv 编写器编写额外的分隔符。因此,我可能不得不编写自己的 csv 编写器。

4

3 回答 3

8

The basic problem you are running into is that csv and fixed-format are basically opposing views of data storage. Making them work together is not a common practice. Also, if you only have quotes on the items with spaces in them, it will throw off the alignment on those rows:

testing     "rather hmm "
strange     "ways to    "
"store some " "csv data   "
testing     testing    

Reading that data back in results in wrong results as well:

'testing' 'rather hmm '
'strange' 'ways to    '
'store some ' 'csv data   '
'testing' 'testing' ''

Notice the extra field at the end of the last row. Given these problems, I would go with your example of

"item1          " "item2          "
"next item1     " "next item2     "
"anotheritem1   " "anotheritem2   "

which I find very readable, is easy to generate with the existing csv library, and gets correctly parsed when read back in. Here's the code I used to generate it:

import csv

class SpaceCsv(csv.Dialect):
    "csv format for exporting tables"
    delimiter = None
    doublequote = True
    escapechar = None
    lineterminator = '\n'
    quotechar = '"'
    skipinitialspace = True
    quoting = csv.QUOTE_MINIMAL
csv.register_dialect('space', SpaceCsv)

data = (
        ('testing    ', 'rather hmm '),
        ('strange    ', 'ways to    '),
        ('store some ', 'csv data   '),
        ('testing    ', 'testing    '),

temp = open(r'c:\tmp\fixed.csv', 'w')
writer = csv.writer(temp, dialect='space')
for row in data:
    writer.writerow(row)
temp.close()

You will, of course, need to have all your data padded to the same length, either before getting to the function that does all this, or in the function itself. Oh, and if you have numeric data you'll have to make padding allowances for that as well.

于 2011-08-04T23:51:31.493 回答
2

这对你有什么好处?我认为您确实只是缺少 csv.QUOTE_NONE 常量。

import csv
csv.register_dialect('spacedelimitedfixedwidth', delimiter=' ', quoting=csv.QUOTE_NONE)
with open('crappymainframe.out', 'rb') as f:
    reader = csv.reader(f, 'spacedelimitedfixedwidth')

这是对 csv 模块文档底部的 unixpwd 方言示例的修改。

于 2011-04-12T20:41:05.530 回答
0

这个活动状态配方展示了如何在 python 中输出表格化数据:http: //code.activestate.com/recipes/267662-table-indentation/

您可能可以从该示例中收集到足够的信息来做您想做的事情。

于 2011-04-12T16:23:54.203 回答