python - python: tarfile 提取错误 IOError: [Errno 22] 无效模式 ('wb') 或文件名

Question

我正在使用 tarfile 提取文件。不幸的是，这个压缩文件来自 linux 服务器，并且包含几个文件，其中包含文件的非法 Windows 操作系统字符 (':')。

我正在使用以下内容：

extract = tarfile.open(file)
extract.extractall(path=new_path)
extract.close()

我收到以下错误：IOError: [Errno 22] invalid mode ('wb') or filename: ... "file::ext"

所以我尝试通过以下方式传递错误：

try:
    extract = tarfile.open(file)
    extract.extractall(path=new_path)
    extract.close()
except IOError:
    pass

这确实有效，但提取不会继续。它只是随着这次失败而停止。

当我使用 WinRAR 解压档案时，该文件会自动重命名为“file__ext”。

python 是否有 WinRAR 扩展？或者也许是一种跳过错误并继续提取的方法？或者像 WinRAR 那样自动重命名文件。我不介意文件是否会被跳过。

我看到了几个有这个错误的帖子，但是它们都是用于压缩，而不是提取。

score 2 · Accepted Answer

extract = tarfile.open(file)
for f in extract:
    # add other unsavory characters in the brackets
    f.name = re.sub(r'[:]', '_', f.name)
extract.extractall(path=new_path)
extract.close()

（更改不会保存到原始文件 b/c 我们默认以读取模式打开它。）

score 0 · Accepted Answer

如果主要目标是批处理这些作业，您可以从命令行调用 winRAR：

导入子
进程 subprocess.call(['winRAR.exe', 'x', 'file.rar', 'PathToExtractTo'], shell=True)

我还没有测试过上面的代码，但希望它能提供一些想法。

python - python: tarfile 提取错误 IOError: [Errno 22] 无效模式 ('wb') 或文件名

2 回答 2

Related

Reference