python - 从文件中删除以（）开头并以（）结尾的行段

Question

这是我的文件 blah.log 的示例

Y:\TH2020-0600_1P00392G01_02\1P00392G01_02.obc[30-SEP-20 10:42:47

@30-SEP-20 10:42:51

yhjubad7

q28ed7qai

aiuwdh8

“30-SEP-20 10:43:06

@30-SEP-20 10:43:39 nkdjaw aibw

阿克乌德纳维克德

/30-SEP-20 10:43:52 @30-SEP-20 10:43:52 ahuwsd8

2dhaiubd 98wha98 "30-SEP-20 10:49:39

]30-SEP-20 11:29:03

Y:\TH2020-0600_1P00392G01_02\1P00392G01_02.obc[01-OCT-20 11:19:08]01-OCT-20 11:26:29

还有更多（如数千行）行，但我只是总结一下

我想删除以 '@' 开头并以 '/' 结尾的行段（最后其中一些是 '?' 或 '!' ）。

这是我的代码

file = "cpcpk/1P00392G01_02.LOG"
newfile="cpcpk/New_1P00392G01_02.LOG"
new=open(newfile,'w')
with open(file) as input_data:
    # loops through the whole file
    for line in input_data:
        # reset data
        data=[]
        if line.startswith('@'):
            # Skips text before the beginning of the interesting block
            for line in input_data:
                if line.startswith('@'):
                    #write test log in a new file
                    data.append(line)
                    break
            # Reads text until the end of the block:
            for line in input_data:  
                if line.startswith('"'):
                    data.append(line)
                    new.writelines(data)
                    break
                elif line.startswith('/'):
                    break
                elif line.startswith('?'):
                    break
                elif line.startswith('!'):
                    break
                data.append(line)
                
new.close()

第一个问题

当我运行它时，它确实删除了它，但我想要的一些行也被删除了。

第二个问题

如果我这样编码，第一行和最后几行也不会写。

这是我想要的输出：

Y:\TH2020-0600_1P00392G01_02\1P00392G01_02.obc[30-SEP-20 10:42:47

@30-SEP-20 10:42:51

yhjubad7

q28ed7qai

aiuwdh8

“30-SEP-20 10:43:06 @30-SEP-20 10:43:52 ahuwsd8

2dhaiubd

98wha98

“30-SEP-20 10:49:39

]30-SEP-20 11:29:03

Y:\TH2020-0600_1P00392G01_02\1P00392G01_02.obc[01-OCT-20 11:19:08

我的意思是，@例如，如果行以开头，@30-SEP-20 10:42:51 那么它将开始将行写入列表。然后，当 for 行循环时，循环到以开头的行，"例如"30-SEP-20 10:43:06，它将停止循环，然后将列表写入新文件，但如果循环到以开头的行，/例如/30-SEP-20 10:43:06它将停止循环并重置列表，然后重新开始。就像我编码的一样。你可以看到我有3个循环。第二个和第三个循环是找到我想要的和我不想要的。然后，第一个循环是重复第二个和第三个循环。

score 0 · Accepted Answer

您可以使用正则表达式指定模式来检测要删除的行：

import re

with open(file) as input_file, open(newfile, 'w') as new:
    new.write(re.sub(r"@[^@]+/.*\n", "", input_file.read())

正则表达式演示

如果文件太大而您无法一次全部读取，则可以执行以下操作：

buff = []
with open(file) as input_file, open(newfile, 'w') as new:
    for line in input_file:
        if line.startsiwth('@'):
            new.writelines(buff)
            buff = []
        if line.startswith('/'):
            buff = []
            continue
        buff.append(line)

这是为行保存一个缓冲区，并且每次@遇到 a - 缓冲区都会刷新到文件中。另一方面，当/遇到 a 时 - 缓冲区被重置。可能需要在边缘情况下做一些工作，但那是为了给你一个想法。

score 0 · Accepted Answer

你正在寻找这样的东西吗？

file = "a.log"
data=[]
with open(file) as input_data:
    # loops through the whole file
    for line in input_data:
        line2= line.rstrip('\n')
        if line2.startswith('@') and line2.endswith('/'):
            pass
        else:
          print(line)
          data.append(line)
print(data)

score 0 · Accepted Answer

在您的问题中，您说您想删除以“@”开头并以“/”结尾的行，但您的文件不包含任何这样的行？如果这是您想要做的，那么下面的代码应该可以工作，并将创建一个新文件，其中删除了所有符合设定条件的行（即以“@”开头并以“/”结尾）。

file = 'thefile.txt'
newfile = 'thenewfile.txt'
data = []

with open(file, 'r') as rf, open(newfile, 'w') as wf:
    for line in rf:
        line = line.rstrip()
        if not line.startswith('@') and not line.endswith('/'):
            data.append(line)
    for line in data:
        wf.write(line + '\n')

score 0 · Accepted Answer

类似于下面的代码：

with open('blah.log') as f:
  lines = [l.strip() for l in f.readlines()]
  with open('blah1.log','w') as f1:
    for line in lines:
      if len(line) > 0 and line[0] == '@' and line[-1] == '/':
        continue
      else:
        f1.write(line + '\n')

python - 从文件中删除以（）开头并以（）结尾的行段

4 回答 4

Related

Reference