作者:采蘑菇的雨天戴草帽_412_715 | 来源:互联网 | 2023-06-08 10:51
我有以下格式的 .srt 文件:0100:00:01,830 --> 00:00:04,740corresponding text1200:00:05,280 --> 00:00:10,280corr
我有以下格式的 .srt 文件:
0
1
00:00:01,830 --> 00:00:04,740
corresponding text
1
2
00:00:05,280 --> 00:00:10,280
corresponding text
2
3
00:00:10,740 --> 00:00:14,640
corresponding text
3
4
00:00:15,510 --> 00:00:19,260
corresponding text
4
带有行号的额外行一直贯穿副标题(第 5 行、第 6 行...第 540 行)。我尝试了该命令sed '/^[0-9]/ s/.//'
,并按预期替换了所有数字,但我不知道如何使其仅替换范围内每个数字的第二次出现。
预期的结果是:
0
1
00:00:01,830 --> 00:00:04,740
corresponding text
2
00:00:05,280 --> 00:00:10,280
corresponding text
3
00:00:10,740 --> 00:00:14,640
corresponding text
4
00:00:15,510 --> 00:00:19,260
corresponding text
我如何使用 sed、awk 或任何可以批量完成的工具来实现它,因为有几个文件具有相同的情况?
谢谢!
回答
$ awk 'BEGIN{FS=OFS=RS;RS=""} {$NF=""}1' file
0
1
00:00:01,830 --> 00:00:04,740
corresponding text
2
00:00:05,280 --> 00:00:10,280
corresponding text
3
00:00:10,740 --> 00:00:14,640
corresponding text
4
00:00:15,510 --> 00:00:19,260
corresponding text
@RavinderSingh13 - yes, that makes sense. I had snapped to most of that but did not catch the effect of `$NF=""` -- thinking about it in terms of paragraph and nuking the last field makes perfect sense.