作者:刻骨铭心2502914183_610 | 来源:互联网 | 2023-06-07 13:45
I'm working in bash and I have a large file in which I want to remove all the lines that do not match a certain regex, probably using $ grep -e "" > output.txt
我在bash工作,我有一个大文件,我想删除所有与某个正则表达式不匹配的行,可能使用$ grep -e“
”
> output.txt
What I want to keep is any line that contain exactly x times a specified character, for example in the binary sequence
我想要保留的是任何包含指定字符x次的行,例如二进制序列
0000, 0001, 0010, 0011, 0100, 0101, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111
0000,0001,0010,0011,0100,0101,0111,1000,1001,1010,1011,1100,1101,1110,1111
I would like to keep only those who have 2 1, leaving me with
我想只保留那些有2 1的人,让我留下
0011, 0101, 0110, 1001, 1010, 1100
0011,0101,0110,1001,11010,1100
I would then use a bash variable to vary the amount I need (always exactly half of the length, working with strings of the same length) I'm litterally looking for lines that are half 0 and half 1
然后我会使用一个bash变量来改变我需要的数量(总是正好是长度的一半,使用相同长度的字符串)我正在寻找半0和半1的行
I have this right now. It's not using regex. It works, but is very slow:
我现在有这个。它不使用正则表达式。它有效,但速度很慢:
($1
is the length of every string, $d
is just a directory)
($ 1是每个字符串的长度,$ d只是一个目录)
sed -e 's/\(.\)/\1 /g' <$d/input.txt > $d/spaces.txt
awk '{c=0;for(i=1;i<=NF;++i){c+=$i};print c}' $d/spaces.txt > $d/sums.txt
grep -n "$(($1/2))" $d/sums.txt | cut -f1 -d: > $d/linenums.txt
for i in $(cat $d/linenums.txt)
do
sed "${i}q;d" $d/input.txt
done > $d/valids.txt
In case you wonder this puts spaces in between every digit turning 1010
into 1 0 1 0
, then it adds the values together, saves the results in sums.txt, grep for length/2 and save only the line numbers in linenums.txt, then it reads linenums.txt and outputs the corresponding line from input.txt to output.txt
如果你想知道这会在每个数字之间放置空格1010变成1 0 1 0,那么它将值加在一起,将结果保存在sums.txt中,grep表示长度/ 2并且只保存linenums.txt中的行号,然后它读取linenums.txt并从input.txt输出相应的行到output.txt
I need something quicker, the for loop is what's taking way too long
我需要更快的东西,for循环是太长时间了
Thanks for your time and for sharing your knowledge with me.
感谢您的时间,并与我分享您的知识。
1 个解决方案