regex - Use grep to find either of two strings without changing the order of the lines? -
i'm sure has been asked can't find apologies redundancy.
i want use grep or egrep find every line has either ' p ' or ' ca ' in them , pipe them new file. can 1 or other using:
egrep ' ca ' all.pdb > ca.pdb
or
egrep ' p ' all.pdb > p.pdb
i'm new regex i'm not sure syntax or
.
update: order of output lines important, i.e. not want output sort lines string matched. here example of first 8 lines of 1 file:
atom 1 n thr u 27 -68.535 88.128 -17.857 1.00 0.00 1h5 n atom 2 ht1 thr u 27 -69.437 88.216 -17.434 0.00 0.00 1h5 h atom 3 ht2 thr u 27 -68.270 87.165 -17.902 0.00 0.00 1h5 h atom 4 ht3 thr u 27 -68.551 88.520 -18.777 0.00 0.00 1h5 h atom 5 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 6 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p atom 8 hb thr u 27 -68.543 88.566 -15.171 0.00 0.00 1h5 h atom 9 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 10 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p atom 11 hb thr u 27 -68.543 88.566 -15.171 0.00 0.00 1h5 h atom 12 c ser d 2 -73.656 70.884 -7.805 1.00 0.00 dna2 c atom 13 op1 ser d 2 -73.656 70.884 -7.805 1.00 0.00 dna2 o
and want result file example be:
atom 5 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 6 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p atom 9 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 10 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p
you can use grep
this:
grep ' p \| ca ' file > new_file
the |
expression indicates "or". have escape in order tell grep
has special meaning.
you can avoid escaping , using fancier extended grep
:
grep -e ' (p|ca) ' file > new_file
in general, prefer awk
syntax, since more clear , easier extend:
awk '/ p / || / ca /' file
or given sample input, can use awk
check if in 3rd column when happens:
$ awk '$3=="ca" || $3=="p"' file atom 5 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 6 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p atom 9 ca lys b 122 -116.643 85.931-103.890 1.00 0.00 2h2b c atom 10 p thy j 2 -73.656 70.884 -7.805 1.00 0.00 dna2 p
test
$ cat file hello p here , ca ca appears nothing here p ca $ grep ' p \| ca ' file hello p here , ca ca appears $ grep -e ' (p|ca) ' file hello p here , ca ca appears $ awk '/ p / || / ca /' file hello p here , ca ca appears
Comments
Post a Comment