regex - Use grep to find either of two strings without changing the order of the lines? -

- May 15, 2012

i'm sure has been asked can't find apologies redundancy.

i want use grep or egrep find every line has either ' p ' or ' ca ' in them , pipe them new file. can 1 or other using:

egrep ' ca ' all.pdb > ca.pdb

egrep ' p ' all.pdb > p.pdb

i'm new regex i'm not sure syntax or.

update: order of output lines important, i.e. not want output sort lines string matched. here example of first 8 lines of 1 file:

atom      1 n    thr u  27     -68.535  88.128 -17.857  1.00  0.00      1h5  n   atom      2 ht1  thr u  27     -69.437  88.216 -17.434  0.00  0.00      1h5  h   atom      3 ht2  thr u  27     -68.270  87.165 -17.902  0.00  0.00      1h5  h   atom      4 ht3  thr u  27     -68.551  88.520 -18.777  0.00  0.00      1h5  h   atom      5 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c   atom      6 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p   atom      8 hb   thr u  27     -68.543  88.566 -15.171  0.00  0.00      1h5  h   atom      9 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c   atom     10 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p   atom     11 hb   thr u  27     -68.543  88.566 -15.171  0.00  0.00      1h5  h   atom     12 c    ser d   2     -73.656  70.884  -7.805  1.00  0.00      dna2 c   atom     13 op1  ser d   2     -73.656  70.884  -7.805  1.00  0.00      dna2 o

and want result file example be:

atom      5 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c   atom      6 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p   atom      9 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c   atom     10 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p

you can use grep this:

grep ' p \| ca ' file > new_file

the | expression indicates "or". have escape in order tell grep has special meaning.

you can avoid escaping , using fancier extended grep:

grep -e ' (p|ca) ' file > new_file

in general, prefer awk syntax, since more clear , easier extend:

awk '/ p / || / ca /' file

or given sample input, can use awk check if in 3rd column when happens:

$ awk '$3=="ca" || $3=="p"' file atom      5 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c atom      6 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p atom      9 ca   lys b 122    -116.643  85.931-103.890  1.00  0.00      2h2b c atom     10 p    thy j   2     -73.656  70.884  -7.805  1.00  0.00      dna2 p

test

$ cat file hello p here , ca ca appears nothing here p ca $ grep ' p \| ca ' file hello p here , ca ca appears $ grep -e ' (p|ca) ' file hello p here , ca ca appears $ awk '/ p / || / ca /' file hello p here , ca ca appears

Search This Blog

harsh

regex - Use grep to find either of two strings without changing the order of the lines? -

test

Comments

Post a Comment

Popular posts from this blog

Java 3D LWJGL collision -

spring - SubProtocolWebSocketHandler - No handlers -

methods - python can't use function in submodule -