------------------------------------------------------------------------ Lesson 2: Advanced grep ------------------------------------------------------------------------ The syntax of grep command is $ grep -flag1 value1 -flag2 value2 pattern input_file where pattern is what you are searching for in the input_file. The primary focus of this lesson is to show case and learn key options. We start off, though, with rules for quoting. ------------------------------------------------------------------------ I. QUOTING RULES: ------------------------------------------------------------------------ If the pattern is simple -- not containing characters have special meaning to the shell (e.g. space) -- then all three options are available: no quoting, double quotes or single quotes. A. A single quote is the strictest quote. Nothing inside the single quote is "interpreted" by the shell, that is, the entire pattern enclosed in single quotes is passed to grep. B. If the pattern contains a space then you must quote. The reason is that a space is the parameter separator for the shell. So quoting "eludes" the usual interpretation. As noted in IIB of Lesson 1, both types of quotes can be used. C. If you want to use a variable known to the shell then you MUST use a double quote. A single quote will simply pass the entire pattern along with the "$" variable to grep. The most common variables are shell variables. For example $LOGNAME is the name of your account. In my case, it is "srk" You can always define new variables (e.g. a="Hello Kitty"; echo $a). You can also get new variables from command substitution, e.g. $(whoami) returns my user id (which is same as $LOGNAME). Thus if you want to search for your user name in a file you would do the following: $ grep "$LOGNAME" filename #search for "srk" $ grep "$(whoami)" filename #does the same as the previous command On the other hand: $ grep '$LOGNAME' filename #no result since $ is interpreted literally #(as \n). So $LOGNAME makes no sense. There are times when you HAVE to use a Whenever you HAVE to use one quote in your grep "\([\"']\).*\1" ------------------------------------------------------------------------ II. OPTIONS: ------------------------------------------------------------------------ We have already seen the "-n" flag #prefixes all matched lines by line number of the input lines $ grep -n pattern filename #The flag "-v" displays all lines NOT matched by pattern #It is surprisingly very useful $ grep -v "^ *$" filename #display all non-blank lines #The flag "-i" makes no distinction between cases (for alphabets) $ grep -i "kulkarni" filename #Search for Kulkarni or kulkarni #A note: A word is an alphanumeric+underscore pattern preceeded #by a blank (or tab) and succeeded by a blank or punctutation #(.,;?"'). $ grep -w pattern infile #search for WORD given by pattern #matching whole lines specified by pattern $ grep -x pattern filename #however note that the command below does the same thing $ grep "^pattern$" filename #These options allow you to view local context of matched lines #setting n=0 results in showing only the matched line $ grep -A n #display matched line and n lines succeeding matched line $ grep -B n #display matched line and n lines preceeding matched line $ grep -C n #display matched line and +/-n lines centered on matched line #counting the number of matchd lines $ grep -c pattern filename #the command below does the same thing but is slower $ grep pattern filename | wc -l $ grep -o pattern filename #instead of presenting line, presents the match #counting the number of matchd lines $ grep -v -c pattern filename #counts the number of non-matches ------------------------------------------------------------------------ IIB. Multiple files ------------------------------------------------------------------------ grep is designed to work with multiple files. $ grep pattern files $ grep pattern files.* $ grep ------------------------------------------------------------------------ III. Examples ------------------------------------------------------------------------ The notes for the previous lesson (First lesson on grep) can be found in the file "Lesson1_grep.txt". In that file, all input files had extension ".dat". $ ln -s ../GREP1/Lesson1_grep.txt . #soft link set up $ grep '\.dat' Lesson1_grep.txt which produces 50 lines (try it). I wanted to find out the names of these files. Many of the file names were reported several times. I did not want the lines, just the file names. $ grep -o '[[:alnum:]][[:alnum:]]*\.dat' Lesson1_grep.txt | uniq lists the files. [I then piped it to rename ".dat" to ".txt", all on the fly!]