-------------------------------------------------------------------------------- Notes on SED -------------------------------------------------------------------------------- The discussion here applies to GNU sed or "gsed" on a Mac. It has more features than BSD sed (that is the default on Mac). The organization is as follows: I. Quoting rules This is boring stuff. I recommend that you skip on the first reading. However, this issue will bite you sooner or later. Then you have to that you will come back and read this! II. Gotch with "*" in regular expressions This is also boring stuff. Will get you now and then. Come back and read it in. III. SED's quirks This is also boring stuff. Will get you now and then. Come back and read it in. IV. Passig external variables to SED Let $a be an external variable. You refer to this variable as ${a} in SED. You should also use double quotes (so that $ is properly interprete). For single digit positional parameters you can use $1 directly ------------------------------------------------------------------------ II. Careful with "*" in regular Expressions ------------------------------------------------------------------------ The following three examples are instructive. $ echo "1" | sed -n 's/[0-9]*/&/p' 1 $ echo "1" | sed -n 's/[0-9]*/&.&/p' 1.1 $ echo "a" | sed -n 's/[0-9]*/&/p' a $ echo "a" | sed -n 's/[0-9]*/&.&/p' .a ------------------------------------------------------------------------ III. SED's quirky syntax ------------------------------------------------------------------------ Summary: 1. In most cases spaces do not matter 2. If you are using {} to group commands make sure that the last command ends with a ";" 3. You can continue a command line on to the next line without the need for any character (bash only) 4. Following commands have very specific syntax a, c, i, b, r, w, b #EXAMPLE: print lines 8 through 10 $sed -n '8,10p' input.txt #spaces are allowed (helps readability, in addition) $sed -n '8,10 p' input.txt $sed -n '8,10 p ' input.txt #{} allows ganging up of commands but syntax is confusing #All these fail ("extra characters at the end of p command") $sed -n '8,10{p}' input.txt $sed -n '8,10 {p}' input.txt $sed -n '8,10 {p }' input.txt #These will work. Conclusion: terminate with ";" $sed -n '8,10{p;}' input.txt $sed -n '8,10 {p;}' input.txt $sed -n '8,10 {p; }' input.txt #Line break. These will work. $sed -n '8,10 p' input.txt $sed -n '8,10{p; }' input.txt $sed -n '8,10 {p; }' input.txt ------------------------------------------------------------------------ SED commands ------------------------------------------------------------------------ = display line number of the pattern line (on a line by itself) l n show all characters including end-of-line, tabs and so on. line wraps at n characters. If n is not specified or if n is set to 0 then no wrap (gsed). q prints pattern space (if -n flag is not set) and quits. preceeded by at most one address. [gsed allows you to return a custom exit code]. Q quits without printing pattern space (regardless whether -n flag is set or not). [gsed] p prints pattern space. This should be used in conjunction with -n flag set (otherwise pattern will be printed twice). n prints pattern space (if -n flag is set). read the next line into pattern space. if this is the last line last line then quit with no processing. d delete the pattern space. proceed to the first line of the script. z clears the pattern space. More efficient and hardier than s/.*// (can handle multi-byte characters). [gsed] s substitute command (more later) s/regexp/replace/flags Flags: number, g, p, w, e, I (or i) Not required that any flag be involved Flags can be combined in any combination n replace nth match g global replacement 3g means a global substitution starting with the 3rd match (gsed) w filename write pattern into filename note a single space separates "w" and filename no other characters other than a newline after the filename i ignore case when comparing (can use "I" also) h copies pattern space to "hold" (erasing existing hold) g retrieves "hold" into pattern space (erasing existing pattern) x exchanges contents of pattern and hold H appends newline to the contents of hold and then append the pattern space G append newline to the pattern space and then append the hold contents to pattern space N Add a newline to the pattern space, then append the next line of input to the pattern space. If there is no more input then sed exists without processing any more commands. D If pattern space contains no newline, start a normal new cycle as if the d command was issued. Otherwise, delete text in the pattern space up to the first newline, and restart cycle with the resultant pattern space, without reading a new line of input. ------------------------------------------------------------------------ Under what circumstance does sed read in a new line? ------------------------------------------------------------------------ yes | head -5 | nl -s" " | sed 's/ y//' > InputFile gsed -n ':ABranch; p; b' InputFile 1 2 3 4 5 gsed -n ':ABranch; p; b ABranch' InputFile results in a copious copies of first line of the InputFile 1 1 1 (terminated by control-C) Summary: In order to initiate the read cycle control must fall to the bottom of the script. InputFile --------- #!/bin/sh gsed -n ' :ABranch p b ' ------------------------------------------------------------------------ Peculiarity of "n" & "N" command ------------------------------------------------------------------------ $ seq 3 | sed -n 'N;p' #easy to understand why last line is missing 1 2 $ seq 4 | -n sed 'N;p' #in contrast 1 2 3 4 #It helps to use gsed and "l" feature $ seq 4 | gset -n 'N;p;l' 1 2 1\n2$