This file is indexed.

/usr/share/bibledit-gtk/site/gtk/reference/menu/menu-preferences/filters/sed.html is in bibledit-gtk-data 4.9-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <link href="../../../../../bibledit.css" rel="stylesheet" type="text/css" /><!-- 

Copyright (©) 2003-2011 Teus Benschop and Contributors to the Wiki.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
Texts.  A copy of the license is included in the section entitled "GNU
Free Documentation License" in the file FDL.

-->
    <title></title>
  </head>
  <body>
    <div id="menu">
      <ul>
        <li>
          <a href="../../../../../home.html">1 Bibledit</a>
        </li>
        <li>
          <a href="../filters.html">Filters</a>
        </li>
        <li style="list-style: none; display: inline">
          <hr />
        </li>
        <li>
          <a href="tec.html">TECkit</a>
        </li>
        <li>
          <a href="regex.html">regex</a>
        </li>
        <li>sed
        </li>
      </ul>
    </div>
    <div id="content">
      <h1>
        sed
      </h1>
      <p>
        This reference was assembled while closely looking at the information provided in <a href="http://www.grymoire.com/Unix/Sed.html" rel="nofollow">Sed - An Introduction and Tutorial</a>. In March 2008, the author, Bruce Barnett, gave permission to use that information for the reference.
      </p>
      <h3>
        <a name="theessentialcommandsforsubstitution" href="" id="theessentialcommandsforsubstitution"></a>The essential command: s for substitution
      </h3>
      <p>
        The substitute command is the s. It changes all occurrences of a regular expression into another value. A simple example changes "day" to "night":
      </p>
      <pre>
s/day/night/
</pre>
      <p>
        This substitute command consists of four parts.<br />
      </p>
      <pre>
s         substitute command<br />/../../   delimiters<br />day       regular expression pattern string<br />night     replacement string
</pre>
      <p>
        The <a href="https://sites.google.com/site/bibledit/gtk/reference/menu/menu-preferences/filters/regex">regular expressions</a> are covered in detail in another reference.
      </p>
      <h3>
        <a name="theslashasadelimiter" href="" id="theslashasadelimiter"></a>The slash as a delimiter
      </h3>
      <p>
        The character after the <i>s</i> is the delimiter. Conventionally it is a slash, but it can be anything that you want.
      </p>
      <p>
        For example, it can be an underscore instead of the slash:
      </p>
      <pre>
s_day_night_
</pre>
      <p>
        Or colons:
      </p>
      <pre>
s:day:night:
</pre>
      <p>
        Or you can use the "|" character.
      </p>
      <pre>
s|day|night|
</pre>
      <p>
        Select the one that you want. As long as it is not in the string that you are looking for, anything works. Also remember that three delimiters are needed. If you get a "Unterminated 's' command" it is because you are missing one of them.
      </p>
      <h3>
        <a name="usingampasthematchedstring" href="" id="usingampasthematchedstring"></a>Using &amp; as the matched string
      </h3>
      <p>
        At times one may want to search for a pattern and add some characters, like parenthesis, around or near the pattern that was found. It is easy to do this if looking for a particular string:
      </p>
      <pre>
s/abc/(abc)/
</pre>
      <p>
        This won't work if you don't know exactly what you will find. How can you put the string you found in the replacement string if you don't know what it is?
      </p>
      <p>
        The solution requires the special character "&amp;." It corresponds to the pattern found.
      </p>
      <pre>
s/[a-z]*/(&amp;)/
</pre>
      <p>
        There can be any number of "&amp;" in the replacement string. You could also double a pattern, e.g. the first number of a line:
      </p>
      <pre>
s/[0-9]*/&amp; &amp;/
</pre>
      <p>
        If the input for this rule is "123 abc", then the output will be "123 123 abc".
      </p>
      <p>
        Let's slightly amend this example. Sed will match the first string, and make it as greedy as possible. The first match for '[0-9]*' is the first character on the line, as this matches zero of more numbers. So if the input was "abc 123" the output would be unchanged, except for a space before the letters. A better way to duplicate the number is to make sure it matches a number:
      </p>
      <pre>
s/[0-9][0-9]*/&amp; &amp;/
</pre>
      <p>
        The input of "123 abc" becomes "123 123 abc".
      </p>
      <h3>
        <a name="using1tokeeppartofthepattern" href="" id="using1tokeeppartofthepattern"></a>Using \1 to keep part of the pattern
      </h3>
      <p>
        The use of "(" ")" and "1" has been described in the reference on <a href="https://sites.google.com/site/bibledit/gtk/reference/menu/menu-preferences/filters/regex">regular expressions.</a> To review, the escaped parenthesis (that is, parenthesis with backslashes before them) remember portions of the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern. If you wanted to keep the first word of a line, and delete the rest of the line, mark the important part with the parenthesis:
      </p>
      <pre>
s/\([a-z]*\)*/\1/
</pre>
      <p>
        If you want to switch two words around, you can remember two patterns and change the order around:
      </p>
      <pre>
s/\([a-z]*\) \([a-z]*\)/\2 \1/<br />
</pre>
      <p>
        The "\1" doesn't have to be in the replacement string. It can be in the pattern you are searching for. If you want to eliminate duplicated words, you can try:
      </p>
      <pre>
s/\([a-z]*\)\1/\1/<br />
</pre>
      <p>
        You can have up to nine values: "\1" thru "\9."
      </p>
      <h3>
        <a name="substituteflags" href="" id="substituteflags"></a>Substitute Flags
      </h3>
      <p>
        You can add additional flags after the last delimiter. These flags can specify what happens when there is more than one occurrence of a pattern on a single line, and what to do if a substitution is found.
      </p>
      <h3>
        <a name="gglobalreplacement" href="" id="gglobalreplacement"></a>/g - Global replacement
      </h3>
      <p>
        Most Unix utilties work on files, reading a line at a time. <i>Sed</i>, by default, is the same way. If you tell it to change a word, it will only change the first occurrence of the word on a line. You may want to make the change on every word on the line instead of the first. For an example, let's place parentheses around words on a line. Instead of using a pattern like "[A-Za-z]*" which won't match words like "won't," we will use a pattern, "[^ ]*," that matches everything except a space. Well, this will also match anything because "*" means <b>zero or more</b>. To work around a possible bug in sed, you must avoid matching the null string when using the "g" flag to <i>sed</i>. A work-around example is: "[^ ][^ ]*." The following will put parenthesis around the first word:
      </p>
      <pre>
s/[^ ]*/(&amp;)/<br />
</pre>
      <p>
        If you want it to make changes for every word, add a "g" after the last delimiter and use the work-around:
      </p>
      <pre>
s/[^ ][^ ]*/(&amp;)/g<br />
</pre>
      <h3>
        <a name="issedrecursive" href="" id="issedrecursive"></a>Is sed recursive?
      </h3>
      <p>
        <i>Sed</i> only operates on patterns found in the incoming data. That is, the input line is read, and when a pattern is matched, the modified output is generated, and the <b>rest</b> of the input line is scanned. The "s" command will not scan the newly created output. That is, you don't have to worry about expressions like:
      </p>
      <pre>
s/loop/loop the loop/g<br />
</pre>
      <p>
        This will not cause an infinite loop. If a second "s" command is executed, it could modify the results of a previous command. I will show you how to execute multiple commands later.
      </p>
      <h3>
        <a name="12etcspecifyingwhichoccurrence" href=""></a>/1, /2, etc. Specifying which occurrence
      </h3>
      <p>
        With no flags, the first pattern is changed. With the "g" option, all patterns are changed. If you want to modify a particular pattern that is not the first one on the line, you could use "\(" and "\)" to mark each pattern, and use "\1" to put the first pattern back unchanged. This next example keeps the first word on the line but deletes the second:
      </p>
      <pre>
s/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /<br />
</pre>
      <p>
        This rule is ugly! There is an easier way to do this. You can add a number after the substitution command to indicate you only want to match that particular pattern. Example:
      </p>
      <pre>
s/[a-zA-Z]* //2<br />
</pre>
      <p>
        You can combine a number with the g (global) flag. For instance, if you want to leave the first world alone alone, but change the second, third, etc. to DELETED, use /2g:
      </p>
      <pre>
s/[a-zA-Z]* /DELETED /2g<br />
</pre>
      <p>
        Don't get /2 and \2 confused. The /2 is used at the end. \2 is used in inside the replacement field.
      </p>
      <p>
        Note the space after the "*" character. Without the space, <i>sed</i> will run a long, long time. (Note: this bug is probably fixed by now.) This is because the number flag and the "g" flag have the same bug. You should also be able to use the pattern
      </p>
      <pre>
s/[^ ]*//2<br />
</pre>
      <p>
        but this also eats CPU.<br />
      </p>
      <p>
        The number flag is not restricted to a single digit. It can be any number from 1 to 512. If you wanted to add a colon after the 80th character in each line, you could type:
      </p>
      <pre>
s/./&amp;:/80<br />
</pre>
      <h3>
        <a name="combiningsubstitutionflags" href="" id="combiningsubstitutionflags"></a>Combining substitution flags
      </h3>
      <p>
        You can combine flags when it makes sense. Some things that don't work are using a number and a "g" flag, because this is inconsistent.
      </p>
      <h3>
        <a name="scripts" href="" id="scripts"></a>Scripts
      </h3>
      <p>
        If you have a large number of <i>sed</i> rules, you can put then into a script.
      </p>
      <p>
        A <i>sedscript</i> could look like this:
      </p>
      <pre>
# sed comment - This script changes lower case vowels to upper case<br />s/a/A/g<br />s/e/E/g<br />s/i/I/g<br />s/o/O/g<br />s/u/U/g<br />
</pre>
      <p>
        When there are several commands in one file, each command must be on separate line.
      </p>
      <h3>
        <a name="comments" href="" id="comments"></a>Comments
      </h3>
      <p>
        <i>Sed</i> comments are lines where the first non-white character is a "#." On many systems, <i>sed</i> can have only one comment, and it must be the first line of the script. On modern systems, you can have several comment lines anywhere in the script.
      </p>
      <h3>
        <a name="multiplerulesandorderofexecution" href="" id="multiplerulesandorderofexecution"></a>Multiple rules and order of execution
      </h3>
      <p>
        As we explore more of the commands of <i>sed</i>, the commands will become complex, and the actual sequence can be confusing. It's really quite simple. Each line is read in. The various commands, in order specified by the user, has a chance to operate on the input line. After the substitutions are made, the next command has a chance to operate on the same line, which may have been modified by earlier commands. If you ever have a question, the best way to learn what will happen is to just try it out. If a complex command doesn't work, make it simpler. If you are having problems getting a complex script working, break it up into two smaller scripts and do them one by one.
      </p>
      <h3>
        <a name="addressesandrangesoftext" href="" id="addressesandrangesoftext"></a>Addresses and Ranges of Text
      </h3>
      <p>
        You have only learned one command, and you can see how powerful <i>sed</i> is. However, all it is doing is a search and substitute. That is, the substitute command is treating each line by itself, without caring about nearby lines. What would be useful is the ability to restrict the operation to certain lines. Some useful restrictions might be:
      </p>
      <p>
        - Specifying a line by its number.
      </p>
      <p>
        - Specifying a range of lines by number.
      </p>
      <p>
        - All lines containing a pattern.
      </p>
      <p>
        - All lines from the beginning of a file to a regular expression.
      </p>
      <p>
        - All lines from a regular expression to the end of the file.
      </p>
      <p>
        - All lines between two regular expressions.<br />
      </p>
      <p>
        <i>Sed</i> can do all that and more. Every command in <i>sed</i> can be assigned an address, range or restriction like the above examples. The restriction or address immediately precedes the command:
      </p>
      <pre>
<i>restriction</i> <i>command</i><br />
</pre>
      <h3>
        <a name="restrictingtoalinenumber" href="" id="restrictingtoalinenumber"></a>Restricting to a line number
      </h3>
      <p>
        The simplest restriction is a line number. If you wanted to delete the first number on line 3, just add a "3" before the command:
      </p>
      <pre>
3 s/[0-9][0-9]*//<br />
</pre>
      <h3>
        <a name="patterns" href="" id="patterns"></a>Patterns
      </h3>
      <p>
        Many Unix utilities use a slash to search for a regular expression. <i>Sed</i> uses the same convention, provided you terminate the expression with a slash. To delete the first number on all lines that start with a "#," use:
      </p>
      <pre>
/^#/ s/[0-9][0-9]*//<br />
</pre>
      <p>
        I placed a space after the "/<i>expression</i>/" so it is easier to read. It isn't necessary, but without it the command is harder to fathom. <i>Sed</i> does provide a few extra options when specifying regular expressions. But I'll discuss those later. If the expression starts with a backslash, the next character is the delimiter. To use a comma instead of a slash, use:
      </p>
      <pre>
\,^#, s/[0-9][0-9]*//<br />
</pre>
      <p>
        The main advantage of this feature is searching for slashes. Suppose you wanted to search for the string "/usr/local/bin" and you wanted to change it for "/common/all/bin." You could use the backslash to escape the slash:
      </p>
      <pre>
/\/usr\/local\/bin/ s/\/usr\/local/\/common\/all/<br />
</pre>
      <p>
        It would be easier to follow if you used an underline instead of a slash as a search. This example uses the underline in both the search command and the substitute command:
      </p>
      <pre>
\_/usr/local/bin_ s_/usr/local_/common/all_<br />
</pre>
      <p>
        This illustrates why <i>sed</i> scripts get the reputation for obscurity. I could be perverse and show you the example that will search for all lines that start with a "g," and change each "g" on that line to an "s:"
      </p>
      <pre>
/^g/s/g/s/g<br />
</pre>
      <p>
        Adding a space and using an underscore after the substitute command makes this <b>much</b> easier to read:
      </p>
      <pre>
/^g/ s_g_s_g<br />
</pre>
      <p>
        Er, I take that back. It's hopeless. There is a lesson here: Use comments liberally in a <i>sed</i> script. Comments are a Good Thing. You may have understood the script perfectly when you wrote it. But six months from now it could look like a confusing thing.
      </p>
      <h3>
        <a name="rangesbylinenumber" href="" id="rangesbylinenumber"></a>Ranges by line number
      </h3>
      <p>
        You can specify a range on line numbers by inserting a comma between the numbers. To restrict a substitution to the first 100 lines, you can use:
      </p>
      <pre>
1,100 s/A/a/<br />
</pre>
      <p>
        If you know exactly how many lines are in a file, you can explicitly state that number to perform the substitution on the rest of the file. In this case, assume that there are 532 lines in the file:
      </p>
      <pre>
101,532 s/A/a/<br />
</pre>
      <p>
        An easier way is to use the special character "$," which means the last line in the file.
      </p>
      <pre>
101,$ s/A/a/<br />
</pre>
      <p>
        The "$" is one of those conventions that mean "last" in several utilities.
      </p>
      <h3>
        <a name="rangesbypatterns" href="" id="rangesbypatterns"></a>Ranges by patterns
      </h3>
      <p>
        You can specify two regular expressions as the range. Assuming a "#" starts a comment, you can search for a keyword, remove all comments until you see the second keyword. In this case the two keywords are "start" and "stop:"
      </p>
      <pre>
/start/,/stop/ s/#.*//<br />
</pre>
      <p>
        The first pattern turns on a flag that tells <i>sed</i> to perform the substitute command on every line. The second pattern turns off the flag. If the "start" and "stop" pattern occurs twice, the substitution is done both times. If the "stop" pattern is missing, the flag is never turned off, and the substitution will be performed on every line until the end of the file.
      </p>
      <p>
        You should know that if the "start" pattern is found, the substitution occurs on the same line that contains "start." This turns on a switch, which is line oriented. That is, the next line is read. If it contains "stop" the switch is turned off. Otherwise, the substitution command is tried. Switches are line oriented, and not word oriented.
      </p>
      <p>
        You can combine line numbers and regular expressions. This example will remove comments from the beginning of the file until it finds the keyword "start:"
      </p>
      <pre>
1,/start/ s/#.*//<br />
</pre>
      <p>
        This example will remove comments everywhere except the lines <b>between</b> the two keywords:
      </p>
      <pre>
1,/start/ s/#.*//' -e '/stop/,$ s/#.*//<br />
</pre>
      <p>
        The last example has a range that overlaps the "/start/,/stop/" range, as both ranges operate on the lines that contain the keywords. I will show you later how to restrict a command up to, <b>but not including</b> the line containing the specified pattern.
      </p>
      <p>
        Before I start discussing the various commands, should show know that some commands cannot operate on a range of lines. I will let you know when I mention the commands. I will describe three commands in this secton. One of the three cannot operate on a range.
      </p>
      <h3>
        <a name="deletewithd" href="" id="deletewithd"></a>Delete with d
      </h3>
      <p>
        Using ranges can be confusing, so you should expect to do some experimentation when you are trying out a new script. A useful command deletes every line that matches the restriction: "d." If you want to look at the first 10 lines of a file, you can use:
      </p>
      <pre>
11,$ d<br />
</pre>
      <p>
        which is similar in function to the <i>head</i> command. If you want to chop off the header of a mail message, which is everything up to the first blank line, use:
      </p>
      <pre>
1,/^$/ d<br />
</pre>
      <p>
        The range for deletions can be regular expressions pairs to mark the begin and end of the operation. Or it can be a single regular expression. Deleting all lines that start with a "#" is easy:
      </p>
      <pre>
/^#/ d<br />
</pre>
      <p>
        Removing comments and blank lines takes two commands. The first removes every character from the "#" to the end of the line, and the second deletes all blank lines:
      </p>
      <pre>
s/#.*//
</pre>
      <pre>
/^$/ d<br />
</pre>
      <p>
        A third one should be added to remove all blanks and tabs immediately before the end of line:
      </p>
      <pre>
s/#.*//<br />s/[ ^I]*$//<br />/^$/ d<br />
</pre>
      <p>
        The character "^I" is a <i>CRTL-I</i> or tab character. You would have to explicitly type in the tab. Note the order of operations above, which is in that order for a very good reason. Comments might start in the middle of a line, with white space characters before them. Therefore comments are first removed from a line, potentially leaving white space characters that were before the comment. The second command removes all trailing blanks, so that lines that are now blank are converted to empty lines. The last command deletes empty lines. Together, the three commands removes all lines containing only comments, tabs or spaces.
      </p>
      <p>
        This demonstrates the pattern space <i>sed</i> uses to operate on a line. The actual operation <i>sed</i> uses is:
      </p>
      <p>
        - Copy the input line into the pattern space.
      </p>
      <p>
        - Apply the first <i>sed</i> command on the pattern space, if the address restriction is true.
      </p>
      <p>
        - Repeat with the next sed expression, again operating on the pattern space.
      </p>
      <p>
        - When the last operation is performed, write out the pattern space and read in the next line from the input file.<br />
      </p>
      <h3>
        <a name="reversingtherestrictionwith" href="" id="reversingtherestrictionwith"></a>Reversing the restriction with !
      </h3>
      <p>
        Sometimes you need to perform an action on every line except those that match a regular expression, or those outside of a range of addresses. The "!" character, which often means not in Unix utilities, inverts the address restriction.
      </p>
      <h3>
        <a name="theqorquitcommand" href="" id="theqorquitcommand"></a>The q or quit command
      </h3>
      <p>
        There is one more simple command that can restrict the changes to a set of lines. It is the "q" command: quit. the third way to duplicate the head command is:
      </p>
      <pre>
11 q<br />
</pre>
      <p>
        which quits when the eleventh line is reached. This command is most useful when you wish to abort the editing after some condition is reached.
      </p>
      <p>
        The "q" command is the one command that does not take a range of addresses. Obviously the command
      </p>
      <pre>
1,10 q<br />
</pre>
      <p>
        cannot quit 10 times. Instead
      </p>
      <pre>
1 q<br />
</pre>
      <p>
        or
      </p>
      <pre>
10 q
</pre>
      <p>
        is correct.
      </p>
      <h3>
        <a name="groupingwithand" href="" id="groupingwithand"></a>Grouping with { and }
      </h3>
      <p>
        The curly braces, "{" and "}," are used to group the commands.
      </p>
      <p>
        Hardly worth the build up. All that prose and the solution is just matching squigqles. Well, there is one complication. Since each <i>sed</i> command must start on its own line, the curly braces and the nested <i>sed</i> commands must be on separate lines.
      </p>
      <p>
        Previously, I showed you how to remove comments starting with a "#." If you wanted to restrict the removal to lines between special "begin" and "end" key words, you could use:<br />
      </p>
      <pre>
/begin/,/end/ {<br />s/#.*//<br />s/[ ^I]*$//<br />/^$/ d<br />p<br />}
</pre>
      <p>
        These braces can be nested, which allow you to combine address ranges. You could perform the same action as before, but limit the change to the first 100 lines:<br />
      </p>
      <pre>
1,100 {<br />/begin/,/end/ {<br />s/#.*//<br />s/[ ^I]*$//<br />/^$/ d<br />p<br />}<br />}
</pre>
      <p>
        You can place a "!" before a set of curly braces. This inverts the address, which removes comments from all lines <b>except</b> those between the two reserved words:<br />
      </p>
      <pre>
/begin/,/end/ !{<br />s/#.*//<br />s/[ ^I]*$//<br />/^$/ d<br />p<br />}
</pre>
      <h3>
        <a name="thecommentcommand" href="" id="thecommentcommand"></a>The # Comment Command
      </h3>
      <p>
        As we dig deeper into <i>sed</i>, comments will make the commands easier to follow. Older versions of <i>sed</i> only allow one line as a comment, and it must be the first line. Recent versions allow more than one comment, and these comments don't have to be first. An example could be:
      </p>
      <h3>
        <a name="addingchanginginsertingnewlines" href="" id="addingchanginginsertingnewlines"></a>Adding, Changing, Inserting new lines
      </h3>
      <p>
        <i>Sed</i> has three commands used to add new lines to the output stream. Because an entire line is added, the new line is on a line by itself to emphasize this. There is no option, an entire line is used, and it must be on its own line. If you are familiar with many unix utilities, you would expect <i>sed</i> to use a similar convention: lines are continued by ending the previous line with a "\". The syntax to these commands are finicky, like the "r" and "w" commands.
      </p>
      <h4>
        <a name="appendalinewitha" href="" id="appendalinewitha"></a>Append a line with 'a'
      </h4>
      <p>
        The "a" command appends a line after the range or pattern. This example will add a line after every line with "WORD:
      </p>
      <pre>
/WORD/ a\
</pre>
      <pre>
Add this line after every line with WORD
</pre>
      <h4>
        <a name="insertalinewithi" href="" id="insertalinewithi"></a>Insert a line with 'i'
      </h4>
      <p>
        You can insert a new line before the pattern with the "i" command:
      </p>
      <pre>
/WORD/ i\
</pre>
      <pre>
Add this line before every line with WORD
</pre>
      <h4>
        <a name="changealinewithc" href="" id="changealinewithc"></a>Change a line with 'c'
      </h4>
      <p>
        You can change the current line with a new line.
      </p>
      <pre>
/WORD/ c\
</pre>
      <pre>
Replace the current line with the line
</pre>
      <p>
        A "d" command followed by a "a" command won't work, as I discussed earlier. The "d" command would terminate the current actions. You can combine all three actions using curly braces:<br />
      </p>
      <pre>
/WORD/ {<br />i\<br />Add this line before<br />a\<br />Add this line after<br />c\<br />Change the line to this one<br />}
</pre>
      <h3>
        <a name="leadingtabsandspacesinasedscript" href="" id="leadingtabsandspacesinasedscript"></a>Leading tabs and spaces in a sed script
      </h3>
      <p>
        <i>Sed</i> ignores leading tabs and spaces in all commands. However these white space characters may or may not be ignored if they start the text following a "a," "c" or "i" command. If you want to keep leading spaces, and not care about which version of <i>sed</i> you are using, put a "\" as the first character of the line:
      </p>
      <pre>
a\
</pre>
      <pre>
\        This line starts with a tab
</pre>
      <h3>
        <a name="addingmorethanoneline" href="" id="addingmorethanoneline"></a>Adding more than one line
      </h3>
      <p>
        All three commands will allow you to add more than one line. Just end each line with a "\:"
      </p>
      <pre>
/WORD/ a\<br />Add this line\<br />This line\<br />And this line
</pre>
      <h3>
        <a name="addinglinesandthepatternspace" href="" id="addinglinesandthepatternspace"></a>Adding lines and the pattern space
      </h3>
      <p>
        I have mentioned the pattern space before. Most commands operate on the pattern space, and subsequent commands may act on the results of the last modification. The three previous commands, like the read file command, add the new lines to the output stream, bypassing the pattern space.
      </p>
      <h3>
        <a name="addressrangesandtheabovecommands" href="" id="addressrangesandtheabovecommands"></a>Address ranges and the above commands
      </h3>
      <p>
        Some commands can take a range of lines, and others cannot. To be precise, the commands "a," "i," "r," and "q" will not take a range like "1,100" or "/begin/,/end/." The "c" or change command allows this, and it will let you change several lines into one:
      </p>
      <pre>
/begin/,/end/ c\<br />***DELETED***<br />
</pre>
      <p>
        If you need to do this, you can use the curly braces, as that will let you perform the operation on every line:
      </p>
      <pre>
1,$ {
</pre>
      <pre>
a\
</pre>
      <pre>
}
</pre>
      <h3>
        <a name="multilinepatterns" href="" id="multilinepatterns"></a>Multi-Line Patterns
      </h3>
      <p>
        Most UNIX utilities are line oriented. Regular expressions are line oriented. Searching for patterns that covers more than one line is not an easy task. (Hint: It will be very shortly.)
      </p>
      <p>
        <i>Sed</i> reads in a line of text, performs commands which may modify the line, and outputs modification if desired. The main loop of a <i>sed</i> script looks like this:
      </p>
      <p>
        1. The next line is read from the input file and placed in the pattern space. If the end of file is found, and if there are additional files to read, the current file is closed, the next file is opened, and the first line of the new file is placed into the pattern space.
      </p>
      <p>
        2. The line count is incremented by one.
      </p>
      <p>
        3. Each <i>sed</i> command is examined. If there is a restriction placed on the command, and the current line in the pattern space meets that restriction, the command is executed. Some commands, like "n" or "d" cause <i>sed</i> to go to the top of the loop. The "q" command causes <i>sed</i> to stop. Otherwise the next command is examined.
      </p>
      <p>
        3. After all of the commands are examined, the pattern space is output.
      </p>
      <p>
        The restriction before the command determines if the command is executed. If the restriction is a pattern, and the operation is the delete command, then the following will delete all lines that have the pattern:
      </p>
      <pre>
/PATTERN/ d<br />
</pre>
      <p>
        If the restriction is a pair of numbers, then the deletion will happen if the line number is equal to the first number or greater than the first number and less than or equal to the last number:
      </p>
      <pre>
10,20 d<br />
</pre>
      <p>
        If the restriction is a pair of patterns, there is a variable that is kept for each of these pairs. If the variable is false and the first pattern is found, the variable is made true. If the variable is true, the command is executed. If the variable is true, and the last pattern is on the line, after the command is executed the variable is turned off:
      </p>
      <pre>
/begin/,/end/ d<br />
</pre>
      <p>
        Whew! That was a mouthful. Hey, I said it was a review. If you have read the previous lessons, you should have breezed through this. You may want to refer back to this review, because I covered several subtle points.
      </p>
      <h3>
        <a name="transformwithy" href="" id="transformwithy"></a>Transform with y
      </h3>
      <p>
        If you wanted to change a word from lower case to upper case, you could write 26 character substitutions, converting "a" to "A," etc. <i>Sed</i> has a command that operates like the <i>tr</i> program. It is called the "y" command. For instance, to change the letters "a" through "f" into their upper case form, use:
      </p>
      <pre>
y/abcdef/ABCDEF/<br />
</pre>
      <p>
        I could have used an example that converted all 26 letters into upper case, and while this column covers a broad range of topics, the "column" prefers a narrower format.
      </p>
      <p>
        If you wanted to convert a line that contained a hexadecimal number (e.g. 0x1aff) to upper case (0x1AFF), you could use:
      </p>
      <pre>
/0x[0-9a-zA-Z]*/ y/abcdef/ABCDEF<br />
</pre>
      <p>
        This works fine if there are only numbers in the file. If you wanted to change the second word in a line to upper case, you are out of luck - unless you use multi-line editing.
      </p>
      <h3>
        <a name="displayingcontrolcharacterswithal" href="" id="displayingcontrolcharacterswithal"></a>Displaying control characters with a l
      </h3>
      <p>
        The "l" command prints the current pattern space. It is therefore useful in debugging <i>sed</i> scripts. It also converts unprintable characters into printing characters by outputting the value in octal preceded by a "\" character.
      </p>
      <h3>
        <a name="theholdbuffer" href="" id="theholdbuffer"></a>The Hold Buffer
      </h3>
      <p>
        So far we have talked about three concepts of <i>sed</i>: (1) The input stream or data before it is modified, (2) the output stream or data after it has been modified, and (3) the pattern space, or buffer containing characters that can be modified and send to the output stream.
      </p>
      <p>
        There is one more "location" to be covered: the <i>hold buffer</i> or <i>hold space</i>. Think of it as a spare pattern buffer. It can be used to "copy" or "remember" the data in the pattern space for later. There are five commands that use the hold buffer.
      </p>
      <h3>
        <a name="exchangewithx" href="" id="exchangewithx"></a>Exchange with x
      </h3>
      <p>
        The "x" command eXchanges the pattern space with the hold buffer. By itself, the command isn't useful. Executing the <i>sed</i> command
      </p>
      <pre>
x<br />
</pre>
      <p>
        as a filter adds a blank line in the front, and deletes the last line. It looks like it didn't change the input stream significantly, but the <i>sed</i> command is modifying every line.
      </p>
      <p>
        The hold buffer starts out containing a blank line. When the "x" command modifies the first line, line 1 is saved in the hold buffer, and the blank line takes the place of the first line. The second "x" command exchanges the second line with the hold buffer, which contains the first line. Each subsequent line is exchanged with the preceding line. The last line is placed in the hold buffer, and is not exchanged a second time, so it remains in the hold buffer when the program terminates, and never gets printed. This illustrates that care must be taken when storing data in the hold buffer, because it won't be output unless you explicitly request it.
      </p>
      <h3>
        <a name="holdwithhorh" href="" id="holdwithhorh"></a>Hold with h or H
      </h3>
      <p>
        The "x" command exchanges the hold buffer and the pattern buffer. Both are changed. The "h" command copies the pattern buffer into the hold buffer. The pattern buffer is unchanged.
      </p>
      <h3>
        <a name="flowcontrol" href="" id="flowcontrol"></a>Flow Control
      </h3>
      <p>
        As you learn about <i>sed</i> you realize that it has it's own programming language. It is true that it's a very specialized and simple language. What language would be complete without a method of changing the flow control? There are three commands <i>sed</i> uses for this. You can specify a label with an text string followed by a colon. The "b" command branches to the label. The label follows the command. If no label is there, branch to the end of the script. The "t" command is used to test conditions.
      </p>
      <h3>
        <a name="testingwitht" href="" id="testingwitht"></a>Testing with t
      </h3>
      <p>
        You can execute a branch if a pattern is found. You may want to execute a branch only if a substitution is made. The command "t label" will branch to the label if the last substitute command modified the pattern space.
      </p>
      <p>
        One use for this is recursive patterns. Suppose you wanted to remove white space inside parenthesis. These parentheses might be nested. That is, you would want to delete a string that looked like "( ( ( ())) )." The <i>sed</i> expressions
      </p>
      <pre>
s/([ ^I]*)/g<br />
</pre>
      <p>
        would only remove the innermost set. You would have to pipe the data through the script four times to remove each set or parenthesis. You could use the regular expression
      </p>
      <pre>
s/([ ^I()]*)/g<br />
</pre>
      <p>
        but that would delete non-matching sets of parenthesis. The "t" command would solve this:<br />
      </p>
      <pre>
:again<br />s/([ ^I]*)//g<br />t again
</pre>
      <h3>
        <a name="passingregularexpressionsasarguments" href="" id="passingregularexpressionsasarguments"></a>Passing regular expressions as arguments
      </h3>
      <p>
        In the earlier scripts, I mentioned that you would have problems if you passed an argument to the script that had a slash in it. In fact, and regular expression might cause you problems. A script like the following is asking to be broken some day:
      </p>
      <pre>
s/'"$1"'//g
</pre>If the argument contains any of these characters in it you may get a broken script: "/\.*[]^$." One solution is to have the user put a backslash before any of these characters when they pass it as an argument. However, the user has to know which characters are special.
    </div>
  </body>
</html>