Search
« TextExpander: Change Clipboard Case | Main | Evernote Power Tool: Online ENML Editor »
Friday
Sep302011

Using SED on Windows


The other day I needed to trawl through a rather unappealing log file. Over 18,000 lines and over 12,000 occurrences of the word error. A quick scan of the log showed that the vast majority of these were more a warning than a significant error. However I knew the error I was looking for was in there somewhere. I decided that I needed to weed out the non-error lines before I even started to tackle this one. I remembered that the stream editor (SED) utility in UNIX was ideal for this sort of thing so I had a look to see how I could apply this in Windows.



The first step was to try and find out if you could get SED for Windows . I had a good idea that it did exist and a quick Google search turned up a GNU version of SED that would run on Windows. You can download a copy of SED for Windows from SourceForge. I selected and installed the "Complete package, except sources" to the default location in Program Files. 



I then spent a while looking through my book on SED & AWK and various web sites to try and get together the pieces for the script that I wanted to run. I'd really recommend the O'Reilly SED & AWK book by Dale Dougherty & Arnold Robbins, but there are plenty of resources on the web and in the manuals that come as part of the installation package. The "Useful one-line scripts for SED" was really useful to get the line numbering looking great in the script I eventually used.



I wanted my script to run through the log file and pick out every line containing the word "ERROR" (in upper case), and prefix it with the line number from it's position in the original log.



So I set about creating a simple DOS batch file to call three SED commands in succession. At each stage I produced an intermediate file which I could check as I was creating the script to make sure it was working as I'd intended.



So the first SED command simply adds line numbers to each line from the original log file.


SED = "{input file path}" > "{output file path}"


The second SED command takes this new file and spaces out the line numbers and lines from the log so that the log lines all line up and the numbers sit in a 'reserved space' at the beginning of the line.


SED "N;s/\n/\t/" "{input file path}" > "{output file path}"


Finally the third SED command processes the file and outputs only lines containing the word "ERROR".


SED -n "/ERROR/p" "{input file path}" > "{output file path}"


The batch file simply sets the current directory at the start to be the location of the SED executable and finally tidies up the temporary files. A temporary folder on the root of the C drive was used as the location for the original log file, the temporary files and the final processed log file.


@ECHO off
ECHO Set the current directory to the folder in which SED was installed
C:
CD "C:\Program Files\GnuWin32\bin"
ECHO Add line numbers
SED = "C:\temp\original.log" > "C:\temp1.log"
ECHO Format line numbers
SED "N;s/\n/\t/" "C:\temp\temp1.log" > "C:\temp\temp2.log"
ECHO Output only the lines containing the word 'ERROR'
SED -n "/ERROR/p" "C:\temp\temp2.log" > "C:\temp\processed.log"
ECHO Remove temporary files
DEL "C:\temp\temp1.log"
DEL "C:\temp\temp2.log"


I have a feeling given how prevalent these log files are in the work I'm doing that I may need to re-use and modify this script in the future. SED is a great tool and blindingly fast at processing even large files. If you haven't already got it I'd recommend downloading this open source utility and adding it your Windows toolbox today.


PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (4)

Great post - read about SED for windows and knew it was what I need but without scripting experience was stuck. With your example, 2 hrs and copy and pasting my required code together via google searches I was done!

August 26, 2012 | Unregistered Commentertobi

Hi, I have downloaded the "Complete Package, except sources" from SourceForge and installed it in my Windows 7 . But how do I setup the SED environment to start working. I don't see any file that opens SED editor. I even logged in to DOS and changed the directory to that GNU32/sed folder and submitted a few sed commands but it was not recognized by DOS. Could you please help me?

July 1, 2014 | Unregistered CommenterBaskar

Baskar.

I've never used SED in any sort of interactive editor mode, only as a utility to process a file called in a batch sequence ... so off the top of my head I'm not sure if there's an option here to use it how you are expecting. Perhaps a simple example of how I might use it in batch will help?

Let's assume I have installed (copied the files for) SED into a folder. In that folder I create a new text file called file1.txt. It contains 1 line of text:

foo bar

I now open a command prompt and navigate to this folder. I can run SED and give it a command to run against file1.txt:

sed s/foo/bar/ file1.txt > file2.txt

When run I should get a new file called file2.txt. This file should be the same as file1.txt in terms of content except that every instance of "foo" should have been converted to "bar". The content of file2.txt should therefore be:

bar bar

Does that help? I'm assuming the only difference to using SED at the DOS prompt to in an interactive editor is that you need to prefix the commands each time with "sed ". SO hopefully not too onerous for you.

Regards,

Stephen.

July 31, 2014 | Registered CommenterStephen Millard

I just happened on this topic while looking for something else.

Baskar - sed is extremely rarely if ever used interactively ("open sed editor"). sed ("stream editor") is a filter, like tr or grep, not an interactive editor like word or vim. As Stephen explains, sed automatically and quickly processes input to output, based on a "sed script" (eg, "s/foo/bar/") that you specify. sed is normally run from a shell script or batch file, but can also be run from the command line.

Stephen - Some of the tasks you describe seem perhaps more easily and efficiently carried out with grep, perhaps in combination with the nl command. Of course, the main thing is that the script works correctly, and runs fast enough. So there is nothing wrong with your examples, it's an excellent post, and the way you use the temporary files is exactly right.

If I may, I would like to mention another resource for learning and reviewing sed - "Definitive Guide to sed: Tutorial and Reference" - a book that I wrote and published. The website for "Definitive Guide" is www.sed-book.com if anyone wants to take a look.

Thank you,
Daniel Goldman

November 8, 2014 | Unregistered CommenterDaniel Goldman

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>