A VBScript to remove tabs and retain positional formatting25 Jul 2010
I’ve been working with colleagues in our finance department recently to streamline and automate some of their processes. An interesting little technical problem that cropped up was taking some output from the accounts system and pushing it into an Excel spreadsheet. The problem was that the formatting was done with varying numbers of tabs and spaces. My solution was a little bit of VBScript.
The output from this particular system was such that there was no easy export to anything like CSV or any way to extract the data directly from the database (without having to work out and do all of the calculations the system was doing). The output it could give however was a plain text report on screen. The text was selectable, but formatted into columns through the judicious use of spaces and tabs. The tabs here were not just taking the place of a fixed number of spaces, but were quite rightly references to tab stops - i.e. a particular distance along a line. Thankfully each tab stop was equal in length else I would have been into a more complicated solution.
I’d already selected VBScript as the tool of choice due to its ease of integration with Excel so I ended up writing a little function to do the necessary calculations for me. The only thing that took a little working out was the actual size of the tab stop, but if you find you have trouble working this out, just try incrementing the size of the tab stop by 1 until you get one that works.
The function I wrote is shown below. In essence it works its way through a string (make sure you pass it a line at a time) and uses the length of the string prior to the tab as the basis of working out how many spaces should replace the tab to take it to the next tab stop distance. It the appends the next bit of the string up to the next tab and repeats the process until all tabs have been accounted for.
Function DeTabString(p_strInput) Const TAB_SIZE = 8 Const FILLER = " " Dim strInput, intCounter 'Initialise strinput = p_strInput DeTabString = "" if instr(1,strInput, vbTab) > 0 then While instr(1,strInput, vbTab) > 0 'Take everything up to the tab DeTabString = DeTabString & left(strInput, instr(1,strInput, vbTab) - 1) 'Now pad the new string we're building to get it up to the tab stop position intCounter = (Int(Len(DeTabString) / TAB_SIZE) + 1) * TAB_SIZE - Len(DeTabString) While intCounter > 0 DeTabString = DeTabString & FILLER intCounter = intCounter - 1 Wend 'Take everything after the tab and set that to be processed next strInput = right(strInput, len(strInput) - instr(1,strInput, vbTab)) Wend 'Now add the remainder of the original string DeTabString = DeTabString & strInput else DeTabString = strInput end if End Function
To illustrate how this function works or might be used here’s a little bit of test code:
strOutput = "123456781234567812345678" strOutput = strOutput & vbCrLf & DeTabString("Hello" & vbTab & "there" & vbTab & "World") strOutput = strOutput & vbCrLf & DeTabString("Hi" & vbTab & vbTab & "Earth") strOutput = strOutput & vbCrLf & DeTabString("Hiya" & vbTab & "my " & vbTab & "Planet") WriteOutputFile strOutput Sub WriteOutputFile(p_strInput) Dim objFileSystem, objOutputFile Set objFileSystem = CreateObject("Scripting.fileSystemObject") Set objOutputFile = objFileSystem.CreateTextFile("C:\TabOutput.txt", TRUE) objOutputFile.Write(p_strInput) objOutputFile.Close Set objFileSystem = Nothing End Sub
So this is taking three strings, processing them and outputting them to a file. The first string consists of three words separated by tabs and is our base line test for the formatting. The words being used are all less than 8 characters (which was the tab stop used in the default function set-up above. The second line is missing a middle word and so has two tabs in a row. Finally the third line has some extra spaces included with the middle word of the string.
The first line of the output isn’t processed by the DeTabString function, but it of course could be. The purpose of this line is to give you a visual guide as to where the repeating units of eight (as per the tab stop) appear along a line. For the purposes of highlighting the effects I also changed the FILLER constant to a period rather than a space. Here’s what the output file looks like this.
123456781234567812345678 Hello...there...World Hi..............Earth Hiya....my ....Planet
So there you have it. A useful little function to have in your scripting toolbox. I’ve made use of it a few times now dealing with this particular finance system, but I’m sure that others might find similar opportunities to make use of this.