I’m writing this up to help others, and so I can refer to it again.
Long story, but at work I was left with a text file that was a comma-delimited list that had:
- Order ID
- Order Date (YYYYMMDD)
- Customer Info (name, full address, phone etc.)
Example lines:
23456,20081108,Fred Thomas, 44 Pine…
66666,20081201,Jill Harvey, 234 Thomas Ave…
(The ellipsis […] is just a placeholder for the additional customer info, plus line feed.)
I wanted to do a quick transformation on this list and extract a list of Order IDs only. I was in vi (I had grepped this list out of a bigger list; long story that I’m not going to bore you with).
Now, the first set of data was the order IDs, and the second was repeated in every line (,2008[whatever]).
So, it seemed trivial to do a REGEX to find this repeating element in every line, and remove it and all that follows to end of line.
Seems simple, but it took three of us a half-hour of monkeying around and googling before we found the solution, which is:
At the replace prompt (:), run substitute against all lines (%s).
Find the repeating item(,2008) and replace that and all (*) printable characters (\p)that follow (\p*) with nothing (//)
End up with a list of order IDs, as desired.
23456
66666
Trivial now; just had to find the correct REGEX to replace all that follows that pattern.
This has been a Public Service Announcement from Stupid vi Tricks….