Remove duplicate lines from a text file and split a reference number
807589Jun 23 2008 — edited Jul 9 2008Hi,
This is my first post.
I am trying to remove duplicate lines from a text file. To make things difficult the lines contain non unique timestamps but a unique reference number. Some of the duplicates amount to 10 lines whereas others can only be 2 lines.
1. Here are some examples of duplicates lines: <timestamp>,<reference>,<error message>
08:47:22,95847170050,Problem inputting data.
08:53:28, 96672540040, More problems inputting data.
08:47:29,95847170050,Problem inputting data.
08:53:35, 96672540040, More problems inputting data.
08:47:35,95847170050,Problem inputting data.
08:53:41, 96672540040, More problems inputting data.
I want to delete all but the most recent duplicate line.
2. The reference number is a series of 11 digits which i need to split into two numbers (one 7 digits long and the other 4) separated by a comma.
Before: 96672540040
After: 9667254,0040
Appreciate all the help in advance.
Thanks