Just thought I'd add some thoughts to what a few posters have already touched on. One issue I see is that your example data is not a properly formatted CSV file. When I create CSV files, I encapsulate my free text fields with double quotations "like this". This tells most software that everything between the quotes is part of one column, regardless of what it contains. The only issue is when your free text field itself contains double quotes. This will cause software to cut off your data prematurely. One option is to change how the source data is generated. Instead of using a comma, use a set of characters as your delimiter which are less likely to appear in your free text field, such as #*#*# . You then split your data based on that delimeter as opposed to splitting on a comma. You could also try an approach similar to pradeepalankar . I personally like this approach. Get your first two and last two fields. Everything else is your third field. It sounds like you have flexibility for what your final data set will look like. With the code already suggested, you easily aim for this data set: Field1 Field2 Field3 Field4 Field5 001 12A cards UK 001,12A, Tues to Friday John rotates with Team, Perm 10800-1645 Sat Perm 8am starts Tue to Sat as of 22/02, cards, UK 002 12B HL UK 002,12B, Mon to Wed Marry rotates with Team, Perm 0800-1645 Sat Perm 8am starts Tue to Sat as of 22/02 Works in shift, HL, UK 003 12c HL UK 003,12c, Sat&Sun Paul rotates with Team, Perm 19000-1645 Sat,HL, UK 004 12D CC UK 004,12D, All day Joe rotates with PL Team, Perm 10800-1645 Sat 8am starts Tue to Sat as of 24/02 Works in shift, CC, UK
... View more