I have a comma delimited data file that has one column that contains an unwanted comma in the form: lastname, firstname. The source data does not have quotation marks around each entry. Does anybody know how to rid this data of the unwanted comma, while still keeping the rest of the commas as the delimiters?
Use the FIND function in a DATA step (pre-process your file with ) with a negative value for the second argument. But what if there isn't a comma present?
Or how about inputting two variables LNAME FNAME (along with the MISSOVER option on the INFILE stmt) and then you may or may not need to concatenate the two variables to reconstruct your FULL_NAME variable.
Without sample data, I'd say there is no fool-proof answer to your question.
An approach however that comes to mind:
a) do an "input @;" then count the commas in _infile_
b) do an "input var1 var2 var3 @;" and stop just ahead of the variable that may contain the unwanted comma
c) do the next input statement according to the number of commas counted (ie. read one or two fields) then @ stop again
d) do an input for the remaining variables
This might work if what you described is the only oddity that occurs in the data. Reality however is often far more complex, so take this advice with the proverbial grain of salt.
Another idea (much in Scott Barry's vein): if the immediately succeeding field (with respect to the field in question) is easily identifiable (by being either a constant or numeric value) then the SCAN function might come in handy.