11-19-2017 05:25 PM
I've been working my way thru Cody's Learning SAS by Example. Problem 15 in Chapter 3 is giving me some trouble.
The input file contains:
123,"Harold Wilson", Acct,01/15/1989,$78,123.
128,"Julia Child",Food, 08/29/1988,$89,123
The input is comma delimited, but the last field, SALARY, contains a comma.
When I run the following script, SALARY is truncated to the portion before its embedded comma. Section 3.14 in the book discusses using the ampersand "&" modifier, but that involves altering the input.
I could read SALARY in pieces (i.e., before and after the comma), do some arithmetic, and re-assemble. I could alter the input so that SALARY is enclosed in quotes. Seems like there should be a more straight forward solution.
Any ideas? Thanks in advance SAS release: 9.04.01M4P11092016
data EMPLOY ;
infile '/home/mytest/sasuser.v94/cody/Employee1.csv' dsd missover;
informat EmpID $3.
11-19-2017 09:08 PM
Read the whole line as one variable and use scan function to put values into first 4 variables,
then use index function to point at the $ signe, substring the rest of line and input value to salary
using best informat or comma informat.
Try make your code and come back in case of any issue.
11-20-2017 02:04 AM - edited 11-20-2017 02:17 AM
Your CSV file is malformed. If the value has a delimiter (or a quote) in it then it should be quoted. If this is a real data file and they are sending you more of them tell them to fix their process so that it generates a validly formatted file.
Fortunately the problem is in the last field in the line, so you can work around the error. Since it is the last field you can use list mode input to read the beginning columns and then use formatted mode for the last column.
NOTE: This is an example where using the obsolete MISSOVER option on the INFILE statement instead of the newer (it is probably only 30 years old) TRUNCOVER option could cause a real problem. If you tell SAS to read 8 characters for SALARY and the line only has 7 characters left then with MISSOVER you will get a missing value, but with TRUNCOVER SAS will just use the number of characters that are available.
data EMPLOY ; infile '/home/mytest/sasuser.v94/cody/Employee1.csv' dsd truncover; length EmpID $3 EmpName $20 Dept $8 DateHire 8 Salary 8 ; input empid empname dept datehire salary comma32. ; informat datehire mmddyy. ; format datehire yymmdd10. salary dollar12.; run;
Also don't define your variables by relying on the side effect of using an INFORMAT statement as the first place you reference the variable. Go ahead and explicitly define them using a LENGTH or ATTRIB statement. Save the INFORMAT statement for when you actually need to attach an INFORMAT to a variable. Note that SAS does not need informats (or formats) for most variable, especially character variables.