I'm at my wit's end. I'm trying to read in a .csv file in a SAS data step. The last column of the file, a numeric variable, is showing up as all missing values.
The .csv file was created in a Windows environment. I'm working in Windows on SAS Enterprise Guide 7.15. Here're a few lines of the SAS file:
monyear | patcounty | count | tot_mme | highmme |
16-Jan | A | 70253 | 64753813 | 6369 |
16-Jan | B | 25 | 26797.5 | 6827 |
16-Jan | C | 3471 | 4238297 | 388 |
16-Jan | D | 25693 | 33292887 | 3006 |
16-Jan | E | 4505 | 6134380 | 550 |
Here's my code:
data work.ucd2 (keep=monyear year patcounty count tot_mme highmme);
retain monyear year patcounty count tot_mme highmme;
infile "G:\SAPB\Projects\OD2A\Evaluation\UCD Evaluation\Data\ucd_cures2.csv"
dlm = ","
firstobs=2
truncover
termstr=crlf;
input monyear $
patcounty $
count
tot_mme
highmme;
year2 = "20"||trim(substr(monyear,1,2));
year = input(year2, 4.);
put year=;
run;
And here's what the code spits out:
I've tried different values in the "termstr" option ("cr", "lf", "crlf"); used or omitted, variously, the "truncover", "missover", and "dsd" options; and tried this solution: https://documentation.sas.com/doc/en/vdmmlcdc/8.1/lestmtsref/n1rill4udj0tfun1fvce3j401plo.htm
I've searched high and low, including in SAS user forums and StackExchange, but nothing works.
FWIW, PROC IMPORT reads the .csv file in just fine. But there must be some way to do this in a data step.
Your INPUT statement contains 5 variables, but the log shows 9 comma-separated columns, the 5th of which (which would be highmme) you blacked out.
PROC IMPORT uses DATA step code behind the scenes. Check your SAS log to see if it shows up or not. If not try this:
options source;
proc import etc....;
run;
Many thanks, SASKiwi. The DATA step did show up in the log. I copied and pasted the relevant lines, adding the LRECL=32767 option and putting in the INFORMAT, FORMAT, and INPUT statements as indicated in the log.
All to no avail. I'm still getting the missing values.
Here are a few lines from the log after running the DATA step (with county names redacted):
Somehow, some extraneous numbers are added to the end of each line. For example, record #2 has the value 6369 in the variable "highmme", but adds "68" to the value. Similarly, record #3 has the value of 5 for "highmme", but adds "48" at the end.
Any idea what's going on? Is it related to the problem?
Many thanks.
Your INPUT statement contains 5 variables, but the log shows 9 comma-separated columns, the 5th of which (which would be highmme) you blacked out.
Bingo! And ... duh.
Thanks to both of you, SASKiwi and Kurt_Bremser, for taking the time to respond. SASKiwi: I'm sorry I can only select one response as the solution. Looking in the log at what INFILE does with the behind-the-scenes DATA step was fantastic information, but Kurt_Bremer's solution more directly aided me in solving my problem.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.