Hello everyone,
I recently started a class in college for SAS programming and so far myself and my classmates have been experiencing difficulties. Without going into too much detail, the reason I am reaching out is because the professor is not very present for this class and the material being taught does not run correctly within SAS. The specific issue we are facing is that we are unable to find suitable code for pointing to specific locations with columns in formatted input. We are supposed to have an input that is identical to the image I have attached, but our examples that are being taught are incorrect and do not work. I have tried many different methods and this is the one I am currently using, but it is still not producing the output we need. I've included a screenshot of the incorrect output we are currently generating below. Any assistance would be greatly appreciated!
The current line of code we are using.
DATA class_info; INFILE DATALINES ; INPUT Course $8. +1 Days $2. +1 BeginDate mmddyy10. +2 EndDate mmddyy10. +2 Credits 3. +2 Tuition dollar4.; DATALINES; DSCI 200 08/26/2019 10/29/2019 3 3000 DSCI 307 TT 08/26/2019 12/12/2019 3 3000 MATH 371 MW 08/26/2019 12/11/2019 3 3000 MATH 372 MW 08/27/2019 .......... 3 3000 ; RUN; PROC PRINT DATA=class_info; TITLE 'Reading in Data using Pointers/Informats'; FORMAT BeginDate mmddyy10. EndDate mmddyy10. Tuition dollar4.; RUN;
What we are suppose to produce
Incorrect output
Any help would be greatly appreciate!
DATA class_info;
INFILE DATALINES ;
INPUT Course $8. Days $4. BeginDate : ?? mmddyy10. EndDate : ?? mmddyy10. Credits Tuition ;
DATALINES;
DSCI 200 08/26/2019 10/29/2019 3 3000
DSCI 307 TT 08/26/2019 12/12/2019 3 3000
MATH 371 MW 08/26/2019 12/11/2019 3 3000
MATH 372 MW 08/27/2019 .......... 3 3000
;
RUN;
PROC PRINT DATA=class_info;
TITLE 'Reading in Data using Pointers/Informats';
FORMAT BeginDate mmddyy10. EndDate mmddyy10. Tuition dollar14.;
RUN;
DATA class_info;
INFILE DATALINES ;
INPUT Course $8. Days $4. BeginDate : ?? mmddyy10. EndDate : ?? mmddyy10. Credits Tuition ;
DATALINES;
DSCI 200 08/26/2019 10/29/2019 3 3000
DSCI 307 TT 08/26/2019 12/12/2019 3 3000
MATH 371 MW 08/26/2019 12/11/2019 3 3000
MATH 372 MW 08/27/2019 .......... 3 3000
;
RUN;
PROC PRINT DATA=class_info;
TITLE 'Reading in Data using Pointers/Informats';
FORMAT BeginDate mmddyy10. EndDate mmddyy10. Tuition dollar14.;
RUN;
big thanks for helping us out with my with this problem. I really appreciate the time you took to answer my question.
I am going to take your code as an opportunity to decipher some of the messages you got in the log and how to address them.
This is part of the log:
201 DATA class_info; 202 INFILE DATALINES ; 203 INPUT Course $8. +1 Days $2. +1 BeginDate mmddyy10. +2 EndDate mmddyy10. +2 Credits 3. +2 203! Tuition dollar4.; 204 205 DATALINES; NOTE: Invalid data for BeginDate in line 206 13-22. NOTE: Invalid data for EndDate in line 206 25-34. NOTE: Invalid data for Credits in line 206 37-39. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+-- 206 DSCI 200 08/26/2019 10/29/2019 3 3000 Course=DSCI 200 Days= BeginDate=. EndDate=. Credits=. Tuition=0 _ERROR_=1 _N_=1
The Rule: indicates the actual positions the program code sees for your positions. The + symbol are the multiples of 5 such as 5, 15, 25. The 1, 2, 3 are the 10, 20, 30 columns in the data. So that helps understand the invalid data messages.
The one for Begindate says that it was trying to read at columns 13 to 22. (The line 206 is the Log code line and will vary constantly). So if you look at the rule above the data you see that the value starts in column 14 (1 before the 15 or +).
So starting at column 13 the value starts with a space, and ending at column 22 the 1 in 2019 is the last character. So the instructions provided in the input line are starting to read one character too soon. The value of " 08/26/201" is not valid input to the MMDDYY10 informat so results in "invalid data". Similar behavior for Endate, starts one character too soon.
The behavior for Credits is different. It starts reading on the correct column. However the INPUT specified 3., which means it will read 3 characters and try to treat them as a number. But the third column 39 (just before the 4 on the rule) is the start of another value with a space inbetween. So the value attempted for Credits was "3 3" which is not a valid numeric value and the informat won't read that.
So you can "fix" the columns to start reading by increasing +1 or decreasing the column shifts as needed and using a different informat for Credits
INPUT Course $8. +1 Days $2. +2 BeginDate mmddyy10. +2 EndDate mmddyy10. +1 Credits 1. +1 Tuition dollar4.;
You still get invalid data because of that .......... for one of the dates. You can provide a MODIFIER to the format ?? to remove that warning (caution: you want to know the reason for all of your invalid data before doing such)
INPUT Course $8. +1 Days $2. +2 BeginDate ?? mmddyy10. +2 EndDate ??mmddyy10. +1 Credits 1. +1 Tuition dollar4.;
Note there are quite often multiple ways to read the same data. Instead of using offsets from where the LAST value was read, which is the +1 or +2, you can specify an exact column to read from using the @ followed by a column number. The editor should show the column number if the data is nice.
INPUT Course $8. @10 Days $2. @14 BeginDate ?? mmddyy10. @26 EndDate ??mmddyy10. @37 Credits 1. @39 Tuition dollar4.;
You can mix these options as well, some as + and some @ and even list input if there aren't missing values.
INPUT Course $8. @10 Days $2. @14 BeginDate ?? mmddyy10. @26 EndDate ??mmddyy10. Credits Tuition ;
The above works if there are no missing values for Credits. The informat Dollar isn't needed unless you have special characters like currency or commas in the value text.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.