data Pumpkin2;
infile 'H:\stats301\A2\Pumpkin.dat';
input name $16. age 3. +1 category $1 date date11. (score1-score5) (5*4.1);
run;
proc print;
title 'Pumpkin carving competition';
format date date9.;
run;
i wanted to create some variables on my data but it doesn't work for the "date" variable.
I'm not sure what is the problem with my code.
the error is saying
163 data Pumpkin2;
164 infile 'H:\stats301\A2\Pumpkin.dat';
165 input name $16. age 3. +1 category $1 date date11. (score1-score5) (5*4.1);
166 run;
NOTE: The infile 'H:\stats301\A2\Pumpkin.dat' is:
Filename=H:\stats301\A2\Pumpkin.dat,
RECFM=V,LRECL=32767,File Size (bytes)=376,
Last Modified=17 August 2017 11:34:02,
Create Time=17 August 2017 11:34:02
NOTE: Invalid data for date in line 1 2-12.
NOTE: Invalid data for score1 in line 1 13-16.
NOTE: Invalid data for score3 in line 1 21-24.
NOTE: Invalid data for score4 in line 1 25-28.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
1 Alicia Grossman 13 c 10-28-1999 7.8 6.5 7.2 8.0 7.9 52
name=Alicia Grossman age=13 category=A date=. score1=. score2=1.3 score3=. score4=. score5=199.9
_ERROR_=1 _N_=1
NOTE: Invalid data for date in line 2 2-12.
NOTE: Invalid data for score3 in line 2 21-24.
NOTE: Invalid data for score4 in line 2 25-28.
2 Matthew Lee 9 D 10-30-1999 6.5 5.9 6.8 6.0 8.1 52
name=Matthew Lee age=9 category=M date=. score1=. score2=0.9 score3=. score4=. score5=199.9
_ERROR_=1 _N_=2
NOTE: Invalid data for date in line 3 2-12.
NOTE: Invalid data for score1 in line 3 13-16.
NOTE: Invalid data for score3 in line 3 21-24.
NOTE: Invalid data for score4 in line 3 25-28.
3 Elizabeth Garcia 10 C 10-29-1999 8.9 7.9 8.5 9.0 8.8 52
name=Elizabeth Garcia age=10 category=E date=. score1=. score2=1 score3=. score4=. score5=199.9
_ERROR_=1 _N_=3
NOTE: Invalid data for date in line 4 2-12.
NOTE: Invalid data for score1 in line 4 13-16.
NOTE: Invalid data for score3 in line 4 21-24.
NOTE: Invalid data for score4 in line 4 25-28.
4 Lori Newcombe 6 D 10-30-1999 6.7 5.6 4.9 5.2 6.1 52
name=Lori Newcombe age=6 category=L date=. score1=. score2=0.6 score3=. score4=. score5=199.9
_ERROR_=1 _N_=4
NOTE: Invalid data for date in line 5 2-12.
NOTE: Invalid data for score1 in line 5 13-16.
NOTE: Invalid data for score3 in line 5 21-24.
NOTE: Invalid data for score4 in line 5 25-28.
5 Jose Martinez 7 d 10-31-1999 8.9 9.510.0 9.7 9.0 52
name=Jose Martinez age=7 category=J date=. score1=. score2=0.7 score3=. score4=. score5=199.9
_ERROR_=1 _N_=5
NOTE: Invalid data for date in line 6 2-12.
NOTE: Invalid data for score1 in line 6 13-16.
NOTE: Invalid data for score3 in line 6 21-24.
NOTE: Invalid data for score4 in line 6 25-28.
6 Brian Williams 11 C 10-29-1999 7.8 8.4 8.5 7.9 8.0 52
name=Brian Williams age=11 category=B date=. score1=. score2=1.1 score3=. score4=. score5=199.9
_ERROR_=1 _N_=6
NOTE: Invalid data for date in line 7 2-12.
NOTE: Invalid data for score1 in line 7 13-16.
NOTE: Invalid data for score3 in line 7 21-24.
NOTE: Invalid data for score4 in line 7 25-28.
7 Andrew Balemi 13 D 10-07-1997 1.3 2.1 0.7 1.5 2.1 52
name=Andrew Balemi age=13 category=A date=. score1=. score2=1.3 score3=. score4=. score5=199.7
_ERROR_=1 _N_=7
NOTE: 7 records were read from the infile 'H:\stats301\A2\Pumpkin.dat'.
The minimum record length was 52.
The maximum record length was 52.
NOTE: The data set WORK.PUMPKIN2 has 7 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
167
168 proc print;
169 title 'Pumpkin carving competition';
170 format date date9.;
171 run;
NOTE: There were 7 observations read from the data set WORK.PUMPKIN2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.08 seconds
cpu time 0.00 seconds
what is the problem?
Your input statement
input name $16. age 3. +1 category $1 date mmddyy10. (score1-score5) (5*4.1);
is missing the period in the format for category. So it attempts to start reading in column 1 as a character variable. That is why your Category values are wrong. That then throws off everything else.
So
input name $16. age 3. +1 category $1. date mmddyy10. (score1-score5) (5*4.1);
has a chance
Note this bit from you log:
*NOTE: Invalid data for date in line 1 2-11.* <= SAS is reading date starting in column 2, ie the middle of the name.
*NOTE: Invalid data for score1 in line 1 12-15.*
*NOTE: Invalid data for score3 in line 1 20-23.*
*NOTE: Invalid data for score4 in line 1 24-27.*
The values the log is showing like 10-28-1999 are apparently mmddyy10. (month day year).
The date11 informat would read things like 28OCT1999.
Your input statement
input name $16. age 3. +1 category $1 date mmddyy10. (score1-score5) (5*4.1);
is missing the period in the format for category. So it attempts to start reading in column 1 as a character variable. That is why your Category values are wrong. That then throws off everything else.
So
input name $16. age 3. +1 category $1. date mmddyy10. (score1-score5) (5*4.1);
has a chance
Note this bit from you log:
*NOTE: Invalid data for date in line 1 2-11.* <= SAS is reading date starting in column 2, ie the middle of the name.
*NOTE: Invalid data for score1 in line 1 12-15.*
*NOTE: Invalid data for score3 in line 1 20-23.*
*NOTE: Invalid data for score4 in line 1 24-27.*
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.