Help using Base SAS procedures

how can i format this variables?

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

how can i format this variables?

data Pumpkin2;
infile 'H:\stats301\A2\Pumpkin.dat';
input name $16. age 3. +1 category $1 date date11. (score1-score5) (5*4.1);
run;

proc print;
title 'Pumpkin carving competition';
format date date9.;
run;

i wanted to create some variables on my data but it doesn't work for the "date" variable.

I'm not sure what is the problem with my code.

 

the error is saying

 

163 data Pumpkin2;
164 infile 'H:\stats301\A2\Pumpkin.dat';
165 input name $16. age 3. +1 category $1 date date11. (score1-score5) (5*4.1);
166 run;

NOTE: The infile 'H:\stats301\A2\Pumpkin.dat' is:
Filename=H:\stats301\A2\Pumpkin.dat,
RECFM=V,LRECL=32767,File Size (bytes)=376,
Last Modified=17 August 2017 11:34:02,
Create Time=17 August 2017 11:34:02

NOTE: Invalid data for date in line 1 2-12.
NOTE: Invalid data for score1 in line 1 13-16.
NOTE: Invalid data for score3 in line 1 21-24.
NOTE: Invalid data for score4 in line 1 25-28.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
1 Alicia Grossman 13 c 10-28-1999 7.8 6.5 7.2 8.0 7.9 52
name=Alicia Grossman age=13 category=A date=. score1=. score2=1.3 score3=. score4=. score5=199.9
_ERROR_=1 _N_=1
NOTE: Invalid data for date in line 2 2-12.
NOTE: Invalid data for score3 in line 2 21-24.
NOTE: Invalid data for score4 in line 2 25-28.
2 Matthew Lee 9 D 10-30-1999 6.5 5.9 6.8 6.0 8.1 52
name=Matthew Lee age=9 category=M date=. score1=. score2=0.9 score3=. score4=. score5=199.9
_ERROR_=1 _N_=2
NOTE: Invalid data for date in line 3 2-12.
NOTE: Invalid data for score1 in line 3 13-16.
NOTE: Invalid data for score3 in line 3 21-24.
NOTE: Invalid data for score4 in line 3 25-28.
3 Elizabeth Garcia 10 C 10-29-1999 8.9 7.9 8.5 9.0 8.8 52
name=Elizabeth Garcia age=10 category=E date=. score1=. score2=1 score3=. score4=. score5=199.9
_ERROR_=1 _N_=3
NOTE: Invalid data for date in line 4 2-12.
NOTE: Invalid data for score1 in line 4 13-16.
NOTE: Invalid data for score3 in line 4 21-24.
NOTE: Invalid data for score4 in line 4 25-28.
4 Lori Newcombe 6 D 10-30-1999 6.7 5.6 4.9 5.2 6.1 52
name=Lori Newcombe age=6 category=L date=. score1=. score2=0.6 score3=. score4=. score5=199.9
_ERROR_=1 _N_=4
NOTE: Invalid data for date in line 5 2-12.
NOTE: Invalid data for score1 in line 5 13-16.
NOTE: Invalid data for score3 in line 5 21-24.
NOTE: Invalid data for score4 in line 5 25-28.
5 Jose Martinez 7 d 10-31-1999 8.9 9.510.0 9.7 9.0 52
name=Jose Martinez age=7 category=J date=. score1=. score2=0.7 score3=. score4=. score5=199.9
_ERROR_=1 _N_=5
NOTE: Invalid data for date in line 6 2-12.
NOTE: Invalid data for score1 in line 6 13-16.
NOTE: Invalid data for score3 in line 6 21-24.
NOTE: Invalid data for score4 in line 6 25-28.
6 Brian Williams 11 C 10-29-1999 7.8 8.4 8.5 7.9 8.0 52
name=Brian Williams age=11 category=B date=. score1=. score2=1.1 score3=. score4=. score5=199.9
_ERROR_=1 _N_=6
NOTE: Invalid data for date in line 7 2-12.
NOTE: Invalid data for score1 in line 7 13-16.
NOTE: Invalid data for score3 in line 7 21-24.
NOTE: Invalid data for score4 in line 7 25-28.
7 Andrew Balemi 13 D 10-07-1997 1.3 2.1 0.7 1.5 2.1 52
name=Andrew Balemi age=13 category=A date=. score1=. score2=1.3 score3=. score4=. score5=199.7
_ERROR_=1 _N_=7
NOTE: 7 records were read from the infile 'H:\stats301\A2\Pumpkin.dat'.
The minimum record length was 52.
The maximum record length was 52.
NOTE: The data set WORK.PUMPKIN2 has 7 observations and 9 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


167
168 proc print;
169 title 'Pumpkin carving competition';
170 format date date9.;
171 run;

NOTE: There were 7 observations read from the data set WORK.PUMPKIN2.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.08 seconds
cpu time 0.00 seconds

 

what is the problem?

 


Accepted Solutions
Solution
‎08-18-2017 10:54 PM
Super User
Posts: 13,574

Re: how can i format this variables?

Your input statement

input name $16. age 3. +1 category $1 date mmddyy10. (score1-score5) (5*4.1);

is missing the period in the format for category. So it attempts to start reading in column 1 as a character variable. That is why your Category values are wrong. That then throws off everything else.

 

So

input name $16. age 3. +1 category $1. date mmddyy10. (score1-score5) (5*4.1);

has a chance

 

Note this bit from you log:

*NOTE: Invalid data for date in line 1 2-11.*  <= SAS is reading date starting in column 2, ie the middle of the name.
*NOTE: Invalid data for score1 in line 1 12-15.*
*NOTE: Invalid data for score3 in line 1 20-23.*
*NOTE: Invalid data for score4 in line 1 24-27.*

View solution in original post


All Replies
Super User
Posts: 23,760

Re: how can i format this variables?

Umm...fix the errors in the first step. Your data is not being read in correctly. That should be your first concern.
Contributor
Posts: 21

Re: how can i format this variables?

i'm sorry but i don't see which line you're talking. would you let me know which line of error you're saying?
Super User
Posts: 13,574

Re: how can i format this variables?

The values the log is showing like 10-28-1999  are apparently mmddyy10. (month day year).

The date11 informat would read things like 28OCT1999.

 

Contributor
Posts: 21

Re: how can i format this variables?

Hi, i tried to apply different function but it's not still working :/

*203 data Pumpkin2;*
*204 infile 'H:\stats301\A2\Pumpkin.dat';*
*205 input name $16. age 3. +1 category $1 date mmddyy10. (score1-score5)
(5*4.1);*
*206 run;*

*NOTE: The infile 'H:\stats301\A2\Pumpkin.dat' is:*
* Filename=H:\stats301\A2\Pumpkin.dat,*
* RECFM=V,LRECL=32767,File Size (bytes)=376,*
* Last Modified=17 August 2017 11:34:02,*
* Create Time=17 August 2017 11:34:02*

*NOTE: Invalid data for date in line 1 2-11.*
*NOTE: Invalid data for score1 in line 1 12-15.*
*NOTE: Invalid data for score3 in line 1 20-23.*
*NOTE: Invalid data for score4 in line 1 24-27.*
*RULE:
----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---*
*1 Alicia Grossman 13 c 10-28-1999 7.8 6.5 7.2 8.0 7.9 52*
*name=Alicia Grossman age=13 category=A date=. score1=. score2=1.3 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=1*
*NOTE: Invalid data for date in line 2 2-11.*
*NOTE: Invalid data for score3 in line 2 20-23.*
*NOTE: Invalid data for score4 in line 2 24-27.*
*2 Matthew Lee 9 D 10-30-1999 6.5 5.9 6.8 6.0 8.1 52*
*name=Matthew Lee age=9 category=M date=. score1=. score2=0.9 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=2*
*NOTE: Invalid data for date in line 3 2-11.*
*NOTE: Invalid data for score1 in line 3 12-15.*
*NOTE: Invalid data for score2 in line 3 16-19.*
*NOTE: Invalid data for score3 in line 3 20-23.*
*NOTE: Invalid data for score4 in line 3 24-27.*
*3 Elizabeth Garcia 10 C 10-29-1999 8.9 7.9 8.5 9.0 8.8 52*
*name=Elizabeth Garcia age=10 category=E date=. score1=. score2=. score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=3*
*NOTE: Invalid data for date in line 4 2-11.*
*NOTE: Invalid data for score1 in line 4 12-15.*
*NOTE: Invalid data for score3 in line 4 20-23.*
*NOTE: Invalid data for score4 in line 4 24-27.*
*4 Lori Newcombe 6 D 10-30-1999 6.7 5.6 4.9 5.2 6.1 52*
*name=Lori Newcombe age=6 category=L date=. score1=. score2=0.6 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=4*
*NOTE: Invalid data for date in line 5 2-11.*
*NOTE: Invalid data for score1 in line 5 12-15.*
*NOTE: Invalid data for score3 in line 5 20-23.*
*NOTE: Invalid data for score4 in line 5 24-27.*
*5 Jose Martinez 7 d 10-31-1999 8.9 9.510.0 9.7 9.0 52*
*name=Jose Martinez age=7 category=J date=. score1=. score2=0.7 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=5*
*NOTE: Invalid data for date in line 6 2-11.*
*NOTE: Invalid data for score1 in line 6 12-15.*
*NOTE: Invalid data for score3 in line 6 20-23.*
*NOTE: Invalid data for score4 in line 6 24-27.*
*6 Brian Williams 11 C 10-29-1999 7.8 8.4 8.5 7.9 8.0 52*
*name=Brian Williams age=11 category=B date=. score1=. score2=1.1 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=6*
*NOTE: Invalid data for date in line 7 2-11.*
*NOTE: Invalid data for score1 in line 7 12-15.*
*NOTE: Invalid data for score3 in line 7 20-23.*
*NOTE: Invalid data for score4 in line 7 24-27.*
*7 Andrew Balemi 13 D 10-07-1997 1.3 2.1 0.7 1.5 2.1 52*
*name=Andrew Balemi age=13 category=A date=. score1=. score2=1.3 score3=.
score4=. score5=-19.9*
*_ERROR_=1 _N_=7*
*NOTE: 7 records were read from the infile 'H:\stats301\A2\Pumpkin.dat'.*
* The minimum record length was 52.*
* The maximum record length was 52.*
*NOTE: The data set WORK.PUMPKIN2 has 7 observations and 9 variables.*
*NOTE: DATA statement used (Total process time):*
* real time 0.03 seconds*
* cpu time 0.03 seconds*


*207*
*208 proc print;*
*209 title 'Pumpkin carving competition';*
*210 format date date9.;*
*211 run;*

*NOTE: There were 7 observations read from the data set WORK.PUMPKIN2.*
*NOTE: PROCEDURE PRINT used (Total process time):*
* real time 0.08 seconds*
* cpu time 0.01 seconds*

Super User
Posts: 23,760

Re: how can i format this variables?

Why not just read first and last name separately. All your records show only two values?

I would suggest using proc import to import the dataset. Then check the log for the code used and see how it and your program differs.
Solution
‎08-18-2017 10:54 PM
Super User
Posts: 13,574

Re: how can i format this variables?

Your input statement

input name $16. age 3. +1 category $1 date mmddyy10. (score1-score5) (5*4.1);

is missing the period in the format for category. So it attempts to start reading in column 1 as a character variable. That is why your Category values are wrong. That then throws off everything else.

 

So

input name $16. age 3. +1 category $1. date mmddyy10. (score1-score5) (5*4.1);

has a chance

 

Note this bit from you log:

*NOTE: Invalid data for date in line 1 2-11.*  <= SAS is reading date starting in column 2, ie the middle of the name.
*NOTE: Invalid data for score1 in line 1 12-15.*
*NOTE: Invalid data for score3 in line 1 20-23.*
*NOTE: Invalid data for score4 in line 1 24-27.*

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 348 views
  • 0 likes
  • 3 in conversation