DATA Step, Macro, Functions and more

print only data with invalid dates

Reply
Contributor
Posts: 25

print only data with invalid dates

Trying to solve this question:

  1. Show the listing of observations with SSN, “corrected” numeric date-of-birth and “invalid” character date-of-birth variables for observations with invalid dates of birth.

 

This is the code I tried:

 

proc print data=hi.student;
where Char_DOB = .;
var SSN Total_DOB Char_DOB;
run;

 

Char_DOB is a character variable and Total_DOB is a numeric.

 

Since the empty fields for char_dobs are .'s i assumed they were missing but i guess they are the invalid dates and i have no idea how to make a where statement for that.

 

Super User
Posts: 19,772

Re: print only data with invalid dates

[ Edited ]

The answer was in previous post. 

 

Where char_dob = '.';

 

It's likely you made a mistake in conversion so missing was converted to a single character, a period. 

 


larkjr18 wrote:

Trying to solve this question:

  1. Show the listing of observations with SSN, “corrected” numeric date-of-birth and “invalid” character date-of-birth variables for observations with invalid dates of birth.

 

This is the code I tried:

 

proc print data=hi.student;
where Char_DOB = .;
var SSN Total_DOB Char_DOB;
run;

 

Char_DOB is a character variable and Total_DOB is a numeric.

 

Since the empty fields for char_dobs are .'s i assumed they were missing but i guess they are the invalid dates and i have no idea how to make a where statement for that.

 


 

Contributor
Posts: 25

Re: print only data with invalid dates

I'm getting this in the SAS log when I run it

NOTE: No observations were selected from data set HI.STUDENT.
NOTE: There were 0 observations read from the data set HI.STUDENT.
WHERE Char_DOB='.';
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

Super User
Posts: 19,772

Re: print only data with invalid dates

Well, then I repeat my previous last answer - Please show a proc contents for the dataset, an image of your invalid data and the code and log. 

 

FYI - reposting the same question isn't helpful in the long run. I'll merge the threads once this is answered. 

Contributor
Posts: 25

Re: print only data with invalid dates

[ Edited ]

Here is my code: I also added pics of the data set and proc contents

 

libname hi
'C:\Users\larkj\Desktop';
proc format library = hi;
value season
1='Fall' 2='Spring' 3='Summer';
value $sx
'M'='Male' 'F'='Female';
value race
.='Missing' 2='Hispanic' 3='Puerto Rican'
4='Mexican American' 5='Asian American'
6='Caucasian' 7='American Indian' 8='African American'
9='Other';
;
run;
options fmtsearch=(hi)

data hi.student (drop=i);
infile 'C:\Users\larkj\Desktop\STUDENT.DAT';
input ssn 1-9 sex $ 11 term 17 type $ 19 div $ 21-24
major 26-30 ed $ 32 state $ 34-35 sat 37-40 satm 42-44
satv 46-48 gpa 50-53 rank 55-58 act 60-63 toefl 65-68
honor $ 70 ethnic 72 year_dob 101-102 month_dob 103-104
day_dob 105-106
;
Char_DOB = input(catx("/", month_dob, day_dob, year_dob),?? mmddyy8.);
Total_DOB = input(catx("/", month_dob, day_dob, year_dob),?? mmddyy8.);
if ((month_dob = 02) AND (day_dob= 29)) then Total_DOB = mdy(03,01,year_dob)-1;
else Total_DOB = mdy(month_dob,15,year_dob);
format Total_DOB mmddyy8.
term Season.
sex $Sx.
ethnic Race.;
array change (6) gpa sat satm satv act toefl;
do i = 1 TO 6;
if change(i) = 0 then change(i) = .;
end;
Char_DOBc = put(Char_DOB,mmddyy8.);
drop Char_DOB;
rename Char_DOBc = Char_DOB;
run;
PROC CONTENTS DATA=hi.student;
RUN;
proc print data=hi.student;
format term Season. sex $Sx. ethnic Race. Total_DOB mmddyy8. ;
run;
proc print data=hi.student;
where Char_DOB = '.';
var SSN Total_DOB Char_DOB;
run;


2017-03-29 (1).png2017-03-29 (2).png2017-03-29 (3).png2017-03-29 (4).png2017-03-29 (5).png
Super User
Posts: 19,772

Re: print only data with invalid dates

Ok...you can't change a variable type in a data step once it's been assigned. So your rename/drop conversions won't do what you want. 

 Get the basics working and then try and simplify it. 

 

The solution is:

Rather than INPUT, use MDY to calculate the numeric date. Then use PUT() to convert it to a character field. 

 

 

Your IF statement is recoding TOTAL_DOB to the 15th as well, after you've created it initially which is why the dates don't match. That's why all your days are 15 in the dates and why they don't match up between numeric/character. 

 

Lines15 in data step if you start with Data as line 1. 

PROC Star
Posts: 7,468

Re: print only data with invalid dates

[ Edited ]

Your problem is with your char_dob field. The ones you want to separate from the rest are the ones with a value of '0'

 

I'd create a new numeric date field (e.g., num_dob) with code something like:

if char_dob eq '0' then call missing(num_dob);
else num_dob=input(char_dob,mmddyy10. ??);

 

Then, simply print out any records that have a missing num_dob.

 

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 25

Re: print only data with invalid dates

Do I need to create the num_dob var before I can run your code?

 

I assumed that since there is a var=put statement in your code it would create it but when I try to run the proc print it says it can't find num_dob in the set.

PROC Star
Posts: 7,468

Re: print only data with invalid dates

You have to post your log in order for anyone to see what the problem might be.

 

One possibility is that num_dob is being assigned as a character variable. If so, just add a line before your set statement:

 

length num_dob 8;

 

Art, CEO, AnalystFinder.com

Contributor
Posts: 25

Re: print only data with invalid dates

Hmm the proc print statement is still tell me num_dob is not on file in hi.student.

Super User
Posts: 19,772

Re: print only data with invalid dates


larkjr18 wrote:

Hmm the proc print statement is still tell me num_dob is not on file in hi.student.


Post your code and log if you need further help. I'm assuming it's changed by now.

 

FYI - Your statement above doesn't tell us much. Take the time to detail your questions. 

Contributor
Posts: 25

Re: print only data with invalid dates

Sorry about that, here is my relevant code right now:

 

libname hi
'C:\Users\larkj\Desktop';
proc format library = hi;
value season
1='Fall' 2='Spring' 3='Summer';
value $sx
'M'='Male' 'F'='Female';
value race
.='Missing' 2='Hispanic' 3='Puerto Rican'
4='Mexican American' 5='Asian American'
6='Caucasian' 7='American Indian' 8='African American'
9='Other';
;
run;
options fmtsearch=(hi);
data hi.student (drop=i);
infile 'C:\Users\larkj\Desktop\STUDENT.DAT';
length num_dob 8;
input ssn 1-9 sex $ 11 term 17 type $ 19 div $ 21-24
major 26-30 ed $ 32 state $ 34-35 sat 37-40 satm 42-44
satv 46-48 gpa 50-53 rank 55-58 act 60-63 toefl 65-68
honor $ 70 ethnic 72 year_dob 101-102 month_dob 103-104
day_dob 105-106
;
num_dob= input(char_dob,mmddyy10.);
if char_dob eq '0' then call missing(num_dob);
else num_dob=input(char_dob,mmddyy10. ??);
Char_DOB = input(catx("/", month_dob, day_dob, year_dob),?? mmddyy8.);
Total_DOB = input(catx("/", month_dob, day_dob, year_dob),?? mmddyy8.);
if ((month_dob = 02) AND (day_dob= 29)) then Total_DOB = mdy(03,01,year_dob)-1;
else Total_DOB = mdy(month_dob,15,year_dob);
format Total_DOB mmddyy8.
term Season.
sex $Sx.
ethnic Race.;
array change (6) gpa sat satm satv act toefl;
do i = 1 TO 6;
if change(i) = 0 then change(i) = .;
end;
Char_DOBc = put(Char_DOB,mmddyy8.);
drop Char_DOB;
rename Char_DOBc = Char_DOB;
run;
PROC CONTENTS DATA=hi.student;
RUN;
proc print data=hi.student;
format term Season. sex $Sx. ethnic Race. Total_DOB mmddyy8. ;
run;
proc print data=hi.student;
where num_dob=.;
var SSN Total_DOB;
run;

 

Here is my problem that the log is saying:


2359 proc print data=hi.student;
2360 where num_dob=.;
ERROR: Variable num_dob is not on file HI.STUDENT.
2361 var SSN Total_DOB;
2362 run;

 

Super User
Posts: 19,772

Re: print only data with invalid dates

Yes, that would be the correct result given your code. 

 

If I may make a suggestion - comment every single line of the code above. 

It's a good thing to do while learning.

I bet you'll see the error pretty quickly. 

Hint:  order of operations. 

Contributor
Posts: 25

Re: print only data with invalid dates

Sorry there's just alot to do with little time so I've been racing through the code. I guess I don't truley know the order of operations. Do you think you could expand on your hint?

Super User
Posts: 19,772

Re: print only data with invalid dates

If it doesn't exist yet you can't use it.

Ask a Question
Discussion stats
  • 16 replies
  • 266 views
  • 0 likes
  • 4 in conversation