Help using Base SAS procedures

checking missing date

Accepted Solution Solved
Reply
Regular Contributor
Posts: 190
Accepted Solution

checking missing date

here am writing d code to check missing day month year from the input below  & display the correct format according to the given values below. my input dates are 

-MMM-YYYY 
--YYYY 
DD--YYYY 
DD-MMM- 

the values to display according to the coressponding input value in the output are-->

-MMM-YYYY = 01-MMM-YYYY
--YYYY = 01-JAN-YYYY
DD--YYYY = DD-JAN-YYYY
DD-MMM- = DD-MMM-0000

 

 

here is my code to take string & display it but it gives a missing output what can i do to read the input then see wats missing then display the respective outputs

 

data date;
str ='-jan-1988';
date1= input(str,anydtdte11.);
format date1 mmddyyd10.
run;

 


Accepted Solutions
Solution
‎02-06-2016 04:26 AM
Super User
Posts: 11,343

Re: checking missing date

Part of your issue is you haven't broken the code up to identify all of your possible cases. 1) all values missing, 2) all but day missing, 3) all but month missing, 4) all but year missing, 5) ony missing day, 6) only missing month, 7) missing only year and finally 8) missing no values.

 

For instance with your third record and the value of 02--1978

None of this code gets executed:

if day eq "-" and month eq "-" then year=substr(birthdate,3,4);
else if day eq "-" and month ne "-" then year=substr(birthdate,6,4);
else if year eq "-" then year="0000";
else if month eq "-" then year=substr(birthdate,5,4);

because "day" = 0, "month"=2 and "year"=7

 

And for the case where day is ne - and month ne - you never re-read the year so you get that single digit.

 

I'm answering this away from my SAS install and have an idea I can't test at this time that may be much simpler over all.

 

 

 

 

View solution in original post


All Replies
Super User
Super User
Posts: 7,955

Re: checking missing date

What is your rules for what gets replaced in the missing parts.  SAS dates can only have all parts presetn, so you need rules like:

If day is missing then assume 01

If month is missing then assume Jan

If Year is missing then assume 2015

 

Then you code those rules in:

data want;
  set have;
  day=substr(date,1,2);
  month=substr(date,3,3);
  year=substr(date,6,4);
  if day="--" then day="01";
  if month="--" then month="JAN";
  if year="--" then year="2015";
  converted_data=input(cats(day,month,year),date9.);
  format converted_date date9.;
run;

The above is to show the working.

Regular Contributor
Posts: 190

Re: checking missing date

hello RW9 here my 4 inputs are having missing day month year & day month.. so u say i create a data set with my inputs & then use ur logic & can i use it as macro..
Super User
Super User
Posts: 7,955

Re: checking missing date

Yes, what you want to do is to check each part of the string provided in date.  You do this by using:

if <condition> then <result>;

Setup (or similar select when).  

Converting the string to date will just create missings where the complete date is not present, as SAS does not have a "default" for missing data.

 

As for "can i use it as macro..", not directly, this code is written in Base SAS, for use with datasets - i.e. the language used to process data.  Macro is a text generation facility, it is not advisable to start doing data related processing in a text generation language, otherwise you will end up with messy obfuscated code.  The most important thing to learn about any technology is when to use it, and more impoartantly when not to use it.

Regular Contributor
Posts: 190

Re: checking missing date

@RW9 am using a input dataset have still there error exist
Regular Contributor
Posts: 190

Re: checking missing date

@ rw9 here's my log with the code & errors...

57 data have;
58 input birthdate $;
59 cards;

NOTE: The data set WORK.HAVE has 4 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.02 seconds

64 run;

65
66 data want;
67 set have;
68 day=substr(birthdate,1,2);
69 month=substr(birthdate,5,7);
70 year=substr(birthdate,8,6);
71 if day="-" then day="01";
72 if month="--" then month="JAN";
73 if year="-" then year="0000";
74 converted_date=input(cats(day,month,year),date9.);
75 format converted_date date9.;
76 run;

NOTE: Invalid third argument to function SUBSTR at line 69 column 9.
NOTE: Invalid third argument to function SUBSTR at line 70 column 8.
NOTE: Invalid argument to function INPUT at line 74 column 18.
birthdate=-Jan-197 day=-J month=-197 year=7 converted_date=. _ERROR_=1 _N_=1
NOTE: Invalid third argument to function SUBSTR at line 69 column 9.
NOTE: Invalid third argument to function SUBSTR at line 70 column 8.
NOTE: Invalid argument to function INPUT at line 74 column 18.
birthdate=--1977 day=-- month=77 year= converted_date=. _ERROR_=1 _N_=2
NOTE: Invalid third argument to function SUBSTR at line 69 column 9.
NOTE: Invalid third argument to function SUBSTR at line 70 column 8.
NOTE: Invalid argument to function INPUT at line 74 column 18.
birthdate=02--1978 day=02 month=1978 year=8 converted_date=. _ERROR_=1 _N_=3
NOTE: Invalid third argument to function SUBSTR at line 69 column 9.
NOTE: Invalid third argument to function SUBSTR at line 70 column 8.
NOTE: Invalid argument to function INPUT at line 74 column 18.
birthdate=03-Jan- day=03 month=an- year= converted_date=. _ERROR_=1 _N_=4
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line)Smiley SadColumn).
4 at 74:18
NOTE: There were 4 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 4 observations and 5 variables.
Regular Contributor
Posts: 190

Re: checking missing date

[ Edited ]

@RW9here my input datastep

data have;
input birthdate $;
cards;
-Jan-1975
--1977
02--1978
03-Jan-
run;

& its log

birthdate
1 -Jan-197
2 --1977
3 02--1978

Super User
Super User
Posts: 7,955

Re: checking missing date

[ Edited ]

So what is your logic here, your data doesn't seem to match.  Take two of the examples:

--1977
02--1978

In the first, month is one single "-", in the second month is two "-".  It shiould either be one - for missing, or 3 - missing -.

What are the rules to identify the three parts from the date?

 

I have done what I can with the data, its really not good to accept un-logical data like this though:

data have;
  length day $2 month $3 year $4 birthdate $11;
  input birthdate $;
cards;
-Jan-1975
--1977
02--1978
03-Jan- 
run;

data want;
  set have;
  length day $2 month $3 year $4;
  month=compress(birthdate," ","ka");
  birthdate=tranwrd(compress(tranwrd(birthdate,month,"")," "),"--","-");
  day=substr(birthdate,1,index(birthdate,"-"));
  birthdate=substr(birthdate,index(birthdate,"-")+1);
  year=birthdate;
run;
Regular Contributor
Posts: 190

Re: checking missing date

for example see this
01-Jan-1975
05-jan-1977
02-jan-1978
03-Jan-0000
this is the actual format the dash is between day-month-year u get it so in the 2 & 3 card line its '--' '--'
Super User
Super User
Posts: 7,955

Re: checking missing date

Sorry, I don't see any "--" in that exampe data.  From that example data I would just use scan, but this doesn't show missings.  Post smoe test data which shows your data, exactly, with each of the combinations.

Regular Contributor
Posts: 190

Re: checking missing date

-Jan-1975
--1977
02--1978
03-Jan- this are the dates i have.
so for example --1977 so its converted date format is 05-jan-1977
Super User
Super User
Posts: 7,955

Re: checking missing date

Sorry, we are back to my previous point then:

So what is your logic here, your data doesn't seem to match.  Take two of the examples:

--1977
02--1978

In the first, month is one single "-", in the second month is two "-".  It shiould either be one - for missing, or 3 - missing -.

What are the rules to identify the three parts from the date?

 

There's no logic provided which states in one record month could be --, in another record it could be -.

 

To add to that, how do you know that --1977 should be 05-jan-1977?  What are your rules on the data, how are you going to clean it.  The code I provided should give you a good start in splitting the string up, you now need to work out what your logic and rules will be on missing data.

Regular Contributor
Posts: 190

Re: checking missing date

@RW9 can i use ur first code logic against my new inputs -Jan-1975
--1977
02--1978
03-Jan- ?? can we change the if the conditions or the substr condition..?
Regular Contributor
Posts: 190

Re: checking missing date

my original post
-MMM-YYYY = 01-MMM-YYYY
--YYYY = 01-JAN-YYYY
DD--YYYY = DD-JAN-YYYY
DD-MMM- = DD-MMM-0000

inputs to the left & the required output needed to the right after '=' so....my input dataset i created based on the values in the left & want to display the values to the right as our first logic example u provided.. can we do it to the input code
Super User
Super User
Posts: 7,955

Re: checking missing date

Yes, so as mentioned before its two part.  First separate out the individual parts.  You are ignoring my point with regards to logical assignment of -- versus -, so I will just make assumptions.  The second part is to assign the required replacements for missing values, and then input the result.  

In the datastep below, the first part of the datastep separates out the components, the second adds in your rules.

data have;
  length day $2 month $3 year $4 birthdate old $11;
  input birthdate $;
  old=birthdate;
cards;
-Jan-1975
--1977
02--1978
03-Jan- 
run;

data want;
  set have;
  length day $2 month $3 year $4;
  month=compress(birthdate," ","ka");
  birthdate=tranwrd(compress(tranwrd(birthdate,month,"")," "),"--","-");
  day=tranwrd(substr(birthdate,1,index(birthdate,"-")),"-","");
  birthdate=substr(birthdate,index(birthdate,"-")+1);
  year=birthdate;
  /* Your rules here */
  if day="" and month ne "" and year ne "" then new_date=input(cats("01",month,year),date9.);
  else if day="" and month="" and year ne "" then new_date=input(cats("01JAN",year),date9.);
  else if day ne "" and month="" and year ne "" then new_date=input(cats(day,"JAN",year),date9.);
  else if day ne "" and month ne "" and year="" then new_date=input(cats(day,month,"2000"),date9.); 
  format new_date date9.;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 17 replies
  • 1671 views
  • 0 likes
  • 3 in conversation