Hi,
I have a dataset short of 7 mio entries and two datetime.18 entries LST_UPDT_DT and CLOSE_APRV_DT
I want to delete entries where LST_UPDT_DT is greater than CLOSE_APRV_DT (but set first a new variable (treatment) so that I can check what is happening, before any entries are deleted)
A lot of the CLOSE_APRV_DT are missing, and in this case I do not want to delete the entry. Instead I wanted to call them "missing"
Because and LST_UPDT_DT is greater than any empty CLOSE_APRV_DT SAS 9.4 seems to labels them all as = "DELETE";
What must I do that empty datetimes are treated as empty and not as a date that is smaller than LST_UPDT_DT ?
Data main2;
Set merge;
if CLOSE_APRV_DT = " " then TREATMENT = "MISSING";
If LST_UPDT_DT GT CLOSE_APRV_DT then TREATMENT = "DELETE";
If LST_UPDT_DT LT CLOSE_APRV_DT then TREATMENT = "LEAVE";
run;
Many thanks
Is this closer to what you want?
Data main
Set merge;
if CLOSE_APRV_DT = . then GREATER = "MISSING";
else If LST_UPDT_DT GT CLOSE_APRV_DT then GREATER = "DELETE";
else If LST_UPDT_DT LT CLOSE_APRV_DT then GREATER = "LEAVE";
run;
if you can show very small respresentative data of what you have and what you want then Someone can easily help you.
Data main
Set merge;
if CLOSE_APRV_DT = " " then GREATER = "MISSING";
If LST_UPDT_DT GT CLOSE_APRV_DT then GREATER = "DELETE";
If LST_UPDT_DT LT CLOSE_APRV_DT then GREATER = "LEAVE";
run;
What do I need to do to have all 23Jan01 entries appear as GREATER = "missing" - INSTEAD OF GREATER = "delete"?
reason being in the next sep I want to delete all entries with GREATER = "delete" ?
many thanks
may be something like this. But somehow I feel that I might have not understood your requirement completly
Proc sql; select a.*, case when LST_UPDT_DT is missing and CLOSE_APRV_DT is missing then 'Missing'/* this first when is
not needed as per your requirement*/ when LST_UPDT_DT is missing then 'Missing' when LST_UPDT_DT GT CLOSE_APRV_DT then 'Delete' when LST_UPDT_DT LT CLOSE_APRV_DT then 'Leave' end as GREATER from yourtable a ; quit;
Is this closer to what you want?
Data main
Set merge;
if CLOSE_APRV_DT = . then GREATER = "MISSING";
else If LST_UPDT_DT GT CLOSE_APRV_DT then GREATER = "DELETE";
else If LST_UPDT_DT LT CLOSE_APRV_DT then GREATER = "LEAVE";
run;
You need an ELSE statement so that once a condition is TRUE you don't execute any subsequent IF statements ( @SASKiwi posted already the code for this).
In your code there isn't an ELSE and though all IF statements get executed.
In a logical condition a missing numeric value gets treated as minus eternal.
In case variable CLOSE_APRV_DT is missing the 1st and and the 2nd IF statement will be TRUE. The first IF statement will populate variable GREATER with "MISSING" but then comes the 2nd IF statement which is also TRUE and it will replace the value of GREATER with "DELETE" ....and though you always end up with DELETE. Using if...then...ELSE will avoid this issue and once a condition is TRUE no further tests will be performed.
Also: You've missed the case where LST_UPDT_DT = CLOSE_APRV_DT.
You can cover this by either implementing a 4th and last statement: ELSE GREATER="..." or you're using in either the 2nd or 3rd if an operator with E (...GT -> GE or LT -> LE) to include the equal case.
From a "coding good practice" perspective:
Always try to cover all cases with explicit conditions and then have as last statement a ELSE <variable>='<undefined>'; to capture everything that you've missed.
This allows you then to clearly identify cases you've missed and deal with them - or may be they are really not relevant to you so you just make it clear that they aren't (i.e. with assigning a value of "na").
Valis points Patrick, thank you very much. And nicely explained how to conduct the statements in a logical way. Thank you very much!
Anna
From the data you displayed, I infer that the datetime values are not strings, but real numeric SAS datetime values.
So you should not compare with an empty string, but with the dot, which is the correct representation for a missing numerical value.
The fact that SAS does automatic conversions for you should not keep you from programming clean code.
Since a missing value is always considered to be smaller than any other value, just a slight reorder of your statements might do the trick:
Data main2;
Set merge;
If LST_UPDT_DT GT CLOSE_APRV_DT then TREATMENT = "DELETE";
If LST_UPDT_DT LT CLOSE_APRV_DT then TREATMENT = "LEAVE";
if CLOSE_APRV_DT = . then TREATMENT = "MISSING";
run;
Could be a good example for the binary ifn/ifc functions, that way the if is evaluated only once:
data main2; set merge; treatment=ifc(lst_updt gt close_aprv_dt,"DELETE","LEAVE"); if close_aprv=. then treatment="MISSING"; run;
OP, as a tip, its not a good idea to call your datasets things like "merge" which are SAS keywords. It really doesn't sound right doing:
set merge;
Thank you Kurt, yes datetime18.0 is numeric, and I was trying to find information about whether I had to convert this into another timeframe or not. But best to leave it. Saying this. I was not able to display age from DOB datetime.18, but had to conert it.
Many thanks. Anna
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.