BookmarkSubscribeRSS Feed
monday89
Fluorite | Level 6

I have the following dataset.

What I would like to do is

  • if group = 2 and (stop_date_time_1 is NOT MISSING and occurs on or before final_stop_date_time) then keep = 1
  • else if group = 2 and (stop_date_time_1 is missing) and start_date_time_1 occurs AFTER final_stop_date_time) then keep = 0
SubjectID final_stop_date_time start_date_time_1 stop_date_time_1 group
11 12FEB2020:10:59:00 11FEB2020:15:30:00 11FEB2020:16:08:00 2
11 12FEB2020:10:59:00 11FEB2020:16:08:00 12FEB2020:10:59:00 2
11 12FEB2020:10:59:00 12FEB2020:21:40:00 . 2

Want: 

SubjectID final_stop_date_time start_date_time_1 stop_date_time_1 group keep
11 12FEB2020:10:59:00 11FEB2020:15:30:00 11FEB2020:16:08:00 2 1
11 12FEB2020:10:59:00 11FEB2020:16:08:00 12FEB2020:10:59:00 2 1
11 12FEB2020:10:59:00 12FEB2020:21:40:00 . 2 0

I tried the following and doesn't work

data want; set have; 

if group = '2' and 
(stop_date_time_1 <= final_stop_date_time) then keep = 1 ;
else if 
(group = '2' and ((start_date_time_1 > final_stop_date_time ) and missing(stop_date_time_1)) OR 
(group = '2' and (stop_date_time_2 > final_stop_date_time)) 
then keep = 0;
run;

4 REPLIES 4
ballardw
Super User

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the <> to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the <> icon or attached as text to show exactly what you have and that we can test code against.

 

I'm going to go out on a limb and guess that part of the problem is you get Keep=1 for missing values of the Stop_date_time.

if group = '2' and 
not missing(stop_date_time) and
(stop_date_time_1 <= final_stop_date_time) then keep = 1 ;

Missing values are always less than any given value. So you need to explicitly limit that part.

Alternate could be to provide a lower limit instead of the missing function:

if group = '2' and 
( 0 <= stop_date_time_1 <= final_stop_date_time) then keep = 1 ;

but that approach may not be appropriate if you do not know what your minimum actual value of the stop_date_time should be.

I am assuming that your datetime values are SAS datetimes valued. If they are character then any comparison with < or > is almost always a very bad idea. 'January' comes after 'April' in terms of inequalities.

monday89
Fluorite | Level 6

There is no error but this how final dataset looks. The last record should keep = 0 because stop_date_time_2 is null and start_date_time_1 occurs after final_stop_date_time

 

SubjectID final_stop_date_time start_date_time_1 stop_date_time_2 group keep
11 12FEB2020:10:59:00 11FEB2020:15:30:00 11FEB2020:16:08:00 2 1
11 12FEB2020:10:59:00 11FEB2020:16:08:00 12FEB2020:10:59:00 2 1
11 12FEB2020:10:59:00 12FEB2020:21:40:00 . 2 1

 

ballardw
Super User

Count the ( and ) in this:

(group = '2' and ((start_date_time_1 > final_stop_date_time ) and missing(stop_date_time_1)) OR 
(group = '2' and (stop_date_time_2 > final_stop_date_time)) 

you have an unbalanced () somewhere.

 

Not nice to include variables in code without example data.

 

Since every single one of your conditions involves group='2' a good style choice is to factor out the common element:

If group='2' then do;
   <condition one>
   <next condition>
   <etc>
end;

which will make it easier to read and understand that this block all involves group='2'. And sometimes you can introduce logic errors with minor typos for repeated code.

smantha
Lapis Lazuli | Level 10
data new;
informat final_stop_date_time	start_date_time_1	stop_date_time_1 datetime19.;
input SubjectID	final_stop_date_time	start_date_time_1	stop_date_time_1	group $;
format final_stop_date_time	start_date_time_1	stop_date_time_1 datetime19.;
if group = '2' and 
(stop_date_time_1 <= final_stop_date_time) then keep = 1 ;
if 
(group = '2' and ((start_date_time_1 >= final_stop_date_time ) and missing(stop_date_time_1))) OR 
(group = '2' and (stop_date_time_1 > final_stop_date_time)) 
then keep = 0;
cards;;;
11	12FEB2020:10:59:00	11FEB2020:15:30:00	11FEB2020:16:08:00	2
11	12FEB2020:10:59:00	11FEB2020:16:08:00	12FEB2020:10:59:00	2
11	12FEB2020:10:59:00	12FEB2020:21:40:00	.	2
;;;

run;

reasons why it did not work before:

1. (stop_date_time_2 > final_stop_date_time)  stop_date_time_2 does not exist

2. It did not go into else condition as when the date column is missing first if condition is always true

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 517 views
  • 0 likes
  • 3 in conversation