BookmarkSubscribeRSS Feed
rebelde52
Fluorite | Level 6

Hello, 

 

I have an example subset of 1 patient that where I created 2 separate dates. That is "Date" and "Pending_Date".  I also created a variable where each row is either Pending='YES' or Pending='NO'. Below is my table of what I have and what I want. How can I code if Date is less than Pending_Date then exclude='yes' then I would drop my exclusions. 

I tried it using an if then statement but it will only compare dates right next to each other (ex 3/29/2023 > blank date) but i want it to take into consideration the other dates as well. For example Obs 3 is a pending date and not greater Obs 1,2 4,5, and 8 so that wont get excluded. However, Obs 6 is a pending date and is greater than Obs 1,2, and 4 which is how I have my "Want" table.  

 

Again I want my exclusion to be " Pending_Date > Date then Exclude='YES' " but I want all the "Dates" to be taken into consideration. 

 

Whats the best way to approach this?

 

Thank you in advance! 

Have

Obs Patient_IDPendingPending_DateDate
1AB1NO 3/29/22
2AB1NO 3/29/22
3AB1YES3/29/22 
4AB1NO 3/29/22
5AB1NO 4/5/22
6AB1YES4/5/22 
7AB1YES4/5/22 
8AB1NO 4/5/22

 

Want

ObsPatient_IDPendingPending_DateDate
3AB1YES3/29/22 
5AB1NO 4/5/22
6AB1YES4/5/22 
6AB1YES4/5/22 
8AB1NO 4/5/22
6 REPLIES 6
mkeintz
PROC Star

It's not clear what PENDING date you are using as a criterion.  Are all dates compared to the highest PENDING date found, or just to the closest upcoming pending date?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
rebelde52
Fluorite | Level 6
All dates are compared to the highest Pending date found.
Oligolas
Barite | Level 11

I'd use SQL for this:

data have;
infile datalines dsd dlm=' ';
input Obs Patient_ID $ Pending  $ Pending_Date :mmddyy8. Date :mmddyy8.;
format Pending_Date Date mmddyy8.;
datalines;
1 AB1 NO  3/29/22
2 AB1 NO  3/29/22
3 AB1 YES 3/29/22
4 AB1 NO  3/29/22
5 AB1 NO  4/5/22
6 AB1 YES 4/5/22
7 AB1 YES 4/5/22
8 AB1 NO  4/5/22
;
run;

PROC SQL;
   CREATE TABLE excludes AS
      SELECT DISTINCT a.*
      FROM have a, have b
      WHERE a.Patient_ID eq b.Patient_ID 
      AND b.pending_date IS NOT NULL
      AND a.date IS NOT NULL
      AND b.pending_date > a.date
      ORDER BY a.obs
   ;
   CREATE TABLE want AS
      SELECT *
      FROM have
      EXCEPT
      SELECT *
      FROM excludes
   ;
   DROP TABLE excludes;
QUIT;
________________________

- Cheers -

mkeintz
PROC Star

Pass through each ID twice.  In the first pass, read only the PENDING='YES' observations, and retain the last such observation with variable _MAX_PENDING_DATE.  On the second pass output only cases with PENDING='YES' or DATE not preceding _MAX_PENDING_DATE:

data have;
infile datalines truncover;
input Obs Patient_ID $ Pending  $ Pending_Date :mmddyy8. Date :mmddyy8.;
format Pending_Date Date mmddyy8.;
datalines;
1 AB1 NO  .       3/29/22
2 AB1 NO  .       3/29/22
3 AB1 YES 3/29/22
4 AB1 NO  .       3/29/22
5 AB1 NO  .       4/5/22
6 AB1 YES 4/5/22
7 AB1 YES 4/5/22
8 AB1 NO  .       4/5/22
run;

data want (drop=_:);
  set have (in=firstpass where=(pending='YES')) 
      have (in=secondpass);
  by patient_id;

  retain _max_pending_date ;
  if firstpass=1  then _max_pending_date=pending_date;
  if secondpass=1 and (pending='YES' or date>=_max_pending_date);
run;

This program assumes the (1) data are sorted by patient_id, (2) that every patient_id has at least one pending_date observation, and (3) the last pending_date value within a patient_id is always the maximum pending_date.  It makes no assumptions about the order of the other DATE values.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Oligolas
Barite | Level 11

oh yes good point, it's also possible to simply compare to the max since earlier observations must be dropped anyway

PROC SQL;
   CREATE TABLE want AS
      SELECT *
      FROM have
      GROUP BY Patient_ID
      HAVING pending='YES' or Date>=max(Pending_Date) 
      ORDER BY obs
   ;
QUIT;
________________________

- Cheers -

mkeintz
PROC Star

@Oligolas wrote:

oh yes good point, it's also possible to simply compare to the max since earlier observations must be dropped anyway

PROC SQL;
   CREATE TABLE want AS
      SELECT *
      FROM have
      GROUP BY Patient_ID
      HAVING pending='YES' or Date>=max(Pending_Date) 
      ORDER BY obs
   ;
QUIT;

"Comparing to the max" actually means reading all the data twice, once to get the max, and once to compare.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2148 views
  • 1 like
  • 3 in conversation