BookmarkSubscribeRSS Feed
rebelde52
Fluorite | Level 6

Hello, 

 

I have an example subset of 1 patient that where I created 2 separate dates. That is "Date" and "Pending_Date".  I also created a variable where each row is either Pending='YES' or Pending='NO'. Below is my table of what I have and what I want. How can I code if Date is less than Pending_Date then exclude='yes' then I would drop my exclusions. 

I tried it using an if then statement but it will only compare dates right next to each other (ex 3/29/2023 > blank date) but i want it to take into consideration the other dates as well. For example Obs 3 is a pending date and not greater Obs 1,2 4,5, and 8 so that wont get excluded. However, Obs 6 is a pending date and is greater than Obs 1,2, and 4 which is how I have my "Want" table.  

 

Again I want my exclusion to be " Pending_Date > Date then Exclude='YES' " but I want all the "Dates" to be taken into consideration. 

 

Whats the best way to approach this?

 

Thank you in advance! 

Have

Obs Patient_IDPendingPending_DateDate
1AB1NO 3/29/22
2AB1NO 3/29/22
3AB1YES3/29/22 
4AB1NO 3/29/22
5AB1NO 4/5/22
6AB1YES4/5/22 
7AB1YES4/5/22 
8AB1NO 4/5/22

 

Want

ObsPatient_IDPendingPending_DateDate
3AB1YES3/29/22 
5AB1NO 4/5/22
6AB1YES4/5/22 
6AB1YES4/5/22 
8AB1NO 4/5/22
6 REPLIES 6
mkeintz
PROC Star

It's not clear what PENDING date you are using as a criterion.  Are all dates compared to the highest PENDING date found, or just to the closest upcoming pending date?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
rebelde52
Fluorite | Level 6
All dates are compared to the highest Pending date found.
Oligolas
Barite | Level 11

I'd use SQL for this:

data have;
infile datalines dsd dlm=' ';
input Obs Patient_ID $ Pending  $ Pending_Date :mmddyy8. Date :mmddyy8.;
format Pending_Date Date mmddyy8.;
datalines;
1 AB1 NO  3/29/22
2 AB1 NO  3/29/22
3 AB1 YES 3/29/22
4 AB1 NO  3/29/22
5 AB1 NO  4/5/22
6 AB1 YES 4/5/22
7 AB1 YES 4/5/22
8 AB1 NO  4/5/22
;
run;

PROC SQL;
   CREATE TABLE excludes AS
      SELECT DISTINCT a.*
      FROM have a, have b
      WHERE a.Patient_ID eq b.Patient_ID 
      AND b.pending_date IS NOT NULL
      AND a.date IS NOT NULL
      AND b.pending_date > a.date
      ORDER BY a.obs
   ;
   CREATE TABLE want AS
      SELECT *
      FROM have
      EXCEPT
      SELECT *
      FROM excludes
   ;
   DROP TABLE excludes;
QUIT;
________________________

- Cheers -

mkeintz
PROC Star

Pass through each ID twice.  In the first pass, read only the PENDING='YES' observations, and retain the last such observation with variable _MAX_PENDING_DATE.  On the second pass output only cases with PENDING='YES' or DATE not preceding _MAX_PENDING_DATE:

data have;
infile datalines truncover;
input Obs Patient_ID $ Pending  $ Pending_Date :mmddyy8. Date :mmddyy8.;
format Pending_Date Date mmddyy8.;
datalines;
1 AB1 NO  .       3/29/22
2 AB1 NO  .       3/29/22
3 AB1 YES 3/29/22
4 AB1 NO  .       3/29/22
5 AB1 NO  .       4/5/22
6 AB1 YES 4/5/22
7 AB1 YES 4/5/22
8 AB1 NO  .       4/5/22
run;

data want (drop=_:);
  set have (in=firstpass where=(pending='YES')) 
      have (in=secondpass);
  by patient_id;

  retain _max_pending_date ;
  if firstpass=1  then _max_pending_date=pending_date;
  if secondpass=1 and (pending='YES' or date>=_max_pending_date);
run;

This program assumes the (1) data are sorted by patient_id, (2) that every patient_id has at least one pending_date observation, and (3) the last pending_date value within a patient_id is always the maximum pending_date.  It makes no assumptions about the order of the other DATE values.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Oligolas
Barite | Level 11

oh yes good point, it's also possible to simply compare to the max since earlier observations must be dropped anyway

PROC SQL;
   CREATE TABLE want AS
      SELECT *
      FROM have
      GROUP BY Patient_ID
      HAVING pending='YES' or Date>=max(Pending_Date) 
      ORDER BY obs
   ;
QUIT;
________________________

- Cheers -

mkeintz
PROC Star

@Oligolas wrote:

oh yes good point, it's also possible to simply compare to the max since earlier observations must be dropped anyway

PROC SQL;
   CREATE TABLE want AS
      SELECT *
      FROM have
      GROUP BY Patient_ID
      HAVING pending='YES' or Date>=max(Pending_Date) 
      ORDER BY obs
   ;
QUIT;

"Comparing to the max" actually means reading all the data twice, once to get the max, and once to compare.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1287 views
  • 1 like
  • 3 in conversation