BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nerdy2703
Fluorite | Level 6

I have the following dataset below. I have been trying to subset the data 1) have patients (Patient_No) with any prior positive outcome and any subsequent negative outcome, 2) have patients with at least 1  prior positive outcome, at least 1 subsequent negative outcome, at least 90 days between the first low positive or negative outcome and most recent negative outcome, most recent outcome should be negative and there are no subsequent positive outcomes.  I have tried using Proc sql  first and last to select the patients but the code is getting too long and messy.

 

Data Diagnose;
Input @1 Patient_No $2.
@3 Date MMDDYY10.
@14 Visit_No $2.
@16 Outcome $12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I hope this implements all of your rules:

 


data sub1 sub2;
do until(last.patient_no);
    set diagnose; by patient_no;
    select (Outcome);
        when ("Positive") do;
            if missing(firstPos) then firstPos = date;
            end;
        when ("Negative") do;
            if missing(firstNeg) then firstNeg = date;
            recentNeg = date;
            end;
        when("Low positive") do;
            if missing(firstLow) then firstLow = date;
            end;
        otherwise;
        end;
    end;

/* Subset 1 :  any prior positive outcome and any subsequent 
   negative outcome */
if firstPos < firstNeg 
then output sub1;

/* Subset 2 */
if
    /* at least 1  prior positive outcome */
    not missing(firstPos) and
    /* at least 1 subsequent negative outcome */
    firstPos < firstNeg and
    /* at least 90 days between the first low positive 
       or negative outcome and most recent negative outcome */
    intck("day", min(firstLow, firstNeg), recentNeg) >= 90 and
    /* most recent outcome should be negative and there are no 
       subsequent positive outcomes */
    Outcome = "Negative" 
then output sub2;

keep patient_No; /* Applies to both output datasets */
run;
PG

View solution in original post

5 REPLIES 5
error_prone
Barite | Level 11
Can you post the expected result dataset? This allows use to verify a suggestion before posting it.

Your requirements are complex, so don't expect a single seven statements solution.
Reeza
Super User

Please post what you've tried, otherwise we may make suggestions that will not work for you.

PGStats
Opal | Level 21

I hope this implements all of your rules:

 


data sub1 sub2;
do until(last.patient_no);
    set diagnose; by patient_no;
    select (Outcome);
        when ("Positive") do;
            if missing(firstPos) then firstPos = date;
            end;
        when ("Negative") do;
            if missing(firstNeg) then firstNeg = date;
            recentNeg = date;
            end;
        when("Low positive") do;
            if missing(firstLow) then firstLow = date;
            end;
        otherwise;
        end;
    end;

/* Subset 1 :  any prior positive outcome and any subsequent 
   negative outcome */
if firstPos < firstNeg 
then output sub1;

/* Subset 2 */
if
    /* at least 1  prior positive outcome */
    not missing(firstPos) and
    /* at least 1 subsequent negative outcome */
    firstPos < firstNeg and
    /* at least 90 days between the first low positive 
       or negative outcome and most recent negative outcome */
    intck("day", min(firstLow, firstNeg), recentNeg) >= 90 and
    /* most recent outcome should be negative and there are no 
       subsequent positive outcomes */
    Outcome = "Negative" 
then output sub2;

keep patient_No; /* Applies to both output datasets */
run;
PG
nerdy2703
Fluorite | Level 6
Yes, it does. Thank you
nerdy2703
Fluorite | Level 6

@PGStats, the code works fine, but I noticed that if the first visit is a negative, the code fails to meet the rules. For example, in the data set below. Patient 4 meets the rules, but has a negative first visit. 

Data Diagnose;
Input @1 Patient_No $2.
@3 Date MMDDYY10.
@14 Visit_No $2.
@16 Outcome $12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 10/21/2000 1 Negative
4 11/25/2001 2 Positive
4 12/01/2001 3 Positive
4 06/28/2002 4 Negative
4 10/26/2002 5 Negative
;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 709 views
  • 2 likes
  • 4 in conversation