turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Sequence or first and last variable

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-07-2017 10:29 PM

I have the following dataset below. I have been trying to subset the data 1) have patients (Patient_No) with any prior positive outcome and any subsequent negative outcome, 2) have patients with at least 1 prior positive outcome, at least 1 subsequent negative outcome, at least 90 days between the first low positive or negative outcome and most recent negative outcome, most recent outcome should be negative and there are no subsequent positive outcomes. I have tried using Proc sql first and last to select the patients but the code is getting too long and messy.

Data Diagnose;

Input @1 Patient_No $2.

@3 Date MMDDYY10.

@14 Visit_No $2.

@16 Outcome $12.;

Format Date MMDDYY10.;

Datalines;

1 10/21/2000 1 Positive

1 10/25/2000 2 Positive

1 11/01/2000 3 Negative

1 05/28/2001 4 Negative

2 11/22/2000 1 Positive

2 11/29/2000 2 Positive

2 12/28/2000 3 Positive

2 06/28/2001 4 Low positive

2 10/29/2001 5 Negative

3 12/12/2000 1 Positive

3 12/29/2000 2 Positive

3 02/21/2001 3 Positive

3 07/12/2001 4 Negative

3 08/29/2001 5 Positive

;

Accepted Solutions

Solution

10-08-2017
12:02 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to nerdy2703

10-07-2017 11:52 PM

I hope this implements all of your rules:

```
data sub1 sub2;
do until(last.patient_no);
set diagnose; by patient_no;
select (Outcome);
when ("Positive") do;
if missing(firstPos) then firstPos = date;
end;
when ("Negative") do;
if missing(firstNeg) then firstNeg = date;
recentNeg = date;
end;
when("Low positive") do;
if missing(firstLow) then firstLow = date;
end;
otherwise;
end;
end;
/* Subset 1 : any prior positive outcome and any subsequent
negative outcome */
if firstPos < firstNeg
then output sub1;
/* Subset 2 */
if
/* at least 1 prior positive outcome */
not missing(firstPos) and
/* at least 1 subsequent negative outcome */
firstPos < firstNeg and
/* at least 90 days between the first low positive
or negative outcome and most recent negative outcome */
intck("day", min(firstLow, firstNeg), recentNeg) >= 90 and
/* most recent outcome should be negative and there are no
subsequent positive outcomes */
Outcome = "Negative"
then output sub2;
keep patient_No; /* Applies to both output datasets */
run;
```

PG

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to nerdy2703

10-07-2017 11:10 PM

Can you post the expected result dataset? This allows use to verify a suggestion before posting it.

Your requirements are complex, so don't expect a single seven statements solution.

Your requirements are complex, so don't expect a single seven statements solution.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to nerdy2703

10-07-2017 11:13 PM

Please post what you've tried, otherwise we may make suggestions that will not work for you.

Solution

10-08-2017
12:02 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to nerdy2703

10-07-2017 11:52 PM

I hope this implements all of your rules:

```
data sub1 sub2;
do until(last.patient_no);
set diagnose; by patient_no;
select (Outcome);
when ("Positive") do;
if missing(firstPos) then firstPos = date;
end;
when ("Negative") do;
if missing(firstNeg) then firstNeg = date;
recentNeg = date;
end;
when("Low positive") do;
if missing(firstLow) then firstLow = date;
end;
otherwise;
end;
end;
/* Subset 1 : any prior positive outcome and any subsequent
negative outcome */
if firstPos < firstNeg
then output sub1;
/* Subset 2 */
if
/* at least 1 prior positive outcome */
not missing(firstPos) and
/* at least 1 subsequent negative outcome */
firstPos < firstNeg and
/* at least 90 days between the first low positive
or negative outcome and most recent negative outcome */
intck("day", min(firstLow, firstNeg), recentNeg) >= 90 and
/* most recent outcome should be negative and there are no
subsequent positive outcomes */
Outcome = "Negative"
then output sub2;
keep patient_No; /* Applies to both output datasets */
run;
```

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

10-08-2017 12:03 AM

Yes, it does. Thank you

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

11-12-2017 10:34 PM

@PGStats, the code works fine, but I noticed that if the first visit is a negative, the code fails to meet the rules. For example, in the data set below. Patient 4 meets the rules, but has a negative first visit.

```
Data Diagnose;
Input @1 Patient_No $2.
@3 Date MMDDYY10.
@14 Visit_No $2.
@16 Outcome $12.;
Format Date MMDDYY10.;
Datalines;
1 10/21/2000 1 Positive
1 10/25/2000 2 Positive
1 11/01/2000 3 Negative
1 05/28/2001 4 Negative
2 11/22/2000 1 Positive
2 11/29/2000 2 Positive
2 12/28/2000 3 Positive
2 06/28/2001 4 Low positive
2 10/29/2001 5 Negative
3 12/12/2000 1 Positive
3 12/29/2000 2 Positive
3 02/21/2001 3 Positive
3 07/12/2001 4 Negative
3 08/29/2001 5 Positive
4 10/21/2000 1 Negative
4 11/25/2001 2 Positive
4 12/01/2001 3 Positive
4 06/28/2002 4 Negative
4 10/26/2002 5 Negative
;
```