Hello everyone,
I have a dataset with subjects (ID) that have the same number of follow-ups (Follow-up) with a result in a test (Test with 0 being negative and 1 being positive).
I want to create a new variable (Test_Positive) with last available follow-up in Test variable. I want the Test_Positive variable to have all follow-ups as 0 or 1 according to last available Test value. The simulation below may be more informative.
Please see below:
HAVE:
ID | Follow-up | Test |
1 | 1 | 0 |
1 | 2 | . |
1 | 3 | 1 |
1 | 4 | 1 |
2 | 1 | 0 |
2 | 2 | 0 |
2 | 3 | 0 |
2 | 4 | 0 |
3 | 1 | 1 |
3 | 2 | 1 |
3 | 3 | . |
3 | 4 | . |
WANT:
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 1 |
1 | 2 | . | 1 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
I have tried the following code without success (see output at after code):
proc sort data = have;
by Follow-up;
run;
data want;
set have;
by Follow-up;
Test_Positive = . ;
if last.Test then Test_Positive = last.Test;
if last.Test = . then Test_Positive = .;
else Test_Positive = coalesce(Test,Test_Positive);
retain Test_Positive;
run;
OUTPUT (the error is in 2 first rows where Test_Positive should be 1):
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 0 |
1 | 2 | . | 0 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
Thank you very much for any help!
JJG
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
That worked out beautifully!
Much appreciated mkeintz!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.