Hello everyone,
I have a dataset with subjects (ID) that have the same number of follow-ups (Follow-up) with a result in a test (Test with 0 being negative and 1 being positive).
I want to create a new variable (Test_Positive) with last available follow-up in Test variable. I want the Test_Positive variable to have all follow-ups as 0 or 1 according to last available Test value. The simulation below may be more informative.
Please see below:
HAVE:
ID | Follow-up | Test |
1 | 1 | 0 |
1 | 2 | . |
1 | 3 | 1 |
1 | 4 | 1 |
2 | 1 | 0 |
2 | 2 | 0 |
2 | 3 | 0 |
2 | 4 | 0 |
3 | 1 | 1 |
3 | 2 | 1 |
3 | 3 | . |
3 | 4 | . |
WANT:
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 1 |
1 | 2 | . | 1 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
I have tried the following code without success (see output at after code):
proc sort data = have;
by Follow-up;
run;
data want;
set have;
by Follow-up;
Test_Positive = . ;
if last.Test then Test_Positive = last.Test;
if last.Test = . then Test_Positive = .;
else Test_Positive = coalesce(Test,Test_Positive);
retain Test_Positive;
run;
OUTPUT (the error is in 2 first rows where Test_Positive should be 1):
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 0 |
1 | 2 | . | 0 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
Thank you very much for any help!
JJG
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
That worked out beautifully!
Much appreciated mkeintz!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.