Hello everyone,
I have a dataset with subjects (ID) that have the same number of follow-ups (Follow-up) with a result in a test (Test with 0 being negative and 1 being positive).
I want to create a new variable (Test_Positive) with last available follow-up in Test variable. I want the Test_Positive variable to have all follow-ups as 0 or 1 according to last available Test value. The simulation below may be more informative.
Please see below:
HAVE:
ID | Follow-up | Test |
1 | 1 | 0 |
1 | 2 | . |
1 | 3 | 1 |
1 | 4 | 1 |
2 | 1 | 0 |
2 | 2 | 0 |
2 | 3 | 0 |
2 | 4 | 0 |
3 | 1 | 1 |
3 | 2 | 1 |
3 | 3 | . |
3 | 4 | . |
WANT:
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 1 |
1 | 2 | . | 1 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
I have tried the following code without success (see output at after code):
proc sort data = have;
by Follow-up;
run;
data want;
set have;
by Follow-up;
Test_Positive = . ;
if last.Test then Test_Positive = last.Test;
if last.Test = . then Test_Positive = .;
else Test_Positive = coalesce(Test,Test_Positive);
retain Test_Positive;
run;
OUTPUT (the error is in 2 first rows where Test_Positive should be 1):
ID | Follow-up | Test | Test_Positive |
1 | 1 | 0 | 0 |
1 | 2 | . | 0 |
1 | 3 | 1 | 1 |
1 | 4 | 1 | 1 |
2 | 1 | 0 | 0 |
2 | 2 | 0 | 0 |
2 | 3 | 0 | 0 |
2 | 4 | 0 | 0 |
3 | 1 | 1 | 1 |
3 | 2 | 1 | 1 |
3 | 3 | . | 1 |
3 | 4 | . | 1 |
Thank you very much for any help!
JJG
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
You need to pass through each ID twice. In the firstpass find the test_positive value. In the secondpass keep that value and output:
data have;
input ID Followup Test;
datalines;
1 1 0
1 2 .
1 3 1
1 4 1
2 1 0
2 2 0
2 3 0
2 4 0
3 1 1
3 2 1
3 3 .
3 4 .
run;
data want;
set have (in=firstpass)
have (in=secondpass) ;
by id;
retain test_positive;
if first.id then test_positive=.;
if firstpass=1 then test_positive=coalesce(test,test_positive);
if secondpass=1;
run;
That worked out beautifully!
Much appreciated mkeintz!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.