Dear Friend,
I am new to SAS and can’t find way around some code; please help if you can.
I am working on data set (see first rows below of the dataset below) and would like to pick patients with two GFR=1 and date between them (GFR_D) between them of at least 90 days (≥90 days). In clinical settings this is how patients diagnosed with chronic kidney disease (CKD). Patients must have at least two subsequent GFR measures that are more than 90 days apart before they are labeled CKD patients. The abnormal measures (GFR=1) must not have a normal value between them ( e.g. patient 8 doesn’t have CKD!)
Data I have.
PATIENT | GFR_D | GFR |
1 | 2011-01-31 | 0 |
1 | 2011-06-16 | 1 |
1 | 2011-10-26 | 1 |
1 | 2012-03-13 | 0 |
1 | 2012-07-19 | 0 |
1 | 2012-11-22 | 1 |
1 | 2013-02-27 | 1 |
1 | 2013-06-28 | 1 |
1 | 2013-11-08 | 0 |
1 | 2014-03-11 | 0 |
1 | 2014-07-02 | 1 |
2 | 2011-04-20 | 0 |
2 | 2011-09-22 | 0 |
2 | 2012-01-16 | 0 |
2 | 2012-04-19 | 0 |
3 | 2012-07-26 | 1 |
3 | 2012-11-15 | 0 |
3 | 2013-03-21 | 1 |
4 | 2013-07-09 | 1 |
4 | 2013-10-29 | 1 |
4 | 2015-06-17 | 1 |
5 | 2011-02-01 | 0 |
5 | 2011-05-02 | 0 |
6 | 2011-07-31 | 1 |
6 | 2011-10-29 | 0 |
7 | 2011-06-16 | 1 |
7 | 2011-10-26 | 0 |
7 | 2012-03-13 | 1 |
7 | 2012-07-19 | 0 |
7 | 2012-11-22 | 1 |
7 | 2013-02-27 | 1 |
7 | 2013-06-28 | 1 |
8 | 2011-06-16 | 1 |
8 | 2011-10-26 | 0 |
8 | 2012-03-13 | 1 |
8 | 2012-07-19 | 0 |
8 | 2012-11-22 | 1 |
8 | 2013-11-27 | 0 |
9 | 2011-06-16 | 1 |
9 | 2011-10-26 | 0 |
9 | 2012-03-13 | 1 |
9 | 2012-07-19 | 0 |
9 | 2012-11-22 | 1 |
9 | 2013-11-27 | 1 |
9 | 2013-12-28 | 1 |
10 | 2013-11-27 | 1 |
10 | 2016-05-28 | 1 |
The data I want should contain only patients who have GFR=1 at any two dates that are at least 90 days apart. In would also like the date when the patient became diagnosed with CKD (That is the first date when GFR was 1 and the subsequent GFR was more than 90 days after and = 1)
Data I want
PATIENT | GFR_D |
1 | 2011-06-16 |
4 | 2013-07-09 |
7 | 2012-11-22 |
10 | 2013-11-27 |
Thank you so much.
Dathan Byonanebye
That will be easier in a data step. Use RETAIN to remember if/when the earlier test was and whether you found any with difference larger than 90 days.
data want ;
set have ;
by patient gfr_d ;
retain first_date diff found;
format first_date yymmdd10.;
if first.patient then call missing(first_date,diff,found);
if (not found) then do;
if gfr then do;
if missing(first_date) then first_date=gfr_d ;
else do;
diff = gfr_d - first_date;
if diff >= 90 then found=1;
end;
end;
else call missing(first_date);
end;
if last.patient and found then output want;
keep patient first_date diff ;
run;
Obs PATIENT first_date diff 1 1 2011-06-16 132 2 4 2013-07-09 112 3 7 2012-11-22 97 4 9 2012-11-22 370 5 10 2013-11-27 913
Why isn't patient 3 on the list? It has dates of 2012-07-26 and 2013-03-21 .
data have;
input PATIENT GFR_D GFR ;
informat gfr_d yymmdd.;
format gfr_d yymmdd10.;
put patient gfr_d gfr ;
cards;
1 2011-01-31 0
1 2011-06-16 1
1 2011-10-26 1
1 2012-03-13 0
1 2012-07-19 0
1 2012-11-22 1
1 2013-02-27 1
1 2013-06-28 1
1 2013-11-08 0
1 2014-03-11 0
1 2014-07-02 1
2 2011-04-20 0
2 2011-09-22 0
2 2012-01-16 0
2 2012-04-19 0
3 2012-07-26 1
3 2012-11-15 0
3 2013-03-21 1
4 2013-07-09 1
4 2013-10-29 1
4 2015-06-17 1
5 2011-02-01 0
5 2011-05-02 0
6 2011-07-31 1
6 2011-10-29 0
7 2011-06-16 1
7 2011-10-26 0
7 2012-03-13 1
7 2012-07-19 0
7 2012-11-22 1
7 2013-02-27 1
7 2013-06-28 1
8 2011-06-16 1
8 2011-10-26 0
8 2012-03-13 1
8 2012-07-19 0
8 2012-11-22 1
8 2013-11-27 0
9 2011-06-16 1
9 2011-10-26 0
9 2012-03-13 1
9 2012-07-19 0
9 2012-11-22 1
9 2013-11-27 1
9 2013-12-28 1
10 2013-11-27 1
10 2016-05-28 1
;
proc sql;
select patient
, min(gfr_d) as first_date format=yymmdd10.
, max(gfr_d) as last_date format=yymmdd10.
, range(gfr_d) as range
from have
where gfr=1
group by patient
having range(gfr_d) >= 90
;
quit;
PATIENT first_date last_date range ------------------------------------------ 1 2011-06-16 2014-07-02 1112 3 2012-07-26 2013-03-21 238 4 2013-07-09 2015-06-17 708 7 2011-06-16 2013-06-28 743 8 2011-06-16 2012-11-22 525 9 2011-06-16 2013-12-28 926 10 2013-11-27 2016-05-28 913
Shouldn't patient 9 be included as well?
9 2011-06-16 1
9 2011-10-26 0
9 2012-03-13 1
9 2012-07-19 0
9 2012-11-22 1
9 2013-11-27 1 ->365+ days apart?
9 2013-12-28 1
If so, this works for me. You'll need to unduplicate the list but this lets you track the logic to ensure it's correct.
data want;
set have;
by patient gfr notsorted;
x=dif(gfr_d);
if first.gfr then x=.;
if gfr=1 and not first.gfr and x>90 then flag=1;
else flag=0;
run;
Unique List
data temp;
set have;
by patient gfr notsorted;
x=dif(gfr_d);
if first.gfr then x=.;
if gfr=1 and not first.gfr and x>90 then output;
run;
proc sql;
create table want as
select distinct patient
from temp;
quit;
Thank you Reeza
That will be easier in a data step. Use RETAIN to remember if/when the earlier test was and whether you found any with difference larger than 90 days.
data want ;
set have ;
by patient gfr_d ;
retain first_date diff found;
format first_date yymmdd10.;
if first.patient then call missing(first_date,diff,found);
if (not found) then do;
if gfr then do;
if missing(first_date) then first_date=gfr_d ;
else do;
diff = gfr_d - first_date;
if diff >= 90 then found=1;
end;
end;
else call missing(first_date);
end;
if last.patient and found then output want;
keep patient first_date diff ;
run;
Obs PATIENT first_date diff 1 1 2011-06-16 132 2 4 2013-07-09 112 3 7 2012-11-22 97 4 9 2012-11-22 370 5 10 2013-11-27 913
Thank you Tom
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.