Hello,
This is a follow-up question to a solution posted here. I'm trying to calculate cumulative exposure history before disease diagnosis, given age of exposure onset (age), age of disease diagnosis (dxAge), and exposure type (Extype).
My program looks like this:
data have;
format id age dxAge Extype 3.;
input id age dxAge Extype;
cards;
1 18 48 1
1 25 48 1
1 48 48 1
1 50 48 1
1 52 48 . *The last line is current age;
;
run;
proc sort data=have;
by id descending age;
run;
data new;
set have;
by id;
retain found lastAge;
if first.id then do;
found = 0;
lastAge = 0;
end;
if lastAge gt 0 then
time = lastAge - age;
else
time = 0;
if dxAge gt 0 then do; *Lag exposure time to one year before diagnosis;
if not(found) then do;
if dxAge - age gt 1 then do;
found = 1;
lagTime = (dxAge - age) - 1;
end;
else
lagTime = 0;
end;
else
lagTime = time;
end;
else
lagTime = 0;
lastAge = age;
run;
proc sort data=new;
by id age;
run;
However, in some instances I have multiple lines of exposure time (e.g., multiple exposure types) for the same age:
data have;
format id age dxAge Extype 3.;
input id age dxAge Extype;
cards;
1 18 48 1
1 18 48 2
1 18 48 3
1 25 48 1
1 48 48 1
1 50 48 1
1 52 48 .
;
run;
My desired output looks like this:
ID Age DxAge Extype Lagtime
1 18 48 1 7
1 18 48 2 7
1 18 48 3 7
1 25 48 1 22
1 48 48 1 0
1 50 48 1 0
1 52 48 . 0
Running the program above only counts the first line of exposure time (25-18=7 years) and sets the next lines to 0. The cumulative time from age 18 to 25 should be 7*3=21 years. Could anyone please suggest a fix for this? Thank you.
@TJ87 wrote:
Running the program above only counts the first line of exposure time (25-18=7 years) and sets the next lines to 0. The cumulative time from age 18 to 25 should be 7*3=21 years. Could anyone please suggest a fix for this? Thank you.
I am wondering about your definition of exposure or at least what your input variables represent. My first thought with your example data is that something is accidentally repeated. Second that something was measured reapeatedly at age 18. But nothing can make me believe that you get 21 years of exposure given the first exposure at 18 and compared to age 25.
If there is a changing exposure rate, such as with radiation like rad/hr (or whatever unit is being used now), then accumulate the actual radiation based on measures but I don't see anything of that sort in the data presented.
Hello,
Exposure was measured repeatedly at age 18 in this scenario, so we can take the sum of exposure time over the number of exposures. Thanks.
I think the trick is that the program needs to take the difference between age and the next highest age per ID (in this example, from 18 to 25), but I don't know how to code that.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.