Hello,
I have a data set with several pulmonary function test dates for each patient (i.e. pftsdate1 pftsdate2...pftsdate20) where patients had gotten FEV and FVC taken at each date (recorded as fevdate1 fvcdate1...fevdate20 fvcdate20). These measurements were pulled from patients charts and so could have taken place many years before the patients were enrolled in the study that the data was collected for. I also have a variable of when a patient was enrolled in the study (enrolldate). I am currently creating a table that averages the FEV measurements for the pulmonary function test measurements taken closest to enrollment, but not after enrollment. This means, for all patients, I have to find the pftsdate that corresponds to the most recent one prior to that patient's enroll date, then create new variables called "tablefev" and "tablefvc" that pulls the fevdate information from that pftsdate.
I am in the middle of creating a long code that hopefully gives me this, but I feel like there has to be a 10 times easier loop way of getting a column of these numbers.
Any help? Thanks.
Here's the sort of code that could go into a DATA step. If I understand correctly, the FEV and FVC variables contain the actual measurements, even though the variable names contain the word "DATE".
array pftsdate {20};
array fevdate {20};
array fvcdate {20};
do _n_=1 to 20 until (pftsdate{_n_} > enrolldate);
if pftsdate{_n_} < enrolldate then do;
if fevdate{_n_} > . then closest_fev = fevdate{_n_};
if fvcdate{_n_} > . then closest_fvc = fvcdate{_n_};
end;
end;
See if this works for what you need. Good luck.
Here's the sort of code that could go into a DATA step. If I understand correctly, the FEV and FVC variables contain the actual measurements, even though the variable names contain the word "DATE".
array pftsdate {20};
array fevdate {20};
array fvcdate {20};
do _n_=1 to 20 until (pftsdate{_n_} > enrolldate);
if pftsdate{_n_} < enrolldate then do;
if fevdate{_n_} > . then closest_fev = fevdate{_n_};
if fvcdate{_n_} > . then closest_fvc = fvcdate{_n_};
end;
end;
See if this works for what you need. Good luck.
Thanks! Yes, pftsdate has the date while the other two are the actual measurements? How do I incorporate enrolldate into this? By that I mean how do I ensure that the pftsdate I am geting the most recent prior to enroll date? The only way I can think of is taking the difference from the dates (enrolldate-pftsdate{_n_ ) and writing a command that finds the smallest negative number?
That part is already built in.
I'm making an assumption that I hope is valid, that the PFTSDATEs are in order from earliest to latest. So as the loop goes from 1 to 20 it always moves to a later date. At the same time, inside the loop the logic compares the PFTSDATE to ENROLLDATE. So as long as the dates are in order, you shouldn't need any further comparisons with ENROLLDATE.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.