Hi all,
I have a long dataset with one row for each day an individual participated in a study, and multiple rows for individuals with multiple entry and exit dates, as well as multiple infections with dengue. I want to calculate person-days, but I do not know how to account for individuals who had dengue cases as they should be excluded from contributing person time for one month following illness.
I tried this code:
%MACRO lag (num);
DATA dengue.expand2;
set dengue.expand;
lag_dengue&num. = lag&num.(resfinaldengue1);
if codigo ne lag&num.(codigo) then lag_dengue&num. = .;
output;
run;
%MEND;
%lag(1);
%lag(2);
%lag(3);
%lag(4);
%lag(5);
%lag(6);
%lag(7);
%lag(8);
%lag(9);
%lag(10);
%lag(11);
%lag(12);
%lag(13);
%lag(14);
%lag(15);
%lag(16);
%lag(17);
%lag(18);
%lag(19);
%lag(20);
%lag(21);
%lag(22);
%lag(23);
%lag(24);
%lag(25);
%lag(26);
%lag(27);
%lag(28);
%lag(29);
%lag(30);
*Adjusting the number of person-days contributed by codigo based on if they had Dengue;
DATA dengue.expand3;
set dengue.expand2;
if (lag_dengue30='.' AND resfinaldengue1=1) then p_day=0;
else p_day = 1;
run;
Where Codigo= participant id
resfinaldengue1= 1 if positive for dengue, 0 if no dengue
p_day = person day
I also have variables for entry and exit dates, infection start date, and infection start date #2 and #3 if multiple infections.
This code made it so that for individuals who did have dengue, the first 30 days under their codigo would be person day = 0. I was wondering if anyone had other suggestions for how to code for this, since this code does not account for individuals who had more than one dengue infection. (I had multiple rows for participants with multiple infections, however, some of those infections occurred within the same entry and exit date).
Please show us a portion of your actual data in data set dengue.expand, so we can see what you are talking about. Some people ignore this next request ... do not ignore this next request. We need data provided as working SAS data step code. You can type it in yourself or follow these instructions. Do not provide data as Excel or as screen captures.
Your current code replaces data set dengue.expand2 every time you run that macro. So it really isn't doing much good to run %lag(1) through %lag(29) as the only result you will have is that from %lag(30).
The basic generic approach to you question would be to
1) identify the persons with dengue (or other characteristic) and then
2) remove all of their records from the data set.
How well did you test that approach to working with your data before writing that macro? I am asking because code like this involving the lag function seldom works as desired:
if codigo ne lag&num.(codigo) then lag_dengue&num. = .;
as the LAG function called in an IF uses the last time the If was true for comparisons.
Instead of loading (and replacing the output set) 30 times you might want to consider creating 30 variables at one time. Not sure exactly what you expect but by having all of the Lag_dengue1 through Lag_dengue30 on each observation then you have all values all the time.
If Codigo is an identification variable it may be that you want to use a BY Codigo and perhaps first/last processing to identify groups.
Perhaps this link would be of interest for multiple lag values an by group processing: https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/lefunctionsref/n0l66p5oqex1f2n1quuopdvtcjqb.h...
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.