I would like to know how can I assign index dates at random to unexposed patients by incidence density sampling from the distribution of index dates in the exposed cohort, any help is appreciated.
the following is the distribution of exposed group per year according to their index date;
ID_YEAR | Frequency | Percent |
2000 | 337 | 1.60 |
2001 | 445 | 2.11 |
2002 | 542 | 2.57 |
2003 | 715 | 3.39 |
2004 | 764 | 3.63 |
2005 | 873 | 4.14 |
2006 | 1099 | 5.22 |
2007 | 1080 | 5.13 |
2008 | 1274 | 6.05 |
2009 | 1483 | 7.04 |
2010 | 1608 | 7.63 |
2011 | 1729 | 8.21 |
2012 | 1749 | 8.30 |
2013 | 1989 | 9.44 |
2014 | 1834 | 8.71 |
2015 | 1647 | 7.82 |
2016 | 1214 | 5.76 |
2017 | 685 | 3.25 |
the exposed group have an index date while the control group don't have an index date and I want to assign it to them.
thanks
Use the RAND function with the "table" distribution to generate the assignments:
data pct;
input ID_YEAR Frequency Percent;
datalines;
2000 337 1.60
2001 445 2.11
2002 542 2.57
2003 715 3.39
2004 764 3.63
2005 873 4.14
2006 1099 5.22
2007 1080 5.13
2008 1274 6.05
2009 1483 7.04
2010 1608 7.63
2011 1729 8.21
2012 1749 8.30
2013 1989 9.44
2014 1834 8.71
2015 1647 7.82
2016 1214 5.76
2017 685 3.25
;
proc sql;
select percent/100 into :pct separated by ","
from pct;
select min(id_year) - 1 into :baseYear
from pct;
quit;
/* fake data */
data unexposed;
do id = 1 to 1000;
output;
end;
run;
data assigned;
call streaminit(97987976);
set unexposed;
assigned_id_year = &baseYear. + rand("table", &pct.);
run;
/* check unexposed frequencies */
proc freq data = assigned;
table assigned_id_year;
run;
Use the RAND function with the "table" distribution to generate the assignments:
data pct;
input ID_YEAR Frequency Percent;
datalines;
2000 337 1.60
2001 445 2.11
2002 542 2.57
2003 715 3.39
2004 764 3.63
2005 873 4.14
2006 1099 5.22
2007 1080 5.13
2008 1274 6.05
2009 1483 7.04
2010 1608 7.63
2011 1729 8.21
2012 1749 8.30
2013 1989 9.44
2014 1834 8.71
2015 1647 7.82
2016 1214 5.76
2017 685 3.25
;
proc sql;
select percent/100 into :pct separated by ","
from pct;
select min(id_year) - 1 into :baseYear
from pct;
quit;
/* fake data */
data unexposed;
do id = 1 to 1000;
output;
end;
run;
data assigned;
call streaminit(97987976);
set unexposed;
assigned_id_year = &baseYear. + rand("table", &pct.);
run;
/* check unexposed frequencies */
proc freq data = assigned;
table assigned_id_year;
run;
The method above could be adapted to dates instead of years but might become impractical if the number of dates is too large. I don't know how many arguments can be handled by the RAND function. How many distinct dates would you have?
actually I have used the same method but it gives me until May 2013!
I have 6479 dates from 1 Jan 2000 to 26 Sep 2017.
Hummm... The pct macro variable is limited to 32k characters. Now 32k / 6479 is less than 6 characters per probability value, including a comma and a decimal point... I suspect that the list of probability values was truncated, which is why you didn't get the full range of dates.
Would it be acceptable to reduce the time resolution a bit, to say, weeks, months or even quarters? Hint: use functions intck and intnx to deal with such time intervals.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.