Hi everyone,
I have some individual-level prescription redemption data and I would like to calculate the point prevalence of drug use, defined as the percentage of the population being treated at monthly intervals one year before and after to the initiation of an intervention (intervention starts at different time points for each individual, but every individual has 1 year-follow-up in both directions).
My current data looks something like this:
Table 1
id |
drug_atc |
date_of_redemption |
grace_period |
days_supply |
last_day_of_supply |
intervention_date |
cov_by_last |
intervention |
1 |
AA10A |
23/12/2020 |
30 |
30 |
21/02/2021 |
01/02/2021 |
- |
0 |
1 |
AA10A |
31/01/2021 |
30 |
30 |
01/04/2021 |
01/02/2021 |
1 |
0 |
1 |
AA10A |
04/02/2021 |
30 |
30 |
05/04/2021 |
01/02/2021 |
1 |
1 |
1 |
AA10A |
07/03/2021 |
30 |
30 |
06/05/2021 |
01/02/2021 |
1 |
1 |
1 |
AA10A |
15/05/2021 |
30 |
30 |
14/07/2021 |
01/02/2021 |
0 |
1 |
2 |
AA10A |
02/04/2021 |
30 |
60 |
01/07/2021 |
01/06/2021 |
- |
0 |
2 |
AA10A |
05/08/2021 |
30 |
60 |
03/11/2021 |
01/06/2021 |
0 |
1 |
2 |
AA10A |
06/09/2021 |
30 |
30 |
05/11/2021 |
01/06/2021 |
1 |
1 |
Last_day_of_supply is the days_supply + grace_period.
Cov_by_last denotes if the prescription was filled before the last_day_of_supply of the previous prescription (yes=1, no=0).
Intervention denotes if the prescription was filled before (0) or after (1) the intervention.
I also have a shell table with each id, the date of intervention and date of the monthly intervals since intervention:
Table 2
ID |
drug_atc |
month_since_int |
intervention_date |
intervention |
date_since_inv |
1 |
AA10A |
-12 |
01/02/2021 |
0 |
01/02/2020 |
1 |
AA10A |
-11 |
01/02/2021 |
0 |
01/03/2020 |
1 |
AA10A |
-10 |
01/02/2021 |
0 |
01/04/2020 |
1 |
AA10A |
-9 |
01/02/2021 |
0 |
01/05/2020 |
1 |
AA10A |
-8 |
01/02/2021 |
0 |
01/06/2020 |
1 |
AA10A |
-7 |
01/02/2021 |
0 |
01/07/2020 |
1 |
AA10A |
-6 |
01/02/2021 |
0 |
01/08/2020 |
1 |
AA10A |
-5 |
01/02/2021 |
0 |
01/09/2020 |
1 |
AA10A |
-4 |
01/02/2021 |
0 |
01/10/2020 |
1 |
AA10A |
-3 |
01/02/2021 |
0 |
01/11/2020 |
1 |
AA10A |
-2 |
01/02/2021 |
0 |
01/12/2020 |
1 |
AA10A |
-1 |
01/02/2021 |
0 |
01/01/2021 |
1 |
AA10A |
0 |
01/02/2021 |
1 |
01/02/2021 |
1 |
AA10A |
1 |
01/02/2021 |
1 |
01/03/2021 |
1 |
AA10A |
2 |
01/02/2021 |
1 |
01/04/2021 |
1 |
AA10A |
3 |
01/02/2021 |
1 |
01/05/2021 |
1 |
AA10A |
4 |
01/02/2021 |
1 |
01/06/2021 |
1 |
AA10A |
5 |
01/02/2021 |
1 |
01/07/2021 |
1 |
AA10A |
6 |
01/02/2021 |
1 |
01/08/2021 |
1 |
AA10A |
7 |
01/02/2021 |
1 |
01/09/2021 |
1 |
AA10A |
8 |
01/02/2021 |
1 |
01/10/2021 |
1 |
AA10A |
9 |
01/02/2021 |
1 |
01/11/2021 |
1 |
AA10A |
10 |
01/02/2021 |
1 |
01/12/2021 |
1 |
AA10A |
11 |
01/02/2021 |
1 |
01/01/2022 |
1 |
AA10A |
12 |
01/02/2021 |
1 |
01/02/2022 |
2 |
AA10A |
-12 |
01/06/2021 |
0 |
01/06/2020 |
2 |
AA10A |
-11 |
01/06/2021 |
0 |
01/07/2020 |
2 |
AA10A |
-10 |
01/06/2021 |
0 |
01/08/2020 |
I would like to create a table like this,
table 3
ID | drug_atc | month_since_int | in_treatment | intervention |
1 | AA10A | -12 | 1 | 0 |
1 | AA10A | -11 | 1 | 0 |
1 | AA10A | -10 | 1 | 0 |
1 | AA10A | -9 | 1 | 0 |
1 | AA10A | -8 | 1 | 0 |
1 | AA10A | -7 | 1 | 0 |
1 | AA10A | -6 | 1 | 0 |
1 | AA10A | -5 | 1 | 0 |
1 | AA10A | -4 | 1 | 0 |
1 | AA10A | -3 | 1 | 0 |
1 | AA10A | -2 | 1 | 0 |
1 | AA10A | -1 | 1 | 0 |
1 | AA10A | 0 | 1 | 1 |
1 | AA10A | 1 | 1 | 1 |
1 | AA10A | 2 | 1 | 1 |
1 | AA10A | 3 | 0 | 1 |
1 | AA10A | 4 | 1 | 1 |
1 | AA10A | 5 | 0 | 1 |
1 | AA10A | 6 | 1 | 1 |
1 | AA10A | 7 | 0 | 1 |
1 | AA10A | 8 | 1 | 1 |
1 | AA10A | 9 | 0 | 1 |
1 | AA10A | 10 | 1 | 1 |
1 | AA10A | 11 | 0 | 1 |
1 | AA10A | 12 | 0 | 1 |
2 | AA10A | -12 | 1 | 0 |
2 | AA10A | -11 | 1 | 0 |
2 | AA10A | -10 | 0 | 0 |
where in_treatment denotes whether the individual was covered by a days_supply of the drug on the day of the start of the month_since_int.
Eventually, the final table should look something like this, which contains the percentage of individuals in treatment at each month in relation to the intervention.
month_since_int |
drug_atc |
in_treat_percent |
-12 |
AA10A |
70.0% |
-11 |
AA10A |
60.0% |
-10 |
AA10A |
80.0% |
-9 |
AA10A |
70.0% |
-8 |
AA10A |
60.0% |
-7 |
AA10A |
60.0% |
-6 |
AA10A |
70.0% |
-5 |
AA10A |
70.0% |
-4 |
AA10A |
60.0% |
-3 |
AA10A |
80.0% |
-2 |
AA10A |
60.0% |
-1 |
AA10A |
70.0% |
0 |
AA10A |
60.0% |
1 |
AA10A |
60.0% |
2 |
AA10A |
50.0% |
3 |
AA10A |
40.0% |
4 |
AA10A |
50.0% |
5 |
AA10A |
30.0% |
6 |
AA10A |
30.0% |
7 |
AA10A |
20.0% |
8 |
AA10A |
20.0% |
9 |
AA10A |
30.0% |
10 |
AA10A |
20.0% |
The key step I am struggling with is creating table 3, but I happy to receive input to any of the steps, it does not have to be the total solution.
Thanks
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.