I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
Obs | e_vp_id | hrf_b | cum_dose |
1 | 671692534 | 5.80214515 | 25’000 |
2 | 671693823 | 5.2234011 | 25’000 |
3 | 671787871 | 12.4227981 | 10’000 |
4 | 671787871 | 12.4227981 | 25’000 |
5 | 671787871 | 12.4227981 | 5’000 |
6 | 671898982 | 5.10388071 | 25’000 |
7 | 671898982 | 5.10388071 | 25’000 |
8 | 672010043 | 12.0596382 | 25’000 |
9 | 672010043 | 12.0596382 | 15’000 |
10 | 672010067 | 10.2165108 | 5’000 |
This is the new dataset I need with the new variable (dose_total):
Obs | e_vp_id | hrf_b | cum_dose | dose total |
1 | 671692534 | 5.80214515 | 25’000 | 2’500 |
2 | 671693823 | 5.2234011 | 25’000 | 2’500 |
3 | 671787871 | 12.4227981 | 10’000 | 40’000 |
4 | 671787871 | 12.4227981 | 25’000 | 40’000 |
5 | 671787871 | 12.4227981 | 5’000 | 40’000 |
6 | 671898982 | 5.10388071 | 25’000 | 50’000 |
7 | 671898982 | 5.10388071 | 25’000 | 50’000 |
8 | 672010043 | 12.0596382 | 25’000 | 35’000 |
9 | 672010043 | 12.0596382 | 15’000 | 35’000 |
10 | 672010067 | 10.2165108 | 5’000 | 5’000 |
Any tips are welcome. Thank you in advance!
Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
create table want as
select *, sum(cum_dose) as dose_total
from your_dataset
group by e_vp_id;
quit;
Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
create table want as
select *, sum(cum_dose) as dose_total
from your_dataset
group by e_vp_id;
quit;
@gutst wrote:
I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
Obs
e_vp_id
hrf_b
cum_dose
1
671692534
5.80214515
25’000
2
671693823
5.2234011
25’000
3
671787871
12.4227981
10’000
4
671787871
12.4227981
25’000
5
671787871
12.4227981
5’000
6
671898982
5.10388071
25’000
7
671898982
5.10388071
25’000
8
672010043
12.0596382
25’000
9
672010043
12.0596382
15’000
10
672010067
10.2165108
5’000
This is the new dataset I need with the new variable (dose_total):
Obs
e_vp_id
hrf_b
cum_dose
dose total
1
671692534
5.80214515
25’000
2’500
2
671693823
5.2234011
25’000
2’500
3
671787871
12.4227981
10’000
40’000
4
671787871
12.4227981
25’000
40’000
5
671787871
12.4227981
5’000
40’000
6
671898982
5.10388071
25’000
50’000
7
671898982
5.10388071
25’000
50’000
8
672010043
12.0596382
25’000
35’000
9
672010043
12.0596382
15’000
35’000
10
672010067
10.2165108
5’000
5’000
Any tips are welcome. Thank you in advance!
You say the dose total should be weighted by HRF_B, but in your output data set it is not weighted. Please explain.
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.