I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
Obs | e_vp_id | hrf_b | cum_dose |
1 | 671692534 | 5.80214515 | 25’000 |
2 | 671693823 | 5.2234011 | 25’000 |
3 | 671787871 | 12.4227981 | 10’000 |
4 | 671787871 | 12.4227981 | 25’000 |
5 | 671787871 | 12.4227981 | 5’000 |
6 | 671898982 | 5.10388071 | 25’000 |
7 | 671898982 | 5.10388071 | 25’000 |
8 | 672010043 | 12.0596382 | 25’000 |
9 | 672010043 | 12.0596382 | 15’000 |
10 | 672010067 | 10.2165108 | 5’000 |
This is the new dataset I need with the new variable (dose_total):
Obs | e_vp_id | hrf_b | cum_dose | dose total |
1 | 671692534 | 5.80214515 | 25’000 | 2’500 |
2 | 671693823 | 5.2234011 | 25’000 | 2’500 |
3 | 671787871 | 12.4227981 | 10’000 | 40’000 |
4 | 671787871 | 12.4227981 | 25’000 | 40’000 |
5 | 671787871 | 12.4227981 | 5’000 | 40’000 |
6 | 671898982 | 5.10388071 | 25’000 | 50’000 |
7 | 671898982 | 5.10388071 | 25’000 | 50’000 |
8 | 672010043 | 12.0596382 | 25’000 | 35’000 |
9 | 672010043 | 12.0596382 | 15’000 | 35’000 |
10 | 672010067 | 10.2165108 | 5’000 | 5’000 |
Any tips are welcome. Thank you in advance!
Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
create table want as
select *, sum(cum_dose) as dose_total
from your_dataset
group by e_vp_id;
quit;
Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
create table want as
select *, sum(cum_dose) as dose_total
from your_dataset
group by e_vp_id;
quit;
@gutst wrote:
I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
Obs
e_vp_id
hrf_b
cum_dose
1
671692534
5.80214515
25’000
2
671693823
5.2234011
25’000
3
671787871
12.4227981
10’000
4
671787871
12.4227981
25’000
5
671787871
12.4227981
5’000
6
671898982
5.10388071
25’000
7
671898982
5.10388071
25’000
8
672010043
12.0596382
25’000
9
672010043
12.0596382
15’000
10
672010067
10.2165108
5’000
This is the new dataset I need with the new variable (dose_total):
Obs
e_vp_id
hrf_b
cum_dose
dose total
1
671692534
5.80214515
25’000
2’500
2
671693823
5.2234011
25’000
2’500
3
671787871
12.4227981
10’000
40’000
4
671787871
12.4227981
25’000
40’000
5
671787871
12.4227981
5’000
40’000
6
671898982
5.10388071
25’000
50’000
7
671898982
5.10388071
25’000
50’000
8
672010043
12.0596382
25’000
35’000
9
672010043
12.0596382
15’000
35’000
10
672010067
10.2165108
5’000
5’000
Any tips are welcome. Thank you in advance!
You say the dose total should be weighted by HRF_B, but in your output data set it is not weighted. Please explain.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.