I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
| Obs | e_vp_id | hrf_b | cum_dose | 
| 1 | 671692534 | 5.80214515 | 25’000 | 
| 2 | 671693823 | 5.2234011 | 25’000 | 
| 3 | 671787871 | 12.4227981 | 10’000 | 
| 4 | 671787871 | 12.4227981 | 25’000 | 
| 5 | 671787871 | 12.4227981 | 5’000 | 
| 6 | 671898982 | 5.10388071 | 25’000 | 
| 7 | 671898982 | 5.10388071 | 25’000 | 
| 8 | 672010043 | 12.0596382 | 25’000 | 
| 9 | 672010043 | 12.0596382 | 15’000 | 
| 10 | 672010067 | 10.2165108 | 5’000 | 
This is the new dataset I need with the new variable (dose_total):
| Obs | e_vp_id | hrf_b | cum_dose | dose total | 
| 1 | 671692534 | 5.80214515 | 25’000 | 2’500 | 
| 2 | 671693823 | 5.2234011 | 25’000 | 2’500 | 
| 3 | 671787871 | 12.4227981 | 10’000 | 40’000 | 
| 4 | 671787871 | 12.4227981 | 25’000 | 40’000 | 
| 5 | 671787871 | 12.4227981 | 5’000 | 40’000 | 
| 6 | 671898982 | 5.10388071 | 25’000 | 50’000 | 
| 7 | 671898982 | 5.10388071 | 25’000 | 50’000 | 
| 8 | 672010043 | 12.0596382 | 25’000 | 35’000 | 
| 9 | 672010043 | 12.0596382 | 15’000 | 35’000 | 
| 10 | 672010067 | 10.2165108 | 5’000 | 5’000 | 
Any tips are welcome. Thank you in advance!
Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
 create table want as
 select *, sum(cum_dose) as dose_total
 from your_dataset
 group by e_vp_id;
quit;Hi @gutst A simple double pass aka Auto remerge functionality of Proc SQL?
proc sql;
 create table want as
 select *, sum(cum_dose) as dose_total
 from your_dataset
 group by e_vp_id;
quit;
@gutst wrote:
I have a dataset with drug claims (data drugclaims). Each row represents one drug claim and includes a variable that specifies the total dose dispensed (cum_dose). I need to create a new variable for each patient showing the total cumulative dose of the drug per distinct patient within the dataset (i.e. sum up cum_dose of the same e_vp_id). Finally this needs to be weighted by the variable hrf_b. Below is an example of the dataset I have (drugclaims):
Obs
e_vp_id
hrf_b
cum_dose
1
671692534
5.80214515
25’000
2
671693823
5.2234011
25’000
3
671787871
12.4227981
10’000
4
671787871
12.4227981
25’000
5
671787871
12.4227981
5’000
6
671898982
5.10388071
25’000
7
671898982
5.10388071
25’000
8
672010043
12.0596382
25’000
9
672010043
12.0596382
15’000
10
672010067
10.2165108
5’000
This is the new dataset I need with the new variable (dose_total):
Obs
e_vp_id
hrf_b
cum_dose
dose total
1
671692534
5.80214515
25’000
2’500
2
671693823
5.2234011
25’000
2’500
3
671787871
12.4227981
10’000
40’000
4
671787871
12.4227981
25’000
40’000
5
671787871
12.4227981
5’000
40’000
6
671898982
5.10388071
25’000
50’000
7
671898982
5.10388071
25’000
50’000
8
672010043
12.0596382
25’000
35’000
9
672010043
12.0596382
15’000
35’000
10
672010067
10.2165108
5’000
5’000
Any tips are welcome. Thank you in advance!
You say the dose total should be weighted by HRF_B, but in your output data set it is not weighted. Please explain.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
