- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm working on a database describing medicines reimbursement. I'm counting the number of reimbursement by year, month and by ATC (a categorical variable describing "classes" of medicines).
proc sql noprint;
create table remb_month as
select month, year, ATC, count(*) as nb_remb
from remb
group by year, month, ATC
quit;
I obtain something like that :
month year ATC nb_remb
1 2006 1 1234
1 2006 2 3456
1 2006 3 7890
1 2006 5 9876
You can see there is no reimbursement of medicines of the ATC class 4 during this month. I would like to obtain a additionnal line when there are no reimbursement of a certain ATC during a given month :
month year ATC nb_remb
1 2006 1 1234
1 2006 2 3456
1 2006 3 7890
1 2006 4 0
1 2006 5 9876
Any idea of how I could do that ?
Thank you
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I also regularly encounter this in my data. And I use the Double Transpose method.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My solution would be like this:
data defaults;
do month = 1 to 12;
do year = 2000 to 2015;
do ATC = 1 to &max_atc;
output;
end;
end;
end;
run;
proc sql;
create table want as
select
a.month,
a.year,
a.ATC,
case
when b.nb_remb is missing
then 0
else b.nb_remb
end as nb_remb
from
defaults as a left join
remb_month as b
on a.month = b.month and
a.year = b.year and
a.ATC = b.ATC
;
quit;
(untested)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Nice Kurt, basically the answer is you are knowing that dates should measurements are to be expected, than:
1/ build a template for all the levels that should be there (your datastep)
2/ merge that one with the results that are being delivered.
3/ Correct / analyse them NULL/missing results as being information of the join what should be there and what is there. (only the levels, info like counts excluded)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You may consider a non-SQL approach. The MEANS procedure (and her sister SUMMARY for that matter) have the PRELOADFMT option. This assumes you have a format value for each existing value of your classification variable. In combination with COMPLETETYPES you can create an output dataset that has a row for each value in the format, if its in the data or not. The other way around it also allows you to exclude values may you wish so.
Some nice uses of the PRELOADFMT option (which is not limited to MEANS/SUMMARY) can be found in this paper: http://support.sas.com/resources/papers/proceedings11/239-2011.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I also regularly encounter this in my data. And I use the Double Transpose method.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please try
data have;
input month year ATC nb_remb;
cards;
1 2006 1 1234
1 2006 2 3456
1 2006 3 7890
1 2006 5 9876
;
proc sort data=have out=dummy(keep=month year ) nodupkey;
by month year;
run;
proc sort data=have ;
by month year atc;
run;
data dummy_;
set dummy;
do atc=1 to 5;
output;
end;
run;
proc sort data=dummy_;
by month year atc;
run;
data want;
merge dummy_(in=b) have(in=a);
by month year atc;
if b;
if nb_remb=. then nb_remb=0;
run;
proc sort data=want;
by month year atc;
run;
Thanks,
Jag
Jag
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
try this , if your data is not already sorted then you may have to sort it:
data have;
input month year ATC nb_remb;
cards;
1 2006 1 1234
1 2006 2 3456
1 2006 3 7890
1 2006 5 9876
;
data want;
set have;
by atc;
if dif(atc)>1 then do;
month=month;year=year;atc=atc-1;nb_remb=0;
output;
set have point=_N_;output;
end;
else output;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for all your answers.
I didnt know the double transpose method, it's very useful in such cases !