sum by group variable and merge it back to original dataset

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

sum by group variable and merge it back to original dataset

Hello,

I have a dataset like this;

key          date                 indicator1                

1         05/01/13            1                                 

  2        05/01/13            0                                

  3       06/01/13            1                              

  4       06/01/13             .                            

5        06/02/13             0

6       06/03/13             1

in dataset   indicator 1 = present ( qualify as numerator) 0=(qualify as denominator)  missing(.) = not applicable.

now I want to find respective numerator and denominator for the given month. my output should look like

key          date               indicator1     rate        

1         05/01/13            1                     1/2         

2        05/01/13            0                    1/2         

3       06/01/13            1                    2/3                  

4        06/02/13             0                    2/3

5       06/03/13             1                    2/3

I am wondering if somebody can help me on this.

Thanks


Accepted Solutions
Solution
‎07-03-2013 11:28 AM
Super User
Posts: 17,907

Re: sum by group variable and merge it back to original dataset

You can take the average of indicator1 then by month to get what you want, and use the fract8. format to get the fraction rather than a decimal.

It doesn't matter if you filter out the missing because they're not included in the average calculation anyways, but I did exclude them using a where clause.

data have;

informat date mmddyy10.;

format date date9.;

input key date mmddyy10. indicator1;             

cards;

1 05/01/13  1

2 05/01/13  0

3 06/01/13  1

4 06/01/13  .

5 06/02/13  0

6 06/03/13  1

;

run;

proc sql;

create table want as

select *, put(date, monyy7.) as month_category, mean(indicator1) as rate format=fract8.

from have

where indicator1 ne .

group by put(date, monyy7.);

quit;

View solution in original post


All Replies
Super User
Posts: 17,907

Re: sum by group variable and merge it back to original dataset

You data doesn't match your criteria:

1 = present ( qualify as numerator) 0=(qualify as denominator)

I think you mean, 1 or 0 is a denominator?

Is your date mmddyy or ddmmyy? Or something else?

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

Hello Reeza,

you are right. 0 is qualify for numerator and denominator both. and my date is SAS date;

Thank you for your correction

Solution
‎07-03-2013 11:28 AM
Super User
Posts: 17,907

Re: sum by group variable and merge it back to original dataset

You can take the average of indicator1 then by month to get what you want, and use the fract8. format to get the fraction rather than a decimal.

It doesn't matter if you filter out the missing because they're not included in the average calculation anyways, but I did exclude them using a where clause.

data have;

informat date mmddyy10.;

format date date9.;

input key date mmddyy10. indicator1;             

cards;

1 05/01/13  1

2 05/01/13  0

3 06/01/13  1

4 06/01/13  .

5 06/02/13  0

6 06/03/13  1

;

run;

proc sql;

create table want as

select *, put(date, monyy7.) as month_category, mean(indicator1) as rate format=fract8.

from have

where indicator1 ne .

group by put(date, monyy7.);

quit;

Super User
Posts: 9,687

Re: sum by group variable and merge it back to original dataset

If I understood what you mean.

data have;
input key          date   : mmddyy10.           indicator1 ;  
format date    mmddyy10.   ;
cards;
1         05/01/13            1                                 
  2        05/01/13            0                                
  3       06/01/13            1                              
  4       06/01/13             .                            
5        06/02/13             0
6       06/03/13             1
;
run;
proc sql;
create table want as
 select *,sum(indicator1=1)/count(*) as rate        
  from have
   where  indicator1 is not missing
    group by year(date),month(date);
quit;


Ksharp

Message was edited by: xia keshan

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

thank you @Reeza and . Both of these works.One more concern. If I have more than indicators do I have to create mean statement for 10 different indicators. Is there a shortcut or I have to create rate_1 ,rate_2 ....rate_10 for given indicator.

Thank you again

Super User
Posts: 9,687

Re: sum by group variable and merge it back to original dataset

Yes, you need to write by your hand,or if you are familliar with macro, you can make a macro variable to wrap them all together.

select *,sum(indicator1=1)/count(*) as rate    ,

select *,sum(indicator2=1)/count(*) as rate    ,

select *,sum(indicator3=1)/count(*) as rate     ,

select *,sum(indicator4=1)/count(*) as rate     ,

................

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

@ksharp awesome. I am familiar with SAS macro. I see your point. Thank you so much

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

Hello   I did as of your suggestion and I got my output.Now I am going more deeper. I want to find 3 month rolling sum same as monthly summary.Is there a way to do that.

Super User
Posts: 17,907

Re: sum by group variable and merge it back to original dataset

By 3 month rolling do you mean:

Jan-Feb-Mar and then Feb-Mar-Apr?

As this question is answered, you should post a new question and link back to the old one if you think its relevant.

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

yes. I am hoping to run that way

Super User
Posts: 9,687

Re: sum by group variable and merge it back to original dataset

You need sub-query.

Not tested code:

proc sql;

create table want as

select * ,

( select sum(indicator1=1)/count(*)      

  from have

   where  indicator1 is not missing and date between intnx('month',-3,a.date,'s') and a.date ) as rate

from have as a;

quit;

Ksharp

Frequent Contributor
Posts: 89

Re: sum by group variable and merge it back to original dataset

Hello , and Fareeza Khurshed

Thank you for your input. . I tried input but it was unable to validate my data . Is it because there is no group by variable.

thanks

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 460 views
  • 3 likes
  • 3 in conversation