Solved: Re: How to count frequency from Dataset B to populate a new column in ...

Luciferene · Posted 09-13-2018 11:45 AM

Hello community,

I'm very new to SAS and I've been struggling to efficiently do this.

So following are examples of the datasets:

Dataset A:

A 0.86

B 1.25

C 1.01

D 1.35

Dataset B (Columns: Category 1 Category 2):

A A

A B

B B

A A

C C

A D

D B

A D

Then I want to Dataset A to look like, or create a new Dataset C that looks like (With columns being Classes, rates, #Cat 1 from Dataset A, #Cat 2 from Dataset B):

A 0.86 5 2

B 1.25 1 3

C 1.01 1 1

D 1.35 1 2

Currently I'm using One Way Freq to do the count for Cat 1 and Cat 2 for Dataset B. And combining frequency column from those two with Data A and it's very messy.

Thank you in advance for help!

novinosrin · Posted 09-13-2018 01:44 PM

@Luciferene wrote:

I'm only "counting from 2 categories" but dataset b does have many other columns that are not being counted.

data A;
input Cat1 $ n ;
cards;
A 0.86
B 1.25
C 1.01
D 1.35
;

data b;
input  Cat1 $ Cat2 $;
cards;
A A
A B
B B
A A
C C
A D
D B
A D
;

data _null_;
if _n_=1 then do;
if 0 then set A;
  dcl hash H (ordered:'y') ;
   h.definekey  ("Cat1") ;
   h.definedata ("Cat1","n","freq1","freq2") ;
   h.definedone () ;
end;
call missing(freq1,freq2);
do until(lr);
set a end=lr;
rc=h.add();
end;
lr=0;
do until(lr);
set b end=lr;
if h.find(key:Cat1) ne 0 then freq1=1;else freq1=sum(freq1,1);
h.replace();
if h.find(key:Cat2) ne 0 then freq2=1;else freq2=sum(freq2,1);
h.replace();
end;
h.output(dataset:'want');
stop;
run;

View solution in original post

novinosrin · Posted 09-13-2018 12:27 PM

Is it just 2 categories in dataset b or the number of categories in b could vary?

Luciferene · Posted 09-13-2018 01:27 PM

I'm only counting from 2 categories, but dataset b does have many other columns that are not being counted.

novinosrin · Posted 09-13-2018 01:44 PM

@Luciferene wrote:

I'm only "counting from 2 categories" but dataset b does have many other columns that are not being counted.

data A;
input Cat1 $ n ;
cards;
A 0.86
B 1.25
C 1.01
D 1.35
;

data b;
input  Cat1 $ Cat2 $;
cards;
A A
A B
B B
A A
C C
A D
D B
A D
;

data _null_;
if _n_=1 then do;
if 0 then set A;
  dcl hash H (ordered:'y') ;
   h.definekey  ("Cat1") ;
   h.definedata ("Cat1","n","freq1","freq2") ;
   h.definedone () ;
end;
call missing(freq1,freq2);
do until(lr);
set a end=lr;
rc=h.add();
end;
lr=0;
do until(lr);
set b end=lr;
if h.find(key:Cat1) ne 0 then freq1=1;else freq1=sum(freq1,1);
h.replace();
if h.find(key:Cat2) ne 0 then freq2=1;else freq2=sum(freq2,1);
h.replace();
end;
h.output(dataset:'want');
stop;
run;

novinosrin · Posted 09-13-2018 01:47 PM

Hi @Luciferene Btw, welcome to the forum as it appears this is just your 1st thread. If there are many categories, that approach can be tweaked to do some gymnastics using arrays, however me being extremely lazy I haven't done that until needed.

Luciferene · Posted 09-13-2018 02:21 PM

Thanks a lot @novinosrin for the welcome and the help, this is what I was looking for!

Few (probably very basic) questions:

How would I go about reordering the rows? For example if I want them in B C D A, etc.

If I wanted to operate on something like SUM(n * freq1) / SUM(freq1), how would I do that? It doesn't seem to let me operate on the values like matrix.

If you can give me any guidance or point me in the direction to learn that would be great.

novinosrin · Posted 09-13-2018 02:37 PM

Can you please elaborate a bit more. As in, this SUM(n * freq1) / SUM(freq1), do you want to apply this on the output dataset?

May be it would help if you show your desired using a sample like you did in your question to think through technically

And for reordering rows, please let know the logic for the order you want.

I am going for lunch(Chicago time) with my mates, and there might be delay in my responses. Kindly bear with me

Luciferene · Posted 09-13-2018 02:48 PM

Yes I'm working with the output. So if the output comes out to be (Category, n , freq1, freq2 for columns):

A 0.86 5 2

B 1.25 1 3

C 1.01 1 1

D 1.35 1 2

I want SUM(n * freq1) / SUM(freq1) = (0.86 * 5 + 1.25 * 1 + ...) / (0.86 + 1.25 + ...). I know how to do this if it was say matrix, but my attempt at doing it for dataset keeps giving me errors.

For the reordering, when the output data orders everything alphabetically where as the categories are set in certain order.

For example if my Categories were initially laid out as:

Standard

Preferred

Declined

The output spits out in the order of:

Declined

Preferred

Standard

It's more or less for the summary report.

Of course, enjoy your lunch! Thanks for your prompt responses.

Luciferene · Posted 09-13-2018 02:58 PM

Just adding on, I would ideally like the result to be added onto the output data.
For example:
A 0.86 5 2
B 1.25 1 3
C 1.01 1 1
D 1.35 1 2
Total, (blank), 8, 8
Weighted AVG, (blank), For freq 1, For freq 2

Where the weighted avg is SUM(n * freq1) / SUM(freq1). My apologise for messy alignment.

Thanks!

novinosrin · Posted 09-13-2018 03:10 PM

Can you clarify whether the following result is what you want that needs to be appended to the prev output which is in have?

data have;
input Cat1 $ n freq1 freq2;
cards;
A 0.86 5 2
B 1.25 1 3
C 1.01 1 1
D 1.35 1 2
;


proc sql;
create table want as
select SUM(n * freq1)/SUM(freq1)  as wgt_avg_for_freq1, SUM(n * freq2)/SUM(freq2) as wgt_avg_for_freq2,sum(freq1) as sum_freq1,sum(freq1) as sum_freq2
from have;
quit;

Luciferene · Posted 09-13-2018 03:17 PM

Yep that's correct.

novinosrin · Posted 09-13-2018 04:38 PM

Thank you, but I am afraid that result won't align correctly when we stack with the output from the previous.

Or now that you have both the results, can you stack and show the desired one to be precise. It also sounds to me that you might want a report from a prco tabulate or proc report

Luciferene · Posted 09-13-2018 04:48 PM

I'm not exactly sure if I understood you correctly, but this is basically the final form that I desire:

A           0.86    5              2
B           1.25    1              3
C           1.01    1              1
D           1.35    1              2
Total               8              8
WA                  0.98875        1.1475

Where the 2nd column (n) is blank. Basically the final two rows are summaries.

novinosrin · Posted 09-13-2018 04:49 PM

Aah, you understood well. Thank you & brb

novinosrin · Posted 09-13-2018 04:59 PM

/*This WANT is the output from the previous, 
just recreating from demo purpose*/
data want;
input Cat1 $ n freq1 freq2;
cards;
A 0.86 5 2
B 1.25 1 3
C 1.01 1 1
D 1.35 1 2
;

/*Final want*/
proc sql;
create table final_want as
select * from want
union 
select 'Total' as Cat1,. as n, sum(freq1) as sum_freq1,sum(freq1) as sum_freq2 from want
union 
select 'WA' as Cat1,. as n,SUM(n * freq1)/SUM(freq1)  as freq1,SUM(n * freq2)/SUM(freq2) as freq2 from want;
quit;

How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Re: How to count frequency from Dataset B to populate a new column in Dataset A in SAS EG 7.1

Registration is open