Hi all,
I have a data set like this:
V1 V2 v3
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
.
What I want is:
For each v1 in each v2 : I want to find distinct v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them
Was wondering if the below code is right.
Proc sql :
create table X as select *, sum ( count ( distinct v3))*2 ) as want
group by V1 V2;
run;
Thank you for your help
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;
proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;
I might be away for supper and stuff. Let us know if you need some clarification
Did you try running it? What happened?
But to answer your question, the code is not correct.
@fama wrote:
Hi all,
I have a data set like this:
V1 V2 v3
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
.
What I want is:
For each v1 in each v2 : I want to find distinct v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them
Was wondering if the below code is right.
Proc sql :
create table X as select *, sum ( count ( distinct v3))*2 ) as want
group by V1 V2;
run;
Thank you for your help
If the data is too big, try playing around with a subset instead.
Here's the code to get a subset if you need that:
data have;
input V1 V2 v3;
cards;
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
;;;;
run;
or:
data temp;
set have (obs=100);
run;
Post exactly you'd expect to see if your actual data was what you posted initially (my have data set) and I'll see what I can help you with.
@fama wrote:
I did not try. The data is quite big and it takes time to be ran.
I wanted to make sure if it is correct.
What is the correct one then?
Thanks
run your code using your sample and look at the output and see If it matches your expected result,
If you want us to provide you a solution, plz post the expected output for a prospective input
Ok, how did you get the 10 and 14. What am i missing?
@fama wrote:
10=( 3*2)+ (1*2) where 3 is the frequency of 1 in V3 and 1 is the frequency of 2 in V3
14= (1*2) +(2*2)+(3*2) where 1 is the frequency of 6 in V3 and 2 is the frequency of 3 in V3 and 3 is the frequency of 4
@fama Thank you & yes clear. One last question, do you really want this in proc sql?
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;
proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;
I might be away for supper and stuff. Let us know if you need some clarification
I'm glad. It's almost midnight here. I'll leave some notes in the morning. Please don't mind. I promise i'll soon as I wake up
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.