- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have a data set like this:
V1 V2 v3
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
.
What I want is:
For each v1 in each v2 : I want to find distinct v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them
Was wondering if the below code is right.
Proc sql :
create table X as select *, sum ( count ( distinct v3))*2 ) as want
group by V1 V2;
run;
Thank you for your help
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;
proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;
I might be away for supper and stuff. Let us know if you need some clarification
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you try running it? What happened?
But to answer your question, the code is not correct.
@fama wrote:
Hi all,
I have a data set like this:
V1 V2 v3
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
.
What I want is:
For each v1 in each v2 : I want to find distinct v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them
Was wondering if the below code is right.
Proc sql :
create table X as select *, sum ( count ( distinct v3))*2 ) as want
group by V1 V2;
run;
Thank you for your help
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I wanted to make sure if it is correct.
What is the correct one then?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If the data is too big, try playing around with a subset instead.
Here's the code to get a subset if you need that:
data have;
input V1 V2 v3;
cards;
1 1 2
1 2 1
1 2 5
1 1 2
1 2 2
2 2 3
2 1 6
2 1 6
3 2 1
3 4 3
3 4 3
;;;;
run;
or:
data temp;
set have (obs=100);
run;
Post exactly you'd expect to see if your actual data was what you posted initially (my have data set) and I'll see what I can help you with.
@fama wrote:
I did not try. The data is quite big and it takes time to be ran.
I wanted to make sure if it is correct.
What is the correct one then?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Even finding the subset takes hours..
I will try thanks..
But what is wrong with the code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
run your code using your sample and look at the output and see If it matches your expected result,
If you want us to provide you a solution, plz post the expected output for a prospective input
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is the data and expected output.
data have;
input V1 V2 v3 want;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;
10=( 3*2)+ (1*2) where 3 is the frequency of 1 in V3 and 1 is the frequency of 2 in V3
14= (1*2) +(2*2)+(3*2) where 1 is the frequency of 6 in V3 and 2 is the frequency of 3 in V3 and 3 is the frequency of 4
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok, how did you get the 10 and 14. What am i missing?
@fama wrote:
10=( 3*2)+ (1*2) where 3 is the frequency of 1 in V3 and 1 is the frequency of 2 in V3
14= (1*2) +(2*2)+(3*2) where 1 is the frequency of 6 in V3 and 2 is the frequency of 3 in V3 and 3 is the frequency of 4
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When V2 =1 , V3 has 1 (frequency =3) and 2(frequency =1).
then we need to obtain the square of each frequency that means 3*2 and 1*2 and sum up the results that means 9+1= 10
Hope it is clear
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@fama Thank you & yes clear. One last question, do you really want this in proc sql?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes I want to create a table using SQL in SAS
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;
proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;
I might be away for supper and stuff. Let us know if you need some clarification
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I really appreciate your help.
The code works well.
Could you please clarify it for me. really like how it woks and learn...
Thanks again
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm glad. It's almost midnight here. I'll leave some notes in the morning. Please don't mind. I promise i'll soon as I wake up
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content