BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fama
Fluorite | Level 6

Hi all,

I have a data set like this:

 

V1  V2  v3

1      1    2

1      2    1

1      2    5

1      1    2

1      2    2

2     2     3

2     1    6

2     1    6

3     2    1

3     4    3

3     4    3

.

 

What I want is:

For each v1  in each v2 : I want to find distinct  v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them

 

Was wondering if the below code is right.

 

 

Proc sql :

create table X as select *,     sum ( count ( distinct v3))*2 ) as want

group by V1 V2;

run;

 

Thank you for your help

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;

proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;

I might be away for supper and stuff. Let us know if you need some clarification

View solution in original post

27 REPLIES 27
Reeza
Super User

Did you try running it? What happened?

 

But to answer your question, the code is not correct. 


@fama wrote:

Hi all,

I have a data set like this:

 

V1  V2  v3

1      1    2

1      2    1

1      2    5

1      1    2

1      2    2

2     2     3

2     1    6

2     1    6

3     2    1

3     4    3

3     4    3

.

 

What I want is:

For each v1  in each v2 : I want to find distinct  v3 and then count them ( say we have 4 distinct value in V3 and each of them has different frequency) Then square each frequency and then sum up them

 

Was wondering if the below code is right.

 

 

Proc sql :

create table X as select *,     sum ( count ( distinct v3))*2 ) as want

group by V1 V2;

run;

 

Thank you for your help

 


 

fama
Fluorite | Level 6
I did not try. The data is quite big and it takes time to be ran.
I wanted to make sure if it is correct.
What is the correct one then?
Thanks
Reeza
Super User

If the data is too big, try playing around with a subset instead. 

Here's the code to get a subset if you need that:

 

data have;
input V1  V2  v3;
cards;
1      1    2
1      2    1
1      2    5
1      1    2
1      2    2
2     2     3
2     1    6
2     1    6
3     2    1
3     4    3
3     4    3
;;;;
run;

 

or:

 

data temp;
set have (obs=100);
run;

Post exactly you'd expect to see if your actual data was what you posted initially (my have data set) and I'll see what I can help you with. 

 


@fama wrote:
I did not try. The data is quite big and it takes time to be ran.
I wanted to make sure if it is correct.
What is the correct one then?
Thanks

 

fama
Fluorite | Level 6
The data is 170 GB!
Even finding the subset takes hours..
I will try thanks..
But what is wrong with the code?
novinosrin
Tourmaline | Level 20

run your code using your sample and look at the output and see If it matches your expected result, 

 

If you want us to provide you a solution, plz post the expected output for a prospective input

fama
Fluorite | Level 6
Thanks for your reply:
Here is the data and expected output.

data have;
input V1 V2 v3 want;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14

;;;;
run;
10=( 3*2)+ (1*2) where 3 is the frequency of 1 in V3 and 1 is the frequency of 2 in V3
14= (1*2) +(2*2)+(3*2) where 1 is the frequency of 6 in V3 and 2 is the frequency of 3 in V3 and 3 is the frequency of 4
novinosrin
Tourmaline | Level 20

 

Ok, how did you get the 10 and 14. What am i missing? 

 


@fama wrote:


10=( 3*2)+ (1*2) where 3 is the frequency of 1 in V3 and 1 is the frequency of 2 in V3
14= (1*2) +(2*2)+(3*2) where 1 is the frequency of 6 in V3 and 2 is the frequency of 3 in V3 and 3 is the frequency of 4

 

fama
Fluorite | Level 6
we need to first look at V2. for each distinct value say 1 we count the distinct values of V3.
When V2 =1 , V3 has 1 (frequency =3) and 2(frequency =1).
then we need to obtain the square of each frequency that means 3*2 and 1*2 and sum up the results that means 9+1= 10
Hope it is clear
Thanks
novinosrin
Tourmaline | Level 20

@fama   Thank you & yes clear. One last question, do you really want this in proc sql?

fama
Fluorite | Level 6
Sorry to get Square I should have used **.

Yes I want to create a table using SQL in SAS
novinosrin
Tourmaline | Level 20
data have;
input V1 V2 v3;
cards;
1 1 1 10
1 1 1 10
1 1 1 10
1 1 2 10
1 2 6 14
1 2 3 14
1 2 3 14
1 2 4 14
1 2 4 14
1 2 4 14
;;;;
run;

proc sql;
create table fama_want as
select a.*,want
from have a, (select v1,v2, sum(c) as want from (select *, count(v3)**2 as c from have group by v1, v2,v3 )
group by v1,v2) b
where a.v1=b.v1 and a.v2=b.v2;
quit;

I might be away for supper and stuff. Let us know if you need some clarification

fama
Fluorite | Level 6
Thank you very much Novinosrin.
I really appreciate your help.
The code works well.
Could you please clarify it for me. really like how it woks and learn...
Thanks again
novinosrin
Tourmaline | Level 20

I'm glad. It's almost midnight here. I'll leave some notes in the morning. Please don't mind. I promise i'll soon as I wake up

fama
Fluorite | Level 6
I really appreciate it!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 27 replies
  • 1686 views
  • 1 like
  • 5 in conversation