BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
telc24
Obsidian | Level 7

Hello friends!

I was hoping someone could help me think through this programming question. I have a dataset that looks like this:

 

data my_data;
input id option1 option2 option3 option 4;
cards;
A 1 0 1 1

B 1 1 1 1

C 0 1 1 1 

D 0 1 0 1

E 1 1 1 1

F 0 0 1 1;
run;

 

What I need to do is calculate a specific proportion for EACH pairwise combination. The proportion would essentially be the: (# of individuals in BOTH options) / (# of individuals in the first option). For example, I would need the calculation for Option1_Option2 to be: # option 1 AND option 2 / # option 1 = 1/3. Similarly, Option2_Option1 would be: # option 1 AND option 2 / # option 2 = 1/4. And so on...

 

I need the resulting table output to look like this:

Option1_Option2 2/3

Option1_Option3  3/3

Option1_Option4 3/3

Option2_Option1 2/4

Option2_Option3 3/4

Option2_Option4 4/4

Option3_Option1 3/5

Option3_Option2 3/5

Option3_Option4 5/5

Option4_Option1 3/6

Option4_Option2 4/6

Option4_Option3 5/6

 

Where I get tripped up in the coding is the pairwise combinations that I need to look at. Any advice or help would be hugely appreciated! Thank you in advance!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This should do it:

 

data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id;
run;

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
PG

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

A well formulated question.

 

However, it would seem to me that Option1_Option2 should be 2/3  (AE)/(ABE) right?

PG
telc24
Obsidian | Level 7

Yes! Sorry about that! I played around with the mock dataset a few times and didn't catch that change!

PGStats
Opal | Level 21

This should do it:

 

data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id;
run;

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
PG
telc24
Obsidian | Level 7

@PGStats this is brilliant!! Thank you so much! You just saved me so much time 🙂 

telc24
Obsidian | Level 7

Hi @PGStats , thank you again for your help! I have one more question that would help me automate things. I also have one more identifier that I want to account for (the "type" variable below). 

 

data my_data;

input id $ type option1 option2 option3 option4;

cards;

A 1 1 0 1 1

B 2 1 1 1 1

C 1 0 1 1 1

D 2 0 1 0 1

E 1 1 1 1 1

F 2 0 0 1 1

;

 

Currently, the code you generously provided includes the proportion calculation for types 1 AND 2. Is there an easy way to additionally add a column for the proportion calculation for type 1 only? And then a column for the proportion calculation for type 2 only?

 

I'm not very familiar with using proc sql to generate tables, so any general resources to help me learn would also be greatly appreciated!

 

Thank you, again!

 

 

 

Jagadishkatam
Amethyst | Level 16

You can try the below code , i updated the @PGStats code

 

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion, sum(a.col1*b.col1) as type1, coalesce(sum(a.col1), 1) as type2
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
Thanks,
Jag
telc24
Obsidian | Level 7

Thanks @Jagadishkatam .Currently, your code is giving me additional columns for the numerator of the "combination" variable (variable=type1) and the denominator of the "combination" variable (variable=type2).


What I'm looking for is a similar "combination" calculation variable for instances for the individuals who are type=1. And then a similar "combination" calculation variable for the individuals who are type=2. Is there a way to add an if/then or when statement to proc sql to do so? For example, the table output for the dataset shown below would look like this:

 

Dataset:

data my_data;

input id $ type option1 option2 option3 option4;

cards;

A 1 1 0 1 1

B 2 1 1 1 1

C 1 0 1 1 1

D 2 0 1 0 1

E 1 1 1 1 1

F 2 0 0 1 1

;

 

Output table:

Option1-Option2 2/3 1/2 1/1

Option1-Option3 3/3 2/2 1/1

Option1-Option4 3/3 2/2 1/1

Option2-Option1 2/4 1/2 1/2

etc.

 

PGStats
Opal | Level 21

Here are two methods for obtaining proportions by type, assuming that ids are unique keys:

 

data my_data;
input id $ type option1 option2 option3 option4;
cards;
A 1 1 0 1 1
B 2 1 1 1 1
C 1 0 1 1 1 
D 2 0 1 0 1
E 1 1 1 1 1
F 2 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id type;
run;

proc sql;
create table want1 as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportion,
    sum(a.type=1 and a.col1 and b.col1) / coalesce(sum(a.type=1 and a.col1), 1) as proportionType1,
    sum(a.type=2 and a.col1 and b.col1) / coalesce(sum(a.type=2 and a.col1), 1) as proportionType2
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;


proc sql;
create table want2 as
select aa.combination, proportion, proportionType1, proportionType2 from

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportion
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    group by a._name_, b._name_) as aa left join

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportionType1
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    where a.type = 1
    group by a._name_, b._name_) as bb on aa.combination = bb.combination left join

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportionType2
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    where a.type = 2
    group by a._name_, b._name_) as cc on aa.combination = cc.combination
    ;
quit;

To scale this up to, say, 20 types, you would need a different approach involving an extra transposition.

PG
Ksharp
Super User
data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc corr data=my_data out=have(where=(_type_='SSCP'))  sscp noprint;
run;
proc transpose data=have(where=(_name_ ne 'Intercept')) out=have2(index=(name)) name=name;
by _name_;
var option: ;
run;

proc summary data=my_data ;
var option: ;
output out=total sum=;
run;
proc transpose data=total out=total2 name=name ;
var option: ;
run;

data want;
 merge have2 total2(rename=(col1=col2));
 by name;
 want=col1/col2;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2139 views
  • 6 likes
  • 4 in conversation