BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
telc24
Obsidian | Level 7

Hello friends!

I was hoping someone could help me think through this programming question. I have a dataset that looks like this:

 

data my_data;
input id option1 option2 option3 option 4;
cards;
A 1 0 1 1

B 1 1 1 1

C 0 1 1 1 

D 0 1 0 1

E 1 1 1 1

F 0 0 1 1;
run;

 

What I need to do is calculate a specific proportion for EACH pairwise combination. The proportion would essentially be the: (# of individuals in BOTH options) / (# of individuals in the first option). For example, I would need the calculation for Option1_Option2 to be: # option 1 AND option 2 / # option 1 = 1/3. Similarly, Option2_Option1 would be: # option 1 AND option 2 / # option 2 = 1/4. And so on...

 

I need the resulting table output to look like this:

Option1_Option2 2/3

Option1_Option3  3/3

Option1_Option4 3/3

Option2_Option1 2/4

Option2_Option3 3/4

Option2_Option4 4/4

Option3_Option1 3/5

Option3_Option2 3/5

Option3_Option4 5/5

Option4_Option1 3/6

Option4_Option2 4/6

Option4_Option3 5/6

 

Where I get tripped up in the coding is the pairwise combinations that I need to look at. Any advice or help would be hugely appreciated! Thank you in advance!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This should do it:

 

data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id;
run;

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
PG

View solution in original post

9 REPLIES 9
PGStats
Opal | Level 21

A well formulated question.

 

However, it would seem to me that Option1_Option2 should be 2/3  (AE)/(ABE) right?

PG
telc24
Obsidian | Level 7

Yes! Sorry about that! I played around with the mock dataset a few times and didn't catch that change!

PGStats
Opal | Level 21

This should do it:

 

data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id;
run;

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
PG
telc24
Obsidian | Level 7

@PGStats this is brilliant!! Thank you so much! You just saved me so much time 🙂 

telc24
Obsidian | Level 7

Hi @PGStats , thank you again for your help! I have one more question that would help me automate things. I also have one more identifier that I want to account for (the "type" variable below). 

 

data my_data;

input id $ type option1 option2 option3 option4;

cards;

A 1 1 0 1 1

B 2 1 1 1 1

C 1 0 1 1 1

D 2 0 1 0 1

E 1 1 1 1 1

F 2 0 0 1 1

;

 

Currently, the code you generously provided includes the proportion calculation for types 1 AND 2. Is there an easy way to additionally add a column for the proportion calculation for type 1 only? And then a column for the proportion calculation for type 2 only?

 

I'm not very familiar with using proc sql to generate tables, so any general resources to help me learn would also be greatly appreciated!

 

Thank you, again!

 

 

 

Jagadishkatam
Amethyst | Level 16

You can try the below code , i updated the @PGStats code

 

proc sql;
create table want as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1*b.col1) / coalesce(sum(a.col1), 1) as proportion, sum(a.col1*b.col1) as type1, coalesce(sum(a.col1), 1) as type2
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;
Thanks,
Jag
telc24
Obsidian | Level 7

Thanks @Jagadishkatam .Currently, your code is giving me additional columns for the numerator of the "combination" variable (variable=type1) and the denominator of the "combination" variable (variable=type2).


What I'm looking for is a similar "combination" calculation variable for instances for the individuals who are type=1. And then a similar "combination" calculation variable for the individuals who are type=2. Is there a way to add an if/then or when statement to proc sql to do so? For example, the table output for the dataset shown below would look like this:

 

Dataset:

data my_data;

input id $ type option1 option2 option3 option4;

cards;

A 1 1 0 1 1

B 2 1 1 1 1

C 1 0 1 1 1

D 2 0 1 0 1

E 1 1 1 1 1

F 2 0 0 1 1

;

 

Output table:

Option1-Option2 2/3 1/2 1/1

Option1-Option3 3/3 2/2 1/1

Option1-Option4 3/3 2/2 1/1

Option2-Option1 2/4 1/2 1/2

etc.

 

PGStats
Opal | Level 21

Here are two methods for obtaining proportions by type, assuming that ids are unique keys:

 

data my_data;
input id $ type option1 option2 option3 option4;
cards;
A 1 1 0 1 1
B 2 1 1 1 1
C 1 0 1 1 1 
D 2 0 1 0 1
E 1 1 1 1 1
F 2 0 0 1 1
;

proc transpose data=my_data out=have;
var option:;
by id type;
run;

proc sql;
create table want1 as
select 
    catx("-", a._name_, b._name_) as combination,
    sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportion,
    sum(a.type=1 and a.col1 and b.col1) / coalesce(sum(a.type=1 and a.col1), 1) as proportionType1,
    sum(a.type=2 and a.col1 and b.col1) / coalesce(sum(a.type=2 and a.col1), 1) as proportionType2
from 
    have as a inner join
    have as b on a.id=b.id and a._name_ ne b._name_
group by a._name_, b._name_;
quit;


proc sql;
create table want2 as
select aa.combination, proportion, proportionType1, proportionType2 from

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportion
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    group by a._name_, b._name_) as aa left join

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportionType1
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    where a.type = 1
    group by a._name_, b._name_) as bb on aa.combination = bb.combination left join

   (select 
        catx("-", a._name_, b._name_) as combination,
        sum(a.col1 and b.col1) / coalesce(sum(a.col1), 1) as proportionType2
    from 
        have as a inner join
        have as b on a.id=b.id and a._name_ ne b._name_
    where a.type = 2
    group by a._name_, b._name_) as cc on aa.combination = cc.combination
    ;
quit;

To scale this up to, say, 20 types, you would need a different approach involving an extra transposition.

PG
Ksharp
Super User
data my_data;
input id $ option1 option2 option3 option4;
cards;
A 1 0 1 1
B 1 1 1 1
C 0 1 1 1 
D 0 1 0 1
E 1 1 1 1
F 0 0 1 1
;

proc corr data=my_data out=have(where=(_type_='SSCP'))  sscp noprint;
run;
proc transpose data=have(where=(_name_ ne 'Intercept')) out=have2(index=(name)) name=name;
by _name_;
var option: ;
run;

proc summary data=my_data ;
var option: ;
output out=total sum=;
run;
proc transpose data=total out=total2 name=name ;
var option: ;
run;

data want;
 merge have2 total2(rename=(col1=col2));
 by name;
 want=col1/col2;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2210 views
  • 6 likes
  • 4 in conversation