I have a person-level dataset that has the individual's preferred store (1,2, or 3), and binary indicators if the individual likes eating apples, oranges, bananas, pineapple:
data have_person;
input id store apple orange banana pineapple;
cards;
1 2 1 0 1 1
2 3 1 1 1 1
3 1 0 1 1 1
4 3 0 1 0 1
5 2 1 1 1 1
6 1 0 0 1 1
;
I also have a store-level dataset that tells me if the store SELLS apples, oranges, bananas, or pineapples.
(Note that individuals' preferred store may NOT sell the fruit that they like to eat (example: person 1 prefers store 2 and likes eating apples, bananas, pineapples. However, store 2 does not sell bananas or pineapples (only apples and oranges). Also, stores may sell fruit that people who shop at the store do not like eating.)
data have_store;
input store apple orange banana pineapple;
cards;
1 1 1 1 0
2 1 1 0 0
3 0 0 1 1
;
So what I need to calculate using data from these 2 datasets is: a specific proportion for EACH pairwise combination of fruit. There are two specific details that need to be taken into account for the calculation:
I need a calculation for all the pairwise combinations, but the trick is that the pairwise combinations that need calculations are defined by the STORE dataset and not the person-level dataset (individuals could like eating fruit that is unavailable at the store, and I don't care about that).
This is what the resulting dataset needs to look like:
Store Combination Numerator Denominator Proportion
1 apple_orange 0 0 0
1 apple_banana 0 0 0
1 orange_banana 1 1 1
1 orange_apple 0 1 0
1 banana_apple 0 2 0
1 banana_orange 1 2 0.5
2 apple_orange 1 2 0.5
2 orange_apple 1 1 1
3 banana_pineapple 1 1 1
3 pineapple_banana 1 2 0.5
If it makes it easier to rename the fruit to fruit1, fruit2, fruit3, fruit4, that's fine. I think this needs some type of sql statement, but I'm honestly just not familiar with sql enough to figure out all the details that this type of dataset requires.
Any help is greatly appreciated!!
You would need to transpose your data to get this done with SQL:
proc transpose data=have_person out=person(rename=col1=like) name=fruit;
by id store notsorted;
var apple orange banana pineapple;
run;
proc transpose data=have_store out=store(rename=col1=sells) name=fruit;
by store notsorted;
var apple orange banana pineapple;
run;
proc sql;
create table proportions as
select
a.store,
catx("_", a.fruit, b.fruit) as fruits,
sum(c.like and d.like) as numerator,
sum(c.like) as denominator,
coalesce(calculated numerator / calculated denominator, 0) as proportion
from
store as a inner join
store as b on a.store=b.store and a.fruit ne b.fruit inner join
person as c on a.store=c.store and a.fruit=c.fruit inner join
person as d on c.id=d.id and a.store=d.store and b.fruit=d.fruit
where a.sells and b.sells
group by a.store, a.fruit, b.fruit;
quit;
@PGStats You previously helped me with a somewhat similar coding question. This adds an additional layer of information on top of it. Any insight you may have would be greatly, greatly appreciated! Thanks again for your help previously!
You would need to transpose your data to get this done with SQL:
proc transpose data=have_person out=person(rename=col1=like) name=fruit;
by id store notsorted;
var apple orange banana pineapple;
run;
proc transpose data=have_store out=store(rename=col1=sells) name=fruit;
by store notsorted;
var apple orange banana pineapple;
run;
proc sql;
create table proportions as
select
a.store,
catx("_", a.fruit, b.fruit) as fruits,
sum(c.like and d.like) as numerator,
sum(c.like) as denominator,
coalesce(calculated numerator / calculated denominator, 0) as proportion
from
store as a inner join
store as b on a.store=b.store and a.fruit ne b.fruit inner join
person as c on a.store=c.store and a.fruit=c.fruit inner join
person as d on c.id=d.id and a.store=d.store and b.fruit=d.fruit
where a.sells and b.sells
group by a.store, a.fruit, b.fruit;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.