Hi! I am having the worst time of my life right now with what should be a super simple procedure. I have three tables. The first table (A) shows all the possible combinations of Group and Bucket. The second table (B) has values for random combinations of Group and Bucket, but also has another field, Type, that comes into play. And the third table (C) has Avg_Values for each combination of Bucket and Type. What I want is the fourth table (D), where it brings in all possible combinations of Group and Bucket for both types, and matches them with their value from table (B). But if (B) has no matching value, then grab the Avg_Value for that Bucket and Type from table (C).
(A)
Group | Bucket |
Red | a |
Red | b |
Red | c |
Blue | a |
Blue | b |
Blue | c |
Green | a |
Green | b |
Green | c |
(B)
Type | Group | Bucket | Value |
First | Red | a | 2 |
First | Red | b | 9 |
First | Red | c | 1 |
First | Blue | a | 3 |
First | Blue | c | 8 |
First | Green | b | 4 |
First | Green | c | 5 |
Second | Red | a | 4 |
Second | Red | c | 7 |
Second | Blue | a | 9 |
Second | Blue | b | 8 |
Second | Green | a | 10 |
Second | Green | b | 2 |
Second | Green | c | 1 |
(C)
Type | Bucket | Avg_Value |
First | a | 99 |
First | b | 89 |
First | c | 79 |
Second | a | 69 |
Second | b | 59 |
Second | c | 49 |
(D)
Type | Group | Bucket | New_Value |
First | Red | a | 2 |
First | Red | b | 9 |
First | Red | c | 79 |
First | Blue | a | 3 |
First | Blue | b | 89 |
First | Blue | c | 8 |
First | Green | a | 99 |
First | Green | b | 4 |
First | Green | c | 5 |
Second | Red | a | 4 |
Second | Red | b | 59 |
Second | Red | c | 7 |
Second | Blue | a | 9 |
Second | Blue | b | 8 |
Second | Blue | c | 49 |
Second | Green | a | 10 |
Second | Green | b | 2 |
Second | Green | c | 1 |
It really should be simple but can't for the life of me figure it out. Can anyone here help? Obviously I am in serious distress since its a Saturday before a holiday and I'm working on this lol. Thanks in advance!
The following code does what you want:
data a; /*This will be the base data set (the one that has all the cases)*/
length Type $8 Group $8 Bucket $8;
set a;
type="First";
output;
type="Second";
output;
run;
proc sql;
create table d as
select a.type ,a.group ,a.bucket ,coalesce(b.Value,c.Avg_Value) as New_Value
from a left join b on (a.type=b.type and a.group=b.group and a.bucket=b.bucket) /*In this join we take the values from b*/
left join c on (a.type=c.type and a.bucket=c.bucket) /*In this join we take the avg values from c*/
/*the coalesce function takes the first non missing value from its arguments*/
order by a.type, a.group desc, a.bucket;
quit;
Note: in your example you had
data set b
First | Red | c | 1 |
data set d
First | Red | c | 79 |
, I assumed it was a mistake.
The following code does what you want:
data a; /*This will be the base data set (the one that has all the cases)*/
length Type $8 Group $8 Bucket $8;
set a;
type="First";
output;
type="Second";
output;
run;
proc sql;
create table d as
select a.type ,a.group ,a.bucket ,coalesce(b.Value,c.Avg_Value) as New_Value
from a left join b on (a.type=b.type and a.group=b.group and a.bucket=b.bucket) /*In this join we take the values from b*/
left join c on (a.type=c.type and a.bucket=c.bucket) /*In this join we take the avg values from c*/
/*the coalesce function takes the first non missing value from its arguments*/
order by a.type, a.group desc, a.bucket;
quit;
Note: in your example you had
data set b
First | Red | c | 1 |
data set d
First | Red | c | 79 |
, I assumed it was a mistake.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.