Hello. I have two datasets- what I would like to do is that combine two dataset by group and month_year but also include the ones that do not overlap with dates:
Dataset A
month_year | expected_rate |
1/1/2017 | 0 |
2/1/2017 | 5 |
3/1/2017 | 10 |
4/1/2017 | 15 |
5/1/2017 | 35 |
6/1/2017 | 45 |
7/1/2017 | 51 |
8/1/2017 | 53 |
9/1/2017 | 59 |
10/1/2017 | 65 |
11/1/2017 | 95 |
12/1/2017 | 100 |
Dataset B
group | actual_rate | rmd_my |
A | 2 | 2/1/2017 |
A | 7 | 3/1/2017 |
A | 51 | 10/1/2017 |
A | 67 | 11/1/2017 |
B | 0 | 1/1/2017 |
B | 4 | 2/1/2017 |
B | 10 | 4/1/2017 |
B | 25 | 5/1/2017 |
B | 65 | 12/1/2017 |
WANT:
group | month_year | expected_rate | actual_rate |
A | 1/1/2017 | 0 | |
A | 2/1/2017 | 5 | 2 |
A | 3/1/2017 | 10 | 7 |
A | 4/1/2017 | 15 | |
A | 5/1/2017 | 35 | |
A | 6/1/2017 | 45 | |
A | 7/1/2017 | 51 | |
A | 8/1/2017 | 53 | |
A | 9/1/2017 | 59 | |
A | 10/1/2017 | 65 | 51 |
A | 11/1/2017 | 95 | 67 |
A | 12/1/2017 | 100 | |
B | 1/1/2017 | 0 | 0 |
B | 2/1/2017 | 5 | 4 |
B | 3/1/2017 | 10 | |
B | 4/1/2017 | 15 | 10 |
B | 5/1/2017 | 35 | 25 |
B | 6/1/2017 | 45 | |
B | 7/1/2017 | 51 | |
B | 8/1/2017 | 53 | |
B | 9/1/2017 | 59 | |
B | 10/1/2017 | 65 | |
B | 11/1/2017 | 95 | |
B | 12/1/2017 | 100 | 65 |
I tried the following proc sql command:
Proc sql;
SELECT * FROM dataset_a
FULL JOIN dataset_b
ON dataset_b.month_year= dataset_a.month_year
I also tried full outer join, right join. None of them seem to work
I believe this will give you what you want...
data sample_a ;
input month_year:ddmmyy10. expected_rate ;
format month_year ddmmyy10. ;
datalines ;
1/1/2017 0
2/1/2017 5
3/1/2017 10
4/1/2017 15
5/1/2017 35
6/1/2017 45
7/1/2017 51
8/1/2017 53
9/1/2017 59
10/1/2017 65
11/1/2017 95
12/1/2017 100
; run ;
data sample_b ;
input group $ actual_rate rmd_my:ddmmyy8. ;
format rmd_my ddmmyy10. ;
datalines ;
A 2 2/1/2017
A 7 3/1/2017
A 51 10/1/2017
A 67 11/1/2017
B 0 1/1/2017
B 4 2/1/2017
B 10 4/1/2017
B 25 5/1/2017
B 65 12/1/2017
; run ;
proc sql ;
create table results as
select 'A' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'A' /* Gets all the data for Group A */
outer union corresponding /* This is a concatenation of the results set before and after */
select 'B' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'B' /* Gets all the data for Group B */ ;
quit ;
but if you need something a little more dynamic that does not need you to know the values of Group, try this - its harder to follow but looks like the results are good
proc sql ;
create table results as
select c.group, c.month_year, c.expected_rate, d.actual_rate
from (select a.month_year, a.expected_rate, b.group
from sample_a a,
(select distinct b.group from sample_b b) ) c /* this gives you all the combinations of month_year and group, along with expected_rate */
left join sample_b d
on c.month_year = d.rmd_my and c.group=d.group /* and this gives you the actual rates */
order by c.group, c.month_year ;
quit ;
Thank you! Just a quick question - I tried the second block of the code, but I am getting error at " b) ) c" . it says syntax error but I tried to remove ) and add it else where but still doesn't work
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.