Hello. I have two datasets- what I would like to do is that combine two dataset by group and month_year but also include the ones that do not overlap with dates:
Dataset A
month_year | expected_rate |
1/1/2017 | 0 |
2/1/2017 | 5 |
3/1/2017 | 10 |
4/1/2017 | 15 |
5/1/2017 | 35 |
6/1/2017 | 45 |
7/1/2017 | 51 |
8/1/2017 | 53 |
9/1/2017 | 59 |
10/1/2017 | 65 |
11/1/2017 | 95 |
12/1/2017 | 100 |
Dataset B
group | actual_rate | rmd_my |
A | 2 | 2/1/2017 |
A | 7 | 3/1/2017 |
A | 51 | 10/1/2017 |
A | 67 | 11/1/2017 |
B | 0 | 1/1/2017 |
B | 4 | 2/1/2017 |
B | 10 | 4/1/2017 |
B | 25 | 5/1/2017 |
B | 65 | 12/1/2017 |
WANT:
group | month_year | expected_rate | actual_rate |
A | 1/1/2017 | 0 | |
A | 2/1/2017 | 5 | 2 |
A | 3/1/2017 | 10 | 7 |
A | 4/1/2017 | 15 | |
A | 5/1/2017 | 35 | |
A | 6/1/2017 | 45 | |
A | 7/1/2017 | 51 | |
A | 8/1/2017 | 53 | |
A | 9/1/2017 | 59 | |
A | 10/1/2017 | 65 | 51 |
A | 11/1/2017 | 95 | 67 |
A | 12/1/2017 | 100 | |
B | 1/1/2017 | 0 | 0 |
B | 2/1/2017 | 5 | 4 |
B | 3/1/2017 | 10 | |
B | 4/1/2017 | 15 | 10 |
B | 5/1/2017 | 35 | 25 |
B | 6/1/2017 | 45 | |
B | 7/1/2017 | 51 | |
B | 8/1/2017 | 53 | |
B | 9/1/2017 | 59 | |
B | 10/1/2017 | 65 | |
B | 11/1/2017 | 95 | |
B | 12/1/2017 | 100 | 65 |
I tried the following proc sql command:
Proc sql;
SELECT * FROM dataset_a
FULL JOIN dataset_b
ON dataset_b.month_year= dataset_a.month_year
I also tried full outer join, right join. None of them seem to work
I believe this will give you what you want...
data sample_a ;
input month_year:ddmmyy10. expected_rate ;
format month_year ddmmyy10. ;
datalines ;
1/1/2017 0
2/1/2017 5
3/1/2017 10
4/1/2017 15
5/1/2017 35
6/1/2017 45
7/1/2017 51
8/1/2017 53
9/1/2017 59
10/1/2017 65
11/1/2017 95
12/1/2017 100
; run ;
data sample_b ;
input group $ actual_rate rmd_my:ddmmyy8. ;
format rmd_my ddmmyy10. ;
datalines ;
A 2 2/1/2017
A 7 3/1/2017
A 51 10/1/2017
A 67 11/1/2017
B 0 1/1/2017
B 4 2/1/2017
B 10 4/1/2017
B 25 5/1/2017
B 65 12/1/2017
; run ;
proc sql ;
create table results as
select 'A' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'A' /* Gets all the data for Group A */
outer union corresponding /* This is a concatenation of the results set before and after */
select 'B' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'B' /* Gets all the data for Group B */ ;
quit ;
but if you need something a little more dynamic that does not need you to know the values of Group, try this - its harder to follow but looks like the results are good
proc sql ;
create table results as
select c.group, c.month_year, c.expected_rate, d.actual_rate
from (select a.month_year, a.expected_rate, b.group
from sample_a a,
(select distinct b.group from sample_b b) ) c /* this gives you all the combinations of month_year and group, along with expected_rate */
left join sample_b d
on c.month_year = d.rmd_my and c.group=d.group /* and this gives you the actual rates */
order by c.group, c.month_year ;
quit ;
Thank you! Just a quick question - I tried the second block of the code, but I am getting error at " b) ) c" . it says syntax error but I tried to remove ) and add it else where but still doesn't work
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: