Hello. I have two datasets- what I would like to do is that combine two dataset by group and month_year but also include the ones that do not overlap with dates:
Dataset A
| month_year | expected_rate |
| 1/1/2017 | 0 |
| 2/1/2017 | 5 |
| 3/1/2017 | 10 |
| 4/1/2017 | 15 |
| 5/1/2017 | 35 |
| 6/1/2017 | 45 |
| 7/1/2017 | 51 |
| 8/1/2017 | 53 |
| 9/1/2017 | 59 |
| 10/1/2017 | 65 |
| 11/1/2017 | 95 |
| 12/1/2017 | 100 |
Dataset B
| group | actual_rate | rmd_my |
| A | 2 | 2/1/2017 |
| A | 7 | 3/1/2017 |
| A | 51 | 10/1/2017 |
| A | 67 | 11/1/2017 |
| B | 0 | 1/1/2017 |
| B | 4 | 2/1/2017 |
| B | 10 | 4/1/2017 |
| B | 25 | 5/1/2017 |
| B | 65 | 12/1/2017 |
WANT:
| group | month_year | expected_rate | actual_rate |
| A | 1/1/2017 | 0 | |
| A | 2/1/2017 | 5 | 2 |
| A | 3/1/2017 | 10 | 7 |
| A | 4/1/2017 | 15 | |
| A | 5/1/2017 | 35 | |
| A | 6/1/2017 | 45 | |
| A | 7/1/2017 | 51 | |
| A | 8/1/2017 | 53 | |
| A | 9/1/2017 | 59 | |
| A | 10/1/2017 | 65 | 51 |
| A | 11/1/2017 | 95 | 67 |
| A | 12/1/2017 | 100 | |
| B | 1/1/2017 | 0 | 0 |
| B | 2/1/2017 | 5 | 4 |
| B | 3/1/2017 | 10 | |
| B | 4/1/2017 | 15 | 10 |
| B | 5/1/2017 | 35 | 25 |
| B | 6/1/2017 | 45 | |
| B | 7/1/2017 | 51 | |
| B | 8/1/2017 | 53 | |
| B | 9/1/2017 | 59 | |
| B | 10/1/2017 | 65 | |
| B | 11/1/2017 | 95 | |
| B | 12/1/2017 | 100 | 65 |
I tried the following proc sql command:
Proc sql;
SELECT * FROM dataset_a
FULL JOIN dataset_b
ON dataset_b.month_year= dataset_a.month_year
I also tried full outer join, right join. None of them seem to work
I believe this will give you what you want...
data sample_a ;
input month_year:ddmmyy10. expected_rate ;
format month_year ddmmyy10. ;
datalines ;
1/1/2017 0
2/1/2017 5
3/1/2017 10
4/1/2017 15
5/1/2017 35
6/1/2017 45
7/1/2017 51
8/1/2017 53
9/1/2017 59
10/1/2017 65
11/1/2017 95
12/1/2017 100
; run ;
data sample_b ;
input group $ actual_rate rmd_my:ddmmyy8. ;
format rmd_my ddmmyy10. ;
datalines ;
A 2 2/1/2017
A 7 3/1/2017
A 51 10/1/2017
A 67 11/1/2017
B 0 1/1/2017
B 4 2/1/2017
B 10 4/1/2017
B 25 5/1/2017
B 65 12/1/2017
; run ;
proc sql ;
create table results as
select 'A' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'A' /* Gets all the data for Group A */
outer union corresponding /* This is a concatenation of the results set before and after */
select 'B' as group, sample_a.month_year, sample_a.expected_rate, sample_b.actual_rate
from sample_a
left join sample_b
on sample_a.month_year = sample_b.rmd_my and sample_b.group = 'B' /* Gets all the data for Group B */ ;
quit ;
but if you need something a little more dynamic that does not need you to know the values of Group, try this - its harder to follow but looks like the results are good
proc sql ;
create table results as
select c.group, c.month_year, c.expected_rate, d.actual_rate
from (select a.month_year, a.expected_rate, b.group
from sample_a a,
(select distinct b.group from sample_b b) ) c /* this gives you all the combinations of month_year and group, along with expected_rate */
left join sample_b d
on c.month_year = d.rmd_my and c.group=d.group /* and this gives you the actual rates */
order by c.group, c.month_year ;
quit ;
Thank you! Just a quick question - I tried the second block of the code, but I am getting error at " b) ) c" . it says syntax error but I tried to remove ) and add it else where but still doesn't work
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.