Hello,
I'm trying to get a certain output through a join.
Code for creating Input datasets:
data rank;
infile datalines;
input rmg $4. acc;
datalines;
123 1
124 2
125 3
126 4
127 5
;
run;
data rank1;
infile datalines;
input rmg $4. acc;
datalines;
123 1
124 2
125 3
123 7
124 8
125 9
136 10
138 11
;
run;
What I'm looking to do is get all the data from table rank and get only that data from table rank1 where rank.rmg=rank1.rmg but rank.acc is not in rank1.acc to avoid duplication. So for eg, for rmg 123 in rank, the code should include 123 from rank, but in addition should also pull 123 from rank1 where acc is not 1.
So this is the output that I'm looking to get:
rmg | acc |
123 | 1 |
123 | 7 |
124 | 2 |
124 | 8 |
125 | 3 |
125 | 9 |
126 | 4 |
127 | 5 |
When I try this code:
proc sql;
create table new as
select rank.*, rank1.rmg, rank1.acc from rank left join rank1
on rank.rmg=rank1.rmg
where rank1.acc not in (select distinct acc from rank);
quit;
I get this output:
rmg | acc |
123 | 1 |
123 | 1 |
124 | 2 |
124 | 2 |
125 | 3 |
125 | 3 |
126 | 4 |
127 | 5 |
Then when I try this code:
proc sql;
create table new as
select rank.*, rank1.rmg as rmg1, rank1.acc as acc1 from rank left join rank1
on rank.rmg=rank1.rmg
where rank1.acc not in (select distinct acc from rank);
quit;
I get this output:
rmg | acc | rmg1 | acc1 |
123 | 1 | 123 | 7 |
124 | 2 | 124 | 8 |
125 | 3 | 125 | 9 |
126 | 4 | ||
127 | 5 |
What should I be changing to get the desired output? Appreciate any suggestions.
Thank you!
Methinks that judging from your input/output data logic, what you're looking for is:
data rank ;
input rmg :$3. acc ;
cards ;
123 1
124 2
125 3
126 4
127 5
;
run ;
data rank1;
input rmg :$3. acc ;
cards ;
123 1
124 2
125 3
123 7
124 8
125 9
136 10
138 11
;
run ;
proc sql ;
create table want as
select * from rank
union
select * from rank1 where rmg in (select rmg from rank)
;
quit ;
UNION (without ALL) unduplicates the rows with the same RMG and ACC.
Kind regards
Paul D.
Try OUTER UNION CORRESPONDING instead of UNION in @hashman 's example then.
thank you for your reply, but outer union corresponding would create additional columns for me.
So to recap, I made slight changes to my input dataset to accomodate more columns, and change their names too:
data rank;/*base*/
infile datalines;
input check $2. rmgid $4. accno blah $2.;
datalines;
y 123 1 y
n 124 2 n
y 125 3 y
n 126 4 n
y 127 5 y
;
run;
data rank1;
infile datalines;
input rmg $4. acc;
datalines;
123 1
124 2
125 3
123 7
124 8
125 9
136 10
138 11
;
run;
and the output I'm trying for is:
check | rmgid | accno | blah |
Y | 123 | 1 | Y |
N | 124 | 2 | N |
Y | 125 | 3 | Y |
N | 126 | 4 | N |
Y | 127 | 5 | Y |
| 123 | 7 |
|
| 124 | 8 |
|
| 125 | 9 |
|
Basically trying to union them such that I get all the data from table rank and get only that data from table rank1 where rank.rmgid=rank1.rmg but rank.accno is not in rank1.acc to avoid duplication.
Does SELECT DISTINCT instead of just SELECT give you the result you want?
Cheating.....
proc sql ;
create table want as
select * from rank
union
select ' ' as check
,rmgid
,accno
,' ' as blah
from rank1
where rmgid in (select rmgid from rank)
and accno not in (select accno from rank)
order by accno
;
quit ;
Works fine for me with this complete program:
data rank;
infile datalines;
input check $2. rmgid $4. accno blah $2.;
datalines;
y 123 1 y
n 124 2 n
y 125 3 y
n 126 4 n
y 127 5 y
;
run;
data rank1;
infile datalines;
input rmgid $4. accno;
datalines;
123 1
124 2
125 3
123 7
124 8
125 9
136 10
138 11
;
run;
proc sql ;
create table want as
select * from rank
union
select ' ' as check
,rmgid
,accno
,' ' as blah
from rank1
where rmgid in (select rmgid from rank)
and accno not in (select accno from rank)
order by accno
;
quit ;
If you want to scale this solution then I'd probably start looking at using DATA steps as they are a whole lot more flexible for stacking tables.
>If you want to scale this solution then I'd probably start looking at using DATA steps as they are a whole lot more flexible for stacking tables.<
You snatched it off the tip of my proverbial online tongue ;).
And thank you for joining (no pun intended) the thread with your astute responses.
Kind regards
Paul D.
@hashman - Thanks for providing the original UNION solution - I just started playing with it...
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.