Hi,
I have one dataset (dataset 1) with 10,000 observations and another dataset (dataset 2) with 150,000 observations. What code could be used so that two output datasets are created when comparing account_id between two datasets?
1. In one output dataset (account_match) to be created, there is a list of all of the account_id (and also select all of the corresponding variables associated with that account_id in the output dataset) from dataset 1 that are also in dataset 2 (i.e. where there the account_id in dataset 1 is also in dataset 2)?
2. The second output dataset to be created, account_no_match, would output all of the account_id in dataset 1 that are not seen in dataset 2 (and also select all of the corresponding variables associated with that account_id in the output dataset) .
As a result, number of accounts in account_match + number of accounts in account_no_match = 10,000 observations (i.e. dataset 1)
Code:
data account_match account_no_match;
[insert code here]
run;
data account_1;
input account_id $;
datalines;
1
2
3
4
5
6
7
8
;
data account_2;
input account_id $;
datalines;
14
5
1
2
99
98
93
;
proc sort data=account_1;
by account_id;
run;
proc sort data=account_2;
by account_id;
run;
data account_match account_no_match;
merge account_1 (in=a)
account_2 (in=b);
by account_id;
if a and b then output account_match;
else if a and not b then output account_no_match;
run;
Not sure if this is what you're trying to get at.
data account_1;
input account_id $;
datalines;
1
2
3
4
5
6
7
8
;
data account_2;
input account_id $;
datalines;
14
5
1
2
99
98
93
;
proc sort data=account_1;
by account_id;
run;
proc sort data=account_2;
by account_id;
run;
data account_match account_no_match;
merge account_1 (in=a)
account_2 (in=b);
by account_id;
if a and b then output account_match;
else if a and not b then output account_no_match;
run;
Not sure if this is what you're trying to get at.
If I only wanted variables from dataset "account_1" to be included in both output datasets (account_match and account_no_match), how could I change your code?
data account_match account_no_match;
merge account_1 (in=a)
account_2 (in=b);
by account_id;
if a and b then output account_match;
else if a and not b then output account_no_match;
run;
You want the same results for both data sets? That's how I'm reading your question. If so:
data account_match account_no_match;
merge account_1 (in=a)
account_2 (in=b);
by account_id;
if a and b then output account_match account_no_match;
run;
I don't think this is what you're trying to convey though. You should post a snapshot of your data.
@jeremy4 wrote:
If I only wanted variables from dataset "account_1" to be included in both output datasets (account_match and account_no_match), how could I change your code?
data account_match account_no_match; merge account_1 (in=a) account_2 (in=b); by account_id; if a and b then output account_match; else if a and not b then output account_no_match; run;
Only read account_id from table account_2.
....
account_2 (in=b keep=account_id);
Here two other options how to do this.
data account_1;
input account_id $;
some_other_var=_n_;
datalines;
1
2
3
4
5
6
7
8
;
data account_2;
input account_id $;
datalines;
14
5
1
2
99
98
93
;
/* options 1 */
data match no_match;
if _n_=1 then
do;
dcl hash h1(dataset:'account_2(keep=account_id)');
h1.defineKey('account_id');
h1.defineDone();
end;
set account_1;
if h1.check()=0 then output match;
else output no_match;
run;
/* option 2 */
proc sql;
create table match_noMatch as
select
t1.*,
case when missing(t2.account_id) then '0' else '1' end as match_flg
from
account_1 as t1
left join
(select distinct account_id from account_2) as t2
on t1.account_id=t2.account_id
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.