BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vnreddy
Quartz | Level 8

Hi All,

 

I have two datasets A and B both are having more than 16 million records

I am trying to create a Flag based on the account_numbers e.g., if dataset A Account_number matches with Account_number from B dataset then the Flag should be 1 if not matches then Flag = 0

Matches Flag = 1

Mismatch Flag = 0

 

I want to see the percentage by Flag 1 & 0

for. eg., if i have total 10 records, out of which 8 matches in Dataset A and B then Flag 1 should be 80% and Flag 0 should be 20% 

 

I am looking for outputs like Output 1 & Outpu 2

 

Dataset A

Account_number Billing_Period
12345678 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345684 201611
12345685 201611
12345686 201611
12345687 201611

 

Dataset B

Account_number Billing_Period
12345672 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345687 201611
12345685 201611
12345686 201611
12345687 201611

 

Expected Output 1:

Account_number Account_number Billing_Period Flag
12345678 12345672 201611 0
12345679 12345679 201611 1
12345680 12345680 201611 1
12345681 12345681 201611 1
12345682 12345682 201611 1
12345683 12345683 201611 1
12345684 12345688 201611 0
12345685 12345685 201611 1
12345686 12345686 201611 1
12345687 12345687 201611 1

 

Expected output 2:

Billing_Period Flag Percentage
201611 1 80
201611 0 20

 

Thanks

nreddy

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User
data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;

View solution in original post

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

Here is the flag part. Do you want the second output as a SAS data set or a report?

 

data a;
input Account_number $ Billing_Period $;
datalines;
12345678 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345684 201611
12345685 201611
12345686 201611
12345687 201611
;

data b;
input Account_number $ Billing_Period $;
datalines;
12345672 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345687 201611
12345685 201611
12345686 201611
12345687 201611
;

data want;
   if _N_ = 1 then do;
      dcl hash h(dataset : "b");
      h.definekey("Account_number");
      h.definedone();
   end;

   set a;

   flag = (h.find() = 0);
run;

Kurt_Bremser
Super User
data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1028 views
  • 0 likes
  • 3 in conversation