BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vnreddy
Quartz | Level 8

Hi All,

 

I have two datasets A and B both are having more than 16 million records

I am trying to create a Flag based on the account_numbers e.g., if dataset A Account_number matches with Account_number from B dataset then the Flag should be 1 if not matches then Flag = 0

Matches Flag = 1

Mismatch Flag = 0

 

I want to see the percentage by Flag 1 & 0

for. eg., if i have total 10 records, out of which 8 matches in Dataset A and B then Flag 1 should be 80% and Flag 0 should be 20% 

 

I am looking for outputs like Output 1 & Outpu 2

 

Dataset A

Account_number Billing_Period
12345678 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345684 201611
12345685 201611
12345686 201611
12345687 201611

 

Dataset B

Account_number Billing_Period
12345672 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345687 201611
12345685 201611
12345686 201611
12345687 201611

 

Expected Output 1:

Account_number Account_number Billing_Period Flag
12345678 12345672 201611 0
12345679 12345679 201611 1
12345680 12345680 201611 1
12345681 12345681 201611 1
12345682 12345682 201611 1
12345683 12345683 201611 1
12345684 12345688 201611 0
12345685 12345685 201611 1
12345686 12345686 201611 1
12345687 12345687 201611 1

 

Expected output 2:

Billing_Period Flag Percentage
201611 1 80
201611 0 20

 

Thanks

nreddy

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User
data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;

View solution in original post

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

Here is the flag part. Do you want the second output as a SAS data set or a report?

 

data a;
input Account_number $ Billing_Period $;
datalines;
12345678 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345684 201611
12345685 201611
12345686 201611
12345687 201611
;

data b;
input Account_number $ Billing_Period $;
datalines;
12345672 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345687 201611
12345685 201611
12345686 201611
12345687 201611
;

data want;
   if _N_ = 1 then do;
      dcl hash h(dataset : "b");
      h.definekey("Account_number");
      h.definedone();
   end;

   set a;

   flag = (h.find() = 0);
run;

Kurt_Bremser
Super User
data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 984 views
  • 0 likes
  • 3 in conversation