Solved: creating flag and percentage calculation

vnreddy · Posted 02-09-2021 06:23 AM

Hi All,

I have two datasets A and B both are having more than 16 million records

I am trying to create a Flag based on the account_numbers e.g., if dataset A Account_number matches with Account_number from B dataset then the Flag should be 1 if not matches then Flag = 0

Matches Flag = 1

Mismatch Flag = 0

I want to see the percentage by Flag 1 & 0

for. eg., if i have total 10 records, out of which 8 matches in Dataset A and B then Flag 1 should be 80% and Flag 0 should be 20%

I am looking for outputs like Output 1 & Outpu 2

Dataset A

Account_number	Billing_Period
12345678	201611
12345679	201611
12345680	201611
12345681	201611
12345682	201611
12345683	201611
12345684	201611
12345685	201611
12345686	201611
12345687	201611

Dataset B

Account_number	Billing_Period
12345672	201611
12345679	201611
12345680	201611
12345681	201611
12345682	201611
12345683	201611
12345687	201611
12345685	201611
12345686	201611
12345687	201611

Expected Output 1:

Account_number	Account_number	Billing_Period	Flag
12345678	12345672	201611	0
12345679	12345679	201611	1
12345680	12345680	201611	1
12345681	12345681	201611	1
12345682	12345682	201611	1
12345683	12345683	201611	1
12345684	12345688	201611	0
12345685	12345685	201611	1
12345686	12345686	201611	1
12345687	12345687	201611	1

Expected output 2:

Billing_Period	Flag	Percentage
201611	1	80
201611	0	20

Thanks

nreddy

Kurt_Bremser · Posted 02-09-2021 06:45 AM

data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

PeterClemmensen · Posted 02-09-2021 06:31 AM

Here is the flag part. Do you want the second output as a SAS data set or a report?

data a;
input Account_number $ Billing_Period $;
datalines;
12345678 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345684 201611
12345685 201611
12345686 201611
12345687 201611
;

data b;
input Account_number $ Billing_Period $;
datalines;
12345672 201611
12345679 201611
12345680 201611
12345681 201611
12345682 201611
12345683 201611
12345687 201611
12345685 201611
12345686 201611
12345687 201611
;

data want;
   if _N_ = 1 then do;
      dcl hash h(dataset : "b");
      h.definekey("Account_number");
      h.definedone();
   end;

   set a;

   flag = (h.find() = 0);
run;

.

The DATA to DATA Step Macro
Blog: SASnrd

Kurt_Bremser · Posted 02-09-2021 06:45 AM

data ds_a;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345678  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345684  201611
12345685  201611
12345686  201611
12345687  201611
;

data ds_b;
input account_number $ billing_period :yymmn6.;
format billing_period yymmn6.;
datalines;
12345672  201611
12345679  201611
12345680  201611
12345681  201611
12345682  201611
12345683  201611
12345687  201611
12345685  201611
12345686  201611
12345687  201611
;

data want;
set ds_a;
if _n_ = 1
then do;
  declare hash b (dataset:"ds_b");
  b.definekey("account_number","billing_period");
  b.definedone();
end;
flag = (b.check() = 0);
run;

proc freq data=want;
tables billing_period*flag / out=want2;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

creating flag and percentage calculation

Re: creating flag and percentage calculation

Re: creating flag and percentage calculation

Re: creating flag and percentage calculation

Registration is open

Call for Content EXTENDED

creating flag and percentage calculation

Re: creating flag and percentage calculation

Re: creating flag and percentage calculation

Re: creating flag and percentage calculation

Registration is open

Call for Content EXTENDED

SAS Training: Just a Click Away