Desktop productivity for business analysts and programmers

Unique count for

Accepted Solution Solved
Reply
Contributor
Posts: 48
Accepted Solution

Unique count for

Hi All,

 

Sample data below

 

RefInvoiceNumberInvoiceTypeReason
123A1NAAIncome
123A2NAAIncome
123A3DPPIncome
123A4PPACare

 

I would like to do a unique count combing Ref and InvoiceType and a unique count combing Ref and Reason. Output should be as below.

 

RefInvoiceNumberInvoiceTypeReasonUnqiueRefUniqueReasonUniqueInvoiceType
123A1NAAIncome111
123A2NAAIncome   
123A3DPPIncome  1
123A4PPACare 11

 

I had thought of concatenating Ref and InvoiceType (new field UniqueInvoiceType) and also Ref and Reason (new field UniqueReason) and doing an if first on both thos fields.

 

Is this the best/most efficient way of achiveing this?

 

Cheers

Haydn


Accepted Solutions
Solution
‎09-20-2017 11:16 PM
Super User
Posts: 10,623

Re: Unique count for

I love this question. What if there are more than one Ref ?


data have;
input Ref $ 1-3 InvoiceNumber $ 5-6	InvoiceType $ 8-10	Reason $ 12-18;
datalines;
123	A1	NAA	Income
123	A2	NAA	Income
123	A3	DPP	Income
123	A4	PPA	Care
223	A1	NAA	Income
223	A2	NAA	Income
223	A3	DPP	Income
223	A4	PPA	Care
;

data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h1(dataset:'have');
  h1.definekey('ref');
  h1.definedone();

  declare hash h2(dataset:'have');
  h2.definekey('ref','InvoiceType');
  h2.definedone();

  declare hash h3(dataset:'have');
  h3.definekey('ref','Reason');
  h3.definedone();
 end;
set have;
if h1.check()=0 then do;UnqiueRef=1;h1.remove();end;
if h2.check()=0 then do;UniqueReason=1;h2.remove();end;
if h3.check()=0 then do;UniqueInvoiceType=1;h3.remove();end;
run;

View solution in original post


All Replies
PROC Star
Posts: 500

Re: Unique count for

I am very sure, there will be much easier and cleaner solution then what I have written. if your combination of ref and invoicenumber are unique then you can do sperately each first.group and then join them back as shown below. 

data have;
input Ref $ 1-3 InvoiceNumber $ 5-6	InvoiceType $ 8-10	Reason $ 12-18;
datalines;
123	A1	NAA	Income
123	A2	NAA	Income
123	A3	DPP	Income
123	A4	PPA	Care
223	A1	NAA	Income
223	A2	NAA	Income
223	A3	DPP	Income
223	A4	PPA	Care
;

proc sort data = have out=have1;
by Ref invoicenumber;
run;

data have1a;
set have1;
by Ref invoicenumber;
if first.ref then UnqiueRef = 1; else UnqiueRef =.;
 run;
 
 proc sort data = have out=have2;
by Ref  InvoiceType;
run;
 
 data have2a;
set have2;
by Ref  InvoiceType;
if first.invoicetype = 1 then UniqueInvoiceType =1 ; 
 else UniqueInvoiceType =.;
 run;
 
  proc sort data = have out=have3;
by Ref reason;
run;
 
 data have3a;
set have3;
by Ref  reason;
if first.reason = 1 then Uniquereason =1 ; 
 else Uniquereason =.;
 run;
 
 
 proc sql;
 create table want as
 select a.Ref,
        a.invoicenumber,
        a.invoicetype,
        a.reason,
        a.UnqiueRef,
        b.UniqueInvoiceType,
        c.Uniquereason
 from have1a a, have2a b, have3a c
 where  a.Ref = b.ref
 and    a.ref=c.ref
 and   a.InvoiceNumber =b.InvoiceNumber
 and    a.InvoiceNumber =c.InvoiceNumber
 order by a.Ref, a.InvoiceNumber;
 quit;
Solution
‎09-20-2017 11:16 PM
Super User
Posts: 10,623

Re: Unique count for

I love this question. What if there are more than one Ref ?


data have;
input Ref $ 1-3 InvoiceNumber $ 5-6	InvoiceType $ 8-10	Reason $ 12-18;
datalines;
123	A1	NAA	Income
123	A2	NAA	Income
123	A3	DPP	Income
123	A4	PPA	Care
223	A1	NAA	Income
223	A2	NAA	Income
223	A3	DPP	Income
223	A4	PPA	Care
;

data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h1(dataset:'have');
  h1.definekey('ref');
  h1.definedone();

  declare hash h2(dataset:'have');
  h2.definekey('ref','InvoiceType');
  h2.definedone();

  declare hash h3(dataset:'have');
  h3.definekey('ref','Reason');
  h3.definedone();
 end;
set have;
if h1.check()=0 then do;UnqiueRef=1;h1.remove();end;
if h2.check()=0 then do;UniqueReason=1;h2.remove();end;
if h3.check()=0 then do;UniqueInvoiceType=1;h3.remove();end;
run;

Contributor
Posts: 48

Re: Unique count for

Thanks for your help on this.

There will definitely be more than 1 ref in the dataset (it has 2 million rows). I'm running this now and will post back here on how it went.

 

Cheers

Haydn

Contributor
Posts: 48

Re: Unique count for

Thanks for you help, your code worked like a charm
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 312 views
  • 2 likes
  • 3 in conversation