Hi @andreas_lds I'm afraid should the requirement be count of "distinct industries" for each account number, your code would yield incorrect results, like for the revised "have"
data have; input Accountnumber $ Industry $; datalines; 148060 IN 148060 CL 148060 CL 148060 CC 148060 HW 148060 FC 92865 PL 92865 PL 150021 PL 150021 MB 150021 NL ;
Of course a simple proc sort and nodupkey will offset the problem:
proc sort data=have out=_have nodupkey; by Accountnumber industry; run;
data want; set _have(keep=AccountNumber); by AccountNumber notsorted;
attrib noi length= 8 label= 'number of industries'; retain noi;
if first.AccountNumber then do; noi = 0; end;
noi = noi + 1;
if last.AccountNumber then do; output; end; run;
/*Or a DOW fun*/
proc sort data=have out=_have nodupkey; by Accountnumber industry; run;
data want; do number_of_industries= 1 by 1 until(last.Accountnumber); set _have; by Accountnumber ; end; drop industry; run;
... View more