BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lipty
Obsidian | Level 7

Dear all,

 

I have the following data set and would like to count the unique number of industries of a firm. 

 

Original data set:

 

gvkeyyearsicsic1sic2
48392018371137113714
48392018371161416411
48392018371152335622
48392017371137113714
48392017371137117379

 

This is the final data I would like to get

gvkeyyearnumberofsic
483920186
483920173

 

Thank you very much~~~~

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hello @Lipty 

 

Yes as @Reeza  suggests narrow and long works best for SAS, but FWIW

 

data have;
input gvkey	year	sic	sic1	sic2;
cards;
4839	2018	3711	3711	3714
4839	2018	3711	6141	6411
4839	2018	3711	5233	5622
4839	2017	3711	3711	3714
4839	2017	3711	3711	7379
;

data want;
do until(last.year);
 set have;
 by gvkey descending year;
 array t sic:;
 array j(9999) _temporary_;
 do over t;
  if t not in j then do;
  i=sum(i,1);
  j(i)=t;
  end;
 end;
end;
numberofsic=n(of j(*));
call missing(of j(*));
drop i sic:;
run;

View solution in original post

10 REPLIES 10
Reeza
Super User
Change your data structure to a long format via PROC TRANSPOSE and then use SQL.

proc sql;
create table want as
select gvkey, year, count(distinct sic) as numberofsic
from longfile
group by gvkey, year
order by 1, 1;
quit;
novinosrin
Tourmaline | Level 20

Hello @Lipty 

 

Yes as @Reeza  suggests narrow and long works best for SAS, but FWIW

 

data have;
input gvkey	year	sic	sic1	sic2;
cards;
4839	2018	3711	3711	3714
4839	2018	3711	6141	6411
4839	2018	3711	5233	5622
4839	2017	3711	3711	3714
4839	2017	3711	3711	7379
;

data want;
do until(last.year);
 set have;
 by gvkey descending year;
 array t sic:;
 array j(9999) _temporary_;
 do over t;
  if t not in j then do;
  i=sum(i,1);
  j(i)=t;
  end;
 end;
end;
numberofsic=n(of j(*));
call missing(of j(*));
drop i sic:;
run;
Lipty
Obsidian | Level 7

Thank you very much @Reeza But my original data has 168344 observations. When I transpose it, the data is wired. : )

 

Thank you very much, @novinosrin . I will try your method now. Many thanks. : )

Reeza
Super User

But my original data has 168344 observations. When I transpose it, the data is wired. : )

No idea why that's relevant. Long data doesn't matter to SAS and is more efficient overall.

 

Lipty
Obsidian | Level 7

Thank you very much. : ) It works~~~~ ^_^

novinosrin
Tourmaline | Level 20

You are welcome! 

novinosrin
Tourmaline | Level 20

Sorry I am feeling bored and dull. So for some entertainment, here is a solution that uses Hash method

 

data have;
input gvkey	year	sic	sic1	sic2;
cards;
4839	2018	3711	3711	3714
4839	2018	3711	6141	6411
4839	2018	3711	5233	5622
4839	2017	3711	3711	3714
4839	2017	3711	3711	7379
;

data want ;
if _n_=1 then do;
   dcl hash H () ;
   h.definekey  ("_n_") ;
   h.definedone () ;
end;
do until(last.year);
 set have;
 by gvkey descending year;
 array t sic:;
 do over t;
  if not missing(t) then _n_=t;
  _rc=h.add();
 end;
end;
numberofsic=h.num_items;
h.clear();
drop _: sic:;
run;

 

 

Lipty
Obsidian | Level 7

Thanks~~~~

 

I tried both and the results are consistent. : )

novinosrin
Tourmaline | Level 20

I am glad. Some fun that made me feel so stupid in retrospect. Did we realize that I was silly(1st solution) in using

numberofsic=n(of j(*));

where we could have simply used

numberofsic=i;

LOL. That proves I was sleeping. It's very trivial but attention to detail is very important in achieving immaculate code. 🙂

 

 

 

Lipty
Obsidian | Level 7

LOL. You are so funny. Thank you very much. I might bother you next time. ^_^

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1102 views
  • 0 likes
  • 3 in conversation