DATA Step, Macro, Functions and more

Subset Counts

Reply
Occasional Contributor
Posts: 14

Subset Counts

I am working with a transaction type data set where there are multiple observations for individual children nested within families.  Each child has a unique" child_id" and an associated "familynumber." I would like to generate a count of the number of children in each family.  There are 746,000 observations from 76,000 families.  I have attempted to use the following code, but each of the 76,000 subsets (one for each family) takes about 4 seconds to process.  Does anyone know of a more efficient way to go about generating this count variable?  Thank you for your help.

proc sql noprint;

select left(put(count(distinct(familynumber)),15.0)) into :fmcount work.time;

select distinct(familynumber) into :family1 - :family&fmcount from work.time;

quit;

%macro family_count;

%do i = 1 %to &fmcount;

proc sql noprint;

select count(distinct(child_id)) from work.time

where familynumber = &&family&i;

quit;

%end;

%mend family_count;

%family_count;

Respected Advisor
Posts: 4,920

Re: Subset Counts

Posted in reply to PhillipSherlock

As far as I can see, you don't need macro programming at all to do this.

proc sql;

create table familyCounts as

select familyNumber, count(distinct child_id) as childCount

from time

group by familyNumber;

quit;

PG

PG
Super User
Posts: 19,772

Re: Subset Counts

Posted in reply to PhillipSherlock

My fav is the double proc freq, one because it will give you a list of family/child list, and the second output will be the number of children per family.

proc freq data=time noprint;

table familynumber*child_id/out=family_child_list;

run;

proc freq data=family_child_list no print;

table familynumber/out=family_child_count;

run;

Super User
Posts: 10,020

Re: Subset Counts

No need for double proc freq , just add an option NLEVELS . if I am not making a mistake .

data class;set sashelp.class;run;
proc sort data=class;by sex;run;
ods select NLevels;
ods output NLevels=want;
proc freq data=class nlevels;
by sex;
tables age;
run;


Xia Keshan

Ask a Question
Discussion stats
  • 3 replies
  • 241 views
  • 6 likes
  • 4 in conversation