Help using Base SAS procedures

Creating cumulative counts over period of time

Reply
Regular Contributor
Posts: 188

Creating cumulative counts over period of time

Hi I have a dataset like this:

Month  Geo_id

1           A

1           B

1           C

2           A

2           D

2           E

I want to select cumulative counts like:

Month   Count

1           3

2           5

Where 3 is the count of distinct geo_ids for 1st month, 5 is the count of distinct geo_ids for month 1,2 and so on for other months.

I tried this unsuccessfully:

proc sql;

create table temp as select month,count(distinct geo_id) as cnt from active_phy1 where min(month) ge month le month

group by month;quit;

Frequent Contributor
Posts: 115

Re: Creating cumulative counts over period of time

data have;
input (month geo_id) ($);
datalines;
1           A
1           B
1           C
2           A
2           D
2           E
;

proc sort data=have out=have1 nodupkey;
by geo_id;
run;

data want;
do until(last.month);
set have1;
by month;
count+1;
end;
drop geo_id;
run;

Regards,

Naveen Srinivasan

L&T Infotech

Regular Contributor
Posts: 188

Re: Creating cumulative counts over period of time

Thanks,

Anyway we can do this by sql?

Frequent Contributor
Posts: 115

Re: Creating cumulative counts over period of time

has given you the sql answer, I hope you noted. Thanks

Occasional Contributor
Posts: 17

Re: Creating cumulative counts over period of time

Hi,

Your code will not work I guess in case of dataset like

data have;

  input month geo_id$;

  datalines;

1 A

1 B

1 C

2 A

2 D

2 E

3 A

3 C

3 D

;

run;

Yogesh

Super User
Super User
Posts: 7,401

Re: Creating cumulative counts over period of time

Hi,

Two ways jump to mind, first ave a sub-query do the count:

proc sql;

  create table WANT as

  select  A.MONTH,

          (select count(distinct GEO_ID) from HAVE where MONTH <= A.MONTH) as CNT

  from    (select distinct MONTH from HAVE) A;

quit;

Second, proc sort nodupkey before the SQL:

proc sort data=have nodupkey;

  by geo_id;

run;

proc sql;

  create table WANT as

  select  MONTH,

          count(DISTINCT GEO_ID) as CNT

  from    HAVE

  group by MONTH;

quit;

Super User
Posts: 9,676

Re: Creating cumulative counts over period of time

It is not good for SQL , low efficient .

data have;
input (month geo_id) ($);
datalines;
1           A
1           B
1           C
2           A
2           D
2           E
;
run;

proc sql;
 create table want as
  select month,(select count(distinct geo_id) from have where month le a.month )  as count
   from (select distinct month from have) as a ;
run;


It is good for data step.Fast .

data want1;
 set have;
 by month;
 array x{999999} $ _temporary_ ;
 if geo_id not in x then do;n+1;x{n}=geo_id;end;
 if last.month;
 drop geo_id;
run;

Xia Keshan

Respected Advisor
Posts: 3,124

Re: Creating cumulative counts over period of time

I like the full dynamic Hash approach better, no waste of memory, no worry about busted array.

data have;

     input (month geo_id) ($);

     datalines;

1           A

1           B

1           C

2           A

2           D

2           E

;

run;

data want;

     if _n_=1 then

           do;

                declare hash h();

                h.definekey('geo_id');

                h.definedone();

           end;

     do until (last.month);

           set have;

           by month;

           rc=h.add();

     end;

     count=h.num_items;

     keep month count;

run;

Super Contributor
Posts: 305

Re: Creating cumulative counts over period of time

hello,

data have;
input Month  Geo_id $;
datalines;
1 A
1 B
1 C
2 A
2 D
2 E
;

proc sort data=have;
by month;
run;


data want;

if _N_=1 then do;
declare hash g(ordered:'Y');
g.definekey('gr');
g.definedone();
end;

set have;
by month;

gr=Geo_id;

if g.find(key:Geo_id) ne 0 then do;g.add();end;

if last.month then do;totalitems = g.num_items;output;end;

keep month totalitems;
run;

Ask a Question
Discussion stats
  • 8 replies
  • 665 views
  • 7 likes
  • 7 in conversation