DATA Step, Macro, Functions and more

Counts by id by month

Reply
Contributor
Posts: 27

Counts by id by month

Hi,

I have a dataset with id's and InviteDate's (DDMMYY)

id InviteDate
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
....


I'm trying to get a report to determine the average # of invitations a person would get in a month.

so i think the first step is to get the number of invites per person per month, so the data would look like:

id MonYY Invites
1 Jan10 2
1 May10 1
2 Jan10 1
2 May10 1
2 Jun10 1
2 Jul10 1
....


Thanks for your help.
Super Contributor
Super Contributor
Posts: 3,174

Re: Counts by id by month

You can use a DATA step to assign a "month" start date variable (presuming you are working with a SAS DATE type numeric variable for InviteDate), using the INTNX function; also using the appropriate SAS output FORMAT to display the date as you show in your post. Then use PROC SUMMARY / MEANS to generate your statistics output and PROC PRINT to display.

Scott Barry
SBBWorks, Inc.
Super Contributor
Super Contributor
Posts: 365

Re: Counts by id by month

Hello Kdang,

This is one of many possible solutions:
[pre]
/* Input and create MonYY */;
data a;
input ID $ 1-1 d $ 3-12;
m=SUBSTR(d,4,2);
y=SUBSTR(d,9,2);
if m="01" then mt="Jan";
else if m="02" then mt="Feb";
else if m="03" then mt="Mar";
else if m="04" then mt="Apr";
else if m="05" then mt="May";
else if m="06" then mt="Jun";
else if m="07" then mt="Jul";
else if m="08" then mt="Aug";
else if m="09" then mt="Sep";
else if m="10" then mt="Oct";
else if m="11" then mt="Nov";
else if m="12" then mt="Dec";
else put "Error in month m=" m;
MonYY=mt||y;
keep ID MonYY;
datalines;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
/* Counting Invitations */;
proc SQL;
create table r as
select ID, MonYY, Count(*) as Invites
from a
group by ID, MonYY
;quit;
[/pre]
Sincerely,
SPR
Super User
Posts: 9,691

Re: Counts by id by month

Emmmm.Maybe the founction is your best companion.


[pre]
data temp;
input id invitedate ddmmyy10.;
monyy=substr(put(invitedate,date7.),notdigit(put(invitedate,date7.)));
cards;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
proc freq data=temp;
tables id*monyy /list out=result(drop=percent);
run;
proc print noobs label;
var id monyy count;
label count='invites';
run;

[/pre]



Ksharp
Contributor
Posts: 27

Re: Counts by id by month

Thanks for all the help, I ended up using the intnx solution.
Super Contributor
Super Contributor
Posts: 3,174

Re: Counts by id by month

Excellent choice - mostly because of the unacceptable sort-order consideration when attempting to string together "date related" character text and expect SAS to know how it is to be interpreted with a report and/or axis value. That's why SAS provides these powerful functions to operate effectively and efficiently, with minimal coding, most of the time.

Scott Barry
SBBWorks, Inc.
Super Contributor
Super Contributor
Posts: 365

Re: Counts by id by month

In the spirit of the Sbb post, this is one more variant using different SAS functions:
[pre]
/* Input and create MonYY */;
data a;
input ID InviteDate ddmmyy10.;
MonYY=MDY(Month(invitedate),01,Year(invitedate));
format InviteDate MonYY date7.;
datalines;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
/* Counting Invitations */;
proc SQL;
create table r as
select ID, MonYY, Count(*) as Invites
from a group by ID, MonYY;
quit;
[/pre]
SPR
Super User
Posts: 9,691

Re: Counts by id by month

Hi.
I also forget the order of date is very important.
Emmmm.I think that would be more simple. Smiley Happy


[pre]
data temp;
input id invitedate ddmmyy10.;
cards;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
proc freq data=temp;
tables id*invitedate /list out=result(drop=percent);
format invitedate monyy5.;
run;
proc print noobs label;
label count='invites';
run;
[/pre]



Ksharp
Regular Contributor
Posts: 233

Re: Counts by id by month

PROC SQL;

SELECT ID, MonYY AS MONTH, COUNT(ID) AS INVITEES

FROM TABLENAME

GROUP BY ID;

QUIT;

Respected Advisor
Posts: 3,124

Re: Counts by id by month

Proc SQL actually does provide a native approach for this problem, however, I got lost in your code and I am not sure how it works. it seems to me it would need more work like this:

data have;

infile cards;

input id $ InviteDate :ddmmyy10.;

format invitedate date9.;

cards;

1 01/01/2010

1 12/01/2010

1 15/05/2010

2 01/01/2010

2 15/05/2010

2 19/06/2010

2 30/07/2010

;

proc sql;

create table want as

select id, put(invitedate, monyy5.) as my, count(*) as ct

from have

group by id, calculated my

;quit;

proc print;run;

Regards,

Haikuo

Regular Contributor
Posts: 233

Re: Counts by id by month

When I ran your code the results are as follows.


                                    Obs    id     my      ct

                                     1     1     JAN10     2
                                     2     1     MAY10     1
                                     3     2     JAN10     1
                                     4     2     JUL10     1
                                     5     2     JUN10     1
                                     6     2     MAY10     1

Here are the results from my code

                                    Obs    id     my      ct

                                     1     1     JAN10     2
                                     2     1     MAY10     1
                                     3     2     JAN10     1
                                     4     2     JUL10     1
                                     5     2     JUN10     1
                                     6     2     MAY10     1

I dont see anything different in the results.

Respected Advisor
Posts: 3,124

Re: Counts by id by month

What is your platform? Are you using SAS?  I am running SAS 9.2 on winxp sp3 32bit. Your code does not work on my machine, and it seems to me it does not have valid syntax. Please try again or provide your full running code.

SAS Employee
Posts: 416

Re: Counts by id by month

Hello -

I realize that you might not have access to SAS/ETS software - however, in case you do, you might want to consider running PROC TIMESERIES (which creates some nice summary stats as well - see OUTSUM data set) - after modifying your source data slightly.

If you don't care about month a person was not contacted, simply get rid of the setmissing option.

Regards,

Udo

data have;

infile cards;

input id $ InviteDate :ddmmyy10. ct;

format invitedate date9.;

cards;

1 01/01/2010 1

1 12/01/2010 1

1 15/05/2010 1

2 01/01/2010 1

2 15/05/2010 1

2 19/06/2010 1

2 30/07/2010 1

;

proc timeseries data=have out=want outsum=outsum;

id invitedate interval=month accumulate=total setmissing=0;

var ct;

by id;

run;

Regular Contributor
Posts: 233

Re: Counts by id by month

Good catch Mr.Hai.kuo. Not sure how I overlooked at it. Please ignore my previous code. This should work fine now.

Code:

PROC SQL;
CREATE TABLE HAVE1 AS
SELECT ID, invitedate AS MONTH FORMAT MMYYS., COUNT(ID) AS INVITEES
FROM have
QUIT;

PROC FREQ DATA = HAVE1;
TABLES ID * MONTH /LIST MISSING NOROW NOCOL NOPERCENT NOCUM;
RUN;

Output:

The FREQ Procedure

id MONTH Frequency

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

1 01/2010 2

1 05/2010 1

2 01/2010 1

2 05/2010 1

2 06/2010 1

2 07/2010 1

LOG:

628 PROC SQL;

629 CREATE TABLE HAVE1 AS

630 SELECT ID, invitedate AS MONTH FORMAT MMYYS., COUNT(ID) AS INVITEES

631 FROM have

632 QUIT;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.HAVE1 created, with 7 rows and 3 columns.

633

NOTE: PROCEDURE SQL used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

634 PROC FREQ DATA = HAVE1;

635 TABLES ID * MONTH /LIST MISSING NOROW NOCOL NOPERCENT NOCUM;

636 RUN;

NOTE: There were 7 observations read from the data set WORK.HAVE1.

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.03 seconds

cpu time 0.01 seconds

Ask a Question
Discussion stats
  • 13 replies
  • 344 views
  • 0 likes
  • 7 in conversation