BookmarkSubscribeRSS Feed
KDang
Fluorite | Level 6
Hi,

I have a dataset with id's and InviteDate's (DDMMYY)

id InviteDate
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
....


I'm trying to get a report to determine the average # of invitations a person would get in a month.

so i think the first step is to get the number of invites per person per month, so the data would look like:

id MonYY Invites
1 Jan10 2
1 May10 1
2 Jan10 1
2 May10 1
2 Jun10 1
2 Jul10 1
....


Thanks for your help.
13 REPLIES 13
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You can use a DATA step to assign a "month" start date variable (presuming you are working with a SAS DATE type numeric variable for InviteDate), using the INTNX function; also using the appropriate SAS output FORMAT to display the date as you show in your post. Then use PROC SUMMARY / MEANS to generate your statistics output and PROC PRINT to display.

Scott Barry
SBBWorks, Inc.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Kdang,

This is one of many possible solutions:
[pre]
/* Input and create MonYY */;
data a;
input ID $ 1-1 d $ 3-12;
m=SUBSTR(d,4,2);
y=SUBSTR(d,9,2);
if m="01" then mt="Jan";
else if m="02" then mt="Feb";
else if m="03" then mt="Mar";
else if m="04" then mt="Apr";
else if m="05" then mt="May";
else if m="06" then mt="Jun";
else if m="07" then mt="Jul";
else if m="08" then mt="Aug";
else if m="09" then mt="Sep";
else if m="10" then mt="Oct";
else if m="11" then mt="Nov";
else if m="12" then mt="Dec";
else put "Error in month m=" m;
MonYY=mt||y;
keep ID MonYY;
datalines;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
/* Counting Invitations */;
proc SQL;
create table r as
select ID, MonYY, Count(*) as Invites
from a
group by ID, MonYY
;quit;
[/pre]
Sincerely,
SPR
Ksharp
Super User
Emmmm.Maybe the founction is your best companion.


[pre]
data temp;
input id invitedate ddmmyy10.;
monyy=substr(put(invitedate,date7.),notdigit(put(invitedate,date7.)));
cards;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
proc freq data=temp;
tables id*monyy /list out=result(drop=percent);
run;
proc print noobs label;
var id monyy count;
label count='invites';
run;

[/pre]



Ksharp
KDang
Fluorite | Level 6
Thanks for all the help, I ended up using the intnx solution.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Excellent choice - mostly because of the unacceptable sort-order consideration when attempting to string together "date related" character text and expect SAS to know how it is to be interpreted with a report and/or axis value. That's why SAS provides these powerful functions to operate effectively and efficiently, with minimal coding, most of the time.

Scott Barry
SBBWorks, Inc.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
In the spirit of the Sbb post, this is one more variant using different SAS functions:
[pre]
/* Input and create MonYY */;
data a;
input ID InviteDate ddmmyy10.;
MonYY=MDY(Month(invitedate),01,Year(invitedate));
format InviteDate MonYY date7.;
datalines;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
/* Counting Invitations */;
proc SQL;
create table r as
select ID, MonYY, Count(*) as Invites
from a group by ID, MonYY;
quit;
[/pre]
SPR
Ksharp
Super User
Hi.
I also forget the order of date is very important.
Emmmm.I think that would be more simple. 🙂


[pre]
data temp;
input id invitedate ddmmyy10.;
cards;
1 01/01/2010
1 12/01/2010
1 15/05/2010
2 01/01/2010
2 15/05/2010
2 19/06/2010
2 30/07/2010
;
run;
proc freq data=temp;
tables id*invitedate /list out=result(drop=percent);
format invitedate monyy5.;
run;
proc print noobs label;
label count='invites';
run;
[/pre]



Ksharp
Hima
Obsidian | Level 7

PROC SQL;

SELECT ID, MonYY AS MONTH, COUNT(ID) AS INVITEES

FROM TABLENAME

GROUP BY ID;

QUIT;

Haikuo
Onyx | Level 15

Proc SQL actually does provide a native approach for this problem, however, I got lost in your code and I am not sure how it works. it seems to me it would need more work like this:

data have;

infile cards;

input id $ InviteDate :ddmmyy10.;

format invitedate date9.;

cards;

1 01/01/2010

1 12/01/2010

1 15/05/2010

2 01/01/2010

2 15/05/2010

2 19/06/2010

2 30/07/2010

;

proc sql;

create table want as

select id, put(invitedate, monyy5.) as my, count(*) as ct

from have

group by id, calculated my

;quit;

proc print;run;

Regards,

Haikuo

Hima
Obsidian | Level 7

When I ran your code the results are as follows.


                                    Obs    id     my      ct

                                     1     1     JAN10     2
                                     2     1     MAY10     1
                                     3     2     JAN10     1
                                     4     2     JUL10     1
                                     5     2     JUN10     1
                                     6     2     MAY10     1

Here are the results from my code

                                    Obs    id     my      ct

                                     1     1     JAN10     2
                                     2     1     MAY10     1
                                     3     2     JAN10     1
                                     4     2     JUL10     1
                                     5     2     JUN10     1
                                     6     2     MAY10     1

I dont see anything different in the results.

Haikuo
Onyx | Level 15

What is your platform? Are you using SAS?  I am running SAS 9.2 on winxp sp3 32bit. Your code does not work on my machine, and it seems to me it does not have valid syntax. Please try again or provide your full running code.

udo_sas
SAS Employee

Hello -

I realize that you might not have access to SAS/ETS software - however, in case you do, you might want to consider running PROC TIMESERIES (which creates some nice summary stats as well - see OUTSUM data set) - after modifying your source data slightly.

If you don't care about month a person was not contacted, simply get rid of the setmissing option.

Regards,

Udo

data have;

infile cards;

input id $ InviteDate :ddmmyy10. ct;

format invitedate date9.;

cards;

1 01/01/2010 1

1 12/01/2010 1

1 15/05/2010 1

2 01/01/2010 1

2 15/05/2010 1

2 19/06/2010 1

2 30/07/2010 1

;

proc timeseries data=have out=want outsum=outsum;

id invitedate interval=month accumulate=total setmissing=0;

var ct;

by id;

run;

Hima
Obsidian | Level 7

Good catch Mr.Hai.kuo. Not sure how I overlooked at it. Please ignore my previous code. This should work fine now.

Code:

PROC SQL;
CREATE TABLE HAVE1 AS
SELECT ID, invitedate AS MONTH FORMAT MMYYS., COUNT(ID) AS INVITEES
FROM have
QUIT;

PROC FREQ DATA = HAVE1;
TABLES ID * MONTH /LIST MISSING NOROW NOCOL NOPERCENT NOCUM;
RUN;

Output:

The FREQ Procedure

id MONTH Frequency

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

1 01/2010 2

1 05/2010 1

2 01/2010 1

2 05/2010 1

2 06/2010 1

2 07/2010 1

LOG:

628 PROC SQL;

629 CREATE TABLE HAVE1 AS

630 SELECT ID, invitedate AS MONTH FORMAT MMYYS., COUNT(ID) AS INVITEES

631 FROM have

632 QUIT;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.HAVE1 created, with 7 rows and 3 columns.

633

NOTE: PROCEDURE SQL used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

634 PROC FREQ DATA = HAVE1;

635 TABLES ID * MONTH /LIST MISSING NOROW NOCOL NOPERCENT NOCUM;

636 RUN;

NOTE: There were 7 observations read from the data set WORK.HAVE1.

NOTE: PROCEDURE FREQ used (Total process time):

real time 0.03 seconds

cpu time 0.01 seconds

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 3929 views
  • 0 likes
  • 7 in conversation