BookmarkSubscribeRSS Feed
PharmlyDoc
Quartz | Level 8

I have a dataset called record with patients who activated an online account.

I have a column of dates as follows and I want to count the frequencies that each date occurs such that I can graph the number of activations (as a line over time) that occur every 2 weeks. 

 

How should I go about doing this? 

 

Activation_Date
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM

 

So for example, in Excel, it would look like this: 

 

Screen Shot 2021-06-16 at 8.48.00 AM.png

 

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

You can turn your date time values into the half months via this data step command. From there, you can do whatever counting and plotting you want.

 

halfmonth=intnx('semimonth',datepart(activation_date),0,'b');

 

 

--
Paige Miller
PharmlyDoc
Quartz | Level 8

Okay,  thanks. So I used the data step and created the halfmonth variable. Then I created a new table that counts the number of times each date (frequency) appeared in the halfmonth column. Then I created a line chart in proc sgplot. 

 

data record_2;
set record;
halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
format halfmonth MMDDYY10.;
run;

/* count the number of time a certain date appears in the halfmonth column    */
proc sql;
create table record_3  as 
select halfmonth,count(*) as count
FROM record_2 
group by halfmonth;
quit;

proc sgplot data=record_3;
	vline halfmonth / response=count;
	yaxis grid;
run;

 Obs Activation_Date halfmonth count12345678910111213

04SEP20:09:15:5909/01/20203
04SEP20:14:28:0709/01/20203
13SEP20:10:11:5909/01/20203
23SEP20:07:07:1709/16/20201
04OCT20:08:20:4710/01/20203
04OCT20:14:24:4710/01/20203
04OCT20:14:45:3610/01/20203
23OCT20:13:41:2010/16/20206
23OCT20:12:19:1910/16/20206
29OCT20:08:15:0910/16/20206
23OCT20:12:31:3310/16/20206
23OCT20:12:46:1710/16/20206
23OCT20:11:24:1810/16/20206

 

 Obs halfmonth count1234

09/01/20203
09/16/20201
10/01/20203
10/16/20206

 

 

 

Screen Shot 2021-06-16 at 11.30.18 AM.png

 

Sajid01
Meteorite | Level 14

In this examples, the periods and the counts are show.

data test (keep=bdate Activation_date period);
length period $ 21;
informat Activation_Date anydtdtm30. ;
format Activation_date datetime23.0 bdate date9.;
input Activation_Date;
bdate=intnx('semimonth',datepart(activation_date),0,'b');
period_b=put(intnx('semimonth',datepart(activation_date),0,'b'),date9.);
period_e=put(intnx('semimonth',datepart(activation_date),0,'e'),date9.);
period=period_b||" - "||period_e;
datalines;
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM
;
run;
proc sql;
create table test2 (keep=period count) as
select  distinct period, count(period) as count
from test
group by period
order by bdate;
quit;
proc sgplot data=test2;
	series x=period y=count /markers ;
	
run;

The plot will have period on the x-axis and count on the x-axis
The output will be like this

Sajid01_0-1623862028068.png

The plot will be like this

Sajid01_1-1623862313583.png

 

PaigeMiller
Diamond | Level 26

By making the horizontal axis of your plot be a variable that is a text string, if a half-month has zero count in the original data set, then there is no spot on the horizontal axis for that half-month (your tick marks could be non-consecutive half-months). If you use the numeric variable BDATE as the horizontal axis variable, then all half-months between the earliest and latest will appear on the horizontal axis. In either case, you might want to handle the situation where a half-month has zero observations and arrange the plot so that a zero will appear for that half-month.

 

In general, for almost all problems, I recommend that if a variable represents a date, it should be numeric and not character.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 795 views
  • 1 like
  • 3 in conversation