BookmarkSubscribeRSS Feed
PharmlyDoc
Quartz | Level 8

I have a dataset called record with patients who activated an online account.

I have a column of dates as follows and I want to count the frequencies that each date occurs such that I can graph the number of activations (as a line over time) that occur every 2 weeks. 

 

How should I go about doing this? 

 

Activation_Date
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM

 

So for example, in Excel, it would look like this: 

 

Screen Shot 2021-06-16 at 8.48.00 AM.png

 

 

4 REPLIES 4
PaigeMiller
Diamond | Level 26

You can turn your date time values into the half months via this data step command. From there, you can do whatever counting and plotting you want.

 

halfmonth=intnx('semimonth',datepart(activation_date),0,'b');

 

 

--
Paige Miller
PharmlyDoc
Quartz | Level 8

Okay,  thanks. So I used the data step and created the halfmonth variable. Then I created a new table that counts the number of times each date (frequency) appeared in the halfmonth column. Then I created a line chart in proc sgplot. 

 

data record_2;
set record;
halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
format halfmonth MMDDYY10.;
run;

/* count the number of time a certain date appears in the halfmonth column    */
proc sql;
create table record_3  as 
select halfmonth,count(*) as count
FROM record_2 
group by halfmonth;
quit;

proc sgplot data=record_3;
	vline halfmonth / response=count;
	yaxis grid;
run;

 Obs Activation_Date halfmonth count12345678910111213

04SEP20:09:15:5909/01/20203
04SEP20:14:28:0709/01/20203
13SEP20:10:11:5909/01/20203
23SEP20:07:07:1709/16/20201
04OCT20:08:20:4710/01/20203
04OCT20:14:24:4710/01/20203
04OCT20:14:45:3610/01/20203
23OCT20:13:41:2010/16/20206
23OCT20:12:19:1910/16/20206
29OCT20:08:15:0910/16/20206
23OCT20:12:31:3310/16/20206
23OCT20:12:46:1710/16/20206
23OCT20:11:24:1810/16/20206

 

 Obs halfmonth count1234

09/01/20203
09/16/20201
10/01/20203
10/16/20206

 

 

 

Screen Shot 2021-06-16 at 11.30.18 AM.png

 

Sajid01
Meteorite | Level 14

In this examples, the periods and the counts are show.

data test (keep=bdate Activation_date period);
length period $ 21;
informat Activation_Date anydtdtm30. ;
format Activation_date datetime23.0 bdate date9.;
input Activation_Date;
bdate=intnx('semimonth',datepart(activation_date),0,'b');
period_b=put(intnx('semimonth',datepart(activation_date),0,'b'),date9.);
period_e=put(intnx('semimonth',datepart(activation_date),0,'e'),date9.);
period=period_b||" - "||period_e;
datalines;
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM
;
run;
proc sql;
create table test2 (keep=period count) as
select  distinct period, count(period) as count
from test
group by period
order by bdate;
quit;
proc sgplot data=test2;
	series x=period y=count /markers ;
	
run;

The plot will have period on the x-axis and count on the x-axis
The output will be like this

Sajid01_0-1623862028068.png

The plot will be like this

Sajid01_1-1623862313583.png

 

PaigeMiller
Diamond | Level 26

By making the horizontal axis of your plot be a variable that is a text string, if a half-month has zero count in the original data set, then there is no spot on the horizontal axis for that half-month (your tick marks could be non-consecutive half-months). If you use the numeric variable BDATE as the horizontal axis variable, then all half-months between the earliest and latest will appear on the horizontal axis. In either case, you might want to handle the situation where a half-month has zero observations and arrange the plot so that a zero will appear for that half-month.

 

In general, for almost all problems, I recommend that if a variable represents a date, it should be numeric and not character.

--
Paige Miller

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1125 views
  • 1 like
  • 3 in conversation