Quartz | Level 8

Count frequencies of dates in a column and then graph them by Half-month intervals

I have a dataset called record with patients who activated an online account.

I have a column of dates as follows and I want to count the frequencies that each date occurs such that I can graph the number of activations (as a line over time) that occur every 2 weeks.

How should I go about doing this?

 Activation_Date 2020/10/29:08:15:09 AM 2020/10/23:12:31:33 PM 2020/10/04:02:45:36 PM 2020/10/23:12:46:17 PM 2020/10/04:02:24:47 PM 2020/10/23:01:41:20 PM 2020/10/04:08:20:47 AM 2020/10/23:11:24:18 AM 2020/09/13:10:11:59 AM 2020/09/04:09:15:59 AM 2020/09/23:07:07:17 AM 2020/09/04:02:28:07 PM 2020/10/23:12:19:19 PM

So for example, in Excel, it would look like this:

4 REPLIES 4
Diamond | Level 26

Re: Count frequencies of dates in a column and then graph them by Half-month intervals

You can turn your date time values into the half months via this data step command. From there, you can do whatever counting and plotting you want.

``````halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
``````

--
Paige Miller
Quartz | Level 8

Re: Count frequencies of dates in a column and then graph them by Half-month intervals

Okay,  thanks. So I used the data step and created the halfmonth variable. Then I created a new table that counts the number of times each date (frequency) appeared in the halfmonth column. Then I created a line chart in proc sgplot.

``````data record_2;
set record;
halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
format halfmonth MMDDYY10.;
run;

/* count the number of time a certain date appears in the halfmonth column    */
proc sql;
create table record_3  as
select halfmonth,count(*) as count
FROM record_2
group by halfmonth;
quit;

proc sgplot data=record_3;
vline halfmonth / response=count;
yaxis grid;
run;``````

Obs Activation_Date halfmonth count12345678910111213

 04SEP20:09:15:59 09/01/2020 3 04SEP20:14:28:07 09/01/2020 3 13SEP20:10:11:59 09/01/2020 3 23SEP20:07:07:17 09/16/2020 1 04OCT20:08:20:47 10/01/2020 3 04OCT20:14:24:47 10/01/2020 3 04OCT20:14:45:36 10/01/2020 3 23OCT20:13:41:20 10/16/2020 6 23OCT20:12:19:19 10/16/2020 6 29OCT20:08:15:09 10/16/2020 6 23OCT20:12:31:33 10/16/2020 6 23OCT20:12:46:17 10/16/2020 6 23OCT20:11:24:18 10/16/2020 6

Obs halfmonth count1234

 09/01/2020 3 09/16/2020 1 10/01/2020 3 10/16/2020 6

Meteorite | Level 14

Re: Count frequencies of dates in a column and then graph them by Half-month intervals

In this examples, the periods and the counts are show.

``````data test (keep=bdate Activation_date period);
length period \$ 21;
informat Activation_Date anydtdtm30. ;
format Activation_date datetime23.0 bdate date9.;
input Activation_Date;
bdate=intnx('semimonth',datepart(activation_date),0,'b');
period_b=put(intnx('semimonth',datepart(activation_date),0,'b'),date9.);
period_e=put(intnx('semimonth',datepart(activation_date),0,'e'),date9.);
period=period_b||" - "||period_e;
datalines;
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM
;
run;
proc sql;
create table test2 (keep=period count) as
select  distinct period, count(period) as count
from test
group by period
order by bdate;
quit;
proc sgplot data=test2;
series x=period y=count /markers ;

run;``````

The plot will have period on the x-axis and count on the x-axis
The output will be like this

The plot will be like this

Diamond | Level 26

Re: Count frequencies of dates in a column and then graph them by Half-month intervals

By making the horizontal axis of your plot be a variable that is a text string, if a half-month has zero count in the original data set, then there is no spot on the horizontal axis for that half-month (your tick marks could be non-consecutive half-months). If you use the numeric variable BDATE as the horizontal axis variable, then all half-months between the earliest and latest will appear on the horizontal axis. In either case, you might want to handle the situation where a half-month has zero observations and arrange the plot so that a zero will appear for that half-month.

In general, for almost all problems, I recommend that if a variable represents a date, it should be numeric and not character.

--
Paige Miller
Discussion stats
• 4 replies
• 616 views
• 1 like
• 3 in conversation