I have a dataset called record with patients who activated an online account.
I have a column of dates as follows and I want to count the frequencies that each date occurs such that I can graph the number of activations (as a line over time) that occur every 2 weeks.
How should I go about doing this?
Activation_Date |
2020/10/29:08:15:09 AM |
2020/10/23:12:31:33 PM |
2020/10/04:02:45:36 PM |
2020/10/23:12:46:17 PM |
2020/10/04:02:24:47 PM |
2020/10/23:01:41:20 PM |
2020/10/04:08:20:47 AM |
2020/10/23:11:24:18 AM |
2020/09/13:10:11:59 AM |
2020/09/04:09:15:59 AM |
2020/09/23:07:07:17 AM |
2020/09/04:02:28:07 PM |
2020/10/23:12:19:19 PM |
So for example, in Excel, it would look like this:
You can turn your date time values into the half months via this data step command. From there, you can do whatever counting and plotting you want.
halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
Okay, thanks. So I used the data step and created the halfmonth variable. Then I created a new table that counts the number of times each date (frequency) appeared in the halfmonth column. Then I created a line chart in proc sgplot.
data record_2;
set record;
halfmonth=intnx('semimonth',datepart(activation_date),0,'b');
format halfmonth MMDDYY10.;
run;
/* count the number of time a certain date appears in the halfmonth column */
proc sql;
create table record_3 as
select halfmonth,count(*) as count
FROM record_2
group by halfmonth;
quit;
proc sgplot data=record_3;
vline halfmonth / response=count;
yaxis grid;
run;
Obs Activation_Date halfmonth count12345678910111213
04SEP20:09:15:59 | 09/01/2020 | 3 |
04SEP20:14:28:07 | 09/01/2020 | 3 |
13SEP20:10:11:59 | 09/01/2020 | 3 |
23SEP20:07:07:17 | 09/16/2020 | 1 |
04OCT20:08:20:47 | 10/01/2020 | 3 |
04OCT20:14:24:47 | 10/01/2020 | 3 |
04OCT20:14:45:36 | 10/01/2020 | 3 |
23OCT20:13:41:20 | 10/16/2020 | 6 |
23OCT20:12:19:19 | 10/16/2020 | 6 |
29OCT20:08:15:09 | 10/16/2020 | 6 |
23OCT20:12:31:33 | 10/16/2020 | 6 |
23OCT20:12:46:17 | 10/16/2020 | 6 |
23OCT20:11:24:18 | 10/16/2020 | 6 |
Obs halfmonth count1234
09/01/2020 | 3 |
09/16/2020 | 1 |
10/01/2020 | 3 |
10/16/2020 | 6 |
In this examples, the periods and the counts are show.
data test (keep=bdate Activation_date period);
length period $ 21;
informat Activation_Date anydtdtm30. ;
format Activation_date datetime23.0 bdate date9.;
input Activation_Date;
bdate=intnx('semimonth',datepart(activation_date),0,'b');
period_b=put(intnx('semimonth',datepart(activation_date),0,'b'),date9.);
period_e=put(intnx('semimonth',datepart(activation_date),0,'e'),date9.);
period=period_b||" - "||period_e;
datalines;
2020/10/29:08:15:09 AM
2020/10/23:12:31:33 PM
2020/10/04:02:45:36 PM
2020/10/23:12:46:17 PM
2020/10/04:02:24:47 PM
2020/10/23:01:41:20 PM
2020/10/04:08:20:47 AM
2020/10/23:11:24:18 AM
2020/09/13:10:11:59 AM
2020/09/04:09:15:59 AM
2020/09/23:07:07:17 AM
2020/09/04:02:28:07 PM
2020/10/23:12:19:19 PM
;
run;
proc sql;
create table test2 (keep=period count) as
select distinct period, count(period) as count
from test
group by period
order by bdate;
quit;
proc sgplot data=test2;
series x=period y=count /markers ;
run;
The plot will have period on the x-axis and count on the x-axis
The output will be like this
The plot will be like this
By making the horizontal axis of your plot be a variable that is a text string, if a half-month has zero count in the original data set, then there is no spot on the horizontal axis for that half-month (your tick marks could be non-consecutive half-months). If you use the numeric variable BDATE as the horizontal axis variable, then all half-months between the earliest and latest will appear on the horizontal axis. In either case, you might want to handle the situation where a half-month has zero observations and arrange the plot so that a zero will appear for that half-month.
In general, for almost all problems, I recommend that if a variable represents a date, it should be numeric and not character.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.