I am using PROC TIMESERIES to wrap data into a timeseries format. There is one variable in my data for which I would like to store the most frequent value. However, I could'nt find a way to do this easily in PROC TIMESERIES nor any smart work around. I was wondering whether I missed something because it appears as a rather usual problem.
data have; infile datalines delimiter=' '; input customerid : 8. date : date9. variable : 3. ; datalines; 2123780 11APR2017 0 2123780 13APR2017 0 2123780 16APR2017 1 2123780 16APR2017 0 2123780 16APR2017 2 2123780 16APR2017 0 2123780 21APR2017 0 2123780 23APR2017 0 2123780 23APR2017 0 2123780 26APR2017 0 2123780 29APR2017 0 2123780 29APR2017 3 2123780 01MAY2017 3 2123780 01MAY2017 2 2123780 01MAY2017 5 2123780 05MAY2017 1 2123780 07MAY2017 2 2123780 09MAY2017 2 2123780 11MAY2017 3 2123780 13MAY2017 3 2123780 14MAY2017 0 2123780 16MAY2017 2 2123780 16MAY2017 2 ; run;
PROC TIMESERIES data=have out=want;
by customerID;
id date interval=day accumulate=???;
var variable;
run;
Thank you in advance for any hints!
Unfortunately, there is no simple function that can do this that I can see. I checked around and I think these references may help. One uses SQL. If you are uncomfortable with that you will need to use something like PROC FREQ in separate operation. I suppose some other procs (Summary, Univariate, Tablulate) could work as well, but I think PROC FREQ is the best choice.
https://communities.sas.com/t5/SAS-Procedures/Most-frequent-observation/td-p/63799 SQL
https://www.lexjansen.com/nesug/nesug10/cc/cc16.pdf Using Proc Freq to find most common value.
Thank you for your research you put in this topic! Since both of your links contain breaks which results in a 404 error, I am adding the correct links for other readers:
Does anyone know why there is no such option implemented in most SAS procedures? In many cases, a mode seems to be a desirable information to have. I know, it has its difficulties if different values occur equally often but a siimple option in a formula should solve this problem and keep the researcher aware of what is happening.
This is not exactly a solution to my original question but I want to share it anyway.
After performing a PROC TIMESERIES to my data and randomly keeping the first non missing observation of a day, the following code performs a moving mode for a specific time window.
%let window=7;
/*Bestimme den Mode von kategorialen Variablen*/
data want;
set have;
by customerID;
/*Create an array as long as the time window*/
array var {&window};
retain var:;
/* Fill the array with missings for a new ID*/
if first.customerID then
do;
do i = 1 to &window - 1;
var{i} = .;
end;
/*The current observation is assigned at the last position of the array*/
var{dim(var)} = variable;
end;
else
do;
/*If the observation is NOT then first of a new ID, then fill each array value with the one from the succeding position.*/
do i = 1 to dim(var)-1;
var{i} = var{i+1};
end;
/*The current observation is assigned at the last position of the array*/
var{dim(var)} = variable;
/*An array to count the most frequent value*/
array nums {&window} var:;
/* Loop through the array and search the most frequent observation*/
do _n_=1 to &window;
count=0;
do _i_ = _n_ to &window;
/*Missing values are not considered*/
if nums{_n_} = nums{_i_} and not missing( nums{_n_}) then
count + 1;
end;
/*A mode-value is replaced if another value occured more often. Consequently, a new mode has to beat the last mode by at least one.*/
if count > maxcount then
do;
mode = nums{_n_};
maxcount = count;
end;
end;
end;
run;
It is inspired by this and this reply on different questions and can be seen as a substitute for a PROC EXPAND moving mode transformation.
PROC EXPAND DATA=have;
CONVERT variable / TRANSFORM=(MODE &window);
RUN;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.