I am using PROC TIMESERIES to wrap data into a timeseries format. There is one variable in my data for which I would like to store the most frequent value. However, I could'nt find a way to do this easily in PROC TIMESERIES nor any smart work around. I was wondering whether I missed something because it appears as a rather usual problem.
data have; infile datalines delimiter=' '; input customerid : 8. date : date9. variable : 3. ; datalines; 2123780 11APR2017 0 2123780 13APR2017 0 2123780 16APR2017 1 2123780 16APR2017 0 2123780 16APR2017 2 2123780 16APR2017 0 2123780 21APR2017 0 2123780 23APR2017 0 2123780 23APR2017 0 2123780 26APR2017 0 2123780 29APR2017 0 2123780 29APR2017 3 2123780 01MAY2017 3 2123780 01MAY2017 2 2123780 01MAY2017 5 2123780 05MAY2017 1 2123780 07MAY2017 2 2123780 09MAY2017 2 2123780 11MAY2017 3 2123780 13MAY2017 3 2123780 14MAY2017 0 2123780 16MAY2017 2 2123780 16MAY2017 2 ; run;
PROC TIMESERIES data=have out=want;
by customerID;
id date interval=day accumulate=???;
var variable;
run;
Thank you in advance for any hints!
Unfortunately, there is no simple function that can do this that I can see. I checked around and I think these references may help. One uses SQL. If you are uncomfortable with that you will need to use something like PROC FREQ in separate operation. I suppose some other procs (Summary, Univariate, Tablulate) could work as well, but I think PROC FREQ is the best choice.
https://communities.sas.com/t5/SAS-Procedures/Most-frequent-observation/td-p/63799 SQL
https://www.lexjansen.com/nesug/nesug10/cc/cc16.pdf Using Proc Freq to find most common value.
Thank you for your research you put in this topic! Since both of your links contain breaks which results in a 404 error, I am adding the correct links for other readers:
Does anyone know why there is no such option implemented in most SAS procedures? In many cases, a mode seems to be a desirable information to have. I know, it has its difficulties if different values occur equally often but a siimple option in a formula should solve this problem and keep the researcher aware of what is happening.
This is not exactly a solution to my original question but I want to share it anyway.
After performing a PROC TIMESERIES to my data and randomly keeping the first non missing observation of a day, the following code performs a moving mode for a specific time window.
%let window=7;
/*Bestimme den Mode von kategorialen Variablen*/
data want;
set have;
by customerID;
/*Create an array as long as the time window*/
array var {&window};
retain var:;
/* Fill the array with missings for a new ID*/
if first.customerID then
do;
do i = 1 to &window - 1;
var{i} = .;
end;
/*The current observation is assigned at the last position of the array*/
var{dim(var)} = variable;
end;
else
do;
/*If the observation is NOT then first of a new ID, then fill each array value with the one from the succeding position.*/
do i = 1 to dim(var)-1;
var{i} = var{i+1};
end;
/*The current observation is assigned at the last position of the array*/
var{dim(var)} = variable;
/*An array to count the most frequent value*/
array nums {&window} var:;
/* Loop through the array and search the most frequent observation*/
do _n_=1 to &window;
count=0;
do _i_ = _n_ to &window;
/*Missing values are not considered*/
if nums{_n_} = nums{_i_} and not missing( nums{_n_}) then
count + 1;
end;
/*A mode-value is replaced if another value occured more often. Consequently, a new mode has to beat the last mode by at least one.*/
if count > maxcount then
do;
mode = nums{_n_};
maxcount = count;
end;
end;
end;
run;
It is inspired by this and this reply on different questions and can be seen as a substitute for a PROC EXPAND moving mode transformation.
PROC EXPAND DATA=have;
CONVERT variable / TRANSFORM=(MODE &window);
RUN;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.