BookmarkSubscribeRSS Feed
mat_n
Obsidian | Level 7

I am using PROC TIMESERIES to wrap data into a timeseries format. There is one variable in my data for which I would like to store the most frequent value. However, I could'nt find a way to do this easily in PROC TIMESERIES nor any smart work around. I was wondering whether I missed something because it appears as a rather usual problem.

 

data have;
	infile datalines delimiter='	';
	input customerid : 8.
		date : date9.
		variable : 3.
	;
	datalines;
2123780	11APR2017	0
2123780	13APR2017	0
2123780	16APR2017	1
2123780	16APR2017	0
2123780	16APR2017	2
2123780	16APR2017	0
2123780	21APR2017	0
2123780	23APR2017	0
2123780	23APR2017	0
2123780	26APR2017	0
2123780	29APR2017	0
2123780	29APR2017	3
2123780	01MAY2017	3
2123780	01MAY2017	2
2123780	01MAY2017	5
2123780	05MAY2017	1
2123780	07MAY2017	2
2123780	09MAY2017	2
2123780	11MAY2017	3
2123780	13MAY2017	3
2123780	14MAY2017	0
2123780	16MAY2017	2
2123780	16MAY2017	2
;
run;

PROC TIMESERIES data=have out=want;
by customerID;
id date interval=day accumulate=???;
var variable;
run;

 

 

Thank you in advance for any hints!

 

3 REPLIES 3
Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

 

Unfortunately, there is no simple function that can do this that I can see. I checked around and I think these references may help. One uses SQL. If you are uncomfortable with that you will need to use something like PROC FREQ in separate operation. I suppose some other procs (Summary, Univariate, Tablulate) could work as well, but I think PROC FREQ is the best choice.  

 

https://communities.sas.com/t5/SAS-Procedures/Most-frequent-observation/td-p/63799   SQL

 

https://www.lexjansen.com/nesug/nesug10/cc/cc16.pdf    Using Proc Freq to find most common value.

mat_n
Obsidian | Level 7

Thank you for your research you put in this topic! Since both of your links contain breaks which results in a 404 error, I am adding the correct links for other readers:

Most frequent observation

Ordering PROC FREQ Around

 

Does anyone know why there is no such option implemented in most SAS procedures? In many cases, a mode seems to be a desirable information to have. I know, it has its difficulties if different values occur equally often but a siimple option in a formula should solve this problem and keep the researcher aware of what is happening.

 

 

 

mat_n
Obsidian | Level 7

This is not exactly a solution to my original question but I want to share it anyway.

After performing a PROC TIMESERIES to my data and randomly keeping the first non missing observation of a day, the following code performs a moving mode for a specific time window.

 

 

%let window=7;
/*Bestimme den Mode von kategorialen Variablen*/
data want;
	set have;
	by customerID;

	/*Create an array as long as the time window*/
	array var {&window};
	retain var:;

	/*	Fill the array with missings for a new ID*/
	if first.customerID then
		do;
			do i = 1 to &window - 1;
				var{i} = .;
			end;

			/*The current observation is assigned at the last position of the array*/
			var{dim(var)} = variable;
		end;
	else
		do;
			/*If the observation is NOT then first of a new ID, then fill each array value with the one from the succeding position.*/
			do i = 1 to dim(var)-1;
				var{i} = var{i+1};
			end;

			/*The current observation is assigned at the last position of the array*/
			var{dim(var)} = variable;

			/*An array to count the most frequent value*/
			array nums {&window} var:;

			/*	Loop through the array and search the most frequent observation*/
			do _n_=1 to &window;
				count=0;

				do _i_ = _n_ to &window;
					/*Missing values are not considered*/
					if nums{_n_} = nums{_i_} and not  missing( nums{_n_}) then
						count + 1;
				end;

				/*A mode-value is replaced if another value occured more often. Consequently, a new mode has to beat the last mode by at least one.*/
				if count > maxcount then
					do;
						mode = nums{_n_};
						maxcount = count;
					end;
			end;
		end;
run;

 

It is inspired by this and this reply on different questions and can be seen as a substitute for a PROC EXPAND moving mode transformation.

 

PROC EXPAND DATA=have;

CONVERT variable / TRANSFORM=(MODE &window);

RUN;

 

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1477 views
  • 1 like
  • 2 in conversation