BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

 

I have a time series data and I would like to perform the following rank:

for each row, look back 199 rows (so total 200 rows).

Rank value within 200 rows, so the rank runs from 1 to 200.

Identify the top 150,180, 190, 196 values.

 

I use the percentile concept but it doesn't work 100% as expected.

My code is below.

Can anyone please help?

Thank you,

HHC

 

*calculate Moving percentile for each window based on this record and previous 199 record;
data PRC; set PRC;
	drop aa1-aa200;
	array a{0:199} aa1-aa200;
	retain aa:;
		pct_low=pctl(75,of a{*}); 
		pct_mid=pctl(90,of a{*});	
		pct_high=pctl(95,of a{*});	
		pct_top=pctl(98,of a{*});

	a{mod(_n_,200)}=ROC1;
run;

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

View solution in original post

3 REPLIES 3
ballardw
Super User

What do you expect to do with ties? The PCTL function has 5 variations with the default being 5. Look up the effect of QNTLDEF to see potential effect.

 

Please describe exactly what was wrong with the percentile approach?

FreelanceReinh
Jade | Level 19

Hi @hhchenfx,

 

Since you know the ranks in advance, the ORDINAL function or the SMALLEST function -- depending on how you want to handle missing values -- will be easier to use. The SMALLEST function could also be replaced by the LARGEST function. For example, the 196th smallest of 200 non-missing values is the 5th largest. Moreover, you can save the DROP and RETAIN statements as well as the variable names aa1-aa200 by using a temporary array (EDIT: unless aa1-aa200 are already contained in dataset PRC):

array a{0:199} _temporary_;

 

PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 672 views
  • 1 like
  • 4 in conversation