BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Rhodochrosite | Level 12

Hi Everyone,

 

I have a time series data and I would like to perform the following rank:

for each row, look back 199 rows (so total 200 rows).

Rank value within 200 rows, so the rank runs from 1 to 200.

Identify the top 150,180, 190, 196 values.

 

I use the percentile concept but it doesn't work 100% as expected.

My code is below.

Can anyone please help?

Thank you,

HHC

 

*calculate Moving percentile for each window based on this record and previous 199 record;
data PRC; set PRC;
	drop aa1-aa200;
	array a{0:199} aa1-aa200;
	retain aa:;
		pct_low=pctl(75,of a{*}); 
		pct_mid=pctl(90,of a{*});	
		pct_high=pctl(95,of a{*});	
		pct_top=pctl(98,of a{*});

	a{mod(_n_,200)}=ROC1;
run;

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

View solution in original post

3 REPLIES 3
ballardw
Super User

What do you expect to do with ties? The PCTL function has 5 variations with the default being 5. Look up the effect of QNTLDEF to see potential effect.

 

Please describe exactly what was wrong with the percentile approach?

FreelanceReinh
Jade | Level 19

Hi @hhchenfx,

 

Since you know the ranks in advance, the ORDINAL function or the SMALLEST function -- depending on how you want to handle missing values -- will be easier to use. The SMALLEST function could also be replaced by the LARGEST function. For example, the 196th smallest of 200 non-missing values is the 5th largest. Moreover, you can save the DROP and RETAIN statements as well as the variable names aa1-aa200 by using a temporary array (EDIT: unless aa1-aa200 are already contained in dataset PRC):

array a{0:199} _temporary_;

 

PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1043 views
  • 1 like
  • 4 in conversation