BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

 

I have a time series data and I would like to perform the following rank:

for each row, look back 199 rows (so total 200 rows).

Rank value within 200 rows, so the rank runs from 1 to 200.

Identify the top 150,180, 190, 196 values.

 

I use the percentile concept but it doesn't work 100% as expected.

My code is below.

Can anyone please help?

Thank you,

HHC

 

*calculate Moving percentile for each window based on this record and previous 199 record;
data PRC; set PRC;
	drop aa1-aa200;
	array a{0:199} aa1-aa200;
	retain aa:;
		pct_low=pctl(75,of a{*}); 
		pct_mid=pctl(90,of a{*});	
		pct_high=pctl(95,of a{*});	
		pct_top=pctl(98,of a{*});

	a{mod(_n_,200)}=ROC1;
run;

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

View solution in original post

3 REPLIES 3
ballardw
Super User

What do you expect to do with ties? The PCTL function has 5 variations with the default being 5. Look up the effect of QNTLDEF to see potential effect.

 

Please describe exactly what was wrong with the percentile approach?

FreelanceReinh
Jade | Level 19

Hi @hhchenfx,

 

Since you know the ranks in advance, the ORDINAL function or the SMALLEST function -- depending on how you want to handle missing values -- will be easier to use. The SMALLEST function could also be replaced by the LARGEST function. For example, the 196th smallest of 200 non-missing values is the 5th largest. Moreover, you can save the DROP and RETAIN statements as well as the variable names aa1-aa200 by using a temporary array (EDIT: unless aa1-aa200 are already contained in dataset PRC):

array a{0:199} _temporary_;

 

PGStats
Opal | Level 21

For one thing, you should insert the current value into the array before calculating percentiles

 

/* Calculate moving percentiles for each window based on this record and previous 199 records */
data PRC; 
    set PRC;
	array a{0:199} _temporary_;
	a{mod(_n_,200)}=ROC1;
	if _n_ >= 200 then do;
		pct_low = pctl(75, of a{*}); 
		pct_mid = pctl(90, of a{*});	
		pct_high = pctl(95, of a{*});	
		pct_top = pctl(98, of a{*});
	end;
run;

(untested)

 

As for what you expect... you should tell us.

PG

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 621 views
  • 1 like
  • 4 in conversation