Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Problem with PROC RANK

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-06-2021 03:09 PM
(1227 views)

Hello,

In the “allvar” dataset, variables divi, rd, and sin take values of either 0 or 1; variable divo takes values -1 or 0. I am using this data set to create portfolios for each date (newdatadate in my case).

I am using PROC RANK and group them into 5 before creating portfolios.

My Code:

OPTIONS MPRINT SYMBOLGEN;

%LET avars = divi divo rd sin ; * Note: (I have more than 60 variables here);

%LET n_avars=%SYSFUNC(countw(&avars));

%PUT &n_avars;

%LET grp = 5;

**PROC** **SORT** DATA = allvar;

BY newdatadate;

**RUN**;

**PROC** **RANK** DATA=allvar OUT=allvar_ranked group=&grp;

BY newdatadate; VAR &avars;

**RUN**;

I think, in the allvar_ranked dataset, variables divi, divo, rd, and sin should have only two different ranked values since they already had taken two values before they were ranked. However, it’s not the case.

__When I run the following (before running proc rank):__

**proc** **sql**; select distinct divi from allvar; **quit**; * Values: ., 0, 1 ;

**proc** **sql**; select distinct divo from allvar; **quit**; * Values: ., 0,-1 ;

**proc** **sql**; select distinct rd from allvar; **quit**; * Values: 0, 1 ;

**proc** **sql**; select distinct sin from allvar; **quit**; * Values: ., 0, 1 ;

__When I run the same in the ranked dataset:__

**proc** **sql**; select distinct divi from allvar_ranked; **quit**; * Ranked Values: ., 0, 2, 3, 4 ;

**proc** **sql**; select distinct divo from allvar_ranked; **quit**; * Ranked Values: ., 0, 1, 2, 3 ;

**proc** **sql**; select distinct rd from allvar_ranked; **quit**; * Ranked Values: 1, 2, 4 ;

**proc** **sql**; select distinct sin from allvar_ranked; **quit**; * Ranked Values: ., 2, 4 ;

I was using this command previously and have never encountered this problem.

How can I fix this?

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't get it. How would you want proc rank to generate 5 groups out of two values?

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

I have other variables along with these four. These are the only variables that take two values.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@elif wrote:

Hi,

I have other variables along with these four. These are the only variables that take two values.

I think that you may want to include your BY variable in the summary because that effects things;

Example of some data ranked with and without by variable.

data junk; do i= 1 to 50; byvar = mod(i,10); x= rand('integer',2) -1; y= rand('integer',2) -1; z= rand('integer',2) -2; r= rand('integer',2) -2; output; end; run; proc sort data=junk; by byvar; run; proc freq data=junk; title "Before ranking"; tables x y z r; run; proc rank data=junk out=ranked1 group=5; var x y z r; run; proc freq data=ranked1; title "After ranking without by variable"; tables x y z r; run; proc rank data=junk out=ranked2 group=5; by byvar; var x y z r; run; proc freq data=ranked2; title "After ranking with by variable"; tables byvar *(x y z r) / list; run;title;

Snap shot result: Without BY variable the ranks are two values for each variable.

With By variable the ranks vary across the entire data set BUT **within each by group** there are only two values but the actual rank values can vary for different values of the by variable.

From the same starting data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

But proc rank treats each variable separately. It doesn't combine variables to create groups.

What is your goal? Maybe you should be looking for clustering procedures.

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

My goal is to create portfolios for each data date based on these rankings/groups so I will be able to compare the best group and the worst group for each variable.

Hence, for each date, I will assign 5 ranks to each variable separately.

I just wanted to handle the ranking procedure in one step, but I guess I should run proc rank for the rest of my variables with 5 groups.

Then, I can use proc rank again on the ranked dataset with two groups for only divi, divo, rd, and sin.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Suppose that variable X is your performance criteria for determining which item (stock?) belongs to the best group and the worst group, seems to me that X is the variable that should be ranked

```
proc rank data=myData out=myRanks fraction;
by date;
var X;
ranks Xrank;
run;
```

then you could explore the relationship between Xrank and your explanatory variables. Proc hpsplit could help in that exploration, too determine which variable values are associated with low and high ranks.

hth

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Update:

I have solved the problem after learning more about this procedure via your help.

Thank you so much!

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.