turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Learn SAS
- /
- Analytics U
- /
- Find the students that are ranked top 10% in thei...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-31-2014 01:14 PM

Hi ,

I have a file with data of students with following variables

**Name type**

Roll_Id num

gender char

Age num

flight char

play char

vote char

HSGPA num

credit num

Height num

i) Now problem is how can I get the students that are ranked top 10% ?

ii) Do comparative analysis of student height by Gender and Play?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

07-31-2014 02:24 PM

I did something similar to this. If you could specify what each variable is I can help. For example what is credit num? What is flight char?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

07-31-2014 05:48 PM

How are the rank to be calculated? Based on a single variable or something using multiple variables? Is "top 10%" going to be the largest or smallest of the ranking?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

07-31-2014 09:41 PM

Dear Shan,

First, you have to identify which variable(s) that you need to rank and next how you would like to be rank, either it is from small to large or large to small.

You may use the 'By' statement to get it done.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

08-01-2014 01:46 AM

@jagoma12 : In the above question credit num -->credit(variable) num(type of variable) similarly all other variables are mentioned in the question.

:I think it is based on HSGPA(High school grade point average) variable.

I think it must be ranked from large to small since it is top 10% for HSGPA variable.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

08-01-2014 08:34 AM

@shankarchavan If HSGPA is on the 4 point or even 6 point scale it won't help too much. The credit variable may help. If the credits are weighted, for example a regular class is 4 points, honors is 5 points, and AP is 6 points. Then in that case you can do

Proc Sort Data=Dataset Out = Work.Sorted;

By Credit;

Run;

Proc Print Data= Sorted;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jogoma12

08-01-2014 10:08 AM

Thanks jogoma12 but how to get top 10% and do comparative analysis? Will proc anova help in doing comparative analysis?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

08-01-2014 10:16 AM

How many students is it? I don't know about Proc ANOVA so I can not guide you there. However, you can find how many students would be 10%. Then you could run a loop. You can try this one, it should work but no guarantees since it is not debugged. Just change "Original Data Set" to the name of the data set and "TotalNumber of students" to the total in the first two lines of code.

%Let Data= Original Data Set;

%let students = TotalNumber of students;

%let ten= %Eval(&Students*.1);

Data Top_Ten;

Set &Data;

Do i=1 to &ten;

Output;

End;

Run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

08-01-2014 10:12 AM

1) proc rank

2) ANOVA - proc anova or proc glm

Xia Keshan

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to shankarchavan

08-01-2014 10:38 AM

First use proc rank to create 10 groups each containing 10% of the students then use proc anova to test differences among the means for HSGPA taking

HSGPARank as an independent variable.

proc rank data=have out=want descending groups=10 ties=high;

var HSGPA;

ranks HSGPARank;

run;

**HSGPARank=1 will be the top 10%**