DATA Step, Macro, Functions and more

proc sql question

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

proc sql question

Below is the input DS

 

 

data a;
input name $ sal ;
cards;
a 100
a 400
a 200
b 100
b 100
b 300
c 300
c 300
;
run;

I want output like below using proc sql

 

 

data a;
input name $ sal  rank;
cards;
a 100 1
a 400 2
a 200 3
b 100 1
b 100 2
b 300 3
c 300 1
c 300 2
;
run;

 

 

Second question :

 

I want to pull every second highest using proc sql.

 

Thanks in advance


Accepted Solutions
Solution
‎05-29-2017 10:15 AM
PROC Star
Posts: 7,360

Re: proc sql question

I agree with @Reeza. The monotonic function will work as desired (if you use a group by clause), but there are no guarantees that it will always work that way since its not supported:

data a;
  input name $ sal  rank;
  cards;
a 100 1
a 400 2
a 200 3
b 100 1
b 100 2
b 300 3
c 300 1
c 300 2
;
run;

proc sql noprint;
  create table want as
    select *, monotonic() as rank
      from a
        group by name
  ;
quit;

Art, CEO, AnalystFinder.com

View solution in original post


All Replies
Trusted Advisor
Posts: 1,369

Re: proc sql question

Assuming data is in the right order, you can do it in one step, no need SQL at all:

data a;
input name $ sal ;
retain rank 0  old_name ' ';
If name ne old_name then do;
   rank=0 old_name=name;
end;
rank+1;
cards;
a 100
a 400
a 200
b 100
b 100
b 300
c 300
c 300
;
run;
Super User
Posts: 17,784

Re: proc sql question

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

 

If you really want SQL look at the monotonic() function though I'm not sure how that works with BY groups. SQL doesn't respect row order though so you need to be careful and test it. 

Solution
‎05-29-2017 10:15 AM
PROC Star
Posts: 7,360

Re: proc sql question

I agree with @Reeza. The monotonic function will work as desired (if you use a group by clause), but there are no guarantees that it will always work that way since its not supported:

data a;
  input name $ sal  rank;
  cards;
a 100 1
a 400 2
a 200 3
b 100 1
b 100 2
b 300 3
c 300 1
c 300 2
;
run;

proc sql noprint;
  create table want as
    select *, monotonic() as rank
      from a
        group by name
  ;
quit;

Art, CEO, AnalystFinder.com

PROC Star
Posts: 63

Re: proc sql question

What you want is, by definition, not really doable with SQL. What you want is a variable named RANK which is just the arbitrary order of data within your table, not the order as defined by some other variables.

It is possible to get the results you want in SQL in most SAS editions, provided that your data is created like your example dataset:

proc sql;
  select name,sal,r-min(r)+1 as rank from(
  select name,sal,monotonic() as r
  from a)
  group by name
  order by 1,3
  ;

But this depends on using the order of the data in the input table - by definition, a SQL table is not ordered and does not have to be delivered to a query in physical order when using SQL. This means that if you do the above, you may get the result you want when doing the query against a simple SAS table; if you use another type of libname, be it MS-SQL, Oracle, Hadoop or some SAS server, there is no guarantee that you will get the data in that order. Your query is by definition indeterministic, meaning that you are not guaranteed to get the same results every time.

On the other hand, if you wanted the RANK variable to mirror the ranking of SAL within NAME, it is possible:

proc sql;
  select name,sal,r-min(r)+1 as rank from(
  select name,sal,monotonic() as r
  from a order by name,sal)
  group by name
  order by 1,3
  ;

- the difference being the ORDER BY clause in the inner query. But this means that the lowest SAL for a NAME would get RANK=1, etc (use DESC if you want the highest SAL having RANK=1), not the data you showed us as you desired output.

 

The MONOTONIC() function is still (I think) not documented in SAS SQL, but it will probably not change.

 

In other SQL dialects, such as Oracle or MS, you would use a RANK summary function or similar to get the rank, something like:

  select name,sal,rank over (partition by name order by sal) as rank 
  order by 1,3
  ;

 but SAS SQL still has not gotten around to that. And note that you will have to have an ORDER BY in the definition of RANK, as SQL tables are by definition not ordered beforehand.

 

To be sure that you get what you want, you will have to use a SAS datastep:

Data b;
  do rank=1 by 1 until(last.name);
    set a;
by name; output; end; run; proc sql; select name,sal,rank from b; quit;

Although I would advise against using the form at all, you can get the records with RANK=2 from the first, indeterministic(!) query by using a HAVING clause:

proc sql;
  select name,sal,r-min(r)+1 as rank from(
  select name,sal,monotonic() as r
  from a)
  group by name
  having rank=2
  order by 1
  ;

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 198 views
  • 2 likes
  • 5 in conversation