DATA Step, Macro, Functions and more

Proc sql Language

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Proc sql Language

My question is about the select statement and its usage. 

Following is the code used to calculate average, i copied it from base sas codes avaiable online to understand function but can't understand why does it have "e. and d.". and what does it mean to say "class e class d",

Also need to understand the calcuation highlighted [in bold].

 

Step 1  : Creating Data set

data class;
set sashelp.class;
obs = _n_;
run;

 

Step 2 : Proc Sql

proc sql;
select avg(weight) as Median
from (select e.weight
from class e, class d
group by e.weight
having sum(case when e.weight = d.weight then 1 else 0 end)
>= abs(sum(sign(e.weight - d.weight))));
quit;

 

 

Thanks !


Accepted Solutions
Solution
3 weeks ago
Regular Contributor
Posts: 178

Re: Proc sql Language

Firstly e and d are aliases i.e. they are ways of referring to different variables in different data sets. In this particular case e and d both refer to the same data set and what is being produced is a cartesian product i.e. a data set which contains all possible combinations of weights (the table is being joined with itself).

 

The highlighted code is basically a complicated way of finding the median (i.e. the value with the same number of observations which are greater than it and which are less than it) by using the sum of differences. If you're using SAS 9.4 you can simply do this instead

 

proc sql;
	select median(weight)
	from class;
quit;

View solution in original post


All Replies
PROC Star
Posts: 547

Re: Proc sql Language

It may contribute to your understanding to know that you could just as well write

 

from class as e, class as d

e and d are simply the names you are using to refference you datasets, so you can refference them by

 

 

select avg(weight) as Median
from (select e.weight
Solution
3 weeks ago
Regular Contributor
Posts: 178

Re: Proc sql Language

Firstly e and d are aliases i.e. they are ways of referring to different variables in different data sets. In this particular case e and d both refer to the same data set and what is being produced is a cartesian product i.e. a data set which contains all possible combinations of weights (the table is being joined with itself).

 

The highlighted code is basically a complicated way of finding the median (i.e. the value with the same number of observations which are greater than it and which are less than it) by using the sum of differences. If you're using SAS 9.4 you can simply do this instead

 

proc sql;
	select median(weight)
	from class;
quit;
New Contributor
Posts: 3

Re: Proc sql Language

Thanks Smiley Happy

It worked when I used SAS 9.4.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 116 views
  • 1 like
  • 3 in conversation