Help using Base SAS procedures

how to use proc sql to do some special calculation

Reply
Frequent Contributor
Posts: 122

how to use proc sql to do some special calculation

Hi, I want to use proc sql to calculate sum of some observations group by certain variable, but subject to some constraint. Below is the detail.

 

I have some stocks and their returns for each day. I would like to calculate the mean of daily returns for each stock ID, but only the first few days, for example, the first two days. 

 

ID          Day             Return

1               1                 0.2

1               2                 0.2

1               3                 0.3

....

2               1                 0.2

2               2                 0.1

2               3                 0.2

.....

3               1                 0.2

3               2                 0.6

3               3                 0.2

 

The program below will give full average of daily returns for each ID, but not the first two days for each ID. How to revise it?

proc sql;

create table want as

select *, mean(return) as avg_ret

from from

group by ID;

quit;

 

Thanks.

 

Super Contributor
Posts: 308

Re: how to use proc sql to do some special calculation

Hello,

 

Assuming variable Day has values 1 and 2 this will do it:

 


proc sql;
create table want as
select t1.*, t2.avg_ret
from have t1 left join
(select ID, mean(return) as avg_ret from have where day le 2 group by ID) t2
on t1.id=t2.id;

quit;

Super User
Posts: 7,780

Re: how to use proc sql to do some special calculation

Simple?

proc sql;
create table want as
select *, mean(return) as avg_ret
from from
group by ID
where day in (1,2)
;
quit;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 1,117

Re: how to use proc sql to do some special calculation

Posted in reply to KurtBremser

@KurtBremser: I think the WHERE clause must precede the GROUP BY clause.

Super User
Posts: 7,780

Re: how to use proc sql to do some special calculation

Posted in reply to FreelanceReinhard

FreelanceReinhard wrote:

@KurtBremser: I think the WHERE clause must precede the GROUP BY clause.


Correct.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 122

Re: how to use proc sql to do some special calculation

Posted in reply to KurtBremser

day 1 and 2 are just examples. In reality, the variable day is in terms of date and it's not necessarily continuous. I only want to use the first few observations, say, the first 10 observations in each group (ID). How can that be done? 

Trusted Advisor
Posts: 1,117

Re: how to use proc sql to do some special calculation

I think it would be easier (and more efficient) to select the first n observations per ID using a data step:

 

/* Create test data */
data have;
input ID Day Return;
cards;
1 1 0.2
1 2 0.2
1 3 0.3
2 1 0.2
2 2 0.1
2 3 0.2
3 1 0.2
3 2 0.6
3 3 0.2
;

%let n=2;

/* Select first (i.e. earliest) n obs. per ID */
data temp / view=temp;
set have;
by id day;
if first.id then c=1;
else c+1;
if c<=&n;
drop c;
run;

Then you could apply your original PROC SQL step to TEMP rather than HAVE (what you called "FROM").

 

 

Alternatively, here is a pure PROC SQL approach:

 

proc sql;
create table want as
select *, mean(return) as avg_ret
from (select * from have a
      where (select count(*) from have where id=a.id & .<day<=a.day)<=&n)
group by id
order by id, day;
quit;

Here, TEMP is replaced by an inline view (which, in addition, involves a subquery in its WHERE clause).

 

It is assumed that the order of DAY values corresponds to chronological order. Unlike the data step, the PROC SQL step does not require dataset HAVE to be sorted (by ID, DAY). Furthermore, there should be no duplicate DAY values within an ID.

PROC Star
Posts: 1,760

Re: how to use proc sql to do some special calculation

Posted in reply to FreelanceReinhard

You might as well do it in one go.

 

data HAVE;
input ID DAY RETURN;
cards;
1 1 0.2
1 2 0.2
1 3 0.3
2 10 0.2
2 12 0.1
2 12 0.2
3 11 0.2
3 12 0.6
3 13 0.2
run;

data WANT ;
  set HAVE;
  by ID;
  PREV_RETURN=lag(RETURN);
  if lag(first.ID);
  MEAN=mean(PREV_RETURN, RETURN);
run;

proc print;
  var ID MEAN;
run;

 

Obs ID MEAN
1 1 0.20
2 2 0.15
3 3 0.40
PROC Star
Posts: 1,760

Re: how to use proc sql to do some special calculation

For an arbitrary nb of obs:

 

data HAVE;
input ID DAY RETURN;
cards;
1 1 0.2
1 2 0.2
1 3 0.3
2 10 0.2
2 12 0.1
2 12 0.2
3 11 0.2
3 12 0.6
3 13 0.2
run;

%let nb_obs=2;

data WANT ;
  set HAVE;
  by ID;
  if first.ID then call missing(SUM,OBS);
  SUM+RETURN;
  OBS+1;
  if OBS=&nb_obs. or (OBS < &nb_obs. & last.ID) then do;
    MEAN=divide(SUM, OBS);
    output;
  end;
run;

proc print;
  var ID MEAN;
run;
PROC Star
Posts: 1,760

Re: how to use proc sql to do some special calculation

[ Edited ]

If you really must use SQL (but SQL is not suited to processing rows in a certain order), something as ugly as this works in SAS:

 


%let nb_obs=2;

proc sql nothreads;

create table WANT(keep=ID MEAN) as
select 
      a.ID
     , monotonic()    as ROW
     , mean(a.RETURN) as MEAN 
from HAVE              a
    ,( select ID
            , min(monotonic()) as GRP_START  
       from HAVE 
       group by ID 
      )                b
where a.ID=b.ID 
  and calculated ROW<b.GRP_START+&nb_obs.
group by a.ID 
;

quit;

 

ID MEAN
1 0.20
2 0.15
3 0.40

 

 

Ask a Question
Discussion stats
  • 9 replies
  • 511 views
  • 1 like
  • 5 in conversation