BookmarkSubscribeRSS Feed
thdang
Calcite | Level 5

Hallo, I would like to do some calculation using the lag(1) value. I have something like this, but lag function did not work here

Proc sql noprint;

    Create table want as

    Select date, id, sum(value1) as value1,

        sum(value2*weight)/sum(weight) as value2,

        sum(lag1(weight)*return)/sum(lag1(weight)) as return

    from have group by id, date;

   

quit;

How can I make this work? Thank you

15 REPLIES 15
art297
Opal | Level 21

You can't use the lag function with proc sql.

PGStats
Opal | Level 21

The lag function is not available in proc SQL. Proc SQL performs operations on unordered sets of data.

By LAG1 did you mean the previous observation or the previous date?

PG

PG
thdang
Calcite | Level 5

yes, by lag I mean the previous date (t-1).

PGStats
Opal | Level 21

If you want to match the value with the return on the previous day then try this (untested) :

Proc sql;

Create table want as

select t.id, t.date, t.value1, t.value2, r.return

from

    (Select id, date,

       sum(value1) as value1,

       sum(weight*value2)/sum(weight) as value2

    from have group by id, date) as t

left join

   (Select id, date,

      sum(weight*return)/sum(weight) as return

    from have group by id, date) as r

on t.id=r.id and t.date=intnx('DAY',r.date,1);

quit;

This assumes that date is a SAS date. If date is a datetime variable then replace 'DAY' by 'DTDAY'.

PG

PG
thdang
Calcite | Level 5

the left join - part did not work well. because the weight and return are all from previous day. The data I have are monthly data I should say previous obs above I think.

The return is this month return and the weight is last month weight of different assets

return (t) = sum(weight (t-1)*return(t))/sum(weight(t-1))

art297
Opal | Level 21

Post some sample data and the result you want from that data.

thdang
Calcite | Level 5

The data look like this.

     ID         date               value              return     portfolio

I want to group them into the portfolio level so the monthly return of the portfolio would be the weighted return of the individual asset.

The result table would be:

portfolio     date               value                    return

200075     01/31/1997     138.449                    .

200075   02/28/1997    138.249     (0.149*0.005199 + 138.3*0.006061)/(138.3+0.149).

...

art297
Opal | Level 21

And

1. where does the 'weight' variable come from?

2. is the one value for return, R, a missing value?

3. will those always be the two ids?

4. will there always be two and only two ids?

thdang
Calcite | Level 5

1. the weight is the value column

2. the R is the missing return, missing value.

3. No there are more different ids (about 15000 of them)

4. and the number of ids varies per portfolio. there are about 2500 different portfolios 

art297
Opal | Level 21

Then I'm confused.  Your formula indicates variables value1, value2, weight and return.

From your example, value1 and value2 are the two records for a given date.  The formula takes the sum of those, divided by another variable labeled weight.  I don't understand the difference between value and weight.  Similarly, each value has a separate return.

Please reshow your example using the variable names from the file, namely: id, date, value, return and portfolio

thdang
Calcite | Level 5

You refers to the first post?

it would be this:

Proc sql noprint;

    Create table want as

    Select date, portfolio, sum(value) as value,

        sum(lag1(value)*return)/sum(lag1(value)) as return

    from have group by portfolio, date;

quit;

return (t) = sum(value(t-1)*return(t))/sum(value(t-1))


so if I don't take the lag grouping those individual assets into portfolio works fine. But only for the return I have to take the lag ( previous month) value 

art297
Opal | Level 21

Still not sure I correctly understand what you want, but the following appears to match what you want.  Someone more proficient than me with proc sql can consolidate the code if it is in the right direction:

data have;

  informat date mmddyy10.;

  format date date9.;

  input id date value return portfolio;

  cards;

029946 1/31/1997 0.149 . 200075

029947 1/31/1997 138.300 0.010499 200075

029946 2/28/1997 0.276 0.005199 200075

029947 2/28/1997 137.973 0.006061 200075

029946 3/31/1997 0.376 -0.029741 200075

029947 3/31/1997 132.154 -0.030120 200075

029946 4/30/1997 0.481 0.012467 200075

029947 4/30/1997 133.955 0.014311 200075

029946 5/30/1997 0.595 0.054529 200075

029947 5/30/1997 141.063 0.054674 200075

029946 6/30/1997 0.693 0.030025 200075

029947 6/30/1997 143.641 0.030518 200075

;

proc sql;

  create table need as

    select l.portfolio,l.id, l.date, l.return as lreturn,

           l.value as lvalue, r.return as rreturn,

        r.value as rvalue

      from

        (select portfolio,id, date,return,value

           from have ) as l

          left join

             (select portfolio,id, date, value,return

                from have) as r

               on l.portfolio=r.portfolio and

                   l.id=r.id and

                  intnx('month',r.date,1,'b')<=l.date<=

                  intnx('month',r.date,1,'e')

           order by portfolio,date,id

  ;

  create table want as

    select portfolio, date,

           sum(rvalue*lreturn)/sum(rvalue) as return

      from need

        group by portfolio,date

  ;

quit;

Tom
Super User Tom
Super User

To make the calculation you want you need to get the basis value (what you were calling weight) onto the record with the return value. SQL is not a good tool for this since it does not natively support the concept of ordered records.  Assuming that the ID variable identifies particular asset you need to process you data sorted by PORTFOLIO ID DATE to be able to take the previous value of the asset.

data returns ;

set have;

by portfolio id date ;

basis = lag(value);

if first.id then basis = .;

run;

Now you can do calculations using variables that are on the same observation.

art297
Opal | Level 21

With the example data your code results in the same values as the first of the two proc sql runs I suggested.  Specifically, my lreturn=your return, my rvalue=your basis, and my lvalue= your value.

I agree with you that SQL shouldn't be used if one needs to know the order of one's data but, in this case, I don't think order wasn't necessary .. only the values of date, id and porfolio.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 5157 views
  • 6 likes
  • 4 in conversation