Desktop productivity for business analysts and programmers

Lag equivalent in PROC SQL

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

Lag equivalent in PROC SQL

Hi all,

is there an equivalent to the lag function in a PROC SQL step?

Thank you,

Marco

EDIT: re-posting this as I posted it in the wrong section earlier


Accepted Solutions
Solution
‎04-07-2014 04:38 AM
Super User
Super User
Posts: 7,725

Re: Lag equivalent in PROC SQL

You would need to sort your data, otherwise monotonic seems to work ok.  Its not ideal though, you will notice that it throws some weird results when using where/order by's etc. so I would generally avoid it.  An example below.  What I would look at is what you are trying to achieve with using a lag function, and how the problem could be solved using SQL rather than trying to translate a specific function between two different languages.

data have;

  attrib name format=$20. qtr result format=best.;

  infile datalines;

  input name $ qtr result;

datalines;

Terry 3 12

Terry 4 13

Rob   1 90

Rob   3 91

Terry 1 11

Terry 2 15

Rob   4 50

Rob   3 51

;

run;

proc sort data=have;

  by name qtr;

run;

proc sql;

  create table WORK.WANT as

  select  BASE.*,

          PREV.QTR as PREV_QTR,

          PREV.RESULT as PREV_RESULT

  from    (select *,MONOTONIC() as IND from WORK.HAVE) BASE

  left join (select *,MONOTONIC() as IND from WORK.HAVE) PREV

  on      BASE.NAME=PREV.NAME

  and     BASE.IND=PREV.IND+1;

quit;

View solution in original post


All Replies
Super Contributor
Posts: 578

Re: Lag equivalent in PROC SQL

Generally, SQL doesn't consider sets to be ordered, so you have to force order on them by adding some sort of variable to match back to the dataset.  For example, if you have a dataset HAVE like this:

Business_Date date

Val1 num

Val2 num

you could do this:

proc sql;

select

     t1.business_date,

     t1.val1,

     t1.val2,

     t2.val1 as Prior_Vall1,

     t2.val2 as Prior_Val2

from

     have t1

     left outer join have t2

          on t1.business_date-1=t2.business_date;

quit;

Super User
Posts: 5,391

Re: Lag equivalent in PROC SQL

If you don't have a natural row id variable (such as business date in 's example), you can enforce row numbers, either by using the unsupported and fairly undocumented function monotonic(), or by storing the row numbers explicitly in the source table ("have").

But, at this point, it is quite clear that both programming and performance wise, this is not a good idea. This is easier to code in a data step, and joins always comes with a higher cost than a table scan.

Data never sleeps
Trusted Advisor
Posts: 3,208

Re: Lag equivalent in PROC SQL

As said the SQL approach has a complete different set of axioma-s eg data is considered not being ordered/sorted as opposed in other language e.g. the SAS-datastep where the data is assumed to be ordered.

The result is you cannot do some things in Proc SQL that you can do in the SAS-datastep as of this kind of axioma-s.

Answer: no there  is not  something like the lag function in SQL  (ansi Sql).

Sometimes there are aggregate functions that will do your original functional request, that is only a subset of all possibilities you could solve within a data-step or 3GL language.

Before SAS 9  within SAS proc sql was solved in a sequential/single process byt that monotonic could work.

Do not trust any ordering with proc sql anymore as it was changed to be multi-threading, that is several sub-processes can run on selected parts of the data. It will using the axioma of not having a defined ordering.          

---->-- ja karman --<-----
Solution
‎04-07-2014 04:38 AM
Super User
Super User
Posts: 7,725

Re: Lag equivalent in PROC SQL

You would need to sort your data, otherwise monotonic seems to work ok.  Its not ideal though, you will notice that it throws some weird results when using where/order by's etc. so I would generally avoid it.  An example below.  What I would look at is what you are trying to achieve with using a lag function, and how the problem could be solved using SQL rather than trying to translate a specific function between two different languages.

data have;

  attrib name format=$20. qtr result format=best.;

  infile datalines;

  input name $ qtr result;

datalines;

Terry 3 12

Terry 4 13

Rob   1 90

Rob   3 91

Terry 1 11

Terry 2 15

Rob   4 50

Rob   3 51

;

run;

proc sort data=have;

  by name qtr;

run;

proc sql;

  create table WORK.WANT as

  select  BASE.*,

          PREV.QTR as PREV_QTR,

          PREV.RESULT as PREV_RESULT

  from    (select *,MONOTONIC() as IND from WORK.HAVE) BASE

  left join (select *,MONOTONIC() as IND from WORK.HAVE) PREV

  on      BASE.NAME=PREV.NAME

  and     BASE.IND=PREV.IND+1;

quit;

Trusted Advisor
Posts: 3,208

Re: Lag equivalent in PROC SQL

@rw9, Not all problems can be solved with SQL.  There is always a search for the right tool for a problem.   

The acceptance of a limitation should be part of a learning process. That is the why behind mathematics.

---->-- ja karman --<-----
Super User
Posts: 9,878

Re: Lag equivalent in PROC SQL

To be honest. I will not recommend to use RW9's code. MONOTONIC() is undocumented function which means you are going to suffer the risk from it .Sometime you are going to wrong result.I will use ODS to get that index row.


ods listing close;
ods output sql_results=have;
proc sql number;
 select name,age from sashelp.class;
quit;
ods listing;
proc sql;
create table want as
 select *,(select age  from have where row=a.row-1) as lag_age
  from have as a;
quit;

Ksharp

Respected Advisor
Posts: 3,156

Re: Lag equivalent in PROC SQL

Very nice, Ksharp! Great to have you back.

Haikuo

Super User
Posts: 9,878

Re: Lag equivalent in PROC SQL

HaiKuo,

Hope we have a meet in near future.

Xia Ke Shan

Contributor
Posts: 44

Re: Lag equivalent in PROC SQL

hi,

good answer. can you help me the similar as below

Empid   year    salary

1         2011    11

1         2012    13

1         2013     20

2         2011     44      

2         2012     22

2         2013     88

Requirement:    Need a 4th column representing percentage of salary change for an empid  by years.

                                             datastep->lag function-> we can do

                                              proc sql-> how to do this?

thanks in advance.

Super User
Super User
Posts: 7,725

Re: Lag equivalent in PROC SQL

Should start a new topic for this as this post is closed.  You can do the same as above without the need for monotonic as you can use year.

proc sql;

     create table WANT as

     select     A.EMPID,

                   A.YEAR,

                    A.SALARY,

                    (A.SALARY / B.SALARY)  * 100 as SAL_PCENT    

     from        HAVE A

     left join    HAVE B

     on          A.EMPID=B.EMPID

     and        A.YEAR=B.YEAR+1;

quit;

Contributor
Posts: 44

Re: Lag equivalent in PROC SQL

Hi,

thanks a lot. I understood.

Super User
Posts: 9,878

Re: Lag equivalent in PROC SQL

OK. Assuming the data as simple as it is :

data have;
input Empid   year    salary ;
cards;
1         2011    11
1         2012    13
1         2013     20
2         2011     44      
2         2012     22
2         2013     88
;
run;
proc sql;
create table want as 
 select *,(salary-(select salary from have where Empid=a.Empid and year=a.year-1) )/salary as per format=percentn8.2
  from have as a;
  quit;

Xia Keshan

Contributor
Posts: 44

Re: Lag equivalent in PROC SQL

hi,

thanks a lot. i understood.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 10448 views
  • 12 likes
  • 8 in conversation