BookmarkSubscribeRSS Feed
Agent1592
Pyrite | Level 9

 

 
data test; set test;
	   by code;
	   leaddate=lag(date);
	   if first.code then leaddate=intnx('month',date,3,'sameday');
	   *if sunday move back by 2 days, if saturday move back by 1 day;
	   if weekday(date)=1 then date1=intnx('day',date,-2); else
	   if weekday(date)=7 then date1=intnx('day',date,-1); else date1=date;
	   if weekday(leaddate)=1 then leaddate1=intnx('day',leaddate,2); else
	   if weekday(leaddate)=7 then leaddate1=intnx('day',leaddate,3); else
	   if weekday(leaddate)=6 then leaddate1=intnx('day',leaddate,3);else
	   leaddate1=intnx('day',leaddate,1);
	   if leaddate=date then delete;
	   format date1 leaddate1 date9.;
	run;

 

I have a question regarding the code above. I am using a sample code but I don't know why I have to use the first function. Does the leaddate in the 4th line of the code replace the leaddate in the 3rd line. One is a leading variable and the other calculates 3 months ahead. Why do I need to keep this if first.statement in the code. Is this code correct?

 

9 REPLIES 9
Astounding
PROC Star

Whether the code is correct or not ... depends on what it is supposed to do.  But you do require that first function.

 

LEADDATE needs to be set on every observation.  The later statement that works with FIRST.CODE sets LEADDATE for those observations only, not for all the other observations that are not the first one for a CODE.

Cynthia_sas
SAS Super FREQ

Hi:

  Without seeing data or understanding the program logic, it is hard to comment on whether you need the first INTNX function call. You need to do SOMETHING at the change of the CODE group. Because I assume that at the change, you do not want the leaddate value from the previous group to carry forward? It looks like you want to advance the date on the change of the CODE group. Have you examined the output? Are the dates as you expect them to be.

 

  For proof of concept about how LAG works and why you would want/need to do SOMETHING  at the first of the group,  For example, you did not post any data. so consider this fake data. Each group based on the CODE variable has 4 observations, as shown below:

data test;
  infile datalines;
  input code date : date. name $;
datalines;
1 1nov17 albert
1 2nov17 bob
1 3nov17 carl
1 4nov17 dave
2 1dec17 alice
2 2dec17 barb
2 3dec17 cathy
2 4dec17 diane
;
run;

data newtest1;
  set test; by code;
  leaddate = lag(date);
  format date leaddate date9.;
run;

proc print data=newtest1;
  title 'newtest1 no test for first.code';
run;


data newtest2;
  set test; by code;
  leaddate = lag(date);
  if first.code then leaddate=.;
  format date leaddate date9.;
run;

proc print data=newtest2;
  title 'newtest2 has test for first.code and sets leaddate to missing';
run;

  One assumes that you do not want the date from dave (observation #4) to be lagged to alice (observation #5 and the first.code value for code=2).

 

  So the output from NEWTEST1 shows what happens when you do not perform any test for first.code. In the output for NEWTEST2, it shows that you can reset the value for LEADDATE to something (in my program to missing) for the first observation's value in every BY group.

 

  Whether INTNX is doing what you want or need is a question that only you can answer. Here's the output from the above programs.

 

cynthia

lag_leaddate.png

 

Reeza
Super User

It means you're grouping your analysis by the CODE grouping variable and that for the first record of every unique CODE value it does something specific, in this case calculate the LEADDATE that is then used for further calculations. If it's not the first value of CODE then it uses the lagged value of the variable DATE. 

 

 *get date previous date value;
    leaddate=lag(date);

    *if first record for CODE then calculate a new leaddate;
    if first.code then
        leaddate=intnx('month', date, 3, 'sameday');

 

Correct code and correct logic are not the same thing, in this case your code may be 100% correct, but the logic can be entirely incorrect. 

 

Something that would help with understanding the code would be comments and proper formatting. 

 

data test;
    set test; *using the same name over and over is a bad idea;
    by code; *set to use first date;
    
    *get date previous date value;
    leaddate=lag(date);

    *if first record for CODE then calculate a new leaddate;
    if first.code then
        leaddate=intnx('month', date, 3, 'sameday');
    
    *if sunday move back by 2 days, if saturday move back by 1 day;
    if weekday(date)=1 then
        date1=intnx('day', date, -2);
    else if weekday(date)=7 then
        date1=intnx('day', date, -1);
    else
        date1=date;

    *Not sure what you're doing here;
    if weekday(leaddate)=1 then
        leaddate1=intnx('day', leaddate, 2);
    else if weekday(leaddate)=7 then
        leaddate1=intnx('day', leaddate, 3);
    else if weekday(leaddate)=6 then
        leaddate1=intnx('day', leaddate, 3);
    else
        leaddate1=intnx('day', leaddate, 1);
  
    *remove records if certain condition met...?;
    if leaddate=date then
        delete;
        
    format date1 leaddate1 date9.;
run;

 

mkeintz
PROC Star

The lag function will cross by-group boundaries, which contaminates the first observation of the current by-group with data from the last obs of the preceding group.  The "if first." statements tests whether the record-in-hand is at the beginning of a by-group, and therefore needs to be de-contaminated.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PGStats
Opal | Level 21

At first sight, it might look like the statements

 

    leaddate=lag(date);
    if first.code then leaddate=intnx('month',date,3,'sameday');

could be replaced with the better looking

 

    if first.code then leaddate=intnx('month',date,3,'sameday');
    else leaddate=lag(date);

 

but that wouldn't work. To work properly, the lag() function must be called for every observation. Calling the lag() function conditionnally is (almost) never a good idea.

PG
Agent1592
Pyrite | Level 9
Thanks the dates are company reporting days and if the earnings reporting day is Sunday the idea is to move by 2 days. Is there a better code using PROC SQL?
Reeza
Super User

SQL is harder because it doesn't have the row concept and the LAG function is harder to achieve in SQL 

 

PGStats
Opal | Level 21

date1 is the most recent weekday relative to date, simple enough. But what is leaddate1; you move Saturday and Sunday to the next Tuesday and Friday to the next Monday. What's the logic?

PG
Agent1592
Pyrite | Level 9

These are trading days. So the idea is to move weekend days to trading week days.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1134 views
  • 3 likes
  • 6 in conversation