BookmarkSubscribeRSS Feed
9 REPLIES 9
trungcva112
Obsidian | Level 7
Please. Anything could help
mkeintz
PROC Star

If you drop quarters with greater than a 75% change in cogsq or sale, then how do you define lag(cogsq) and lag(sale) for the subsequent quarter.

 

If  Q3 has 80% change vs Q2, then what values will you use as lag of coqsq and sale in Q4?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
trungcva112
Obsidian | Level 7
Good question. For example, If cogs in Q3 has 80% change vs Q2, then the lag of cogs in Q4 would be cogs in Q2 (or equivalently the nearest quarter that has cogs growth rate within (-75%, +75%).
mkeintz
PROC Star

So does this mean that the concept of lagged values of cogsq and sale can represent varying time spans?   Let's say you have successive changes in sales of +80%, -%78%, and +90%, for a highly seasonal company.  Then your lagged value of cogsq and sale will represent one-year old values.

 

And more generally, I don't get this notion of dropping dramatic changes anyway, since it is likely to introduce even more dramatic changes in your analysis data.  Let's say a company is in a growth spurt, with successive changes in sales of 78% (for Q2) and 20% (for Q3).  By dropping Q2 and inserting Q1 sales as the lagged value for Q3, you will have introduced a "change" of 96% between current and "preceding" value of the variable.  That would increase the beta coefficient for lagged sale.

 

What you MIGHT want to do instead is insert a dummy variable indicating records with large absolute proportional changes, and keep those records.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
trungcva112
Obsidian | Level 7
Thanks for your advice. But how can I insert this dummy variable and other conditions in the rolling regression?
trungcva112
Obsidian | Level 7
Anyone has any idea?
mkeintz
PROC Star

This program should give you then needed infrastructure:

 

data need1 (keep=window data cogsq sale lag_: outlier_:) /view=need1;

  do n=1 by 1 until (last.gvkey);
   set have;
   by gvkey;
   lag_cogsq=lag(cogsq);
   lag_sale=lag(sale);
   if n=1 then call missing(of lag_:);
   else do;
     outlier_cogsq=1-(0.25<cogsq/lag_cogsq<1.75);
     outlier_sale=1-(0.25<sale/lag_sale<1.75);
   end;

   array var {*} data cogsq sale lag_cogsq lag_sale outlier_cogsq outlier_sale;
   array data {100,7}; /*Up to 100 historic records for 7 vars*/

   do v=1 to 7; data{n,v}=var{v};end
  end;

  if n>10 then do end=11 to n;
    beg=max(2,n-59);
    window=end-11;
    do row=beg to end;
      do v=1 to 7; var{v}=data{row,v};end;
      output;
    end;
  end;
run;

proc reg data=need;
 ......
 run;
quit;

 

 

Remember data set NEED will be roughly 20 times the size of HAVE (you want 5-year rolling quarterly data). So I made NEED a data set VIEW instead of a data set FILE.  It's only activated when a subsequent PROC calls for NEED, and the data is streamed directly to the proc instead of a disk file.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
trungcva112
Obsidian | Level 7
Hi everyone.

I have describe the data in more detail. Could anyone has an idea? Because previous codes do not work
mkeintz
PROC Star
  1. The phrase "do not work" is not a usefull description of the problem.  What results did you get vs what you  expected? Please show those results.
     
  2. Test this with a sample data set with two gvkeys and just a couple years of data.  And if you are testing my suggestion, change the dataset view VNEED to a data set file.  Then you can examine the  intermediate data set to confirm it is constructed as intended.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2744 views
  • 0 likes
  • 2 in conversation