Dear all,
I have a dataset with following variables:
Gvkey = company identification code
fyearq= fiscal year
fqtr=fiscal quarter
sale_dec= indicator value = 1 if sales decrease; else it is equals to 0
saledec= value of decrease
saleinc = value of increase
I would like to find:
1. the most recent quarter with a sales decrease and its valueof decrease
2. the most recent quarter with a sales increase and its value of increase
for each fiscal year. The selection of the most recent quarter is based on the last 7 quarters (q-7) plus the current quarter (q).
Thank you and hope to receive a solution soon.
mspak
OK. Mspak.
So you want a rolling window which'length is eight quarter and output one obs for each id and year ?
Assuming there is no gap(missing quarter in a year) in your data. Otherwise you need some more code to handle this.
data want;
set test;
by GVKEY FYEARQ;
array d{8} _temporary_;
array i{8} _temporary_;
if first.gvkey then call missing(of d{*},of i{*});
do k=8 to 2 by -1;
d{k}=d{k-1};
i{k}=i{k-1};
end;
d{1}=SALEDEC;
i{1}=SALEINC;
if last.fyearq then do;
last_decrease=coalesce(of d{*});
last_increase=coalesce(of i{*});
output;
end;
keep GVKEY FYEARQ last_: ;
run;
Below should point you into the right direction.
libname temp 'C:\temp';
proc sort data=temp.test out=inter;
where sale_dec=1;
by gvkey fyearq fqtr;
run;
/*the most recent quarter with a sales decrease and its valueof decrease per company identification code*/
data want1;
set inter;
by gvkey fyearq fqtr;
if last.gvkey;
run;
@mspak ,
plz post your data at this forum, not XLS, ZIP file.
No one would like to download these files from website due to virus risk.
And do't forget to post the output you want see too.
It is hard to understand what you mean by most recent quarter.
Assuming I understand what you mean.
data last_decrease last_increase;
do i=1 by 1 until(last.gvkey);
set test;
by gvkey;
if sale_dec then idx_dec=i;
if not missing(saleinc) then idx_inc=i;
end;
do j=1 by 1 until(last.gvkey);
set test;
by gvkey;
if j=idx_dec then output last_decrease;
if j=idx_inc then output last_increase;
end;
drop i j;
run;
Hi,
The dataset is as follows:
GVKEY FYEARQ FQTR DATACQTR SALE_DEC SALEDEC SALEINC
001004 2011 1 2011Q3 1 4.0787026566 .
001004 2011 2 2011Q4 1 . .
001004 2011 3 2012Q1 0 . -0.017301069
001004 2011 4 2012Q2 0 . 0.2664100989
001004 2012 1 2012Q3 1 1.8067205327 .
001004 2012 2 2012Q4 1 . .
001004 2012 3 2013Q1 0 . 1.9264163544
001004 2012 4 2013Q2 0 . -0.188650692
With the above dataset, I would like to find the most recent quarter's sales decrease and most recent quarter's sales increase.
For example, the most recent quarter with a sales decrease for year 2011 and 2012 would be 4.01787...as this is the only sales decrease over the last 8 quarters.
The most recent sales increase for year 2011 would be 0.2664100989 as this is the latest value for sales increase over the last 8 quarters for year 2011. The value for the latest sales increase for 2012 would be -0.18865.
The most recent "sales decrease" event not neccessarily happen in the quarter 4 as the firm might experience the last "sales decrease" event during quarter 1 of previous year.
I wish the output as follows:
GVKEY FYEARQ SALEDEC SALEINC
001004 2011 4.0787026566 0.2664100989
001004 2012 4.0787026566 -0.188650692
I hope that this make it clearer.
Thank you for any suggestions.
Regards,
MSPAK
OK. Mspak.
So you want a rolling window which'length is eight quarter and output one obs for each id and year ?
Assuming there is no gap(missing quarter in a year) in your data. Otherwise you need some more code to handle this.
data want;
set test;
by GVKEY FYEARQ;
array d{8} _temporary_;
array i{8} _temporary_;
if first.gvkey then call missing(of d{*},of i{*});
do k=8 to 2 by -1;
d{k}=d{k-1};
i{k}=i{k-1};
end;
d{1}=SALEDEC;
i{1}=SALEINC;
if last.fyearq then do;
last_decrease=coalesce(of d{*});
last_increase=coalesce(of i{*});
output;
end;
keep GVKEY FYEARQ last_: ;
run;
Thank you Ksharp for helps. This code is really useful. Have a nice day!
Hi Ksharp,
I have some questions regarding the understanding of your code.
when you write:
if first.gvkey then call missing(of d{*},of i{*});
do k=8 to 2 by -1;
d{k}=d{k-1};
i{k}=i{k-1};
end;
From what I understand, whenever we encounter the first observation of a certain gvkey, you make all the values of the arrays to be missing, and then you make them all be the same - but they are already all missing in the first place?
Yes. they are all missing. It doesn't matter. it is just one more useless operator for the first.gvkey. it wouldn't change anything about result .
Ok, but now I have another question!
I realize that you are doing rolling windows, but can't understand it from the code.
So for a given observation, d{1} is the saledec for that period, and d{2} is the saledec for the previous perid etc.
But how do you put the saledec of the previous period into d{2}? You don't do something like d{2} = lag{saledec}, so how is it done?
Thanks!
Yes. I do.
The array D is temporary, which means it is retained during all the data step time.
so when do
do k=8 to 2 by -1;
d{k}=d{k-1};
i{k}=i{k-1};
end;
assume d{8}=1 2 3 4 5 6 7 8
then the first time d{8}=current_obs 1 2 3 4 5 6 7
NOTE, 8 is removed from d{*}.
next time when run
d{8}=next_obs current_obs 1 2 3 4 5 6
NOTE, 7 is removed from d{*}.
Hi mspak,
Very nice question, I am also working a lot with financial data.
I looked at your data and noticed the following:
1) saleinc has both positive and negative values - shouldn't they all be of the same sign?
2) there are observations when the sale_dec is either 0 or 1, but there is no value for the saledec or saleinc for these observations -is it because this information is missing in the original database?
Thanks
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.