Dear altruists,
I want to create a new dataset based on a specific calendar date as follows:
My reference date is 13 April 2009.
I want to retain observations containing only Fiscal_Quarter 2 which immediately precedes the first Fiscal_Quarter 4 before 13 April 2009.
So for instance, for Company_ID 1210, the first Fiscal_Quarter 4 before 13 April 2009 is on 31 March 2009.
I want to retain the Fiscal_Quarter 2 value on 30 September 2008.
I have the following dataset (I have uploaded the SAS dataset in the end):
Company_ID | Date | Fiscal_Quarter | Return |
1210 | 20080630 | 1 | 0.243333 |
1210 | 20080930 | 2 | 0.443333 |
1210 | 20081231 | 3 | 0.483333 |
1210 | 20090331 | 4 | 0.363333 |
1210 | 20090630 | 1 | 0.163333 |
1210 | 20090930 | 2 | 0.203333 |
1094 | 20070930 | 1 | 0.243333 |
1094 | 20071231 | 2 | 0.283333 |
1094 | 20080331 | 3 | 0.323333 |
1094 | 20080630 | 4 | 0.363333 |
1094 | 20080930 | 1 | 0.403333 |
1094 | 20081231 | 2 | 0.443333 |
1094 | 20090331 | 3 | 0.483333 |
1094 | 20090630 | 4 | 0.363333 |
1327 | 20071231 | 1 | 0.163333 |
1327 | 20080331 | 2 | 0.123333 |
1327 | 20080630 | 3 | 0.027333 |
1327 | 20080930 | 4 | -0.06867 |
1327 | 20081231 | 1 | -0.16467 |
1327 | 20090331 | 2 | -0.26067 |
1327 | 20090630 | 3 | -0.35667 |
1327 | 20090930 | 4 | -0.45267 |
1045 | 20080331 | 1 | -0.54867 |
1045 | 20080630 | 2 | -0.64467 |
1045 | 20080930 | 3 | -0.74067 |
1045 | 20081231 | 4 | -0.83667 |
1045 | 20090331 | 1 | -0.93267 |
1045 | 20090630 | 2 | -1.02867 |
I am expecting the following output:
Company_ID | Date | Fiscal_Quarter | Return |
1210 | 20080930 | 2 | 0.443333 |
1094 | 20071231 | 2 | 0.283333 |
1327 | 20080331 | 2 | 0.123333 |
1045 | 20080630 | 2 | -0.64467 |
You would notice that for Company_ID 1327, I do not want the row for Fiscal_Quarter 2 on 31 March 2009 since the first value for Fiscal_Quarter 4 is on 30 September 2008.
Thank you in advance for all your support!
Assuming that "Date" is a sas-date:
proc sort data=work.have out= sorted;
where Date < '13Apr2009'd and Fiscal_Quarter in (2, 4);
by Company_ID descending Date;
run;
data work.result;
set work.sorted;
by Company_ID descending Date;
retain output_next 0;
if fiscal_quarter = 4 then output_next = 1;
if fiscal_quarter = 2 and output_next = 1 then do;
output;
output_next = 0;
end;
drop output_next;
run;
proc sort data=work.result out=work.want;
by descending return;
run;
I want to retain observations containing only Fiscal_Quarter 2 which immediately precedes the first Fiscal_Quarter 4 before 13 April 2009.
Here is UNTESTED CODE. If you want tested code, please provide the data as SAS data step code (instructions).
/* Find fiscal quarter 4 before 13APR2009 */
proc sql;
create last_fiscal_quarter4 as select * from have(where=(fiscal_quarter=4 and date<'13APR2009'd))
group by company_id
having date=max(date);
run;
/* Find fiscal quarter 2 before the above found date */
proc sql;
create want as select h.*
from have(where=(fiscal_quarter=2)) as h left join last_fiscal_quarter4 as l
on h.company_id=l.company_id
where h.fiscal_quarter<l.date
group by company_id
having h.date=max(h.date);
quit;
Thank you for your kind response.
Unfortunately, I could not convert it into datastep since I have never done it before 😞
I am working on it.
Just to let you know that I received the following error message when I ran your untested code:
I will upload the file again as a datastep again in my query if hopefully I can figure it out.
Thanks again!
Fixed (I think). Silly error on my part.
/* Find fiscal quarter 4 before 13APR2009 */
proc sql;
create table last_fiscal_quarter4 as select * from have(where=(fiscal_quarter=4 and date<'13APR2009'd))
group by company_id
having date=max(date);
run;
/* Find fiscal quarter 2 before the above found date */
proc sql;
create table want as select h.*
from have(where=(fiscal_quarter=2)) as h left join last_fiscal_quarter4 as l
on h.company_id=l.company_id
where h.fiscal_quarter<l.date
group by company_id
having h.date=max(h.date);
quit;
Thanks a lot again Andreas!
Unfortunately, there was still a small error in the log which I could not fix.
Nonetheless, I have got the solution, thanks to Andreas 🙂
I really appreciate your time!
Assuming that "Date" is a sas-date:
proc sort data=work.have out= sorted;
where Date < '13Apr2009'd and Fiscal_Quarter in (2, 4);
by Company_ID descending Date;
run;
data work.result;
set work.sorted;
by Company_ID descending Date;
retain output_next 0;
if fiscal_quarter = 4 then output_next = 1;
if fiscal_quarter = 2 and output_next = 1 then do;
output;
output_next = 0;
end;
drop output_next;
run;
proc sort data=work.result out=work.want;
by descending return;
run;
Hello Andreas,
Thank you for your reply the other day.
When I am trying to run the program on a larger dataset, unfortunately the results are not coming as expected.
For instance, the result for GVKEY 001209, GVKEY 004321.
Would be grateful if you could kindly give me any suggestions as to how I can fix this issue.
I am attaching the larger dataset here.
Since I have modified the names of the columns a bit, I am copying your same code with the names a bit modified here for your convenience.
proc sort data=work.have out= sorted1;
where Datadate < '13Apr2009'd and FQTR in (2, 4);
by GVKEY descending Datadate;
run;
data work.result1;
set work.sorted1;
by GVKEY descending Datadate;
retain output_next 0;
if FQTR = 4 then output_next = 1;
if FQTR = 2 and output_next = 1 then do;
output;
output_next = 0;
end;
drop output_next;
run;
proc sort data=work.result1 out=work.want1;
by descending GVKEY Datadate;
run;
Sorry, but i can't download any binary files. Could you post the data of GVKEY=001209 and GVKEY=004321 as data step?
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.