Hello SAS Support Communities,
I'm not going to open zip files from an unknown source in this day and age and I doubt anyone else will either. Post some data in a data step and you will likely get a better response.
Hello HB,
Thanks for the response. The problem is that SAS Support Communities does not let me post a SAS dataset unless it is in a zip file. If it helps, I have attached what I am looking for in Excel (combining the Compustat1 and Restatements1 tabs to get the Compustat2 tab).
God bless you and your family, best regards, and take care,
Jadallah
Don't post all your data. No one has the time to peruse full data sets anyway. Give us a 20-row example of what you want to achieve.
Hi Chris,
Thank you for the message. I only posted an example. One dataset has two observations in it and the other two each have nine observations.
God bless, best regards, and take care,
Jadallah
Also that I can understand where the guys are coming from, I feel they are a bit harsh on you (@ChrisNZ, @HB).
The best way to post data is to provide a SAS data step which creates sample data. There is a SAS macro available which converts a SAS table to a SAS data step creating the data.
https://blogs.sas.com/content/sastraining/2016/03/11/jedi-sas-tricks-data-to-data-step-macro/
I've downloaded and unzipped your data and then used this macro via below code.
libname sample 'C:\temp\sampledata';
%data2datastep(compustat1, sample, ,C:\temp\sampledata\compustat1.sas);
%data2datastep(compustat2, sample, ,C:\temp\sampledata\compustat2.sas);
%data2datastep(restatements1, sample, ,C:\temp\sampledata\restatements1.sas);
Here the result creating the data which now can be used by everyone in this forum.
data WORK.COMPUSTAT1;
infile datalines dsd truncover;
input CIK_Code:32. fyear:32. fyr:32. datadate:32.;
label CIK_Code="CIK_Code" fyear="fyear" fyr="fyr" datadate="datadate";
datalines4;
1042173,1998,3,19980331
1042173,1999,3,19990331
1042173,2000,3,20000331
1042173,2001,3,20010331
1042173,2002,3,20020331
1042173,2003,3,20030331
1042173,2004,3,20040331
1042173,2005,3,20050331
1042173,2006,3,20060331
;;;;
data WORK.COMPUSTAT2;
infile datalines dsd truncover;
input CIK_Code:32. fyear:32. fyr:32. datadate:32. AO:32. FO:32. AD:32.;
label CIK_Code="CIK_Code" fyear="fyear" fyr="fyr" datadate="datadate" AO="AO" FO="FO" AD="AD";
datalines4;
1042173,1998,3,19980331,0,0,0
1042173,1999,3,19990331,0,0,0
1042173,2000,3,20000331,1,1,0
1042173,2001,3,20010331,1,0,0
1042173,2002,3,20020331,1,1,0
1042173,2003,3,20030331,1,0,0
1042173,2004,3,20040331,1,0,0
1042173,2005,3,20050331,1,0,1
1042173,2006,3,20060331,0,0,0
;;;;
data WORK.RESTATEMENTS1;
infile datalines dsd truncover;
input CIK_Code:32. Restatement_Key:32. Restated_Period_Begin:DATE9. Restated_Period_Ended:DATE9. Disclosure_Date:DATE9.;
format Restated_Period_Begin DATE9. Restated_Period_Ended DATE9. Disclosure_Date DATE9.;
label CIK_Code="CIK_Code" Restatement_Key="Restatement_Key" Restated_Period_Begin="Restated_Period_Begin" Restated_Period_Ended="Restated_Period_Ended" Disclosure_Date="Disclosure_Date";
datalines4;
1042173,11335,01JAN2000,30JUN2004,19OCT2004
1042173,12789,01FEB2002,30JUN2003,01MAY2004
;;;;
And on a side note:
Browsing through your previous posts it looks to me you often only post the problem without demonstrating your own effort. Please try to first solve the problem on your own and post questions only when you get stuck and need direction (eventually also posting some of your not yet working code). If you don't do this then you risk that over time more and more people will start to ignore your questions as this forum is not about doing your work for you (unless you pay for it) but to help and support you in doing your own work.
Thank you for the help Patrick. That's a good point regarding effort. The following code works for my variables (AO, FO, and AD) IF the FYR and FYEAR variables from compustat1 can be joined in with restatements1:
data Etr.restatement1;
set Etr.restatement;
rpb = restated_period_begin;
rpe = restated_period_ended;
RstateDays = (Datdif (rpb, rpe, 'act/act'))+1;
if month(restated_period_begin) le FYR then y1 = year(restated_period_begin);
else y1 = year(restated_period_begin) + 1;
if month(restated_period_ended) le FYR then y2 = year(restated_period_ended);
else y2 = year(restated_period_ended) + 1;
if month(disclosure_date) le FYR then y3 = year(disclosure_date);
else y3 = year(disclosure_date) + 1;
run;
data Etr.restatement1a;
set Etr.restatement1;
If RstateDays LT 350 then delete;
run;
data Etr.restatement2;
set Etr.restatement1a;
drop rpb;
drop rpe;
run;
proc sort data=Etr.restatement2 out=Etr.restatement2a;
by cik_code;
run;
/*create observation for each year restated [include begin year and end year] */
data Etr.restatement3;
set Etr.restatement2a;
by cik_code;
if cik_code then do;
do fyear=y1 to y3;
output;
end;
end;
run;
proc sort data= Etr.restatement3 out= Etr.restatement4 noduplicate;
by restatement_key cik fyear;
run;
data Etr.restatement5;
set Etr.restatement4;
if y1 = fyear then FO=1;
else FO=0;
run;
data Etr.restatement6;
set Etr.restatement5;
if y3 = fyear then AD=1;
else AD=0;
run;
data Etr.restatement7;
set Etr.restatement6;
if y1 le fyear le y2 then AO=1;
else AO=0;
run;
The problem is that a simple merge by cik_code does not work.
God bless, best regards, and thanks a lot,
Jadallah
Hello Patrick,
I made significant progress with the following code:
data etr1.restatements2;
set etr1.restatements1;
y1 = year(restated_period_begin);
y1a = year(restated_period_begin)+1;
y2 = year(restated_period_ended);
y2a = year(restated_period_ended)+1;
y3 = year(disclosure_date);
y3a = year(disclosure_date)+1;
m1 = month(restated_period_begin);
m2 = month(restated_period_ended);
m3 = month(disclosure_date);
run;
proc sort data=etr1.restatements2 out=etr1.restatements2a;
by cik_code;
run;
/*create observation for each year restated [include begin year and end year] */
data etr1.restatements3;
set etr1.restatements2a;
by cik_code;
if cik_code then do;
do fyear=y1 to y3a;
output;
end;
end;
run;
proc sort data=etr1.restatements3 out=etr1.restatements4 noduplicate;
by cik_code fyear;
run;
proc sort data=etr1.compustat1 out=etr1.compustat1a noduplicate;
by cik_code fyear;
run;
data etr1.combined;
merge etr1.compustat1a etr1.restatements4;
by cik_code fyear;
run;
data etr1.combined1;
set etr1.combined;
if m1 le fyr then fyear1 = y1;
if m1 gt fyr then fyear1 = y1a;
if m2 le fyr then fyear2 = y2;
if m2 gt fyr then fyear2 = y2a;
if m3 le fyr then fyear3 = y3;
if m3 gt fyr then fyear3 = y3a;
run;
proc sort data=etr1.combined1 out=etr1.combined2 noduplicate;
by restatement_key;
run;
data etr1.combined3;
set etr1.combined2;
if fyear = fyear1 then FO = 1;
else FO = 0;
run;
data etr1.combined4;
set etr1.combined3;
if fyear1 le fyear le fyear2 then AO=1;
else AO=0;
run;
data etr1.combined5;
set etr1.combined4;
if fyear = fyear3 then AD = 1;
else AD = 0;
run;
proc sort data=etr1.combined5 out=etr1.combined6 noduplicate;
by cik_code fyear;
run;
Now all I need is the following:
There should only be 1 observation per cik_code fyear combination. Some cik_code fyear combinations are included in multiple observations. They need to be condensed as follows:
AO, AD, and FO need to keep maximum values. For example, look at observations 11 and 12. These should be condensed into 1 observation with FO = 0, AO = 1, and AD = 1. Does that make sense? It may be an array code.
God bless you and your family, best regards, and thank you for your help,
Jadallah
Hello Kim,
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.