Dear All,
I am a new SAS learner, and I`ve been having some difficulties solving this problem:
I have two data sets; The first data set includes 2452 companies’ bankruptcy related data from 1980 to 2018. The second data set includes that 2452 companies` monthly stock price from 1975 till 2018 (so basically 43*12=516 monthly stock price for each company). I want to test if one of the filed companies in the first data set is public or private during its filing period. So, in order to decide this, I want to check if the company has stock prices two years before the filing date(taking two years backward for each company according to its filing date). And, if it is possible, I want to use “if statement” to decide if the company is public or not. If the average price of the 24 months stock prices can be calculated, then I consider this company as a public company during its bankruptcy filing date. Otherwise, this company is considered as a private company. The two data sets are organized in the following format. Is it possible to create a loop to get the final results?
Data set 1 (four columns, 2452 rows)
Company GVKEY CIK bankruptcy file data (in “mmddyy” format)
1 123456 23445 10/19/2018
2 789654 898980 01/29/1990
…….
2452 678950 383994 05/9/2000
Data set 2 (341350 rows )
Company CIK GVKEY stock price date(“yymmdd” format) stock price
1 456734 090908 19800131 15.75
1 456734 090908 19800230 17.75
1 456734 090908 19800331 6.25
1 456734 090908 19800430 10.05
1 456734 090908 19800531 15.75
……
…….
…….
1 456734 090908 19801130 12.00
1 456734 090908 19801231 17.75
…..
……till
1 456734 090908 20181130 15.75
1 456734 090908 20181031 17.75
then it starts the second company
..... until it covers all the 2452 companies
I already have merged the two data sets according to the unique identifier GVKEY, and in the merged data file, SAS just repeated one bankruptcy file date to fill up the 40 rows for each company in the merged data file. I am also not sure about how to delete those repeated file dates for each company.
Thanks,
There is no data attached. Here is one last shot at this. Your explanation makes me think that what you really need is something like:
proc sql;
create table ccc as
select
aa.cik,
aa.bankrupcyDate,
mean(bb_num.prccm) as MEANPRICE
from aa left join bb_num
on aa.cik = bb_num.cik and
bb_num.priceDate between intnx('month', aa.bankrupcyDate, -23, 'Beginning') and
intnx('month', aa.bankrupcyDate, 0, 'End')
group by aa.cik, aa.bankrupcyDate ;
quit;
if aa contains the bankrupcy dates and bb_num contains the stock prices.
This should get you started
data BANKRUPTCY;
input COMPANY GVKEY CIK BDATE : mmddyy10.;
format BDATE date9.;
cards;
1 123456 23445 10/19/2018
2 789654 898980 01/29/1990
2452 678950 383994 05/9/2000
run;
data COMPANIES;
input COMPANY CIK GVKEY SDATE : yymmdd10. STOCKPRICE ;
format SDATE date9.;
cards;
1 456734 090908 19800131 15.75
1 456734 090908 19800230 17.75
1 456734 090908 19800331 6.25
1 456734 090908 19800430 10.05
1 456734 090908 19800531 15.75
1 456734 090908 20181130 15.75
1 456734 090908 20181031 17.75
2 456734 090908 19800131 15.75
2 456734 090908 19800230 17.75
2 456734 090908 19800331 6.25
2 456734 090908 19800430 10.05
2 456734 090908 19800531 15.75
run;
proc sql;
select BANKRUPTCY.COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
from BANKRUPTCY left join COMPANIES
on BANKRUPTCY.COMPANY=COMPANIES.COMPANY
and intck('month',BDATE,SDATE) between 0 and -24
group by 1;
quit;
COMPANY | BDATE | MEANPRICE |
---|---|---|
1 | 19OCT2018 | 17.75 |
2 | 29JAN1990 | . |
2452 | 09MAY2000 | . |
I would suggest, instead :
data BANKRUPTCY;
input COMPANY GVKEY CIK BDATE : mmddyy10.;
format BDATE date9.;
cards;
1 123456 23445 10/19/2018
2 789654 898980 01/29/1990
2452 678950 383994 05/9/2000
;
data COMPANIES;
input COMPANY CIK GVKEY SDATE : yymmdd10. STOCKPRICE ;
format SDATE date9.;
cards;
1 456734 090908 19800131 15.75
1 456734 090908 19800229 17.75
1 456734 090908 19800331 6.25
1 456734 090908 19800430 10.05
1 456734 090908 19800531 15.75
1 456734 090908 20171130 15.75
1 456734 090908 20171031 17.75
2 456734 090908 19800131 15.75
2 456734 090908 19800229 17.75
2 456734 090908 19800331 6.25
2 456734 090908 19800430 10.05
2 456734 090908 19800531 15.75
;
proc sql;
select BANKRUPTCY.COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
from BANKRUPTCY left join COMPANIES
on BANKRUPTCY.COMPANY=COMPANIES.COMPANY
and SDATE between intnx('year', BDATE, -2, "same") and BDATE
group by BANKRUPTCY.COMPANY, BDATE;
quit;
Note: I changed the available dates for company 1 to be within the two years prior to bankruptcy.
COMPANY BDATE MEANPRICE ------------------------------ 1 19OCT2018 16.75 2 29JAN1990 . 2452 09MAY2000 .
hey, I already merged these two data sets by GVKEY, and if I want to work on that merged date set, what changes should I make on the codes that you`ve provided? one more thing, after I merged, I noticed that the bankruptcy file date for each company was repeated as two data sets dont share the same number of rows. Is this affect the calculation of the average stock price?
If your merged data is the way I think it should be - namely, every row with a given GVKEY have the same value for COMPANY and BDATE - (but your last question makes me doubt that), this would do:
proc sql;
select COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
from MERGEDDATA
where SDATE between intnx('year', BDATE, -2, "same") and BDATE
group by COMPANY, BDATE;
quit;
I ended up using your first code, and it did give me some numerical results. but I wanted know if the final results are calculated the way I wanted them to be, so I also manually checked some companies` two years average stock price just before the their bankruptcy. I got results that are not the same.
I tried to change the "year" to "month", and put -24 instead of -2. but they two gave me the same results. Could you please tell me how SAS calculated the value by using your code. Does it take month into consideration ?
for example, if company A went bankrupt at year 2018/11/09, I want the average two year stock price of company A calculate by including all the monthly stock prices from 2016/11- 2018/11, not just from 2016/01-2018/01 or 2016/12-2-18/12.
Thank you in advance.
Ziba
Please show an example of data and code for which the results are not as expected.
Hey PG,
I have uploaded the screenshot of excel workbook to explain my question better. What I am not sure is how SAS calculates the mean price of monthly stock prices. I have tried two ways of calculating the average stock price in Excel. In excel, the first and second rows are the monthly closing date and corresponding monthly stock price. This company went bankrupt at year of 1989/3/10, the first calculation( colored as green) takes the 24 months stock prices before the bankruptcy date(from 1987/3/31-1989/3/31), and the average price is 6.525. The second approach(colored in red) includes date 1987/01/31-1989/01/31 (also 24 months data), but I got 7.015 as the average price. However, SAS generated 10.19 as the mean price. That`s why I am confused how the calculations are processed in sas. I used the following code(I merged by cik).
proc sql;
create table ccc
as select aa.cik, merged_file_date, mean(prccm) as MEANPRICE
from aa left join bb_num
on aa.cik = bb_num.cik
and prccm between intck('year', merged_file_date, -2, "same") and merged_file_date
group by aa.cik, merged_file_date ;
quit;
I want the mean price that`s calculated by SAS equals to 6.525(not 10.19) for this company( by selecting dates colored in green).
What changes should I make to this code?
Thank you soooo much.
Ziba
There is no data attached. Here is one last shot at this. Your explanation makes me think that what you really need is something like:
proc sql;
create table ccc as
select
aa.cik,
aa.bankrupcyDate,
mean(bb_num.prccm) as MEANPRICE
from aa left join bb_num
on aa.cik = bb_num.cik and
bb_num.priceDate between intnx('month', aa.bankrupcyDate, -23, 'Beginning') and
intnx('month', aa.bankrupcyDate, 0, 'End')
group by aa.cik, aa.bankrupcyDate ;
quit;
if aa contains the bankrupcy dates and bb_num contains the stock prices.
Hey PG,
I am sorry I didn`t upload the file, but the code you provided worked perfectly. I got the results that I wanted. Thank you soooooo much. I really appreciated your help.
Ziba
What do you want the result data set to look like?
Hey mkeintz,
as a final result, I`d like to create a new column that indicates the type of the company. so if the average stock price can be calculated, then define that company as a public company. and if we have average stock price that`s NA, then categorize it as private company.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.