BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ziba
Calcite | Level 5

Dear All, 

I am a new SAS learner, and I`ve been having some difficulties solving this problem:

 

I have two data sets; The first data set includes 2452 companies’ bankruptcy related data from 1980 to 2018. The second data set includes that 2452 companies` monthly stock price from 1975 till 2018 (so basically 43*12=516 monthly stock price for each company). I want to test if one of the filed companies in the first data set is public or private during its filing period. So, in order to decide this, I want to check if the company has stock prices two years before the filing date(taking two years backward for each company according to its filing date). And, if it is possible, I want to use “if statement” to decide if the company is public or not. If the average price of the 24 months stock prices can be calculated, then I consider this company as a public company during its bankruptcy filing date.  Otherwise, this company is considered as a private company. The two data sets are organized in the following format. Is it possible to create a loop to get the final results?

 

 Data set 1 (four columns, 2452 rows)

Company   GVKEY        CIK               bankruptcy file data (in “mmddyy” format)

1                   123456       23445       10/19/2018

2                   789654         898980     01/29/1990

…….

2452            678950         383994      05/9/2000

 

 

 

Data set 2 (341350 rows )

Company     CIK             GVKEY         stock price date(“yymmdd” format)         stock price

1                     456734    090908        19800131                                                     15.75

1                    456734     090908        19800230                                                     17.75

1                     456734    090908        19800331                                                      6.25

1                    456734     090908        19800430                                                      10.05

1                     456734    090908        19800531                                                         15.75

……

…….

…….

1                     456734    090908        19801130                                                            12.00

1                    456734     090908        19801231                                                            17.75

…..

……till

1                     456734    090908        20181130                                                                 15.75

1                    456734     090908        20181031                                                                 17.75

then it starts the second company 

..... until it covers all the 2452 companies 

 

I already have merged the two data sets according to the unique identifier GVKEY, and in the merged data file, SAS just repeated one bankruptcy file date to fill up the 40 rows for each company in the merged data file. I am also not sure about how to delete those repeated file dates for each company. 

Thanks, 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

There is no data attached. Here is one last shot at this. Your explanation makes me think that what you really need is something like:

 

proc sql;
create table ccc as 
select 
    aa.cik, 
    aa.bankrupcyDate, 
    mean(bb_num.prccm) as MEANPRICE
from aa left join bb_num
  on  aa.cik = bb_num.cik and
      bb_num.priceDate between intnx('month', aa.bankrupcyDate, -23, 'Beginning') and 
      intnx('month', aa.bankrupcyDate, 0, 'End')
group by aa.cik, aa.bankrupcyDate ;
quit;

if aa contains the bankrupcy dates and bb_num contains the stock prices.

 

PG

View solution in original post

11 REPLIES 11
ChrisNZ
Tourmaline | Level 20

This should get you started


data BANKRUPTCY;
  input COMPANY GVKEY CIK  BDATE : mmddyy10.;
  format  BDATE date9.;
cards;
1     123456   23445    10/19/2018
2     789654  898980    01/29/1990
2452  678950  383994    05/9/2000
run;

data COMPANIES;
  input COMPANY  CIK  GVKEY  SDATE : yymmdd10. STOCKPRICE ;
  format  SDATE date9.;
cards;
1    456734    090908        19800131           15.75
1    456734     090908        19800230           17.75
1    456734    090908        19800331            6.25
1    456734     090908        19800430            10.05
1    456734    090908        19800531               15.75
1    456734    090908        20181130                       15.75
1    456734     090908        20181031                       17.75
2    456734    090908        19800131           15.75
2    456734     090908        19800230           17.75
2    456734    090908        19800331            6.25
2    456734     090908        19800430            10.05
2    456734    090908        19800531               15.75
run;

proc sql;
  select BANKRUPTCY.COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
  from BANKRUPTCY left join COMPANIES
  on  BANKRUPTCY.COMPANY=COMPANIES.COMPANY  
  and intck('month',BDATE,SDATE) between 0 and -24
  group by 1;
quit;

 

COMPANY BDATE MEANPRICE
1 19OCT2018 17.75
2 29JAN1990 .
2452 09MAY2000 .

 

 

 

PGStats
Opal | Level 21

I would suggest, instead :

 

data BANKRUPTCY;
  input COMPANY GVKEY CIK  BDATE : mmddyy10.;
  format  BDATE date9.;
cards;
1    123456 23445  10/19/2018
2    789654 898980 01/29/1990
2452 678950 383994 05/9/2000
;

data COMPANIES;
  input COMPANY  CIK  GVKEY  SDATE : yymmdd10. STOCKPRICE ;
  format  SDATE date9.;
cards;
1 456734 090908 19800131 15.75
1 456734 090908 19800229 17.75
1 456734 090908 19800331 6.25
1 456734 090908 19800430 10.05
1 456734 090908 19800531 15.75
1 456734 090908 20171130 15.75
1 456734 090908 20171031 17.75
2 456734 090908 19800131 15.75
2 456734 090908 19800229 17.75
2 456734 090908 19800331 6.25
2 456734 090908 19800430 10.05
2 456734 090908 19800531 15.75
;

proc sql;
  select BANKRUPTCY.COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
  from BANKRUPTCY left join COMPANIES
  on  BANKRUPTCY.COMPANY=COMPANIES.COMPANY  
  and SDATE between intnx('year', BDATE, -2, "same") and BDATE
  group by BANKRUPTCY.COMPANY, BDATE;
quit;

Note: I changed the available dates for company 1 to be within the two years prior to bankruptcy.

 

                            COMPANY      BDATE  MEANPRICE
                           ------------------------------
                                  1  19OCT2018      16.75
                                  2  29JAN1990          .
                               2452  09MAY2000          .
PG
Ziba
Calcite | Level 5

hey, I already merged these two data sets by GVKEY, and  if I want to work on that merged date set, what changes should I make on the codes that you`ve provided? one more thing, after I merged, I noticed that the bankruptcy file date for each company was repeated as two data sets dont share the same number of rows. Is this affect the calculation of the average stock price? 

 

 

PGStats
Opal | Level 21

If your merged data is the way I think it should be - namely, every row with a given GVKEY have the same value for COMPANY and BDATE - (but your last question makes me doubt that), this would do:

 

proc sql;
  select COMPANY, BDATE, mean(STOCKPRICE) as MEANPRICE
  from MERGEDDATA  
  where SDATE between intnx('year', BDATE, -2, "same") and BDATE
  group by COMPANY, BDATE;
quit;
PG
Ziba
Calcite | Level 5

I ended up using your first code, and it did give me some numerical results. but I wanted know if the final results are calculated the way I wanted them to be, so  I also manually checked some companies` two years average stock price just before the their bankruptcy.  I got results that are not the same. 

I tried to change the "year" to "month", and put -24 instead of -2. but they two gave me the same results. Could you please tell me how SAS calculated the value by using your code. Does it take month into consideration ?

 

for example, if company A went bankrupt at year 2018/11/09, I want  the average two year stock price of company A  calculate by including all the monthly stock prices from 2016/11- 2018/11, not just from 2016/01-2018/01 or 2016/12-2-18/12.

 

Thank you in advance. 

Ziba

 

 

 

 

PGStats
Opal | Level 21

Please show an example of data and code for which the results are not as expected.

PG
Ziba
Calcite | Level 5

Hey PG, 

I have uploaded the screenshot of excel workbook to explain my question better. What I am not sure is how SAS calculates the mean price of monthly stock prices. I have tried two ways of calculating the average stock price in Excel. In excel, the first and second rows are the monthly closing date and corresponding monthly stock price. This company went bankrupt at year of  1989/3/10, the first calculation( colored as green) takes the 24 months stock prices before the bankruptcy date(from 1987/3/31-1989/3/31), and the average price is 6.525. The second approach(colored in red) includes date 1987/01/31-1989/01/31 (also 24 months data), but I got 7.015 as the average price. However, SAS generated 10.19 as the mean price. That`s why I am confused how the calculations are processed in sas. I used the following code(I merged by cik). 

proc sql;
create table ccc 
as select aa.cik, merged_file_date, mean(prccm) as MEANPRICE
  from aa left join bb_num
  on aa.cik = bb_num.cik 
  and prccm between intck('year', merged_file_date, -2, "same") and merged_file_date
  group by aa.cik, merged_file_date ;
quit;

I want the mean price that`s calculated by SAS  equals to 6.525(not 10.19) for this company( by selecting dates colored in green).

What changes should I make to this code?

Thank you soooo much.

Ziba 

 

PGStats
Opal | Level 21

There is no data attached. Here is one last shot at this. Your explanation makes me think that what you really need is something like:

 

proc sql;
create table ccc as 
select 
    aa.cik, 
    aa.bankrupcyDate, 
    mean(bb_num.prccm) as MEANPRICE
from aa left join bb_num
  on  aa.cik = bb_num.cik and
      bb_num.priceDate between intnx('month', aa.bankrupcyDate, -23, 'Beginning') and 
      intnx('month', aa.bankrupcyDate, 0, 'End')
group by aa.cik, aa.bankrupcyDate ;
quit;

if aa contains the bankrupcy dates and bb_num contains the stock prices.

 

PG
Ziba
Calcite | Level 5

Hey PG, 

I am sorry I didn`t  upload the file, but the code you provided worked perfectly. I got the results that I wanted. Thank you soooooo much. I really appreciated your help.

Ziba 

Smiley Very Happy

mkeintz
PROC Star

What do you want the result data set to look like?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ziba
Calcite | Level 5

Hey 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2506 views
  • 1 like
  • 4 in conversation