BookmarkSubscribeRSS Feed
Ziba
Calcite | Level 5

hey all, 

I have another question that`s similar to what I had before.

I have a dataset called AA, and it has these following variables:

 Data set AA (four columns, 1642 rows-companies), quarter represents which quarter that company filed for bankruptcy(this variable is very important), 

 

 

Company Name
RAYTECH CORP
MACY'S INC
UNITED AIRLINES INC
TWIN STAR PRODUCTIONS INC
CIK
797917
794367
319687
847908

 

Bankruptcy date Quarter
3/10/1989 1989Q1
1/15/1990 1990Q1
12/3/1990 1990Q4
6/6/1991 1991Q2

 

in the second data set, lets call it FF, I have all the 1642 companies quarterly financial data from 1985 till 2018 (75111 rows),like total assets, total liabilities, net income, R&D expenditure, etc. 

 What I want to do is to find out each companies` last quarterly(but within 12 months) filed financial data of total assets, total liabilities, net income and R&D expenditure before their bankruptcy filings, and take those values to create  new columns. 

For example, in data set AA,  "RAYTECH CORP" went bankrupt on 3/10/1989, which is in 1989Q1( first quarter), I firstly want to  merge these two data sets by variable "CIK"(to locate the company in data set FF from data set AA). When I matched the company in dataset FF, I want to extract the financial values for this company in that bankruptcy filed quarter (1989Q1) or if we don`t have data keep looking for (1988Q4), (1988Q3), (1988Q2), (1988Q1) , and   stop after one full circle(full 12 months, or 4 quarters). as a final results I want to have company names, CIK, and total assets, total liabilities, net income and R&D expenditure. 

I attached this company`s financial data, and the area colored in yellow is the values that I want to extract (as we have the financial data on the bankruptcy filed quarter, so we  dont have to keep looking)

 

I am not sure how to proceed as the interval is quarter( not a number of months), and it is more like moving the rows in the data set FF. 

Thanks in advance, 

 

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
  input CPY $ CIK $ BK_DTE : mmddyy10. BK_QTR $ ;
  cards;
UNITED 319687 12/3/1990 1990Q4
MACY 794367 1/15/1990 1990Q1
RAYTECH 797917 3/10/1989 1989Q1
run;

data FF;
  CIK='319687'; QTR='1989Q4'; ASSET=1; output;
  CIK='319687'; QTR='1988Q3'; ASSET=1; output;
  CIK='794367'; QTR='1989Q3'; ASSET=1; output;
  CIK='794367'; QTR='1988Q2'; ASSET=1; output;
  CIK='797917'; QTR='1989Q1'; ASSET=1; output;
  CIK='797917'; QTR='1988Q4'; ASSET=1; output;
run;

data WANT;
  merge HAVE FF;
  by CIK;
  DIST= intck('quarter',input(BK_QTR,yyq6.),input(QTR,yyq6.));
  if first.CIK ;          /* The first match is always the best, i.e. the closest date */
  if DIST < -3 then call missing(QTR, ASSET); /* If date too far back then nullify the match*/ 
run;

 

CPY CIK BK_DTE BK_QTR QTR ASSET
UNITED 319687 11294 1990Q4   .
MACY 794367 10972 1990Q1 1989Q3 1
RAYTECH 797917 10661 1989Q1 1989Q1 1

 

 

 

Ziba
Calcite | Level 5

Hey Kris,

Thanks for your efforts, but as a final result, I want to get each company`s financial data(i attached a document, you can have a alook), not the quarter. 

 

ChrisNZ
Tourmaline | Level 20

Do not provide information in MS Office documents.

Many people including myself cannot/will not download them due to the risk these documents present.

The format for presenting the input data and the desired solution is well defined. See:

How to convert datasets to data steps
The macro for direct download
How to post code

Please stick to it.

 

 My solution provides the company's data, reduced to variable ASSET in my example.

mkeintz
PROC Star

You want the most recent quarterly FF quarterly record prior to or equal to the quarter of bankruptcy in dataset AA - as long as the FF record is no more than 4 quarters preceding the bankruptcy.

 

I see you have a date variable in AA.  Do you also have a date variables (not a quarter identifier) in FF?

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ziba
Calcite | Level 5

In dataset AA I only have bankruptcy date as a date variable, the quarter identifier that I have in AA is a numeric value(that I filled manually). I believe in FF, that quarter identifier is numeric as well. BUT in FF, I also have a date variable, which indicates the quarterly end month, like 19890331(Q1),19890630(Q2),199890930,(Q3)19891230 (Q4). I was wondering if it is possible first to determine the quarterly end month for each company by looking at bankruptcy dates, and then proceed with the same procedure. 

thank you 

 

mkeintz
PROC Star

Let's say you have

  1. The bankruptcy data set AA, with one record per CIK, sorted by CIK, and with bankruptcy date BKRDATE
  2. The quarterly data set FF, sorted by CIK/DATADATE, with never more than one record per quarter.  DATADATE is always the last date of the quarter.

You want to produce one record per bankrupt CIK, with the same-quarter FF data, or most recent FF data prior to BKRDATE.  One of the issues is that BKRDATE may be 15feb2011, but the DATADATE for that quarter is 31mar2011.  So, depending on whether there is a quarterly FF observation for the AA bankruptcy quarter, vs a prior FF quarter, you'll can end up with an FF record with DATADATE either preceding (by up to 4 quarters) or after BKRDATE (same quarter).

 

Below is a synthetic sample for CIK's 1,2,3 and 4, all with BKRDATE=15feb2011 in data set AA. 

 

But in FF  (with * next to the proper selection)

  1. CIK 1 ends on the same quarter (datadate=31mar2011)
  2. CIK 2 ends in the prior quarter (31dec2010),
  3. CIK 3 ends 3 quarters prior (30jun2010),
  4. CIK 4 ends too early (31dec2009)
  5. CIK 5 has no bankruptcy record in AA

 

data aa;
  input cik bkrdate :date9.;
  format bkrdate yymmddn8.;
datalines;
1 15feb2011
2 15feb2011
3 15feb2011
4 15feb2011
run;

data ff;
  input cik datadate :date9.; format datadate yymmddn8.;
datalines;
 1 31dec2009
 1 31mar2010
 1 30jun2010
 1 30sep2010
 1 31dec2010
 1 31mar2011  *
 2 31dec2009
 2 31mar2010
 2 30jun2010
 2 30sep2010
 2 31dec2010  *
 3 31dec2009
 3 31mar2010
 3 30jun2010  *
 4 31dec2009  None
 5 31dec2009  None
 5 31mar2010
 5 30jun2010
 5 30sep2010
 5 31dec2010
 5 31mar2011
 5 30jun2011
run;

data want (drop=_:);
  do until (last.cik);
    set aa (keep=cik in=inaa) ff (keep=cik);  /*Get CIK's only */
    by cik;
    if inaa then set aa;                      /* Get BKRDATE*/
    else do;
      merge ff
            ff (firstobs=2 keep=datadate rename=(datadate=_nxtdate));
      if last.cik then _nxtdate='31dec9999'd;
      if bkrdate=. then continue;                                 /*No need to test dates*/
      if not (0<=intck('qtr',datadate,bkrdate)<=4) then continue; /*Get another FF*/
      if 0<=intck('qtr',_nxtdate,bkrdate)<=4 then continue;       /*Get another FF*/
      output;
    end;
  end;
  bkrdate=.;
run;

 

 

For each CIK:

  1. The program reads the AA record prior to all the FF records.
  2. If the record-in-hand is a CIK from AA, then read the BKRDATE from AA.  This will be automatically retained through all the FF records until then next AA record is encountered (for the next CIK).
  3. If not an AA record (i.e. record-in-hand is a CIK from FF) then
    1. read the full FF record and also the next FF datadate (renamed to _nxtdate).
    2. if the record-in-hand is last for this CIK, then _nxtdate is for the following CIK and is set to 31dec9999.
    3. If the BKRDATE is missing (i.e. no AA record for this CIK) then "continue" the do group - i.e. go to top of loop and read more FF data.
    4. if FF datadate for record in hand is NOT between zero quarters and 4 quarters prior to bkrdate then "continue"  (go to top of loop and get another record).
    5. If _nxtdate following IS between zero and 4 quarters prior to bkrdate then "continue" - go to top of loop and get that record, so that DATADATE will pass the test/
    6. The only record that can pass #4 and #5 above is the best possible FF record, so it is OUTPUT.  Note that while there can be multiple records with satisfactory DATADATEs, the test on _NXTDATE will force another record to be read if _NXTDATE also qualifies, so you won't output multiple records per CIK, because eventually _NXTDATE will no longer qualify.

Note if there is an AA record, but no FF records, the is no output, just as if there were FF records, but none qualified.  Same if there are FF records, but no AA record - no output.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ziba
Calcite | Level 5

Thank you so much for your help, the code that you provided worked perfectly. I really appreciated your help. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2506 views
  • 0 likes
  • 3 in conversation