BookmarkSubscribeRSS Feed
yelena
Fluorite | Level 6

Dear all,

I am cleaning the data and after merge I have found that I have two different values of the same variable over the same time period. It happened since the company switched its fiscal years. HOw can I get around this problem?

 

X_variable is reported from Compustat based on Fiscal Periods. DATADATE = end of fiscal period. FYR = month(DATADATE). BEGFYR = beginning of fiscal period.

 

As_of_Date is taken from another dataset and it seems that it falls into two different fiscal periods because this company (ID = 123456) changed its reporting date from December (FYR = 12) until september (FYR=09) and then used FYR = 09 until today. So this swtich happened once. What should I do in this case? If the company switched from 12 (up until 2009) to 9(from 2010), how can I say that for the

 

For 2009 we have a long year (from 2009-01-01 until 2009-12-31) So if As_of_Date falls between these two dates, we report X_variable = 472

 

For 2010 we will have a short year (from 2010-01-01 until 2010-09-30). So As_of_Date won't fall between these two dates, so we will omit the duplicate values. So that the fiscal year periods will look like:

 

2009-01-01 until 2009-12-31 (12 months)

2010-01-01 until 2010-09-30 (10 months)

2010-10-01 until 2011-09-30 (12 months)

 

I think I need to change smth in the code in the definition of  fiscal years:

 

data sql.NEW;

set sql. OLD;

format begfyr endfyr yymmdd10.;

fyr = month(DATADATE);

endfyr = DATADATE;

begfyr = intnx('month', endfyr, -11, 'beg');

 

run;

 

proc sql;

create table sql.merge

as select *

from sql. loan as a, sql.compustat as b

where a.ID = b.ID and

a.As_of_Date >= b.begfyr and a.As_of_Date <= b.endfyr

;

quit;

 

 

Original DATA

 

ID                      As_of_Date                         DATADATE             FYR         BEGFYR                           X_variable

123456                2009-10-30                       2009-12-31              12               2009-01-01                                 472

123456               2009-11-30                         2009-12-31             12               2009-01-01                                 472

123456               2009-12-31                         2009-12-31               12            2009-01-01                                 472

123456                2009-10-30                        2010-09-30                09          2009-10-01                                    434

123456                2009-11-30                        2010-09-30                09            2009-10-01                                   434

123456                 2009-12-31                       2010-09-30                09             2009-10-01                                  434

9 REPLIES 9
Shmuel
Garnet | Level 18

It is not clear to me:

1) is "Original data" your input ?

2) what output do you expect ? can you figure it?
    what differences are relating to your code output ?

 

yelena
Fluorite | Level 6

Thank you for your questions. Let me clarify my question. 

 

I have to correctly define the variable "BEGFYR" = beginning of fiscal year. At the moment in the original data we have DATADATE = ENDFYR = end of fiscal year. If 12-31-2000 is the ENDFYR, then 01-01-2000 will be BEGFYR. But sometimes companies switch fiscal years. At first they may report in December but then start reporting in September or any other months. So I have to take care of this situation. 

 

For example, in the original data I have ENDFYR and FYR (=month(DATADATE)). I have to construct BEGFYR (but need to take into account that some fiscal years may be short and some years may be missing). Also, this changes from one firm to another firm. 

 

ID                              ENDFYR                 FYR                   BEGFYR

 123456                   12-31-2000              12                   01-01-2000

123456                    12-31-2001              12                   01-01-2001

123456                    9-30-2002                 9                    01-01-2002 (short year)

123456                    9-30-2003                 9                    10-01-2002

123456                    9-30-2006                 9                     10-01-2005

123456                    7-31-2008                 7                     08-01-2007  

 

654321                    12-31-2000            12                      01-01-2000

654321                     5-31-2001             5                         01-01-2001 (short year)

654321                     8-31-2001             8                          06-01-2001 (short year)

 

So how should I define BEGFYR?

Thank you for your help.

Yelena                                  

Shmuel
Garnet | Level 18

Assuming records are sorted by ID ENDFYR  and the day after is the BEGFYR of next record

then you can calculate BEGFYR fot the 2nd record on:

 

data want;

 set have;

  by id;

      retain prev_endfyr;

      if first.id then do;

         BEGFYR = ... ;  /* make the best you can to define it on first ID record */

         prev_begfyr = endfyr;

     end; 

     else begfyr = prev_begfyr +1;

     drop prev_endfyr;

run;

 

So you are independent of FYR.

yelena
Fluorite | Level 6

Thank you very much! Should I then define:

begfyr = intnx('month', endfyr, -fyr-1, 'beg');

 

 

Shmuel
Garnet | Level 18

The best way to check a code is to run it on few recods and check results;

  data test;

    set want(obs=10);

      .... enter your code to test ....

 run;

Shmuel
Garnet | Level 18
change to set have(obs=10) or any amount to check
yelena
Fluorite | Level 6

Hello,

I have tried to run the code but it does not work. I only get the observations for two observations:

 

ID                       ENDFYR                  FYR                      begfyr                                  prev_begfyr  

123456               2000-12-31              12                             2000-01-01                          2000-12-31

123456               2001-12-31              12                                      .                                        .

123456               2002-09-30                9                                      .                                        .

123456               2003-09-30                9                                         .                                     .

123456              2006-09-30               9                                        .                                        .

123456             2008-07-31                7                                       .                                        .

654321             2000-12-31                12                              2000-01-01                           2000-12-31

654321             2001-05-31                  5                               .                                                 .

 

I am not sure how to populate missing values. Any suggestions? Thank you for your help.

Shmuel
Garnet | Level 18

Sorry, I have missed something - each line is a new period - then try:

 

data want;

 set have;

  by id;

      retain prev_endfyr;

      if first.id then do;

         BEGFYR = ... ;  /* make the best you can to define it on first ID record */

         prev_begfyr = endfyr;

     end; 

    begfyr = prev_begfyr +1;    /* else - canceled */

    prev_begfyr = endfyr;          /* line added */

     drop prev_endfyr;

run;

 

yelena
Fluorite | Level 6

Hmm, thank you so much for your help. But still something is weird with the code as there are only first observations that are defined for begfyr.

 

I have figured out how to write the code. It works now. I'll post it later.

 

 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3433 views
  • 0 likes
  • 2 in conversation