BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Deepak_More
Calcite | Level 5

Hello everyone.

I have a dataset with certain variables for each firm. each firm has multiple year observations i.e. 1990, 1993, 1995,1998, 2000, 2002 and 2004.

I want to create the missing values not only for the missing years between these years but also want to add 2005 to these years. the values of other variables for the years created using SAS needs to be the value of previous years. For example 1990 exists originally in the dataset so the missing years which i create i.e. 1991 and 1992 should have same observations as 1990.

1994 should have same observations as 1993.

1996 and 1997 should have same observations as 1995

and so on until 2005 where the observations should be same as 2004.

If the firm observation ends before year 2004 lets say at year 2000, then still I would like to create the observations until 2005 where the variables will have values of 2000 in 2001 to 2005.

I am providing tables for examples

Have                                                                         

YearCusipIndexDual class dummy 
    
1990XXXYY1231
1993XXXYY1251
1995XXXYY1271
1998XXXYY1291
2000XXXYY1261
2002XXXYY1251
2004XXXYY1241
1993ABCD22020
1995ABCD22080
1998ABCD22060
2000ABCD22070
1995DECO92170
1998DECO92180
2000DECO92190
2002DECO92150

 

 

want 

YearCusipIndexDual class dummy 
1990XXXYY1231
1991XXXYY1231
1992XXXYY1231
1993XXXYY1251
1994XXXYY1251
1995XXXYY1271
1996XXXYY1271
1997XXXYY1271
1998XXXYY1291
1999XXXYY1291
2000XXXYY1261
2001XXXYY1261
2002XXXYY1251
2003XXXYY1251
2004XXXYY1241
2005XXXYY1241
1993ABCD22020
1994ABCD22020
1995ABCD22080
1996ABCD22080
1997ABCD22080
1998ABCD22060
1999ABCD22060
2000ABCD22060
2001ABCD22060
2002ABCD22060
2003ABCD22060
2004ABCD22060
2005ABCD22060

 

 

Thanking you in anticipation

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I will provide UNTESTED CODE to get you most of the way there. If you want TESTED CODE, or code which does all of what you want, you must provide the input data according to these instructions and not using any other method.

 

/* UNTESTED CODE */
data years;
     do year=1990 to 2005;
        output;
     end;
run;
proc sql;
    create table cartesian as select a.cusip,b.year from (select distinct cusip from have) as a,years as b
    order by a.cusip, b.year;
quit;
data want;
    merge have cartesian;
    by cusip year;
    retain index1 dual_class_dummy1;
    if missing(index) then index1=index;
    if missing(dual_class_dummy) then dual_class_dummy1=dual_class_dummy;
run;

 

--
Paige Miller

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

I will provide UNTESTED CODE to get you most of the way there. If you want TESTED CODE, or code which does all of what you want, you must provide the input data according to these instructions and not using any other method.

 

/* UNTESTED CODE */
data years;
     do year=1990 to 2005;
        output;
     end;
run;
proc sql;
    create table cartesian as select a.cusip,b.year from (select distinct cusip from have) as a,years as b
    order by a.cusip, b.year;
quit;
data want;
    merge have cartesian;
    by cusip year;
    retain index1 dual_class_dummy1;
    if missing(index) then index1=index;
    if missing(dual_class_dummy) then dual_class_dummy1=dual_class_dummy;
run;

 

--
Paige Miller
Deepak_More
Calcite | Level 5

Hello PaigeMiller,

Thank you for your response.

First of all my apologies for the format of the data provided. I am new to the community and this was my first post. However I will be careful about this next time and thank you for the link provided.

 

The codes worked well in creating the missing years and the range from 1990 to 2005. However the values of the variables except Cusip could not be created as previous observations and instead they were blank for every observation in the new variable created.

But because the years are created by Cusip, I think I might be able to fill the missing values by the previous observation by some other codes.

PaigeMiller
Diamond | Level 26

@Deepak_More wrote:

 

The codes worked well in creating the missing years and the range from 1990 to 2005. However the values of the variables except Cusip could not be created as previous observations and instead they were blank for every observation in the new variable created.

But because the years are created by Cusip, I think I might be able to fill the missing values by the previous observation by some other codes.


I don't know what this means, please show us. In any event, please provide the data in the requested format.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 903 views
  • 1 like
  • 2 in conversation