Below is a sample data and SAS program (I use SAS 9.4) that illustrates a challenge I have with processing the replacement of missing credit ratings within each gvkey (identifier) for a range of ym (months within a year). In other words, the procedure to replace missing ratings needs to reset with each individual gvkey. gvkey = a unique number for each corporation ym = a unique month and year rating = credit rating issued for a gvkey at a specific ym “.” = missing rating for specific ym within a gvkey Two gvkeys, 001078 and 002504, are used in the data below, though there are over one thousand in the actual study. The ym’s repeat for each gvkey as they represent the months for one year. The coding follows the data and the SAS results follow the coding. I am trying to replace missing ratings with the last or most recent rating. The program works for the most part as shown in column “rating1.” It worked perfectly for gvkey 001078. However, I need the program to process the replacement within each gvkey and not have the next gvkey read from the previous gvkey. The first gvkey, 1078, is shown in observations 1 -12. Observation 13 is the start of a new gvkey, 2504. The problem is that gvkey 2504 is reading the last rating from observation 11 that is from the previous and different gvkey, 1078. The replacement process should start over again for the next gvkey, 2504, and should start by reading observation 13 as the first rating or data line, which is a missing rating, and should report a missing rating in the “rating1” column until observation 15 when a new rating for gvkey 2504 appears. If the first two or more observations within a gvkey is a missing rating then the replacement rating should also be a missing rating until a new rating appears. That is the correct treatment of the data. How do I have the replacement process begin and end within each gvkey? Feel free to edit the code below if you have any suggestions. Thank you, Doug data a; input gvkey ym rating$; cards; 001078 24216 . 001078 24217 . 001078 24218 . 001078 24219 BBB 001078 24220 BBB 001078 24221 BBB+ 001078 24222 . 001078 24223 . 001078 24224 . 001078 24225 . 001078 24226 B 001078 24227 . 002504 24216 . 002504 24217 . 002504 24218 A 002504 24219 A- 002504 24220 . 002504 24221 . 002504 24222 CC 002504 24223 CC 002504 24224 CC 002504 24225 CC+ 002504 24226 . 002504 24227 . proc sort; by gvkey ym; run; data b; set a; retain rating1; if rating > "." then rating1=rating; by gvkey; /* this step did not seem to help */ run; proc print; var gvkey ym rating rating1; run; RESULTS: The SAS System Obs gvkey ym rating rating1 1 1078 24216 ← The first three observations in the 2 1078 24217 ← rating1 column worked as intended as 3 1078 24218 ← they should be missing ratings. 4 1078 24219 BBB BBB 5 1078 24220 BBB BBB 6 1078 24221 BBB+ BBB+ 7 1078 24222 BBB+ 8 1078 24223 BBB+ 9 1078 24224 BBB+ 10 1078 24225 BBB+ 11 1078 24226 B B 12 1078 24227 B 13 2504 24216 B ← Observations 13 and 14 in the rating1 14 2504 24217 B ← column should still be missing ratings. 15 2504 24218 A A 16 2504 24219 A- A- 17 2504 24220 A- 18 2504 24221 A- 19 2504 24222 CC CC 20 2504 24223 CC CC 21 2504 24224 CC CC 22 2504 24225 CC+ CC+ 23 2504 24226 CC+ 24 2504 24227 CC+
... View more