Good point, @Reeza. I assumed (based on elwayfan446's posts) that only 'WTD' could be missing.
@elwayfan446: Would the requested code need to fill in missing 'MTD', 'YTD' or 'LTD' records as well? What if all four were missing? How would we know the ID (if there is one)?
I think, the number and names of additional variables are less important. We could name them field2, field3 etc. for the time being without loss of generality.
The only important question is: Do we have an "ID" (and blocks of 3 to 4 or maybe 1 to 4 observations per ID) or are we talking about a HAVE dataset with max. 4 observations?
@FreelanceReinh While unlikely, I suppose there could be a missing MTD, YTD, and LTD. However, if WTD is not missing then it spills over into the others. That being said, having code to check that in case might not be a bad idea.
I don't have an ID specific to the observations other than the TIMEFRAME variable. The HAVE dataset will only have a max of 4 observations (WTD, MTD, YTD, and LTD)
Thanks @elwayfan446 for the clarification. Then I would agree with those who have suggested to use a "master list".
/* Create a dataset containing all time frame codes */
data timeframes;
length timeframe $3; /* Please adapt $3 to the length of TIMEFRAME in HAVE! */
input timeframe;
cards;
LTD
MTD
WTD
YTD
;
/* Prepare the HAVE dataset, if it is not sorted by TIMEFRAME */
proc sort data=have;
by timeframe;
run;
/* Insert dummy observations as desired */
data want;
merge have
timeframes;
by timeframe;
run;
If you do have a key variable such as ID and blocks of 3 or 4 observations (the number possibly varying from one ID to the next), you can still follow the simple one-data-step approach:
data have;
input id timeframe $ field2 field3;
cards;
1 MTD 11 101
1 YTD 21 201
1 LTD 31 301
2 MTD 12 102
2 YTD 22 202
2 LTD 32 302
2 WTD 42 402
;
data want;
set have;
by id;
if first.id then wtdex=0;
wtdex+(timeframe='WTD');
output;
if last.id & not wtdex then do;
call missing(of _all_);
timeframe='WTD';
output;
end;
drop wtdex;
run;
Because I am not familiar with the cards; piece of this... I am having a little trouble understanding. Maybe if I add the remaining variables for the data set (have and want) then you can help me translate that a little bit. Here are all the variables...
TIMEFRAME
_TYPE_
_PAGE_
_TABLE_
INVESTOR_LOAN_ID_N
NOTE_BALANCE_Sum
NOTE_BALANCE_Mean
WGT_AVG_20YR_Mean
LTV_Mean
FICO_Mean
SRP_RATE_Mean
SRP_AMT_Mean
SRP_AMT_Sum
PCT_TOTAL_UPB_PURCHASE_Mean
PCT_TOTAL_UPB_REFI_Mean
PCT_TOTAL_CNT_ESCROW_Mean
PCT_TOTAL_CNT_NONESCROW_Mean
Cards is used to generate sample data to test code. Providing a variable list does not help, providing sample data that mimics your issue is helpful.
You can read here on how to produce a Minimum, Complete and Verifable example.
@Reeza Thank you for this reference. This is great to know going forward.
I wanted to thank everyone for the help this morning. I have learned a lot in this thread.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.