I have a date format, e.g. 06/01/1989 and I want a new variable with just the year in it…1989 in this case. However, when I use a substr function, it is returning me odd values.
I checked the column description for the date variable and it says
The SUBSTR function is meant to work with character strings or character variables. Therefore, using it with a numeric variable would have given you unpredictable results. Numbers are stored in such as way as to make it impossible to go to position 7 or position 4 of the number -- only character strings have fixed positions. So in the variable ALPHA, a character variable of 26 characters:
newvar = substr(alpha,7,1);
would "substring" or extract the 7th character or the G from the alpha variable value.
If date is a character string (such as you show), then SUBSTR will work. But if date is a numeric value, as shown in the program below, then the results will not be correct. Since the OP said his date value was numeric, SUBSTR on a numeric variable would force a conversion of the internally stored number from numeric to character in order for the SUBSTR to work (since it will ONLY work on CHARACTER variables or text strings).
See the log and program below, where the internally stored date for 06/01/1989 is the number 10744 (Jun 1, 1989 is 10744 days from Jan 1, 1960). So when 10744 was converted to a text string it was converted with the BEST12. format, which resulted in the string xxxxxxx10744 (where every x is a space). So the SUBSTR would get x107 -- where the x is a space -- so the substr results in newvar2 having the string ' 107' with a leading space. (The PUT statement does not show leading spaces -- nor do most SAS procedures.)
1289 data one;
1292 put chardate= newvariable=;
1294 numdate = '01jun1989'd;
1295 newvar2 = substr(numdate,7,4);
1296 put "formatted val for " numdate= mmddyy10. " Internal val for " numdate= newvar2=;
NOTE: Numeric values have been converted to character values at the places given by: (Line)Column).
formatted val for numdate=06/01/1989 Internal val for numdate=10744 newvar2=107
NOTE: The data set WORK.ONE has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds