Hello,
I am a new user of SAS.
I have a variable named Date of Birth. But in SAS the DOB appears as 23822, 22867, 19530, and so on. It is a character variable with $8. I would like to convert this to mmddyy10.
I have tried
data new;
set have;
dob1 =input(dob, $8.);
format dob $8.;
run;
Thanks for the help
I strongly wonder where you managed to get those values as CHARACTER. They look somewhat like the numeric value of SAS dates.
Try formatting DOB1 as mmddyy10. and see if you are happy with the results.
Or show us the LOG from running that example code. Copy the data step from the log with all the notes or warnings and paste into a code box opened on the forum with the </> icon to preserve formatting.
This is how DOB looks like
DATA CANCER.corr_DOS1;
SET CANCER.BC;
dob1= input (dob, yymmdd6.);
format dob1 MMDDYY10.;
RUN;
I didn't get any output. It is missing
Here is the log
proc print data=CANCER.bc(obs=10);
var dob;
format dob mmddyy10.;
ERROR: You are trying to use the numeric format MMDDYY with the character variable DOB in data
set CANCER.BC.
run;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Hello,
This was in excel format and I have imported it into SAS.
The original DOB is
However, after using the code the new DOB1 is
@anjanamohan1611 wrote:
Hello,
This was in excel format and I have imported it into SAS.
The original DOB is
However, after using the code the new DOB1 is
So now you should show the code used to read the Excel file, the actual other code that you ran from the LOG along with notes and warnings, and quite possibly a copy of the Excel file.
So the issue is that your Excel file has text in at least one of the cells in the column with the date values. So when SAS tried to figure out how to convert the column in a variable (rather than a loose collection of cells like a spreadsheet) it had to make a character variable instead of numeric variable. When that happens for any actual date values in the column in Excel you get the raw number of days (as Excel stores them) as a character string of digits.
So first convert to a number and then correct for the difference in starting date.
183 data test; 184 excelstr='23822'; 185 exceldt=input(excelstr,32.); 186 sasdt=exceldt+'30DEC1899'd ; 187 put excelstr= $quote. / (2*exceldt 2*sasdt) (=comma. =yymmdd10. / ); 188 run; excelstr="23822" exceldt=23,822 exceldt=2025-03-22 sasdt=1,906 sasdt=1965-03-21
I suspect it may be time to actually show what your data set looks like.
Please run
Proc contents data=CANCER.BC;
run;
and show us the result.
If you try to input a value like 23822 with a yymmdd6 informat it would result in missing values because 1) you don't have 6 characters and 2) it would try to treat 82 as the month. I bet you have lots of invalid data messages in your LOG. You do read the log don't you? Something like:
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 8:11
which means that the value can't be treated as a date with the value shown.
And if the "year" was accepted as two digits the result would likely be 2023, which I suspect is not valid for your data. Example:
data junk; x = '230823'; date = input(x,yymmdd6.); put date= mmddyy10.; format date mmddyy10.; run;
08 above is valid for a month value so this "works" but the year is 2023.
If the values of the variable are actually character then I think you need to create a numeric version such as you tried before: Dob1 =input(dob,best.); or similar. and then apply a format to the DOB1 variable. If the value of DOB is actually numeric then try just assigning the format.
proc import datafile="C:\CANCER\MEANADH\BC.xlsx" out=cancer.BC
DBMS= xlsx REPLACE;
sheet="sheet_1";
getnames=yes;
run;
This is how the data looks in excel
DOB |
3/21/1965 |
8/9/1962 |
6/20/1953 |
2/4/1962 |
1/8/1965 |
Yes, it is a one time project. I will try fixing the excel
And in SAS the properties of this variable is
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.