Hi SAS users,
I have read almost all the forums regarding converting character string to SAS datetime but none of them seem to work for me. I have multiple date variables that need to be converted. for e.g. one variable is called 'discharge_dod' and is a character type variable and the format and informat is $500. When printed, it reads as
2018-05-04 13:30 |
To check the chronological order of dates, I need to convert it to a SAS format that has both the date and the time. How would I do this for this variable and all my other variables that are in the same character format?
Thank you so much.
Please do not post pictures of errors. Copy the the code of the data step or procedure plus any associated messages, notes or errors, from the log then paste the whole thing into a text box opened on the forum with the </> icon.
The text box is important because the message windows on this forum will reformat text and the diagnostic characters that SAS often supplies will not appear in correct position relative to the code. The text is also preferred because we can copy and suggest changes or highlight typos and such that are not easy or possible at all with images.
A small log file might look like:
99 data junk; 100 x="2018-05-04 13:30"; 101 y=input(x,anydtdtm.); 102 format y datetime20.; 103 run; NOTE: The data set WORK.JUNK has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.01 seconds
Please notice that this starts with the DATA and ends will the messages associated with the data step.
The Array statement syntax is: the word "array", the name of the array (cannot be the name of an existing variable) then the number of elements and then the variable names. The order is critical.
Your new array with the values you show would be:
array newdt {39} <list the names>.
NO new {39} at the end.
The Format statement I used is to rename variables named Newdt1 Newdt2 Newdt3 etc. Since you created new variable names of Var1new etc. there are no variables of that name.
SAS Large economy sized hint: SAS will use lists of names like Var1 -Var39 to reference sequentially numbered variables. So your first array statement can be much shorter. With that, you will see that it is much easier to use NewVar1 - NewVar39 for the second array:
array newdt {39} NewVar1 - NewVar39;
You likely got a error because you had the "new" variable names just as list without telling SAS how many elements there were in the correct order. The way that the Old works is that is listing existing variables and SAS will treat the first name as the array name and the following ones as elements of the array. When new variables are created you must specify how many.
Alternatively:
array NewVar{39};
Will 1) name the array and 2) create 39 sequentially numbered variables NewVar1 to NewVar39. If you use this remember to replace the Newdt[i] with NewVar[i].
My code had a typo and should read
newdt[i]=input(old[i],anydtdtm.);
to reference the new array elements. But that would be unlikely to result in an error, just a new variable with the last element.
The most common array errors for new users are 1) not matching the number of array elements which typically will result in an error of "array index out of range" or similar wording, or 2) mixing variable types in the array. SAS will allow only elements of one type, either all character or all numeric variables. If that is the problem there likely needs to be a lot of detail shared.
Another is that you are not getting an actual error but "invalid data" which means the actual value encountered in the INPUT function is of a different actual content. Possibly missing values, which will result in missing new variables or data with seriously different content. If the value is actually only a time without date, or missing any of month, day of month or year then the date is likely invalid and will result in warnings of invalid data. But we need to see the messages.
To check the chronological order of dates, I need to convert it to a SAS format that has both the date and the time.
Assuming this is YYYY-MM-DD, then an alphabetical sort will put the dates into the right order. But to convert it to a numeric SAS date, you can use the INPUT function, like this in a DATA step:
data have;
chardate='2018-05-04 13:30';
run;
data want;
set have;
numdate=input(chardate,anydtdtm.);
format numdate datetime16.;
run;
If that is the only type of value then consider this code:
data junk; x="2018-05-04 13:30"; y=input(x,anydtdtm.); format y datetime20.; run;
You may need to use the LEFT function prior to the Input function if there are leading spaces. This would look like:
data junk; x="2018-05-04 13:30"; y=input( left(x),anydtdtm.); format y datetime20.; run;
Multiple variables would probably use a couple of arrays to loop over all of them.
data want; set have; array old <list of current character variables>; array newdt {n} ; /* n should be the number of variables on the OLD list*/ /* creates varaibles named newdt1 newdt2 or provide desired names for the new variables in the order they appear in old */ do i= 1 to dim(old); newdt[i] =input(old[i],anydtdtm.); end; format newdt: datetime20.; /* make sure your format assignment matches the names of the variables*/ run;
Please do not post pictures of errors. Copy the the code of the data step or procedure plus any associated messages, notes or errors, from the log then paste the whole thing into a text box opened on the forum with the </> icon.
The text box is important because the message windows on this forum will reformat text and the diagnostic characters that SAS often supplies will not appear in correct position relative to the code. The text is also preferred because we can copy and suggest changes or highlight typos and such that are not easy or possible at all with images.
A small log file might look like:
99 data junk; 100 x="2018-05-04 13:30"; 101 y=input(x,anydtdtm.); 102 format y datetime20.; 103 run; NOTE: The data set WORK.JUNK has 1 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.06 seconds cpu time 0.01 seconds
Please notice that this starts with the DATA and ends will the messages associated with the data step.
The Array statement syntax is: the word "array", the name of the array (cannot be the name of an existing variable) then the number of elements and then the variable names. The order is critical.
Your new array with the values you show would be:
array newdt {39} <list the names>.
NO new {39} at the end.
The Format statement I used is to rename variables named Newdt1 Newdt2 Newdt3 etc. Since you created new variable names of Var1new etc. there are no variables of that name.
SAS Large economy sized hint: SAS will use lists of names like Var1 -Var39 to reference sequentially numbered variables. So your first array statement can be much shorter. With that, you will see that it is much easier to use NewVar1 - NewVar39 for the second array:
array newdt {39} NewVar1 - NewVar39;
You likely got a error because you had the "new" variable names just as list without telling SAS how many elements there were in the correct order. The way that the Old works is that is listing existing variables and SAS will treat the first name as the array name and the following ones as elements of the array. When new variables are created you must specify how many.
Alternatively:
array NewVar{39};
Will 1) name the array and 2) create 39 sequentially numbered variables NewVar1 to NewVar39. If you use this remember to replace the Newdt[i] with NewVar[i].
My code had a typo and should read
newdt[i]=input(old[i],anydtdtm.);
to reference the new array elements. But that would be unlikely to result in an error, just a new variable with the last element.
The most common array errors for new users are 1) not matching the number of array elements which typically will result in an error of "array index out of range" or similar wording, or 2) mixing variable types in the array. SAS will allow only elements of one type, either all character or all numeric variables. If that is the problem there likely needs to be a lot of detail shared.
Another is that you are not getting an actual error but "invalid data" which means the actual value encountered in the INPUT function is of a different actual content. Possibly missing values, which will result in missing new variables or data with seriously different content. If the value is actually only a time without date, or missing any of month, day of month or year then the date is likely invalid and will result in warnings of invalid data. But we need to see the messages.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.