This seems like an easy question, but the answer has eluded me for some time.
I have a column of DOB values in a SAS output dataset expressed a text ike
DOB
38407
29738
42385
37471
.... and so on. I want to express these in data format like 08/14/2020 for example.
I approached it with this code
Data Arias.Merge_data_E_18 (drop = DOB_1); Set Arias.Merge_data_E_18; DOB_1 = input (DOB, mmddyy8.); run;
But the log came back
304 Data Arias.Merge_data_E_18 (drop = DOB_1); 305 Set Arias.Merge_data_E_18; 306 DOB_1 = input (DOB, mmddyy8.); 307 run; NOTE: Invalid argument to function INPUT at line 306 column 12. County=Linn Jurisdiction=Linn CHD Contact_number=000118e5-a3d5-ea11-a813-001dd8309200 _Do_Not_Modify__Row_Checksum=aF2JuLzWGLlUSpF32cWQi4+R2awa7Yzn3d6BiH+2JnyF6rREvdofoaNKFfl+EOEce30rHjSO2 <<repeated>> City=Warm Springs Preferred_Method_of_Contact=Phone Email= Mobile_Phone=541-460-2541 End_of_Quarantine_Period=08/28/2020 Language= DOB_1=. _ERROR_=1 _N_=35 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 23839 at 306:12 NOTE: There were 41280 observations read from the data set ARIAS.MERGE_DATA_E_18. NOTE: The data set ARIAS.MERGE_DATA_E_18 has 41280 observations and 29 variables. NOTE: Compressing data set ARIAS.MERGE_DATA_E_18 decreased size by 47.20 percent. Compressed is 245 pages; un-compressed would require 464 pages. NOTE: DATA statement used (Total process time): real time 0.17 seconds cpu time 0.18 seconds
I would appreciate your help again (not the first time today).
wlierman
Here's a quick way to do it:
Data Arias.Merge_data_E_18 ;
Set Arias.Merge_data_E_18;
FORMAT DOB mmddyys10.;
run;
After you run that code, you should see dates showing in your output data set in the format mm/dd/ccyy, for example: 08/14/2020.
Results using the date values you supplied:
SAS shows you the dates in the format you want, but the data has not changed. This is just a format. What you see is NOT the actual value of the data but rather just a formatted display of the underlying value.
If you actually need a new variable, you could code the following:
Data Arias.Merge_data_E_18 ;
Set Arias.Merge_data_E_18;
New_DOB = PUT(DOB, mmddyys10.);
run;
Results:
Notice the difference in the icons for each column.
General rules for PUT and INPUT:
When you want to change or show a numeric variable in a character format, use PUT.
When you want to change or show a character variable in a numeric format, use INPUT.
Your original example was using INPUT on a numeric variable. The general rule is to use PUT on numeric variables and INPUT on character variables.
Jim
Here's a quick way to do it:
Data Arias.Merge_data_E_18 ;
Set Arias.Merge_data_E_18;
FORMAT DOB mmddyys10.;
run;
After you run that code, you should see dates showing in your output data set in the format mm/dd/ccyy, for example: 08/14/2020.
Results using the date values you supplied:
SAS shows you the dates in the format you want, but the data has not changed. This is just a format. What you see is NOT the actual value of the data but rather just a formatted display of the underlying value.
If you actually need a new variable, you could code the following:
Data Arias.Merge_data_E_18 ;
Set Arias.Merge_data_E_18;
New_DOB = PUT(DOB, mmddyys10.);
run;
Results:
Notice the difference in the icons for each column.
General rules for PUT and INPUT:
When you want to change or show a numeric variable in a character format, use PUT.
When you want to change or show a character variable in a numeric format, use INPUT.
Your original example was using INPUT on a numeric variable. The general rule is to use PUT on numeric variables and INPUT on character variables.
Jim
Hi Jim,
Thank you for the spot-on solution to this problem.
I appreciate your quick response and you taking the time to answer.
Take care.
wlierman
Hi Jim,
I am confounded now! I coded the data step like this
Data Arias.Merge_data_E_18; Set Arias.Merge_data_E_18; New_DOB = PUT(DOB, MMDDYYS10.); run;
I thought it was working but then I got the log back
324 Data Arias.Merge_data_E_18; 325 Set Arias.Merge_data_E_18; 326 New_DOB = PUT(DOB, mmddyys10.); ---------- 48 ERROR 48-59: The format $MMDDYYS was not found or could not be loaded. 327 run; NOTE: The SAS System stopped processing this step because of errors.
Okay so I looked in the Step-by_Step Base SAS manual
read how to assign the MMDDYY10. format as
format DepartureDate mmddyy10.; (using the var name form the book)
I tried it as
Data Arias.Merge_data_E_18; Set Arias.Merge_data_E_18; Format DOB mmddyys10.; *New_DOB = PUT(DOB, MMDDYYS10.); run;
The log returned
Data Arias.Merge_data_E_18; 356 Set Arias.Merge_data_E_18; 357 Format DOB mmddyys10.; ---------- 48 ERROR 48-59: The format $MMDDYYS was not found or could not be loaded. 358 *New_DOB = PUT(DOB, MMDDYYS10.); 359 run;
So I am stumped again.
Appreciate your assistance.
wlierman
SAS is trying to apply a character format. I think that DOB may not be defined as numeric even if it contains numbers.
Can you run the below and post the results?
PROC CONTENTS DATA=Arias.Merge_data_E_18;
RUN;
If DOB is a number but the variable is defined as character, then you'd have to change your code as follows:
Data Arias.Merge_data_E_18;
Set Arias.Merge_data_E_18;
New_DOB = PUT(INPUT(DOB, 5.), MMDDYYS10.);
run;
I think 5. should be an appropriate Informat, but try it and see if it works.
Jim
It's a little hard to read that, but I think that my surmise was correct that for whatever reason, DOB is being stored as character.
Please try this code:
Data Arias.Merge_data_E_18;
Set Arias.Merge_data_E_18;
New_DOB = PUT(INPUT(DOB, 5.), MMDDYYS10.);
run;
The INPUT function will read in the value stored as character and change it to numeric storage. The PUT function will then format the data stored numeric as numeric into your desired format.
Jim
I ran the second code block
Data Arias.Merge_data_E_18; Set Arias.Merge_data_E_18; New_DOB = PUT(INPUT(DOB, 5.), MMDDYYS10.); run;
and received the following log
Data Arias.Merge_data_E_18; 363 Set Arias.Merge_data_E_18; 364 New_DOB = PUT(INPUT(DOB, 5.), MMDDYYS10.); 365 run; NOTE: Invalid argument to function INPUT at line 364 column 17. County=Jackson Jurisdiction=Jackson CHD Contact_number=dbac2a5f-0cd4-ea11-a813-001dd83096fe _Do_Not_Modify__Row_Checksum=MvbQ4ucFSa06yXIUQuV9uanIv0tnSeKcliEG0lVvJeQwDITgs0d6E2l4rdktOUpsBptzwsZXA R6WWLZisKos1w== _Do_Not_Modify__Modified_On=04SEP20:01:06:04 Person_ID=956838 Contact_ID=5028693 Case_ID=774942 _Full_Name=Josh Charleboix First_Name=Josh Middle_Name= Last_Name=Charleboix Contact_Attempt_Successful=Yes Created_On=01AUG20:08:33:54 Date_of_Last_Exposure=07/23/2020 _1st_Contact_Attempt=01AUG20:15:30:00 _2nd_Contact_Attempt=02AUG20:13:30:00 _3rd_Contact_Attempt=8 Owner=Natasha Ching DOB=04/03/1861 00:00:00 Monitoring_Status=Lost to Follow-up Status=Inactive Address_1=411 berrydale Ave #9 MEDFORD, or 97501 City=MEDFORD Preferred_Method_of_Contact=Text Email=joshcharleboix@gmail.com Mobile_Phone=541-973-5460 End_of_Quarantine_Period=08/06/2020 Language=English New_DOB=. _ERROR_=1 _N_=35443 NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values. Each place is given by: (Number of times) at (Line):(Column). 1 at 364:17 NOTE: There were 41280 observations read from the data set ARIAS.MERGE_DATA_E_18. NOTE: The data set ARIAS.MERGE_DATA_E_18 has 41280 observations and 30 variables. NOTE: Compressing data set ARIAS.MERGE_DATA_E_18 decreased size by 46.38 percent. Compressed is 252 pages; un-compressed would require 470 pages. NOTE: DATA statement used (Total process time): real time 0.17 seconds cpu time 0.12 seconds
Which was only 1 time for that error. Why did that occur? I can probably live with one line.
Well, looking at the data dump that SAS gave us, I see:
DOB=04/03/1861 00:00:00
That's not a numeric date value; that's a character date-time value. One cannot read a character date-time value with a numeric Informat. like the "5." I suggested.
So, question: Are your DOB values numeric dates or are they character date-time values? Or do most DOB values consist of numeric dates and this one record has a character date-time value rather than a numeric date?
Jim
I'd say that the DOB in 1861 is bad data. Someone born in 1861 would be about 160 years old today, which I'm thinking is probably not the case. It's probably a mis-key of 1961, but that's just a guess. If the data is bad, there isn't all that much you can do about it. SAS will capture it as a missing value (".") which is just as it should be.
Are the majority (except for bad data like our 1861 example) of the records now showing the date the way you want them to? If so, I'd say we've found our solution.
Jim
Hi Jim,
Unfortunately the DOB is still in the char format. Rather than copy a a **bleep** of the output dataset these are the first 10 values for DOB:
DOB
38407
29738
42385
37471
34608
26080
39334
37308
38671
27413
@wlierman wrote:
I could just drop DOB, but I am really curious now as to how to solve this.
Do you mean the 1861 issue? I think the 1861 is bad data in which case even if we read it in, it will still be wrong. However, not all date issues will be bad data, so let's have a look:
We have this so far:
New_DOB = PUT(INPUT(DOB, 5.), MMDDYYS10.);
Which I believe is working fine 99% of the time -- except of course New_DOB sometimes comes up "missing." After the above code, we can do the following:
IF STRIP(New_DOB) = '.' THEN
New_DOB = PUT(INPUT(SUBSTR(DOB, 1, (INDEX(DOB, ' '))), MMDDYY10.), MMDDYYS10.);
We'd then have the following results:
The above code may seem a little complicated, but it does protect us from truly invalid dates, for example 04/32/1861. A truly invalid date will result in a '.' which we may want to translate to a space since a space represents missing for character fields.
Jim
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.