Hi,
I have a character birthdate variable and I am trying to convert it to numeric but getting errors in my input function. Thanks
Here is my code:
data want;
set have;
birthday_num= input(strip(birthdate), anydtdte.);
format birthday_num mmddyy10.;
run;
If you don't include the full context the error messages are not as useful. We do not know what statement is referencing.
I suspect the issue is that you did not tell INPUT to read the full string. The default width for ANYDTDTE is 9. So any string with 2 digit month and 2 digit day will be too long.
Example:
9 data test;
10 birthdate='07/14/1942';
11 birthday_num= input(strip(birthdate), anydtdte.);
12 format birthday_num mmddyy10.;
13 run;
NOTE: Invalid argument to function INPUT at line 11 column 17.
birthdate=07/14/1942 birthday_num=. _ERROR_=1 _N_=1
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 11:17
NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
Note that the INPUT() function does NOT care if the string being read is SHORTER than the WIDTH on the INFORMAT specification being used. So unless there is some reason you want to ignore part of the string just use the MAXIMUM width that the informat supports.
The maximum width supported by MMDDYY is 10. The maximum width supported by ANYDTDTE is 60.
26 data test; 27 birthdate='07/14/1942'; 28 birthday_num1 = input(birthdate, anydtdte60.); 29 birthday_num2 = input(birthdate, mmddyy10.); 30 format birthday_num: mmddyy10.; 31 run; NOTE: The data set WORK.TEST has 1 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Result
So you mean the strings look something like what the MMDDYY format would produce from an actual date value?
Then why didn't you use the MMDDYY informat in the INPUT() function call? Using the ANYDT... series of informats forces SAS to GUESS what style each string is using to represent a date. It might well assume they are in DMY order and convert April first in the fourth of January.
What errors are you getting? If it is data errors (the string could not be converted) then you should see the invalid values printed in the SAS log.
Make it like this:
data have;
birthdate='05-17-2024';
run;
proc print;
run;
options DATESTYLE=MDY;
data want;
set have;
birthday_num= input(strip(birthdate), anydtdte10.); /* <- ADD 10 HERE */
format birthday_num mmddyy10.;
run;
proc print;
run;
Default length for the informt is 9, since you have "mmddyy10 " string it's length is 10...
#Maxim1: Read the documentation.
Bart
@Rou wrote:
I am getting an error in the input statement.
We need to know what the error is, according to the SAS log. Just saying you have an error doesn't tell us anything useful.
We also need to know what the date in question looks like as a character string, which was also requested earlier by @Patrick . Without this information, its hard to know how to proceed.
birthdate 7/14/1942 but stored as a character variable. Error code: NOTE: Invalid argument to function INPUT at line 448 column 15.
Show us the entire log for this DATA step.
Does PROC CONTENTS say this variable that contains the dates is character/text or numeric?
Proc contents shows that it is a char var.
Here is the log:
NOTE: Invalid argument to function INPUT at line 16 column 15.
empi=0 hup_mrn=472646611 pmc_mrn=. pah_mrn=. cch_mrn=. lastname=WINFIELD firstname=SUSIE
gender_code= birthdate= imageyr=. event_category= i4_mri_screen= proc_code= proc_name=
osfilm= mammo_laterality= screenep=. screendate=9/20/2022 pat_last_name= pat_first_name=
hispanic= screenyear=2022 last_mammo_date= lastmammo_year=. mammo_baseline_yn= proc_id=.
screentype= birads_overall= density_overall=. density_r=. density_l=. finalizing_prov_id=
finalizingprovname= callback= i1_screening= i2_diagnostic= i3_ultrasound= i4_mri= i5_biopsy=
i6_other= numimagingrecs=. architecturaldistortion=. asymmetry=. calcification= mass=. lymphnode=.
firstpathdate= maxoflesioncancer= outcome= interesting_study_c=. interestingstudy=
risk_menarcheage= risk_fstlivebirth= risk_fstdegrel= risk_numbiopsy= risk_atyphypls=
gail5year=. gaillifetime=. weight_lbs=. heightininches=. bmi=. cancerreg_dxdate= daysfromscreen=.
invasive= minofhistobehavioricdo3desc= maxofhistobehavioricdo3desc= maxofcstumorsize=.
maxofcssitespecificfactor1=. maxofcssitespecificfactor2=. maxofcssitespecificfactor15=.
maxofcssitespecificfactor22=. maxofcssitespecificfactor23=. nextscreendate= episodecount=. mqsa=.
DateSigned= SigningPhysicians= StudyLevelOutcome= ComponentLevelOutcome=
OverallBreastComposition= Patient= Procedure= accessionnumber=46446630 datasource=mqsa age=.
pmbb_consented=. pmbb_id= brca= brca_num_visits= brca_start_date= ovarian_ca=
ovarian_ca_num_visits= ovarian_ca_start_date= endometriosis= endometriosis_num_visits=
endometriosis_start_date= pmbb_sample=. reg_casestatflag= reg_dxdate= reg_facilityname=
reg_laterality=. reg_histobehavior=. reg_gradedifferentiation= reg_cstumorsize=.
reg_regionalnodespositive=. reg_regionalnodesexam=. reg_cssitespecificfactor1=. reg_ersummary=.
reg_cssitespecificfactor2=. reg_prsummary=. reg_cssitespecificfactor15=. reg_her2overallsumm=.
reg_ajccedition=. reg_summarystage= reg_stage_recode= reg_erpos=. reg_prpos=. reg_her2pos=.
reg_ajcc8pathprogstage= reg_ajcc8advanced=. reg_sourceregistry= reg_ajcc7= reg_filledstage=.
reg_subtype4=. casestatflag= bc_ever=0 bc_prior=0 bc_after=0 mammolocation=Fern Hill hp=.
breast_density=. all_brca1=. all_brca2=. date_of_death= nbiop_cat=. agemencat=. agefstlivecat=.
n_rels=. gail_score=. gail_score_lifetime=. birads_density=3 AvgBreastAreasqcm_2d=.
AvgBreastDensity_2d=. AvgDenseAreasqcm_2d=. hispanic_latino=0 race_eth=9 tomo=1 old_vbd_request=.
PatientName= Race_1= VitalStatus= DeathDate= ZipCode= PreferredLanguage= AnyTamoxifenYn=.
TamoxifenFirstUphsOrderDate= AnyRaloxifeneYn=. RaloxifeneFirstUphsOrderDate=
BrcaDxPresentInChartYn=. BrcaFirstDocumentedDate= EpicOrderId=. ProcName= MammogramProcType=
OrderDate= ExamDate= OrderPriority= BreastOverallBreastComp= BreastImagingOutcome=
OrderingProvider= PerformingDepartmentId=. PerformingDepartment= NearestBmiEncounterDate=
NearestBmi=. SecondNearestBmiEncounterDate= SecondNearestBmi=. InsurancePayorName1=
InsuranceFinancialClass1= InsurancePayorName2= InsuranceFinancialClass2= EncounterCsn=.
SubmissionDate= AgeFirstPeriod=. agefstperiod=. NumberLiveBirths=. AgeAtMenopause=. agemeno=.
BreastProblem= HistoryRadiationTherapy= LastMenstrualPeriod= OtherMedicalHistory=
AshkenaziJewish= MessageId=. Prompt1= PatientResponse1= PatienttResponseDate1=
PatientResponseComment1= Prompt2= PatientResponse2= PatienttResponseDate2=
PatientResponseComment2= Prompt3= PatientResponse3= PatienttResponseDate3=
PatientResponseComment3= Prompt4= PatientResponse4= PatienttResponseDate4=
PatientResponseComment4= Prompt5= PatientResponse5= PatienttResponseDate5=
PatientResponseComment5= Prompt6= PatientResponse6= PatienttResponseDate6=
PatientResponseComment6= Prompt7= PatientResponse7= PatienttResponseDate7=
PatientResponseComment7= Prompt8= PatientResponse8= PatienttResponseDate8=
PatientResponseComment8= Prompt9= PatientResponse9= PatienttResponseDate9=
PatientResponseComment9= Prompt10= PatientResponse10= PatienttResponseDate10=
PatientResponseComment10= vbd_available=0 uniqueid_new=229876 pat_mrn_id=. sex= race_pds=
facility= risk_age=. risk_race= birads_assessment=0 birads_assessment_detail=0
mammo_recommendation=Normal interval follow-up,
Additional Projections EncounterDate=.
EncounterId=. BcraScore5Yr=. BcraAge=. BcraAgeAtMenarcheCode=. BcraAgeAtMenarcheDesc=
BcraFirstLiveBirthCode=. BcraFirstLiveBirthDesc= BcraFamHist=. BcraFamHistDesc=
BcraNumberBreastBiopsiesCode=. BcraNumberBreastBiopsiesDesc= BcraAtypicalHyperplasiaCode=.
BcraAtypicalHyperplasiaDesc= BcraRaceCode=. BcraRaceDesc= BcraScoreLifetime=. datelastperiod=.
agelastperiod=. StudyLevelAssessment= ComponentLevelAssessment= Age_1= ApptDate= Dept=
EthnicBackground= EthnicGroup= site= screenyr=. ageatscreen=. birads_r= birads_l= maxbirads=
lastbirads= datesigned_1=09/20/2022 signingphysicians_1=BARAK, ELIZABETH RICHARDSON
orderingprovider_1=KEAH, JENNIFER HILLIARD [17057152] studyleveloutcome_1=
componentleveloutcome_1= studylevelassessment_1=0: Additional images for evaluation
componentlevelassessment_1=Need Additional Imaging Evaluation [0] recommendationsresolved=No
overallbreastcomposition_1=Heterogeneously dense [3] patient_1=Winfield, Susie apptdate_1=9/20/22
procedure_1=MAMMO SCREENING BILATERAL TOMOSYNTHESIS dept_1=FERNMAMMO [6156]
ethnicbackground_1=American/United States [12] ethnicgroup_1=Not Hispanic or Latino [30]
apptdate2=22908 mrn_1=472646611 dummyaccession=. Recommendation= RecommendationsResolved_1= seq=.
mammo_outcome= max_recommendation=3 max_biopsy_rec=0 consented=. pmbb_dna_sample_2023=.
pmbb_plasma_edta_2023=. pmbb_genotyped_2023=. pmbb_status= dna_in_inventory=. genotyped=.
min_dx_img_time=2 max_birads_date=22908 max_birads=0 max_birads_source=Diagnostic Imaging Linkage
last_birads_date=0 last_birads=2 last_birads_source=Diagnostic Imaging Linkage
mammo_recommendation_num=3 new_vbd_request=0 sent_for_vbd=. gghe=0 outcome_dx_90=
outcome_dx_90_2yr= max_outcome_1yr=. max_outcome_2yr=. acs_fn=0 npv_encounterdate=
npv_parenthospital= npv_codestandard= npv_dxcode= npv_dxdescription= bc_date=. bc_npv=.
timetobc=. min_timetobc=. vbd_grant=0 any_vbd_request=0 birthday_num=. _ERROR_=1 _N_=788895
NOTE: Mathematical operations could not be performed at the following places. The results of the
operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
4 at 16:15
NOTE: There were 789597 observations read from the data set WORK.SCMAMO_DROP.
NOTE: The data set WORK.SCMAMO_CONVERTED has 789597 observations and 335 variables.
NOTE: DATA statement used (Total process time):
real time 6.13 seconds
cpu time 2.65 seconds
That is NOT an ERROR. It is a NOTE. You still not NOT include the CONTEXT needed to make complete sense of the NOTE. You did not show the part of the LOG that shows the CODE being run so we don't know what statement was on line 16 and column 15.
But assuming the issue is with converting BIRTHDATE to a number the value of BIRTHDATE displayed in the log is helpful. It is showing the it is empty, so that is definitely does not represent a valid date.
You can prevent that note in two ways.
You could first test if BIRTHDATE is empty and only execute the INPUT() function when it is not.
if not missing(birthdate) then birthdate_num=input(birthdate,mmddyy10.);
Or you could suppress ALL of the notes about invalid values by including the ?? modifier before the informat. Then even if the value of BIRTHDATE is 'dawn of time' you won't get an invalid value NOTE.
birthdate_num=input(birthdate,??mmddyy10.)
If you don't include the full context the error messages are not as useful. We do not know what statement is referencing.
I suspect the issue is that you did not tell INPUT to read the full string. The default width for ANYDTDTE is 9. So any string with 2 digit month and 2 digit day will be too long.
Example:
9 data test;
10 birthdate='07/14/1942';
11 birthday_num= input(strip(birthdate), anydtdte.);
12 format birthday_num mmddyy10.;
13 run;
NOTE: Invalid argument to function INPUT at line 11 column 17.
birthdate=07/14/1942 birthday_num=. _ERROR_=1 _N_=1
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 11:17
NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
Note that the INPUT() function does NOT care if the string being read is SHORTER than the WIDTH on the INFORMAT specification being used. So unless there is some reason you want to ignore part of the string just use the MAXIMUM width that the informat supports.
The maximum width supported by MMDDYY is 10. The maximum width supported by ANYDTDTE is 60.
26 data test; 27 birthdate='07/14/1942'; 28 birthday_num1 = input(birthdate, anydtdte60.); 29 birthday_num2 = input(birthdate, mmddyy10.); 30 format birthday_num: mmddyy10.; 31 run; NOTE: The data set WORK.TEST has 1 observations and 3 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Result
Thank you all for the help.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.