Hi I am trying to compare 2 explanatory variables and the age of receiving a test.
one set (Alaska) would have received the test at their birth and is put into a column called "BD"
the other set (Washington) may have received the test at birth or later in life, I have their birthdate (WashBD) and the date they received the test (TestD)
overall I want to make a new variable describing their age that all sets received the test
the dates are also a bit of a mess and are categorized differently
BD = YYYY/MM/DD
WashBD = YYYY
TestD = DD-MM-YYYY
can someone help?
thanks!
@Cooksam13 wrote:
51531
WashBD Num 8 YEAR. BEST32. TestD Num 8 YEAR. YYMMDD10. BD Char 10 $EAR. $10. BD so this is what it is coming up as and I am having trouble converting BD to a date
WASHBD is a numeric variable that is being displayed using the YEAR format and has 32. informat attached to it. (BEST is the name of a FORMAT if you use it as in INFORMAT then it is treated as an alias for the normal numeric informat).
TESTD is a numeric variable that is being displayed using the YEAR format and has the YYMMDD10. informat attached to it.
BD is a character variable that can hold up to 10 bytes. Someone tried to attach the numeric format YEAR to it and that is getting displayed by PROC CONTENTS as $EAR because all character formats should start with a $. It also has the $10. informat attached to it.
So to convert BD into a DATE value use the INPUT() function with an appropriate informat. But you will need to store the results into a numeric variable and BD is already being used as the name of a character variable. So if it has strings like '23-12-2020' then you want to use the DDMMYY informat.
bd_date = input(bd,ddmmyy10.);
format bd_date yymmdd10.;
Not sure what kind of values WASHBD has. You should print it using different formats and see what the values look like. If they are date values for people's birthdays then they should be integer values from about -21,000 to about 23,000.
461 data _null_; 462 start=intnx('year',today(),-120,'b'); 463 end=today(); 464 put start= :comma10. start yymmdd10.; 465 put end= :comma10. +2 end yymmdd10.; 466 run; start=-21,184 1902-01-01 end=22,744 2022-04-09
If they are YEAR values then they should be integers between 1900 and 2021. In which case WASHBD should NOT have the YEAR format attached.
Similarly for TESTD, but since it has an informat that is appropriate for a date value attached to it then it is more likely that it has actual date values in it.
this is the code I am working with
the first data step i get what i want which is just the year showing up but when I do the second data step it shows warnings and reverts the dates in the HBT to SAS inputs for dates and shows this error "Invalid numeric data, BD='2010/02/11' , at line 73 column 41" in multiple areas
data want;
set have;
Format WashBD TestD BD Year.;
run;
run;
data want;
set want;
if not missing(WashBD) then DOB = WashBD;
if not missing(BD) then DOB = BD;
if not missing(BD) then HBT = BD;
if not missing(TestD) then HBT = TestD;
Format WashBD TestD BD Year.;
run;
error "Invalid numeric data, BD='2010/02/11' indicates that your variable BD is of type character and not numeric. If true then already in your first data step format bd year.; shouldn't work and you should get an error similar to below.
Ideally you provide representative sample data created via a fully tested SAS data step that you post here.
What you already could do: Run below code and share with use the result. This will show us the data types and formats of your source variables.
proc contents data=work.have;
run;quit;
51531
WashBD | Num | 8 | YEAR. | BEST32. | |
TestD | Num | 8 | YEAR. | YYMMDD10. | |
BD | Char | 10 | $EAR. | $10. | BD |
so this is what it is coming up as and I am having trouble converting BD to a date
@Cooksam13 wrote:
51531
WashBD Num 8 YEAR. BEST32. TestD Num 8 YEAR. YYMMDD10. BD Char 10 $EAR. $10. BD so this is what it is coming up as and I am having trouble converting BD to a date
WASHBD is a numeric variable that is being displayed using the YEAR format and has 32. informat attached to it. (BEST is the name of a FORMAT if you use it as in INFORMAT then it is treated as an alias for the normal numeric informat).
TESTD is a numeric variable that is being displayed using the YEAR format and has the YYMMDD10. informat attached to it.
BD is a character variable that can hold up to 10 bytes. Someone tried to attach the numeric format YEAR to it and that is getting displayed by PROC CONTENTS as $EAR because all character formats should start with a $. It also has the $10. informat attached to it.
So to convert BD into a DATE value use the INPUT() function with an appropriate informat. But you will need to store the results into a numeric variable and BD is already being used as the name of a character variable. So if it has strings like '23-12-2020' then you want to use the DDMMYY informat.
bd_date = input(bd,ddmmyy10.);
format bd_date yymmdd10.;
Not sure what kind of values WASHBD has. You should print it using different formats and see what the values look like. If they are date values for people's birthdays then they should be integer values from about -21,000 to about 23,000.
461 data _null_; 462 start=intnx('year',today(),-120,'b'); 463 end=today(); 464 put start= :comma10. start yymmdd10.; 465 put end= :comma10. +2 end yymmdd10.; 466 run; start=-21,184 1902-01-01 end=22,744 2022-04-09
If they are YEAR values then they should be integers between 1900 and 2021. In which case WASHBD should NOT have the YEAR format attached.
Similarly for TESTD, but since it has an informat that is appropriate for a date value attached to it then it is more likely that it has actual date values in it.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.