I have a dataset that has a diagnosis date variable. This variable is a character variable. I would like to use this variable to cacluate the duration since diagnosis (randomization date-diagnosis date). Unfortunately the variable is stored in three different way for the subjects. I have about 40 subjects with only a year, 50 subjects with just a month and year and the remainder have a month, day and year. I want to split the dataset by the way the diagnosis date is reported and then impute the date where the missing information is present. After the imputation, I am planning on merging all the data back together.
The data is currently as yyyy(40 subjects), mmmyyyy (50 subjects) and ddmmmyyyy(remainder)
I am expecting to have three datasets after the split.
I have used the following code in SAS 9.4 and it did not work (error message also included):
113 data diagfix1;
114 set disease;
115 if diagdate=diagdate year4.;
------
388
201
76
ERROR 388-185: Expecting an arithmetic operator.
ERROR 201-322: The option is not recognized and will be ignored.
ERROR 76-322: Syntax error, statement will be ignored.
116 run;
Any help on how to split this data based on the date format would be appreciated.
Thanks.
Pay attention:
@art297 added the next line to the code, just before the select statement:
date02=strip(date02);
You can use also function compress() or left() instead strip.
It would help to see some example data. It appears that all 3 types are in the same variable. Are they SAS dates or character?
Art, CEO, AnalystFinder.com
You can use next code:
len = length(strip(date_var));
select (len);
when (4) then date = mdy(01,01,input(date_var,4.));
when (7) then date = input('01'||date_var, date9.);
when (9) then date = input(date_var,date9.);
otherwise put 'Check obs ' _N_ date_var=;
end;
then continue with DATE as sas date variable.
In
len = length(strip(date_var));
I'm counting number of characters in the input date variable.
if variable contains year only then its length is 4.
if it contains month and tear its length is 7.
if its a full date in a format of ddmmmyyyy then its length is 9.
for each kind of input I fill the missing part as day=01 and if need month=JAN;
finally convert it to sas date variable.
when (7) then date = input('01'||date_var, date9.);
informat date9. accepts input as DDMMMYYYY
where DD is the day (as number) of the month.
I have entered 01 in order to have a valid date forma (that is the 1st day in the month)
date_var is a character type field thefore I concatenate '01' as character.
You can't concatenate number to char, that is the reason of getting missing value.
Have you checked you log ?
Its difficult to know why you got year 2019.
please post again your full code + sample of input rows that make you truble.
Please find the output with the 2019 diagnosis date as well as the negative diagnosis time in the attached excel document. Other than the diagdate and diagtime, the other variables are the input variables with mock data. The code I used is below.
data merge;
merge bchist demog;
by subject;
where strip(trt) ne "001";
len = length(strip(date02));
select (len);
when (4) diagdate = mdy(06,15,input(date02,4.));
when (7) diagdate = input("15"||date02, date9.);
when (9) diagdate = input(date02,date9.);
otherwise put 'Check obs ' _N_ date02=;
end;
format diagdate date9.;
diagtime=intck('year',diagdate,daterand);
run;
Please let me know if you need any further information. Thanks for your help.
@PaulaC: You have embedded spaces in your data. The following should correct for that:
data year mmyyyy mdy; input date02 $12.; format diagdate date9.; date02=strip(date02); select (length(strip(date02))); when (4) do; diagdate = mdy(06,05,input(date02,4.)); output year; end; when (7) do; diagdate = input('15'||date02, date9.); output mmyyyy; end; when (9) do; diagdate = input(date02,date9.); output mdy; end; otherwise put 'Check obs ' _N_ date02=; end; cards; 1963 1964 1970 1972 198 1980 1981 APR1993 DEC1991 FEB1995 JUL1980 MAR1993 MAY1995 NOV1973 NOV1994 OCT1979 01APR1993 01APR1997 01AUG1991 OCT1991 OCT1991 OCT1991 OCT1991 NOV1979 NOV1979 NOV1979 NOV1979 NOV1979 NOV1979 JAN1993 JAN1993 JAN1993 JAN1993 JUN1980 JUN1980 JUN1980 JUN1980 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 DEC1973 DEC1973 DEC1973 DEC1973 DEC1973 DEC1973 DEC1973 MAY1993 MAY1993 MAY1993 MAY1993 MAY1993 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 MAR1995 OCT1994 OCT1994 OCT1994 OCT1994 ;
HTH,
Art, CEO, AnalystFinder.com
Pay attention:
@art297 added the next line to the code, just before the select statement:
date02=strip(date02);
You can use also function compress() or left() instead strip.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.