I have a dataset that has a diagnosis date variable. This variable is a character variable. I would like to use this variable to cacluate the duration since diagnosis (randomization date-diagnosis date). Unfortunately the variable is stored in three different way for the subjects. I have about 40 subjects with only a year, 50 subjects with just a month and year and the remainder have a month, day and year. I want to split the dataset by the way the diagnosis date is reported and then impute the date where the missing information is present. After the imputation, I am planning on merging all the data back together.
The data is currently as yyyy(40 subjects), mmmyyyy (50 subjects) and ddmmmyyyy(remainder)
I am expecting to have three datasets after the split.
I have used the following code in SAS 9.4 and it did not work (error message also included):
113 data diagfix1;
114 set disease;
115 if diagdate=diagdate year4.;
                                   ------
                                  388
                                  201
                                  76
ERROR 388-185: Expecting an arithmetic operator.
ERROR 201-322: The option is not recognized and will be ignored.
ERROR 76-322: Syntax error, statement will be ignored.
116 run;
Any help on how to split this data based on the date format would be appreciated.
Thanks.
Pay attention:
@art297 added the next line to the code, just before the select statement:
date02=strip(date02);
You can use also function compress() or left() instead strip.
It would help to see some example data. It appears that all 3 types are in the same variable. Are they SAS dates or character?
Art, CEO, AnalystFinder.com
You can use next code:
    len = length(strip(date_var));
   select (len);
      when (4) then date = mdy(01,01,input(date_var,4.));
      when (7) then date = input('01'||date_var, date9.);
      when (9) then date = input(date_var,date9.);
      otherwise put 'Check obs ' _N_ date_var=;
  end;then continue with DATE as sas date variable.
In
len = length(strip(date_var));I'm counting number of characters in the input date variable.
if variable contains year only then its length is 4.
if it contains month and tear its length is 7.
if its a full date in a format of ddmmmyyyy then its length is 9.
for each kind of input I fill the missing part as day=01 and if need month=JAN;
finally convert it to sas date variable.
 when (7) then date = input('01'||date_var, date9.);informat date9. accepts input as DDMMMYYYY
where DD is the day (as number) of the month.
I have entered 01 in order to have a valid date forma (that is the 1st day in the month)
date_var is a character type field thefore I concatenate '01' as character.
You can't concatenate number to char, that is the reason of getting missing value.
Have you checked you log ?
Its difficult to know why you got year 2019.
please post again your full code + sample of input rows that make you truble.
Please find the output with the 2019 diagnosis date as well as the negative diagnosis time in the attached excel document. Other than the diagdate and diagtime, the other variables are the input variables with mock data. The code I used is below.
data merge;
merge bchist demog;
by subject;
where strip(trt) ne "001";
len = length(strip(date02));
select (len);
when (4) diagdate = mdy(06,15,input(date02,4.));
when (7) diagdate = input("15"||date02, date9.);
when (9) diagdate = input(date02,date9.);
otherwise put 'Check obs ' _N_ date02=;
end;
format diagdate date9.;
diagtime=intck('year',diagdate,daterand);
run;
Please let me know if you need any further information. Thanks for your help.
@PaulaC: You have embedded spaces in your data. The following should correct for that:
data year mmyyyy mdy;
  input date02 $12.;
  format diagdate date9.;
  date02=strip(date02);
  select (length(strip(date02)));
      when (4) do;
                 diagdate = mdy(06,05,input(date02,4.));
                 output year;
               end;
      when (7) do;
                 diagdate = input('15'||date02, date9.);
                 output mmyyyy;
               end;
      when (9) do;
                 diagdate = input(date02,date9.);
                 output mdy;
               end;
      otherwise put 'Check obs ' _N_ date02=;
  end;
  cards;
1963
1964
1970
1972
198
1980
1981
APR1993
DEC1991
FEB1995
JUL1980
MAR1993
MAY1995
NOV1973
NOV1994
OCT1979
01APR1993
01APR1997
01AUG1991
  OCT1991
  OCT1991
  OCT1991
  OCT1991
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  JAN1993
  JAN1993
  JAN1993
  JAN1993
  JUN1980
  JUN1980
  JUN1980
  JUN1980
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 MAY1993
 MAY1993
 MAY1993
 MAY1993
 MAY1993
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  OCT1994
  OCT1994
  OCT1994
  OCT1994
;
HTH,
Art, CEO, AnalystFinder.com
Pay attention:
@art297 added the next line to the code, just before the select statement:
date02=strip(date02);
You can use also function compress() or left() instead strip.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
