DATA Step, Macro, Functions and more

Splitting data based on date format type

Accepted Solution Solved
Reply
Contributor
Posts: 43
Accepted Solution

Splitting data based on date format type

I have a dataset that has a diagnosis date variable.  This variable is a character variable.  I would like to use this variable to cacluate the duration since diagnosis (randomization date-diagnosis date).  Unfortunately the variable is stored in three different way for the subjects.  I have about 40 subjects with only a year, 50 subjects with just a month and year and the remainder have a month, day and year.  I want to split the dataset by the way the diagnosis date is reported and then impute the date where the missing information is present.  After the imputation, I am planning on merging all the data back together.

 

The data is currently as yyyy(40 subjects), mmmyyyy (50 subjects) and ddmmmyyyy(remainder)

 

I am expecting to have three datasets after the split.

 

I have used the following code in SAS 9.4 and it did not work (error message also included):

113 data diagfix1;
114 set disease;
115 if diagdate=diagdate year4.;
                                   ------
                                  388
                                  201
                                  76
ERROR 388-185: Expecting an arithmetic operator.

ERROR 201-322: The option is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

116 run;

 

Any help on how to split this data based on the date format would be appreciated.

 

Thanks.

 


Accepted Solutions
Solution
‎01-23-2017 04:38 PM
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

Pay attention:

 

 @art297 added the next line to the code, just before the select statement:

date02=strip(date02);

 You can use also function compress() or left() instead strip.

 

 

View solution in original post


All Replies
PROC Star
Posts: 7,360

Re: Splitting data based on date format type

It would help to see some example data. It appears that all 3 types are in the same variable. Are they SAS dates or character?

 

Art, CEO, AnalystFinder.com

 

Contributor
Posts: 43

Re: Splitting data based on date format type

The variable diagdate is a character variable (as mentioned in the original post). Some examples of the data are as follows:
1963
1964
1970
1972
198
1980
1981
APR1993
DEC1991
FEB1995
JUL1980
MAR1993
MAY1995
NOV1973
NOV1994
OCT1979
01APR1993
01APR1997
01AUG1991

Thanks.
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

You can use next code:

    len = length(strip(date_var));
   select (len);
      when (4) then date = mdy(01,01,input(date_var,4.));
      when (7) then date = input('01'||date_var, date9.);
      when (9) then date = input(date_var,date9.);
      otherwise put 'Check obs ' _N_ date_var=;
  end;

then continue with DATE as sas date variable.

 

Contributor
Posts: 43

Re: Splitting data based on date format type

Thank you. I am trying this now. Would you mind explaining what this code is doing so that I will know for the future? Will I be getting three separate datasets?
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

In 

len = length(strip(date_var));

I'm counting number of characters in the input date variable.

if variable contains year only then its length is 4.

if it contains month and tear its length is 7.

if its a full date in a format of ddmmmyyyy then its length is 9.

 

for each kind of input I fill the missing part as day=01 and if need month=JAN;

finally convert it to sas date variable.

Contributor
Posts: 43

Re: Splitting data based on date format type

I have a question for you regarding the length of 7 putting the character "01" since only the month and year were provided. I ended up with a diagnosis date of 2019 and a duration of -22, but not sure why. When I made it the character "01" a numeric 01, the value became missing. Do you have any explanations for this?
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

 when (7) then date = input('01'||date_var, date9.);

informat date9. accepts input as DDMMMYYYY

where DD is the day (as number) of the month.

I have entered  01 in order to have a valid date forma (that is the 1st day in the month)

 

date_var is a character type field thefore I concatenate '01' as character.

You can't concatenate number to char, that is the reason of getting missing value.

Have you checked you log ?

 

 

Contributor
Posts: 43

Re: Splitting data based on date format type

Thank you for the explanation. I don't recall seeing anything in the log, but I will check again. Doesn't the input command change the variable from character to numeric?
Contributor
Posts: 43

Re: Splitting data based on date format type

I just checked the log and it did not give me any error messages. When I looked at the diagnosis date that was created, for 50 of the patients it is giving me a diagnosis date of 01mmm2019. Why 2019? I end up with a negative time from diagnosis. The years provided in the date variable range from 1973-1995. I am not sure why the code gives a year of 2019 for these year ranges.
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

Its difficult to know why you got year 2019.

please post again your full code + sample of input rows that make you truble.

Contributor
Posts: 43

Re: Splitting data based on date format type

Please find the output with the 2019 diagnosis date as well as the negative diagnosis time in the attached excel document.  Other than the diagdate and diagtime, the other variables are the input variables with mock data.  The code I used is below.

data merge;
merge bchist demog;
by subject;
where strip(trt) ne "001";
len = length(strip(date02));
select (len);
when (4) diagdate = mdy(06,15,input(date02,4.));
when (7) diagdate = input("15"||date02, date9.);
when (9) diagdate = input(date02,date9.);
otherwise put 'Check obs ' _N_ date02=;
end;
format diagdate date9.;
diagtime=intck('year',diagdate,daterand);
run;

 

Please let me know if you need any further information.  Thanks for your help.

PROC Star
Posts: 7,360

Re: Splitting data based on date format type

@PaulaC: You have embedded spaces in your data. The following should correct for that:

 

data year mmyyyy mdy;
  input date02 $12.;
  format diagdate date9.;
  date02=strip(date02);
  select (length(strip(date02)));
      when (4) do;
                 diagdate = mdy(06,05,input(date02,4.));
                 output year;
               end;
      when (7) do;
                 diagdate = input('15'||date02, date9.);
                 output mmyyyy;
               end;
      when (9) do;
                 diagdate = input(date02,date9.);
                 output mdy;
               end;
      otherwise put 'Check obs ' _N_ date02=;
  end;
  cards;
1963
1964
1970
1972
198
1980
1981
APR1993
DEC1991
FEB1995
JUL1980
MAR1993
MAY1995
NOV1973
NOV1994
OCT1979
01APR1993
01APR1997
01AUG1991
  OCT1991
  OCT1991
  OCT1991
  OCT1991
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  NOV1979
  JAN1993
  JAN1993
  JAN1993
  JAN1993
  JUN1980
  JUN1980
  JUN1980
  JUN1980
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 DEC1973
 MAY1993
 MAY1993
 MAY1993
 MAY1993
 MAY1993
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  MAR1995
  OCT1994
  OCT1994
  OCT1994
  OCT1994
;

HTH,

Art, CEO, AnalystFinder.com

 

Solution
‎01-23-2017 04:38 PM
Trusted Advisor
Posts: 1,369

Re: Splitting data based on date format type

Pay attention:

 

 @art297 added the next line to the code, just before the select statement:

date02=strip(date02);

 You can use also function compress() or left() instead strip.

 

 

Contributor
Posts: 43

Re: Splitting data based on date format type

thank you for your help.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 21 replies
  • 264 views
  • 2 likes
  • 3 in conversation