BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear,

 

My data contains date variable with the following values. I am using the following code to get the output. I got the output except for a few OBS containing 'jun' as the month because I am trying to remove 'un' charcters using tranwrd function which removes 'un' in 'jun' month. Please help me simply the code. Thanks

 

 

date

 

//2011
00/UNK/UNK
01/Dec/2014
02/UNK/2014
18/jun/2008
UN/mar/2023
Un/apr/2013
uN/sep/2011

un/oct/2009
Un/UNK/2003
UN/UNK/2002
uN/UNK/2015
Un/UNK/2006

 

 

code;

date1=tranwrd(date_raw,'/','');
date2=tranwrd(date1,'UNK','');
date3=tranwrd(date2,'un','');
date4=tranwrd(date3,'UN','');
date5=tranwrd(date4,'Un','');
date6=compress(tranwrd(date5,'uN',''));

 

if length(date6)=9 then STDTC2= put(input(date6,date9.),is8601da.);

if length(date6)=7 then STDTC3= put(input(date6,monyy7.),yymmd7.);
if length(date6)=4 then STDTC4= put(input(date6,best.),10.);

 

if stdtc2 ^='' then isdate=stdtc2;

else if stdtc3 ^='' then isdate=stdtc3;

else if stdtc4 ^='' then isdate=stdtc4;

 

output needed;

2011

blank

2014-12-01
2014--02
2008-06-18
2023-03
2013-04
2011-09

2009-10
2003
2002
2015
2006

9 REPLIES 9
Shmuel
Garnet | Level 18

Add to your code at the beginning:

 

 if input(datex, ??  ddmmyy10.)  ne . 

   then date_out = put(input(datex, ddmmyy10.),  yymmdd10.);

else do;

   ... your code ...

end;

 

The only "problem" that may arouse is that date_out will be in dormat of: yyyy/mm/dd instead yyyy-mm-dd

and that can be corrected by date_out = translate(date_out, '-' , '/' );

knveraraju91
Barite | Level 11

Thank you for help. If I use my code where you mentioned to use. Does it work for values 'Un/jun/2012' . My code removes 'un' in jun. Please help.  

Shmuel
Garnet | Level 18

The expression  input(datex, ??  ddmmyy10.) checks for valid date and

assigns a number that is the days passed since 01-01-1960 otherwise it

assingns a missing value.

 

As  'Un/jun/2012' is not a valid date, it will execute the ELSE part of the IF statement.

In oredr to overcome JUN month, I would prefer a differnt code than yours assuming that

date_raw is always in a format of day/month/year .

I havn't run this code. You may try it:

 

%LET MM =  'JAN' ' FEB' 'MAR' 'APR' 'MAY' 'JUN' 'JUL' 'AUG' 'SEP' 'OCT' 'NOV' 'DEC' ;

 

DATA WANT;

  SET HAVE;

 

 array mx $ &mm;

 

 if input(datex, ??  ddmmyy10.)  ne .                         /* valid date given */

   then date_out = put(input(datex, ddmmyy10.),  yymmdd10.);

else do;

    month = scan(date_raw,2);

    year = scan(date_raw,3);

    if   length(trim(month)) = 2 and                    /* 2 digits month */

         1 le input(month, best2.) le 12 and

         input(year, best4.) ge 1960                    /* check for reasonable year */

    then date_out = catx( '-', year,month);   

    else if  length(trim(month)) = 3 and           /* 3 characters month */

           upcase(month) in (&mm) and 

           input(year, best4.) ge 1960  

           then do;

                 do i=1 to 12;

                     if mx(i) = upcase(trim(month)) then do;

                       month = put(i , z2.);     /* convert month mmm to its number, 2 digits */

                       date_out = catx( '-', year, trim(month)); 

                        leave;

                    end; 

                end; end;

   else if  input(year, best4.) ge 1960  

   then date_out = trim(year);

end;  

run;

 

 

 

 

Patrick
Opal | Level 21

The mdy() function is quite helpful here. See code below.

data have;
infile datalines truncover;
input have_str $11.;
datalines;
//2011
00/UNK/UNK
01/Dec/2014
02/UNK/2014
18/jun/2008
UN/mar/2023
Un/apr/2013
uN/sep/2011
un/oct/2009
Un/UNK/2003
UN/UNK/2002
uN/UNK/2015
Un/UNK/2006
  
Un/##/2006
;
run;

options missing=' ';
data want;
  set have;
  length want_str $11.;

  /* valid day, month and year? */
  want_str=put(input(have_str, ?? date11.),yymmdd10.);

  if missing(want_str) and not missing(have_str) then
    do;
      length _day $2 _month $3 _year $4;
      _day  =scan(have_str,1,'/','m');
      _month=put(month(input(scan(have_str,2,'/','m')||'2000', ?? monyy.)),z2.);
      _year =scan(have_str,3,'/','m');

      /* valid day and year? */
      if mdy('01',_day,_year) then want_str=_year||'--'||_day;

      /* valid month and year? */
      else if mdy(_month,'01',_year) then want_str=_year||'-'||_month;

      /* valid year? */
      else if mdy('01','01',_year) then want_str=_year;

    end;
run;

 

knveraraju91
Barite | Level 11

Thank you for the help. I got the output I need. But I am getting warning message as "invalid argument  for OBS where _day='un' 'UN' and _year='UNK'. I added a few additional lines to the code. Is there any other option to simplyfy it. Thanks.

 

code;

options missing=' ';
data want;
  set have;
  length want_str $11.;

  /* valid day, month and year? */
  want_str1=put(input(have_str, ?? date11.),yymmdd10.);

  if missing(want_str) and not missing(have_str) then
    do;
      length _day $2 _month $3 _year $4;
      _day  =scan(have_str,1,'/','m');
      _month=put(month(input(scan(have_str,2,'/','m')||'2000', ?? monyy.)),z2.);
      _year =scan(have_str,3,'/','m');

/* valid day and year? */
if _day not in ('un' 'UN' 'uN' 'Un') and _year ^='UNK' then do;
if mdy('01',_day,_year) then want_str2=_year||'--'||_day;
end;

/* valid month and year? */
if _year ^='UNK' then do;

if mdy(_month,'01',_year) then want_str3=_year||'-'||_month;
end;
/* valid year? */
if _year^='UNK' then do;
if mdy('01','01',_year) then want_str4=_year;
end;

end;

if want_str1 ^='' then wantstr=want_str1;

else if want_str2 ^='' then wantstr=want_str2;

else if want_str3 ^='' then wantstr=want_str3;

else if want_str4 ^='' then wantstr=want_str4;

 


run;

Patrick
Opal | Level 21

I should have checked the log.... mdy() requires numerical variables. Below a code version which creates a bit less Notes in the log about conversion from character to numerical.

data have;
infile datalines truncover;
input have_str $11.;
datalines;
//2011
00/UNK/UNK
01/Dec/2014
02/UNK/2014
18/jun/2008
UN/mar/2023
Un/apr/2013
uN/sep/2011
un/oct/2009
Un/uNk/2003
 
Un/##/2006
99/999/2006
;
run;

options missing=' ';
data want;
  set have;
  length want_str $11.;

  /* valid day, month and year? */
  want_str=put(input(have_str, ?? date11.),yymmdd10.);

  if missing(want_str) and not missing(have_str) then
    do;
      length _day _month _year 8;
      _day  =input(scan(have_str,1,'/','m'), ?? best32.);
      _month=month(input(scan(have_str,2,'/','m')||'2000', ?? monyy.));
      _year =input(scan(have_str,3,'/','m'),?? best32.);

      if _day<1 or _day>31 then call missing(_day);
      if _month<1 or _month>12 then call missing(_month);

      /* valid day and year? */
      if mdy(01,_day,_year) then want_str=put(_year,z4.)||'--'||put(_day,z2.);

      /* valid month and year? */
      else if mdy(_month,01,_year) then want_str=put(_year,z4.)||'-'||put(_month,z2.);

      /* valid year? */
      else if mdy(01,01,_year) then want_str=put(_year,z4.);

    end;
run;

If you're using my code version then you don't need to add all these checks like if _day not in ('un' 'UN' 'uN' 'Un') and _year ^='UNK' then do; The code I've posted works fine without it and _day will either contain a number or a missing.

 

If the number in _day is a valid day then the mdy() function will return a value (if the 2 other parameters are valid as well), else it will return a missing. So that's the check.

 

As you can see this code version doesn't only deal with "UN/UNK" but with any string which doesn't represent a valid day, month or year.

knveraraju91
Barite | Level 11

Dear,

 

My data contains date variable with the following values. I am using the following the code. I am not able to run the code. Please help. It giving me error message.Thanks

 

 

datex

 

//2011
00/UNK/UNK
01/Dec/2014
02/UNK/2014
18/jun/2008
UN/mar/2023
Un/apr/2013
uN/sep/2011

un/oct/2009
Un/UNK/2003
UN/UNK/2002
uN/UNK/2015
Un/UNK/2006

un/jun/2002

 

output needed;

2011

blank

2014-12-01
2014--02
2008-06-18
2023-03
2013-04
2011-09

2009-10
2003
2002
2015
2006

 

code;(Some one posted this code);

 

%LET MM =  'JAN' ' FEB' 'MAR' 'APR' 'MAY' 'JUN' 'JUL' 'AUG' 'SEP' 'OCT' 'NOV' 'DEC' ;

 

DATA WANT;

  SET HAVE;

 

 array mx $ &mm;

 

 if input(datex, ??  ddmmyy10.)  ne .                         /* valid date given */

   then date_out = put(input(datex, ddmmyy10.),  yymmdd10.);

else do;

    month = scan(date_raw,2);

    year = scan(date_raw,3);

    if   length(trim(month)) = 2 and                    /* 2 digits month */

         1 le input(month, best2.) le 12 and

         input(year, best4.) ge 1960                    /* check for reasonable year */

    then date_out = catx( '-', year,month);   

    else if  length(trim(month)) = 3 and           /* 3 characters month */

           upcase(month) in (&mm) and 

           input(year, best4.) ge 1960  

           then do;

                 do i=1 to 12;

                     if mx(i) = upcase(trim(month)) then do;

                       month = put(i , z2.);     /* convert month mmm to its number, 2 digits */

                       date_out = catx( '-', year, trim(month)); 

                        leave;

                    end; 

                end; end;

   else if  input(year, best4.) ge 1960  

   then date_out = trim(year);

end;  

run;

 

 

Patrick
Opal | Level 21

Please don't post the same question twice.

 

Here the link to the thread with replies: 

https://communities.sas.com/t5/Base-SAS-Programming/How-to-convert-date-to-character-dates-when-sour...

 

 

Shmuel
Garnet | Level 18

After hard work I simplified the code as:

 

data have;
infile datalines;
input datex $11. ;
datalines;
//2011
00/UNK/UNK
01/Dec/2014
02/UNK/2014
18/jun/2008
UN/mar/2023
Un/apr/2013
uN/sep/2011
un/oct/2009
Un/UNK/2003
UN/UNK/2002
uN/UNK/2015
Un/UNK/2006
; run;


%LET MM = 'JAN' ' FEB' 'MAR' 'APR' 'MAY' 'JUN' 'JUL' 'AUG' 'SEP' 'OCT' 'NOV' 'DEC' ;

 

DATA WANT(keep=datex date_out day month year );   /*drop unrelevant variables */
         length date_out $10 day $2 month $3 year $4;
  SET HAVE;
       array mx {12} $ ( &mm );

      day = scan(datex,1,'/','m');
      month = scan(datex,2,'/','m');
      year = scan(datex,3,'/','m');

if length(strip(month)) = 3 and                /* 3 characters month */
     upcase(month) in (&mm) and
     input(year,?? best4.) ge 1960
then do;
    do i=1 to 12;
        if mx(i) = upcase(trim(month)) then do;
           month = put(i , z2.);                  /* convert month mmm to its number, 2 digits */
          leave;
      end;
end; end;

lday     =  compress(day,'0123456789');
lmonth = compress(month,'0123456789');
lyear    = compress(year,'0123456789');

if lday=' ' and lmonth=' ' and lyear=' '
then date_out = catx('-',year,month); else
if lday=' ' and lyear=' ' and lmonth ne ' '
then date_out = cat(year,'--',day); else
if lday ne ' ' and lmonth=' ' and lyear=' '
then date_out = catx('-',year,month); else
if lday ne ' ' and lmonth ne ' ' and lyear=' '
then date_out = year; else date_out = ' ';
run;





SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 4988 views
  • 5 likes
  • 3 in conversation