BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RobertNYC
Obsidian | Level 7

 

Hi, 

I have two character date variables which I need to change to numeric variables. I have a basic program but the issue is that sometimes the data is imported in different formats – DD-MM with no year  and the year should be the current year.  My question is how do I take this into account in my data step? See y current program and sample data below. The "TBD" and "on going" values should be set to missing.  Any help will be greatly apericated.  Thanks! 

 

 

 

data test_date; 
set master; 
start_date_new= input(substr(strip(start_date),1,10),MMDDYY10.); 
end_date_new=input(substr(strip(end_date),1,10),MMDDYY10.); 
format start_date_new end_date_new date9.; 
run; 

 

Start_Date End_Date
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
10/13/2015 TBD
3/31/2015 5/19/2015
5/21/2015 On Going
6/9/2015 7/6/2015
7/8/2015 TBD
10/20/2015 11/2/2015
8/5/2015 TBD
7/30/2015 11/2/2015
8/11/2015 TBD
8/18/2015 11/2/2015
8/25/2015 9/29/2015
9/10/2015 10/5/2015
18-Sep 1-Oct
18-Sep 1-Oct
9/29/2015 11/2/2015
8/4/2015 9/29/2015
8/11/2015 TBD
8/13/2015 TBD
7/8/2015 10/6/2015
8/5/2015 ongoing
10/1/2015 11/2/2015
1 ACCEPTED SOLUTION

Accepted Solutions
MikeZdeb
Rhodochrosite | Level 12

Hi, using a portion of your data (you could drop S and E) ...

 

data x;
input (s e) (:$20.);
if findc(s,'-') then s=catt(s,'-15');
if findc(e,'-') then e=catt(e,'-15');
start = input(s,anydtdte10.);
end   = input(e,anydtdte10.);
format st: end mmddyy10.;
datalines;
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
;

 

data set X ...

Obs        s         e                 start           end

  1    6/23/2015     7/20/2015    06/23/2015    07/20/2015
  2    7/14/2015     9/5/2015     07/14/2015    09/05/2015
  3    8/4/2015      TBD          08/04/2015             .
  4    8/20/2015     11/2/2015    08/20/2015    11/02/2015
  5    8/27/2015     TBD          08/27/2015             .
  6    9-Feb-15      1-Jul-15     02/09/2015    07/01/2015
  7    15-Jan-15     1-Jul-15     01/15/2015    07/01/2015
  8    15-Jan-15     1-Jul-15     01/15/2015    07/01/2015
  9    3-Feb-15      1-Jul-15     02/03/2015    07/01/2015
 10    9-Feb-15      1-Jul-15     02/09/2015    07/01/2015
 11    1-Jul-15      TBD          07/01/2015             .
 12    1-Jul-15      TBD          07/01/2015             .
 13    8/27/2015     TBD          08/27/2015             .
 14    10/6/2015     TBD          10/06/2015             .
 15    10/13/2015    11/2/2015    10/13/2015    11/02/2015

 

 

View solution in original post

3 REPLIES 3
MikeZdeb
Rhodochrosite | Level 12

Hi, using a portion of your data (you could drop S and E) ...

 

data x;
input (s e) (:$20.);
if findc(s,'-') then s=catt(s,'-15');
if findc(e,'-') then e=catt(e,'-15');
start = input(s,anydtdte10.);
end   = input(e,anydtdte10.);
format st: end mmddyy10.;
datalines;
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
;

 

data set X ...

Obs        s         e                 start           end

  1    6/23/2015     7/20/2015    06/23/2015    07/20/2015
  2    7/14/2015     9/5/2015     07/14/2015    09/05/2015
  3    8/4/2015      TBD          08/04/2015             .
  4    8/20/2015     11/2/2015    08/20/2015    11/02/2015
  5    8/27/2015     TBD          08/27/2015             .
  6    9-Feb-15      1-Jul-15     02/09/2015    07/01/2015
  7    15-Jan-15     1-Jul-15     01/15/2015    07/01/2015
  8    15-Jan-15     1-Jul-15     01/15/2015    07/01/2015
  9    3-Feb-15      1-Jul-15     02/03/2015    07/01/2015
 10    9-Feb-15      1-Jul-15     02/09/2015    07/01/2015
 11    1-Jul-15      TBD          07/01/2015             .
 12    1-Jul-15      TBD          07/01/2015             .
 13    8/27/2015     TBD          08/27/2015             .
 14    10/6/2015     TBD          10/06/2015             .
 15    10/13/2015    11/2/2015    10/13/2015    11/02/2015

 

 

dsbihill
Obsidian | Level 7

Im sure there are more efficent ways then what i have done here, but this was a fun challenge to try out.

 

here is what i did

data have;
	  length start_date end_date $ 12.;
      input Start_Date $ End_Date $;
      datalines ;


6/23/2015 7/20/2015 
7/14/2015 9/5/2015 
8/4/2015 TBD 
8/20/2015 11/2/2015 
8/27/2015 TBD 
9-Feb 1-Jul 
15-Jan 1-Jul 
15-Jan 1-Jul 
3-Feb 1-Jul 
9-Feb 1-Jul 
1-Jul TBD 
1-Jul TBD 
8/27/2015 TBD 
10/6/2015 TBD 
10/13/2015 11/2/2015 
10/13/2015 TBD 
3/31/2015 5/19/2015 
5/21/2015 On Going 
6/9/2015 7/6/2015 
7/8/2015 TBD 
10/20/2015 11/2/2015 
8/5/2015 TBD 
7/30/2015 11/2/2015
8/11/2015 TBD
8/18/2015 11/2/2015
8/25/2015 9/29/2015
9/10/2015 10/5/2015
18-Sep 1-Oct
18-Sep 1-Oct
9/29/2015 11/2/2015
8/4/2015 9/29/2015
8/11/2015 TBD
8/13/2015 TBD
7/8/2015 10/6/2015
8/5/2015 ongoing
10/1/2015 11/2/2015
;


data want;
	set have;
	format new_start_date new_end_date mmddyy10.;
	
	if notdigit(start_date) ne 1 then do;
		if length(trim(left(start_date))) <= 6 then
			new_start_date = input(compress(start_date || "-" || year(date())), date11.);
		if length(trim(left(start_date))) > 6 then
			new_start_date = input(start_date, mmddyy10.);
	end;
	
	if notdigit(end_date) ne 1 then do;
		if length(trim(left(end_date))) <= 6 then
			new_end_date = input(compress(end_date || "-" || year(date())), date11.);
		if length(trim(left(end_date))) > 6 then
			new_end_date = input(end_date, mmddyy10.);
	end;
	else status = end_date;

	drop start_Date end_Date;

run;
ballardw
Super User

If your data is coming from Excel then specify the entire column to have the same format before importing it into SAS.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1555 views
  • 1 like
  • 4 in conversation