Hi,
I have two character date variables which I need to change to numeric variables. I have a basic program but the issue is that sometimes the data is imported in different formats – DD-MM with no year and the year should be the current year. My question is how do I take this into account in my data step? See y current program and sample data below. The "TBD" and "on going" values should be set to missing. Any help will be greatly apericated. Thanks!
data test_date;
set master;
start_date_new= input(substr(strip(start_date),1,10),MMDDYY10.);
end_date_new=input(substr(strip(end_date),1,10),MMDDYY10.);
format start_date_new end_date_new date9.;
run;
Start_Date | End_Date |
6/23/2015 | 7/20/2015 |
7/14/2015 | 9/5/2015 |
8/4/2015 | TBD |
8/20/2015 | 11/2/2015 |
8/27/2015 | TBD |
9-Feb | 1-Jul |
15-Jan | 1-Jul |
15-Jan | 1-Jul |
3-Feb | 1-Jul |
9-Feb | 1-Jul |
1-Jul | TBD |
1-Jul | TBD |
8/27/2015 | TBD |
10/6/2015 | TBD |
10/13/2015 | 11/2/2015 |
10/13/2015 | TBD |
3/31/2015 | 5/19/2015 |
5/21/2015 | On Going |
6/9/2015 | 7/6/2015 |
7/8/2015 | TBD |
10/20/2015 | 11/2/2015 |
8/5/2015 | TBD |
7/30/2015 | 11/2/2015 |
8/11/2015 | TBD |
8/18/2015 | 11/2/2015 |
8/25/2015 | 9/29/2015 |
9/10/2015 | 10/5/2015 |
18-Sep | 1-Oct |
18-Sep | 1-Oct |
9/29/2015 | 11/2/2015 |
8/4/2015 | 9/29/2015 |
8/11/2015 | TBD |
8/13/2015 | TBD |
7/8/2015 | 10/6/2015 |
8/5/2015 | ongoing |
10/1/2015 | 11/2/2015 |
Hi, using a portion of your data (you could drop S and E) ...
data x;
input (s e) (:$20.);
if findc(s,'-') then s=catt(s,'-15');
if findc(e,'-') then e=catt(e,'-15');
start = input(s,anydtdte10.);
end = input(e,anydtdte10.);
format st: end mmddyy10.;
datalines;
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
;
data set X ...
Obs s e start end
1 6/23/2015 7/20/2015 06/23/2015 07/20/2015
2 7/14/2015 9/5/2015 07/14/2015 09/05/2015
3 8/4/2015 TBD 08/04/2015 .
4 8/20/2015 11/2/2015 08/20/2015 11/02/2015
5 8/27/2015 TBD 08/27/2015 .
6 9-Feb-15 1-Jul-15 02/09/2015 07/01/2015
7 15-Jan-15 1-Jul-15 01/15/2015 07/01/2015
8 15-Jan-15 1-Jul-15 01/15/2015 07/01/2015
9 3-Feb-15 1-Jul-15 02/03/2015 07/01/2015
10 9-Feb-15 1-Jul-15 02/09/2015 07/01/2015
11 1-Jul-15 TBD 07/01/2015 .
12 1-Jul-15 TBD 07/01/2015 .
13 8/27/2015 TBD 08/27/2015 .
14 10/6/2015 TBD 10/06/2015 .
15 10/13/2015 11/2/2015 10/13/2015 11/02/2015
Hi, using a portion of your data (you could drop S and E) ...
data x;
input (s e) (:$20.);
if findc(s,'-') then s=catt(s,'-15');
if findc(e,'-') then e=catt(e,'-15');
start = input(s,anydtdte10.);
end = input(e,anydtdte10.);
format st: end mmddyy10.;
datalines;
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
;
data set X ...
Obs s e start end
1 6/23/2015 7/20/2015 06/23/2015 07/20/2015
2 7/14/2015 9/5/2015 07/14/2015 09/05/2015
3 8/4/2015 TBD 08/04/2015 .
4 8/20/2015 11/2/2015 08/20/2015 11/02/2015
5 8/27/2015 TBD 08/27/2015 .
6 9-Feb-15 1-Jul-15 02/09/2015 07/01/2015
7 15-Jan-15 1-Jul-15 01/15/2015 07/01/2015
8 15-Jan-15 1-Jul-15 01/15/2015 07/01/2015
9 3-Feb-15 1-Jul-15 02/03/2015 07/01/2015
10 9-Feb-15 1-Jul-15 02/09/2015 07/01/2015
11 1-Jul-15 TBD 07/01/2015 .
12 1-Jul-15 TBD 07/01/2015 .
13 8/27/2015 TBD 08/27/2015 .
14 10/6/2015 TBD 10/06/2015 .
15 10/13/2015 11/2/2015 10/13/2015 11/02/2015
Im sure there are more efficent ways then what i have done here, but this was a fun challenge to try out.
here is what i did
data have;
length start_date end_date $ 12.;
input Start_Date $ End_Date $;
datalines ;
6/23/2015 7/20/2015
7/14/2015 9/5/2015
8/4/2015 TBD
8/20/2015 11/2/2015
8/27/2015 TBD
9-Feb 1-Jul
15-Jan 1-Jul
15-Jan 1-Jul
3-Feb 1-Jul
9-Feb 1-Jul
1-Jul TBD
1-Jul TBD
8/27/2015 TBD
10/6/2015 TBD
10/13/2015 11/2/2015
10/13/2015 TBD
3/31/2015 5/19/2015
5/21/2015 On Going
6/9/2015 7/6/2015
7/8/2015 TBD
10/20/2015 11/2/2015
8/5/2015 TBD
7/30/2015 11/2/2015
8/11/2015 TBD
8/18/2015 11/2/2015
8/25/2015 9/29/2015
9/10/2015 10/5/2015
18-Sep 1-Oct
18-Sep 1-Oct
9/29/2015 11/2/2015
8/4/2015 9/29/2015
8/11/2015 TBD
8/13/2015 TBD
7/8/2015 10/6/2015
8/5/2015 ongoing
10/1/2015 11/2/2015
;
data want;
set have;
format new_start_date new_end_date mmddyy10.;
if notdigit(start_date) ne 1 then do;
if length(trim(left(start_date))) <= 6 then
new_start_date = input(compress(start_date || "-" || year(date())), date11.);
if length(trim(left(start_date))) > 6 then
new_start_date = input(start_date, mmddyy10.);
end;
if notdigit(end_date) ne 1 then do;
if length(trim(left(end_date))) <= 6 then
new_end_date = input(compress(end_date || "-" || year(date())), date11.);
if length(trim(left(end_date))) > 6 then
new_end_date = input(end_date, mmddyy10.);
end;
else status = end_date;
drop start_Date end_Date;
run;
If your data is coming from Excel then specify the entire column to have the same format before importing it into SAS.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.