BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
desireatem
Pyrite | Level 9

Dear Madam?Sir,

 

Your help is much appreciated.

 

I have a very large data (marked bold below)  that I wish to compute the difference between the two dates below. However, the variable "Arrival_data_time" is a character while LKW is numeric as below:

Arrival_date_time Char 18 $18. $18. Arrival date_time

 

LKW Num 8 DATETIME16.   LKW

 

I used the code but the results looks wrong:

data bio;
set bio;
LKWH=Arrival_date_time-LKW;
run;

proc means data=Bio median p25 p75 ;
var LKWH;
run;

OUTPUT

Analysis Variable : LKWH
Median 25th Pctl 75th Pctl
1968891295 1967083516 1970649275

 

Arrival_date_time LKW 
3/23/22 9:51 3/23/2022 9:00
4/28/2022 17:39 4/28/2022 9:00
5/3/2022 11:55 5/2/2022 16:30
5/9/2022 21:39 5/9/2022 17:20
5/15/2022 12:47 5/15/2022 12:20
6/1/2022 6:39 5/31/2022 17:00
6/8/2022 21:44 6/8/2022 21:00
6/13/2022 10:33 6/12/2022 23:00
. 6/27/2022 19:30
7/18/22 10:37 7/17/2022 23:00
8/8/2022 15:09 8/8/2022 6:00
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This code seems to work for me. The first data step reproduces your data. The second data step does the differencing. If the second data step does not produce the same results for you that I am getting, then we still do not have a good description of your data.

 

data have;
input Arrival_date_time : $16. lkw :anydtdtm.;
infile cards delimiter='|';
format lkw datetime16.;
cards;
3/23/22 9:51    |	3/23/2022 9:00
4/28/2022 17:39 |   4/28/2022 9:00
5/3/2022 11:55  |   5/2/2022 16:30
5/9/2022 21:39  |   5/9/2022 17:20
5/15/2022 12:47 |   5/15/2022 12:20
6/1/2022 6:39   |   5/31/2022 17:00
6/8/2022 21:44  |   6/8/2022 21:00
6/13/2022 10:33 |   6/12/2022 23:00
.               |   6/27/2022 19:30
7/18/22 10:37   |   7/17/2022 23:00
8/8/2022 15:09  |   8/8/2022 6:00
;
data want;
    set have;
    arrival_date_time_Num=input(arrival_date_time,anydtdtm.);
    difference = arrival_date_time_num - lkw;
    format arrival_date_time_num datetime16. difference time.;;
run;

PaigeMiller_0-1673542843476.png

 

--
Paige Miller

View solution in original post

21 REPLIES 21
PaigeMiller
Diamond | Level 26

First, you will have to create a numeric variable containing the date/time value from the character variable ARRIVAL_DATE_TIME, then do the subtraction, using a DATA step.

 

 

data want;
    set have;
    /* Create numeric date/time variable */
    arrival_date_time_num=input(arrival_date_time,anydtdtm.);
    /* Do subtraction */
    difference=arrival_date_time_num-lkw;
    format difference time.;
run;

 

--
Paige Miller
desireatem
Pyrite | Level 9

It does not work, the answer is negative. I wish to get the difference in minutes.

 

data Bio;
Set Bio;
arrival_date_time_num=input(arrival_date_time,anydtdtm.);
difference=arrival_date_time_num-lkw;
format difference time.;
run;

 

proc means data=Bio median p25 p75 ;
var difference;
run;

 

Analysis Variable : difference
Median 25th Pctl 75th Pctl
-1810558500 -1810558500 -1810558500
PaigeMiller
Diamond | Level 26

The numbers for variable DIFFERENCE are in seconds, and I use the proper format, so the values of DIFFERENCE should appear as hours:minutes:seconds. SHOW US a screen capture of (a portion of) the data set BIO so we can see the value of DIFFERENCE for ourselves (not the screen capture from PROC MEANS).

 

If the numbers are negative, you can change the calculation formula to obtain positive numbers.

 

 

 

 

--
Paige Miller
desireatem
Pyrite | Level 9

It didnt work:

 

401 data Bio;
402 Set Bio;
403 arrival_date_time_num=input(arrival_date_time,anydtdtm.);
404 difference=arrival_date_time_num-lkw;
405 format difference time.;
406 run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
10 at 404:36
NOTE: There were 11 observations read from the data set WORK.BIO.
NOTE: The data set WORK.BIO has 11 observations and 58 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

 

proc print data=bio;
var difference;
run;

 


Obs difference
1 .
2 .
3 .
4 .
5 .
6 .
7 .
8 .
9 .
10 .
11 -502932
PaigeMiller
Diamond | Level 26

Okay, show us all the relevant columns of data set BIO (this would be variable ARRIVAL_DATE_TIME, LKW and ARRIVAL_DATE_TIME_NUM). Something is not as you described it, that is why you are getting missing values.

--
Paige Miller
desireatem
Pyrite | Level 9
The SAS System

Obs arrival_date_time_num LKW difference
1 . 23MAR22:09:00:00 .
2 . 28APR22:09:00:00 .
3 . 02MAY22:16:30:00 .
4 . 09MAY22:17:20:00 .
5 . 15MAY22:12:20:00 .
6 . 31MAY22:17:00:00 .
7 . 08JUN22:21:00:00 .
8 . 12JUN22:23:00:00 .
9 . 27JUN22:19:30:00 .
10 . 17JUL22:23:00:00 .
11 164999100 08AUG22:06:00:00 -502932
PaigeMiller
Diamond | Level 26

This does not use the code that I provided, otherwise the DIFFERENCE would appear as hour:minute:seconds.

 

Your data is not as you represent it, ARRIVAL_DATE_TIME_NUM is missing for most rows, and you need to also show us the data including ARRIVAL_DATE_TIME

--
Paige Miller
Tom
Super User Tom
Super User

You are getting a negative number because 1965 is much earlier than 2022.

1908  data _null_;
1909    dt=164999100;
1910    put dt comma17. dt datetime19.;
1911    dt='08AUG22:06:00:00'dt;
1912    put dt comma17. dt datetime19.;
1913  run;

      164,999,100 24MAR1965:17:05:00
    1,975,557,600 08AUG2022:06:00:00
desireatem
Pyrite | Level 9

A good portion of the data comes out as missing only one convertion is done from char to num

Kurt_Bremser
Super User

You will need to do the conversion in multiple steps:

arrival_date_time_num = dhms(input(scan(arrival_date_time,1),mmddyy10.),0,0,input(scan(arrival_date_time,2),time5.));

because somebody chose a completely non-standard format for the datetime value. YYYY-MM-DD HH:MM works with the E8601DT informat, and the "ANY" formats are dependent on your locale setting (MDY dates are only used on the American continent, everybody else uses DMY or (best) YMD).

desireatem
Pyrite | Level 9

Thank you, I tried but it did not work:

 

 

data Bio;
Set Bio;
*arrival_date_time_num=input(arrival_date_time,anydtdtm.);
arrival_date_time_num = dhms(input(scan(arrival_date_time,1),mmddyy10.),0,0,input(scan(arrival_date_time,2),time5.));
difference=arrival_date_time_num-lkw;
format difference time.;
run;

 

 

 

 

424 data Bio;
425 Set Bio;
426 *arrival_date_time_num=input(arrival_date_time,anydtdtm.);
427 arrival_date_time_num =
427! dhms(input(scan(arrival_date_time,1),mmddyy10.),0,0,input(scan(arrival_date_time,2),time5.));
428 difference=arrival_date_time_num-lkw;
429 format difference time.;
430 run;

NOTE: Invalid argument to function INPUT at line 427 column 30.
NOTE: Invalid argument to function INPUT at line 427 column 77.

Kurt_Bremser
Super User

We need to explicitly set the delimiter for SCAN to the blank:

data have;
input arrival_date_time $20.;
datalines;
3/23/22 9:51
4/28/2022 17:39
5/3/2022 11:55
5/9/2022 21:39
5/15/2022 12:47
6/1/2022 6:39
6/8/2022 21:44
6/13/2022 10:33
6/27/2022 19:30
7/18/22 10:37
8/8/2022 15:09
;

data want;
set have;
arrival_date_time_num = dhms(input(scan(arrival_date_time,1," "),mmddyy10.),0,0,input(scan(arrival_date_time,2," "),time5.));
format arrival_date_time_num e8601dt19.;
run;

proc print data=want noobs;
var arrival_date_time_num;
run;

Result:

arrival_date_time_num
2022-03-23T09:51:00
2022-04-28T17:39:00
2022-05-03T11:55:00
2022-05-09T21:39:00
2022-05-15T12:47:00
2022-06-01T06:39:00
2022-06-08T21:44:00
2022-06-13T10:33:00
2022-06-27T19:30:00
2022-07-18T10:37:00
2022-08-08T15:09:00
desireatem
Pyrite | Level 9

data Bio1;
Set Bio;
*arrival_date_time_num=input(arrival_date_time,anydtdtm.);
arrival_date_time_num = dhms(input(scan(arrival_date_time,1," "),mmddyy10.),0,0,input(scan(arrival_date_time,2," "),time5.));
format arrival_date_time_num e8601dt19.;
run;

proc print data=bio1;
var arrival_date_time_num ;
run;

 

The SAS System

Obs arrival_date_time_num
1 .
2 .
3 .
4 .
5 .
6 .
7 .
8 .
9 .
10 .
11 .
Tom
Super User Tom
Super User

It does not help to show the results without showing the inputs.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 21 replies
  • 2013 views
  • 0 likes
  • 5 in conversation