BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SAS_Novice22
Quartz | Level 8

Hi SAS Community,

 

I am looking for advise on an issue I am having trying to create a duration variable from two start and end dates/times. I would like to derive this variable and pull basic descriptive data such as mean (SD), median, min max etc. I was previously doing this in excel and did not have any problems (cell D - cell C), however I am experiencing a hard time in SAS given I am a very new user and I am teaching myself (or at least trying to).

 

Here is my SAS code:

*/ creating new variable for DISEASE attack duration  - NOT WORKING /*;
TITLE "DISEASE attack duration - new variable";
DATA WORK.DISEASE_ERT;
	SET DISEASE_ERT;
	DURATION=DISEASEAD2-DISEASEAD1;  /*evaluate attack duration and derive a new variable*/
RUN;

Here is what my data look like:

SAS_Novice22_0-1654618177661.png

The SAS Log is really long so I will avoid including it here and also because it has patients numbers listed. But generally it states this multiple times:

NOTE: Invalid numeric data, DISEASEAD2='04APR2020 22:00' , at line 950 column 14.
NOTE: Invalid numeric data, DISEASEAD1='03APR2020 19:00' , at line 950 column 21.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.

NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 669 at 950:20 NOTE: There were 669 observations read from the data set WORK.DISEASE_ERT. NOTE: The data set WORK.DISEASE_ERT has 669 observations and 56 variables. NOTE: DATA statement used (Total process time): real time 0.17 seconds cpu time 0.14 seconds

Any assistance would be greatly appreciated. I have a feeling this is because time and date are combined the dataset but I am not sure how to make these separate and still retrieve a duration variable?

 

Thank you in advance.

T.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

To bring data from an existing data set you should have something like:

data new;
   set old;

So the name of your existing data set should be in the place of OLD in the above.

You would not include my examples of converting two existing values.

So maybe:

data new;
set DISEASE_ERT; dt1= input(DISEASEAD1,datetime19.); dt2= input(DISEASEAD2,datetime19.); duration = dt2 - dt1; format dt1 dt2 datetime19. duration time12.; put (_all_) (=/); run;

 You really want to create a new data set so that if you have a logic problem you do not replace existing values,or delete variables or records  accidentally. With a new data set you still have the old one for base values when you find that you need to modify something.

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

If you want to work with DATETIME values then they need to be NUMBERS and not CHARACTER strings.

You can use the INPUT() function with the DATETIME informat to convert those strings into numbers that represent the number of seconds since 1960.  Then if you subtract them you get the number of seconds between the two datetime values.  You can use the TIME format to display the seconds in a way that humans will recognize as hours minutes and seconds.

320  data test;
321    DISEASEAD2='04APR2020 22:00';
322    DISEASEAD1='03APR2020 19:00';
323    dt1= input(DISEASEAD1,datetime19.);
324    dt2= input(DISEASEAD2,datetime19.);
325    duration = dt2 - dt1;
326    format dt1 dt2 datetime19. duration time12.;
327    put (_all_) (=/);
328  run;


DISEASEAD2=04APR2020 22:00
DISEASEAD1=03APR2020 19:00
dt1=03APR2020:19:00:00
dt2=04APR2020:22:00:00
duration=27:00:00

PS Avoid overwriting your input dataset when running a data step.  If the code is messed up you will have to go back and re-run whatever code created the original dataset.

ballardw
Super User

The invalid data messages are because your variables are not date, time or datetime values but character. So the first step is going to be to create datetime values.

 

An important question is what unit do you want your duration to be reported in? By default SAS datetime values will be numbers of seconds and the subtraction approach you use would result in seconds. Do you want seconds for units?

 

data example;
   DISEASEAD2='04APR2020 22:00';
   DISEASEAD1='03APR2020 19:00';
   /* SAS datetime values*/
   dd2 = input(DISEASEAD2,anydtdtm30.);
   dd1 = input(DISEASEAD1,anydtdtm30.);
   /* format so human readable*/
   format dd2 dd1 datetime20.;
   seconds = dd2-dd1;
minutes = intck('minute',dd1,dd2,'c'); hours = intck('hour',dd1,dd2,'c'); days = intck('dtday',dd1,dd2,'c'); run;

Since Excel does datetime values with the time as a fractional day you possibly want to convert one of the units such as seconds, minutes or hours into fractional days to get comparable numbers.

The INTCK function returns number of time interval boundaries crossed.

https://communities.sas.com/t5/SAS-Communities-Library/Working-with-Dates-and-Times-in-SAS-Tutorial/... has a PDF with much information about SAS date, time and datetime.

SAS_Novice22
Quartz | Level 8

Hi @Tom and @ballardw ,

 

Thank you for the information and your replies. I have tried both codes and I am still doing something wrong but I am not sure exactly what.

Here is the SAS codes I have tried:

data DISEASE_ERT;
   /* SAS datetime values*/
   dd2 = input(DISEASEAD2,anydtdtm30.);
   dd1 = input(DISEASEAD1,anydtdtm30.);
   /* format so human readable*/
   format dd2 dd1 datetime20.;
   seconds = dd2-dd1;
   minutes = intck('minute',dd1,dd2,'c');
   hours = intck('hour',dd1,dd2,'c');
   days  = intck('dtday',dd1,dd2,'c');
run;

and also ...

data DISEASE_ERT;
dt1= input(DISEASEAD1,datetime19.);
dt2= input(DISEASEAD2,datetime19.);
duration = dt2 - dt1;
format dt1 dt2 datetime19. duration time12.;
put (_all_) (=/);
run;

Here is my SAS Log:

1124  data DISEASE_ERT;
1125     /* SAS datetime values*/
1126     dd2 = input(DISEASEAD2,anydtdtm30.);
1127     dd1 = input(DISEASEAD1,anydtdtm30.);
1128     /* format so human readable*/
1129     format dd2 dd1 datetime20.;
1130     seconds = dd2-dd1;
1131     minutes = intck('minute',dd1,dd2,'c');
1132     hours = intck('hour',dd1,dd2,'c');
1133     days  = intck('dtday',dd1,dd2,'c');
1134  run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
      1126:16   1127:16
NOTE: Variable DISEASEAD2 is uninitialized.
NOTE: Variable DISEASEAD1 is uninitialized.
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      1 at 1130:17   1 at 1131:14   1 at 1132:12   1 at 1133:12
NOTE: The data set WORK.DISEASE_ERT has 1 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

Apologies, I am on a huge learning curve and I am trying not to give up.

 

Thanks in advance,

T.

Tom
Super User Tom
Super User

Make sure to add the SET statement to read in your existing dataset.  Since you did not read in your current data when you tried to reference DISEASEAD1 without having even mentioned it before SAS defined it as numeric.  That explains the notes.

 

Also if your stings are consistently in a style that the DATETIME informat can read it is better to use that than one of the ANYDT... informats as those will try to GUESS what style your strings are in.

 

 

SAS_Novice22
Quartz | Level 8
Thanks Tom, that was exactly what I was missing. Added it in and the code worked. Tried out both examples provided and they work.
Thank you again for being super helpful!
T.
ballardw
Super User

To bring data from an existing data set you should have something like:

data new;
   set old;

So the name of your existing data set should be in the place of OLD in the above.

You would not include my examples of converting two existing values.

So maybe:

data new;
set DISEASE_ERT; dt1= input(DISEASEAD1,datetime19.); dt2= input(DISEASEAD2,datetime19.); duration = dt2 - dt1; format dt1 dt2 datetime19. duration time12.; put (_all_) (=/); run;

 You really want to create a new data set so that if you have a logic problem you do not replace existing values,or delete variables or records  accidentally. With a new data set you still have the old one for base values when you find that you need to modify something.

SAS_Novice22
Quartz | Level 8
Worked like a charm, I was missing the SET! Thank you for your help. Reading this is appears that the output is in hours and I can modify to fraction of days using a formatting code or deriving a new variable like hours/24 hours to get days?

Thanks again!
T.
tarheel13
Rhodochrosite | Level 12

datetime is number of seconds in SAS. to get days, you need to divide by 86,400

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 840 views
  • 5 likes
  • 4 in conversation