BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
AyoSho
Obsidian | Level 7

Hello everyone,

I am working with a dataset and carried out difference in recorded dates using the intck function as below:

dif = intck ('day’, startdate, enddate, 'DISCRETE');

The sample result is shown below:

dif

frequency

percent

-6

18

0.03

-5

15

0.03

-4

20

0.04

-3

26

0.05

-2

28

0.05

-1

87

0.16

0

2078

3.85

1

4204

7.8

2

3690

6.84

3

3097

5.74

4

2707

5.02

5

2207

4.09

 

How do I interpret the negative results? I was not expecting negative values since startdate was placed before enddate in the syntax. Can it be an issue with how the dates were formatted? If so, how do I fix the issue regarding negative values since I intend to find average value for dif variable? An example of how startdate is formatted is 01JAN1960 while an example of how enddate is formatted is 04/06/2019. I will appreciate any suggestions/recommendations on how to make the code work. Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

Dif is only negative for a small number of rows in your data. If appears you have a small data quality problem where around 50 rows have an ENDDATE that is earlier than STARTDATE. View those rows to understand what might be causing this. It's possible that they are simply errors but without understanding how you got your data it is hard to advise further. You could just remove those rows if they are genuinely bad data.

View solution in original post

11 REPLIES 11
Reeza
Super User

Formats shouldn't matter as long as both are date types. Both have a date format?

 

1960 versus 2019 makes me wonder if one has a datetime format versus a date?

 

 

 

 

 

AyoSho
Obsidian | Level 7
Hi Reeza,
Thanks for your reply.

startdate is formatted as DATE9.

enddate is formatted as MMDDYY10. and also has informat DATETIME22.3 noted
when I run proc content.

PaigeMiller
Diamond | Level 26

enddate is formatted as MMDDYY10. and also has informat DATETIME22.3 noted

 

Please look at the values of enddate with your own eyes. Do they look like dates around 1960? Do they look like the dates you are expecting?

--
Paige Miller
PaigeMiller
Diamond | Level 26

This is not a format problem. SAS always uses the un-formatted values to do this math, and so whatever formats are used are irrelevant. The interpretation is that some of your enddate values are before the corresponding startdate values.

--
Paige Miller
Tom
Super User Tom
Super User

The FORMAT attached to a variable are just the instructions for how to convert the value to text strings for printing.

 

What you need to be concerned about is the types of VALUES that the variables contain.  If the values look right with displayed with the DATE or MMDDYY (or DDMMYY) format then they contain DATE values, which are number of days since 1960.

 

In which case you can just use subtraction to get the number of days between two dates.

 

If you are seeing negative differences then the ENDDATE is BEFORE the STARTDATE.

 

Perhaps the different display format are confusing the humans that are trying to check if the date values were entered properly?  That is especially true if you are displaying values like  04/06/2019.  Is that April Sixth or the Fourth of June?

AyoSho
Obsidian | Level 7
Hi everyone,
I was hoping to get some help with interpreting the results from INTCK function. Below is the sample code I used for an analysis:

dif = INTCK('month', startdate, enddate, 'DISCRETE');
 
The sample result is shown here:
 
difFrequencyPercent
-620
-520
-420
-340.01
-2120.02
-1330.06
02220341.17
11316424.41
237446.94
329055.39
419473.61

Within the raw dataset, enddate for example is formatted as 01/10/2019 while startdate for example is formatted as 15MAR2018

How do I interpret the negative results as shown above?. I would expect that the results would be positive since enddate was not placed before startdate in the data step?.... 
 
What adjustments do I need to make to the data step? Can it be an issue with how the dates were formatted? 
 
I would appreciate any help/suggestions. Thanks everyone.
SASKiwi
PROC Star

Dif is only negative for a small number of rows in your data. If appears you have a small data quality problem where around 50 rows have an ENDDATE that is earlier than STARTDATE. View those rows to understand what might be causing this. It's possible that they are simply errors but without understanding how you got your data it is hard to advise further. You could just remove those rows if they are genuinely bad data.

AyoSho
Obsidian | Level 7
Thank you for your reply Saskiwi.
The dataset is from a national database with 53,000 observations. The
illustration which I posted in the question was a snippet of a section from
the results. The number of rows with negative results is 61 in total and a
visual inspection does show that enddate for some rows is earlier than the
startdate (example shown below):

enddate startdate
10/18/1985 6-Mar-18
Thanks for your help!
Patrick
Opal | Level 21

@AyoSho SAS has a concept of SAS Date values and SAS DateTime values.

SAS Date values are the count of days since 1/1/1960

SAS DateTime values are the count of seconds since 1/1/1960

 

Both of these values are stored in a SAS numerical data type. SAS Formats just make these number human readable but they don't have any effect on calculations/logic which uses the internal values (the counts).

 

A SAS INFORMAT is used to convert an external value - a text string - to the appropriate value for storing in a SAS variable.

What feels wrong: Informat DATETIME is used to convert a date string with a time component to a SAS DateTime value (=count of seconds) but format MMDDYY10. is used to make a SAS Date value (count of days) human readable. If you apply MMDDYY10. on a SAS DateTime value then you should see a line of ************ when printing.

 

And for any calculations/comparisons like with intck(): If you mix variable with Date and DateTime values then you certainly won't get correct results. You can convert a SAS variable that stores a SAS DateTime value to a SAS Date value using the DATEPART() function.

 

Hope these explanations will help you to investigate and resolve the challenge you're facing.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1869 views
  • 14 likes
  • 7 in conversation