BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JianShen
Obsidian | Level 7

Hi All,

 

I have a simple question I think, yet I just can't seem to find the answer when I try to google it (I'm sure it's because I just can't find the correct search terms).

 

I'm currently trying to calculate the difference in years within a group. Example of my current data:

 

Group    Date

 

A            1990

A            1991

A            1992

B            2001

B            2002

 

 

I've noticed that, if I calculate the difference in years, I get the difference in days (I think). For example, if the difference is 2002 - 2001, then I get 365 as a result. What I would like is to get the difference in years, so 2002 - 2001 = 1.

 

This is the code I currently have:

 

proc sql;

create table work.test_date_diff as

select  (max(year) - min(year)) as diffyear format year4.

from work.test_year

group by group;

quit;

As such, I was wondering just what it is that I'm missing.

 

I hope someone can tell me what I'm doing wrong or link me to an example!

 

Many thanks in advance!

 

Jian

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Are your date variable value in years?

If yes, are you after this?

Plus I am not sure why do you want to FORMAT the difference in years with a YEAR4 format that is used to display year component of a SAS date



data have;
input group $ date;
cards;
A            1990
A            1991
A            1992
B            2001
B            2002
;

data want;
 set have;
 by group;
 dif_years=dif(date);
 if first.group then call missing(dif_years);
run;

 

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Are your date variable value in years?

If yes, are you after this?

Plus I am not sure why do you want to FORMAT the difference in years with a YEAR4 format that is used to display year component of a SAS date



data have;
input group $ date;
cards;
A            1990
A            1991
A            1992
B            2001
B            2002
;

data want;
 set have;
 by group;
 dif_years=dif(date);
 if first.group then call missing(dif_years);
run;

 

novinosrin
Tourmaline | Level 20

Hello again @JianShen   I have a feeling the simple correction in your code would be is to remove the format year4.

proc sql;

create table work.test_date_diff as

select  (max(year) - min(year)) as diffyear 

from work.test_year

group by group;

quit;
ed_sas_member
Meteorite | Level 14

Hi @JianShen 

Assuming the variable YEAR is a SAS date, you can use the INTCK function:

proc sql;
create table work.test_date_diff as
select  intck('year',min(year),max(year)) as diffyear
from work.test_year
group by group;
quit;

 If it is just a numeric value (not a SAS date), the solution with dif(date) should work.

Kurt_Bremser
Super User

If you have date values (as it appears), use the intck() function:

intck('year',min(year),max(year)) as diffyear

Full code with example data:

data have;
input year :yymmdd10.;
format year year4.;
datalines;
2019-12-01
2018-10-13
2005-05-03
;

proc sql;
create table want as
select intck('year',min(year),max(year)) as diffyear
from have;
quit;

If that does not do it, we need to know what your data looks like. Post a data step with datalines that reflects your dataset.

Tom
Super User Tom
Super User

If you see that difference between two values as 365 then perhaps the values are dates instead of years? Note that attaching a format to variable does NOT change the value stored in that variable. So if you had date values and attached the YEAR format to the variable then only the year number will display, but that does not change what is stored.

 

Are your date values always for the same day within the year? Like January first? If not then what do you want the difference in years to be when the day of the year is not the same between the two dates being compared?  The INTCK() function will count the number of boundaries crossed.

 

But perhaps you just want to take the difference between the year number, independent of when in the year the date was.  Make sure to include you grouping variable in the output dataset, otherwise the data will be difficult to interpret. Why not keep that actual first and last day as variables also.

proc sql;
create table work.test_date_diff as
  select group
       , min(year) as first_date format=date9.
       , max(year) as last_date format=date9.
       , (year(max(year)) - year(min(year))) as year_range format=4.
  from work.test_year
  group by group
;
quit;

 

PaigeMiller
Diamond | Level 26

I've noticed that, if I calculate the difference in years, I get the difference in days (I think). For example, if the difference is 2002 - 2001, then I get 365 as a result. What I would like is to get the difference in years, so 2002 - 2001 = 1.

 

You could divide by 365 and take the integer value. Or use the INTCK function or the YRDIF function to account for leap years.

--
Paige Miller
Astounding
PROC Star

As others have noted, you have a variable named YEAR, but it does not contain year information.  It contains specific date information.  So you could divide the difference by 365 (or possibly by 365.25) to convert number of days to number of years.  But if you want an integer as the output for number of years, you will need to make decisions about some cases, such as how many years do these cases represent:

 

Min:  December 30, 2018

Max:  January 2, 2019

 

Min:  December 30, 2018

Max:  December 29, 2019

 

Min:  December 30, 2018

Max:  March 1, 2019

 

Min:  December 30, 2018

Max:  November 1, 2019

 

Min:  December 30, 2018

Max:  December 31, 2019

 

You will need the rules spelled out, or else any and all program suggestions are just guesses.

JianShen
Obsidian | Level 7

Hi All,

 

Thank you for all of the quick and helpful replies! 

 

It is indeed as @novinosrin and @ed_sas_member noticed and mentioned. My "dates" aren't really dates, but are numeric values and the dif(date) option is working wonders! I made it too complicated for myself by trying to make it into SAS dates and using that (I feel like hitting myself now...).

 

Many thanks to all of you for helping and clarifying this! The other options also helped with a different year question that I was also looking at!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3839 views
  • 0 likes
  • 7 in conversation