BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mvhoya
Obsidian | Level 7

Hi SAS Communities,

 

I would like to create a new variable called duration that will calculate the total number of months an individual (ID) was in a health program using a variable called visit_date (already converted to numeric and is in SAS date format). My data is currently in long format, so each ID has a value for the visit_date variable.

 

My dataset is quite large and each ID has a varying # of visits, but the variables mentioned above are structured as follows:

ID1    3/15/2020

ID1    6/27/2020

ID1    01/25/2021

ID2    3/02/2020

ID2   09/15/2020

 

I've used arrays in the past to calculate duration of time, but I've only done that with data in wide format. How would I go about accomplishing this task? Thank you in advance. 

1 ACCEPTED SOLUTION

Accepted Solutions
5 REPLIES 5
ballardw
Super User

It might help to show what you expect for output given the input. And exactly which duration might be involved, between each record or overall.

 

One of the nice things about SAS dates once you understand that the units are days that the statistical functions can do some interesting things.

Proc summary data=have;
   id variable;
   var datevariable;
  output out=work.summary Range= daterange;
run;

The above calculates the range of values for each id. That means the value when you take the largest (latest date) minus the smalles (earliest date). The value would be in days. "Months" you would have to provide a definition for as the number of days varies.  You will get 0 for ID with only one record.

 

OR you could use a data step, using By group processing with the id variable, retain a date value for the first of an Id and when you get to the last of the id use the INTCK function to do the duration.

mvhoya
Obsidian | Level 7
For the output, I expect to have the variable 'duration' show the number of months for each ID. Using a data step is more aligned with what I would like to achieve since I would like to use the duration variable later in my analysis. Would you kindly show an example of code of the data step explanation you provided at the end of your response?
Kurt_Bremser
Super User

Use a retained variable:

data want;
set have;
by id;
retain start;
if first.id then start = date;
if last.id;
duration = intck('month,start,date);
keep id duration;
run;
ChrisNZ
Tourmaline | Level 20

So you just want to use the minimum and maximum date for each ID?

mvhoya
Obsidian | Level 7
I want to calculate the duration of time in months each ID spent in the program and since the dates are in chronological order for each ID, I can use the first and last date for each ID.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1664 views
  • 4 likes
  • 4 in conversation