Hi all,
Hi all, I have a longitudinal data set where the above dates are nested by household and person. The above are dates for one person and one household. I would like to compute the difference between two subsequent dates without creating a lagged variable. Is this possible? Thanks and all help is appreciated.
It's easy, but you are missing some of the description. When should the variable not be calculated? When beginning a new household? When beginning a new member of the household?
Here's an approach that assumes you should get no differences whenever the member of the household changes:
proc sort data=have;
by household member date;
run;
data want;
set have;
by household member;
days_difference = dif(date);
if first.member then days_difference=.;
run;
Similar to the LAG function, the DIF function must execute on every observation in order to get the right answer.
Here's more of the data set. I would like to compute the difference for each unique basebrgy (household) and basewman (person) combination. And if there is no date, I would like it to not compute anything.
Then the code I gave you is the right approach. You just need to change the variable names.
Also, DATE cannot be a character string. DATE has to be a numeric variable stored on SAS's usual scale for storing dates.
I have a different question but related to the same dataset. I am trying to create a singular ID variable for each household and person combination. This is the code I have currently that sort of gets at what I am trying to do:
data subset2;
set subset2;
by basebrgy basewman;
ID + 1;
if first.basebrgy OR first.basewman then ID = 1;
run;
Here is what the output looks like:
For the ID variable, I'd like it to be a column of 1's for basebrgy = 1/basewman = 12 and a column of 2's for basebrgy = 1/basewman = 14. Hope I was clear enough. Thanks for any help.
Not sure I understant but wouldn't this simply be:
data subset2;
set subset2;
if basebrgy = 1 and basewman = 12 then ID=1;
else if basebrgy = 1 and basewman = 14 then ID=2;
else;
run;
This works but subset2 is a subset of the much larger dataset. I would like to automate the process instead of doing it manually.
You're playing with the right tools, but you need a slightly different combination:
data want;
set have;
by basebrgy basewman;
if first.basewman then ID + 1;
run;
This is definitely a topic worth spending time on to understand. Many DATA steps create BY variables.
I tried your code but it only produced 1 for the first observations of each combination. I would like 2 for the second combination. Any ideas?
The code worked just fine for me. Can you post the log from your attempt?
1867 data subset2;
1868 set subset2;
1869 by basebrgy basewman;
1870
1871 if first.basewman then ID = 1;
1872 run;
NOTE: There were 18 observations read from the data set WORK.SUBSET2.
NOTE: The data set WORK.SUBSET2 has 18 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
Here's the output:
Never mind, I figured it out. I set ID = 1 instead of ID + 1
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.