Hi! I have a question regarding how to indicate a value is repeated in a data that's long format.
Example:
ID name_of_former_variable ageofonset numberoftraumas
1 ageonsettrauma14 12 1
1 ageonsettrauma7 13 2
2 ageonsettrauma6 8 1
2 ageonsettrauma7 8 2
2 ageonsettrauma16 8 3
2 ageonsettrauma18 8 4
2 ageonsettrauma20 9 5
3 ageonsettrauma4 5 1
3 ageonsettrauma18 5 2
So basically, numberoftraumas is a counter variable to see the number of traumas each ID endorsed.
I am interested in earliest age of onset (so for ID # 1, it is 12, for ID #2 it is 8). I also want to know
the number of traumas endorsed (So for ID #1 it is 2, for ID #2 it is 5). I can get these by splitting these
into two data sets and merging the largest "numberoftraumas" variable with the smallest "ageofonset"
THen I was able to use a substring function to indicate which trauma # is associated with the earliest
age of onset. But I'd like to indicate if there is a tie. So for example, for ID #2, the earliest age of
onset is 8. However, there are multiple traumas endorsed at age 8. I want to create a variable which
indicates if there are ties, specifically, the first two traumas (because I want to indicate that they had
multiple traumas at that earliest age of onset, if that makes sense).
I can't just use the largest "numberoftraumas' to indicate this because, as you can see with ID #1,
even though they had two traumas, they occurred at two different ages.
So if subject has multiple traumas occurring at the earliest age of trauma, then their 'firsttrauma'=99
Otherwise, it is the number of the trauma.
So for ID1, their earliest age of trauma was trauma#14 and occured at age 12.
No other trauma was endorsed at age 12, so their firsttrauma variable is 14.
But for ID2, they had 4 traumas at age 8, and one trauma at age 9, so
I don't care which traumas they experienced at age 8, I just want to indicate
that there was more than one experienced at age 8, so their firsttrauma=99.
ID3 had 2 traumas at age 5, so their firsttrauma=99.
So this is what I will see:
ID name_of_former_variable ageofonset numberoftraumas firsttrauma
1 ageonsettrauma14 12 1 14
2 ageonsettrauma20 8 5 99
3 ageonsettrauma18 5 2 99
But ultimately, I don't care about the name_of_former_variable so I'd delete that and want to see this:
ID ageofonset numberoftraumas firsttrauma
1 12 1 14
2 8 5 99
3 5 2 99
Thank you so much!! Sorry for the long explanation. I wasn't sure how to find this online , I couldn't find anything in my searches ![]()
Last time I asked a question someone requested for more details so I thought I'd provide more. ![]()
I'd guess there was an easier way to do this with the original data or by merging it in, but I'll provide a data step solution based on your data.
My conversion to number may be off, I always get put/input wrong.
Untested.
data want;
set have;
by id;
retain first_age firsttrauma;
if first.id then do;
firsttrauma=input(compress(name, 'kd'), 2.);
first_age=ageofonset;
end;
if not first.id and ageofonset=first_age then firsttrauma=99;
if last.id then output;
end;
I'd guess there was an easier way to do this with the original data or by merging it in, but I'll provide a data step solution based on your data.
My conversion to number may be off, I always get put/input wrong.
Untested.
data want;
set have;
by id;
retain first_age firsttrauma;
if first.id then do;
firsttrauma=input(compress(name, 'kd'), 2.);
first_age=ageofonset;
end;
if not first.id and ageofonset=first_age then firsttrauma=99;
if last.id then output;
end;
Hi Reeza!
Thank you so much. The code worked
I just had to change input to put. You always are so fast to respond and help me, which I really appreciate. I am, unfortunately, not the best programmer, but try to google things before I come here and ask you guys. If you have any advice on how I can learn little tricks like the one you showed me, I'd really appreciate it! Just simple things like using the retain, first.variable types of statements.
Thanks again!!!!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.