SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hi All, 

 

I have this longitudinal dataset with multiple observations per studyid: 

 

data have;
input studyID FSA $3. Exposure Age ;
cards;
1 B2Y 384 17
1 B2Y 384 17
1 B2Y 384 18
2 BgT 1000 15
3 M6D 400 14
3 M6D 400  15
3 M6D 400 16
run;

What I want to do is create a new variable called "Age_start" which is the lowest value for age for each indivdual in the study, so the "want" dataset would look like this: 

 

data want;
input studyID FSA $3. Exposure Age age_start ;
cards;
1 B2Y 384 17 17
1 B2Y 384 17 17
1 B2Y 384 18 17
2 BgT 1000 15 15
3 M6D 400 14 14
3 M6D 400  15 14
3 M6D 400 16 14
run;

As per usual, any thoughts would be hugely appreciated. I've been playing with it for a couple of hours using proc sort and then running variations of a data step that looks like 

data want;
set have;
age_start = first.age;
run; 

but all I get are zeros. 

 

Any thoughts are much appreciated. 

 

Thank you. 

 

Rightcoast

1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12
data want;
set have;
by studyid;
retain age_start;
if first.studyID then 
age_start = age;
run;

View solution in original post

4 REPLIES 4
kiranv_
Rhodochrosite | Level 12
data want;
set have;
by studyid;
retain age_start;
if first.studyID then 
age_start = age;
run;
novinosrin
Tourmaline | Level 20
data have;
input studyID FSA $3. Exposure Age ;
cards;
1 B2Y 384 17
1 B2Y 384 17
1 B2Y 384 18
2 BgT 1000 15
3 M6D 400 14
3 M6D 400  15
3 M6D 400 16
run;


proc sql;
create table want as
select *, min(age) as age_start 
from have
group by studyid
order by studyid;
quit;
Tom
Super User Tom
Super User

FIRST. and LAST. variables are binary valued numbers that are generated to indicate if the current observation is the first or last in the current group defined by the values of the BY values upto and including the one you listed.

 

Looks like you have found a feature (A.K.A. bug) in SAS. The reference to FIRST.AGE should have generated an error since not only is AGE not in your list of BY variables but you don't even have any BY statement.  Instead it looks like SAS just treated it as always FALSE. Hence the constant value of zero.

 

Easy to do in SQL code since SAS will automatically remerge summary statistics for you.

proc sql;
  create table want as select *,min(age) as first_age from have group by studyid;
quit;

Or if the data is sorted (either by AGE or by DATE and your AGE variables are correct) within STUDYID then you can use real BY variable processing in a data step.

data want;
  set have;
  by studyid ;
  if first.studyid then first_age=age;
  retain first_age;
run;

Now if you have some missing values of AGE then there is a possibility that FIRST_AGE will be missing.  You could add this statement so that the first non-missing value is taken.

first_age=coalesce(first_age,age);

But then the earlier values will still have missing values.

You could also add what are called DOW loops.  Basically placing the SET/MERGE statement inside of DO loop.  In this case two DO loops. One to find the MIN and another to actually write the values.

data want;
do until (last.studyid);
  set have;
  by studyid;
  first_age=min(first_age,age);
end;
do until (last.studyid);
  set have;
  by studyid;
  output;
end;
run;

 

righcoastmike
Quartz | Level 8

Hi Tom, 

 

Thanks so much for your thoughtful response. Thankfully I don't have any missing data, but I'll file it away because I'm sure it will come up in the future. The clarification on the .first/.last variables was especially helpful because I thought they were a product of proc sort generally, not of the BY statement. I'm sure that will help my programming going forward. 

 

As always I'm super humbled and impressed by all the people willing to help out on here. 

 

Thanks so much, hopefully one day I can get skilled enough to pay it forward and help out someone else. 

 

Rightcoast

 

 

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2140 views
  • 3 likes
  • 4 in conversation