Hi there
Fairly new to SAS and would really appreciate help with a particular problem.
I have a dataset comprising multiple rows per person. For each person, I would like to transfer the maximum value from variable 1 (var1) onto all the other rows for that person and into a new variable (newvar).
e.g.
ID Var1
1 4
1 5
1 2
1 2
--
2 6
2 9
2 1
2 0
....would become.....
ID Var1 Newvar
1 4 5
1 5 5
1 2 5
1 2 5
--
2 6 9
2 9 9
2 1 9
2 0 9
Any advice much appreciated!
Thank you
This is easy to do in PROC SQL because SAS will automatically remerge summary statistics for you.
proc sql ;
create table want as
select *,max(var1) as newvar
from have
group by id
;
quit;
You could do it in a data step using a technique known as DOW loops. The data must be sorted by ID .
data want ;
do until (last.id);
set have;
by id;
newvar=max(newvar,var1);
end;
do until (last.id);
set have;
by id;
output;
end;
run;
Hi,
This selects all the data, then left joins the max value per ID group.
proc sql; create table WANT as select A.*, B.MAX_VALUE from HAVE A left join (select ID, max(VAR1) as MAX_VALUE from HAVE group by ID) B on A.ID=B.ID; quit;
proc sort data=have;
by id descending var1;
run;
data want;
set have;
retain newvar;
by id descending var1;
if first.id then do;
newvar=var1;
end;
run;
This is easy to do in PROC SQL because SAS will automatically remerge summary statistics for you.
proc sql ;
create table want as
select *,max(var1) as newvar
from have
group by id
;
quit;
You could do it in a data step using a technique known as DOW loops. The data must be sorted by ID .
data want ;
do until (last.id);
set have;
by id;
newvar=max(newvar,var1);
end;
do until (last.id);
set have;
by id;
output;
end;
run;
Thank you so much for your solutions. Your experise are much appreciated!
I meant to say that I have hundreds of millions of observations so an efficient command will be the most useful. I will try these and see....
Thanks again
If you have hundreds of millions of observations then perhaps you should be thinking about big picture. It will take a long time to read every observation find the max and write out all the data again with max attached. Then what? Will you read all hundreds of millions back in again to do what? Programs that are adequate for most applications with thousand of observations may not be adequate when you have hundreds of millions of observations.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.