Hi there
Fairly new to SAS and would really appreciate help with a particular problem.
I have a dataset comprising multiple rows per person. For each person, I would like to transfer the maximum value from variable 1 (var1) onto all the other rows for that person and into a new variable (newvar).
e.g.
ID Var1
1 4
1 5
1 2
1 2
--
2 6
2 9
2 1
2 0
....would become.....
ID Var1 Newvar
1 4 5
1 5 5
1 2 5
1 2 5
--
2 6 9
2 9 9
2 1 9
2 0 9
Any advice much appreciated!
Thank you
This is easy to do in PROC SQL because SAS will automatically remerge summary statistics for you.
proc sql ;
create table want as
select *,max(var1) as newvar
from have
group by id
;
quit;
You could do it in a data step using a technique known as DOW loops. The data must be sorted by ID .
data want ;
do until (last.id);
set have;
by id;
newvar=max(newvar,var1);
end;
do until (last.id);
set have;
by id;
output;
end;
run;
Hi,
This selects all the data, then left joins the max value per ID group.
proc sql;
create table WANT as
select A.*,
B.MAX_VALUE
from HAVE A
left join (select ID,
max(VAR1) as MAX_VALUE
from HAVE
group by ID) B
on A.ID=B.ID;
quit;
proc sort data=have;
by id descending var1;
run;
data want;
set have;
retain newvar;
by id descending var1;
if first.id then do;
newvar=var1;
end;
run;
This is easy to do in PROC SQL because SAS will automatically remerge summary statistics for you.
proc sql ;
create table want as
select *,max(var1) as newvar
from have
group by id
;
quit;
You could do it in a data step using a technique known as DOW loops. The data must be sorted by ID .
data want ;
do until (last.id);
set have;
by id;
newvar=max(newvar,var1);
end;
do until (last.id);
set have;
by id;
output;
end;
run;
Thank you so much for your solutions. Your experise are much appreciated!
I meant to say that I have hundreds of millions of observations so an efficient command will be the most useful. I will try these and see....
Thanks again
If you have hundreds of millions of observations then perhaps you should be thinking about big picture. It will take a long time to read every observation find the max and write out all the data again with max attached. Then what? Will you read all hundreds of millions back in again to do what? Programs that are adequate for most applications with thousand of observations may not be adequate when you have hundreds of millions of observations.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.