Hi guys,
suppose to have the following:
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 0 0
166 18MAR2019 25MAR2019 0 1 1
166 12APR2020 02JUN2020 0 0 0
170 22FEB2017 07MAR2017 1 0 0
170 22FEB2017 07MAR2017 0 1 1
170 30JAN2019 04MAR2019 0 0 0
313 03MAR2016 10MAR2016 1 0 0
313 03MAR2016 10MAR2016 0 1 1
313 12DEC2019 15DEC2019 0 0 0
215 22DEC2014 25DEC2014 1 1 1
;
I would like to "move" all values = 1 of Variable1 and Variable2 where Index = 1 to get the following:
data DB1;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 1 1
166 18MAR2019 25MAR2019 0 0 0
166 12APR2020 02JUN2020 0 0 0
170 22FEB2017 07MAR2017 1 1 1
170 22FEB2017 07MAR2017 0 0 0
170 30JAN2019 04MAR2019 0 0 0
313 03MAR2016 10MAR2016 1 1 1
313 03MAR2016 10MAR2016 0 0 0
313 12DEC2019 15DEC2019 0 0 0
215 22DEC2014 25DEC2014 1 1 1
;
To do this for one variable I do:
data db_temp;
retain max_index;
set DB;
by ID;
if first.ID then max_index=0;
max_index=max(max_index,Variable1);
if last.ID;
keep ID max_index;
run;
data DB1;
merge DB db_temp;
by ID;
if Index=1 then Variable1=max_index;
else if Variable1=max_index then Variable1=0;
run;
How if I need to run also on Variable2 and potentially VariablesN instead of one variable? How the code should be modified?
Thank you in advance
There needs to be a concrete rule before you can program it.
It sounds like perhaps you want the MAX values per ID to be attached to the observation where INDEX=1? If so just take the max and merge it back on. Drop the original variables so you can have the other values be zero.
proc sort data=db;
by id descending index;
run;
proc summary data=db ;
by id;
output out=summary(drop=_type_ _freq_) max(index variable1 variable2)=;
run;
data want ;
merge db(drop=variable1 variable2) summary;
by id descending index;
variable1+0;
variable2+0;
run;
If you want the cases that do not have an index record left alone then split the data up before merging it back together.
data db_index(drop=variable1 variable2) db_other;
merge db summary(keep=id index rename=(index=index2) where=(index2=1)) ;
by id;
if index2=1 then output db_index;
else output db_other;
drop index2;
run;
data want ;
merge db_other db_index(where=(index=1)) summary(where=(index=1));
by id descending index;
variable1+0;
variable2+0;
run;
But it sounds like a silly thing to do. Is there a reason that motivates this desire? Perhaps there is another way to do what you actually want to do if you could describe it.
There needs to be a concrete rule before you can program it.
It sounds like perhaps you want the MAX values per ID to be attached to the observation where INDEX=1? If so just take the max and merge it back on. Drop the original variables so you can have the other values be zero.
proc sort data=db;
by id descending index;
run;
proc summary data=db ;
by id;
output out=summary(drop=_type_ _freq_) max(index variable1 variable2)=;
run;
data want ;
merge db(drop=variable1 variable2) summary;
by id descending index;
variable1+0;
variable2+0;
run;
If you want the cases that do not have an index record left alone then split the data up before merging it back together.
data db_index(drop=variable1 variable2) db_other;
merge db summary(keep=id index rename=(index=index2) where=(index2=1)) ;
by id;
if index2=1 then output db_index;
else output db_other;
drop index2;
run;
data want ;
merge db_other db_index(where=(index=1)) summary(where=(index=1));
by id descending index;
variable1+0;
variable2+0;
run;
But it sounds like a silly thing to do. Is there a reason that motivates this desire? Perhaps there is another way to do what you actually want to do if you could describe it.
Calculate the max with PROC SUMMARY, load the result into a hash, and use it for the first observation.
Note that the names of the variables need to be determined first for use in the definition of the hash.
proc summary data=db nway;
by id notsorted;
var var:;
output out=maxed (drop=_type_ _freq_) max()=;
run;
proc sql noprint;
select quote(name) into :vars separated by ","
from dictionary.columns
where libname = "WORK" and memname = "DB" and upcase(name) like 'VAR%';
quit;
data want;
set db;
by id notsorted;
if _n_ = 1
then do;
declare hash m (dataset:"maxed");
m.definekey("id");
m.definedata(&vars.);
m.definedone();
end;
array var {*} var:;
if first.id
then i = m.find();
else do i = 1 to dim(var);
var{i} = 0;
end;
drop i;
run;
Assuming I understood what you mean.
data DB;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 0 0
166 18MAR2019 25MAR2019 0 1 1
166 12APR2020 02JUN2020 0 0 0
170 22FEB2017 07MAR2017 1 0 0
170 22FEB2017 07MAR2017 0 1 1
170 30JAN2019 04MAR2019 0 0 0
313 03MAR2016 10MAR2016 1 0 0
313 03MAR2016 10MAR2016 0 1 1
313 12DEC2019 15DEC2019 0 0 0
315 22DEC2014 25DEC2014 1 1 1
;
data want;
do until(last.id);
set DB;
by id;
if Variable1=1 then v1=1;
if Variable2=1 then v2=1;
end;
do until(last.id);
set DB;
by id;
if Index=1 then do;Variable1=coalesce(v1,0);Variable2=coalesce(v2,0);end;
else do;Variable1=0;Variable2=0;end;
output;
end;
drop v1 v2;
run;
Just based on your sample data and desired outcome the code could be as simple as below to generate the desired result.
data have;
input ID :$20. Admission :date09. Discharge :date09. Index Variable1 Variable2;
format Admission date9. Discharge date9.;
cards;
166 16FEB2019 26FEB2019 1 0 0
166 18MAR2019 25MAR2019 0 1 1
166 12APR2020 02JUN2020 0 0 0
170 22FEB2017 07MAR2017 1 0 0
170 22FEB2017 07MAR2017 0 1 1
170 30JAN2019 04MAR2019 0 0 0
313 03MAR2016 10MAR2016 1 0 0
313 03MAR2016 10MAR2016 0 1 1
313 12DEC2019 15DEC2019 0 0 0
215 22DEC2014 25DEC2014 1 1 1
;
data want_1;
set have;
array vars{*} Variable1 Variable2;
do i=1 to dim(vars);
vars[i]=index;
end;
drop i;
run;
proc print data=want_1;
run;
But then looking at your code it appears your real data also contains other cases that require additional logic. Something like below could work.
data want_2;
if _n_=1 then
do;
dcl hash h1();
h1.defineKey('id');
h1.defineDone();
do until(last);
set have end=last;
array vars[*] Variable1 Variable2;
if h1.check() ne 0 and max(of vars[*])=1 then h1.add();
end;
end;
set have;
do i=1 to dim(vars);
if index=1 then vars[i]= (h1.check()=0);
else vars[i]=0;
end;
drop i;
run;
proc print data=want_2;
run;
If none of the solutions provided work for your actual data then I suggest you provide amended sample data that includes the cases where things aren't working.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.