Hi,
I am struggling to retain a value of "Y" for all observations by ID if the value "Y" exists for any record within the same ID:
For example, my data looks like this:
ID | indicator1 | indicator2 | indicator3 | indicator4 | indicator5 |
001 | Y | ||||
001 | Y | ||||
001 | Y | ||||
001 | Y | ||||
002 | Y | ||||
002 | Y | ||||
002 | Y | ||||
002 | |||||
002 | Y |
but I want it to look like this within the indicatorXnew fields:
ID | indicator1 | indicator2 | indicator3 | indicator4 | indicator5 | indicator1new | indicator2new | indicator3new | indicator4new | indicator5new |
001 | Y | Y | Y | Y | Y | |||||
001 | Y | Y | Y | Y | Y | |||||
001 | Y | Y | Y | Y | Y | |||||
001 | Y | Y | Y | Y | Y | |||||
002 | Y | Y | Y | Y | Y | |||||
002 | Y | Y | Y | Y | Y | |||||
002 | Y | Y | Y | Y | Y | |||||
002 | Y | Y | Y | Y | ||||||
002 | Y | Y | Y | Y | Y |
I have some familiarity with RETAIN but only for summing numeric values. I would share code but nothing I try works correctly, but here is the code to generate data set HAVE:
data HAVE;
infile datalines dlm='|' truncover;
input ID$ indicator1$ indicator2$ indicator3$ indicator4$ indicator5$;
datalines;
001|| ||Y|| || || ||
001||Y|| || || || ||
001|| || ||Y|| || ||
001|| || || || || Y
002||Y|| || || || ||
002|| || ||Y|| || ||
002|| ||Y|| || || ||
002|| || || || || ||
002|| || || ||Y|| ||
;run;
I would appreciate any help, thank you
Here's a variation that is such an odd duck, you might not see it again:
data want;
do until (last.id);
update have (obs=0) have;
by id;
end;
indicator1new = indicator1;
indicator2new = indicator2;
indicator3new = indicator3;
indicator4new = indicator4;
indicator5new = indicator5;
do until (last.id);
set have;
by id;
output;
end;
run;
Here you go.
data have;
infile datalines dlm='|' truncover;
input ID $ (indicator1 indicator2 indicator3 indicator4 indicator5) (:$1.);
datalines;
001|| ||Y|| || || ||
001||Y|| || || || ||
001|| || ||Y|| || ||
001|| || || || || Y
002||Y|| || || || ||
002|| || ||Y|| || ||
002|| ||Y|| || || ||
002|| || || || || ||
002|| || || ||Y|| ||
;
proc sort data=have;
by id;
run;
data inter;
set have;
by id;
array ind{*} indicator1-indicator5;
array new_ind{*} $1 new_indicator1-new_indicator5;
retain new_ind;
keep id new_indicator1-new_indicator5;
do i=1 to dim(new_ind);
if ind[i]='Y' then new_ind[i]='Y';
end;
if last.id then
do;
output;
call missing(of new_ind[*]);
end;
run;
data want;
merge have inter;
by id;
run;
And here an alternative coding option.
data want2;
set have;
by id;
array ind{*} indicator1-indicator5;
array new_ind{*} $1 new_indicator1-new_indicator5;
retain new_ind;
if first.id then
do;
call missing(of new_ind[*]);
do until(last.id);
set have;
by id;
do i=1 to dim(new_ind);
if ind[i]='Y' then new_ind[i]='Y';
end;
end;
end;
drop i;
run;
Collapse to one record per BY group. If your variables are really coded as Y or missing then you could use UPDATE statement to do the collapsing. The UPDATE statement will ignore the missing values when applying the transaction observations.
data summary;
update have(obs=0) have;
by id;
run;
If you then need those variable renamed and remerged onto the detailed records use a MERGE statement. If you have a series of variables with a numeric counter it works much better to have the counter as a suffix so that you can use a variable list.
data want;
merge have summary(rename=(indicator1-indicator5=new_indicator1-new_indicator5));
by id;
run;
data have; infile datalines dlm='|' truncover; input ID $ (indicator1 indicator2 indicator3 indicator4 indicator5) (:$1.); datalines; 001|| ||Y|| || || || 001||Y|| || || || || 001|| || ||Y|| || || 001|| || || || || Y 002||Y|| || || || || 002|| || ||Y|| || || 002|| ||Y|| || || || 002|| || || || || || 002|| || || ||Y|| || ; proc sql; create table want as select *, max(indicator1) as new1, max(indicator2) as new2, max(indicator3) as new3, max(indicator4) as new4, max(indicator5) as new5 from have group by id; quit;
Here's a variation that is such an odd duck, you might not see it again:
data want;
do until (last.id);
update have (obs=0) have;
by id;
end;
indicator1new = indicator1;
indicator2new = indicator2;
indicator3new = indicator3;
indicator4new = indicator4;
indicator5new = indicator5;
do until (last.id);
set have;
by id;
output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.