I’m going to use a simple example to ask my question because I'm not sure how to articulate what I'm trying to do. If I have small groups within a dataset such as the ID column below. I want to check to see if any record within an ID group has the color red. If it does, I want to make all the records in that ID group red. So that all 3 records for ID 1 and 3 would change to red and ID 2 would be unchanged. How would I do that in SAS?
ID Color
1 Blue
1 Green
1 Red
2 Blue
2 Yellow
3 Green
3 Red
3 Yellow
Thanks in advance for any suggestion.
data have;
input ID colour $;
cards;
1 Blue
1 Green
1 Red
2 Blue
2 Yellow
3 Green
3 Red
3 Yellow
;;;;
run;
proc sort data=have;
by id colour;
run;
data red_list;
set have;
where colour="Red";
by ID;
if first.ID;
rename colour=master_colour;
run;
data want;
merge have red_list;
by ID;
final_colour = coalescec(master_colour, colour);
run;
@ballardw solution but a data step approach instead. These are probably the best approaches given your question. It's possible to do this in a single step using DoW if efficiency becomes an issue but that's a more complex programming approach.
Is this what you're looking for?
data have;
input id color :$7.;
datalines;
1 Blue
1 Green
1 Red
2 Blue
2 Yellow
3 Green
3 Red
3 Yellow
;
data want;
merge have (in = a)
have (in = b rename = (color = color_red) where = (color_red = "Red"));
by id;
if a;
run;
I don't overwrite the original variable, but you could do that if you wanted. I just don't like doing that personally.
Obs id color color_red 1 1 Blue Red 2 1 Green Red 3 1 Red Red 4 2 Blue 5 2 Yellow 6 3 Green Red 7 3 Red Red 8 3 Yellow Red
@BillSut wrote:
I’m going to use a simple example to ask my question because I'm not sure how to articulate what I'm trying to do. If I have small groups within a dataset such as the ID column below. I want to check to see if any record within an ID group has the color red. If it does, I want to make all the records in that ID group red. So that all 3 records for ID 1 and 3 would change to red and ID 2 would be unchanged. How would I do that in SAS?
ID Color
1 Blue
1 Green
1 Red
2 Blue
2 Yellow
3 Green
3 Red
3 Yellow
Thanks in advance for any suggestion.
One way, note the data step to provide an actual data set that can be used with code.
data have; input ID $ Color $; datalines; 1 Blue 1 Green 1 Red 2 Blue 2 Yellow 3 Green 3 Red 3 Yellow ; Proc sql; create table want as select a.id, coalescec(b.color,a.color) as color from have as a left join (select distinct id,color from have where color='Red') as b on a.id=b.id ; quit;
The Sql part selects the Id's that have Red somewhere in the color variable using the (select distinct). The Join then matches those id to the basic data. The Coalescec function then returns the first value encountered in the two subsets. In the case where the ID did not have "red" then the B.color is missing and the A.color (from the base set) is the result.
If you have more than one value you are looking for then you would need to provide a more complicated example of the data - as a data step- and the expected results.
data have;
input ID colour $;
cards;
1 Blue
1 Green
1 Red
2 Blue
2 Yellow
3 Green
3 Red
3 Yellow
;;;;
run;
proc sort data=have;
by id colour;
run;
data red_list;
set have;
where colour="Red";
by ID;
if first.ID;
rename colour=master_colour;
run;
data want;
merge have red_list;
by ID;
final_colour = coalescec(master_colour, colour);
run;
@ballardw solution but a data step approach instead. These are probably the best approaches given your question. It's possible to do this in a single step using DoW if efficiency becomes an issue but that's a more complex programming approach.
A hash object could be used:
data want;
set have;
if _n_ = 1 then do;
declare hash h(dataset: 'have(where= (Color = "Red"))');
h.defineKey('id');
h.defineData('color');
h.defineDone();
end;
rc = h.find();
drop rc;
run;
Maybe a bit to hefty for the problem 😉
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.