I'm trying to create a new column for each code I'm looking for. Then flag each row that contain those codes.
The code works for 1 row then terminates. How do I keep looping through each row of my data set?
data have;
input id$ code1$ code2$ code3$ code4$;
cards;
aa 03f 06x 01a 05a
bb 05n 02b 01a .
cc 02b . . .
dd 01a 02b . .
;
run;
data top_codes;
input top_code1$ top_code2$ top_code3$;
cards;
01d 01a 02b
;
run;
data want (drop= i k);
if _n_ eq 1 then;
do;
set top_codes;
retain _all;
array tc(*) top_code1--top_code3;
end;
set have;
array _vars(*) code1--code4;
array code_{3};
do i=1 to dim(_vars);
do k = 1 to dim(code_);
if _vars(i) = tc(k) then code_{k} = 1 ;
end;
end;
run;
This is a image of what I'm getting. I'm expecting to get a line for each row of my data
You have an extra semi-colon.
Yet another reason to dislike that indentation style. Keep the DO on the end of the IF line and align the END with the IF
Try this instead:
data want (drop= i k);
if _n_ eq 1 then do;
set top_codes;
retain _all;
array tc(*) top_code1--top_code3;
end;
set have;
array _vars(*) code1--code4;
array code_{3};
do i=1 to dim(_vars);
do k = 1 to dim(code_);
if _vars(i) = tc(k) then code_{k} = 1 ;
end;
end;
run;
You don't even need a DO/END block at that point because the SET statement is the only executable statement.
if _n_ eq 1 then set top_codes;
retain _all;
array tc(*) top_code1--top_code3;
I am not sure what the RETAIN statement is doing. You are not creating any variable named _ALL. And if there is a variable named _ALL coming in from one of those two datasets it would already be retained so no need to list it in a RETAIN statement. If you meant to use the _ALL_ variable list then that also would do nothing as the only variables defined at that point are the ones read from TOP_CODES and those are already set to be retained.
How about providing a more complete example of what you expect. I am not all sure of what you want.
Note: You are getting one row of data for each line of date in the Top_codes data set. Multiple Set statements are a complex subject.
If your goal is to have every record from Have matched to the Top_code data set then you want a Cartesian product such as
proc sql; create table want as select a.*,b.* from top_codes as a, have as b ; quit;
If your top_codes had two rows of data you have every row of top_codes joined with every row of Have yielding 2 times as many records as Have.
You have an extra semi-colon.
Yet another reason to dislike that indentation style. Keep the DO on the end of the IF line and align the END with the IF
Try this instead:
data want (drop= i k);
if _n_ eq 1 then do;
set top_codes;
retain _all;
array tc(*) top_code1--top_code3;
end;
set have;
array _vars(*) code1--code4;
array code_{3};
do i=1 to dim(_vars);
do k = 1 to dim(code_);
if _vars(i) = tc(k) then code_{k} = 1 ;
end;
end;
run;
You don't even need a DO/END block at that point because the SET statement is the only executable statement.
if _n_ eq 1 then set top_codes;
retain _all;
array tc(*) top_code1--top_code3;
I am not sure what the RETAIN statement is doing. You are not creating any variable named _ALL. And if there is a variable named _ALL coming in from one of those two datasets it would already be retained so no need to list it in a RETAIN statement. If you meant to use the _ALL_ variable list then that also would do nothing as the only variables defined at that point are the ones read from TOP_CODES and those are already set to be retained.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.