I'm trying to create a new column for each code I'm looking for. Then flag each row that contain those codes.
The code works for 1 row then terminates. How do I keep looping through each row of my data set?
data have;
  input id$ code1$ code2$ code3$ code4$;
  cards;
aa 03f 06x 01a 05a
bb 05n 02b 01a .
cc 02b . . .
dd 01a 02b . .
;
run;
data top_codes;
  input top_code1$ top_code2$ top_code3$;
  cards;
01d 01a 02b
;
run;
data want (drop= i k);
	
	if _n_ eq 1 then;
		do;
			set top_codes;
			retain _all;
			array tc(*) top_code1--top_code3;
		end;
	set have;
	array _vars(*) code1--code4;
	array code_{3}; 
 	
	do i=1 to dim(_vars);
		do k = 1 to dim(code_);
			if _vars(i) = tc(k) then code_{k} = 1 ;
		end;
	end;
run;
This is a image of what I'm getting. I'm expecting to get a line for each row of my data
You have an extra semi-colon.
Yet another reason to dislike that indentation style. Keep the DO on the end of the IF line and align the END with the IF
Try this instead:
data want (drop= i k);
  if _n_ eq 1 then do;
    set top_codes;
    retain _all;
    array tc(*) top_code1--top_code3;
  end;
  set have;
  array _vars(*) code1--code4;
  array code_{3}; 
  do i=1 to dim(_vars);
    do k = 1 to dim(code_);
      if _vars(i) = tc(k) then code_{k} = 1 ;
    end;
  end;
run;You don't even need a DO/END block at that point because the SET statement is the only executable statement.
  if _n_ eq 1 then set top_codes;
  retain _all;
  array tc(*) top_code1--top_code3;
I am not sure what the RETAIN statement is doing. You are not creating any variable named _ALL. And if there is a variable named _ALL coming in from one of those two datasets it would already be retained so no need to list it in a RETAIN statement. If you meant to use the _ALL_ variable list then that also would do nothing as the only variables defined at that point are the ones read from TOP_CODES and those are already set to be retained.
How about providing a more complete example of what you expect. I am not all sure of what you want.
Note: You are getting one row of data for each line of date in the Top_codes data set. Multiple Set statements are a complex subject.
If your goal is to have every record from Have matched to the Top_code data set then you want a Cartesian product such as
proc sql; create table want as select a.*,b.* from top_codes as a, have as b ; quit;
If your top_codes had two rows of data you have every row of top_codes joined with every row of Have yielding 2 times as many records as Have.
You have an extra semi-colon.
Yet another reason to dislike that indentation style. Keep the DO on the end of the IF line and align the END with the IF
Try this instead:
data want (drop= i k);
  if _n_ eq 1 then do;
    set top_codes;
    retain _all;
    array tc(*) top_code1--top_code3;
  end;
  set have;
  array _vars(*) code1--code4;
  array code_{3}; 
  do i=1 to dim(_vars);
    do k = 1 to dim(code_);
      if _vars(i) = tc(k) then code_{k} = 1 ;
    end;
  end;
run;You don't even need a DO/END block at that point because the SET statement is the only executable statement.
  if _n_ eq 1 then set top_codes;
  retain _all;
  array tc(*) top_code1--top_code3;
I am not sure what the RETAIN statement is doing. You are not creating any variable named _ALL. And if there is a variable named _ALL coming in from one of those two datasets it would already be retained so no need to list it in a RETAIN statement. If you meant to use the _ALL_ variable list then that also would do nothing as the only variables defined at that point are the ones read from TOP_CODES and those are already set to be retained.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
