BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasquestions
Calcite | Level 5

Hello, I have a dataset test which has 7000 variables. I have a task to find all the variables that have .B and .C in it. For example

 

data test;

input K1 K2 K3 K4 K5;

cards;

1  0  .B  2  1

0  2   2  .C 0

2  1   1  1   0

;

run;

 

The output should give me K3,K4 vars. I have tried creating an array and do loop as below

 

data tmp1;
set test;
array formats[*] _all_;

do i = 1 to dim(formats);
if formats[i] in (.B,.C)  ;
end;

run;

 

The above code is giving me the observations that have .B and .C . Can anyone help me in directing the right way to get ONLY the list of variables that have .B and .C. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Here are two different ways. The first uses a data step and loops through the array, Every time it encounters a missing value it outputs that variable name to a data set and then you can remove duplicates after. 

 

data singleStep;
	set test;
	array K(*) k1-k5;

	do i=1 to dim(k);

		if k(i) in (.B, .C) then
			do;
				variableName=vname(k(i));
				output;
			end;
	end;
	keep variableName;
run;

proc sort data=singleStep nodupkey;
	by variableName;
run;

This is more of a dynamic approach, that uses a format to map the values of interest to one group and everything else to another group. Then you summarize and see which variables have the values of interest. The method above is probably simpler but this works as well and puts the list of variables into a comma separated macro variable at the end.

 

proc format;
	value find_fmt
.B, .C='Found' other='Not of Interest';
run;

*summarize into missing/not missing/
ods select none;
ods output onewayfreqs=summary;

proc freq data=test;
	table _all_ / missing;
	format _numeric_ find_fmt.;
run;

ods select all;
*Format output - probably more than is needed by code that I already have;

data want;
	length variable $32. variable_value $50.;
	set summary;
	Variable=scan(table, 2);
	Variable_Value=strip(trim(vvaluex(variable)));
	keep variable variable_value frequency percent cum:;
	label variable='Variable' variable_value='Variable Value';
run;

proc sql;
	select distinct variable Into :var_list separated by ", " from want where 
		variable_value="Found";
quit;

 

View solution in original post

7 REPLIES 7
maguiremq
SAS Super FREQ

Is this what you want?

data want;
	set test;
	array vars [*] k:;
		do i = 1 to dim(vars);
			if vars[i] in (".B", ".C") then flag = 1;
		end;
	drop i;
run;
sasquestions
Calcite | Level 5

I want the variables that have the particular value. The above code gives me the observations. Is there a way to find the list of variables ? Thanks.

maguiremq
SAS Super FREQ

 

data want;
	set test;
	array vars [*] k:;
		do i = 1 to dim(vars);
			if vars[i] in (".B", ".C") then var_of_interest = vname(vars[i]);
		end;
	drop i;
run;
Obs K1 K2 K3 K4 K5 var_of_interest 
1 1 0 B 2 1 K3 
2 0 2 2 C 0 K4 
3 2 1 1 1 0   

Is that what you mean?

sasquestions
Calcite | Level 5
I will Try this way. Thanks.
Reeza
Super User

Here are two different ways. The first uses a data step and loops through the array, Every time it encounters a missing value it outputs that variable name to a data set and then you can remove duplicates after. 

 

data singleStep;
	set test;
	array K(*) k1-k5;

	do i=1 to dim(k);

		if k(i) in (.B, .C) then
			do;
				variableName=vname(k(i));
				output;
			end;
	end;
	keep variableName;
run;

proc sort data=singleStep nodupkey;
	by variableName;
run;

This is more of a dynamic approach, that uses a format to map the values of interest to one group and everything else to another group. Then you summarize and see which variables have the values of interest. The method above is probably simpler but this works as well and puts the list of variables into a comma separated macro variable at the end.

 

proc format;
	value find_fmt
.B, .C='Found' other='Not of Interest';
run;

*summarize into missing/not missing/
ods select none;
ods output onewayfreqs=summary;

proc freq data=test;
	table _all_ / missing;
	format _numeric_ find_fmt.;
run;

ods select all;
*Format output - probably more than is needed by code that I already have;

data want;
	length variable $32. variable_value $50.;
	set summary;
	Variable=scan(table, 2);
	Variable_Value=strip(trim(vvaluex(variable)));
	keep variable variable_value frequency percent cum:;
	label variable='Variable' variable_value='Variable Value';
run;

proc sql;
	select distinct variable Into :var_list separated by ", " from want where 
		variable_value="Found";
quit;

 

sasquestions
Calcite | Level 5

Thank you for the explanation. It worked.

Tom
Super User Tom
Super User

There is nothing in @Reeza code that is limiting what numeric variables it considers.  The only reference to K1-K6 was in the data step she used to make some sample data

 

PS If you provide your own example data as a data step in your questions then people trying to help you won't need to make their own.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2131 views
  • 2 likes
  • 4 in conversation