Solved: Finding all the variables that have a particular value in the dataset

sasquestions · Posted 02-08-2021 12:20 PM

Hello, I have a dataset test which has 7000 variables. I have a task to find all the variables that have .B and .C in it. For example

data test;

input K1 K2 K3 K4 K5;

cards;

1 0 .B 2 1

0 2 2 .C 0

2 1 1 1 0

;

run;

The output should give me K3,K4 vars. I have tried creating an array and do loop as below

data tmp1;
set test;
array formats[*] _all_;

do i = 1 to dim(formats);
if formats[i] in (.B,.C) ;
end;

run;

The above code is giving me the observations that have .B and .C . Can anyone help me in directing the right way to get ONLY the list of variables that have .B and .C. Thanks.

Reeza · Posted 02-08-2021 12:51 PM

Here are two different ways. The first uses a data step and loops through the array, Every time it encounters a missing value it outputs that variable name to a data set and then you can remove duplicates after.

data singleStep;
	set test;
	array K(*) k1-k5;

	do i=1 to dim(k);

		if k(i) in (.B, .C) then
			do;
				variableName=vname(k(i));
				output;
			end;
	end;
	keep variableName;
run;

proc sort data=singleStep nodupkey;
	by variableName;
run;

This is more of a dynamic approach, that uses a format to map the values of interest to one group and everything else to another group. Then you summarize and see which variables have the values of interest. The method above is probably simpler but this works as well and puts the list of variables into a comma separated macro variable at the end.

proc format;
	value find_fmt
.B, .C='Found' other='Not of Interest';
run;

*summarize into missing/not missing/
ods select none;
ods output onewayfreqs=summary;

proc freq data=test;
	table _all_ / missing;
	format _numeric_ find_fmt.;
run;

ods select all;
*Format output - probably more than is needed by code that I already have;

data want;
	length variable $32. variable_value $50.;
	set summary;
	Variable=scan(table, 2);
	Variable_Value=strip(trim(vvaluex(variable)));
	keep variable variable_value frequency percent cum:;
	label variable='Variable' variable_value='Variable Value';
run;

proc sql;
	select distinct variable Into :var_list separated by ", " from want where 
		variable_value="Found";
quit;

View solution in original post

maguiremq · Posted 02-08-2021 12:30 PM

Is this what you want?

data want;
	set test;
	array vars [*] k:;
		do i = 1 to dim(vars);
			if vars[i] in (".B", ".C") then flag = 1;
		end;
	drop i;
run;

sasquestions · Posted 02-08-2021 12:42 PM

I want the variables that have the particular value. The above code gives me the observations. Is there a way to find the list of variables ? Thanks.

maguiremq · Posted 02-08-2021 12:49 PM

data want;
	set test;
	array vars [*] k:;
		do i = 1 to dim(vars);
			if vars[i] in (".B", ".C") then var_of_interest = vname(vars[i]);
		end;
	drop i;
run;

Obs K1 K2 K3 K4 K5 var_of_interest 
1 1 0 B 2 1 K3 
2 0 2 2 C 0 K4 
3 2 1 1 1 0

Is that what you mean?

sasquestions · Posted 02-08-2021 12:52 PM

I will Try this way. Thanks.

Reeza · Posted 02-08-2021 12:51 PM

Here are two different ways. The first uses a data step and loops through the array, Every time it encounters a missing value it outputs that variable name to a data set and then you can remove duplicates after.

data singleStep;
	set test;
	array K(*) k1-k5;

	do i=1 to dim(k);

		if k(i) in (.B, .C) then
			do;
				variableName=vname(k(i));
				output;
			end;
	end;
	keep variableName;
run;

proc sort data=singleStep nodupkey;
	by variableName;
run;

This is more of a dynamic approach, that uses a format to map the values of interest to one group and everything else to another group. Then you summarize and see which variables have the values of interest. The method above is probably simpler but this works as well and puts the list of variables into a comma separated macro variable at the end.

proc format;
	value find_fmt
.B, .C='Found' other='Not of Interest';
run;

*summarize into missing/not missing/
ods select none;
ods output onewayfreqs=summary;

proc freq data=test;
	table _all_ / missing;
	format _numeric_ find_fmt.;
run;

ods select all;
*Format output - probably more than is needed by code that I already have;

data want;
	length variable $32. variable_value $50.;
	set summary;
	Variable=scan(table, 2);
	Variable_Value=strip(trim(vvaluex(variable)));
	keep variable variable_value frequency percent cum:;
	label variable='Variable' variable_value='Variable Value';
run;

proc sql;
	select distinct variable Into :var_list separated by ", " from want where 
		variable_value="Found";
quit;

sasquestions · Posted 02-08-2021 02:59 PM

Thank you for the explanation. It worked.

Tom · Posted 02-08-2021 03:03 PM

There is nothing in @Reeza code that is limiting what numeric variables it considers. The only reference to K1-K6 was in the data step she used to make some sample data

PS If you provide your own example data as a data step in your questions then people trying to help you won't need to make their own.

Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Re: Finding all the variables that have a particular value in the dataset

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away