Hello,
I have a table with hundreds of columns. I would like to create one dataset for each column with the observations that match a simple condition.
For examble here is my table (I take just 3 variables for the example but I have hundreds)
col coll colll
----------------
1 3 5
2 5 2
10 3 1
8 0 -10
my condition : >=5
I want in return three datasets dcol dcoll and dcolll with :
dcol :
col coll colll
10 3 1
8 0 -10
dcoll:
col coll colll
2 5 2
and dcolll
col coll colll
1 3 5
I thought about try to create an array with the name of my variables and then doing a loop with a datastep inside but I don't know how to do this and if it's what to do.
Thanks for your help!
Céline
data have;
input col coll colll;
datalines;
1 3 5
2 5 2
10 3 1
8 0 -10
;
data _null_;
set sashelp.vcolumn(where=(libname='WORK' and memname='HAVE'));
call execute(catt('data d_',name,';set have(where=(',name,'>=5));run;'));
run;
I don't usually advise splitting data sets up like this. When you need to find the rows where COL >= 5 you can do this:
proc something data=have (where=(col>=5));
... more lines of code ...
run;
If the rows are mutually exclusive, so they wind up having >=5 in only one column, you could just create a column indicator and then use BY processing.
If the rows are not mutually exclusive, they can wind up haveing >=5 in many columns, then a macro could handle it. Lets assume you have col1-col50
%macro dothis;
%do i=1 %to 50;
title "Analysis where COL&i >= 5";
proc something data=have(where=(col&i>=5));
/* more code here */
run;
%end;
%mend;
%dothis
What exactly will you be doing with those "100's" of data sets?
You can identify if any of your numeric variables have a specific value with
whichn(5,_numeric_)
or a list such as
whichn(5,varname1 -- varname25)
the -- would indicate adjacent variables to search.
which returns the column order number of the first numeric variable with the value of 5 or a 0 otherwise.
Yes, that's a great question from @ballardw: what are you going to do with all of this once it is done? @celine9602 please tell us, depending on your answer we can tailor the results to your specific need.
@celine9602 wrote:
Each of these datasets would be exported in a corresponding sheet in an Excel file (empty datsets wont be exported). I need to keep the complete observations.
And then what? Store and ignore the file like many government reports? Print? Actually read and do something? What would reading those sheets do?
@celine9602 wrote:
In fact I need to report the observations and the corresponding column name that match the condition.
Lets start over a bit so we can follow your actual need.
Provide a small data set, maybe 5 variables and 5 observations, and 2 different "condition" values.
Then for that very small data set provide an example of the REPORT that needs to be generated.
Here is an example of how to make a small data set:
data have; input thisvar thatvar somevar var4 varmore; datalines; 1 2 3 4 5 12 13 14 15 16 21 31 41 51 61 55 44 33 22 11 101 102 103 104 105 ;
If you want to use different values or variable names fine, just provide an example like this.
I intentionally made data that doesn't duplicate values by row or variable so the "report" should be pretty short made manually.
ONE example of searching a bunch of variables and returning the name of the variable that matches when found:
data want; set have; array v (*) thisvar -- varmore; if whichn(13,of v(*))>0 then matchvar = vname(v[whichn(13,of v(*))]); run;
If you need to indicate MULTIPLE variables in a single observation then you need to state that as a requirement, provide example data and what the report format would look like.
In fact I need to report the observations and the corresponding column name that match the condition.
Stop talking about the code, and stop talking about the computer tasks such as reading or finding records that meet a certain condition or writing Excel files. Explain the purpose of this work, as if you were talking to your company president or university president, without discussing code, without discussing individual tasks, what will be gained by doing this?
data have;
input col coll colll;
datalines;
1 3 5
2 5 2
10 3 1
8 0 -10
;
data _null_;
set sashelp.vcolumn(where=(libname='WORK' and memname='HAVE'));
call execute(catt('data d_',name,';set have(where=(',name,'>=5));run;'));
run;
Do you have SAS/IML ?
data have;
input col coll colll;
datalines;
1 3 5
2 5 2
10 3 1
8 0 -10
;
proc iml;
use have;
read all var _num_ into x[c=vname];
want=(x>=5)[+,];
create want from want[c=vname];
append from want;
close;
quit;
If you don't have SAS/IML ,Try PROC SQL :
data have;
input col coll colll;
datalines;
1 3 5
2 5 2
10 3 1
8 0 -10
;
proc sql;
create table want as
select sum(col>=5) as col,sum(coll>=5) as coll,sum(colll>=5) as colll
from have;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.