BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Liamb
Obsidian | Level 7
Hello,
I have a table with 7000 variables coded from 0 to 3 and I want to select only those containing value 1.
How to do it? (dataset with array or proc sql)?
Thanks
Best regards
1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Hi,

 

It's more helpful to post data as a DATA step with cards, so that people can easily create your sample dataset.

 

But as penance for my earlier typo, I read in your sample data and ran it through my corrected code:

 

data have ;
  input ID : $5. K20_G V21_I K22_K Q23_H P25_A E28_A E29_E K30_del I31_R K32_A ;
  cards ;
RET1  1 0 0 0 0 0 0 0 0 0
RET2  0 1 0 0 0 0 0 0 . 0
RET3  0 0 0 1 0 0 0 0 0 0
RET4  0 0 0 0 0 1 0 0 0 0
RET5  1 0 0 0 0 0 0 0 0 0
RET6  0 . 0 0 0 0 0 1 0 0
RET7  0 0 0 0 0 . 0 0 0 0
RET8  0 0 0 0 0 0 0 0 0 0
RET9  0 0 0 0 0 0 0 0 0 0
RET10 0 . 0 0 0 0 0 0 0 1
;
run ;

proc sort data=have ;
  by id ;
run ;

proc transpose data=have out=vert;
  var _numeric_ ;
  by id ;
run ;

proc sql noprint;
  select distinct _name_ into :keeplist separated by " "
    from vert 
    where col1=1
  ;
quit ;

%put >>&keeplist<< ;

data want ;
  set have(keep=&keeplist) ;
run ;
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

10 REPLIES 10
Reeza
Super User
More details.
Do you want if any variable in a row is 1?
Do the variables have any naming system? Are all the variables side by side in the data set? Are all the variables the same type (num/char)?
Liamb
Obsidian | Level 7
I want to keep any variable that has a row equal to 1. All my variables are
named differently, e.g. K20I, K20R, D21C, C35D,..... They are all numeric
Patrick
Opal | Level 21

@Liamb wrote:
I want to keep any variable that has a row equal to 1. All my variables are
named differently, e.g. K20I, K20R, D21C, C35D,..... They are all numeric

To not have to type the names of 7000 variables we need some naming patterns to select them.

Depending on your data we could also try the other way round and exclude variables: Select all variables that are numeric but don't have a certain name/naming pattern.

 

Please let us know which way round could work for your data.

 

Once we can generate the list of variables code like below will work. 

data want;
  set sashelp.class;
  if whichn(of _numeric_,15)>0 then output;
run;

Above code selects all rows where at least one of the variables in list _numeric_ has a value of 15.

Liamb
Obsidian | Level 7

I want to select all the variables that have an observation equal to 1 or delete the variables that have no observation equal to 1. Here are an example of data in attached

ballardw
Super User

Suggestion: provide a small example, maybe 10 variables and 10 rows similar to your data.

Then show us the result you want given that example data.

Your requirement " I want to select only those containing value 1" isn't very clear. Variables that contain 1 somewhere in the data set? For every observation in the data set? Only for some observations.

 

We may ask questions about why you selected something.

 

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

With the amount of words that SQL requires I shudder to think what any of the code that may possibly work would look like with 7000 (!) variables.

 

 

 

 

Liamb
Obsidian | Level 7

I want to check each of 7000 variables, and if a variable has the value 1 for any record, I want to keep that variable

ID

K20_G

V21_I

K22_K

Q23_H

P25_A

E28_A

E29_E

K30_del

I31_R

K32_A

RET1

1

0

0

0

0

0

0

0

0

0

RET2

0

1

0

0

0

0

0

0

.

0

RET3

0

0

0

1

0

0

0

0

0

0

RET4

0

0

0

0

0

1

0

0

0

0

RET5

1

0

0

0

0

0

0

0

0

0

RET6

0

.

0

0

0

0

0

1

0

0

RET7

0

0

0

0

0

.

0

0

0

0

RET8

0

0

0

0

0

0

0

0

0

0

RET9

0

0

0

0

0

0

0

0

0

0

RET10

0

.

0

0

0

0

0

0

0

1

 

Quentin
Super User

Hi,

 

It's more helpful to post data as a DATA step with cards, so that people can easily create your sample dataset.

 

But as penance for my earlier typo, I read in your sample data and ran it through my corrected code:

 

data have ;
  input ID : $5. K20_G V21_I K22_K Q23_H P25_A E28_A E29_E K30_del I31_R K32_A ;
  cards ;
RET1  1 0 0 0 0 0 0 0 0 0
RET2  0 1 0 0 0 0 0 0 . 0
RET3  0 0 0 1 0 0 0 0 0 0
RET4  0 0 0 0 0 1 0 0 0 0
RET5  1 0 0 0 0 0 0 0 0 0
RET6  0 . 0 0 0 0 0 1 0 0
RET7  0 0 0 0 0 . 0 0 0 0
RET8  0 0 0 0 0 0 0 0 0 0
RET9  0 0 0 0 0 0 0 0 0 0
RET10 0 . 0 0 0 0 0 0 0 1
;
run ;

proc sort data=have ;
  by id ;
run ;

proc transpose data=have out=vert;
  var _numeric_ ;
  by id ;
run ;

proc sql noprint;
  select distinct _name_ into :keeplist separated by " "
    from vert 
    where col1=1
  ;
quit ;

%put >>&keeplist<< ;

data want ;
  set have(keep=&keeplist) ;
run ;
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Kurt_Bremser
Super User

Maxim 19: Long Beats Wide.

proc transpose data=have out=long;
by id;
var _numeric_;
run;

data want;
set long;
if _n_ = 1
then do;
  declare hash _1 (dataset:"long (where=(col1 = 1))");
  _1.definekey("_name_");
  _1.definedone();
end;
if _1.check() = 0;
run;

The hash is just my tool of choice (no prior sorting needed), you could also do SQL joins or a DATA step MERGE.

Quentin
Super User

Hi,

 

Do you mean you want to check each of 7000 variables, and if a variable has the value 3 for any record, you want to keep that variable?

 

One way to do this is with PROC TRANSPOSE.  You can transpose your dataset to make a vertical dataset with one record per value. So it's basically name-value pairs.  Once you have that, you can use PROC SQL (or whatever) to make a list of all the variables that have a certain value.  Something like:

 

proc transpose data=sashelp.class out=vert;
  var _numeric_ ;
  by name ;
run ;

proc print data=vert ;
run ;

proc sql ;
  select distinct _name_ into :keeplist
    from vert 
    where col1=14
  ;
quit ;

data want ;
  set sashelp.class(keep=&keeplist) ;
run ;
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Quentin
Super User
I forgot separated by “ “ in the SQL step to make the macro variable KeepList a list of variables.
BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1500 views
  • 7 likes
  • 6 in conversation