## Subset a Matrix

Solved
Occasional Contributor
Posts: 7

# Subset a Matrix

I create a matrix that is 45x45 fairly using my a data step.

However, I need to parse it down to 30x30 and the output from the procedure is nonsensical for me (it does not matter how the input data file is sorted).

Thus, I need to delete the same rows and columns from the matrix:

to_delete=(1,4,20,23,26,26,30,31,32,33,34,35,36,37,38)

So the final matrix would only contain row 2,3,5,6...etc of the original matrix and be square(nxn)

I have been stumped thus far.

I have been trying variations of this

data output;set input;

keep _COL_ =: to_delete;run;quit;

Accepted Solutions
Solution
‎01-08-2013 06:03 PM
PROC Star
Posts: 8,164

## Re: Subset a Matrix

Will this do what you want?

data to_delete;

set large_matrix;

keep _COL_;

if missing (Female);

run;

proc sql noprint;

create table keep_small_matrix

as Select name

from dictionary.columns

where libname="WORK" and

memname="LARGE_MATRIX" and

name ne "_COL_"

;

DELETE from keep_small_matrix

WHERE name in (select * from work.to_delete)

;

select name

into : keeps

separated by ","

from keep_small_matrix

;

create table small_matrix as Select _COL_,&keeps.

from large_matrix

;

DELETE from small_matrix WHERE _COL_ in (select * from work.to_delete)

;

QUIT;

All Replies
PROC Star
Posts: 8,164

## Re: Subset a Matrix

Not sure what you mean.  Can you provide a small example (as a datastep) that represents, say, a 5x5 matrix and a second datastep showing the result you want to achieve?

Posts: 3,852

## Re: Subset a Matrix

The "problem" is that rows can be easily referenced by number or other ID and can be DELETED with an executable statement.  The columns however have to be referenced by name and cannot be DROPPED with and executable statement.  One way to approach the problem would be transpose the data so that you can reference the rows and columns in the same way.

data mat;
input row \$ a b c d e f;
cards;
a 1 2 3 4 5 6
b 1 2 3 4 5 6
c 1 2 3 4 5 6
d 1 2 3 4 5 6
e 1 2 3 4 5 6
f 1 2 3 4 5 6
;;;;
run;
proc transpose name=col data=mat out=mat2 prefix=m;
by row;
run;

data remove;
set mat2;
if row in('a','c') then delete;
if col in('a','c') then delete;
run;
proc sort;

by row col;
run;
proc transpose data=remove out=mat3;
by row;
id col;
run;
Occasional Contributor
Posts: 7

## Re: Subset a Matrix

data remove;
set mat2;
if row in('a','c') then delete;
if col in('a','c') then delete;
run;

I need to delete something like ~1000 rows and columns from a 10,000 x 10,000 matrix, do I need to put the columns into an array then somehow reference the array in the data step?

Posts: 3,852

## Re: Subset a Matrix

Put the row/column ID into a SAS data set and transpose that to an ARRAY(see below).  Then use the array in the IN operator.

You may find that you can generate the list with a DO loop if you use numbered row/column ID.

do delete=4 to 10, 32, 56, 100 to 200......

data mat;
input row \$ a b c d e f;
cards;
a 1 2 3 4 5 6
b 1 2 3 4 5 6
c 1 2 3 4 5 6
d 1 2 3 4 5 6
e 1 2 3 4 5 6
f 1 2 3 4 5 6
;;;;
run;
proc transpose name=col data=mat out=mat2 prefix=m;
by row;
run;
data delete;
input del :\$1. @@;
cards;
a c
;;;;
run;
proc transpose data=delete out=delete prefix=del;
var del;
run;
data remove;
set mat2;
if _n_ eq 1 then set delete;
array del

• del:;
•    if row in del or col in del then delete;
run;
proc sort;

by row col;
run;
proc transpose data=remove out=mat3;
by row;
id col;
run;

PROC Star
Posts: 8,164

## Re: Subset a Matrix

If DN correctly interpreted what you want to do, here is a way to do it in a data step:

data have;

input a b c d e f g h i j;

cards;

1 1 1 1 1 1 1 1 1 1

2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3

4 4 4 4 4 4 4 4 4 4

5 5 5 5 5 5 5 5 5 5

6 6 6 6 6 6 6 6 6 6

7 7 7 7 7 7 7 7 7 7

8 8 8 8 8 8 8 8 8 8

9 9 9 9 9 9 9 9 9 9

0 0 0 0 0 0 0 0 0 0

;

%let to_delete=(1,4,6,8);

proc sql noprint;

select name into :deletes

separated by " "

from dictionary.columns

where libname="WORK" and

memname="HAVE" and

varnum in &to_delete

;

quit;

data want (drop=&deletes.);

set have;

if _n_ not in &to_delete.;

run;

Occasional Contributor
Posts: 7

## Re: Subset a Matrix

I feel as though I can do this in PROC SQL but havent been successful today.

I'm very close. I can delete the rows, but am stuck on deleting the columns

Again, my "large matrix" has both rows and columns with the same variable names (COL1, COL2....COLi) so i merely create a dataset that has a list of these based off of some critera:

data to_delete; set large_matrix;

keep _COL_;

if missing (Female);run;quit;

to_delete contains random columns from which I want to delete both the row and column

This successfully deletes the proper rows (not difficult). Deleting the proper columns is stumping though.

proc sql;

create table small_matrix as Select * from large_matrix;

DELETE from small_matrix WHERE _COL_ in (select * from work.to_delete);

/* alter table work.kjk drop column  in (work.tod) Does not work to remove columns */

;

QUIT;

Thank you to all for your time.

PROC Star
Posts: 8,164

## Re: Subset a Matrix

Can you show a SMALL example (say for 3 columns/rows) of what large_matrix looks like?

Occasional Contributor
Posts: 7

## Re: Subset a Matrix

 Female Male _COL_ COL1 COL2 COL3 COL4 COL5 COL6 COL7 COL8 COL9 A B COL1 0.716692 0.820369 0.319965 0.537561 0.009048 0.623179 0.329801 0.501619 0.461064 C D COL2 0.47814 0.568633 0.377889 0.322076 0.848357 0.585695 0.190736 0.666417 0.753265 COL3 0.33489 0.464657 0.19093 0.712932 0.309871 0.298905 0.408627 0.931734 0.623899 E F COL4 0.182391 0.123417 0.184646 0.260479 0.347146 0.428899 0.070591 0.21143 0.833228 COL5 0.048757 0.780064 0.238505 0.923092 0.527054 0.210349 0.51993 0.463315 0.781508 G H COL6 0.564873 0.769461 0.520583 0.604096 0.174694 0.395051 0.149006 0.923724 0.669723 COL7 0.077545 0.799448 0.857925 0.023725 0.2919 0.145091 0.180948 0.425452 0.333822 I J COL8 0.40907 0.508158 0.975613 0.828987 0.009638 0.137749 0.039158 0.358044 0.816533 COL9 0.280801 0.596491 0.265516 0.568593 0.066118 0.680901 0.915639 0.379575 0.277307

In this case, I want to delete row/col 3,5,7,9 (which are in the to_delete dataset at this point)

Solution
‎01-08-2013 06:03 PM
PROC Star
Posts: 8,164

## Re: Subset a Matrix

Will this do what you want?

data to_delete;

set large_matrix;

keep _COL_;

if missing (Female);

run;

proc sql noprint;

create table keep_small_matrix

as Select name

from dictionary.columns

where libname="WORK" and

memname="LARGE_MATRIX" and

name ne "_COL_"

;

DELETE from keep_small_matrix

WHERE name in (select * from work.to_delete)

;

select name

into : keeps

separated by ","

from keep_small_matrix

;

create table small_matrix as Select _COL_,&keeps.

from large_matrix

;

DELETE from small_matrix WHERE _COL_ in (select * from work.to_delete)

;

QUIT;

Occasional Contributor
Posts: 7

## Re: Subset a Matrix

That does work Art.

Thank you

SAS Super FREQ
Posts: 4,241

## Re: Subset a Matrix

If you have SAS/IML, do it this way:

proc iml;

use have; read all var _NUM_ into x[c=Names]; close have;
DeleteIdx = {1 4 6 8};

/* subset/extract */
KeepIdx = setdif( 1:ncol(x), DeleteIdx);
y = x[ , KeepIdx];
NewNames = Names[KeepIdx];

/* write new data */

create want from y[c=NewNames]; append from y; close want;

quit;

🔒 This topic is solved and locked.