turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-04-2018 11:33 AM - edited 01-04-2018 11:43 AM

Hi all,

I have a nxn matrix (for simplicity assuming n=3) like following:

```
proc iml;
A = {0.1 0.2 0.5, 0.2 0.4 0.6, 0.2 0.3 0.8};
```

I have another table - a reference table, which specifies which values from this nxn (n=3) matrix needs to be written to a new data table. that table is following:

data output_table;

infile datalines dsd;

length FieldName $20 FieldSourceTable $20;

input Matrix_Id Row_Order Column_Order IsZero FieldName FieldSourceTable;

datalines;

3, 1, 1, 1, ., .

3, 1, 2, 0, result1, table2

3, 1, 3, 0, result1, table2

3, 2, 1, 1, ., .

3, 2, 2, 0, sum1, table2

3, 2, 3, 0, sum2, table2

3, 3, 1, 0, prod1, table2

3, 3, 2, 0, prod2, table2

3, 3, 3, 1, ., .

;

Assume the matrix A is my Matrix_ID 3. So, I need to create a new table called table 2 with the names specified in 'FieldName' column in the reference table and write specific values from the matrix to the table (e.g. (2,3)th element to the column 'sum2'). The following code works for the purpose:

```
proc sort data= output_table out= output_table;
by Row_Order Column_Order;
quit;
proc iml;
use output_table; /* parameterization table for matrix 3 (output table)*/
read all var {Column_Order Row_Order FieldName};
close;
idx = loc(FieldName ^= " ");
refNames = FieldName[idx];
A = {0.1 0.2 0.5, 0.2 0.4 0.6, 0.2 0.3 0.8}; /*A is the full matrix to write*/
C = A[idx];
C = t(C);
mattrib C colname=refNames;
print C;
```

The ouput matrix C looks like following:

result1 | result1 | sum1 | sum2 | prod1 | prod2 |

0.2 | 0.5 | 0.4 | 0.6 | 0.2 | 0.3 |

What I want to do is, whenever I have two or more elements of the matrix to be written to the same row, I want them to be summed up. So, my ideal output in this case would be to sum up 0.2 and 0.5 and have 0.7 as result1 as the following output matrix (let's call it D):

result1 | sum1 | sum2 | prod1 | prod2 |

0.7 | 0.4 | 0.6 | 0.2 | 0.3 |

Then I'd write the matrix D to a data table:

create data_to_write from D[colname = refNames];

append from D; /** create data set and append in each subsequent steps**/

end;

Not being able to sum the elements with same column name to create the matrix D for all cases of same column name. Could you please help?

Accepted Solutions

Solution

01-04-2018
12:01 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to snoopy369

01-04-2018 11:51 AM

Yep, that seems to work nicely. Not sure if there's a better non-loop way, maybe Rick will have a better suggestion. I think in R there's a simpler way (without the loop), if I recall correctly, involving matrix multiplication, maybe that's possible in IML as well.

Just identify the unique refNames, then loop over that and use the loc technique again to identify the matching rows in refNames, then grab the corresponding rows in C and sum them.

```
proc iml;
use output_table; /* parameterization table for matrix 3 (output table)*/
read all var {Column_Order Row_Order FieldName};
close;
idx = loc(FieldName ^= " ");
refNames = FieldName[idx];
uNames = unique(refNames); *vector of unique names in refNames;
A = {0.1 0.2 0.5,
0.2 0.4 0.6,
0.2 0.3 0.8}; /*A is the full matrix to write*/
C = A[idx];
cFin = j(1,ncol(uNames)); *target vector to populate, already the right structure;
do i = 1 to ncol(uNames); *iterate over uNames (the unique names);
uIdx = loc(refNames=uNames[i]); *find all of the rows from the refNames vector that have that unique name;
cFin[i] = sum(C[uIdx]); *now sum the rows from C that match those refNames rows and put into target vector;
end;
mattrib cFin colname=uNames;
print C idx refNames cFin;
quit;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ss59

01-04-2018 11:44 AM

I think UNIQUE-LOC is the way to go - https://blogs.sas.com/content/iml/2011/11/01/the-unique-loc-trick-a-real-treat.html

I'll see if I can whip up an example.

I'll see if I can whip up an example.

Solution

01-04-2018
12:01 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to snoopy369

01-04-2018 11:51 AM

Yep, that seems to work nicely. Not sure if there's a better non-loop way, maybe Rick will have a better suggestion. I think in R there's a simpler way (without the loop), if I recall correctly, involving matrix multiplication, maybe that's possible in IML as well.

Just identify the unique refNames, then loop over that and use the loc technique again to identify the matching rows in refNames, then grab the corresponding rows in C and sum them.

```
proc iml;
use output_table; /* parameterization table for matrix 3 (output table)*/
read all var {Column_Order Row_Order FieldName};
close;
idx = loc(FieldName ^= " ");
refNames = FieldName[idx];
uNames = unique(refNames); *vector of unique names in refNames;
A = {0.1 0.2 0.5,
0.2 0.4 0.6,
0.2 0.3 0.8}; /*A is the full matrix to write*/
C = A[idx];
cFin = j(1,ncol(uNames)); *target vector to populate, already the right structure;
do i = 1 to ncol(uNames); *iterate over uNames (the unique names);
uIdx = loc(refNames=uNames[i]); *find all of the rows from the refNames vector that have that unique name;
cFin[i] = sum(C[uIdx]); *now sum the rows from C that match those refNames rows and put into target vector;
end;
mattrib cFin colname=uNames;
print C idx refNames cFin;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to snoopy369

01-04-2018 12:02 PM

That is quite neat. Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to snoopy369

01-04-2018 12:07 PM - edited 01-04-2018 12:08 PM

Recalled the matrix multiplication solution. I think it wouldn't help here. You make a NxM matrix (or is it MxN?) where N is the rows in C and M is the columns in cFin (the uniques), and populate [n,m] with a 1 if that row in C corresponds to that column of cFin, and 0 if not, then matrix multiplication will give you the result you want - but in this case to make that matrix would be just as hard as doing the unique-loc loop anyway, I think, so wouldn't save any time.

If A is really huge, and you're not taking all that many values from it, then you can probably make that matrix though more easily from refNames and use matrix multiplication twice to get from A to the result.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ss59

01-04-2018 12:16 PM

Use the UNIQUE-LOC trick to loop over the unique elements of the refNames. See

An efficient alternative to the UNIQUE-LOC trick

```
uNames = unique(refNames); /* unique names (sorted) */
newRefNames = uNames; /* initialize character array to set the length */
D = j(1, ncol(uNames), .);
do j = 1 to ncol(uNames);
jdx = loc( refNames = uNames[j]);
D[j] = sum( C[jdx] );
newRefNames[j] = refNames[jdx[1]];
end;
print D[c=uNames];
```

IMHO, you could save a lot of time if you clean and organize the data. Combining duplicate fields and eliminating/recoding missing values are part of the data cleansing phase. IML is typically used for the analysis phase.