Hello,
I have an observation with two text fields that contain listings of error keys.
I need to convert these fields into two vectors the have a zero/one indicator for each error key. I then need to be able to perform some basic matrix algebra on the vectors and store the results (a numeric value) in a third picture.
EXAMPLE
Field 1: “112 1454 122 342”
Field 2: “122 1343 32”
Key for Vector Element 112 1454 122 342 1343 32
Field 1 Vector: 1 1 1 1 0 0
Field 2 Vector: 0 0 1 0 1 1
This is essentially a numerical application of the Salton Wong and Yang (1975) vector space model.
Does anyone have any code handy to do this, or can anyone point me to resources where I can learn it myself? I've been struggling to find stuff.
Thank you all!
You didn't provide what your data look like, so I'll just have to guess. Try the following:
Sample code:
data have;
length s $100;
input s & $; /* special character '&' reads until 2 or more blanks */
Field = _N_;
cnt = countw(s, ' ');
do i = 1 to cnt;
key = scan(s, i, ' ');
output;
end;
datalines;
112 1454 122 342
122 1343 32
;
proc iml;
use Have;
read all var {"Field" "Key"};
close;
Fields = unique(Field);
Keys = unique(Key);
Result = j(ncol(Fields), ncol(Keys), 0);
/* http://blogs.sas.com/content/iml/2011/11/07/an-efficient-alternative-to-the-unique-loc-technique.html */
b = uniqueby(Field, 1); /* b[i] = beginning of i_th category */
b = b // (nrow(Field)+1); /* trick: append (n+1) to end of b */
do i = 1 to nrow(b)-1; /* For each level... */
idx = b[i]:(b[i+1]-1); /* Find observations in level */
/* http://blogs.sas.com/content/iml/2014/03/17/finding-elements-in-one-vector-that-are-not-in-another-vector.html */
Result[i,] = element(Keys, Key[idx]);
end;
F = char(Fields);
print Result[rowname=F colname=Keys];
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.