Re: If matrix row is zero, then set column to zero

jl1005 · Posted 04-21-2017 05:25 PM

Hi,

I'm trying to create a set of matrices where if an entire row contains only zeros, then I'd like to fill the corresponding column with 0's as well.

For example, if row 7 were to contain only 0's, I'd like to change all the values in column 7 to 0's.

Here's the code that I've tried:

do i = 1 to ncol(dsNames);
		use (dsNames[i]);
			read all var _ALL_ into X;	
			tmp = X;
			X = j(nrow(X) + 8, ncol(X), 0);
			rows = setdif(1:nrow(X), do(4,32,4));
			X[rows, ] = tmp;
			do j = 1 to nrow(X);
				if X[j, ] = 0 & j ^= 4 & j ^= 8 & j^= 12 & j ^= 16 & j ^= 20 & j ^= 24 & j ^= 28 & j ^= 32  then X[ , j] = 0;
				end;
			print x;
			call valset(MatNames[i], X);
			close (dsNames[i]);
end;

The second do loop is the one where it's checking for 0's. Also, I'd leave rows 4, 8, 12,...,32 alone.

The code doesn't throw any errors, but I've checked the ouput matrices, and the columns aren't being rewritten properly.

The purpose of this is to create a square, 32x32, transition matrix. The issue this is hoping to fix is that even though we end in state, we don't necessarily have a way to transition out of the state. For example, suppose we end in state 25 and set that as our new start state. In the data set, there are no observations we start in that state, so we have no way to transition out, which is a problem.

But then suppose we end up setting a column to 0, which will change the values of each row. Is it an issue if the rows no longer add up to one, using sas's table distribution with randfun, to generate out next state?

Is there a better way to go about this?

Thank you.

Ksharp · Posted 04-21-2017 11:33 PM

Plz post some data and the output you want.

proc iml;
x={1 2 3 4,
   0 0 0 0,
   3 4 0 0,
   0 0 0 0};
idx=loc(((x=0)[,+])=ncol(x));
x[,idx]=0;
print x;
quit;

IanWakeling · Posted 04-22-2017 03:25 AM

I can't see any errors looking at the code, so as Ksharp has said you need to provide an example with data so we can understand better.

I have one suggestion, rather than using the complex IF statement, why not restrict the loop to those rows you are interested in, and which you have conveniently stored in the matrix rows.

do j = 1 to ncol(rows);
  if X[rows[j], ] = 0  then X[ , rows[j]] = 0;
end;

Rick_SAS · Posted 04-22-2017 06:14 AM

You won't get a transition matrix if you have a row of zeros. By definition, each row of a transition matrix sums to 1. A transition matrix shouldn't have any rows that are all zero, so I'm not sure why you are asking this question.

I'd also like to mention that you can relabel your "states" in the transition matrix. Instead of special handling for the rows for which mod(row,4)=0, just relabel those states to be 1-8. Then your transition matrix has a simple block structure.

jl1005 · Posted 04-22-2017 06:52 PM

Here's some code that should be a reasonably accurate representation:

proc iml;
	A = {	0.22	0.65	0.08	0.05,
			0.16	0.26	0.38	0.2,
			0	0	0	0,
			0.06	0.32	0.56	0.06};
	print A;
	do j = 1 to nrow(A);
		if A[j, ]=0 & j^=2 & j^=4 then A[ ,j] = 0; 
	end;
	print A;


pts =	{	5	4	3	0,
		5	2	4	2,
		1	3	3	2,
		5	1	1	2};

num_games = 1;
%LET end_games = 50; 
results = j(&end_games., 2, 0);

do while(num_games <= &end_games);
	start_state = 1;
	points = 0;	

	do while( start_state^=2 & start_state^=4);
		end_state = randfun(1, 'Table', A[start_state, ]);
		points = points + pts[start_state, end_state];
		start_state = end_state;
	end;

	results[num_games, 1] = num_games;
	results[num_games, 2] = points;
	num_games = num_games + 1;
end;
print results;

It will run a few iterations without issue, but then by setting the end_games value to a value such as 500 will more likely not throw an error.

I don't think it fully fits the transition matrix definition. Instead of transitioning from one state to any other, we start with row 1, then randomly pick a column in that row, then set that as our next start state, or row to use. For example, start in row 1, select column 2, then row 2 is our next start state, and select a column from that row based on the probabilities contained in that row.

The issue is that since the data has been split up, some datasets do not contain some of the start state and end state combinations, hence the rows of 0. So if we were to select that row as a start state, we'd never be able to transition out of that state. So by setting the corresponding rows to 0, we would ideally never choose the column, as the probability of selecting the column is 0.

jl1005 · Posted 04-22-2017 07:19 PM

The code throws an error when the end state is greater than 4, which is outside the points matrix. I would guess that it's selecting a state 5 because the row probabilites don't add up to one; I may be completely wrong though.

Although probably not the best solution, a do until loop solves the issue, as shown below:

do while(num_games <= &end_games);
	start_state = 1;
	points = 0;	

	do while( start_state^=2 & start_state^=4);
		do until(end_state <= 4);
			end_state = randfun(1, 'Table', A[start_state, ]);
		end;
		points = points + pts[start_state, end_state];
		start_state = end_state;
	end;

	results[num_games, 1] = num_games;
	results[num_games, 2] = points;
	num_games = num_games + 1;
end;

Rick_SAS · Posted 04-22-2017 08:16 PM

It seems like the third row should be

0	0	1	0,

which means, "when you enter this state, you never leave it." These are the "Hotel California states" in your system.

The problem occurs because you a specifying a probability vector that does not sum to one. As explained in the doc for the RAND function, the function assumes that there is an (n=1)th category with probability 1. Thus the return value in your example will be 5, which leads to the "subscript out of range" error.

proc iml;
p = {0 0 0 0};  /* 4 elements; sum(p) < 1 */
item = randfun(1, "table", p);
print item;     /* always 5   */

jl1005 · Posted 04-23-2017 01:38 AM

I think I'm going about this the wrong way; or at least my solution is wrong.

Suppose we have three players, each with seperate transition matrices; shown below with a, b, and c.

	A = {	0.22	0.65	0.08	0.05,
		0.16	0.26	0.38	0.2,
		0.23	0.17	0.36	0.24,
		0.06	0.32	0.56	0.06};

	B = {	0.65	0.18	0.08	0.09,
		0	0	0	0,
		0.36	0.14	0.43	0.07,
		0	0	0	0};
	C = {	0	0	0	0,
		0.09	0.28	0.33	0.30,
		0.26	0.21	0.41	0.12,
		0.43	0.25	0.30	0.02};

So, as I explained above, the column you select in your starting row determines your next starting row, for the next player.

Suppose that we start in row 1 in matrix A, and we select column 3. Then we move to B and use row 3 as our starting state. But there's no way to transition our of that state.

Is there a way to generate your end state, then look at the next matrix, and see if it only contains 0's, and if it does, generate a new end state?

jl1005 · Posted 04-23-2017 02:04 AM

It's clearly not the cleanest, but I think this works:

proc iml;
	A = {	0.22	0.65	0.08	0.05,
		0.16	0.26	0.38	0.2,
		0.23	0.17	0.36	0.24,
		0.06	0.32	0.56	0.06};

	B = {	0.65	0.18	0.08	0.09,
		0	0	0	0,
		0.36	0.14	0.43	0.07,
		0	0	0	0};
	C = {	0	0			0,
		0.09	0.28	0.33	0.30,
		0.26	0.21	0.41	0.12,
		0.43	0.25	0.30	0.02};

	pts =	{0	4	1	0,
		4	0	2	2,
		3	4	1	0,
		4	4	0	0};

print A;
print B;
print C;
print pts;



%LET batter_1 = A;	/*Set-up batting order */
%LET batter_2 = B;
%LET batter_3 = C;

condition= 0;
num_games = 1;

do while(num_games <= 10);

	points		= 0;
	at_bats		= 1;
	start_state = 1;

	do while(start_state ^= 3);		/*Assume that state 3 represents 3 outs */
		batter = mod(at_bats, 3);	/*Since we only have 3 batters, mod 3*/
		next_batter = mod(at_bats+1,3);
		print batter;
		print next_batter;

		if batter = 1 then current_bat = &batter_1;
		if batter = 2 then current_bat = &batter_2;
		if batter = 0 then current_bat = &batter_3;

		if next_batter = 1 then next_bat = &batter_1;
		if next_batter = 2 then next_bat = &batter_2;
		if next_batter = 0 then next_bat = &batter_3;

		print current_bat;
		print next_bat;

			do until(condition = 1);
				end_state = randfun(1, 'table', current_bat[start_state, ]);
				print end_state;
				if next_bat[end_state, ] ^= 0 then condition = 1;
				if next_bat[end_state, ]  = 0 then print 'generate new end_state';
			end;

		points = points + pts[start_state, end_state];
		at_bats = at_bats + 1;
		start_state = end_state;
		condition = 0;
	end;
	print points;
num_games = num_games +1;
end;

Rick_SAS · Posted 04-23-2017 06:46 AM

In your example, only matrix A is a transition matrix. I think you should review Markov chains, how they are defined, and how to use them. The article "Markov transition matrices in SAS/IML" has a reference to an online textbook (Grimstead and Snell) that is clear and has many examples.

Good luck.

The 2025 SAS Hackathon has begun!