BookmarkSubscribeRSS Feed
jl1005
Obsidian | Level 7

Hi,

 

I've read over Rick's articles that describe how to loop through mutiple datasets in order to output multiple different matricies, and I think I've followed along and should be close, but my code is still throwing errors.

 

I have data sets labeled class_01_data to class_15_data, and would like to put those into matricies titled t01 to t15.

 

Here's what I have so far.

proc iml;
dsNames = {01 02 03 04 05 06 07 08 09 10 11 12 13 14 15};
	do i = 01 to ncol(dsNames);
		ds = 'class_' + (dsNames[i]) + '_data'; *Should be the name of the input dataset;
		MatNames = 't01':('t'+strip(char(ncol(dsNames))));  *Want to be the name of the output matrix;
		use (ds);
			read all var _ALL_ into X;
			call valset(MatNames, X);
			close (ds);
			print(MatNames);
	end;

Thank you.

 

11 REPLIES 11
PeterClemmensen
Tourmaline | Level 20

What kind of matrix computation do you want to apply to the matrices titled t01 to t15?

 

In this type of problem it is better coding practice to read a dataset into a matrix, X do some computation on X and store the result in a vector/matrix, instead of having 15 matrices read in at the same time. Below I have outlined what I think your program should look like along with a few suggestions.

 

data class_1_data;
input var1 var2 var3;
datalines;
100 100 200
600 300 400
600 200 200
;

data class_2_data;
input var1 var2 var3;
datalines;
200 100 200
900 300 400
300 200 400
;

data class_3_data;
input var1 var2 var3;
datalines;
100 200 400
300 400 200
200 600 500
;

proc iml;
dsNames = cats("class_1":"class_3","_data");
sums = j(1,ncol(dsNames),0);

do i = 1 to ncol(dsNames);
   use (dsNames[i]);
      read all var _NUM_ into X;
   close (dsNames[i]);
   sums[i] = X[+];
end;

print sums;

quit;

 

Regards

jl1005
Obsidian | Level 7

Once a dataset is read in, I'd like to insert 8 rows of zeros in certain spots:

	tmp=insert(t2,z,4);
	tmp=insert(tmp,z,8);
	tmp=insert(tmp,z,12);
	tmp=insert(tmp,z,16);
	tmp=insert(tmp,z,20);
	tmp=insert(tmp,z,24);
	tmp=insert(tmp,z,28);
	t2=insert(tmp,z,32);
	print t2;

There's probably a more efficient way to do that, but that all I'm doing, for now.

Rick_SAS
SAS Super FREQ

Regarding inserting 8 rows of zeros, I would 

1. Allocate a new zero matrix that has eight more rows than your data

2. Use the SETDIF function to exclude the rows that you wish to be nonzero.

3. Copy the data into those rows

 

proc iml;
t2 = j(40, 5, 1);  /* example data */
tmp = t2;          /* save copy of data */
t2 = j(nrow(t2) + 8, ncol(t2), 0); /* allocate */
rows = setdif(1:nrow(t2), do(4,32,4)); /* 1,2,3, 5,6,7, 9,... */
t2[rows, ] = tmp;  /* copy the data into the new matrix */
PeterClemmensen
Tourmaline | Level 20

Though @Rick_SAS's method is probably more efficient, you can use the insert function as you suggest like in this small example

 

data testdata(drop = i);
do i=1 to 40;
   x = i;
   y = 2*x;
   output;
end;
run;

proc iml;
use testdata;
   read all var _ALL_ into X;
close testdata;

zeros = j(1,ncol(X),0);
rows = do(4,32,4);

do i = 1 to ncol(rows);
   X = insert(X,zeros,rows[i]);
end;

print X;
quit;

 

Rick_SAS
SAS Super FREQ

For a discussion of draycut's technique, see the article "read data sets that are specified by an array of names."

jl1005
Obsidian | Level 7

Combining what's been suggested has yielded this:

 

proc iml;
dsNames = cats('class_01':'class_15','_data');
	do i = 01 to ncol(dsNames);
		use (dsNames[i]);
			read all var _ALL_ into X;
		close (dsNames[i]);
			tmp = X;
			X = j(nrow(X) + 8, ncol(X), 0);
			rows = setdif(1:nrow(t01), do(4,32,4));
			X[rows, ] = tmp;
end;

But it's throwing an error that class_01_data.data does not exist. It seems like its the .data part thats throwing it off.

 

So here's what I think I'd like the program to do for just one table, but want to expand it to fit however many datasets so the code doesn't have to be repeated 15 or more times:

proc iml;
use class_01_data;
	read all var _ALL_ into t01;
	close class_01_data;
		tmp = t01;
		t01=j(nrow(t01) + 8, ncol(t01), 0);
		rows = setdif(1:nrow(t01), do(4,32,4));
		t01[rows, ] = tmp;
		print t01;
PeterClemmensen
Tourmaline | Level 20

A few things 🙂

 

1. You should start your loop at i = 1 and not at 01.

2. Defining your dataset names in this way as I suggested above gives you class_1_data, class_2_data and so on without the zero following the underscore (I missed this zero at a first glance in my example datasets above, sorry about that). But this zero complicates things a bit, so if I were you I would rename my datasets or simply set

 

dsNames = {'class_01_data' 'class_02_data' 'class_03_data'
           'class_04_data' 'class_05_data' 'class_06_data'
           'class_07_data' 'class_08_data' 'class_09_data'
           'class_10_data' 'class_11_data' 'class_12_data'
           'class_13_data' 'class_14_data' 'class_15_data'};

, though @Rick_SAS probably has some clever solution to this 🙂

 

3. This line

 

rows = setdif(1:nrow(t01), do(4,32,4));

should be

 

rows = setdif(1:nrow(X), do(4,32,4));

as you are refferencing t01 which has never been set here.

 

 

Rick_SAS
SAS Super FREQ

If you need to use 01, 02, 03,..., you can manufacture the data set names by using the Z format:

 

dsNames = cats('class_', putn(1:15,"Z2."), '_data');
print dsNames;

 

 

jl1005
Obsidian | Level 7

Rick,

 

I'm not sure if this is enough to delegate a whole new post to it, but I just had one, hopefully, quick question. Given this code:

 

do while(num_games <= 20);
	start_state = 1;
	at_bats     = 1;
	points      = 0;
	batter      = 0;
	current_bat = 0;
	do while(
				start_state ^=  4 & start_state ^= 8  & start_state ^= 12 &
				start_state ^= 16 & start_state ^= 20 & start_state ^= 24 &
				start_state ^= 28 & start_state ^= 32
			);
		batter = mod(at_bats, 9);
		*print batter;
		if batter = 1  then current_bat = &b1 ;
		else if batter = 2  then current_bat = &b2 ;
		else if batter = 3  then current_bat = &b3 ;
		else if batter = 4  then current_bat = &b4 ;
		else if batter = 5  then current_bat = &b5 ;
		else if batter = 6  then current_bat = &b6 ;
		else if batter = 7  then current_bat = &b7 ;
		else if batter = 8  then current_bat = &b8 ;
		else if batter = 9  then current_bat = &b9 ;
		*print current_bat;
		end_state=randfun(1, 'Table',current_bat[start_state, ]); 
		points = points + runs[start_state, end_state];
		at_bats = at_bats + 1;
		start_state = end_state;
	end;
	print points at_bats start_state;
	print 'End of Game';
num_games = num_games + 1;
end;

I'd like to store each number of points and at bats from each game in a new matrix to then analyze. 

 

So if the number of games is 20, I'd create a 2x20 matrix, but then how do I have sas read each iteration of the loop as a seperate row?

 

Thank you.

Rick_SAS
SAS Super FREQ

Untested, but should give you the idea:

 

results = j(num_games, 2, 0);

do while(num_games <= 20);

   ...

   do while( mod(start_state, 4) ^= 0 );

      ...

      results[num_games,1] = points;

      results[num_games,2] = at_bats;

   end;

end;

print results[colname={"Points" "AtBats"}];

 

 

Also, if I may suggest, get rid of the long IF-THEN/ELSE statement  with macro variables, or at least move the macros outside the loop like this 

b = {&b1 &b2  &b3 &b4 &b5 &b6 &b7 &b8 &b9};  /* top of program */

...

current_bat = b[batter];   /* inside loop. Replaces IF/THEN logic */

jl1005
Obsidian | Level 7

Hi draycut,

 

I took your advice and renamed all the datasets to just 1,2,..,15 and removed the leading 0, which seems to have fixed it. So here's the updated code, which is fully functional:

proc iml;
dsNames = cats('class_1':'class_15','_data');
MatNames = dsNames;
do i = 1 to ncol(dsNames);
		use (dsNames[i]);
			read all var _ALL_ into X;		
			tmp = X;
			X = j(nrow(X) + 8, ncol(X), 0);
			rows = setdif(1:nrow(X), do(4,32,4));
			X[rows, ] = tmp;
			print X;
			call valset(MatNames[i], X);
			close (dsNames[i]);
end;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 11 replies
  • 1315 views
  • 7 likes
  • 3 in conversation