DATA Step, Macro, Functions and more

Looping through Columns of Dataset

Reply
Occasional Contributor
Posts: 9

Looping through Columns of Dataset

Hi, I am coming from a background in R and am wondering how SAS handles arrays. I have a dataset with more than two columns and want to write a loop that allows me to compare the values of an entire column to the those of another column. For example,

 

Column 1          Column 2

1                       12

2                       13

3                       14 

4                       15

 

I want to compare 12 to 1, 13 to 2, 14 to 3, 15 to 4, etc. Is this possible to do in SAS 9.4? Thank you so much.

Super User
Posts: 17,750

Re: Looping through Columns of Dataset

SAS processes each data set line by line. You need to not think of it as an array/matrix unless you're in IML.

 

It's a dataset and each line is going to be processed one at a time.

 

What type of comparison do you want? If you want the difference it would be something as follows:

 

data want;
set have;

diff = col1 - col2;

run;

 

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

Hi Reeza,

 

Thank you for the reply. So, currently, I've written a do-statement and want to iterate through the elements of the array, comparing the values of, say, the second element in the array, to the first and assign a new variable something if that condition is satisfied. So, for example,

 

Column 1    Column 2

1                  12

2                  13

3                  14

4                  15

 

 

If the first entry of column 2 (12) is equal to the first entry of column 1 (1), then test_var = something. Then, skip to the next entry of column 2 and compare it to the next entry of column 1, etc. 

 

Thanks again! I hope what I'm asking is clear.

This is my code (it's wrong):

data work.test2;
	set work.test;
	array areverse {9} x1 x2 x3 x4 x5 x6 x7 x8 x9;
	if (areverse[1] = 0) then test_var = 1;
	do i=1 to 8;
		if (areverse[1] > 0 and areverse[i+1] = 0) then test_var = i+1;
	end;
run;
Super User
Posts: 17,750

Re: Looping through Columns of Dataset

Your logic doesn't make sense in terms of SAS. Rather than code can you provide logic in sample expected output.

 

Arrays in SAS are short cuts to reference a variable value in a specific row. Like I mentioned, you can not use the concepts of an array and matrix the way you're used to in R in SAS. Think of it as a row wise process. 

You look at the first row of data and do something. Then you move to the next row and so forth. To reference a variable in a data step means to reference it in each row.

 

Here's an example you can run.

data sample;
set sashelp.class;

age2 = age*2;

run;

Proc print data=sample;
var age age2;
run;


SAS has a free intro course for R programmers if you're interested.

 

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

[ Edited ]

Thank you for clarifying that for me. It's hard to switch from R to SAS, haha!

Essentially, I have 9 elements of an array. I want to first check if the first element ("variable") of the array is equal 0; if it is, then test_var will get the value 1.

Then, I wish to iterate through the other elements of the array ("variables") and check whether the values of each variable is equal to 0. If the condition is met, the test_var gets assigned the number corresponding to the variable's index in the array; otherwise it should be missing. test_var also does not change after each iteration (if the condition has already been met). If test_var = 2, it will stay that way. test_var will basically be missing until the "variable" satisfies the condition.

x2 test_var
1 .
2 .
0 2
1 .
2 .

x3 test_var
1 .
0 3
2 2
1 .
2 .

Repeat for x4, x5, etc.

Super User
Posts: 17,750

Re: Looping through Columns of Dataset

Try phrasing your question without using the word array. At all. 

 

Also, look at the WHICHN function. 

 

test_var = whichn(0, of x1-xn);

 

 

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

[ Edited ]

Okay, another attempt (my apologies, I'm just trying to find a way to do this in SAS--need to do this for future projects, as the company only uses SAS :-P)

 

Total of 9 columns: x1 to x9.

 

Search through all the values of x1, row by row. If the value in row 9 is 0, then test_var in row 9 will be 0 and missing on the previous 8.

 

Then, search through all the values of x2, x3, etc., row by row, conditioned on the fact that the corresponding row of x1 is NOT 0 AND the value of x2, x3, etc. is 0. Same idea as above, basically.

 

 

I think you alluded that you can do this in SAS IML? I have access to that, too. Thanks!

 

Super User
Posts: 10,466

Re: Looping through Columns of Dataset

How about providing some actual input data and the desired output?

If you have a data set you can use these instructions: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create data step code to provide us values we can work with.

 

Since Base SAS basically looks at one ROW at a time it does provide a function to look at values in the lines above, LAG, which has some tricks to using. But not to set values in lines above.

 

If you are routinely going to do matrix operations then you need to look into add the SAS/IML module.

Or rethink the data layouts/operations.

 

I am kind of interested to see what you do with row 10 in your data, if it exists.

 

Super User
Posts: 17,750

Re: Looping through Columns of Dataset

I'd almost want to flip your data to process it... it really does depend on your next steps though.

 

 

Super User
Posts: 9,662

Re: Looping through Columns of Dataset

SAS/IML can handle matrix , but it is another language other than data step.
SAS/IML is very like R, if you are familiar with R, you could easily master SAS/IML .

For your question , you need post some more data and the output to clarify your questions.


data have;
input c1 c2;
cards;
1                       12
2                       13
3                       14 
4                       15
;
run;

proc iml;
use have;
read all var _num_ into x;
close;

yes_no=(x[,1]=x[,2]) ;
print yes_no;

index=loc(yes_no);
print index;

quit;

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

Thanks for all the replies, everyone! I can get the desired output if I use if-then-else statements, but my main concern is: What if I have a lot of variables in my dataset? This is my if-then-else statement. How can I do this more efficiently when I have many more variables?

 

data work.test2;
	set work.test;
	if v1 = 0 then test = 1;
	else if v2 = 0 then test = 2;
	else if v3 = 0 then test = 3;
	else if v4 = 0 then test = 4;
	else if v5 = 0 then test = 5;
	else if v6 = 0 then test = 6;
	else if v7 = 0 then test = 7;
	else if v8 = 0 then test = 8;
	else if v9 = 0 then test = 9;
run;
Super User
Posts: 17,750

Re: Looping through Columns of Dataset

If this is your code, the WHICHN statement I suggested earlier will work. 

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

Hi Reeza,

 

Not sure if I completely understand the usage of whichn despite reading some documentation about it, but this code does not yield the same results as the if-then-else statement.

 

data work.test2;
	set work.test;
	array areverse {9} v1 v2 v3 v4 v5 v6 v7 v8 v9;
	test_var = whichn(0, of areverse{*});
run;
Super User
Posts: 17,750

Re: Looping through Columns of Dataset

Post an example of where the results don't match. 

Occasional Contributor
Posts: 9

Re: Looping through Columns of Dataset

After examining the output more carefully, I see that it's setting all the missing values to 0. How can I fix this?

 

Desired:

Screen Shot 2016-09-29 at 1.52.00 PM.png

 

What I'm getting:

Screen Shot 2016-09-29 at 1.52.47 PM.png

 

Thanks so much for all your help.

Ask a Question
Discussion stats
  • 18 replies
  • 1089 views
  • 3 likes
  • 4 in conversation