Solved
Contributor
Posts: 40

# Looping through columns in groups of 4, combining columns and reassigning values

Hi,

I have some data which looks like below. The columns are actually in groups of 4 which just represent a 4 point scale. i.e. columns a, b, c & d represent the same variable where a = 4, b = 3, c = 2 and d = 1. Similarly e = 4, f = 3, g = 2 and h = 1.

The value 1 in 'a' means , 'a' has been selected by respondent and the value to be assigned for columns a, b, c and d for record 1 is 4 on the 4-point scale. The value 1 in column g means for the group of e, f, g & h, g has been chosen as the correct option by respondent and a 2 should be assigned for variable represented by group e, f, g & h.

HAVE:

abcdefgh
10000010
00101000
00010100
01000001

WANT:

VAR1 (a, b, c, d)VAR2 (e, f, g, h)
42
24
13
31

How can i achieve this?

Accepted Solutions
Solution
‎04-20-2015 11:38 AM
Super User
Posts: 13,522

## Re: Looping through columns in groups of 4, combining columns and reassigning values

There are a number of approaches available depending upon your level of coding comfort.

The easiest might be an array but the variables would need to be presented in the correct order for an array statement:

array group a b c d <etc>

The trickier part is creating the names for the NEW variables.

You can get a list of all of the variable names  and count of variables into a macro variable by:

proc sql noprint;

select name, count(*)/4 into : varlist separated by ' ' : varcount

from dictionary.columns where libname='LIB'

and memname='DATA';

quit;

Put the name of your library and dataset in place of LIB and DATA IN CAPITAL LETTERS;

You can use %put to see the value in the log. If only variables you want appear, and are in the correct order in the Varlist variable you can use &varlist.

data want;

set have;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

base = (I-1)*4;

newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

All Replies
Super User
Posts: 23,700

Super User
Posts: 13,522

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Here are two ways, one an assignment approach and the other with and informat. If you have lots of these variables I would recommend the format approach. Or go back to the survey software that created these and look at the export options.

proc format;

invalue groupfour

'1000'=4

'0100'=3

'0010'=2

'0001'=1

;

run;

data work.test;

input a b c d e f g h ;

var1 = a*4 + b*3 + c*2 + d;

var2 = input(cats(e,f,g,h),groupfour.);

datalines;

1 0 0 0 0 0 1 0

0 0 1 0 1 0 0 0

0 0 0 1 0 1 0 0

0 1 0 0 0 0 0 1

;

run;

Contributor
Posts: 40

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Hi BallardW,

the format approach is really cool . I have one question through - my original dataset has a lot of columns. Eventually the column names run into AB, AC , AD up to FV.

I don't want to type:

input(cats(e,f,g,h), groupfour.);

input(cats(i,j,k,l), groupfour.);

Isn't there someway I can ask SAS to apply groupfour. to successive 4 columns and keep creating new variable names?

thanks

Super User
Posts: 10,770

## Re: Looping through columns in groups of 4, combining columns and reassigning values

```

data work.test;
input a b c d e f g h ;
array x{*} d c b a;
array y{*} h g f e;
v1=whichn(1,of x{*});
v2=whichn(1,of y{*});
datalines;
1 0 0 0 0 0 1 0
0 0 1 0 1 0 0 0
0 0 0 1 0 1 0 0
0 1 0 0 0 0 0 1
;
run;

```

Xia Keshan

Contributor
Posts: 40

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Thanks Xia, very good approach - I didn;t know the whichn function.

Is there a way we can save the keystrokes when creating array ? There are many columns in my dataset - Can we code SAS to go through 4 columns and make arrays out of them?

Solution
‎04-20-2015 11:38 AM
Super User
Posts: 13,522

## Re: Looping through columns in groups of 4, combining columns and reassigning values

There are a number of approaches available depending upon your level of coding comfort.

The easiest might be an array but the variables would need to be presented in the correct order for an array statement:

array group a b c d <etc>

The trickier part is creating the names for the NEW variables.

You can get a list of all of the variable names  and count of variables into a macro variable by:

proc sql noprint;

select name, count(*)/4 into : varlist separated by ' ' : varcount

from dictionary.columns where libname='LIB'

and memname='DATA';

quit;

Put the name of your library and dataset in place of LIB and DATA IN CAPITAL LETTERS;

You can use %put to see the value in the log. If only variables you want appear, and are in the correct order in the Varlist variable you can use &varlist.

data want;

set have;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

base = (I-1)*4;

newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

Contributor
Posts: 40

## Re: Looping through columns in groups of 4, combining columns and reassigning values

I get this error:

NOTE: Line generated by the macro variable "VARCOUNT".

1     Var       2

-

22

200

ERROR: Missing numeric suffix on a numbered variable list (Var1-Var).

WARNING: Defining an array with zero elements.

Though the macro variable &varcount is recognized, a numeric suffix is required it says.

Super User
Posts: 13,522

## Re: Looping through columns in groups of 4, combining columns and reassigning values

I missed a comma retyping into the forum:

varlist separated by ' ' , : varcount

Contributor
Posts: 40

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Oh no I corrected that. The error is occurring in the next block -

data want;

set test;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

base = (I-1)*4;

newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

OTE: Line generated by the macro variable "VARCOUNT".

1     Var       2

-

22

200

ERROR: Missing numeric suffix on a numbered variable list (Var1-Var).

WARNING: Defining an array with zero elements.

ERROR 22-322: Syntax error, expecting one of the following: a name, (, ;, _ALL_, _CHARACTER_,

_CHAR_, _NUMERIC_.

ERROR 200-322: The symbol is not recognized and will be ignored.

Super User
Posts: 13,522

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Before the data step after creating varcount:

%let varcount=%left(&varcount);

The SQL is placing leading spaces.

Contributor
Posts: 40

## Re: Looping through columns in groups of 4, combining columns and reassigning values

Trying another approach:

data test;

input a b c d e f g h i j k l;

datalines;

1 0 0 0 0 1 0 0 1 0 0 0

0 1 0 0 0 0 1 0 0 1 0 0

0 0 1 0 1 0 0 0 1 0 0 0

0 0 0 1 0 0 0 1 0 0 1 0

;

run;

proc transpose data = test out = test_t;

run;

data test_t;

set test_t;

retain block;

if _n_ = 1 then block = 1;

if mod(_n_/4,1) = 0.25 and _n_ gt 1 then block +1;

run;

This gives me:

 Name Col1 Col2 Col3 Col4 block a 1 0 0 0 1 b 0 1 0 0 1 c 0 0 1 0 1 d 0 0 0 1 1 e 0 0 1 0 2 f 1 0 0 0 2 g 0 1 0 0 2 h 0 0 0 1 2 i 1 0 1 0 3 j 0 1 0 0 3 k 0 0 0 1 3 l 0 0 0 0 3

Is there a way I can group this by "block" to get something like :

 block col1 col2 col3 col4 1 1000 0100 0010 0001 2 0100 0010 1000 0001 3 1000 0100 1000 0010

thanks

Super User
Posts: 8,084

## Re: Looping through columns in groups of 4, combining columns and reassigning values

You can use the WHICHN() function to figure our which one of the four has a 1 in it.  But that assumes clean data with only one 1 in any group (or if more than 1 is set then that the highest one "wins").

data have;

input a b c d e f g h i j k l m n o p q r s t ;

cards;

1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0

;;;;

data want;

set have ;

array x a -- t ;

array value (5) ;

do row=1 to dim(x)/4 ;

value(row)=whichn(1,x(4*(row-1)+4),x(4*(row-1)+3),x(4*(row-1)+2),x(4*(row-1)+1)) ;

end;

put (x(*)) (2. 2. 2. 2. +2)

/ (value(*)) (2. +8)

;

run;

1 0 0 0   0 1 0 0   0 0 1 0   0 0 0 1   0 0 0 0

4         3         2         1         0

🔒 This topic is solved and locked.