Help using Base SAS procedures

Looping through columns in groups of 4, combining columns and reassigning values

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Looping through columns in groups of 4, combining columns and reassigning values

Hi,

I have some data which looks like below. The columns are actually in groups of 4 which just represent a 4 point scale. i.e. columns a, b, c & d represent the same variable where a = 4, b = 3, c = 2 and d = 1. Similarly e = 4, f = 3, g = 2 and h = 1.

The value 1 in 'a' means , 'a' has been selected by respondent and the value to be assigned for columns a, b, c and d for record 1 is 4 on the 4-point scale. The value 1 in column g means for the group of e, f, g & h, g has been chosen as the correct option by respondent and a 2 should be assigned for variable represented by group e, f, g & h.

HAVE:

abcdefgh
10000010
00101000
00010100
01000001

WANT:

VAR1 (a, b, c, d)VAR2 (e, f, g, h)
42
24
13
31

How can i achieve this?


Accepted Solutions
Solution
‎04-20-2015 11:38 AM
Super User
Posts: 10,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

There are a number of approaches available depending upon your level of coding comfort.

The easiest might be an array but the variables would need to be presented in the correct order for an array statement:

array group a b c d <etc>

The trickier part is creating the names for the NEW variables.

You can get a list of all of the variable names  and count of variables into a macro variable by:

proc sql noprint;

     select name, count(*)/4 into : varlist separated by ' ' : varcount

     from dictionary.columns where libname='LIB'

          and memname='DATA';

quit;

Put the name of your library and dataset in place of LIB and DATA IN CAPITAL LETTERS;

You can use %put to see the value in the log. If only variables you want appear, and are in the correct order in the Varlist variable you can use &varlist.

data want;

set have;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

     base = (I-1)*4;

     newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

View solution in original post


All Replies
Super User
Posts: 17,829

Re: Looping through columns in groups of 4, combining columns and reassigning values

SASPic1.jpg

Super User
Posts: 10,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

Here are two ways, one an assignment approach and the other with and informat. If you have lots of these variables I would recommend the format approach. Or go back to the survey software that created these and look at the export options.

proc format;

invalue groupfour

'1000'=4

'0100'=3

'0010'=2

'0001'=1

;

run;

data work.test;

input a b c d e f g h ;

var1 = a*4 + b*3 + c*2 + d;

var2 = input(cats(e,f,g,h),groupfour.);

datalines;

1 0 0 0 0 0 1 0

0 0 1 0 1 0 0 0

0 0 0 1 0 1 0 0

0 1 0 0 0 0 0 1

;

run;

Contributor
Posts: 40

Re: Looping through columns in groups of 4, combining columns and reassigning values

Hi BallardW,

the format approach is really cool . I have one question through - my original dataset has a lot of columns. Eventually the column names run into AB, AC , AD up to FV.

I don't want to type:

input(cats(e,f,g,h), groupfour.);

input(cats(i,j,k,l), groupfour.);

Isn't there someway I can ask SAS to apply groupfour. to successive 4 columns and keep creating new variable names?

thanks

Super User
Posts: 9,681

Re: Looping through columns in groups of 4, combining columns and reassigning values



data work.test;
input a b c d e f g h ;
array x{*} d c b a;
array y{*} h g f e;
v1=whichn(1,of x{*});
v2=whichn(1,of y{*});
datalines;
1 0 0 0 0 0 1 0
0 0 1 0 1 0 0 0
0 0 0 1 0 1 0 0
0 1 0 0 0 0 0 1
;
run;

Xia Keshan

Contributor
Posts: 40

Re: Looping through columns in groups of 4, combining columns and reassigning values

Thanks Xia, very good approach - I didn;t know the whichn function.

Is there a way we can save the keystrokes when creating array ? There are many columns in my dataset - Can we code SAS to go through 4 columns and make arrays out of them?

Solution
‎04-20-2015 11:38 AM
Super User
Posts: 10,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

There are a number of approaches available depending upon your level of coding comfort.

The easiest might be an array but the variables would need to be presented in the correct order for an array statement:

array group a b c d <etc>

The trickier part is creating the names for the NEW variables.

You can get a list of all of the variable names  and count of variables into a macro variable by:

proc sql noprint;

     select name, count(*)/4 into : varlist separated by ' ' : varcount

     from dictionary.columns where libname='LIB'

          and memname='DATA';

quit;

Put the name of your library and dataset in place of LIB and DATA IN CAPITAL LETTERS;

You can use %put to see the value in the log. If only variables you want appear, and are in the correct order in the Varlist variable you can use &varlist.

data want;

set have;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

     base = (I-1)*4;

     newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

Contributor
Posts: 40

Re: Looping through columns in groups of 4, combining columns and reassigning values

I get this error:

NOTE: Line generated by the macro variable "VARCOUNT".

1     Var       2

                -

                22

                200

ERROR: Missing numeric suffix on a numbered variable list (Var1-Var).

WARNING: Defining an array with zero elements.

Though the macro variable &varcount is recognized, a numeric suffix is required it says.

Super User
Posts: 10,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

I missed a comma retyping into the forum:

varlist separated by ' ' , : varcount

Contributor
Posts: 40

Re: Looping through columns in groups of 4, combining columns and reassigning values

Oh no I corrected that. The error is occurring in the next block -

data want;

set test;

array group &varlist;

array newvar  Var1 - Var&varcount;

do I = 1 to &varcount;

     base = (I-1)*4;

     newvar =  input(cats(group[base+1],group[base+2],group[base+3],group[base+4]),groupfour.);

end;

drop I base;

OTE: Line generated by the macro variable "VARCOUNT".

1     Var       2

                -

                22

                200

ERROR: Missing numeric suffix on a numbered variable list (Var1-Var).

WARNING: Defining an array with zero elements.

ERROR 22-322: Syntax error, expecting one of the following: a name, (, ;, _ALL_, _CHARACTER_,

              _CHAR_, _NUMERIC_.

ERROR 200-322: The symbol is not recognized and will be ignored.

Super User
Posts: 10,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

Before the data step after creating varcount:

%let varcount=%left(&varcount);

The SQL is placing leading spaces.

Contributor
Posts: 40

Re: Looping through columns in groups of 4, combining columns and reassigning values

Trying another approach:

data test;

   input a b c d e f g h i j k l;

   datalines;

1 0 0 0 0 1 0 0 1 0 0 0

0 1 0 0 0 0 1 0 0 1 0 0

0 0 1 0 1 0 0 0 1 0 0 0

0 0 0 1 0 0 0 1 0 0 1 0

;

run;

proc transpose data = test out = test_t;

run;

data test_t;

set test_t;

retain block;

if _n_ = 1 then block = 1;

if mod(_n_/4,1) = 0.25 and _n_ gt 1 then block +1;

run;

This gives me:

NameCol1Col2Col3Col4block
a10001
b01001
c00101
d00011
e00102
f10002
g01002
h00012
i10103
j01003
k00013
l00003

Is there a way I can group this by "block" to get something like :

blockcol1col2col3col4
11000010000100001
20100001010000001
31000010010000010

thanks

Super User
Super User
Posts: 6,500

Re: Looping through columns in groups of 4, combining columns and reassigning values

You can use the WHICHN() function to figure our which one of the four has a 1 in it.  But that assumes clean data with only one 1 in any group (or if more than 1 is set then that the highest one "wins").

data have;

  input a b c d e f g h i j k l m n o p q r s t ;

cards;

1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0

;;;;

data want;

  set have ;

  array x a -- t ;

  array value (5) ;

  do row=1 to dim(x)/4 ;

   value(row)=whichn(1,x(4*(row-1)+4),x(4*(row-1)+3),x(4*(row-1)+2),x(4*(row-1)+1)) ;

  end;

  put (x(*)) (2. 2. 2. 2. +2)

    / (value(*)) (2. +8)

  ;

run;

1 0 0 0   0 1 0 0   0 0 1 0   0 0 0 1   0 0 0 0

4         3         2         1         0


☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 465 views
  • 7 likes
  • 5 in conversation