DATA Step, Macro, Functions and more

Changing the Column name with the Row Value

Accepted Solution Solved
Reply
Super Contributor
Posts: 381
Accepted Solution

Changing the Column name with the Row Value

[ Edited ]

I have a table which looks like the one below. I also have another table which has "Q"s that went through a filter. What I want to do is to replace the names of the"Q"s that are as rows in the second table with the ones that match with the Q's that are the column names in te first table and also type the matching ones as '_Q' or  'Q_' . Actually ı could achive the process transposing the table three times. What I want to know is whether there is another shorter way of doing this.
Thank you.

can.png

 

 

can.png

 

Screenshot_2.png

 

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
%let colCnt=6;
data out_data(keep=Percentage Variable);
	format pcntage percent7.2;
    array t t1-t&colCnt.;
	retain t1-t&colCnt. 0;  * retain some total counts;
	set table end = eof;
		array Q Q1-Q&colCnt.;
		do i=1 to &colcnt;
		    t(i)=t(i)+(Q(i)ne 0);  * add each non-missing to the total count ;
		end;
		if eof then do i=1 to &colcnt.;
			Variable='Q'||strip(put(i,best.));  * create column name ;
			Percentage=t(i)/_n_; * calculate %age ;
			if Percentage>0.70 then output; * output one row for each of the analyzed columns;
		end;
	run;
PROC TRANSPOSE DATA=WORK.table
	OUT=WORK.TransposedTABLE
	PREFIX=Q
	NAME=QKolon ;
	VAR Q:;
	RUN;
	QUIT;
PROC SQL;
create table tableJoin2 as
select *, Case
			When j.Variable =t.QKolon then left(trim(j.Variable)||"_")
			else t.QKolon
			end as newColumn
 from TransposedTABLE as T
left join out_data as J
on(J.Variable=T.QKolon);
quit;
PROC TRANSPOSE DATA=WORK.TABLEJOIN2
	OUT=WORK.TABLEJOIN3(drop=_:) ;
	ID newColumn;
	VAR Q1-Q10;
	RUN;
	QUIT;

Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 9,687

Re: Changing the Column name with the Row Value

Assuming I understand what you mean.

 

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
run;
data x;
input variable $ percentage;
cards;
Q1 0.9
Q2 0.8
Q4 0.2
Q6 o.5
;
run;
proc sql;
 select cats(variable,'=',variable,'_') into : list separated by ' '
  from x;
quit;
proc datasets library=work nodetails nolist;
 modify table;
 rename &list ;
quit;

View solution in original post


All Replies
Solution
‎09-25-2015 06:23 AM
Super User
Posts: 9,687

Re: Changing the Column name with the Row Value

Assuming I understand what you mean.

 

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
run;
data x;
input variable $ percentage;
cards;
Q1 0.9
Q2 0.8
Q4 0.2
Q6 o.5
;
run;
proc sql;
 select cats(variable,'=',variable,'_') into : list separated by ' '
  from x;
quit;
proc datasets library=work nodetails nolist;
 modify table;
 rename &list ;
quit;
Super Contributor
Posts: 381

Re: Changing the Column name with the Row Value

Thank you, yes you understood rightly Smiley Happy

Super User
Super User
Posts: 6,502

Re: Changing the Column name with the Row Value

You can use varaible lists with RENAME.

rename q1-q6 = _q1-_q6 ;

 

Super User
Super User
Posts: 6,502

Re: Changing the Column name with the Row Value

Your statement of the problem and the code you provided do not seem linked. The code you provided is doing some type of transformation on the values, not just transposing the tables.

If you want your WIDE table converted to match your NARROW table then use PROC TRANSPOSE in this way.

proc transpose data=table1 out=narrow1 name=variable;
  by id ;
  var q1-q6 ;
run;

If you want your NARROW table converted to match your WIDE table then use PROC TRANSPOSE in this way.

proc transpose data=table2 out=wide2 ;
  id variable;
  var percentage;
run;

If you want to rename one set of variables so that you can have all 12 variables in one data set then just use a rename option.

data both ;
   if _n_=1 then set wide2 (rename=(q1-q6 = _q1-_q6)) ;
   set table1 ;
run;

 

Super Contributor
Posts: 381

Re: Changing the Column name with the Row Value

Thank you Tom. But i'm not sure it works either. I used the code below. I got error that "Variable q3 is not on file WORK.WIDE2.". Did i made something wrong.I think if we write q1-q6, rename option wants to see all member of q's.It didn't see the q3 and q5.

Super User
Super User
Posts: 6,502

Re: Changing the Column name with the Row Value

[ Edited ]

So that is caused by your data not containing an example of all possible values.  PROC TRANSPOSE will not create variables for combinations that do not exist.

There are two ways to insure that you have all possible variables.

One is to add a data step after the transpose.

data want ;

    length q1-q6 8 ;

    set wide2 ;

    rename q1-q6 = _q1-_q6 ;

run;

 

Another is to add data to your input table to insure that all possible varaibles are present.  Usually you would do this by adding a group with a clearly invalid value of the id variable that can then be dropped after the transpose.  If the data is large you can improve performance by doing with a view that you feed to proc transpose.  Something like this should work.

 

data extra / view=extra ;

    if _n_=1 then do ;

       id=-1 ;

       do i=1 to 6 ; variable = cats('Q',i); output; end;

   end;

   set table2 ;

   output;

   drop i;

run;

proc transpose data=extra out=wide2 (where=(id ne -1) rename=(q1-q6=_q1-_q6) );

   by id notsorted;

   id variable;

   var percentage ;

run;

 

   

 

Super Contributor
Posts: 381

Re: Changing the Column name with the Row Value

Hello Tom,

Thank you for detailed information Smiley Happy

Actually i'm a little bit  confused, i tried your code but it didn't get the right solution. I made some changes but didn't achieve.Columns name are changed but i didn't see any data and all of them are 'q_' now. I guess i didn't understand you.

 

Super User
Super User
Posts: 6,502

Re: Changing the Column name with the Row Value

If you tried the data step code I just posted that attempted to add extra rows please try it again.  The first time I posted it I left out an important OUTPUT statement.  The result would have been that only the extra data was kept and none of the real data.

Super Contributor
Posts: 381

Re: Changing the Column name with the Row Value

Firstly i got "ERROR: The ID value "Q1" occurs twice in the same BY group." error  then add "let" after rename it worked but it's not returned desired solution.. I think i'm doing something wrong or missing.

proc sort data=table1;
by id;
run;
data extra / view=extra ;
    if _n_=1 then do ;
       id=-1 ;
       do i=1 to 6 ; variable = cats('Q',i); output; end;
   end;
   set table2 ;
   output;
   drop i;
run;
proc transpose data=extra out=wide2 (where=(id ne -1) rename=(q1-q6=_q1-_q6) ) let;
   by id notsorted;
   id variable;
   var percentage ;
run;

 

Super User
Super User
Posts: 6,502

Re: Changing the Column name with the Row Value

So the problem is that your original TABLE2 does not already have any identifying variables to represent the groups.  You only have one group.  What is happening in that case with this data step view is that the variable ID is staying = to -1 when the first real data is read from TABLE2 (since there is no ID variable in that table).  If you just add a statement after the DO I= loop to set ID to missing then that will fix the problem.

 

    if _n_=1 then do ;
       id=-1 ;
       do i=1 to 6 ; variable = cats('q',i); output; end;
       id=.;
   end;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 518 views
  • 0 likes
  • 3 in conversation