Solved: Re: Changing the Column name with the Row Value

turcay · Posted 09-20-2015 09:53 AM

I have a table which looks like the one below. I also have another table which has "Q"s that went through a filter. What I want to do is to replace the names of the"Q"s that are as rows in the second table with the ones that match with the Q's that are the column names in te first table and also type the matching ones as '_Q' or 'Q_' . Actually ı could achive the process transposing the table three times. What I want to know is whether there is another shorter way of doing this.
Thank you.

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
%let colCnt=6;
data out_data(keep=Percentage Variable);
	format pcntage percent7.2;
    array t t1-t&colCnt.;
	retain t1-t&colCnt. 0;  * retain some total counts;
	set table end = eof;
		array Q Q1-Q&colCnt.;
		do i=1 to &colcnt;
		    t(i)=t(i)+(Q(i)ne 0);  * add each non-missing to the total count ;
		end;
		if eof then do i=1 to &colcnt.;
			Variable='Q'||strip(put(i,best.));  * create column name ;
			Percentage=t(i)/_n_; * calculate %age ;
			if Percentage>0.70 then output; * output one row for each of the analyzed columns;
		end;
	run;
PROC TRANSPOSE DATA=WORK.table
	OUT=WORK.TransposedTABLE
	PREFIX=Q
	NAME=QKolon ;
	VAR Q:;
	RUN;
	QUIT;
PROC SQL;
create table tableJoin2 as
select *, Case
			When j.Variable =t.QKolon then left(trim(j.Variable)||"_")
			else t.QKolon
			end as newColumn
 from TransposedTABLE as T
left join out_data as J
on(J.Variable=T.QKolon);
quit;
PROC TRANSPOSE DATA=WORK.TABLEJOIN2
	OUT=WORK.TABLEJOIN3(drop=_:) ;
	ID newColumn;
	VAR Q1-Q10;
	RUN;
	QUIT;

Ksharp · Posted 09-20-2015 10:12 AM

Assuming I understand what you mean.

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
run;
data x;
input variable $ percentage;
cards;
Q1 0.9
Q2 0.8
Q4 0.2
Q6 o.5
;
run;
proc sql;
 select cats(variable,'=',variable,'_') into : list separated by ' '
  from x;
quit;
proc datasets library=work nodetails nolist;
 modify table;
 rename &list ;
quit;

View solution in original post

Ksharp · Posted 09-20-2015 10:12 AM

Assuming I understand what you mean.

data table;
length ID $ 7 Q1 8 Q2 8 Q3 8 Q4 8 Q5 8 Q6 8;
infile datalines missover dlm=",";
input ID Q1 Q2 Q3 Q4 Q5 Q6 ;
datalines;
RefID1,0.90,0.80,0.00,0.90,0.00,0.70
RefID2,0.100,0.100,0.00,0.70,0.00,60
RefID3,0.40,0.80,0.00,0.90,0.00,0.50
RefID4,0.55,0.80,0.05,0.90,0.00,0.69
RefID5,0.00,0.80,0.60,0.90,0.20,0.90
RefID6,0.96,0.00,0.40,0.90,0.00,0.95
RefID7,0.00,0.80,0.90,0.90,0.00,0.99
RefID8,0.56,0.80,0.55,0.90,0.00,0.93
RefID9,0.99,0.80,0.99,0.90,0.00,0.70
RefID10,0.89,0.88,0.56,0.90,0.00,0.00
;
run;
data x;
input variable $ percentage;
cards;
Q1 0.9
Q2 0.8
Q4 0.2
Q6 o.5
;
run;
proc sql;
 select cats(variable,'=',variable,'_') into : list separated by ' '
  from x;
quit;
proc datasets library=work nodetails nolist;
 modify table;
 rename &list ;
quit;

turcay · Posted 09-20-2015 10:25 AM

Thank you, yes you understood rightly 🙂

Tom · Posted 09-20-2015 10:30 AM

You can use varaible lists with RENAME.

rename q1-q6 = _q1-_q6 ;

Tom · Posted 09-20-2015 10:15 AM

Your statement of the problem and the code you provided do not seem linked. The code you provided is doing some type of transformation on the values, not just transposing the tables.

If you want your WIDE table converted to match your NARROW table then use PROC TRANSPOSE in this way.

proc transpose data=table1 out=narrow1 name=variable;
  by id ;
  var q1-q6 ;
run;

If you want your NARROW table converted to match your WIDE table then use PROC TRANSPOSE in this way.

proc transpose data=table2 out=wide2 ;
  id variable;
  var percentage;
run;

If you want to rename one set of variables so that you can have all 12 variables in one data set then just use a rename option.

data both ;
   if _n_=1 then set wide2 (rename=(q1-q6 = _q1-_q6)) ;
   set table1 ;
run;

turcay · Posted 09-20-2015 10:32 AM

Thank you Tom. But i'm not sure it works either. I used the code below. I got error that "Variable q3 is not on file WORK.WIDE2.". Did i made something wrong.I think if we write q1-q6, rename option wants to see all member of q's.It didn't see the q3 and q5.

Tom · Posted 09-20-2015 10:41 AM

So that is caused by your data not containing an example of all possible values. PROC TRANSPOSE will not create variables for combinations that do not exist.

There are two ways to insure that you have all possible variables.

One is to add a data step after the transpose.

data want ;

length q1-q6 8 ;

set wide2 ;

rename q1-q6 = _q1-_q6 ;

run;

Another is to add data to your input table to insure that all possible varaibles are present. Usually you would do this by adding a group with a clearly invalid value of the id variable that can then be dropped after the transpose. If the data is large you can improve performance by doing with a view that you feed to proc transpose. Something like this should work.

data extra / view=extra ;

if _n_=1 then do ;

id=-1 ;

do i=1 to 6 ; variable = cats('Q',i); output; end;

end;

set table2 ;

output;

drop i;

run;

proc transpose data=extra out=wide2 (where=(id ne -1) rename=(q1-q6=_q1-_q6) );

by id notsorted;

id variable;

var percentage ;

run;

turcay · Posted 09-20-2015 10:58 AM

Hello Tom,

Thank you for detailed information 🙂

Actually i'm a little bit confused, i tried your code but it didn't get the right solution. I made some changes but didn't achieve.Columns name are changed but i didn't see any data and all of them are 'q_' now. I guess i didn't understand you.

Tom · Posted 09-20-2015 11:29 AM

If you tried the data step code I just posted that attempted to add extra rows please try it again. The first time I posted it I left out an important OUTPUT statement. The result would have been that only the extra data was kept and none of the real data.

turcay · Posted 09-20-2015 11:38 AM

Firstly i got "ERROR: The ID value "Q1" occurs twice in the same BY group." error then add "let" after rename it worked but it's not returned desired solution.. I think i'm doing something wrong or missing.

proc sort data=table1;
by id;
run;
data extra / view=extra ;
    if _n_=1 then do ;
       id=-1 ;
       do i=1 to 6 ; variable = cats('Q',i); output; end;
   end;
   set table2 ;
   output;
   drop i;
run;
proc transpose data=extra out=wide2 (where=(id ne -1) rename=(q1-q6=_q1-_q6) ) let;
   by id notsorted;
   id variable;
   var percentage ;
run;

Tom · Posted 09-20-2015 12:11 PM

So the problem is that your original TABLE2 does not already have any identifying variables to represent the groups. You only have one group. What is happening in that case with this data step view is that the variable ID is staying = to -1 when the first real data is read from TABLE2 (since there is no ID variable in that table). If you just add a statement after the DO I= loop to set ID to missing then that will fix the problem.

    if _n_=1 then do ;
       id=-1 ;
       do i=1 to 6 ; variable = cats('q',i); output; end;
       id=.;
   end;

Classroom Training Available!