How to remove columns with same entries

Reply
Contributor
Posts: 33

How to remove columns with same entries

Hi there,

I wonder if someone can help me.

I have a data set with several columns and would like to get rid of those columns with duplicated entries like the example below

 

I have a data set:          
id a b c d e
1 blue blue blue blue red
2 white   white yellow red
3 grey grey black black red
4 orange orange orange    
5 brown orange brown white  brown
           
We would llike to remove repeated columns by patient so the data set will look like:          
id a b c d e
1 blue red      
2 white yellow red    
3 grey black red    
4 orange        
5 brown orange white    

 

all your help will be appreciated

 

Thanks

Eduardo.

Super User
Posts: 17,837

Re: How to remove columns with same entries

Transpose, remove duplicates via proc sort and re-transpose.

Contributor
Posts: 33

Re: How to remove columns with same entries

Problem is that I have lot more variables,
Super User
Posts: 17,837

Re: How to remove columns with same entries

Transpose, Remove Duplicates, Transpose again, use a data step to sort in appropriate order. 

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;

proc transpose data=have out=temp1(rename=col1=var_a_e);
by id;
var a b c d e;
run;

proc sort data=temp1 nodupkey out=temp2 (where=(var_a_e ne ""));
by id var_a_e;
run;


proc transpose data=temp2 out=temp3;
by id;
id _name_;
var var_a_e;
run;

data want;
retain id a b c d e;
set temp3;
array lett(5) $ a b c d e;
array rev(5) e d c b a;
call sortc(of rev(*));

drop _name_;
run;

proc print;run;
Respected Advisor
Posts: 3,124

Re: How to remove columns with same entries

You can also resort to the almighty data step:

data want;
	array t(5) $10 _temporary_;
	set have;
	array ae(5) a--e;
	call missing (of t(*));

	do i=1 to 5;
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to 5;
		ae(i)=t(i);
	end;

	drop i j;
run;
Contributor
Posts: 33

Re: How to remove columns with same entries

Thanks Haikuo,
I have 36 columns not only 5, I ran what you suggested but didn't work
Super User
Posts: 9,681

Re: How to remove columns with same entries

Or simple IML code :

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;
proc iml;
use have;
read all var {id};
read all var _char_ into x[c=vnames];
y=j(nrow(x),ncol(x),blankstr(nleng(x)));
do i=1 to nrow(x);
 temp=unique(x[i,]); 
 idx=loc(temp^=' ');
 y[i,1:ncol(idx)]=temp[,idx];
end;
print y;
run;

 

 

Contributor
Posts: 33

Re: How to remove columns with same entries

Thanks Ksharp, my SAS licence does not include the iml procedure
Super User
Posts: 9,681

Re: How to remove columns with same entries

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;


%let columns=5;  /* <---- */


data want;
	array t(&columns) $ 32 _temporary_;
	set have;
	array ae(*) $ _character_;
	call missing (of t(*));

	do i=1 to dim(ae);
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to dim(ae);
		ae(i)=t(i);
	end;

	drop i j;
run;
Ask a Question
Discussion stats
  • 8 replies
  • 428 views
  • 1 like
  • 4 in conversation