BookmarkSubscribeRSS Feed
lalohg
Quartz | Level 8

Hi there,

I wonder if someone can help me.

I have a data set with several columns and would like to get rid of those columns with duplicated entries like the example below

 

I have a data set:          
id a b c d e
1 blue blue blue blue red
2 white   white yellow red
3 grey grey black black red
4 orange orange orange    
5 brown orange brown white  brown
           
We would llike to remove repeated columns by patient so the data set will look like:          
id a b c d e
1 blue red      
2 white yellow red    
3 grey black red    
4 orange        
5 brown orange white    

 

all your help will be appreciated

 

Thanks

Eduardo.

8 REPLIES 8
Reeza
Super User
Transpose, remove duplicates via proc sort and re-transpose.

lalohg
Quartz | Level 8
Problem is that I have lot more variables,
Reeza
Super User

Transpose, Remove Duplicates, Transpose again, use a data step to sort in appropriate order. 

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;

proc transpose data=have out=temp1(rename=col1=var_a_e);
by id;
var a b c d e;
run;

proc sort data=temp1 nodupkey out=temp2 (where=(var_a_e ne ""));
by id var_a_e;
run;


proc transpose data=temp2 out=temp3;
by id;
id _name_;
var var_a_e;
run;

data want;
retain id a b c d e;
set temp3;
array lett(5) $ a b c d e;
array rev(5) e d c b a;
call sortc(of rev(*));

drop _name_;
run;

proc print;run;
Haikuo
Onyx | Level 15

You can also resort to the almighty data step:

data want;
	array t(5) $10 _temporary_;
	set have;
	array ae(5) a--e;
	call missing (of t(*));

	do i=1 to 5;
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to 5;
		ae(i)=t(i);
	end;

	drop i j;
run;
lalohg
Quartz | Level 8
Thanks Haikuo,
I have 36 columns not only 5, I ran what you suggested but didn't work
Ksharp
Super User

Or simple IML code :

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;
proc iml;
use have;
read all var {id};
read all var _char_ into x[c=vnames];
y=j(nrow(x),ncol(x),blankstr(nleng(x)));
do i=1 to nrow(x);
 temp=unique(x[i,]); 
 idx=loc(temp^=' ');
 y[i,1:ncol(idx)]=temp[,idx];
end;
print y;
run;

 

 

lalohg
Quartz | Level 8
Thanks Ksharp, my SAS licence does not include the iml procedure
Ksharp
Super User
data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;


%let columns=5;  /* <---- */


data want;
	array t(&columns) $ 32 _temporary_;
	set have;
	array ae(*) $ _character_;
	call missing (of t(*));

	do i=1 to dim(ae);
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to dim(ae);
		ae(i)=t(i);
	end;

	drop i j;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1582 views
  • 1 like
  • 4 in conversation