BookmarkSubscribeRSS Feed
lalohg
Quartz | Level 8

Hi there,

I wonder if someone can help me.

I have a data set with several columns and would like to get rid of those columns with duplicated entries like the example below

 

I have a data set:          
id a b c d e
1 blue blue blue blue red
2 white   white yellow red
3 grey grey black black red
4 orange orange orange    
5 brown orange brown white  brown
           
We would llike to remove repeated columns by patient so the data set will look like:          
id a b c d e
1 blue red      
2 white yellow red    
3 grey black red    
4 orange        
5 brown orange white    

 

all your help will be appreciated

 

Thanks

Eduardo.

8 REPLIES 8
Reeza
Super User
Transpose, remove duplicates via proc sort and re-transpose.

lalohg
Quartz | Level 8
Problem is that I have lot more variables,
Reeza
Super User

Transpose, Remove Duplicates, Transpose again, use a data step to sort in appropriate order. 

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;

proc transpose data=have out=temp1(rename=col1=var_a_e);
by id;
var a b c d e;
run;

proc sort data=temp1 nodupkey out=temp2 (where=(var_a_e ne ""));
by id var_a_e;
run;


proc transpose data=temp2 out=temp3;
by id;
id _name_;
var var_a_e;
run;

data want;
retain id a b c d e;
set temp3;
array lett(5) $ a b c d e;
array rev(5) e d c b a;
call sortc(of rev(*));

drop _name_;
run;

proc print;run;
Haikuo
Onyx | Level 15

You can also resort to the almighty data step:

data want;
	array t(5) $10 _temporary_;
	set have;
	array ae(5) a--e;
	call missing (of t(*));

	do i=1 to 5;
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to 5;
		ae(i)=t(i);
	end;

	drop i j;
run;
lalohg
Quartz | Level 8
Thanks Haikuo,
I have 36 columns not only 5, I ran what you suggested but didn't work
Ksharp
Super User

Or simple IML code :

 

data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;
proc iml;
use have;
read all var {id};
read all var _char_ into x[c=vnames];
y=j(nrow(x),ncol(x),blankstr(nleng(x)));
do i=1 to nrow(x);
 temp=unique(x[i,]); 
 idx=loc(temp^=' ');
 y[i,1:ncol(idx)]=temp[,idx];
end;
print y;
run;

 

 

lalohg
Quartz | Level 8
Thanks Ksharp, my SAS licence does not include the iml procedure
Ksharp
Super User
data have;
infile cards expandtabs truncover;
input id	(a	b	c	d	e) ($);
cards;
1	blue	blue	blue	blue	red
2	white	. 	white	yellow	red
3	grey	grey	black	black	red
4	orange	orange	orange	 	. .
5	brown	orange	brown	white 	brown
;
run;


%let columns=5;  /* <---- */


data want;
	array t(&columns) $ 32 _temporary_;
	set have;
	array ae(*) $ _character_;
	call missing (of t(*));

	do i=1 to dim(ae);
		if ae(i) not in t then
			do;
				j=sum(j,1);
				t(j)=ae(i);
			end;
	end;

	do i=1 to dim(ae);
		ae(i)=t(i);
	end;

	drop i j;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1057 views
  • 1 like
  • 4 in conversation