Help using Base SAS procedures

drop the duplicate columns with sas

Reply
Frequent Contributor
Posts: 75

drop the duplicate columns with sas

I'm dropping duplcated columns from my tables with this code.


Step by step explanation:


I'm creating the table toto through a select on the table dicoat.


Once created, I'm having a datastep


However the while loop is not recognized. for the datastep. The rule should be if there are duplicate columns, then drop the duplicate columns

and do that until the counting is superior or equal to one.


The issue is that the while loop is not recognized. Edited: Is it allowed to put a proc sql within a loop like that?


Why?


Am I doing something wrong?


Thanks



proc sql;
create table toto
as select min(nomvar) as nomvar,count(title) as counting
from dicoat
group by title
having count(title) > 1;


data work
.toto;

set toto;
  do
while(counting>=1);
  
proc sql;
delete from dicoat where nomvar in (select nomvar from toto);
  
insert into toto
select min(nomvar) as nomvar,count(title) as counting from dicoat
group by title
having count(title) > 1;
end;
run
;

data _null_
;
file tempf;
set toto end=lastobs;
if _n_=1 then put "data aat;set aat (drop=";
put var
/;
if lastobs then put ");run;";
run
;

%inc tempf;

filename tempf clear
;

Super User
Posts: 5,430

Re: drop the duplicate columns with sas

Not really seeing what you try to achieve, some sample data can be of help - along with a more extensive description of your problem.

And no, you can't mix SQL and data step in that way. You could call SQL from within a data step using call execute, but I can't tell if that's the right way to go for you. My gut feeling is that there is a simpler way to do this.

Data never sleeps
Frequent Contributor
Posts: 75

Re: drop the duplicate columns with sas

Ok LinusH.

I have a table called dicoat.

It will be something like this

data dicoat;

input title$ nomvar$;

cards;

aty var12

zat var13

zat var52

aty var55

zat var98

att var100

;

As you can see, I'm trying to remove the duplicate from dicoat with my code recursively until dicoat has only one nomvar and one title. In the same time, I need to put all the duplcate into a macro variable as I need to use that macrovariable

to drop the columns in the table aat as I've put in my code above.

aat before transformation

var12 var13 var52 var55  var98 var100


aat after columns drop


var12 var13 var100

I hope it is clearer. If not, let me know.

Super User
Posts: 5,430

Re: drop the duplicate columns with sas

Why don't you just build a macro variable with all variables that should be dropped? And then just rewrite you table using a accumulated drop statement.

proc sort data=dicoat;

    by title;

run;

data _null_;

     set dicoat end=last;

     length dropvar $1000;

     retain dropvar;

     by title;

     if not first.title then dropvar = catx(' ',dropvar,nomvar);

     if last then call symput('dropvar',trim(dropvar));

run;

options symbolgen;

data aat;

     set aat(drop=&DROPVAR.);

run;

Data never sleeps
Frequent Contributor
Posts: 75

Re: drop the duplicate columns with sas

Thanks LinusH.

Let me have a look.

Contributor
Posts: 64

Re: drop the duplicate columns with sas

Andy,

In the example you showed -  I see zat  appears twice.

You'd like to only see zat once, correct? In that case, how do you decide which row to delete? var13 or var52?

Frequent Contributor
Posts: 75

Re: drop the duplicate columns with sas

Hi Jberry,

It does not matter , which variables to delete. However, as I have many , I need one to recursively delete them until I have only one.

Does it make sense?

Contributor
Posts: 64

Re: drop the duplicate columns with sas

If you don't care which row appears, try

proc sql;

  create table no_dupes as

  select *

  from (select *,monotonic() as rn from dicoat) AS rs

  group by title

  having rn=min(rn);

quit;

Ask a Question
Discussion stats
  • 7 replies
  • 366 views
  • 0 likes
  • 3 in conversation