BookmarkSubscribeRSS Feed
Mastanvali
Calcite | Level 5

Hi Team, 

 

I have columns in data1 id,col1,col2,col3.

Now I need like

select distinct id, Strip(col1)||strip(col2)||strip(col3) from data1

But now what I need is in future I have columns like col1,col2,col3....col10

at this position i need to automate in select statement at the point of strip function 

select distinct id,strip(col1)||strip(col2)||....||strip(col10)

11 REPLIES 11
Kurt_Bremser
Super User

Extracte the names from dictionary.columns, wrap them into strip(), and store them in a macro variable for later use:

proc sql;
select "strip(" || trim(name) || ")" into :names separated by '||'
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and index(upcase(name),'COL') = 1;
create table want as
select distinct id, &names.
from have;
quit;

 

Edit: changed the literal comma separator to the double pipe characters. Note that both codes should result in the same number of rows.

Mastanvali
Calcite | Level 5
any body help me out of the issue.
It's very priority query
Jagadishkatam
Amethyst | Level 16

Please try, the new variable generated will have the expected output.

 

proc transpose data=sashelp.vcolumn out=trans (drop=_name_ _label_);
where libname='WORK';
by memname;
id varnum;
var name;
run;

data want;
length variables $2000.;
set trans;
variables=catx(',',of _:);
run;
Thanks,
Jag
Astounding
PROC Star

Isn't this a bit silly?  Why would you want to treat these two combinations as being the same?

 

col1='ab'   col2='cd'  col3='e'

 

col1='a'  col2='bc'  col3='de'

 

Why not just get rid of the STRIP function entirely, and try to get the equivalent of:

 

distinct (id col1 col2 col3)

Mastanvali
Calcite | Level 5
did you get my concern?

My concern is automatically concatenate similar variables.
Astounding
PROC Star

Then you are barking up the wrong tree using SQL.  A DATA step can handle variable lists, while SQL cannot:

 

data joined / view=joined;

set have;

keyvar = cats(id, of col: );

run;

 

proc freq data=joined;

tables keyvar / noprint out=want;

run;

 

But my comment is still true.  Removing blanks means you could be counting two combinations as being the same when they are different.

ballardw
Super User

@Mastanvali wrote:
did you get my concern?

My concern is automatically concatenate similar variables.

1) Input data set example (as data step code)

2) Output data set example (as data step code)

 

3) tell us how to determine variables are "similar".

 

Showing (incomplete and data less) code that does not work does not describe a problem very well.

 

And almost any problem that involves an unknown number of variables tends to indicate poorly structured data to start with.

Mastanvali
Calcite | Level 5
I have variables id col1col2 col3 in datasetname(data1)
now im writing sql query
Create table t as
Select distinct id,strip(col1)||strip(col2)||strip(col3)

Now at present in my data1 dataset I have variables I'd col1 col2
col3....col8
Now i need all query like above query only but while new data coming at
that time col1 col2 like similar variables are getting more or less so we
need to insert that column names manually right.
For that I need to write automatically it will take how many similar
variables we are getting find out that and automatically concatenate

this my concern
Tom
Super User Tom
Super User

One way to think of it is to consider the code you want to generate as text. If you can generate the text then you could put that text into a macro variable and use it at the place you need it in your code.

So for the simple case you have presented where you have variables that a literally named COL1, COL2, ... COLn you could do something like this:

data _null_;
  length code $2000;
  do i=1 to 6 ;
    code=catx('||',code,cats('strip(col',i,')'));
  end;
  call symputx('code',code);
run;

Then in your SQL code just reference the macro variable CODE.

... id,&code ...

But I suspect that your real problem is more complex. The names of the variables are probably not literally COL1, COL2, ...

Also the code you are generating is most likely not doing what you actually want.  If you can explain your actual problem (using actual examples) then someone could provide a solution.  For example it might be that your actual problem is that you need to get the distinct values of ALL of the data in the input dataset. In that case you don't need to know the names of ANY of the variables.

* Use PROC SORT ;
proc sort data=have out=want nodupkey;
  by _all_;
run;
* Use PROC SQL ;
proc sql;
create table want as select distinct * from have ;
quit;

If you did only want the distinct values of some of the variables then perhaps they use a common prefix on their names?

proc sort data=have(keep=id col:) out=want nodupkey;
by _all_;
run;

 

Mastanvali
Calcite | Level 5
In the above code you gave like

do i=1 to 6 ;
code=catx('||',code,cats('strip(col',i,')'));
end;

you enter manually but i need like

do i=1 to &cnt ;
code=catx('||',code,cats('strip(col',i,')'));
end;

How to find out total number of variables in a dataset?
Tom
Super User Tom
Super User

@Mastanvali wrote:
...


How to find out total number of variables in a dataset?

You have two answers in this thread already that have provided answers using the metadata views.

You could also just use PROC CONTENTS.

proc contents data=have noprint out=contents; run;
proc sql noprint;
select name into :varlist separated by ' '
from contents
where upcase(name) like 'COL%'
;
%let nvars=&sqlobs;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2266 views
  • 4 likes
  • 6 in conversation