Hi Team,
I have columns in data1 id,col1,col2,col3.
Now I need like
select distinct id, Strip(col1)||strip(col2)||strip(col3) from data1
But now what I need is in future I have columns like col1,col2,col3....col10
at this position i need to automate in select statement at the point of strip function
select distinct id,strip(col1)||strip(col2)||....||strip(col10)
Extracte the names from dictionary.columns, wrap them into strip(), and store them in a macro variable for later use:
proc sql;
select "strip(" || trim(name) || ")" into :names separated by '||'
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE' and index(upcase(name),'COL') = 1;
create table want as
select distinct id, &names.
from have;
quit;
Edit: changed the literal comma separator to the double pipe characters. Note that both codes should result in the same number of rows.
Please try, the new variable generated will have the expected output.
proc transpose data=sashelp.vcolumn out=trans (drop=_name_ _label_);
where libname='WORK';
by memname;
id varnum;
var name;
run;
data want;
length variables $2000.;
set trans;
variables=catx(',',of _:);
run;
Isn't this a bit silly? Why would you want to treat these two combinations as being the same?
col1='ab' col2='cd' col3='e'
col1='a' col2='bc' col3='de'
Why not just get rid of the STRIP function entirely, and try to get the equivalent of:
distinct (id col1 col2 col3)
Then you are barking up the wrong tree using SQL. A DATA step can handle variable lists, while SQL cannot:
data joined / view=joined;
set have;
keyvar = cats(id, of col: );
run;
proc freq data=joined;
tables keyvar / noprint out=want;
run;
But my comment is still true. Removing blanks means you could be counting two combinations as being the same when they are different.
@Mastanvali wrote:
did you get my concern?
My concern is automatically concatenate similar variables.
1) Input data set example (as data step code)
2) Output data set example (as data step code)
3) tell us how to determine variables are "similar".
Showing (incomplete and data less) code that does not work does not describe a problem very well.
And almost any problem that involves an unknown number of variables tends to indicate poorly structured data to start with.
One way to think of it is to consider the code you want to generate as text. If you can generate the text then you could put that text into a macro variable and use it at the place you need it in your code.
So for the simple case you have presented where you have variables that a literally named COL1, COL2, ... COLn you could do something like this:
data _null_;
length code $2000;
do i=1 to 6 ;
code=catx('||',code,cats('strip(col',i,')'));
end;
call symputx('code',code);
run;
Then in your SQL code just reference the macro variable CODE.
... id,&code ...
But I suspect that your real problem is more complex. The names of the variables are probably not literally COL1, COL2, ...
Also the code you are generating is most likely not doing what you actually want. If you can explain your actual problem (using actual examples) then someone could provide a solution. For example it might be that your actual problem is that you need to get the distinct values of ALL of the data in the input dataset. In that case you don't need to know the names of ANY of the variables.
* Use PROC SORT ;
proc sort data=have out=want nodupkey;
by _all_;
run;
* Use PROC SQL ;
proc sql;
create table want as select distinct * from have ;
quit;
If you did only want the distinct values of some of the variables then perhaps they use a common prefix on their names?
proc sort data=have(keep=id col:) out=want nodupkey;
by _all_;
run;
@Mastanvali wrote:
...
How to find out total number of variables in a dataset?
You have two answers in this thread already that have provided answers using the metadata views.
You could also just use PROC CONTENTS.
proc contents data=have noprint out=contents; run;
proc sql noprint;
select name into :varlist separated by ' '
from contents
where upcase(name) like 'COL%'
;
%let nvars=&sqlobs;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.