Hi,
data have;
length name1 name2 name3 name4 $20;
name1='B'; name2= ''; name3= 'A'; name4='A';
output;
name1=' '; name2= 'B'; name3= ''; name4='';
output;
name1='A'; name2= 'A'; name3= 'A'; name4='';
output;
run;
Want output as
row DistinctName DistinctN
1 A, B 2
2 B 1
3 A 1
Thanks in advance!
Not pretty or elegant but should be robust and scalable if you actually have more variables:
data have;
length name1 name2 name3 name4 $20;
input name1 name2 name3 name4;
row+1;
datalines;
B . A A
. B . .
A A A .
;
proc transpose data=have out=trans (where=( not missing(col1)) drop=_name_);
by row;
var name1-name4;
run;
proc sql;
create table temp as
select distinct row,col1
from trans;
quit;
data want;
set temp;
by row;
/* need to set length of a target to string
to at least longest possible so 20*number of variables+ number of variables
minus 1 for commas.
*/
length DistinctName $85;
retain DistinctName;
if first.row then DistinctName=col1;
else DistinctName= catx(',',DistinctName,col1);
distinctn= countw(DistinctName,',');
if last.row;
drop col1;
run;
Time to learn how to write a data step. That list of variables and output statements gets cumbersome quickly.
Also use the code box on the forum opened with the {I} or "running man" icons for posting code.
And just for curiosity how will that comma delimited list of "names" actually be used?
is this what you're after?
proc sql;
create table cat_count as
select cats(name1,name2,name3,name4) as ConcatenatedString
,count(*) as Volume
from have
group by ConcatenatedString
order by ConcatenatedString;
quit;
Not pretty or elegant but should be robust and scalable if you actually have more variables:
data have;
length name1 name2 name3 name4 $20;
input name1 name2 name3 name4;
row+1;
datalines;
B . A A
. B . .
A A A .
;
proc transpose data=have out=trans (where=( not missing(col1)) drop=_name_);
by row;
var name1-name4;
run;
proc sql;
create table temp as
select distinct row,col1
from trans;
quit;
data want;
set temp;
by row;
/* need to set length of a target to string
to at least longest possible so 20*number of variables+ number of variables
minus 1 for commas.
*/
length DistinctName $85;
retain DistinctName;
if first.row then DistinctName=col1;
else DistinctName= catx(',',DistinctName,col1);
distinctn= countw(DistinctName,',');
if last.row;
drop col1;
run;
Time to learn how to write a data step. That list of variables and output statements gets cumbersome quickly.
Also use the code box on the forum opened with the {I} or "running man" icons for posting code.
And just for curiosity how will that comma delimited list of "names" actually be used?
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.