Here's a rough version, for the final I would add some macros so you don't have to know how many cities their are, macro variables should be able to take care of that and shorten the code: data have; infile cards dsd dlm='*'; informat group $1. cities $100.; input group$ cities$; cards; A*Houston, Houston, New York, Los Angeles, Los Angeles B*Chicago, Boston, Boston C*Miami, Worcester, Worcester, Springfield, Springfield, Atlanta, Atlanta D*Oakland, Oakland, Oakland, Dallas, Dallas E*Lancaster, Boston, Madison, Madison, Madison, Madison ; data start; set have; comma=countc(cities,',')+1; scan1=strip(scan(cities,comma-0,',')); scan2=strip(scan(cities,comma-1,',')); scan3=strip(scan(cities,comma-2,',')); scan4=strip(scan(cities,comma-3,',')); scan5=strip(scan(cities,comma-4,',')); scan6=strip(scan(cities,comma-5,',')); scan7=strip(scan(cities,comma-6,',')); run; proc transpose data=start out=tran(drop=_NAME_);by group;var scan:; proc sort data=tran nodup;by group col1;where not missing(col1); proc transpose data=tran out=tran2(drop=_NAME_);by group;var col1; data want(keep=group unique); set tran2; unique=catx(',',of col1-col4); run;
... View more