Hi SAS Community,
I have a dataset where there are multiple observations per category, and I need to reduce this dataset to a single observation per category.
Here is an example of my data...can someone help me code this up?
data have;
type | brand |
car | bmw |
car | lexus |
car | toyota |
car | honda |
car | maserati |
car | ford |
truck | ford |
truck | chevy |
truck | toyota |
truck | dodge |
motorcycle | harleydavidson |
motorcycle | honda |
data want;
type | brand |
car | bmw; lexus; toyota;honda; maserati; ford |
truck | ford; chevy; toyota; dodge |
motorcycle | harleydavidson; honda |
THanks
Here are two methods:
data have;
input type:$20. brand:$20.;
datalines;
car bmw
car lexus
car toyota
car honda
car maserati
car ford
truck ford
truck chevy
truck toyota
truck dodge
motorcycle harleydavidson
motorcycle honda
;
run;
proc sort data=have;
by type;
run;
/* Method 1 */
data want(drop=brand_);
format brand $100.;
do until(last.type);
set have(rename=(brand=brand_));
by type;
brand=strip(brand)||";"||strip(brand_);
end;
run;
/* Method 2 */
proc transpose data=have out=trn(drop=_name_);
by type;
var brand;
run;
data want(drop=COL:);
set trn;
brand=CATX(";",of col:);
run;
data have;
infile cards expandtabs truncover;
input type $ brand $;
cards;
car bmw
car lexus
car toyota
car honda
car maserati
car ford
truck ford
truck chevy
truck toyota
truck dodge
;
run;
data want;
length want $ 200;
do until(last.type);
set have;
by type;
want=catx(';',want,brand);
end;
drop brand;
run;
@belboy,
Can you try something like this ?
ods noproctitle;
proc freq data=SASHELP.CARS;
tables (Type) *(Make) / chisq nopercent norow nocol nocum
plots(only)=(freqplot mosaicplot);
run;
Here are two methods:
data have;
input type:$20. brand:$20.;
datalines;
car bmw
car lexus
car toyota
car honda
car maserati
car ford
truck ford
truck chevy
truck toyota
truck dodge
motorcycle harleydavidson
motorcycle honda
;
run;
proc sort data=have;
by type;
run;
/* Method 1 */
data want(drop=brand_);
format brand $100.;
do until(last.type);
set have(rename=(brand=brand_));
by type;
brand=strip(brand)||";"||strip(brand_);
end;
run;
/* Method 2 */
proc transpose data=have out=trn(drop=_name_);
by type;
var brand;
run;
data want(drop=COL:);
set trn;
brand=CATX(";",of col:);
run;
worked like a charm! thanx
one more way
data want;
length all_brand $100.;
retain all_brand;
set have;
by type;
if first.type then all_brand = brand;
else all_brand= catx(';', brand, all_brand);
if last.type then output;
drop brand;
run;
@belboy wrote:
Hi SAS Community,
I have a dataset where there are multiple observations per category, and I need to reduce this dataset to a single observation per category.
What will you be doing with the resulting data? It may be harder to work with that form than you expect.
For example, in your expected want output that ford and Toyota appear in different order and positions. Anything where you need the individual names later is going to be a pain to work with. Also you example is not quite consistent about separating the names with a semicolon or semicolon and space.
data want;
type brand car bmw; lexus; toyota;honda; maserati; ford truck ford; chevy; toyota; dodge motorcycle harleydavidson; honda
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.