BookmarkSubscribeRSS Feed
Mariloud
Obsidian | Level 7
I am struck on this dataset needing help with combining observations of category variable so that I can get the count of distinct values by group. Thanks for your help!
6 REPLIES 6
ballardw
Super User

An example of data and what you mean by "combine".

Possibly SQL is not the best approach depending on the data and what you mean by "combined". Proc freq and a custom format may be easier depending on what/how combining things.

Mariloud
Obsidian | Level 7

For example the dataset contains the following values:

 

Category   Category_count

Animals         85

culture           37

education       33

Animals      1653

Culture        1501

Education    1278

so on...............

 

I want to be able to count distinct value by category group like this👇

 

Category   Category_count

Animals           1738

Culture            1538

Education        1311

 

 

 

 

 

 

 

ballardw
Super User

Do your values actually have differences of spelling such as "Education" and "education", "Culture" and "culture" or is that just bad typing?

 

If your need is to combine the spelling then Propcase(category) instead of just category may be the key.

 

Since you said "distinct" did you expect an SQL solution?

 

 

Mariloud
Obsidian | Level 7

They actually do have differences in spelling such a "Animals" and "animals"

andreas_lds
Jade | Level 19

@Mariloud wrote:

They actually do have differences in spelling such a "Animals" and "animals"


You need to fix that problem first

data cleaned;
   set have;
   Category = propcase(Category);
run;

proc summary data=cleaned nway;
   class Category;
   var Category_Count;
   output out=want(drop= _type_ _freq_) sum=;
run;

Instead of propcase using upcase or lowcase works, too.

s_lassen
Meteorite | Level 14

Please post your example data as a data step, e.g. like this:

data have;
  input Category : $12. Category_count;
cards;
Animals         85
culture           37
education       33
Animals      1653
Culture        1501
Education    1278
;run;

You can get the sums using PROC SUMMARY:

proc summary data=have nway;
  class Category;
  format Category $upcase12.;
  var Category_count;
  output out=want sum=;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1303 views
  • 0 likes
  • 4 in conversation