BookmarkSubscribeRSS Feed
Rose2
Calcite | Level 5

Hello!

 

I am currently trying to merge two categories within a single variable together. The variable is new_cat, and it currently has the levels of "Low" "moderate" and "high". I am trying to group "moderate" and "high" together as there aren't many values in the 'high' group. All of the information is within the same dataset which I've named work.newdt. I am unsure if I should somehow create a new variable with only two levels or merge the two levels in the already existing variable. 

 

Thank you!

 

 

 

4 REPLIES 4
HarrySnart
SAS Employee

Hi @Rose2  you can do this within the datastep using an IF statement 

 

 

data work.cars_short;
set sashelp.cars;
if make in ('Audi','BMW','Mercedes-Benz') then make_short = 'Luxury German';
else make_short = make;
run;

You could also use a CASE statement via PROC SQL.


Hope that helps

Thanks

Harry

Rose2
Calcite | Level 5

I attempted 

if new_dt_categories in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=new_dt_categories;
run;

 

but when I did proc freq, it still had separate categories, and now they were numbered instead of being named in the table. 

Tom
Super User Tom
Super User

@Rose2 wrote:

I attempted 

if new_dt_categories in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=new_dt_categories;
run;

 

but when I did proc freq, it still had separate categories, and now they were numbered instead of being named in the table. 


What you described makes it sound like your original variable is numeric (or if character has digit strings in it) and is being displayed using a user defined format that converts those numbers into the text you used in IF statement.

 

Try using PUT() in your IF.  Or even easier use VVALUE() function since then you don't need to know what format is attached to the variable.  Also make sure to define the new variable long enough to hold the longest possible value.

length new_dt_cat2 $200;
if vvalue(new_dt_categories) in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=vvalue(new_dt_categories);

Note you could also just make a new format that collapses those categories into one.  That might be enough for what you are doing.

Kurt_Bremser
Super User

Maxim 3: Know Your Data.

Run a PROC CONTENTS on your dataset to see the attributes of your variable(s). Your data step codes have to deal with the raw, unformatted values to work.

If the values are formatted, a simple change of the format allows statistical procedures to use another grouping.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 3946 views
  • 0 likes
  • 4 in conversation