Hello!
I am currently trying to merge two categories within a single variable together. The variable is new_cat, and it currently has the levels of "Low" "moderate" and "high". I am trying to group "moderate" and "high" together as there aren't many values in the 'high' group. All of the information is within the same dataset which I've named work.newdt. I am unsure if I should somehow create a new variable with only two levels or merge the two levels in the already existing variable.
Thank you!
Hi @Rose2 you can do this within the datastep using an IF statement
data work.cars_short;
set sashelp.cars;
if make in ('Audi','BMW','Mercedes-Benz') then make_short = 'Luxury German';
else make_short = make;
run;
You could also use a CASE statement via PROC SQL.
Hope that helps
Thanks
Harry
I attempted
if new_dt_categories in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=new_dt_categories;
run;
but when I did proc freq, it still had separate categories, and now they were numbered instead of being named in the table.
@Rose2 wrote:
I attempted
if new_dt_categories in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=new_dt_categories;
run;
but when I did proc freq, it still had separate categories, and now they were numbered instead of being named in the table.
What you described makes it sound like your original variable is numeric (or if character has digit strings in it) and is being displayed using a user defined format that converts those numbers into the text you used in IF statement.
Try using PUT() in your IF. Or even easier use VVALUE() function since then you don't need to know what format is attached to the variable. Also make sure to define the new variable long enough to hold the longest possible value.
length new_dt_cat2 $200;
if vvalue(new_dt_categories) in ('Moderate Distress', 'High‎/Extreme Distress') then new_dt_cat2= 'Moderate/Extreme';
else new_dt_categories2=vvalue(new_dt_categories);
Note you could also just make a new format that collapses those categories into one. That might be enough for what you are doing.
Maxim 3: Know Your Data.
Run a PROC CONTENTS on your dataset to see the attributes of your variable(s). Your data step codes have to deal with the raw, unformatted values to work.
If the values are formatted, a simple change of the format allows statistical procedures to use another grouping.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.