Hi,
I am looking for a Macro to catagorize a numeric variable into a catagorical variable by forming groups.
For example, I have a column called price and I want to divied it in 10 catagories.
Do someone knows such a Macro (Code) ?
thanks
Ehsan
Assuming you want to categorize all columns of your dataset, otherwise you have to
refine the SQL query.
I use Ksharp's solution, proc rank, which I didnt know of. It is only slightly adapted to
automatically generate the variables and ranks names.
Here the source dataset's name is "prices"
proc sql;
SELECT NAME, cats("RANK_",NAME)
INTO :cols SEPARATED BY ' ',
:ranks SEPARATED BY ' '
FROM dictionary.columns
WHERE LIBNAME="WORK" AND MEMNAME="PRICES" /* AND other restrictions */; /* /!\ use upper case here */
quit;
proc rank data=prices out=want groups=10;
var &cols.;
ranks &ranks.;
run;
Why do you need a macro. This is data manipulation, should be done in a datastep which is what that is for. Macro is not for this task, it doesn't have the data types or conding structures. If you post example test data (as a datastep) and what the output should look like, I can provide code, but:
if <> then
or
select() when()
Could be used, or in fact formats with ranges:
proc format;
value fmt
1-10="First"
...
;
hello,
You can use proc format
data prices;
input item $ price;
cards;
a 2
b 50
c 123
;
run;
proc format;
value price low-<10="cheap"
10-<50="ok"
50-high="expensive";
run;
proc print data=prices;
format price price.;
var item price;
run;
It depend on the rule how you category this price .
Check PROC RANK + groups=10 option .
proc rank data=have out=want groups=10;
var price;
ranks rank_price;
run;
Actually, I am looking for an Automated (Macro) solution of this problem.
I habe more than 400 numeric columns with numeric values having different ranges (min max mean etc) .
So I need a Macro that takes as Input a Dataset and variable name and automatically make N groups(catagories) out of this column without puting manual effort as you guys have already suggested above.
So do someone knows an automated solution of this problem ?
Ehsan
Assuming you want to categorize all columns of your dataset, otherwise you have to
refine the SQL query.
I use Ksharp's solution, proc rank, which I didnt know of. It is only slightly adapted to
automatically generate the variables and ranks names.
Here the source dataset's name is "prices"
proc sql;
SELECT NAME, cats("RANK_",NAME)
INTO :cols SEPARATED BY ' ',
:ranks SEPARATED BY ' '
FROM dictionary.columns
WHERE LIBNAME="WORK" AND MEMNAME="PRICES" /* AND other restrictions */; /* /!\ use upper case here */
quit;
proc rank data=prices out=want groups=10;
var &cols.;
ranks &ranks.;
run;
Actually I was looking for ready-made solution. Its sounds like that there is now such known solution available. So I have to put the Ideas given by you guys in a code.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.