Hi All,
Can one of you explain me in which situation we should choose:-
1)By instead of class
2)Class instead of By
in a proc step.
Please let me know your thoughts.
class: the dataset does not need to be sorted; SAS creates a structure in memory to hold the necessary sums/counters for each class value
by: dataset needs to be sorted (unless you use the "notsorted" option, which may create multiple stats for the same value if it repeats with other values interspersed), and only the values for the current by group are kept in memory.
Bottom line: few distinct values are better handled with class, many distinct values (memory consumption!) will require sorting and "by" processing; by processing will scale "indefinitely".
What proc are you talking about . If it was some statistical proc like proc glm proc logistic ,that would be whole different concept .
in proc means
Kurt gave you a good explanation . One more thing is BY is more fast than CLASS .
The output, either the dataset or the displayed output using the CLASS and BY statement differ if you have multiple variables as well. Sometimes one is preferable over the other. Note how there are different combinations of the variables when using the CLASS statement vs the BY statement - i.e. the dataset size differs out_class has 100 rows, out_by has 55 rows.
proc sort data=sashelp.class out=class;
by sex age;
run;
proc means data=class;
class sex age;
output out=out_class;
run;
proc means data=class;
by sex age;
output out=out_by;
run;
proc print data=out_class;
run;
proc print data=out_by;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.