BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

May anyone explain the code

avg_weight=symget(cats('avg_weight_',sex));

Why are we using cats here?

I don't see any concatenating here so why using cats?

IS it practical code or just in books?

PROC SQL noprint;
Select avg(case when sex='F' then weight else . end ),
	   avg(case when sex='M' then weight else . end )
into : avg_weight_m trimmed,
	 : avg_weight_f trimmed
from sashelp.class;
quit;

data class_way1;
set sashelp.class;
avg_weight=symget(cats('avg_weight_',sex));
Run;

data class_way2;
set sashelp.class;
IF sex='M' then avg_weight=&avg_weight_m.;
else IF sex='F' then avg_weight=&avg_weight_f.;
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Of course there's concatenating done, the string 'avg_weight_' is concatenated with the contents of variable sex. CATS will automatically strip all arguments of leading and trailing blanks.

The first code shows how to replace a IF/THEN/ELSE with functions.

 

My code would be

proc sql;
create table class as
  select
    cl.*,
    avg(weight) as average_weight
  from sashelp.class cl
  group by sex
  order by name
;
quit;

Such code does not need to make any assumptions about the possible contents of sex.

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Of course there's concatenating done, the string 'avg_weight_' is concatenated with the contents of variable sex. CATS will automatically strip all arguments of leading and trailing blanks.

The first code shows how to replace a IF/THEN/ELSE with functions.

 

My code would be

proc sql;
create table class as
  select
    cl.*,
    avg(weight) as average_weight
  from sashelp.class cl
  group by sex
  order by name
;
quit;

Such code does not need to make any assumptions about the possible contents of sex.

RichardDeVen
Barite | Level 11

The first bit computes the average weight of each sex and stores the printable representation in a macro variable whose name contains the sex value.  So you have

%put &=avg_weight_m;
%put &=avg_weight_f;
---- LOG ----
AVG_WEIGHT_M=90.11111
AVG_WEIGHT_F=108.95

Storing computation results in the macro symbol table is generally not the best approach.

 

Way 1 is just bad coding.  Bad awful coding.

 

The result of 

cats('avg_weight_', sex)

will be one of 

'avg_weight_M'
'avg_weight_F'

which are the names of macro variables.

avg_weight = symget(cats('avg_weight_',sex));

moves the value of a macro variable to the value of a DATA step variable.  However, avg_weight will be $200 because SYMGET returns a character value.

 

Way 2 is not much better.

IF sex='M' then avg_weight=&avg_weight_m.; else
IF sex='F' then avg_weight=&avg_weight_f.;

After macro resolution becomes

 

IF sex='M' then avg_weight=90.11111; else
IF sex='F' then avg_weight=108.95;

 

In this case the resolved macro variables are interpreted in the context of DATA Step source code, so the assigned variable ave_weight will take on a numeric type.

 

Typically you want the computation to be made and saved directly (per @Kurt_Bremser).  

Ronein
Meteorite | Level 14
Sorry for miss-understanding.
When we compare the two ways:
Way1:
avg_weight = symget(cats('avg_weight_',sex));
Way2:
IF sex='M' then avg_weight=&avg_weight_m.; else
IF sex='F' then avg_weight=&avg_weight_f.;

Why way1 is better than way2????
Kurt_Bremser
Super User

The function call method is more versatile (no literal coding of values needed), but will probably be less performant because of the repeated function calls to retrieve macro variable values, which might come into play when large datasets are processed.

My SQL is the simplest and easiest to understand code, but might be outperformed by a PROC MEANS and following DATA step with a hash object built from the means. This method avoids any conventional sorting and reads the datasets in sequential order.

Tom
Super User Tom
Super User

Your example is poor because you stored numbers into macro variables and because the two methods actually produce different results (a character variable versus a numeric variable) so some respondents are concentrating on that.

 

Let just assume the macro variables already exist and concentrate on the final data step that is using the values of those macro variables. The reason the first method using SYMGET() is better than the second is because you do not need to know the number of possible values of the variable being used to generate the macro variable name (SEX in your example) when writing the code.  To make it even more robust you could use the SYMEXIST() function to make sure the macro variable actually exists.

%let gender_f = Female;
data class_way1;
  set sashelp.class;
  length mvar $32 gender $6 ;
  mvar=cats('gender_',sex);
  if symexist(mvar) then gender=symget(mvar);
run;

proc freq ;
 tables sex*gender / list missing;
run;
                                          Cumulative    Cumulative
Sex    gender    Frequency     Percent     Frequency      Percent
------------------------------------------------------------------
F      Female           9       47.37             9        47.37
M                      10       52.63            19       100.00
PaigeMiller
Diamond | Level 26

Usually, the only reason to create a data set with the mean by group as a variable is so you can subtract the mean from the actual value to get the difference between the actual value and the mean.

 

If that's the real reason for doing this, there's no need to create macro variables at all. PROC STDIZE does this subtraction in one step, and would be the simplest to code.

 

I agree with @RichardDeVen, the two solutions presented are very poor coding.

--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1254 views
  • 2 likes
  • 5 in conversation