BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi all SAS Users,

Today when I read some discussion about shortening the function calculation with many variables at once.

options mprint;
data quiz_summary;
	set pg2.class_quiz;
	Name=upcase(Name);
	Mean1=mean(Quiz1, Quiz2, Quiz3, Quiz4, Quiz5);
	/* Numbered Range: col1-coln where n is a sequential number */ 
	*Mean2=mean(of Quiz--Quiz);
	/* Name Prefix: all columns that begin with the specified character string */ 
	Mean3=mean(of Q:);
run;

I have some questions as below:

1. The data statement Mean3=mean(of Q:); will be translated to "dot" or "space" delimiter. Clarifying:

Mean3=mean(of Q:);

/*explain1 : equal to*/
Mean3=mean(Q1 Q2 Q3);
/*explain2: or */
Mean3=mean(Q1, Q2, Q3);

I know the explain1 is wrong but I am not sure maybe SAS has some special treatment for function like that. I used option mprint and putlog but cannot see what is the long version of Mean3=mean(of Q:); . Is there any way to do so btw?

2. We all know that dash can be used to group a list of range of variables in PDV  as below

Phil_NZ_0-1617400642497.png

In this case, they group the range from Quiz1 to AvgQuiz by the double dash.

So, I applied to my code and it went wrong

data quiz_summary;
	set pg2.class_quiz;
	Name=upcase(Name);
	Mean1=mean(Quiz1 Quiz2 Quiz3 Quiz4 Quiz5);
	Mean2=mean(of Quiz--Quiz);
run;

The log is

49         data quiz_summary;
50         	set pg2.class_quiz;
51         	Name=upcase(Name);
52         	Mean1=mean(Quiz1, Quiz2, Quiz3, Quiz4, Quiz5);
53         	/* Numbered Range: col1-coln where n is a sequential number */
54         	Mean2=mean(of Quiz--Quiz);
                  ____
                  71
ERROR: Variable Quiz cannot be found on the list of previously defined variables.
ERROR 71-185: The MEAN function call does not have enough arguments.

The thing here, if the longer version of the statement Mean2=mean(of Quiz--Quiz); is Mean2=mean(Quiz1 Quiz2 .....Quiz5 Mean1), I accept that I am wrong, but they should announce me the error below rather than the above error

 

 

 

ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, *, **, +, ',', -, /, 
              <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT, IN, LE, LT, MAX, MIN, NE, NG, NL, 
              NOTIN, OR, [, ^=, {, |, ||, ~=.  

 

 

 

3. Apart from that, so, can you tell me how to fix the code above Mean2=mean(of Quiz--Quiz); with the use of double dash to calculate the function of variables with analogous names.

 

Warmest regards,

Phil.

 

 

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Variable list rules

The : after part of variable name means "use all variables that start with the common characters".

 

Q: means try to use every variable whose name starts with Q. Causes invalid data messages with functions that expect all numeric variables, like Mean, Sum if any of the variables are character.

 

The -- means variables that are consecutively numbered in the variable order list (or adjacent columns). Starts with the first variable and goes the the last. Again if any of the variables in the list are of different types expect issues. But the NAME has to be complete.

Your example with Quiz -- Quiz fails because you have no variable named Quiz at all. SAS quite often stops at the first error of certain types. Doubt that there are going to be changes made in that account.

You would have to use Quiz1 -- Quiz5 , or if the variables are not actually in column order you can use the single dash Quiz1 - Quiz5 which requires sequentially numbered variable names but does not care about position in the PDV. There cannot be gaps in the sequence though.

 

The inline statistical functions require a comma delimiter separating: a valid variable, a literal value, a valid list structure, or a combination of them. This is syntactically correct as long as the variables exist, are in the correct order for b -- var2 to make sense, and all a variables are numeric:

 

x = sum( a,b,c, of Var1-Var3, of b -- var2, of a:);

realizing of course that some variables will be duplicated.

 

I am afraid that you have describe what you mean by analogous names in "; with the use of double dash to calculate the function of variables with analogous names."  The double dash only has one meaning: sequential columns or PDV data order, nothing else. If the variables are not adjacent you cannot use a single -- to get all of them. You can use different intervals if the sub-groups are adjacent.

 

Not so much in function but array definitions I have used things like:

 

array v (*) q1-q18 q21a--q21z q25 q27 q30 q34-q80;

To define a list of variables that had similar characteristics that I needed to work with. Note the mixing of types, sequentially named, pdv order, individually named.

View solution in original post

7 REPLIES 7
Patrick
Opal | Level 21

You can reference a list of variables in the mean function using any of the notations in below example and as documented here.

Be careful when using the double dash notation. Double dash returns the variables according to their place in the pdv (see example below).

I assume your syntax (of quiz--quiz) doesn't work because this is not a list but the same variable repeated.

data have;
  array quiz {*} 8 q1 q2 var q3 q4 q5 (1,2,100,3,4,5);
  output;
  stop;
run;

data test;
  set have;
  array a1 {*} q:;
  x=mean(of q:);
  y=mean(of a1[*]);
  output;
  array a2 {*} q1-q5;
  x=mean(of q1-q5);
  y=mean(of a2[*]);
  output;
  array a3 {*} q1--q5;
  x=mean(of q1--q5);
  y=mean(of a3[*]);
  output;
run;

proc print data=test;
run;

Patrick_0-1617403841848.png

 

 

 

Phil_NZ
Barite | Level 11

Hi @Patrick 

 

Thank you for your explanation, I am aware of what you mentioned. I rarely use a double dash in calculating, mainly for importing data from CSV as @ballardw  suggested previously. Now, when using double dash, I need to open debugger to see what inside PDV before confidently hit the calculation. But yeah, it is how I learn SAS by putting myself into many situations to find the answer, even, it is not an optimized way to do things, but I will learn to deal with a problem from many angles.

 

Warm regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Patrick
Opal | Level 21

@Phil_NZ 

I've updated my previous post. I assume your syntax didn't work because you're repeating the variable so it's not really a list.

You can always issue a proc contents with order=varnum to quickly check the order of vars in a table.

ballardw
Super User

Variable list rules

The : after part of variable name means "use all variables that start with the common characters".

 

Q: means try to use every variable whose name starts with Q. Causes invalid data messages with functions that expect all numeric variables, like Mean, Sum if any of the variables are character.

 

The -- means variables that are consecutively numbered in the variable order list (or adjacent columns). Starts with the first variable and goes the the last. Again if any of the variables in the list are of different types expect issues. But the NAME has to be complete.

Your example with Quiz -- Quiz fails because you have no variable named Quiz at all. SAS quite often stops at the first error of certain types. Doubt that there are going to be changes made in that account.

You would have to use Quiz1 -- Quiz5 , or if the variables are not actually in column order you can use the single dash Quiz1 - Quiz5 which requires sequentially numbered variable names but does not care about position in the PDV. There cannot be gaps in the sequence though.

 

The inline statistical functions require a comma delimiter separating: a valid variable, a literal value, a valid list structure, or a combination of them. This is syntactically correct as long as the variables exist, are in the correct order for b -- var2 to make sense, and all a variables are numeric:

 

x = sum( a,b,c, of Var1-Var3, of b -- var2, of a:);

realizing of course that some variables will be duplicated.

 

I am afraid that you have describe what you mean by analogous names in "; with the use of double dash to calculate the function of variables with analogous names."  The double dash only has one meaning: sequential columns or PDV data order, nothing else. If the variables are not adjacent you cannot use a single -- to get all of them. You can use different intervals if the sub-groups are adjacent.

 

Not so much in function but array definitions I have used things like:

 

array v (*) q1-q18 q21a--q21z q25 q27 q30 q34-q80;

To define a list of variables that had similar characteristics that I needed to work with. Note the mixing of types, sequentially named, pdv order, individually named.

Tom
Super User Tom
Super User

You left out the positional lists that select only one type of variable.

first-numeric-last
first-character-last

Example:

1174  data test;
1175   set sashelp.class ;
1176   array _c name-character-weight ;
1177   array _n name-numeric-weight ;
1178   put (_c(*)) (=);
1179   put (_n(*)) (=);
1180   stop;
1181  run;

Name=Alfred Sex=M
Age=14 Height=69 Weight=112.5
SASKiwi
PROC Star

Personally I avoid use of double dashes and variable shortening (Q:) because the results can be unpredictable AND if you are collaborating with other SAS users it is much better to be explicit in your code so they can see exactly what variables are involved. I'm OK with variable lists though.

Phil_NZ
Barite | Level 11

Hi @Patrick @SASKiwi  @ballardw 

Too many great things for me to enjoy in one discussion. Thank you for making my weekend.

Many thanks and warmest regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1687 views
  • 5 likes
  • 5 in conversation