Solved: Re: What does "_" in front of variable mean?

jackblack1222 · Posted 09-12-2024 04:12 PM

I was reading the solution here:
https://communities.sas.com/t5/SAS-Programming/How-to-find-the-rows-for-which-consecutive-columns-ha...

In his answer, there is "_count". What is the functionality of adding "_" in front of count here?

data have;
infile cards expandtabs truncover;
input Customer	$ M1	M2	M3	M4	M5	M6	M7	M8	M9	M10	M11	M12;
cards;
A	.	94	63	106	424	252	499	356	435	469	200	423
B	.	.	.	.	.	.	.	13	137	440	75	99
C	67	118	364	.	.	.	.	.	.	156	40	415
D	430	423	.	.	.	.	54	165	26	477	129	411
;

data want;
set have;
array x{*} M: ;
count=0;
do i=1 to dim(x);
 if missing(x{i}) then count+1;
  else do;
         count=0;
		 if _count>5 then month=vname(x{i});
	   end;
  _count=count;
end;
drop count _count i;
run;

Quentin · Posted 09-12-2024 04:27 PM

There is no meaning. "_" is an acceptable character to start a variable name, just like a letter.

Some people have a style preference for using an underscore as the first character of a temporary variable which they plan to drop. But it's just a style thing, like deciding to whether to use lowercase, UPPERCASE, or camelCase for variable names. In the SAS language, there is no meaning attached to a variable name that starts with an underscore.

In the code you have shown, there are two different variables, named Count and _Count.

The same two variables could have been named Count1 and Count2, or Count0 and Count1, or CountA and CountB, or Fred and Ginger, or ...

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

View solution in original post

Quentin · Posted 09-12-2024 04:27 PM

There is no meaning. "_" is an acceptable character to start a variable name, just like a letter.

Some people have a style preference for using an underscore as the first character of a temporary variable which they plan to drop. But it's just a style thing, like deciding to whether to use lowercase, UPPERCASE, or camelCase for variable names. In the SAS language, there is no meaning attached to a variable name that starts with an underscore.

In the code you have shown, there are two different variables, named Count and _Count.

The same two variables could have been named Count1 and Count2, or Count0 and Count1, or CountA and CountB, or Fred and Ginger, or ...

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

SASJedi · Posted 09-13-2024 08:07 AM

I like to use an underscore as the first character of variable names I intend to drop. You can then use a statement like this to get rid of all of them with very little code:

DROP _:;

This is an example of using the (very handy) variable name prefix list.

Check out my Jedi SAS Tricks for SAS Users

jackblack1222 · Posted 09-16-2024 01:42 PM

Thank you. I misunderstood that "_" in front of variable has some functionality like variable+":".

FreelanceReinh · Posted 09-13-2024 10:41 AM

Another purpose of leading underscores in variable names (not in your example) is the avoidance of name conflicts. For example, consider a SAS macro which uses a DATA step to work with a user-supplied dataset. The developer of such a macro doesn't know the user's variable names in advance. But assuming that they don't start with an underscore or even two underscores he or she might use variable names like __i so as to make name conflicts unlikely. With the same rationale I sometimes use such names when I suggest code here in this forum.

Similarly, some SAS procedures (e.g., PROC SUMMARY and PROC TRANSPOSE) create variable names with leading and trailing underscores. And there are the automatic variables _N_, _ERROR_, _IORC_, etc.

(Anecdote: I remember a company's standard reporting macro which surprisingly didn't work well when a user applied it to a dataset containing temperature data in a variable TEMP. Why? The macro developer had carelessly used that same name for a variable to store some intermediate results temporarily.)

Quentin · Posted 09-13-2024 11:05 AM

@FreelanceReinh wrote:

Another purpose of leading underscores in variable names (not in your example) is the avoidance of name conflicts. For example, consider a SAS macro which uses a DATA step to work with a user-supplied dataset. The developer of such a macro doesn't know the user's variable names in advance. But assuming that they don't start with an underscore or even two underscores he or she might use variable names like __i so as to make name conflicts unlikely. With the same rationale I sometimes use such names when I suggest code here in this forum.

Similarly, some SAS procedures (e.g., PROC SUMMARY and PROC TRANSPOSE) create variable names with leading and trailing underscores. And there are the automatic variables _N_, _ERROR_, _IORC_, etc.

(Anecdote: I remember a company's standard reporting macro which surprisingly didn't work well when a user applied it to a dataset containing temperature data in a variable TEMP. Why? The macro developer had carelessly used that same name for a variable to store some intermediate results temporarily.)

Yes, I was taught to use the __ prefix for data set variable names in macros for that reason.

I've seen some people take collision avoidance further by using __macroname_ as a prefix, or even generating random variable names. I don't go that far, but I think I did once have the problem of having two different utility macros that both created a __temp variable. So that did encourage me to put little bit more entropy into my dataset variable names.

I find it troubling when macro developers start using _ prefixes in the names of macro variables, particularly when they do that instead of defining them as %local. It's one of the signs that trigger me to wonder whether the macro developer understands the macro language. That would actually be a fun paper for me to try to write some time, "Things I've seen in macro code that make me doubt the entire macro (and macro author)."

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

Tom · Posted 09-13-2024 11:57 AM

If you really want to create data step that uses variables that will not conflict with any variables that are in the input datasets then look into using temporary arrays.

You can use _NUMERIC_, _CHAR_ and _CHARACTER_ as names for arrays, but they cannot be used as names for variables.

For an example of this in use see this macro

https://github.com/sasutils/macros/blob/master/xport2sas.sas

SAS Innovate 2025: Call for Content

Classroom Training Available!