BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to use array to calculate same formula for multiple variables at once.

I don't get the desired result and the calculated field _x _y _z _r _q _g  get null values.

When I calculate it manually (without array) then calculation is fine.

What is wrong with my array???


proc format ;
VALUE Ratio_Till1_Fmt
-9997 ='(1) Ratio un-defined'
0='(2) 0'
0-<0.1='(3) (0,0.1]'
0.1-<0.2='(4) [0.1,0.2)'
0.2-<0.3='(5) [0.2,0.3)'
0.3-<0.4='(6) [0.3,0.4)'
0.4-<0.5='(7) [0.4,0.5)'
0.5-<0.6='(8) [0.5,0.6)'
0.6-<0.7='(9) [0.6,0.7)'
0.7-<0.8='(10) [0.7,0.8)'
0.8-<0.9='(11) [0.8,0.9)'
0.9-<1.0='(12) [0.9,1.0)'
1.0='(13) 1'
;
Run;
Data have;
input X y Z R q g;
cards;
0 0.1 0.8 0.1 1 -9997
0.2 0.3 0.4 0 0 0.4
;
Run;
data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) _x _y _z _r _q _g;
do i=1 to dim(_vars);
_Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;
drop i;
/*calc_x=put(x,Ratio_Till1_Fmt.);*/
/*calc_y=put(y,Ratio_Till1_Fmt.);*/
/*calc_z=put(z,Ratio_Till1_Fmt.);*/
/*calc_r=put(r,Ratio_Till1_Fmt.);*/
/*calc_q=put(q,Ratio_Till1_Fmt.);*/
/*calc_g=put(g,Ratio_Till1_Fmt.);*/
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

You have to define the _Bvars array as character with an appropriate length. 

 

data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) $20 _x _y _z _r _q _g;

do i=1 to dim(_vars);
   _Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;

drop i;
Run;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

You have to define the _Bvars array as character with an appropriate length. 

 

data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) $20 _x _y _z _r _q _g;

do i=1 to dim(_vars);
   _Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;

drop i;
Run;
Quentin
Super User

The clue to the problem was the notes in the log about implicit conversion from character values to numeric values:

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      30:1
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(4) [0.1,0.2)' , at line 30 column 11.
NOTE: Invalid numeric data, '(11) [0.8,0.9)' , at line 30 column 11.
NOTE: Invalid numeric data, '(4) [0.1,0.2)' , at line 30 column 11.
NOTE: Invalid numeric data, '(13) 1' , at line 30 column 11.
NOTE: Invalid numeric data, '(1) Ratio un-defined' , at line 30 column 11.
X=0 y=0.1 Z=0.8 R=0.1 q=1 g=-9997 _x=. _y=. _z=. _r=. _q=. _g=. i=7 _ERROR_=1 _N_=1
NOTE: Invalid numeric data, '(5) [0.2,0.3)' , at line 30 column 11.
NOTE: Invalid numeric data, '(6) [0.3,0.4)' , at line 30 column 11.
NOTE: Invalid numeric data, '(7) [0.4,0.5)' , at line 30 column 11.
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(7) [0.4,0.5)' , at line 30 column 11.
X=0.2 y=0.3 Z=0.4 R=0 q=0 g=0.4 _x=. _y=. _z=. _r=. _q=. _g=. i=7 _ERROR_=1 _N_=2
NOTE: There were 2 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 2 observations and 12 variables.

I think of those notes as errors.

 

SAS variables are strongly typed, in the sense that a variable is either numeric or character.

 

But in the DATA step language, you are not required to explicitly define the type of each variable.  If you don't define the type, the compiler will decide the type for you, based on rules for the statements/functions/etc used to create the variable.

 

When you use an array statement to create new variables, and do not explicitly define the variable type, the variables are created as numeric variable.

 

Then in your assignment statement, you are assigning a character value to a numeric value:

_Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);

SAS will try to do an implicit character to numeric conversion for you,  (which causes the first NOTE which I think should be an error), and then when that conversion fails it will return a missing value (and generate the remaining notes which I think should be errors).

 

When you don't use an array, the compiler sees:

calc_x=put(x,Ratio_Till1_Fmt.);

And says "oh, this statement is creating a new variable calc_x, it should be a character variable because Ronein is assigning a character value to it, so I'll make calc_x a character variable.  And I'll give calc_x a length of $20, because that is the longest length that can be returned by the format Ratio_Till1_Fmt."

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.
PaigeMiller
Diamond | Level 26

Why do you need new character variables to do this? Why not just use the numeric formatted values in array  _vars?

 

Character variables sort alphabetically, numeric variable sort numerically. With character variables, you need the (5) to get things to sort properly; with numeric variables the results can be sorted into desired numerical order easily.

 

Also, I think it is really poor design and unprofessional to have categories (5) [0.2,0.3) instead of [0.2,0.3). No one cares that you have to use (5) to get things to sort properly, it just confuses people viewing your results, not a good thing. I hate seeing presentations where the months in the table are (1) Jan, (2) Feb and so on. There are many other ways to get things to sort properly. If you leave the values as numeric, which is the best practice for numbers, and don't stick (5) in front of the categories in the format, many procedures have an option ORDER=INTERNAL which forces your values to sort properly; plus other methods as well.

 

 

--
Paige Miller
Ronein
Meteorite | Level 14
Thanks.
I need to run logistic regression.
It was defined for me that worse group is 1 and then group 2 is better and so on.
It was defined for me that people want to see categories (groups) description that mention if it is group 1,2,3,4,5 because it give immediate information if it is worse group or best group or in between.
I agree that best is to create a numeric variable that get values 1,2,3,4,5 ( group number) and add to numeric variable a format ( as I mentioned format need to add number of group too). I agree that create a char variable is not clever way

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 463 views
  • 4 likes
  • 4 in conversation