BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

I want to use array to calculate same formula for multiple variables at once.

I don't get the desired result and the calculated field _x _y _z _r _q _g  get null values.

When I calculate it manually (without array) then calculation is fine.

What is wrong with my array???


proc format ;
VALUE Ratio_Till1_Fmt
-9997 ='(1) Ratio un-defined'
0='(2) 0'
0-<0.1='(3) (0,0.1]'
0.1-<0.2='(4) [0.1,0.2)'
0.2-<0.3='(5) [0.2,0.3)'
0.3-<0.4='(6) [0.3,0.4)'
0.4-<0.5='(7) [0.4,0.5)'
0.5-<0.6='(8) [0.5,0.6)'
0.6-<0.7='(9) [0.6,0.7)'
0.7-<0.8='(10) [0.7,0.8)'
0.8-<0.9='(11) [0.8,0.9)'
0.9-<1.0='(12) [0.9,1.0)'
1.0='(13) 1'
;
Run;
Data have;
input X y Z R q g;
cards;
0 0.1 0.8 0.1 1 -9997
0.2 0.3 0.4 0 0 0.4
;
Run;
data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) _x _y _z _r _q _g;
do i=1 to dim(_vars);
_Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;
drop i;
/*calc_x=put(x,Ratio_Till1_Fmt.);*/
/*calc_y=put(y,Ratio_Till1_Fmt.);*/
/*calc_z=put(z,Ratio_Till1_Fmt.);*/
/*calc_r=put(r,Ratio_Till1_Fmt.);*/
/*calc_q=put(q,Ratio_Till1_Fmt.);*/
/*calc_g=put(g,Ratio_Till1_Fmt.);*/
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

You have to define the _Bvars array as character with an appropriate length. 

 

data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) $20 _x _y _z _r _q _g;

do i=1 to dim(_vars);
   _Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;

drop i;
Run;

View solution in original post

4 REPLIES 4
PeterClemmensen
Tourmaline | Level 20

You have to define the _Bvars array as character with an appropriate length. 

 

data want;
set have;
array _vars(*) x y z r q g;
array _Bvars(*) $20 _x _y _z _r _q _g;

do i=1 to dim(_vars);
   _Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);
end;

drop i;
Run;
Quentin
Super User

The clue to the problem was the notes in the log about implicit conversion from character values to numeric values:

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      30:1
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(4) [0.1,0.2)' , at line 30 column 11.
NOTE: Invalid numeric data, '(11) [0.8,0.9)' , at line 30 column 11.
NOTE: Invalid numeric data, '(4) [0.1,0.2)' , at line 30 column 11.
NOTE: Invalid numeric data, '(13) 1' , at line 30 column 11.
NOTE: Invalid numeric data, '(1) Ratio un-defined' , at line 30 column 11.
X=0 y=0.1 Z=0.8 R=0.1 q=1 g=-9997 _x=. _y=. _z=. _r=. _q=. _g=. i=7 _ERROR_=1 _N_=1
NOTE: Invalid numeric data, '(5) [0.2,0.3)' , at line 30 column 11.
NOTE: Invalid numeric data, '(6) [0.3,0.4)' , at line 30 column 11.
NOTE: Invalid numeric data, '(7) [0.4,0.5)' , at line 30 column 11.
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(2) 0' , at line 30 column 11.
NOTE: Invalid numeric data, '(7) [0.4,0.5)' , at line 30 column 11.
X=0.2 y=0.3 Z=0.4 R=0 q=0 g=0.4 _x=. _y=. _z=. _r=. _q=. _g=. i=7 _ERROR_=1 _N_=2
NOTE: There were 2 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 2 observations and 12 variables.

I think of those notes as errors.

 

SAS variables are strongly typed, in the sense that a variable is either numeric or character.

 

But in the DATA step language, you are not required to explicitly define the type of each variable.  If you don't define the type, the compiler will decide the type for you, based on rules for the statements/functions/etc used to create the variable.

 

When you use an array statement to create new variables, and do not explicitly define the variable type, the variables are created as numeric variable.

 

Then in your assignment statement, you are assigning a character value to a numeric value:

_Bvars(i)=put(_vars(i),Ratio_Till1_Fmt.);

SAS will try to do an implicit character to numeric conversion for you,  (which causes the first NOTE which I think should be an error), and then when that conversion fails it will return a missing value (and generate the remaining notes which I think should be errors).

 

When you don't use an array, the compiler sees:

calc_x=put(x,Ratio_Till1_Fmt.);

And says "oh, this statement is creating a new variable calc_x, it should be a character variable because Ronein is assigning a character value to it, so I'll make calc_x a character variable.  And I'll give calc_x a length of $20, because that is the longest length that can be returned by the format Ratio_Till1_Fmt."

The Boston Area SAS Users Group is hosting free webinars!
Next up: SAS Trivia Quiz hosted by SAS on Wednesday May 21.
Register now at https://www.basug.org/events.
PaigeMiller
Diamond | Level 26

Why do you need new character variables to do this? Why not just use the numeric formatted values in array  _vars?

 

Character variables sort alphabetically, numeric variable sort numerically. With character variables, you need the (5) to get things to sort properly; with numeric variables the results can be sorted into desired numerical order easily.

 

Also, I think it is really poor design and unprofessional to have categories (5) [0.2,0.3) instead of [0.2,0.3). No one cares that you have to use (5) to get things to sort properly, it just confuses people viewing your results, not a good thing. I hate seeing presentations where the months in the table are (1) Jan, (2) Feb and so on. There are many other ways to get things to sort properly. If you leave the values as numeric, which is the best practice for numbers, and don't stick (5) in front of the categories in the format, many procedures have an option ORDER=INTERNAL which forces your values to sort properly; plus other methods as well.

 

 

--
Paige Miller
Ronein
Meteorite | Level 14
Thanks.
I need to run logistic regression.
It was defined for me that worse group is 1 and then group 2 is better and so on.
It was defined for me that people want to see categories (groups) description that mention if it is group 1,2,3,4,5 because it give immediate information if it is worse group or best group or in between.
I agree that best is to create a numeric variable that get values 1,2,3,4,5 ( group number) and add to numeric variable a format ( as I mentioned format need to add number of group too). I agree that create a char variable is not clever way

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 952 views
  • 4 likes
  • 4 in conversation