@EvansHsieh wrote:
Here, I have a dataframe followed by
---------------------------------------------------------------------------
data temp_; input x y z; cards; 1 2 10 2 5 10 3 6 10 7 1 2 run;
---------------------------------------------------------------------------
And now i wanna classify each columns with same rules :
(1) variable < 4 then 'First Cluster'
(2) 4 <= variable < 7 then 'Second Cluster'
(3) 7<= variable then 'Third Cluster'.
Here I make an attempt,
---------------------------------------------------------------------------
data test_; set temp_; array level_ x -- z; do over level_; if level_ < 4 then level_ = 'First_Cluster'; else if 4 <= level_ < 7 then level_ = 'Second_Cluster'; else if 7 <= level_ then level_ = 'Third_Cluster'; else level_ = 'Other'; end; run;
---------------------------------------------------------------------------
But what the output frame is all of null value. It seems like `array` can't be assign with string type. So can't I split numeric variable by `array`? or there is any suggestion?
Thank you for all your help !!
Did you read the LOG at all?
The second data step shown above:
406 data test_;
407 set temp_;
408 array level_ x -- z;
409 do over level_;
410 if level_ < 4 then level_ = 'First_Cluster';
411 else if 4 <= level_ < 7 then level_ = 'Second_Cluster';
412 else if 7 <= level_ then level_ = 'Third_Cluster';
413 else level_ = 'Other';
414 end;
415 run;
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
410:24 411:34 412:30 413:10
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
NOTE: Invalid numeric data, 'Third_Cluster' , at line 412 column 39.
x=. y=. z=. _I_=4 _ERROR_=1 _N_=1
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
NOTE: Invalid numeric data, 'Second_Cluster' , at line 411 column 43.
NOTE: Invalid numeric data, 'Third_Cluster' , at line 412 column 39.
x=. y=. z=. _I_=4 _ERROR_=1 _N_=2
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
NOTE: Invalid numeric data, 'Second_Cluster' , at line 411 column 43.
NOTE: Invalid numeric data, 'Third_Cluster' , at line 412 column 39.
x=. y=. z=. _I_=4 _ERROR_=1 _N_=3
NOTE: Invalid numeric data, 'Third_Cluster' , at line 412 column 39.
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
NOTE: Invalid numeric data, 'First_Cluster' , at line 410 column 33.
x=. y=. z=. _I_=4 _ERROR_=1 _N_=4
NOTE: There were 4 observations read from the data set WORK.TEMP_.
NOTE: The data set WORK.TEST_ has 4 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Does the phrase "Invalid numeric data, 'First_cluster' not make sense?
Or the "Character values have been converted to numeric values at the places given by:" ?
The log tells that you are attempting to convert character values to numeric and failing.
It is a good idea to show what you expect for output and if the result should be a data set for further manipulation or a report that people will read.
... View more