Hi all suppose to have the following two cases:
dataset1:
ID Condition N_data_exp N_data_dis N_data_reg
A 2 23 54 65
B 1 27 23 67
C 5 32 43 78
D 3 2 23 70
E 8 87 1 4
.. .. .. .. ..
I would replace all values < 5 in columns starting with "N_" with 999. I used the
following code but it does not work. Of course I'm doing something wrong:
data output;
set dataset1;
if N_: < 5 then N_: = 999;
run;
I cannot use the condition "numeric" because there is the variable "Condition" that must not change.
Then the second case is the following:
dataset2:
ID Condition N_data_exp N_data_dis N_data_reg data_exp_res data_dis_res ....
A 2 23 54 65 345 34
B 1 27 23 67 52 987
C 5 32 43 78 675 321
D 3 2 23 70 0.2 41
E 8 87 1 4 34 0.001
.. .. .. .. .. .. ..
I would replace all values < 5 in columns starting with "N_" with 999 in corresponding columns ending with "_res".
Can anyone help me please? I have to manage the two cases separately.
Really thank you in advance
@NewUsrStat wrote:
Yes, so the condition should be verified in columns starting with N_ while the replacement should be done in columns (corresponding!!!!) ending with _res
So you want to make NEW variables with names that are derived from the names of other variables?
You can use an SQL query to help with that.
So get a list of the variables in your dataset. The easiest way is to use PROC CONTENTS. So if your existing dataset is named HAVE you can make a dataset named CONTENTS with the names (and other metadata) for the variables.
proc contents noprint data=HAVE out=CONTENTS;
run;
Now you can use that dataset to find the variables you want and also generate their names into macro variables by using PROC SQL.
proc sql noprint;
select nliteral(name)
, nliteral(cats(substr(name,3),'_res'))
into :old separated by ' '
, :new separated by ' '
from contents
where upcase(name) eqt 'N_'
and type='num'
;
quit;
Now you can use those two macro variables to define arrays to allow you to run the same code for many variable pairs by indexing into the arrays.
You can then use a DO loop to iterate an index variable over the number of variables in the array and do what ever calculations you need.
It looks like you are trying to HIDE the values of the NEW list when the values in the OLD list are less than 5.
So you might do something like this to replace such values with special missing .B (B for blinded).
data want;
set have;
array old &old;
array new &new;
do index=1 to dim(old);
if 0 < old[index] < 5 then new[index]=.B ;
end;
run;
Are these numeric variables? If so, they cannot have the value "Outlier", that is character. Cannot be done.
But you can DISPLAY values less than 5 as some text string ... their value will still be a number, but it is displayed as Outlier. Would that work for you? The only reason I can think of for putting Outlier into these fields is to display the table for people.
proc format;
value outlf low-5='Outlier';
run;
proc print data=have;
var id condition n_:;
format n_: outlf.;
run;
Note: if you do arithmetic on these variables, such as computing a sum, a number less than 5 will still be that number.
@NewUsrStat wrote:
I will edit the post. Instead of "outlier" also 999 is ok
NOOOOOO! NOOOOOO! NOOOOOO! NOOOOOO!
NOOOOOO! NOOOOOO! NOOOOOO! NOOOOOO!
You should never put a potentially valid number into a dataset to indicate outlier. Then any sum or average or other computation will not be correct.
Please discuss the purpose of doing this. Is it to display a table for people? Is it for some other reason? If so, what?
What is wrong with the PROC FORMAT solution I gave?
Nothing wrong with proc format. No computations will be done on the tables. They result from computations already. Will be only displayed.
Then PROC FORMAT is the solution I would recommend.
So, if I am understanding you properly, you want only to apply this format to variables in DATASET2 whose name ends with a particular suffix.
Thus, you need a list of those variables. PROC SQL can create that list (as the macro variable &names, by finding names that end in _RES)
proc sql noprint;
select name into :names separated by ' ' from dictionary.columns
where reverse(trim(name)) eqt 'ser_' and libname='WORK' and memname='DATASET2';
quit;
%put &=names;
Then
proc print data=dataset2;
format &names outf.;
run;
@NewUsrStat wrote:
Yes, so the condition should be verified in columns starting with N_ while the replacement should be done in columns (corresponding!!!!) ending with _res
So you want to make NEW variables with names that are derived from the names of other variables?
You can use an SQL query to help with that.
So get a list of the variables in your dataset. The easiest way is to use PROC CONTENTS. So if your existing dataset is named HAVE you can make a dataset named CONTENTS with the names (and other metadata) for the variables.
proc contents noprint data=HAVE out=CONTENTS;
run;
Now you can use that dataset to find the variables you want and also generate their names into macro variables by using PROC SQL.
proc sql noprint;
select nliteral(name)
, nliteral(cats(substr(name,3),'_res'))
into :old separated by ' '
, :new separated by ' '
from contents
where upcase(name) eqt 'N_'
and type='num'
;
quit;
Now you can use those two macro variables to define arrays to allow you to run the same code for many variable pairs by indexing into the arrays.
You can then use a DO loop to iterate an index variable over the number of variables in the array and do what ever calculations you need.
It looks like you are trying to HIDE the values of the NEW list when the values in the OLD list are less than 5.
So you might do something like this to replace such values with special missing .B (B for blinded).
data want;
set have;
array old &old;
array new &new;
do index=1 to dim(old);
if 0 < old[index] < 5 then new[index]=.B ;
end;
run;
Never say "it doesn't work" and provide no further information. We don't know what you did; we don't know what happened.
Always show us the code you used. Always show us (a portion of) the data set as WORKING data step code (examples and instructions). If there are errors in the log, provide the ENTIRE log for this PROC, not just the error messages.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.