BookmarkSubscribeRSS Feed
kai_cody
Fluorite | Level 6

Hello, can someone please advise me on the best way to perform the following (whether through macro, array, and/or do loop)?

 

For each variable, I want to search through all observations in the column and see if there is a "<" symbol. For variables with such a symbol, I want to completely ignore the number following the "<" and instead replace this entire observation with the minimum observation (smallest value) in the entire column. For variables without this symbol, I don't need to do anything and want to leave the data as is.

 

ID             VAR1        VAR2

1               853.1         <12.1 (replace this with the smallest value in the entire VAR2 column, ignoring 12.1 and any other observations that have "<")

2               173.4         888.3

3               481.6         385.2

 

I attempted a macro, but it is not working, and now I wondering if an array or do loop would work better?

The issue is that the variables WITH "<" are registered as characters by SAS, while the variables WITHOUT "<" are seen as numeric. And I have many variables, so I'm trying to figure out a way to efficiently apply code to all variables, leaving the variables without "<" alone.

 

INCORRECT MACRO

%macro imputation (var);



data tab1_&var; set data (keep = ID &var);

if not (find(&var,'<')) then do;
&var_good = 111111; end; /* essentially don't need to do anything here, but I didn't know how to get SAS to "do nothing" */

else if find(&var,'<') then do;
&var = 999999; end;



&var_ = &var * 1; /* convert to numeric */
drop &var &var_good; run;


proc sql;
create table tab2_&var as
select *, min(&var_) as min_&var   /* find the min observation in entire column, create a new column with this value, and attach to the previous dataset*/
from tab1_&var;
quit;


data tab3_&var; set tab2_&var;

if &var_ = 999999 then &var = min_&var; /* only assign the minimum observation to cells that had a "<" */
else &var = &var_; run;


data table_&var; set tab3_&var (keep = ID &var); run;



proc sort data=table_&var;
by ID ;
run;

%mend;
run;



%imputation(var1);
%imputation(var2);
%imputation(var3);

 

(The final step will be horizontally merge all the sorted datasets by ID... so that all data is now correct and complete).

 

Any guidance would be greatly appreciated! Thank you!!!

2 REPLIES 2
ballardw
Super User

You might have somewhat better luck by making sure the variable is numeric. "Minimum" value of character variables is a problematic concept as with character values '1111111' is "less than" '9' because the comparisons are done character by character until there is a difference. So the first '1' is "less than" '9'.

 

If the variable is numeric, which means the "<12" would have been created as missing then the minimum function will return the correct non-missing value.

 

One way after you make or place the variables into numeric values.

data have;
   input id var1  var2 var3;
datalines;
2 10   123.32  .4
3 .    0.19    .
5 10   .       1.2
1 25   44      2.3
7 .    55      0
;

Proc summary data=have;
   var var1 var2 var3;
   output out=work.impute min= /autoname
   ;
run;

Proc sql;
   create table imputed as
   select a.id, coalesce(a.var1,b.var1_min) as var1
          ,coalesce(a.var2,b.var2_min) as var2
          ,coalesce(a.var3,b.var3_min) as var3
   from have as a, work.impute as b
   ;
quit;

The coalesce function returns the first non-missing value in the parameters as examined from left to right.

 

To create numeric values from your apparent character values (otherwise the likely wouldn't be any <) then either reread the raw data with a proper numeric informat or in a data step use something like:   newvar = input(oldvar, 8.);

Reeza
Super User
You may find this post informative, especially if you want to go beyond just replacing with minimum. It covers your use case as well.
https://sasexamplecode.com/replace-missing-values-in-sas/

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 364 views
  • 0 likes
  • 3 in conversation