BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Polina_UH
Obsidian | Level 7

Dear all,

 

I have created a macro variable for over 200 metabolites &biomarker.

 

I have standardized &biomarker variable and by running proc univariate I can see that  half of the biomarkers have outliers.

 

I tried this code to set the outliers as missing values:

 

data met; set met;
%if &biomarker<-5 %then %let &biomarker=.;

%else %if &biomarker>5 %then %let &biomarker=.;

run;

 

I got these error messages

 

ERROR: The %IF statement is not valid in open code.

ERROR: The %ELSE statement is not valid in open code.

 

Even though I saw the posts saying that it was possible to run these statements in the open code.

 

Then I tried to run it within a  macro

 

%macro clean;
data met; set met;
%if &biomarker<-5 %then %let &biomarker=.;
%else %if &biomarker>5 %then %let &biomarker=.;
run;
%mend clean;

%clean

 

ERROR: Expecting a variable name after %LET.
ERROR: Symbolic variable name XXLVLDLP XXLVLDLL XXLVLDLPL XXLVL must be 32 or fewer characters
long.
ERROR: The macro CLEAN will stop executing.

 

Then I tried to condition with just one condition at the time

 

%macro clean;
data met; set met;
%if &biomarker<-5 %then %let &biomarker=.; run;
%mend clean;

 

I did not get any error messages. But when I run proc univariate with &biomarker I can see that all values below -5 are still in the data,

 

Please help me to modify my code to execute data cleaning.

 

Thank you very much in advance!

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

My bad. I wrote the code wrong. THis should fix it

 

data want;
    set have;
    array x XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE
        XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL XLVLDLC;
    do i=1 to dim(x);
        if x(i)>5 or x<(i)-5 then x(i)=.;
    end;
    drop i;
run;
--
Paige Miller

View solution in original post

14 REPLIES 14
Astounding
PROC Star

DATA steps do not run macro language %IF %THEN statements.

 

You can still refer to a macro variable using &biomarker, but get rid of all the percent signs.

 

In addition, what does &biomarker resolve to?  That will be necessary to come up with right DATA step logic to replace what you have now.

Polina_UH
Obsidian | Level 7

Thank you for your answer!

 

Running the code without % signs gives this error:

 

 

NOTE: Line generated by the macro variable "BIOMARKER".
1 XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL
--------
388
76
1 ! XLVLDLPL XLVLDLC

ERROR 388-185: Expecting an arithmetic operator.

ERROR 76-322: Syntax error, statement will be ignored.

 

&biomarker is the set of 200+ numeric variables representing metabolites

Kurt_Bremser
Super User

Forget macro programming for the moment, learn to use the data step first.

 

What does

%put &biomarker.;

reveal in the log? Please post that log, using the {i} button.

 

How did you create the macro variable? You are MOST LIKELY better off by keeping that data in a dataset and manipulate it with data step means.

Polina_UH
Obsidian | Level 7

Hi, if macro if not the optimal solution to my problem, I would really appreciate any advice of what may be.

 

I will reduce macro variable to just a few items to make it easier.

 

This is how I created a macro variable:

 

%let biomarker = XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL
XLVLDLPL XLVLDLC;

 

here is the log you requested

 

1651 %put &biomarker;
XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL
XLVLDLC

PaigeMiller
Diamond | Level 26
data want;
    set have;
    array x XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE
        XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL XLVLDLC;
    do i=1 to dim(x);
        if x>5 or x<-5 then x=.;
    end;
    drop i;
run;
--
Paige Miller
Polina_UH
Obsidian | Level 7

Thank you so much! 

 

Unfortunately, the applying this code creates the array which includes all of the variables in my dataset including id and covariates.

 

Here is the log:

 

2249 data new; set met;
2250 array x XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE
2251 XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL XLVLDLC;
2252 do i=1 to dim(x);
2253 if x>5 or x<-5 then x=.;
2254 end;
2255 drop i;
2256 run;

ERROR: Array subscript out of range at line 2253 column 12.

PaigeMiller
Diamond | Level 26

My bad. I wrote the code wrong. THis should fix it

 

data want;
    set have;
    array x XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE
        XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL XLVLDLC;
    do i=1 to dim(x);
        if x(i)>5 or x<(i)-5 then x(i)=.;
    end;
    drop i;
run;
--
Paige Miller
Kurt_Bremser
Super User

@PaigeMiller forgot to add the array references in the code:

data want;
    set have;
    array x XXLVLDLP XXLVLDLL XXLVLDLPL XXLVLDLC XXLVLDLCE
        XXLVLDLFC XXLVLDLTG XLVLDLP XLVLDLL XLVLDLPL XLVLDLC;
    do i=1 to dim(x);
        if x{i}>5 or x{i}<-5 then x{i}=.;
    end;
    drop i;
run;
Polina_UH
Obsidian | Level 7

Dear @Kurt_Bremser and @PaigeMiller ,

 

Thank you so much! The code worked and my problem is solved!

 

Thank you so very much again!

PaigeMiller
Diamond | Level 26

@Polina_UH wrote:

I got these error messages

 

ERROR: The %IF statement is not valid in open code.

ERROR: The %ELSE statement is not valid in open code.

 

Even though I saw the posts saying that it was possible to run these statements in the open code.

 


These macro commands can be run in open code starting in SAS 9.4 Maintenance Release 5, but not in earlier versions of SAS.

 

%macro clean;
data met; set met;
%if &biomarker<-5 %then %let &biomarker=.;
%else %if &biomarker>5 %then %let &biomarker=.;
run;
%mend clean;

%clean

 

ERROR: Expecting a variable name after %LET.

 

The syntax of the %LET command usually does not include an ampersand (although it can include an ampersand, I don't believe you want to use the & in this situation).

 

But as stated by @Astounding this probably is not the proper way to do things.

 

Taking a big step backwards, if you are trying to eliminate outliers, I'm not seeing a need for macros at all. If you standardize each of your 200 variables somehow (it seems like you did this part already) and then you want any value of the standardized variable less then -5 or greater than 5 to be set to missing, this is an example where ARRAY statements work perfectly, and then no macros (and so no complication from macros) is needed.

--
Paige Miller
Polina_UH
Obsidian | Level 7

Hi, thank you for you answer. 

 

I standardized them just by using this code and it worked:

 

proc standard data=met mean=0 std=1 out=met;
var &biomarker;
run;

 

I'm not sure how to create an array: my variables have very different names and no consecutive numbers...

 

PaigeMiller
Diamond | Level 26

@Polina_UH wrote:

Hi, thank you for you answer. 

 

I standardized them just by using this code and it worked:

 

proc standard data=met mean=0 std=1 out=met;
var &biomarker;
run;

 

I'm not sure how to create an array: my variables have very different names and no consecutive numbers...

 


I added array code to Message #7 in this thread.

--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 2847 views
  • 2 likes
  • 4 in conversation