BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ybz12003
Rhodochrosite | Level 12

Hello,

I would like to create an error check table with a logic statement below. Please use Proc SQL format and help me complete the Where statement.  Thanks.

 

If a treatment (meds) = 1 (Yes), I'm look for any subgroups (steroids/vaso/immunemods/monoclonal/ antiviral/othertx) are either missing (.) or No (0).

 

proc sql;
create table want as
select meds, steroids, vaso, immunemods, monoclonal, antiviral, othertx
from have
where meds=1 and (???)
order by site;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@ybz12003 wrote:

Hello,

I would like to create an error check table with a logic statement below. Please use Proc SQL format and help me complete the Where statement.  Thanks.

 

If a treatment (meds) = 1 (Yes), I'm look for any subgroups (steroids/vaso/immunemods/monoclonal/ antiviral/othertx) are either missing (.) or No (0).

 

proc sql;
create table want as
select meds, steroids, vaso, immunemods, monoclonal, antiviral, othertx
from have
where meds=1 and (???)
order by site;

 


So your variables are Boolean (1=TRUE and 0=FALSE) with some missing values?

If so your test is the condition that it is NOT true that ALL of them are TRUE. So test if the SUM() of them is not equal to the number of variables.  (Note you cannot just test if the MIN() is equal to 1 since MIN() will ignore the missing values.)

where meds and (6 ne sum(steroids, vaso, immunemods, monoclonal, antiviral, othertx))

But perhaps you just didn't describe what you want clearly?  It would make more sense to me to look for the observations where MEDS is TRUE but all of the other variables is FALSE as that seems to indicate an inconsistency.  So the MAX() will be TRUE if ANY of them is TRUE.  So in that case the problem records are those with:

where meds and not max(steroids, vaso, immunemods, monoclonal, antiviral, othertx)

 

View solution in original post

5 REPLIES 5
Reeza
Super User
What should NMISS, N, SUM be for this set of variables if your condition is true?
Tom
Super User Tom
Super User

@ybz12003 wrote:

Hello,

I would like to create an error check table with a logic statement below. Please use Proc SQL format and help me complete the Where statement.  Thanks.

 

If a treatment (meds) = 1 (Yes), I'm look for any subgroups (steroids/vaso/immunemods/monoclonal/ antiviral/othertx) are either missing (.) or No (0).

 

proc sql;
create table want as
select meds, steroids, vaso, immunemods, monoclonal, antiviral, othertx
from have
where meds=1 and (???)
order by site;

 


So your variables are Boolean (1=TRUE and 0=FALSE) with some missing values?

If so your test is the condition that it is NOT true that ALL of them are TRUE. So test if the SUM() of them is not equal to the number of variables.  (Note you cannot just test if the MIN() is equal to 1 since MIN() will ignore the missing values.)

where meds and (6 ne sum(steroids, vaso, immunemods, monoclonal, antiviral, othertx))

But perhaps you just didn't describe what you want clearly?  It would make more sense to me to look for the observations where MEDS is TRUE but all of the other variables is FALSE as that seems to indicate an inconsistency.  So the MAX() will be TRUE if ANY of them is TRUE.  So in that case the problem records are those with:

where meds and not max(steroids, vaso, immunemods, monoclonal, antiviral, othertx)

 

PaigeMiller
Diamond | Level 26

I don't think I grasp the problem. When you say

 

I'm look for any subgroups (steroids/vaso/immunemods/monoclonal/ antiviral/othertx) are either missing (.) or No (0).

 

What do you mean by subgroup?

 

Your title says you want ALL subgroup treatments, but your text says you want ANY subgroups. Which is it?

 

Can you show us (or make up) a small amount of data, along with the desired output?

 

--
Paige Miller
ballardw
Super User

Just a style note. This is an extremely simple appearing query BUT you may find it easier to use a data step in the long run with some things like this, especially if you end up needing to use the same list of variables multiple time.

The data step would let you use either an array or possibly the two dash list if all of your steroids to othertx variables are adjacent in the data set. SQL won't allow either shortcut.

 

Consider that answering @Reeza's question about the N, Nmiss and Sum of the variables.

data want;
    set have;
    array v(*) steroids vaso immunemods monoclonal antiviral othertx;
    listn = n(of v(*) );
    listnmiss = nmiss( of v(*) );
    listsum  = sum(of v(*) );
if meds=1 and ( min(of v(*))=0 or listnmiss>0); /* maybe coding your statment relatively directly*/
/* or alternately as if at least one of the variables is 0 or missing the sum is less than 6
if meds=1 and sum( of v(*)) < 6;
*/ run;

With SQL you have to explicitly type the names of all of the variables into the function parameters.

 

Likely if the set is "large" the data step will run quicker as well.

ybz12003
Rhodochrosite | Level 12

Thank you so much for your help.  I got it.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 866 views
  • 1 like
  • 5 in conversation