Keep only certain rows in a variable

Reply
New Contributor
Posts: 3

Keep only certain rows in a variable

I have a dataset of several hospitals. There are several variables. One of the variables is MEASURE ID. I want to keep only those rows that have MEASURE ID= PSI_90 and drop the rest of the rows. How can I do that? I am a beginner so please excuse me if the question is too basic. Thanks in advance.

 

SU

Trusted Advisor
Posts: 1,584

Re: Keep only certain rows in a variable

Use next code:

data want;
    set have;
        where MEASURE ID= PSI_90;
run;
New Contributor
Posts: 3

Re: Keep only certain rows in a variable

Hi Schmuel:

 

Thanks for your reply. I tried that but all I get is a new dataset which is empty. Has 0 observations. Could you please advise. My original dataset looks like the one in the screen shot attached.

 

SU


VirtualBox_windows10_14_03_2017_00_31_51.png
Super User
Posts: 11,343

Re: Keep only certain rows in a variable

Since you did not provide any data in the original post @Shmuel had to guess that you meant when one variable was equal to another not a text value.

 

data want;
    set have;
    where MEASURE ID= "PSI_90";
run;

might work, but your posted picture shows a value of PSI_90_SAFETY. If that is the value you are looking for then that is the string to place between the quotes.

 

 

Spelling counts in programming.

Trusted Advisor
Posts: 1,584

Re: Keep only certain rows in a variable

I see your issue:

You are not looking for 

MEASURE ID= PSI_90;

as there is no variabled named PSI_90.

NOR for

MEASURE ID= "PSI_90";

as yvalues are in a format of:  PSI_nn_xxxxx

 

So the right code should be:

 

data want;
    set have;
    where upcase(substr(MEASURE ID,1,6)) = "PSI_90";
run;

 

New Contributor
Posts: 3

Re: Keep only certain rows in a variable

I am looking for scores for only those rows in which Measure ID= PSI_90_SAFETY. So, I want the first two variables in my resulting dataset to look like: ProviderID, PSI_90_Score. The second variable PSI_90_Score is renamed from the variable Score. And I am looking for scores only for PSI_90_SAFETY and drop everything else. Please help!!

Trusted Advisor
Posts: 1,584

Re: Keep only certain rows in a variable

Do you use any documentation of sas ?

 

You can compare any character variable - either to another variable like:

     if var1 = var2   |    if var1 > var2   etc.

or you can compare it to a literal, exact equation like:

    if var = "any string"     |  if var = 'other string'   etc.

or you can compare to an expression as I have done:    

where upcase(substr(MEASURE ID,1,6)) = "PSI_90";

 the difference between IF and WHERE is the timing of compare.

 WHERE is executed towards the input buffer while the IF is executed after bringing the record to work area.

I have used two functtions:

1) the inner SUBSTRING with 3 arguments: variable name (string to check), start position, end position 

2) the outer UPCASE in order to be sure that I comapr to upper case as in "PSI..." literal.

 

With this information I expect that you can solve yourself your last question - just change the literal you want to compare to.

PROC Star
Posts: 7,492

Re: Keep only certain rows in a variable

Since your example showed PSI_90 as being part of the value PSI_90 safety, I'd think you're looking for something like:

 

data want;
  set have (where=(measure_id contains 'PSI_90'));
run;

Art, CEO, AnalystFinder.com

 

Ask a Question
Discussion stats
  • 7 replies
  • 464 views
  • 1 like
  • 4 in conversation