DATA Step, Macro, Functions and more

Data subsetting.

Reply
Occasional Contributor
Posts: 5

Data subsetting.

Data Set name is PLN200

SSN              PLAN_ID    CV_CAT

001                9000             21
001                9020            847 

 

My code is not giving me expected result:

DATA PLN200;                                      
SET PLN200;                                       
BY SSN;                                           
IF PLAN_ID = 9000 AND PLAN_ID = 9020 THEN DO;     
IF CV_CAT IN(26,55,65,817,827,847) THEN DELETE;   
END; RUN;                                         

 

I am expecting  only "001                9000             21 "   in result.

 

Please suggest.

Super User
Posts: 6,629

Re: Data subsetting.

Posted in reply to msharma1788

It is not possible for PLAN_ID to be equal to both 9000 and 9020.

 

Use OR instead of AND, or better yet:

 

IF PLAN_ID in (9000, 9020) THEN DO;    

 

 

Occasional Contributor
Posts: 5

Re: Data subsetting.

Posted in reply to Astounding

I need the data to be deleted only when the SSN is in both the plans 9000 and 9020 , If its in both the plans then it should check the CV_CAT ids mentioned. and should delete that row.

Super User
Posts: 23,262

Re: Data subsetting.

Posted in reply to msharma1788

So you need to check multiple rows first to see if that SSN has both 9000 and 9020, does that sound correct?

 

Since all the data is not in one row your logic needs to be significantly different.

 

Super User
Posts: 6,629

Re: Data subsetting.

Posted in reply to msharma1788

A DATA step is limited to processing one observation at a time.  Here's a way to program around that for your purposes:

 

data want;

merge

pln200 (where=(plan_id=9000 in=in_9000))

pln200 (where=(plan_id=9020 n=in_9020))

pln200;

by ssn;

if in_9000 and in_9020 and CV_CAT IN(26,55,65,817,827,847) THEN DELETE;

run;

 

The order of the data sets in the MERGE statement is important.  The PLN200 with no WHERE condition must be last.

Respected Advisor
Posts: 3,167

Re: Data subsetting.

Posted in reply to msharma1788

Simply,

IF PLAN_ID = 9000 OR PLAN_ID = 9020 THEN DO;   

 

You also can check the condition at the same time:

DATA PLN200;                                      
SET PLN200;                                       
BY SSN;                                           
IF not ( PLAN_ID in ( 9000, 9020) and   
CV_CAT IN(26,55,65,817,827,847)) ;   
RUN;                

 

Valued Guide
Posts: 559

Re: Data subsetting.

Posted in reply to msharma1788

Hi,

 

Check this one using CALL SYMPUT

 

DATA HAVE;
INFILE DATALINES dlm=" ";
INPUT SSN PLAN_ID CV_CAT;
datalines;
001 9000 21
001 9020 847
;
run;
proc sort data=HAVE;
by SSN PLAN_ID;
run;

DATA WANT;
SET HAVE;
BY SSN PLAN_ID;
IF FIRST.SSN THEN CALL SYMPUT("Hold"," ");
IF PLAN_ID=9000 THEN CALL SYMPUT("Hold","YES");
IF PLAN_ID=9020 and SYMGET("Hold")="YES" and CV_CAT IN(26,55,65,817,827,847) THEN DELETE;
RUN;
Thanks,
Suryakiran
Ask a Question
Discussion stats
  • 6 replies
  • 87 views
  • 0 likes
  • 5 in conversation