BookmarkSubscribeRSS Feed
cho16
Obsidian | Level 7

Hi ..

 

Need a help on this.

I have a dataset with the following attributes: id, typ(XN, IN), and sysdat.
If an ID contains both XN and IN, only XN records should be retrieved.
If the ID contains  XN records, only XN records must be retrieved.
If the ID conatins  IN records , only IN records must be retrieved.

 

Input :

ID    typ    sysdat

123   IN    2210128

123   IN    2210128

123  XN   2410128

124   XN  2210128

125   IN  2210128

125  IN 2210128

 

Ouput :

ID    typ    sysdat

123   IN    2210128

124   XN  2210128

125   IN  2210128

 

Thanks in Advance..

 

 

10 REPLIES 10
Tom
Super User Tom
Super User

Your rules don't make sense. The first IF is a subset of the last IF.  Perhaps you meant to include ONLY in the last one like you did for the middle one?

 

But also your expected results do not match the rules.

So 123 has both so only the XN observation should be kept, but your output is showing that one of the two IN observations was kept instead.

And 125 has only IN observations, but you only selected one of them.  How did you decide which one to keep?

cho16
Obsidian | Level 7

Sorry typo mistake

If an ID contains both XN and IN, only IN records should be retrieved.

cho16
Obsidian | Level 7
Sorry typo mistake...If an ID contains both XN and IN, only XN records should be retrieved.
Kurt_Bremser
Super User

@cho16 wrote:
Sorry typo mistake...If an ID contains both XN and IN, only XN records should be retrieved.

Then your output for 123 is wrong.

Please review your requirements and expected output so they make sense. In particular, define a selection rule when multiple observations meet the condition.

PeterClemmensen
Tourmaline | Level 20

I'm as confused about the rules here as @Tom, so this is somewhat a shot.

 

data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128 
123 IN 2210128 
123 XN 2410128 
124 XN 2210128 
125 IN 2210128 
125 IN 2210128 
;

proc sql;
   create table want as
   select distinct * from have 
   group by ID
   having typ = min(typ)
   ;
quit;

 

Result:

 

ID   typ  sysdat
123  IN   2210128  
124  XN   2210128 
125  IN   2210128 

 

cho16
Obsidian | Level 7

Hi ..

 

Need a help on this.

I have a dataset with the following attributes: id, typ(XN, IN), and sysdat.
If an ID contains both XN and IN, only IN records should be retrieved.
If the ID contains  XN records, only XN records must be retrieved.
If the ID conatins  IN records , only IN records must be retrieved.

 

Input :

ID    typ    sysdat

123   IN    2210128

123   IN    2210128

123  XN   2410128

124   XN  2210128

125   IN  2210128

125  IN 2210128

 

Ouput :

ID    typ    sysdat

123   IN    2210128

124   XN  2210128

125   IN  2210128

 

Thanks in Advance..

 

PeterClemmensen
Tourmaline | Level 20

I'm as confused about the rules here as @Tom in the other thread. 

 

However, try this.

 

data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128 
123 IN 2210128 
123 XN 2410128 
124 XN 2210128 
125 IN 2210128 
125 IN 2210128 
;

proc sql;
   create table want as
   select distinct * from have 
   group by ID
   having typ = min(typ)
   ;
quit;

 

Result:

 

ID   typ  sysdat
123  IN   2210128  
124  XN   2210128 
125  IN   2210128 

 

mkeintz
PROC Star

For each ID, interleave all the observations such that any "IN" records will precede any "XN" records.  Then just keep the first individual record for each ID:

 

data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128 
123 IN 2210128 
123 XN 2410128 
124 XN 2210128 
125 IN 2210128 
125 IN 2210128 
run;
data want;
  set have (where=(typ='IN'))
      have (where=(typ='XN'));
  by id;
  if first.id;
run;

This assumes dataset HAVE is sorted by ID.  But it doesn't matter what the order is within each ID.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
andreas_lds
Jade | Level 19

I have merged your posts. Please don't double-post question.

Ksharp
Super User
data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128 
123 IN 2210128 
123 XN 2410128 
124 XN 2210128 
125 IN 2210128 
125 IN 2210128 
;

proc sql;
create table want as
select distinct *,case when count(distinct typ)>1 and typ='IN' then 1
when count(distinct typ)=1 then 1 else 0 end as flag
 from have
  group by id
   having calculated flag=1
;
quit;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 980 views
  • 1 like
  • 7 in conversation