Hi ..
Need a help on this.
I have a dataset with the following attributes: id, typ(XN, IN), and sysdat.
If an ID contains both XN and IN, only XN records should be retrieved.
If the ID contains XN records, only XN records must be retrieved.
If the ID conatins IN records , only IN records must be retrieved.
Input :
ID typ sysdat
123 IN 2210128
123 IN 2210128
123 XN 2410128
124 XN 2210128
125 IN 2210128
125 IN 2210128
Ouput :
ID typ sysdat
123 IN 2210128
124 XN 2210128
125 IN 2210128
Thanks in Advance..
Your rules don't make sense. The first IF is a subset of the last IF. Perhaps you meant to include ONLY in the last one like you did for the middle one?
But also your expected results do not match the rules.
So 123 has both so only the XN observation should be kept, but your output is showing that one of the two IN observations was kept instead.
And 125 has only IN observations, but you only selected one of them. How did you decide which one to keep?
Sorry typo mistake
If an ID contains both XN and IN, only IN records should be retrieved.
@cho16 wrote:
Sorry typo mistake...If an ID contains both XN and IN, only XN records should be retrieved.
Then your output for 123 is wrong.
Please review your requirements and expected output so they make sense. In particular, define a selection rule when multiple observations meet the condition.
I'm as confused about the rules here as @Tom, so this is somewhat a shot.
data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128
123 IN 2210128
123 XN 2410128
124 XN 2210128
125 IN 2210128
125 IN 2210128
;
proc sql;
create table want as
select distinct * from have
group by ID
having typ = min(typ)
;
quit;
Result:
ID typ sysdat 123 IN 2210128 124 XN 2210128 125 IN 2210128
Hi ..
Need a help on this.
I have a dataset with the following attributes: id, typ(XN, IN), and sysdat.
If an ID contains both XN and IN, only IN records should be retrieved.
If the ID contains XN records, only XN records must be retrieved.
If the ID conatins IN records , only IN records must be retrieved.
Input :
ID typ sysdat
123 IN 2210128
123 IN 2210128
123 XN 2410128
124 XN 2210128
125 IN 2210128
125 IN 2210128
Ouput :
ID typ sysdat
123 IN 2210128
124 XN 2210128
125 IN 2210128
Thanks in Advance..
I'm as confused about the rules here as @Tom in the other thread.
However, try this.
data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128
123 IN 2210128
123 XN 2410128
124 XN 2210128
125 IN 2210128
125 IN 2210128
;
proc sql;
create table want as
select distinct * from have
group by ID
having typ = min(typ)
;
quit;
Result:
ID typ sysdat 123 IN 2210128 124 XN 2210128 125 IN 2210128
For each ID, interleave all the observations such that any "IN" records will precede any "XN" records. Then just keep the first individual record for each ID:
data have;
input ID typ $ sysdat;
datalines;
123 IN 2210128
123 IN 2210128
123 XN 2410128
124 XN 2210128
125 IN 2210128
125 IN 2210128
run;
data want;
set have (where=(typ='IN'))
have (where=(typ='XN'));
by id;
if first.id;
run;
This assumes dataset HAVE is sorted by ID. But it doesn't matter what the order is within each ID.
I have merged your posts. Please don't double-post question.
data have; input ID typ $ sysdat; datalines; 123 IN 2210128 123 IN 2210128 123 XN 2410128 124 XN 2210128 125 IN 2210128 125 IN 2210128 ; proc sql; create table want as select distinct *,case when count(distinct typ)>1 and typ='IN' then 1 when count(distinct typ)=1 then 1 else 0 end as flag from have group by id having calculated flag=1 ; quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.