Solved: Re: how to simply my steps

xiangpang · Posted 03-13-2018 03:49 PM

data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 . 
4 1 56 45 23
4 0 67 13 67  
4 0 . 35 13
4 1 48 35 56 
;
run;

data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 . 
4 1 48 25 56 
;
run;

data c;
set eq;
if dm=. then missindex=0;
else missindex=1;
if first.id then sumindex = missindex;
else sumindex+missindex;
by id;
run;

data d;
set c;
by id;
if first.id and missindex=0 then output;
if last.id and sumindex=1 then output;
run;

proc sql;
    create table want as
  select * from eq where id not in (select distinct id  from d) ;
  quit;

proc sql;
create table eq4 as
  select * from want where id in (select distinct id  from eq1)
quit;

proc sql;
create table eq5 as
  select * from eq1 where id in (select distinct id  from eq4)
quit;

data eq6;
set eq5 eq4;
run;

What I want to do is delete the ID with first DM missing and delete the ID with all the rest DM missing except first DM in eq. Then, select the common ID to merge with eq1. So the result will be only ID 4 with 5 observations.

My code works. But it seems redundant. Could anyone help me to simplify it into one or two steps by sql and data step? Thanks

novinosrin · Posted 03-13-2018 05:39 PM

data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 . 
4 1 56 45 23
4 0 67 13 67  
4 0 . 35 13
4 1 48 35 56 
;
run;

data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 . 
4 1 48 25 56 
;
run;

data want;
if _N_ = 1 then do;
   declare hash h(dataset:'eq1');
   h.defineKey('id');
   h.defineData(all:'y');
   h.defineDone();
end;
_flag=0;__flag=0;
do until(last.id);
set eq;
by id;
if first.id and missing(dm) then _flag=1;
else if first.id and not missing(dm) then __flag=1;
else if  __flag and  missing(dm) then __flag=1;else __flag=0;
end;
do until(last.id);
set eq end=l;
by id;
if not(_flag or __flag) then do;
rc=h.check();
if rc=0 then output;
if rc=0 and last.id then do; h.find();output;end;
end;
end;
drop rc _:;
run;

View solution in original post

ballardw · Posted 03-13-2018 04:11 PM

I don't know if this much improvement though it involves fewer intermediate data sets. The order of the result is also different if that matters.

data work.c;
   set eq;
   by id;
   if first.id and dm ne .;
run;
data work.d;
   set eq1;
   by id;
   if first.id and dm ne .;
run;

proc sql;
   create table want as
   select b.*
   from (select Id from (select id from work.c intersect 
                         select id from work.d ) 
        )
        as a
        left join
        ( select * from eq union all
          select * from eq1) as b
      on a.id = b.id;
quit;

xiangpang · Posted 03-13-2018 04:21 PM

Thanks for your reply. I think the sql part is better than mine. But it still has room for more simplification.

Thanks again.

ballardw · Posted 03-13-2018 04:29 PM

@xiangpang wrote:

Thanks for your reply. I think the sql part is better than mine. But it still has room for more simplification.

Thanks again.

Perhaps but when you have a requirement for order of appearance (first of ID in two separate data sources and the value of a second variable) I suspect you will find it takes awhile to get much simpler code.

xiangpang · Posted 03-13-2018 05:38 PM

Yeah, it is possible. I am just a beginner I am happy to learn and thanks for your help.

novinosrin · Posted 03-13-2018 05:39 PM

data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 . 
4 1 56 45 23
4 0 67 13 67  
4 0 . 35 13
4 1 48 35 56 
;
run;

data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 . 
4 1 48 25 56 
;
run;

data want;
if _N_ = 1 then do;
   declare hash h(dataset:'eq1');
   h.defineKey('id');
   h.defineData(all:'y');
   h.defineDone();
end;
_flag=0;__flag=0;
do until(last.id);
set eq;
by id;
if first.id and missing(dm) then _flag=1;
else if first.id and not missing(dm) then __flag=1;
else if  __flag and  missing(dm) then __flag=1;else __flag=0;
end;
do until(last.id);
set eq end=l;
by id;
if not(_flag or __flag) then do;
rc=h.check();
if rc=0 then output;
if rc=0 and last.id then do; h.find();output;end;
end;
end;
drop rc _:;
run;

xiangpang · Posted 03-13-2018 05:58 PM

Thanks for your reply. I have to admit I did not write the codes in this way before. Could you provide a link about the use of hash and h. ()? I would like to learn about it.

novinosrin · Posted 03-13-2018 06:12 PM

@xiangpang There is not one link. Google, ask.com, tutorials, sugi articles, books,, learning from here, University publishing like ucla, cambrdige, NC state and many more and counting. Have fun!

xiangpang · Posted 03-13-2018 06:57 PM

Thanks

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away