data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 .
4 1 56 45 23
4 0 67 13 67
4 0 . 35 13
4 1 48 35 56
;
run;
data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 .
4 1 48 25 56
;
run;
data c;
set eq;
if dm=. then missindex=0;
else missindex=1;
if first.id then sumindex = missindex;
else sumindex+missindex;
by id;
run;
data d;
set c;
by id;
if first.id and missindex=0 then output;
if last.id and sumindex=1 then output;
run;
proc sql;
create table want as
select * from eq where id not in (select distinct id from d) ;
quit;
proc sql;
create table eq4 as
select * from want where id in (select distinct id from eq1)
quit;
proc sql;
create table eq5 as
select * from eq1 where id in (select distinct id from eq4)
quit;
data eq6;
set eq5 eq4;
run;
What I want to do is delete the ID with first DM missing and delete the ID with all the rest DM missing except first DM in eq. Then, select the common ID to merge with eq1. So the result will be only ID 4 with 5 observations.
My code works. But it seems redundant. Could anyone help me to simplify it into one or two steps by sql and data step? Thanks
data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 .
4 1 56 45 23
4 0 67 13 67
4 0 . 35 13
4 1 48 35 56
;
run;
data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 .
4 1 48 25 56
;
run;
data want;
if _N_ = 1 then do;
declare hash h(dataset:'eq1');
h.defineKey('id');
h.defineData(all:'y');
h.defineDone();
end;
_flag=0;__flag=0;
do until(last.id);
set eq;
by id;
if first.id and missing(dm) then _flag=1;
else if first.id and not missing(dm) then __flag=1;
else if __flag and missing(dm) then __flag=1;else __flag=0;
end;
do until(last.id);
set eq end=l;
by id;
if not(_flag or __flag) then do;
rc=h.check();
if rc=0 then output;
if rc=0 and last.id then do; h.find();output;end;
end;
end;
drop rc _:;
run;
I don't know if this much improvement though it involves fewer intermediate data sets. The order of the result is also different if that matters.
data work.c; set eq; by id; if first.id and dm ne .; run; data work.d; set eq1; by id; if first.id and dm ne .; run; proc sql; create table want as select b.* from (select Id from (select id from work.c intersect select id from work.d ) ) as a left join ( select * from eq union all select * from eq1) as b on a.id = b.id; quit;
Thanks for your reply. I think the sql part is better than mine. But it still has room for more simplification.
Thanks again.
@xiangpang wrote:
Thanks for your reply. I think the sql part is better than mine. But it still has room for more simplification.
Thanks again.
Perhaps but when you have a requirement for order of appearance (first of ID in two separate data sources and the value of a second variable) I suspect you will find it takes awhile to get much simpler code.
Yeah, it is possible. I am just a beginner I am happy to learn and thanks for your help.
data eq;
input ID y dm z w;
cards;
1 1 . 40 8
1 0 . 29 37
1 1 30 . 25
1 1 38 38 23
2 1 30 45 19
2 0 30 20 .
2 1 67 . .
2 1 . 27 .
3 0 33 23 46
3 1 . 12 56
3 0 . . 34
3 1 . 45 .
4 1 56 45 23
4 0 67 13 67
4 0 . 35 13
4 1 48 35 56
;
run;
data eq1;
input ID y dm z w;
cards;
1 1 30 . 25
3 1 . 45 .
4 1 48 25 56
;
run;
data want;
if _N_ = 1 then do;
declare hash h(dataset:'eq1');
h.defineKey('id');
h.defineData(all:'y');
h.defineDone();
end;
_flag=0;__flag=0;
do until(last.id);
set eq;
by id;
if first.id and missing(dm) then _flag=1;
else if first.id and not missing(dm) then __flag=1;
else if __flag and missing(dm) then __flag=1;else __flag=0;
end;
do until(last.id);
set eq end=l;
by id;
if not(_flag or __flag) then do;
rc=h.check();
if rc=0 then output;
if rc=0 and last.id then do; h.find();output;end;
end;
end;
drop rc _:;
run;
Thanks for your reply. I have to admit I did not write the codes in this way before. Could you provide a link about the use of hash and h. ()? I would like to learn about it.
@xiangpang There is not one link. Google, ask.com, tutorials, sugi articles, books,, learning from here, University publishing like ucla, cambrdige, NC state and many more and counting. Have fun!
Thanks
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.