- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys, suppose to have the following:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
...;
Is there a way to get the following?
data DB1;
input ID ID2;
cards;
0002 .
0002 0002
0002 0002
0003 0003
0003 .
...;
In other words if an ID is absent in column ID2 then delete. I tried with if statement without success maybe because there are other missing values in ID2. Can anyone help me please?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hash Tables can help here:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
;
run;
proc print;run;
data want;
declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();
do until (EOF);
set db end=EOF;
if 0 = H.check(key:ID) then output;
end;
stop;
run;
proc print;run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hash Tables can help here:
data DB;
input ID ID2;
cards;
0001 .
0001 .
0001 .
0001 .
0002 .
0002 0002
0002 0002
0003 0003
0003 .
;
run;
proc print;run;
data want;
declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();
do until (EOF);
set db end=EOF;
if 0 = H.check(key:ID) then output;
end;
stop;
run;
proc print;run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since you are treating 0002 as numeric, simply counting will get you the answer. Did you really want them to be treated as numeric?
proc sql;
create table want as select * from DB
group by id
having count(id2)>0 ;
quit;
OR
proc summary data=db nway;
class id;
var id2;
output out=stats n=n_not_missing;
run;
data want;
merge db stats;
by id;
if n_not_missing>0;
drop _type_ _freq_ n_not_missing;
run;
If you want to preserve the order in the original data set, the SQL doesn't do that but the PROC SUMMARY solution does.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
create table want as select * from DB
where id in (select id from DB where ID2 is not missing) ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sounds like you mean that you are keeping those observations because the value of ID appears at some point in the value of ID2.
proc sql;
create table want as select a.*
from have a
where a.id in (select b.id2 from have b where b.id2 is not null)
;
quit;
But the example data you showed also supports the more restrictive condition that you are keeping those observations because at some point in the subset of observations with that same value of ID there exists an observation where ID2 matches ID.
proc sql;
create table want2 as select a.*
from have a
where a.id in (select b.id2 from have b where b.id=b.id2 and b.id2 is not null)
;
quit;