Solved: Re: Conditionally remove missing values

NewUsrStat · Posted 06-03-2024 11:15 AM

Hi guys, suppose to have the following:

data DB;
  input ID ID2;
cards;
0001   .
0001   .
0001   .
0001   .
0002   .
0002  0002
0002  0002
0003  0003
0003   .
...;

Is there a way to get the following?

data DB1;
  input ID ID2;
cards;
0002   .
0002  0002
0002  0002
0003  0003
0003   .
...;

In other words if an ID is absent in column ID2 then delete. I tried with if statement without success maybe because there are other missing values in ID2. Can anyone help me please?

yabwon · Posted 06-03-2024 11:22 AM

Hash Tables can help here:

data DB;
  input ID ID2;
cards;
0001   .
0001   .
0001   .
0001   .
0002   .
0002  0002
0002  0002
0003  0003
0003   .
;
run;
proc print;run;

data want;

declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();

do until (EOF);
  set db end=EOF;
  if 0 = H.check(key:ID) then output;
end;

stop;
run;
proc print;run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

View solution in original post

yabwon · Posted 06-03-2024 11:22 AM

Hash Tables can help here:

data DB;
  input ID ID2;
cards;
0001   .
0001   .
0001   .
0001   .
0002   .
0002  0002
0002  0002
0003  0003
0003   .
;
run;
proc print;run;

data want;

declare hash H(dataset:"DB(keep=ID2 where=(ID2 is not null))");
H.defineKey('ID2');
H.defineDone();

do until (EOF);
  set db end=EOF;
  if 0 = H.check(key:ID) then output;
end;

stop;
run;
proc print;run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

PaigeMiller · Posted 06-03-2024 11:23 AM

Since you are treating 0002 as numeric, simply counting will get you the answer. Did you really want them to be treated as numeric?

proc sql;
    create table want as select * from DB
    group by id
    having count(id2)>0 ;
quit;

OR

proc summary data=db nway;
    class id;
    var id2;
    output out=stats n=n_not_missing;
run;
data want;
    merge db stats;
    by id;
    if n_not_missing>0;
    drop _type_ _freq_ n_not_missing;
run;

If you want to preserve the order in the original data set, the SQL doesn't do that but the PROC SUMMARY solution does.

--
Paige Miller

Ksharp · Posted 06-03-2024 10:07 PM

proc sql;
create table want as select * from DB
where id in (select id from DB where ID2 is not missing) ;
quit;

Tom · Posted 06-04-2024 12:25 PM

Sounds like you mean that you are keeping those observations because the value of ID appears at some point in the value of ID2.

proc sql;
create table want as select a.*
from have a 
where a.id in (select b.id2 from have b where b.id2 is not null)
;
quit;

But the example data you showed also supports the more restrictive condition that you are keeping those observations because at some point in the subset of observations with that same value of ID there exists an observation where ID2 matches ID.

proc sql;
create table want2 as select a.*
from have a 
where a.id in (select b.id2 from have b where b.id=b.id2 and b.id2 is not null)
;
quit;

Conditionally remove missing values

Re: Conditionally remove missing values

Re: Conditionally remove missing values

Re: Conditionally remove missing values

Re: Conditionally remove missing values

Re: Conditionally remove missing values

Registration is open