product altid cc_code
abc 1234 A1
abc 1234
abc 1234 B1
abc 2345 A1
abc 2345 A1
abc 2345 B1
abc 3456
abc 3456
abc 4567 A1
abc 4567 A1
From the above dataset
How to get a dataset which has the following records
product altid cc_code
abc 1234 A1
abc 1234 B1
abc 2345 A1
abc 2345 B1
abc 3456
abc 4567 A1
Try this one:
data have;
infile cards truncover;
input (product altid cc_code) (:$10.);
cards;
abc 1234 A1
abc 1234
abc 1234 B1
abc 2345 A1
abc 2345 A1
abc 2345 B1
abc 3456
abc 3456
abc 4567 A1
abc 4567 A1
;
proc sql;
create table want as
select distinct a.product, a.altid, b.cc_code from have a
left join
(select distinct * from have where not missing (cc_code))b
on a.product=b.product and
a.altid=b.altid
order by a.product, a.altid, b.cc_code;
quit;
proc print;run;
Regards,
Haikuo
or, using proc sort and a datastep:
data have;
input product $ altid cc_code $;
cards;
abc 1234 A1
abc 1234 .
abc 1234 B1
abc 2345 A1
abc 2345 A1
abc 2345 B1
abc 3456 .
abc 3456 .
abc 4567 A1
abc 4567 A1
;
proc sort data=have out=want nodupkey;
by product altid cc_code;
run;
data want;
set want;
by product altid;
if (last.altid and missing(cc_code)) or
not missing(cc_code);
run;
small change in data structure
data have;
infile cards truncover;
input (product altid cc_code dxden rank) (:$10.);
cards;
abc 1234 A1 42390 2
abc 1234 49392
abc 1234 B1 79902 1
abc 1234 B1 79902 2
abc 2345 A1 42390 2
abc 2345 A1 42390 1
abc 2345 B1 79902 1
abc 3456 49392
abc 3456 49393
abc 4567 A1 42390 1
abc 4567 A1 42390 1
;
this is expected:
abc 1234 A1 42390 2
abc 1234 B1 79902 1
abc 2345 A1 42390 1
abc 3456 49392
abc 4567 A1 42390 1
Why isn't the following identified record included in the desired output?
abc 1234 A1 42390 2
abc 1234 49392
abc 1234 B1 79902 1
abc 1234 B1 79902 2
abc 2345 A1 42390 2
abc 2345 A1 42390 1
--------------> abc 2345 B1 79902 1
abc 3456 49392
abc 3456 49393
abc 4567 A1 42390 1
abc 4567 A1 42390 1
If that was just an oversight, using the proc sort method would still accomplish the task by adding one additional sort (but I can't test it at the moment). i.e.:
proc sort data=have out=want;
by product altid cc_code dxden rank;
run;
proc sort data=want nodupkey;
by product altid cc_code;
run;
data want;
set want;
by product altid;
if (last.altid and missing(cc_code)) or
not missing(cc_code);
run;
Agree with Art, the rule is not consistent. A tweak on my original code will do as well, you need another join though.
data have;
infile cards truncover;
input (product altid cc_code dxden rank) (:$10.);
cards;
abc 1234 A1 42390 2
abc 1234 . 49392
abc 1234 B1 79902 1
abc 1234 B1 79902 2
abc 2345 A1 42390 2
abc 2345 A1 42390 1
abc 2345 B1 79902 1
abc 3456 . 49392
abc 3456 . 49393
abc 4567 A1 42390 1
abc 4567 A1 42390 1
;
proc sql;
create table want as
select distinct a.product, a.altid, b.cc_code, coalescec(b.dxden,c.dxden) as deden, b.rank from have a
left join
(select product, altid, cc_code, dxden ,min(rank) as rank from have where not missing (cc_code)
group by product, altid ,cc_code, dxden)b
on a.product=b.product and
a.altid=b.altid
left join (select product, altid, min(dxden) as dxden from have
group by product, altid) c
on a.product=c.product and
a.altid=c.altid
order by a.product, a.altid, b.cc_code;
quit;
proc print;run;
data have;
infile cards missover;
input product$ :altid $ : cc_code$ ;
cards;
abc 1234 A1
abc 1234
abc 1234 B1
abc 2345 A1
abc 2345 A1
abc 2345 B1
abc 3456
abc 3456
abc 4567 A1
abc 4567 A1
;
data have;
set have;
by product altid;
if first.altid or last.altid;
proc sql;
create table want as select distinct * from have;
quit;
proc print;run;
obs product altid cc_code
1 abc 1234 A1
2 abc 1234 B1
3 abc 2345 A1
4 abc 2345 B1
5 abc 3456
6 abc 4567 A1
I have the same question Art described for you.
data have; infile cards truncover; input (product altid cc_code dxden rank) (:$10.); cards; abc 1234 A1 42390 2 abc 1234 . 49392 abc 1234 B1 79902 1 abc 1234 B1 79902 2 abc 2345 A1 42390 2 abc 2345 A1 42390 1 abc 2345 B1 79902 1 abc 3456 . 49392 abc 3456 . 49393 abc 4567 A1 42390 1 abc 4567 A1 42390 1 ; run; proc sort data=have ;by product altid cc_code;run;; data x; set have; by product altid cc_code; if first.cc_code and last.cc_code and missing(cc_code) then delete; run; proc sort data=x out=want nodupkey;by product altid cc_code;run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.