Hi everyone!
I'm trying to solve the problem below...
proc sort data = HAVE out = WANT nodupkey;
by CLASSIFICATION;
run;
| CUSTOMER | CLASSIFICATION |
|---|---|
| CUSTOMER A | MEDIUM |
| CUSTOMER A | LOW |
| CUSTOMER B | HIGH |
| CUSTOMER B | LOW |
| CUSTOMER C | MEDIUM |
Thanks and regards from Brazil!
You need encode these classification firstly.
data have;
input c $ v $;
cards;
a medium
a low
b high
b low
c medium
;
run;
data have;
set have;
select (v);
when ("high") r=1;
when ("medium") r=2;
when ("low") r=3;
otherwise;
end;
run;
proc sort data=have;by c r;run;
/* one way */
proc sort data=have out=want1 nodupkey;by c;run;
/*another way*/
data want2;
set have;
by c;
if first.c;
run;
Xia Keshan
Use a small auxiliary table containing the required classification order:
data have;
length customer $12 classification $8;
input customer & classification;
datalines;
CUSTOMER A MEDIUM
CUSTOMER A LOW
CUSTOMER B HIGH
CUSTOMER B LOW
CUSTOMER C MEDIUM
;
data order;
length classification $8;
do classification = "LOW", "MEDIUM", "HIGH"; order+1; output; end;
run;
proc sql;
create table want as
select unique h.*
from have as h inner join order as o on h.classification=o.classification
group by customer
having order = max(order);
select * from want;
quit;
PG
Alternatively, you could sort by a sql-expression and keep only the last record :
proc sql;
create view have2 as
select * from have
order by customer,
case classification
when "LOW" then 1
when "MEDIUM" then 2
else 3
end;
quit;
data want;
set have2; by customer;
if last.customer;
run;
PG
PG,
You can do it in a whole SQL statement.
data have; length customer $12 classification $8; input customer & classification; datalines; CUSTOMER A MEDIUM CUSTOMER A LOW CUSTOMER B HIGH CUSTOMER B LOW CUSTOMER C MEDIUM ; proc sql; create table want as select unique h.*, case when classification="LOW" then 1 when classification="MEDIUM" then 2 else 3 end as r from have as h group by customer having r = max(r); quit;
Xia Keshan
But for the utmost efficiency, assuming dataset have is already sorted by customer:
data want;
array c{3} $ _temporary_ ("LOW", "MEDIUM", "HIGH");
do until(last.customer);
set have; by customer;
i = max(i, whichc(classification, of c{*}));
end;
classification = c{max(i,1)};
drop i;
run;
PG
PG,
I would not use your code, if there are some other variables out there, and it usually does.
Xia Keshan
I would like to thank you guys, the solutions worked fine and I could achive the goal keeping the strongest one.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.