Hi
I have following data
Customer group money
1 Group1 100
1 Group2 200
2 Group2 100
2 Group3 250
2 Group4 275
I need to create two new variables
the first one looking at group variable taking the first 1 group once sorted applied to records by Customer number
Customer group money new variable 1
1 Group1 100 Group1
1 Group2 200 Group1
2 Group2 100 Group2
2 Group3 250 Group2
2 Group4 275 Group2
The second one is again to look at the groups but take the group based on the last money field once sorted
Customer group money new variable 1 New Varible2
1 Group1 100 Group1 Group 2
1 Group2 200 Group1 Group 2
2 Group2 100 Group2 Group4
2 Group3 250 Group2 Group4
2 Group4 275 Group2 Group4
I have tried using Proc Sort then using first function but I am not sure how it works?
Any help would be greatly received.
Thanks
Simon
Try this in SQL;
proc sql;
create table want as
select
a.customer, a.group, a.money,
min(a.group) as new_var_1,
(select group from have b where b.customer = a.customer group by b.customer having money = max(money)) as new_var_2
from have a
group by customer;
quit;
For a data step/sort solution, you would need two sorts:
proc sort data=have;
by customer group;
run;
data int1 (keep=customer new_var_1);
set have;
by customer;
if first.customer;
new_var_1 = group;
run;
proc sort data=have;
by customer money;
run;
data int2 (keep=customer new_var_2);
set have;
by customer;
if last.customer;
new_var_2 = group;
run;
data want;
merge
have
int1
int2
;
by customer;
run;
Testing will reveal which of the solutions is better, performancewise, for larger datasets.
Try this in SQL;
proc sql;
create table want as
select
a.customer, a.group, a.money,
min(a.group) as new_var_1,
(select group from have b where b.customer = a.customer group by b.customer having money = max(money)) as new_var_2
from have a
group by customer;
quit;
For a data step/sort solution, you would need two sorts:
proc sort data=have;
by customer group;
run;
data int1 (keep=customer new_var_1);
set have;
by customer;
if first.customer;
new_var_1 = group;
run;
proc sort data=have;
by customer money;
run;
data int2 (keep=customer new_var_2);
set have;
by customer;
if last.customer;
new_var_2 = group;
run;
data want;
merge
have
int1
int2
;
by customer;
run;
Testing will reveal which of the solutions is better, performancewise, for larger datasets.
Thanks I found this one did the job thanks for your help
Hi,
If the data is sorted in the order presented then you could try:
data have_last;
set have;
by customer;
if last.customer;
run;
data want;
merge have
have_last (rename = group = new_var_2);
by customer;
retain new_var_1;
if first.customer then
new_var_1 = group;
run;
Regards,
Amir.
Good morning and Hi @SimonJ If your dataset is sorted, i see the solution being very straight forward
data have;
input Customer group $ money;
cards;
1 Group1 100
1 Group2 200
2 Group2 100
2 Group3 250
2 Group4 275
;
data want;
set have(in=a) have(in=b);
by customer ;
retain new_var1 new_var2;
if a and first.customer then new_var1=group;
else if a then new_var2=group;
if b;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.