Hi
I have following data
Customer group money
1 Group1 100
1 Group2 200
2 Group2 100
2 Group3 250
2 Group4 275
I need to create two new variables
the first one looking at group variable taking the first 1 group once sorted applied to records by Customer number
Customer group money new variable 1
1 Group1 100 Group1
1 Group2 200 Group1
2 Group2 100 Group2
2 Group3 250 Group2
2 Group4 275 Group2
The second one is again to look at the groups but take the group based on the last money field once sorted
Customer group money new variable 1 New Varible2
1 Group1 100 Group1 Group 2
1 Group2 200 Group1 Group 2
2 Group2 100 Group2 Group4
2 Group3 250 Group2 Group4
2 Group4 275 Group2 Group4
I have tried using Proc Sort then using first function but I am not sure how it works?
Any help would be greatly received.
Thanks
Simon
Try this in SQL;
proc sql;
create table want as
select
a.customer, a.group, a.money,
min(a.group) as new_var_1,
(select group from have b where b.customer = a.customer group by b.customer having money = max(money)) as new_var_2
from have a
group by customer;
quit;
For a data step/sort solution, you would need two sorts:
proc sort data=have;
by customer group;
run;
data int1 (keep=customer new_var_1);
set have;
by customer;
if first.customer;
new_var_1 = group;
run;
proc sort data=have;
by customer money;
run;
data int2 (keep=customer new_var_2);
set have;
by customer;
if last.customer;
new_var_2 = group;
run;
data want;
merge
have
int1
int2
;
by customer;
run;
Testing will reveal which of the solutions is better, performancewise, for larger datasets.
Try this in SQL;
proc sql;
create table want as
select
a.customer, a.group, a.money,
min(a.group) as new_var_1,
(select group from have b where b.customer = a.customer group by b.customer having money = max(money)) as new_var_2
from have a
group by customer;
quit;
For a data step/sort solution, you would need two sorts:
proc sort data=have;
by customer group;
run;
data int1 (keep=customer new_var_1);
set have;
by customer;
if first.customer;
new_var_1 = group;
run;
proc sort data=have;
by customer money;
run;
data int2 (keep=customer new_var_2);
set have;
by customer;
if last.customer;
new_var_2 = group;
run;
data want;
merge
have
int1
int2
;
by customer;
run;
Testing will reveal which of the solutions is better, performancewise, for larger datasets.
Thanks I found this one did the job thanks for your help
Hi,
If the data is sorted in the order presented then you could try:
data have_last;
set have;
by customer;
if last.customer;
run;
data want;
merge have
have_last (rename = group = new_var_2);
by customer;
retain new_var_1;
if first.customer then
new_var_1 = group;
run;
Regards,
Amir.
Good morning and Hi @SimonJ If your dataset is sorted, i see the solution being very straight forward
data have;
input Customer group $ money;
cards;
1 Group1 100
1 Group2 200
2 Group2 100
2 Group3 250
2 Group4 275
;
data want;
set have(in=a) have(in=b);
by customer ;
retain new_var1 new_var2;
if a and first.customer then new_var1=group;
else if a then new_var2=group;
if b;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.