I have this big dataset were I want to select the first entry of the each customer by date in to the register so that I can put them in a new table/dataset were each customer is represented once.
My dataset looks like this:
Data Have;
Customer_ID$ ent_date Code$
1 1/1/2018 N06
1 1/2/2018 A10
2 3/5/2018 B20
2 4/20/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
4 9/16/2018 A10B
4 10/12/2018 B20.
Data Want;
Customer_ID$ ent_date Code$
1 1/1/2018 N06
2 3/5/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
Thanks!
data have;
input Customer_ID$ ent_date :mmddyy10. Code$ ;
format ent_date mmddyy10.;
cards;
1 1/1/2018 N06
1 1/2/2018 A10
2 3/5/2018 B20
2 4/20/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
4 9/16/2018 A10B
4 10/12/2018 B20
;
data want;
set have;
by customer_id;
if first.customer_id;
run;
/*or*/
data have;
input Customer_ID$ ent_date :mmddyy10. Code$ ;
format ent_date mmddyy10.;
cards;
1 1/1/2018 N06
1 1/2/2018 A10
2 3/5/2018 B20
2 4/20/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
4 9/16/2018 A10B
4 10/12/2018 B20
;
proc sql;
create table want as
select *
from have
group by customer_id
having min(ent_date)=ent_date
order by customer_id, ent_date;
quit;
data have;
input Customer_ID$ ent_date :mmddyy10. Code$ ;
format ent_date mmddyy10.;
cards;
1 1/1/2018 N06
1 1/2/2018 A10
2 3/5/2018 B20
2 4/20/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
4 9/16/2018 A10B
4 10/12/2018 B20
;
data want;
set have;
by customer_id;
if first.customer_id;
run;
/*or*/
data have;
input Customer_ID$ ent_date :mmddyy10. Code$ ;
format ent_date mmddyy10.;
cards;
1 1/1/2018 N06
1 1/2/2018 A10
2 3/5/2018 B20
2 4/20/2018 B20
3 5/27/2018 Y20
4 9/1/2018 N06AA
4 9/16/2018 A10B
4 10/12/2018 B20
;
proc sql;
create table want as
select *
from have
group by customer_id
having min(ent_date)=ent_date
order by customer_id, ent_date;
quit;
In addition to @novinosrin you can use the nodupkey option on a proc sort step. You just have to sort it depending on the values you want. The data was already sorted here.
proc sort data=have nodupkey; by customer_id; run;
In case your data is already sorted by customer_id end_date then use @novinosrin solution
otherwise you need to sort data by both variables before sub-setting the data in order to have the first in date of each customer.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.