Hello all,
I had this situation where I need to exclude all the ID's with >1 sales type on the first sold date. My condition should be valid only for the first date for each unique ID. Any number of sales on same date other than first date should be taken into account. For example, in the below scenario there are multiple observations for each ID. I want to delete those ID's which had duplicate observations on the first date only.
Id | sales | Date sold |
1 | car | 1/1/2001 |
1 | car | 1/1/2001 |
1 | truck | 1/3/2001 |
2 | motorcycle | 1/5/2001 |
2 | truck | 1/8/2001 |
3 | bike | 1/4/2003 |
3 | motorcycle | 1/5/2003 |
3 | truck | 1/6/2003 |
3 | bike | 1/6/2003 |
The output should look like this:
ID | sales | date |
2 | motorcycle | 1/5/2001 |
2 | truck | 1/8/2001 |
3 | bike | 1/4/2003 |
3 | motorcycle | 1/5/2003 |
3 | truck | 1/6/2003 |
3 | bike | 1/6/2003 |
'ID-1' is deleted because it had more than 1 sales (same/different sales type) on the same date (first date). Though the 'ID-3' had more than 1 sales, its not deleted because the sales were not from the first date.
OK. Assuming the data has been sorted as you posted. data have; infile cards expandtabs truncover; input Id (sales Datesold) (:$40.); cards; 1 car 1/1/2001 1 car 1/1/2001 1 truck 1/3/2001 2 motorcycle 1/5/2001 2 truck 1/8/2001 3 bike 1/4/2003 3 motorcycle 1/5/2003 3 truck 1/6/2003 3 bike 1/6/2003 ; run; data want; n=0;count=0; do until(last.id); set have; by id Datesold; if first.Datesold then n+1; if n=1 then count+1; end; do until(last.id); set have; by id Datesold; if count=1 then output; end; drop n count; run;
It's not 100% clear what you are asking for, but it seems this is what you are after:
proc sort data=have;
by id date_sold;
run;
data want;
set have;
by id date_sold;
if first.id then do;
if last.date_sold=0 then delete_me='Y';
else delete_me='N';
end;
retain delete_me;
if delete_me='Y' then delete;
run;
Even if I didn't figure out the proper result here, these would likely be the right tools to be playing with to get a solution.
OK. Assuming the data has been sorted as you posted. data have; infile cards expandtabs truncover; input Id (sales Datesold) (:$40.); cards; 1 car 1/1/2001 1 car 1/1/2001 1 truck 1/3/2001 2 motorcycle 1/5/2001 2 truck 1/8/2001 3 bike 1/4/2003 3 motorcycle 1/5/2003 3 truck 1/6/2003 3 bike 1/6/2003 ; run; data want; n=0;count=0; do until(last.id); set have; by id Datesold; if first.Datesold then n+1; if n=1 then count+1; end; do until(last.id); set have; by id Datesold; if count=1 then output; end; drop n count; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.