Hello all,
I had this situation where I need to exclude all the ID's with >1 sales type on the first sold date. My condition should be valid only for the first date for each unique ID. Any number of sales on same date other than first date should be taken into account. For example, in the below scenario there are multiple observations for each ID. I want to delete those ID's which had duplicate observations on the first date only.
Id | sales | Date sold |
1 | car | 1/1/2001 |
1 | car | 1/1/2001 |
1 | truck | 1/3/2001 |
2 | motorcycle | 1/5/2001 |
2 | truck | 1/8/2001 |
3 | bike | 1/4/2003 |
3 | motorcycle | 1/5/2003 |
3 | truck | 1/6/2003 |
3 | bike | 1/6/2003 |
The output should look like this:
ID | sales | date |
2 | motorcycle | 1/5/2001 |
2 | truck | 1/8/2001 |
3 | bike | 1/4/2003 |
3 | motorcycle | 1/5/2003 |
3 | truck | 1/6/2003 |
3 | bike | 1/6/2003 |
'ID-1' is deleted because it had more than 1 sales (same/different sales type) on the same date (first date). Though the 'ID-3' had more than 1 sales, its not deleted because the sales were not from the first date.
OK. Assuming the data has been sorted as you posted. data have; infile cards expandtabs truncover; input Id (sales Datesold) (:$40.); cards; 1 car 1/1/2001 1 car 1/1/2001 1 truck 1/3/2001 2 motorcycle 1/5/2001 2 truck 1/8/2001 3 bike 1/4/2003 3 motorcycle 1/5/2003 3 truck 1/6/2003 3 bike 1/6/2003 ; run; data want; n=0;count=0; do until(last.id); set have; by id Datesold; if first.Datesold then n+1; if n=1 then count+1; end; do until(last.id); set have; by id Datesold; if count=1 then output; end; drop n count; run;
It's not 100% clear what you are asking for, but it seems this is what you are after:
proc sort data=have;
by id date_sold;
run;
data want;
set have;
by id date_sold;
if first.id then do;
if last.date_sold=0 then delete_me='Y';
else delete_me='N';
end;
retain delete_me;
if delete_me='Y' then delete;
run;
Even if I didn't figure out the proper result here, these would likely be the right tools to be playing with to get a solution.
OK. Assuming the data has been sorted as you posted. data have; infile cards expandtabs truncover; input Id (sales Datesold) (:$40.); cards; 1 car 1/1/2001 1 car 1/1/2001 1 truck 1/3/2001 2 motorcycle 1/5/2001 2 truck 1/8/2001 3 bike 1/4/2003 3 motorcycle 1/5/2003 3 truck 1/6/2003 3 bike 1/6/2003 ; run; data want; n=0;count=0; do until(last.id); set have; by id Datesold; if first.Datesold then n+1; if n=1 then count+1; end; do until(last.id); set have; by id Datesold; if count=1 then output; end; drop n count; run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.