Hello, I have a first table with the customer number and the registration date. Then a second with the order numbers and all the order dates.
I would like to create a table that adds to the A table the date of the first order.
A | B | ||||
Number | DATE_RECORD | Number | DATE_ORDER | ||
234 | 01/02/1995 | 2345 | 11/04/2013 | ||
890 | 03/06/2016 | 2345 | 04/11/2017 | ||
2345 | 04/06/2012 | 2345 | 04/06/2015 | ||
896 | 29/11/2001 | 896 | 21/02/2002 | ||
8765 | 06/03/2002 | 896 | 01/11/2008 | ||
234 | 13/09/1995 | ||||
WANT | |||||
Number | DATE_RECORD | FIRST_ORDER | |||
234 | 01/02/1995 | 13/09/1995 | |||
890 | 03/06/2016 | ||||
2345 | 04/06/2012 | 11/04/2013 | |||
896 | 29/11/2001 | 21/02/2002 | |||
8765 | 06/03/2002 |
Thanks for your help
If keeping the order of dataset a is of importance, consider a hash approach:
proc sort data=b;
by number date_order;
run;
data want;
set a;
if _n_ = 1
then do;
format first_order ddmmyy10.;
declare hash b (dataset:"b (rename=(date_order=first_order))");
b.definekey("number");
b.definedata("first_order");
b.definedone();
end;
if b.find() ne 0 then first_order = .;
run;
Since only the first observation for a key is kept when loading the table into the hash (unless a MULTIDATA:"Y" option is used), the preceding sort makes sure that the minimum date for a number goes into the hash.
PROC SQL;
CREATE TABLE want AS
SELECT a.*,b.number as numberB, b.DATE_ORDER
FROM A
FULL JOIN B
ON A.number eq b.number
ORDER BY a.number,a.date_record, b.DATE_ORDER
;
QUIT;
DATA want;
set want;
by number date_record date_order;
if missing(date_record) or missing(date_order) then do;
if missing(date_record) and missing(date_order) then put 'ERROR: no registration and no order date found ' number=;
else if missing(date_order) then put 'ERROR: no order date found ' number= date_record=;
else if missing(date_record) then put 'ERROR: no registration date found ' number= date_order=;
end;
if first.number;
RUN;
- Cheers -
data a; infile cards expandtabs; input Number DATE_RECORD :ddmmyy10.; format date_record ddmmyy10.; cards; 234 01/02/1995 890 03/06/2016 2345 04/06/2012 896 29/11/2001 8765 06/03/2002 ; data b; infile cards expandtabs; input Number DATE_RECORD :ddmmyy10.; format date_record ddmmyy10.; cards; 2345 11/04/2013 2345 04/11/2017 2345 04/06/2015 896 21/02/2002 896 01/11/2008 234 13/09/1995 ; proc sort data=a;by Number DATE_RECORD;run; proc sort data=b;by Number DATE_RECORD;run; data want; merge a b(rename=( DATE_RECORD=First_Date)); by Number; if first.Number; run;
Should be doable in a simple SQL:
proc sql;
create table want as
select
a.number,
a.date_record,
min(b.date_order) as first_order format=ddmmyy10.
from a left join b on a.number = b.number
group by a.number, a.date_record
;
quit;
If keeping the order of dataset a is of importance, consider a hash approach:
proc sort data=b;
by number date_order;
run;
data want;
set a;
if _n_ = 1
then do;
format first_order ddmmyy10.;
declare hash b (dataset:"b (rename=(date_order=first_order))");
b.definekey("number");
b.definedata("first_order");
b.definedone();
end;
if b.find() ne 0 then first_order = .;
run;
Since only the first observation for a key is kept when loading the table into the hash (unless a MULTIDATA:"Y" option is used), the preceding sort makes sure that the minimum date for a number goes into the hash.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.