Dear All,
I have the following data (two datasets):
Company Date Company Date Value
1 199601 1 199606 10
1 199602 1 199706 11
1 199603 1 199806 12
1 199604
1 199605
1 199606
1 199607
1 199608
1 199609
1 199610
1 199611
1 199612
1 199701
1 199702
1 199703
1 199704
1 199705
1 199706
1 199707
1 199708
1 199709
1 199710
1 199711
1 199712
1 199801
1 199802
1 199803
1 199804
1 199805
1 199806
1 199807
1 199808
1 199809
1 199810
1 199811
1 199812
I need to obtain the following dataset:
Company Date Value
1 199601 10
1 199602 10
1 199603 10
1 199604 10
1 199605 10
1 199606 11
1 199607 11
1 199608 11
1 199609 11
1 199610 11
1 199611 11
1 199612 11
1 199701 11
1 199702 11
1 199703 11
1 199704 11
1 199705 11
1 199706 12
1 199707 12
1 199708 12
1 199709 12
1 199710 12
1 199711 12
1 199712 12
1 199801 12
1 199802 12
1 199803 12
1 199804 12
1 199805 12
1 199806 .
1 199807 .
1 199808 .
1 199809 .
1 199810 .
1 199811 .
1 199812 .
So, for each company I need to add "value" column from the second to the first dataset. The criteria how to add is the following. Add 10 if first date in the second dataset (199606) is greater (but not equal!!) than the dates in the first dataset. Then, add 11 if the second date in the second dataset (199706) is greater than the dates in the first dataset, but less or equal than 199606. Finally, add 12 if the third date in the second dataset (199806) is greater than the dates in the first dataset, but less or equal than 199706. Please note that I should have "value" column missing if date is greater or equal than 199806.
I look forward to getting your reply.
Kind regards,
Ruslan
Change the position of two table A and B in the first data step.
data a;
input Company Date ;
cards;
1 199601
1 199602
1 199603
1 199604
1 199605
1 199606
1 199607
1 199608
1 199609
1 199610
1 199611
1 199612
1 199701
1 199702
1 199703
1 199704
1 199705
1 199706
1 199707
1 199708
1 199709
1 199710
1 199711
1 199712
1 199801
1 199802
1 199803
1 199804
1 199805
1 199806
1 199807
1 199808
1 199809
1 199810
1 199811
1 199812
;
run;
data b;
input Company Date Value;
cards;
1 199606 10
1 199706 11
1 199806 12
;
run;
data temp;
set a b(in=inb);
by Company Date;
in=inb;
run;
data want;
do until(last.Company or not missing(Value));
set temp;
by Company;
end;
v=Value;
do until(last.Company or not missing(Value));
set temp;
by Company;
if not in then output;
end;
drop Value in;
run;
data a;
input Company Date ;
cards;
1 199601
1 199602
1 199603
1 199604
1 199605
1 199606
1 199607
1 199608
1 199609
1 199610
1 199611
1 199612
1 199701
1 199702
1 199703
1 199704
1 199705
1 199706
1 199707
1 199708
1 199709
1 199710
1 199711
1 199712
1 199801
1 199802
1 199803
1 199804
1 199805
1 199806
1 199807
1 199808
1 199809
1 199810
1 199811
1 199812
;
run;
data b;
input Company Date Value;
cards;
1 199606 10
1 199706 11
1 199806 12
;
run;
data temp;
set b(in=inb) a;
by Company Date;
in=inb;
run;
data want;
do until(last.Company or not missing(Value));
set temp;
by Company;
end;
v=Value;
do until(last.Company or not missing(Value));
set temp;
by Company;
if not in then output;
end;
drop Value in;
run;
Thank you very much for your help. Your code works perfectly.
However, I also need to get table like this with minor correction. Could you please let me know what I need to change in your code? I am a bit struggled.
I need to get value 10 for 199601, value 11 for 199701 and so on. I will hugely appreciate your help.
Company Date Value
1 199601 10
1 199602 10
1 199603 10
1 199604 10
1 199605 10
1 199606 10
1 199607 11
1 199608 11
1 199609 11
1 199610 11
1 199611 11
1 199612 11
1 199701 11
1 199702 11
1 199703 11
1 199704 11
1 199705 11
1 199706 11
1 199707 12
1 199708 12
1 199709 12
1 199710 12
1 199711 12
1 199712 12
1 199801 12
1 199802 12
1 199803 12
1 199804 12
1 199805 12
1 199806 12
1 199807 .
1 199808 .
1 199809 .
1 199810 .
1 199811 .
1 199812 .
Kind regards,
Ruslan
Change the position of two table A and B in the first data step.
data a;
input Company Date ;
cards;
1 199601
1 199602
1 199603
1 199604
1 199605
1 199606
1 199607
1 199608
1 199609
1 199610
1 199611
1 199612
1 199701
1 199702
1 199703
1 199704
1 199705
1 199706
1 199707
1 199708
1 199709
1 199710
1 199711
1 199712
1 199801
1 199802
1 199803
1 199804
1 199805
1 199806
1 199807
1 199808
1 199809
1 199810
1 199811
1 199812
;
run;
data b;
input Company Date Value;
cards;
1 199606 10
1 199706 11
1 199806 12
;
run;
data temp;
set a b(in=inb);
by Company Date;
in=inb;
run;
data want;
do until(last.Company or not missing(Value));
set temp;
by Company;
end;
v=Value;
do until(last.Company or not missing(Value));
set temp;
by Company;
if not in then output;
end;
drop Value in;
run;
Perfect!
Thanks a lot, Xia!
Please try the proc sql approach
proc sql;
create table test as select a.* , b.value from a left join b on
a.company=b.company and a.date<b.date order by a.date,b.value;
quit;
data test2;
set test;
by company date value;
if first.date;
run;
Thanks,
Jag
Hi Ruslan, Just curios to know if the code i provided was helpful. Please let me know if it need ant changes.
Hi Jagadishkatam,
I have tried your code, but it did not work since it gave me for each "value" from dataset 2 the same values of date from dataset 1. Could you please change Xia Keshan code to account for minor corrections mentioned above?
Thanks.
Kind regards,
Ruslan
Yes the code is creating a Cartesian product and we need to consider the first record for which i wrote the below code. Could you please try.
data test2;
set test;
by company date value;
if first.date;
run;
i did not try Xia Keshan code. i will check this code as well and try to help you
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.