Hi All,
I have two datasets that look as follows:
Dataset1
Date Company
19950131 1
19950228 1
19950331 1
19950428 1
19950531 1
19950630 1
19950731 1
19950831 1
19950929 1
19951031 1
19951130 1
19951229 1
19960131 1
19960229 1
19960329 1
19960430 1
19960531 1
19960628 1
19960731 1
19960830 1
19960930 1
19961031 1
19961129 1
19961231 1
19970131 1
19970228 1
19970331 1
19970430 1
19970530 1
19970630 1
19970731 1
19970829 1
19970930 1
19971031 1
19971128 1
19971231 1
Dataset2
Date Company Var
30Jun1995 1 10
30Dec1996 1 11
The required dataset should look like this:
Date Company Var
19950131 1 .
19950228 1 .
19950331 1 .
19950428 1 .
19950531 1 .
19950630 1 .
19950731 1 .
19950831 1 .
19950929 1 .
19951031 1 .
19951130 1 .
19951229 1 .
19960131 1 .
19960229 1 .
19960329 1 .
19960430 1 .
19960531 1 .
19960628 1 .
19960731 1 10
19960830 1 10
19960930 1 10
19961031 1 10
19961129 1 10
19961231 1 10
19970131 1 10
19970228 1 10
19970331 1 10
19970430 1 10
19970530 1 10
19970630 1 10
19970731 1 11
19970829 1 11
19970930 1 11
19971031 1 11
19971128 1 11
19971231 1 11
As you can see, for each company, I need to merge dataset1 and dataset2 such that Var from dataset2 (Var data can be available at any month in a year t i.e. not only June or Dec) is added to July of year t+1 in dataset1 and this Var value is repeated until the July of year t+2 and so on. I have data until 2015 and I have a lot of companies. So I need to do this merge for all dates and for each company.
Any help will be hugely appreciated.
Kind regards,
Ruslan
There can be many ways to do this.
One way I can think of immediately is
1. In Dataset1, create two new columns YEAR , MONTH to store value of year and month respectively
2. In Dataset2, create two new columns YEAR , MONTH. Here store value of year in YEAR column and value 7 in month column. Hard code month value in dataset2.
3. Dataset 1 left join dataset2 on YEAR, MONTH columns. Till this point value would come only in month 7. This would be
dataset 3.
4. Sort your dataset 3 by MONTH and year. Then if month =7 has any value in var column then this Var value would be reatained until next month 7 is found.
Hi,
I tried to do these steps, but I do not know how to merge with July of year t+1. I can easily merge with July of year t, but do not know how to merge it with July of year t+1.
So, any piece of code will be highly appreciated.
Kind regards,
Ruslan
Honestly, Your logic is not clear to me.
Assuming I know what you mean.
data Dataset1;
input Date : yymmdd10. Company ;
id=year(date)*100+month(date);
format Date yymmdd10.;
cards;
19950131 1
19950228 1
19950331 1
19950428 1
19950531 1
19950630 1
19950731 1
19950831 1
19950929 1
19951031 1
19951130 1
19951229 1
19960131 1
19960229 1
19960329 1
19960430 1
19960531 1
19960628 1
19960731 1
19960830 1
19960930 1
19961031 1
19961129 1
19961231 1
19970131 1
19970228 1
19970331 1
19970430 1
19970530 1
19970630 1
19970731 1
19970829 1
19970930 1
19971031 1
19971128 1
19971231 1
;
run;
data Dataset2;
input Date : date9. Company Var;
format Date yymmdd10.;
cards;
30Jun1995 1 10
30Dec1996 1 11
;
run;
data key;
set Dataset2;
start=mdy(month(lag(date)),1,year(lag(date))+1);
stop=mdy(month(date),1,year(date)+1);
n=coalesce(intck('month',start,stop),0);
do i=13 to n;
new_date=intnx('month',start,i);
id=year(new_date)*100+month(new_date);
output;
end;
do i=13 to 24;
new_date=intnx('month',date,i);
id=year(new_date)*100+month(new_date);
output;
end;
keep company id var ;
run;
data want;
merge Dataset1(in=ina) key;
by company id;
if ina;
run;
Hi Xia,
Your code works fine, however it does not take into account the fact that in dataset2 I may have the Var value on 31Dec or 30Apr. So it does not matter on which month and year t I have the Var value available (so it may be available in June, Nov or March), this Var value should be merged with July of next year t+1 in dataset1 and repeated throughout the whole year until next June of year t+2.
Please please help me to sort out this problem.
Kind regards,
Ruslan
Sorry. As I said , I don't understand your logic clearly . I konw what you said these .
Show me an example to prove where is my code wrong .
Hi All,
Different problem now!!!
I have the dataset that looks like this:
Date Company Var
199601 1 .
199602 1 .
199603 1 .
199604 1 .
199605 1 .
199606 1 10
199607 1 .
199608 1 .
199609 1 .
199610 1 .
199611 1 .
199612 1 .
199701 1 .
199702 1 .
199703 1 .
199704 1 .
199705 1 .
199706 1 11
199707 1 .
199708 1 .
199709 1 .
199710 1 .
199711 1 .
199712 1 .
The required dataset should look like this:
Date Company Var
199601 1 .
199602 1 .
199603 1 .
199604 1 .
199605 1 .
199606 1 .
199607 1 10
199608 1 10
199609 1 10
199610 1 10
199611 1 10
199612 1 10
199701 1 10
199702 1 10
199703 1 10
199704 1 10
199705 1 10
199706 1 10
199707 1 11
199708 1 11
199709 1 11
199710 1 11
199711 1 11
199712 1 11
So what I need to do is for each company, to repeat the same value that is available in June, year t, for next months within a year until June, year t+1. Please note that I need to use Var value in June, but to start from July, year t, and repeat the value until June, year t+1. I have many months and many companies.
Any help will be hugely appriciated.
Kind regards,
Ruslan
@Reeza has already post this question for you, I have already post the new code. Find it in this forum on your own.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.