BookmarkSubscribeRSS Feed
Ruslan
Calcite | Level 5

Hi All,

 

I have two datasets that look as follows:

 

Dataset1

Date          Company 

19950131        1              

19950228        1

19950331        1

19950428        1

19950531        1

19950630        1

19950731        1

19950831        1

19950929        1

19951031        1

19951130        1

19951229        1

19960131        1

19960229        1

19960329        1

19960430        1

19960531        1

19960628        1

19960731        1

19960830        1

19960930        1

19961031        1

19961129        1

19961231        1

19970131        1

19970228        1

19970331        1

19970430        1

19970530        1

19970630        1

19970731        1

19970829        1

19970930        1

19971031        1

19971128        1

19971231        1

 

Dataset2

Date             Company          Var

30Jun1995          1                 10

30Dec1996          1                 11

 

The required dataset should look like this:

 

Date          Company    Var

19950131        1             .      

19950228        1             .

19950331        1             .

19950428        1             .

19950531        1             .

19950630        1             .

19950731        1             .

19950831        1             .

19950929        1             . 

19951031        1             .

19951130        1             .

19951229        1             .

19960131        1             .

19960229        1             .

19960329        1             .

19960430        1             .

19960531        1             .

19960628        1             .

19960731        1             10

19960830        1             10

19960930        1             10

19961031        1             10

19961129        1             10 

19961231        1             10

19970131        1             10

19970228        1             10

19970331        1             10

19970430        1             10

19970530        1             10

19970630        1             10

19970731        1             11

19970829        1             11

19970930        1             11

19971031        1             11

19971128        1             11

19971231        1             11

 

As you can see, for each company, I need to merge dataset1 and dataset2 such that Var from dataset2 (Var data can be available at any month in a year t i.e. not only June or Dec) is added to July of year t+1 in dataset1 and this Var value is repeated until the July of year t+2 and so on. I have data until 2015 and I have a lot of companies. So I need to do this merge for all dates and for each company.

 

Any help will be hugely appreciated.

 

Kind regards,

Ruslan

7 REPLIES 7
RahulG
Barite | Level 11

There can be many ways to do this.

 

One way I can think of immediately is

 

1. In Dataset1, create two new columns YEAR , MONTH to store value of year and month respectively

 

2. In Dataset2, create two new columns YEAR , MONTH. Here store value of year in YEAR column and value 7 in month column. Hard code month value in dataset2.

 

3.  Dataset 1 left join dataset2 on YEAR, MONTH columns. Till this point value would come only in month 7. This would be

dataset 3.

 

4. Sort your dataset 3 by MONTH  and year. Then if month =7 has any value in var column then this Var value would be reatained until next month 7 is found. 

Ruslan
Calcite | Level 5

Hi,

 

I tried to do these steps, but I do not know how to merge with July of year t+1. I can easily merge with July of year t, but do not know how to merge it with July of year t+1.

 

So, any piece of code will be highly appreciated.

 

Kind regards,

Ruslan

Ksharp
Super User

Honestly, Your logic is not clear to me.

Assuming I know what you mean.

 

 

 

 

data Dataset1;
input Date : yymmdd10.         Company ;
id=year(date)*100+month(date);
format Date  yymmdd10.;
cards;
19950131        1              
19950228        1
19950331        1
19950428        1
19950531        1
19950630        1
19950731        1
19950831        1
19950929        1
19951031        1
19951130        1
19951229        1
19960131        1
19960229        1
19960329        1
19960430        1
19960531        1
19960628        1
19960731        1
19960830        1
19960930        1
19961031        1
19961129        1
19961231        1
19970131        1
19970228        1
19970331        1
19970430        1
19970530        1
19970630        1
19970731        1
19970829        1
19970930        1
19971031        1
19971128        1
19971231        1
;
run;

data Dataset2;
input Date : date9. Company          Var;
format Date  yymmdd10.;
cards;
30Jun1995          1                 10
30Dec1996          1                 11
;
run;
data key;
 set Dataset2;
 
 start=mdy(month(lag(date)),1,year(lag(date))+1);
 stop=mdy(month(date),1,year(date)+1);
 n=coalesce(intck('month',start,stop),0);
 do i=13 to n;
  new_date=intnx('month',start,i);
  id=year(new_date)*100+month(new_date);
  output;
 end;

 do i=13 to 24;
  new_date=intnx('month',date,i);
  id=year(new_date)*100+month(new_date);
  output;
 end;
keep company id var ;
run;

data want;
 merge Dataset1(in=ina) key;
 by company id;
 if ina;
run;
Ruslan
Calcite | Level 5

Hi Xia,

 

Your code works fine, however it does not take into account the fact that in dataset2 I may have the Var value on 31Dec or 30Apr. So it does not matter on which month and year t I have the Var value available (so it may be available in June, Nov or March), this Var value should be merged with July of next year t+1 in dataset1 and repeated throughout the whole year until next June of year t+2.

 

Please please help me to sort out this problem. 

 

Kind regards,

Ruslan

Ksharp
Super User

Sorry. As I said , I don't understand your logic clearly . I konw what you said these .

Show me an example to prove where is my code wrong .

Ruslan
Calcite | Level 5

Hi All,

Different problem now!!!

 

I have the dataset that looks like this:

 

Date      Company    Var

 

199601        1             .

199602        1             .

199603        1             .

199604        1             .

199605        1             .

199606        1             10

199607        1             .

199608        1             .

199609        1             .

199610        1             .

199611        1             . 

199612        1             .

199701        1             .

199702        1             .

199703        1             .

199704        1             .

199705        1             .

199706        1             11

199707        1             .

199708        1             .

199709        1             .

199710        1             .

199711        1             .

199712        1             .

 

The required dataset should look like this:

 

Date      Company    Var

 

199601        1             .

199602        1             .

199603        1             .

199604        1             .

199605        1             .

199606        1             .

199607        1             10

199608        1             10

199609        1             10

199610        1             10

199611        1             10

199612        1             10

199701        1             10

199702        1             10

199703        1             10

199704        1             10

199705        1             10

199706        1             10

199707        1             11

199708        1             11

199709        1             11

199710        1             11

199711        1             11

199712        1             11

 

So what I need to do is for each company, to repeat the same value that is available in June, year t, for next months within a year until June, year t+1. Please note that I need to use Var value in June, but to start from July, year t, and repeat the value until June, year t+1. I have many months and many companies.

 

Any help will be hugely appriciated.

 

Kind regards,

Ruslan

Ksharp
Super User
@Reeza has already post this question for you, I have already post the new code. Find it in this forum on your own.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1174 views
  • 1 like
  • 3 in conversation