BookmarkSubscribeRSS Feed
jkim197
Obsidian | Level 7

Hello all!

 

I want to convert mixed quarterly and annual frequency data to monthly data in panel data setting.

 

I have 

 

summary_period2 : represents frequency, (A: annual, Q: quarterly, AQ:quarterly)

crsp_fundno : entity id variable

caldt : calendar date

end_dt : ending calendar date for each entitiy

 

summary_period2crsp_fundnoexp_ratiocaldtend_dt
A10.00251997123120000731
A10.00191998123120000731
Q10.00191999033120000731
Q10.00191999063020000731
Q10.00421999093020000731
AQ10.00421999123120000731
Q10.00422000033120000731
Q10.00422000063020000731
A201997123120000831
A201998123120000831
Q201999033120000831
Q201999063020000831
Q201999093020000831
AQ201999123120000831
Q202000033120000831
Q202000063020000831

 

 

data have;
input  summary_period2 $	crsp_fundno	exp_ratio	caldt	end_dt;
datalines;
A	1	0.0025	19971231	20000731
A	1	0.0019	19981231	20000731
Q	1	0.0019	19990331	20000731
Q	1	0.0019	19990630	20000731
Q	1	0.0042	19990930	20000731
AQ	1	0.0042	19991231	20000731
Q	1	0.0042	20000331	20000731
Q	1	0.0042	20000630	20000731
A	2	0	19971231	20000831
A	2	0	19981231	20000831
Q	2	0	19990331	20000831
Q	2	0	19990630	20000831
Q	2	0	19990930	20000831
AQ	2	0	19991231	20000831
Q	2	0	20000331	20000831
Q	2	0	20000630	20000831
;
run;

and want

 

crsp_fundnoexp_ratiocaldt
10.002519970131
10.002519970228
10.002519970331
10.002519970430
10.002519970531
10.002519970630
10.002519970731
10.002519970830
10.002519970931
10.002519971030
10.002519971131
10.002519971231
10.001919980131
10.001919980228
10.001919980331
10.001919980430
10.001919980531
10.001919980630
10.001919980731
10.001919980831
10.001919980930
10.001919981031
10.001919981130
10.001919981231
10.001919990131
10.001919990228
10.001919990331
10.001919990430
10.001919990531
10.001919990630
10.004219990731
10.004219990831
10.004219990930
10.004219991031
10.004219991130
10.004219991231
10.004220000131
10.004220000229
10.004220000331
10.004220000430
10.004220000531
10.004220000630
10.004220000731
2019970131
2019970228
2019970331
2019970430
2019970531
2019970630
2019970731
2019970831
2019970930
2019971031
2019971130
2019971231
2019980131
2019980228
2019980331
2019980430
2019980531
2019980630
2019980731
2019980831
2019980930
2019981031
2019981130
2019981231
2019990331
2019990630
2019990930
2019991231
2020000331
2020000630
2020000731
2020000831

 

 

In this situation how do I get the dataset I wanted?

7 REPLIES 7
Reeza
Super User

If you have SAS ETS, try PROC TIMESERIES. 

If you don't, a manual data step with a start/end date and a loop works:

 

data want;
set have;

do date= start_date to end_date;
output;
end;

run;

@jkim197 wrote:

Hello all!

 

I want to convert mixed quarterly and annual frequency data to monthly data in panel data setting.

 

I have 

 

summary_period2 : represents frequency, (A: annual, Q: quarterly, AQ:quarterly)

crsp_fundno : entity id variable

caldt : calendar date

end_dt : ending calendar date for each entitiy

 

summary_period2 crsp_fundno exp_ratio caldt end_dt
A 1 0.0025 19971231 20000731
A 1 0.0019 19981231 20000731
Q 1 0.0019 19990331 20000731
Q 1 0.0019 19990630 20000731
Q 1 0.0042 19990930 20000731
AQ 1 0.0042 19991231 20000731
Q 1 0.0042 20000331 20000731
Q 1 0.0042 20000630 20000731
A 2 0 19971231 20000831
A 2 0 19981231 20000831
Q 2 0 19990331 20000831
Q 2 0 19990630 20000831
Q 2 0 19990930 20000831
AQ 2 0 19991231 20000831
Q 2 0 20000331 20000831
Q 2 0 20000630 20000831

 

 

data have;
input  summary_period2 $	crsp_fundno	exp_ratio	caldt	end_dt;
datalines;
A	1	0.0025	19971231	20000731
A	1	0.0019	19981231	20000731
Q	1	0.0019	19990331	20000731
Q	1	0.0019	19990630	20000731
Q	1	0.0042	19990930	20000731
AQ	1	0.0042	19991231	20000731
Q	1	0.0042	20000331	20000731
Q	1	0.0042	20000630	20000731
A	2	0	19971231	20000831
A	2	0	19981231	20000831
Q	2	0	19990331	20000831
Q	2	0	19990630	20000831
Q	2	0	19990930	20000831
AQ	2	0	19991231	20000831
Q	2	0	20000331	20000831
Q	2	0	20000630	20000831
;
run;

and want

 

crsp_fundno exp_ratio caldt
1 0.0025 19970131
1 0.0025 19970228
1 0.0025 19970331
1 0.0025 19970430
1 0.0025 19970531
1 0.0025 19970630
1 0.0025 19970731
1 0.0025 19970830
1 0.0025 19970931
1 0.0025 19971030
1 0.0025 19971131
1 0.0025 19971231
1 0.0019 19980131
1 0.0019 19980228
1 0.0019 19980331
1 0.0019 19980430
1 0.0019 19980531
1 0.0019 19980630
1 0.0019 19980731
1 0.0019 19980831
1 0.0019 19980930
1 0.0019 19981031
1 0.0019 19981130
1 0.0019 19981231
1 0.0019 19990131
1 0.0019 19990228
1 0.0019 19990331
1 0.0019 19990430
1 0.0019 19990531
1 0.0019 19990630
1 0.0042 19990731
1 0.0042 19990831
1 0.0042 19990930
1 0.0042 19991031
1 0.0042 19991130
1 0.0042 19991231
1 0.0042 20000131
1 0.0042 20000229
1 0.0042 20000331
1 0.0042 20000430
1 0.0042 20000531
1 0.0042 20000630
1 0.0042 20000731
2 0 19970131
2 0 19970228
2 0 19970331
2 0 19970430
2 0 19970531
2 0 19970630
2 0 19970731
2 0 19970831
2 0 19970930
2 0 19971031
2 0 19971130
2 0 19971231
2 0 19980131
2 0 19980228
2 0 19980331
2 0 19980430
2 0 19980531
2 0 19980630
2 0 19980731
2 0 19980831
2 0 19980930
2 0 19981031
2 0 19981130
2 0 19981231
2 0 19990331
2 0 19990630
2 0 19990930
2 0 19991231
2 0 20000331
2 0 20000630
2 0 20000731
2 0 20000831

 

 

In this situation how do I get the dataset I wanted?


 

Shmuel
Garnet | Level 18

@Reeza, the line 

do date= start_date to end_date;

will loop a day by day.

 

If I understand correctly, the loop should be by month, the the code should be:

 

Shmuel
Garnet | Level 18

(the code was droped from my previos post)

data want;
  set have(rename=(caldt = start_dt));
        caldt = start_dt;
        drop start_dt end_dt;
        do while (caldt le end_dt);
             output;
             caldt = intnx('month', caldt, 1 ,'end');
       end;
run;
novinosrin
Tourmaline | Level 20
data have;
input  summary_period2 $	crsp_fundno	exp_ratio	caldt : yymmdd10.	end_dt : yymmdd10.;
format caldt yymmdd10.	end_dt yymmdd10.;
datalines;
A	1	0.0025	19971231	20000731
A	1	0.0019	19981231	20000731
Q	1	0.0019	19990331	20000731
Q	1	0.0019	19990630	20000731
Q	1	0.0042	19990930	20000731
AQ	1	0.0042	19991231	20000731
Q	1	0.0042	20000331	20000731
Q	1	0.0042	20000630	20000731
A	2	0	19971231	20000831
A	2	0	19981231	20000831
Q	2	0	19990331	20000831
Q	2	0	19990630	20000831
Q	2	0	19990930	20000831
AQ	2	0	19991231	20000831
Q	2	0	20000331	20000831
Q	2	0	20000630	20000831
;
run;
data want(drop=caldt _n rename=(n=caldt));
set have;
by crsp_fundno;
retain _n;
if first.crsp_fundno then do;
call missing(n);
n=intnx('month',intnx('year',caldt,0,'b')-1,0,'e');
end;
else n=_n;
do until(n=caldt);
 n=intnx('month',n,1,'e');
output;
end;
if last.crsp_fundno then do;n=end_dt;output;end;
_n=n;
format n _n yymmdd10.;
run;

@jkim197I need some in help in understanding your expected results for crsp_fundno  =2

2019991231
2020000331
2020000630

Seems like i am missing something as I am not spotting the month sequence above 

jkim197
Obsidian | Level 7
if I dont want to interpolate the exp_ratio so that
I stop at last caldt, what should I do?

1 0.0042 20000131
1 0.0042 20000229
1 0.0042 20000331
1 0.0042 20000430
1 0.0042 20000531
1 0.0042 20000630
* stop at here not 20000731

novinosrin
Tourmaline | Level 20

@jkim197wrote:
if I dont want to interpolate the exp_ratio so that
I stop at last caldt, what should I do?

1 0.0042 20000131
1 0.0042 20000229
1 0.0042 20000331
1 0.0042 20000430
1 0.0042 20000531
1 0.0042 20000630
* stop at here not 20000731


For that all you need to is comment out --->if last.crsp_fundno then do;n=end_dt;output;end;

 

data have;
input  summary_period2 $	crsp_fundno	exp_ratio	caldt : yymmdd10.	end_dt : yymmdd10.;
format caldt yymmdd10.	end_dt yymmdd10.;
datalines;
A	1	0.0025	19971231	20000731
A	1	0.0019	19981231	20000731
Q	1	0.0019	19990331	20000731
Q	1	0.0019	19990630	20000731
Q	1	0.0042	19990930	20000731
AQ	1	0.0042	19991231	20000731
Q	1	0.0042	20000331	20000731
Q	1	0.0042	20000630	20000731
A	2	0	19971231	20000831
A	2	0	19981231	20000831
Q	2	0	19990331	20000831
Q	2	0	19990630	20000831
Q	2	0	19990930	20000831
AQ	2	0	19991231	20000831
Q	2	0	20000331	20000831
Q	2	0	20000630	20000831
;
run;
data want(drop=caldt _n rename=(n=caldt));
set have;
by crsp_fundno;
retain _n;
if first.crsp_fundno then do;
call missing(_n);
n=intnx('month',intnx('year',caldt,0,'b')-1,0,'e');
end;
else n=_n;
do until(n=caldt);
 n=intnx('month',n,1,'e');
output;
end;
/*if last.crsp_fundno then do;n=end_dt;output;end;*/
_n=n;
format n _n yymmdd10.;
run;

 

I'll look forward to your feedback and see if you require any further changes/edit. Very interesting question indeed! Thank you!

 

jkim197
Obsidian | Level 7
Oh..just a mistake..
It should be stretched to monthly as same as below

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1446 views
  • 0 likes
  • 4 in conversation