BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
endofline
Obsidian | Level 7

Hi All, 

 

I'm working with some price data that is summed up by the year and the fiscal quarter. However, not all observations have purchases in all quarters/years. I want to make those missing quarters/year appear as a 0. I have a 3 year bubble around each of my observations where I have data on what they spent.

 

Currently this is how my data looks 

 

 

ID   Year    Qtr   Spent

1    2010   1       50

1    2010   2      100 

1    2010   4        25

1    2011   2        100

1    2011   4        200

 

 

What I want is:

 

ID   Year    Qtr   Spent

1    2010   1       50

1    2010   2      100 

1    2010   3        0

1    2010   4        25

1    2011   1         0

1    2011   2        100

1    2011   3         0

1    2011   4        200

1   2012   1          0

1   2012   2          0

1    2012  3          0

1    2012   4         0

 

Thanks in advance! 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

If you have to do it yourself without ETS, it's not that lengthy.  First, create a data set with all combinations of ID, YEAR, and QTR:

 

proc sql noprint;

   create table year_list as select distinct year from have;

   create table id_list as select distinct id from have;

   create table qtr_list as select distinct qtr from have;

   create table all_combinations as select * from year_list, id_list, qtr_list;

quit;

 

Then sort, merge, fill in the zeros:

 

proc sort data=all_combinations;

   by id year qtr;

run;

 

proc sort data=have;

   by id year qtr;

run;

 

data want;

   merge all_combinations have (in=have_data);

   by id year qtr;

   if have_data=0 then spent=0;

run;

View solution in original post

19 REPLIES 19
Reeza
Super User

If you have SAS/ETS look at PROC TIMESERIES

endofline
Obsidian | Level 7

I believe that I have proc timeseries, but have never used it. How would you use it to create what I need?

Reeza
Super User

You specify the time interval, the start and end period and it fills it in. 

Make sure to include your BY statement as well.

 

Here's a fully worked example with fake data:

 

 

/*this is an example of time series data. 
1. Create a time series data set with missing intervals (IBM)
2. Add back missing entries using PROC TIMESERIES (IBM_NO_MISSING)
3. Calculate moving average - 12 month average
*/

/*1*/
data ibm;
	set sashelp.stocks;
	where stock='IBM';

	if month(date)=7 then
		delete;
run;

proc sort data=ibm;
	by date;
run;

/*2*/
proc timeseries data=ibm out=ibm_no_missing;
	id date interval=month start='01Aug1986'd end='01Dec2005'd;
	var open;
run;

 

endofline
Obsidian | Level 7

For the example you posted where it has 

 

start='01Aug1986'd end='01Dec2005'd;

 

Is this the overall global start and end date?

 

The way I get the data is that I have a 3 year bubble around each ID as they enter into my data. However, not all of them are entering in at the same time, some might enter in 2010 and others in 2013. 

Reeza
Super User

Ah...then this method won't work, you'll need to use the data step approach. If you have a  data set that defines the bubbles it's relatively straightforward to loop the dates to get an empty table with all the records and then merge it in with the original table to get the results.

endofline
Obsidian | Level 7

OK, how might you go about that then? I was thinking about something with loops but could not think of a solution. 

Reeza
Super User

What data do you have? If all you have is what you've shown above, how do you know what the start and end of the loop will be?

 

endofline
Obsidian | Level 7

I have the variables I showed, ID, Yr, Qtr and Spent plus the start date of when they entered so:

 

 

ID   Year    Qtr   Spent  Date of Entry

1    2010   1       50         Jan 2011

1    2010   2      100       Jan 2011

1    2010   4        25      Jan 2011

1    2011   2        100   Jan 2011

1    2011   4        200    Jan 2011

 

 

So I have a year of lookback, the year during and the year after the entry date. 

Astounding
PROC Star

If you have to do it yourself without ETS, it's not that lengthy.  First, create a data set with all combinations of ID, YEAR, and QTR:

 

proc sql noprint;

   create table year_list as select distinct year from have;

   create table id_list as select distinct id from have;

   create table qtr_list as select distinct qtr from have;

   create table all_combinations as select * from year_list, id_list, qtr_list;

quit;

 

Then sort, merge, fill in the zeros:

 

proc sort data=all_combinations;

   by id year qtr;

run;

 

proc sort data=have;

   by id year qtr;

run;

 

data want;

   merge all_combinations have (in=have_data);

   by id year qtr;

   if have_data=0 then spent=0;

run;

endofline
Obsidian | Level 7

So this will work but it does still have some issues. My data goes from 2009 to 2015. but not all IDs are coming in during the same time frame. Some are coming in 2010 and others in 2013 and I just want to be able to capture the 3 year bubble around them. I tried this solution but it gives me the full 2009 to 2015 time frame around all of the IDs. Any thoughts on how to get it to just be around the 3 year bubble? 

Reeza
Super User

@endofline wrote:

 Any thoughts on how to get it to just be around the 3 year bubble? 


What's a 3 year bubble? As long as you can define it, you're fine, so far you haven't offered a definition. 

endofline
Obsidian | Level 7

ID   Year    Qtr   Spent  Date of Entry

1    2010   1       50         Jan 2011

1    2010   2      100       Jan 2011

1    2010   4        25      Jan 2011

1    2011   2        100   Jan 2011

1    2011   4        200    Jan 2011

 

 

So for the 3 year bubble I have a year of lookback, the year during and the year after the entry date.And I have this for each observation, with each obersvation having a different date of entry.  

Astounding
PROC Star

You'll have to describe the date ranges a little more.  (And of course the program gets more complex.) 

 

Could a company begin in Q3 of one year, and end in Q2 of another year?  How do you know when the window begins and ends for each company?  Does a window have to be exactly 12 quarters, or does it depend on the data?

 

I'll be able to compare my questions to your (just-posted) description later tonight.

endofline
Obsidian | Level 7

I know when they enter in the data set for sure by the date entered variable and that we're for sure suppose to be getting 1 year of lookback data from that entry date, the year during and the year after.

 

It does happen that someone does enter in the dataset in say April or October. However, it should be 12 qtrs. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 19 replies
  • 1937 views
  • 6 likes
  • 6 in conversation