Help using Base SAS procedures

Creating observations for missing values

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

Creating observations for missing values

Hi All, 

 

I'm working with some price data that is summed up by the year and the fiscal quarter. However, not all observations have purchases in all quarters/years. I want to make those missing quarters/year appear as a 0. I have a 3 year bubble around each of my observations where I have data on what they spent.

 

Currently this is how my data looks 

 

 

ID   Year    Qtr   Spent

1    2010   1       50

1    2010   2      100 

1    2010   4        25

1    2011   2        100

1    2011   4        200

 

 

What I want is:

 

ID   Year    Qtr   Spent

1    2010   1       50

1    2010   2      100 

1    2010   3        0

1    2010   4        25

1    2011   1         0

1    2011   2        100

1    2011   3         0

1    2011   4        200

1   2012   1          0

1   2012   2          0

1    2012  3          0

1    2012   4         0

 

Thanks in advance! 

 

 

 


Accepted Solutions
Solution
4 weeks ago
Super User
Posts: 5,258

Re: Creating observations for missing values

If you have to do it yourself without ETS, it's not that lengthy.  First, create a data set with all combinations of ID, YEAR, and QTR:

 

proc sql noprint;

   create table year_list as select distinct year from have;

   create table id_list as select distinct id from have;

   create table qtr_list as select distinct qtr from have;

   create table all_combinations as select * from year_list, id_list, qtr_list;

quit;

 

Then sort, merge, fill in the zeros:

 

proc sort data=all_combinations;

   by id year qtr;

run;

 

proc sort data=have;

   by id year qtr;

run;

 

data want;

   merge all_combinations have (in=have_data);

   by id year qtr;

   if have_data=0 then spent=0;

run;

View solution in original post


All Replies
Super User
Posts: 18,588

Re: Creating observations for missing values

If you have SAS/ETS look at PROC TIMESERIES

Contributor
Posts: 27

Re: Creating observations for missing values

I believe that I have proc timeseries, but have never used it. How would you use it to create what I need?

Super User
Posts: 18,588

Re: Creating observations for missing values

You specify the time interval, the start and end period and it fills it in. 

Make sure to include your BY statement as well.

 

Here's a fully worked example with fake data:

 

 

/*this is an example of time series data. 
1. Create a time series data set with missing intervals (IBM)
2. Add back missing entries using PROC TIMESERIES (IBM_NO_MISSING)
3. Calculate moving average - 12 month average
*/

/*1*/
data ibm;
	set sashelp.stocks;
	where stock='IBM';

	if month(date)=7 then
		delete;
run;

proc sort data=ibm;
	by date;
run;

/*2*/
proc timeseries data=ibm out=ibm_no_missing;
	id date interval=month start='01Aug1986'd end='01Dec2005'd;
	var open;
run;

 

Contributor
Posts: 27

Re: Creating observations for missing values

For the example you posted where it has 

 

start='01Aug1986'd end='01Dec2005'd;

 

Is this the overall global start and end date?

 

The way I get the data is that I have a 3 year bubble around each ID as they enter into my data. However, not all of them are entering in at the same time, some might enter in 2010 and others in 2013. 

Super User
Posts: 18,588

Re: Creating observations for missing values

Ah...then this method won't work, you'll need to use the data step approach. If you have a  data set that defines the bubbles it's relatively straightforward to loop the dates to get an empty table with all the records and then merge it in with the original table to get the results.

Contributor
Posts: 27

Re: Creating observations for missing values

OK, how might you go about that then? I was thinking about something with loops but could not think of a solution. 

Super User
Posts: 18,588

Re: Creating observations for missing values

What data do you have? If all you have is what you've shown above, how do you know what the start and end of the loop will be?

 

Contributor
Posts: 27

Re: Creating observations for missing values

I have the variables I showed, ID, Yr, Qtr and Spent plus the start date of when they entered so:

 

 

ID   Year    Qtr   Spent  Date of Entry

1    2010   1       50         Jan 2011

1    2010   2      100       Jan 2011

1    2010   4        25      Jan 2011

1    2011   2        100   Jan 2011

1    2011   4        200    Jan 2011

 

 

So I have a year of lookback, the year during and the year after the entry date. 

Solution
4 weeks ago
Super User
Posts: 5,258

Re: Creating observations for missing values

If you have to do it yourself without ETS, it's not that lengthy.  First, create a data set with all combinations of ID, YEAR, and QTR:

 

proc sql noprint;

   create table year_list as select distinct year from have;

   create table id_list as select distinct id from have;

   create table qtr_list as select distinct qtr from have;

   create table all_combinations as select * from year_list, id_list, qtr_list;

quit;

 

Then sort, merge, fill in the zeros:

 

proc sort data=all_combinations;

   by id year qtr;

run;

 

proc sort data=have;

   by id year qtr;

run;

 

data want;

   merge all_combinations have (in=have_data);

   by id year qtr;

   if have_data=0 then spent=0;

run;

Contributor
Posts: 27

Re: Creating observations for missing values

So this will work but it does still have some issues. My data goes from 2009 to 2015. but not all IDs are coming in during the same time frame. Some are coming in 2010 and others in 2013 and I just want to be able to capture the 3 year bubble around them. I tried this solution but it gives me the full 2009 to 2015 time frame around all of the IDs. Any thoughts on how to get it to just be around the 3 year bubble? 

Super User
Posts: 18,588

Re: Creating observations for missing values


endofline wrote:

 Any thoughts on how to get it to just be around the 3 year bubble? 


What's a 3 year bubble? As long as you can define it, you're fine, so far you haven't offered a definition. 

Contributor
Posts: 27

Re: Creating observations for missing values

ID   Year    Qtr   Spent  Date of Entry

1    2010   1       50         Jan 2011

1    2010   2      100       Jan 2011

1    2010   4        25      Jan 2011

1    2011   2        100   Jan 2011

1    2011   4        200    Jan 2011

 

 

So for the 3 year bubble I have a year of lookback, the year during and the year after the entry date.And I have this for each observation, with each obersvation having a different date of entry.  

Super User
Posts: 5,258

Re: Creating observations for missing values

[ Edited ]

You'll have to describe the date ranges a little more.  (And of course the program gets more complex.) 

 

Could a company begin in Q3 of one year, and end in Q2 of another year?  How do you know when the window begins and ends for each company?  Does a window have to be exactly 12 quarters, or does it depend on the data?

 

I'll be able to compare my questions to your (just-posted) description later tonight.

Contributor
Posts: 27

Re: Creating observations for missing values

I know when they enter in the data set for sure by the date entered variable and that we're for sure suppose to be getting 1 year of lookback data from that entry date, the year during and the year after.

 

It does happen that someone does enter in the dataset in say April or October. However, it should be 12 qtrs. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 19 replies
  • 277 views
  • 6 likes
  • 6 in conversation