Solved: Re: Pulling data from previous observations

AllSoEasy · Posted 06-11-2013 02:05 PM

I have a data set with dates, and a product category, and I have a number of rows in the data set which have dates in the future, and so the product column is blank for those, but i'd like to essentially replicate the product data throughout those rows. i.e. sample of my data looks like this now:

12/31/2012	product1
12/31/2012	product2
12/31/2012	product3
12/31/2012	product4
12/31/2012	product5
12/31/2012	product6
12/31/2012	product7
12/31/2012	product8
12/31/2012	product9
12/31/2012	product10
12/31/2012	product11
12/31/2012	product12
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013

And I would like it to look like this:

12/31/2012	product1
12/31/2012	product2
12/31/2012	product3
12/31/2012	product4
12/31/2012	product5
12/31/2012	product6
12/31/2012	product7
12/31/2012	product8
12/31/2012	product9
12/31/2012	product10
12/31/2012	product11
12/31/2012	product12
01/31/2013	product1
01/31/2013	product2
01/31/2013	product3
01/31/2013	product4
01/31/2013	product5
01/31/2013	product6
01/31/2013	product7
01/31/2013	product8
01/31/2013	product9
01/31/2013	product10
01/31/2013	product11
01/31/2013	product12
02/28/2013	product1
02/28/2013	product2
02/28/2013	product3
02/28/2013	product4
02/28/2013	product5
02/28/2013	product6
02/28/2013	product7
02/28/2013	product8
02/28/2013	product9
02/28/2013	product10
02/28/2013	product11
02/28/2013	product12

etc.

I thought I could use the lag statement, in a way such as this:

data want;

set have;

if missing(prodcat) then prodcat = lag12(prodcat);

run;

Since the product categories repeat every 12 observations, but this data step has absolutely no effect on the data set, if i replace the lag function with just a constant string, I see that string populated throughout the rows -- so why doesn't the lag function put the data in properly?

Thanks!

-Ryan

Astounding · Posted 06-11-2013 02:51 PM

Yes, that's correct. Very tricky ... LAG processes the ORIGINAL value in the "stack", not the final changed value. Here's another approach:

proc sql noprint;

create table dates as select distinct date from have;

create table products as select distinct product from have where product > ' ';

create table want as select * from dates, products order by date, product;

quit;

Good luck.

View solution in original post

Tom · Posted 06-11-2013 02:36 PM

You normally do NOT want to reference LAG() conditionally. It does not "look back", instead it is popping values off a stack generated by the previous EXECUTIONS of the LAG() function.

lag12=lag12(prodcat);

if missing(prodcat) then prodcat = lag12;

AllSoEasy · Posted 06-11-2013 02:44 PM

Thanks so much! However it only partially worked...With your code modification it now populates the following 12 rows (01/31/2013) months, but none after that. I tried a retain statement with no luck...Any suggestions?

Astounding · Posted 06-11-2013 02:51 PM

Yes, that's correct. Very tricky ... LAG processes the ORIGINAL value in the "stack", not the final changed value. Here's another approach:

proc sql noprint;

create table dates as select distinct date from have;

create table products as select distinct product from have where product > ' ';

create table want as select * from dates, products order by date, product;

quit;

Good luck.

AllSoEasy · Posted 06-11-2013 02:59 PM

Thank you so much Astounding! That works perfectly and seems to be a very elegant solution.

Many thanks!!

Tom · Posted 06-11-2013 03:07 PM

It sounds like your actual problem is not related to LAG() at all.

Do you want to assign all possible productions to all possible dates? then the SQL solution proposed looks good.

If you really did want to "cycle" the non-missing values you might be able to use POINT=.

data products;

set have;

where not missing(product);

keep product;

run;

data want;

set have ;

if product = ' ' then do ;

p+1;

if p > nobs then p=1;

set products point=p nobs=nobs;

end;

run;

Reeza · Posted 06-11-2013 03:08 PM

You could always try a cross join instead:

proc sql;

create table want as

select * from (

select distinct date from have

cross join

select distinct product from have)

order by date, product;

quit;

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away