DATA Step, Macro, Functions and more

Pulling data from previous observations

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 105
Accepted Solution

Pulling data from previous observations

I have a data set with dates, and a product category, and I have a number of rows in the data set which have dates in the future, and so the product column is blank for those, but i'd like to essentially replicate the product data throughout those rows. i.e. sample of my data looks like this now:

12/31/2012product1
12/31/2012product2
12/31/2012product3
12/31/2012product4
12/31/2012product5
12/31/2012product6
12/31/2012product7
12/31/2012product8
12/31/2012product9
12/31/2012product10
12/31/2012product11
12/31/2012product12
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
01/31/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013
02/28/2013

And I would like it to look like this:

12/31/2012product1
12/31/2012product2
12/31/2012product3
12/31/2012product4
12/31/2012product5
12/31/2012product6
12/31/2012product7
12/31/2012product8
12/31/2012product9
12/31/2012product10
12/31/2012product11
12/31/2012product12
01/31/2013product1
01/31/2013product2
01/31/2013product3
01/31/2013product4
01/31/2013product5
01/31/2013product6
01/31/2013product7
01/31/2013product8
01/31/2013product9
01/31/2013product10
01/31/2013product11
01/31/2013product12
02/28/2013product1
02/28/2013product2
02/28/2013product3
02/28/2013product4
02/28/2013product5
02/28/2013product6
02/28/2013product7
02/28/2013product8
02/28/2013product9
02/28/2013product10
02/28/2013product11
02/28/2013product12

etc.

I thought I could use the lag statement, in a way such as this:

data want;

set have;

if missing(prodcat) then prodcat = lag12(prodcat);

run;

Since the product categories repeat every 12 observations, but this data step has absolutely no effect on the data set, if i replace the lag function with just a constant string, I see that string populated throughout the rows -- so why doesn't the lag function put the data in properly?

Thanks!

-Ryan


Accepted Solutions
Solution
‎06-11-2013 02:51 PM
Super User
Posts: 5,083

Re: Pulling data from previous observations

Yes, that's correct.  Very tricky ... LAG processes the ORIGINAL value in the "stack", not the final changed value.  Here's another approach:

proc sql noprint;

   create table dates as select distinct date from have;

   create table products as select distinct product from have where product > ' ';

   create table want as select * from dates, products order by date, product;

quit;

Good luck.

View solution in original post


All Replies
Super User
Super User
Posts: 6,500

Re: Pulling data from previous observations

You normally do NOT want to reference LAG() conditionally.  It does not "look back", instead it is popping values off a stack generated by the previous EXECUTIONS of the LAG() function.

lag12=lag12(prodcat);

if missing(prodcat) then prodcat = lag12;

Frequent Contributor
Posts: 105

Re: Pulling data from previous observations

Thanks so much! However it only partially worked...With your code modification it now populates the following 12 rows (01/31/2013) months, but none after that. I tried a retain statement with no luck...Any suggestions?

Solution
‎06-11-2013 02:51 PM
Super User
Posts: 5,083

Re: Pulling data from previous observations

Yes, that's correct.  Very tricky ... LAG processes the ORIGINAL value in the "stack", not the final changed value.  Here's another approach:

proc sql noprint;

   create table dates as select distinct date from have;

   create table products as select distinct product from have where product > ' ';

   create table want as select * from dates, products order by date, product;

quit;

Good luck.

Frequent Contributor
Posts: 105

Re: Pulling data from previous observations

Thank you so much Astounding! That works perfectly and seems to be a very elegant solution.

Many thanks!!

Super User
Super User
Posts: 6,500

Re: Pulling data from previous observations

It sounds like your actual problem is not related to LAG() at all.

Do you want to assign all possible productions to all possible dates? then the SQL solution proposed looks good.

If you really did want to "cycle" the non-missing values you might be able to use POINT=.

data products;

  set have;

  where not missing(product);

  keep product;

run;

data want;

  set have ;

  if product = ' ' then do ;

    p+1;

    if p > nobs then p=1;

    set products point=p nobs=nobs;

end;

run;

Super User
Posts: 17,829

Re: Pulling data from previous observations

You could always try a cross join instead:

proc sql;

create table want as

select * from (

select distinct date from have

cross join

select distinct product from have)

order by date, product;

quit;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 238 views
  • 3 likes
  • 4 in conversation