Hi, I need to complete missing values from a column with the previous value if there is any and by group. I'll try to explain it:
Obs. Price Name
1 3.5 A
2 3.6 A
3 . A
4 3.2 A
5 . A
6 . A
7 3.4 A
8 . B
9 3.6 B
10 . B
11 3.2 B
12 . B
13 . B
After the process, the table should be:
Obs. Price Name
1 3.5 A
2 3.6 A
3 3.6 A
4 3.2 A
5 3.2 A
6 3.2 A
7 3.4 A
8 . B
9 3.6 B
10 3.6 B
11 3.2 B
12 3.2 B
13 3.2 B
Obs. N° 8 must be still missing because for its group B, there is no previous value to copy.
This is my code:
DATA PRICES;
SET PRICESMISSING;
BY NAME;
DO;
IF PRICE <> . THEN A=PRICE;
IF MISSING(PRICE) THEN PRICE=LAG(A);
END;
RUN;
My idea is store a aux variable to save no missing values from price and when the process find a missing value use the lag value from the aux variable but when I run it, I got the same column without any replace.
Could you help me? Any kind of suggestion is welcome! Thank you!
data have;
input obs price name $;
cards;
1 3.5 A
2 3.6 A
3 . A
4 3.2 A
5 . A
6 . A
7 3.4 A
8 . B
9 3.6 B
10 . B
11 3.2 B
12 . B
13 . B
;
data want(rename=(new=price));
retain new;
set have;
by name notsorted;
new=ifn((first.name=0 and price ne .),price,new);
if first.name then new=price;
drop price obs;
proc print;run;
Hi, the use of LAG should be "unconditional" (in other words, using LAG inside an IF is not recommended, per this Tech Support note 24665 - Use the LAG function to conditionally carry information down a data set ) So I would recommend that you look at the note and use the technique in the program (the technique that creates variable C) ...you are using a technique such as the one that creates variable A (the WRONG result) in the example program.
cynthia
LAG is not the proper tool for step interpolation or as it is often commonly referred to as LOCF.
data have;
input obs price name $;
cards;
1 3.5 A
2 3.6 A
3 . A
4 3.2 A
5 . A
6 . A
7 3.4 A
8 . B
9 3.6 B
10 . B
11 3.2 B
12 . B
13 . B
;
data want(rename=(new=price));
retain new;
set have;
by name notsorted;
new=ifn((first.name=0 and price ne .),price,new);
if first.name then new=price;
drop price obs;
proc print;run;
Thank you very much!! I almost cry when it worked!
data want(rename=(new=price));
retain new;
set have;
by name notsorted;
if first.name then new=price;
else new=coalesce(price, new);
drop price;
run;
It's really the same approach as Linlin but I dislike multiple ifs with similar conditionnal arrangements so I hid "and price ne ." into the coalesce function. Slight improvement to code readability.
As Cynthia pointed out, LAG is a pile or queue and not an actual previous row addressing so conditioning on it is nearly guaranteed to fail your desired logic. You could use an array _temporary_ to store previous over data step iterations to achieve a logic similar to the one you've coded in your original post.
Vince
Edit @Data _Null_ that's neat. I didn't know of an update statement for data step.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.