Hi
I have to update missing values in a in both coulmns using value of adjacent coulmn as refernce for search. refer to below example
Coulmn 1 | Coulmn 2 |
---|---|
A | 1 |
1 | |
A | |
B | 2 |
2 | |
B |
As in ROW1 values are available for both coulmn, i want to update missing values in ROW2 & ROW3 , please guide How ?
Look up the "lag()" function and "SAS by processing" (first and last);
assuming that the first record of each by group is never missing i did this coding and still the table does not update.you know why,just curious?
data one;
input Cou1 $ Cou2;
cards;
A 1
. 1
A .
B 2
. 2
B .
;
data two;
set one;
by cou1 notsorted;
lag_cou1=lag(cou1);
lag_cou2=lag(cou2);
if not(first.cou1) then do;
if cou1=' ' then cou1=lag_cou1;
if cou2=. then cou2=lag_cou2; end;
proc print;run;
When you encounter a missing value for COU1, you are also encountering FIRST.COU1=1. So your IF THEN group will not execute.
but cou1 is never missing when first.cou1=1.At least that what the table above shows?
Lag is not appropiate for this problem. But there are easy ways to get the desired result.
Slick! There are also ways get it simply done in one step, inspired by the name you used "coalesced",
data a;
input C1 :$1. C2;
cards;
A 1
. 1
A .
B 2
. 2
B .
;;;;
data b ;
set a;
retain _1 ' ';
retain _2;
c1=coalescec(c1,_1);
c2=coalesce(c2,_2);
_1=c1;_2=c2;
drop _:;
run;
Haikuo
Yes but you to write more code: rename and assignments, etc. The MISSING CHECK feature of the UPDATE is a powerfull alternative applied to all the variables.
Yes, DN! And I have to use Array() if more variables get involved, the code will be increasingly bulky. I was thinking of using Update, but didn't go far enough to introduce a dummy id variable. Another trick I learned from you!
Haikuo
sorry,what does this line below exactly do?
update av(obs=0) av;
Have you looked at the documentation regarding the UPDATE statement?
WRT the line in question, it reads observations from the transaction data using the UPDATE statement. The UPDATE statement requires exactly two data sets as input a master and transaction. This technique uses a dummy master av(obs=0).
the av(obs=0) was actually confusing me.Merging a table with rows with a table with no rows and still achieving the results
It's not merge. The transactions are "overlayed" with the missing check not allowing a missing value to update a nonmissing value.
i thought the reason you created dummyby was to merge the 2 sets
The BY statement is required when using an UPDATE statement so that the, potentially multiple, transactions can be assigned to the proper record in the master. You are allowed to add new records to the MASTER via UPDATE, hence the use of a MASTER table with no observations.
Data _null_'s twist on the standard UPDATE that makes it work for this type of Last Observation Carried Forward application is to add the OUTPUT statement. So rather than collapsing the whole dataset into one row per BY group the data step emits an observation after each record read from the transaction table. So we end up with the same number of observations as we started with.
Missing values in the transaction table do NOT overwrite the values already in the variables, hence the previous value is "carried forward".
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.