Hello everyone,
In the following table I want to fill the missing value of MSA with non-missing values of those observations whose CBSA values are equal, so that the column MSA_filled be created. I tried retain but it didn't work. Do I need to use coalesce?
I appreciate your suggestions.
CBSA | MSA | MSA_filled |
---|---|---|
111 | . | 45 |
111 | 45 | 45 |
111 | . | 45 |
111 | . | 45 |
222 | 67 | 67 |
222 | . | 67 |
333 | 89 | 89 |
333 | . | 89 |
333 | 89 | 89 |
555 | 88 | 88 |
555 | 88 | 88 |
555 | . | 88 |
555 | . | 88 |
How about :
data have; input CBSA MSA ; cards; 111 . 111 45 111 . 111 . 222 67 222 . 333 89 333 . 333 89 555 88 555 88 555 . 555 . ; run; data want; retain MSA_filled _CBSA; do until(not missing(MSA) or last.CBSA); set have; by CBSA; end; if CBSA ne _CBSA then call missing(MSA_filled); if not missing(MSA) then MSA_filled=MSA ; do until(not missing(MSA) or last.CBSA); set have; by CBSA; output; end; _CBSA=CBSA; drop _CBSA; run;
Xia Keshan
Retain won't work on as there is already a field.
Sort descending on MSA, create a new variable that is retained and fill down from there.
The problem is that if there isn't any non-missing value for a specific CBSA then the cell will be filled with the previous values, but I want it to be missing, like the table below.
CBSA | MSA | MSA_filled |
333 | 89 | 89 |
333 | . | 89 |
333 | 89 | 89 |
555 | 88 | 88 |
555 | 88 | 88 |
555 | . | 88 |
555 | . | 88 |
222 | . | . |
222 | . | . |
111 | . | 45 |
111 | 45 | 45 |
111 | . | 45 |
111 | . | 45 |
Use By Group processing on CBSA - if first.CBSA then set it to missing
There are plenty of ways to achieve this. However, the one that seems most intuitive to me is the following:
*Create a master list;
data Master; set Table;
if missing(MSA) then delete;
run;
*Make sure Master has just one entry for each CBSA;
proc sort data=Master nodupkey; by CBSA; run;
*Create new table as needed;
proc sql;
create table Table2 as
select t1.*, t2.MSA as MSA_filled
from Table as t1
left join Master as t2
on t1.CBSA = t2.CBSA
;quit;
How about :
data have; input CBSA MSA ; cards; 111 . 111 45 111 . 111 . 222 67 222 . 333 89 333 . 333 89 555 88 555 88 555 . 555 . ; run; data want; retain MSA_filled _CBSA; do until(not missing(MSA) or last.CBSA); set have; by CBSA; end; if CBSA ne _CBSA then call missing(MSA_filled); if not missing(MSA) then MSA_filled=MSA ; do until(not missing(MSA) or last.CBSA); set have; by CBSA; output; end; _CBSA=CBSA; drop _CBSA; run;
Xia Keshan
Actually IMHO the simplest approach would be like this,
proc sql;
create table want as
select cbsa, max(msa,max(msa)) /*min(msa)*/ as msa
from have
group by cbsa;
quit;
min or max doesn't matter since they both ignore missing values.
So, what's the point of using retain?
If there is only one unique MSA for each group . You can do this .
data have;
input CBSA MSA ;
cards;
111 .
111 45
111 .
111 .
222 67
222 .
333 89
333 .
333 89
555 88
555 88
555 .
555 .
;
run;
data want;
merge have have(keep=CBSA MSA rename=(MSA=MSA_filled) where=(MSA_filled is not missing));
by CBSA;
run;
I don't quite understand your question. 'RETAIN' is a concept for Date Step, where data is processed Sequentially. SQL does NOT use/need 'RETAIN'. As you may wonder in my code: max(msa, max(msa)), the second max() is the summary function to get the max of the group, it can be replaced by min(), which is also a summary function. The first max() is not a summary function, it is to compare msa of certain row with the group max, it can also be replaced by min() and coalesce().
Thank you. Actually I wanted to know what's the main difference between SQL code and RETAIN. It seems that in my case both methods work.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.