## missing data

Solved
Super Contributor
Posts: 1,636

# missing data

Hi All

How could I get dataset WANT from HAVE?  Thank you!

data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want;

id count var1
aa 1 20
aa 2 25
aa 3 30
bb 1 10
bb 2 15
bb 3 15
bb 4 20
cc 1 10
cc 2 20
cc 3 30
cc 4 40
cc 5 50

Accepted Solutions
Solution
‎11-04-2015 03:35 PM
Posts: 5,521

## Re: missing data

A data step solution:

data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want0;
retain lastVar1;
set have; by id;
if first.id then call missing(lastVar1);
if missing(var1) then prevVar1 = lastVar1;
else lastVar1 = var1;
drop lastVar1;
run;

proc sort data=want0; by id descending count; run;

data want;
retain lastVar1;
set want0; by id;
if first.id then call missing(lastVar1);
if missing(var1) then var1 = mean(prevVar1, lastVar1);
else lastVar1 = var1;
drop lastVar1 prevVar1;
run;

proc sort data=want; by id count; run;

proc print data=want noobs; run;
PG

All Replies
Posts: 3,167

## Re: missing data

Long time no see! I suppose Data step solution would be a lot more efficient/elegant, but I can't figure out one right now, so here is the Ugly.

data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

proc sql;
create table want as
select id, count, case when not missing(var1) then var1 else mean((select var1 from have where id=a.id and count < a.count and not missing(var1) having count=max(count)),
(select var1 from have where id=a.id and count > a.count and not missing (var1) having count=min(count))) end as var1
from have a
;
quit;
PROC Star
Posts: 8,163

## Re: missing data

Also, long time, no see. Missed you!

You appear to want to use two different methods for the 2nd and 3rd IDs. Is that really what you want, or something else?

If you want to use the same method, the following seems to come close:

PROC STDIZE data=have out=want method=mean missing=midrange reponly;
var var1;
by id;
run;

Art, CEO, AnalystFinder.com

Solution
‎11-04-2015 03:35 PM
Posts: 5,521

## Re: missing data

A data step solution:

data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want0;
retain lastVar1;
set have; by id;
if first.id then call missing(lastVar1);
if missing(var1) then prevVar1 = lastVar1;
else lastVar1 = var1;
drop lastVar1;
run;

proc sort data=want0; by id descending count; run;

data want;
retain lastVar1;
set want0; by id;
if first.id then call missing(lastVar1);
if missing(var1) then var1 = mean(prevVar1, lastVar1);
else lastVar1 = var1;
drop lastVar1 prevVar1;
run;

proc sort data=want; by id count; run;

proc print data=want noobs; run;
PG
Super Contributor
Posts: 1,636

## Re: missing data

[ Edited ]

Hi Haikuo, Art, and PG,

Thank you very much for your great help!! I miss you guys too.

Best wishes!

Linlin

🔒 This topic is solved and locked.