Lapis Lazuli | Level 10

missing data

Hi All

How could I get dataset WANT from HAVE?  Thank you!

data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want;

id count var1
aa 1 20
aa 2 25
aa 3 30
bb 1 10
bb 2 15
bb 3 15
bb 4 20
cc 1 10
cc 2 20
cc 3 30
cc 4 40
cc 5 50

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

Re: missing data

A data step solution:

``````data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want0;
retain lastVar1;
set have; by id;
if first.id then call missing(lastVar1);
if missing(var1) then prevVar1 = lastVar1;
else lastVar1 = var1;
drop lastVar1;
run;

proc sort data=want0; by id descending count; run;

data want;
retain lastVar1;
set want0; by id;
if first.id then call missing(lastVar1);
if missing(var1) then var1 = mean(prevVar1, lastVar1);
else lastVar1 = var1;
drop lastVar1 prevVar1;
run;

proc sort data=want; by id count; run;

proc print data=want noobs; run;``````
PG
4 REPLIES 4
Onyx | Level 15

Re: missing data

Long time no see! I suppose Data step solution would be a lot more efficient/elegant, but I can't figure out one right now, so here is the Ugly.

```data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

proc sql;
create table want as
select id, count, case when not missing(var1) then var1 else mean((select var1 from have where id=a.id and count < a.count and not missing(var1) having count=max(count)),
(select var1 from have where id=a.id and count > a.count and not missing (var1) having count=min(count))) end as var1
from have a
;
quit;```
Opal | Level 21

Re: missing data

Also, long time, no see. Missed you!

You appear to want to use two different methods for the 2nd and 3rd IDs. Is that really what you want, or something else?

If you want to use the same method, the following seems to come close:

PROC STDIZE data=have out=want method=mean missing=midrange reponly;
var var1;
by id;
run;

Art, CEO, AnalystFinder.com

Opal | Level 21

Re: missing data

A data step solution:

``````data have;
input id\$ count var1;
cards;
aa 1 20
aa 2 .
aa 3 30
bb 1 10
bb 2 .
bb 3 .
bb 4 20
cc 1 10
cc 2 .
cc 3 30
cc 4 .
cc 5 50
;

data want0;
retain lastVar1;
set have; by id;
if first.id then call missing(lastVar1);
if missing(var1) then prevVar1 = lastVar1;
else lastVar1 = var1;
drop lastVar1;
run;

proc sort data=want0; by id descending count; run;

data want;
retain lastVar1;
set want0; by id;
if first.id then call missing(lastVar1);
if missing(var1) then var1 = mean(prevVar1, lastVar1);
else lastVar1 = var1;
drop lastVar1 prevVar1;
run;

proc sort data=want; by id count; run;

proc print data=want noobs; run;``````
PG
Lapis Lazuli | Level 10

Re: missing data

Hi Haikuo, Art, and PG,

Thank you very much for your great help!! I miss you guys too.

Best wishes!

Linlin

Discussion stats
• 4 replies
• 1239 views
• 3 likes
• 4 in conversation