- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hi everybody,
I know that every time you find me the right solution, so I do not delay to contact you with my best regards.
I have a dataset as:
ID P NC MG MP
M1 1 1 . .
M1 1 2 2.3 .
M1 1 3 2.6 3.2
M1 1 4 3.5 4.1
M1 2 1 . 1.2
M1 2 2 . 2.3
M1 2 3 4.3 3.5
M2 1 1 . .
M2 1 2 . .
M2 1 3 3.5 3.2
M2 1 4 4.1 3.3
M2 1 5 4.0 3.3
M2 2 1 . .
M2 2 2 3.6 4.1
M2 2 3 3.4 4.0
M2 2 4 3.4 3.9
M2 3 1 4.2 4.2
M2 3 2 4.3 4.2
M2 3 3 4.2 4.1
M3 2 1 4.6 .
M3 2 2 4.8 .
M3 2 3 4.7 .
M3 2 4 4.3 .
ID represents an identifier,
P = an exercise number, can be 1, 2, 3 or 4.
NC: an order number of each test for the same ID and the same exercise number.
MG and MP are observations.
What I want is to replace the missing values of MG or MP with the next value if it exists or even with the next value after if the next value is also missing. Nothing changes if all the observations during the same exercise are missing, either for MG or MP.
Note that missing values If they exist are only observed in the two first observations for each ID at each P.
The desired result is as follows:
ID P NC MG MP
M1 1 1 2.3 3.2
M1 1 2 2.3 3.2
M1 1 3 2.6 3.2
M1 1 4 3.5 4.1
M1 2 1 4.3 1.2
M1 2 2 4.3 2.3
M1 2 3 4.3 3.5
M2 1 1 3.5 3.2
M2 1 2 3.5 3.2
M2 1 3 3.5 3.2
M2 1 4 4.1 3.3
M2 1 5 4.0 3.3
M2 2 1 3.6 4.1
M2 2 2 3.6 4.1
M2 2 3 3.4 4.0
M2 2 4 3.4 3.9
M2 3 1 4.2 4.2
M2 3 2 4.3 4.2
M2 3 3 4.2 4.1
M3 2 1 4.6 .
M3 2 2 4.8 .
M3 2 3 4.7 .
M3 2 4 4.3 .
I count on your help, thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Easiest would probably be to sort the data set in reverse order, and bring forward the most recent nonmissing value. For example:
proc sort data=have;
by id p descending nc;
run;
data want;
set have;
by id p;
retain new_mg new_mp;
if first.p then do;
new_mg=mg;
new_mp=mp;
end;
if mg > . then new_mg = mg;
else mg = new_mg;
if mp > . then new_mp = mp;
else mp = new_mp;
drop new_mg new_mp;
run;
You can always put the observations back into their original order:
proc sort data=want;
by id p nc;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Easiest would probably be to sort the data set in reverse order, and bring forward the most recent nonmissing value. For example:
proc sort data=have;
by id p descending nc;
run;
data want;
set have;
by id p;
retain new_mg new_mp;
if first.p then do;
new_mg=mg;
new_mp=mp;
end;
if mg > . then new_mg = mg;
else mg = new_mg;
if mp > . then new_mp = mp;
else mp = new_mp;
drop new_mg new_mp;
run;
You can always put the observations back into their original order:
proc sort data=want;
by id p nc;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much, it works very well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have; input ID $ P NC MG MP; cards; M1 1 1 . . M1 1 2 2.3 . M1 1 3 2.6 3.2 M1 1 4 3.5 4.1 M1 2 1 . 1.2 M1 2 2 . 2.3 M1 2 3 4.3 3.5 M2 1 1 . . M2 1 2 . . M2 1 3 3.5 3.2 M2 1 4 4.1 3.3 M2 1 5 4.0 3.3 M2 2 1 . . M2 2 2 3.6 4.1 M2 2 3 3.4 4.0 M2 2 4 3.4 3.9 M2 3 1 4.2 4.2 M2 3 2 4.3 4.2 M2 3 3 4.2 4.1 M3 2 1 4.6 . M3 2 2 4.8 . M3 2 3 4.7 . M3 2 4 4.3 . ; run; proc sort data=have; by ID P NC; run; data temp; set have; by ID P ; retain _MG _MP; if first.P then call missing(_MG,_MP); if not missing(MG) then _MG=MG; if not missing(MP) then _MP=MP; drop MG MP; run; proc sort data=temp; by ID P descending NC; run; data want; set temp; by ID P ; retain MG MP; if first.P then call missing(MG,MP); if not missing(_MG) then MG=_MG; if not missing(_MP) then MP=_MP; drop _:; run; proc sort data=want; by ID P NC; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It works well too, thanks to you and to Astounding for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Oops I am late to the part,Just for fun and nothing more:
data _null_;
if _n_=1 then do;
if 0 then set have;
declare hash myhash(multidata:'YES',ordered:'a');
myhash.definekey('id','p');
myhash.definedata('id','p','nc','mg','mp');
myhash.definedone( );
end;
set have end=last;
by id p;
if ((missing(mp) and missing(mg)) or (missing(mp) or missing(mg))) and myhash.check() ne 0 then rc=myhash.add();
else if myhash.check() ne 0 and not missing(mp) and not missing(mg) then rc=myhash.add();
else if myhash.check()=0 and (missing(mp) and missing(mg)) then rc=myhash.add();
else if myhash.check()=0 and ((not missing(mp) and missing(mg)) or (missing(mp) and not missing(mg))) then
do;
_mg=mg;
_mp=mp;
_nc=nc;
rc=myhash.find();
if missing(mg) then mg=_mg;
if missing(mp) then mp=_mp;
myhash.replacedup();
rc=myhash.find_next();
do while(rc=0);
if missing(mg) then mg=_mg;
if missing(mp) then mp=_mp;
myhash.replacedup();
rc=myhash.find_next();
end;
mg=_mg;
mp=_mp;
nc=_nc;
myhash.add();
end;
else if myhash.check()=0 and not missing(mp) and not missing(mg) then do;
_mg=mg;
_mp=mp;
_nc=nc;
rc=myhash.find();
if missing(mg) then mg=_mg;
if missing(mp) then mp=_mp;
myhash.replacedup();
rc=myhash.find_next();
do while(rc=0);
if missing(mg) then mg=_mg;
if missing(mp) then mp=_mp;
myhash.replacedup();
rc=myhash.find_next();
end;
mg=_mg;
mp=_mp;
nc=_nc;
myhash.add();
end;
if last then rc=myhash.output(dataset:'want');
run;
proc sort data=want;
by id p nc;
run;
Regards,
Naveen Srinivasan