BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
soumri
Quartz | Level 8

hi everybody,

I know that every time you find me the right solution, so I do not delay to contact you with my best regards.

I have a dataset as:

 

ID     P  NC MG MP

M1    1  1    .       .

M1    1  2    2.3   .

M1    1  3    2.6   3.2

M1    1  4    3.5   4.1

M1    2  1    .       1.2

M1    2  2    .       2.3

M1    2  3    4.3   3.5

M2    1  1    .       .

M2    1  2    .       .

M2    1  3    3.5   3.2

M2    1  4    4.1   3.3

M2    1  5    4.0   3.3

M2    2  1    .       .

M2    2  2    3.6   4.1

M2    2  3    3.4   4.0

M2    2  4    3.4   3.9

M2    3  1    4.2   4.2

M2    3  2    4.3   4.2

M2    3  3    4.2   4.1

M3    2  1    4.6   .

M3    2  2    4.8   .

M3    2  3    4.7   .

M3    2  4    4.3   .

ID represents an identifier,
P = an exercise number, can be 1, 2, 3 or 4.
NC: an order number of each test for the same ID and the same exercise number.
MG and MP are observations.
What I want is to replace the missing values of MG or MP with the next value if it exists or even with the next value after if the next value is also missing. Nothing changes if all the observations during the same exercise are missing, either for MG or MP.

Note that missing values If they exist are only observed in the two first observations for each ID at each P.
The desired result is as follows:

 

ID     P  NC MG MP

M1    1  1    2.3   3.2

M1    1  2    2.3   3.2

M1    1  3    2.6   3.2

M1    1  4    3.5   4.1

M1    2  1    4.3   1.2

M1    2  2    4.3   2.3

M1    2  3    4.3   3.5

M2    1  1    3.5   3.2

M2    1  2    3.5   3.2

M2    1  3    3.5   3.2

M2    1  4    4.1   3.3

M2    1  5    4.0   3.3

M2    2  1    3.6   4.1

M2    2  2    3.6   4.1

M2    2  3    3.4   4.0

M2    2  4    3.4   3.9

M2    3  1    4.2   4.2

M2    3  2    4.3   4.2

M2    3  3    4.2   4.1

M3    2  1    4.6   .

M3    2  2    4.8   .

M3    2  3    4.7   .

M3    2  4    4.3   .

I count on your help, thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Easiest would probably be to sort the data set in reverse order, and bring forward the most recent nonmissing value.  For example:

 

proc sort data=have;

by id p descending nc;

run;

 

data want;

set have;

by id p;

retain new_mg new_mp;

if first.p then do;

   new_mg=mg;

   new_mp=mp;

end;

if mg > . then new_mg = mg;

else mg = new_mg;

if mp > . then new_mp = mp;

else mp = new_mp;

drop new_mg new_mp;

run;

 

You can always put the observations back into their original order:

 

proc sort data=want;

by id p nc;

run;

View solution in original post

5 REPLIES 5
Astounding
PROC Star

Easiest would probably be to sort the data set in reverse order, and bring forward the most recent nonmissing value.  For example:

 

proc sort data=have;

by id p descending nc;

run;

 

data want;

set have;

by id p;

retain new_mg new_mp;

if first.p then do;

   new_mg=mg;

   new_mp=mp;

end;

if mg > . then new_mg = mg;

else mg = new_mg;

if mp > . then new_mp = mp;

else mp = new_mp;

drop new_mg new_mp;

run;

 

You can always put the observations back into their original order:

 

proc sort data=want;

by id p nc;

run;

soumri
Quartz | Level 8

Thank you very much, it works very well.

Ksharp
Super User
data have;
input ID  $   P  NC MG MP;
cards;
M1    1  1    .       .
M1    1  2    2.3   .
M1    1  3    2.6   3.2
M1    1  4    3.5   4.1
M1    2  1    .       1.2
M1    2  2    .       2.3
M1    2  3    4.3   3.5
M2    1  1    .       .
M2    1  2    .       .
M2    1  3    3.5   3.2
M2    1  4    4.1   3.3
M2    1  5    4.0   3.3
M2    2  1    .       .
M2    2  2    3.6   4.1
M2    2  3    3.4   4.0
M2    2  4    3.4   3.9
M2    3  1    4.2   4.2
M2    3  2    4.3   4.2
M2    3  3    4.2   4.1
M3    2  1    4.6   .
M3    2  2    4.8   .
M3    2  3    4.7   .
M3    2  4    4.3   .
;
run;

proc sort data=have;
 by ID   P  NC;
run;
data temp;
 set have;
 by ID     P ;
 retain _MG _MP;
 if first.P then call missing(_MG,_MP);
 if not missing(MG) then _MG=MG;
 if not missing(MP) then _MP=MP;
 drop MG MP;
run;

proc sort data=temp;
 by ID   P  descending NC;
run;
data want;
 set temp;
 by ID     P ;
 retain MG MP;
 if first.P then call missing(MG,MP);
 if not missing(_MG) then MG=_MG;
 if not missing(_MP) then MP=_MP;
 drop _:;
run;

proc sort data=want;
 by ID   P  NC;
run;


soumri
Quartz | Level 8

It works well too, thanks to you and to Astounding for your help.

novinosrin
Tourmaline | Level 20

Oops I am late to the part,Just for fun and nothing more: 

 

data _null_;

if _n_=1 then do;

if 0 then set have;

declare hash myhash(multidata:'YES',ordered:'a');

myhash.definekey('id','p');

myhash.definedata('id','p','nc','mg','mp');

myhash.definedone( );

end;

set have end=last;

by id p;

if ((missing(mp) and missing(mg)) or (missing(mp) or missing(mg))) and myhash.check() ne 0 then rc=myhash.add();

else if myhash.check() ne 0 and not missing(mp) and not missing(mg) then rc=myhash.add();

else if myhash.check()=0 and (missing(mp) and missing(mg)) then rc=myhash.add();

else if myhash.check()=0 and ((not missing(mp) and missing(mg)) or (missing(mp) and not missing(mg))) then

do;

_mg=mg;

_mp=mp;

_nc=nc;

rc=myhash.find();

if missing(mg) then mg=_mg;

if missing(mp) then mp=_mp;

myhash.replacedup();

rc=myhash.find_next();

do while(rc=0);

if missing(mg) then mg=_mg;

if missing(mp) then mp=_mp;

myhash.replacedup();

rc=myhash.find_next();

end;

mg=_mg;

mp=_mp;

nc=_nc;

myhash.add();

end;

else if myhash.check()=0 and not missing(mp) and not missing(mg) then do;

_mg=mg;

_mp=mp;

_nc=nc;

rc=myhash.find();

if missing(mg) then mg=_mg;

if missing(mp) then mp=_mp;

myhash.replacedup();

rc=myhash.find_next();

do while(rc=0);

if missing(mg) then mg=_mg;

if missing(mp) then mp=_mp;

myhash.replacedup();

rc=myhash.find_next();

end;

mg=_mg;

mp=_mp;

nc=_nc;

myhash.add();

end;

if last then rc=myhash.output(dataset:'want');

run;

 

proc sort data=want;

by   id p nc;

run;

 

Regards,

Naveen Srinivasan

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 997 views
  • 4 likes
  • 4 in conversation