Solved: Re: Dataset taking account previous record

michtka · Posted 04-30-2012 07:41 AM

Hi everyone, I got the next dataset:

subjid cpevent weight

100 screening 40

100 day1 .

101 screening 20

101 day1 20

102 day1 30

103 day1 .

The idea is to obtain the next dataset, taking account the previous record of day 1 (screening) if it is missing (example:subjid 100),

and excluding the subjid with day 1 missing and not screening record (example:subjid 103):

Dataset I want:

subjid cpevent weight

100 day1 40

101 day1 20

102 day1 30

Thanks in advance:

Vaset

Haikuo · Posted 04-30-2012 08:47 AM

data want (drop=_:);

set have;

by subjid;

/*_w=ifn(first.subjid,.,lag(weight));*/

_w1=lag(weight);

if first.subjid then _w=.; else _w=_w1;

put subjid= cpevent= weight= _w=;

/*weight=ifn(cpevent='day1' and missing(weight), _w,weight);*/

if cpevent='day1' and missing (weight) then weight=_w;

if cpevent='day1' and not missing(weight) then output;

run;

Haikuo

View solution in original post

Haikuo · Posted 04-30-2012 08:06 AM

Try this:

data have;

input subjid$ cpevent :$10. weight;

cards;

100 screening 40

100 day1 .

101 screening 20

101 day1 20

102 day1 30

103 day1 .

;

data want (drop=_:);

set have;

by subjid;

_w=ifn(first.subjid,.,lag(weight));

weight=ifn(cpevent='day1' and missing(weight), _w,weight);

if cpevent='day1' and not missing(weight) then output;

run;

proc print;run;

Regards,

Haikuo

michtka · Posted 04-30-2012 08:22 AM

Thanks Hi, but could make it without the ifn function?

Thanks.

Haikuo · Posted 04-30-2012 08:47 AM

data want (drop=_:);

set have;

by subjid;

/*_w=ifn(first.subjid,.,lag(weight));*/

_w1=lag(weight);

if first.subjid then _w=.; else _w=_w1;

put subjid= cpevent= weight= _w=;

/*weight=ifn(cpevent='day1' and missing(weight), _w,weight);*/

if cpevent='day1' and missing (weight) then weight=_w;

if cpevent='day1' and not missing(weight) then output;

run;

Haikuo

michtka · Posted 04-30-2012 09:14 AM

Thanks Hai, it works.

Regards.

V.

michtka · Posted 04-30-2012 12:37 PM

Hi Hai,

what about if i have:

subjid cpevent text value

100 screening weight 40

100 day1 weight .

100 day1 pulse .

and i want to get:

100 day1 weight 40

100 day1 pulse .

I mean, recording only the previous value when baseline (day1) is missing, keeping the missing value

if not previous value are recording.

Cheers,

Haikuo · Posted 04-30-2012 01:15 PM

I have just replied you in another thread newly initiated by you on the same topic.

Haikuo

michtka · Posted 04-30-2012 01:44 PM

Sorry Hai, but I try to be clear with this problem...I got the next dataset:

Obs subjid cpevent text value visit

1 100 day1 weight . 2

2 100 screening weight 40 1

3 100 day1 pulse . 2

4 100 day1 high 20 2

5 101 day1 weight . 2

6 101 day1 pulse 20 2

7 101 day1 high . 2

8 101 screening high 10 1

I sorting out the dataset by subjid, visit and text, and I aply your code:

Aplying your code:

data want (drop=_:);

set have;

by subjid;

/*_w=ifn(first.subjid,.,lag(weight));*/

_w1=lag(weight);

if first.subjid then _w=.; else _w=_w1;

put subjid= cpevent= weight= _w=;

/*weight=ifn(cpevent='day1' and missing(weight), _w,weight);*/

if cpevent='day1' and missing (weight) then weight=_w;

if cpevent='day1' and not missing(weight) then output;

run;

I would like to obtain this dataset:

100 day1 weight 40

100 day1 pulse .

100 day1 high 20

101 day1 weight .

101 day 1 pulse 20

101 day1 high 10

I asssume that I need to use a modification of your code, but i dont get how at the moment.

Could ypou help me to get this final dataset with your code?

Thanks a lot.

michtka · Posted 04-30-2012 01:59 PM

to be more clear:...if I got this code sorting out by subjid, visit and text:

Obs subjid cpevent text value visit

1 100 screening weight . 1

2 100 day1 high 20 2

3 100 day1 pulse . 2

4 100 day1 weight 40 2

5 101 screening high . 1

6 101 day1 high 10 2

7 101 day1 pulse 20 2

8 101 day1 weight . 2

aplying your code, I obtain:

Obs subjid cpevent text value visit _w1 _w

1 100 screening weight . 1 . .

2 100 day1 high 20 2 . .

3 100 day1 pulse 20 2 20 20

4 100 day1 weight 40 2 . .

5 101 screening high . 1 40 .

6 101 day1 high 10 2 . .

7 101 day1 pulse 20 2 10 10

8 101 day1 weight 20 2 20 20

And you can see , that this is not like I want,

For example, text=weight for subjid 101 should be blank and not 20 because the idea is:

get the baseline value of the previous record, and leave the baseline blank is not previous record

is recording. In other words, i would like the next dataset:

100 day1 weight 40

100 day1 pulse .

100 day1 high 20

101 day1 weight .

101 day 1 pulse 20

101 day1 high 10

Haikuo · Posted 04-30-2012 02:23 PM

Talking about the 'final dataset', and I surely hope it will be the 'final'. I can see your underlined rational to solve your problem: make small achievement one step at a time, that is why you unfolded your 'final dataset' just as right now. However, it is usually helpful for us to have the big picture from the beginning, as the way you have visioned as an approach may not be the optimal path. Just like one, your final dataset requires more complex technique, yet, it will be different from those methods applied to your previous small steps, and can not be grown from them. finally I hope the following is what you want, it takes 2XDOW:

data have;

input (Obs subjid cpevent text) (:$10.) value visit;

cards;

1 100 day1 weight . 2

2 100 screening weight 40 1

3 100 day1 pulse . 2

4 100 day1 high 20 2

5 101 day1 weight . 2

6 101 day1 pulse 20 2

7 101 day1 high . 2

8 101 screening high 10 1

;

data want (drop=_:);

do until (last.subjid);

set have;

by subjid;

if cpevent='screening' and text='weight' then _w=value;

if cpevent='screening' and text='high' then _h=value;

end;

do until (last.subjid);

set have;

by subjid;

if cpevent='day1' and text='weight' and missing (value) then value=_w;

if cpevent='day1' and text='high' and missing (value) then value=_h;

if cpevent ne 'screening' then output;

end;

run;

proc print;run;

Regards,

Haikuo

BTW, please call me 'Haikuo' if you prefer to calling my name. 'Hai' is only part of my name, just like you don't call 'Mike' as 'Mi'.

michtka · Posted 04-30-2012 02:45 PM

Hi Haikuo, thank you for your time and for solving my problem.

Yes, I am agree that getting an small dataset from the more complex dataset was not a good idea.

Now, with your last code I have got a perfect picture of my problem.

Thank you very much.

Regards,

V.

Haikuo · Posted 04-30-2012 02:52 PM

Hi V.

Please don't take it wrong. I don't mind answering any questions I am capable of answering. I was just saying that your original approach could construct a perfect learning experience, while if your aim is to solve your pressing problem, for the sake of efficiency, posting your problem once for all could be a better option.

Regards,

Haikuo

michtka · Posted 04-30-2012 03:00 PM

Yes, Thank you.

Totally understood. Next time I will approach the issue with my real problem, because if not, we will get confuse each other

and the rest of the people want to learn from my issue.

Thank you very much again.

V

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away