BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sasuser123123
Quartz | Level 8
Hello!
I've data where all lab results along with the 8 visits , So some subjects were not visits all 8 visits.So I need to retain the previous visit observation when visit is missing..like

Data new;
Input Id $ visit val1 val2;
Datalines;
ABC123 1 50 54
ABC123 2 33 33
ABC123 3 21 44
ABC123 4 33 64
ABC121 1 90 34
ABC121 2 32 39
ABC121 3 51 24
ABC122 1 73 83
ABC122 2 10 14
ABC122 3 53 77
ABC124 1 50 54
ABC124 2 32 33
ABC124 3 51 94
ABC124 4 33 44
;
Run;


ID ABC121 and ABC122 have only three visits so I need to add 4th visit to those IDs with previous observation values. Could you please help me out how to do this one.

Thank you!
Regards..
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Use by-group processing, and add a loop when last.id is reached:

data have;
input id $ visit val1 val2;
datalines;
ABC123 1 50 54
ABC123 2 33 33
ABC123 3 21 44
ABC123 4 33 64
ABC121 1 90 34
ABC121 2 32 39
ABC121 3 51 24
ABC122 1 73 83
ABC122 2 10 14
ABC122 3 53 77
ABC124 1 50 54
ABC124 2 32 33
ABC124 3 51 94
ABC124 4 33 44
;

data want;
set have;
by id notsorted;
output;
if last.id
then do visit = visit + 1 to 4;
  output;
end;
run;

proc print data=want noobs;
run;

Result:

  id      visit    val1    val2

ABC123      1       50      54 
ABC123      2       33      33 
ABC123      3       21      44 
ABC123      4       33      64 
ABC121      1       90      34 
ABC121      2       32      39 
ABC121      3       51      24 
ABC121      4       51      24 
ABC122      1       73      83 
ABC122      2       10      14 
ABC122      3       53      77 
ABC122      4       53      77 
ABC124      1       50      54 
ABC124      2       32      33 
ABC124      3       51      94 
ABC124      4       33      44 

View solution in original post

15 REPLIES 15
PeterClemmensen
Tourmaline | Level 20

What does your desired output look like?

Kurt_Bremser
Super User

Use by-group processing, and add a loop when last.id is reached:

data have;
input id $ visit val1 val2;
datalines;
ABC123 1 50 54
ABC123 2 33 33
ABC123 3 21 44
ABC123 4 33 64
ABC121 1 90 34
ABC121 2 32 39
ABC121 3 51 24
ABC122 1 73 83
ABC122 2 10 14
ABC122 3 53 77
ABC124 1 50 54
ABC124 2 32 33
ABC124 3 51 94
ABC124 4 33 44
;

data want;
set have;
by id notsorted;
output;
if last.id
then do visit = visit + 1 to 4;
  output;
end;
run;

proc print data=want noobs;
run;

Result:

  id      visit    val1    val2

ABC123      1       50      54 
ABC123      2       33      33 
ABC123      3       21      44 
ABC123      4       33      64 
ABC121      1       90      34 
ABC121      2       32      39 
ABC121      3       51      24 
ABC121      4       51      24 
ABC122      1       73      83 
ABC122      2       10      14 
ABC122      3       53      77 
ABC122      4       53      77 
ABC124      1       50      54 
ABC124      2       32      33 
ABC124      3       51      94 
ABC124      4       33      44 
sasuser123123
Quartz | Level 8
Exactly like this..and it's perfectly working.
Thank you so much for your help..
And could you please explain why we used last.
Kurt_Bremser
Super User

By-group processing is initiated with the by statement; for every variable in the by statement, automatic boolean variables first. and last. are created, which signal when a group starts and when it ends.

My code runs a do loop when the last observation of a group is reached, and the do loop is specified in a way that it only runs when the last observation of a group has a visit number smaller than 4.

sasuser123123
Quartz | Level 8
And also I forget to mention that the variable visit is in character format so what I did was firstly I converted it into numeric and then apply condition then again converted it into character. was this process is correct are is there any alternative......
Thank you!
Kurt_Bremser
Super User

If you only have values 1 to 4, it makes sense to save space by storing it as $1 (numeric variables need at least 3 bytes).

Still, if you do not have space issues, storing such values as numeric might be better.

sasuser123123
Quartz | Level 8
Thank you so much !
sasuser123123
Quartz | Level 8
Hello sir!
I've a doubt...
So if the second observation or third observation is missing instead of fourth observation per id in our raw data how to do locf for that condition like..

data have;
input id $ visit val1 val2;
datalines;
ABC123 1 50 54
ABC123 2 33 33
ABC123 4 33 64
ABC121 1 90 34
ABC121 3 51 24
ABC122 1 73 83
ABC122 2 10 14
ABC122 3 53 77
ABC124 1 50 54
ABC124 2 32 33
ABC124 3 51 94
ABC124 4 33 44
;


Kurt_Bremser
Super User

For this, you use the "look-ahead" technique:

data have;
input id $ visit val1 val2;
datalines;
ABC123 1 50 54
ABC123 2 33 33
ABC123 4 33 64
ABC121 1 90 34
ABC121 3 51 24
ABC122 1 73 83
ABC122 2 10 14
ABC122 3 53 77
ABC124 1 50 54
ABC124 2 32 33
ABC124 3 51 94
ABC124 4 33 44
;

data want;
merge
  have
  have (
    firstobs=2
    keep=id visit
    rename=(id=_id visit=_visit)
  )
;
output;
if id ne _id then _visit = 5;
do visit = visit + 1 to _visit - 1;
  output;
end;
drop _:;
run;
sasuser123123
Quartz | Level 8
It's very difficult to understand this program...
Kurt_Bremser
Super User

I do a merge without a by; the firstobs=2 causes the second read in the merge to always be one observation ahead of the first read.

I only keep two variables and rename them, so I have two additional variables in the PDV, "future" versions of the originals.

One of these is used to detect if we're still in the same group (id), and the other gives me the next visit number.

The do loop is structured in a way that it writes the current observation (one do loop iteration if next visit = current visit + 1), or multiple observations if there is a "hole". At the end of the group, I set a virtual 5th visit, so the loop works right up to visit 4.

 

This code was created mostly by applying Maxim 4: start with a simple idea (the look-ahead), run it, look where the result differs from the expectations, adapt and run again. Rinse, lather, repeat.

sasuser123123
Quartz | Level 8
Anyway thank you so much..
sasuser123123
Quartz | Level 8
@Kurt_Bremser Thank you so much for your explanation....
sasuser123123
Quartz | Level 8
And I'm unable to understand that why we used first obs option=2

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1434 views
  • 5 likes
  • 3 in conversation