BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mdavidson
Quartz | Level 8

Sorry for the poor subject line. 

 

I'm reading in a data set that is very odd and has a lot of missing data. I'll try to illustrate below:

 


data have;

   input ID val comment $;

   datalines;                     

1        6    start     

.         12  adjust	

.         15  adjust	

.         6   final     

2        12   start     

2        15   final     

3        6    start     

.         12  adjust	

3         15  final		
;


data want;

   input ID val comment $;

   datalines;                     

1        6    start     

1         6   final     

2        12   start     

2        15   final     

3        6    start     

3         15  final		
;

In order to get my "want" data set, I would normally sort by ID and use first/last processing to extract the records needed. However, in this case I don't always have a value in the ID field for the records that I want to capture. I'm not exactly sure if a retain or a lag statement would work in this case.

 

Any ideas on an approach?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Is the idea to replace the missing ID values with the earlier non-missing values?

The same method can be used but without using BY statement or FIRST. and LAST. variables.

data want ;
   set have;
   if missing(id) then id=newid;
   else newid=id;
   retain newid ;
   drop newid;
run;

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

Is the idea to replace the missing ID values with the earlier non-missing values?

The same method can be used but without using BY statement or FIRST. and LAST. variables.

data want ;
   set have;
   if missing(id) then id=newid;
   else newid=id;
   retain newid ;
   drop newid;
run;
mdavidson
Quartz | Level 8
Tom -- thanks so much, brilliant!
hashman
Ammonite | Level 13

@mdavidson :

If you can rely on the variable COMMENT (i.e. if in your sample input, it's not just for show) and the "start" row always has a non-missing ID value (as it appears from your sample input), then one variant could be:

data have ;                           
  input ID val comment $ ;            
   cards ;                            
1   6  start                          
.  12  adjust                         
.  15  adjust                         
.   6  final                          
2  12  start                          
2  15  final                          
3   6  start                          
.  12  adjust                         
3  15  final                          
run ;                                 
                                      
data want (drop = _:) ;               
  retain ID ;                         
  set have (rename = (id = _id)) ;    
  if comment = "start" then ID = _id ;
  if comment in ("start", "final") ;  
run ;                                 

Kind regards

Paul D.

mdavidson
Quartz | Level 8
Thanks Paul, this also works great!
hashman
Ammonite | Level 13

@mdavidson:

You're welcome, of course, but your "also" gets me a bit puzzled because @Tom's program and mine generate different outputs. Namely, mine outputs only the "start" and "final" rows (as in your sample output), while @Tom's keeps all the rows, including "adjust". Neither is right or wrong, it just makes me wonder what it is that you really want.

Kind regards

Paul D.    

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1827 views
  • 2 likes
  • 3 in conversation