BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mdavidson
Quartz | Level 8

Sorry for the poor subject line. 

 

I'm reading in a data set that is very odd and has a lot of missing data. I'll try to illustrate below:

 


data have;

   input ID val comment $;

   datalines;                     

1        6    start     

.         12  adjust	

.         15  adjust	

.         6   final     

2        12   start     

2        15   final     

3        6    start     

.         12  adjust	

3         15  final		
;


data want;

   input ID val comment $;

   datalines;                     

1        6    start     

1         6   final     

2        12   start     

2        15   final     

3        6    start     

3         15  final		
;

In order to get my "want" data set, I would normally sort by ID and use first/last processing to extract the records needed. However, in this case I don't always have a value in the ID field for the records that I want to capture. I'm not exactly sure if a retain or a lag statement would work in this case.

 

Any ideas on an approach?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Is the idea to replace the missing ID values with the earlier non-missing values?

The same method can be used but without using BY statement or FIRST. and LAST. variables.

data want ;
   set have;
   if missing(id) then id=newid;
   else newid=id;
   retain newid ;
   drop newid;
run;

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

Is the idea to replace the missing ID values with the earlier non-missing values?

The same method can be used but without using BY statement or FIRST. and LAST. variables.

data want ;
   set have;
   if missing(id) then id=newid;
   else newid=id;
   retain newid ;
   drop newid;
run;
mdavidson
Quartz | Level 8
Tom -- thanks so much, brilliant!
hashman
Ammonite | Level 13

@mdavidson :

If you can rely on the variable COMMENT (i.e. if in your sample input, it's not just for show) and the "start" row always has a non-missing ID value (as it appears from your sample input), then one variant could be:

data have ;                           
  input ID val comment $ ;            
   cards ;                            
1   6  start                          
.  12  adjust                         
.  15  adjust                         
.   6  final                          
2  12  start                          
2  15  final                          
3   6  start                          
.  12  adjust                         
3  15  final                          
run ;                                 
                                      
data want (drop = _:) ;               
  retain ID ;                         
  set have (rename = (id = _id)) ;    
  if comment = "start" then ID = _id ;
  if comment in ("start", "final") ;  
run ;                                 

Kind regards

Paul D.

mdavidson
Quartz | Level 8
Thanks Paul, this also works great!
hashman
Ammonite | Level 13

@mdavidson:

You're welcome, of course, but your "also" gets me a bit puzzled because @Tom's program and mine generate different outputs. Namely, mine outputs only the "start" and "final" rows (as in your sample output), while @Tom's keeps all the rows, including "adjust". Neither is right or wrong, it just makes me wonder what it is that you really want.

Kind regards

Paul D.    

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 731 views
  • 2 likes
  • 3 in conversation