BookmarkSubscribeRSS Feed
lh50
Fluorite | Level 6

I have a dataset where participants were sampled from multiple locations that includes several location-specific variables in addition to participant-specific test results. After merging with another results dataset, some participant IDs from the second dataset have no link in the first so they are missing the location information. However, it is obvious which location they correspond to based on their ID value. For example, ABC1002 in the table below comes from the same school/village/region as ABC1001 and 1003.

 

idresultresult2villageschool_idschool_namegeopoint_latgeopoint_lngregion
ABC1001NEGATIVE2A5555School A99999915
ABC1002        
ABC1003NEGATIVE2A5555School A99999915

 

My question: is there an easy way to impute the location-specific variables for participants like this but *not* other variables like 'result' and 'result2' that are specific to each participant? I could obviously do this one at a time, but hoping there is a quicker fix. 

2 REPLIES 2
ballardw
Super User

As long as none of the first ID in a sequence are missing the values something like this:

Data have;
  input id $	result $	result2	village $	 ;
datalines;
ABC1001	NEGATIVE	2	A	
ABC1002	. 	 	 	. 	. 	 	 	 
ABC1003	NEGATIVE	2	A	
;

data want;
   set have;
   village=coalescec(village,lag(village));
run;
 	 	 	 	 	 	 	 

The Coalescec (for character) and Coalsesce (for numeric) returns the first value that is not missing. The LAG function, when used carefully, returns the value of a variable on the previous record. Actually you can look back quite a number of records but leave that to an actual problem.

Do use the Coalesce function if your variable is numeric. You didn't provide enough information to tell so I create a small example using just a few variables.

lh50
Fluorite | Level 6

Thanks, this worked to an extent but only for the first instance of a participant with missing values. Where there are multiple in a row (ex, ABC1002 and ABC1003 are both missing in the example I gave) it only imputed values for ABC1002 but not ABC1003. Is there a way to correct this?

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 366 views
  • 0 likes
  • 2 in conversation