09-30-2013 12:40 AM
I have a data set of 434 observation (named as selection) and another data-set with about 17000 observations (named as nodup). The first data-set is my main analysis group (study group) and I want to select one control match from the second data-set for each of the observations in the study group. The matching variables are SIC, FYEAR, SALE. the first two should be exact matches, while SALE should be between a range, defined by lowsale and highsale variables.
If I run this PROC SQL, I will get multiple matches for some of the observation in my study group, while for some others, there is only one match.
create table controls_id as
select * from selection, nodup
where ((nodup.xSALE between selection.lowsale and selection.highsale) and
and selection.fyear=nodup.xfyear );
When I have multiple matches, I want to select the closest match for my observation. the closeness is defined as ((sale-xsale)^2). In other words, I want to select the match which has the smallest difference in the sale variable.
Would you please let me know how I can perform this step?
Thanks for reading this post
Need further help from the community? Please ask a new question.