About AndreasKirk

AndreasKirk · ‎11-11-2015

Thanks for the very thorough answer! I'm running SAS 9.3 and so I don't have the proc OptNet procedure. Any idea how to get around this?

AndreasKirk · ‎11-11-2015

My problem relates to the literature on matching imperfectly on continuous variables however, I have not been able to find anybody experiencing the same distinct problem as I have. The problem is as follows: I have two datasets, one with test subjects and one with control subjects. I need to match the two datasets based on one variable; income. There are more control subjects than test subjects hence I need to pick only the best matches. My first approach was to use PROC FASTCLUS using the test subjects as the center of the clusters and only picking the best match for each cluster. However as I have some groups with relatively few individuals this approach does not give me exactly what I was looking for. My problem is that PROC FASTCLUS does not give me the best match, considering ALL matches in the dataset. Let me give an example: data cases; input ID $ wage; datalines; 1 800 2 1000 ; run; data candidates; input ID $ wage; datalines; 5 700 6 600 8 2000 ; run; /* Finding number of observations in cases */ data _null_; if 0 then set cases nobs=n; call symput('numobs',n); stop; run; %let n_cases=&numobs; /* Making clusters */ proc sort data=cases; by wage; run; data cases; set cases; cluster+1; run; proc sort data=candidates; by wage; run; proc fastclus data=candidates out=donor maxclusters=&n_cases. seed=cases maxiter=0 noprint; var wage; run; proc sort data=donor; by cluster distance; run; /* Finding donors */ data donor candidates (drop=cluster distance); set donor; by cluster; if first.cluster then output donor; run; This program gives me the following matches: ID wage 5 700 8 2000 However, looking at the data, the best matches are ID wage 5 700 6 600 as these would minimize the TOTAL difference between ALL matches. My problem is thus that I need to pick the best matches, taking ALL matches into consideration, i.e. minimize TOTAL distance between test and control subjects. Does anybody have an idea how to do this?

AndreasKirk · ‎09-11-2013

Is it possible to use the first. and last. statement within certain groups? E.g. I have a variable with social security numbers and a variable with years. I have multiple person records each year. I would like to use the first. statement to identify the first observation of each social security number, but for each year. Is this possible?

Online Status	Offline
Date Last Visited	‎11-12-2015 03:51 AM

Re: Matching on continuous variable, minimizing total distance between...

Matching on continuous variable, minimizing total distance between all...

first. and last. statement within groups

first. and last. statement within groups

Re: Matching on continuous variable, minimizing total distance between...

Matching on continuous variable, minimizing total distance between all...

first. and last. statement within groups