Sorted indeed ;).
As tentatively promised in the earlier reply, here's your sorted list approach encapsulated using a hash table. Note that _N_ below is used merely as a container for W-values in the table.
data want (keep = w: mean ) ;
if _n_ = 1 then do ;
dcl hash h (multidata:"Y", ordered:"A") ;
h.definekey ("_n_") ;
h.definedone () ;
dcl hiter i ("h") ;
end ;
set have ;
array w w: ;
do over w ;
if N (w) then h.add(key:w, data:w) ;
end ;
_dm = max (of w:) ;
do _q = 1 by 1 while (i.next() = 0) ;
if _q > 1 then do ;
_d = _n_ - _pn ;
if _d < _dm then do ;
_dm = _d ;
mean = mean (_n_, _pn) ;
end ;
end ;
_pn = _n_ ;
end ;
h.clear() ;
run ;
I wish, though, that in addition to the hash object we had a simpler linear dynamic structure (sort of like a list in Python). Of course, the hash object can play the role, but a list would provide for simpler and terser coding and less overhead in situations like the case at hand.
Best
Paul D.
1. How are you going to handle the missing values in defining the "closest"? I assume you're ignoring them.
2. How are you going to handle the ties between two or more pairs of values that are equally close? I assume that you pick the first ones from left to right.
Your sample data aren't rich enough to test an algorithmic concept. So, try the following input first:
data have ;
call streaminit (7) ;
array w weight1-weight5 ;
do _n_ = 1 to 5 ;
do over w ;
w = rand ("integer", 23) ;
if rand ("uniform") < .3 then w = . ;
end ;
output ;
end ;
run ;
Now, the program:
data want (keep = w: mean closest:) ;
set have ;
array w [*] w: ;
diff = constant ("big") ;
do i = 1 to dim (w) - 1 ;
if nmiss (w[i]) then continue ;
do j = i + 1 to dim (w) ;
if nmiss (w[j]) then continue ;
_diff = abs (w[i] - w[j]) ;
if _diff > diff then continue ;
diff = _diff ;
_i = i ;
_j = j ;
end ;
end ;
closest1 = w[_i] ;
closest2 = w[_j] ;
mean = mean (of closest:) ;
run ;
Note that it's not much different in principle from what @novinosrin has offered, save for a couple of generalizations. It generates the following output:
19 | . | 8 | 22 | 4 | 19 | 22 | 20.5 |
4 | . | 19 | 21 | 7 | 19 | 21 | 20.0 |
10 | 9 | 15 | 23 | 11 | 10 | 11 | 10.5 |
3 | . | . | . | 17 | 3 | 17 | 10.0 |
7 | . | 10 | . | 2 | 7 | 10 | 8.5 |
Now that we can see that the concept works, we can move on to your sample data set:
data have ;
input weight1-weight3 ;
obs = _n_ ;
cards ;
54 54.7 53
48 47.5 .
67.4 68 69
. 48 48
;
run ;
Running the program above against it, we get:
54.0 | 54.7 | 53 | 54.0 | 54.7 | 54.35 |
48.0 | 47.5 | . | 48.0 | 47.5 | 47.75 |
67.4 | 68.0 | 69 | 67.4 | 68.0 | 67.70 |
. | 48.0 | 48 | 48.0 | 48.0 | 48.00
|
If you can't discern the idea from the code, it compares each weight to each other pairwise. If the distance between them is shorter than the previously found distance (initially set to an impossibly big number), then the respective array indices are stored, and the process is repeated until the pairing is exhausted. If instead of the very first indices with the same closest distance you want the very last ones, change the inequality sign from > to =>.
HTH
Paul D.
Thank you so much for looking at this. You are right, I'm ignoring the missing values.
What you and @novinosrin are proposing is more complex than what I just saw from @ChrisNZ, but the latter seems to work just fine. I don't know what it does when there are ties, my database is really small (153 observations) and I'm only comparing three variables.
I can't discern the idea from the code (yet) but I'll certainly look at those alternatives to get more familiar with SAS. Thanks again!
If you really want to learn SAS, you have to look beyond the data you have at hand and ask yourself how your code will work and whether it will work when the data change - because it WILL change. Coding for three variables like 12, 13, 23 means putting data (in this case, how many variables you have and what their combinations are for the task at hand) in your program. A properly designed program is only a set of instructions to process data; it shouldn't contain any data itself. That means, in particular:
- No IF-THEN-ELSE "walpaper" code. The hard coded values in the would-be wallpaper should be stored in a table outside the program, and the program should look them up.
- No hard coded constants. These, too, belong to a control file outside the program. The program should read the control file and get the values from it.
- No hard coding relying on a specific dimension, such as the number of variables. The latter, again, belongs to a control file or, much less preferably, macro variables or macro parameters.
Any aspiring SAS programmer should dutifully absorb Ian Whitlock's paper "Code or Data?":
http://www2.sas.com/proceedings/sugi24/Advtutor/p40-24.pdf
to understand these issues and also understand that a poor program design may serve one's immediate purpose but bite one in the derriere when it's least expected. I'm eternally grateful to Ian for his savage bluntness on SAS-L of yore when it came to this (and other) kind of matters. There's a saying that a good physicist sees analogies but a great one seen analogies between analogies. With SAS programming, it's no different.
Paul D.
"I'm eternally grateful to Ian for his savage bluntness on SAS-L of yore when it came to this (and other) kind of matters"
Wow a great person's gesture owards a perhaps "equal" which is so amazing, humble, eloquent,modest and respectful. I don't know Ian Whitlock, nor i have SAS L account but have heard a lot about him through Art Tabachneck, so guessing.
That's note a lot of people like me and others should take seriously in all walks of life. That speaks a lot from heart and soul. Thank you!
The day If and when I become PD. Oh well taking your blessing if even beyond PD, I will echo a similar note to somebody else.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.