In my head I had it right ... I meant data _null_;
set have end=end_of_have;
length value $10 freq 8;
if _n_=1 then do;
declare hash h (ordered:'a');
h.definekey('value');
h.definedata('value','freq');
h.definedone();
end;
array month {12} f1-f12;
do m=1 to 12;
value=month{m};
if h.find()^=0 then freq=1;
else freq=freq+1;
h.replace();
end;
if end_of_have then h.output(dataset:'want');
run;
... View more
One thing that is a little weird: somehow this approach loses 1,800 person-months. But that's out of 818,600,000 million person-months, so I think I can live with the results in aggregate 🙂
... View more
Agreed that counties don't change very often ... Connecticut was the last big one of which I am aware. Thanks for pointing out that ZIP file -- always good to have something like that on tap.
... View more
Thanks! I like the elegance of this ... but given that county FIPS code sets change from one year to the next, the code line array t(7) $ 1 _temporary_ ('A','B','C','D','E','F','G'); could get cumbersome really fast. I'll save this, though, for a time when there are fewer possible field values!
... View more
Thanks! That seems to be the most elemental solution -- one that first occurred to me and that runs overnight. It has definite curb appeal. I have a wealth of options from which to choose!
... View more
Thanks to both of you for your comments; I agree that long beats wide and that tidy beats messy. And last night as I read my own post, I realized that macros are executed at compile-time execution and not at object-time. However, I fear that I did not explain my problem very well. I was given a large file: 60-some million person-records, each with their county of residence in each of the 12 months of the year. My task is to count up person-months for each county. So transposing the existing dataset would not be a good thing. The hit-it-with-a-stick approach would be to write out 12 records for each person and then summarize over counties; but I do not relish wrangling 720 million records. Another approach might be to create a monster SELECT(f{i}) "snippet" running from county 01000 to 99999, outputting at EOF and then transposing the results. I was going for some kind of elegant run-time solution that points directly to the county counter that needs to be increased ... maybe a CALL function or something. But I also don't mind hitting something with a stick if it looks to be a one-time thing.
... View more
I have on the order of 60 million records, each with 12 monthly fields. I would like to accumulate counts across these fields, but their contents vary from record to record -- for example id f1 f2 f3 ... 1 A B C 2 E G A 3 C C C etc. my hope was to create a series of variables called n_A, n_B, n_c, n_e, etc and then do something like data counts; set big_honker (keep=f1-f12) end=eof; array f {12} f1-f12; do i=1 to 12; %let vbl=%magic(f{i}); &vbl+1; end; if eof then output; drop f1-f12; run; where the %magic takes the value x in each of those monthly fields and creates the appropriate accumulator n_x. But I'll be darned if I can figure it out. Help, please!
... View more