Dear all,
I think I understand the reason why I get the missing TOTAL_SALE with non-missing SALE figures. This is due to the ZIP uploaded into hash object was included in the zipcitydistance function. If I wish to ensure all the zipcodes (in SAS) are up-to-date, I have to download the latest version of zipcodes database from SAS official web.
Thank you for helps, especially for Ksharp and Authur Tabaneck ![]()
Regards,
mspak
mspak,
As ArthurT point out, there are lots of invalid zipcode in your dataset . and function
zipcitydistance(a.zip , b.zip) will generated some missing value(.< zipcitydistance(a.zip , b.zip) <=60 this condition excludes these missing value) , so we wouldn't push it into Hash Table which make Total_Sale missing.
Ksharp
Hi Ksharp,
Thank you for your clarification. I think the observations without a valid zipcodes will not included in the calculation of total_sale. But the final step, which combined the total_sale (total_sale dataset) with the sale (in the original data: Geo_shr) will include the SALE figures of the firms with invalid zipcodes. Any, this is fine - I will not include the observations without a valid zipcode in my analysis (as my study is based on valid zipcode in US).
data want;
if _n_ eq 1 then do;
if 0 then set total_sale;
declare hash h(dataset:'total_sale');
h.definekey('fyear','sic3','zip');
h.definedata('total_sale');
h.definedone();
end;
set x.Geo_shr;
call missing(total_sale);
rc=h.find();
percent=divide(sale,total_sale) ;
drop rc;
run;
Thank you for your valuable suggestions.
Regards,
mspak
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.