<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Density of industrial firms in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28260#M6566</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi again,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The observations are unique for each firm-year. Each firm is possible to have same zipcode for every year. I couldn't remain only to have unique zip, tic and sic3 as firms might change their headquarters to another area which has different zipcodes, as such, you might see them repeated. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you and I hope my explanation is clear.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sun, 11 Mar 2012 10:44:39 GMT</pubDate>
    <dc:creator>mspak</dc:creator>
    <dc:date>2012-03-11T10:44:39Z</dc:date>
    <item>
      <title>Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28257#M6563</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Dear all,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have file with the follwoign variables:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;tic: company identification code&lt;/P&gt;&lt;P&gt;fyear: financial year&lt;/P&gt;&lt;P&gt;sic3: 3-digit SIC Industry (ie. indistry classification code with the first 3 digits)&lt;/P&gt;&lt;P&gt;zip: US 5-digit zipcode&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I wish to calculate a variable, density of industrial firms, which is defined as log of one plus the number of firms in the same 3-digit SIC industry within 60-mile radius of the firms headquarters (tic).&amp;nbsp;&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NUM_INDUSTRY = Number of firms in the same 3-digit SIC industry within 60-mile radius of the firms headquarters&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;DENSITY_INDUSTRY = LOG (1 + NUM_INDUSTRY)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How should I can pairs of firms with the same 3-digit SIC code? I know the way to calculate distances and number of firms within 60-mile radius. However, before, I can calculate the distance, I must have a list of firms with the same 3-digit SIC code for each firm-year (ie. thsi does mean for each firm every year).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note: the SAS datafile attached.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;mspak &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 09:43:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28257#M6563</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T09:43:41Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28258#M6564</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What identifies a headquarter?&lt;/P&gt;&lt;P&gt;From your narrative I would expect to get a variable which identifies the headquarter (so only populated for records which are headquarter), a clustering variable (eg. TIC) and then a firm identifier (eg. SIC).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The data you provide don't give me this headquarter identifier. On the other hand there are a lot of duplicates in it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;&amp;nbsp; select count(*) as NObs, *&lt;BR /&gt;&amp;nbsp; from mspak.comp&lt;BR /&gt;&amp;nbsp; group by fyear,tic&lt;BR /&gt;&amp;nbsp; having count(*)&amp;gt;1&lt;BR /&gt;&amp;nbsp; ;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NObs&amp;nbsp; zip&amp;nbsp;&amp;nbsp;&amp;nbsp; tic&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; fyear SIC3&lt;BR /&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 08543&amp;nbsp; 0388B&amp;nbsp;&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 284&lt;BR /&gt;2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 08543&amp;nbsp; 0388B&amp;nbsp;&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 284&lt;BR /&gt;...&lt;BR /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94105&amp;nbsp; 3DOVRA&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 153&lt;BR /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94105&amp;nbsp; 3DOVRA&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 153&lt;BR /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94105&amp;nbsp; 3DOVRA&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 153&lt;BR /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94105&amp;nbsp; 3DOVRA&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 153&lt;BR /&gt;5&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 94105&amp;nbsp; 3DOVRA&amp;nbsp;&amp;nbsp; 2003&amp;nbsp; 153&lt;/P&gt;&lt;P&gt;.....&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Are you sure your source data is as it should be? And if yes could you explain how to identify the headquarter and why there are duplicates?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;DIV class="mcePaste" id="_mcePaste" style="position: absolute; width: 1px; height: 1px; overflow: hidden; top: 0px; left: -10000px;"&gt;﻿&lt;/DIV&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 10:26:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28258#M6564</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2012-03-11T10:26:25Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28259#M6565</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Patrick,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The datafile contains all firms with their hardquarters' zipcode in US. Let say, for example the following:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;TIC FYEAR SIC ZIP&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;AA&amp;nbsp; 2003&amp;nbsp;&amp;nbsp; 246&amp;nbsp; XXXXX&lt;/P&gt;&lt;P&gt;BB&amp;nbsp; 2003&amp;nbsp;&amp;nbsp; 246&amp;nbsp; XXXXX&lt;/P&gt;&lt;P&gt;CC&amp;nbsp; 2003&amp;nbsp;&amp;nbsp; 246&amp;nbsp; XXXXX&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We assume that firms with SIC code of 246 only 3 firms. Then I have to calculate the distance between:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;AA - BB&lt;/P&gt;&lt;P&gt;AA - CC&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If the distance between AA-BB and AA-CC are within 60 mile-radius, then industry density for firm AA = log (2+1)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For firm BB,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have to calculate distance between:&lt;/P&gt;&lt;P&gt;BB-AA&lt;/P&gt;&lt;P&gt;BB-CC&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If only distance between BB-AA are within 60 mile-radius, the density = LOG (1+1)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;AND SO FORTH.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 10:39:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28259#M6565</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T10:39:00Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28260#M6566</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi again,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The observations are unique for each firm-year. Each firm is possible to have same zipcode for every year. I couldn't remain only to have unique zip, tic and sic3 as firms might change their headquarters to another area which has different zipcodes, as such, you might see them repeated. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you and I hope my explanation is clear.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 10:44:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28260#M6566</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T10:44:39Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28261#M6567</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Duplicates to be eliminated.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The uniqueness to be identified by&lt;/P&gt;&lt;P&gt;1. tic&lt;/P&gt;&lt;P&gt;2. fyear&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 10:50:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28261#M6567</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T10:50:32Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28262#M6568</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Looks to me like a hash iter object approach could solve this in one data step. I'll post some code as soon as I have something working.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 11:13:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28262#M6568</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2012-03-11T11:13:57Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28263#M6569</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks in advance for your helps. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Firm-year's zipcodes are origin zipcodes and the destination zipcode are other firms (other than origin) which are in the same SIC3. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Firm-years have chance to be origin and destination among themselves for the same SIC3.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 11:20:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28263#M6569</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T11:20:26Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28264#M6570</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Can't provide you the final solution yet but need to stop for today. Thought I share with what I could come up so far. May be this helps someone else to come up with a solution.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I decided to go for a hash iter approach over a SQL approach because I believe it possibly will perform better.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;First issue I had was an outdated sashelp.zipcode table. Function zipcitydistance() doesn't like it too much if it's fed with zipcodes which are not in this table.&lt;/P&gt;&lt;P&gt;A updated table can be obtained here: &lt;A href="http://support.sas.com/rnd/datavisualization/mapsonline/html/misc.html"&gt;http://support.sas.com/rnd/datavisualization/mapsonline/html/misc.html&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In order to continue working I excluded all zipcodes in your source data not in sashelp.zipcode (that's the where clause in the SQL which needs to be removed for a final version).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Where I'm currently stuck is how the iter object behaves. In my understanding there should be at least one match for every iteration of the data step (that's where a row matches with itself, so AA-AA). But in the 4th input row I get no match (variable count ends up to be zero - and therefore the log() falls over). Not sure yet why this happens and if it is because I'm missunderstanding how the iter object works - or if it is a bug (I was a bit lazy in applying patches so it is a possibility).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The code below should run without syntax errors and show the issue I'm currently having with the 4th obs from the input data set.&lt;/P&gt;&lt;P&gt;There is some comment missing in the code.... I haven't finished yet. Hope it's anyway of some use.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;options ls=200;&lt;BR /&gt;libname mspak 'C:\Users\ssapam\Documents\My SAS Files\9.2';&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;&amp;nbsp; create table comp_2 as&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; select distinct fyear,sic3,tic,zip&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; from mspak.comp&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; where zip in (select zip from sashelp.zipcode)&amp;nbsp; /* zipcitydistance() only works if zipcode exists in sashelp.zipcode */&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; order by fyear,sic3&lt;BR /&gt;&amp;nbsp; ;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want;&lt;BR /&gt;&amp;nbsp; set comp_2(rename=(tic=ThisTic zip=ThisZip) obs=4 firstobs=4 ) &lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; comp_2(obs=0) /*needed for variable mapping for hash object */&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ;&lt;BR /&gt;&amp;nbsp; by fyear sic3;&lt;BR /&gt;&amp;nbsp; if _n_=1 then&lt;BR /&gt;&amp;nbsp; do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; declare hash h1 (dataset:'work.comp_2',ordered:'a');&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineKey('fyear','sic3');&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineData('tic','zip');&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h1.defineDone();&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; declare hiter hit1('h1');&lt;BR /&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&amp;nbsp; count=0;&lt;BR /&gt;&amp;nbsp; ThisFyear=fyear;&lt;BR /&gt;&amp;nbsp; ThisSic3=sic3;&lt;BR /&gt;&amp;nbsp; _rc=hit1.first();&lt;BR /&gt;&amp;nbsp; _rc=h1.find();&lt;/P&gt;&lt;P&gt;&amp;nbsp; do while(_rc=0);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; count+ (zipcitydistance(ThisZip, zip)&amp;lt;=60);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; distance=zipcitydistance(ThisZip, zip);&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; /* for debugging. only uncomment together with obs=2 in set statement! */&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; put (fyear sic3 tic zip) (=) &lt;A&gt;/*@60&lt;/A&gt; (ThisFyear ThisSic3 ThisTic ThisZip count distance) (=)*/; &lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=hit1.next();&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; if ThisFyear ne fyear or ThisSic3 ne sic3 then leave;&lt;BR /&gt;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&amp;nbsp; mutual_density = log (count);&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 13:55:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28264#M6570</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2012-03-11T13:55:18Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28265#M6571</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Patrick,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I need time to digest your program suggested as I am new to SQL. I will test run the program, I am not sure will this take long time for my notebook to process it? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good nite (11 march 2012, 11.09pm now)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Mspak.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 15:08:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28265#M6571</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-11T15:08:08Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28266#M6572</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Patrick's direction is definitely applaudable. I have tried SQL first, and stop it after 30-mins of running. Here is my hash approach, I think we may not need to use hiter, as both table are identical.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;After making sure that all of the zipcodes are included in the sashelp.zipcodes,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;407&amp;nbsp; data want (drop=_:);&lt;/P&gt;&lt;P&gt;408&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if _n_=1 then do;&lt;/P&gt;&lt;P&gt;409&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set comp( rename=(tic=_tic zip=_zip) obs=1);&lt;/P&gt;&lt;P&gt;410&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dcl hash h(dataset: 'comp (rename=(tic=_tic zip=_zip))', multidata: 'yes');&lt;/P&gt;&lt;P&gt;411&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definekey('fyear','sic3');&lt;/P&gt;&lt;P&gt;412&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definedata('_tic','_zip');&lt;/P&gt;&lt;P&gt;413&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definedone();&lt;/P&gt;&lt;P&gt;414&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;415&lt;/P&gt;&lt;P&gt;416&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set comp;&lt;/P&gt;&lt;P&gt;417&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _ct=0;&lt;/P&gt;&lt;P&gt;418&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h.find();&lt;/P&gt;&lt;P&gt;419&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; do while (_rc=0);&lt;/P&gt;&lt;P&gt;420&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if .&amp;lt;zipcitydistance(_zip,zip) &amp;lt;=60 then _ct+1;&lt;/P&gt;&lt;P&gt;421&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.has_next(result: _r);&lt;/P&gt;&lt;P&gt;422&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if _r ne 0 then _rc=h.find_next();&lt;/P&gt;&lt;P&gt;423&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; else leave;&lt;/P&gt;&lt;P&gt;424&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;425&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DENSITY_INDUSTRY=log(_ct);&lt;/P&gt;&lt;P&gt;426&lt;/P&gt;&lt;P&gt;427&amp;nbsp; run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;NOTE: There were 66452 observations read from the data set WORK.COMP.&lt;/P&gt;&lt;P&gt;NOTE: There were 1 observations read from the data set WORK.COMP.&lt;/P&gt;&lt;P&gt;NOTE: There were 66452 observations read from the data set WORK.COMP.&lt;/P&gt;&lt;P&gt;NOTE: The data set WORK.WANT has 66452 observations and 5 variables.&lt;/P&gt;&lt;P&gt;NOTE: DATA statement used (Total process time):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; real time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5:26.81&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; cpu time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5:17.99&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Less than 6mins, Not too shady. BTW, the has_next() method is not a must-have here. You can just use:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want (drop=_:);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; if _n_=1 then do;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set comp( rename=(tic=_tic zip=_zip) obs=1);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dcl hash h(dataset: 'comp (rename=(tic=_tic zip=_zip))', multidata: 'yes');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definekey('fyear','sic3');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definedata('_tic','_zip');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; h.definedone();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; dcl hiter hi('h');&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set comp ;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _ct=0;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h.find();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; do while (_rc=0);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if .&amp;lt;zipcitydistance(_zip,zip) &amp;lt;=60 then _ct+1;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; _rc=h.find_next();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DENSITY_INDUSTRY=log(_ct);&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I was hoping has_next will help the efficiency, well, it turned out no difference.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 11 Mar 2012 18:25:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28266#M6572</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2012-03-11T18:25:55Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28267#M6573</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks Hai.kuo!&lt;/P&gt;&lt;P&gt;I wasn't aware of the possibilities given by 'multidata' together with find_next() &lt;/P&gt;&lt;P&gt;Learned something &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;@mspak&lt;/P&gt;&lt;P&gt;Go for Hai.kuo's second option. It looks pretty neat to me.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 12 Mar 2012 08:24:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28267#M6573</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2012-03-12T08:24:35Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28268#M6574</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks HaiKuo and Patrick,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Both hiter and hash objects are new to me. Thanks for sharing. I learned something new here. &lt;/P&gt;&lt;P&gt;The hash object provides a fast, easy way to perform lookups without sorting or indexing in data step. I think I have to take time to digest the program again by further reading on this method. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Running the program in data step within 6 minutes is great for me!!! I always took a few hours to run the program in SQL. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Will update in this discussion forum again in future if it proves successful. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 12 Mar 2012 14:01:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28268#M6574</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-12T14:01:14Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28269#M6575</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Hai.Kuo,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I ran the program and it is efficient. As the definition of industry density is log(1+ number of firms located within 60-mile radius in the same industry, should I change the DENSITY_INDUSTRY as log (1+ct) instead of log(_ct)?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Mar 2012 05:11:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28269#M6575</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-13T05:11:50Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28270#M6576</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Not by your definitions. _ct has already included the self-match, which by your definition is 1+count of match to others, so you don't need to use 1+_Ct to replace _ct in this context.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Mar 2012 11:01:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28270#M6576</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2012-03-13T11:01:52Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28271#M6577</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you very much &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Mar 2012 11:12:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28271#M6577</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-13T11:12:58Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28272#M6578</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;I reworked an alternate solution using you provided data on a personal machine.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;/* remove a few duplicates that existed using your statement that records should be unique by tic and fyear */&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;proc sort data=in.comp out=comp nodupkey; by tic fyear; run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;/* make sure you update your zipcode dataset using the information provided by Patrick to reduce the number of rejected records */&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;proc sql;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;create table foobar as&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;select a.zip,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; a.tic,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; a.fyear,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; a.sic3,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; sum(zipcitydistance(a.zip,c.zip)&amp;lt;60) as num_industry,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; log(1 + calculated num_industry) as density_industry&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp; from comp a,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; sashelp.zipcode b,&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; comp c&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp; where a.zip=b.zip and c.zip=b.zip and a.tic ne c.tic and a.fyear=c.fyear and a.sic3=c.sic3&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp; group by a.zip,a.tic,a.fyear,a.sic3&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;&amp;nbsp; order by a.tic,a.fyear;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;/* bring back the rejected records, leave the num_industry and density_industry as null or set to a desired other value */ &lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;data comp;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;merge comp(in=b) foobar(in=a);&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;by tic fyear;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;if a or (b and not a) then output;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P style="background-color: #ffffff; font-size: 12px; font-family: Arial, Helvetica, sans-serif;"&gt;Full program runtime 2.87 seconds on my system.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Mar 2012 22:23:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28272#M6578</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2012-03-13T22:23:56Z</dc:date>
    </item>
    <item>
      <title>Re: Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28273#M6579</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks FriedEgg,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am thinking what is the purpose to have the source tables of "comp" 2 times (ie. comp a and comp c)? Is it for the matching purpose. I can see you set the where clause - a.tic ne c.tic &amp;amp; a.sic3=c.sic3. Does the "a.tic ne c.tic" indirectly assume the "comp a" as the list of firms that need matching and "com c" as the firms to be matched.&amp;nbsp; Then, matching criteria can be indirectly identified by the where clause "a.sic3=c.sic3".&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Processing time can be saved if we can set the criteria in the SQL wisely. SAS Hash object is also a high technology in SAS program. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks for sharing. I learnt something from you too &lt;img id="smileyhappy" class="emoticon emoticon-smileyhappy" src="https://communities.sas.com/i/smilies/16x16_smiley-happy.png" alt="Smiley Happy" title="Smiley Happy" /&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;MSPAK&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 14 Mar 2012 23:59:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28273#M6579</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-14T23:59:26Z</dc:date>
    </item>
    <item>
      <title>Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28274#M6580</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The comp dataset is referenced twice to perform a self join, as you assumed.&amp;nbsp; 'comp a' as you say will be the set of reference here and 'comp c' will be applied in similar fashion to a left outer join on year and sic3.&amp;nbsp; The reason for the tic condition was to avoid matching a record to itself.&amp;nbsp; In this situation SQL is much faster than the proposed hash solution due to the order of processing and the removal of the invalid zips before the density calculation.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 15 Mar 2012 16:39:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28274#M6580</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2012-03-15T16:39:35Z</dc:date>
    </item>
    <item>
      <title>Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28275#M6581</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thanks FriedEgg for confirmation.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am new to SAS and have used the program for merely 2 - 3 months. I have another question on the WHERE clause:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Do you mean that if the WHERE clause of "a.tic ne c.tic" is not available, then the firms (identified by tic) will be self-matched? If I never indicated them in the WHERE clause as "a.tic=c.tic" would this also generate the same result? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;As per the following WHERE clause:&lt;/P&gt;&lt;P&gt;a.zip=b.zip and c.zip=b.zip and you never indicate that a.zip=c.zip, does it indirectly indicate a.zip ne c.zip?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In short, my question is that any difference if the "ne" indicated in WHERE clause compared to have no such a clause? &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Once again, thank you very much for helps.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;mspak&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 16 Mar 2012 07:30:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28275#M6581</guid>
      <dc:creator>mspak</dc:creator>
      <dc:date>2012-03-16T07:30:03Z</dc:date>
    </item>
    <item>
      <title>Density of industrial firms</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28276#M6582</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;a.tic ne c.tic -- match all values of tic other than itself&lt;/P&gt;&lt;P&gt;a.tic eq c.tic -- match only itself&lt;/P&gt;&lt;P&gt;not mentioning a.tic or c.tic -- match all values of tic to all values of tic&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;a.zip=b.zip -- this is used to validate the zipcodes in 'comp a' if a.zip is not present in 'sashelp.zipcode b' then the record is removed&lt;/P&gt;&lt;P&gt;b.zip=c.zip -- this is used to validate the zipcodes in 'comp c' if a.zip is not present in 'sashelp.zipcode b' then the record is removed&lt;/P&gt;&lt;P&gt;a.zip=c.zip -- this would only match firms in the same zip to each other for density calculation, which would be wrong.&amp;nbsp; It does not indirectly imply a.zip ne c.zip which would match all firms except those in the same zip code to eachother, also wrong.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 16 Mar 2012 08:09:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Density-of-industrial-firms/m-p/28276#M6582</guid>
      <dc:creator>FriedEgg</dc:creator>
      <dc:date>2012-03-16T08:09:41Z</dc:date>
    </item>
  </channel>
</rss>

