Thanks FriedEggs again, From your first part of SQL: proc sql; create table foo as select ticker, year, sum(missing(postcode)) as total_miss, count(iddir) as total_dir, count(distinct iddir) as total_same, (calculated total_miss/calculated total_dir) as percent_miss from in.misingcode group by ticker,year; The count(iddir) as total_dir includes those duplicate IDDIR; but the count(distinct iddir) as total_same, eliminate the duplicates and provides the total of directors. I think your total_same should be the total_dir by my definition. I wish to eliminate duplicates IDDIR. My total_same definition is that total directors which have the same zipcode for a given firm-year, but this should excludes those have missing zipcodes. In other words, directors who have missing postcodes do not counted as same postcodes. Change_following is seems incorect: change_dir for year 2007 = Change_following for year 2006. I would like to see the number of changes for the current year compared to the following years. Perhaps for this information, I can present them as time-series data for years 2001 - 2010 in terms of total number of unique directors (by IDDIR) reported. Then I should detect the significant changes for a given year. The year with most significant change should be identified and then the number of changes should be reported only for the year with the highest changes. Is there any quick way or special function that deal with missing values by grouping them according to the category, such as firm-year as in my case? The dataset that I provided - the incomplete should be more in earlier years than later years. Therefore, my purpose to detect significant change of total number of directors - to imply that in same firms, the incomplete data might be large in earlier years before the significant change of number of directors. I am digesting your second part of the SQL in which 3 tables was used as source tables. I think in SQL, the lagging and leading figures can be written as year+1 and year-1. In DATA step, I used PROC EXPAND to do this. I am imagining the tables after left joins by the criteria that you set. SQL do need creativity in designing the program. Thank you for your helps. Note: How long did you take to master this skill? I am just 3 months old in terms of SAS program, and having attended SAS Prog1 and prog2 trainings. Regadrs, mspak
... View more