DID Tracker
https://communities.sas.com/kntur85557/tracker
DID TrackerSun, 23 Jun 2024 21:51:49 GMT2024-06-23T21:51:49ZWeighting (proc glimmix), using weights correctly for DID analysis
https://communities.sas.com/t5/Statistical-Procedures/Weighting-proc-glimmix-using-weights-correctly-for-DID-analysis/m-p/850209#M42098
<P><SPAN>I'm conducting a difference-in-difference analysis to assess the impact of public housing demolition on violent crime. I'm using a multilevel, negative binomial model coded with proc glimmix. My intervention group is census tracts where public housing was completely demolished, and my control group is census tracts where public housing underwent routine maintenance. There are two key pieces of data that could affect my analyses: 1) the public housing developments that were demolished are as large as 1,700 units, and as small as 400 units.; 2) since I am using census data from the 2000 and 2010 census, my census tract population data abruptly shift, sometimes dropping by as many as 1,400 people between censuses.</SPAN></P><P> </P><P><SPAN>Question 1: Since the drop in the number of housing units would correlate with the population decline in a given census tract over time, I believe what I need to do is somehow weight the data analysis to account for there being larger public housing developments in some tracts and smaller developments in others prior to demolition. How do I go about this? Is it as simple as adding WEIGHT=number_units_at_start to the RANDOM statement? Most of the articles I've found are on survey weighting, and aren't helpful. Others are indecipherable, since I'm proficient but not fluent in biostatistics.</SPAN></P><P> </P><P><SPAN>Question 2: I've also entertained the idea of adjusting for the number of public housing units at the start of my timeframe. The control tracts lost housing units as well, but to routine maintenance, not wholesale demolition. What would the impact of this be vs. adding some kind of weighting variable? </SPAN></P><P> </P><P><SPAN>Here are examples of my code and data (there are 176 rows of data from 8 intervention and 8 control tracts in the full table; housing unit data are just examples since that variable isn't in my dataset at the moment). Thank you in advance for any advice.</SPAN></P><P> </P><P><SPAN>proc glimmix data=main.toph_es_demo PLOTS=pearsonpanel(marginal);<BR />class tract couplet exposed (ref="0") timeline (ref="-1") c00;<BR />model totcrime = timeline*pre timeline*post exposed timeline*pre*exposed timeline*post*exposed c00/ <BR /> solution</SPAN></P><P><SPAN> dist = negbin<BR /> link = log<BR /> offset = logpopyrs<BR /> cl;<BR />random int/subject=couplet(tract) type=un cl s;<BR />covtest 'var(couplet(tract))=0' 0 .;<BR />output out=residmodel1_negbin_TOPH_des pred(noblup)=predicted<BR /> pearson(noblup)=pearson;<BR />run;</SPAN></P><P> </P><TABLE><TBODY><TR><TD>Tract</TD><TD>Year</TD><TD>Timeline</TD><TD>Tract_pop</TD><TD>Crime_count</TD><TD>pre</TD><TD>post</TD><TD>exposed</TD><TD>log_population_yrs</TD><TD>violent_crime_rate</TD><TD>number_units_at_start</TD><TD>couplet</TD></TR><TR><TD>I</TD><TD>1995</TD><TD>-5</TD><TD>1205</TD><TD>81</TD><TD>1</TD><TD>0</TD><TD>1</TD><TD>7.09</TD><TD>67.22</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>1996</TD><TD>-4</TD><TD>1205</TD><TD>59</TD><TD>1</TD><TD>0</TD><TD>1</TD><TD>7.09</TD><TD>48.96</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>1997</TD><TD>-3</TD><TD>1205</TD><TD>26</TD><TD>1</TD><TD>0</TD><TD>1</TD><TD>7.09</TD><TD>21.58</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>1998</TD><TD>-2</TD><TD>1205</TD><TD>30</TD><TD>1</TD><TD>0</TD><TD>1</TD><TD>7.09</TD><TD>24.90</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>1999</TD><TD>-1</TD><TD>1205</TD><TD>24</TD><TD>0</TD><TD>0</TD><TD>1</TD><TD>7.09</TD><TD>19.92</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2000</TD><TD>0</TD><TD>1205</TD><TD>26</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.09</TD><TD>21.58</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2001</TD><TD>1</TD><TD>1205</TD><TD>18</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.09</TD><TD>14.94</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2002</TD><TD>2</TD><TD>1205</TD><TD>10</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.09</TD><TD>8.30</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2003</TD><TD>3</TD><TD>1205</TD><TD>13</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.09</TD><TD>10.79</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2004</TD><TD>4</TD><TD>1205</TD><TD>11</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.09</TD><TD>9.13</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>I</TD><TD>2005</TD><TD>5</TD><TD>1991</TD><TD>17</TD><TD>0</TD><TD>1</TD><TD>1</TD><TD>7.60</TD><TD>8.54</TD><TD>1700</TD><TD>1</TD></TR><TR><TD>C</TD><TD>1995</TD><TD>-5</TD><TD>290</TD><TD>9</TD><TD>1</TD><TD>0</TD><TD>0</TD><TD>5.67</TD><TD>31.03</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>1996</TD><TD>-4</TD><TD>290</TD><TD>17</TD><TD>1</TD><TD>0</TD><TD>0</TD><TD>5.67</TD><TD>58.62</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>1997</TD><TD>-3</TD><TD>290</TD><TD>7</TD><TD>1</TD><TD>0</TD><TD>0</TD><TD>5.67</TD><TD>24.14</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>1998</TD><TD>-2</TD><TD>290</TD><TD>8</TD><TD>1</TD><TD>0</TD><TD>0</TD><TD>5.67</TD><TD>27.59</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>1999</TD><TD>-1</TD><TD>290</TD><TD>7</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>5.67</TD><TD>24.14</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2000</TD><TD>0</TD><TD>290</TD><TD>7</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.67</TD><TD>24.14</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2001</TD><TD>1</TD><TD>290</TD><TD>6</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.67</TD><TD>20.69</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2002</TD><TD>2</TD><TD>290</TD><TD>4</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.67</TD><TD>13.79</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2003</TD><TD>3</TD><TD>290</TD><TD>13</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.67</TD><TD>44.83</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2004</TD><TD>4</TD><TD>290</TD><TD>6</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.67</TD><TD>20.69</TD><TD>850</TD><TD>1</TD></TR><TR><TD>C</TD><TD>2005</TD><TD>5</TD><TD>244</TD><TD>8</TD><TD>0</TD><TD>1</TD><TD>0</TD><TD>5.50</TD><TD>32.79</TD><TD>850</TD><TD>1</TD></TR></TBODY></TABLE>Sat, 17 Dec 2022 17:52:16 GMThttps://communities.sas.com/t5/Statistical-Procedures/Weighting-proc-glimmix-using-weights-correctly-for-DID-analysis/m-p/850209#M42098DID2022-12-17T17:52:16ZRe: proc glimmix for count data model fit assessment
https://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/839019#M41541
<P>This has all been very helpful Steve! Thank you!</P>Mon, 17 Oct 2022 16:24:20 GMThttps://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/839019#M41541DID2022-10-17T16:24:20ZRe: proc glimmix for count data model fit assessment
https://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/838722#M41528
<P>Thank you Steve.This is helpful.</P><P> </P><P>When I used proc genmod for another project I used the code:</P><P> </P><P>output out=residuals<BR />stdreschi = Stdreschi;<BR />Proc plot data=residuals;<BR />plot stdreschi*time_yrs_t;<BR />run;</P><P> </P><P>This code doesn't seem to be available for/work in proc glimmix. Do you know of code with me that will enable proc glimmix to output the residuals and predicted values so that I can plot them?</P><P> </P>Fri, 14 Oct 2022 20:32:31 GMThttps://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/838722#M41528DID2022-10-14T20:32:31Zproc glimmix for count data model fit assessment
https://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/838414#M41513
<P>I'm conducting a difference-in-difference analysis to assess the impact of an intervention on violent crime.My outcome is violent crime, inputted as a count that is converted to a rate via an offset of (ln of population years) to account for a changing population over time.</P><P> </P><P>I want to asses whether my count outcome data are overdispersed to confirm that I am appropriately using a negative binomial distribution. I'm using proc glimmix (SAS 9.4). I read about using the standard "Gener. Chi-Square/DF" output to determine whether data are overdispersed, with a value closer to 1.0 showing that the data are not overdispersed; but I was cautioned not to use this calculation to determine whether I should use a negative binomial model instead of a Poisson model.</P><P> </P><P>I have been advised to assess whether my outcome data are overdispersed by dividing the residual deviance by the predicted mean. Is there SAS code I can add to my glimmix model to calculate this? Here is my present code for a difference-in-difference model with a negative binomial distribution.</P><P> </P><P>proc glimmix data=main.simple_model;<BR />class tract post (ref='0') exposed (ref='0');<BR />model totcrime = post exposed post*exposed covariates/<BR />solution<BR />dist = negbin<BR />link = log<BR />offset = logpopyrs<BR />cl;<BR />random int/subject=tract type=un s;<BR />covtest 'var(tract)=0' . 0;<BR />run;</P><P> </P><P>Would another possibility be to run the model as a Poisson model and then a negative binomial model and do a likelihood ratio test?</P><P> </P><P>Thank you.</P><DIV class=""><DIV><DIV align="center"> </DIV></DIV></DIV>Thu, 13 Oct 2022 14:56:01 GMThttps://communities.sas.com/t5/Statistical-Procedures/proc-glimmix-for-count-data-model-fit-assessment/m-p/838414#M41513DID2022-10-13T14:56:01ZRe: Am I using the RANDOM statement properly? (PROC GLIMMIX, event study, clustered data)
https://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/816090#M40288
<P>I think I found an answer for my second question here: <A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_mixed_examples05.ht" target="_blank">https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.4/statug/statug_mixed_examples05.ht</A>m</P>Wed, 01 Jun 2022 19:09:30 GMThttps://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/816090#M40288DID2022-06-01T19:09:30ZRe: Am I using the RANDOM statement properly? (PROC GLIMMIX, event study, clustered data)
https://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815735#M40264
Thank you for your reply Steve. As I've understood things so far, I would exponentiate the fixed effects estimates to obtain a ratio of rate ratios. So exponentiating a negative fixed effect estimate (e^Beta) provided by SAS would provide a percentage decrease in my intervention group vs. the control group. Is that how you understand it? I've wondered if there is a SAS option to ask SAS to exponentiate the fixed effects estimates for me. Do you know of one?<BR />Additionally, I asked on a forum at one time about modeling the violent crime rates rather than using the log link, and I was told that doing that would risk imprecision in the error terms, and it was better to use the log link. Interested in your thoughts. Thank you.Mon, 30 May 2022 19:26:44 GMThttps://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815735#M40264DID2022-05-30T19:26:44ZRe: Am I using the RANDOM statement properly? (PROC GLIMMIX, event study, clustered data)
https://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815549#M40260
<a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/181543">@svh</a>: Thank you for the suggestion if adding "s" to the random statements. Two follow up questions if you don't mind. Please let me know if it would be easier to answer these questions with screenshots provided.<BR /><BR />Using both of these lines of code:<BR />random int/subject=tract s;<BR />random int/subject=couplet(tract) type=un s;<BR /><BR />followed by these:<BR />covtest 'var(tract)=0' . 0;<BR />covtest 'var(couplet(tract))=0' 0 .;<BR /><BR />I get output for the covtests that says that the random effects are not significant (p=1.000 for both, note: MI); however, if I run a model using each random statement individually and perform the covtest for the respective random statement, the output says each random effect is significant (p<.0001, MI). Also, the parameters provided by "s" are the same in the models where the random statements are added in individually. If I use both random statement in the model, the estimates provided as a result of each random statement (tract and couplet(tract)) are different.<BR /><BR />1. Does this mean I should use just the "random int/subject=couplet(tract) type=un s;" statement since it indicates to SAS that the model is hierarchical? Whether I use one random statement or both, the fixed effects parameters stay the same.<BR /><BR />2. Since the output for the random effects estimates provides intercepts, then after exponentiating the estimates, the interpretation is the baseline level of violent crime/1000 person years for each census tract, correct?<BR /><BR />Thanks very much.Sat, 28 May 2022 16:16:18 GMThttps://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815549#M40260DID2022-05-28T16:16:18ZRe: Am I using the RANDOM statement properly? (PROC GLIMMIX, event study, clustered data)
https://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815108#M40243
<P>Thank you very much for your response.</P>Wed, 25 May 2022 20:45:24 GMThttps://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/815108#M40243DID2022-05-25T20:45:24ZAm I using the RANDOM statement properly? (PROC GLIMMIX, event study, clustered data)
https://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/812814#M40091
<P>Hello. I hope I've posted to the correct location. I'm using SAS version 9.4. I'm not a novice SAS user, but I'm not an expert. I use SAS in the context of applied public health, and any assistance using straightforward language, in addition to/instead of technical jargon, would be greatly appreciated.</P><P> </P><P>I'm working on a difference-in-difference (DID) analysis of the impact of public housing demolition on violent crime at the census tract level (data sample attached). I am using annual data, and my timeline is from five years prior to the intervention (demolition) to five years post-intervention. My intervention group is a set of 8 "target" census tracts where public housing was demolished. My comparison group is a group of 8 "other public housing (OPH)" census tracts where public housing underwent routine maintenance. I have already conducted a couple of DID analyses using pooled data. My question stems from the third analysis--an event study. The demolitions in the target tracts occurred in different years, so in order to use the OPH tracts as a comparison group I matched the target and oph tracts based on the % males ages 15-34, and then assigned the 5-year pre-post timing of the target tract to its oph tract pair. Therefore I have a variable for each tract, as well as a "couplet" variable for each target/oph pair. The variable C00 is to include a variable that accounts for the fact that there are data from two censuses in the model.</P><P> </P><P>My questions are:</P><P>1)I think I have a multilevel model here. I'm looking at census tracts and couplets of census tracts, but this isn't quite the same as looking at, for example, appointments of patients in clinics as in this SAS guide (<A href="http://support.sas.com/kb/40/724.html" target="_blank" rel="noopener">http://support.sas.com/kb/40/724.html</A>).I used this code first:</P><P>random int/subject=tract;<BR />random int/subject=couplet(tract) type=un;</P><P>thinking that the tract level was the larger level and the couplet was the smaller one. The model converges in 13 iterations, with no error messages. Then someone told me they saw my data the other way around--that "couplet" was the larger level and tract was the smaller level. So I switched the code to reflect that. The results were the same as before, but I had to use the "nloptions" command to increase the number of iterations so the model would converge, and I received an error message ("<SPAN>Estimated G matrix is not positive definite."). Was my first crack at the code correct then?<BR /></SPAN></P><DIV><SPAN> </SPAN></DIV><P>2.) Am I using the "random" code correctly? I'm not sure I need both lines. And maybe I just need the second line (random int/subject=couplet(tract) type=un; ). I've attempted to use the covtest code to test this, but I'm not completely understanding how to order the "0 ." and what's going on behind the scenes in SAS so that I can interpret the output.</P><P> </P><P>3.) One last, model-building oriented question: None of my covariates were statistically significantly contributing to the model. This makes sense given the similarity of the target and oph tract demographics. Can I then exclude C00 from this model since C00 is meant to tell the model the census data (which I'm not including) came from two separate censuses?</P><P> </P><P>I included all of the event study code and side notes I have at present to show my thought process. Thank you very much for reading all of this, and for your feedback. I hope I've described my process clearly.</P><P> </P><P>proc glimmix data=main.dataset;<BR />class tract couplet exposed (ref="0") timeline (ref="-1") c00;<BR />model totcrime = timeline*pre timeline*post exposed timeline*pre*exposed timeline*post*exposed c00/<BR />solution<BR />dist = negbin<BR />link = log<BR />offset = logpopyrs<BR />cl;</P><P>random int/subject=tract; *Original code-- the model converges in 13 iterations;<BR />random int/subject=couplet(tract) type=un; *Using this line of code by itself and with the one above comes up with the same results;</P><P> </P><P>/*random int/subject=couplet type=un; *edited code-as suggested by someone else;*/<BR />/*random int/subject=tract(couplet) type=un;*edited code-as suggested by someone else;*/</P><P> </P><P>/*random int/subject=couplet type=un; *tested this most recently. Results don't make sense based on background knowledge.;*/</P><P><BR />nloptions maxiter=120;<BR />/*covtest 'No random couplet effect' zeroG; *P<.001 says the couplet effect (first random statement) is necessary in the model (<A href="http://support.sas.com/kb/40/724.html);*/" target="_blank" rel="noopener">http://support.sas.com/kb/40/724.html);*/</A></P><P>/*covtest 'var(couplet)=0' 0 ./estimates;*/<BR />covtest 'var(tract)=0' . 0;<BR />/*covtest 'var(couplet(tract))=0' 0 .;*/<BR />run;</P>Wed, 11 May 2022 21:33:42 GMThttps://communities.sas.com/t5/Statistical-Procedures/Am-I-using-the-RANDOM-statement-properly-PROC-GLIMMIX-event/m-p/812814#M40091DID2022-05-11T21:33:42ZRe: Time-series data with time gaps that need to be filled
https://communities.sas.com/t5/SAS-Programming/Time-series-data-with-time-gaps-that-need-to-be-filled/m-p/743948#M233010
Thank you Jim. How do I know which day SAS is seeing to know which "sameday" SAS is using? When I used the nodupkey option to drop duplicate categorical month/days in order to get the dataset the sample comes from, I don't know which of the days it dropped from among the numerical month/yrs. Here's some of the original code, if that helps.<BR /><BR />*Turn o_moyr into a character variable to aid in grouping crime count;<BR />data crime.crime3;<BR />set crime.crime2;<BR />o_monyrc = put(o_year, dtmonyy7.);<BR />run;<BR /><BR />(lots of code...)<BR /><BR />data main.dataset5_a ;<BR />set main.dataset5;<BR />where tract in (305, 402,403,501,506,708,902,1005,1017,1106,1113,1114,1115,1203,<BR />1204,1207,1301,1302,1304, 1404, 1405, 1603, 1607, 2807, 2902);<BR />run;<BR />*19043 observations, 57 variables;<BR /><BR />proc sort data=main.dataset5_a out=main.dataset5_a2 nodupkey;<BR />by tract o_monyrc;<BR />run;<BR />*5704 observations, 57 variables;<BR /><BR />proc sort data=main.dataset5_a2;<BR />by tract o_moyr;<BR />run;Wed, 26 May 2021 16:37:38 GMThttps://communities.sas.com/t5/SAS-Programming/Time-series-data-with-time-gaps-that-need-to-be-filled/m-p/743948#M233010DID2021-05-26T16:37:38ZTime-series data with time gaps that need to be filled
https://communities.sas.com/t5/SAS-Programming/Time-series-data-with-time-gaps-that-need-to-be-filled/m-p/743940#M233008
<DIV class="branch"><DIV><DIV align="left"><P>Hello. I've been researching a solution to this question for a couple of days and I haven't found anything as specific as what I need. I'm not a succinct coder, and I am somewhere between novice and intermediate. I'd really appreciate help with the following:</P><P> </P><P>I have a dataset that should have 7800 observations (25 census tracts x 26 years x 12 month/years).</P><P>It has 5,704 observations because there are month/years where--in the original crime dataset-- the crime count for that month/yr and tract was 0. (The counts here are total crime including different UCR hierarchy codes.) The sample below has 7 sample rows. Everything to the right of crimecount is a demographic variable from the US census.</P><P> </P><P>The census data is from three decennial censuses-1990 population data (trctpop) is applied from 1990-1994, 2000 data is applied from 1995-2004, and 2010 data is applied from 2005-2015.</P><P> </P><P>I want to fill in the missing month/years with all of the identical census population data and demographic information for the appropriate month/year and census tract, and add a 0 for crimecount. So for example, between March and May 1990 below, a new observation would be added, identical to these two observations, except the month/year would say April1990 and there would be a 0 in the crimecount column.</P><P> </P><P>As I began coding the dataset below, I wanted month and year, so o_moyr in the table is formatted as dtmonyy7. The problem is that SAS "sees" the <U>day</U> in the date, and I couldn't drop duplicate month/yrs because SAS wasn't just looking at the month and year, but also the day. So I created o_monyrc as a categorical version of o_monyrc and used the nodupkey option to get SAS to see beyond the day to drop duplicate month/years. The problem, of course is that SAS orders o_monyrc starting with April, so I have to sort by o_moyr to keep the dataset ordered properly. As I've tried to figure out adding in the missing month/years and 0's for crimecount, the date formatting issue continues to be a problem because SAS still sees the day in o_moyr. I'd also appreciate suggestions for how to deal with that. Perhaps I could extract the month and the year into different columns from the get go and drop o_monyrc?</P><P> </P><P>Thank you for any and all help.</P><P> </P><TABLE cellspacing="0" cellpadding="5"><TBODY><TR><TD>o_monyrc</TD><TD>o_moyr</TD><TD>tract</TD><TD>trctpop</TD><TD><FONT color="#800000">crimecount</FONT></TD><TD>shrblk</TD><TD>numhhs</TD><TD>fmcn</TD><TD>ffhn</TD><TD>ffhd</TD><TD>educ8</TD><TD>educ11</TD></TR><TR><TD>JAN1990</TD><TD>JAN1990</TD><TD>305</TD><TD>2583</TD><TD>8</TD><TD>2493</TD><TD>1133</TD><TD>44</TD><TD>191</TD><TD>257</TD><TD>295</TD><TD>541</TD></TR><TR><TD>FEB1990</TD><TD>FEB1990</TD><TD>305</TD><TD>2583</TD><TD>4</TD><TD>2493</TD><TD>1133</TD><TD>44</TD><TD>191</TD><TD>257</TD><TD>295</TD><TD>541</TD></TR><TR><TD><FONT color="#800000">MAR1990</FONT></TD><TD><FONT color="#800000">MAR1990</FONT></TD><TD><FONT color="#800000">305</FONT></TD><TD><FONT color="#800000">2583</FONT></TD><TD><FONT color="#800000">1</FONT></TD><TD><FONT color="#800000">2493</FONT></TD><TD><FONT color="#800000">1133</FONT></TD><TD><FONT color="#800000">44</FONT></TD><TD><FONT color="#800000">191</FONT></TD><TD><FONT color="#800000">257</FONT></TD><TD><FONT color="#800000">295</FONT></TD><TD><FONT color="#800000">541</FONT></TD></TR><TR><TD><FONT color="#800000">MAY1990</FONT></TD><TD><FONT color="#800000">MAY1990</FONT></TD><TD><FONT color="#800000">305</FONT></TD><TD><FONT color="#800000">2583</FONT></TD><TD><FONT color="#800000">3</FONT></TD><TD><FONT color="#800000">2493</FONT></TD><TD><FONT color="#800000">1133</FONT></TD><TD><FONT color="#800000">44</FONT></TD><TD><FONT color="#800000">191</FONT></TD><TD><FONT color="#800000">257</FONT></TD><TD><FONT color="#800000">295</FONT></TD><TD><FONT color="#800000">541</FONT></TD></TR><TR><TD>JUN1990</TD><TD>JUN1990</TD><TD>305</TD><TD>2583</TD><TD>11</TD><TD>2493</TD><TD>1133</TD><TD>44</TD><TD>191</TD><TD>257</TD><TD>295</TD><TD>541</TD></TR><TR><TD>JUL1990</TD><TD>JUL1990</TD><TD>305</TD><TD>2583</TD><TD>7</TD><TD>2493</TD><TD>1133</TD><TD>44</TD><TD>191</TD><TD>257</TD><TD>295</TD><TD>541</TD></TR><TR><TD>AUG1990</TD><TD>AUG1990</TD><TD>305</TD><TD>2583</TD><TD>7</TD><TD>2493</TD><TD>1133</TD><TD>44</TD><TD>191</TD><TD>257</TD><TD>295</TD><TD>541</TD></TR></TBODY></TABLE></DIV></DIV></DIV>Wed, 26 May 2021 16:15:19 GMThttps://communities.sas.com/t5/SAS-Programming/Time-series-data-with-time-gaps-that-need-to-be-filled/m-p/743940#M233008DID2021-05-26T16:15:19Z