04-17-2016 05:41 PM
Thanks so much on your assistance on the following...
I am using SAS 9.3 ( The Basic version... NOT Enterprise, etc.). I have a list of 500 people who each traveled to one city on a specific date (Visit) . What I would like to know is what is the range in days between the 1st and 2nd visit per city, then calculate the mean of these ranges ( based on cities with 2 or more visitis). What I did was sort the data by City & then by Visit (date variable). When I did this, found there is at least one visit per city. For some cities, there were only 2 visits, and for other cities there were > 2 visits. Figured out how to find range of first and last visit per city (in days) , and easily calculated mean by hand. (see code below)...
proc tabulate data = a;
table city*visit, n nmiss (min max mean)*f=mmddyy10. range;
But not sure how to calculate range per city based on 1st and 2nd visit only. Finding the mean of these ranges ( 2 or more visits) I can do by hand pretty quickly, but if possible to code easily, then I would love to know as well. I sincerely appreciate the help :-).
Here is sample data
Licoln Heights 25-Oct-15
New Mexico 17-Sep-15
Palm Springs 04-Aug-15
Palm Springs 23-Aug-15
Palm Springs 09-Sep-15
Palm Springs 11-Oct-15
San Diego 21-Aug-15
San Diego 18-Sep-15
San Diego 29-Sep-15
04-17-2016 06:11 PM
So you want:
-if 2 visits, calculate the range
-if more than 2 visits, calculate the mean of the ranges of subsequent visits?
So if you have
San Diego 18-Sep-15
San Diego 27-Sep-15 range=9
San Diego 28-Sep-15 range=1
San Diego 30-Sep-15 range=2 mean=12/3=4
Is that right?
04-17-2016 06:19 PM
I would only want to calculate the range between 1st and 2nd visit for cities with 2 or more visits. So for this San Diego example, I would just want range = 9, then let's say for the next city Palm Springs has 3 visits, and the range between the first & second visit = 7, and then the next city, Temecula only has 1 visit, so range = 0. The average would be ( (9+7)/2) = 8.
04-17-2016 06:56 PM
data WANT; if LASTOBS then do; %* Output mean; MEAN=SUM/N; output; end; set HAVE end=LASTOBS; by CITY; RANGE_FIRST2 = dif(VISIT); %* Derive range; if lag(first.CITY) and not first.CITY; %* Only keep 2nd city record; SUM+RANGE_FIRST2; %* Sum the ranges; N+1; %* Count the ranges; run; proc print; var MEAN; run;
04-17-2016 06:12 PM
Assuming you have SAS dates use the dif function in conjunction with by processing. You can add a counter that resets at each city to count visits and only keep second occurrence of desired. I've commented it out below.
Data want; Set have; By id city; Days_between = dif(date); If first.city then days_between = .;
If first.city then visit=1;
*if visit ne 2 then delete;
04-17-2016 07:03 PM
Thanks for response.
I ran the suggested code and got the following output:
Visit = 01JAN60 for all rows. Days_between and date = . (missing)
Log states 'date is uinitialized'
Also I'm wondering the code line --> dif(date); It doesn't highlight in blue as yours does....
Also is it not working b/c they're not SAS dates?
04-17-2016 09:11 PM
data have; input City & $40. Visit : date11.; format visit date9.; cards; Atlanta 22-Aug-15 Franklin 11-Nov-15 Licoln Heights 25-Oct-15 New Mexico 17-Sep-15 Palm Springs 04-Aug-15 Palm Springs 23-Aug-15 Palm Springs 09-Sep-15 Palm Springs 11-Oct-15 San Diego 21-Aug-15 San Diego 18-Sep-15 San Diego 29-Sep-15 Temecula 25-Sep-15 ; run; proc sort data=have;by city visit;run; data want; set have; by city ; if first.city then n=0; n+1; dif=dif(visit); if n=2; run; proc means data=want mean; var dif; run;
04-17-2016 11:33 PM - edited 04-17-2016 11:48 PM
The more, the merrier!
data meanLapse; set have end=done; by City; wasFirstCity = lag(first.city); InitialVisit = lag(visit); if wasFirstCity and not first.city then do; sumLapse + intck("DAY", InitialVisit, visit); nLapse + 1; end; if done then do; meanLapse = sumLapse / nLapse; output; end; keep nLapse meanLapse; run; proc print data=meanLapse noobs; run;
Edit: Sorry @ChrisNZ, I hadn't realized that this was almost identical to your solution.