Good afternoon,
Thanks so much on your assistance on the following...
I am using SAS 9.3 ( The Basic version... NOT Enterprise, etc.). I have a list of 500 people who each traveled to one city on a specific date (Visit) . What I would like to know is what is the range in days between the 1st and 2nd visit per city, then calculate the mean of these ranges ( based on cities with 2 or more visitis). What I did was sort the data by City & then by Visit (date variable). When I did this, found there is at least one visit per city. For some cities, there were only 2 visits, and for other cities there were > 2 visits. Figured out how to find range of first and last visit per city (in days) , and easily calculated mean by hand. (see code below)...
proc tabulate data = a;
class city;
var visit;
table city*visit, n nmiss (min max mean)*f=mmddyy10. range;
run;
But not sure how to calculate range per city based on 1st and 2nd visit only. Finding the mean of these ranges ( 2 or more visits) I can do by hand pretty quickly, but if possible to code easily, then I would love to know as well. I sincerely appreciate the help :-).
Here is sample data
City Visit
Atlanta 22-Aug-15
Franklin 11-Nov-15
Licoln Heights 25-Oct-15
New Mexico 17-Sep-15
Palm Springs 04-Aug-15
Palm Springs 23-Aug-15
Palm Springs 09-Sep-15
Palm Springs 11-Oct-15
San Diego 21-Aug-15
San Diego 18-Sep-15
San Diego 29-Sep-15
Temecula 25-Sep-15
Thank you
So you want:
-if 2 visits, calculate the range
-if more than 2 visits, calculate the mean of the ranges of subsequent visits?
So if you have
San Diego 18-Sep-15
San Diego 27-Sep-15 range=9
San Diego 28-Sep-15 range=1
San Diego 30-Sep-15 range=2 mean=12/3=4
Is that right?
Hi there,
I would only want to calculate the range between 1st and 2nd visit for cities with 2 or more visits. So for this San Diego example, I would just want range = 9, then let's say for the next city Palm Springs has 3 visits, and the range between the first & second visit = 7, and then the next city, Temecula only has 1 visit, so range = 0. The average would be ( (9+7)/2) = 8.
Thank you
Like this?
data WANT;
if LASTOBS then do; %* Output mean;
MEAN=SUM/N;
output;
end;
set HAVE end=LASTOBS;
by CITY;
RANGE_FIRST2 = dif(VISIT); %* Derive range;
if lag(first.CITY) and not first.CITY; %* Only keep 2nd city record;
SUM+RANGE_FIRST2; %* Sum the ranges;
N+1; %* Count the ranges;
run;
proc print;
var MEAN;
run;
Obs | MEAN |
---|---|
1 | 23.5 |
Yes! This worked! Thank you so much!
Assuming you have SAS dates use the dif function in conjunction with by processing. You can add a counter that resets at each city to count visits and only keep second occurrence of desired. I've commented it out below.
Data want;
Set have;
By id city;
Days_between = dif(date);
If first.city then days_between = .;
If first.city then visit=1;
else visit+1;
*if visit ne 2 then delete;
Run;
Thanks for response.
I ran the suggested code and got the following output:
Visit = 01JAN60 for all rows. Days_between and date = . (missing)
Log states 'date is uinitialized'
Also I'm wondering the code line --> dif(date); It doesn't highlight in blue as yours does....
Also is it not working b/c they're not SAS dates?
data have;
input City & $40. Visit : date11.;
format visit date9.;
cards;
Atlanta 22-Aug-15
Franklin 11-Nov-15
Licoln Heights 25-Oct-15
New Mexico 17-Sep-15
Palm Springs 04-Aug-15
Palm Springs 23-Aug-15
Palm Springs 09-Sep-15
Palm Springs 11-Oct-15
San Diego 21-Aug-15
San Diego 18-Sep-15
San Diego 29-Sep-15
Temecula 25-Sep-15
;
run;
proc sort data=have;by city visit;run;
data want;
set have;
by city ;
if first.city then n=0;
n+1;
dif=dif(visit);
if n=2;
run;
proc means data=want mean;
var dif;
run;
The more, the merrier!
data meanLapse;
set have end=done; by City;
wasFirstCity = lag(first.city);
InitialVisit = lag(visit);
if wasFirstCity and not first.city then do;
sumLapse + intck("DAY", InitialVisit, visit);
nLapse + 1;
end;
if done then do;
meanLapse = sumLapse / nLapse;
output;
end;
keep nLapse meanLapse;
run;
proc print data=meanLapse noobs; run;
Edit: Sorry @ChrisNZ, I hadn't realized that this was almost identical to your solution.
Thanks so much! I appreciate it 🙂
@PGStats No worries. Great minds think alike. 🙂
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.