BookmarkSubscribeRSS Feed
soulful63
Calcite | Level 5

Good afternoon, 

Thanks so much on your assistance on the following... 

 

I am using SAS 9.3 ( The Basic version... NOT Enterprise, etc.).  I have a list of 500 people who each traveled to one city on a specific date (Visit) .  What I would like to know is what is the range in days between the 1st and 2nd visit per city, then calculate the mean of these ranges ( based on cities with 2 or more visitis).   What I did was sort the data by City & then by Visit (date variable).  When I did this,  found there is at least one visit per city. For some cities, there were only 2 visits,  and for other cities there were >  2 visits.  Figured out how to find range of first and last visit per city (in days) , and easily calculated mean by hand.  (see code below)... 

 

proc tabulate data = a;
class city;
var visit;
table city*visit, n nmiss (min max mean)*f=mmddyy10. range;
run;

 

But not sure how to calculate range per city based on 1st and 2nd visit only.  Finding the mean of these ranges ( 2 or more visits) I can do by hand pretty quickly, but if possible to code easily, then I would love to know as well. I sincerely appreciate the help :-).  

 

Here is sample data

 

 

City                     Visit

Atlanta                22-Aug-15

Franklin              11-Nov-15

Licoln Heights    25-Oct-15

New Mexico      17-Sep-15

Palm Springs     04-Aug-15

Palm Springs     23-Aug-15

Palm Springs     09-Sep-15

Palm Springs     11-Oct-15

 

San Diego          21-Aug-15

San Diego         18-Sep-15

San Diego         29-Sep-15

Temecula          25-Sep-15

 

Thank you 

11 REPLIES 11
ChrisNZ
Tourmaline | Level 20

So you want:

-if 2 visits, calculate the range

-if more than 2 visits, calculate the mean of the ranges of subsequent visits?

So if you have

San Diego         18-Sep-15

San Diego         27-Sep-15  range=9

San Diego         28-Sep-15  range=1   

San Diego         30-Sep-15  range=2    mean=12/3=4

Is that right?

 

soulful63
Calcite | Level 5

Hi there, 

 

I would only want to calculate the range between 1st and 2nd visit for cities with 2 or more visits.  So for this San Diego example,  I would just want range = 9,  then let's say for the next city Palm Springs has 3 visits,  and the range between the first & second visit =  7, and then the next city, Temecula only has 1 visit, so range = 0.  The average would be  ( (9+7)/2)  =  8.  

 

Thank you 

ChrisNZ
Tourmaline | Level 20

Like this?

 


data WANT;
  if LASTOBS then do;                     %* Output mean;
    MEAN=SUM/N;
    output; 
  end;
  set HAVE end=LASTOBS;
  by CITY;
  RANGE_FIRST2 = dif(VISIT);              %* Derive range;
  if lag(first.CITY) and not first.CITY;  %* Only keep 2nd city record;
  SUM+RANGE_FIRST2;                       %* Sum the ranges;
  N+1;                                    %* Count the ranges;
run;

proc print;
  var MEAN;
run;

 

Obs MEAN
1 23.5
soulful63
Calcite | Level 5

Yes!  This worked!  Thank you so much!  

Reeza
Super User

Assuming you have SAS dates use the dif function in conjunction with by processing. You can add a counter that resets at each city to count visits and only keep second occurrence of desired. I've commented it out below. 

 

Data want;
Set have;
By id city;

Days_between = dif(date);
If first.city then days_between = .;

If first.city then visit=1;
else visit+1;

*if visit ne 2 then delete;
Run;
soulful63
Calcite | Level 5

Thanks for response. 

 

I ran the suggested code and got the following output:  

 

Visit =  01JAN60 for all rows.  Days_between and date =  . (missing) 

Log states  'date is uinitialized' 

 

Also I'm wondering the code line -->  dif(date);  It doesn't highlight in blue as yours does.... 

 

Also is it not working b/c they're not SAS dates?  

Ksharp
Super User
data have;
input City  & $40.                 Visit : date11.;
format visit date9.;
cards;
Atlanta                22-Aug-15
Franklin              11-Nov-15
Licoln Heights    25-Oct-15
New Mexico      17-Sep-15
Palm Springs     04-Aug-15
Palm Springs     23-Aug-15
Palm Springs     09-Sep-15
Palm Springs     11-Oct-15
San Diego          21-Aug-15
San Diego         18-Sep-15
San Diego         29-Sep-15
Temecula          25-Sep-15
;
run;
proc sort data=have;by city visit;run;
data want;
 set have;
 by city ;
 if first.city then n=0;
 n+1;
 dif=dif(visit);
 if n=2;
run;
proc means data=want mean;
 var dif;
run;
soulful63
Calcite | Level 5
Yes, this works! I'm so appreciative for this!
Thank you!
PGStats
Opal | Level 21

The more, the merrier!

 

data meanLapse;
set have end=done; by City;
wasFirstCity = lag(first.city);
InitialVisit = lag(visit);
if wasFirstCity and not first.city then do;
    sumLapse + intck("DAY", InitialVisit, visit);
    nLapse + 1;
    end;
if done then do;
    meanLapse = sumLapse / nLapse;
    output;
    end;
keep nLapse meanLapse;
run;

proc print data=meanLapse noobs; run;

Edit: Sorry @ChrisNZ, I hadn't realized that this was almost identical to your solution. Smiley Embarassed  

PG
soulful63
Calcite | Level 5

Thanks so much!  I appreciate it 🙂 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1310 views
  • 0 likes
  • 5 in conversation