Hi everyone,
I have a dataset that has 3 columns:
Date - Date of the event
Car - the number of the car
Client - The person who is riding in the car
If on a given date, a car has more than one client, they have two rows. I would like to make the data wide so I can get the average per car over a period of time.
The data that I have is this:
| Date | Car | Client |
| 01/01/18 | 1 | Ente |
| 01/02/18 | 1 | Patterson |
| 01/02/18 | 1 | Jeter |
| 01/02/18 | 2 | Davidson |
| 01/03/18 | 1 | Steve |
| 01/03/18 | 1 | Gonzalez |
| 01/03/18 | 2 | York |
| 01/03/18 | 2 | Camara |
| 01/03/18 | 2 | Camara |
| 01/04/18 | 1 | Waters |
| 01/04/18 | 1 | Ventre |
| 01/04/18 | 2 | Perez |
| 01/04/18 | 3 | Rodriguez |
| 01/04/18 | 3 | Jeter |
| 01/04/18 | 3 | Jiminez |
The data that I would like is this:
| Date | CAR1_Count | Car2_Count | Car3_Count | TotalCount |
| 1/1/2018 | 1 | 0 | 0 | 1 |
| 1/2/2018 | 2 | 1 | 0 | 3 |
| 1/3/2018 | 2 | 3 | 0 | 5 |
| 1/4/2018 | 2 | 1 | 3 | 6 |
Any suggestions would be greatly appreciated.
Thank you!
data have;
infile cards missover dlm=' ';
input Date $ Car Client $;
cards;
01/01/18 1 Ente
01/02/18 1 Patterson
01/02/18 1 Jeter
01/02/18 2 Davidson
01/03/18 1 Steve
01/03/18 1 Gonzalez
01/03/18 2 York
01/03/18 2 Camara
01/03/18 2 Camara
01/04/18 1 Waters
01/04/18 1 Ventre
01/04/18 2 Perez
01/04/18 3 Rodriguez
01/04/18 3 Jeter
01/04/18 3 Jiminez
;
proc freq data=have ;
tables date*car / out=want;
run;
proc sql;
create table want2 as
select date, count(date) as TotalCount
from have
group by date;
quit;
proc sort data=want;
by date car;
run;
data want3;
merge want2 want;
by date;
run;
proc transpose data=want3(drop=percent) out=want4(drop=_name_ _label_) prefix=Count_car ;
id car;
by date totalcount;
run;
PROC TABULATE should be able to do that easily. Untested but definitely in the right neighborhood:
proc tabulate data=have;
class date car;
tables date, (car all)*n=' ';
run;
This uses Proc means to display the mean number of riders per car per day that the car was used.
data have; infile cards missover dlm=' '; input Date $ Car Client $; cards; 01/01/18 1 Ente 01/02/18 1 Patterson 01/02/18 1 Jeter 01/02/18 2 Davidson 01/03/18 1 Steve 01/03/18 1 Gonzalez 01/03/18 2 York 01/03/18 2 Camara 01/03/18 2 Camara 01/04/18 1 Waters 01/04/18 1 Ventre 01/04/18 2 Perez 01/04/18 3 Rodriguez 01/04/18 3 Jeter 01/04/18 3 Jiminez ; proc summary data=have nway; class date car; output out=havesum (drop=_type_); run; proc means data=havesum (rename=(_freq_=count)) mean; class car; var count; run;
If you actually need a wider period of time you may need to include the days with 0 riders depending on what you mean by average per car over a period of time.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.