Hello lads,
I'am quite new to SAS EG so the question I will ask might be simple but I can't get it done. My current dataset looks like this (example, because in reality its much bigger):
Location - VehicleType - TotalVehicles - Date
2646 - AAA - 06 - 10December2019.
2646 - GGG - 13 - 10December2019.
2646 - AAA - 02 - 10December2019.
2646 - FFF - 11 - 10December2019.
2646 - GGG - 02 - 10December2019.
2646 - CCC- 11 - 10December2019.
I would like to have the total vehicles that drove on that location per month. So as an example, I want the following:
Location - VehicleType - TotalVehicles - Date
2646 - AAA- 20 - December2019.
2646 - TTT - 78 - December2019
2646 - GGG- 230- December2019.
2646 - BBB- 780 - December2019
2646 - HHH- 20 - December2019.
2646 - NNN- 78 - December2019
I actually want the total per month per vehicle type. That means each vehicletype must have just 1 date variable.
Can somebody help me with this? Appreciate it.
In your expected result, you have three observations for 2646-AAA-December2019. That does not make sense.
@AK100 wrote:
Hello lads,
I'am quite new to SAS EG so the question I will ask might be simple but I can't get it done. My current dataset looks like this (example, because in reality its much bigger):
Location - VehicleType - TotalVehicles - Date
2646 - AAA - 06 - 10December2019.
2646 - GGG - 13 - 10December2019.
2646 - AAA - 02 - 10December2019.
2646 - FFF - 11 - 10December2019.
2646 - GGG - 02 - 10December2019.
2646 - CCC- 11 - 10December2019.
I would like to have the total vehicles that drove on that location per month. So as an example, I want the following:
Location - VehicleType - TotalVehicles - Date
2646 - AAA- 20 - December2019.
2646 - AAA - 78 - December2019
2646 - GGG- 230- December2019.
2646 - BBB- 780 - December2019
2646 - AAA- 20 - December2019.
2646 - BBB - 78 - December2019
I actually want the total per month per vehicle type. That means each vehicletype must have just 1 date variable.
Can somebody help me with this? Appreciate it.
The simplest way is just a quick SQL query:
proc sql;
create table want as
select
location,
vehicletype,
intnx('month',date,0,'b') as date format=yymmd7.,
sum(totalvehicles) as totalvehicles
from have
group by location, vehicletype, calculated date;
quit;
Assuming your desired result should actually look like...
...and you want to use the EG query builder and not code:
The INTNX function is used to increment date/time/datetime values. 0 means to stay in the same month (no increment), and 'b' means to align the result to the beginning. The way we used the function means that we always get the first day of the month as a result, allowing us to use the "date" variable as a month designator for grouping.
Okay guy's that makes sense, its seems clear. However, I forgot to tell 1 thing. My Totalvehicle column is actually a COUNT of the VehicleType column.
My first intention was to SUM the VehicleType, but thats not possible ofcourse as those are not numbers but just letters.
Does the COUNT VehicleType I'am using give me the same result as the SUM vehicletype you are using @Patrick ?
The final step is to calculate percentages so I can say for example: Okay vehicletype X drove past that location 560 times in the month X, that is a percentage of 21% of all the types that drove there that month.
If every observation in the source dataset represents ONE vehicle, then a coun(*) would do it instead of the sum.
Then this will do it:
data have;
input Location $ VehicleType $ Date :date9.;
format date yymmddd10.;
datalines;
2646 AAA 10Dec2019
2646 GGG 10Dec2019
2646 AAA 10Dec2019
2646 FFF 10Dec2019
2646 GGG 10Dec2019
2646 CCC 10Dec2019
;
proc sql;
create table want as
select
location,
vehicletype,
intnx('month',date,0,'b') as date format=yymmd7.,
count(*) as totalvehicles
from have
group by location, vehicletype, calculated date;
quit;
Providing the data in a data step with datalines enables everybody to immediately rerun the code for proof or testing.
Yeah! That's why it's important to take your time and provide representative sample data posted as a SAS Data step and then clearly show how the desired result should look like using the sample data. Such an approach removes a lot of ambiguity. You will have to spend a bit more time asking the initial question but then you'll get better answers faster.
Okay Patrick, thank you.
That's the syntax of the SAS intnx() function documented here.
This is a very powerful SAS function which lets you shift SAS Date values. What I'm doing here: Take a date and align it to the first day of the month the date is in. This way I then can summarize by month as now all dates are aligned to the first of the month.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.