BookmarkSubscribeRSS Feed
AK100
Pyrite | Level 9

Hello lads,

 

I'am quite new to SAS EG so the question I will ask might be simple but I can't get it done. My current dataset looks like this (example, because in reality its much bigger):

Location - VehicleType - TotalVehicles - Date

2646 -  AAA -  06 - 10December2019.

2646 - GGG - 13 - 10December2019.

2646 -  AAA -  02 - 10December2019.

2646 -  FFF - 11 - 10December2019.

2646 - GGG -  02 - 10December2019.

2646 - CCC- 11 - 10December2019.

 

I would like to have the total vehicles that drove on that location per month. So as an example, I want the following:

Location - VehicleType - TotalVehicles - Date

2646 -  AAA-  20 - December2019.

2646 - TTT - 78 - December2019

2646 -  GGG-  230- December2019.

2646 - BBB- 780 - December2019

2646 -  HHH-  20 - December2019.

2646 - NNN- 78 - December2019

 

I actually want the total per month per vehicle type. That means each vehicletype must have just 1 date variable. 

 

Can somebody help me with this? Appreciate it.

16 REPLIES 16
Kurt_Bremser
Super User

In your expected result, you have three observations for 2646-AAA-December2019. That does not make sense.

 


@AK100 wrote:

Hello lads,

 

I'am quite new to SAS EG so the question I will ask might be simple but I can't get it done. My current dataset looks like this (example, because in reality its much bigger):

Location - VehicleType - TotalVehicles - Date

2646 -  AAA -  06 - 10December2019.

2646 - GGG - 13 - 10December2019.

2646 -  AAA -  02 - 10December2019.

2646 -  FFF - 11 - 10December2019.

2646 - GGG -  02 - 10December2019.

2646 - CCC- 11 - 10December2019.

 

I would like to have the total vehicles that drove on that location per month. So as an example, I want the following:

Location - VehicleType - TotalVehicles - Date

2646 -  AAA-  20 - December2019.

2646 - AAA - 78 - December2019

2646 -  GGG-  230- December2019.

2646 - BBB- 780 - December2019

2646 -  AAA-  20 - December2019.

2646 - BBB - 78 - December2019

 

I actually want the total per month per vehicle type. That means each vehicletype must have just 1 date variable. 

 

Can somebody help me with this? Appreciate it.


 

AK100
Pyrite | Level 9
That is indeed true, I've changed it. Thx.
Kurt_Bremser
Super User

The simplest way is just a quick SQL query:

proc sql;
create table want as
select
  location,
  vehicletype,
  intnx('month',date,0,'b') as date format=yymmd7.,
  sum(totalvehicles) as totalvehicles
from have
group by location, vehicletype, calculated date;
quit;
AK100
Pyrite | Level 9
This worked fine for me Thank you!!
Patrick
Opal | Level 21

Assuming your desired result should actually look like...

Capture3.JPG

 

...and you want to use the EG query builder and not code:

Capture.JPG

Capture.JPG

AK100
Pyrite | Level 9
This looks okay, but I dont get your last part of the computed month value. Could you show that? Why do you add ,0.'b' after t.1.date?
Kurt_Bremser
Super User

The INTNX function is used to increment date/time/datetime values. 0 means to stay in the same month (no increment), and 'b' means to align the result to the beginning. The way we used the function means that we always get the first day of the month as a result, allowing us to use the "date" variable as a month designator for grouping.

AK100
Pyrite | Level 9

@Patrick @Kurt_Bremser 

 

Okay guy's that makes sense, its seems clear. However, I forgot to tell 1 thing. My Totalvehicle column is actually a COUNT of the VehicleType column.


My first intention was to SUM the VehicleType, but thats not possible ofcourse as those are not numbers but just letters. 

 

Does the COUNT VehicleType  I'am using give me the same result as the SUM vehicletype you are using @Patrick ?

 

The final step is to calculate percentages so I can say for example: Okay vehicletype X drove past that location 560 times in the month X, that is a percentage of 21% of all the types that drove there that month.

AK100
Pyrite | Level 9
It represents one measurement, so that means one vehicle type and vehicle can drive past a specific location more that once.
Kurt_Bremser
Super User

Then this will do it:

data have;
input Location $ VehicleType $ Date :date9.;
format date yymmddd10.;
datalines;
2646 AAA 10Dec2019
2646 GGG 10Dec2019
2646 AAA 10Dec2019
2646 FFF 10Dec2019
2646 GGG 10Dec2019
2646 CCC 10Dec2019
;

proc sql;
create table want as
select
  location,
  vehicletype,
  intnx('month',date,0,'b') as date format=yymmd7.,
  count(*) as totalvehicles
from have
group by location, vehicletype, calculated date;
quit;

Providing the data in a data step with datalines enables everybody to immediately rerun the code for proof or testing.

Patrick
Opal | Level 21

Yeah! That's why it's important to take your time and provide representative sample data posted as a SAS Data step and then clearly show how the desired result should look like using the sample data. Such an approach removes a lot of ambiguity. You will have to spend a bit more time asking the initial question but then you'll get better answers faster. 

AK100
Pyrite | Level 9

Okay Patrick, thank you. 

Patrick
Opal | Level 21

That's the syntax of the SAS intnx() function documented here.

This is a very powerful SAS function which lets you shift SAS Date values. What I'm doing here: Take a date and align it to the first day of the month the date is in. This way I then can summarize by month as now all dates are aligned to the first of the month.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 1052 views
  • 2 likes
  • 4 in conversation