BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jolly
Calcite | Level 5


I have a data set that is set up something like:

Billing Month     Product                    Gross Sales (Quantity)

01/12               FordF150                         12

01/12               FordF250                            16

01/12               FordFocus                         8

01/12               Dodge1500                         11

01/12               Dodge2500                             18

01/12               DodgeAvenger                    13

02/12               FordF150                              15

02/12               FordF250                              7

02/12               FordFocus                              9

02/12               Dodge1500                              11

02/12               Dodge2500                              5

02/12               DodgeAvenger                         20

Except my data set is a few hundred times bigger with more product names.

What I want to do is to combine rows within the same billing month by a common keyword.

Example:  I want the data to look like:

Billing Month          Product          Gross Sales

01/12                    Ford                   36

01/12                    Dodge               42

02/12                    Ford                    31

02/12                    Dodge                   36

How can this be done in SAS?  I have access to SAS 9.2 and 9.3

1 ACCEPTED SOLUTION

Accepted Solutions
stat_sas
Ammonite | Level 13

data want;
set have;
if find(product,'Ford')>0 then prod=substr(product,1,4);
else if find(product,'Dodge')>0 then prod=substr(product,1,5);
run;

proc sql;
select month,prod,sum(sales) as Gross_sales
from want
group by month,prod
order by month,prod desc;
quit;

View solution in original post

6 REPLIES 6
ballardw
Super User

How many product lines are you dealing with? With your examples are you dealing with any and all car makes or ONLY Dodge and Ford?

Also is this a one time thing or will this be recurring? If recurring I would look at the raw data and see if there was a way to get the maker and model separate. OR build a custom format that would likely need to be updated occasionaly.

If only two makers a simple brute force or a few:

data want;

     set have;

     length Maker $ 8; /* make this at least as long as the longest maker name*/

     if  index(upcase(product),'FORD') >0 then Maker = 'Ford';

     else if index(upcase(product,'DODGE'>0 then Maker='Dodge';

run;

then many report procedures can summarize.

proc means data=want sum;

     class billingmonth maker;

     var sales;

run;

I have added the variable Maker so you can verify the results. Caveat: If any model ends up with part or another makers name this won't work. Unlikely but .... Which is why I recommend going back to the raw data and how it is brough into SAS and have the make and model as diffferent variables. Also, you may get a request for pickups, sedans, coupes, SUV and such. That will much easier with the model separate.

I do not use variable names with spaces in them, if yours have them adjust code as needed. And I could have done this as far back as 5.18 and I'm sure much earlier than that.

djbateman
Lapis Lazuli | Level 10

It will take a lot of subset functions, but I would try something like this:

firstupper=anyupper(substr(product,2));

firstdigit=anydigit(substr(product,2));

if firstupper=0 then firstupper=.;

if firstdigit=0 then firstdigit=.;

split=min(firstupper,firstdigit);

car_make=substr(product,1,split);

FIRSTUPPER finds the first uppercase letter (not counting the first letter of the product name).

FIRSTDIGIT finds the first digit in the product name.

If an uppercase letter or a digit are not found, a 0 is returned.  I force all 0's to missing so they are not captured when I find the smallest value.

SPLIT finds the smallest value of FIRSTUPPER and FIRSTDIGIT to find out where to split the product name.

CAR_MAKE subsets the product name from the beginning of the string to the start of the first digit or next uppercase letter.

stat_sas
Ammonite | Level 13

data want;
set have;
if find(product,'Ford')>0 then prod=substr(product,1,4);
else if find(product,'Dodge')>0 then prod=substr(product,1,5);
run;

proc sql;
select month,prod,sum(sales) as Gross_sales
from want
group by month,prod
order by month,prod desc;
quit;

PGStats
Opal | Level 21

Use a small auxiliary table (Brands) and the truncated string comparison operator EQT :

data Sales;

length Product $16;

input Month :anydtdte. Product Sales;

format Month mmyys5.;

datalines;

01/12               FordF150                         12

01/12               FordF250                            16

01/12               FordFocus                         8

01/12               Dodge1500                         11

01/12               Dodge2500                             18

01/12               DodgeAvenger                    13

02/12               FordF150                              15

02/12               FordF250                              7

02/12               FordFocus                              9

02/12               Dodge1500                              11

02/12               Dodge2500                              5

02/12               DodgeAvenger                         20

;

data Brands;

length Brand $16;

input Brand;

datalines;

Ford

Dodge

;

proc sql;

select Month, Brand, sum(sales) as BrandSales

from Sales inner join Brands on Product EQT Brand

group by Month, Brand;

quit;

PG

PG
Jagadishkatam
Amethyst | Level 16

by proc sql in a single step

proc sql;

create table want as select distinct BillingMonth, sum(grosssales) as sum, case when index(lowcase(product),'ford')>0 then 'Ford'

when index(lowcase(product),'dodge')>0 then 'Dodge'

else '' end as prod from have group by BillingMonth, prod;

quit;

Thanks,

Jag

Thanks,
Jag
Ksharp
Super User

A simple way is using format .

data Sales;
length Product $16;
input Month $ Product Sales;
datalines;
01/12               FordF150                         12
01/12               FordF250                            16
01/12               FordFocus                         8
01/12               Dodge1500                         11
01/12               Dodge2500                             18
01/12               DodgeAvenger                    13
02/12               FordF150                              15
02/12               FordF250                              7
02/12               FordFocus                              9
02/12               Dodge1500                              11
02/12               Dodge2500                              5
02/12               DodgeAvenger                         20
;
run;
proc summary data=Sales nway ;
class month product;
format product $4.;
var Sales;
output out=want sum=;
run;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 584 views
  • 6 likes
  • 7 in conversation