How to combine rows by common string?

Accepted Solution Solved
Reply
Contributor
Posts: 45
Accepted Solution

How to combine rows by common string?


I have a data set that is set up something like:

Billing Month     Product                    Gross Sales (Quantity)

01/12               FordF150                         12

01/12               FordF250                            16

01/12               FordFocus                         8

01/12               Dodge1500                         11

01/12               Dodge2500                             18

01/12               DodgeAvenger                    13

02/12               FordF150                              15

02/12               FordF250                              7

02/12               FordFocus                              9

02/12               Dodge1500                              11

02/12               Dodge2500                              5

02/12               DodgeAvenger                         20

Except my data set is a few hundred times bigger with more product names.

What I want to do is to combine rows within the same billing month by a common keyword.

Example:  I want the data to look like:

Billing Month          Product          Gross Sales

01/12                    Ford                   36

01/12                    Dodge               42

02/12                    Ford                    31

02/12                    Dodge                   36

How can this be done in SAS?  I have access to SAS 9.2 and 9.3


Accepted Solutions
Solution
‎07-25-2014 11:27 AM
Trusted Advisor
Posts: 1,204

Re: How to combine rows by common string?

data want;
set have;
if find(product,'Ford')>0 then prod=substr(product,1,4);
else if find(product,'Dodge')>0 then prod=substr(product,1,5);
run;

proc sql;
select month,prod,sum(sales) as Gross_sales
from want
group by month,prod
order by month,prod desc;
quit;

View solution in original post


All Replies
Super User
Posts: 10,500

Re: How to combine rows by common string?

How many product lines are you dealing with? With your examples are you dealing with any and all car makes or ONLY Dodge and Ford?

Also is this a one time thing or will this be recurring? If recurring I would look at the raw data and see if there was a way to get the maker and model separate. OR build a custom format that would likely need to be updated occasionaly.

If only two makers a simple brute force or a few:

data want;

     set have;

     length Maker $ 8; /* make this at least as long as the longest maker name*/

     if  index(upcase(product),'FORD') >0 then Maker = 'Ford';

     else if index(upcase(product,'DODGE'>0 then Maker='Dodge';

run;

then many report procedures can summarize.

proc means data=want sum;

     class billingmonth maker;

     var sales;

run;

I have added the variable Maker so you can verify the results. Caveat: If any model ends up with part or another makers name this won't work. Unlikely but .... Which is why I recommend going back to the raw data and how it is brough into SAS and have the make and model as diffferent variables. Also, you may get a request for pickups, sedans, coupes, SUV and such. That will much easier with the model separate.

I do not use variable names with spaces in them, if yours have them adjust code as needed. And I could have done this as far back as 5.18 and I'm sure much earlier than that.

Regular Contributor
Posts: 220

Re: How to combine rows by common string?

It will take a lot of subset functions, but I would try something like this:

firstupper=anyupper(substr(product,2));

firstdigit=anydigit(substr(product,2));

if firstupper=0 then firstupper=.;

if firstdigit=0 then firstdigit=.;

split=min(firstupper,firstdigit);

car_make=substr(product,1,split);

FIRSTUPPER finds the first uppercase letter (not counting the first letter of the product name).

FIRSTDIGIT finds the first digit in the product name.

If an uppercase letter or a digit are not found, a 0 is returned.  I force all 0's to missing so they are not captured when I find the smallest value.

SPLIT finds the smallest value of FIRSTUPPER and FIRSTDIGIT to find out where to split the product name.

CAR_MAKE subsets the product name from the beginning of the string to the start of the first digit or next uppercase letter.

Solution
‎07-25-2014 11:27 AM
Trusted Advisor
Posts: 1,204

Re: How to combine rows by common string?

data want;
set have;
if find(product,'Ford')>0 then prod=substr(product,1,4);
else if find(product,'Dodge')>0 then prod=substr(product,1,5);
run;

proc sql;
select month,prod,sum(sales) as Gross_sales
from want
group by month,prod
order by month,prod desc;
quit;

Respected Advisor
Posts: 4,646

Re: How to combine rows by common string?

Use a small auxiliary table (Brands) and the truncated string comparison operator EQT :

data Sales;

length Product $16;

input Month :anydtdte. Product Sales;

format Month mmyys5.;

datalines;

01/12               FordF150                         12

01/12               FordF250                            16

01/12               FordFocus                         8

01/12               Dodge1500                         11

01/12               Dodge2500                             18

01/12               DodgeAvenger                    13

02/12               FordF150                              15

02/12               FordF250                              7

02/12               FordFocus                              9

02/12               Dodge1500                              11

02/12               Dodge2500                              5

02/12               DodgeAvenger                         20

;

data Brands;

length Brand $16;

input Brand;

datalines;

Ford

Dodge

;

proc sql;

select Month, Brand, sum(sales) as BrandSales

from Sales inner join Brands on Product EQT Brand

group by Month, Brand;

quit;

PG

PG
Trusted Advisor
Posts: 1,128

Re: How to combine rows by common string?

by proc sql in a single step

proc sql;

create table want as select distinct BillingMonth, sum(grosssales) as sum, case when index(lowcase(product),'ford')>0 then 'Ford'

when index(lowcase(product),'dodge')>0 then 'Dodge'

else '' end as prod from have group by BillingMonth, prod;

quit;

Thanks,

Jag

Thanks,
Jag
Super User
Posts: 9,681

Re: How to combine rows by common string?

A simple way is using format .

data Sales;
length Product $16;
input Month $ Product Sales;
datalines;
01/12               FordF150                         12
01/12               FordF250                            16
01/12               FordFocus                         8
01/12               Dodge1500                         11
01/12               Dodge2500                             18
01/12               DodgeAvenger                    13
02/12               FordF150                              15
02/12               FordF250                              7
02/12               FordFocus                              9
02/12               Dodge1500                              11
02/12               Dodge2500                              5
02/12               DodgeAvenger                         20
;
run;
proc summary data=Sales nway ;
class month product;
format product $4.;
var Sales;
output out=want sum=;
run;

Xia Keshan

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 230 views
  • 6 likes
  • 7 in conversation