Contains & sum

Reply
Contributor
Posts: 21

Contains & sum

Hi Guys 

 

I have a situation that I am not sure how to go about sloving for.

 

I have a data set that contains "Supplier_name" this is transposed and has "supplier_name1 - suppllier_name82"

supplier_name contains character infomation such as names for eg "LOANS" "CREDIT" , & the second varaible is balance and has "balance1 - balance82. I need to group FOR eg. all loans and the sum of all loans.

 

you help will be greatly appreciated.

Regular Contributor
Posts: 181

Re: Contains & sum

Hi Allistair

 

If you could post an example in the form of Have and Want files it'll be easier for us to give you an answer.

 

Chris

Contributor
Posts: 21

Re: Contains & sum

sample data :

ID Supplier_name1  OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3  OPEN_BALANCE3 Supplier_name4  OPEN_BALANCE5 Supplier_name5  OPEN_BALANCE6 Supplier_name6  OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260         CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261        
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260    
6 HOME LOAN 12259 PERSONAL LOANS 12259     CREDIT CARD 12263 PERSONAL LOANS 12261    
7 LOAN 12260 CREDIT CARD 12260             CREDIT CARD 12259
8 PERSONAL LOANS 12261                 HOME LOAN 12258
9 CREDIT CARD 12262                    
10 HOME LOAN 12263                    

 

desired output :

 

ID   TOTAL PERSONAL LOANS       SUM

1                        4                          R67000

Super User
Posts: 5,081

Re: Contains & sum

It's probably easier to use a DATA step to "untranspose" your data:

 

data want;

set have;

array supplier_name {82};

array open_balance {82};

do _n_=1 to 82;

   if open_balance{_n_} > 0 then do;

       supplier = supplier_name{_n_};

       balance = open_balance{_n_};

       output;

   end;

end;

keep id supplier balance;

run;

 

Then there are loads of ways to summarize.  For example:

 

proc summary data=want nway;

   class id supplier;

   var balance;

   output out=final_summary (keep=id supplier total_balance _freq_) sum=total_balance;

run;

 

Or perhaps as a report:

 

proc tabulate data=want;

class id supplier;

var balance;

tables id * supplier, balance * (sum n);

run;

Contributor
Posts: 21

Re: Contains & sum

Thank you so much for the reply - will run that now and advise.

 

how can I create new tables using the contains function or similar ?

 

supplier_name contains the info for e.g  "SAS personal laon" & supplier_name2 "data personal laons"

 

how do I group by the institute of the loan ?

 

sample view :

 

id  institute1 - 82   total_loan_value:

1        SAS                       R5000

2        DATA                     R1000

 

 

 

 

PROC Star
Posts: 551

Re: Contains & sum

Post example data in the form of a datastep and what you want your desired dataset to look like.

Contributor
Posts: 21

Re: Contains & sum

sample data :
ID Supplier_name1 OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3 OPEN_BALANCE3 Supplier_name4 OPEN_BALANCE5 Supplier_name5 OPEN_BALANCE6 Supplier_name6 OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260 CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260
6 HOME LOAN 12259 PERSONAL LOANS 12259 CREDIT CARD 12263 PERSONAL LOANS 12261
7 LOAN 12260 CREDIT CARD 12260 CREDIT CARD 12259
8 PERSONAL LOANS 12261 HOME LOAN 12258
9 CREDIT CARD 12262
10 HOME LOAN 12263

desired output :

ID TOTAL PERSONAL LOANS SUM
1 4 R67000
Super User
Super User
Posts: 7,401

Re: Contains & sum

Neither of your posts contain a datastep with test data, so with nothing to run against, I would suggest do a transpose:

proc transpose data=have out=inter;
  by id;
  var _all_;
run;

Then do you calculations:

proc sql;
  create table WANT as
  select  ID,
             count(*)/2 as TOTAL,
             sum(...) as LOANS
  from    INTER
  group by ID;
quit;

You need to put whatever the output from the transpose step variable(s) are.  As I have not test data to run with I can only guess.

Contributor
Posts: 21

Re: Contains & sum

Hi , Will send through the data step. thanks for the reply.

Contributor
Posts: 21

Re: Contains & sum

sample data :

ID Supplier_name1  OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3  OPEN_BALANCE3 Supplier_name4  OPEN_BALANCE5 Supplier_name5  OPEN_BALANCE6 Supplier_name6  OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260         CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261        
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260    
6 HOME LOAN 12259 PERSONAL LOANS 12259     CREDIT CARD 12263 PERSONAL LOANS 12261    
7 LOAN 12260 CREDIT CARD 12260             CREDIT CARD 12259
8 PERSONAL LOANS 12261                 HOME LOAN 12258
9 CREDIT CARD 12262                    
10 HOME LOAN 12263                    

 

desired output :

 

ID   TOTAL PERSONAL LOANS       SUM

1                        4                          R67000

Super User
Super User
Posts: 7,401

Re: Contains & sum

Please post test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

And what the output should look like.  I would suspect, if you want to be grouping, that sticking with the data in the normalised form (i.e. data going down the page rather than across) would be simplest.

Contributor
Posts: 21

Re: Contains & sum

 

 

supplier_name contains the info for e.g  "SAS personal laon" & supplier_name2 "data personal laons"

 

how do I group by the institute of the loan ?

 

sample view :

 

id  institute1 - 82   total_loan_value:

1        SAS                       R5000

2        DATA                     R1000

 

Super User
Posts: 5,081

Re: Contains & sum

There are various text processing functions in SAS.  You will have to pick the tools that do what you need.  For example:

 

source = scan(institute1, 1, ' ');

 

This will pick out the first word of INSTITUTE1.

 

Or perhaps:

 

if index(institute1, 'SAS') > 0 then source='SAS';

 

This determines whether SAS appears anywhere within INSTITUTE1.

 

The original posted solution used arrays to move through all the possible sources.  It's probably easier to "untranspose" the data first, so you only have to apply the searching logic to a single variable.  But if you want to pick out specific pieces of information that don't follow a pattern (such as being the first word), you may need to know something about what is in the data and program something very customized for what you are searching for.

Contributor
Posts: 21

Re: Contains & sum

Thank you !

Ask a Question
Discussion stats
  • 13 replies
  • 133 views
  • 0 likes
  • 5 in conversation