BookmarkSubscribeRSS Feed
EMC9
Obsidian | Level 7

Hi Guys 

 

I have a situation that I am not sure how to go about sloving for.

 

I have a data set that contains "Supplier_name" this is transposed and has "supplier_name1 - suppllier_name82"

supplier_name contains character infomation such as names for eg "LOANS" "CREDIT" , & the second varaible is balance and has "balance1 - balance82. I need to group FOR eg. all loans and the sum of all loans.

 

you help will be greatly appreciated.

13 REPLIES 13
ChrisBrooks
Ammonite | Level 13

Hi Allistair

 

If you could post an example in the form of Have and Want files it'll be easier for us to give you an answer.

 

Chris

EMC9
Obsidian | Level 7

sample data :

ID Supplier_name1  OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3  OPEN_BALANCE3 Supplier_name4  OPEN_BALANCE5 Supplier_name5  OPEN_BALANCE6 Supplier_name6  OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260         CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261        
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260    
6 HOME LOAN 12259 PERSONAL LOANS 12259     CREDIT CARD 12263 PERSONAL LOANS 12261    
7 LOAN 12260 CREDIT CARD 12260             CREDIT CARD 12259
8 PERSONAL LOANS 12261                 HOME LOAN 12258
9 CREDIT CARD 12262                    
10 HOME LOAN 12263                    

 

desired output :

 

ID   TOTAL PERSONAL LOANS       SUM

1                        4                          R67000

Astounding
PROC Star

It's probably easier to use a DATA step to "untranspose" your data:

 

data want;

set have;

array supplier_name {82};

array open_balance {82};

do _n_=1 to 82;

   if open_balance{_n_} > 0 then do;

       supplier = supplier_name{_n_};

       balance = open_balance{_n_};

       output;

   end;

end;

keep id supplier balance;

run;

 

Then there are loads of ways to summarize.  For example:

 

proc summary data=want nway;

   class id supplier;

   var balance;

   output out=final_summary (keep=id supplier total_balance _freq_) sum=total_balance;

run;

 

Or perhaps as a report:

 

proc tabulate data=want;

class id supplier;

var balance;

tables id * supplier, balance * (sum n);

run;

EMC9
Obsidian | Level 7

Thank you so much for the reply - will run that now and advise.

 

how can I create new tables using the contains function or similar ?

 

supplier_name contains the info for e.g  "SAS personal laon" & supplier_name2 "data personal laons"

 

how do I group by the institute of the loan ?

 

sample view :

 

id  institute1 - 82   total_loan_value:

1        SAS                       R5000

2        DATA                     R1000

 

 

 

 

PeterClemmensen
Tourmaline | Level 20

Post example data in the form of a datastep and what you want your desired dataset to look like.

EMC9
Obsidian | Level 7
sample data :
ID Supplier_name1 OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3 OPEN_BALANCE3 Supplier_name4 OPEN_BALANCE5 Supplier_name5 OPEN_BALANCE6 Supplier_name6 OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260 CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260
6 HOME LOAN 12259 PERSONAL LOANS 12259 CREDIT CARD 12263 PERSONAL LOANS 12261
7 LOAN 12260 CREDIT CARD 12260 CREDIT CARD 12259
8 PERSONAL LOANS 12261 HOME LOAN 12258
9 CREDIT CARD 12262
10 HOME LOAN 12263

desired output :

ID TOTAL PERSONAL LOANS SUM
1 4 R67000
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Neither of your posts contain a datastep with test data, so with nothing to run against, I would suggest do a transpose:

proc transpose data=have out=inter;
  by id;
  var _all_;
run;

Then do you calculations:

proc sql;
  create table WANT as
  select  ID,
             count(*)/2 as TOTAL,
             sum(...) as LOANS
  from    INTER
  group by ID;
quit;

You need to put whatever the output from the transpose step variable(s) are.  As I have not test data to run with I can only guess.

EMC9
Obsidian | Level 7

Hi , Will send through the data step. thanks for the reply.

EMC9
Obsidian | Level 7

sample data :

ID Supplier_name1  OPEN_BALANCE1 Supplier_name2 OPEN_BALANCE2 Supplier_name3  OPEN_BALANCE3 Supplier_name4  OPEN_BALANCE5 Supplier_name5  OPEN_BALANCE6 Supplier_name6  OPEN_BALANCE7
1 PERSONAL LOANS 12254 LOAN 12258 PERSONAL LOANS 12260 PERSONAL LOANS 12258 LOAN 12256 LOAN 12256
2 CREDIT CARD 12255 VEHICHLE 12259 CREDIT CARD 12257 CREDIT CARD 12259 PERSONAL LOANS 12257 PERSONAL LOANS 12257
3 HOME LOAN 12256 HOME LOAN 12260         CREDIT CARD 12258 CREDIT CARD 12258
4 PERSONAL LOANS 12257 LOAN 12257 PERSONAL LOANS 12261 HOME LOAN 12261        
5 CREDIT CARD 12258 PERSONAL LOANS 12258 CREDIT CARD 12262 PERSONAL LOANS 12262 HOME LOAN 12260    
6 HOME LOAN 12259 PERSONAL LOANS 12259     CREDIT CARD 12263 PERSONAL LOANS 12261    
7 LOAN 12260 CREDIT CARD 12260             CREDIT CARD 12259
8 PERSONAL LOANS 12261                 HOME LOAN 12258
9 CREDIT CARD 12262                    
10 HOME LOAN 12263                    

 

desired output :

 

ID   TOTAL PERSONAL LOANS       SUM

1                        4                          R67000

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Please post test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

And what the output should look like.  I would suspect, if you want to be grouping, that sticking with the data in the normalised form (i.e. data going down the page rather than across) would be simplest.

EMC9
Obsidian | Level 7

 

 

supplier_name contains the info for e.g  "SAS personal laon" & supplier_name2 "data personal laons"

 

how do I group by the institute of the loan ?

 

sample view :

 

id  institute1 - 82   total_loan_value:

1        SAS                       R5000

2        DATA                     R1000

 

Astounding
PROC Star

There are various text processing functions in SAS.  You will have to pick the tools that do what you need.  For example:

 

source = scan(institute1, 1, ' ');

 

This will pick out the first word of INSTITUTE1.

 

Or perhaps:

 

if index(institute1, 'SAS') > 0 then source='SAS';

 

This determines whether SAS appears anywhere within INSTITUTE1.

 

The original posted solution used arrays to move through all the possible sources.  It's probably easier to "untranspose" the data first, so you only have to apply the searching logic to a single variable.  But if you want to pick out specific pieces of information that don't follow a pattern (such as being the first word), you may need to know something about what is in the data and program something very customized for what you are searching for.

EMC9
Obsidian | Level 7

Thank you !

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1126 views
  • 0 likes
  • 5 in conversation