Hi Guys
I have a situation that I am not sure how to go about sloving for.
I have a data set that contains "Supplier_name" this is transposed and has "supplier_name1 - suppllier_name82"
supplier_name contains character infomation such as names for eg "LOANS" "CREDIT" , & the second varaible is balance and has "balance1 - balance82. I need to group FOR eg. all loans and the sum of all loans.
you help will be greatly appreciated.
Hi Allistair
If you could post an example in the form of Have and Want files it'll be easier for us to give you an answer.
Chris
sample data :
ID | Supplier_name1 | OPEN_BALANCE1 | Supplier_name2 | OPEN_BALANCE2 | Supplier_name3 | OPEN_BALANCE3 | Supplier_name4 | OPEN_BALANCE5 | Supplier_name5 | OPEN_BALANCE6 | Supplier_name6 | OPEN_BALANCE7 |
1 | PERSONAL LOANS | 12254 | LOAN | 12258 | PERSONAL LOANS | 12260 | PERSONAL LOANS | 12258 | LOAN | 12256 | LOAN | 12256 |
2 | CREDIT CARD | 12255 | VEHICHLE | 12259 | CREDIT CARD | 12257 | CREDIT CARD | 12259 | PERSONAL LOANS | 12257 | PERSONAL LOANS | 12257 |
3 | HOME LOAN | 12256 | HOME LOAN | 12260 | CREDIT CARD | 12258 | CREDIT CARD | 12258 | ||||
4 | PERSONAL LOANS | 12257 | LOAN | 12257 | PERSONAL LOANS | 12261 | HOME LOAN | 12261 | ||||
5 | CREDIT CARD | 12258 | PERSONAL LOANS | 12258 | CREDIT CARD | 12262 | PERSONAL LOANS | 12262 | HOME LOAN | 12260 | ||
6 | HOME LOAN | 12259 | PERSONAL LOANS | 12259 | CREDIT CARD | 12263 | PERSONAL LOANS | 12261 | ||||
7 | LOAN | 12260 | CREDIT CARD | 12260 | CREDIT CARD | 12259 | ||||||
8 | PERSONAL LOANS | 12261 | HOME LOAN | 12258 | ||||||||
9 | CREDIT CARD | 12262 | ||||||||||
10 | HOME LOAN | 12263 |
desired output :
ID TOTAL PERSONAL LOANS SUM
1 4 R67000
It's probably easier to use a DATA step to "untranspose" your data:
data want;
set have;
array supplier_name {82};
array open_balance {82};
do _n_=1 to 82;
if open_balance{_n_} > 0 then do;
supplier = supplier_name{_n_};
balance = open_balance{_n_};
output;
end;
end;
keep id supplier balance;
run;
Then there are loads of ways to summarize. For example:
proc summary data=want nway;
class id supplier;
var balance;
output out=final_summary (keep=id supplier total_balance _freq_) sum=total_balance;
run;
Or perhaps as a report:
proc tabulate data=want;
class id supplier;
var balance;
tables id * supplier, balance * (sum n);
run;
Thank you so much for the reply - will run that now and advise.
how can I create new tables using the contains function or similar ?
supplier_name contains the info for e.g "SAS personal laon" & supplier_name2 "data personal laons"
how do I group by the institute of the loan ?
sample view :
id institute1 - 82 total_loan_value:
1 SAS R5000
2 DATA R1000
Post example data in the form of a datastep and what you want your desired dataset to look like.
Neither of your posts contain a datastep with test data, so with nothing to run against, I would suggest do a transpose:
proc transpose data=have out=inter; by id; var _all_; run;
Then do you calculations:
proc sql; create table WANT as select ID, count(*)/2 as TOTAL, sum(...) as LOANS from INTER group by ID; quit;
You need to put whatever the output from the transpose step variable(s) are. As I have not test data to run with I can only guess.
Hi , Will send through the data step. thanks for the reply.
sample data :
ID | Supplier_name1 | OPEN_BALANCE1 | Supplier_name2 | OPEN_BALANCE2 | Supplier_name3 | OPEN_BALANCE3 | Supplier_name4 | OPEN_BALANCE5 | Supplier_name5 | OPEN_BALANCE6 | Supplier_name6 | OPEN_BALANCE7 |
1 | PERSONAL LOANS | 12254 | LOAN | 12258 | PERSONAL LOANS | 12260 | PERSONAL LOANS | 12258 | LOAN | 12256 | LOAN | 12256 |
2 | CREDIT CARD | 12255 | VEHICHLE | 12259 | CREDIT CARD | 12257 | CREDIT CARD | 12259 | PERSONAL LOANS | 12257 | PERSONAL LOANS | 12257 |
3 | HOME LOAN | 12256 | HOME LOAN | 12260 | CREDIT CARD | 12258 | CREDIT CARD | 12258 | ||||
4 | PERSONAL LOANS | 12257 | LOAN | 12257 | PERSONAL LOANS | 12261 | HOME LOAN | 12261 | ||||
5 | CREDIT CARD | 12258 | PERSONAL LOANS | 12258 | CREDIT CARD | 12262 | PERSONAL LOANS | 12262 | HOME LOAN | 12260 | ||
6 | HOME LOAN | 12259 | PERSONAL LOANS | 12259 | CREDIT CARD | 12263 | PERSONAL LOANS | 12261 | ||||
7 | LOAN | 12260 | CREDIT CARD | 12260 | CREDIT CARD | 12259 | ||||||
8 | PERSONAL LOANS | 12261 | HOME LOAN | 12258 | ||||||||
9 | CREDIT CARD | 12262 | ||||||||||
10 | HOME LOAN | 12263 |
desired output :
ID TOTAL PERSONAL LOANS SUM
1 4 R67000
Please post test data in the form of a datastep:
And what the output should look like. I would suspect, if you want to be grouping, that sticking with the data in the normalised form (i.e. data going down the page rather than across) would be simplest.
supplier_name contains the info for e.g "SAS personal laon" & supplier_name2 "data personal laons"
how do I group by the institute of the loan ?
sample view :
id institute1 - 82 total_loan_value:
1 SAS R5000
2 DATA R1000
There are various text processing functions in SAS. You will have to pick the tools that do what you need. For example:
source = scan(institute1, 1, ' ');
This will pick out the first word of INSTITUTE1.
Or perhaps:
if index(institute1, 'SAS') > 0 then source='SAS';
This determines whether SAS appears anywhere within INSTITUTE1.
The original posted solution used arrays to move through all the possible sources. It's probably easier to "untranspose" the data first, so you only have to apply the searching logic to a single variable. But if you want to pick out specific pieces of information that don't follow a pattern (such as being the first word), you may need to know something about what is in the data and program something very customized for what you are searching for.
Thank you !
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.