BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bbb_NG
Fluorite | Level 6

Hi,

I have a data

IDNameTypeLength
1branch_codeCharacter7
2NationCMPCharacter3
3VersionCharacter6
4ZoneCMPCharacter9
5cmpCharacter31
6NoCountNumeric8

which is designed for excel pivot table, for field 1-4 are already-grouped-by field,

field 5 cmp has a lot of values to be distinguished in Pivot table as Row,

for eg. with value 'Male','Female'  as Sex classification and 'Commercial','Production' as Occupation classification,

'saving account','Term deposit' as account type.

and field 6 nocount are the count/sum respective to field cmp, say

cmp=male, nocount=9,

cmp=saving account, nocount=15235.76

This format serves my purpose to generate Excel pivot, but it is vertically duplicated.

I want to use transpose to re-write the format as follows, but with half day try failed, please give a hand.

branch_codeNationCMPVersionZoneCMPMaleFemalesaving accountTerm depositCommercialProduction

for field 1-4 no transpose, to set values for cmp to the field name, to set respective values for nocount as value under the field.

Thus to avoid duplication.

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
shivas
Pyrite | Level 9

Hi ...

use LET option in transpose..

eg

proc transpose data=b out=c let;

   by branch_code PFSCustSegmentCode LoanCMP OutstandingCMP NationCMP Version NewCustInd OccupationCMP TRBLevel ZoneCMP;

   id cmp ;

   var  NoCount;

   run;

View solution in original post

18 REPLIES 18
art297
Opal | Level 21

Are those six values, in the mixed cases as shown, the only six values in the variable cmp?

bbb_NG
Fluorite | Level 6

Dear Art,

     In fact I'm shiftless.

     I meant that field1-4 are grouped-by fields, infact there are 10 grouped-by fields.

     and in cmp there are about 60 values.

     the Descartes made the data so big but with duplicated records in  10 grouped-by fields.

     I have 4M records in this format,  I suppose after transpose it will shrink to 10k-100k records but with more fields but no duplication.

Thanks.

Tom
Super User Tom
Super User

Can you post some actual records and how you want them transposed?

I think it sounds like you have data where one variable has the variable name (Male or Female) and another variable has the value (Count?).  This is perfect for PROC TRANSPOSE.

proc transpose data=have out=want ;

   by grp var ;

   var count ;

   name zone;

run;

You probably have data for many variables so that you might get a diagonal matrix.

Some thing like.

GRP  Var    Male Female Savings

1  Gender   10     20      .

1  Savetype  .      .    1002

2  Gender   12     18      .

2  Savetype  .      .    2345

You should be able to use the fact that the off terms are all missing to use the UPDATE statement to fix that.

data fix;

   update want(obs=0) want ;

   by grp ;

    drop var;

run;

bbb_NG
Fluorite | Level 6

Tom,

Thanks for your reply.

I'm now digesting it.

For your question ,I have data

branch_codePFS Cust Segment CodeLoanCMPOutstandingCMPNationCMPVersionNew Cust IndOccupationCMPTRBLevelZoneCMPcmpNoCount
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 1Male1
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 2Female2
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 1Commercial3
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 1Production4
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 1Saving Account Balance15,763.67


field branch_code  to ZoneCmp are all grouped already.

and I want the output

branch_codePFS Cust Segment CodeLoanCMPOutstandingCMPNationCMPVersionNew Cust IndOccupationCMPTRBLevelZoneCMPMaleFemaleCommercialProductionSaving Account Balance
AFUNORMALNNCN201201NCommercialTRB1:0_10KZone 1123415,763.67


It's the real data.

Thanks

art297
Opal | Level 21

I think that the following does what you want:

proc summary data=have nway;

  class branch_code nationcmp

     /*and any other grouping fields*/

     cmp;

  var NoCount;

  output out=temp (drop=_:) sum=;

run;

proc transpose data=temp

      out=want (drop=_:);

  by branch_code nationcmp

     /*and any other grouping fields*/;

  id cmp;

  var NoCount;

run;

shivas
Pyrite | Level 9

Hi ,

Is this what your are expecting,

data a;

input branch_code $          PFSCustSegmentCode $ LoanCMP $ OutstandingCMP $          NationCMP $          Version $ NewCustInd $          OccupationCMP $ 13.          TRBLevel $ 13.          ZoneCMP $ 7.          cmp $ 22.          NoCount;

cards;

AFU          NORMAL          N          N          CN          201201          N          Commercial          TRB1:0_10K          Zone 1          Male                              1

AFU          NORMAL          N          N          CN          201201          N          Commercial          TRB1:0_10K          Zone 2          Female                              2

AFU          NORMAL          N          N          CN          201201          N          Commercial          TRB1:0_10K          Zone 1          Commercial                          3

AFU          NORMAL          N          N          CN          201201          N          Commercial          TRB1:0_10K          Zone 1          Production                           4

AFU          NORMAL          N          N          CN          201201          N          Commercial          TRB1:0_10K          Zone 1          SavingAccountBalance          15763.67

;

run;

proc sort data=a out=b;by branch_code PFSCustSegmentCode LoanCMP OutstandingCMP NationCMP Version NewCustInd OccupationCMP TRBLevel ZoneCMP;run;

proc transpose data=b out=c ;

   by branch_code PFSCustSegmentCode LoanCMP OutstandingCMP NationCMP Version NewCustInd OccupationCMP TRBLevel ZoneCMP;

   id cmp ;

   var  NoCount;

   run;

bbb_NG
Fluorite | Level 6

Dear All,

Thank you all for spending time looking at my question and giving an answer.

Very thank you.

bbb_NG
Fluorite | Level 6

Dear all,

     I've write the code as you mentioned above,

first sort, then transpose.

but error occurs

ERROR: ID value“'TRB9_4:300k_1M'n” appears twice in the same By group.

Note, that point to the following by group;

BRANCH CODE=AFU 'PFS Cust Segment Code'n=PSE LoanCMP=N OutstandingCMP=N NationCMP=CN Version=201201 New Cust Ind=N.

Then I tried debug, using filters as "BRANCH CODE=AFU 'PFS Cust Segment Code'n=PSE LoanCMP=N OutstandingCMP=N NationCMP=CN Version=201201 New Cust Ind=N", and only to find that there's no records for CMP=''TRB9_4:300k_1M'n', which means that the error might attributed to that

I've listed values for cmp ,which accounts for 135 items.

and for the already-grouped records, may only have 100 or even less of the item in vertical duplicate format.

Is the missing values caused the faliure of this transpose function?

bbb_NG
Fluorite | Level 6

By the way,anyone knows how to let the log using english only, my SAS EG 4.2 is with english panel, but the log is combined with english and chinese, thus I have to translate the error, not the original ones.

Tom
Super User Tom
Super User

Unless you have mistyped the value in the second row you should not include the variable ZONECMP in the proc transpose call.   Your example input has two values for this variable and your desired output only has one row.

Based on your example rows there should not be rows of data where CMP is like TRB.....  Those values look like the values that are in the variable TRBLevel.   Check and make sure that you are really dealing with a single range of values in your source data.  One of the problems when trying to deal with Excel files is that it is totally free form and so the values in the sheets might not form nice rectangular tables that SAS or a database system can handle.  You might need to treat the sheet as being composed of more than one block of data.

The other issue about why you might not be able to find a match for the string that you see in the error message is that SAS might remove leading spaces before the value.  You might try searching for just pieces of the values listed in the log. For example:  where index(cmp,'TRB') and index(cmp,'4:300').

bbb_NG
Fluorite | Level 6

Tom,

Thanks for your high level instruction. I 'm checking my mistakes under your  instruction.

shivas
Pyrite | Level 9

Hi ...

use LET option in transpose..

eg

proc transpose data=b out=c let;

   by branch_code PFSCustSegmentCode LoanCMP OutstandingCMP NationCMP Version NewCustInd OccupationCMP TRBLevel ZoneCMP;

   id cmp ;

   var  NoCount;

   run;

bbb_NG
Fluorite | Level 6

Shivas,

     Thanks a lot.

     After using Let, the error appears no more.

     I'm not working on the checking procedure.

     If nothing wrong, I will mark it as correct.

     Very thanks.

bbb_NG
Fluorite | Level 6

Shivas,

     Thanks,

     a little "Let" can do a so great work.

thanks.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1444 views
  • 6 likes
  • 4 in conversation