BookmarkSubscribeRSS Feed
BCNAV
Quartz | Level 8

Hi all, I have an issue with adjusting the look of my data. I am using EG but can also program reasonably well.

 

I have data ordered as:

 

Customer_ID     Invoice_num     Customer_name     Chrg     Amt

1121                        1                    Bank1                    Fee1       1

                                                                                    Fee2        2

                                                                                    Fee3        3

1122                        2                    Bank 2                   Fee4       12

                                                                                    Fee7       15

 

And so on. There are up to 9 Fee Chrgs. (fee is a description of the fee type)

 

I need to flatten this so I get

 

Customer_ID     Invoice_num    Customer_name    Chrg1 Chrg2 Chrg3 Chrg4 (up to 9)  Amt1 Amt2 Amt3 (up to 9) and a total

1121                          1                    Bank1                Fee1  Fee2   Fee3  Fee4                     1       0       5     

1122                          6                    Bank2                Fee1  Fee2   Fee3  Fee4                     1       0       5     

 

All Fee description types (9) must be listed horizontally, if a client does not have that fee type on their invoice, it would be 0

 

I am a bit stuck with this.  Thanks

15 REPLIES 15
error_prone
Barite | Level 11

Please pos the data as datastep and explain the logic. I don't understand why Invoice_num is "6" for Customer_ID "1122".

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You have a couple of options, which, if you had provided some test data in the form of a datastep using the {i} above the post area I would provide code against.  As such I will just say:

1)  Proc transpose twice, once for fee, once for amt, then merge the two resulting datasets

2)  Array the two sets of data in a datastep by the given variables, output on last of the by group.

 

There are any number of posts and resources on this type of thing out there, a good term is re-shaping data.  E.g

http://support.sas.com/resources/papers/proceedings13/082-2013.pdf

 

Note that programming with such a structure will make your coding harder, transposed is really only useful in output documents.

ballardw
Super User

It helps to have example data in the form of data step code. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

If you actual data does not have Customer_ID     Invoice_num     Customer_name     on each record as implied by the data example then likely the first thing to do would be to make sure those variables are populated for each row.

The data would be better structured to have a variable Fee with values of 1, 2, 3

 

Here's one example. The id variables end up on the right but that could addressed in any output or by adding length statements for them prior to the array in the last data step.

 

data have;
   input id $ inv $ cust $ Fee Amt;
datalines;
1121 1   Bank1  1 1
1121 1   Bank1  2 2
1121 1   Bank1  4 3
1121 1   Bank1  5 4
1121 2   Bank2  3 10
1121 2   Bank2  7 15
;
run;

/* following step will use BY
   processing requiring sort if not already
*/
proc sort data=have;
   by id inv cust;
run;

proc transpose data=have out=temp
  prefix=amt;
   by id inv cust;
   id fee;
   var amt;
run;

data want;
   array chrg {9} $4 ;
   array amt {9}  ;
   set temp;
   do i=1 to dim(chrg);
      chrg[i] = cats('Fee',i);
      if missing(amt[i]) then amt[i]=0;
   end;
   drop i;
run;
art297
Opal | Level 21

If you download and run the %transpose macro (http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset), the task requires minimal coding:

 

data have;
  input Customer_ID Invoice_num Customer_name $ Chrg $ Amt;
  cards;
1121                      1          Bank1     Fee1     1
1121                      1          Bank1     Fee2     2
1121                      1          Bank1     Fee3     3
1122                      2          Bank2     Fee4    12
1122                      2          Bank2     Fee7    15
;

%transpose(data=have, out=want, by=Customer_ID, var=Chrg Amt, sort=YES)

Art, CEO, AnalystFinder.com

 

BCNAV
Quartz | Level 8

I have been trying to upload data, but when I run the data2datastep macro I get:

 

ERROR: Insufficient authorization to access C:\WINDOWS\system32\create_EGTASK_HOLD_data.sas

 

How can I direct output?

 

 

BCNAV
Quartz | Level 8

Sorry, I didn't realize that the data macro needed a full filename

 

here is datastep data

data EGTASK.HOLD;
  infile datalines dsd truncover;
  input Customer_Number:$6. Invoice_Number:BEST12. Customer_Name:$8. Description:$23. Total_Amt:BEST12.;
datalines4;
N00006,300002,Airline1,Enroute/En route,31
,,,Terminal/Terminaux,22
,,,NAT/Atlantique Nord,16
,,,Overflight/Survol,21
,,,I.Comm./Comm.int.,34
,,,GST / TPS (#897815452),12
,,,HST / TVH (#897815452),0
,,,QST / TVQ (#1018137751),0
N00008,300003,Airline2,Enroute/En route,-21
,,,Terminal/Terminaux,16
,,,NAT/Atlantique Nord,21
,,,Overflight/Survol,500
,,,I.Comm./Comm.int.,25
,,,GST / TPS (#897815452),16
,,,HST / TVH (#897815452),0
,,,QST / TVQ (#1018137751),0
N00012,300004,Airline3,Enroute/En route,0
,,,Terminal/Terminaux,0
,,,NAT/Atlantique Nord,12
,,,Overflight/Survol,14
,,,I.Comm./Comm.int.,21
,,,GST / TPS (#897815452),0
,,,HST / TVH (#897815452),0
,,,QST / TVQ (#1018137751),0
N00017,300005,Airline4,Daily/Quotidienne,-66
,,,Enroute/En route,-25
,,,Terminal/Terminaux,16
,,,I.Comm./Comm.int.,14
,,,GST / TPS (#897815452),13
,,,HST / TVH (#897815452),-15
,,,QST / TVQ (#1018137751),0
N00021,300006,Airline5,NAT/Atlantique Nord,2.24
,,,Overflight/Survol,-6.27
,,,I.Comm./Comm.int.,1.24
,,,GST / TPS (#897815452),0
,,,HST / TVH (#897815452),0
,,,QST / TVQ (#1018137751),11
;;;;

thanks for your patience

 

BCNAV
Quartz | Level 8

The output should be one line per customer number (not multiple lines):

 

Customer_Number   Invoice_Number  Customer_Name  Line1_Descr (9 in total for the charge description) Line1 (to 9) for the $ amounts relating to the respective descriptions and a total

ballardw
Super User

@BCNAV wrote:

The output should be one line per customer number (not multiple lines):

 

Customer_Number   Invoice_Number  Customer_Name  Line1_Descr (9 in total for the charge description) Line1 (to 9) for the $ amounts relating to the respective descriptions and a total


Is something like this what the above refers to? Note I used different libraries as I do not feel like creating a local library just to match your code.

 

data work.temp;
   set work.hold;
   length TCustomer_Number $ 6 TCustomer_Name $ 8;
   retain TCustomer_Number  TInvoice_Number  TCustomer_Name;
   if not missing(Customer_Number) then do;
      TCustomer_Number =  Customer_Number;
      TInvoice_Number  =  Invoice_Number;
      TCustomer_Name   =  Customer_Name;
   end;
   else do;
      Customer_Number = TCustomer_Number  ;
      Invoice_Number  = TInvoice_Number   ;   
      Customer_Name   = TCustomer_Name    ;
   end;
   drop TCustomer_Number  TInvoice_Number  TCustomer_Name;
run;

 

And since "description" looks much different in content than the "chrg" in your initial question, which value corresponds to "fee1" "fee2" and such?

If this were my project I would likely have a custom informat to read the text of description into a numeric 1 to 9 value so I could control order for later bits, and have a custom format to display the text of the description values.

Tom
Super User Tom
Super User

Why do most of your records have missing CUSTOMER_NUMBER?

BCNAV
Quartz | Level 8

This is how it is supplied to us from billing. We get one customer number, then a bunch of charges...then a new customer number, then more charges related to that customer. They can't provide:

 

Customer_number     Description    Amount

121                                 Fee1              6

121                                 Fee3              9

121                                 Fee7              10

 

Instead they can do:

 

Customer_number     Description    Amount

121                                 Fee1              6

                                       Fee3              9

                                       Fee7              1

 

Do you think it is an easier issue to solve with the customer_number filled in?

 

BCNAV
Quartz | Level 8

@ballardw

 

Thanks...your output is almost there.  It provides:

 

N00006 300002 Airline1 Enroute/En route 31
N00006 300002 Airline1 Terminal/Terminaux 22
N00006 300002 Airline1 NAT/Atlantique Nord 16
N00006 300002 Airline1 Overflight/Survol 21
N00006 300002 Airline1 I.Comm./Comm.int. 34
N00006 300002 Airline1 GST / TPS (#897815452) 12
N00006 300002 Airline1 HST / TVH (#897815452) 0
N00006 300002 Airline1 QST / TVQ (#1018137751) 0

 

As you can see, there are multiple rows with the same client ID. I need to get it so there are multiple description columns and multiple amount columns so that each customer_number only occupies one row.

ballardw
Super User

I provided one that works better with a numeric value in the "description" place.

@art297 provided another that I think requires the data as just modified to work.

BCNAV
Quartz | Level 8

The only thing missing now is the descriptions. I need to be able to get, in one line, the amounts and the descriptions of the amounts.  I think this is pretty close, but I am not sure if I can use art297's to do both amount AND description to get all in one row.

ballardw
Super User

@BCNAV wrote:

The only thing missing now is the descriptions. I need to be able to get, in one line, the amounts and the descriptions of the amounts.  I think this is pretty close, but I am not sure if I can use art297's to do both amount AND description to get all in one row.


Try it.

 

Here's one way to put the pieces together that I provided. Note the use of the informat/format pair to control order. You have not mentioned that anything related to the original order of appearance was to be maintained.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 990 views
  • 0 likes
  • 6 in conversation