We’re smarter together. Learn from this collection of community knowledge and add your expertise.

How to pivot data in SAS Data Management Studio

by SAS Employee audrey on ‎02-01-2016 11:43 AM - edited on ‎02-01-2016 03:08 PM by Community Manager (937 Views)

Sometimes, you get data in a format where you need to pivot from column to row. For example, your customers have several phone numers, kept as MOBILE, OFFICE, HOME, etc. on the same record and you'd like to pivot that table.

PivotPhones.png

 

 

Such functionality doesn't exist in SAS Data Management Studio, but can be achieved by some simple Expression code. In the Pre-Processing Expression code (before I start to read my input records), I will need to declare my variables first and set some values:

 

//New array that will contain my field list
hidden string array fieldlist

//New string that will contain indexes for fields I want to pivot
hidden string indexlist

//New integer for the number of fields that I want to pivot
hidden integer dim

//Loop integers
hidden integer i
hidden integer j

//New fields created
string PhoneTypestring PhoneNumber

//Indexes for the fields I want to pivot
indexlist = '2;3;4'

//Number of fields to pivot 
dim = aparse(indexlist,';',fieldlist)

 

In the Expression code (what will be executed for each record), I will loop on all existing fields, then set the value for my new columns:

 

//Loop on all fields
for i = 1 to fieldcount()
Begin
  for j = 1 to dim
  Begin
    if fieldname(i) == fieldname(fieldlist.get(j)) then
    Begin
      PhoneType = fieldname(i)
      PhoneNumber = fieldvalue(i)
      pushrow()
    End
  End
End

return false

 

Notes:

  • The fieldname(integer) function returns a string, representing the name of a specific field output from the parent node.
  • The fieldvalue(integer) function returns a string, representing the value of a specific field output from the parent node.
  • The <array_name>.get(integer) function retrieves the value of the specified item within an array. The returned value is in the type of the array.
  • The pushrow function pushes the current values of all symbols (this includes both field values for the current row and defined symbols in the code) to a stack. It returns a boolean (true = success ; false = error).

 

At this point, I have duplicated my Phone Numbers into 2 new columns, and I just need to use a Field Layout node to remove the unwanted columns.

 

duplicatedPhones.png

  

A Data Job build with Data Management Studio 2.6 is attached to this article, that contains the code to pivot data.

Attachment
Contributors
Your turn
Sign In!

Want to write an article? Sign in with your profile.