07-13-2017 12:48 PM - edited 07-13-2017 01:10 PM
As part of my MS in Analytics program, I had an opportunity to discuss about Data Restructuring basics. I wanted to share it with the sas community.
SAS Functions assist SAS programmers in performing actions such as transforming numbers for e.g., from numbers with decimals to the nearest integer if so desired (for example, from 100.7 to 101 by using ROUND function) or could break down a string of characters. If we need to capitalize the first letter of each word MURALI SASTRY, if we use PROPCASE function, we could capitalize first letter of each word and result will return with Murali Sastry.
These are the two examples of SAS functions. There are several hundreds of these functions to transform data from its “current state” to “desired state” as data analytics professionals we need to review data in as many perspectives and dimensions as feasible. Similar to the numeric and character functions stated above, there are statistical, financial and other functions that are used to assist in data transformation or data manipulation without changing the raw data structure.
Tasks Needed to Manipulate and Restructure Data:
SAS Functions perform the tasks needed to manipulate and restructure data to the desired state. If data is inconsistently entered in a database, it makes utmost sense to standardize and structure data. For these purposes, SAS Functions will come in handy. Examples of such tasks include all capital letters used in names, some contact telephone numbers with parenthesis for area code and some without parenthesis, some words in the cells with leading blanks or trailing blanks and some without them. In such cases, data manipulation is a necessity to maintain data quality prior to working with data to understand it.
Objects in SAS to enable tasks to be performed:
To organize the activities of data manipulation, SAS user interface offers methods through the use of project tree (keeping phases of project together in a SAS data manipulation endeavor, process flow (to advance the projects from start through understanding data, and transforming data to provide needed insights), Task status is displayed as the task is in execution mode (for example when in ‘run’ mode, the status is displayed such as ‘status 1 of 3, running PROC CONTENTS’. Similar to any windows environment, SAS Enterprise Guide also has File, Edit, View, Tasks, Program, Tools, and Help pull down menus.
SAS Functions to be applied to Datasets for Summarization:
SUM Function could be utilized in DATA step and PROC step in programming to accomplish data summaries. Alternatives to sum function include SUMMARY, MEANS, or SQL could be used to summarize data.
Matlapudi, Anjan, and Knapp, Daniel (2012) Ways to Summarize Data using SUM function in SAS
Cody, Ron (2007) Learning SAS by Example: A Programmer’s Guide. 2013 Cary, NC: SAS Institute, Inc.