SAS expertise delivered to your desktop -- on-demand and free!

Join Now

Filtering Every Which Way in SAS Enterprise Guide

by SAS Employee megak8 on ‎07-12-2017 05:31 PM - edited Monday by Community Manager (1,984 Views)

Did you miss the Ask the Expert session on Filtering Every Which Way in SAS Enterprise Guide? Not to worry, you can catch it on-demand at your leisure.

 

Watch the webinar

 

By the end of this hour, you will know the secrets to exploiting key data management, integration and filtering capabilities of SAS Enterprise Guide. We’ll cover:

  • Finding the tasks that conduct data management.
  • Four ways to conduct filtering.
  • Transposing data – restructuring observations and deriving new variables.
  • Using the Query Builder’s advanced expression editor. 

 The data management tasks are primarily found under Tasks -> DataThe data management tasks are primarily found under Tasks -> Data

Here are some highlighted questions from the Q&A segment held at the end of the session for ease of reference. 

­

 

What are the new variables that were created in the split column example?­ What is the best way to drop those?­

­

 

The new variable names (actual, forecast, residual) were derived from the values of the column that was split (type); they are now included in your new output data. 

 

Can you show one more time how to get to advanced expression?­

 

From the Query Builder, click computed columns, New -> Advanced expression­.

 

I don't see "Excel" as an option in Tools, Options, Results General.  Is that because I'm using only SAS EG 6.1?­

­

 

Yes, that feature became available in Enterprise Guide version 7.12 ­- What's new in SAS Enterprise Guide?

 

If you stopped and ended SAS right now, ­would the code be embedded? Is it embedded in the .egp file then?

 

The code that performs the subset is added to the code that is generated.­ ­­There is no code stored by default; the query builder will generate the code when it is run again.­ If the code is part of the project, it will be stored as part of the .egp file.­ ­You can verify that it is part of the project by looking for the icon that identifies code. 

 

To save the code generated, you'd need to create a task template then? ­

­

 

No, you can choose the code tab and save the code from there.

 

The join -- it checks for variable name and type but does it check for length too? ­ 

 

The default for joins is a common name/type.  I do not believe it checks for length.  It will take the length of the first variable that it encounters at compile time.­

 

What is common variable between files? ­

 

The commonly named variables with the same data type­ – usually the primary and foreign keys.

 

I noticed that in the variables presented, the variable names are different from file to file. ­

 

The query builder does not require commonly named variables; that’s the default for joins. It uses SQL code behind the scenes. (SELECT).

 

Do you need to specify paired quote marks around character variables then?

 

No, you shouldn't need to do that if you are comparing the values of the variables and not text strings.

 

In the filter & sort utility, you can query for data in a list.  Can you do the same thing in the query builder?­

 

Yes, you can do this from the filter tab in the Query Builder.­ This operation was shown during the first 30 minutes of the webinar.

 

­How would you be able to see two or more dataset windows side by side?­ View - Side by Side only is allowing to see the Process Flow Window and One dataset. How do I display side by side 2 datasets?­

 

From the top margin, view -> workspace layout -> side by side. Double click on the 1st data set. Choose side by side, then go to the empty tile and select the next data set that you want to see.­ You may need version 7.1 or later.

 

What is the difference between an .egp file and a task template then? ­Is a task template smaller than an .egp project file?

 

Task templates allow you to use the same settings for a particular task. When you run a task, such as the Bar Chart task, you might want to frequently use the same settings. Instead of opening the task multiple times to specify these settings for different input data sources, task templates enable you to save your settings for a specific task to a template. You can then run that template with any input data source and achieve more consistent results for that task. .egp files are Enterprise Guide projects – a holistic recipe of ingredients (data) and assembly instructions (tasks and code) for achieving repeatable results/outputs. They contain process flows, programs, tasks, and all the instructions for taking your data from raw and operational to intelligent and analytical.

 

Say you needed to save the transpose task as a separate .egp file  - how would you do that?­

 

Make a copy of the .egp file and remove other tasks.

 

Please address WHERE vs IF for subsetting within the DATA step.­

 

The WHERE acts like a pre-processor and goes to the SAS data to perform the subset before any other processing happens.  The IF requires that the data be loaded into memory and can then be subset.  IF is appropriate for raw data as well as SAS data.  W­HERE is only appropriate for SAS data sets.­

 

Every time I run a program, SAS automatically generates links to working data sets under Process Flow, how to remove them?­

 

SAS Technical Support (support@sas.com) is likely a better resource to answer this question. In the query builder, and most other tasks that generate data outputs, you can specify where to save outputs. You’ll need write permissions to create permanent outputs.

 

Is the underlying SAS code for all the tasks done with the wizards available to view to the user?­

 

Yes, you can select the code tab and access/edit the code from there.­

 

What's the difference between PROC TRANSPOSE and the SPLIT COLUMNS function you just used?­

 

They both use PROC TRANSPOSE behind the scenes. The SPLIT COLUMNS task generates most options for PROC TRANSPOSE; coding PROC TRANSPOSE yourself or revising the generated code give you more control.­

 

If you write a program in SAS Enterprise Guide, does SAS Enterprise Guide create a Process Flow for you, even if you didn't use the wizard and just used the code?­

­

 

There is still a process flow, though you do not have to use that as part of your visual.  You can simply use the program editor window to process the code.­

 

We have a column that is delimited that looks like this­: actual_arrival|<null>|01/19/2017 0700^actual_departure|<null>|01/19/2017 1000^status|A|D­. Is there something that can easily parse, then make this column transposed,  the column goes field name, pipe, from value, field name, to value carrot to next field­?

 

If the value is a character value, then you could possibly use the SCAN function in order to separate the values into different components. You would have to specify the separator­. If you create separate variables as the result of SCAN, then you could use TRANSPOSE. Please experiment to make sure these functions give you the output that you want.­

 

­In query builder, the replace function, like create a new column where 1 will be replaced with 01; I know how to do this one at a time.  Is there a way to copy columns of data in the replace function? ie 1, 2, 3 with 01, 02, 03.­

 

DATA step (if), formats and/or lookups are likely good fits for this operation. You could also use the dta editor in Enterprise Guide if you have write permissions.

 

What is the macro looking portion (%let ...) used for when the code for a function is generated? (most of my programs only use code to pull from Oracle and DB2, so I only use the "meat­ of the code?”)

 

Enterprise Guide often adds extra code that we might not see if we wrote the code ourselves – all code generator technologies do this to some extent because they retrieve pre-defined specifications to achieve consistency.  If you are capturing the code for subsequent use, you may want to remove the %let and other statements that have been added by Enterprise Guide.­

 

Do you recommend SAS programmers to switch to EG as programmer interface? What are the pros and cons? Thanks!­

 

Enterprise Guide gives you a modern, sophisticated Windows programming interface for server environments. It has dozens of additional amenities, such as the process flow and hierarchical views for creating repeatable, scalable processes, visual documentation, scheduling, syntax help, grid-enablement, editing data, and more such as Ten SAS Enterprise Guide program editor tricks .

 

How could I use the OH and IN query to create a data file that had one row per customer, with the last date of purchase and the total quantity purchased? Thank you!­

 

This topic was addressed during the second half of the webinar. Recency, frequency and monetary analysis is likely a good fit too. (Tasks -> Data Mining -> Recency, Frequency, and Monetary Analysis).

 

Can you show how query builder will combine/create new variables?

 

This topic was addressed during the second half of the webinar.

 

Is there an easy way to combine multiple reports onto a single result page when you build a custom report in Enterprise Guide? We're using E-Guide to build some simple dashboards - 5 or 6 reports. It would be nice to build all the reports onto a single output page rather than stringing a bunch of pages together. I realize not really the topic of this session, but a link to some sample code or sample output or other webinar that tackles this topic would be really helpful. We do not have SAS Visual Analytics, just Enterprise Guide 7.1.

 

The report builder in Enterprise Guide will likely be a good fit. It’s demonstrated in the on-demand recording here and also here, in Enterprise Guide’s help

 

I often have issues with case sensitivities when joining data. What is the most efficient way to overcome this.?

 

Case should only be an issue if you are comparing text values. You could choose to use the UPCASE or LOWCASE functions to help make consistent comparisons to text.

Can you give me a method to eliminate duplicate records? I have tried dupkey and it was taking out records that should have been kept. ­

 

There is a DISTINCT keyword that is available with SQL. ­In the Query Builder, click the Select Data tab, and select the Select distinct rows only check box.

Do you know if, when outputting to PowerPoint, can you then edit the slide from within PowerPoint?­

 

Yes, you can edit just as you would other content in PowerPoint.­

What are the new variables that were created in the split column example?­ What is the best way to drop those?­

 

The new variable names were derived from the values of the column that is being split.­ That is now part of your new data, so I wonder why you would want to drop them?

If you show the output again, you'll see that the far right 3 columns are all blank with obscure names­. ­I would not want them as part of the new dataset­.

 

Those are values such as confidence level values L95, U95 that are included in forecasts.

Can you show one more time how to get to advanced expression?­

 

From the Query Builder, select computed columns, New -> Advanced expression­.

 

Recommended Resources

Course: SAS Enterprise Guide 1: Querying and Reporting
Course: Creating Reports and Graphs with SAS Enterprise Guide
Book: The Little SAS® Enterprise Guide® Book

 

Want more tips? Be sure to subscribe to the Ask the Expert Community Library to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. From the Ask the Expert Library, just click Subscribe from the orange bar underneath the list of the recent articles.

NOTE: For best results when opening the attached slides, click on the “download” icon.

Contributors