BookmarkSubscribeRSS Feed
paulrockliffe
Obsidian | Level 7

 

Pretty much every dataset I produce has some common columns, these contain the Sector, Region and Office that the line of data is linked to.  Data is often distributed to a Sector, a Region or an Office and usually this is done by dumping the entire dataset to an Excel spreadsheet and using slicers etc to allow individuals to make their own decisions about which data cut they want.  This is all fairly new as a way of working and I'd like to improve things because often having a full data-cut causes security concerns or simply means there's too much data for antiquated machines to run comfortably.

 

I'd like to add a Code Node to my Enterprise Guide process flows that creates multiple tables, one for each <parameter> and filters each table by that parameter.  The parameters would be the Sector, the Region or the Office, so for example if I wanted to use the Region, I would end up with 8 output tables, one for each Region.

 

If I could do that, that woujld be amazing.  But if it's relatively simple, I'd like to be able to use multiple parameters, as some times it would be very useful to break the data down further, for example a Region's data broken down by Sector.

 

In most cases I would want to export the results to Excel, I don't have access to any of the services around that other than right-click, 'export results as a step in the process'.  So a code that incorporated the export would be really helpful too.  Obviously I'd need to specify a location in the Code Node and the export needs to be .csv as the way Enterpise Guide builds .xls files (the referencing between the different .xml files is either relative or the opposite I can't remember which) means they can't be used as Data Connections in Excel.  Data Connections are really useful for us as they allow us to build process flows that automatically update the Excel-based products that we deliver.

 

Hope that all makes sense, but if any part isn't clear, just let me know.

 

Thanks in advance!

23 REPLIES 23
Kurt_Bremser
Super User

To me, that would be a real case for stored processes. You can set prompts for the parameters (pre-populated with variable names) and then create the tables accordingly.

Since prompts are also available in EG (and you can create a stored process from a node), that is the first path I'd follow.

 

Do you have a STP server?

 

Basic code for splitting along parameters:

%let indata=sashelp.cars;
%let outlib=work;
%let param=origin;

proc sort
  data=&indata (keep=&param)
  out=lookup
  nodupkey
;
by &param;
run;

data _null_;
call execute ('data ');
do until (eof1);
  set lookup end=eof1;
  call execute("&outlib.." !! trim(&param) !! ' ');
end;
call execute ("; set &indata.;");
do until(eof2);
  set lookup end=eof2;
  call execute("if &param = '" !! trim(&param) !! "' then output &outlib.." !! trim(&param) !! ';');
end;
call execute('run;');
stop;
run;

At the end you'll have three datasets in work, dynamically created from sashelp.cars. All parameters are supplied as macro variables, so you can easily define prompts for that. You could also use the same lookup dataset to create a single export step for every origin, using call execute and a where dataset condition in the proc export.

paulrockliffe
Obsidian | Level 7

Thanks, that's very helpful!

 

I'm making this up as I go along really, so I don't know if I can store a process, but I'll look into that.  What's an STP Server?

 

Thanks again, really appreciate the help.

paulrockliffe
Obsidian | Level 7

Oh, in your code example I'm specifying the input data table.  Is there anyway to have that pick up whatever table the Code Node is linked to?  

paulrockliffe
Obsidian | Level 7

Having a look around, I can open Stored Processes, and I've found a load in a folder so it looks like that might be an option, so long as I can save to that folder.  That'll depend on permissions, but also whether I'm allowed to use the folder.  I'll investigate.  In the mean time, if I get the code to work, I can easily convert that into a Stored Process?

 

Kurt_Bremser
Super User

@paulrockliffe wrote:

Having a look around, I can open Stored Processes, and I've found a load in a folder so it looks like that might be an option, so long as I can save to that folder.  That'll depend on permissions, but also whether I'm allowed to use the folder.  I'll investigate.  In the mean time, if I get the code to work, I can easily convert that into a Stored Process?

 


There are several steps involved:

- identify which nodes in a EG project will have to go into one STP

- get the code from these nodes and combine them into one code node, or have them in one process flow

- define the user prompts and use the results of the prompts in the nodes or the code

- once that works, create the stored process from that

 

Requirements

- you need a physical location (a "stored process repository") on the server where the stored process code will be stored (you and sassrv need read/write/execute permissions - OS level - there)

- you need a folder in metadata where the stored process definition will be stored. My Folder can be used, but if you want to share your STP, a folder in "Shared Data" is needed. The necessary metadata permissions are required there.

 

For most of the tasks, Enterprise Guide is a good helper. Start small (think "hello world"), and move to more complex STPs from there.

 

Since a BI server usually includes a web component, trying

http://your_sas_server:7980/SASStoredProcess

should lead you to the web interface for stored processes.

7980 is the default port for the SAS web server as delivered with 9.4

Kurt_Bremser
Super User

A code node does not have dedicated "input data" or "output data" parameters like other nodes. That is because there is no way how EG could know beforehand if your code actually needs or creates data.

For this you will have to rely on defined user prompts.

A quick introduction to user prompts can be found here: http://support.sas.com/resources/papers/proceedings13/028-2013.pdf

Kurt_Bremser
Super User

In a BI Server environment (meaning you have a metadata server, application servers and a web interface to them), one of the elements is a stored process server. This means you have one or more SAS processes waiting, and when a request is made, the code for the STP is fetched and executed. Output from that process can be an output stream (eg HTML). Through the web interface, people can request STPs for execution.

 

A good starting point is here: https://support.sas.com/documentation/cdl/en/stpug/68399/HTML/default/viewer.htm#p0ct3m137hug02n1lqu...

paulrockliffe
Obsidian | Level 7

OK, I've made a start with getting this to work within my setup and with a table that includes my population.

 

I've got it working with Office as the Parameter, but where the Office has two words (eg Castle House) in the name it is creating two identical tables, one named Castle, the other House.  Is that an easy fix as it then causes problems dealing with the data down-stream.  I've got two Regions with London in the name, this results on the code stoping once it's created the first table called "London".

 

 

Is it possible for the code to start by deleting any tables it's already created; running it a second time causes errors because tables already exist?  The error is ERROR: Data set WORK.LONDON is already open for output.  I've tried deleting the tables that are visible on the Process Flow, but they remain in WORK, so that doens't remove the error.  

 

Thanks again!

Kurt_Bremser
Super User

When you prepare the lookup table, you can create two variables; one for the actual contents of your parameter variable, and the other with a value derived from that that contains a valid SAS name (no blanks or special characters).

If it's not that easy to create valid SAS names, consider just using &param.001, &param.002 and so on (just the name of the parameter and a sequential number), as these will surely be valid.

The problem with the locks should be fixed if you set your Enterprise Guide to not automatically open resulting datasets for display.

paulrockliffe
Obsidian | Level 7

Thanks, sorry do you mind explaining how I add the second column to the Lookup table created by the Proc Sort?

 

I'm very new to SQL and can follow what you've done but don't know how to add another variable when your code isn't adding the first variable, just keeping something that was already there.

Kurt_Bremser
Super User

How you solve the problem of creating valid SAS names depends on the data you have. So have to apply a hefty dose of Maxim 3 (Know your data) to the problem. Do a proc freq on the variables in question, and see if a simple replacement of blanks with underlines solves it without creating unwanted similar values.

A first attempt might look like that:

%let indata=sashelp.baseball;
%let outlib=work;
%let param=team;

proc sort
  data=&indata (keep=&param)
  out=lookup
  nodupkey
;
by &param;
run;

data lookup;
set lookup;
vname = translate(trim(&param),'_',' ');
run;

data _null_;
call execute ('data ');
do until (eof1);
  set lookup end=eof1;
  call execute("&outlib.." !! trim(vname) !! ' ');
end;
call execute ("; set &indata.;");
do until(eof2);
  set lookup end=eof2;
  call execute("if &param = '" !! trim(&param) !! "' then output &outlib.." !! trim(vname) !! ';');
end;
call execute('run;');
stop;
run;

If it's hard to create valid dataset names (special characters etc.), the intermediate step for preparing lookup might be changed to this:

data lookup;
set lookup;
vname = "&param" !! put(_n_,z4.);
run;

Now the datasets will be numbered, so you will have to rely on the lookup dataset to tell you which dataset belongs where.

paulrockliffe
Obsidian | Level 7

Thanks, I'll take a look at that.  

 

The data is only going to be split by three columns, all contain only names of either a Region, Office, or Sector.  There's something like 10 offices, 8 regions and 10 sectors and most are single words, though there are some with multiples words separated by a space and a few where there's an ampersand in there.  To get a valid SAS name I'd need to transform the contents by deleting ampersands, removeing double spaces and then replacing spaces with underscores.  That would cover every possible eventuality for the data I have.  

 

Or I could get the chap that builds the table I'm using to modify it so that it only contains valid results, that would robably be easiest!

paulrockliffe
Obsidian | Level 7

Right, got that working I think, though it was creating some data set names that are longer than 32 characters, so I've had to use substr to knock the end of the names.  I cna't work out how to remove an ampersand without ending up with two underscores next to each other, but that's a minor problem.

 

Do you know how I go about getting each output table to automatically export to a folder as a .csv file?

 

I presume if I use prompts to control the code node that the only way to be able to easily apply the code in a new project is to use the Stored Process?

 

Thanks again, this has been a great help!

Kurt_Bremser
Super User

The lookup file created in my last version already contains the name of the dataset, which can also be used as a valid filename for the OS.

So you can use it to dynamically create the export steps:

%let outpath=/somewhere/subdir/;

data _null_;
set lookup;
call execute("proc export data=&outlib.." !! trim(vname) !! " outfile='&outpath." !! trim(vname) !! ".csv' dbms=csv; run;");
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 2413 views
  • 1 like
  • 2 in conversation