Solved: Re: Can I derive the libname from a Data Source or Data Source Item pr...

ScottBass · Posted 10-02-2013 03:59 AM

Hi,

Say I have a Stored Process (STP) with a Data Source and Data Source Item prompt.

The generated macro variables will look something like this:

DATASOURCE=/Shared Data/CLASS(Table)

DATASOURCE_TYPE=1

COLUMN=Name

COLUMN0=5

COLUMN1=Name

COLUMN2=Sex

COLUMN3=Age

COLUMN4=Height

COLUMN5=Weight

COLUMN_COUNT=5

COLUMN_PATH=/Shared Data/CLASS

COLUMN_PATH1=/Shared Data/CLASS

COLUMN_PATH2=/Shared Data/CLASS

COLUMN_PATH3=/Shared Data/CLASS

COLUMN_PATH4=/Shared Data/CLASS

COLUMN_PATH5=/Shared Data/CLASS

COLUMN_SOURCE_TYPE=1

COLUMN_SOURCE_TYPE1=1

COLUMN_SOURCE_TYPE2=1

COLUMN_SOURCE_TYPE3=1

COLUMN_SOURCE_TYPE4=1

COLUMN_SOURCE_TYPE5=1

COLUMN_TYPE=1

COLUMN_TYPE1=1

COLUMN_TYPE2=1

COLUMN_TYPE3=2

COLUMN_TYPE4=2

COLUMN_TYPE5=2

Minor question, just curious: If I look at the properties of the SASHELP.CLASS table object via SMC while in the Data Library manager, under the Physical Storage tab I see: DBMS: SAS; Library: SASHELP; Name: CLASS. If I look at the properties of the SASHELP.CLASS table object while in the Folder containing the object itself, I don't get the Physical Storage tab. Why would the properties window be different based on where I view the properties from?

Other table objects generated from our production libraries have Library entries like: My Favourite Library and Some Library With a User-Friendly Name (i.e. library names as the user sees them in EG).

Given that the table object appears to store the Library name as part of its metadata:

1) Given the metadata path to the table object name (i.e. DATASOURCE=/Shared Data/CLASS(Table) ), is there any way I can derive the libref of that table object in my stored process code?

2) Can anyone enlighten me why SAS returns DATASOURCE=/Shared Data/CLASS(Table) and COLUMN_PATH1=/Shared Data/CLASS for Data Source and Data Source Items? IMO SAS should return DATASOURCE=SASHELP.CLASS and COLUMN_PATH1=SASHELP.CLASS, or perhaps COLUMN1=SASHELP.CLASS.Name instead, and ditch COLUMN_PATH1 altogether. Is there any reason why a stored process developer would be interested in the metadata item path rather than the "location" (i.e. libname.memname) that the metadata table item represents? Am I missing something here?

3) The end-user GUI displays the Data Source metadata path for the Data Source Item, i.e. a table with Data Source Item and Data Source columns. Can anyone envision a scenario where this information would be of interest to the end user? IMO my end users will find this confusing - they could care less where the table object is stored in the metadata.

4) I'd like my end user to select a Data Source, then the downstream Data Source Item prompts would be dependent on the chosen Data Source. Is there any way to setup this dependency? (I assume No, but perhaps a little birdie will share the undocumented way to implement this).

5) (Minor): What's the difference between COLUMN0 and COLUMN_COUNT? Why have both?

Thanks,

Scott

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

Vince_SAS · Posted 10-09-2013 10:30 AM

This may help you when working with a data source prompt; I don't know of a similar macro to deal with data source items.

Usage Note 37386: New SAS Stored Process prompts in SAS 9.2: Data source and Data source item - Usage Example

http://support.sas.com/kb/37/386.html

Vince DelGobbo

SAS R&D

View solution in original post

Cynthia_sas · Posted 10-05-2013 02:43 PM

Hi:

Here's my .02 on your questions:

1) I believe you have to look in the metadata for the LIBREF. I am not sure what you mean by "derive". The SASHELP datasets are probably not a good choice for an example because the SASHELP datasets are all in Shared Data folders, I think. So they are inherently different from your own tables and data sources

2) DATASOURCE is the METADATA location -- that is NOT a physical disk drive location; nor is it the lib.data name. DATASOURCE is only something that "lives" in the metadata. DATASOURCE_ITEM is the equivalent of the column name in a table, in your example, they are the same. But, since the DATASOURCE could be an information map, the DATASOURCE_ITEM might not be a physical column in the data -- it might be something that is coming from the information map. The DATASOURCE prompts can only see into a SAS Folder metadata location to select a cube, information map or table. If the DATASOURCE_TYPE=1, then the datasource is a table; 2= OLAP cube; 4= infomap based on relational data; 8= info map based on OLAP cube. DATASOURCEITEM allows user to select data items from either tables or infomaps which are based on relational tables.

3) I do not normally surface these to end users, but sometimes use them for my own testing as hidden prompts or prompts whose values I set during testing with %LET.

4) The rule for dependent prompts is that the prompt type MUST be text, numeric or date.

5) COLUMN0 (as an indicator of how many selections the user made) is compatible with the original "workings" of the Stored Process Server (which was compatible with the SAS/Intrnet Application Dispatcher server). COLUMN_COUNT is the updated way of showing number of prompt selections. The numbers should be the same. In some instances, prompt0 will not be created, while prompt_COUNT will always be created.

You can always open a track with Tech Support if you are having issues with a specific stored process and prompt.

cynthia

jakarman · Posted 10-05-2013 05:36 PM

Scott, as I am reading your questions they are not bases on technical questions but about generic concepts.

If that is correct than the TS direction is pointing into a dark labyrinth. (Sorry Cynthia).

Most of what is existing today has a long evalutionary history. SAS is existing since 1976. Within IT that as something of a pre-historic era.
There is much positive to say about that, but also implies, on first sight, some weird conventions.

-- The metadata approach is rather new with SAS and is still in a building up life-cycle. This approach is getting his power by the gui clickable building concept. DI Information map and far more to see with SAS-VA. SAS® Visual Analytics | SAS . The first steps to deliver web-based presentation was SAS/intrnet. Of SAS/intrnet you still can find the things like sessions in SAS(R) 9.3 Stored Processes: Developer's Guide

-- The original SAS language SAS(R) 9.4 Language Reference: Concepts is still there but is getting encapsulated with the newer metadata approach.

-- With the sashelp.class you are using SAS in the old way as described by the language concepts. Using the DMS you can see a lot datasets catalogs and objects. -- Every type of object can be associated with different type of actions. This is how the Objected oriented approach was standardized. The object approach was on his top with SAS/AF and offering SCL (SAS componemt language), now more or less a forgotten part.

Using Portal menu's (web-reports) is heavily relying on using the SAS-metadata. Moving away from the coding by hand and trying to build it up by generating that out of metadata. So in that approach you will see just metadata names. Not something users are used to as it his hiding the known physical locations.

The mix-up of those two can make is very confusing.

Metadata is booming again as a generic approach. There are standards for that Metadata standards - Wikipedia, the free encyclopedia and the CMW OMG Formal Specifications one being used to have exchange options by different suppliers.

Metadata was originally dedicated to a DBA maintaining all definitions for a specific RDBMS I remember the IDD (Integrated Data Dictionar) of IDMS doing a lot of the same type of concepts as SAS metadata. Only 30 years older.

Within the SAS-concepts you find a lot of limitatiions in the SAS language, eg:

- libnames / filenames limited length in the name of 8. the old 8.8.8 namings and IO-controlblocks of System/360

- 32Kb length of variables. You have there the signed 2 byte limitation. Being quite common in the years like commodore-64 machine.

Many high level languages like C and Java where almost everything is build are having those kind of limitations.

---->-- ja karman --<-----

ScottBass · Posted 10-09-2013 01:04 AM

Hi,

Thanks for the replies so far. Perhaps it will help to explain what I'm trying to do...

I'm trying to develop a generic approach to creating an SQL query against a dataset using a stored process. In most cases, it will be a straight query against a fact table. In some instances, I will then join dimension data to the fact table subset using a hash object join. For now, assume that the subsetting (where clause) only occurs against the fact table.

In some instances, the end user will select the libref and dataset. In others, I will simply set default values for these parameters, then make the prompts hidden.

So, in the user interface (prompt manager), the end user selects: the data source (libref and dataset) (FROM clause), desired output columns (SELECT clause), subset criteria (WHERE clause), and output sort order (ORDER BY clause - for now only ascending sort order is supported).

The best (albeit klunky) design I've been able to come up with so far is:

libref: text, dynamic list, single value

dataset: text, dynamic list, single value

columns: text, dynamic list, multiple ordered values

filter name/value pairs (5): name: text, dynamic list, single value value: text, user enters values, multiple values (5 pairs of prompts)

additional filtering: text, user enters values, single value

order: text, dynamic list, multiple ordered values

The dataset list is dependent on the libref selection, all downstream column selections are depended on libref and dataset selection.

The source dataset for the dynamic lists is SASHELP.VLIBNAM, SASHELP.VTABLE, and SASHELP.VCOLUMN. I created the metadata for these tables by creating a pre-assigned library definition for SASHELP, then imported selected tables, writing the objects to \Shared Data.

The user selects the desired libref, then the desired table, then the desired output columns. For the filter, there are 5 "built-in" prompts where the user selects the desired variable, then enters the desired subsetting values. Some of these variables have about 1.5 - 3M distinct values, so "Get Values" from a table simply won't perform in an acceptable manner - the user will have to type or paste the values into the prompt. The variable data type is checked in the stored process, and the data is quoted as appropriate. In almost all cases, 5 prompts (variables) are sufficient. Otherwise, an advanced user can augment the filter (where clause) using the additional filtering prompt. If 5 prompts is more than enough, I can simply hide the unnecessary prompts without having to change the underlying STP code. If I want to "bind" a particular variable to a prompt, I can set a default value and either protect or hide the variable selection dropdown. This approach (hopefully) allows flexibility for the end user and fast stored process development for the developer.

This all works fine...EXCEPT it performs horribly. Like 1-3 minutes for the prompt manager to display the UI. So I tried making a "snapshot" of VLIBNAME, VTABLE, and VCOLUMN for a single library, but still got very poor performance. My guess is all these dynamic, dependent prompts, plus having to query the workspace server, is the cause of the poor UI performance.

So, I switched to using the datasource and datasource item prompts. Which performs great - it's reading from metadata instead of dynamic views on the workspace server - but these prompts IMO set really stupid values for the macro variables!!!

1) What I meant by "derive" is, can I use PROC METADATA or the metadata data step interfaces to derive the libref from a metadata table object, if I have the path to the object? The SASHELP table metadata objects are in Shared Data only because I wrote them there.

2) I understand that DATASOURCE is the metadata location - that was in fact my point. And I wasn't just "griping" - most (all?) of my stored processes are "old school" - I'm not doing much with say PROC METADATA, PROC STP, etc. I truly don't understand whether any stored process developer, or stored process end user, would be interested in the metadata path over the libref.dataset name of the table object? I'm only working with Tables, not say Information Maps. Would the metadata path be of interest for an Information Map datasource? If anyone can provide an example of a stored process where the metadata path is useful to the stored process, I'd love to see it; I'm sure I would learn something from the example.

<quote>DATASOURCEITEM allows user to select data items from either tables</quote>

I disagree, since the DATASOURCEITEM does not contain sufficient information to derive the source libref and dataset. If I have SASHELP.SHOES, SASUSER.SHOES, and SOMELIB.SHOES, which SHOES dataset did the end user select? The generated macro variables don't give me that information.

3) I can't hide the DATASOURCEITEM GUI from the user, since I need it to be visible so they can select the desired columns. What I want to suppress is the DATASOURCE path in the GUI, since it is of no interest (and in fact confusing) to my end users.

4) IMO this is a bit short sighted. Setting the available list of downstream DATASOURCEITEMS based on the selection of an upstream DATASOURCE is obvious, isn't it? Surely R&D would get this??? We were doing this in SAS/AF for decades! In DMS, select Tools --> Query to see what I mean!

5) OK, thanks, COLUMN0 is deprecated then, but still around due to legacy functionality.

I would open a TS track, but I think this is more of stating a desired use case and request for feedback from the community. If others are in agreement I'll formalize this as an enhancement request (although IMO a few of these are not enhancements but design bugs worthy of a hotfix).

I understand the evolutionary history of SAS, but the subject matter in this post is very recent code/functionality in SAS. Like I said, maybe I'm missing something, but I think the prompt macro variables should be returning the libref.tablename.columnname rather than a metadata path. I wonder if the prompt manager got much usability testing by end users before being released to the market?

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

Vince_SAS · Posted 10-09-2013 10:30 AM

This may help you when working with a data source prompt; I don't know of a similar macro to deal with data source items.

Usage Note 37386: New SAS Stored Process prompts in SAS 9.2: Data source and Data source item - Usage Example

http://support.sas.com/kb/37/386.html

Vince DelGobbo

SAS R&D

ScottBass · Posted 10-14-2013 02:46 AM

Hi

Thanks a lot for your reply. I had a look at the Usage Note, then Googled "%metaauto", which led me to this useful SGF paper: http://support.sas.com/resources/papers/proceedings11/309-2011.pdf.

I had a look at the %metaauto macro (C:\Program Files\SASHome\SASFoundation\9.3\core\sasmacro\metaauto.sas on my machine), as well as all the internal macros, esp. metadata_getDatafileFromPath.

However, this begs a few questions:

1) Why the difference between the generated metadata item path between a DATASOURCE and DATASOURCEITEM prompt??? (Note: my datasource prompt is named DATASOURCE; my datasourceitem prompt is named COLUMN).

A DATASOURCE prompt generates macro variables like:

DATASOURCE=/Shared Data/CLASS(Table)

DATASOURCE_TYPE=1

while a DATASOURCEITEM prompt generates macro variables like:

COLUMN1=Name

COLUMN_COUNT=5

COLUMN_PATH1=/Shared Data/CLASS

COLUMN_SOURCE_TYPE1=1 (a Table)

COLUMN_TYPE1=1 (a Char variable)

I'm guessing (pure conjecture) that (Table) is part of the metadata item path? After all, I could have metadata objects CLASS(Table), CLASS(Cube), CLASS(Infomap), CLASS(OLAP Infomap) all stored in /Shared Data. I'm assuming that (Table) is a bit analogous to a file extension, and that /Path/ObjectName(ObjectType) forms a complete path?

Regardless of whether this is technically accurate, in reviewing the code for metadata_getDatafileFromPath:

rc = metadata_pathobj('',

"&PATH",

'',

pathtype,

pathid);

only works for /Shared Data/Class(Table), NOT /Shared Data/Class.

But, if COLUMN_PATH1=/Shared Data/CLASS is meant to be "the path to the metadata item object from which the datasourceitem is derived", then why specify an incomplete (as your macro states a "malformed") path???

2) Ok, that's likely someone else's code with the design bug. But, if TYPE= is a required parameter to your macro, then why don't you work around this design bug in the prompt manager code? I would think something like this would work:

%macro test(path=, type=);

%* Strip leading and trailing spaces ;

%let path=%sysfunc(strip(&path));

%* Folders can contain parentheses, so use defensive programming ;

%* Assume "#" is not in the path, otherwise use some other delimiter ;

%let rx=%sysfunc(prxparse(#^(.*)($Table$|$Cube$|$InfoMap$)$#));

%if %sysfunc(prxmatch(&rx,&path)) %then %let path=%sysfunc(prxposn(&rx,1,&path));

%syscall prxfree(rx);

%if (&type eq 1) %then %let path=&path(Table);

%else

%if (&type eq 2) %then %let path=&path(Cube);

%else

%if (&type eq 4) %then %let path=&path(InfoMap);

%put &=path;

%mend;

%test(path=/Shared Data/Class(Table), type=1);

%test(path=/Shared Data/Class(Cube), type=2);

%test(path=/Shared Data/Class(InfoMap), type=4);

%test(path=/Shared Data/Class(Cube), type=1); * note mismatch on (type) and type= ;

%test(path=/Shared Data/Class, type=1);

%test(path=/Shared Data/Class, type=2);

%test(path=/Shared Data/Class, type=4);

%test(path=/Funky/(Path) (With) (Table)/Class,type=1); * yes, I created this folder using SMC! ;

(We don't use OLAP or Information Maps, so I'm just guessing re: the (object type ) specification - edit as required)

3) If code similar to the above was in the %metadata_getDatafileFromPath macro, then it would also work with DATASOURCEITEM prompts (although it would be better if the prompt manager design bug was fixed so you didn't have to). Here is an example where I "wrap" your macro:

%macro my_metadata_getDatafileFromPath(

path=,

type=,

outvar=,

assignlib=N,

showver=N,

debug=

);

%* Strip leading and trailing spaces ;

%let path=%sysfunc(strip(&path));

%* Folders can contain parentheses, so use defensive programming ;

%* Assume "#" is not in the path, otherwise use some other delimiter ;

%let rx=%sysfunc(prxparse(#^(.*)($Table$|$Cube$|$InfoMap$)$#));

%if %sysfunc(prxmatch(&rx,&path)) %then %let path=%sysfunc(prxposn(&rx,1,&path));

%syscall prxfree(rx);

%if (&type eq 1) %then %let path=&path(Table);

%else

%if (&type eq 2) %then %let path=&path(Cube);

%else

%if (&type eq 4) %then %let path=&path(InfoMap);

%metadata_getDatafileFromPath(

path=&path,

type=&type,

outvar=&outvar,

assignlib=&assignlib,

showver=&showver,

debug=&debug

);

%mend;

%macro loop;

%do i=1 %to &column_count;

%my_metadata_getDatafileFromPath(

path=&&COLUMN_PATH&i,

type=&&COLUMN_SOURCE_TYPE&i,

assignlib=n,

outvar=COLUMN_TABLE&i,

debug=

);

%put COLUMN_TABLE&i=&&column_table&i;

%end;

%mend;

%loop;

4) Minor: if debug=y means "debug", then debug=n should mean "don't debug". IMO debug=<any non blank value> meaning "don't debug" is confusing, and I have to troll through the macro code to work it out.

I was a bit worried about the performance impact of having to call the macro, but it's really fast. Your macro goes a long way to helping me solve my programming issue.

I still wish: 1) the source libname.dataset name were part of the macro variables created by the prompt manager, 2) I could hide the unwanted information from the datasource and datasourceitem user-interface components, and 3) I could create dynamic prompting, setting a dependency on datasourceitem on the upstream datasource selection.

I suspect this isn't even your code, but perhaps you could pass my opinions to the appropriate persons...

Regards,

Scott

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

Quentin · Posted 10-10-2013 08:57 AM

Hi Scott (et al),

I think I agree with the main thrust of your point. But I think it's because when it comes to BI, I'm still (after a year or so) straddlig the new metadata-centric approach and good old fashioned SAS. It feels like a happy straddling for now, but I suspect maybe not for long...

While I'm happy writing stored processes, I tend to hand write the source code for them. And in that code I rarely uses a metadata library. I tend to use base engine libraries or whatever engine I need for a data source. Many of my datasets are never registered in the metadata. Some down sides of this approach are that I don't have all the security benefits of libraries/datasets defined in the metadata. (But I'm okay with that, since I can still control access to the stored process itself, and I haven't been working with sensitive data). And if one day we move servers, updating the metadata libraries won't point my stored process toward a new physical locations of the data. (But I'm okay cuz most of the locations/dataset names in my code are dynamically derived from macros or other project data). But the BI framework is obviously centered around metadata. With lots of benefits beyond the two I've mentioned.

So as an old SAS school programmer, if I had a &data prompt, I (like you) would want a simple prompt to select a libname (or physical path), data set name, etc. Like SAS/AF or whatever. And were I to have a problem like yours, I would likely approach it as you did. (Except that sashelp.vcolumn is so big, I'd probably have a separate nightly job to make small datasets for these dynamic prompts).

But since the entire BI framework is about metadata, it's not surprising to me that when you use the data prompts, you get metadata information. The BI approach is trying to get folks to do more with metadata, and less with libnames/data set names written in code. See for example recent addition of metadata-bound libraries. http://www.notecolon.info/2012/11/note-metadata-bound-libraries.html I guess the mantra from that perspective would be: more metadata usage means more security and easier server administration and easier usage for business analysts.

As for your request for an example where knowing metadata source path for a dataset would be useful to a stored process developer, I guess it depends on how the metadata paths are set up. For example, if metadata paths were informative, such as Data/Dev/DataX vs Data/Prod/DataX, then I can see how I might want to know the metadata path for the dataset selected by a user, so that I could determine whether they are analyzing /Dev data or /Prod data.

Regards,

--Q.

jakarman · Posted 10-11-2013 03:22 PM

Quentin, indeed you see al lot moving to metadata. But you assumption it is more easier for administration and better security is too optimistic. As metadatabound libraries are only SAS datalibraries you are missing a lot of information/data that is not part of that. Setting up a well defined host security cannot be ignored. This is documented by SAS in the Security Admininstration Guide.

---->-- ja karman --<-----

Quentin · Posted 10-11-2013 03:36 PM

Agreed, Jaap. It all starts with the host security. I guess by "more security" from using metadata I was referring to more granular security. Where for a single table you can (I think) limit access to certain variables and rows within that table. Can't do that at the OS level. And by "easier administation" I meant if e.g. all libnames are registered in metadata (not coding paths in libname statements), then an admin could switch host servers or whatever, and everything would magikally work after updating the metadata, without need to update code.

But as I said, I haven't actually played much with the metadata. Being happy with the level of security and functionality I can get from my OS and base SAS / macros....

jakarman · Posted 10-11-2013 03:43 PM

Very well stated Quentin.

I am seeing a lot of people obviously forgetting the basic of software architecture/engineering. The goal you described (metadata) is belonging to that. How to achieve that can also be done in the old way. You are knowing the goals. Wished more people did.

---->-- ja karman --<-----

ScottBass · Posted 10-13-2013 11:52 PM

As a quick aside, and drifting a bit off topic (hijacking my own thread )...

I'm certainly no security expert, esp. with respect to metadata, but:

1) AFAIK, currently you CANNOT have a completely secure environment without host operating system security, i.e. NTFS permissions. IOW, if someone can "connect" to the server, knows the physical path to the library, and the library and its contents have NTFS Everyone: Read permission, and the datasets don't have a password on them, then a knowledgeable end user can get access to the data.

2) I can't help feeling that fine grained access control would be better implemented at a "lower level" than the metadata server, i.e. via the engine and dataset header. Without too much detail:

* For each engine (eg. BASE, SPDE, etc), the engine determines whether the particular path can be allocated.

* For each dataset, the dataset header determines whether the table can be accessed, which columns can be seen, and which rows can be seen.

Perhaps the engine and dataset headers communicate with the metadata server, or perhaps they read security information some other way, say some table that is updated via some GUI.

I would like to see more fine grained security available in just an old-fashioned Base SAS install, without the need for O/S level security.

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

ScottBass · Posted 10-13-2013 11:29 PM

Hi ,

While I'm happy writing stored processes, I tend to hand write the source code for them. And in that code I rarely uses a metadata library. I tend to use base engine libraries or whatever engine I need for a data source. Many of my datasets are never registered in the metadata. Some down sides of this approach are that I don't have all the security benefits of libraries/datasets defined in the metadata. (But I'm okay with that, since I can still control access to the stored process itself, and I haven't been working with sensitive data). And if one day we move servers, updating the metadata libraries won't point my stored process toward a new physical locations of the data. (But I'm okay cuz most of the locations/dataset names in my code are dynamically derived from macros or other project data). But the BI framework is obviously centered around metadata. With lots of benefits beyond the two I've mentioned.

All our libraries are pre-assigned (no, I didn't come up with this policy). And our batch jobs use the -metaautoresources "SASApp" option. So, all our libraries are assigned, whether in EG (Workspace Server), Stored Process Server, or batch job. We do use security to restrict access (allocation) to particular user groups. I'm happy to use a metadata library allocation, i.e. the meta engine, but there is no need.

So as an old SAS school programmer, if I had a &data prompt, I (like you) would want a simple prompt to select a libname (or physical path), data set name, etc. Like SAS/AF or whatever. And were I to have a problem like yours, I would likely approach it as you did. (Except that sashelp.vcolumn is so big, I'd probably have a separate nightly job to make small datasets for these dynamic prompts).

I made a snapshot of sashelp.vcolumn for a single library. It reduced the number of columns from say 40,000 to 4,000. Still a lot, but filtering a list of 4,000 columns should be quite quick (depending on the server specs). It still had very poor performance with respect to the initial GUI display.

After the user has selected the datasource and datasource items, my STP code needs to know which datasource (SAS dataset) and datasource item (SAS dataset column) the user selected. The default SAS macro variables created by the prompt manager does not provide this information (but it can be derived via Vince's macro).

But since the entire BI framework is about metadata, it's not surprising to me that when you use the data prompts, you get metadata information. The BI approach is trying to get folks to do more with metadata, and less with libnames/data set names written in code. See for example recent addition of metadata-bound libraries. http://www.notecolon.info/2012/11/note-metadata-bound-libraries.html I guess the mantra from that perspective would be: more metadata usage means more security and easier server administration and easier usage for business analysts.

Maybe. I still assert that displaying the metadata path to the end-user via the baked-in GUI elements is not good, esp. without giving the STP developer the chance to hide the path. I just want the user to pick a column! S/he could care less about the path to the metadata table object.

As for your request for an example where knowing metadata source path for a dataset would be useful to a stored process developer, I guess it depends on how the metadata paths are set up. For example, if metadata paths were informative, such as Data/Dev/DataX vs Data/Prod/DataX, then I can see how I might want to know the metadata path for the dataset selected by a user, so that I could determine whether they are analyzing /Dev data or /Prod data.

Our metadata paths do show this information. I still don't want to display the path to the end user. The user will know the environment s/he has selected by the library from which they selected the datasource. And the environment (Dev or Prod) is not dependent on the metadata path. If I move the table objects to path foo and bar, it will still be Dev or Prod metadata. So I don't like the idea of binding the environment to the metadata path.

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

jakarman · Posted 10-10-2013 01:44 AM

Scott,

On first sight you are trying to build a "query builder" is there a reason for that?

It is standard functionality in Eguide and AMO. With SAS-v8 SAS/intrnet there was a sample doing that. If that is what you want to build, it should be possible.

The old-days codings

Counters/macros
The naming convention for automatic is as described. In that approach it is easy to contruct do-loops in SAS macro-language.
By the way it is similar to REXX-language.
Using sashelp.vlibname vtable is terrible slow. It was already that slow in 8.2 . It is not advisable to use them when good response is needed. With SAS 9.4 “proc
delete”has got his revival http://support.sas.com/resources/papers/proceedings13/022-2013.pdf . Proc datasets is already much better in performance than using sashelp.v- approach.

What are your options:

- You could use a “libname _all_ list” with redirect and “proc datasets” to improve speed.

- you could set up less or more permanent libraries containing the libnames / tables information. It would make sense to use a alike structure as the sashelp.v- members

- If the expected tables are less or more stable you could make own copies of the sashelp.v- members in a sort of caching approach.

Using the prompting framework you can use table-input when those tables for the prompts are registered in the metadata. That is asking for a more stable table-layout describing them.

Using the prompting framework with table-input and having many app-servers or using to want a SP-server can give you some surprises. These are:

The menu for prompting will open up always a WS-server even if your process has been associated with a SP-server or other logical-server.
When more app-servers are available for the user and libname, the prompting framework will not necessary use the same app-server as you have defined your
process to run. It will pick a technical first one in an internal invisible list.

These kind of not logical, not documented behavior can make your solution unsolvable. Just in the case you access to one possible view it is reliable functioning.

I classify this as a design/implementation failure although SAS won’t admit that. Sorry Vince, I could not resist this opportunity, but I guessed you already expected that. The reference you have given before is a good one, of course...

Jaap

---->-- ja karman --<-----

ScottBass · Posted 10-14-2013 12:54 AM

Hi

On first sight you are trying to build a "query builder" is there a reason for that?

It is standard functionality in Eguide and AMO. With SAS-v8 SAS/intrnet there was a sample doing that. If that is what you want to build, it should be possible.

I'm more than happy for a better way of doing what I'm trying to do...

We have 3 fact tables, the largest about 30M records. The entire library is a SPDE library, with "appropriate indexes". We have numerous dimension tables varying in size from 50 to 1M records, with the appropriate business and surrogate keys in the fact tables to join the dimension data. Our server is of "moderate" specifications, not the fastest in the world, but not a dog either.

A query against the fact table, especially when using the "provider" variable (a particular variable with low cardinality) is surprisingly fast (5-10 seconds). The end users are almost always interested in a particular provider. A typical query will be "provider = '12345' and date=<year to date | month to date>".

My end users need to get to the non-summarized, detail level data very fast. They also need the data in Excel for further processing, so I will be using AMO. The end users are very smart, but not programmers, not the most SAS literate, and AMO rightly does not contain a query builder for this category of end user (runaway query), only the Filter and Sort functionality against a single table.

If all they wanted to do was query the fact table, the built-in "Filter and Sort" capability of AMO would do fine. But, they may want "provider = '12345' and date = <month to date> and state = 'NSW'". In that scenario, I want to filter the fact table, then use a hash object join for the address dimension data, then use a subsetting IF for state = 'NSW'. To do this, I will use a stored process (STP).

Our end users often complain that our "canned" STP's either contain too many or too few pre-selected columns. So, I want to give them the option to select their desired output columns.

I further would like the end user to be able to paste subsetting information into Excel cells (esp. the provider), then use VBA to set the STP prompt and refresh. It's unfortunate that the VBA interface to AMO is so limited (only single value text prompts), but I may be able to use hidden prompts and some fancy parsing to work around this shortcoming.

Again, this needs to be FAST, both in the user interface and in query execution. We cannot create a completely joined fact and dimension uber-table (which would be fast); it will be too huge, take too long to build, and take too much disk space. I can't use a view (which would be slow), as the where clause passed to the view does not take advantage of the indexes. I need to generate the code on the fly for the best performance.

Ok, so with all that painful detail out of the way...if there is a better approach, I'm all ears.

The old-days codings

Counters/macros
The naming convention for automatic is as described. In that approach it is easy to contruct do-loops in SAS macro-language.
By the way it is similar to REXX-language.

I'm quite familiar with macro programming. In my previous post I had attached my current code - you can have a look at it if you wish.

Using sashelp.vlibname vtable is terrible slow. It was already that slow in 8.2 . It is not advisable to use them when good response is needed.

See my previous reply to Quentin. Even with a small, static table, using data retrieved from the workspace or stored process server was really slow, vs. data retrieved from the metadata server (i.e. datasource and datasource item).

With SAS 9.4 “proc delete”has got his revival http://support.sas.com/resources/papers/proceedings13/022-2013.pdf .

Sure, but I'm not sure what this has to do with sashelp.v* tables slow performance???

Proc datasets is already much better in performance than using sashelp.v- approach.

Yes, I usually use PROC CONTENTS, but this won't work for the datasource for a dynamic text prompt, which displays BEFORE the stored process code runs.

What are your options: - You could use a “libname _all_ list” with redirect and “proc datasets” to improve speed. - you could set up less or more permanent libraries containing the libnames / tables information. It would make sense to use a alike structure as the sashelp.v- members - If the expected tables are less or more stable you could make own copies of the sashelp.v- members in a sort of caching approach. Using the prompting framework you can use table-input when those tables for the prompts are registered in the metadata. That is asking for a more stable table-layout describing them.

Only if I do this BEFORE the STP runs, not as part of the STP. Which is OK, since the dataset attributes would rarely change. But see previous reply to Quentin (as well as my second post) - this approach still ran slowly for me.

Using the prompting framework with table-input and having many app-servers or using to want a SP-server can give you some surprises. These are:

The menu for prompting will open up always a WS-server even if your process has been associated with a SP-server or other logical-server.

Hmmm, well that's a bummer. Who designed that? If I say I want my STP to run on the STP server, why open a WS server? This could be part of the reason for my poor performance.


When more app-servers are available for the user and libname, the prompting framework will not necessary use the same app-server as you have defined your
process to run. It will pick a technical first one in an internal invisible list.

Thanks for the info. Not applicable in my case, but useful to know.

These kind of not logical, not documented behavior can make your solution unsolvable.

Yep...although I'm leaning toward the datasource and datasource item approach for UI performance reasons.

Just in the case you access to one possible view it is reliable functioning.

I classify this as a design/implementation failure although SAS won’t admit that.

I agree. SAS R&D are incredibly brilliant but 1) I don't think they always see things from the end user perspective, 2) often do similar things in an inconsistent manner (see my reply to Vince), and 3) sometimes create solutions where you can work wonders...IF you can figure out the hoops to jump through to get there. I believe some analyst reports ranking SAS as really useful, but difficult to use, have some basis in fact (this from someone who has worked with SAS for 25+ years, and has based most of his career on SAS programming). Perhaps it gives some of us job security

Sorry Vince, I could not resist this opportunity, but I guessed you already expected that. The reference you have given before is a good one, of course...

Vince's input was incredibly useful - see my reply

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

gtcox76 · Posted 10-10-2013 09:14 AM

Not sure if this helps but you can query the dictionary.libnames table via proc sql, pathname is the variable containing the physical path of the assigned library. There is a view in sashelp of this table sashelp.vlibnam you can query through data step.

Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

Re: Can I derive the libname from a Data Source or Data Source Item prompt output?

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away