Hi!
I have several large tables. From each table I want to get values from specific columns.
The column names (headings) are all different, but parts of the column names are the same.
For example the word “Index” is in all of the column names.
Is it possible to write something to get values from all columns with the word “index” in the column name?
=>Not specifying each "Index" column name from tables I want something from
lca
proc sql;
select name from sashelp.vcolumn where upcase(memname)='<your dataset>'
and name like '%index%';
quit;
run;
You may need to convert the "name" column to uppercase to make it more consistent or adjust your search string if you know you're searching for a mixed-case string.
proc sql;
select name from sashelp.vcolumn where upcase(memname)='<your dataset>'
and name like '%index%';
quit;
run;
You may need to convert the "name" column to uppercase to make it more consistent or adjust your search string if you know you're searching for a mixed-case string.
Almost the same as Jason's suggestion, but I'd use dictionary.columns. e.g.:
data have;
input x index_y z other_index;
cards;
1 2 3
;
proc sql;
select name from dictionary.columns
where libname='WORK' and
memname='HAVE' and
upcase(name) like '%INDEX%'
;
quit;
Art
(apart from the "success-factor" of upper-casing the "%index%") how would your proposal perform faster?
Some inside info comes from _method and _tree.
I understood that the performance penalty in sashelp.vcolumn only comes when used in a data step (has that changed?)
Peter, No speed improvement was considered, just preference and personal familiarity. The main intended improvements were upcasing the name variable and adding a libname in the where clause. I've only used the sashelp views in a datastep and have always used the dictionary. files in sql. The performance penalty you mention interests me, as it must be compared with something. Can you elaborate?
Art
the performance issue is over collecting data through a VIEW (like sashelp.V{anything}.SQL can pass the where clause attributes to the underlying platform, because it is an sql view, but a data step does not (probably for some reason that it cannot?).
Easily demonstrated with a simple filter.
In a new SAS session there will be no tables in WORK, so create one
data sashelp_views_demo ;
retain a b ' ' c 3 ;
run ;
Now use both SQL and a DATA step to collect SASHELP.VCOLUMN entries for this demo table;
Rather than just code, I have pasted the SASlog below because it offers run and cpu time comparisons.
PROC SQL filling a VIEW <1 second real time
DATA STEP filling VIEW 54.9 seconds real time
Using 4.8 secs system cpu
10 option fullstimer ;
11 proc sql ;
12 create table proc_sql_view_filling as
13 select *
14 from sashelp.vcolumn
15 where libname = 'WORK'
16 and memname = 'SASHELP_VIEWS_DEMO'
17 ;
NOTE: Table WORK.PROC_SQL_VIEW_FILLING created, with 3 rows and 18 columns.
18 quit ;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.84 seconds
user cpu time 0.00 seconds
system cpu time 0.03 seconds
memory 207.35k
OS Memory 9584.00k
Timestamp 28/04/2012 18:19:30 o'clock
19
20 data data_step_view_filling ;
21 set sashelp.vcolumn ;
22 where libname = 'WORK'
23 and memname = 'SASHELP_VIEWS_DEMO'
24 ;
25 run ;
NOTE: The map data sets in library MAPSGFK are based on the digital maps from GfK GeoMarketing and are covered by their Copyright. For
additional information, see http://support.sas.com/mapsonline/gfklicense.
NOTE: There were 3 observations read from the data set SASHELP.VCOLUMN.
WHERE (libname='WORK') and (memname='SASHELP_VIEWS_DEMO');
NOTE: The data set WORK.DATA_STEP_VIEW_FILLING has 3 observations and 18 variables.
NOTE: DATA statement used (Total process time):
real time 54.94 seconds
user cpu time 0.40 seconds
system cpu time 4.80 seconds
memory 794.32k
OS Memory 9584.00k
Timestamp 28/04/2012 18:20:25 o'clock
It is clear that the DATA STEP opens data set headers not opened when PROC SQL fills the VIEW :- because of the message about "library MAPSGFK".
I also pasted the log for running PROC CONTENTS to create column information for the data set because that procedure is my preferred method of collecting table attributes, for its simplicity, brevity and that it appears to handle the libname.memname insensitive to upper or lower case (although I’m not sure about tables accessed through SAS/Access into a dbms which is sensitive to the case of the table name).
26 proc contents data= work.sashelp_views_demo noprint out= conts ;
NOTE: Writing HTML Body file: sashtml.htm
27 run ;
NOTE: The data set WORK.CONTS has 3 observations and 40 variables.
NOTE: PROCEDURE CONTENTS used (Total process time):
real time 5.57 seconds
user cpu time 1.01 seconds
system cpu time 0.78 seconds
memory 2882.66k
OS Memory 11120.00k
Timestamp 28/04/2012 18:22:13 o'clock
Peter: Much appreciated! Remind me to NEVER opt for sashelp.vcolumn in a datastep over either sashelp.vcolumn or dictionary.columns with PROC SQL.
Worked. Thanks
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.