BookmarkSubscribeRSS Feed
dstuder
Obsidian | Level 7

Hi everybody

In order to fill out a questionnaire I need to do some (around 300) queries in our database and extract the number of cases where certain conditions are true. My SAS-Code is created dynamically by a PHP-Script and looks like this (it's actually a loop over all queries):

 

 

/******************************************
Question "Q1"
******************************************/
DATA want (KEEP = var1 var2 var3);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1, var2, var3;
RUN;

PROC SQL;
SELECT "Q1" AS question, COUNT(*) AS count FROM want WHERE var1='xy';
QUIT;

/******************************************
Question "Q2"
******************************************/
DATA want (KEEP = var4 var5 var6);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4, var5, var6;
RUN;

PROC SQL;
SELECT "Q2" AS question, COUNT(*) AS count FROM want WHERE var4='y';
QUIT;


... and so forth

Now, I need to summarise these results somehow, so that I get a table like this:

Question | Result
---------------------- 
Q1 120
Q2 240
Q3 1

My idea is that at the beginning I create an empty table `results` with two columns (name of the query and result). Then in each code-block (Q1, Q2, ... Qx) I append one line to the results-table (name of the query and the result itself).

 

But how can I do this? How can I save the result of each PROC SQL in an existing table?

Do you see any better solution?

 

10 REPLIES 10
ed_sas_member
Meteorite | Level 14

Hi @dstuder 

I suggest that you adapt the code as below:

 

/******************************************
Question "Q1"
******************************************/
DATA want (KEEP = var1 var2 var3);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1 var2 var3; 	/* <---- REMOVE COMMAS */
RUN;

PROC SQL;
   CREATE TABLE Q1 AS /* <---- ADD THIS */
   SELECT "Q1" AS question, COUNT(*) AS count FROM want WHERE var1='xy';
QUIT;

/****************************************** 
Question "Q2" 
******************************************/ 
DATA want (KEEP = var4 var5 var6); 
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4 var5 var6; /* <---- REMOVE COMMAS */
RUN;

PROC SQL;
	CREATE TABLE Q2 AS /* <---- ADD THIS */
	SELECT "Q2" AS question, COUNT(*) AS count FROM want WHERE var4='y';
QUIT;

/**
	....
**/

/* APPEND */

DATA FINAL;
	LENGTH question $ 100;
	SET Q:;
	RENAME count = result;
RUN;

Best,

dstuder
Obsidian | Level 7

Sorry for the bad code. It was more of a pseudo-code in order to show the problem and I didn't actually run it.

 

DATA FINAL;
	LENGTH question $ 100;
	SET Q:;
	RENAME count = result;
RUN;

So does this code actually summarize all the tables Q1, Q2, ... Q300 (created by PROC SQL) and put the table names and number of cases in a table `final`? I am also asking myself whether it wouldn't be better to overwrite a single table "Q" for each question instead of creating hundreds of tables called Q1, Q2 etc.?

ed_sas_member
Meteorite | Level 14

Hi @dstuder 

The CREATE TABLE xx AS statement just before the SELECT clause in PROC SQL put the results of the query in a dataset xx.

-> So in the code provided, it will create datasets Q1, Q2, ...

 

In the FINAL dataset, SAS append all tables beginning with the letter Q (-> Q:).

 

Definitely, you can avoid creating hundreds of intermediate datasets.

Here is a way to proceed:

/******************************************
Question "Q1"
******************************************/
DATA want1 (KEEP = var1 var2 var3); /* <--- RENAME WANT*/
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1 var2 var3; 	/* <---- REMOVE COMMAS */
RUN;

/****************************************** 
Question "Q2" 
******************************************/ 
DATA want2 (KEEP = var4 var5 var6);  /* <--- RENAME WANT*/
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4 var5 var6; /* <---- REMOVE COMMAS */
RUN;


/* APPEND */

PROC SQL;
   CREATE TABLE FINAL AS /* <---- ADD THIS */
   SELECT "Q1" AS question, COUNT(*) AS count FROM want1 WHERE var1='xy' /* ADAPT WANT1*/
	UNION /* <---- ADD THIS */
   SELECT "Q2" AS question, COUNT(*) AS count FROM want2 WHERE var4='y' /* ADAPT WANT2*/
    UNION
   /*...*/
	;
QUIT;

 

 

 

Kurt_Bremser
Super User

Is the WHERE condition always based on the first variable in the sort nodupkey?

 

Anyway, your code can be optimized:

%let question=Q1;
%let varlist=var1 var2 var3;
%let value=xy;

proc sort
  data=mydatabase (
    keep=&varlist
    where=(%scan(&varlist,1) = "&value")
  )
  out=sorted
  nodupkey
;
by &varlist;
run;

data single_result;
question = "&question";
count = nobs;
set sorted nobs=nobs;
stop;
keep question count;
run;

proc append data=single_result base=want force;
run;

If my assumptions were correct, this can form the basis of a macro with three parameters. Create a dataset with those three parameters, so you can run the macro in an automated way. Before running that, initialize the want dataset.

dstuder
Obsidian | Level 7

Hello Kurt,
actually no. The where-condition is random and depends entirely on what the specific question in the questionnaire asks for. It can contain one or several variables.

Kurt_Bremser
Super User

"Any kind of code" parameters in macros are quite tricky to handle. In this case I suggest you use call execute (or dosubl) from within a data step to create the code.

But the basic structure of my code is still valid: use the where condition and keep= in the first step that reads data and does the sort, and then get the number of observations.

Note that my code overwrites the temporary dataset single_result in every iteration and appends it to the final table, so you'll only have two datasets in the end, one of which can safely be discarded.

dstuder
Obsidian | Level 7

Thanks! I'll have to go through your answer carfully as I am still a newbie in SAS.

A short follow-up question: Does it make any difference in terms of performance whether I use a macro or if I just repeat all the code for each question? I mean if it's just a question of readability I dont really care because I am creating the SAS-code automatically using a PHP-script.

Kurt_Bremser
Super User

Creating the code from a php script will probably mean you write a program, run it in batch, rewrite it, run it again, and so on. Why not use the tools that SAS provides for creating code dynamically?

RichardDeVen
Barite | Level 11

The following SQL construct can be used to insert (i.e. append) a query result into an existing table

INSERT INTO <TABLE>
SELECT ... ;

Example:

* blank base table for results;
proc sql; create table work.results (Question CHAR(20), Result NUM); * Dynamic 1 .... ; data dynamic1; set sashelp.class; run;
* Append query results for Q1 to results table; proc sql; insert into results select 'Q1', count(*) from dynamic1; * Dynamic 2 .... ; data dynamic2; set sashelp.class; where age < 14; run; * Append query results for Q2 to results table;
proc sql; insert into results select 'Q2', count(*) from dynamic2; %let syslast = results;
ballardw
Super User

This looks a lot like skip pattern verification. If your current data is one record per respondent I would use a data step instead of hundreds of queries in SQL.

My basic approach when I have done this is to create a bunch of indicator variables 1 (for correct/desired/expected value of the variable with conditions) and 0 else. Then use a summary or report step to sum the 1 values to get one output table.

 

More dummy /pseudo code:

data want;
   set have;
   Q1 = (var1 ='xy');
   Q2 = (var4 ='y');
   /* skip pattern check: expect some value of Var5 
      when var3='y'
   */
   Q3 = (not missing(var5) and var3='y');

run;

proc summary data=want ;
   var Q1 - Q3;
   output out=work.summary sum=;   /* a mean would be percent of records
                                      with the response*/
run;

/* or a report procedure*/
proc tabulate data=want;
   var Q1 -Q3;
   table Q1 -Q3,
         sum
   ;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 899 views
  • 1 like
  • 5 in conversation