BookmarkSubscribeRSS Feed
dstuder
Obsidian | Level 7

Hi everybody

In order to fill out a questionnaire I need to do some (around 300) queries in our database and extract the number of cases where certain conditions are true. My SAS-Code is created dynamically by a PHP-Script and looks like this (it's actually a loop over all queries):

 

 

/******************************************
Question "Q1"
******************************************/
DATA want (KEEP = var1 var2 var3);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1, var2, var3;
RUN;

PROC SQL;
SELECT "Q1" AS question, COUNT(*) AS count FROM want WHERE var1='xy';
QUIT;

/******************************************
Question "Q2"
******************************************/
DATA want (KEEP = var4 var5 var6);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4, var5, var6;
RUN;

PROC SQL;
SELECT "Q2" AS question, COUNT(*) AS count FROM want WHERE var4='y';
QUIT;


... and so forth

Now, I need to summarise these results somehow, so that I get a table like this:

Question | Result
---------------------- 
Q1 120
Q2 240
Q3 1

My idea is that at the beginning I create an empty table `results` with two columns (name of the query and result). Then in each code-block (Q1, Q2, ... Qx) I append one line to the results-table (name of the query and the result itself).

 

But how can I do this? How can I save the result of each PROC SQL in an existing table?

Do you see any better solution?

 

10 REPLIES 10
ed_sas_member
Meteorite | Level 14

Hi @dstuder 

I suggest that you adapt the code as below:

 

/******************************************
Question "Q1"
******************************************/
DATA want (KEEP = var1 var2 var3);
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1 var2 var3; 	/* <---- REMOVE COMMAS */
RUN;

PROC SQL;
   CREATE TABLE Q1 AS /* <---- ADD THIS */
   SELECT "Q1" AS question, COUNT(*) AS count FROM want WHERE var1='xy';
QUIT;

/****************************************** 
Question "Q2" 
******************************************/ 
DATA want (KEEP = var4 var5 var6); 
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4 var5 var6; /* <---- REMOVE COMMAS */
RUN;

PROC SQL;
	CREATE TABLE Q2 AS /* <---- ADD THIS */
	SELECT "Q2" AS question, COUNT(*) AS count FROM want WHERE var4='y';
QUIT;

/**
	....
**/

/* APPEND */

DATA FINAL;
	LENGTH question $ 100;
	SET Q:;
	RENAME count = result;
RUN;

Best,

dstuder
Obsidian | Level 7

Sorry for the bad code. It was more of a pseudo-code in order to show the problem and I didn't actually run it.

 

DATA FINAL;
	LENGTH question $ 100;
	SET Q:;
	RENAME count = result;
RUN;

So does this code actually summarize all the tables Q1, Q2, ... Q300 (created by PROC SQL) and put the table names and number of cases in a table `final`? I am also asking myself whether it wouldn't be better to overwrite a single table "Q" for each question instead of creating hundreds of tables called Q1, Q2 etc.?

ed_sas_member
Meteorite | Level 14

Hi @dstuder 

The CREATE TABLE xx AS statement just before the SELECT clause in PROC SQL put the results of the query in a dataset xx.

-> So in the code provided, it will create datasets Q1, Q2, ...

 

In the FINAL dataset, SAS append all tables beginning with the letter Q (-> Q:).

 

Definitely, you can avoid creating hundreds of intermediate datasets.

Here is a way to proceed:

/******************************************
Question "Q1"
******************************************/
DATA want1 (KEEP = var1 var2 var3); /* <--- RENAME WANT*/
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var1 var2 var3; 	/* <---- REMOVE COMMAS */
RUN;

/****************************************** 
Question "Q2" 
******************************************/ 
DATA want2 (KEEP = var4 var5 var6);  /* <--- RENAME WANT*/
SET myDatabase;
RUN;

PROC SORT NODUPKEY;
BY var4 var5 var6; /* <---- REMOVE COMMAS */
RUN;


/* APPEND */

PROC SQL;
   CREATE TABLE FINAL AS /* <---- ADD THIS */
   SELECT "Q1" AS question, COUNT(*) AS count FROM want1 WHERE var1='xy' /* ADAPT WANT1*/
	UNION /* <---- ADD THIS */
   SELECT "Q2" AS question, COUNT(*) AS count FROM want2 WHERE var4='y' /* ADAPT WANT2*/
    UNION
   /*...*/
	;
QUIT;

 

 

 

Kurt_Bremser
Super User

Is the WHERE condition always based on the first variable in the sort nodupkey?

 

Anyway, your code can be optimized:

%let question=Q1;
%let varlist=var1 var2 var3;
%let value=xy;

proc sort
  data=mydatabase (
    keep=&varlist
    where=(%scan(&varlist,1) = "&value")
  )
  out=sorted
  nodupkey
;
by &varlist;
run;

data single_result;
question = "&question";
count = nobs;
set sorted nobs=nobs;
stop;
keep question count;
run;

proc append data=single_result base=want force;
run;

If my assumptions were correct, this can form the basis of a macro with three parameters. Create a dataset with those three parameters, so you can run the macro in an automated way. Before running that, initialize the want dataset.

dstuder
Obsidian | Level 7

Hello Kurt,
actually no. The where-condition is random and depends entirely on what the specific question in the questionnaire asks for. It can contain one or several variables.

Kurt_Bremser
Super User

"Any kind of code" parameters in macros are quite tricky to handle. In this case I suggest you use call execute (or dosubl) from within a data step to create the code.

But the basic structure of my code is still valid: use the where condition and keep= in the first step that reads data and does the sort, and then get the number of observations.

Note that my code overwrites the temporary dataset single_result in every iteration and appends it to the final table, so you'll only have two datasets in the end, one of which can safely be discarded.

dstuder
Obsidian | Level 7

Thanks! I'll have to go through your answer carfully as I am still a newbie in SAS.

A short follow-up question: Does it make any difference in terms of performance whether I use a macro or if I just repeat all the code for each question? I mean if it's just a question of readability I dont really care because I am creating the SAS-code automatically using a PHP-script.

Kurt_Bremser
Super User

Creating the code from a php script will probably mean you write a program, run it in batch, rewrite it, run it again, and so on. Why not use the tools that SAS provides for creating code dynamically?

RichardDeVen
Barite | Level 11

The following SQL construct can be used to insert (i.e. append) a query result into an existing table

INSERT INTO <TABLE>
SELECT ... ;

Example:

* blank base table for results;
proc sql; create table work.results (Question CHAR(20), Result NUM); * Dynamic 1 .... ; data dynamic1; set sashelp.class; run;
* Append query results for Q1 to results table; proc sql; insert into results select 'Q1', count(*) from dynamic1; * Dynamic 2 .... ; data dynamic2; set sashelp.class; where age < 14; run; * Append query results for Q2 to results table;
proc sql; insert into results select 'Q2', count(*) from dynamic2; %let syslast = results;
ballardw
Super User

This looks a lot like skip pattern verification. If your current data is one record per respondent I would use a data step instead of hundreds of queries in SQL.

My basic approach when I have done this is to create a bunch of indicator variables 1 (for correct/desired/expected value of the variable with conditions) and 0 else. Then use a summary or report step to sum the 1 values to get one output table.

 

More dummy /pseudo code:

data want;
   set have;
   Q1 = (var1 ='xy');
   Q2 = (var4 ='y');
   /* skip pattern check: expect some value of Var5 
      when var3='y'
   */
   Q3 = (not missing(var5) and var3='y');

run;

proc summary data=want ;
   var Q1 - Q3;
   output out=work.summary sum=;   /* a mean would be percent of records
                                      with the response*/
run;

/* or a report procedure*/
proc tabulate data=want;
   var Q1 -Q3;
   table Q1 -Q3,
         sum
   ;
run;

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1872 views
  • 1 like
  • 5 in conversation