- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I want to create multiple tables from one data set using proc sql. First thought is using loop with macro variables.
The large data set likes (call it large_table):
names var1
aaa 23
aaa 33
bbb 44
bbb 44
bbb 45
ccc 55
..... ....
want to create tables with all the names, e.g.
proc sql;
create table aaa_table as
select *
from large_table
where names = 'aaa'
;
quit;
proc sql;
create table bbb_table as
select *
from large_table
where names = 'bbb'
;
quit;
want to use loop, but cannot find out a better way to loop through the values. so far, tried to add index for names, then try to loop using macro variables to create tables. something like below:
/* get index, may not be necessary if there is other ways to do this */
proc sql;
select names, monotonic() as index
from (select distinct names from large_table)
;
quit;
/* then somehow to use the index as looping start and end */
/* e.g.
%macro names (...)
then add one proc sql to create tables inside the loop
*/
sorry for the long story, any help would be great!
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @sasecn Maybe you are after this?
data have;
input names $ var1;
cards;
aaa 23
aaa 33
bbb 44
bbb 44
bbb 45
ccc 55
;
dm log 'clear';
proc sql ;
select distinct names,count(distinct names) into :names1-,:c
from have;
quit;
%macro t;
proc sql;
%do i =1 %to &c;
create table &&names&i as
select *
from have
where names="&&names&i";
%end;
quit;
%mend t;
%t
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It would be really helpful if you told us (or better yet, showed us examples of) the output you want to achieve. It would also be very helpful if you explained what you are doing, and why you are doing it.
Breaking up data sets like this is almost never a good idea, and there are almost always better ways to get to your final destination. Assuming I know what you are trying to do, and I'm not sure I actually do know what you want to do. But after you break up this data, what next? What is the next step of the program? Analysis? Plotting or reporting? Something else?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is for reporting purpose. I have a large table, for simplicity just think as it has two vars: names; var1. The variable names has more values: aaa, bbb, ccc, .... My final output is to break down the large table into multiple tables by the values of "names". So output tables are: aaa_table; bbb_table; ccc_table...... each is a subset of the large table broken down by the values of variable "names".
i am not good at explaining thing. hope this can make it clear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasecn wrote:
It is for reporting purpose. I have a large table, for simplicity just think as it has two vars: names; var1. The variable names has more values: aaa, bbb, ccc, .... My final output is to break down the large table into multiple tables by the values of "names". So output tables are: aaa_table; bbb_table; ccc_table...... each is a subset of the large table broken down by the values of variable "names".
i am not good at explaining thing. hope this can make it clear.
Why do you need separate tables to make reports? Why not just make the reports from the original table and filter by the value of the variable?
proc freq data=BIG;
where names='aaa';
tables var1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Make your life easy. Do not split up the data set.
Use one large data set, and do the reporting in wahtever reporting procedure works for you (PROC PRINT, PROC TABULATE, PROC REPORT) using a BY statement to get different reports for different names. Or if you need different reports for different names, then do what @Tom said.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input names $ var1;
cards;
aaa 23
aaa 33
bbb 44
bbb 44
bbb 45
ccc 55
;
data _null_;
set have end=z;
by names;
if _n_=1 then call execute('proc sql;');
if first.names then do;
call execute(catx(' ','create table ', names ,' as '));
call execute(catt(' select * from ',' have ',' where ',' names=','"',names,'"',';'));
end;
if z;
call execute(' quit; ');
run;
514 data _null_;
515 set have end=z;
516 by names;
517 if _n_=1 then call execute('proc sql;');
518 if first.names then do;
519 call execute(catx(' ','create table ', names ,' as '));
520 call execute(catt(' select * from ',' have ',' where ',' names=','"',names,'"',';'));
521 end;
522 if z;
523 call execute(' quit; ');
524 run;
NOTE: There were 6 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: CALL EXECUTE generated line.
1 + proc sql;
2 + create table aaa as
3 + select * from have where names="aaa";
NOTE: Table WORK.AAA created, with 2 rows and 2 columns.
4 + create table bbb as
5 + select * from have where names="bbb";
NOTE: Table WORK.BBB created, with 3 rows and 2 columns.
6 + create table ccc as
7 + select * from have where names="ccc";
NOTE: Table WORK.CCC created, with 1 rows and 2 columns.
8 + quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your help. Well, this code is out of my knowledge. It does work on my data though. I need to learn more about this.
One thing noticed that it takes longer time to run and generate a message says "this takes longer time, use lots of resources ...." I tested that it did use more time than if i manually create each table using single proc sql for each table. I am using EG 5.1, not sure if it is related or not.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @sasecn Maybe you are after this?
data have;
input names $ var1;
cards;
aaa 23
aaa 33
bbb 44
bbb 44
bbb 45
ccc 55
;
dm log 'clear';
proc sql ;
select distinct names,count(distinct names) into :names1-,:c
from have;
quit;
%macro t;
proc sql;
%do i =1 %to &c;
create table &&names&i as
select *
from have
where names="&&names&i";
%end;
quit;
%mend t;
%t
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, that is what I was looking for. Thanks for the help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I believe you need to take a step back and look at your problem with a bit more distance.
Many SAS Procs allow for By Group processing and you can create a table per by group. Many Procs also offer a CLASS statement and you can get statistics/reports per distinct value of a class variable.
So... What do you have? And what do you need? If you can provide simple sample data (via a working SAS data step creating such data) and then show us how the report needs to look like then I'm sure someone can come up with a solution which requires less advanced coding and is easier to understand for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is a good idea to use SQL to create the names, but actually a lot easier if you let SQL create all the code:
proc sql noprint;
select distinct catx(' ','create table',cats(names,'_table'),'as select * from large_table where names =',
quote(names)) into :sqlcode separated by ';' from large_table;
;
&sqlcode;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Great, thank you!