- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to figure out a way to create a new SAS dataset with an if/then statement or case statement in proc sql.
What I am looking at is a dataset that doesn't have any observations in it at all. I have some code that will give me a zero in a temp dataset if no observations are found in the original set. What I need to be able to do is either:
1. Create a dataset with 3 observations where the first variable is populated with something I hard code and the rest of the variables 0's if the main dataset has no observations or the temp dataset is showing 0 indicating there were no observations.
2. If there are observations in the main dataset, then I want to create the new dataset based off of those observations.
I have researched but haven't really found a way yet. At least not one that is recognizable. I found this paper but I am not sure if it would help me here. http://www.lexjansen.com/phuse/2014/cc/CC06.pdf
Any ideas would be greatly appreciated.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Ok, so let me explain. Basically we are using basic datastep and logic to generate strings which are inserted into the compilation phase after the datastep has finished - this is what call execute does. So to break down:
/* Create empty results table */ proc sql; create table RESULTS (A num,B num); quit; data _null_; set sashelp.vtable (where=(libname="WORK" and memname="ABC")); /* The above takes SAS metadata about table work.abc */ /* Based on that if there are zero observations from work.abc we send */ /* this text out at the end of the step */ if nobs=0 then do; call execute('proc sql; insert into RESULTS set A=0,B=0; quit;'); end; /* This text is called if there are observations */ else do; call execute('proc sql; insert into RESULTS (select sum(VARA) as A,SUM(VARB) as B from WORK.ABC; quit;'); end; run;
Now I have overcomplicated it a bit to show the method. Basically if from the metadata zero observations are found in WORK.ABC - which we will assume is your data, then the text string:
proc sql; insert into RESULTS set A=0,B=0; quit;
Gets pushed out to the compiler and that is executed. Otherwise the sum sql statement is pushed out and executed.
If you want exact code, post some test data in the form of a datastep.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I think this will get you there. One detail: is the variable you hard-code part of the main data set or not? It might make a difference in adding a DROP statement or in setting a length for that variable.
data want;
if 0 then set have end=done;
if done=0 then do until (done2);
set have end=done2;
output;
array nums {*} _numeric_;
end;
else do;
do _n_=1 to dim(nums);
nums{_n_}=0;
end;
something_i_hard_code='Some Value';
do _n_=1 to 3;
output;
end;
end;
stop;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it would be part of the original dataset if an observation existed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
With SQL something like below should work.
data have_with_obs;
var1='a'; var2='b'; var3=1; var4=2;
do i=1 to 5;
output;
end;
run;
data have_zero_obs;
stop;
set have_with_obs;
run;
data filler;
var1='0'; var2='0';
do i=1 to 3;
output;
end;
stop;
set have;
run;
proc sql;
create table want_hasObs as
select *
from have_with_obs
union all corr
select *
from filler
where (select count(*) from have_with_obs)<1
;
quit;
proc sql;
create table want_Filler as
select *
from have_zero_obs
union all corr
select *
from filler
where (select count(*) from have_zero_obs)<1
;
quit;
Instead of the "select(*)...." which adds a full pass through the table, you could also query dictionary.tables and retrieve the number of observations from there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Well, the simplest method I can think is a process thing:
1) create an empty dataset containing the variables you expect in the output
2) run your program and set the output from that program with the template from step 1
3) if there are no observations in that dataset then run some code to populate with default text
So as an example:
proc sql; create table WANT ( A char(200), B num, C num ); quit; /* run your code */ data want; set want results; run; data _null_; set sashelp.vtable (where=(libname="WORK" and memname="WANT" and nobs=0)); /* If nothing is found in the where, then this code will never get called */ call execute('proc sql; insert into WANT set A="No obs",B=0,C=0; quit;'); run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@RW9 Thanks for the help. I am trying to get this to work because it is the answer in this thread that I feel more comfortable with what is going on. However, I have a couple of questions.
First, I can't figure out which piece of this code will populate my new dataset with data from the orignal dataset if there are observations in it. Also, I can't determine what the "results" set is supposed to represent your example.
Here is the flow.
Original Dataset -> Either has observations or doesn't -> If it has observations, I am summing totals from the observations and inserting them into a new dataset. -> If it doesn't have observations, I want to insert the records for 0 amounts into the summary dataset.
I hope that makes sense. I feel like this is close to what I need but I am missing something.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Ok, so let me explain. Basically we are using basic datastep and logic to generate strings which are inserted into the compilation phase after the datastep has finished - this is what call execute does. So to break down:
/* Create empty results table */ proc sql; create table RESULTS (A num,B num); quit; data _null_; set sashelp.vtable (where=(libname="WORK" and memname="ABC")); /* The above takes SAS metadata about table work.abc */ /* Based on that if there are zero observations from work.abc we send */ /* this text out at the end of the step */ if nobs=0 then do; call execute('proc sql; insert into RESULTS set A=0,B=0; quit;'); end; /* This text is called if there are observations */ else do; call execute('proc sql; insert into RESULTS (select sum(VARA) as A,SUM(VARB) as B from WORK.ABC; quit;'); end; run;
Now I have overcomplicated it a bit to show the method. Basically if from the metadata zero observations are found in WORK.ABC - which we will assume is your data, then the text string:
proc sql; insert into RESULTS set A=0,B=0; quit;
Gets pushed out to the compiler and that is executed. Otherwise the sum sql statement is pushed out and executed.
If you want exact code, post some test data in the form of a datastep.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@RW9 ok, I think I am with you. I believe I have that working, however, I am wondering if I am able to do select statements with unions in order to get 3 seperate records inserted into the new dataset if there are no records in the original? Otherwise, is there a way to modify the set statement example you gave me?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I was able to do what I needed by using 3 seperate insert statements to get my 3 observations. It seems like there would be a more efficient way. If you have any suggestions @RW9 let me know. At least this is working. I really appreciate your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No probs. If you want to post your code I will try to take a look.