Hi all,
I am trying to automate existing proc survey select procedure. We have provided sample size by business every quarter of some groups.
We update the samaple size in macro code then run the programme. Sometimes it happned given sample size do nt match with actual tdata present in input table hence it provide error then accoridng to actually data we update the sample size then we run it again.. this thing I have to Automate.
Secondly..We use "STRATA" function with this code as well. It happend same group is not present in input table also in that case we have to comment out the not exist group in code before running. You can see attched code.
So we can automate this to avoid such scenerio .Instead of manual upadating code will check and do the needful.
For ref please find attched code.
Thanks in advance.
You should provide some example of what "given sample size do nt match with actual tdata present in input" means. As in how does in not match? If you ask for 25 records and there are only 10 for a specific strata what do you want to happen?
Note that there is an option, SELECTALL, to select all records of a strata if there are not as many as you desire.
Since you have what appears at first glance to be about 30 macro variables and description of how any of them are assigned that is something else to consider.
If this were my project I might be tempted to see if SAMPRATE instead of SAMPSIZE were more appropriate. I could set the rate for a common percentage, possibly with an NMAX option.
Something else to consider are the use of control data sets. You could likely build either a SAMPSIZE or SAMPRATE data set that has the strata identification and desired size as a value of _NSIZE_ or _RATE_ instead of a laundry list of macro variables.
Your sampsize data set could look like
Group (the strata variable) _nsize_
1 25
2 18
3 12
4 135
...
until you have all of the values of group in your current input data set. Just make sure that you do not have any values of group in the SAMPSIZE set that do not appear in the data. _nsize_ has the typical rules for size variables.
HI ,
Suppose I have asked to run the coe for sample size 20 however input table contains only 10 in taht case without changing the macri variable value to 10 how can i automate this ? if input table contains less number then we have correct it in macro variable if it is high then we have to consider the value given ti us.
if saying about macro variables all are assigned .. the only thing is physicall all macro might not present in inout table in taht case we have to check and commented it as shown in attachment.
@Suvendu_Lenka wrote:
HI ,
Suppose I have asked to run the coe for sample size 20 however input table contains only 10 in taht case without changing the macri variable value to 10 how can i automate this ? if input table contains less number then we have correct it in macro variable if it is high then we have to consider the value given ti us.
Did you look at the SELECTALL option for selecting all records if the available sample is less than sampsize?
The second part, if I understand, is going to be basic data manipulation to count levels of a variable in a data set. Proc Freq, Proc SQL or data step to count the number of actually selected records and reassign a value to a macro variable.
if saying about macro variables all are assigned .. the only thing is physicall all macro might not present in inout table in taht case we have to check and commented it as shown in attachment.
Provide a very concrete example include some input data, the values of the strata variable, the values of the macro variables and how the macro variables are assigned.
If your "all macro might not present in inout table in" then why do you set them in the first place? From what you have shown how are we to know which ones aren't needed?
Did you at least look at the documentation for the SAMPSIZE or SAMPRATE dataset option?
Your need to describe in some detail where these macro variables come from and why.
Hi ballardw,
Thank you so much . SELECTALL option is working for me and it is taking the sample size avaliable in dataset if it is less than the sample size given in programme. Thanks
But still I am struggling for strata group. If you see my code I have commented many strata group because sometimes this are not avaliable in input data set but not always . it menas ne next run it might happen i have to uncommented some of the strata group from the comment.
Basically the queston is "
How can I sample from some of the strata in my data while ignoring others? "
For example :
%let rcanum1= 5180;
%let rcanum2= 4776;
%let rcanum3= 3240;
%let rcanum4= 5064;
proc surveyselect data = work.claima
sampsize = (&rcanum1., &rcanum2., &rcanum3., &rcanum4.)
method = srs
out = work.samplea SELECTALL;
strata group;
run;
In next run "&rcanum2" is not avliable like :
%let rcanum1= 5180;
%let rcanum3= 3240;
%let rcanum4= 5064;
I dont wnat to go & correct teh code by deleting the value from sample size & will again run .. the code should automatic check & provide me output without warning & error for 3 group. is there any option ??
Hope you understand ! Thanks in advance.
Cheers,
I moved the question to the Statistical Procedures community, where it belongs.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.