This week’s task for Custom Task Tuesday allows the user to do data analysis on “itsy bitsy” data. In statistics, the types of analyses you can do on large datasets are not the same as the types of analyses you can do on small datasets due to the inability to fulfill basic statistical assumptions with small data.
One type of test that is great for tiny datasets is Fisher’s exact test to determine whether there is a relationship between two variables, instead of a Chi-squared test which would be used for larger datasets.
The task we are constructing today will allow the user to select their datasets and two variables on which to perform the Fisher’s exact test. The results will include a print out of the two variables selected in addition to results of the Fisher’s exact test.
This is what the task will look like when we are finished, along with some sample output:
In SAS Studio, under the Task and Utilities section, open a “New Task” as well as the “Sample Task.” We will copy and paste the necessary Velocity Template code from the Sample Task to our task.
Name: Itsy Bitsy Data Analysis
Description: Analyze itsy bitsy data using Fisher’s exact test.
At the top of the VTL code for your New Task, you will need to fill in the Name and Description portions as shown below:
After you’ve done that, you should save this task to your My Tasks folder, so you don’t lose it. Click thebutton in the upper left corner of the task to bring up this option screen:
The easiest way to create the CTM portion of the task is to steal VTL code from the Sample Task. From looking at our “finished product,” you can see that we are going to use one dataset selector and two role selectors. Find the code that corresponds with a combobox in the Metadata section of the Sample Task, and copy and paste them into the same place in your task. Edit to code you copied to correspond with what we want as our finished product. With the role selectors, we want to make sure we have the maxVars and minVars both equal to 1, as Fisher’s exact test requires exactly two variables.
Your finished metadata portion should look something like this:
<Metadata> <DataSources> <DataSource name="DATASOURCE"> <Roles> <Role type="N" maxVars="1" order="true" minVars="1" name="OPTNVAR1" exclude="VAR">Variable 1 for Fisher's Exact Test:</Role> <Role type="N" maxVars="1" order="true" minVars="1" name="OPTNVAR2" exclude="VAR">Variable 2 for Fisher's Exact Test:</Role> </Roles> </DataSource> </DataSources> <Options> <Option name="DATATAB" inputType="string">DATA</Option> <Option name="DATAGROUP" inputType="string">DATA</Option> <Option inputType="string" name="label">COPY dataset is created in Work library that is a copy of your input dataset.</Option> <Option name="ROLESGROUP" inputType="string">FISHER'S EXACT TEST</Option> </Options> </Metadata>
Recall: each object in the metadata portion will have corresponding code in the UI portion. The UI portion determines the ordering of the display of the task. Just like before, find the code that correspond with the objects we need (one dataset selector, two role selectors) in the UI section of the Sample Task, and copy and paste them into the same place in your task. Edit to code you copied to correspond with what we want as our finished product.
This is what your finished UI portion should look like:
<UI> <Container option="DATATAB"> <Group option="DATAGROUP" open="true"> <OptionItem option="label"/> <DataItem data="DATASOURCE"/> </Group> <Group option="ROLESGROUP" open="true"> <RoleItem role="OPTNVAR1"/> <RoleItem role="OPTNVAR2"/> </Group> </Container> </UI>
In your SAS code, you will reference the Velocity macro variables that we created above ($DATASOURCE, $OPTNVAR1, $OPTNVAR2). This is how you make the interface work with and control your SAS code, so this is the most important part. This SAS code sorts the dataset and performs Fisher's exact test on the data.
This is what your final Code Template portion should look like:
data COPY; set $DATASOURCE; run; proc sort data = COPY; by #if( $OPTNVAR1.size() > 0 )#foreach( $item in $OPTNVAR1 )$item #end #end #if( $OPTNVAR2.size() > 0 )#foreach( $item in $OPTNVAR2 )$item #end #end ; run; proc print data = COPY; var #if( $OPTNVAR1.size() > 0 )#foreach( $item in $OPTNVAR1 )$item #end #end #if( $OPTNVAR2.size() > 0 )#foreach( $item in $OPTNVAR2 )$item #end #end ; run; proc freq data = COPY; tables #if( $OPTNVAR1.size() > 0 )#foreach( $item in $OPTNVAR1 )$item #end #end * #if( $OPTNVAR2.size() > 0 )#foreach( $item in $OPTNVAR2 )$item #end #end / fisher noprint; run;
You’re finished! You created a custom user interface to do analysis (means and fisher’s exact test) on itsy bitsy data. Click the button to save, then click the button to open the task. Make your selections, then click again to watch it run!
Want to try it yourself?
Get the code from the zip file at the end of this article or from GitHub.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.