BookmarkSubscribeRSS Feed

Statistical Modeling with SAS for Microsoft 365

Started ‎03-03-2025 by
Modified ‎03-03-2025 by
Views 1,180

In today's data-driven world, organizations need tools that combine powerful analytics with accessibility. SAS enables users to build sophisticated statistical models, while Microsoft Excel remains a staple for data organization and reporting. By using SAS for Microsoft 365, users can seamlessly integrate SAS’s modeling capabilities with Excel’s familiar interface, leveraging the strengths of both platforms. In this post, we'll leverage SAS for Microsoft 365 to analyze factors contributing to a Congenital Heart Disease (CHD) diagnosis using a logistic regression model in SAS, and then highlight our findings in Excel.

 

Framingham Heart Study

 

For this analysis, we will be using the SASHELP.HEART table which provides results from the Framingham Heart Study which originates from a long-term cardiovascular study conducted in Framingham, Massachusetts, beginning in 1948. This dataset is widely used in epidemiological research to analyze risk factors for heart disease. It contains demographic, lifestyle, and clinical variables such as age, sex, weight, cholesterol levels, blood pressure, and smoking status, among others. The dataset has been instrumental in identifying key predictors of cardiovascular disease and remains a valuable resource for statistical modeling and machine learning applications in healthcare.

 

Inserting SAS Data into Excel

 

We'll start by creating a new workbook in Microsoft Excel, opening the SAS Viya pane, and navigating to the Data tab. From there, users can select a cataloged table or choose one from the CAS or SAS Compute Servers. Since SASHELP.HEART is located on the SAS Compute Server, we'll select All Tables from the dropdown menu, then navigate to SAS Studio Compute Context > SASHELP > HEART to open the table.

 

01_ST_heart1-2.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

After double-clicking the SASHELP.HEART table, the Results tab opens, allowing us to modify the table before inserting it into the workbook. To add all rows to the worksheet, we select All rows from the Insert rows dropdown menu. In the Table Options section, we check Display column labels and leave the remaining options unchecked.

 

02_ST_heart2-1.png

 

In the Columns tab, we can choose which columns to insert into the workbook. We select the following:

 

Status: Alive or Dead

 

AgeCHDdiag: Age at CHD Diagnosis

 

Sex: Male or Female

 

AgeAtStart: Age at the start of the study

 

Chol_Status: Cholesterol Status: Desirable, Borderline, High

 

BP_Status: Blood Pressure Status: Optimal, Normal, High

 

Weight_Status: Weight Status: Underweight, Normal, Overweight

 

Smoking_Status: Smoking Status: Non-smoker, Light, Moderate, Heavy, Very Heavy

 

We will reorder the columns by selecting the Reorder icon 01_ST_reordericon_heart6.png, selecting the AgeCHDdiag column and using the arrows on the side to move it below AgeAtStart. If the arrows do not appear, try dragging the left-hand side of the SAS Viya pane to make it larger.

 

03_ST_heart7.gif

 

The Filter tab contains the Graphical and Expression builders to filter data by rows. The Graphical Builder provides a user-friendly, drag-and-drop interface for creating queries, making it easier for users to build and manipulate data without writing code. In contrast, the Expression Builder allows users to create custom expressions, calculations, and conditional logic using functions and operators, offering more flexibility for advanced data transformations.

 

We are not going to filter our data, so let's go to the Sort tab and sort the table by AgeAtStart in ascending order by selecting Add sort, choosing AgeAtStart from the dropdown menu and ensuring the Sort icon is set to ascending 02_ST_resortascendicon_heart3.png.

 

04_ST_heart8.png

 

Select the Insert in document icon 03_ST_insertindocicon_heart4.pngto insert the table into the Excel worksheet and label the worksheet tab Data. Notice in the data that some study participants have a value for Age CHD Diagnosed and others do not. We will run a logistic regression analysis to see what factors lead to participants being diagnosed with CHD.

 

05_ST_heart16.png

 

Logistic Regression Analysis

 

Let's create another Excel worksheet called Analysis of CHD Diagnosis.

 

06_ST_heart11.png

 

Navigating to the Programs tab of the SAS Viya pane, we'll use a SAS program to run a logistic regression analysis of factors contributing to CHD diagnosis.

 

In this SAS program, we are creating a new variable called CHDYes that is equal to 1 if the participant has a value for Age CHD Diagnosed and 0 if missing. Then, we use PROC LOGISTIC to run the a regression model with Sex, Age at Start, Cholesterol Status, Blood Pressure Status, Weight Status, and Smoking Status as explanatory variables for CHD Diagnosis.

 

/* Create a new dataset with CHDYes variable */
data heart_data; 
    set sashelp.heart;
    if missing(AgeCHDDiag) then CHDYes = 0;
    else CHDYes = 1;
run;

/* Perform logistic regression with specific output */
ods graphics on;
ods select ParameterEstimates OddsRatios ORPlot;
proc logistic data=heart_data descending plots=oddsratio;
    class Sex Chol_Status(ref = "Desirable") BP_Status (ref = "Optimal") 
    Weight_Status (ref = "Normal") Smoking_Status (ref="Non-smoker") / param=ref;
    model CHDYes = Sex AgeAtStart Chol_Status BP_Status Weight_Status Smoking_Status;
run;
ods graphics off;

 

Making sure the top left cell of the Excel worksheet is selected, click Insert in document to add the results of the regression analysis to the worksheet.

 

07_ST_heart12.gif

 

The regression analysis gave us a table of Maximum Likelihood Estimates along with a table and graph of Odds Ratio Estimates. To make it easier for viewers to hone in on statistically significant factors, we can use Excel to highlight the statistically significant factors in both tables.

 

08_ST_heart13.png

 

From the analysis we see there are several statistically significant factors affecting the likelihood of a CHD diagnosis. For example, those who are classified as Overweight or Very Heavy smokers are more at risk of a CHD Diagnosis. We can use conditional formatting in Excel to highlight the cells with these values.

 

09_ST_heart14.gif

 

Males and those with a blood pressure status other than Optimal are more likely to be diagnosed with CHD. We can use Excel to create a new sheet called Blood Pressure Status by Sex and include a pivot table and pivot chart to understand the distribution of these two variables.

 

10_ST_heart15.gif

 

Conclusion

From this analysis, we were able to use the statistical modeling capabilities of SAS with the familiar interface of Microsoft Excel, all while never leaving the Excel app. To learn more about SAS for Microsoft 365 visit SAS for Microsoft 365 | SAS Support or sign up for the new SAS for Microsoft 365 course to become an expert user: Course: SAS for Microsoft 365.

 

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Cool article!

Contributors
Version history
Last update:
‎03-03-2025 06:05 PM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started