About WWD

WWD · ‎08-19-2021

Please disregard my previous post. A posting was just added to my course page that stated that Virtual Lab 1 is down.

WWD · ‎08-19-2021

I am working in: Course = AI and Machine Learning Professional Module= Machine Learning Specialist For this course and module, students are supposed to use Virtual Lab 1. When I go to my course page, Virtual Lab 1 is not available. Have I burned up all of my time on Virtual Lab 1 or is the system down? Thank you, Bill Donaldson

WWD · ‎08-14-2021

I have a question that falls within the following area: Course = AI and Machine Learning Professional Module= Machine Learning Specialist Lession = Lesson 5 Support Vector Machines Subsection = Improving the SVM model Video time = 21-second mark In the video, the narrator mentions that fact that there is no iterating the SVM model. The statistician may make changes to the hyperparameters, but the model in completely determined and that there is not a sequence of models (that statement is vague and probably why I'm asking this question.). After running the SVM node, under the Assessment tab and looking at the Output table, the reader will notice that there is a report on the number of iterations used to build the model. My question is "Does the number of iteration pertain to the number of steps that were required to find a solution to the minimization problem that must be solved to identify H, where H = {<w,x> + b = 0}?

WWD · ‎08-11-2021

Ari: Thank you for answering these questions plus the previous questions that you answered. Bill

WWD · ‎08-11-2021

My question pertains to the following subject matter: Course = AI and Machine Learning Professional Module= Machine Learning Specialist Lession = Lesson 3 Decision Trees and Ensembles of Trees. The first question: When scoring data using an ensemble of trees, is the entire validation data set scored by each of the individual trees in the ensemble? The second question is: If my training dataset has 1000 points, will each "bagging" sample (sampling done with replacement) used to build a tree in the ensemble contain 1000 data points? or is this an hyperparameter that the statistician can set within Model Studio? A follow-up question then becomes if my original dataset contains 400 data points, may a bagging sample, drawn from the original 400, contain more than 400 points and be mathematically defensiveable? The third question is: Is the only difference between bagging and boosting is how the sample is selected for each tree in the ensemble. For bagging, the sample is with replacement for boosting the sampling is based on weights. But, in the end, each method when applied to the same original dataset of size 500, will produce samples of size 500? Thank you, Bill Donaldson

WWD · ‎08-09-2021

This is a follow-up to a previously posted question. I needed to add some additional information to help the reader locate the area/topic that I have a question about. Module: AI&ML Professional Class/Title: Machine Learning Specialist Lesson: Lesson 4 Neural Networks Section: Build a Neural Network using Default Settings Video, Demo or Practice Title: At the 1 minute and 50 second mark of the video. subject: The neural network is being used to approximate the logit function. In the Output Table, there exists a row titled "output nodes". For this example, SAS is reporting two (2) output nodes. When you look at the neural-network diagram within the results, there is a single output node. Will someone help me and explain to me what the two outputs represent and where those nodes appear in the diagram, please? Thank you, Bill Donaldson

WWD · ‎08-08-2021

Hey Cynthia: All of the questions that I’ve submitted pertain to the module “Machine Learning Specialist” with the title of the curriculum titled “Machine Learning Using SAS Viya”. One of my questions pertains to SVMs which is chapter 5 of this curriculum. Another question dealt with neural nets, which is Chapter 4 within this same module. Thank you, Bill

WWD · ‎08-08-2021

In Section 5 of AI and Machine learning, the program introduces Support Vector Machines (SVM). Within this module, there is an extended discussion of how to interpret the SVM results. Assume that the data are perfectly separable by a hyperplane. Would the two clusters the the “1” cluster (event occurrence) and the “0” cluster (the event did not occur)? When linear relationships are fit for individual variables, what is the target value for the “non-event” group (all the Y values are 0)? I know this is a tough question to describe and will be tougher to interpret. If possible, may I make a phone appointment with an instructor to discuss the problem, please? Thank you, Bill Donaldson

WWD · ‎08-07-2021

In the demo within the fourth chapter "Neural Networks", section 1 titled "Build a Neural Network using Default Settings" at time 1 minute 50 seconds, the presenter is going over the Output Table. The neural network is being used to approximate the logit function. In the Output Table, there exists a row titled "output nodes". For this example, SAS is reporting two (2) output nodes. When you look at the neural network diagram within the results, there is a single output node. Will someone help me and explain to me what the two output represent and where those nodes appear in the diagram, please? Thank you, Bill Donaldson

WWD · ‎07-31-2021

In the AI and Machine Learning module, there exists a document in Lesson 2 “Data Preparation and Model Selection” title “Singular Value Decomposition”. In the contained example, three SVDs are calculated. The values of the SVDs are: 1.63, 0.49, and 1.45. The author of the document says that the SVDs are ordered by the magnitude of their associated SVD values. In the example, SVD2 (value = 0.49) is said to be more important than SVD3 (value = 1.45). That SVD2 is said to be more important than SVD3 is contrary to the method of assigning importance by the magnitude of the SVD. My question is, why was SVD2 selected over SVD3, if magnitude of the SVD is used to select (or remove) a SVD? Thank you, Bill Donaldson

WWD · ‎07-30-2019

When using a data-monitoring node, must ALL tables that are to be monitored be pre-defined? In particular, during the demonstrations, the instructor identified “Customers” and “Employees” as tables that might be analyzed. When going through the properties wizard for the data-monitoring node, the end-user is required to identify the source table. Let’s say that there was a third table titled “Management” and this table was to be monitored. Before we used the methodology that was used for “Customers” and “Employees”, would the table “Management” have to be pre-defined outside of the data-monitoring node? Thank you, Bill Donaldson

WWD · ‎06-28-2019

Marc: I did a little poking around on the internet and found a discussion about what happens when one group within a category variable is not statistically different from the reference group. In the discussion, the author said that a category variable is “all or nothing”. By that the author was saying that it is not statistically defensible to remove one dummy variable, which would mean collapsing that category into the reference group. If you refit the model, using the “new” category variable with 1 fewer groups, you’d probably be OK, I think. (In the advanced modeling module, there are techniques for collapsing categorical variable.) I don’t know enough at this point in time to even say if what I found is equivalent to what you wrote. I have some thinking to do. Thank you for you for looking into my question, Bill

WWD · ‎06-25-2019

In my previous question, the discussion was about when interaction terms are not statistically significant. In particular, winter-summer and spring-summer were the only seasons that interacted with sale price given heating units. There were 4 other season pairs that were seen to be not statistically significant. If the model that was developed were to be put in production to make predictions, is it statistically defensible to create an indicator variable to indicate the only pairings that should be included? For example, There are six different season pairs: (winter, spring), (winter, summer), (winter, fall), (spring, summer), (spring, fall), (summer, fall). If one of these pairs produced a difference, for example (winter, spring), would you create an indicator variable for a winter-spring difference? If there were two combos that produced a difference, would you then create two indicator variables—one for each interaction? This is totally cryptic. Please don’t hesitate to e-mail for extra discription. Bill Donaldson

WWD · ‎06-25-2019

Thank you for the quick turn-around!!! Bill Donaldson

WWD · ‎06-25-2019

On page 3-23 within the Course Notes for Module 1, Statistics 1 Introduction to ANOVA, Regression ... the author is analyzing the statistical significance of an interaction term on Sale Price. The author states that there is a significant statistical difference in Sale Price when looking at Winter vs Spring. When I look at the immediately preceding table and graph, it looks like the difference is between Spring and Summer. The t-value for Spring versus Summer is 3.34. The t-value for Winter versus Spring is -1.89. What data and results did the author reference when they made their statement about significance on page 3-34? Bill Donaldson

Online Status	Offline
Date Last Visited	‎12-17-2021 01:21 PM

Re: _uniqueid_ and required variables

Identifying the number of hidden nodes

_uniqueid_ and required variables

Re: Category counts

Category counts

Scoring categorical class data

Re: Text Processing

Re: Degeneration of an ARMA or MA model over time

Degeneration of an ARMA or MA model over time

Text Processing

Re: Module 1, “Preparing Data for Analysis and Reporting

Identifying the number of hidden nodes

Re: Where is Virtual Lab 1

Where is Virtual Lab 1

Intrepreting the SAS Viya output for Support Vector Machines

Re: Three questions about Training and Validation Data Sets

Three questions about Training and Validation Data Sets

Clarification and additional background information concerning the use...

Re: Interpreting a Support Vector Machine Model

Interpreting a Support Vector Machine Model

AI and Machine Learning Module

SVD selection

Module 1, “Preparing Data for Analysis and Reporting

Re: Follow-up to previous question of Moduile 1 Chapter 3 ANOVA, Regre...

Follow-up to previous question of Moduile 1 Chapter 3 ANOVA, Regressio...

Re: Module 1 Statistics 1 Introduction to ANOVA, Regression ...

Module 1 Statistics 1 Introduction to ANOVA, Regression ...