About ycenycute

ycenycute · ‎09-29-2023

I successfully run the variable selection node, and got the figure below in the results. I am wondering what does sequential R square mean in the vertical axis here?

ycenycute · ‎09-15-2023

I am unable to find the option in a point-and-click fashion in the enterprise guide.

ycenycute · ‎09-15-2023

I am wondering is there a way to plot a boxplot in enterprise guide without specifying the horizontal axis. I want to plot one boxplot for one interval variable, I don't want to consider different situations on the horizontal axis.

ycenycute · ‎11-03-2022

I know I need to set the data role as transaction when doing association rule model. But what does it mean? Does SAS treat it differently from a raw data? If so, how?

ycenycute · ‎10-28-2022

I see. This is super helpful. One more question, if I select centroid, how is optimal K selected?

ycenycute · ‎10-28-2022

Hi, I am not familiar with the SAS code. Thus, I don't know what the difference between PROC FASTCLUS and PROC CLUSTER is. I use SAS EM. And I drag a Cluster node under the Explore tab to the diagram and connect the Cluster node to my data node. Then if I select the cluster node, in the property panel on the left, there is Ward, Centroid and other options under selecting criteria. My question is if I would like use K means, shall I pick Centroid as the selecting criteria? Because I don't think Ward is related to K means algorithm. Or do they both apply to K means algorithm? Sorry, my question was moved from new users forum to here. I am not sure if I can get help here.

ycenycute · ‎10-28-2022

Okay. Are you suggesting that if I drag a cluster node into the diagram, it does not matter if I choose Ward or Centroid in the property panel on the left? Because I am able to choose Ward or Centroid if I select the cluster node (I don't think it is hierarchical clustering node). Are you suggesting these two methods will give the same results?

ycenycute · ‎10-28-2022

In Enterprise Miner, there is selection criteria, what is the differences between Ward and Centroid? Are they both using K-means algorithm? Centroid seems like K-means because K-means is based on calculating distance between centroid and other data points.

ycenycute · ‎10-26-2022

Wondering is cluster node in EM using K means algorithm? I know K means leverage the distance between centroids from two clusters. So shall I change clustering method to centroid to enable K means algorithm?

ycenycute · ‎10-15-2022

Thanks for the detailed explanation. If we train a model using training data and select a model using validation data. Then what is the purpose of separating test data? Also, back to the cumulative lift problem. So, is cumulative lift not a metrics for evaluating different models? And we should instead use ROC?

ycenycute · ‎10-15-2022

Interesting. Is this the setting for SAS? Because I usually just split the data into training and testing, and evaluate model performance on test data. As far as I am concerned, validation data is in cross validation where we need to select optimal hyper parameters, so we further split our training data set into training and validation. But we still evaluate model performance based on test data. It is exactly because observations in the test data are not entered into the training process, test data can be used to evaluate the model performance. Because in the end, we determine whether a prediction is a good one based on new data, not on the old or historical data. So I am indeed confused about the SAS norm to set training, validation and test data....

ycenycute · ‎10-13-2022

When I compare different models based on test data. How can I determine which model is better? Do I also count like area under the curve, like ROC?

ycenycute · ‎10-13-2022

This is super helpful. Thanks. But I am wondering when I run a decision tree on a dataset without splitting into train or test or validate. What is the meaning of baseline cumulative lift or best cumulative lift? How are these two curves obtained?

ycenycute · ‎10-05-2022

I think I get it. The index is for leaf node, ranked by the highest percentage of 1 in the node.

ycenycute · ‎10-05-2022

This is the figure I got.

Online Status	Offline
Date Last Visited	‎09-29-2023 07:55 AM

What is sequential R square in variable selection node in the output o...

Re: Boxplot without specifying horizontal axis

Boxplot without specifying horizontal axis

What does it mean by setting the data role to transaction?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

SAS EM: is cluster node using K means?

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: SAS EM: Association

Re: Why do I get Type 3 Analysis of Effects in linear regression in SA...

Re: Filter does not exclude the data

What is sequential R square in variable selection node in the output o...

Re: Boxplot without specifying horizontal axis

Boxplot without specifying horizontal axis

What does it mean by setting the data role to transaction?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

Re: SAS EM: is cluster node using K means?

SAS EM: is cluster node using K means?

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: How to interpret lift / gain plot in score rankings in SAS EM: dec...

Re: How to interpret leaf statistics plot in SAS EM: decision tree

Re: How to interpret leaf statistics plot in SAS EM: decision tree