About M_Maldonado

M_Maldonado · ‎06-10-2015

Hi Ivan, This is supported for neural networks using the AutoNeural node. For other models the only alternative that comes to mind is to use macro programming on base SAS to do multiple calls of a procedure with different parameters. I hope this helps, Miguel

M_Maldonado · ‎06-02-2015

are you creating a project from scratch, or do you get this error when you copy a project to another location? not sure what is going on. if you are just creating a new project, what happens when you give it a different name?

M_Maldonado · ‎06-01-2015

In this article the authors use the Segment Profile node to interpret the segments that the SOM/K node outputs. I hope this helps, Miguel http://www.iasri.res.in/sscnars/data_mining/8-som%20with%20e-miner.pdf

M_Maldonado · ‎05-29-2015

There is a data mining approach for rare events, often used to flag fraud. Give it a try not transforming or reject variables just yet. Try clustering your data and if you have a few flagged or confirmed fraud cases, you can train a predictive model for each cluster. You are hoping that your fraudsters have different patterns than the rest of your customers, and you would have a higher concentration of fraudsters in certain clusters. Make sure your cluster makes sense and decide whether you need to standardize or tweak your clustering. For your 300 binary variables you do not need to standardize but do standardize if you have other inputs in really different scales. presented this approach in SAS Global Forum 2015. Take a look at his paper SAS® Does Data Science: How to Succeed in a Data Science Competition http://support.sas.com/resources/papers/proceedings15/SAS2520-2015.pdf Compare this approach to Reeza's and Xia's suggestions. Good luck, Miguel

M_Maldonado · ‎05-28-2015

I haven't used the interactive binning node for a while. But to get 4 optimally binned groups, the Transform node is the way to go. Add a Transform group node and specify Default Method for your inputs as Optimal Binning. Run! (the groups is 4 by default). A much better alternative is Interactive Grouping node if you have licensed Credit Scoring for SAS Enterprise Miner. Good luck, Miguel

M_Maldonado · ‎05-26-2015

Credit scorecards have been the standard model for credit scoring because they are easy to interpret and enable you to easily score new data – that is, calculate a credit score for new customers. This tip walks you through the basic steps to build a credit scorecard developed using Credit Scoring for SAS® Enterprise Miner™ and is the first in a series of tips that I will be posting on credit scoring. Building a Scorecard The nodes in the basic flow diagram to build a credit scorecard are: Input Data Source, Data Partition, Interactive Grouping, and Scorecard. For this example you can use the German Credit data set available in the Help menu of SAS Enterprise Miner. Click on Help->Generate Sample Data Source -> German Credit. This data set has a binary target good_bad that indicates whether a customer defaulted on his monthly payments (designated with the value 'BAD'), as well as several other variables related to demographics and credit bureau that serve as inputs, or characteristics, . Interactive Grouping Node In a nutshell, the interactive grouping node is a very flexible tool for binning or grouping your variables. This node: bins your input variables using options you can easily tweak calculates the weight of evidence of the bins for each input variable calculates Gini and Information Value, and rejects input variables with a low value of these statistics The procedures running behind the scenes find the optimal binning of the inputs with respect to the target, subject to certain constraints that you can easily customize. Make sure you use the interactive application of the node to visually confirm that the event counts and weight of evidence trend make sense for your binning. If necessary, you can merge bins, create new groups, or manually adjust the weight of evidence. Manually adjusting the Weight of Evidence For certain variable inputs you might need to manually adjust the weight of evidence (WOE). For example, the variable employed summarizes the number of years that a credit applicant has been employed at his current job. In general, years at current job tends to be proportionally inverse to credit default. The fact that the weight of evidence does not decrease monotonically for groups 1 through 5 on this data set can be due to a number of reasons. For example, this data set might be sample-biased because many applications with employed<2 were hand selected or "cherry-picked", and their good behavior is reflected in a low event count and low weight of evidence. To prevent this sample bias from affecting your scorecard you can use the Manual WOE column on the Coarse Detail view of the Groupings tab in the interactive application. Change the WOE from 0.1283 to 0.7 for group 1 and from -0.13131 to -0.5 for group 2. Notice that the new weight of evidence is plotted as New WOE and the information value is re-calculated as New Information Value. Scorecard Node Once you are satisfied with the bins or groups you found with the Interactive Grouping node, run the Scorecard node to model a logistic regression using your grouped inputs. Then it will create a linear transformation of the predicted log of the odds for each input group, or attribute, into scorepoints that are much easier to interpret. By default, with each increase of 20 scorepoints, the odds of the event double. The event you are modeling is payment default, which means that for example an application scored with 130 points has double the odds of defaulting compared to an application with score of 150. In the results, there are several useful plots and tables including the scorecard, the score distribution, the KS plot, the trade-off plot, and many others. Output variables and Adverse Characteristics Notice from the exported data sets that the Scorecard node creates several variables. The variables with prefix SCR_ are the scorecard points for each variable in the scorecard, and SCORECARD_POINTS is the total points for each application. When you specify the Scorecard property Generate Report=Yes to output the Adverse Characteristics, your results will also include the variables that decreased the scorepoints the most for each observation. You can select up to 5 adverse characteristics. As an example of how to interpret this columns, for the first observation on the data set below, 14 scorepoints were deducted because the purpose of the loan was labeled either 1, 3, 8, missing, or unknown. Recommended reading SAS Enterprise Miner Reference Help: SAS Credit Scoring Siddiqi, Naeem, Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring, Cary, NC: SAS Press, 2005.

M_Maldonado · ‎05-20-2015

Logan, Enterprise Guide supports importing access databases. Have you tried to do File->Import Data? Also give it a try posting your question on Data Management and Base SAS procedures communities. Good luck! -M

M_Maldonado · ‎05-18-2015

Hey MB, I don't think there is a way to transpose your data automatically in Enterprise Miner. Same as Jaap, I would prefer to do it in base SAS or Enterprise Guide. How complex is this SAS transpose program that we need to write? Do you have a finite number of policy_type (house, vehicle, life), or your SAS program has to catch that? Are your records in order (by customer id)? To get the transposed data set you want I would use a SAS data step with arrays. This paper will get you started: Sharpening your skills in reshaping data: proc transpose vs array processing If you run into trouble, the guys on this community are the bomb! SAS support community: SAS Macro, Data Step, and SAS Language Elements This book is a fantastic read to start doing complex, efficient SAS programs in no time. Carpenter's Complete Guide to the SAS Macro Language Good luck! -Miguel

M_Maldonado · ‎05-15-2015

Aditya, Take a look to this article from a professor from UT Dallas. Definitely worth reading. http://www.utdallas.edu/~nkumar/FactorExample.PDF

M_Maldonado · ‎05-14-2015

When I want to convert my Excel reports to a SAS proc I use proc report. It is kind of old school and has a little bit of steep learning curve, but I found it really useful. Proc report with a custom format (through proc format) can do 90% of the things you have on that report. For the confidence interval you might have to use a stats proc like univariate or freq. Google proc report examples and see if that suits your report needs. Here a paper that seems like a tood start: Proc Report beyond the Basics Good luck!

M_Maldonado · ‎05-14-2015

Several procs will give you results similar to that. I would start with proc freq or proc univiriate. Personally I like univariate because you can output the results and then print them in a convenient way. Curious, did you design that output? Are you converting it from some report, or what you up to? Thanks, M

M_Maldonado · ‎05-13-2015

Hey Liban, More info please. When you say you want to show the trend, do you mean you want to model the trend or explain the trend? Or is this just a visualization exercise? what are you trying to do? thanks, Miguel

M_Maldonado · ‎05-11-2015

Hi Aditya, What are you using to calculate KMO, proc factor? I think PCA is the most common factor analysis for data miners, but you might be trying to do something beyond variable reduction using KMO. If you really want to do exploratory factor analysis using proc factor or something similar you might get better input from SAS statistical procedures community or SAS procedures support community. Still, share a code example of what you are using right now and we will give you suggestions on how to iterate through your data. Just curious, dichotomized variables are nominal variables with 2 levels, right? Is there any reason I cannot treat them as binary variables, or am I completely lost here? Thanks, Miguel

M_Maldonado · ‎05-07-2015

Thanga, I don't think UTILLOC would help to sort your data faster. Probably the threads option is already enabled in your system. SAS(R) 9.2 THREADS

M_Maldonado · ‎05-05-2015

Hey Hank, There are a lot of videos and papers that will help you catch the learning curve really fast. Please describe your data and what you are trying to visualize and people will have more specific suggestions. Right now I am not completely sure if you want to use EM to visualize your clusters or if you want to do link analysis. Below how you do link analysis for the first time and an alternative that I really like for visualizing networks. If this is your very first EM project, spend some minutes watching Getting Started with SAS Enterprise Miner: Setting Up an Enterprise Miner Project. You can skip the part where they create a Data Source because you can use the File Import node to import your csv. I personally prefer creating data sources the way they do it in the video, but the File Import will get you started faster. Steps for Link Analysis 1. Add a File Import node to your diagram (find it on the "Sample" tab). Then click on the ellipsis for Import File to select your csv file. Right after that click the ellipsis for Variables to make sure all the roles are levels are set the way you need them. 2. Connect a Link Node to your File Import. If you were analyzing facebook posts the way Falko does in his post, I would use only two nominal inputs: ID of the conversation, and name of the participant. You can use the variables ellipsis to specify this. Alternative: Visualize Networks with the Association node I like the Association node to visualize networks because the results are more flexible and the rules are easy to interpret in terms of stats like support and frequency. To do this you will need to set the role of your data set as transaction (instead of raw or train) when you import it or change it in your diagram directly if you have your data as a Data Source. Then run the Association node with an ID variable and an input for the name of the participant. For example, here I analyzed an online community similar to SAS online communities. The name of the post was an ID variable and the name of the participant a nominal input. I had to manually re-arrange the network a little bit to clearly show that these network had 4 sub-networks, one of them really active and collaborative, as well as a very influential participant that connected all four. I hope it helps, Miguel

Online Status	Offline
Date Last Visited	‎02-28-2018 11:39 AM

Re: Unbalanced data - miner

Re: SAS EM only: How to use parameter estimates in the next node?

Re: StatExplore Node

Re: How many leaves and nodes should a tree

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Re: run time error ensemble model

Re: run time error ensemble model

Re: help with hash table

Re: help with hash table

Re: StatExplore Node

Re: How to access Variable importance in neural network in EM?

Re: Grouping variables to create new variables SAS Enterprise Miner

Re: Error when running market basket node in SAS EM

Re: Seed Initialization Method for Hierarchical Clustering

Re: How can I use the Tobit's Model in SAS?

Re: Using cross-validation in Enterprise Miner;

Re: How come no Segment Profile after I set "Cluster Variable Role" = ...

Re: Confusion matrix in Enterprise Miner

Re: How can we export dataset from enterprise Miner as a csv file or t...

Credit Scoring by Example in SAS® Enterprise Miner™

Tip: How to model a rare target using an oversample approach in SAS® ...

Tip: How to interpret your SAS® Rapid Predictive Modeler results

Tip: Use the Cutoff Node in SAS® Enterprise Miner™ to Consume the Post...

Tip: How to build a scorecard using Credit Scoring for SAS® Enterprise...

Re: Grid search to optimize parameters?

Re: text mining

Re: Interpreting the results of SOM/Kohonen nodes

Re: How can i treat 300 binary product variables in classification cas...

Re: How to create force bumber of groups to 4 in EM Interactive Binnin...

Tip: How to build a scorecard using Credit Scoring for SAS® Enterprise...

Re: Data import from Access 2013 to SAS 9.3

Re: how to pivot data in enterprise miner?

Re: Questions on exploratory factor analysis..

Re: Dynamic Summary Table

Re: Dynamic Summary Table

Re: Numerical Trends

Re: Questions on exploratory factor analysis..

Re: UTILLOC option in Enterprise Miner

Re: How to do Link Network Node in EM?