About M_Maldonado

M_Maldonado · ‎06-22-2015

Le title statement pour le titre est universelle, n'import la proc. L'option height commande la taille. Essaie avec "title height=20pt". L'exemple suivant utilisse le title statement directement avant la proc. C'est pareil pour proc gplot, proc sgplot, et quelques autres proc statistiques. Le titre est partie de l'ODS, pas exactement partie de la proc, mais les resultats de la proc montrent le titre. Je recommends title statement directement avant la proc pour le graphiques, pas dans la proc greplay. Mais ca marche aussi pareil pour la proc greplay. exemple title2 j=center color=red underlin=1 height=28pt "1" height=24pt "9" height=20pt "9" height=16pt "6"; a lire: SAS(R) 9.2 Language Reference: Title statement merci, Miguel

karendean · ‎07-04-2015

Thank you

M_Maldonado · ‎06-17-2015

For that specific example, 2 out of 4 models select yes, and the other 2 out of 4 select no. For this tie case, where both have 50 % voted probability, Yes is selected because descending levels (alphabetically descending) are given priority by default. You can change this order in the metadata definition. Thanks,

Rick_SAS · ‎06-11-2015

You can get a histogram of the simulated sample by using the HISTOGRAM statement in PROC UNIVARIATE. To overlay a line plot of the distribution, specify the distribution name and any known parameters after a "slash" (/) : histogram x / normal; /* or lognormal; optionally specify popluation parameters */

genericuserid111 · ‎06-04-2015

It's seeming like I'm the limiting feature here as I have trouble interpreting the output for the cluster node as well. Would the interpretation of the output from the cluster node be considered something a data analyst would understand out of the box? I.e. is there some standard of terminology SAS is adhering to here? Thank you!

M_Maldonado · ‎05-18-2015

Hey MB, I don't think there is a way to transpose your data automatically in Enterprise Miner. Same as Jaap, I would prefer to do it in base SAS or Enterprise Guide. How complex is this SAS transpose program that we need to write? Do you have a finite number of policy_type (house, vehicle, life), or your SAS program has to catch that? Are your records in order (by customer id)? To get the transposed data set you want I would use a SAS data step with arrays. This paper will get you started: Sharpening your skills in reshaping data: proc transpose vs array processing If you run into trouble, the guys on this community are the bomb! SAS support community: SAS Macro, Data Step, and SAS Language Elements This book is a fantastic read to start doing complex, efficient SAS programs in no time. Carpenter's Complete Guide to the SAS Macro Language Good luck! -Miguel

M_Maldonado · ‎05-01-2015

Hi Peiyu, The only one time that I got a similar behavior was when I imported a project from a really old version of EM to a much newer one. But I don't think this is your case. Please submit a tech support form using this link and they will help you figure out what is going on. Thanks, Miguel

data_null__ · ‎08-19-2015

You can't do that transpose, two or more variable to wide, with one PROC TRANSPOSE step. You can use the PROC SUMMARY trick if you have LE 100 obs per PATIENT_ID. If you want the leading zero you can use the rename trick but there is really no need for the leading zeros. data Apple; input patient_ID:$3. Admission_Date: anydtdte. Hospital $; format admission_date date11.; cards; 101 Jan-01-2014 A 101 Jan-04-2014 B 101 Jan-09-2014 C 102 Jan-09-2014 F 102 Jan-10-2014 G 103 Jan-13-2014 T 103 Jan-14-2014 N 103 Jan-20-2014 F 103 Jan-29-2014 T ;;;; run; proc print; run; %let obs=1; proc sql noprint; select max(obs) into :obs from (select count(*) as obs from apple group by patient_id); quit; run; options dkrocond=nowarn; proc summary data=apple; by patient_id; output out=wide ( drop=_type_ rename=(admission_date_1-admission_date_99=Admission_Date_01-Admission_Date_99 hospital_1-hospital_99=Hospital_01-Hospital_99) ) idgroup(out[&obs](admission_date hospital)=); run; options dkrocond=warn; proc print; run;

M_Maldonado · ‎04-27-2015

Hi Bruno, I could not find a proc that handles html gracefully. But the regular expressions were not as bad as I thought. You get full credit for your RegEx! it is way better than mine! And it does 99% of the job. I still get weird strings like \n\n but they are easy to remove with some SAS code. Thanks again! Miguel

TMKAIG1 · ‎03-31-2015

Hi Miguel! Thank you for continuing the conversation. I created this test data set to compare against different implementations of "optimal binning." I designed it specifically to be tricky to find the best four bins. In practice, if you're looking in detail at only a few predictors, you wouldn't restrict yourself so severely, and if there were six perfect bins, as in this case, you'd find them. But when you're dealing with hundreds or thousands of predictors as you might in a typical data mining exercise, then it's not at all uncommon to go with something like the best four or five bins, and you'd like to feel confident that your algorithm can locate them. Right now I feel a bit queasy about Enterprise Miner. If I allowed the Transform Variables node to search for 25 bins with my example data set, it found the six perfect ones. But if I limited it to 17 bins, it only found four of the uniform bins, and it split the two smallest ones into three pieces, one of which had only 92 points, less than 1% of the data. A bin that small could be significant, or it could be a blip. In my opinion, that's another weakness of the Transform Variables node -- you can't specify a minimum bin size. The R package, smbinning, has the opposite constraint, you can specify a minimum bin size, but not a maximum bin number; by the way, the smbinning function has a default minimum bin size of 5%, which is why it finds five bins for this data set with its default setting. If you submit: bt3_xg.t3<-smbinning(bt3_xg.data,y="bt3_xg",x="ipxg",p=0.04) it will find the six uniform bins (because the two smallest bins each have 4% of the data). About three years ago Ivan Oliveira in the Enterprise Miner development team described to me some improvements in optimal binning that were being added to EM 7.1. It strikes me now that he was talking only about the Interactive Grouping node in the Credit Scoring application, but at the time I thought he was referring to the Transform Variables node. I wish SAS would get the optimal binning in the Transform Variables node up to speed with the Interactive Grouping node. Okay, I promise to stop ranting now (or at least in the not too distant future). Chi-square is a useful measure of dependence / association that goes back to Karl Pearson and the early days of the science of Statistics in the late nineteenth / early twentieth centuries. One of the best things about it is that its distribution is well-understood, so you can generate p-values and perform significance tests based on your results. I don't know whether anyone has ever bothered to do that for Gini or entropy value distributions. The original automated decision tree algorithms eventually coalesced (about forty years ago?) into CHAID, which uses chi-square as its splitting rule measure of association. One useful property of chi-square is the way it scales fractally -- if you break up a big group into k identical subgroups, then the sum of the subgroup chi-squares equals the big group chi-square. But that's actually a bit of a drawback for optimal binning, because if a bin can split into two identical sub-bins, you'd rather just keep the original large bin, and this gives you no incentive to do so. Most of the association measures I've seen: Gini, entropy, information value, weight of evidence, within group sum of squares,... all have the property that they improve with increased granularity, so that if you removed all restrictions on the number or size of bins, they'd give you as many bins as data points; even with the drawback described above, chi-square will favor some lumpiness over complete granularity. And if you remove the chi-square denominators, so that you take the sum of the squared differences between the actual and expected number of hits in each group, you naturally seek out larger bins. Thanks!

M_Maldonado · ‎03-07-2015

Hi Jon, A way to do it directly in EM: On your Data Partition node, click on the Variables ellipsis (...). On the menu you can specify a Partition Role as Stratification. e.g. Home Equity IDS->Partition (change Partition Role of 'Reason' from Default to Stratification) I hope this helps, Miguel

rayIII · ‎03-13-2015

Yes, for sure. Use a LIBNAME statement to specify where you want to write the SAS dataset. You can give the dataset any name you wish as long as it conforms to SAS naming conventions. For example, this would write a dataset (you specify the name) to the path associated with the myplace library. libname myplace 'your_path_goes_here'; Data myplace.dataset_name_you_choose_goes_here; Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2; Input ID Age Sex $ Income Race $ Hight Weight; run;

M_Maldonado · ‎02-27-2015

Hi Elisa, Are you talking about the "Item Constellation Plot"? You can save a bmp file if you do File->Save as. This is useful if you want to take a look at a few items constellation plots. Let's say you are interested in the constellation plots for all your nodes and for the top 3 items with highest weight on the node frequency histogram. I hope this helps, Miguel

C_Ishihara · ‎03-04-2015

You may also consider the Text Filter Node. You can filter on both fixed and unstructed variable data by using the Document Filter features. Example: I want to find a word in a text field only relating to Product B. The snippet results are very helpful and can speed up review.

kasiacie · ‎02-24-2015

Thanks:)

Online Status	Offline
Date Last Visited	‎02-28-2018 11:39 AM

Re: Unbalanced data - miner

Re: SAS EM only: How to use parameter estimates in the next node?

Re: StatExplore Node

Re: How many leaves and nodes should a tree

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Re: Export scoring code for Cross Validation in SAS Enterprise Miner

Re: run time error ensemble model

Re: run time error ensemble model

Re: help with hash table

Re: help with hash table

Re: StatExplore Node

Re: How to access Variable importance in neural network in EM?

Re: Grouping variables to create new variables SAS Enterprise Miner

Re: Error when running market basket node in SAS EM

Re: Seed Initialization Method for Hierarchical Clustering

Re: Using cross-validation in Enterprise Miner;

Re: How come no Segment Profile after I set "Cluster Variable Role" = ...

Re: Confusion matrix in Enterprise Miner

Re: How can we export dataset from enterprise Miner as a csv file or t...

Re: How many leaves and nodes should a tree

Credit Scoring by Example in SAS® Enterprise Miner™

Tip: How to model a rare target using an oversample approach in SAS® ...

Tip: How to interpret your SAS® Rapid Predictive Modeler results

Tip: Use the Cutoff Node in SAS® Enterprise Miner™ to Consume the Post...

Tip: How to build a scorecard using Credit Scoring for SAS® Enterprise...

Re: Taille des titres proc GREPLAY

Re: Proc Assoc

Re: SAS DataMiner- Ensemble node

Re: How to generate a list of random numbers in SAS

Re: Interpreting the results of SOM/Kohonen nodes

Re: how to pivot data in enterprise miner?

Re: SAS enterprise miner cannot open project

Re: Transposing variables to observations

Re: how can I translate html embedded in strings?

Re: Optimal Binning in the Enterprise Miner Transform Variables node n...

Re: Adjust the distribution of a feature with sampling?

Re: Cross-Post From Text Mining Community Forum - Newbie Question

Re: SAS Miner - Link analysis

Re: How to filter datasets in the enterprise miner

Re: Grouping objects by clustering to improve predictive modeling