About Rick_SAS

Rick_SAS · ‎05-31-2025

Yes, happy to. Please see Attrs, attrs, everywhere: The interaction between ATTRPRIORITY, CYCLEATTRS, and STYLEATTRS in ODS graphics - The DO Loop and the links therein.

Rick_SAS · ‎05-30-2025

No worries. Glad your problem is solved!

Rick_SAS · ‎05-30-2025

> I am not getting different symbols for groups. First, you might want to read an article that describes the interaction between ATTRPRIORITY, CYCLEATTRS, and STYLEATTRS in ODS graphics. It might be that what you are seeing (or what you want to see) requires that you understand how these options interact. It sounds like you might want to use ATTRPRIORITY=NONE, which enables the symbol to automatically change for different groups. Since we don't have your data, let's use data that we all have. The following creates some data for this example that (hopefully!) is similar to yours: data Out; set sashelp.stocks(keep=Stock Date Open where=(Date<'01JAN1991'd)); rename date=Time Open=mean; run; proc sort data=Out; by Time Stock; run; /* write spline predictor to OUT2 data set in the PRED variable */ proc glmselect data=Out; effect spl = spline(Time / basis=bspline knotmethod=equal(9)); class Stock; model mean = spl | stock / selection=none; output out=out2 pred=pred; run; Now let's graph the data. The following sets ATTRPRIORITY=NONE and lets the symbols, colors, and line styles change according to the ODS style. ods graphics / AttrPriority=NONE; proc sgplot data=Out2; scatter y=mean x=time /group=Stock; series y=pred x=time / group=Stock; run; The output is below. Note that the colors, symbols, and line patterns are different for all groups. If you want to override the default colors or symbols or patterns, you can use the STYLEATTRS statement, as Paige suggested. You can override 1, 2, or all three attributes. In the following, I set all line patterns to SOLID, but specify colors and patterns for the markers and lines: ods graphics / AttrPriority=NONE; proc sgplot data=Out2; styleattrs datalinepatterns=(solid) datacontrastcolors=(SteelBlue DarkGreen DarkRed) datasymbols=(CircleFilled TriangleFilled X); scatter y=mean x=time /group=Stock markerattrs=( size=6 ); series y=pred x=time / group=Stock; run; Please read the article and study the examples. Hope this helps!

Rick_SAS · ‎05-28-2025

The vector for the initial guess is called PD_init in the program, but you are passing initPD (which is undefined) as an argument to CALL NLPQN. That information (which matrix was not set to a value) is part of the log, so please copy/paste the entire error message in future posts.

Rick_SAS · ‎05-20-2025

I think the task you propose is hard and it's not clear what information it would provide. Again, what do you intend to do with these thousands of numbers? What is the scientific result you want to find? The reason I ask is that there might be a better way to conduct your analysis. For example, if your goal is to reduce the number of variables in a model by eliminating highly correlated variables, there are better ways to implement that task. Another question: You say, "I have continuous, ordinal, binary and nominal variables." The statistic you need to use depends on the type of the variable. Is there some naming convention that enables you to determine the type of the variable? For example, are the variables named CONT1-CONT20, ORD1-ORD25, and NOM1-NOM10? In general, you would use a different measure of association for the various combination of variables: Cont-Cont : usually Pearson correlation Cont-Ord : Recode Ord as 1,2,..,k and use Kendall tau-b Cont-Nom : This is tough, You have to recode the levels and use something called the point-biserial correlation. I've never done this. Ord-Ord : Kendall tau-b Ord-Nom : rank-biserial correlation. I've never done this. Nom-Nom : PROC FREQ provides several statistics of ASSOCIATION I do not think many people try to compute these statistics all at once because you cannot easily compare one method to another. For example, a Kendall tau-b score of 0.5 does not equate to a Pearson correlation of 0.5. I would not know how to compare the various statistics to each other.

Rick_SAS · ‎05-20-2025

>I am performing simple Pearson, Spearman and Kendall correlations between many variables using PROC CORR. How many variables do you have? For N variables, there are N*(N-1)/2 pairwise correlations, most of which will be significant. For example, if you have N=100 variables, it is likely you will encounter as many as ~5000 significant correlations. On the other hand, if you have N=10 or N=15, then you can visualize the correlations. So, how many variables and what do you want to do with the information about correlations?

Rick_SAS · ‎05-06-2025

@rbettinger wrote: Thank you for replying. In the interests of using my time well, I am going to avoid trying to solve this problem by rewriting the code that produces it. I have read your posting several times. I do not understand what you think the problem is. Please give ONE example that shows the problem and tell us what you think the correct answer should be.

Rick_SAS · ‎05-05-2025

I looked closely at your program, which is very well written and documented. I tried a few alternatives, but was unable to beat "Method 2", which is both easy to understand and easy to implement. In SAS 9.4, I think that is as fast I can get the computation.

Rick_SAS · ‎05-05-2025

Can you explain what you think is wrong or problematic about the following code that you posted? The computations look okay to me, so I guess I do not understand your concerns: proc iml ; a =8.372E-26 ; b= 4.63E-103 ; p = .01 ; c = ( a ## p + b ## p ) ## (1/p) ; print c ; quit ;

Rick_SAS · ‎05-05-2025

> am stymied by my inability to make the expression ndx = loc( vector= max( vector )) return only one value instead of > 1 . When, for example, vector = {8.372E-26 4.63E-103}, the variable ndx will contain {0 0} because the two values in vector are smaller than constant('maceps'), which is 2.2e-16. Can you explain your first sentence? It cannot be correct because LOC returns either an empty matrix or a set of positive integers. It will never return a zero. Here's what I see, which looks correct: proc iml; vector = {8.372E-26 4.63E-103}; ndx = loc( vector= max( vector )); print ndx; /* ndx 1 */ I will think about the second half of your question.

Rick_SAS · ‎04-23-2025

If I understand your question, the answer is that you can always rescale, but rescaling a variable does NOT change its significance (as measured by p-valuies) in the model. If your original model is Y = X1 X2; and then you define Z1=W1*X1 and Z2=W2*X2 for any nonzero values W1 and W2, the new model Y = Z1 Z2; will have different regression coefficient estimates, but the tests for significance (the p-values) will be the same. This is easily seen if you use standardized estimates. See https://blogs.sas.com/content/iml/2018/08/22/standardized-regression-coefficients.html For example: data class; set sashelp.class; X1 = Height; X2 = Weight; Z1 = 0.0254*X1; /* measure height in meters */ Z2 = 0.45359237*X2; /* measure weight in kilos */ run; title "Original Model: Inches and Pounds"; proc logistic data=class; model Sex = X1 X2; ods select ParameterEstimates; run; title "Rescaled Model: Meters and Kilos"; proc logistic data=class; model Sex = Z1 Z2; ods select ParameterEstimates; run;

Rick_SAS · ‎04-02-2025

Right. You can see this effect even for two classes. and even for sampling with replacement. Flip a fair coin 4 times and look at the ratios of heads to tails. The probability of 0:4 is 2 / 16 because HHHH and TTTT are possible. The probability of 1:3 is very high at 8 / 16. It includes possibilities such a HTTT and THTT. The probability of 2:2 is "only" 6 / 16. It includes possibilities such a HTTH and THHT. So, even in that simple probability space, the unequal ratio is higher than you might initially think.

Rick_SAS · ‎04-01-2025

The key is that you are pulling WITHOUT REPLACEMENT, so the conditional probabilities change as you pull. The expected ratio is only 4:4:4 is you pull with replacement. You can do a simpler analysis of 6 balls (3R, 3B, 3Y) and pulling 3 balls to see that the ratios are not even. If I did the analysis correctly, in that situation there is a 70% chance that the ratio is 2:1:0 and only a 30% chance that the ratio is 1:1:1. I think you can perform the simulation more concisely if you use the SAMPLE function to pull the samples and use the TABULATE subroutine to analyze the ratio: proc iml; call randseed(123); /* simulate 1000000 times */ n=1000000; /* create binary variable equal to 1 if the ratio is 3:4:5 */ is_345 = j(n,1,0); set = repeat('R',8) // repeat('G',8) // repeat('Y',8); draws = sample(set, 12 //n, "NoReplace"); /* draw n samples of size 12 */ do i = 1 to n; call tabulate(labels, freq, draws[i,]); if ncol(freq)=3 then is_345[i] = all(freq={3 4 5}) | all(freq={3 5 4}) | all(freq={4 3 5}) | all(freq={5 3 4}) | all(freq={4 5 3}) | all(freq={5 4 3}) ; end; /* estimate proportion of draws that result in 3:4:5 */ prop_345 = mean(is_345); print prop_345; In this simulation, the result is 0.4867 of the draws have ratio 3:4:5. Since this is less than 50%, the magician must have been using "magic" (that is, cheating) to win money.

Rick_SAS · ‎03-24-2025

Perhaps it would be helpful to understand how Python performs the scaling?

Rick_SAS · ‎03-22-2025

Hi @Werner_69 I used ODA, pasted your DATA step example, and copied/pasted the data until the DATALINES block was very large. By using the steps you described, I was able to successfully print the Code window. I generated examples that printed up to 100 pages. All were successful. I was using the Chrome browser, not Firefox. So, unfortunately, I am unable to reproduce your problem. I do not have any other ideas other than the previous suggestion to try another browser to determine whether this is a browser-dependent issue.

Online Status	Offline
Date Last Visited	Thursday