About JBerry

JBerry · ‎07-15-2016

I answered my own question. In a different part of the HELP section is this description: In linear models, statisticians routinely use the mean squared error (MSE) as the main measure of fit. The MSE is the sum of squared errors (SSE) divided by the degrees of freedom for error. (DFE is the number of cases less the number of weights in the model.) This process yields an unbiased estimate of the population noise variance under the usual assumptions. For neural networks and decision trees, there is no known unbiased estimator. Furthermore, the DFE is often negative for neural networks. There exist approximations for the effective degrees of freedom, but these are often prohibitively expensive and are based on assumptions that might not hold. Hence, the MSE is not nearly as useful for neural networks as it is for linear models. One common solution is to divide the SSE by the number of cases N, not the DFE. This quantity, SSE/N, is referred to as the average squared error (ASE).

JBerry · ‎07-15-2016

In E-Miner, I see 2 selections in the Model Comparison node: - Average Squared Error - Mean Squared Error What is the difference? Searching the HELP does not yield any decent description: ----->8 snip from HELP file ------------------------------ The Selection Statistic choices are as follows: Default — The default selection uses different statistics based on the type of target variable and whether a profit/loss matrix has been defined. If a profit/loss matrix is defined for a categorical target, the average profit or average loss is used. If no profit/loss matrix is defined for a categorical target, the misclassification rate is used. If the target variable is interval, the average squared error is used. Akaike's Information Criterion — chooses the model with the smallest Akaike's Information Criterion value. Average Squared Error — chooses the model with the smallest average squared error value. Mean Squared Error — chooses the model with the smallest mean squared error value. ----->8 snip from HELP file ------------------------------

JBerry · ‎07-07-2016

If you can safely convert the codes to a number and use a numeric check, try looking into removing the letters with COMPRESS. To also make it a number, nest it like this: Num_DX=input(compress(DX,,"kd"),best.); Here's my untested attempt: data want; set have; ARRAY diagn {5} dx1-dx15; DO i = 1 TO 15; num_dx = input(compress(diagn{i},,"kd"),best.); HYPER='0'; IF num_dx >= 40100 and num_dx < 40590 then HYPER='1'; END; run;

JBerry · ‎07-07-2016

I have no idea what you're trying to do, but you're referencing variables that don't exist: AB=AB||B - the program doesn't know what AB is? So since you set B as a '1' but then never told it what AB was supposed to be, it automatically converts your number into a numeric 1 and then it results in a NOTE: in your log. Next, you do something similar with B||BA, except B actually exists (but as a string) - so it concatenates as '1 .' So, check your log. The answer is often there. Since i dont know what you're doing, I cant help very much from here. But my big tips are: (1) use length and input statements to correctly define the variable you're adding to avoid surprises (2) check your log often... even when you think things went correctly

JBerry · ‎07-07-2016

I love SAS for it's arrays. I use it often to make imputations like this: (change missing to the value 75) data want; set have; array change [*] x1-x999; do over change; if change=. then change=75; end; run; But what if, instead of changing to 75, I wanted impute to the minimum value of x. Thinking about this hurts my brain because I know that array is moving "sideways" and I'm looking for a whole dataset aggregation to obtain the minimum. I'm sure I could hack someting together, but I'm really worried about effiecency due to my dataset size.

JBerry · ‎07-07-2016

I think this is a little too big for a forum post. Also you posted it twice in 2 different forums.

JBerry · ‎07-07-2016

Impute is only referring to missing values, so if there is nothing missing then it doesn't need imputed. If it is highly skewed, you might address that by the Variable Transformation node which lets you try different things to create new, "transformed" versions of the variable. Hope that helps.

JBerry · ‎07-07-2016

Sometimes you are dealing with two different sets of underlying drivers, so something that might work is to see if you can identify those who spend less than $50 using a binary regression model first. If you can predict those (meaning you are getting a strong model), simply run the binary model first, and then run a separate GLM for each. I'll bet you can see which variables are different by comparing the GLM results.

JBerry · ‎07-07-2016

Adding comment to close thread

JBerry · ‎05-25-2016

I've read the papers but have been a little stuck on some minor syntax issues. I'm trying to figure out how to take an existing data step and thread it using DS2: First, some fake data: data have; input IdNumber Name $ 6-20 Team $ 22-27 StartWeight EndWeight; datalines; 1023 David Shaw red 189 165 1049 Amelia Serrano yellow 145 124 1219 Alan Nance red 210 192 1246 Ravi Sinha yellow 194 177 1078 Ashley McKnight red 127 118 1221 Jim Brown yellow 220 . ; run; Here is the data step code I'm trying to migrate into a threaded process using DS2: data want; set have; if Name = 'David' then David_Flag=1; else David_Flag=0; WeightDiff=EndWeight-StartWeight; run;

JBerry · ‎03-15-2016

Thanks, but this was just a simplified made-up example to illustrate my problem. In reality, what I'm doing involves many macros passing variables back and forth and doing a variety of data tasks, so its quite complicated to post here.

JBerry · ‎03-14-2016

I am writing a macro with a rather complicated IF statement - however I noticed when I put comments inside I get this annoying error: ERROR: There is no matching %IF statement for the %ELSE. I really dont have the option to *not* write comments, is there some other way I can do this? %global testvar; %global check; %let testvar=5; %MACRO printme(check); %IF &testvar=1 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=2 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=3 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=4 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=5 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=6 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=7 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %IF &testvar=8 %THEN %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; * comment ; %ELSE %DO; %let check = 1; %put testvar is &testvar and check was &check; %END; %MEND printme; %printme;

JBerry · ‎03-04-2016

THANKS! Mprint showed the issue - it was a silly extraneous character being inserted into the value for xvar from a prior process. Your tip saved me my sanity. Thanks!

JBerry · ‎03-04-2016

Yeah, if I run the code alone, it prints right to the HTML output tab in E.G. I also searched for ods statements in the program and there are none. I attempted to wrap the macro in ods HTML file= and ods close; but still nothing. I'll turn on MPRINT and see if there are any clues...

JBerry · ‎03-04-2016

I have a rather complicated program that I run, it will iterate through a list of variables and do certain things... One of those things I want to do, is create a scatterplot, so when I'm all done I can open up my HTML results and see a whole list of scatterplots that it created. For some reason, my code works fine but the scatterplots never appear. In fact I get no HTML results whatsoever (the default in my case) options source NOnotes NOserror; %macro plotting(datasetname,xvar,yvar); proc sgscatter data=&datasetname); title "&xvar versus &yvar"; plot (&xvar)*(&yvar) / pbspline; run; %mend plotting; %macro main; * do stuff; %plotting(&datasetname,&xvar,&yvar); %mend main;

Online Status	Offline
Date Last Visited	‎02-23-2018 03:36 PM

Re: Mean Squared Error vs Average Squared Error

Mean Squared Error vs Average Squared Error

Re: Array with a Character Variable (Using a Do-Until Loop)

Re: Concatenation

Array Imputation with Aggregate

Re: Binning a set of Continuous Variables using Percentiles for WOE Tr...

Re: Impute function in SAS Miner

Re: GLM not predicting lower values

Re: Transitioning to DS2

Transitioning to DS2

Metadata: More space please

Re: Increase variable name length from 32 to 128 characters

Increase variable name length from 32 to 128 characters

Re: Increase variable name length from 32 to 128 characters

Re: Using a Teradata UDF in SAS Implicit Sql Pass Thru

Re: Mean Squared Error vs Average Squared Error

Mean Squared Error vs Average Squared Error

Re: GLM not predicting lower values

Re: Concatenation

Re: VIF values with int vs. noint REG model

Re: Mean Squared Error vs Average Squared Error

Mean Squared Error vs Average Squared Error

Re: Array with a Character Variable (Using a Do-Until Loop)

Re: Concatenation

Array Imputation with Aggregate

Re: Binning a set of Continuous Variables using Percentiles for WOE Tr...

Re: Impute function in SAS Miner

Re: GLM not predicting lower values

Re: Transitioning to DS2

Transitioning to DS2

Re: Comments inside macro %IF statements?

Comments inside macro %IF statements?

Re: Why Won't Scatter Plots Print

Re: Why Won't Scatter Plots Print

Why Won't Scatter Plots Print