About supp

supp · ‎08-31-2024

Is it possible to call an Azure OpenAI API from base SAS and chat with an LLM (i.e. ChatGPT-4)? Here is an example of what I am thinking. Prompt ChatGPT-4 with "Hello chat gpt" and capture the response in JSON. This just gives me a 404 error. Has anyone gotten something like this to work? %let api_key= xxxxxxxxxxxxxxxxxxxxxxxxx; %let question = %str(%"Hello chat gpt%"); /* Body of the POST request */ filename in temp; data _null_; file in; put; put "{"; put '"model": "gpt-4", "messages": [{"role": "user", "content": '"&question }]"; put "}"; run; /* reference that file as IN= parm in PROC HTTP POST */ filename resp "%sysfunc(getoption(WORK))/echo.json"; proc http method="POST" url="<End Point URL>/openai/deployments/gpt-4/chat/completions?api-version=2024-02-01" ct="application/json" in=in out=resp; headers "Authorization" = "Bearer &api_key."; run;

supp · ‎05-31-2024

Interest: I am looking for a team Track : any Skills: SAS, SAS viya, SQL Profile: (24) Laine Suppes | LinkedIn

supp · ‎05-20-2024

The formatting you suggested might be the solution. I put a space after the term, then no other spaces. I now see the term 'dont know' in my outterms CAS table! I will examine the results a bit more but this is progress!

supp · ‎05-20-2024

Thanks for the reply! I am using SAS Viya 3.5. I do not get any error or messages. My results are unchanged when I use multiterm. For example, I put the term "dont know" in a CAS table "mycas._multiterm" as follows. Note that this is a common bi-gram in my data found by using pointwise mutual information. Here is my code for topic discovery: proc textmine data= mycas.data_file; doc_id id; var text; parse termwgt= none cellwgt= none reducef= 30 entities= std multiterm= mycas._multiterm outparent= mycas.outparent outterms= mycas.outterms outpos= mycas.outpos outchild= mycas.outchild; svd k= 100 outdocpro= mycas.outdocpro keepvariables= (id) outtopics= mycas.outtopics svdu= mycas.outsvdu; run; My primary indication that it is not working is that the term 'dont know' is not in mycas.outterms table. My secondary indication is that I can't get my topic assignments to change when I specify multi term phrases. proc sql; select * from mycas.outterms where term = 'dont know'; quit;

supp · ‎05-19-2024

I am attempting to extract topics from a collection of customer comments. I would like to be able to parse common phrases. I read the documentation for the multi term parameters within the parse statement. However I am not getting it to work. How can I pass the phrases"I don't know", "improve training" or "everything is fine"?

supp · ‎08-23-2023

I should mention I don't have access to IML

supp · ‎08-22-2023

Has anyone implemented the BEST approach to compare two group means and their differences? I am interested in a Bayesian approach to a two sample t test. I found this paper which describes an approach referred to as BEST. Kruschke2013JEPG.pdf (iu.edu) From the paper here is a visual of the model used: Here is my attempt to apply BEST approach described in the paper to the "Behrens-Fisher Problem" described in the SAS MCMC procedure. The data: data behrens; input y ind @@; datalines; 121 1 94 1 119 1 122 1 142 1 168 1 116 1 172 1 155 1 107 1 180 1 119 1 157 1 101 1 145 1 148 1 120 1 147 1 125 1 126 2 125 2 130 2 130 2 122 2 118 2 118 2 111 2 123 2 126 2 127 2 111 2 112 2 121 2 ; My attempt to apply the BEST method. It seems to me the main difference is the using the t distribution as the likelihood function (as opposed to a normal distribution used in the SAS documentation) proc sql; select mean(y) into :mean_y from behrens; quit; /** Get pooled data **/ proc glm data= behrens; class ind; model y = ind; run; * Root MSE = 19.32394 ; %let low_pooled_std = 19.32394 / 1000; %put &=low_pooled_std. ; %let high_pooled_std = 19.32394 * 1000; %put &=high_pooled_std. ; proc mcmc data=behrens outpost=postout2 seed=123 nmc=40000 monitor=(_parms_ mudif) statistics(alpha=0.01); ods select PostSumInt; parms mu1 0 mu2 0; parms sig21 1; parms sig22 1; parms nu 1; prior mu: ~ N(&mean_y., sd= &high_pooled_std.); * prior assumes pooled mean and normal distribution ; prior sig2: ~ uniform(&low_pooled_std., &high_pooled_std.); prior nu: ~ expon(scale= 29); * From Kruschke paper, exponential distribution spreads prior credibility fairly evenly over nearly normal and heavy tailed data ; mudif = mu1 - mu2; if ind = 1 then do; mu = mu1; s2 = sig21; end; else do; mu = mu2; s2 = sig22; end; model y ~ t(mu, var=s2, nu); /* model y ~ n(mu, var=s2); Use this if a normal distribution is desired. The t distribution should handle outlier better*/ run; Here are the estimates. These match pretty close the to the example in SAS documentation: proc sql; select 'Probability difference of means if greater than 0', sum(mudif > 0) / count(*) as probability from postout2; run; The priors are subjective, but I just used the approach described in the paper. These should be modified to fit the analysis being conducted. If anyone has attempted to implement the BEST approach I would appreciate any feedback on my approach.

supp · ‎12-23-2022

To figure this out here is simulated data. The outcome is score. We will try to estimate the coefficient paramater value for group using a linear regression /** Simulate data **/ data _sim1; array test [4] _temporary_ (.03184 .08917 .19745 .68152); array control [4] _temporary_ (.02376 .10239 .20877 .66508); do i= 1 to 500; test_dist= rand('tabled', of test[*]); group = 1; if test_dist= 1 then score= 0; if test_dist= 2 then score= 33.33; if test_dist= 3 then score= 66.67; if test_dist= 4 then score= 100; output; end; do i= 1 to 10000; group = 2; control_dist= rand('tabled', of control[*]); if control_dist= 1 then score= 0; if control_dist= 2 then score= 33.33; if control_dist= 3 then score= 66.67; if control_dist= 4 then score= 100; output; end; run; Using proc genmode to esitmate the paramater values. Not sure if it is better to treat group as a class variable or just numeric. proc genmod data= _sim1 ; /* class group; */ model score = group; bayes seed= 1 coeffprior=normal nbi= 1000 nmc= 100000 thin= 10 seed= 1 out= posterior diagnostics=all; run; I think the diagnostics look good? The final analysis seems reasonable. Using HPD the impact on score of being in the control group is likely somewhere between -3.0138 and 1.55543 with a maximum liklihood point estimate of -0.7640. I tried to recreate something similar using PROC MCMC but it doesn't seem to be working as well. Can you see what I am doing wrong? /** MCMC approach **/ proc mcmc data= _sim1 seed=1 nbi=1000 nmc=10000 outpost= simout thin=10; parms b0 0 b1 0 s2 1; prior b: ~ normal(0, var= 100); prior s2 ~ igamma(1, scale= 1); mu = b0 +( b1 * group); model score ~ normal(mu, var= s2); run; The effective sample size if very low and I seem to be getting almost a reversed impact of being in the test group.

supp · ‎12-15-2022

Is there a way to use a bayesian approach to compare the means of two independent samples? Can proc mcmc handle something like this. For example, I have two independent samples from the same underlying population: Sample 1: n = 500, mean = 83.8, standard deviation = 24.7 Sample 2: n = 1000, mean = 86.1, standard deviation = 25.8 Is there a way I can model the difference between the means using proc mcmc without the underlying data? My only thought is to create a posterior distribution like this: /* Compute the posterior distributions for the population means */ data post; do i = 1 to 10000; do x= 0 to 100 by .1; sample1 = pdf('Normal', x, 83.8, 24.7); sample2 = pdf('Normal', x, 86.1, 25.8); output; end; end; run; @Rick_SAS

supp · ‎06-29-2022

Thanks @Ksharp, the code you provided gets a posterior distribution without using PROC IML. It is also a little cleaner than my code in the original post.

supp · ‎06-27-2022

I looked into PROC MCMC. I can't figure out what the MODEL statement would be. Maybe this is too simple of an example?

supp · ‎06-27-2022

I spent some time summarizing the results from the SAS method. I now think the approach I used and results are valid. I still think there is a more elegant way to achieve the same results. Here are some summaries I made from sampling the posterior. /** Add a cumulative proportion column to find confidence intervals **/ data _sample3; set _sample2; retain cumulative_prop; if _n_ = 1 then cumulative_prop = proportion; else cumulative_prop = cumulative_prop + proportion; run; proc sgplot data= _sample3; series x= p_grid y= proportion; run; proc sql; /** Add up posterior probability where p < 0.5 */ select sum(posterior_s) as sum_standard_post from _sample3 where p_grid < 0.5; select sum(proportion) as sum_proportion from _sample3 where p_grid < 0.5; /** What proportion of water is compatible with greater than 0.5 and less than .75 probability? **/ select sum(proportion) as sum_proportion from _sample3 where 0.5 < p_grid < 0.75; /** What proportion of water is compatible with greater less than 80% probability? **/ select max(p_grid) as p_grid from _sample3 where cumulative_prop < .8; /** What proportion of water is compatible with middle 80% probability? **/ select min(p_grid) as lower_parameter, max(p_grid) as upper_parameter from _sample3 where 0.1 < cumulative_prop < .9; /** Point estimate of highest probabilty **/ select p_grid as point_estimate, proportion, cumulative_prop from _sample3 having max(proportion) = proportion; quit; /** Proportion of water less than .5 Posterior --> 0.166 probability Sample --> 0.178 probability Proportion of water greater than .5 and less than .75 sample --> .596 probability Lower 80% posterior probablity when proprotion of water is .76 Middle 80% posterior probablity when proprotion of water is between .445 and .812 point estimate for most likely proportion of water --> .625 (about .4% probability) **/

supp · ‎06-27-2022

Thanks for the reply. When I try running PROC IML I get a message suggesting my shop does not have a license for it. It seems PROC IML lets us approach the problem very similar to R using vectors.

supp · ‎06-26-2022

I am working through the book "Statistical Rethinking" by Richard McElreath. The following is a toy example from the book that uses grid approximation to get a posterior distribution then samples from the posterior. My question is what is an equivalent way to obtain the same posterior and similar sample in SAS? Since I am more familiar with SAS I am doing this for understanding and fun. The set up is we toss a globe and catch it 9 times. The tip of our right index finger will land on either land (L) or water (W). We observe L= 3 and W= 6. We are trying to determine the proportion of the globe that is water using a bayesian approach. The R code: # create a grid of possible parameter values (proportion of globe that is W) p_grid <- seq(from= 0, to= 1, length.out= 1000) p_grid[1000] # Set prior prob_p <- rep(1, l) # Liklihood function for grid of possible parameter values prob_data <- dbinom(6, size = 9, p_grid) # combine prior with liklihood to get posterior posterior <- prob_data * prob_p sum(posterior) #Standard posterior posterior <- posterior/sum(posterior) # Sample from posterior samples <- sample(p_grid, prob = posterior, size = 1e4, replace = TRUE) #plot samples plot(samples) plot(p_grid, posterior) Here is what the resulting posterior distribution looks like from the R script. Here is my attempt to recreate the process in SAS. The tricky part seems to be is how to sample (or run simulations) based on the posterior. I used PROC SURVEYSELECT. /** Bayes practice **/ data _test1; p_grid= 0; /** Set initial parameter value **/ do i= 1 to 1000; /* p_grid = rand('uniform') ; */ /** Alternative approach is to just randomly generate a bunch of parameter values **/ prob_p = 1 ; /** Set a prior **/ prob_data = pdf('binomial', 6, p_grid, 9); /** Liklihood function, liklihood of data given parameter value **/ posterior = prob_data * prob_p; output; p_grid + .001; /** update p_grid **/ end; drop i; run; proc sql; select sum(posterior) into: sum_posterior from _test1; /* Sum posterior to standardize */ create table _test2 as select *, posterior/(&sum_posterior.) as posterior_s from _test1 order by p_grid; quit; # create macro to sample from dataset using standard posterior as the size. %macro sample(i); proc surveyselect data= _test2 method= pps_wr out= _sample1 n= 1 outhits noprint; size posterior_s; run; proc append base= all_simulations data= _sample1; run; %mend sample; proc delete data= all_simulations; run; data _null_; %let sim_size = 10000; do i = 1 to &sim_size; call execute("%sample("||i||");"); end; run; proc sql; create table _sim_sum1 as select p_grid, sum(numberhits) as freq, sum(numberhits) / &sim_size. as proportion from all_simulations group by p_grid; select sum(proportion) as sum_proportion from _sim_sum1; quit; proc sgplot data= _sim_sum1; series x= p_grid y= proportion; run; Here is the resulting distribution from the plot: (It doesn't look right) I acknowledge I may be way off base with this approach. I am trying to understand this material and would appreciate someone showing me a proper way to do this in SAS.

supp · ‎03-07-2022

Yeah, I over thought this one. The expected value of a bad outcome is just n*p or 100*.22 = 22 bad outcomes out of 100. The variance is n(p)(1-p) or 100 * .22 * .78 = 17.16 Standard deviation is sqrt(17.16) = 4.14

Online Status	Offline
Date Last Visited	‎09-01-2024 10:46 AM

Calling Azure OpenAi from base SAS

Re: Interested in joining a team? Need teammates?

Re: Specify multi word phrases in proc textmine

Re: Specify multi word phrases in proc textmine

Specify multi word phrases in proc textmine

Re: How to implement Bayesian Estimation Supersedes the t Test (BEST)...

How to implement Bayesian Estimation Supersedes the t Test (BEST) in ...

Re: Bayesian approach to a two sample t test

Bayesian approach to a two sample t test

Re: How to sample from a posterior distribution

Interested in joining a team? Need teammates?

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

Re: Interested in joining a team? Need teammates?

Re: How many times out of 100 will event occur?

Re: Creating a Datalines Table with Proper Date Format

Re: Getting SAS script to run run from Python using SASPy

Find the name of the owner's favorite monkey

Calling Azure OpenAi from base SAS

Re: Interested in joining a team? Need teammates?

Re: Specify multi word phrases in proc textmine

Re: Specify multi word phrases in proc textmine

Specify multi word phrases in proc textmine

Re: How to implement Bayesian Estimation Supersedes the t Test (BEST)...

How to implement Bayesian Estimation Supersedes the t Test (BEST) in ...

Re: Bayesian approach to a two sample t test

Bayesian approach to a two sample t test

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

Re: How to sample from a posterior distribution

How to sample from a posterior distribution

Re: How many times out of 100 will event occur?

SAS Hacker's Hub

SAS Inner Circle Panel

SAS Analytics Explorers