Could someone who comes from R language and SAS programming help me? I need to execute the code below programmatically in SAS. The code below was developed in R Spark on databricks. The goal is to run SAS through PROC, but as is little known in SAS Transfers. If anyone can help me with any suggestions.
for(i in 1:nrow(distinct_cluster)){
variaveis_clusters <- dados %>%
filter(DATIPRNOTFSC<'2019-01-01', DATIPRNOTFSC>='2018-01-01')%>%
group_by(DESATICLI, Calculation)%>%
sdf_pivot(DESATICLI ~ Calculation, fun.aggregate = list(qtde = "sum"))
stats <- variaveis_clusters %>% select(-DESATICLI) %>% summarise_all(funs(avg, min, max)) %>% collect()
cols <- variaveis_clusters%>% select(-DESATICLI) %>% colnames()
avgs <- stats %>% select(ends_with("avg")) %>% unlist
mins <- stats %>% select(ends_with("min")) %>% unlist
maxs <- stats %>% select(ends_with("max")) %>% unlist
exprs <- glue("(`{cols}` - {avgs}) / ({maxs} - {mins})") %>%
setNames(cols) %>%
variaveis_escaladas<-variaveis_clusters %>% mutate(!!! exprs)
variaveis_clusters_tratado <- na.replace(variaveis_escaladas,0)
variaveis_clusters_tratado <- sdf_copy_to(sc, variaveis_clusters_tratado,
name = "variaveis_clusters_tratado", overwrite = TRUE)
cluster <- ml_kmeans(variaveis_clusters_tratado, DESATICLI ~ ., k = 6)
clusters <- ml_predict(cluster, variaveis_clusters_tratado)
clusters <- clusters%>%mutate(grupos = case_when(
prediction==0 ~ "A",
prediction==1 ~ "B",
prediction==2 ~ "C",
prediction==3 ~ "D",
prediction==4 ~ "E",
prediction==5 ~ "F"))
Could you explain what the code does?
Have you looked into the R-interface in SAS which is available through IML/Studio?
Get it running for one set and then we can help you convert it to run for all groups.
There are generally several ways, two of the most common are BY group and macros.
I'd say read up a bit on BY group processing in SAS - it's like GROUP BY in R, but think of it as being available to all your packages - including the cluster proc.
And another is macros, which are similar to functions except assume they return nothing just do a specific set of tasks.
@EloarL wrote:
What I have with the frequency of code to execute in SAS refers to the command that was used at the beginning of the code. The command refers to executing the code 6 times for a quantity of 6 didnt groups. ie I don't know which command to use in SAS to perform retries n times. This is my difficulty.
It may help to provide a small data set as SAS data step code and what the desired output would be from the given R code for that example data set. The data set needs to be just big enough or complex enough to exercise all of the options.
Instructions here: will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.