Manipulating Data in Base SAS® Part 3 – Deduplicate
Recent Library Articles
Recently in the SAS Community Library: Duplicates in data can badly skew the results of an analysis. @SASJedi demonstrates data deduplication using PROC SORT with the NODUPKEY, OUT=, and DUPOUT= options and PROC SQL and PROC FedSQL
Hi,
I'm currently waiting to get access for sas viya but I have a test case scenario to compare between sas eg and sas viya and just want to verify the sas viya code. For bootstrapping in sas eg I have this code
proc surveyselect data=data
out=BootSamples noprint
seed=25
reps=2000
method=urs
samprate=1
outhits;
run;
Am I right that this would be the equivalent code in sas viya or am i missing something?
/* Start CAS session and load data into CAS */
cas mysess sessopts=(caslib='casuser');
libname mylib cas sessref=mysess;
/* Load example data */
proc casutil;
load data=data casout="sample" replace;
run;
/* Perform bootstrap resampling using the sampling action set */
proc cas;
action sampling.srs result=r /
table={caslib='casuser', name='sample'}
output={casout={caslib='casuser', name='BootSamples', replace=true}}
samppct=100 /* Sampling rate of 100% for bootstrap */
method='URS' /* Unrestricted random sampling with replacement */
seed=25 /* Seed for reproducibility */
reps=2000; /* Number of bootstrap replicates */
selection={name='Freq', includeFreq=true}; /* Include frequency counts in the output */
quit;
/* Fetch and display some of the bootstrap samples (Optional) */
proc cas;
table.fetch / table={caslib='casuser', name='BootSamples'} to=10;
quit;
/* End CAS session */
cas mysess terminate;
... View more
Thanks to a little math help from StackExchange, here's a SAS ODS Graphics Happy Father's Day greeting!
* Fun w/SAS ODS Graphics: Happy Father's Day! (Scatter + Polygon + Text Plots)
Star vertices algorithm from math.stackexchange.com/questions/3582342/coordinates-of-the-vertices-of-a-five-pointed-star;
data star; * Generate points for stars;
retain id 0 r1 6 r2 2.5 dad "DAD" xT 0 yT 0; * Star outer radius is 6, inner radius is 2.5;
pi=constant("pi");
do pt=1 to 600; * Points for 600 little Unicode stars;
xS=-6.25+12.5*ranuni(123); yS=-5.75+12.5*ranuni(456); output;
end;
xS=.; yS=.;
do k=0 to 4; * Points for 1 big polygon star;
x=r1*cos(2*pi*k/5+pi/2); y=r1*sin(2*pi*k/5+pi/2); output;
x=r2*cos(2*pi*k/5+pi/2+2*pi/10); y=r2*sin(2*pi*k/5+pi/2+2*pi/10); output;
end;
ods graphics / reset width=5in height=5in noborder; * Make Dad a star!;
proc sgplot noautolegend aspect=1 noborder nowall pad=0;
styleattrs backcolor=navy;
symbolchar name=uniStar char='2605'x; * Unicode value for 5-pointed star;
scatter x=xS y=yS / markerattrs=(symbol=unistar color=White size=24pt); * Plot little unicode stars;
polygon x=x y=y id=id / fill fillattrs=(color=cxd9d9d9) dataskin=crisp; * Plot big polygon star;
text x=xT y=yT text=dad / contributeoffsets=none textattrs=(size=48pt color=navy weight=bold) contributeoffsets=none; * "DAD";
xaxis display=none values=(-6.25 6.25) offsetmin=.01 offsetmax=.01; * Hide axes;
yaxis display=none values=(-5.75 6.75) offsetmin=.01 offsetmax=.01;
run;
... View more
Boa tarde, tenho duas tabelas e gostaria de gerar/exportar um arquivo único, em que, cada tabela ficasse em uma aba do excel. Como fazer este procedimento?
... View more
Hey SAS Community, I am new to SAS and would appreciate any advice on this topic. For a university project, I need to calculate the expected sales value for the upcoming months after my dataset runs out. The dataset includes Total_amt , which contains the transaction values, and Tran_date , which specifies the dates of the transactions. data TransactionsWithSasDate;
set Transactions;
Tran_date = mdy(Month, Day, Year);
format Tran_date date9.;
run;
proc sql;
create table MonthlySales as
select
intnx('Month', Tran_date, 0, 'Beginning') as Month format=date9.,
sum(Total_amt) as MonthlySalesValue
from TransactionsWithSasDate
group by calculated Month;
quit;
proc arima data=MonthlySales;
identify var=MonthlySalesValue(12);
estimate p=1 q=1;
forecast lead=12 id=Month interval=Month out=ForecastedSalesValue;
run;
proc sgplot data=ForecastedSalesValue;
series x=Month y=MonthlySalesValue / lineattrs=(color=blue) legendlabel="Actual";
series x=Month y=Forecast / lineattrs=(color=red) legendlabel="Forecast";
xaxis label='Month';
yaxis label='Monthly sales value';
title 'Monthly Sales Trend and Forecast';
run; I double-checked my code, but I am not sure if it is correct because the output graph looks a little off. Any advice on this topic would be highly appreciated! Greetings, Johannes
... View more
Hi everyone, data tab1;
input gr $ experiment var1;
datalines;
A 1 0.58
A 2 0.74
A 3 1.17
B 1 0.73
B 2 0.75
B 3 1.52
C 1 1.09
C 2 1.06
C 3 1.60
;
run;
proc npar1way data=tab1 wilcoxon dscf;
class gr;
var var1;
run; I have 3 subjects (A, B and C). For each subject, i repeat an experient 3 times. So I have 3 quantitave measurements for each subject. Looking at the data, I realize that in each experiment (1, 2 and 3), A < B < C. But for each experiment I don't have the same "order of magnitude" for my quantitative variable, because of manipulation errors. I wanted to do a kruskall-wallis because of the small sample, to test my quantitative variable between my 3 groups. But I don't know how to take into account the fact that there is inter-experimental variability because one and only one CLASS variable must be specified ? And is the kruslll-wallis test the best solution ? Thank you.
... View more