About PGStats

PGStats · ‎09-23-2022

Look at the STREAM routine : %let seed=1234; data test1; call streaminit(&seed.); call stream(1); do i = 1 to 4; a= rand('Uniform'); output; end; run; proc print data=test1; run; data test2; call streaminit(&seed.); call stream(2); do i = 1 to 4; b= rand('Uniform'); output; end; run; proc print data=test2; run; Both streams are controlled by the same seed but consist of independent sequences.

PGStats · ‎09-22-2022

Specifying option nlag=1 is loosely like requesting : model measurements = measurements_1 preintervention intervention postintervention sin cos/ method=ml dwprob covb; where measurements_1 is the measurement value from the preceeding observation. I.e. you model the measurements as depending not only on independent variables but also on the previous state of the system.

PGStats · ‎09-21-2022

Or simply: prxchange('s/\b(\d+)(st|nd|rd|th)\b/\1/i', 1, Address1) \b matches any word boundary \d+ matches one or more digits \1 brings the match from the first set of parentheses (i.e. the digits) the i suffix makes the match case insensitive

PGStats · ‎09-21-2022

Or if your SAS times refer to consecutive days, use the expression: sleepSeconds = endTime - startTime + "24:00:00"t; Note: "24:00:00"t is a SAS time literal that represents the number of seconds in 24 hours.

PGStats · ‎09-19-2022

Is age supposed to be a sequence, or is it supposed to be from a certain distribution?

PGStats · ‎09-17-2022

Using tricks is fun but can be obfuscating... /* Easier to see what this thing does... */ data want ; do until(last.case_nbr) ; set W ; by case_nbr ; if not missing(reason) then started = 1 ; if started then output ; end ; drop started ; run ;

PGStats · ‎09-13-2022

Hi Nick, The performance of such a simple* operation will usually depend on 1) The size of the tables 2) The location of the tables (local or remote) 3) In which table(s) ID1 and ID2 are primary keys, if any. * or maybe your query is not as simple as that.

PGStats · ‎09-12-2022

What would your counter be counting?

PGStats · ‎09-04-2022

@FreelanceReinh wrote: It's true that the sequences of random numbers differ between RandSQL1 and RandSQL2 in this example, but the difference is still "systematic" as the order is just reversed: If the second ORDER BY clause is changed to order by cnt desc; the differences regarding variables U and N vanish (at least on my computer) even when MYDATA1 was extended from 10 to 10 million observations. When the join order is changed and the sort order is changed, the original random sequence is recovered, but the random values are no longer associated with the same cnt values. I get exactly the same results as you describe when I run your many-to-many join example on a SODA (SAS On Demand for Academics) server. To anybody considering to call the SETINIT function in proc SQL I say DON'T !

PGStats · ‎09-03-2022

Great discussion ! Here is a simple example showing the dangers of trusting the seed setting mechanism to get reproducible random sequences in SQL. Note that the only difference between the two queries is the order of the datasets mentioned in the inner join data mydata1; do cnt=1 to 10; output; end; run; proc sort data=mydata1 out=mydata2; by descending cnt; run; %let seed = 27182818; proc sql; create table RandSQL1(drop=dumb) as select mydata1.cnt, streaminit(&seed) as dumb, rand('uniform') as u, rand('normal') as n from mydata1 inner join mydata2 on mydata1.cnt=mydata2.cnt order by cnt; quit; proc sql; create table RandSQL2(drop=dumb) as select mydata1.cnt, streaminit(&seed) as dumb, rand('uniform') as u, rand('normal') as n from mydata2 inner join mydata1 on mydata1.cnt=mydata2.cnt order by cnt; quit; proc sql; select a.cnt, a.u=b.u, a.n=b.n from randsql1 as a, randsql2 as b where a.cnt=b.cnt; quit;

PGStats · ‎09-03-2022

Using basic tools: data have; do col1 = 1 to 3; col2 = byte(rank("A") + col1 - 1); output; end; run; proc sort data=have; by descending col1; run; data temp; set have; length col3 $16; retain col3; col3 = cats(col2, col3); run; proc sort data=temp out=want(keep=col3); by col1; run; proc print data=want; run;

PGStats · ‎09-02-2022

ANOVA compares observarions, not variables. So you need to transpose your data first. Something like: data have; input Student $ Grad y2016 y2017 y2018 y2019 y2020; datalines; student1 2018 58 83 108 . . student2 2018 60 86 103 . . student3 2018 63 80 110 . . studenta 2018 59 82 106 . . student4 2019 . 57 84 113 . student5 2019 . 61 79 117 . student6 2019 . 64 82 109 . studentb 2019 . 60 81 115 . student7 2020 . . 55 80 117 student8 2020 . . 61 82 121 student9 2020 . . 62 87 116 studentc 2020 . . 57 85 118 ; proc transpose data=have out=temp1 prefix=yr; by student grad notsorted; run; data temp2; set temp1; if yr1; year = input(substr(_name_,2), 4.); aYear = year - grad + 3; /* student academic year */ keep student grad year aYear yr1; rename yr1 = labHours; run; proc glm data=temp2; class aYear year; model labHours = aYear year / solution; lsmeans aYear; lsmeans year; run; Note: this is testing for main effects only. Interactions between year and academic year are not estimable.

PGStats · ‎08-31-2022

2) If I understand correctly, the nearest deadwood is the one found inside the circle, but it could also be closer, lying just outside the circle. So what you have is interval-censored data. The true shortest distance lies in the interval (nearest distance to the circle, measured distance). Ask your professor about the appropriateness of censored data analysis for your data.

PGStats · ‎08-31-2022

1) For nonparametric univariate tests, look at proc NPAR1WAY. The Wilcoxon rank-sum test for example is based on ranks and does not assume normality.

PGStats · ‎08-29-2022

a) The quasi-Poisson regression is requested with the RANDOM _RESIDUAL_ statement, as explained here. b) IMHO, extrapolation using splines is nearly impossible, since the spline is fitted segment by segment (between knots), i.e. there is not even a function defined beyond the last knot.

Online Status	Offline
Date Last Visited	‎06-09-2024 06:52 PM

Re: Randomly select 3 rows with condition

Re: Randomly select 3 rows with condition

Re: Requête inner join conditional

Re: How to change order of variables in the graph produced by lines in...

Re: Duplicate value within ID

Re: Notation scientifique dans une variable de type alfanumerique $cha...

Re: Make new variables using if statements in proc sql

Re: How to use GROUP BY to concatenate strings in SAS proc SQL?

Re: Need to remove the exponential from the number

Re: how can get help about "proc split" online ?

Re: Heteroskedasticity

Re: Create a table with increasing integers from 1 to N in SQL

Re: Create a table with increasing integers from 1 to N in SQL

Re: Randomly select 3 rows with condition

Re: How do i parse specific number combinations from a field in a tabl...

Re: Add semi-annual variable to existing table

Re: Proc Transpose help - "Name of Former Variable" column not recogni...

Re: Using PROC SQL and need to do YEAR(DATE) AS BID_YEAR

Re: Applying Results of Principal Component Analysis on New Data

Re: how can I create a variable that indicates the percentile that eac...

Geometric mean for zero values

Convert LAT-LONG to UTM, and back

Finding all single linked chains: the allChains macro

How to find all connected components in a graph

Finding a complete sub matrix (aka finding maximal bicliques)

Re: Seeding random number generator in multiple data steps

Re: What does nlag= in the autoreg procedure do?

Re: I can match with prxmatch, need help with prxchange replacment?

Re: calculate hours between 2 times

Re: How to remove outlier from the variable age?

Re: How does this work? Weird DO loop.

Re: Proc sql left join

Re: Proc SQL

Re: how to set the seed in random number generation in SQL

Re: how to set the seed in random number generation in SQL

Re: Concatenating variable value vertically

Re: comparing variables whose observations are semi-exclusive

Re: Alternative to PROC GLM for not normally distributed data + circle...

Re: Alternative to PROC GLM for not normally distributed data + circle...

Re: Modeling time series with Quasi-Poisson reggression