Hi guys,
Can you please help me in coming out with random years from a list of firm-years provided below (panel dataset):
Firm Years
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
From the list, I want to come up with random years from the list of years (grouping by firms). For instance, firm A random year should be from 2010 to 2014. Likewise, for firm B, it should be from 2009 to 2015.
Regards
Hi @amanjot_42,
PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:
data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;
proc surveyselect data=have method=srs n=1
seed=1618 out=want(keep=firm years);
strata firm;
run;
The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.
Hi @amanjot_42,
PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:
data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;
proc surveyselect data=have method=srs n=1
seed=1618 out=want(keep=firm years);
strata firm;
run;
The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.
Hi @amanjot_42
If you don't have SAS/STAT licensed, you can simulate the function of Proc Surveyselect by adding a random key, sorting on Firm + random key and keeping the first occurence of each firm. Input is the data set provided by @FreelanceReinh
data temp1; set have;
rkey = ranuni(5);
run;
proc sort data=temp1;
by Firm rkey;
run;
data want(drop=rkey); set temp1;
by Firm;
if first.Firm;
run;
If you don't have SAS/STAT licensed (but a recent version of Base SAS, e.g. 9.4M5) and you don't want to create or sort additional datasets, you can use this approach:
data want(drop=i);
call streaminit(3141);
array y[50] _temporary_;
do i=1 by 1 until(last.firm);
set have;
by firm;
y[i]=years;
end;
years=y[rand('integer',i)];
run;
This assumes that you have at most 50 observations (i.e. years) per firm. Increase the array dimension if that is not enough.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.