Hi guys,
Can you please help me in coming out with random years from a list of firm-years provided below (panel dataset):
Firm Years
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
From the list, I want to come up with random years from the list of years (grouping by firms). For instance, firm A random year should be from 2010 to 2014. Likewise, for firm B, it should be from 2009 to 2015.
Regards
Hi @amanjot_42,
PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:
data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;
proc surveyselect data=have method=srs n=1
seed=1618 out=want(keep=firm years);
strata firm;
run;
The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.
Hi @amanjot_42,
PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:
data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;
proc surveyselect data=have method=srs n=1
seed=1618 out=want(keep=firm years);
strata firm;
run;
The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.
Hi @amanjot_42
If you don't have SAS/STAT licensed, you can simulate the function of Proc Surveyselect by adding a random key, sorting on Firm + random key and keeping the first occurence of each firm. Input is the data set provided by @FreelanceReinh
data temp1; set have;
rkey = ranuni(5);
run;
proc sort data=temp1;
by Firm rkey;
run;
data want(drop=rkey); set temp1;
by Firm;
if first.Firm;
run;
If you don't have SAS/STAT licensed (but a recent version of Base SAS, e.g. 9.4M5) and you don't want to create or sort additional datasets, you can use this approach:
data want(drop=i);
call streaminit(3141);
array y[50] _temporary_;
do i=1 by 1 until(last.firm);
set have;
by firm;
y[i]=years;
end;
years=y[rand('integer',i)];
run;
This assumes that you have at most 50 observations (i.e. years) per firm. Increase the array dimension if that is not enough.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.