BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amanjot_42
Fluorite | Level 6

Hi guys,

 

Can you please help me in coming out with random years from a list of firm-years provided below (panel dataset):

 

Firm Years

A 2010

A 2011

A 2012

A 2013

A 2014

B 2009

B 2010

B 2011

B 2012

B 2013

B 2014

B 2015

C 2013

C 2014

C 2015

 

From the list, I want to come up with random years from the list of years (grouping by firms). For instance, firm A random year should be from 2010 to 2014. Likewise, for firm B, it should be from 2009 to 2015.

 

Regards

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @amanjot_42,

 

PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:

data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;

proc surveyselect data=have method=srs n=1
                  seed=1618 out=want(keep=firm years);
strata firm;
run;

The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @amanjot_42,

 

PROC SURVEYSELECT (requires SAS/STAT) is the natural choice for this task:

data have;
input Firm $ Years;
cards;
A 2010
A 2011
A 2012
A 2013
A 2014
B 2009
B 2010
B 2011
B 2012
B 2013
B 2014
B 2015
C 2013
C 2014
C 2015
;

proc surveyselect data=have method=srs n=1
                  seed=1618 out=want(keep=firm years);
strata firm;
run;

The above statement selects n=1 observation randomly (srs=simple random sampling) from each of the strata (groups) defined by the values of FIRM (dataset HAVE should be sorted by FIRM) using the arbitrary positive integer 1618 as a seed to initialize the random number generator. Without the KEEP= option dataset WANT would contain two additional variables containing stratum sizes and selection probabilities.

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @amanjot_42 

 

If you don't have SAS/STAT licensed, you can simulate the function of Proc Surveyselect by adding a random key, sorting on Firm + random key and keeping the first occurence of each firm. Input is the data set provided by @FreelanceReinh 

 

data temp1; set have;
	rkey = ranuni(5);
run;

proc sort data=temp1;
	by Firm rkey;
run;

data want(drop=rkey); set temp1;
	by Firm;
	if first.Firm;
run;
amanjot_42
Fluorite | Level 6
thanks,it worked well!
FreelanceReinh
Jade | Level 19

If you don't have SAS/STAT licensed (but a recent version of Base SAS, e.g. 9.4M5) and you don't want to create or sort additional datasets, you can use this approach:

data want(drop=i);
call streaminit(3141);
array y[50] _temporary_;
do i=1 by 1 until(last.firm);
  set have;
  by firm;
  y[i]=years;
end;
years=y[rand('integer',i)];
run;

This assumes that you have at most 50 observations (i.e. years) per firm. Increase the array dimension if that is not enough.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 578 views
  • 1 like
  • 3 in conversation