BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rfarmenta
Obsidian | Level 7

I have a dataset with 200K observations from surveys that were collected between 2002 and 2014. The data is in wide format and there are 4 date variables, date1-4. I need to first subset to include surveys after January 01, 2010. I tired this code but it gives me zero observations:

data date_subset;

set allparts;

if surveydate_w1 < '01JAN2010'd then delete;

if surveydate_w2 < '01JAN2010'd then delete;

if surveydate_w3 < '01JAN2010'd then delete;

if surveydate_w4 < '01JAN2010'd then delete;

run;

After that I need to select a random sample that has a mean age of 35 and is 25% female and 75% male. I think I can use the proc surveyselect or proc sql for this but have never done it. Any help with these procedure would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

It appears that you have no records with ALL of the dates later than 01JAN2010.

Maybe what you want is:

if surveydate_w1 ge '01JAN2010'd OR surveydate_w2 ge '01JAN2010'd or surveydate_w3 ge '01JAN2010'd or surveydate_w4 ge '01JAN2010'd then OUTPUT;

Lots of luck trying to specify a MEAN age with a RANDOM selection.

You might be better off specifying an age range that includes 35 such as a dataset option (where = ( 25 le age le 45)).

With suveyselect you can specify STRATA level sample rates such as :

proc surveyselect data=have out=want

     samprate=(.25 .75); /* to use this the input data should be sorted by the STRATA variable and assumes that the value for male comes before female when sorted. Reverse the order of the values if that is no the case*/

Strata Gender;

run;

View solution in original post

3 REPLIES 3
LinusH
Tourmaline | Level 20

First, narrow data is usually better suited for analysis and reporting. And easier to transform. So a general advice would be to transpose your data set.

Then I'm not sure what your subset should do.  Now your logic keeps records where all dates are prior to 2010. Is that correct?

Are you sure that your dates are in SAS date format?

Data never sleeps
ballardw
Super User

It appears that you have no records with ALL of the dates later than 01JAN2010.

Maybe what you want is:

if surveydate_w1 ge '01JAN2010'd OR surveydate_w2 ge '01JAN2010'd or surveydate_w3 ge '01JAN2010'd or surveydate_w4 ge '01JAN2010'd then OUTPUT;

Lots of luck trying to specify a MEAN age with a RANDOM selection.

You might be better off specifying an age range that includes 35 such as a dataset option (where = ( 25 le age le 45)).

With suveyselect you can specify STRATA level sample rates such as :

proc surveyselect data=have out=want

     samprate=(.25 .75); /* to use this the input data should be sorted by the STRATA variable and assumes that the value for male comes before female when sorted. Reverse the order of the values if that is no the case*/

Strata Gender;

run;

rfarmenta
Obsidian | Level 7

Thank you, you are right I was specifying incorrectly and don't have anyone with all of the dates that meet that criteria. The modification you suggested worked.

I am doing a comparison between two datasets with different outcomes so I will just use the same age range and see what other restrictions might be best. Thank you for your help!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 455 views
  • 0 likes
  • 3 in conversation