DATA Step, Macro, Functions and more

Creating a subset of dataset in new output file

Accepted Solution Solved
Reply
Occasional Contributor kvc
Occasional Contributor
Posts: 13
Accepted Solution

Creating a subset of dataset in new output file

I'm trying to create a subset of my dataset with just 6 variables. That is, I want to send these 6 variables, intact without performing any analyses on them, to an output file (I plan to merge these 6 variables with another group of mean-level variables that I created using the aggregate code that allowed me to send those mean-level variables to an output file). I found the following code:

PROC UNIVARIATE DATA=SASdataset  options;

           optionsSmiley TongueLOT

   VAR variable(s);

   BY variable(s);

   OUTPUT OUT=SASdataset  keyword=variablename ... ;

Linear models

However, it's confusing to me and I'm not sure how to use it.


Accepted Solutions
Solution
‎10-31-2011 04:15 PM
PROC Star
Posts: 7,363

Creating a subset of dataset in new output file

You have to be a bit more specific regarding what you want the output file to contain.  Here is an example that might help:

data test;

  set sashelp.class;

run;

proc sort data=test;

  by sex;

run;

PROC UNIVARIATE DATA=test;

   VAR age height weight;

   by sex;

   OUTPUT OUT=want mean=mean_age mean_height mean_weight

                   std=sd_age sd_height sd_weight;

run;

View solution in original post


All Replies
Solution
‎10-31-2011 04:15 PM
PROC Star
Posts: 7,363

Creating a subset of dataset in new output file

You have to be a bit more specific regarding what you want the output file to contain.  Here is an example that might help:

data test;

  set sashelp.class;

run;

proc sort data=test;

  by sex;

run;

PROC UNIVARIATE DATA=test;

   VAR age height weight;

   by sex;

   OUTPUT OUT=want mean=mean_age mean_height mean_weight

                   std=sd_age sd_height sd_weight;

run;

Occasional Contributor kvc
Occasional Contributor
Posts: 13

Creating a subset of dataset in new output file

I tried your code but the output wasn't what I was looking for.

There are about 1500 participants that have data in the cells for these six variables. If I were able to copy  the six variables with the data (not the means of the data), and  then paste them into a new SAS file, that would be what I am trying to do.

Does that make sense? I just want a subset of the actual variables with their original data.

PROC Star
Posts: 7,363

Creating a subset of dataset in new output file

Can't you just merge the resulting file with your original data?  Conversely, proc univariate has an IDout option that will probably do what you want.  Some procs have a copy option that will also bring in other fields.

Occasional Contributor kvc
Occasional Contributor
Posts: 13

Creating a subset of dataset in new output file

I know I'm not explaining this right. I'll try again. I am working with a very large dataset that includes individuals who are reporting on many different things. In this dataset are also variables about the schools these individuals attend. I am trying to extricate the school-level variables from the dataset so I can run some separate analyses on just the school level data. I don't want to merge these variables back into the dataset. They are already in the dataset. I want to take them out of the dataset and have them in a separate file. Does this make sense?

PROC Star
Posts: 7,363

Creating a subset of dataset in new output file

No! To me, that conflicts with what you've said and asked before.  It would help if you show a very simple example with, say, 3 people from each of two schools, and only one measure.

What would be helpful, in such an example, is what the data look like coming in and what you would want the resulting file to look like.

Super User
Super User
Posts: 6,500

Re: Creating a subset of dataset in new output file

What do you mean by subset?

I just want a subset of the actual variables with their original data.

Usually a subset means a reduction in the number of OBSERVATIONS?

data seniors;

   set have;

   where 65 <= age ;

run;

But it appears that you are talking about a subset of the number of VARIABLES?

data core;

  set have;

  keep id var1-var6;

run;

Either way there is no need to copy and paste anything.

What do you mean by "with their original data"?  A subset by definition would be a copy of the original data.

Did you want to do some analysis?  Perhaps calculate some statistic on the data and combine it some way with the original data?

Did you want to calculate statistics across observations?  Such as a mean or a sum?

proc sql noprint;

  create table want as

    select id,var1,mean(var1) as mean_var1

    from have

  ;

quit;

Did you want to calculate a statistic across variables? 

data want;

  set have;

  mean = mean(of var1-var6);

run;

Occasional Contributor kvc
Occasional Contributor
Posts: 13

Re: Creating a subset of dataset in new output file

Of what you sent me, this is the code I was looking for:

data core;
  set have;
  keep id var1-var6;
run;

However, the PROC UNIVARIATE code (see one of the responses above), which gave me the means, also worked for what I needed. But I thank you for also sending me this.

Occasional Contributor kvc
Occasional Contributor
Posts: 13

Creating a subset of dataset in new output file

Okay, I just realized that above you did answer my question. I just didn't need the SDs, and I got confused by the output. I think I understand what I didn't before. So thank you for your help here.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 173 views
  • 3 likes
  • 3 in conversation