BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kvc
Calcite | Level 5 kvc
Calcite | Level 5

I'm trying to create a subset of my dataset with just 6 variables. That is, I want to send these 6 variables, intact without performing any analyses on them, to an output file (I plan to merge these 6 variables with another group of mean-level variables that I created using the aggregate code that allowed me to send those mean-level variables to an output file). I found the following code:

PROC UNIVARIATE DATA=SASdataset  options;

           options:PLOT

   VAR variable(s);

   BY variable(s);

   OUTPUT OUT=SASdataset  keyword=variablename ... ;

Linear models

However, it's confusing to me and I'm not sure how to use it.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You have to be a bit more specific regarding what you want the output file to contain.  Here is an example that might help:

data test;

  set sashelp.class;

run;

proc sort data=test;

  by sex;

run;

PROC UNIVARIATE DATA=test;

   VAR age height weight;

   by sex;

   OUTPUT OUT=want mean=mean_age mean_height mean_weight

                   std=sd_age sd_height sd_weight;

run;

View solution in original post

8 REPLIES 8
art297
Opal | Level 21

You have to be a bit more specific regarding what you want the output file to contain.  Here is an example that might help:

data test;

  set sashelp.class;

run;

proc sort data=test;

  by sex;

run;

PROC UNIVARIATE DATA=test;

   VAR age height weight;

   by sex;

   OUTPUT OUT=want mean=mean_age mean_height mean_weight

                   std=sd_age sd_height sd_weight;

run;

kvc
Calcite | Level 5 kvc
Calcite | Level 5

I tried your code but the output wasn't what I was looking for.

There are about 1500 participants that have data in the cells for these six variables. If I were able to copy  the six variables with the data (not the means of the data), and  then paste them into a new SAS file, that would be what I am trying to do.

Does that make sense? I just want a subset of the actual variables with their original data.

art297
Opal | Level 21

Can't you just merge the resulting file with your original data?  Conversely, proc univariate has an IDout option that will probably do what you want.  Some procs have a copy option that will also bring in other fields.

kvc
Calcite | Level 5 kvc
Calcite | Level 5

I know I'm not explaining this right. I'll try again. I am working with a very large dataset that includes individuals who are reporting on many different things. In this dataset are also variables about the schools these individuals attend. I am trying to extricate the school-level variables from the dataset so I can run some separate analyses on just the school level data. I don't want to merge these variables back into the dataset. They are already in the dataset. I want to take them out of the dataset and have them in a separate file. Does this make sense?

art297
Opal | Level 21

No! To me, that conflicts with what you've said and asked before.  It would help if you show a very simple example with, say, 3 people from each of two schools, and only one measure.

What would be helpful, in such an example, is what the data look like coming in and what you would want the resulting file to look like.

Tom
Super User Tom
Super User

What do you mean by subset?

I just want a subset of the actual variables with their original data.

Usually a subset means a reduction in the number of OBSERVATIONS?

data seniors;

   set have;

   where 65 <= age ;

run;

But it appears that you are talking about a subset of the number of VARIABLES?

data core;

  set have;

  keep id var1-var6;

run;

Either way there is no need to copy and paste anything.

What do you mean by "with their original data"?  A subset by definition would be a copy of the original data.

Did you want to do some analysis?  Perhaps calculate some statistic on the data and combine it some way with the original data?

Did you want to calculate statistics across observations?  Such as a mean or a sum?

proc sql noprint;

  create table want as

    select id,var1,mean(var1) as mean_var1

    from have

  ;

quit;

Did you want to calculate a statistic across variables? 

data want;

  set have;

  mean = mean(of var1-var6);

run;

kvc
Calcite | Level 5 kvc
Calcite | Level 5

Of what you sent me, this is the code I was looking for:

data core;
  set have;
  keep id var1-var6;
run;

However, the PROC UNIVARIATE code (see one of the responses above), which gave me the means, also worked for what I needed. But I thank you for also sending me this.

kvc
Calcite | Level 5 kvc
Calcite | Level 5

Okay, I just realized that above you did answer my question. I just didn't need the SDs, and I got confused by the output. I think I understand what I didn't before. So thank you for your help here.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2570 views
  • 3 likes
  • 3 in conversation