BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Whitlea
Obsidian | Level 7

Hello, 

I have a character variable that reflect education level.  I would like to convert this variable to numeric and then tell SAS I want to highest level of education per person.  As of now, all levels of education are showing up per person and are character i.e

 

ID          Education

0000      High School

              Associates 

              Bachelors

0001      High School

              Some College  ...

 

What is the best way to go about this?  Is there a way to do this via formatting?

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

First find the maximum per person. Then summarize.

proc summary data=have nway ;
  class person;
  var ednum;
  output out=want max=highest_ednum;
run;
proc freq data=want;
  tables highest_ednum;
run;

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

Best way is to create a custom informat. Make a comprehensive list of values in "Education", bring them into order, and assign numeric values in your proc informat code.

Alternatively, you can create a dataset that can be used as CNTLIN dataset in proc format. Use TYPE "I" for this.

PGStats
Opal | Level 21

Yes, use an informat and a format. Example:

 

proc format;
invalue edLevelIn (JUST UPCASE)
"HIGH SCHOOL" = 1
"ASSOCIATES" = 2
"BACHELORS" = 3
other = 0;
value edLevel
1 = "High School"
2 = "Associates"
3 = "Bachelors"
other = "Unknown";
run;

data test;
input ID  Education & :edLevelIn.;
format Education edLevel. id z4.0;
datalines;
0000   High School
0000    Associates 
0000    Bachelors
0001      High School
0001      Some College
;

proc print data=test noobs; run;
PG
ballardw
Super User

You have a couple of issues. First the character to numeric.

I suspect we did not see all of the character values so this is incomplete but code similar to:

data want;
   set have;
   select (education);
	 when ('High School') EdNum=1;
	 when ('Some College') EdNum=2;
	 when ('Associates') EdNum=3;
	 when ('Bachelors') EdNum=4;
	 otherwise put "WARNING: Unexpected value of education: " education;
   end;
run;

Select is an option to writing a whole bunch of If/Then/Else assignments. The comparison value must be exactly the same, case and all, to the values of your variable. The numeric value to assign is up to you.

You can sort the data based on ID and the numeric value to get things in increasing (sort of) education level.

 

The next part depends on what your actual data looks like and are you looking for a report or a data set or what. Reports are what people read, data sets are for further manipulation. Also, if you any other other variables involved what needs to be done with them will affect the choices at this point.

So if there are other variables involved you need to provide examples and what the final result should be.

If the only thing involved is the ID and education level you can use any of the tools that involve the MAX of the numeric variable:

 

Such as:

Proc means data=want max;
   class id;
   var ednum;
run;

but an example of what you expect the output to look like is a good idea.

 

Whitlea
Obsidian | Level 7

For now I am just trying to create a data set where I can pull the highest level of education for each individual.  This was helpful and quick!

Thanks!

Whitlea
Obsidian | Level 7

Now that I have ednum the way that I want it, I would like the n and percentage of each level but only using the highest level of education per person.  Is there a way to do this in tabulate or freq? I have attached an example of what I would like to do, only this table includes all education levels. Hope this makes sense!

Tom
Super User Tom
Super User

First find the maximum per person. Then summarize.

proc summary data=have nway ;
  class person;
  var ednum;
  output out=want max=highest_ednum;
run;
proc freq data=want;
  tables highest_ednum;
run;
Whitlea
Obsidian | Level 7

This is exactly what I needed! Thank you!

Whitlea
Obsidian | Level 7

@ballardw's data step in the beginning of this post helped me to convert and order my data and also is a solution to this post.

ballardw
Super User

@PGStats wrote:

@Whitlea, the name is @ballardw, not edwards .


In a not quite strange twist of fate @Whitlea and I worked in the same shop so she knows my real name😧 and the w is for edward...

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 934 views
  • 0 likes
  • 5 in conversation