BookmarkSubscribeRSS Feed
Calcite | Level 5



I'm trying to do some research with a professor and I've been tasked with learning SAS. I'm trying to use the proc sort command to sort some of the data and would like some help doing this:


1) Sorting data between years, i.e. from 1985-2005

2) Sorting data so it filters out/eliminates data with a certain character in the first slot of a cell. I.e., for SIC codes, I want to filter out any that start with 6 (6123, 6149, etc)

3) I want to add a column that shows how many firms are in a SIC 1-digit code. For example, I want to count the firms that start with 5 (5124, 5655), then sort them by Year. So an example would look like this:


1) A 5100 2

2) B 5101 2

3) C 4231 1


Thank you! If anyone can help me find some beginner tutorials/videos too, that would be great.

Tourmaline | Level 20

As you suggest yourself, as a beginner, you need some training.

If you don't have the possibility to attend a physical training location, there are e-learning options. You can start with the free tutorials, and see what that brings you:

There are probably a lot of stuff on youtube as well.

If you find SAS documentation hard to navigate, try to read SAS literature, like The Little SAS Book, or SAS for Dummies:


Proc Sort does what the name implies, sorts data, and not much more.

You can filter data there using the almost global WHERE statement.

For manipulating data you could use the Data Step, or SQL.

For doing statistical summaries, use (again) SQL, or some other SAS Procedures (Summary, Tabulate, Report).

Data never sleeps
Super User

1) Proc Sort data=have; by year; run;
2) You would have to specify more clearly what you want - what does your input look like and what output do you expect?
3) Create a new variable using Substring() function and then use a PROC FREQ to summarize your data.
Calcite | Level 5

1) So it would only show the data from 1985-2005, for this, would I use proc sort or would I use data conditional;? For this, I've posted what I've figured out so far below.


data conditional;

if fyear GT 1985 and fyear LT 2005 then;


I know this might not be correct, but could someone nudge me in the right direction? I don't want it to be an if statement, I just don't know how to filter so it ONLY shows between 1985 and 2005.


2) One column of data is SIC codes, so...







I'm trying to eliminate any observations that begin with a "6" (as they're more complicated) so after running my sort or filter, it would only show the "7043" data row.


Thanks everyone!

Super User

PROC FREQ, one of the basic summarization procedures, will by default generate output in order, numeric or alphabetic and the data does not need to be sorted first.


Proc freq data= have;

   tables year*siccode/ list;



Will generate a single output table with year first and then the SIC codes. You can add a where clause to filter data without sorting.


proc freq data=have;

   where substr(siccode,1,1) = '5'; /*if the sic code is character, which it really should be unless you are doing arithmetic with it*/

   where floor(siccode/1000) = 5; /* if numeric*/

   tables year*siccode/ list;


Calcite | Level 5

Thank you so much! Modified your tables year*sic/list code and got my sort from 1985-2005. If I wanted to add more of my data columns to that table, would I just add them like this?




Like that?

Super User

Yes adding the additional variables with the * says to show the combinations in the data. Note that if a variable has missing data then that does not appear in the output unless you include an instruction MISSING in the option list after the / separator.


Also if you have many levels for each variable you might exceed the display output or it will take a long time to produce. If you have 5 variables with 20 levels each you are asking for 3,200,000 rows of output. So be careful. I have seen an errant Proc freq table output generate more than 5 stack feet of not-quite-as-intended output from a line printer.




The early bird rate has been extended! Register by March 18 for just $695 - $100 off the standard rate.


Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events. 


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4 in conversation