Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Proc Sort - Help!

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-20-2015 11:42 PM
(910 views)

Hi!

I'm trying to do some research with a professor and I've been tasked with learning SAS. I'm trying to use the proc sort command to sort some of the data and would like some help doing this:

1) Sorting data between years, i.e. from 1985-2005

2) Sorting data so it filters out/eliminates data with a certain character in the first slot of a cell. I.e., for SIC codes, I want to filter out any that start with 6 (6123, 6149, etc)

3) I want to add a column that shows how many firms are in a SIC 1-digit code. For example, I want to count the firms that start with 5 (5124, 5655), then sort them by Year. So an example would look like this:

1) A 5100 2

2) B 5101 2

3) C 4231 1

Thank you! If anyone can help me find some beginner tutorials/videos too, that would be great.

6 REPLIES 6

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As you suggest yourself, as a beginner, you need some training.

If you don't have the possibility to attend a physical training location, there are e-learning options. You can start with the free tutorials, and see what that brings you:

https://support.sas.com/training/tutorial/index.html

There are probably a lot of stuff on youtube as well.

If you find SAS documentation hard to navigate, try to read SAS literature, like The Little SAS Book, or SAS for Dummies:

https://support.sas.com/publishing/index.html

Proc Sort does what the name implies, sorts data, and not much more.

You can filter data there using the almost global WHERE statement.

For manipulating data you could use the Data Step, or SQL.

For doing statistical summaries, use (again) SQL, or some other SAS Procedures (Summary, Tabulate, Report).

Data never sleeps

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

http://www.ats.ucla.edu/stat/sas/modules/default.htm

1) Proc Sort data=have; by year; run;

2) You would have to specify more clearly what you want - what does your input look like and what output do you expect?

3) Create a new variable using Substring() function and then use a PROC FREQ to summarize your data.

1) Proc Sort data=have; by year; run;

2) You would have to specify more clearly what you want - what does your input look like and what output do you expect?

3) Create a new variable using Substring() function and then use a PROC FREQ to summarize your data.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

1) So it would only show the data from 1985-2005, for this, would I use proc sort or would I use data conditional;? For this, I've posted what I've figured out so far below.

data conditional;

if fyear GT 1985 and fyear LT 2005 then;

run;

I know this might not be correct, but could someone nudge me in the right direction? I don't want it to be an if statement, I just don't know how to filter so it ONLY shows between 1985 and 2005.

2) One column of data is SIC codes, so...

6127

6843

6921

7043

I'm trying to eliminate any observations that begin with a "6" (as they're more complicated) so after running my sort or filter, it would only show the "7043" data row.

Thanks everyone!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

PROC FREQ, one of the basic summarization procedures, will by default generate output in order, numeric or alphabetic and the data does not need to be sorted first.

Proc freq data= have;

tables year*siccode/ list;

run;

Will generate a single output table with year first and then the SIC codes. You can add a where clause to filter data without sorting.

proc freq data=have;

where substr(siccode,1,1) = '5'; /*if the sic code is character, which it really should be unless you are doing arithmetic with it*/

where floor(siccode/1000) = 5; /* if numeric*/

tables year*siccode/ list;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you so much! Modified your tables year*sic/list code and got my sort from 1985-2005. If I wanted to add more of my data columns to that table, would I just add them like this?

fyear*sic*datafamount*etc

Like that?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes adding the additional variables with the * says to show the combinations in the data. Note that if a variable has missing data then that does not appear in the output unless you include an instruction MISSING in the option list after the / separator.

Also if you have many levels for each variable you might exceed the display output or it will take a long time to produce. If you have 5 variables with 20 levels each you are asking for 3,200,000 rows of output. So be careful. I have seen an errant Proc freq table output generate more than 5 stack feet of not-quite-as-intended output from a line printer.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.