DATA Step, Macro, Functions and more

Is SAS dataset Okay to work with?

Reply
Occasional Contributor
Posts: 8

Is SAS dataset Okay to work with?

Hey Everyone,

I have downloaded a dataset XLSX file from the Bureau Department of Labor Statistics website (https://www.bls.gov/cps/tables.htm#weekearn) that talks about median weekly income of full-time workers between different races, ages, etc., but I have a question if I were to import it SAS program, could I use some the data that it has in it to create a SAS program. Here is the sample data that I am talking about.project understanding.PNG

Super Contributor
Posts: 441

Re: Is SAS dataset Okay to work with?

The short answer is "Yes but you'd have some issues...."

 

The problem with many Excel files is that they contain header rows, blank rows and totals along with the disaggregates. All of these appear to be present in this screenshot. If, as it appears, it's quite a small file you may find it easier to remove these before attempting to import it into SAS

Occasional Contributor
Posts: 8

Re: Is SAS dataset Okay to work with?

Posted in reply to ChrisBrooks

So I would move up rows (with data) to the rows that don't have any data in them and get rid of blank rows?

Super Contributor
Posts: 441

Re: Is SAS dataset Okay to work with?

Yes, also drop the header rows (except for the one which will hold your variable names). I'd also drop the rows containing totals for the categories eg "Men, 16 years and over" as it's bad practice to hold aggregates in the same file as disaggregates and you'll find the file hard/impossible to work with.

Super User
Posts: 7,846

Re: Is SAS dataset Okay to work with?

First of all, don't call this a dataset; it's a spreadsheet. Per definition, spreadsheets are unstructured accumulations of cells ordered by rows and columns; there are no fixed attributes for columns.

A SAS dataset is a database-like table with a fixed set of attributes for each column (and a fixed number of columns!) across all rows.

From what I see, there are at least two logical tables in that spreadsheet: distrubution along sex and age, and distribution along ethnicity and sex.

These tables should end up in individual datasets, as they will require different columns, or you'll have to always filter for specific columns to be missing/non-missing when analyzing for one of the distributions.

I also advise to remove data (the years) from structure and transpose to a long format with individual observations for years.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 11,343

Re: Is SAS dataset Okay to work with?

You would be much better off looking for the "raw" data that would have one row per person.

The example data would not allow you to ever get anything like median income for males 21 to 35.

 

Or possible start here: https://www.census.gov/cps/data/cpstablecreator.html

Which is a census bureau table generator from the same source that will let you create your own tables from the same data source.

 

Note that table generator runs SAS code in the background.

Occasional Contributor
Posts: 8

Re: Is SAS dataset Okay to work with?

Occasional Contributor
Posts: 8

Re: Is SAS dataset Okay to work with?

Here is what I did to get the data table: 

 

screenshot.PNG2017.PNG

Occasional Contributor
Posts: 8

Re: Is SAS dataset Okay to work with?

But how to I export this data?
Occasional Contributor
Posts: 8

Re: Is SAS dataset Okay to work with?

[ Edited ]

@KurtBremser, I went back and made some suggestions based on what told me. I can now import the two different xlsx files into SAS right? Can you take a look one which is attached as a pic and the other as file?

 

Distribution along sex and age: 

 

 Ethnicity&SexData.PNG

Ask a Question
Discussion stats
  • 9 replies
  • 118 views
  • 0 likes
  • 4 in conversation