BookmarkSubscribeRSS Feed
dseals23
Fluorite | Level 6

Hey Everyone,

I have downloaded a dataset XLSX file from the Bureau Department of Labor Statistics website (https://www.bls.gov/cps/tables.htm#weekearn) that talks about median weekly income of full-time workers between different races, ages, etc., but I have a question if I were to import it SAS program, could I use some the data that it has in it to create a SAS program. Here is the sample data that I am talking about.project understanding.PNG

9 REPLIES 9
ChrisBrooks
Ammonite | Level 13

The short answer is "Yes but you'd have some issues...."

 

The problem with many Excel files is that they contain header rows, blank rows and totals along with the disaggregates. All of these appear to be present in this screenshot. If, as it appears, it's quite a small file you may find it easier to remove these before attempting to import it into SAS

dseals23
Fluorite | Level 6

So I would move up rows (with data) to the rows that don't have any data in them and get rid of blank rows?

ChrisBrooks
Ammonite | Level 13

Yes, also drop the header rows (except for the one which will hold your variable names). I'd also drop the rows containing totals for the categories eg "Men, 16 years and over" as it's bad practice to hold aggregates in the same file as disaggregates and you'll find the file hard/impossible to work with.

Kurt_Bremser
Super User

First of all, don't call this a dataset; it's a spreadsheet. Per definition, spreadsheets are unstructured accumulations of cells ordered by rows and columns; there are no fixed attributes for columns.

A SAS dataset is a database-like table with a fixed set of attributes for each column (and a fixed number of columns!) across all rows.

From what I see, there are at least two logical tables in that spreadsheet: distrubution along sex and age, and distribution along ethnicity and sex.

These tables should end up in individual datasets, as they will require different columns, or you'll have to always filter for specific columns to be missing/non-missing when analyzing for one of the distributions.

I also advise to remove data (the years) from structure and transpose to a long format with individual observations for years.

ballardw
Super User

You would be much better off looking for the "raw" data that would have one row per person.

The example data would not allow you to ever get anything like median income for males 21 to 35.

 

Or possible start here: https://www.census.gov/cps/data/cpstablecreator.html

Which is a census bureau table generator from the same source that will let you create your own tables from the same data source.

 

Note that table generator runs SAS code in the background.

dseals23
Fluorite | Level 6

dseals23
Fluorite | Level 6

Here is what I did to get the data table: 

 

screenshot.PNG2017.PNG

dseals23
Fluorite | Level 6
But how to I export this data?
dseals23
Fluorite | Level 6

@Kurt_Bremser, I went back and made some suggestions based on what told me. I can now import the two different xlsx files into SAS right? Can you take a look one which is attached as a pic and the other as file?

 

Distribution along sex and age: 

 

 Ethnicity&SexData.PNG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1395 views
  • 0 likes
  • 4 in conversation