- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi folks,
I'm trying to work with some of the US Mortality Multiple Cause files that are available for public use on the web but I'm unfamiliar with the file format. When I extract the files the dataset is in a DUSMCPUB file format. I can open it with Notepad++ and it looks like a text file, but I'm not sure. The codebooks say the variables are defined by tape positions.
Do you know how I can import these files? I'm unsure what the DBMS format should be using proc import and I don't see the file type in the wizard.
Thanks is advace.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What files exactly are you talking about? This site? https://www.nber.org/research/data/mortality-data-vital-statistics-nchs-multiple-cause-death-data
They have zipped SAS7BDAT files to download.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No, the ones I have are from the National Center for Health Statistics.
https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm#Mortality_Multiple
I'll take a closer look at the NBER files. They may be the same. I didn't expect to see health data from the Bureau of Economic Research...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't see anything there talking about tapes. It just looks like a normal description of fixed length data records.
For example here is the description of one variable.
So it is saying that the MRACE6 is the mother's race coded into one of 6 categories. It starts in the 107 character on the line and uses two characters (why it uses 2 I have no idea, maybe that wanted to add an extra space in case they changed the coding to require 2 characters in the future).
So to read it with an INPUT statement all you need to do is:
input
...
mrace6 107-108
...
;
or possibly
input
...
@107 mrace6 2.
...
;
You might also want to create a FORMAT that can convert the value 1 to the string 'White (only)' so it will be easier to read the tables you produce.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Tape position? I have to see this. I haven't read tape with SAS since 1987.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@ballardw - If it was punched card position I'd be even more impressed....given SAS has a CARDS statement 😊
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When I started programming in FORTRAN it was on punch-cards.
Same!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Right? I suspect it's just a legacy jargon situation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Beanpot wrote:
Right? I suspect it's just a legacy jargon situation.
One certainly hopes so. Unless the data relates to measurements taken with a tape measure...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you get this figured out? I am using the same data (multiple causes of death) and have not figured out how to write programming to merge multiple datasets across time in SAS 9.4. I would greatly appreciate any guidance or statements you may have used.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You might be better off to start you own thread and point to this one (copy the url) to ask new questions.
Since SAS "merge" is a side-by-side record combination I suspect you want to actually append sets (stack vertically year after year for example) as I can't see any reason/use for a side-by-side merge.
If you have SAS data sets for each year then you could use Proc Append to create a combined file:
Proc append base=want data=year2011;
run;
Proc append base=want data=year2012;
run;
Proc Append base=want data=year2013;
run;
Repeat as need. Want is the name you would want for your data. Include a library as desired. I picked some random year named files to add.
Append will run faster than data step combinations. It will also provide warning about differences in variable names/types and not allow the append to occur by default. Giving you a chance to address any data issues.
Append expects all the variables to be of the same name and type in all the data sets.