BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alc
Calcite | Level 5 alc
Calcite | Level 5

Hi

I am a non SAS user who has been "using" SAS to open some dbat files files to copy and past annual data from 1985-2010 into Excel.  These are very large files but I am working with a small amount of data.  This approach has worked well given my lack of skills with SAS.  I found, however, that the data elements that I am copying were not extracted from the CSV files from 2007-2010.  The organization from which I got the data keeps its programs and dbat files in the public domain so they pointed me to the data and their public domain programs saying that I should extract the data myself.  They are a small shop and have been very helpful to my work.  The problem is, however, I really don't know what to do with the program which is pasted below:

*options obs=55;

options nocenter;

*by Jean Roth, jroth@nber.org, 2007-11-09 ;

*NOTE:  This program is distributed under the GNU GPL. See end of

*this file and http://www.gnu.org/licenses/ for details. ;

*  The following line should contain the directory

   where the SAS file is to be stored  ;

libname library "~/bulk/hcris";

*  The following line should contain

   the complete path and name of the raw data file.

   On a PC, use backslashes in paths as in C:\  ;

%macro loop;

%do year=1995 %to 2007;

proc printto log  ="./read_hosp_rpt_alphnmrc&year..log";

proc printto print="./read_hosp_rpt_alphnmrc&year..lst";

FILENAME datafile "/homes/data/hcris/fy/hosp_&year._ALPHA.CSV";

*  The following line should contain the name of the SAS dataset ;

%let dataset=hosp_alpha&year._long ;

data library.&dataset.;

*Variable           Maximum

---------           -------

RPT_REC_NUM         64331

WKSHT_CD            "S89R000"

LINE_NUM            "09919"  (min="00000". "00100" I think means "1". "00101" can mean "101" or "1.01", depending on the worksheet)

CLMN_NUM            "0600" (means "6", I think.   Values are 0000, 0100, 0200, 0300, 0400, 0500, and 0600)

;

length wksht_cd $7. line_num $5. clmn_num $4. alphnmrc_itm_txt $45. default = 4  ;

* '2C' is hexadecimal for decimal 44 which represents ',' ;

* '0D' is hexadecimal for decimal 13 which represents '\r', which is the carriage return character;

infile datafile dsd delimiter='2C0D'x  ;

INPUT

rpt_rec_num

wksht_cd $

line_num $

clmn_num $

alphnmrc_itm_txt $

;

proc means data=library.&dataset. max n;

title "Inspect maximums:  Four bytes will retain six significant digits" ;

title2 "&dataset.";

proc freq data=library.&dataset.;

title "";

tables WKSHT_CD LINE_NUM CLMN_NUM ;

proc contents data=library.&dataset.;

data &dataset.;

set library.&dataset.;

proc sort data=&dataset. nodupkey;

by wksht_cd clmn_num line_num  ;

proc print data=&dataset. noobs;

var wksht_cd clmn_num line_num;

%end;

%mend;

%loop;

/*

Copyright 2007 shared by Jean Roth and the National Bureau of Economic Research

National Bureau of Economic Research.

1050 Massachusetts Avenue

Cambridge, MA 02138

jroth@nber.org

This program and all programs referenced in it are free software. You

can redistribute the program or modify it under the terms of the GNU

General Public License as published by the Free Software Foundation;

either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

GNU General Public License for more details.

You should have received a copy of the GNU General Public License

along with this program; if not, write to the Free Software

Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307

USA.

I am wondering if this is difficult to do?  Can you recommend a reference that will step me through using this program to complete my work.

I hope this question is within the scope of this site.

Thank you

Al

1 ACCEPTED SOLUTION

Accepted Solutions
alc
Calcite | Level 5 alc
Calcite | Level 5

Thank you for taking the time to answer my question.  I will see if I can make it work tomorrow.  I uses the SAS programs at a local University.

Thanks again.

Al

View solution in original post

4 REPLIES 4
art297
Opal | Level 21

If you remove the comments, it is easier to see what you have to change.  I put my own extra comments in CAPS.

YOU CAN RUN THE CODE BY COPYING AND PASTING IT INTO THE SAS ENHANCED EDITOR AND THEN CLICKING ON THE RUN ICON

/* CHANGE THE FOLLOWING LINE TO REFLECT WHERE YOU WANT THE SAS FILES TO BE CREATED */

/* IF YOU WANT THEM PUT AT THE ROOT OF YOUR C: DRIVE THE FOLLOWING LINE IS OK AS IS */

libname library "C:\";

%macro loop;

  /* THE FOLLOWING LINE CAUSE THE CODE TO RUN IN A LOOP FOR EACH YEAR */

  %do year=1995 %to 2007;

       /* THE NEXT TWO LINES HAVE TO CHANGED IF YOU DON'T WANT THE LOG AND PRINTOUT*/

       /* PUT IN THE ROOT OF YOUR C: DRIVE */

       proc printto log  ="c:\read_hosp_rpt_alphnmrc&year..log";

       proc printto print="c:\read_hosp_rpt_alphnmrc&year..lst";

       /* THE NEXT LINE ASSUMES THAT YOUR DOWNLOADED CSV FILES WILL BE FOUND AT THE */

       /* ROOT OF YOUR C: DRIVE */

       FILENAME datafile "c:\fy/hosp_&year._ALPHA.CSV";

       /* THE FOLLOWING LINE WILL CREATE THE SAS FILES LABELING THEM WITH */

       /* hosp_alpha  FOLLOWED BY THE YEAR   FOLLOWED BY THE STRING _long */

      %let dataset=hosp_alpha&year._long ;

      /* THE FOLLOWING LINES CREATE THE FILES IN YOUR SPECIFIED DIRECTORY */

     data library.&dataset.;

        length wksht_cd $7. line_num $5. clmn_num $4. alphnmrc_itm_txt $45. default = 4  ;

        infile datafile dsd delimiter='2C0D'x  ;

        INPUT rpt_rec_num

                     wksht_cd $

                     line_num $

                    clmn_num $

                    alphnmrc_itm_txt $;

       run;

       /* THE FOLLOWING LINES WILL PRINT THE MAX AND N OF THE NUMERIC VARIABLES */

       proc means data=library.&dataset. max n;

         title "Inspect maximums:  Four bytes will retain six significant digits" ;

         title2 "&dataset.";

       run;

       /* THE FOLLOWING LINES PRINT FREQUENCY DISTRIBUTIONS OF THE SPECIFIED FIELDS */

     proc freq data=library.&dataset.;

          title "";

          tables WKSHT_CD LINE_NUM CLMN_NUM ;

       run;

       /* THE FOLLOWING LINES PRINT A SUMMARY OF THE FILE */

     proc contents data=library.&dataset.;

      run;

      /* THE FOLLOWING CREATE A WORK DIRECTORY COPY OF THE FILE */

     data &dataset.;

        set library.&dataset.;

      run;

     /* THE FOLLOWING LINES SORT THE WORK DIRECTORY COPY OF THE FILE */

    proc sort data=&dataset. nodupkey;

       by wksht_cd clmn_num line_num  ;

     run;

     /* THE FOLLOWING LINES PRINT THE SPECIFIED DATA */

    proc print data=&dataset. noobs;

        var wksht_cd clmn_num line_num;

     run;

  /* THE FOLLOWING LINE ENDS THE LOOP */

  %end;

  /* THE FOLLOWING LINE ENDS THE CODE (CALLED A MACRO) */

%mend;

/* THE FOLLOWING LINE RUNS THE ABOVE MACRO */

%loop

alc
Calcite | Level 5 alc
Calcite | Level 5

Thank you for taking the time to answer my question.  I will see if I can make it work tomorrow.  I uses the SAS programs at a local University.

Thanks again.

Al

Tom
Super User Tom
Super User

If you ran this program, the reason it stopped at 2007 is that the DO loop only runs from 1995 to 2007.  If you have more years data then you can just change the limits of the %DO YEAR= statement.  So if you downloaded data for 2007 to 2010 then you probably want the statement to be:

%do year=2007 %to 2010 ;

alc
Calcite | Level 5 alc
Calcite | Level 5

Hi

Thank you for your comments.  I may have selected the wrong program from www.nber.org/data/hcris.html to illustrate my question.  I open the SAS files of annual dta and copy/paste into Excel where I create very simply metrics e.x., rates of increase, profit margins, etc.  I found all the data I wanted from 1996-2006.  From 2007 forward, they created annual files but excluded the 5 or so columns that I have been working with.  Hence the need to extract the data for 2007 to 2010.

I will be happy when I am over this hump, one way or another.

Al

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 998 views
  • 0 likes
  • 3 in conversation