Desktop productivity for business analysts and programmers

logically importing massive data in saseg 5.1

Reply
Regular Contributor
Posts: 220

logically importing massive data in saseg 5.1

/*Hi All, I really appreciate help on this one....*/

/*

This is about census data.

I am planning to get census data for each state from sf1 (summary file1) file. from here http://www2.census.gov/census_2010/04-Summary_File_1/. this one has a lot of detail regarding each state.

under each state they have 47 files which we can save in .txt on machine and can pull the data into the SAS. I have variable details for all 47 files for each state.

Question

can anyone please tell me how i can import 47 different files data which has different no. of obeservation and variables (has some common related variables in all 47 data files) at same time rather than pulling data seperately for each file?

I can import each data file simply using import process but it will take long for 47 files for each state. I am looking for some MACRO or any other kind of logical code which could bring in all 47 files at same time. so i don't have to do it seperately for 47 files.

Thanks all

*/


Regular Contributor
Posts: 220

Re: logically importing massive data in saseg 5.1

/*

forgot to add

will make one final dataset after importing all 47 files...

thanks

*/

Occasional Contributor
Posts: 14

Re: logically importing massive data in saseg 5.1

Sandy,  you look fantastic !!!   I'm think about a simple process where you can list all the states in a macro variable then parse the macro variable with another macro called %word_cnt.    Word count just counts the number of words in a macro variable is list format.  I have 2 %let statements one for the state names/file names and one for the result dataset(not necessary since you will use only the master set, _allstates).  The read_txtFiles will read all the txt files listed, I used proc import for the read and force on the append incase of attribute mismatches.  I have a process similar reading in multiple spreadsheets from a workbook and multiple files where I do a "%sysexec dir /OSmiley Very Happy "&_dir1\*.log" > "&_dir3\wc_file.txt" ;" to read in the directory where the files are located.


%macro _wordcntC(list);
%global wrd_cnt;
%let wrd_cnt=%sysfunc(countw(%nrquote(&list),%str(,)));
%put "# of words in list are  &wrd_cnt";
%mend _wordcntC;

******  keep what is above  ****************;
%let _states="Illinois, NewYork, New Mexico, California";
%let _dsnlst=Illinois, New York, New Mexico, California;
%_wordcntC(&_states);


%macro read_txtFiles(states);
%local state;

%do j = 1 %to &wrd_cnt;
  %let word=%qscan(%nrquote(&states),&j,%str(,));
  %let _dsn=%qscan(%nrquote(&_dsnlst),&j,%str(,));

  PROC IMPORT OUT=&dsn
   DATAFILE=&_dsn..txt;
   SCANTEXT=YES;
   USEDATE=YES;
   SCANTIME=YES;
  RUN;

proc append base=_allstates  data=&syslast force;
run;

proc datasets lib=WORK;
    delete &dsn;
quit;

    filename _all_ clear;
    run;
   
%end;
%mend read_txtFiles;

%read_txtFiles(&_states);

Regular Contributor
Posts: 220

Re: logically importing massive data in saseg 5.1

Thank you sir, i will consider this and will let you know soon how it goes...


Regular Contributor
Posts: 220

Re: logically importing massive data in saseg 5.1

hello sir, i may need your further help. i am not that familiar with macro that much. so if you can give me value to put in -  would appreciate your time...my detail is as below...

i have 2 folder and each contains 47 txt files. for now i am trying to pull up data for 2 states only. rhode island (ri) and delaware (de) for testing purpose...

ri - "C:\Documents and Settings\xyz\Desktop"

de - "C:\Documents and Settings\xyz\Desktop"

can you please correct my value i ahve put if its not correct...

Thanks.

/*before i go ahead and run the code i wants to make sure i am putting correct value in the code that you have provided -Thanks*/

%macro _wordcntC(ri,de); /*Here ri and de are the fodlers and each foldes has 47 .txt file*/
%global wrd_cnt;
%let wrd_cnt=%sysfunc(countw(%nrquote(&ri,de),%str(,)));
%put "# of words in list are  &wrd_cnt";
%mend _wordcntC;

******  keep what is above  ****************;


%let _states="ri,de";
%let _dsnlst=ri,de;
%_wordcntC(&_states);


%macro read_txtFiles(ri,de);
%local state;

%do j = 1 %to &wrd_cnt;
  %let word=%qscan(%nrquote(&states),&j,%str(,));
  %let _dsn=%qscan(%nrquote(&_dsnlst),&j,%str(,));

  PROC IMPORT OUT=&dsn
   DATAFILE=&_dsn..txt;
   SCANTEXT=YES;
   USEDATE=YES;
   SCANTIME=YES;
  RUN;

proc append base=_allstates  data=&syslast force;
run;

proc datasets lib=WORK;
    delete &dsn;
quit;

    filename _all_ clear;
    run;
   
%end;
%mend read_txtFiles;

%read_txtFiles(&_states);

Occasional Contributor
Posts: 14

Re: logically importing massive data in saseg 5.1

Close - the code is two macros, one to count the words in any macro variable separated by commas and a space.  The result is just the number of words in the list, in the example it should be two.  Your SAS log will show this.

The second macro is %read_txtFiles.  It will do just that.  It will run twice for de and ri, or to the count of &wrd_cnt.  parse the _states macro and read in each state file.  The proc import can be changed to any metod to import the txt file.

All you really need to change is the %let _states list with the list of states to read, originally 47.  change the %let _dsn to simple state names, can be the same as _states.

I have the options statement to show how the macro is resolving

%macro _wordcntC(list);
%global wrd_cnt;
%let wrd_cnt=%sysfunc(countw(%nrquote(&list),%str(,)));
%put "# of words in list are  &wrd_cnt";
%mend _wordcntC;
******  keep what is above  ****************;

******  my example of the state list  *****************;
%let _states="Illinois, NewYork, New Mexico, California";
******  single sas dataset output from macro  *********;
%let _dsnlst=Illinois, New York, New Mexico, California;

*****  call the macro _wordcntC to get the number of states in the %let _states list *****;
*****  for your example this should be 2, make sure you have spaces between the states like below ***;
%let _states="ri, de";
%let _dsnlst=ri, de;

%_wordcntC(&_states);

options mprint mlogic symbolgen;

%macro read_txtFiles(states);
%local state;

***                                                           ***;
***  read each state in from the _states list macro variable  ***;
***  assign it to a new macro variable word                   ***;
***  the proc import will import the file from the desktop    ***;
***  the result sas dsn will be appended to _allstates        ***;

%do j = 1 %to &wrd_cnt;
  %let word=%qscan(%nrquote(&states),&j,%str(,));
  %let _dsn=%qscan(%nrquote(&_dsnlst),&j,%str(,));

  PROC IMPORT OUT=&_dsn
   DATAFILE="C:\Documents and Settings\xyz\Desktop\&word..txt";
   SCANTEXT=YES;
   USEDATE=YES;
   SCANTIME=YES;
  RUN;

proc append base=_allstates  data=&syslast force;
run;

proc datasets lib=WORK;
    delete &dsn;
quit;

    filename _all_ clear;
    run;
   
%end;
%mend read_txtFiles;

%read_txtFiles(&_states);

Ask a Question
Discussion stats
  • 5 replies
  • 359 views
  • 0 likes
  • 2 in conversation