BookmarkSubscribeRSS Feed
owenwqp1
Obsidian | Level 7
Hi everyone, I am dealing with 3 digit sic industry data by states. My data in excel are with names like Florida_101, Texas_469 and so on. When I import with macro do loop like "%do i=101 %to 469", it works although for each state the 3 digit sic industry are not consecutive from 101 to 469, because physical excel datasets for those non-sic 3 digit industry code numbers are missing. But when I use data step to deal with variables in the imported datasets(with do loop in macro: %do i=101 %to 469), I got many empty datasets. How could I solve this problem? macro like %data(i,j) are not acceptable because there will be a very long list of %data(i,j). Thanks!
9 REPLIES 9
Astounding
PROC Star

Take a look at the EXIST and FEXIST functions.  They tell you whether a SAS data set exists (EXIST) or a file exists (FEXIST). 

 

Inside your %DO loop,  you can add an interior %DO group that basically says, "If the file I'm looking for exists, then %DO."

Tom
Super User Tom
Super User

You need to provide more information if you want help. Why would a loop generating data step code be any different than a loop generating proc import code?

 

Why are you looping from a to b if every number in between a and b does not exist?  Why not set it up to loop over a list of valid values?

%let valid_list=Florida_101 Texas_469 ;
%do i=1 %to %sysfunc(coutw(&valid_list,%str ));
  %let next=%scan(&valid_list,&i,%str( ));
  ....
%end;
owenwqp1
Obsidian | Level 7
@Tom Thank you Tom. To be more concrete, I have sic3 data of states. For example, I have Florida_101, Florida_102, Florida_103, Florida_105, Florida_201, Florida_202, Florida_203, Florida_204, Florida_209, Florida_301, Florida_302, Florida_303,Florida_304, Florida_306; and for Texas, I also have Texas_101, Texas_102, Texas_103, Texas_105, Texas_201, Texas_202, Texas_203, Texas_204, Texas_209, Texas_301, Texas_302, Texas_303,Texas_304, Texas_306. These original data are in excel format, I imported into sas with "%do i=101 %to306". I got only Florida_101, Florida_102, Florida_103, Florida_105, Florida_201, Florida_202, Florida_203, Florida_204, Florida_209, Florida_301, Florida_302, Florida_303,Florida_304, Florida_306 and Texas_101, Texas_102, Texas_103, Texas_105, Texas_201, Texas_202, Texas_203, Texas_204, Texas_209, Texas_301, Texas_302, Texas_303,Texas_304, Texas_306 in sas format. But when I use data step, I got Florida_101.......... to ..........Florida_306, more than 200 datasets in sas. I want to know how to get only Florida_101, Florida_102, Florida_103, Florida_105, Florida_201, Florida_202, Florida_203, Florida_204, Florida_209, Florida_301, Florida_302, Florida_303,Florida_304, Florida_306 when I use data step.Thanks!
Tom
Super User Tom
Super User

Still not sure if I understand.  If you imported them into WORK library and you want all of the FLORIDA datasets then why not just use wild cards.

 

data all_florida ;
  set florida_: ;
run;
owenwqp1
Obsidian | Level 7
@Tom Thank you Tom. The key point of my problem is that, for some states and even the same state,the same variables in different years may be of different type because in some years variable X1 may be numeric and in another year X1 becomes a character variable when imported to sas due to lack of data in that year. So when I use wild cards, it does not work.
Tom
Super User Tom
Super User

@owenwqp1 wrote:
@Tom Thank you Tom. The key point of my problem is that, for some states and even the same state,the same variables in different years may be of different type because in some years variable X1 may be numeric and in another year X1 becomes a character variable when imported to sas due to lack of data in that year. So when I use wild cards, it does not work.

So finally you have stated an actual problem.  This is caused by using PROC IMPORT to guess about what is in your source files.  If you can convert the source files to text files (like CSV files for example) then you can read them in consistently and you will no longer have this problem.

 

If your problem is just with empty variables that got created as numeric instead of character then the solution is to just drop them. (if they are empty you will lose nothing). 

 

Automating this might be more work than its worth.  Personally I would just get the list of files.  For example you could run a simple step like this 

data _null_;
  set sashelp.vtable ;
  where libname = 'WORK' and memname like 'FLORIDA_%' ;
  put memname '(drop=)';
run;

and then  copy the list from the log into a new data step and add in any variables that need to be dropped.

data all_florida ;
  set
FLORIDA_101 (drop=)
FLORIDA_102 (drop=X1)
  ;
run;

If you really need to automate then you should create a dataset to use as a template and query the metadata for your new files and compare the the expected result.  Then you can use the result of that comparison to generate appropriate code for each dataset that will drop , rename , transform etc any variables that got defined in the wrong way.

 

As to the question in your title the example data step above is probably a good place to start for the "looping".  So instead of looping in macro code make the normal data step iteration your loop.  For example if you created a macro named %FIX() that  could fix up the variables for one of the files then you could use CALL EXECUTE in a data step to call it for all of the datasets.

data _null_;
  set sashelp.vtable ;
  where libname='WORK' and memname like 'FLORIDA_%' ;
  call execute(cats('%nrstr(%fix)(',memname,')'));
run;

 

owenwqp1
Obsidian | Level 7
@Kurt_Bremser Thank you KurtBremser! But I have no idea about the call execute routine,could you kindly give me the sample code about my problem so that I can start with? Thanks a lot!
Kurt_Bremser
Super User

@owenwqp1 wrote:
@Kurt_Bremser Thank you KurtBremser! But I have no idea about the call execute routine,could you kindly give me the sample code about my problem so that I can start with? Thanks a lot!

I see that @Tom has already given you a nice example in the meantime.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1911 views
  • 0 likes
  • 4 in conversation