BookmarkSubscribeRSS Feed
romartins
Fluorite | Level 6

Hello,

I have an FTP where csv files are made available monthly and I need to import these csv files with a ";" as a separator, from FTP by SAS code.

 

The problem is that I need to check if these files were generated in a standardized way. Some necessary standards:
- File encoded as utf-8 so as not to lose the encoding of special characters;
- Have standardized file nomenclature and extension;
- Follow standardization in the fields, that is, order and nomenclature of variables. Always checking a naming pattern without accents or special characters, without spaces and with up to 32 characters. Through a pre-defined order and nomenclature.

 

So I have a few questions:
1 - How do I do these checks and import files via SAS code from FTP?
2 - Is there any way to generate the file in a standard way through Prepare Data in SAS VIYA and upload it to FTP or directly to a SASData folder?

 

Note: I need to generate a table that serves as control. To store the file name, extraction date, if there is any observation (default failure) and a flag informing whether the file has already been read or not. This table will be used to send emails periodically informing if any file has been made available outside of the standard and requires correction.

 

Fictional examples attached.
1 - ARQUIVO_2024_01.csv: File OK;
2 - ARQUIVO_2024_02.csv: File error;
3 - PADRAO_LAYOUT.csv: Columns standard.

 

Tested code to import file:

/*Import file from FTP*/
filename arq_raw FTP "&arquivo." user=&user. pass=&pass. host=&host. cd=&path. debug PASSIVE encoding="utf-8";
	
proc import datafile=arq_raw
	out=WORK.TMP_DADOS_FTP
	dbms=csv replace;
	delimiter=";";
	guessingrows=MAX;
run;

/*Releases connection to FTP*/
filename arq_raw clear;

 

I appreciate the colaboration of everyone.

2 REPLIES 2
Kurt_Bremser
Super User
Never use PROC IMPORT for CSV files. Always write the data step yourself.
This way you have full control over column names and attributes; for a check, you can read the header line in a separate step and check its contents against a reference.
Tom
Super User Tom
Super User

For the first problem make sure you are using a SAS session that is running with ENCODING='UTF-8'.  You can use the GETOPTION() function to check the setting of the ENCODING option.  You can add the ENCODING= option to the INFILE statement.

 

For the second problem get the filenames as data.   The FTP filename engine supports the ls command for FTP to get a directory listing. So you can use it to read the filenames into a dataset..  Then it is easy to check and make sure the filenames follow the pattern.

 

The final problem is solved by NOT using PROC IMPORT.  Instead since you seem to know exactly what variables should be in the file just write your own data step to read the file.  You can add a step before that to just read the header line and make sure that the file has the right columns.  It could even be part of the same data step.

For example two of your files seem to have the same columns.  So for those the step to read them might look like this:

data tmp_dados_ftp;
  infile arq_raw dsd dlm=';' truncover ;
  if _n_=1 then do;
     input ;
    if _infile_ ne 'ID;DATA;REGIAO;NOME' then do;
      put 'ERROR: The header line does not match.' / _infile_ ;
      stop;
    end;
  end;
  length ID DATA 8 REGIAO $10 NOME $10;
  format DATA date9.;
  informat DATA mmddyy. ;
  input ID -- NOME ;
run;

These files actually demonstrate one of the issues with using PROC IMPORT to guess how to read the CSV file.  The second column seems to be a date variable.  But the strings that are present could be interpreted as either MDY or DMY order (there are not day values larger than 12).  So PROC IMPORT might not guess to use the right one.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 265 views
  • 2 likes
  • 3 in conversation