BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
azertyuiop
Quartz | Level 8

Hello , Good morning ,

I own 5 Excel with the same scheme and structure .

The columns are here :


"N° Police"n > text format
Application > text format
"Nature événement"n text format
"Cause événement"n > text format
"Anomalie(s)"n > text format
"Date rejet initial"n > numeric format
Acteur > text format
Statut > text format
"Analyse Gestion"n  > text format
"Analyse MOE"  > text format
"Date correction"n  > text format (if the column under excel is empty) , date format (if there is one or more date in the column under excel) .              
Lead > text format


One time by day I import the group of 5 files Excel in my SAS program. I don't by advance if the column : "Date correction"n is empty or contain value.

Due to this case I want use a switch :

If the column is empty (the in format is text) , I want execute this query to convert a column in date format :

call execute (' proc sql ; create table '!!tp_1!!'2 as select "N° Police"n , Application , "Nature événement"n , "Cause événement"n , "Anomalie(s)"n , "Date rejet initial"n , acteur , statut , "Analyse Gestion"n , "Analyse MOE"n , '!!' case when "Date correction"n like '' then ''d end as dtcorr '!!' , Lead from work.'!!tp_1!!' ; run ; ');


Else I want execute the standard query for the date format

call execute (' proc sql ; create table '!!tp_1!!'2 as select "N° Police"n , Application , "Nature événement"n , "Cause événement"n , "Anomalie(s)"n , "Date rejet initial"n , acteur , statut , "Analyse Gestion"n , "Analyse MOE"n , '!!' case when "Date correction"n < 0 then "Date correction"n  else ''d end as dtcorr '!!' , Lead from work.'!!tp_1!!' ; run ; ');


My program is like this :

options compress=yes sortsize=max;

/* macro d'import des données de Excel */

%impxlsx2('\\serveur\underfile\INPUT_test\8X_DECES-SUIVI_ANO_FCT.xlsx', huitX_DECES_imp , 'Anomalies à traiter');
%impxlsx2('\\serveur\underfile\INPUT_test\8X_SUIVI_ANO_FCT.xlsx', huitX_imp , 'Anomalies à traiter');
%impxlsx2('\\serveur\underfile\INPUT_test\OBSEQUES_SUIVI_ANO_FCT.xlsx', OBSEQUES_imp , 'Anomalies à traiter');
%impxlsx2('\\serveur\underfile\INPUT_test\PEGASE_DECES-SUIVI_ANO_FCT.xlsx', PEGASE_DECES_imp , 'Anomalies à traiter');
%impxlsx2('\\serveur\underfile\INPUT_test\PEGASE_SUIVI_ANO_FCT.xlsx', PEGASE_imp , 'Anomalies à traiter');

/* conversion du champ date de correction en champ date vide lorsque celui ci n'est pas renseigné */

%let tp1='huitX_DECES_imp';
%let tp2='huitX_imp';
%let tp3='OBSEQUES_imp';
%let tp4='PEGASE_DECES_imp';
%let tp5='PEGASE_imp';

data _null_ ;

tp_1=&tp1;
tp_2=&tp2;
tp_3=&tp3;
tp_4=&tp4;
tp_5=&tp5;

/* a function with switch when this case is in error go automatically to the other case , and SAS doesn't stop the program */
/* it's similary to iferror in SAS or try catch in java */

/* if " call excuete 1 " >> generate an error */
/* " call execute 2 " */
/* else */
/* " call execute 1 " */

run;



It's possible to place a switch or try catch in the present case with th call execute ?

Thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@azertyuiop wrote:

Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.

 

I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.

 

In a second step I must union all table in one big table . In the present case "Date correction"n  can own two format if the data is presented in the column.

 

I want constrain SAS to own a only format > date format.

 

Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.


The engines that read Excel directly are pretty limited in terms of how many records to examine to guess the variable type.

If you are going to deal with a number of files that are supposed to be the same structure then a more reliable approach is:

1) save one of the files to CSV

2) use Proc import with are large value  of the GUESSINGROWS option to examine many rows to "guess" the variable types.

3) Note that Proc Import for CSV will create data step code that you can copy from the log (or recall from submission) and edit.

   This is the time to examine if the defaults make sense in terms of long enough text variables for expected values, names of variable, ensuring character vs numeric types, informats such as dates or even custom informats if desired, assigned formats, add labels and possibly even initial data quality checks or addition of calculated variables. It is usually very easy to set the variable properties if you have a description document of the file contents.

4) Saving that import generated code with any options or changes as needed then you can run the code against a different CSV file of the same format by changing the Infile statement for the source and the Data set name for the destination.

 

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Errm why?  One of the key points about programming in SAS is knowing your data, working with fixed data structures.  If you start importing data in different ways each time, then each step in your program then has to account for different data types and thus be longer and more unstable.  Does it really matter that the column is missing nueric or text - no, not at all.  There is no benefit to changin the type of a variable based on whether it is missing or not.  As for how to do this, it seems like your using proc import (bad idea straight away as it gueses the file), but basically you would read the file, decide what to do, then convert it if needed.  Its a lot of hassel for no benefit.

 

 

azertyuiop
Quartz | Level 8

Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.

 

I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.

 

In a second step I must union all table in one big table . In the present case "Date correction"n  can own two format if the data is presented in the column.

 

I want constrain SAS to own a only format > date format.

 

Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, I can't understand most of your post.  If you have to use proc import, and have to only use Excel (i.e. you can't just save the file somewhere else as csv then write a datastep import), then you have no options - Excel is a really poor data medium, and proc import is a guessing procedure = you will have problems.  These are the simple facts of life.

Thus you will need to do some post processing on the imported datafile, check the column in sashelp.vcolumns to see if it is date or character, then convert it in a datastep before setting these data together.  Again, this is a solution to a problem of your own restrictions.

ballardw
Super User

@azertyuiop wrote:

Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.

 

I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.

 

In a second step I must union all table in one big table . In the present case "Date correction"n  can own two format if the data is presented in the column.

 

I want constrain SAS to own a only format > date format.

 

Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.


The engines that read Excel directly are pretty limited in terms of how many records to examine to guess the variable type.

If you are going to deal with a number of files that are supposed to be the same structure then a more reliable approach is:

1) save one of the files to CSV

2) use Proc import with are large value  of the GUESSINGROWS option to examine many rows to "guess" the variable types.

3) Note that Proc Import for CSV will create data step code that you can copy from the log (or recall from submission) and edit.

   This is the time to examine if the defaults make sense in terms of long enough text variables for expected values, names of variable, ensuring character vs numeric types, informats such as dates or even custom informats if desired, assigned formats, add labels and possibly even initial data quality checks or addition of calculated variables. It is usually very easy to set the variable properties if you have a description document of the file contents.

4) Saving that import generated code with any options or changes as needed then you can run the code against a different CSV file of the same format by changing the Infile statement for the source and the Data set name for the destination.

 

azertyuiop
Quartz | Level 8

Hello / Good morning ,

 

I take a note for the format in CSV. It's better than Excel to import and control the quality of data.

 

I know the option "guessingrows".

 

I my case if I have used xls/xlsx format it's due to a command of severals departements in my company . In this departement the reference to work it's Excel and they refuse to use CSV/TXT format.

 

You have reason when you say that the security to control with csv format is better en easier .

 

Kurt_Bremser
Super User

Get rid of the Excel crap, use a proper file format and a fixed data step, period. Everything else is plain stupid and a total waste of time.

 

Just look at the silly hoops you have to jump through just because some pea-brain is too dumb to hit "save as csv". Disgusting.

user24feb
Barite | Level 11
Don't try to transfer habits to from other programming languages. Code it the SAS-way. (IMO Excel-imports are always messy, regardless of the language.)
azertyuiop
Quartz | Level 8

Hello / Good morning ,

 

It's not  good idea to find a similar solution which is like in an other langage ?

 

After severals hours I propose this solution :

 

data _null_ ;

do i=1 to 5;

    call execute (' data work.tp'!!cats(i)!!' ; ');

    call execute (' set work.tp'!!cats(i)!!' ; ');

    call execute (' length dtcorr2 8. ; ');
    call execute (' format dtcorr2 date9. ; ');
    call execute (' informat dtcorr2 date9. ; ');
    call execute (' dtcorr2=input("Date correction"n,12.) ; ');
    call execute (' length nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; ');
    call execute (' format nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; ');
    call execute (' informat nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; ');
    call execute (' nopol2="N° Police"n ; ') ;
    call execute (' coapp2=Application  ; ') ;
    call execute (' lbnateve2="Nature événement"n  ; ') ;
    call execute (' lbcaueve2="Cause événement"n ; ') ;
    call execute (' anomalie2="Anomalie(s)"n ; ') ;
    call execute (' lbact2=Acteur ; ') ;
    call execute (' lbsta2=Statut ; ') ;
    call execute (' anages2="Analyse Gestion"n ; ') ;
    call execute (' anamoe2="Analyse MOE"n ; ') ;
    call execute (' Lid2=Lead ; ') ;
    call execute (' drop "Date correction"n ; ');
 
    call execute ('run;');

end;

run;

data cinq_fichiers;

length fichier $100;

set
work.tp1 (in=a)
work.tp2 (in=b)
work.tp3 (in=c)
work.tp4 (in=d)
work.tp5 (in=e)
/*indsname=inds*/;

rename
nopol2=nopol
coapp2=coapp  
lbnateve2=lbnateve
lbcaueve2=lbcaueve
anomalie2=anomalie  
"Date rejet initial"n=dtcrerej
lbact2=lbact
lbsta2=lbsta
anages2=anages  
anamoe2=anamoe
dtcorr2=dtcorr
Lid2=Lid;

/*fichier = inds;*/
if a then fichier="huitX_DECES_imp";
if b then fichier="huitX_imp";
if c then fichier="OBSEQUES_imp";
if d then fichier="PEGASE_DECES_imp";
if e then fichier="PEGASE_imp";

run;

You can see a call execute with a loop because there are 5 tables and it's necessary to repeat the operation in my case .

 

You want propose an other solution or you think that my program owns a risk ? You can give you solution 😄

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 4954 views
  • 6 likes
  • 5 in conversation