Hello , Good morning ,
I own 5 Excel with the same scheme and structure .
The columns are here :
"N° Police"n > text format
Application > text format
"Nature événement"n text format
"Cause événement"n > text format
"Anomalie(s)"n > text format
"Date rejet initial"n > numeric format
Acteur > text format
Statut > text format
"Analyse Gestion"n > text format
"Analyse MOE" > text format
"Date correction"n > text format (if the column under excel is empty) , date format (if there is one or more date in the column under excel) .
Lead > text format
One time by day I import the group of 5 files Excel in my SAS program. I don't by advance if the column : "Date correction"n is empty or contain value.
Due to this case I want use a switch :
If the column is empty (the in format is text) , I want execute this query to convert a column in date format :
call execute (' proc sql ; create table '!!tp_1!!'2 as select "N° Police"n , Application , "Nature événement"n , "Cause événement"n , "Anomalie(s)"n , "Date rejet initial"n , acteur , statut , "Analyse Gestion"n , "Analyse MOE"n , '!!' case when "Date correction"n like '' then ''d end as dtcorr '!!' , Lead from work.'!!tp_1!!' ; run ; ');
Else I want execute the standard query for the date format
call execute (' proc sql ; create table '!!tp_1!!'2 as select "N° Police"n , Application , "Nature événement"n , "Cause événement"n , "Anomalie(s)"n , "Date rejet initial"n , acteur , statut , "Analyse Gestion"n , "Analyse MOE"n , '!!' case when "Date correction"n < 0 then "Date correction"n else ''d end as dtcorr '!!' , Lead from work.'!!tp_1!!' ; run ; ');
My program is like this :
options compress=yes sortsize=max; /* macro d'import des données de Excel */ %impxlsx2('\\serveur\underfile\INPUT_test\8X_DECES-SUIVI_ANO_FCT.xlsx', huitX_DECES_imp , 'Anomalies à traiter'); %impxlsx2('\\serveur\underfile\INPUT_test\8X_SUIVI_ANO_FCT.xlsx', huitX_imp , 'Anomalies à traiter'); %impxlsx2('\\serveur\underfile\INPUT_test\OBSEQUES_SUIVI_ANO_FCT.xlsx', OBSEQUES_imp , 'Anomalies à traiter'); %impxlsx2('\\serveur\underfile\INPUT_test\PEGASE_DECES-SUIVI_ANO_FCT.xlsx', PEGASE_DECES_imp , 'Anomalies à traiter'); %impxlsx2('\\serveur\underfile\INPUT_test\PEGASE_SUIVI_ANO_FCT.xlsx', PEGASE_imp , 'Anomalies à traiter'); /* conversion du champ date de correction en champ date vide lorsque celui ci n'est pas renseigné */ %let tp1='huitX_DECES_imp'; %let tp2='huitX_imp'; %let tp3='OBSEQUES_imp'; %let tp4='PEGASE_DECES_imp'; %let tp5='PEGASE_imp'; data _null_ ; tp_1=&tp1; tp_2=&tp2; tp_3=&tp3; tp_4=&tp4; tp_5=&tp5; /* a function with switch when this case is in error go automatically to the other case , and SAS doesn't stop the program */ /* it's similary to iferror in SAS or try catch in java */ /* if " call excuete 1 " >> generate an error */ /* " call execute 2 " */ /* else */ /* " call execute 1 " */ run;
It's possible to place a switch or try catch in the present case with th call execute ?
Thanks for your help
@azertyuiop wrote:
Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.
I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.
In a second step I must union all table in one big table . In the present case "Date correction"n can own two format if the data is presented in the column.
I want constrain SAS to own a only format > date format.
Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.
The engines that read Excel directly are pretty limited in terms of how many records to examine to guess the variable type.
If you are going to deal with a number of files that are supposed to be the same structure then a more reliable approach is:
1) save one of the files to CSV
2) use Proc import with are large value of the GUESSINGROWS option to examine many rows to "guess" the variable types.
3) Note that Proc Import for CSV will create data step code that you can copy from the log (or recall from submission) and edit.
This is the time to examine if the defaults make sense in terms of long enough text variables for expected values, names of variable, ensuring character vs numeric types, informats such as dates or even custom informats if desired, assigned formats, add labels and possibly even initial data quality checks or addition of calculated variables. It is usually very easy to set the variable properties if you have a description document of the file contents.
4) Saving that import generated code with any options or changes as needed then you can run the code against a different CSV file of the same format by changing the Infile statement for the source and the Data set name for the destination.
Errm why? One of the key points about programming in SAS is knowing your data, working with fixed data structures. If you start importing data in different ways each time, then each step in your program then has to account for different data types and thus be longer and more unstable. Does it really matter that the column is missing nueric or text - no, not at all. There is no benefit to changin the type of a variable based on whether it is missing or not. As for how to do this, it seems like your using proc import (bad idea straight away as it gueses the file), but basically you would read the file, decide what to do, then convert it if needed. Its a lot of hassel for no benefit.
Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.
I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.
In a second step I must union all table in one big table . In the present case "Date correction"n can own two format if the data is presented in the column.
I want constrain SAS to own a only format > date format.
Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.
Sorry, I can't understand most of your post. If you have to use proc import, and have to only use Excel (i.e. you can't just save the file somewhere else as csv then write a datastep import), then you have no options - Excel is a really poor data medium, and proc import is a guessing procedure = you will have problems. These are the simple facts of life.
Thus you will need to do some post processing on the imported datafile, check the column in sashelp.vcolumns to see if it is date or character, then convert it in a datastep before setting these data together. Again, this is a solution to a problem of your own restrictions.
@azertyuiop wrote:
Excel files in entrance in SAS program can own a great number of value , between 1000 to 10000 lines.
I don't know by advance if the column "Date correction"n is empty or contain data. If the column "Date correction"n is empty SAS generates a text format , when sas detectes a value the format is date.
In a second step I must union all table in one big table . In the present case "Date correction"n can own two format if the data is presented in the column.
I want constrain SAS to own a only format > date format.
Due to a specificities and technicals constraints I must use proc import , not a libname. The files excel in entrance are use by others programm or by several user.
The engines that read Excel directly are pretty limited in terms of how many records to examine to guess the variable type.
If you are going to deal with a number of files that are supposed to be the same structure then a more reliable approach is:
1) save one of the files to CSV
2) use Proc import with are large value of the GUESSINGROWS option to examine many rows to "guess" the variable types.
3) Note that Proc Import for CSV will create data step code that you can copy from the log (or recall from submission) and edit.
This is the time to examine if the defaults make sense in terms of long enough text variables for expected values, names of variable, ensuring character vs numeric types, informats such as dates or even custom informats if desired, assigned formats, add labels and possibly even initial data quality checks or addition of calculated variables. It is usually very easy to set the variable properties if you have a description document of the file contents.
4) Saving that import generated code with any options or changes as needed then you can run the code against a different CSV file of the same format by changing the Infile statement for the source and the Data set name for the destination.
Hello / Good morning ,
I take a note for the format in CSV. It's better than Excel to import and control the quality of data.
I know the option "guessingrows".
I my case if I have used xls/xlsx format it's due to a command of severals departements in my company . In this departement the reference to work it's Excel and they refuse to use CSV/TXT format.
You have reason when you say that the security to control with csv format is better en easier .
Get rid of the Excel crap, use a proper file format and a fixed data step, period. Everything else is plain stupid and a total waste of time.
Just look at the silly hoops you have to jump through just because some pea-brain is too dumb to hit "save as csv". Disgusting.
Hello / Good morning ,
It's not good idea to find a similar solution which is like in an other langage ?
After severals hours I propose this solution :
data _null_ ; do i=1 to 5; call execute (' data work.tp'!!cats(i)!!' ; '); call execute (' set work.tp'!!cats(i)!!' ; '); call execute (' length dtcorr2 8. ; '); call execute (' format dtcorr2 date9. ; '); call execute (' informat dtcorr2 date9. ; '); call execute (' dtcorr2=input("Date correction"n,12.) ; '); call execute (' length nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; '); call execute (' format nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; '); call execute (' informat nopol2 coapp2 lbnateve2 lbcaueve2 anomalie2 lbact2 lbsta2 anages2 anamoe2 Lid2 $1500. ; '); call execute (' nopol2="N° Police"n ; ') ; call execute (' coapp2=Application ; ') ; call execute (' lbnateve2="Nature événement"n ; ') ; call execute (' lbcaueve2="Cause événement"n ; ') ; call execute (' anomalie2="Anomalie(s)"n ; ') ; call execute (' lbact2=Acteur ; ') ; call execute (' lbsta2=Statut ; ') ; call execute (' anages2="Analyse Gestion"n ; ') ; call execute (' anamoe2="Analyse MOE"n ; ') ; call execute (' Lid2=Lead ; ') ; call execute (' drop "Date correction"n ; '); call execute ('run;'); end; run; data cinq_fichiers; length fichier $100; set work.tp1 (in=a) work.tp2 (in=b) work.tp3 (in=c) work.tp4 (in=d) work.tp5 (in=e) /*indsname=inds*/; rename nopol2=nopol coapp2=coapp lbnateve2=lbnateve lbcaueve2=lbcaueve anomalie2=anomalie "Date rejet initial"n=dtcrerej lbact2=lbact lbsta2=lbsta anages2=anages anamoe2=anamoe dtcorr2=dtcorr Lid2=Lid; /*fichier = inds;*/ if a then fichier="huitX_DECES_imp"; if b then fichier="huitX_imp"; if c then fichier="OBSEQUES_imp"; if d then fichier="PEGASE_DECES_imp"; if e then fichier="PEGASE_imp"; run;
You can see a call execute with a loop because there are 5 tables and it's necessary to repeat the operation in my case .
You want propose an other solution or you think that my program owns a risk ? You can give you solution 😄
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.