Hi,
I m importing an Excel sheet and convert it to csv. So far so good, it works and I can see the data.
Now I want to read it into a data set.
This is what my data looks like:
Valid From: 01/01/2015
Valid to: 01/01/2015
Time from: 00:00:00
Time to: 23:59:00
Now I do this:
DATA WORK.test3; LENGTH 'Valid From'n $ 10 'Valid To'n $ 10 'Time from'n 8 'Time to'n 8 FORMAT 'Valid From'n $CHAR10. 'Valid To'n $CHAR10. 'Time from'n TIME8. 'Time to'n TIME8. INFORMAT 'Valid From'n $CHAR10. 'Valid To'n $CHAR10. 'Time from'n 8. 'Time to'n 8. INFILE tmpout LRECL=142 ENCODING="WLATIN1" TERMSTR=CRLF DLM=';' MISSOVER DSD; INPUT 'Valid From'n : $CHAR10. 'Valid To'n : $CHAR10. 'Time from'n : BEST32. 'Time to'n : BEST32.
RUN;
This gives me this errors:
NOTE: Invalid data for 'Time from'n in line 1 101-107.
NOTE: Invalid data for 'Time to'n in line 1 109-116.
Valid_from and Valid_to are not quite right either I guess, but at least I dont get an error.
What I want is to have proper date and time values. I tried every possible combination of HHMM5., HHMM8, TIME5., TIME8. etc but none of them work.
Thanks a lot!
Dirk
If you have set the informat with the informat statement then you do not want to use in on the INPUT statement. Actually use of the informat on the input statement is pretty much designed for fixed column files.
There are likely diagnostics displayed in the log that show the values for variables up to the error. If you look I suspect you will find that 'Valid To'n has the first character or two of the 'Time from'n variable.
I strongly recommend not using those variable names and change them to something like "ValidTo" or "Valid_to" as you will get very tired of fixing errors related to misspelled variables. Use a Label to get "pretty" column/row headings and titles.
When importing from CSV you do not need to, nor would you want to, use named literals - that 'var'n type format. Also, you are reading rows of data here, so you need to proces each line of data out into your variables:
data want; length text $200; infile "s:\temp\rob\a.csv"; input; text=scan(_infile_,1,":"); if substr(text,1,5)="Valid" then date=scan(_infile_,2,":"); else time=input(strip(scan(_infile_,2,":")),hhmmss.); format time time5.; run;
Note that this reads the file, and creates 4 rows, one for each row in your test data. It splits out the text part, then the date part, then the time part. (note I don't format the date).
I m not exactly reading a csv.
This is the whole script I am using:
Libname Excel1 XLSX "D:\DWH\Import\Startrax\sameday unlimited Handling Steuertabelle.xlsx"; OPTIONS VALIDVARNAME = ANY; filename tmpout temp lrecl=142; data _tmp; set Excel1."Tabelle1"n (FIRSTOBS=1); FORMAT Export BEST12. 'Security Flat'n BEST12. 'Minimum Labelling'n BEST12. 'Labelling per item'n BEST12. 'Minimum Security'n BEST12. 'Maximum Security'n BEST12. 'Security Flat'n BEST12. 'Security per item'n BEST12. 'Security per kg'n BEST12. 'Ramp per drive Import'n BEST12. 'Other Export Fee'n BEST12. 'Import Flat'n BEST12. 'Import per kg'n BEST12. 'THC Minimum'n BEST12. 'THC Maximum'n BEST12. 'Doc Charge'n BEST12. 'Ramp per drive Export'n BEST12. Transit BEST12.; run; proc datasets lib=work memtype=data nolist; modify _tmp; run; proc export data=_tmp outfile= tmpout dbms=csv replace; putnames=no; delimiter=';'; run; DATA WORK.test3; LENGTH Name $ 26 Station $ 3 PartnerID $ 9 Platform $ 3 Type $ 9 'From Weight'n 8 'To Weight'n 8 Export 8 'Minimum Labelling'n 8 'Labelling per item'n 8 'Minimum Security'n 8 'Maximum Security'n 8 'Security Flat'n 8 'Security per item'n 8 'Security per kg'n 8 'Ramp per drive Import'n 8 'Other Export Fee'n 8 'Import Flat'n 8 'Import per kg'n 8 'THC Minimum'n 8 'THC Maximum'n 8 'Doc Charge'n 8 'Ramp per drive Export'n 8 Transit 8 Currency $ 3 'Valid From'n $ 10 'Valid To'n $ 10 'Time from'n 8 'Time to'n 8 Days 8 'Last updated'n 8 'Updated by'n $ 11 ; FORMAT Name $CHAR26. Station $CHAR3. PartnerID $CHAR9. Platform $CHAR3. Type $CHAR9. 'From Weight'n BEST12. 'To Weight'n BEST12. Export COMMA12.2 'Minimum Labelling'n COMMA12.2 'Labelling per item'n COMMA12.2 'Minimum Security'n COMMA12.2 'Maximum Security'n COMMA12.2 'Security Flat'n COMMA12.2 'Security per item'n COMMA12.2 'Security per kg'n COMMA12.2 'Ramp per drive Import'n COMMA12.2 'Other Export Fee'n BEST12. 'Import Flat'n COMMA12.2 'Import per kg'n COMMA12.2 'THC Minimum'n COMMA12.2 'THC Maximum'n COMMA12.2 'Doc Charge'n COMMA12.2 'Ramp per drive Export'n COMMA12.2 Transit COMMA12.2 Currency $CHAR3. 'Valid From'n $CHAR10. 'Valid To'n $CHAR10. 'Time from'n TIME8. 'Time to'n TIME8. Days BEST12. 'Last updated'n DATE9. 'Updated by'n $CHAR11. ; INFORMAT Name $CHAR26. Station $CHAR3. PartnerID $CHAR9. Platform $CHAR3. Type $CHAR9. 'From Weight'n BEST12. 'To Weight'n BEST12. Export COMMA12. 'Minimum Labelling'n COMMA12. 'Labelling per item'n COMMA12. 'Minimum Security'n COMMA12. 'Maximum Security'n COMMA12. 'Security Flat'n COMMA12. 'Security per item'n COMMA12. 'Security per kg'n COMMA12. 'Ramp per drive Import'n COMMA12. 'Other Export Fee'n BEST12. 'Import Flat'n COMMA12. 'Import per kg'n COMMA12. 'THC Minimum'n COMMA12. 'THC Maximum'n COMMA12. 'Doc Charge'n COMMA12. 'Ramp per drive Export'n COMMA12. Transit COMMA12. Currency $CHAR3. 'Valid From'n $CHAR10. 'Valid To'n $CHAR10. 'Time from'n 8. 'Time to'n 8. Days BEST12. 'Last updated'n DATE9. 'Updated by'n $CHAR11. ; INFILE tmpout LRECL=142 ENCODING="WLATIN1" TERMSTR=CRLF DLM=';' MISSOVER DSD; INPUT Name : $CHAR26. Station : $CHAR3. PartnerID : $CHAR9. Platform : $CHAR3. Type : $CHAR9. 'From Weight'n : BEST32. 'To Weight'n : BEST32. Export : BEST32. 'Minimum Labelling'n : BEST32. 'Labelling per item'n : BEST32. 'Minimum Security'n : BEST32. 'Maximum Security'n : BEST32. 'Security Flat'n : BEST32. 'Security per item'n : BEST32. 'Security per kg'n : BEST32. 'Ramp per drive Import'n : BEST32. 'Other Export Fee'n : BEST32. 'Import Flat'n : BEST32. 'Import per kg'n : BEST32. 'THC Minimum'n : BEST32. 'THC Maximum'n : BEST32. 'Doc Charge'n : BEST32. 'Ramp per drive Export'n : BEST32. Transit : BEST32. Currency : $CHAR3. 'Valid From'n : $CHAR10. 'Valid To'n : $CHAR10. 'Time from'n : BEST32. 'Time to'n : BEST32. Days : BEST32. 'Last updated'n : BEST32. 'Updated by'n : $CHAR11. ; RUN; Libname Excel4;
Since you have one row per variable (name/value pair) in the input file, you can't use a simple input.
Instead
- retain all your variables
- input the line
- identify when a new object starts
- if yes, and you're not in _n_ = 1, output; and reset variables to missing/empty
- scan the line and set the according variable to the value
- at the end, do one additional output;, as you won't be able to detect another object change
Oh yes, and start using only valid SAS variable names three days before yesterday.
Kurt,
my data looks like this:
1/1/2015;31/122017;00:00:00;23:59:59
1/1/2015;31/122017;00:00:00;23:59:59
1/1/2015;31/122017;00:00:00;23:59:59
1/1/2015;31/122017;00:00:00;23:59:59
I was just writing it a bit confusing.
Anyways,
I ve got it partially to work by doing this:
Libname Excel1 XLSX "D:\TimeMattersDWH\Import\Startrax\sameday unlimited Handling Steuertabelle.xlsx"; OPTIONS VALIDVARNAME = ANY; filename tmpout temp lrecl=142; data _tmp; set Excel1."Tabelle1"n (FIRSTOBS=1); FORMAT Export BEST12. 'Security Flat'n BEST12. 'Minimum Labelling'n BEST12. 'Labelling per item'n BEST12. 'Minimum Security'n BEST12. 'Maximum Security'n BEST12. 'Security Flat'n BEST12. 'Security per item'n BEST12. 'Security per kg'n BEST12. 'Ramp per drive Import'n BEST12. 'Other Export Fee'n BEST12. 'Import Flat'n BEST12. 'Import per kg'n BEST12. 'THC Minimum'n BEST12. 'THC Maximum'n BEST12. 'Doc Charge'n BEST12. 'Ramp per drive Export'n BEST12. Transit BEST12.; run; proc datasets lib=work memtype=data nolist; modify _tmp; run; proc export data=_tmp outfile= tmpout dbms=csv replace; putnames=no; delimiter=';'; run; DATA &_OUTPUT. ; LENGTH Name $ 26 Station $ 3 PartnerID $ 9 Platform $ 3 Type $ 9 'FromWeight'n 8 'ToWeight'n 8 Export 8 'MinimumLabelling'n 8 'Labellingperitem'n 8 'MinimumSecurity'n 8 'MaximumSecurity'n 8 'SecurityFlat'n 8 'Securityperitem'n 8 'Securityperkg'n 8 'RampperdriveImport'n 8 'OtherExportFee'n 8 'ImportFlat'n 8 'Importperkg'n 8 'THCMinimum'n 8 'THCMaximum'n 8 'DocCharge'n 8 'RampperdriveExport'n 8 Transit 8 Currency $ 3 'ValidFrom'n 8 'ValidTo'n 8 'Timefrom'n 8 'Timeto'n 8 Days 8 'Lastupdated'n 8 'Updatedby'n $ 50 ; FORMAT Name $CHAR26. Station $CHAR3. PartnerID $CHAR9. Platform $CHAR3. Type $CHAR9. 'FromWeight'n BEST12. 'ToWeight'n BEST12. Export COMMA12.2 'MinimumLabelling'n COMMA12.2 'Labellingperitem'n COMMA12.2 'MinimumSecurity'n COMMA12.2 'MaximumSecurity'n COMMA12.2 'SecurityFlat'n COMMA12.2 'Securityperitem'n COMMA12.2 'Securityperkg'n COMMA12.2 'RampperdriveImport'n COMMA12.2 'OtherExportFee'n BEST12. 'ImportFlat'n COMMA12.2 'Importperkg'n COMMA12.2 'THCMinimum'n COMMA12.2 'THCMaximum'n COMMA12.2 'DocCharge'n COMMA12.2 'RampperdriveExport'n COMMA12.2 Transit COMMA12.2 Currency $CHAR3. 'ValidFrom'n DDMMYY10. 'ValidTo'n DDMMYY10. 'Timefrom'n TIME8. 'Timeto'n TIME8. Days BEST12. 'Lastupdated'n DDMMYY10. 'Updatedby'n $CHAR11. ; INFILE tmpout LRECL=142 ENCODING="WLATIN1" TERMSTR=CRLF DLM=';' MISSOVER DSD; INPUT Name : $CHAR26. Station : $CHAR3. PartnerID : $CHAR9. Platform : $CHAR3. Type : $CHAR9. 'FromWeight'n : BEST32. 'ToWeight'n : BEST32. Export : BEST32. 'MinimumLabelling'n : BEST32. 'Labellingperitem'n : BEST32. 'MinimumSecurity'n : BEST32. 'MaximumSecurity'n : BEST32. 'SecurityFlat'n : BEST32. 'Securityperitem'n : BEST32. 'Securityperkg'n : BEST32. 'RampperdriveImport'n : BEST32. 'OtherExportFee'n : BEST32. 'ImportFlat'n : BEST32. 'Importper kg'n : BEST32. 'THCMinimum'n : BEST32. 'THCMaximum'n : BEST32. 'DocCharge'n : BEST32. 'RampperdriveExport'n : BEST32. Transit : BEST32. Currency : $CHAR3. 'ValidFrom'n : MMDDYY10. 'ValidTo'n : MMDDYY10. 'Timefrom'n : TIME8. 'Timeto'n : TIME8. Days : BEST32. 'Lastupdated'n : MMDDYY10. 'Updatedby'n : $CHAR11. ; RUN; Libname Excel1;
I can use it in DIS now, but the file reader gives me an error when I try to open the output table.
The log says its all good:
NOTE: Variable Importperkg ist nicht initialisiert. NOTE: The infile TMPOUT is: Dateiname=D:\SASWork\_TD7232_NI-SASB1_\#LN00026, RECFM=V,LRECL=142,Dateigröße (Byte)=22008, Zuletzt geändert=10Jan2017:17:41:42, Erstellungsuhrzeit=10Jan2017:17:41:42 NOTE: 160 records were read from the infile TMPOUT. The minimum record length was 108. The maximum record length was 142. NOTE: The data set WORK.W4GNSW5 has 160 observations and 33 variables. NOTE: PROCEDURE| _DISARM| STOP| _DISARM| 2017-01-10T17:41:42,360+01:00| _DISARM| WorkspaceServer| _DISARM| SAS| _DISARM| | _DISARM| 20533248| _DISARM| 16080896| _DISARM| 11| _DISARM| 11| _DISARM| 90108| _DISARM| 3715043| _DISARM| 0.031250| _DISARM| 0.031000| _DISARM| 1799685702.329000| _DISARM| 1799685702.360000| _DISARM| 0.015625| _DISARM| | _ENDDISARM
Now if I try to open it, I get this error:
Unable to execute query: SQL-Passthru-Ausdruck enthielt folgende Fehler: ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, ',', -, /, <, <=, <>, =, >, >=, ?, AND, AS, CONTAINS, EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.... ERROR 22-322: Syntax error, expecting one of the following: eine Zeichenkette in Hochkommata, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, AS, BETWEEN, CONTAINS, EQ, EQT, FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.... ERROR 76-322: Syntax error, statement will be ignored.
But if I look in my SASWork Folder, open the table through "libname a "D:\SASWork\_TD7232_NI-SASB1_\Prc2";,
I can see that the data is properly read. I think I could even use the data in further transformations. But I cant look at it.
Sorry, your posts are getting more and more confused. First off:
"my data looks like this:
1/1/2015;31/122017;00:00:00;23:59:59..."
No, it may do after processing, but you are starting with an Excel file. Lets step through what you are doing:
1) You are using libname to XLSX.
2) There is then a proc datasets call for some unknown reason as it does nothing.
3) You then use proc export to create a delimited file and send it to reference tmpout - which is never specified, so I assume defaults to some SAS temporary area - not a good idea. Also not a good idea, is to use dbms=csv, and specify delimter as ;. CSV standsd for Comma Separated Variable file and is a well structured and documented file format.
4) You then create a dataset, unknown what it is called as you have used &output - never defined, and why would import data be called output??
5) In this import of a text file you are refering to named literals e.g. 'abc'n. This is not a good, and is not needed when dealing with text files. Named literals are used when dealing with the office engine, so that you can stipulate invalid SAS names. As you are importing plain text data, specify proper SAS names.
Now, you are getting issue when trying to open a file, now you have not specified any SQL in your code here, so I am assuming you opening the xlsx file data, in the libname created in step 1, or failing that something in your setup.
Really go back and look at what you are doing, most of the code given is not needed. If you need a dataset, and a simple proc import does not work, then simply open the XLSX file, and goto SaveAs and save the file as CSV. Then write a simple data step import, similar to what you already have - however, you can run proc import on the CSV file to get basic structure and to save you typing time. There is no benefit to opening an Excel file in SAS, then exporting to delimited file, then reading it back in again.
RW9,
yes, because I am confused and I don't know what to do.
1) Yes. Thats what my predecessor did.
2) Yes. Thats what my predecessor did.
3) Not in Germany. We use semicolon all the time, because comma is our decimal point. A number in Germany looks like this: 1,99. Therefore everyone uses semicolon in CSV files.
4) I am using Data Integration Studio. To redirect your output, you have to use something like %_OUTPUT., %_OUTPUT1., %_OUTPUT2. and so forth, depending on the number of work tables your transformation node has.
5) I had spaces in my column names. I removed the spaces and forgot to delete the ''n as well.
What is the proper way to import an Excel File into DIS? I dont know much about SAS and I need it for a project. I m a Pentaho guy, which is completely different from SAS. I m trying to replicate what my previous colleague did with a little help from google and forums.
Hi, I'm coming late to this party, so I'm only just catching up with all of this.
First, you've fixed your original problem by using the TIME8. informats for 'Timefrom'n and 'Timeto'n in the INPUT statement in your DATA &_OUTPUT. step. What you did is the correct way to input these variables, so that's finished.
Now, in terms of seeing the output, SAS is a little odd. It's intended for heavy-duty data processing jobs, so it tends to created dedicated "server" (software only) instances in a number of its products, like DIS. When it creates one of these server instances, that server gets its own dedicated WORK library, to prevent the different servers interfering with each other.
You're likely trying to look at your data from a different server than it was created on. That's why you can see it when you use a physical file reference (D:\SASWork...) you can see it.
The best solution to this is to add some code to your DIS run to copy the data to a permanenet library to which you have access, and then you won't have any problem looking at it.
Tom
I understand. SAS is an incredibly powerful beast, but like many of them it takes some getting used to.
I don't have DIS in front of me at this time, but I don't think you'll be able to register a dataset in the WORK library. They're meant to be temporary.
Your idea should work...tack on a step afterwards to sent the data to a permanent library, you should then be able to see it. Or, change the destination in your DATA step (that might be more challenging, given your lack of experience).
Which tool are you using to try to look at the data?
Tom
Tom,
I think SAS is not the right tool for us, but that is a different story.
In DIS, I do it all the time. I redirect the output of the work tables into libraries and then I register the tables to make them permanent.
Everything is working now btw. I overlooked a blank in one of the column names.
This now the final code. It might not be the most elegant way but it works.
Libname Excel1 XLSX "D:\TimeMattersDWH\Import\Startrax\sameday unlimited Handling Steuertabelle.xlsx"; OPTIONS VALIDVARNAME = ANY; filename tmpout temp lrecl=142; data _tmp; set Excel1."Tabelle1"n (FIRSTOBS=1); FORMAT Export BEST12. 'Security Flat'n BEST12. 'Minimum Labelling'n BEST12. 'Labelling per item'n BEST12. 'Minimum Security'n BEST12. 'Maximum Security'n BEST12. 'Security Flat'n BEST12. 'Security per item'n BEST12. 'Security per kg'n BEST12. 'Ramp per drive Import'n BEST12. 'Other Export Fee'n BEST12. 'Import Flat'n BEST12. 'Import per kg'n BEST12. 'THC Minimum'n BEST12. 'THC Maximum'n BEST12. 'Doc Charge'n BEST12. 'Ramp per drive Export'n BEST12. Transit BEST12.; run; proc datasets lib=work memtype=data nolist; modify _tmp; run; proc export data=_tmp outfile= tmpout dbms=csv replace; putnames=no; delimiter=';'; run; DATA work.test ; LENGTH Name $ 26 Station $ 3 PartnerID $ 9 Platform $ 3 Type $ 9 FromWeight 8 ToWeight 8 Export 8 MinimumLabelling 8 Labellingperitem 8 MinimumSecurity 8 MaximumSecurity 8 SecurityFlat 8 Securityperitem 8 Securityperkg 8 RampperdriveImport 8 OtherExportFee 8 ImportFlat 8 Importperkg 8 THCMinimum 8 THCMaximum 8 DocCharge 8 RampperdriveExport 8 Transit 8 Currency $ 3 ValidFrom 8 ValidTo 8 Timefrom 8 Timeto 8 Days 8 Lastupdated 8 Updatedby $ 50 ; FORMAT Name $CHAR26. Station $CHAR3. PartnerID $CHAR9. Platform $CHAR3. Type $CHAR9. FromWeight BEST12. ToWeight BEST12. Export COMMA12.2 MinimumLabelling COMMA12.2 Labellingperitem COMMA12.2 MinimumSecurity COMMA12.2 MaximumSecurity COMMA12.2 SecurityFlat COMMA12.2 Securityperitem COMMA12.2 Securityperkg COMMA12.2 RampperdriveImport COMMA12.2 OtherExportFee BEST12. ImportFlat COMMA12.2 Importperkg COMMA12.2 THCMinimum COMMA12.2 THCMaximum COMMA12.2 DocCharge COMMA12.2 RampperdriveExport COMMA12.2 Transit COMMA12.2 Currency $CHAR3. ValidFrom DDMMYY10. ValidTo DDMMYY10. Timefrom TIME8. Timeto TIME8. Days BEST12. Lastupdated DDMMYY10. Updatedby $CHAR11. ; INFILE tmpout LRECL=142 ENCODING="WLATIN1" TERMSTR=CRLF DLM=';' MISSOVER DSD; INPUT Name : $CHAR26. Station : $CHAR3. PartnerID : $CHAR9. Platform : $CHAR3. Type : $CHAR9. FromWeight : BEST32. ToWeight : BEST32. Export : BEST32. MinimumLabelling : BEST32. Labellingperitem : BEST32. MinimumSecurity : BEST32. MaximumSecurity : BEST32. SecurityFlat : BEST32. Securityperitem : BEST32. Securityperkg : BEST32. RampperdriveImport : BEST32. OtherExportFee : BEST32. ImportFlat : BEST32. Importperkg : BEST32. THCMinimum : BEST32. THCMaximum : BEST32. DocCharge : BEST32. RampperdriveExport : BEST32. Transit : BEST32. Currency : $CHAR3. ValidFrom : MMDDYY10. ValidTo : MMDDYY10. Timefrom : TIME8. Timeto : TIME8. Days : BEST32. Lastupdated : MMDDYY10. Updatedby : $CHAR11. ; RUN;
I m using Data Integration Studio and the SAS Display Manager to test my code. Sometimes I use the Enterprise Guide, to create PROC SQL code. 98% of my DIS jobs are PROC SQL;'s in User Written Code transformations.
Thanks everyone in this thread for trying to help me out! I used all of your suggestions and I finally got it to work 🙂
Having to work with variable names which don't comply with SAS naming standards can become very annoying. For this reason I'd use
option validvarname=v7; as this instructs SAS to convert the variable names to something compliant.
With DIS and when dealing with Excel I'm doing the same like you: Read the Excel and write the data straight out to a text file so that I get full control of how I actually read the data (external file metadata).
Unlike you I'm using a combination of Proc Import / Proc Export. I'm using Proc Import instead of a Libname because there I can also define a Range in an Excel sheet. If the first row in the Excel contains column names then I'd read the data only from row 2 to begin with.
As others already pointed out the Proc Dataset block in the code you've posted does nothing and you could remove it without any impact.
proc datasets lib=work memtype=data nolist; modify _tmp; run;
Patrick,
can you please post an example of your Proc Import / Export code?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.