BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shellp55
Quartz | Level 8

Hi

I am using SAS 9.3 (64 bit) on a Windows 7 computer.

I am trying to use pipe and wkunzip  to read text files directly from the zip file.  Based on the information that came with the wkunzip add on I used the syntax below but it isn't extracting from the text file portion.  The log doesn't indicate any errors, just that "0 records were read from the infile testing".  I am also curious if I need to identify the text file name because there could be many and/or could I use wildcards because they would all have the same naming convention i.e. 1234A, 1234B, 1234C etc.

Can someone please advise why this isn't working?   Any and all assistance greatly appreciated.

filename testing pipe '"C:\Program Files\WinZip\wzunzip.exe" -vb F:\myfolder\Test.zip test.txt';


data work.import_test;
infile testing truncover LRECL = 5000 firstobs=10;


input @1 Prov   $1.
   @2 Inst   $4.
   @6 Fyear   $4.
   @10 Period   $2.
   @12 Batch   $2.
   @14 AbsNo      $3.
   @18  Coder   $2.
   @20 ChartNo   $10.
   @30 RegNo   $7.
   @37 SecChartNo  $10.
   @47 MatNBChart  $10.
   @57 HCN   $12.
   @69 Postal   $6.
   @75 Rescode   $7.
   @82 Sex   $1.
   @83 ProvHCN   $2.
   @85 RFP   $2.;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
shellp55
Quartz | Level 8

Hi

Okay, I don't know why it's working this time but I revisited all the previous information I had including the blurb from the SAS knowledgebase site.  I used what they had like I did before but this time it worked.

I also found the syntax to use the password function (use -s and your password right after - in the example below my password is 1234).  So the following works and I thank all who tried to help me!

filename testing pipe '"C:\Program Files\WinZip\wzunzip.exe" -s1234 -o -c  F:\myfolder\test.zip' ;

data test_import ;
infile testing firstobs=10 truncover LRECL=5000;

input @1 Prov   $1.
   @2 Inst   $4.
   @6 Fyear   $4.
   @10 Period   $2.
   @12 Batch   $2.
   @14 AbsNo      $3.
   @18  Coder   $2.
   @20 ChartNo   $10.;

run;

Note that as was alluded to by DBailey (and SAS knowledgebase), you have to select firstobs=10 to bypass the code from WinZip.  I think another issue for me was that I wasn't including the firstobs  statement first.  Note that I don't have to cite the files within the zip which is great because I have lots.  If I wanted to exclude some I could use the -e syntax and I would imagine I can limit by file extension versus name if I wanted to.

I am so very pleased to get this darn thing working!  Thanks again to anyone who responded!

View solution in original post

12 REPLIES 12
Stig_Eide
Fluorite | Level 6

So, test.txt is inside Test.zip?

In that case, you could maybe do this:

filename testing pipe '"C:\Program Files\WinZip\wzunzip.exe" -vb F:\myfolder\Test.zip test.txt & type test.txt';

This would put the content of Test.txt to STDOUT and then the program would get the content of the file.

I guess, if you need to delete the file afterwards, you could add the delete command after the type command.

Hope this helps!

Stig

HenrikDorf
SAS Employee

Hi

The -vb option tells WZUNZIP to show the filenames in the zip archive and this is not what you want .

Change -vb to -c and the contents of the archived file will be output , unfortunately sorrounded by extranous information from the wzunzip utility:

I think that WZUNZIP does not support piping.

197  filename nziip pipe 'wzunzip -c  C:\TEMP\TESTZIP.zip test.txt' ;

198

199  data _null_ ;

200      infile nziip ;

201      input ;

202      put _infile_ ;

203  run;

NOTE: The infile NZIIP is:

      Unnamed Pipe Access Device,

      PROCESS=wzunzip -c  C:\TEMP\TESTZIP.zip test.txt,

      RECFM=V,LRECL=256

WinZip(R) Command Line Support Add-On Version 3.1 (Build 8519)

Copyright (c) 1991-2009 WinZip International LLC - All Rights Reserved

Zip file: C:\TEMP\TESTZIP.zip

============================

Unzipping test.txt to con

-------------------------

1:,Count,Date,Time,,c:\pagefile.sys,C:\temp2,C:\temp

2:,0,21042012,202204,0,12667846656,0,273826763

2:,1,21042012,202205,0,12667846656,0,273826811

2:,2,21042012,202206,0,12667846656,0,273826859

2:,3,21042012,202207,0,12667846656,0,273826907

2:,4,21042012,202208,0,12667846656,0,273826955

Stderr output:

Searching...                                 

NOTE: 14 records were read from the infile NZIIP.

      The minimum record length was 0.

      The maximum record length was 70.

NOTE: DATA statement used (Total process time):

      real time           0.13 seconds

      cpu time            0.00 seconds

If you use another unzip utility that supports PIPING , the result will be as you want:

205  filename zipfile pipe 'unzip -p  C:\TEMP\TESTZIP.zip test.txt' ;

206

207  data _null_ ;

208      infile zipfile ;

209      input ;

210      put _infile_ ;

211  run;

NOTE: The infile ZIPFILE is:

      Unnamed Pipe Access Device,

      PROCESS=unzip -p  C:\TEMP\TESTZIP.zip test.txt,

      RECFM=V,LRECL=256

1:,Count,Date,Time,,c:\pagefile.sys,C:\temp2,C:\temp

2:,0,21042012,202204,0,12667846656,0,273826763

2:,1,21042012,202205,0,12667846656,0,273826811

2:,2,21042012,202206,0,12667846656,0,273826859

2:,3,21042012,202207,0,12667846656,0,273826907

2:,4,21042012,202208,0,12667846656,0,273826955

NOTE: 6 records were read from the infile ZIPFILE.

      The minimum record length was 46.

      The maximum record length was 52.

NOTE: DATA statement used (Total process time):

      real time           0.10 seconds

      cpu time            0.00 seconds

UNZIP.exe can be found at various open source places on the internet - just make sure that PIPING (-p) is supported.

Cheers

Henrik Dorf

Denmark

shellp55
Quartz | Level 8

Hi

Thanks for the replies but still not working in that I get the message 0 files read but no error messages.  Henrik according to the SAS knowledgebase, pipe and wkunzip are supposed to work together even though one of the messages showing up is "unnamed pipe access device".

HenrikDorf
SAS Employee

Hi

When i say that WKUNZIP dos not support pipe - i mean that it is not able to exclude messages and trademarks in the output.

I can see that the infile option "firstobs=10" will take care of the first 8 lines of information from WKUNZIP and the first dataline which presumably contains variable names.

Personally i prefer the clean input from UNZIP with no messages from the unzip utility.

Unnamed pipe access device is a SAS concept since you can define  named and unnamed pipes in filename statements.

And finally for solving the problem:

The LRECL=5000 seems to have impact in my test - so I would suggest that you remove this parameter to see if it helps you too.

Henrik

HenrikDorf
SAS Employee

Hi

I just realized that truncover will handle the problems that could occur if LRECL  does not match the true recordlength.

Sorry ...this will not solve anythng

It i strange that you have the message " 0 files read"

To see the messages from WKUNZIP you could write the program like this:

data work.import_test;
infile testing truncover LRECL = 5000 ;

IF _N_<11 then do;

     input ;

     putlog _infile_;

     return;

end;


input @1 Prov   $1.
   @2 Inst   $4.
   @6 Fyear   $4........

Henrik

DBailey
Lapis Lazuli | Level 10

If your system administrators allow...you might could perform the extract via the X command and then process the file:

X '"C:\Program Files\WinZip\wzunzip.exe" -vb F:\myfolder\Test.zip test.txt';

filename testing "F:\myfolder\test.txt" LRECL = 5000;

shellp55
Quartz | Level 8

Hi

Thanks for the responses.

Henrik:  there was a message of "NOTE: DATA STEP stopped due to looping" so can't it just be an if statement?

DBailey:  I assume that this is to unzip the file to the folder and from there I can input it?  If so, that didn't work because I get the error message "ERROR: Physical file does not exist, F:\myfolder\test.txt."

I got the following to work in that it shows in the log what the contents are but I want to read and extract/manipulate the data while keeping it within the zip file.  Is that not possible?  Even if I extract it, manipulate it and then delete the extraction (leaving the zip file as is), is that possible?  Also, I haven't even gotten to password protection yet which I was hoping I could include or there is no point in this exercise.

I got this to work to show the output in the log:

filename testing pipe '"C:\Program Files\WinZip\wzunzip.exe" -c F:\myfolder\Test.zip test.txt';

data _null_ ;
infile testing;
input ;
put _infile_ ;
run;

Thanks for any and all assistance!

DBailey
Lapis Lazuli | Level 10

So that suggests either you can't use the X commands (unlikely if you didn't get an error) or the unzip syntax is incorrect.  If you just run the x command and then go look in the F folder, does the file exist?

shellp55
Quartz | Level 8

Hi

Thanks for sticking with me!

Stig:  thanks, yes I've looked at this and can't get that to work either.  I read in another post where someone contacted the author of the paper and found he didn't test all of his code.

DBailey:  I am the only person on my computer but I have an admin and user profile.  I ran the program from my admin profile just in case there were issues with my not being able to use the x command and there was the error message about no physical file existing.  So, using either profile, no file shows up in the F folder suggesting the syntax for unzip is incorrect.

Any other thoughts?

shellp55
Quartz | Level 8

Hi

Okay, I don't know why it's working this time but I revisited all the previous information I had including the blurb from the SAS knowledgebase site.  I used what they had like I did before but this time it worked.

I also found the syntax to use the password function (use -s and your password right after - in the example below my password is 1234).  So the following works and I thank all who tried to help me!

filename testing pipe '"C:\Program Files\WinZip\wzunzip.exe" -s1234 -o -c  F:\myfolder\test.zip' ;

data test_import ;
infile testing firstobs=10 truncover LRECL=5000;

input @1 Prov   $1.
   @2 Inst   $4.
   @6 Fyear   $4.
   @10 Period   $2.
   @12 Batch   $2.
   @14 AbsNo      $3.
   @18  Coder   $2.
   @20 ChartNo   $10.;

run;

Note that as was alluded to by DBailey (and SAS knowledgebase), you have to select firstobs=10 to bypass the code from WinZip.  I think another issue for me was that I wasn't including the firstobs  statement first.  Note that I don't have to cite the files within the zip which is great because I have lots.  If I wanted to exclude some I could use the -e syntax and I would imagine I can limit by file extension versus name if I wanted to.

I am so very pleased to get this darn thing working!  Thanks again to anyone who responded!

DBailey
Lapis Lazuli | Level 10

what happens if you open a command window and run the command outside of sas?

"C:\Program Files\WinZip\wzunzip.exe" -vb F:\myfolder\Test.zip test.txt

Stig_Eide
Fluorite | Level 6

There is a SUGI paper that discusses different strategies to read files inside compressed archives:

http://www2.sas.com/proceedings/sugi31/155-31.pdf

maybe that can help?

Stig

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2937 views
  • 4 likes
  • 4 in conversation