Complex File Import

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

Complex File Import

I have a somewhat complex file I need to import and have found about 90% of the answer just need a little nudge to get me over the hump. The file has fixed positions for the data and the data is in different rows,see below for an example:

 

//**********************************************************************

//* SYSTEM: SYS01 Need to create variable/column SYSTEM from "//*SYSTEM:" - WORKS

                                       Need the value after "//*SYSTEM:" put into the SYSTEM variable - WORKS

//**********************************************************************

ID: RESOURCE: CLASS: ACCESS: Need to create variables/columns from this line - WORKS

------------------------------------------------------------------------

@Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ Need to populate dataset with these values - WORKS

 

The issue i am faced with is that there are entries that have no data related to the varialbes created however the code below is populating the last known or recorded value for each variable for the remainder of the entries. I have included the full code below and sample data representative of the file.

 

Thanks in advance for any all assistance!

 

/*Sample data below along with column counts for variables
123456789|123456789|123456789|123456789|123456789|123456789|123456789|12

ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
*/

%*	---------------------------------------------------------------------------------------------------- ;
%*	Used code in the link below as a beginning reference ;
%*	https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ;

/*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/
%* Start Complex Import File ;
data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM ID RESOURCE CLASS ACCESS ;

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;

*drop	recType prevType ;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;

Accepted Solutions
Solution
‎01-25-2017 11:54 AM
Grand Advisor
Posts: 10,211

Re: Complex File Import

[ Edited ]

See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.

data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM  ;
   Retain syscount 0; /* #######*/

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
      syscount+1; /*#######*/
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;
   if syscount=1 and missing(id) then delete; /*#######*/

drop	recType prevType syscount;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;

 

 

View solution in original post


All Replies
Grand Advisor
Posts: 10,211

Re: Complex File Import

Can you provide something that shows what the output should look like?

 

Or at least which variables are missing, what the values should be and some way to identify which records they are?

 

Your retain statements are what would hold previous values if you don't clear them out when no longer wanted. Perhaps instead of Delete you want an explicit output so that the values can be cleared after output??

 

Occasional Contributor
Posts: 19

Re: Complex File Import

I went back in and removed the RETAIN statement and that gets me 99% there, below is the edited version of the code. Thank you for pointing out the RETAIN statment! Attached are two files the new resulting output from the code below and the required end result.

 

/*Sample data below along with column counts for variables
123456789|123456789|123456789|123456789|123456789|123456789|123456789|12

ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
*/

%*	---------------------------------------------------------------------------------------------------- ;
%*	Used code in the link below as a beginning reference ;
%*	https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ;

/*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/
%* Start Complex Import File ;
data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
*	retain	SYSTEM ID RESOURCE CLASS ACCESS ;

	*Delete any blank lines in the file ;
	if char(recType, 1) = " " then do ; 
		delete ; 
	end ;

*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;

	if recType = '//* SYSTEM:%' then do ; 
		delete ; 
	end ;

drop	recType ;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;
Grand Advisor
Posts: 10,211

Re: Complex File Import

If I change the retain statement to:

retain SYSTEM ;

 

I get what looks like what you want.

Occasional Contributor
Posts: 19

Re: Complex File Import

Thank you! This gets me to 99.9% now all I need to do is research removing the first blank line from the file.

SYSTEM ID RESOURCE CLASS ACCESS
SYS01  Need to remove this line  
SYS01 @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER
SYS01 XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ
SYS01 NLNA RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER
SYS02    
SYS03    
SYS04    
SYS05    
SYS06    

Grand Advisor
Posts: 10,211

Re: Complex File Import

 

Does

 

If Id='SYS01' and missing(resource) then delete;

 

added after your other delete work? Or does the first value of ID not always equal SYS01?

Occasional Contributor
Posts: 19

Re: Complex File Import

Yes, that works just not an ideal solution as it involves hard coding the SYSTEM value. I have been trying to use the LAG function and FIRST. and LAST. to look back and look ahead to solve the problem. The code below works, output attached.

 

if SYSTEM = 'SYS01' and missing(ID) then delete ;

The output may not always be SYS01 with the missing information so I will need to keep digging and find a solution that works for not only the first record but any record in the middle or at the end of the file. So far my solutions are removing the relevant data at the end of the file were missing observations are valid. I am fairly certain I have tackled this problem in the past trying to find an example in my old code that I can reuse will definately post a response if I find the code.

Super User
Super User
Posts: 6,351

Re: Complex File Import

Looks pretty simple to me.  You just need keep track of whether you have output any rows for the current SYSTEM to know if you need to output an empty row for that system.

 

data want ;
  infile cards truncover ;
  length ROW 8 SYSTEM $6 ID $10 RESOURCE $44 CLASS $10 ACCESS $7 ;
  retain system ;
  retain row .;
  input @;
  if _infile_=:'//* SYSTEM' then do;
    if row=0 then output;
    system = scan(_infile_,-1,' ');
    row=0;
  end;
  if _infile_ in: ('//*','---','ID:') then delete;
  input id resource class access ;
  row+1;
  output;
cards4;
Occasional Contributor
Posts: 19

Re: Complex File Import

Thank you Tom! This worked perfectly.

Valued Guide
Posts: 2,174

Re: Complex File Import

This looks like a veey old fashioned problem. Would the modern way be just to get the application which produces the report to provide its input files. (looks like system management info - from a serious app that should be able to provide all the report data through a better interface API than scraping a text report)
Occasional Contributor
Posts: 19

Re: Complex File Import

I agree an old fashioned problem unfortuneately this is our only output from this legacy system and our department has no control over the output.

Solution
‎01-25-2017 11:54 AM
Grand Advisor
Posts: 10,211

Re: Complex File Import

[ Edited ]

See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.

data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM  ;
   Retain syscount 0; /* #######*/

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
      syscount+1; /*#######*/
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;
   if syscount=1 and missing(id) then delete; /*#######*/

drop	recType prevType syscount;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;

 

 

Occasional Contributor
Posts: 19

Re: Complex File Import

@ballardw and @Tom Wanted to thank both of you for your solutions both worked great! Thank you for all your assistance!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 193 views
  • 0 likes
  • 4 in conversation