BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GriffT
Obsidian | Level 7

I have a somewhat complex file I need to import and have found about 90% of the answer just need a little nudge to get me over the hump. The file has fixed positions for the data and the data is in different rows,see below for an example:

 

//**********************************************************************

//* SYSTEM: SYS01 Need to create variable/column SYSTEM from "//*SYSTEM:" - WORKS

                                       Need the value after "//*SYSTEM:" put into the SYSTEM variable - WORKS

//**********************************************************************

ID: RESOURCE: CLASS: ACCESS: Need to create variables/columns from this line - WORKS

------------------------------------------------------------------------

@Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ Need to populate dataset with these values - WORKS

 

The issue i am faced with is that there are entries that have no data related to the varialbes created however the code below is populating the last known or recorded value for each variable for the remainder of the entries. I have included the full code below and sample data representative of the file.

 

Thanks in advance for any all assistance!

 

/*Sample data below along with column counts for variables
123456789|123456789|123456789|123456789|123456789|123456789|123456789|12

ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
*/

%*	---------------------------------------------------------------------------------------------------- ;
%*	Used code in the link below as a beginning reference ;
%*	https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ;

/*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/
%* Start Complex Import File ;
data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM ID RESOURCE CLASS ACCESS ;

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;

*drop	recType prevType ;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.

data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM  ;
   Retain syscount 0; /* #######*/

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
      syscount+1; /*#######*/
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;
   if syscount=1 and missing(id) then delete; /*#######*/

drop	recType prevType syscount;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;

 

 

View solution in original post

12 REPLIES 12
ballardw
Super User

Can you provide something that shows what the output should look like?

 

Or at least which variables are missing, what the values should be and some way to identify which records they are?

 

Your retain statements are what would hold previous values if you don't clear them out when no longer wanted. Perhaps instead of Delete you want an explicit output so that the values can be cleared after output??

 

GriffT
Obsidian | Level 7

I went back in and removed the RETAIN statement and that gets me 99% there, below is the edited version of the code. Thank you for pointing out the RETAIN statment! Attached are two files the new resulting output from the code below and the required end result.

 

/*Sample data below along with column counts for variables
123456789|123456789|123456789|123456789|123456789|123456789|123456789|12

ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
*/

%*	---------------------------------------------------------------------------------------------------- ;
%*	Used code in the link below as a beginning reference ;
%*	https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ;

/*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/
%* Start Complex Import File ;
data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
*	retain	SYSTEM ID RESOURCE CLASS ACCESS ;

	*Delete any blank lines in the file ;
	if char(recType, 1) = " " then do ; 
		delete ; 
	end ;

*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;

	if recType = '//* SYSTEM:%' then do ; 
		delete ; 
	end ;

drop	recType ;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;
ballardw
Super User

If I change the retain statement to:

retain SYSTEM ;

 

I get what looks like what you want.

GriffT
Obsidian | Level 7

Thank you! This gets me to 99.9% now all I need to do is research removing the first blank line from the file.

SYSTEM ID RESOURCE CLASS ACCESS
SYS01  Need to remove this line  
SYS01 @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER
SYS01 XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ
SYS01 NLNA RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER
SYS02    
SYS03    
SYS04    
SYS05    
SYS06    

ballardw
Super User

 

Does

 

If Id='SYS01' and missing(resource) then delete;

 

added after your other delete work? Or does the first value of ID not always equal SYS01?

GriffT
Obsidian | Level 7

Yes, that works just not an ideal solution as it involves hard coding the SYSTEM value. I have been trying to use the LAG function and FIRST. and LAST. to look back and look ahead to solve the problem. The code below works, output attached.

 

if SYSTEM = 'SYS01' and missing(ID) then delete ;

The output may not always be SYS01 with the missing information so I will need to keep digging and find a solution that works for not only the first record but any record in the middle or at the end of the file. So far my solutions are removing the relevant data at the end of the file were missing observations are valid. I am fairly certain I have tackled this problem in the past trying to find an example in my old code that I can reuse will definately post a response if I find the code.

Tom
Super User Tom
Super User

Looks pretty simple to me.  You just need keep track of whether you have output any rows for the current SYSTEM to know if you need to output an empty row for that system.

 

data want ;
  infile cards truncover ;
  length ROW 8 SYSTEM $6 ID $10 RESOURCE $44 CLASS $10 ACCESS $7 ;
  retain system ;
  retain row .;
  input @;
  if _infile_=:'//* SYSTEM' then do;
    if row=0 then output;
    system = scan(_infile_,-1,' ');
    row=0;
  end;
  if _infile_ in: ('//*','---','ID:') then delete;
  input id resource class access ;
  row+1;
  output;
cards4;
GriffT
Obsidian | Level 7

Thank you Tom! This worked perfectly.

Peter_C
Rhodochrosite | Level 12
This looks like a veey old fashioned problem. Would the modern way be just to get the application which produces the report to provide its input files. (looks like system management info - from a serious app that should be able to provide all the report data through a better interface API than scraping a text report)
GriffT
Obsidian | Level 7

I agree an old fashioned problem unfortuneately this is our only output from this legacy system and our department has no control over the output.

ballardw
Super User

See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.

data test ;
	infile cards truncover ;
	format	SYSTEM	$6.
				ID			$10.
				RESOURCE	$44.
				CLASS		$10.
				ACCESS	$7. ;
*Create a record for each row in the file and retain ;
	input	@1	recType	$72. @ ;
*Define the variables for the header values assume a fixed width ;
	retain	SYSTEM  ;
   Retain syscount 0; /* #######*/

*Delete any blank lines in the file ;
	if char(RecType, 1) = " " then do ; 
		delete ; 
	end ;
*Remove extraneous lines between the values to extract from the file ;
	if recType in ("//**********************************************************************"
						,"ID:       RESOURCE:                                    CLASS:    ACCESS:"
						,"------------------------------------------------------------------------")
			then do ;
		delete ; 
	end ;

*Start of the extract of values from the file ;
	prevType = lag(recType) ;
	if recType =: "//* SYSTEM:" then do ; 
		input @14 SYSTEM $6. ; 
      syscount+1; /*#######*/
	end ; 
	else do ; 
		if recType ne "//* SYSTEM:" 
			then do ; 
			input	@1		ID			$10.
					@11	RESOURCE	$44.
					@56	CLASS		$10.
					@66	ACCESS	$7. ;
			end ;
	end ;
   if syscount=1 and missing(id) then delete; /*#######*/

drop	recType prevType syscount;
cards4 ;
//**********************************************************************
//* SYSTEM:  SYS01
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
@Y9EEIE   RESRV99.VSVYRDVJ.**                          DATASET   READ
SVSQ##PG  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
XLAGSUAV  RESRV99.VSVYRDVJ.**                          DATASET   READ
NLNA      RESRV99.VSVYRDVJ.**                          DATASET   READ
SYSPAS01  RESRV99.VSVYRDVJ.**                          DATASET   ALTER
//**********************************************************************
//* SYSTEM:  SYS02
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS03
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS04
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS05
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
//**********************************************************************
//* SYSTEM:  SYS06
//**********************************************************************
ID:       RESOURCE:                                    CLASS:    ACCESS:
------------------------------------------------------------------------
;;;;

 

 

GriffT
Obsidian | Level 7

@ballardw and @Tom Wanted to thank both of you for your solutions both worked great! Thank you for all your assistance!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2011 views
  • 0 likes
  • 4 in conversation