I have a somewhat complex file I need to import and have found about 90% of the answer just need a little nudge to get me over the hump. The file has fixed positions for the data and the data is in different rows,see below for an example:
//**********************************************************************
//* SYSTEM: SYS01 Need to create variable/column SYSTEM from "//*SYSTEM:" - WORKS
Need the value after "//*SYSTEM:" put into the SYSTEM variable - WORKS
//**********************************************************************
ID: RESOURCE: CLASS: ACCESS: Need to create variables/columns from this line - WORKS
------------------------------------------------------------------------
@Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ Need to populate dataset with these values - WORKS
The issue i am faced with is that there are entries that have no data related to the varialbes created however the code below is populating the last known or recorded value for each variable for the remainder of the entries. I have included the full code below and sample data representative of the file.
Thanks in advance for any all assistance!
/*Sample data below along with column counts for variables 123456789|123456789|123456789|123456789|123456789|123456789|123456789|12 ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ */ %* ---------------------------------------------------------------------------------------------------- ; %* Used code in the link below as a beginning reference ; %* https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ; /*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/ %* Start Complex Import File ; data test ; infile cards truncover ; format SYSTEM $6. ID $10. RESOURCE $44. CLASS $10. ACCESS $7. ; *Create a record for each row in the file and retain ; input @1 recType $72. @ ; *Define the variables for the header values assume a fixed width ; retain SYSTEM ID RESOURCE CLASS ACCESS ; *Delete any blank lines in the file ; if char(RecType, 1) = " " then do ; delete ; end ; *Remove extraneous lines between the values to extract from the file ; if recType in ("//**********************************************************************" ,"ID: RESOURCE: CLASS: ACCESS:" ,"------------------------------------------------------------------------") then do ; delete ; end ; *Start of the extract of values from the file ; prevType = lag(recType) ; if recType =: "//* SYSTEM:" then do ; input @14 SYSTEM $6. ; end ; else do ; if recType ne "//* SYSTEM:" then do ; input @1 ID $10. @11 RESOURCE $44. @56 CLASS $10. @66 ACCESS $7. ; end ; end ; *drop recType prevType ; cards4 ; //********************************************************************** //* SYSTEM: SYS01 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ NLNA RESRV99.VSVYRDVJ.** DATASET READ SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER //********************************************************************** //* SYSTEM: SYS02 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS03 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS04 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS05 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS06 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ ;;;;
See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.
data test ; infile cards truncover ; format SYSTEM $6. ID $10. RESOURCE $44. CLASS $10. ACCESS $7. ; *Create a record for each row in the file and retain ; input @1 recType $72. @ ; *Define the variables for the header values assume a fixed width ; retain SYSTEM ; Retain syscount 0; /* #######*/ *Delete any blank lines in the file ; if char(RecType, 1) = " " then do ; delete ; end ; *Remove extraneous lines between the values to extract from the file ; if recType in ("//**********************************************************************" ,"ID: RESOURCE: CLASS: ACCESS:" ,"------------------------------------------------------------------------") then do ; delete ; end ; *Start of the extract of values from the file ; prevType = lag(recType) ; if recType =: "//* SYSTEM:" then do ; input @14 SYSTEM $6. ; syscount+1; /*#######*/ end ; else do ; if recType ne "//* SYSTEM:" then do ; input @1 ID $10. @11 RESOURCE $44. @56 CLASS $10. @66 ACCESS $7. ; end ; end ; if syscount=1 and missing(id) then delete; /*#######*/ drop recType prevType syscount; cards4 ; //********************************************************************** //* SYSTEM: SYS01 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ NLNA RESRV99.VSVYRDVJ.** DATASET READ SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER //********************************************************************** //* SYSTEM: SYS02 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS03 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS04 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS05 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS06 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ ;;;;
Can you provide something that shows what the output should look like?
Or at least which variables are missing, what the values should be and some way to identify which records they are?
Your retain statements are what would hold previous values if you don't clear them out when no longer wanted. Perhaps instead of Delete you want an explicit output so that the values can be cleared after output??
I went back in and removed the RETAIN statement and that gets me 99% there, below is the edited version of the code. Thank you for pointing out the RETAIN statment! Attached are two files the new resulting output from the code below and the required end result.
/*Sample data below along with column counts for variables 123456789|123456789|123456789|123456789|123456789|123456789|123456789|12 ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ */ %* ---------------------------------------------------------------------------------------------------- ; %* Used code in the link below as a beginning reference ; %* https://communities.sas.com/t5/Base-SAS-Programming/Complex-Flat-file-Reading-Importing-with-Different-Delimiter/m-p/120156#M24720 ; /*-------1---------2---------3---------4---------5---------6---------7---------8---------9---------0---*/ %* Start Complex Import File ; data test ; infile cards truncover ; format SYSTEM $6. ID $10. RESOURCE $44. CLASS $10. ACCESS $7. ; *Create a record for each row in the file and retain ; input @1 recType $72. @ ; *Define the variables for the header values assume a fixed width ; * retain SYSTEM ID RESOURCE CLASS ACCESS ; *Delete any blank lines in the file ; if char(recType, 1) = " " then do ; delete ; end ; *Remove extraneous lines between the values to extract from the file ; if recType in ("//**********************************************************************" ,"ID: RESOURCE: CLASS: ACCESS:" ,"------------------------------------------------------------------------") then do ; delete ; end ; *Start of the extract of values from the file ; if recType =: "//* SYSTEM:" then do ; input @14 SYSTEM $6. ; end ; else do ; if recType ne "//* SYSTEM:" then do ; input @1 ID $10. @11 RESOURCE $44. @56 CLASS $10. @66 ACCESS $7. ; end ; end ; if recType = '//* SYSTEM:%' then do ; delete ; end ; drop recType ; cards4 ; //********************************************************************** //* SYSTEM: SYS01 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ NLNA RESRV99.VSVYRDVJ.** DATASET READ SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER //********************************************************************** //* SYSTEM: SYS02 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS03 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS04 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS05 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS06 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ ;;;;
If I change the retain statement to:
retain SYSTEM ;
I get what looks like what you want.
Thank you! This gets me to 99.9% now all I need to do is research removing the first blank line from the file.
SYSTEM ID RESOURCE CLASS ACCESS
SYS01 Need to remove this line
SYS01 @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER
SYS01 XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ
SYS01 NLNA RESRV99.VSVYRDVJ.** DATASET READ
SYS01 SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER
SYS02
SYS03
SYS04
SYS05
SYS06
Does
If Id='SYS01' and missing(resource) then delete;
added after your other delete work? Or does the first value of ID not always equal SYS01?
Yes, that works just not an ideal solution as it involves hard coding the SYSTEM value. I have been trying to use the LAG function and FIRST. and LAST. to look back and look ahead to solve the problem. The code below works, output attached.
if SYSTEM = 'SYS01' and missing(ID) then delete ;
The output may not always be SYS01 with the missing information so I will need to keep digging and find a solution that works for not only the first record but any record in the middle or at the end of the file. So far my solutions are removing the relevant data at the end of the file were missing observations are valid. I am fairly certain I have tackled this problem in the past trying to find an example in my old code that I can reuse will definately post a response if I find the code.
Looks pretty simple to me. You just need keep track of whether you have output any rows for the current SYSTEM to know if you need to output an empty row for that system.
data want ;
infile cards truncover ;
length ROW 8 SYSTEM $6 ID $10 RESOURCE $44 CLASS $10 ACCESS $7 ;
retain system ;
retain row .;
input @;
if _infile_=:'//* SYSTEM' then do;
if row=0 then output;
system = scan(_infile_,-1,' ');
row=0;
end;
if _infile_ in: ('//*','---','ID:') then delete;
input id resource class access ;
row+1;
output;
cards4;
Thank you Tom! This worked perfectly.
I agree an old fashioned problem unfortuneately this is our only output from this legacy system and our department has no control over the output.
See if this does as needed. Look for the comments with ##### to see the added lines. If there is ever a case when you won't have Id (or possibly resource) for the first system value but need it for output then you need to be very explicit in describing how we are going to know that case has been met.
data test ; infile cards truncover ; format SYSTEM $6. ID $10. RESOURCE $44. CLASS $10. ACCESS $7. ; *Create a record for each row in the file and retain ; input @1 recType $72. @ ; *Define the variables for the header values assume a fixed width ; retain SYSTEM ; Retain syscount 0; /* #######*/ *Delete any blank lines in the file ; if char(RecType, 1) = " " then do ; delete ; end ; *Remove extraneous lines between the values to extract from the file ; if recType in ("//**********************************************************************" ,"ID: RESOURCE: CLASS: ACCESS:" ,"------------------------------------------------------------------------") then do ; delete ; end ; *Start of the extract of values from the file ; prevType = lag(recType) ; if recType =: "//* SYSTEM:" then do ; input @14 SYSTEM $6. ; syscount+1; /*#######*/ end ; else do ; if recType ne "//* SYSTEM:" then do ; input @1 ID $10. @11 RESOURCE $44. @56 CLASS $10. @66 ACCESS $7. ; end ; end ; if syscount=1 and missing(id) then delete; /*#######*/ drop recType prevType syscount; cards4 ; //********************************************************************** //* SYSTEM: SYS01 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ @Y9EEIE RESRV99.VSVYRDVJ.** DATASET READ SVSQ##PG RESRV99.VSVYRDVJ.** DATASET ALTER XLAGSUAV RESRV99.VSVYRDVJ.** DATASET READ NLNA RESRV99.VSVYRDVJ.** DATASET READ SYSPAS01 RESRV99.VSVYRDVJ.** DATASET ALTER //********************************************************************** //* SYSTEM: SYS02 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS03 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS04 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS05 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ //********************************************************************** //* SYSTEM: SYS06 //********************************************************************** ID: RESOURCE: CLASS: ACCESS: ------------------------------------------------------------------------ ;;;;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.