BookmarkSubscribeRSS Feed
yymissing
Calcite | Level 5

Hi All, I have a txt file that has three types of informaiton: name, address, and transations. There is one row per customer. The data has position defined for name and address, but not for transation. Transation has several  types and each type has its own length.  For example TRAN has length of 9, while SMET has length of 8. And each row can have any combination of TRAN and SMET.  Is there a way that I only read in transtions start with "TRAN" and keep each TRAN as a seperate row? 

 

raw data look like:

AMY   205ABB street NY   TRAN90890   TRAN89789   SMET7485   SMET7485
TED   205BBC street CA   SMET7885   SMET9885   TRAN23890   TRAN89129   SMET7655

 

I want the sas data look like:

 

AMY   205ABB street NY   TRAN90890 

AMY   205ABB street NY   TRAN89789  

TED   205BBC street CA   TRAN23890  

TED   205BBC street CA   TRAN89129

 

4 REPLIES 4
Astounding
PROC Star

Sort of.  What you will end up doing is reading in all of the possibilities, and only outputting the correct ones.  This example assumes you will replace the columns for the first two variables with the correct columns:

 

data want;

infile rawdata truncover;

length TRAN $ 9;

input name $ 1-5  address $ 6-25 @ ;

do until (tran=' ');

   input tran $ @ ;

   if tran =: 'TRAN' then output;

end;

run;

art297
Opal | Level 21

Here is one way you can do it:

 

data want;
  infile 'c:\art\textfile.txt' lrecl=200 truncover;
  informat name $20.;
  informat address $50.;
  informat tran $9.;
  input name address & @;
  do until (missing(tran));
    input @'TRAN' tran @;
    if not missing(tran) then do;
      tran=catt('TRAN',tran);
      output;
    end;
  end;
  input;
run;

Art, CEO, AnalystFinder.com

 

novinosrin
Tourmaline | Level 20

 

data have;

infile datalines truncover;

input  name $ address & $20. @;

i=1;

do until(scan(_infile_ ,i)=' ');

input dummy :$10.  @;

if dummy=:'TRAN' then output;

i+1;

end;

drop i;

datalines;

AMY   205ABB street NY   TRAN90890   TRAN89789   SMET7485   SMET7485

TED   205BBC street CA   SMET7885   SMET9885   TRAN23890   TRAN89129   SMET7655

;

 

Regards,

Naveen Srinivasan

 

Tom
Super User Tom
Super User

Your raw data would work better if it was delimited (like comma, tab, pipe '|', etc).  Otherwise it will be hard to find the end of the address.  Let's first look at a solution that assumes it is easy to read the values.  For example if the values where tab delimited.

data want ;
  infile 'myfile.txt' dsd dlm='09'x truncover ;
  length name $20 address $100 transaction $10 ;
  input name address transaction @ ;
  do col=1 by 1 until(transaction=' ');
     if col=1 or transaction ne ' ' then output;
     input transaction @;
  end;
run;

Another possibility is that the values are space delimited, but there are always two spaces after the address and the addresses do not contain embedded double spaces. Then you can use the & modifier.

  infile 'myfile.txt' truncover ;
  length name $20 address $100 transaction $10 ;
  input name address & transaction @ ;

If you don't have either of these then perhaps you can take advantage of the fact that your transactions always start with specific characters.  Probably better to use regular expression to search for the start of the transactions, but perhaps this would also work?

 

data want ;
  infile cards truncover ;
  length name $20 address $100 transaction $10 ;
  input @ ;
  name = scan(_infile_,1);
  loc1 = index(_infile_,'TRAN');
  loc2 = index(_infile_,'SMET');
  if loc1 and loc2 then loc=min(loc1,loc2);
  else loc=max(loc1,loc2);
  drop loc1 loc2 loc ;
  address = substr(substrn(_infile_,1,loc-1),length(name)+2);
  input @loc transaction @;
  do col=1 by 1 until(transaction=' ');
     if col=1 or transaction ne ' ' then output;
     input transaction @;
  end;
cards;
AMY   205ABB street NY   TRAN90890   TRAN89789   SMET7485   SMET7485
TED   205BBC street CA   SMET7885   SMET9885   TRAN23890   TRAN89129   SMET7655
;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 950 views
  • 0 likes
  • 5 in conversation