BookmarkSubscribeRSS Feed
rvsidhu035
Quartz | Level 8

1 8001000004 S BUDATI CHIRUHA 6 07/07/1989 34XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
2 8001000081 MATANGI SAI SUNDAR KANTH 1 27/06/1995 3456721ZYWFGHMLJIACDEBQSPUVTRKON8
3 8001000102 KOLUSU VEERA VENKATA VAMSI KRI 6 05/08/1994 6L4357Z2YWKMJN81DEAQOPRSTUVBCFGHI
4 8001000106 INTURI SRINIVAS 27/06/1993 346725LWNYZ18MKIFGHCDEBAJOPQRSTUV
5 8001000134 BHUKYA VENKATESWARLU NAIK 2 08/04/1995 3456XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
6 8001000138 BURRI SWAPNA 1 01/07/1990 34612578ABCDEFGHIJKLMNOPQRSTUVWYZ

... i have like this text data 2000 observation how to separate

ser_no=1

reg_Id=8001000004

Name=S BUDATI CHIRUHA

Caste_Id=6

Dob=07/07/1989

i dont need remaing extra data like 34568xxxxxxxxxxxx

i am attached txt file below

 

thank you help me

 

6 REPLIES 6
Cynthia_sas
SAS Super FREQ
Hi: What code have you tried so far. You will need a DATA step program with an INFILE statement. How you construct your INPUT statement will be part of the challenge. It is too bad that you did not receive delimited file, like a tab delimited file or a comma delimited file, that would make your code much more simple. Is there a maximum number of name fields that you can have? That would be good information to know.

Cynthia
rvsidhu035
Quartz | Level 8
It's spaced dlm
novinosrin
Tourmaline | Level 20

data have; infile 'Your New Text Document.txt'; input ser_no reg_id : $20. @; call scan(_infile_, -3, position, length,' '); fwidth=position-anyalpha(_infile_); if length=1 then input @(anyalpha(_infile_)) name $varying40. fwidth caste_id dob :ddmmyy10.; format dob ddmmyy10.; drop fwidth position length; run;
/* this will get you close, and I challenge you to get it squeaky clean*//*specify the right path in the infile statement*/

 

Patrick
Opal | Level 21

Below will only work if you haven't got any missing values in the last 3 columns.

data sample(drop=_:);
  infile datalines dlm=' ' col=col truncover;

  input var1 :$3. var2 :$10. @;

  /* Sample 41355: Find the nth delimiter in a character string */
  /* http://support.sas.com/kb/41/355.html */
  call scan(strip(_infile_),-3,_pos,_len,' ');

  _fwidth=_pos -col -1; 
  input var3 $varying60. _fwidth  var4 :$1. var5 :ddmmyy10. var6 :$60.;
  format var5 date9.;


datalines;
1 8001000004 BUDATI CHIRUHAS 6 07/07/1989 34XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
6 8001000138 BURRI SWAPNA 1 01/07/1990 34612578ABCDEFGHIJKLMNOPQRSTUVWYZ
50 8001000562 SUGALI ARUN KUMAR NAIK 2 18/10/1994 345762FABCDEGHIJKLMNOPQRSTUVWYZ18
;
run;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is not space delimited in the correct sense:

1 8001000004 S BUDATI CHIRUHA 6 07/07/1989 34XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

 ^                    ^ ^             ^                ^                    ^

I indicate where spaces appear in the text, as you can see to import a space delimited would mean S and BUDATI, and CHIRUHA all go to different variables as that also contains spaces.  Its quite a bad file really, anything using a good standard, like Comma Separated Variable file, would make this import a simple problem.  As you have accepted a poor file format, likely with very little standardisation or documentation, then you will have problems importing it.  

 

You may be able to read the whole line in as one string and then process it as:

scan(1," ") to get first part, then substr from there to drop this

scan(2," ") to get second part, then substr from there to drop this

Next substr to findc(numbers)-1 to get name and then substr to drop this

Then scan each of the other two variables out

 

This is all caused by poor file format!

rvsidhu035
Quartz | Level 8
data mts(drop=file1 garb);
infile "C:\Documents and Settings\office\Desktop\New Text Document.txt";
input Ser_NO Roll_NO garb $43.;
file1=scan(garb,1,"/");
name=compress(file1," ","ka");
caste_no=scan(compress(file1," ","kd"),1);
dob=substr(garb,length(file1)-2,10);
/*if length(caste_no)=1 then output;else caste_no=0;*/
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 899 views
  • 0 likes
  • 5 in conversation