DATA Step, Macro, Functions and more

Reg:DATA LOADING

Reply
Regular Contributor
Posts: 229

Reg:DATA LOADING

Hi i am getting data in txt file with having Enter(not word it means going to next line)

i am having the data like this as i am having 72 variables and 80000 obs


ajay|ku
mar|2656|
muni|carr
om|455676|         
ajay|kumar|9890|vijay|kumar|589|user|flow|568

i want the output

Name  user    no
ajay    kumar    2656
muni   carrom  455676
ajay   kumar   9890
vijay   kumar   589
user   flow     568

PROC Star
Posts: 7,362

Reg:DATA LOADING

One possibility would be to run one datastep to rewrite the file, and then another to read it.  E.g.:

data _NULL_ ;

  FILE "c:\art\want.txt" RECFM=N ;

  infile cards;

  input;

  do i=1 to length(_infile_);

    dummy=substr(_infile_,i,1);

    IF dummy GT '13'X THEN do;

      PUT dummy $CHAR1. ;

    end;

  end;

  cards;

ajay|ku

mar|2656|

muni|carr

om|455676|         

ajay|kumar|9890|vijay|kumar|589|user|flow|568

;

data want;

  infile "c:\art\want.txt" dlm="|" dsd;

  informat first last $15.;

  input first last number @@;

run;

Regular Contributor
Posts: 229

Re: Reg:DATA LOADING

data temp;
infile cards missover truncover;
input sclurp $32000. ;
cards;
ajay|ku
mar|2656|34
      
muni|carr
om|455676|56
       
ajay|kumar|9890|78|vijay|kumar|589|90|
user|flow|568
    
|67
run;

data temp2;
/*input @;*/
set temp;
sclurp=cats(of sclurp );
retain sclurp '';
if not mod(countw(sclurp,'|'),4) then
  do;
   do i=1 to countw(sclurp,'|') by 4;
    name=input(scan(sclurp,i,'|'),$32.);
    user=input(scan(sclurp,i+1,'|'),$32.);
    no=input(scan(sclurp,i+2,'|'),8.);
    age=input(scan(sclurp,i+3,'|'),8.);
    output;
   end;
   sclurp='';
  end;
drop sclurp i;

run;
proc print;
run;

Tryed by this way it was getting errors

PROC Star
Posts: 7,362

Re: Reg:DATA LOADING

You have added two extra conditions.  Now, in addition to having an extra numeric field, sometime your records end with a pipe, and sometimes they don't.

I tested the following with your new set of test data:

data _NULL_ ;

  FILE "c:\want.txt" RECFM=N ;

  infile cards;

  input;

  do i=1 to length(_infile_);

    dummy=substr(_infile_,i,1);

    if dummy eq "|" then counter+1;

    if counter eq 3 and anyalpha(dummy) then do;

      put "|";

      counter+1;

    end;

    if counter eq 4 then counter=0;

    IF dummy GT '13'X THEN do;

      PUT dummy $CHAR1. ;

    end;

  end;

  cards;

ajay|ku

mar|2656|34

      

muni|carr

om|455676|56

       

ajay|kumar|9890|78|vijay|kumar|589|90|

user|flow|568

    

|67

;

data want;

  infile "c:\want.txt" dlm="|" dsd;

  informat first last $15.;

  input first last number1 number2 @@;

run;

Trusted Advisor
Posts: 1,300

Reg:DATA LOADING

I will not say this works for all possible cases but it works with the examples given:

data foo;

length slurp $32000;

retain slurp '';

input @;

slurp=cats(of slurp _infile_);

if not mod(countw(slurp,'|'),3) then

  do;

   do i=1 to countw(slurp,'|') by 3;

    name=input(scan(slurp,i,'|'),$32.);

    user=input(scan(slurp,i+1,'|'),$32.);

    no=input(scan(slurp,i+2,'|'),8.);

    output;

   end;

   slurp='';

  end;

drop slurp i;

cards;

ajay|ku

mar|2656

muni|carr

om|455676

ajay|kumar|9890|vijay|kumar|589|user|flow|568

;

run;

Regular Contributor
Posts: 229

Re: Reg:DATA LOADING

data temp;
infile cards missover truncover;
input sclurp $32000. ;
cards;
ajay|ku
mar|2656|34
      
muni|carr
om|455676|56
       
ajay|kumar|9890|78|vijay|kumar|589|90|
user|flow|568
    
|67
run;

data temp2;
/*input @;*/
set temp;
sclurp=cats(of sclurp );
retain sclurp '';
if not mod(countw(sclurp,'|'),4) then
  do;
   do i=1 to countw(sclurp,'|') by 4;
    name=input(scan(sclurp,i,'|'),$32.);
    user=input(scan(sclurp,i+1,'|'),$32.);
    no=input(scan(sclurp,i+2,'|'),8.);
    age=input(scan(sclurp,i+3,'|'),8.);
    output;
   end;
   sclurp='';
  end;
drop sclurp i;

run;
proc print;
run;

I am trying by this way i am not getting the output i wnat in this way i want the inital data in to a dataset as i cant give datalines as my data is huge

Trusted Advisor
Posts: 1,300

Reg:DATA LOADING

Looking at the above very quickly I see at least one issue.

sclurp=cats(of sclurp );

this is not collecting like you expect.

Trusted Advisor
Posts: 1,300

Reg:DATA LOADING

R_Win - Can you please let me know what the actual hex value for the 'ENTER' is?

Regular Contributor
Posts: 229

Re: Reg:DATA LOADING

it is '13'X i am thinking as i searched in net it has given as..

Trusted Advisor
Posts: 1,300

Reg:DATA LOADING

Did you get this from searching the internet or from actually looking at the file?  Does the file contain any proper EOL chars?

What OS are you running this job on?

Super User
Posts: 9,676

Reg:DATA LOADING

How about:

data want(keep=name user no);
infile datalines eof=last;
length name user no temp $ 40;
input a $1. @@;
retain temp name user no;
if a ne '|' then temp=cats(temp,a);
 else do;
         count+1;
         select(mod(count,3));
          when (1) name=temp;
          when (2) user=temp;
          when (0) do; no=temp;output;end;
         end;
         call missing(temp);
       end;
return;
last:  no=temp;output;
datalines;
ajay|ku
mar|2656|
muni|carr
om|455676|         
ajay|kumar|9890|vijay|kumar|589|user|flow|568
;
run;



Ksharp

Regular Contributor
Posts: 229

Reg:DATA LOADING

length name user no temp $ 40;

data want;

input temp $300.;

cards;

ajay|ku

mar|2656|

       

muni|carr

om|455676|        

ajay|kumar|9890|vijay|kumar|

        

589|user|flow|568

run;

data want1(keep=name user no);

set want;

length name user no temp $ 40;

input a $1. @@;

retain temp name user no;

if a ne '|' then temp=cats(temp,a);

else do;

         count+1;

         select(mod(count,3));

          when (1) name=temp;

          when (2) user=temp;

          when (0) do; no=temp;output;end;

         end;

         call missing(temp);

       end;

return;

last:  no=temp;output;

datalines;

ajay|ku

mar|2656|

       

muni|carr

om|455676|        

ajay|kumar|9890|vijay|kumar|

        

589|user|flow|568

;

run;

proc print;

run;

Actually this contains datalines reading from the same dataset actually i wnat to read the data intially to a dataset and from that dataset i want to read to another dataset as you have written this code by using datalines as my data is huge i cant keep the datalines...

Super User
Posts: 9,676

Re: Reg:DATA LOADING

But you said you have a txt file ,that is almost like this situation.

data want(keep=name user no);
infile  'c:\x.txt' eof=last;
length name user no temp $ 40;
input a $1. @@;
retain temp name user no;
if a ne '|' then temp=cats(temp,a);
 else do;
         count+1;
         select(mod(count,3));
          when (1) name=temp;
          when (2) user=temp;
          when (0) do; no=temp;output;end;
         end;
         call missing(temp);
       end;
return;
last: no=temp;output; 
run;




Ksharp

Regular Contributor
Posts: 229

Re: Reg:DATA LOADING

i have taken the data like this

Data taken in txt file:

---------------------------

nani|kumar
    |20

kumar|raju|45|arun|kumar|78
n
aresh|k|45

But i am getting wrong output:

----------------------------------------

Obs    name     user        no

                                                 1     nani     kumar       20kumar
                                                 2     raju     45          arun  
                                                 3     kumar    78naresh    k     
                                                 4     kumar    78naresh    45    

Desired output

----------------------------

nani       kumar  20

kumar   raju      45

arun      kumar   78

naresh  k          45

PROC Star
Posts: 7,362

Re: Reg:DATA LOADING

In your examples you have shown two quite different situations.  One has name1,name2 and number, and the other has name1,name2, number1 and number2.  Obviously, you need different input statements for those two very different situations.

Did you try the method that I had initially proposed?

Ask a Question
Discussion stats
  • 24 replies
  • 275 views
  • 0 likes
  • 6 in conversation