How to bring four observations into four different columns from a text...

alepage · Posted 11-22-2024 12:05 PM

Hello,

I have cleanup my log file to get something like that:

1         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,4V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
0 observations where found for the agreement number: 4V3098387 into the dataset wu_prop_prmfev2011
2         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,P34V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
1 observations where found for the agreement number: P34V3098387 into the dataset wu_prop_prmfev2011

so If we look at the above-mentioned example, it is the content of the variable text. What I would like to get is a dataset with two lines and four columns where in the first column we have the %doit statement, second column we have the sas dataset ... exits, third column this is a prop premium datasets and last column remarks = 1 observations where found for the agreement number: P34V3098387 into the dataset wu_prop_prmfev2011

How to do those manipulation ?

Kurt_Bremser · Posted 11-22-2024 12:48 PM

Use four INPUT statements:

data want;
infile "...." truncover;
length var1 var2 var3 var4 $80;
input var1 $80.;
input var2 $80.;
input var3 $80.;
input var4 $80.;
run;

Increase/decrease the lengths and informats as needed;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Tom · Posted 11-22-2024 12:58 PM

What "manipulation"? Just read it that way.

First let's create a file with the lines of text you posted so we have something we can program against.

filename text temp;
options parmcards=text;
parmcards4;
1         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,4V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
0 observations where found for the agreement number: 4V3098387 into the dataset wu_prop_prmfev2011
2         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,P34V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
1 observations where found for the agreement number: P34V3098387 into the dataset wu_prop_prmfev2011
;;;;

If you know the file consists of 4 lines per observation you could use the N= option of the INFILE statement and the # cursor motion operator of the INPUT statement. So perhaps something like this:

data want;
  infile text n=4 truncover ;
  input #1 @'+' doit_call $100.
        #2 @'dataset' memname :$32. status :$8. 
        #3 comment $100.
        #4 nobs @'agreement number:' agreement :$20.
  ;
run;

Result

alepage · Posted 11-22-2024 02:22 PM

How do we do that if I have already the variable text into a SAS dataset and if I have done some cleanup to make sure that we have a structure of 4 lines

Tom · Posted 11-22-2024 03:07 PM

@alepage wrote:
How do we do that if I have already the variable text into a SAS dataset and if I have done some cleanup to make sure that we have a structure of 4 lines

Depends what the dataset you have looks like. Share what you have in the data for a couple of observations. For example if you have a dataset named HAVE with four variables named LINE1 to LINE4 you could share example of what you have for one observations by creating a program like:

data have;
  length line1-line4 $100;
  line1=' + %xxx();';
  line2='other stuff';
  line3='third line stuff';
  line4='last line stuff';
  output;
run;

If you have those strings in four variables then you can use SCAN() to pull out words from the strings. You can use INPUT() function to convert strings that look like numbers into actual numbers. If you need to search the text for the location to pull out you can probably use FIND() or one of the PX... functions that supports the use for regular expressions.

alepage · Posted 11-22-2024 03:17 PM

Hello,

I have imported my log file into a sas dataset with only one variable, the variable text.

Then I did some cleanup into it in order to keep only the four observations sequence as below

1         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,4V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
0 observations where found for the agreement number: 4V3098387 into the dataset wu_prop_prmfev2011
2         + %doit(/dwh_actuariat/sasdata/Data_Retention,2011,wu,prop,habi,fev,P34V3098387)
the sas dataset wu_prop_prmfev2011 exist
this is an prop premium dataset
1 observations where found for the agreement number: P34V3098387 into the dataset wu_prop_prmfev2011

so for the first observation I have something starting with + %doit

second observation something starting with the sas dataset

third observation this is an

and last observation x observations where found...

this 4 observations sequence is repeted 311 times.

I want to create a dataset having 311 observations / lines with var1=first line var2=second line var3=thrid line and finally var4 = fourth line.

In summary, we start with a sas dataset having 1 variable = text with 1244 observations and I want to transform this information into a new dataset of 311 lines and 4 variable

How do we do that

Tom · Posted 11-22-2024 03:54 PM

To convert every four observations into one here are a couple of methods.

You could add a group variable and then transpose by that variable.

This will create a dataset with 5 variables. GROUP and TEXT1 to TEXT4.

data step1;
  set have;
  group + mod(_n_,4)=1 ;
run;
proc transpose data=step1 out=step2(drop=_name_) prefix=text;
  by group;
  var text;
run;

You could do the same thing in single data step by defining an array and putting a DO loop around the SET statement. So in this example we also add the GROUP variable. We also need to set the length of the new variables, which we can do when we define the array.

data step2;
   group+1;
   do i=1 to 4;
      array t $200 text1-text4;
      set have;
      t[i]=text;
   end;
  drop i text;
run;

Let's read in your example data and try it.

data have;
  infile text truncover;
  input text $char200.;
run;

Results:

How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Re: How to bring four observations into four different columns from a text file

Registration is open

SAS Training: Just a Click Away