- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello Everybody,
Just got a flat file which I need to convert to a meaning SAS datasets.
The data Looks like below. Its just a Snapshot.
-------------------------------------------------------------------------------------------------------------------
!Scenario=ACTUAL
!Year=2013
!Period=Dec
!View=YTD
!Entity=29xxx
!Value=USD Total
!PDR=No
GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
GC14A03;[ICP Top];MGMT;EOP;CAxxx_TOT;GROSS;32.7802
<Note: Onle balnk line is present in between two sets of data>
!Scenario=ACTUAL
!Year=2013
!Period=Jan
!View=YTD
!Entity=2xx01
!Value=USD Total
!PDR=No
GC14A10;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
...
...
...
-----------------------------------------------------------------------------------------------
So we consider the above It got two parts the first part is 1) which Start from Scenario and end with PDR 2) multiple lines which starts with GC14xxx and end with the numeric value. And After set of these pair on blank line is present.
The problem here is I need the data in the below way. Few lines of output records
Scenario | Year | Period | View | Entity | Value | PDR | Var1 | Var2 | Var3 | Var4 | Var5 | Var6 | Var7 |
Actual | 2013 | Dec | YTD | 29xxx | USD | No | GC14A03 | [ICP Top] | MGMT | EOP | AllCuxxx | GROSS | 32.7802 |
Actual | 2013 | Dec | YTD | 29xxx | USD | No | GC14A03 | [ICP Top] | MGMT | EOP | CAxxx_TOT | GROSS | 32.7802 |
Actual | 2013 | Dec | YTD | 2xx01 | USD | No | GC14A10 | [ICP Top] | MGMT | EOP | AllCuxxm3 | GROSS | 0 |
Just No idea about how to approach this. I help or pointer will be very useful.
Let me know if you need more information orsomething is not clear.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Suvi104
Have a look at the following code, it uses test data, the DATA Step is made to read this kind of data:
infile cards dlm=";" truncover;
*
* check for the "record type" and keep record
*;
input
@1 recType $16. @
;
*
* define the var for the header values, assume fixed number
*;
retain scenario year period view entity value pdr;
array myHeader{*} $ 12 scenario year period view entity value pdr;
*
* do not need the blank lines
*;
if char(RecType, 1) = " " then do;
delete;
end;
*
* start of header values, we will read the fixed number of lines
*;
if recType =: "!Scenario" then do;
do i = 1 to dim(myHeader);
input @1 tempValue $32.;
myHeader{i} = scan(tempValue, 2, "=");
end;
end;
else do;
*
* read the "data" lines and write to the output data set
*;
input
@1
v1 : $32.
v2 : $32.
v3 : $32.
v4 : $32.
v5 : $32.
v6 : $32.
v7 : 8.
;
output;
end;
*
* we do not need these variables
*;
* drop recType i tempValue;
cards4;
!Scenario=ACTUAL
!Year=2013
!Period=Dec
!View=YTD
!Entity=29xxx
!Value=USD Total
!PDR=No
GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
GC14A03;[ICP Top];MGMT;EOP;CAxxx_TOT;GROSS;32.7802
!Scenario=scenario2
!Year=year2
!Period=period2
!View=view2
!Entity=entity2
!Value=vaue2
!PDR=pdr2
r1_GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
r2_GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
!Scenario=ACTUAL
!Year=2013
!Period=Jan
!View=YTD
!Entity=2xx01
!Value=USD Total
!PDR=No
GC14A10;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A11;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A12;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A13;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
;;;;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here is my try...What you can do is like fixing the how many lines are there in your flat file which consists of VAR1 to VAR7 information and based on that create as many variables as required while reading flat file
And then finally manipulate the data once it is read into the SAS...
The following code that i have prepared which might be helpful...
data mydata;
infile "filelocation\test.txt" TRUNCOVER;
input scenario $16./
year $10./
period $11./
view $9. /
entity $13./
value $16./
pdr $7. /
var1 $50. /
var2 $50. /
var3 $50.;
run;
proc contents data = mydata noprint out = list(keep = name);
run;
proc sql noprint;
select count(*) into :max_val
from list
where name ? "var";
quit;
%macro trial;
data mydata1(where = (var5 NE " "));
set mydata;
scenario = scan(scenario,2,"=");
year = scan(year,2,"=");
period = scan(period,2,"=");
view = scan(view,2,"=");
entity = scan(entity,2,"=");
value = scan(value,2,"=");
pdr = scan(pdr,2,"=");
var1_new = scan(var1,1,";");
var2_new = scan(var1,2,";");
var3 = scan(var1,3,";");
var4 = scan(var1,4,";");
var5 = scan(var1,5,";");
var6 = scan(var1,6,";");
var7 = scan(var1,7,";");
output;
do until(&max_val.);
scenario = scenario;
year = year;
period = period;
view = view;
entity = entity;
value = value;
pdr = pdr;
var1_new = var1_new ;
var2_new = var2_new;
var3 = var3;
var4 = var4;
%do i = 2 %to %eval(&max_val.-1);
var5 = scan(var&i.,5,";");
output;
%end;
var6 = var6;
var7 = var7;
output;
end;
run;
%mend;
options mprint;
%trial
-Urvish
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am able to understnd your approach but the lines are not consistent. So for few it may be single line with Var1-var7 values but for others it may be more than that. We have to consider some dynamic approach to deal with this problem.
Thanks for your help.
Regards,
Suvi
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Suvi104
Have a look at the following code, it uses test data, the DATA Step is made to read this kind of data:
infile cards dlm=";" truncover;
*
* check for the "record type" and keep record
*;
input
@1 recType $16. @
;
*
* define the var for the header values, assume fixed number
*;
retain scenario year period view entity value pdr;
array myHeader{*} $ 12 scenario year period view entity value pdr;
*
* do not need the blank lines
*;
if char(RecType, 1) = " " then do;
delete;
end;
*
* start of header values, we will read the fixed number of lines
*;
if recType =: "!Scenario" then do;
do i = 1 to dim(myHeader);
input @1 tempValue $32.;
myHeader{i} = scan(tempValue, 2, "=");
end;
end;
else do;
*
* read the "data" lines and write to the output data set
*;
input
@1
v1 : $32.
v2 : $32.
v3 : $32.
v4 : $32.
v5 : $32.
v6 : $32.
v7 : 8.
;
output;
end;
*
* we do not need these variables
*;
* drop recType i tempValue;
cards4;
!Scenario=ACTUAL
!Year=2013
!Period=Dec
!View=YTD
!Entity=29xxx
!Value=USD Total
!PDR=No
GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
GC14A03;[ICP Top];MGMT;EOP;CAxxx_TOT;GROSS;32.7802
!Scenario=scenario2
!Year=year2
!Period=period2
!View=view2
!Entity=entity2
!Value=vaue2
!PDR=pdr2
r1_GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
r2_GC14A03;[ICP Top];MGMT;EOP;AllCuxxx;GROSS;32.7802
!Scenario=ACTUAL
!Year=2013
!Period=Jan
!View=YTD
!Entity=2xx01
!Value=USD Total
!PDR=No
GC14A10;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A11;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A12;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
GC14A13;[ICP Top];MGMT;EOP;AllCuxxm3;GROSS;0
;;;;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi suvi107
Did you find a solution for the task you had?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Bruno.. It worked perfectly for me .. Just didn't had the time to comeback to the site and acknoldge it.. Thanks so much.
Suvi