SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Adding line breaks with values based on conditions

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Adding line breaks with values based on conditions

Hello,

 

I am trying to modify a dataset and change it's format. How do I add the appropriate line breaks described below?

 

Source data:

Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6
1 0 2 2 2 3 4 4
1 1 2 2 2 3 3 4
1 2 2 2 3 3 4 4
1 3 3 3 3 4 4 5
1 4 3 3 3 4 4 5

 

This spans across 9 "varx" variables. For each trial, there are 601 (0-600) timesteps. After 600 timesteps trial is set to 2 and timestep is reset to 0.  

 

The data needs to be put in the following format (yes, I know the values for Var1-5 don't match here, no problem)

 

0001                  
1116 1 1 1 1 2 2 2 3 3
1216 1 1 1 2 2 2 2 3 3
0117 1 1 1 2 2 2 3 3 3
.                  
1166                  
END                  
0002                  
1116 1 1 2 2 3 3 3 4 5
1216 1 1 2 2 3 3 4 4 5
0117 1 1 2 2 3 3 4 4

5

 

The "0001" represents the first trial, below which is a date (mmyy) that goes through all 600 timesteps, at which point you see the "END" and "0002", and then repeats. 1116 would represent timestep 0, 1216 represents timestep 1, 0117 represents timestep 2, etc.

 

I've created a new indicator column which will represent the left most column in my second data set. I plan to add the  "0001" at the top after the rest of the "0xxx" are successfully generated. Currently, I'm able to add the "END" in the correct places. I'm able to place a "0002" in properly as well but a 0002 is being placed after ALL of the "END"s, rather than being increased monotonically. Heres the code for END and "0xxx" insertion.  I only have j running from 2 to 4 now as I'm testing with a subset.

 

data want;
set have;
output;
if timestep = 600;
  array allnums {*} _numeric_ ;
     array allchar {*} _character_ ;
     drop i;
     do i=1 to dim(allnums); allnums{i}=.; end;
     do i=1 to dim(allchar); allchar{i}='END'; end;
     output;  /* Output blank observation */
run;




data want1;
set want;
output;
do j = 2 to 4 by 1 while(indicator = 'END');
  		array allnums {*} _numeric_ ;
  		array allchar {*} _character_ ;
     	drop i;
	     	do i=1 to dim(allnums); allnums{i}=.; end;
    		do i=1 to dim(allchar); allchar{i}= "&j."; end;
	output;  /* Output blank observation */ 
j=j+1;
end;
run;

Any help would be greatly appreciated. Working on SAS EG5.1


Accepted Solutions
Solution
‎02-08-2017 01:03 PM
Occasional Contributor
Posts: 13

Re: Adding line breaks with values based on conditions

Posted in reply to AnnaBrown

Sure thing! Totally revamped the previous code

 

%Let YYMM = %sysfunc(mdy(11,01,2016)); *Start Date;
 
/*Add LineBreak*/
Data Work.LineBreak;
     length Indicator $20.;
     Set Have;
     By Trial;

     If First.Trial then do;
           Indicator = cats('Scenario',Trial);
           Output;
           Indicator = put(&YYMM,mmyyn4.);
           Output;
     End; 
	
     Else if not First.Trial and not Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
     End; 
     Else if Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
           Indicator = 'END';
           Output;
     End;
Run;

/*Clear LineBreaks*/
Data Work.ClearBreak(drop=Trial Timestep);
     Set Work.LineBreak;

     Array AllNumVars _numeric_;

     If substr(Indicator,1,8) = 'Scenario' or Indicator = 'END' then do;
           Do over AllNumVars;
                AllNumVars = .;
           End;
     End;

	 if substr(Indicator,1,8) = 'Scenario' then do;
		   indicator = substr(Indicator,9,4);
	 End;
	
	 if Indicator not = 'END' then do;
	 	if length(Indicator) lt 4 then Indicator = cats(repeat('0',4-1-length(Indicator)),Indicator);
     end;
run;

View solution in original post


All Replies
PROC Star
Posts: 1,167

Re: Adding line breaks with values based on conditions

Posted in reply to jbscholten

Is this the sort of thing you're after?

 

data have;

input Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6;

cards;

1 0 2 2 2 3 4 4

1 1 2 2 2 3 3 4

1 2 2 2 3 3 4 4

1 3 3 3 3 4 4 5

1 4 3 3 3 4 4 5

2 0 2 2 2 3 4 4

2 1 2 2 2 3 3 4

2 2 2 2 3 3 4 4

2 3 3 3 3 4 4 5

2 4 3 3 3 4 4 5

3 0 2 2 2 3 4 4

3 1 2 2 2 3 3 4

3 2 2 2 3 3 4 4

3 3 3 3 3 4 4 5

3 4 3 3 3 4 4 5

run;

proc format;

value tsf

0 = "1116"

1 = "1216"

2 = "0117"

3 = "0217"

4 = "1166"

;

run;

data want;

length CharTrial $4;

set have;

by Trial;

drop Trial TimeStep Var1-Var6;

if first.Trial then do;

call missing(Out1, Out2, Out3, Out4, Out5, Out6);

CharTrial = put(Trial, z4.);

output;

end;

CharTrial = put(TimeStep, tsf.);

Out1 = Var1; Out2 = Var2; Out3 = Var3; Out4 = Var4; Out5 = Var5; Out6 = Var6;

output;

if last.Trial

then do;

CharTrial = "END";

call missing(Out1, Out2, Out3, Out4, Out5, Out6);

output;

end;

run;

Occasional Contributor
Posts: 13

Re: Adding line breaks with values based on conditions

Not sure we're on the same page, sorry if I wasn't clear. Taking the data you gave me and added a couple rows to show that timestep goes to 600;

Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6
1 0 2 2 2 3 4 4
1 1 2 2 2 3 3 4
1 2 2 2 3 3 4 4
1 3 3 3 3 4 4 5
1 4 3 3 3 4 4 5
1
1
1 600 4 4 5 5 5 6
2 0 2 2 2 3 4 4
2 1 2 2 2 3 3 4
2 2 2 2 3 3 4 4
2 3 3 3 3 4 4 5
2 4 3 3 3 4 4 5
2
2
2 600 5 6 6 6 7 7
3 0 2 2 2 3 4 4
3 1 2 2 2 3 3 4
3 2 2 2 3 3 4 4
3 3 3 3 3 4 4 5
3 4 3 3 3 4 4 5

 

 

In the current case, 

0 = "1116"
1 = "1216"
2 = "0117"
3 = "0217"
4 = "1166"

is correct, but it needs to be flexible so that I can set some variable, e.g. "CurrentDate", to 1116, 0219, or any date in mmyy form and It will generate the appropriate date, increasing monthly, by timestep. Ideally, the program would also be flexible to take different values for number of trials and number of timesteps per trial. Final format for the above data would be:

 

0001            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166 4 4 5 5 5 6
END            
0002            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166 5 6 6 6 7 7
END            
0003            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166
END            

 

Thanks

Occasional Contributor
Posts: 13

Re: Adding line breaks with values based on conditions

Solved, thanks.

Community Manager
Posts: 564

Re: Adding line breaks with values based on conditions

Posted in reply to jbscholten

Great news that you solved the problem, jbscholten! Can you share your solution for other community members who may run into the same issue?

 

Thanks!

Anna

Solution
‎02-08-2017 01:03 PM
Occasional Contributor
Posts: 13

Re: Adding line breaks with values based on conditions

Posted in reply to AnnaBrown

Sure thing! Totally revamped the previous code

 

%Let YYMM = %sysfunc(mdy(11,01,2016)); *Start Date;
 
/*Add LineBreak*/
Data Work.LineBreak;
     length Indicator $20.;
     Set Have;
     By Trial;

     If First.Trial then do;
           Indicator = cats('Scenario',Trial);
           Output;
           Indicator = put(&YYMM,mmyyn4.);
           Output;
     End; 
	
     Else if not First.Trial and not Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
     End; 
     Else if Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
           Indicator = 'END';
           Output;
     End;
Run;

/*Clear LineBreaks*/
Data Work.ClearBreak(drop=Trial Timestep);
     Set Work.LineBreak;

     Array AllNumVars _numeric_;

     If substr(Indicator,1,8) = 'Scenario' or Indicator = 'END' then do;
           Do over AllNumVars;
                AllNumVars = .;
           End;
     End;

	 if substr(Indicator,1,8) = 'Scenario' then do;
		   indicator = substr(Indicator,9,4);
	 End;
	
	 if Indicator not = 'END' then do;
	 	if length(Indicator) lt 4 then Indicator = cats(repeat('0',4-1-length(Indicator)),Indicator);
     end;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 251 views
  • 1 like
  • 3 in conversation