BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jbscholten
Obsidian | Level 7

Hello,

 

I am trying to modify a dataset and change it's format. How do I add the appropriate line breaks described below?

 

Source data:

Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6
1 0 2 2 2 3 4 4
1 1 2 2 2 3 3 4
1 2 2 2 3 3 4 4
1 3 3 3 3 4 4 5
1 4 3 3 3 4 4 5

 

This spans across 9 "varx" variables. For each trial, there are 601 (0-600) timesteps. After 600 timesteps trial is set to 2 and timestep is reset to 0.  

 

The data needs to be put in the following format (yes, I know the values for Var1-5 don't match here, no problem)

 

0001                  
1116 1 1 1 1 2 2 2 3 3
1216 1 1 1 2 2 2 2 3 3
0117 1 1 1 2 2 2 3 3 3
.                  
1166                  
END                  
0002                  
1116 1 1 2 2 3 3 3 4 5
1216 1 1 2 2 3 3 4 4 5
0117 1 1 2 2 3 3 4 4

5

 

The "0001" represents the first trial, below which is a date (mmyy) that goes through all 600 timesteps, at which point you see the "END" and "0002", and then repeats. 1116 would represent timestep 0, 1216 represents timestep 1, 0117 represents timestep 2, etc.

 

I've created a new indicator column which will represent the left most column in my second data set. I plan to add the  "0001" at the top after the rest of the "0xxx" are successfully generated. Currently, I'm able to add the "END" in the correct places. I'm able to place a "0002" in properly as well but a 0002 is being placed after ALL of the "END"s, rather than being increased monotonically. Heres the code for END and "0xxx" insertion.  I only have j running from 2 to 4 now as I'm testing with a subset.

 

data want;
set have;
output;
if timestep = 600;
  array allnums {*} _numeric_ ;
     array allchar {*} _character_ ;
     drop i;
     do i=1 to dim(allnums); allnums{i}=.; end;
     do i=1 to dim(allchar); allchar{i}='END'; end;
     output;  /* Output blank observation */
run;




data want1;
set want;
output;
do j = 2 to 4 by 1 while(indicator = 'END');
  		array allnums {*} _numeric_ ;
  		array allchar {*} _character_ ;
     	drop i;
	     	do i=1 to dim(allnums); allnums{i}=.; end;
    		do i=1 to dim(allchar); allchar{i}= "&j."; end;
	output;  /* Output blank observation */ 
j=j+1;
end;
run;

Any help would be greatly appreciated. Working on SAS EG5.1

1 ACCEPTED SOLUTION

Accepted Solutions
jbscholten
Obsidian | Level 7

Sure thing! Totally revamped the previous code

 

%Let YYMM = %sysfunc(mdy(11,01,2016)); *Start Date;
 
/*Add LineBreak*/
Data Work.LineBreak;
     length Indicator $20.;
     Set Have;
     By Trial;

     If First.Trial then do;
           Indicator = cats('Scenario',Trial);
           Output;
           Indicator = put(&YYMM,mmyyn4.);
           Output;
     End; 
	
     Else if not First.Trial and not Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
     End; 
     Else if Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
           Indicator = 'END';
           Output;
     End;
Run;

/*Clear LineBreaks*/
Data Work.ClearBreak(drop=Trial Timestep);
     Set Work.LineBreak;

     Array AllNumVars _numeric_;

     If substr(Indicator,1,8) = 'Scenario' or Indicator = 'END' then do;
           Do over AllNumVars;
                AllNumVars = .;
           End;
     End;

	 if substr(Indicator,1,8) = 'Scenario' then do;
		   indicator = substr(Indicator,9,4);
	 End;
	
	 if Indicator not = 'END' then do;
	 	if length(Indicator) lt 4 then Indicator = cats(repeat('0',4-1-length(Indicator)),Indicator);
     end;
run;

View solution in original post

5 REPLIES 5
TomKari
Onyx | Level 15

Is this the sort of thing you're after?

 

data have;

input Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6;

cards;

1 0 2 2 2 3 4 4

1 1 2 2 2 3 3 4

1 2 2 2 3 3 4 4

1 3 3 3 3 4 4 5

1 4 3 3 3 4 4 5

2 0 2 2 2 3 4 4

2 1 2 2 2 3 3 4

2 2 2 2 3 3 4 4

2 3 3 3 3 4 4 5

2 4 3 3 3 4 4 5

3 0 2 2 2 3 4 4

3 1 2 2 2 3 3 4

3 2 2 2 3 3 4 4

3 3 3 3 3 4 4 5

3 4 3 3 3 4 4 5

run;

proc format;

value tsf

0 = "1116"

1 = "1216"

2 = "0117"

3 = "0217"

4 = "1166"

;

run;

data want;

length CharTrial $4;

set have;

by Trial;

drop Trial TimeStep Var1-Var6;

if first.Trial then do;

call missing(Out1, Out2, Out3, Out4, Out5, Out6);

CharTrial = put(Trial, z4.);

output;

end;

CharTrial = put(TimeStep, tsf.);

Out1 = Var1; Out2 = Var2; Out3 = Var3; Out4 = Var4; Out5 = Var5; Out6 = Var6;

output;

if last.Trial

then do;

CharTrial = "END";

call missing(Out1, Out2, Out3, Out4, Out5, Out6);

output;

end;

run;

jbscholten
Obsidian | Level 7

Not sure we're on the same page, sorry if I wasn't clear. Taking the data you gave me and added a couple rows to show that timestep goes to 600;

Trial Timestep Var1 Var2 Var3 Var4 Var5 Var6
1 0 2 2 2 3 4 4
1 1 2 2 2 3 3 4
1 2 2 2 3 3 4 4
1 3 3 3 3 4 4 5
1 4 3 3 3 4 4 5
1
1
1 600 4 4 5 5 5 6
2 0 2 2 2 3 4 4
2 1 2 2 2 3 3 4
2 2 2 2 3 3 4 4
2 3 3 3 3 4 4 5
2 4 3 3 3 4 4 5
2
2
2 600 5 6 6 6 7 7
3 0 2 2 2 3 4 4
3 1 2 2 2 3 3 4
3 2 2 2 3 3 4 4
3 3 3 3 3 4 4 5
3 4 3 3 3 4 4 5

 

 

In the current case, 

0 = "1116"
1 = "1216"
2 = "0117"
3 = "0217"
4 = "1166"

is correct, but it needs to be flexible so that I can set some variable, e.g. "CurrentDate", to 1116, 0219, or any date in mmyy form and It will generate the appropriate date, increasing monthly, by timestep. Ideally, the program would also be flexible to take different values for number of trials and number of timesteps per trial. Final format for the above data would be:

 

0001            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166 4 4 5 5 5 6
END            
0002            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166 5 6 6 6 7 7
END            
0003            
1116 2 2 2 3 4 4
1216 2 2 2 3 3 4
0117 2 2 3 3 4 4
0217 3 3 3 4 4 5
0317 3 3 3 4 4 5
1166
END            

 

Thanks

jbscholten
Obsidian | Level 7

Solved, thanks.

AnnaBrown
Community Manager

Great news that you solved the problem, jbscholten! Can you share your solution for other community members who may run into the same issue?

 

Thanks!

Anna

jbscholten
Obsidian | Level 7

Sure thing! Totally revamped the previous code

 

%Let YYMM = %sysfunc(mdy(11,01,2016)); *Start Date;
 
/*Add LineBreak*/
Data Work.LineBreak;
     length Indicator $20.;
     Set Have;
     By Trial;

     If First.Trial then do;
           Indicator = cats('Scenario',Trial);
           Output;
           Indicator = put(&YYMM,mmyyn4.);
           Output;
     End; 
	
     Else if not First.Trial and not Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
     End; 
     Else if Last.Trial then do;
           Indicator = put(intnx('month',&YYMM,Timestep),mmyyn4.);
           Output;
           Indicator = 'END';
           Output;
     End;
Run;

/*Clear LineBreaks*/
Data Work.ClearBreak(drop=Trial Timestep);
     Set Work.LineBreak;

     Array AllNumVars _numeric_;

     If substr(Indicator,1,8) = 'Scenario' or Indicator = 'END' then do;
           Do over AllNumVars;
                AllNumVars = .;
           End;
     End;

	 if substr(Indicator,1,8) = 'Scenario' then do;
		   indicator = substr(Indicator,9,4);
	 End;
	
	 if Indicator not = 'END' then do;
	 	if length(Indicator) lt 4 then Indicator = cats(repeat('0',4-1-length(Indicator)),Indicator);
     end;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2251 views
  • 1 like
  • 3 in conversation