Dear all,
This is my first post here. I’ve been googling over the last hours for an answer but I cannot find it.
I have a table whose variables contain list of values separated by a comma. I would like to transform it into a total flat table. For example:
VAR1 VAR2 VAR3
r,f,e 3,5 S
j 2,4,6 D,J,N
should transform into:
VAR1 VAR2 VAR3
r 3 S
r 5 S
f 3 S
f 5 S
e 3 S
e 5 S
j 2 D
j 2 J
j 2 N
j 4 D
j 4 J
j 4 N
j 6 D
j 6 J
j 6 N
That may sound like a basic but I don't how to perform it!
Thanks a lot for your help.
ntro
Needed to reset the counters at the beginning of the loops, not change the output statement.
data want;
set have;
*initialize counters to 0;
i=1;
do while (scan(var1, i, ",") ne "");
new_var1=scan(var1, i, ",");
j=1;
do while (scan(var2, j, ",") ne "");
new_var2=scan(var2, j, ",");
k=1;
do while (scan(var3, k, ",") ne "");
new_var3=scan(var3, k, ",");
output;
k+1;
end;
j+1;
end;
i+1;
end;
run;
You're looking for the scan function with a do loop iteration and an explicit output statement.
This isn't tested/working code, you may have to play with the ends/counters to get it working, but hopefully gives you an idea.
data want;
set have;
*initialize counters to 0;
i=1; j=1; k=1;
do while (scan(var1, i, ",") ne "");
new_var1=scan(var1, i, ",");
do while (scan(var2, j, ",") ne "");
new_var2=scan(var2, j, ",");
do while (scan(var3, k, ",") ne "");
new_var3=scan(var3, k, ",");
output;
k+1;
end;
j+1;
end;
i+1;
end;
run;
Reeza,
Thank you very much for the prompt and useful answer. I have tested it and I think it is not far from the solution.
Here what I obtain:
var1 var2 var3 i j k new_var1 new_var2 new_var3
r,f,e 3,5 S 1 1 1 r 3 S
j 2,4,6 D,J,N 1 1 1 j 2 D
j 2,4,6 D,J,N 1 1 2 j 2 J
j 2,4,6 D,J,N 1 1 3 j 2 N
So the first record is treated only once (r 3 S) but then it jumps to last record, from which only 3 lines are produced.
It seems like only var3 is treated. Any idea why?
Thank again
Try adding output statements at the end of each loop, unfortunately I don't have time to test anything today.
Looks like the loops are just ending too early.
You could use iterative DO loop using COUNTW() command to set upper bound of loops.
...
do i=1 to countw(var1, ',' );
....
What do you want to happen when any of the three variables are missing?
Needed to reset the counters at the beginning of the loops, not change the output statement.
data want;
set have;
*initialize counters to 0;
i=1;
do while (scan(var1, i, ",") ne "");
new_var1=scan(var1, i, ",");
j=1;
do while (scan(var2, j, ",") ne "");
new_var2=scan(var2, j, ",");
k=1;
do while (scan(var3, k, ",") ne "");
new_var3=scan(var3, k, ",");
output;
k+1;
end;
j+1;
end;
i+1;
end;
run;
Dear all,
I really can't say how much I appreciate your help. I have used Reeza's code and it wroks perfectly.
Arthur, your code seems like a very nice and clever solution. However I have the following error (sorry it is in French):
98 +input (r,f,e 3,5 S ) ($);
-
22
76
ERREUR 22-322: Erreur de syntaxe ; syntaxe requise : [, {.
ERREUR 76-322: Erreur de syntaxe ; l'instruction sera ignorée.
It seems like the comma is not understood.
Thanks again
: Worked fine on my computer. Commas are among the default delimiters for both ASCII and EBCDIC but, since the main difference between Reeza's and my code was that she explicitly declared the comma as a modifier in her use of the scan function, it apparently is needed on your system.
Ah, but as I look at the error you received, I would guess that you really didn't have a first record that contained "var1 var2 var3", i.e., the variable names. The reason I was suggesting having SAS write and run a program was so that the variable names could be extracted from the first record. Since that appears not to be the case, Reesa's suggested code would definitely be the way to go.
Thank you Arthur.
I am using now Reeza's code but since my final table is 12M lines, it seems quicker to make x data steps rather than x loop in a single data step.
Have a nice week-end
One way to accomplish what I think you are trying to do is to use SAS to write, then run, a program that restructures the data. e.g.:
filename program temp;
data have;
informat x y z $10.;
infile cards end=eof;
input x y z;
cards;
VAR1 VAR2 VAR3
r,f,e 3,5 S
j 2,4,6 D,J,N
;
data _null_;
file program;
set have end=eof;
if _n_ eq 1 then do;
put "data want;";
put "input ("@;
put x y z @;
put ") ($);";
put "cards;";
end;
else do;
i=1;
do while (scan(x,i) ne "");
data1=scan(x,i);
i+1;
j=1;
do while (scan(y,j) ne "");
data2=scan(y,j);
j+1;
k=1;
do while (scan(z,k) ne "");
data3=scan(z,k);
put data1 data2 data3;
k+1;
end;
end;
end;
end;
if eof then put ";";
run;
%include program;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.