I have a data set that contains participant data for multiple months. The data for each month contains a string of data that I want to parse out into different variables. The data within the string changes from month to month.
ID Jan17 Feb17 Mar17
1 I - 2 - 0 - 056 - 0029 - 22 - 561 C - 4 - 0 - 056 - 0029 - 22 - 561 C - 4 - 1 - 056 - 0029 - 22 - 561
2 P - 4 - 23 - 056 - 0029 - 22 - 561 P - 4 - 23 - 056 - 0016 - 22 - 561 B - 1 - 23 - 056 - 0016 - 22 - 561
3 P - 4 - 33 - 007 - 0140 - 03 - 071 N - 4 - 33 - 007 - 0140 - 03 - 071 N - 4 - 33 - 007 - 0140 - 03 - 071
I wrote an array that will parse the string from a single month, but I can't figure a macro that will run for every month and create the output variables to include the month it is from.
array STRING(*) var1 var2 var3 var4 var5 var6 var7 _char_;
i=1;
do while(scan(Jan17, i, " - ") ne "");
STRING(i) =scan(Jan17, i, " - ");
i+1;
end;
Hopefully it would look something like this. I will have a ton of columns, but I can't think of a better way to do it.
ID Jan17var1 Jan17var2 Jan17var3 Jan17var4
1 I 2 0 056
2 P 4 23 056
3 P 4 33 007
Thanks!
I suggest transposing the dataset to make it long, then applying your code, and finally transposing it again to make it wide.
For the last step, I usually download and run the macro you can find at: https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...
Using that method, I think the following code will do what you want:
proc transpose data=have out=need name=month; by id; var jan17--mar17; run; data need (drop=COL1); set need; array STRING(*) $ var1-var7; date=input(month,monyy.); i=1; do while(scan(COL1, i, " - ") ne ""); STRING(i) =scan(COL1, i, " - "); i+1; end; output; run; %transpose(data=need,out=want,by=id,id=date,format=monyy5.,var=var:,var_first=no) /* or, if you want the months and variables separated by an underscore */ %transpose(data=need,out=want,by=id,id=date,format=monyy5.,var=var:,var_first=no,delimiter=_)
Art, CEO, AnalystFinder.com
Thoughts to consider ...
It would be better if the data had never been put together like that.
It would be better if you kept a separate observation for each ID/Month, instead of a single observation per ID. You would need to learn programming techniques to work with the data in that form but that is well worth doing.
SCAN is perfectly fine, but the ARRAY statement is suspect. It looks like you are trying to define the array variables as character by adding _char_ at the end. The right way to do that would be:
array STRING(*) $ var1 var2 var3 var4 var5 var6 var7;
I suggest transposing the dataset to make it long, then applying your code, and finally transposing it again to make it wide.
For the last step, I usually download and run the macro you can find at: https://communities.sas.com/t5/SAS-Communities-Library/A-better-way-to-FLIP-i-e-transpose-make-wide-...
Using that method, I think the following code will do what you want:
proc transpose data=have out=need name=month; by id; var jan17--mar17; run; data need (drop=COL1); set need; array STRING(*) $ var1-var7; date=input(month,monyy.); i=1; do while(scan(COL1, i, " - ") ne ""); STRING(i) =scan(COL1, i, " - "); i+1; end; output; run; %transpose(data=need,out=want,by=id,id=date,format=monyy5.,var=var:,var_first=no) /* or, if you want the months and variables separated by an underscore */ %transpose(data=need,out=want,by=id,id=date,format=monyy5.,var=var:,var_first=no,delimiter=_)
Art, CEO, AnalystFinder.com
Took some work, but this got me what I needed in the long run! Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.