BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mic35
Obsidian | Level 7

My aim is to split multiple values from rows to multiple vars (create a new var for each same value)

I have read various similar topics but all inclusive.

 

Each rows contain between 1 to X information separate by a comma.
For each variable I have to process, I have around 15 items plus the text typed by the users (free fields of a survey).

 

Currently I get  information by column but I didn't succeed to group each same value by column (see eg). I also tried to recode variables but, I have a lost of data...

 

As I have several hundred lines and several variables with so much information to process, would someone please advise me?

 

Eg:

 What I have

Id   Varname

1   Would like to get TV, Would like to view message

2   Would like to speak

3   Would like to speak, Would like to get TV

An idea of what I want

Id   Varname                                                                       TV         SPEAK       MSG

1   Would like to get TV, Would like to view message         1               0                1

2   Would like to speak                                                        0               1                0

3   Would like to speak, Would like to get TV                      1               1                0

Best

 

Mic

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

It will be hard to guaranty extraction all required words.

 

To get the last word of a string you can use scan function:  

 wanted = scan(<string>,-1);

In other case you want to extract the following word coming after "prefer" ?!    

if findw(<string>,'prefer') > 0 then
   wanted = scan(substr(<string>,findw(<string>,'prefer')),1);

You need check your data and gather rules to define which words to look for.
The program and its output should be checked often to assure optimal result.

 

Even after having the rules, you still need:

1) To check whether the extracted words are the expected ones and all required

2) I suggest to start with transposing the data  so each observation contains:

     ID, STRING (one string only per observation - assigned the maximum expected length).

 

Alternative way will be to scan the data and create a list of required words, then it is possible to check each word of the string if it is in the list either by array method or by a hashing method.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

Seems simple enough.

data have ;
  input id varname $60. ;
cards;
1   Would like to get TV, Would like to view message
2   Would like to speak
3   Would like to speak, Would like to get TV
;

data want ;
  set have ;
  TV = 0 ne findw(varname,'Would like to get TV',',','sit');
  SPEAK = 0 ne findw(varname,'Would like to speak',',','sit');
  MSG = 0 ne findw(varname,'Would like to view message',',','sit');
run;

proc print;
run;
Obs    id    varname                                             TV    SPEAK    MSG

 1      1    Would like to get TV, Would like to view message     1      0       1
 2      2    Would like to speak                                  0      1       0
 3      3    Would like to speak, Would like to get TV            1      1       0


Mic35
Obsidian | Level 7

Yes but how to be sure I do not forget the elements written in free text ? and how can I get them in easy way ?

Shmuel
Garnet | Level 18

Do you know ahead the values you want to save as variables or you want to gather them dynamically according to input text? 

Are there rules which values to extract into the variables?

Mic35
Obsidian | Level 7

In the database I must analyze it is a mixture of known and unknown data.
For the known data Tom's method is perfect (and simple 🙂 )but for the free-text data it's more complicated.
In theory this free text is located at the end of the character string but not checked on all the data.

 

for example my cell can be of the form

ID varname
1 Would like to get TV, Would like to view message
2 Would like to speak
3 Would like to speak, Would like to get TV, I prefer ... but ....

 

I prefer ... but... = free text added by the participant

 

Shmuel
Garnet | Level 18

It will be hard to guaranty extraction all required words.

 

To get the last word of a string you can use scan function:  

 wanted = scan(<string>,-1);

In other case you want to extract the following word coming after "prefer" ?!    

if findw(<string>,'prefer') > 0 then
   wanted = scan(substr(<string>,findw(<string>,'prefer')),1);

You need check your data and gather rules to define which words to look for.
The program and its output should be checked often to assure optimal result.

 

Even after having the rules, you still need:

1) To check whether the extracted words are the expected ones and all required

2) I suggest to start with transposing the data  so each observation contains:

     ID, STRING (one string only per observation - assigned the maximum expected length).

 

Alternative way will be to scan the data and create a list of required words, then it is possible to check each word of the string if it is in the list either by array method or by a hashing method.

Mic35
Obsidian | Level 7

Thank you for your time and explanations.

 

By going through this method I seem to obtain the desired result.... remains more than just doing a last random check on all data.

In any case I thank you both for the advices on these SAS functionalities that I almost never use.
The ideal will always remain to participate in the design of the data collection in order to limit this kind of things.

 


data demo (keep=id varname wanted); set demo;
wanted = scan(varname,-1,","); run;

data demo ; set demo; 
  if   find(wanted, 'Would like to view message',"i") >0 OR
      find(wanted, 'Would like to speak',"i") >0  THEN wanted="";
run;

proc freq data=demo;
table varname wanted; run

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 855 views
  • 2 likes
  • 3 in conversation