BookmarkSubscribeRSS Feed
BrahmanandaRao
Lapis Lazuli | Level 10
data currency;
input ID Country $  ;
datalines;
1 USD
2 INR
3 USD,INR
4 USD,INR,EURU
;
run;
ID	Country
1	USD
2	INR
3	USD
3	INR
4	USD
4	INR
4	EURU
10 REPLIES 10
Kurt_Bremser
Super User
do i = 1 to countw(variable,',');
  newvar = scan(variable,i,',');
  output;
end;

Adapt variable names as needed. Add drop/keep statements as needed.

 

TEST your data step code before posting. Simple mistakes like a missing length statement make one look silly.

BrahmanandaRao
Lapis Lazuli | Level 10
not a proper output as above mention
BrahmanandaRao
Lapis Lazuli | Level 10
PLEASE SEE MY REQUIRIED OUTPUT
BrahmanandaRao
Lapis Lazuli | Level 10
data currency;
input ID Country $ 20.@ ;
datalines;
1 USD
2 INR
3 USD,INR
4 USD,INR,EURU
;
run;



data countw;
set currency;
do i = 1 to countw(Country,',');
newvar = scan(Country,i,',');
  output;
end;	
run;
Kurt_Bremser
Super User

@BrahmanandaRao wrote:
data currency;
input ID Country $ 20.@ ;
datalines;
1 USD
2 INR
3 USD,INR
4 USD,INR,EURU
;
run;



data countw;
set currency;
do i = 1 to countw(Country,',');
newvar = scan(Country,i,',');
  output;
end;	
run;

Running it with PROC PRINT, here's the log:

 73         data currency;
 74         input ID Country $ 20.@ ;
 75         datalines;
 
 NOTE: The data set WORK.CURRENCY has 4 observations and 2 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 80         ;
 81         
 82         data countw;
 83         set currency;
 84         do i = 1 to countw(Country,',');
 85         newvar = scan(Country,i,',');
 86           output;
 87         end;
 88         run;
 
 NOTE: There were 4 observations read from the data set WORK.CURRENCY.
 NOTE: The data set WORK.COUNTW has 7 observations and 4 variables.
 NOTE:  Verwendet wurde: DATA statement - (Gesamtverarbeitungszeit):
       real time           0.00 seconds
       cpu time            0.00 seconds
       
 
 89         
 90         proc print data=countw noobs;
 91         run;
 
 NOTE: There were 7 observations read from the data set WORK.COUNTW.
 NOTE:  Verwendet wurde: PROZEDUR PRINT - (Gesamtverarbeitungszeit):
       real time           0.03 seconds
       cpu time            0.03 seconds

Result:

D	Country	i	newvar
1	USD	1	USD
2	INR	1	INR
3	USD,INR	1	USD
3	USD,INR	2	INR
4	USD,INR,EURU	1	USD
4	USD,INR,EURU	2	INR
4	USD,INR,EURU	3	EURU

(some columns shifted because of tab width here on the communities)

newvar has the split values as requested.

If you want to have it as "country", use DROP and RENAME statements, or the respective dataset options.

Tom
Super User Tom
Super User

@BrahmanandaRao wrote:
data currency;
input ID Country $  ;
datalines;
1 USD
2 INR
3 USD,INR
4 USD,INR,EURU
;
run;
ID	Country
1	USD
2	INR
3	USD
3	INR
4	USD
4	INR
4	EURU

Is there a question here?  

Do you want to read multiple values from the same line of text?  That is change your data step?

Do you want to convert an existing variable with multiple comma delimited values in one variable into multi observations?

BrahmanandaRao
Lapis Lazuli | Level 10
Please give solutiion code
Tom
Super User Tom
Super User

@BrahmanandaRao wrote:
Please give solutiion code

Ok.  Since you can't ask the question let me propose a couple of possible questions and post the answers to them.

Question Version 1: You have a text file with two fields separated by a space. The first is a number and the second is a list of words with comma as the delimiter.   So something like this:

 

1 USD
2 INR
3 USD,INR
4 USD,INR,EURU

Let's refer to this file using the filref TEXT.  You could read that using a data step like this:

data currency;
  dlm=' ';
  infile text dlm=dlm truncover;
  input ID @ ;
  dlm=',';
  do while(1=1);
    input Country :$8. @  ;
    if country=' ' then leave;
    else output;
  end;
run;

Basically it picks of the ID value using  space as the delimiter and then converts to using comma as the delimiter and reads values of COUNTRY until there are no more values on the line writing one observation for each non-blank value of COUNTRY.

 

 

Question Version 2:  You have a SAS dataset with two variables one named ID and second named STRING.  The variable string has comma delimited values of COUNTRY.  You want to generate a dataset that splits the values into multiple observations.  Let's assume the existing dataset is named HAVE.  

 

data currency;
  set have ;
  length country $8 ;
  country = scan(string,1,',');
  do index=2 by 1 until(country=' ');
    output;
    country=scan(string,index,',');
  end;
  drop index string;
run;

Note you could create the dataset HAVE from the file TEXT using a data step like this:

data have; 
  infile text truncover ;
  input id string $80. ;
run;

Question Version 3: This one is really the same as version 2 only the string variable is already named COUNTRY.  So you need to do some renaming to allow the new variable have the name of COUNTRY.

data currency;
  set have (rename=(country=string));
  length country $8 ;
  country = scan(string,1,',');
  do index=2 by 1 until(country=' ');
    output;
    country=scan(string,index,',');
  end;
  drop index string;
run;

 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1971 views
  • 0 likes
  • 3 in conversation