BookmarkSubscribeRSS Feed
pdhokriya
Pyrite | Level 9

Hi Reader,

 

Here is the data and code, facing issue while handelling semicolon and length , not getting correct output which is mentioned below in table.

 

Also I am not getting any error/warning in my program.

 

Here, if length LE 10 then i want to consider semicolon (like col1-1st obs) as it is else ignore semicolon (like col1-2nd obs).

 

Output required- 

 

OLD COL COL1 COL2
abcd;xyzxybav;charged; abcd; xyzxybav; charged;
ab;countrynew;zcc; ab; countrynew zcc;
ab;charged; ab;charged    
ab;cd;dr; ab;cd;dr;    
ab ab    

 

 

data abc ;
old = "abcd;xyzxybav;charged;"; output;
old = "ab;countrynew;zcc;"; output;
old = "ab;charged;"; output;
old = "ab;cd;dr;"; output;
old = "ab"; output;
run;

%let w = 10 ;                                                                                             
     
data have  (drop = _:);
 set abc;                                                                                                                                                                                                                                               
 array c $ &w col col1-col2 ;
/*if old^="" and length(strip(old)) le &w then col=old; */
_i_ = 1 ;
do _x = 1 to countw (old, "; ") ;
c = catx ("; ", c, scan (old, _x, "; ")) ;
if length (c) <= &w and length (c) + length (scan (old, _x + 1)) > &w or _x = countw (old, "; ") then _i_ + 1 ;
end ;
run ;

 

18 REPLIES 18
PaigeMiller
Diamond | Level 26

Are you getting ERRORs or WARNINGs in the log? If so, please show us the log (paste it into the window that appears when you click on the </> icon)

--
Paige Miller
pdhokriya
Pyrite | Level 9

No error in pgrogram, just program need updation as per requirement.

ballardw
Super User

You might bother to actually state what is "wrong" in the output.

If the issue is the presence or lack of a semicolon then you really in a lot of detail need to describe why "contrynew" in col1 does not have semicolon at the end but "xyzxybav;" does.

Same with why "ab;charged;" and "ab;cd;dr;" are not split up but other values are.

 

Providing code that does not do what you want does not explain what the rules involved are.

pdhokriya
Pyrite | Level 9

please see code.

PaigeMiller
Diamond | Level 26

@pdhokriya wrote:

please see code.


Your code has col1-col9 but your sample data does not have col1-col9 so we can't really use your code or your data as it is.

 

I agree with @ballardw , please explain what is wrong instead of assuming that we can figure out what is wrong.

--
Paige Miller
pdhokriya
Pyrite | Level 9

I have updated col1 -col2. please check the fyr.

 

also the code which i have given is not working correctly as i am not getting correct output (mentioned in table) so what shall i change in provided code so that will get correct ( table ) output.

ballardw
Super User

@pdhokriya wrote:

please see code.


I am not a mind reader.

If the code does not do what you want then there is no way the code can tell me what you want or expect.

 

 

pdhokriya
Pyrite | Level 9

ok, please ignore my code, and please refer this-

data abc ;
old = "abcd;xyzxybav;charged;"; output;
old = "ab;countrynew;zcc;"; output;
old = "ab;charged;"; output;
old = "ab;cd;dr;"; output;
old = "ab"; output;
run;

 and let me know how shall i get an output data.

Kurt_Bremser
Super User

This creates your intended result:

data abc ;
old = "abcd;xyzxybav;charged;"; output;
old = "ab;countrynew;zcc;"; output;
old = "ab;charged;"; output;
old = "ab;cd;dr;"; output;
old = "ab"; output;
run;

%let w = 10;

data want;
set abc;
length col col1 col2 $&w.;
if length (old) gt &w. + 1
then do;
  col = scan(old,1,";") !! ";";
  col1 = scan(old,2,";") !! ";";
  col2 = scan(old,3,";") !! ";";
end;
else col = old;
run;
pdhokriya
Pyrite | Level 9
Ya You understood correctly, but may be something is missing.
and Yes I can elaborate more on this , for example , if i take below mentioned data and &w = 200 then

data abc ;
old = "ASCORBIC ACID;CALCIUM CARBONATE;CALCIUM PHOSPHATE;CHROMIUM;CYANOCOBALAMIN;FOLIC ACID;MAGNESIUM CARBONATE;MAGNESIUM HYDROXIDE;MANGANESE;NICOTINIC ACID;PANTOTHENIC ACID;POTASSIUM BICARBONATE;POTASSIUM CARBONATE;POTASSIUM PHOSPHATE DIBASIC;PYRIDOXINE;QUERCETIN;RIBOFLAVIN;SODIUM BICARBONATE;SODIUM PHOSPHATE;THIAMINE;THIOCTIC ACID;ZINC ASCORBATE"; output;

old = "ASCORBIC ACID;BIOTIN;CALCIUM;CALCIUM PHOSPHATE;CHROMIUM;CUPRIC OXIDE;CYANOCOBALAMIN;FOLIC ACID;IODINE;IRON;MAGNESIUM OXIDE;MANGANESE;MOLYBDENUM;NICKEL;NICOTINAMIDE;PANTOTHENIC ACID;PHYTOMENADIONE;POTASSIUM;PYRIDOXINE;RETINOL;RIBOFLAVIN;SELENIUM;SODIUM METASILICATE;THIAMINE;TOCOPHEROL;VITAMIN D NOS;ZINC"; output;

old = "ab;countrynew;zcc;"; output;
old = "ab;charged;"; output;
old = "ab;cd;dr;"; output;
old = "ab"; output;
run;


output would be :

example : obs = 1

col = "ASCORBIC ACID;CALCIUM CARBONATE;CALCIUM PHOSPHATE;CHROMIUM;CYANOCOBALAMIN;FOLIC ACID;MAGNESIUM CARBONATE;MAGNESIUM HYDROXIDE;MANGANESE;NICOTINIC ACID;PANTOTHENIC ACID;POTASSIUM BICARBONATE;"
col1 = "POTASSIUM CARBONATE;POTASSIUM PHOSPHATE DIBASIC;PYRIDOXINE;QUERCETIN;RIBOFLAVIN;SODIUM BICARBONATE;SODIUM PHOSPHATE;THIAMINE;THIOCTIC ACID;ZINC ASCORBATE"

example : obs = 2

col= "ASCORBIC ACID;BIOTIN;CALCIUM;CALCIUM PHOSPHATE;CHROMIUM;CUPRIC OXIDE;CYANOCOBALAMIN;FOLIC ACID;IODINE;IRON;MAGNESIUM OXIDE;MANGANESE;MOLYBDENUM;NICKEL;NICOTINAMIDE;PANTOTHENIC ACID;PHYTOMENADIONE;"
col1= "POTASSIUM;PYRIDOXINE;RETINOL;RIBOFLAVIN;SELENIUM;SODIUM METASILICATE;THIAMINE;TOCOPHEROL;VITAMIN D NOS;ZINC"


Tom
Super User Tom
Super User

You are still missing the description of what you are trying to do.

Let's make another guess.  You have a really long string of words delimited by semi-colon and you want to break it into smaller strings, but without splitting individual words.  So essentially you are doing word wrapping like a word processor would to flow a paragraph onto a page.

So if the original string has a maximum length of 1000 and you want the new strings to each have a maximum length of 200 then you will probably need between a maximum of 5 to 10 of them to hold all of the possible words. (What would happen if every word was exactly 100 characters long?)

data want ;
  set abc ;
  array col [10] $200 ;
  target=1;
  do source=1 to countw(old,';');
     if col[target]=' ' then col[target]=scan(old,source,';');
     else if 200 < length(catx(';',col[target],scan(old,source,';'))) then do;
       target=target+1;
       col[target]=scan(old,source,';');
     end;
     else col[target]=catx(';',col[target],scan(old,source,';'));
  end;
run;

Or more simply generate the TALL structure instead. You can always add in a PROC TRANSPOSE step to convert it back into one observation with multiple columns.

data tall ;
  row+1;
  set abc ;
  length col 8 new $200;
  col=1;
  do source=1 to countw(old,';');
    if source>1 and vlength(new) < length(catx(';',new,scan(old,source,';'))) then do;
       output;
       col=col+1;
       new=' ';
    end;
    new=catx(';',new,scan(old,source,';'));
  end;
  if col=1 or new ne ' ' then output;
  drop source old;
run;

proc transpose data=tall out=want(drop=_name_) prefix=col;
  by row;
  id col;
  var new;
run;
pdhokriya
Pyrite | Level 9

Yes You are correct, 

 

Just 1 correction needed i.e. we cant ignore semicolon at the end which is a part of raw data. i.e.

 

col: ASCORBIC ACID;CALCIUM CARBONATE;CALCIUM PHOSPHATE;CHROMIUM;CYANOCOBALAMIN;FOLIC ACID;MAGNESIUM CARBONATE;MAGNESIUM HYDROXIDE;MANGANESE;NICOTINIC ACID;PANTOTHENIC ACID;POTASSIUM BICARBONATE;

 

col: ASCORBIC ACID;BIOTIN;CALCIUM;CALCIUM PHOSPHATE;CHROMIUM;CUPRIC OXIDE;CYANOCOBALAMIN;FOLIC ACID;IODINE;IRON;MAGNESIUM OXIDE;MANGANESE;MOLYBDENUM;NICKEL;NICOTINAMIDE;PANTOTHENIC ACID;PHYTOMENADIONE;

pdhokriya
Pyrite | Level 9


Raw data is impacting in this code :

data want ;
set abc ;
array col [10] $200 ;
target=1;
do source=1 to countw(old,';');
if col[target]=' ' then col[target]=scan(old,source,';');
else if 200 < length(catx(';',col[target],scan(old,source,';'))) then do;
target=target+1;
col[target]=scan(old,source,';');
end;
else col[target]=catx(';',col[target],scan(old,source,';'));
end;
run;


@Tom: Could you please help me with this.
Tom
Super User Tom
Super User

Do you want to append the superfluous semi-colon to the end of every split value?  Or just the final one?

Either way just use the CATS() function to add it back at the appropriate point in the logic. 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1211 views
  • 1 like
  • 5 in conversation