BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11
Dear, I am having issue with the following obs in data one. The actual data is data one from an excel file. The code worked for all values including this. But it also created a blank obs for this value which I am not able to delete with code "if term='' then delete". By running the code below i am getting two OBS in data four. One contains all the characters of the value with length=200 and second with blank value with length 2. I am not able to delete this blank obs. Please help how to delete this. Thank you

data one; input term $ 628.; datalines; Subject 00000000 baseline labs were collected on Day -0, 00AAA0000, instead of on Day -0 per protocol. The baseline labs should not have been drawn as the baseline visit was < 00 days from screening. ; data two; set one; n+1; do i=1 to countw(term,''); temp=scan(term,i,' ');len=length(temp)+1;output; end; run;
data three; set two; by n; retain sum; if first.n then sum=0; if last.n then len=len-1; sum+len; if sum gt 200 then do;group+1;sum=len;end; run;
data four; length dterm $ 200; do until(last.group); set dv2; by n group; dterm=catx(' ',dterm,temp); end; run;
7 REPLIES 7
art297
Opal | Level 21

Not sure what you're trying to do. First, when I ran your initial data step, no data was read. It worked a lot better with:

 

data one;
  informat term $628.;
input term &;
datalines;
Subject 00000000 baseline labs were collected on Day -0, 00AAA0000, instead of on Day -0 per protocol. The baseline labs should not have been drawn as the baseline visit was < 00 days from screening. 
;

 

Art, CEO, AnalystFinder.com

 

ballardw
Super User

Can you show where you attempted: "I am not able to delete with code "if term='' then delete". "

 

Since that does not appear in your code.

 

You are also referencing a data set, DV2, in data four with no description of the contents to us. And since data four also does not reference any of data one, two or three it is real hard to figure out why it was included in your post.

 

Running your code, with data one modified as per @art297 so there is something in it, I do not see blank observations in any of one, two or three.

 

Is the whole purpose of this excersize to break a long string into strings of 200 characters or less breaking at a word boundary?

knveraraju91
Barite | Level 11

Dear,

The  term variable value "Subject 00000000 baseline labs were collected on Day -0, 00AAA0000, instead of on Day -0 per protocol. The baseline labs should not have been drawn as the baseline visit was < 00 days from screening."

is in my data under a variable name term with length 628 from excel file.  I ran the code  below. I suppose to get only one obs for the value with my code as the length of the value is under 200. 

But I am getting two obs with one  all the characters of the value with length=200 and another with blank value and length=2.

 

I am trying to remove the blank value . But I could not with "if dterm='' then delete". Thank you for your time

 

 

 

data two;
set one;
	n+1;
 do i=1 to countw(term,'');
  temp=scan(term,i,' ');len=length(temp)+1;output;
 end;
	run;

	data three;
 set two;
 by n;
 retain sum;
 if first.n then sum=0;
 if last.n then len=len-1;
 sum+len;
 if sum gt 200 then do;group+1;sum=len;end;
run;

data four;
length dterm $ 200;
do until(last.group);
 set three;
 by n group;
 dterm=catx(' ',dterm,temp);
end;
if dterm='' then delete; run;
art297
Opal | Level 21

Where is the blank value you are trying to remove? Show us by stating what comes immediately before and after it.

 

Art, CEO, AnalystFinder.com

 

knveraraju91
Barite | Level 11

Thank you very much for your support. I am attaching a screen shot of my actual data for the values. 

 

The data is read from an excel file. 

The variable length (term) is 628. But when I used the code length=length(term). The value is 202 for this obs.

 

I ran the code in previous post which gave two obs with one containg all the characters of the value with length=199(length=length(dterm)). Length of dvterm set at 200.

The other with blank value with length =2(length=length(dterm)). 

art297
Opal | Level 21

Still not sure what you are trying to either keep or delete. If you just want to find the values you marked in your doc, then the following might help:

 

data one (keep=term val:);
  informat term $628.;
  input term &;
  loc=anydigit(term);
  value1=scan(substr(term,loc),1,', .');
  len=length(value1);
  temp=substr(term,loc+len);
  loc=indexc(temp,'<>-0123456789');
  value2=scan(substr(temp,loc),1,', .');
  len=length(value2);
  temp=substr(temp,loc+len);
  loc=indexc(temp,'<>-0123456789');
  value3=scan(substr(temp,loc),1,', .');
  len=length(value3);
  temp=substr(temp,loc+len);
  loc=indexc(temp,'<>-0123456789');
  value4=scan(substr(temp,loc),1,', .');
  if value4 in ('<','>') then value4=
   catt(value4,scan(substr(temp,loc),2,', .'));
  temp=substr(temp,loc+len);
  loc=indexc(temp,'<>-0123456789');
  value5=scan(substr(temp,loc),1,', .');
  if value5 in ('<','>') then value5=
   catt(value5,scan(substr(temp,loc),2,', .'));
datalines;
Subject 00000000 baseline labs were collected on Day -0, 00AAA0000, instead of on Day -0 per protocol. The baseline labs should not have been drawn as the baseline visit was < 00 days from screening. 
;

Art, CEO, AnalystFinder.com

 

knveraraju91
Barite | Level 11

Thanks for the help. I am attaching an excel sheet. I ran the followibg code. I got two OBS for varaible DTERM I need to remove the blank obs. Thank you

proc import datafile="C:\\Book1.xlsx" DBMS=xlsx out=DV;
                
run;
data two;
set dv;
	n+1;
 do i=1 to countw(term,'');
  temp=scan(term,i,' ');len=length(temp)+1;output;
 end;
	run;

	data three;
 set two;
 by n;
 retain sum;
 if first.n then sum=0;
 if last.n then len=len-1;
 sum+len;
 if sum gt 200 then do;group+1;sum=len;end;
run;

data four;
length dterm $ 200;
do until(last.group);
 set three;
 by n group;
 dterm=catx(' ',dterm,temp);
end;
if dterm='' then delete;
one=length(dterm);
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 988 views
  • 0 likes
  • 3 in conversation