- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have an excel file with 3 columns. Say A,B and C. Now the C column as some values like PPA249394893,PPA3949343904, CPA239483984,DEF948594 etc.. Now i need to remove the value which as prefix PPA from Column C. How to do that in SAS EG ? i have imported the file as below. I need the records which i removed(PPA Prefix records) as well in separae dataset. Please help me to proceed further.
PROC IMPORT OUT =Test
datafile= "/Test/test_file.xlsx"
DBMS = XLSX
REPLACE;
GETNAMES = YES;
RUN;
Thanks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When you send the mail using filename email, simply do this in the data step:
data _null_;
filename emailout; /* or whatever file reference you used in filename email */
/* as one of the options in the infile statement, you can attach the want2 dataset, using its physical path */
put "&total_read observations were read.";
put "&total_output were accepted.";
put "&total_excluded were rejected.";
run;
As an alternative to attaching, you could store the want2 dataset in a library that is accessible to users, and simply add the complete dataset name as text to the email. That keeps mails small (to the pleasure of your mail administrators).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data want1 want2;
set test;
if substr(c,1,3) = 'PPA'
then output want2;
else output want1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Abhi1212 wrote:
Thanks for your reply Kurt. Meanwhile, Control totals should also be captured including: Totals records read, total invalid records excluded, total records output. How that can be done?
Count during the step, and then save to macro variables for further use:
data want1 want2;
set test end=done;
retain
count_out 0
count_x 0
;
if substr(c,1,3) = 'PPA'
then do;
output want2;
count_ex + 1;
end;
else do;
output want1;
count_out + 1;
end;
if done
then do;
call symputx('total_read',put(_n_,best.));
call symputx('total_output',put(count_out,best.));
call symputx('total_excluded',put(count_ex,best.));
end;
drop count_out count_ex;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
ERROR 202-322: The option or parameter is not recognized and will be ignored.
I get this error
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please post the complete log (using the {i} icon to preserve formatting), so we can see to which code parts the messages point.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
________
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, CANCEL, PGM.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I guess you are still learning about the meaning of "complete".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
AND USE THE {i} ICON TO POST CODE AND LOGS!!!
I hope I do not need to repeat myself once more.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry Kurt i udnerstood what you asked for. I rectified the error. I missed a ; in program that was the issue.
I saw the output i got an extra column named count_X with o values. But i need something as in the log.
I need to send the below part in an email with invalid records file attached to it.
NOTE: There were 3508 observations read from the data set WORK.TEST.
NOTE: The data set WORK.WANT1 has 3501 observations and 5 variables.
NOTE: The data set WORK.WANT2 has 7 observations and 5 variables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When you send the mail using filename email, simply do this in the data step:
data _null_;
filename emailout; /* or whatever file reference you used in filename email */
/* as one of the options in the infile statement, you can attach the want2 dataset, using its physical path */
put "&total_read observations were read.";
put "&total_output were accepted.";
put "&total_excluded were rejected.";
run;
As an alternative to attaching, you could store the want2 dataset in a library that is accessible to users, and simply add the complete dataset name as text to the email. That keeps mails small (to the pleasure of your mail administrators).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is this solution you are looking?
data test;
input C$ A B;
datalines;
PPA249394893 12 34
CPA239483984 23 45
DEF948594 56 67
PPA249394893 90 10
;
run;
data ppa_test(drop=new_c) without_ppa(drop=c);
set test;
if upcase(substr(c,1,3))='PPA' then
do;
new_c=substr(c,3); output without_ppa;
end;
else output ppa_test;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Lakshmi,
My input values are given through excel. So first i should import and then i should play around. Also your code is seem to be more lengthy. Just IF statement will solve. Thanks for your response.