Help using Base SAS procedures

Count records that contains specific text

Count records that contains specific text


I have the following SAS code. As of right now, it displays each variables from mylib.test table + variables count_gafi, count_sanction and count_person. How can I make sure to only display the variables count_gafi, count_sanction and count_person

Here is the code. Thank you for your help and time.  :

data work.ind14_15_16;
set mylib.test end=eof;
by objet;
retain count_gafi count_sanction count_person ;

switch = "n";

if _n_ eq 1 then do;




if index(lowcase(strip(objet)),'bolivie')>0 or
index(lowcase(strip(objet)),'équateur')>0 or
index(lowcase(strip(objet)),'equateur')>0 or
index(lowcase(strip(objet)),'éthiopie')>0 or
index(lowcase(strip(objet)),'ethiopie')>0 or
index(lowcase(strip(objet)),'indonésie')>0 or
index(lowcase(strip(objet)),'indonesie')>0 or
index(lowcase(strip(objet)),'kenya')>0 or
index(lowcase(strip(objet)),'nigeria')>0 or
index(lowcase(strip(objet)),'nigéria')>0 or
index(lowcase(strip(objet)),'pakistan')>0 or
index(lowcase(strip(objet)),'sao')>0 or
index(lowcase(strip(objet)),'sri lanka')>0 or
index(lowcase(strip(objet)),'thailand')>0 or
index(lowcase(strip(objet)),'tailand')>0 or
index(lowcase(strip(objet)),'turquie')>0 or
index(lowcase(strip(objet)),'tanzanie')>0 or
index(lowcase(strip(objet)),'viet')>0 or
index(lowcase(strip(objet)),'yemen')>0 or
index(lowcase(strip(objet)),'yémen')>0  then do;
switch = "y";

if index(lowcase(strip(objet)),'bélarus')>0 or
index(lowcase(strip(objet)),'belarus' )>0 or
index(lowcase(strip(objet)),'corée du nord' )>0 or
index(lowcase(strip(objet)),'rpdc' )>0 or
index(lowcase(strip(objet)),'coree du nord')>0 or
index(lowcase(strip(objet)),'ivoire'  )>0 or
index(lowcase(strip(objet)),'congo' )>0 or
index(lowcase(strip(objet)),'chine' )>0 or
index(lowcase(strip(objet)), 'cuba'  )>0 or
index(lowcase(strip(objet)),'érythrée' )>0 or
index(lowcase(strip(objet)), 'erythree')>0 or
index(lowcase(strip(objet)), 'iran'  )>0 or
index(lowcase(strip(objet)),'iraq' )>0 or
index(lowcase(strip(objet)), 'liberia'  )>0 or
index(lowcase(strip(objet)),'libéria' )>0 or
index(lowcase(strip(objet)),'libye'  )>0 or
index(lowcase(strip(objet)),'myanmar' )>0 or
index(lowcase(strip(objet)),'birmanie')>0 or
index(lowcase(strip(objet)), 'somalie'  )>0 or
index(lowcase(strip(objet)),'sierra' )>0 or
index(lowcase(strip(objet)), 'soudan')>0 or
index(lowcase(strip(objet)), 'syrie' )>0 or
then do;
switch = "y";

if switch ="n" then do;
count_person = count_person + 1; /* If the record is not part of count_gafi or count_sanction, I want it to be part of count_person */
if eof;

Re: Count records that contains specific text


data work.ind14_15_16;


data work.ind14_15_16(keep=count_Smiley Happy;

Re: Count records that contains specific text

Thank you for the quick reply linlin.

Exactly what I needed.

Also, would you know a more efficient method to perform the count above?

Re: Count records that contains specific text

I don't know. Sorry:smileysilly:!

Re: Count records that contains specific text

You could improve the speed by reducing the number of function calls.  Try adding this statement:

newvar = lowcase(strip(objet));

Then refer to NEWVAR inside all the INDEX functions.

You could restrict the variables you read in on the SET statement:

set mylib.test (keep=objet) end=eof;

More complex, and possibly beyond your skill level, it appears your data are sorted by objet.  In that case, you could compute flags for the first observation for each value of objet.  The flags would indicate whether count_gafi or count_sanction should be incremented for that value of objet (and whether switch should be y or n).  Then all the remaining observations for the same value of objet could use the flags, instead of recomputing using a ton of functions for each observation that has the same value of objet.

Good luck.

Re: Count records that contains specific text

Thank you very much for the recommandation.

Best regards.

