DATA Step, Macro, Functions and more

Compare values in the same column

Reply
New Contributor
Posts: 3

Compare values in the same column

Hi there,

 

I have a problem, my problem is that I have a open text column with data that I have to find all distincts but the date has diferences in many rows, for example:

 

Description 

AMK 100 MG AMP CAT4000173
AMK 100 MG AMP REF4000173
AMK 500 MG AMP 2 MLREF 4000174
AMK 500 MG AMP 2 MLREF4000174
AMK 500 MGAMP 2 MLREF4000174
AMILASA EN SANGRE
AMILASA EN SANGRE P
AGUA PIRRIGAR 1000  BOTE PLASTICO  PISA P REF 4000
AGUA PIRRIGAR 1000  BOTE PLATICO  PISA P REF 40002
AGUA PIRRIGAR 500  BOTE PLASTICO  PISA REF400025
AGUA PIRRIGAR 500  BOTE PLASTICO  PISA REF4000255

 

and I need to a group all of them.

 

 

Super User
Posts: 11,343

Re: Compare values in the same column

Can you show how that data should be grouped?

Generally one approach would be to "extract" the values of interest into another variable or variables and then work with the extracted values.

You haven't provided much in the way to identify what would need to be extracted or what you would expect the result to look like when finished.

New Contributor
Posts: 3

Re: Compare values in the same column

I´m looking for something like this, the problem is that I have more than 60,000 of rows and I need to find all similars 

 

 

 

Description Description_grouped
AMK 100 MG AMP CAT4000173AMK 100 MG AMP 
AMK 100 MG AMP REF4000173
AMK 500 MG AMP 2 MLREF 4000174AMK 500 MG AMP 2 
AMK 500 MG AMP 2 MLREF4000174
AMK 500 MGAMP 2 MLREF4000174
AMILASA EN SANGREAMILASA EN SANGRE
AMILASA EN SANGRE P
AGUA PIRRIGAR 1000  BOTE PLASTICO  PISA P REF 4000AGUA PIRRIGAR 1000  BOTE PLASTICO  PISA 
AGUA PIRRIGAR 1000  BOTE PLATICO  PISA P REF 40002
AGUA PIRRIGAR 500  BOTE PLASTICO  PISA REF400025AGUA PIRRIGAR 500  BOTE PLASTICO  PISA 
AGUA PIRRIGAR 500  BOTE PLASTICO  PISA REF4000255

 

Thanks.

Super User
Posts: 11,343

Re: Compare values in the same column

Do you have any rules involved? Such as ignore everything after MLREF, CAT, REF or " P "?

New Contributor
Posts: 3

Re: Compare values in the same column

No, I don´t have any rul jey, I it will help, but not with those that miss an space in the meddile

Frequent Contributor
Posts: 100

Re: Compare values in the same column

[ Edited ]

data WANT;

  set HAE;

  length GROUP $50;

re=prxparse('s/REF/""/');

POS=prxmatch(RE,DESC);

  if POS=0 then do;

    re=prxparse('s/CAT/""/');

    POS=prxmatch(RE,DESC);

      if POS=0 then do;

        re=prxparse('s/sP/""/');

        POS=prxmatch(RE,DESC);

      end;

  end;

GROUP=sbsustr(DESC,sum(POS,-1));

 

if scan(GROUP,-1,"")='P' then GROUP=substr(GROUP,1,length(GROUP)-1);

run;

 

I bet some contributors can help and make this code even better.  But it does the work, for now.

Regular Contributor
Posts: 227

Re: Compare values in the same column

this program will give you a list of the words in your long text string;

the frequency listing will show you you how many occurrences.

 

DATA unique(keep = word);
     attrib word length = $32;*need to increase?;
     drop _:; *_temp vars;
set long_text_strings;
do _i = 1 to countw(text,' ');
   word = scan(text,_i);
   output;
   end;
run;
PROC freq data   = &syslast
          order  = freq;
          tables   word /list /* noprint */
             out = list_words;
run;

then make a list of the words you want to find.

make a variable for each word

 

run a similar loop where you indicate found in the variable

and then do a cross-tabulation.

 

a start, eh?

 

Ron Fehd search, find, xtab maven

Ask a Question
Discussion stats
  • 6 replies
  • 115 views
  • 1 like
  • 4 in conversation