BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cmoore
Obsidian | Level 7

I have a data set and have created new columns from an original column called "Africa". This cell contained several variables together that I wanted to isolate and place the new variables into new columns e.g. Africa1, Africa2, etc.

 

E.g Africa= "Corporate/M&A (Band 3).Dispute Resolution (Band 2)

      Africa1 = "Banking & Finance (Band 3)"

      Africa2 = "Corporate/M&A (Band 2)"

 

Once I have done this I want to create a unique column that merges together the region and sector: E,g Africa_BF (this searches for Africa and Banking and Finance variable) that reads all the columns in Africa (1,2,3..) to put them into the new column. I populate the new column with the band numeric and hence the scan function. When creating the new column my code looks like the below:

 

if index(upcase(africa_wide1),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide1,-1);
if index(upcase(africa_wide2),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide2,-1);
if index(upcase(africa_wide3),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide3,-1);
if index(upcase(africa_wide4),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide4,-1);
if index(upcase(africa_wide5),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide5,-1);
if index(upcase(africa_wide6),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide6,-1);
if index(upcase(africa_wide7),'BANKING & FINANCE')>0 then Africa_BF=scan(africa_wide7,-1);

if index(upcase(africa_wide1),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide1,-1);
if index(upcase(africa_wide2),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide2,-1);
if index(upcase(africa_wide3),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide3,-1);
if index(upcase(africa_wide4),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide4,-1);
if index(upcase(africa_wide5),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide5,-1);
if index(upcase(africa_wide6),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide6,-1);
if index(upcase(africa_wide7),'CORPORATE/M&A')>0 then Africa_Corp_MA=scan(africa_wide7,-1);

 

There are roughly 80 regions and 40 sector types that would mean I would have to write code for hundreds of columns. Is there a more efficient way to do this? For each region there would be 40 possible outcomes of Sector/Region.

 

Many Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
rogerjdeangelis
Barite | Level 11
Normalizing a fat variable into key/value pairs

see
https://communities.sas.com/t5/Base-SAS-Programming/SAS-Macros/m-p/349115

HAVE
====

Up to 40 obs WORK.HAVE total obs=2

 FIRM     AFRICA_WIDE

 Firm1    Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)
 Firm2    Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3)

WANT (if you move the output statement inside the loop you will get the form you wanted or transpose want)


Up to 40 obs WORK.WANT total obs=7

Obs    FIRM      NAME      PROJECTS                                         BAND

 1     Firm1    AFRICA1    Corporate/M&A (Band 3)                            3
 2     Firm1    AFRICA2    Dispute Resolution (Band 2)                       2
 3     Firm1    AFRICA3    Projects & Energy (Band 3)                        3
 4     Firm1    AFRICA4    Projects & Energy: Mining & Minerals (Band 3)     3

 5     Firm2    AFRICA1    Corporate/M&A (Band 3)                            3
 6     Firm2    AFRICA2    Dispute Resolution (Band 2)                       2
 7     Firm2    AFRICA3    Projects & Energy (Band 3)                        3

proc transpose data=want out=wantxpo;
by firm;
var band;
id projects;
run;quit;

                          CORPORATE_      DISPUTE_      PROJECTS___     ENERGY__
                          M_A__BAND_    RESOLUTION__     ENERGY__       MINING___
Obs    FIRM     _NAME_        3_          BAND_2_         BAND_3_         MINE

 1     Firm1     BAND         3              2               3              3
 2     Firm2     BAND         3              2               3


WORKING CODE

     * dimension (permax) are the max number of projects acroos all firs;

      array africas[&permax] $64 africa1-africa&permax;
      array africasvals[&permax] $5 africasvals1-africasvals&permax;

FULL SOLUTION

*                _              _       _
 _ __ ___   __ _| | _____    __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \  / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/ | (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|  \__,_|\__,_|\__\__,_|

;

data have;
 firm             ='Firm1';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)';
 africa1          ='Corporate/M&A (Band 3)';
 africa2          ='Dispute Resolution (Band 2)';
 africa3          ='Projects & Energy (Band 3)';
 africa4          ='Projects & Energy: Mining & Minerals (Band 3)';
 africa_corp      =3;
 africa_resolut   =2;
 africa_nrgy_mine =3;
 africa_nrgy_mine =3;
run;quit;


data have;
 firm             ='Firm1';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)';
 output;
 firm             ='Firm2';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3)';
 output;
run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

data want;
  * you need to know the maximum number of africas, ie africa1-africa9?;
  if _n_ = 0 then do;
    rc=%sysfunc(dosubl('
      proc sql;
         select max(percnt) into :permax separated by ""
         from (select countc(africa_wide,".") + 1 as percnt from have group by firm)
      ;quit;
    '));
  end;
  put "&&permax";
  length projects $64;
  set have;
  array africas[&permax] $64 africa1-africa&permax;
  array africasvals[&permax] $5 africasvals1-africasvals&permax;
  do i = 1 to  countc(africa_wide,'.') + 1;
    africas[i]=scan(africa_wide,i,'.');
    africasvals[i]=compress(africas[i],,'kd');
    name=vname(africas[i]);
    projects=africas[i];
    band=compress(africas[i],,'kd');
    output;
  end;
  keep firm name projects band;;
run;quit;


3686  data want;
3687    length project $64;
3688    * you need to know the maximum number of africas, ie africa1-africa9?;
3689    if _n_ = 0 then do;
3690      rc=%sysfunc(dosubl('
NOTE: A GROUP BY clause has been discarded because neither the SELECT clause nor the optional
      HAVING clause of the associated table-expression referenced a summary function.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.03 seconds
      user cpu time       0.00 seconds
      system cpu time     0.01 seconds
      memory              3727.00k
      OS Memory           20468.00k
      Timestamp           04/11/2017 04:01:10 PM
      Step Count                        338  Switch Count  0


3691        proc sql;
3692           select max(percnt) into :permax separated by ""
3693           from (select countc(africa_wide,".") + 1 as percnt from have group by firm)
3694        ;quit;
3695      '));
3696    end;
SYMBOLGEN:  && resolves to &.
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3697    put "&&permax";
3698    set have;
3699    array africas[&permax] $64 africa1-africa&permax;
SYMBOLGEN:  Macro variable PERMAX resolves to 4
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3700    array africasvals[&permax] $5 africasvals1-africasvals&permax;
SYMBOLGEN:  Macro variable PERMAX resolves to 4
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3701    do i = 1 to  countc(africa_wide,'.') + 1;
3702      africas[i]=scan(africa_wide,i,'.');
3703      africasvals[i]=compress(africas[i],,'kd');
3704      name=vname(africas[i]);
3705      projects=africas[i];
3706      band=compress(africas[i],,'kd');
3707      output;
3708    end;
3709    keep firm name projects band;;
3710  run;

4
NOTE: There were 2 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 7 observations and 4 variables.
NOTE: DATA statement used (Total process time):




View solution in original post

7 REPLIES 7
RW9
Diamond | Level 26 RW9
Diamond | Level 26

First SAS Macro is not needed!

Look up array processing, something like:

data want;
  set have;
  array vals{*} africa_wide:;
  do i=1 to dim(vals);
    if vals{i}...;
  end;
run;

If you post test data in the form of a datastep then more acurate code can be provided.

cmoore
Obsidian | Level 7

Hi,

 

Please find attached test data. This is the desired output. In essence, rather than write out all of the IF statements to populate the new columns I wondered if there was a more efficient alternative? By creating the new columns like Africa_BF it would allow me to eye-ball the data across all of the the different firms in column A and compare. The problem I have is that for each region there would be 80 columns for each one.

 

Thank you

data_null__
Jade | Level 19

Post data as text not XL and not attachement.

cmoore
Obsidian | Level 7

Please see the text below as one example:

 

Firm NameAfrica-wideAfrica_wide1Africa_wide2Africa_wide3Africa-Wide - Corporate/MAAfrica-Wide - Dispute ResolutionAfrica-Wide - Projects & Energy
Firm1Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)Corporate/M&A (Band 3)Dispute Resolution (Band 2)Projects & Energy (Band 3)323

 

Astounding
PROC Star

Here's a suggestion that I think will make life much easier. 

 

Change the structure of the data.  Instead of creating a variable named Africa_wide_1, create a variable named (something like) Africa_wide_Corporate_MA.  Put the data into the right variable to begin with.  Your subsequent processing will be (relatively) easy if there is only one variable that holds 'Corporate M&A/(Band #)'.

rogerjdeangelis
Barite | Level 11
Normalizing a fat variable into key/value pairs

see
https://communities.sas.com/t5/Base-SAS-Programming/SAS-Macros/m-p/349115

HAVE
====

Up to 40 obs WORK.HAVE total obs=2

 FIRM     AFRICA_WIDE

 Firm1    Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)
 Firm2    Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3)

WANT (if you move the output statement inside the loop you will get the form you wanted or transpose want)


Up to 40 obs WORK.WANT total obs=7

Obs    FIRM      NAME      PROJECTS                                         BAND

 1     Firm1    AFRICA1    Corporate/M&A (Band 3)                            3
 2     Firm1    AFRICA2    Dispute Resolution (Band 2)                       2
 3     Firm1    AFRICA3    Projects & Energy (Band 3)                        3
 4     Firm1    AFRICA4    Projects & Energy: Mining & Minerals (Band 3)     3

 5     Firm2    AFRICA1    Corporate/M&A (Band 3)                            3
 6     Firm2    AFRICA2    Dispute Resolution (Band 2)                       2
 7     Firm2    AFRICA3    Projects & Energy (Band 3)                        3

proc transpose data=want out=wantxpo;
by firm;
var band;
id projects;
run;quit;

                          CORPORATE_      DISPUTE_      PROJECTS___     ENERGY__
                          M_A__BAND_    RESOLUTION__     ENERGY__       MINING___
Obs    FIRM     _NAME_        3_          BAND_2_         BAND_3_         MINE

 1     Firm1     BAND         3              2               3              3
 2     Firm2     BAND         3              2               3


WORKING CODE

     * dimension (permax) are the max number of projects acroos all firs;

      array africas[&permax] $64 africa1-africa&permax;
      array africasvals[&permax] $5 africasvals1-africasvals&permax;

FULL SOLUTION

*                _              _       _
 _ __ ___   __ _| | _____    __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \  / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/ | (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|  \__,_|\__,_|\__\__,_|

;

data have;
 firm             ='Firm1';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)';
 africa1          ='Corporate/M&A (Band 3)';
 africa2          ='Dispute Resolution (Band 2)';
 africa3          ='Projects & Energy (Band 3)';
 africa4          ='Projects & Energy: Mining & Minerals (Band 3)';
 africa_corp      =3;
 africa_resolut   =2;
 africa_nrgy_mine =3;
 africa_nrgy_mine =3;
run;quit;


data have;
 firm             ='Firm1';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3).Projects & Energy: Mining & Minerals (Band 3)';
 output;
 firm             ='Firm2';
 africa_wide      ='Corporate/M&A (Band 3).Dispute Resolution (Band 2).Projects & Energy (Band 3)';
 output;
run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

data want;
  * you need to know the maximum number of africas, ie africa1-africa9?;
  if _n_ = 0 then do;
    rc=%sysfunc(dosubl('
      proc sql;
         select max(percnt) into :permax separated by ""
         from (select countc(africa_wide,".") + 1 as percnt from have group by firm)
      ;quit;
    '));
  end;
  put "&&permax";
  length projects $64;
  set have;
  array africas[&permax] $64 africa1-africa&permax;
  array africasvals[&permax] $5 africasvals1-africasvals&permax;
  do i = 1 to  countc(africa_wide,'.') + 1;
    africas[i]=scan(africa_wide,i,'.');
    africasvals[i]=compress(africas[i],,'kd');
    name=vname(africas[i]);
    projects=africas[i];
    band=compress(africas[i],,'kd');
    output;
  end;
  keep firm name projects band;;
run;quit;


3686  data want;
3687    length project $64;
3688    * you need to know the maximum number of africas, ie africa1-africa9?;
3689    if _n_ = 0 then do;
3690      rc=%sysfunc(dosubl('
NOTE: A GROUP BY clause has been discarded because neither the SELECT clause nor the optional
      HAVING clause of the associated table-expression referenced a summary function.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.03 seconds
      user cpu time       0.00 seconds
      system cpu time     0.01 seconds
      memory              3727.00k
      OS Memory           20468.00k
      Timestamp           04/11/2017 04:01:10 PM
      Step Count                        338  Switch Count  0


3691        proc sql;
3692           select max(percnt) into :permax separated by ""
3693           from (select countc(africa_wide,".") + 1 as percnt from have group by firm)
3694        ;quit;
3695      '));
3696    end;
SYMBOLGEN:  && resolves to &.
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3697    put "&&permax";
3698    set have;
3699    array africas[&permax] $64 africa1-africa&permax;
SYMBOLGEN:  Macro variable PERMAX resolves to 4
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3700    array africasvals[&permax] $5 africasvals1-africasvals&permax;
SYMBOLGEN:  Macro variable PERMAX resolves to 4
SYMBOLGEN:  Macro variable PERMAX resolves to 4
3701    do i = 1 to  countc(africa_wide,'.') + 1;
3702      africas[i]=scan(africa_wide,i,'.');
3703      africasvals[i]=compress(africas[i],,'kd');
3704      name=vname(africas[i]);
3705      projects=africas[i];
3706      band=compress(africas[i],,'kd');
3707      output;
3708    end;
3709    keep firm name projects band;;
3710  run;

4
NOTE: There were 2 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 7 observations and 4 variables.
NOTE: DATA statement used (Total process time):




cmoore
Obsidian | Level 7

Thanks very much for your help

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1655 views
  • 1 like
  • 5 in conversation