DATA Step, Macro, Functions and more

SUBSTRING

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

SUBSTRING

Hi, All

  I have been using SAS for just more than a month. First of all, I appreciate for who read my post and try to answer my question. Here is my question: how to locate row1 and column 1 and do substring on last 3 character on that cell(row1,column1) and insert the substring and fill out into new column? Should I need to use line and column pointer in this case? How ?


Accepted Solutions
Solution
‎03-22-2017 11:42 PM
Super User
Super User
Posts: 6,500

Re: SUBSTRING

[ Edited ]

SAS is not a spreadsheet. It is more like a database.

So instead of "columns" you have variables. You reference variables by name. In your photograph of the data it looks like your variable name is A.

Instead of "rows" you have observations.  Normally you don't select on observation by its position, but instead select based the values it has for the variables.  There are ways in SAS use the observation number if you want, but it does not look like you need (or want) to in this case.

 

Looking at the picture you posted of your data it seems that you want to extract a date (or dates) out of one of the strings.  So you could do this and create a dataset with the new STARTDT and ENDDT variables.  

data date_range ;
  set Source_data ;
  where a =: 'Dates Found:' ;
  startdt = input(scan(a,2,' '),mmddyy10.);
  enddt = input(scan(a,-1,' '),mmddyy10.);
  format startdt enddt date9.;
  keep startdt enddt ;
run;

Then if you want use this date (or dates) together with another (or the same) dataset you might do something like this:

data want ;
   set Source_data ;
   if _n_=1 then set date_range ;
run;

 Since the DATE_RANGE dataset is read only on the first observation the values of STARTDT and ENDDT will be carried forward (retained) onto all of the observations read from SOURCE_DATA.

 

View solution in original post


All Replies
Super User
Posts: 10,500

Re: SUBSTRING

Since row and column aren't really SAS methods of addressing variables you should provide an example of your input  data and what you would like the result to look like. Since this is a SAS question providing data in a form we can work with will help.

Best is to provide a data step program that would look like this:

data have;
  input a $ b $;
datalines;
Some word
another bit
question answer
;
run;

If you know how to run a macro there is one that will do this for an existing data set. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

 

A "row" in a data set can be referenced using the SAS automatic variable _n_ for many purposes. Columns are referred to as variables and we would need to know the name of the varaible involved and the text to replace it with.

Occasional Contributor
Posts: 5

Re: SUBSTRING

Hi, Ballardw
Thanks for your reply. But instead of using "datalines" statement. I want to read from imported Excel sheet. I have created a new table already. Next how can I achieve to locate column 1 and row 1 specifically and extract the substring on that cell. Finally put the substring value into new column on the new table. You understand the scenario now? Thanks

Super User
Posts: 10,500

Re: SUBSTRING


zidane wrote:

Hi, Ballardw
Thanks for your reply. But instead of using "datalines" statement. I want to read from imported Excel sheet. I have created a new table already. Next how can I achieve to locate column 1 and row 1 specifically and extract the substring on that cell. Finally put the substring value into new column on the new table. You understand the scenario now? Thanks


No data, no shown results so I do not understand.

The example was to show how to use the substring function in the manner requested. The assumption is that you know how to reference an existing data set in data step code using the SET statement and how to use the desired named varaible. The _n_ was used to show how pick a specific row.

 

And what goes into your second table exactly?

Valued Guide
Posts: 505

Re: SUBSTRING

This kind of problem is best done with IML or R.
You should be able to cut and paste my R code below  into IML/R

inspired by
https://goo.gl/60YJOr
https://communities.sas.com/t5/Base-SAS-Programming/SUBSTRING/m-p/343431


HAVE
====

Up to 40 obs SD1.HAVE total obs=12

Obs    ROW    COL          STR

  1     1      1     row-1-col-1-100
  2     1      2     row-1-col-2-105
  3     1      3     row-1-col-3-103
  4     2      1     row-2-col-1-106
  5     2      2     row-2-col-2-107
  6     2      3     row-2-col-3-109
  7     3      1     row-3-col-1-108
  8     3      2     row-3-col-2-101
  9     3      3     row-3-col-3-103
 10     4      1     row-4-col-1-104
 11     4      2     row-4-col-2-105
 12     4      3     row-4-col-3-103

WANT
====

Up to 40 obs from wantwps total obs=12

Obs    WANT

  1    100
  2    105
  3    103
  4    106
  5    107
  6    109
  7    108
  8    101
  9    103
 10    104
 11    105
 12    103

WORKING CODE
===========
     substr(have[1,3], 13,15) = '100'
     substr(have[,3],13,15) => output above

FULL SOLUTION
=============

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.have;
  do row=1 to 4;
    do col=1 to 3;
       str=catx('-','row',
          put(row,1.),'col',put(col,1.),
          put(100+int(10*uniform(5731)),z3.));
       output;
    end;
  end;
run;quit;

*_        ______  ____        ____
\ \      / /  _ \/ ___|      |  _ \
 \ \ /\ / /| |_) \___ \ _____| |_) |
  \ V  V / |  __/ ___) |_____|  _ <
   \_/\_/  |_|   |____/      |_| \_\

;
%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(haven);
have<-as.matrix(read_sas("d:/sd1/have.sas7bdat"));
want<-substr(have[,3], 13, 15);
endsubmit;
import r=want data=wrk.wantwps;
run;quit;
');
Valued Guide
Posts: 505

Re: SUBSTRING

 

This kind of problem is best done with IML or R.
You should be able to cut and paste my R code below  into IML/R

inspired by
https://goo.gl/60YJOr
https://communities.sas.com/t5/Base-SAS-Programming/SUBSTRING/m-p/343431


HAVE
====

Up to 40 obs SD1.HAVE total obs=12

Obs    ROW    COL          STR

  1     1      1     row-1-col-1-100
  2     1      2     row-1-col-2-105
  3     1      3     row-1-col-3-103
  4     2      1     row-2-col-1-106
  5     2      2     row-2-col-2-107
  6     2      3     row-2-col-3-109
  7     3      1     row-3-col-1-108
  8     3      2     row-3-col-2-101
  9     3      3     row-3-col-3-103
 10     4      1     row-4-col-1-104
 11     4      2     row-4-col-2-105
 12     4      3     row-4-col-3-103

WANT
====

Up to 40 obs from wantwps total obs=12

Obs    WANT

  1    100
  2    105
  3    103
  4    106
  5    107
  6    109
  7    108
  8    101
  9    103
 10    104
 11    105
 12    103

WORKING CODE
===========
     substr(have[1,3], 13,15) = '100'
     substr(have[,3],13,15) => output above

FULL SOLUTION
=============

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

options validvarname=upcase;
libname sd1 "d:/sd1";
data sd1.have;
  do row=1 to 4;
    do col=1 to 3;
       str=catx('-','row',
          put(row,1.),'col',put(col,1.),
          put(100+int(10*uniform(5731)),z3.));
       output;
    end;
  end;
run;quit;

*_        ______  ____        ____
\ \      / /  _ \/ ___|      |  _ \
 \ \ /\ / /| |_) \___ \ _____| |_) |
  \ V  V / |  __/ ___) |_____|  _ <
   \_/\_/  |_|   |____/      |_| \_\

;
%utl_submit_wps64('
libname sd1 "d:/sd1";
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(haven);
have<-as.matrix(read_sas("d:/sd1/have.sas7bdat"));
want<-substr(have[,3], 13, 15);
endsubmit;
import r=want data=wrk.wantwps;
run;quit;
');

 

Occasional Contributor
Posts: 5

Re: SUBSTRING

Hi,

  Thanks for you reply. Currently I'm using SAS Enterprise Guide. First of all, I already imported an excel sheet called "Source_Data" to SAS. The next step I'm trying to wrtie codes to read this Source_Data into another new data set called "Refined_Data". My question is when I read this Source_Data, I want to locate a specific cell such as column 1 row3. Then take the substring on this specific cell such as last 10 character and insert the substring into new clolumn on data set "Refined_Data" Sample.PNG

Super User
Posts: 10,500

Re: SUBSTRING

Which variable contains the text you are looking for? Or is your requirement to search all of the variables to find some specific text to manipulate?

 

How do we know that you need the data from row 3? There should be some rule involved but you have not shared any of the rules other than to say row and column number. Do you already know which "row" and "column" number is involved? If not say so. If so then tell use the name of the variable that represents column whatever number.

 

In your output it isn't quite clear if you only want the "row" form the input data in the output. Or if the result is going to be a single record or multiple records or if the result is going to only have the modified value or anything else.

Occasional Contributor
Posts: 5

Re: SUBSTRING

Hi,

The "Source_Data" is imported excel sheet. I try to local var1(or column 1) and row 2 and take the subtring from last 2 characters. Create a new data set and insert substring into new column.

the current imported table "Source_Data" 

  var1            var2

ab sc ef    12 43 56

dt wg df    56 23 55

      ;                :

      :                :

The desired data set should like:

    var1          Var2       New_col

ab sc ef    12 43 56         df

dt wg df    56 23 55         df

     :                 :                df

 

Solution
‎03-22-2017 11:42 PM
Super User
Super User
Posts: 6,500

Re: SUBSTRING

[ Edited ]

SAS is not a spreadsheet. It is more like a database.

So instead of "columns" you have variables. You reference variables by name. In your photograph of the data it looks like your variable name is A.

Instead of "rows" you have observations.  Normally you don't select on observation by its position, but instead select based the values it has for the variables.  There are ways in SAS use the observation number if you want, but it does not look like you need (or want) to in this case.

 

Looking at the picture you posted of your data it seems that you want to extract a date (or dates) out of one of the strings.  So you could do this and create a dataset with the new STARTDT and ENDDT variables.  

data date_range ;
  set Source_data ;
  where a =: 'Dates Found:' ;
  startdt = input(scan(a,2,' '),mmddyy10.);
  enddt = input(scan(a,-1,' '),mmddyy10.);
  format startdt enddt date9.;
  keep startdt enddt ;
run;

Then if you want use this date (or dates) together with another (or the same) dataset you might do something like this:

data want ;
   set Source_data ;
   if _n_=1 then set date_range ;
run;

 Since the DATE_RANGE dataset is read only on the first observation the values of STARTDT and ENDDT will be carried forward (retained) onto all of the observations read from SOURCE_DATA.

 

Occasional Contributor
Posts: 5

Re: SUBSTRING

Hi, Tom

  Thanks for your sharing and replying. The codes work. I very appreciate your help. I'm new to SAS programming but have some C++ and java background. Do you have some websites or book that you recommend to read.

 

Sean

Contributor
Posts: 43

Re: SUBSTRING

For examle you have an excel file has 2 colums "a and b", and excel has 100 rows. You imported this sheet into sas as "tablehave". When you run this code it gives you only first row (obs=1). And "drop b" command deletes column b.finally you got only A1 cell as of excel mentallity.

Data tablewant:
Set tablehave (obs=1);
Sample=substr(a,1, 3);
Drop b;
Run;
Contributor
Posts: 43

Re: SUBSTRING

Second solution ;

proc sql noprint;
select *
into : first_variable1
from work.tablehave;
quit;

Data tablewant;
X=&first_value1;
Run;
Contributor
Posts: 43

Re: SUBSTRING

In sas EG you can write sas codes, from menu bar select "program "and proper section. I couldnt remember now.
Valued Guide
Posts: 505

Re: SUBSTRING

If you using a windows, laptop or workstation you can run the R code on your local machine. The free WPS express edition can call R and there is no limit on the size of the dataset WPS/R creates,

 

You will not regret leaning a little R and WPS is a very good clone of SAS.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 283 views
  • 3 likes
  • 5 in conversation