BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rk7
Obsidian | Level 7 rk7
Obsidian | Level 7

can we read data in datalines starting from the end 

* Using appropriate input option and create a SAS data set named 'COLLEGE' using the data below

* Variables: name, title, tenure, number

* Data:
12345676890123456789012345678890
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@rk7

And just for fun here an approach which uses your initial reverse idea

data have;
  infile datalines truncover dlm=' ';
  input @;
  _infile_=reverse(strip(_infile_));
  input number:best32. tenure:$1. title:$10. name $80.;
  title=reverse(strip(title));
  name=reverse(strip(name));
  datalines;
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1
;

proc print;
run;

View solution in original post

14 REPLIES 14
Patrick
Opal | Level 21

Why would you want to do that? Is this a study question? I can't see any benefit in even attempting this.

rk7
Obsidian | Level 7 rk7
Obsidian | Level 7

the actual data with spaces

Stevenson Ph.D.  Y 2
Smith Ph.D.   N    3
Goldstein   M.D.   Y  1
George Stevenson   Ph.D. Y 2
Fred Smith   Ph.D.    N    3
Alissa Goldstein  M.D.  Y  1

 

which makes it tough to read the data 

we need to use the data with moddifiers and other input options

so i was trying to read the data based on the number . so, i was asking if we could read the data in reverse order 

art297
Opal | Level 21

Each non-standard case requires code that can extract what you want/need. e.g., in this case you could use something like:

data have (drop=_:);
  input @;
  length name $25;
  length answer $1;
  CALL SCAN(_infile_, -2, _position, _length);
  name=substr(_infile_,1,_position-2);
  answer=substr(_infile_,_position,_length);
  type=input(substr(_infile_,_position+_length),8.);
  cards;
Stevenson Ph.D.  Y 2
Smith Ph.D.   N    3
Goldstein   M.D.   Y  1
George Stevenson   Ph.D. Y 2
Fred Smith   Ph.D.    N    3
Alissa Goldstein  M.D.  Y  1
;

Art, CEO, AnalystFinder.com

 

 

rk7
Obsidian | Level 7 rk7
Obsidian | Level 7
In the output the name includes the title and can you pls include the the reference for this concept
Patrick
Opal | Level 21

@rk7

And just for fun here an approach which uses your initial reverse idea

data have;
  infile datalines truncover dlm=' ';
  input @;
  _infile_=reverse(strip(_infile_));
  input number:best32. tenure:$1. title:$10. name $80.;
  title=reverse(strip(title));
  name=reverse(strip(name));
  datalines;
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1
;

proc print;
run;
art297
Opal | Level 21

The reference for all data step programming is: http://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.3&docsetId=pgmmvaov&docsetTarget=pgms...

 

That is, learn how the data step works and all of the functions that are available to you. The same documentation applies to all of the solutions that have been proposed.

 

Art, CEO, AnalystFinder.com

 

rk7
Obsidian | Level 7 rk7
Obsidian | Level 7
Thank you
PGStats
Opal | Level 21

Regular expressions are a very flexible tool to read in almost anything

 

data have;
infile datalines truncover;
length name $40 title $12 tenure $1 number 8;
if not prxId then 
    prxId + prxParse("/(.+?)\s+(\S+)\s+([YN])\s+(\d+)/i");
input line $100.;
if prxmatch(prxId, line) then do;
    name = prxPosn(prxId,1,line);
    title = prxPosn(prxId,2,line);
    tenure = prxPosn(prxId,3,line);
    number = input(prxPosn(prxId,4,line), ?? best.);
end;
drop prxid line;
datalines;
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1
;

proc print; run;

                Obs    name                title    tenure    number

                 1     Stevenson           Ph.D.      Y          2
                 2     Smith               Ph.D.      N          3
                 3     Goldstein           M.D.       Y          1
                 4     George Stevenson    Ph.D.      Y          2
                 5     Fred Smith          Ph.D.      N          3
                 6     Alissa Goldstein    M.D.       Y          1

 

 

PG
rk7
Obsidian | Level 7 rk7
Obsidian | Level 7
In the output the name includes the title and can you pls include the the reference for this concept
PGStats
Opal | Level 21

Simpler then...

 

data have;
infile datalines truncover;
length name $64 tenure $1 number 8;
if not prxId then 
    prxId + prxParse("/(.+?)\s+([YN])\s+(\d+)/i");
input line $100.;
if prxmatch(prxId, line) then do;
    name = prxPosn(prxId,1,line);
    tenure = prxPosn(prxId,2,line);
    number = input(prxPosn(prxId,3,line), ?? best.);
end;
drop prxid line;
datalines;
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1
;

proc print; run;

                 Obs    name                      tenure    number

                  1     Stevenson Ph.D.             Y          2
                  2     Smith Ph.D.                 N          3
                  3     Goldstein M.D.              Y          1
                  4     George Stevenson Ph.D.      Y          2
                  5     Fred Smith Ph.D.            N          3
                  6     Alissa Goldstein M.D.       Y          1

 

 

PG
rk7
Obsidian | Level 7 rk7
Obsidian | Level 7
I mean the name should n't include title
PGStats
Opal | Level 21

@rk7, I edited my posts to show the results.

PG
ballardw
Super User

@rk7 wrote:

can we read data in datalines starting from the end 

* Using appropriate input option and create a SAS data set named 'COLLEGE' using the data below

* Variables: name, title, tenure, number

* Data:
12345676890123456789012345678890
Stevenson Ph.D. Y 2
Smith Ph.D. N 3
Goldstein M.D. Y 1
George Stevenson Ph.D. Y 2
Fred Smith Ph.D. N 3
Alissa Goldstein M.D. Y 1


Please explain the role of

12345676890123456789012345678890

 

If it is not actually part of your data do not include it as part of your example.

rk7
Obsidian | Level 7 rk7
Obsidian | Level 7
the numbers i have used is to find/determine the position of the data.
like , starting point and length of the variable.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 1810 views
  • 3 likes
  • 5 in conversation