BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
epr
Calcite | Level 5 epr
Calcite | Level 5

Hello All,

Need some help reading data using scan

data a;

input info $1-50;

cards;

101 pencils 35

102 parker pensl21

103 dellstudioipod 04

104 dellstudioipod 01

run;

Here I separated columns using scan function.

data cd;

set j;

prid=input(scan(info,1,' '),3.);

prname=put(scan(info,2,' '),$20.);

prqty=input(scan(info,3,' '),3.);

drop info;

run;

But for this data set scan function is not woking as product name it self has sepated with spaces.

data a;

input info $1-50;

cards;

101 pencils 35

102 parker pens 21

103 apple ipod touch n shuffle 04

104 dell studio ipod 01

run;

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you have any control over the generation of the list you should request that they use a delimiter other than space.  Or if they can insure that there are not double spaces embedded in the PRNAME field and that are two spaces before the PRQTY field.  Or change the order of the fields so that the PRNAME field is at the end of the line. Then the INPUT statement could read it.

In your case it looks possible, as long as you can insure that all records have that field at the end.

data want ;

  set have;

   length prid 8 prname $20 prqty 8 ;

   length x1 x2 $8 ;

   x1=scan(info,1,' ');

   x2=scan(info,-1,' ');

  prid=input(X1,3.);

  prqty=input(X2,3.);

  prname=left(substr(info,length(X1)+1,length(info) - length(x1) - length(x2) -2 ));

   drop info x1 x2 ;

run;

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

If you have any control over the generation of the list you should request that they use a delimiter other than space.  Or if they can insure that there are not double spaces embedded in the PRNAME field and that are two spaces before the PRQTY field.  Or change the order of the fields so that the PRNAME field is at the end of the line. Then the INPUT statement could read it.

In your case it looks possible, as long as you can insure that all records have that field at the end.

data want ;

  set have;

   length prid 8 prname $20 prqty 8 ;

   length x1 x2 $8 ;

   x1=scan(info,1,' ');

   x2=scan(info,-1,' ');

  prid=input(X1,3.);

  prqty=input(X2,3.);

  prname=left(substr(info,length(X1)+1,length(info) - length(x1) - length(x2) -2 ));

   drop info x1 x2 ;

run;

data_null__
Jade | Level 19

Like Tom's only different and fixed now.

17         data _null_;
18            input info $1-50;
19            call scan(info,1 ,p1,l1,' ');
20            call scan(info,-1,p2,l2,' ');
21            w1 = substrn(info,p1,l1);
22            w2 = substrn(info,l1+2,p2-l1-2);
23            w3 = substrn(info,p2,l2);
24            put (w:)(=);
25            cards;

w1=
101 w2=pencils w3=35
w1=
101 w2=parker pens w3=214444444
w1=
103 w2=apple ipod touch n shuffle w3=0444
w1=
104 w2=dell studio ipod w3=01

Message was edited by: data _null_

Tom
Super User Tom
Super User

Length of second field does not depend on length of the last field.

It worked for the example because they were all length of 2.


w2 = substrn(info,l1+2,p2-l1-2);

data_null__
Jade | Level 19

OOPS

epr
Calcite | Level 5 epr
Calcite | Level 5

Thank you Tom and ballardw,

I Appreciate your help

Tom your code worked , I got right output.

ballardw
Super User

If I had this data, depending on the source, I would seriously consider rearring the input model. If data were entered in prid prqty and prname order then you wouldn't need to parse the string but just read the variables with suitable options.

With your example data I would input prid directly and then try to parse the remainder.

If none of your product names contain digits then you could search for the first digit that appears using the ANYDIGIT function and then use SUBSTR

input prid $ info $varying50. ;

then something like

prname = substr(info,anyalpha(info),anydigit(info)-1);

prqty = input((substr(info,anydigit(info)),3.);

ScottBass
Rhodochrosite | Level 12

I'm a regex fan, so here's another approach:

*  your first example ;

data a;

input info $1-50;

cards;

101 pencils 35

102 parker pensl21

103 dellstudioipod 04

104 dellstudioipod 01

run;

* your second example ;

data b;

input info $1-50;

cards;

101 pencils 35

102 parker pens 21

103 apple ipod touch n shuffle 04

104 dell studio ipod 01

run;

data want;

   set a;  * or set b ;

   rx=prxparse("/^(\d+)(\D*)(\d+)$/o");

   rc=prxmatch(rx,trim(info));

   prid=input(prxposn(rx,1,info),best.);

   prname=prxposn(rx,2,info);

   prqty=input(prxposn(rx,3,info),best.);

run;

This will assign:

1) prid = beginning of line, all digits up to

2) prname = all non-digits, up to

3) prqty = all digits at the end of line


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1000 views
  • 6 likes
  • 5 in conversation