BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GN0001
Barite | Level 11

Hello team,

Hello team,

I have placed scan function on a variable that contain this format100000*01. I need to keep the portion before ".".

scan(100000*01, 1, "*")

I didn't receive any errors in the log.

How can I view that characters are formatted correct? I have 4,000,000 million rows.

Please let me know.

Regards,

Blueblue

Blue Blue
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Your initial attempt did not work because you provided a number. Provide a character, which means just adding quotes.

 

scan("100000*01", 1, "*")

However, this is a useless check and made more work for you because it's not reflective of your real problem. 

Because you're dealing with a variable make a variable to use in the function to test it unless you're planning to type in 4,000,000 values somehow.

 

data check;
variable_to_split = "1111*01";
first_part = scan(variable_to_split, 1, "*");
second_part = scan(variable_to_split, 2, "*");
run;

View solution in original post

16 REPLIES 16
Kurt_Bremser
Super User
scan(100000*01, 1, "*")

Your first argument is numeric (the result of the calculation 100000 multiplied by 1), which is automatically converted to character with the BEST12. format before being fed to the SCAN function.

Reeza
Super User

@GN0001 wrote:

Hello team,

Hello team,

I have placed scan function on a variable that contain this format100000*01. I need to keep the portion before ".".

scan(100000*01, 1, "*")

I didn't receive any errors in the log.

How can I view that characters are formatted correct? I have 4,000,000 million rows.

Please let me know.

Regards,

Blueblue


 

 

You have a variable that has values (not formats) that look like "10000*01"  

 

However, you haven't specified the variable TYPE - character or numeric. 

 

From the SCAN() documentation:

 

SCAN(string, count, character-list, modifier)

 

string

specifies a character constant, variable, or expression.

 

Your usage provides a numeric value as the first argument not a character.

scan(100000*01, 1, "*")

 

GN0001
Barite | Level 11
Hello Reza,
Does this return 100000?

And since the data type of the field is character, then 100000 will be a character.
I am trying to make sure if my function is correct or not and then I asked if there is any way to view the result like what we do in proc freq.

Respectfully,
BlueBlue

Blue Blue
ballardw
Super User

No error but I bet you have a NOTE in the log that you need to consider. A simple example of code such as you used:

1    data example;
2       x=scan(100000*01, 1, "*");
3    run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      2:17

Which tells you that a value was converted. The 2:17 refers to the * inside your Scan function call.

SAS will try to guess what you intend and a value that is not character where it should be involves SAS applyiing a number of rules. One is if the value could be a number (no quotes here) then use the number and then convert to character for the SCAN function.

In this case the character you used, the * has a meaning with numbers: multiply so that step succeeds. If you had a different character, such as $, you might get a result like this:

data example;
   x=scan(100000$01, 1, "$");
run;

So any single character between two numbers that results in a valid numeric expression will yield unexpected result such as: + - = /  .  (yes the decimal character)

 

Can you tell without running the code what happens with this?

data example;
   x=scan(100000E01, 1, "E");
run;
GN0001
Barite | Level 11
Hello,
This is what I have inherited:
It is a column with values 1110*01 and it is all character. I want to extract all before *.
What is the best approach?
Regards,
BlueBlue
Blue Blue
Reeza
Super User

Your initial attempt did not work because you provided a number. Provide a character, which means just adding quotes.

 

scan("100000*01", 1, "*")

However, this is a useless check and made more work for you because it's not reflective of your real problem. 

Because you're dealing with a variable make a variable to use in the function to test it unless you're planning to type in 4,000,000 values somehow.

 

data check;
variable_to_split = "1111*01";
first_part = scan(variable_to_split, 1, "*");
second_part = scan(variable_to_split, 2, "*");
run;
GN0001
Barite | Level 11
Hello Reza,
I am trying to take off *01 from 1111*01. Using a function on a field which has character type in a dataset doesn't change the data type of the field. This is what I have understood the entire life.

1111 stays character in a dataset which is already set the data type for this column as character.
Respectfully,
BlueBlue
Blue Blue
ballardw
Super User

@GN0001 wrote:
Hello,
This is what I have inherited:
It is a column with values 1110*01 and it is all character. I want to extract all before *.
What is the best approach?
Regards,
BlueBlue

If it is a variable you code is basically correct:

Newvariable = scan(yourvariablename,1,'*');

The issue is that you posted literal values and not the variable.

 

Have you figured out why 10000E01 doesn't throw an error in my previous post?

GN0001
Barite | Level 11
Hello,
I have not figured out why it doesn't throw an error in the log. I eyeballed it. What I wanted is achieved and data type stays same as before.

Regards,
BlueBlue
Blue Blue
GN0001
Barite | Level 11
Hello team member,
Isn’t a column not called a variable in SAS?
Respectfully
Blue and black
Blue Blue
GN0001
Barite | Level 11
It keeps 100000.
Blue Blue
Reeza
Super User

Why?

 


@GN0001 wrote:
It keeps 100000.

 

GN0001
Barite | Level 11

Hi Reza & all,
What I asked was when we apply a function to a data set, for example a scan function on a values of a variable, how can we make sure our function has modified all the values?

I usually eyeball it and a little of log can help. Is there any other approach?

please advise me.

respectfully, 

blue blue

Blue Blue
Reeza
Super User

To test your code there are various ways. Unit test are the most common one, not really a SAS concept though, a programming methodology more so. 

 

*My* method is to run a PROC FREQ on the input variable. I look for all of the different unique cases that are possible and create small dataset with those values only. You can pipe your PROC FREQ output to a data set to help with this. Then you test it on the small data set where you can verify all data points. Once it's verified then you apply it to your big dataset. 

 

You can also run a PROC FREQ on the input variable and compare it to the output but if you have 4million unique values. 

 

proc freq data=want;
table inputVariable * outputVariable / out = check;
run;

 

 


@GN0001 wrote:

Hi Reza & all,
What I asked was when we apply a function to a data set, for example a scan function on a values of a variable, how can we make sure our function has modified all the values?

I usually eyeball it and a little of log can help. Is there any other approach?

please advise me.

respectfully, 

blue blue


 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 3310 views
  • 7 likes
  • 5 in conversation