BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
normanbg
Obsidian | Level 7

   I imported some data from an Excel file with 5 lines and two columns (proc import datafile): ID and V. Say that ID is 1 2 3 4 5 and V is 2 4 6 8 10. I want to define x as the third entry of V; that is 6. In Fortran, that would be x=V(3); other languages use [ ] or some such thing. What about SAS?

   I don't mean to select the case with ID=3, but to create a scalar variable with the single value 6. Seems simple enough, but I have looked everywhere (clearly, except for the right place), but without success.

   Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
normanbg
Obsidian | Level 7
Yes, I did want to reference the variable by its order in the data vector. Your solution (VAR1 VAR2 ... VAR20) is more or less what the workaround (mentioned in an earlier reply) did, but I was looking for what I thought would be a more elegant solution (one-line, without having to recast the Excel file). I guess there just isn't one.
Thank you for your reply.

View solution in original post

25 REPLIES 25
PaigeMiller
Diamond | Level 26

Are you talking about doing this in PROC IML, where vectors are a meaningful concept? Or are you talking about doing this in a DATA step, where there really are no such things as vectors? You can do this in either, but you really haven't indicated which you are using.

 

And what are you going to do with the value 6 once you have selected it?

--
Paige Miller
normanbg
Obsidian | Level 7
Sorry. What I had in mind was to define the x In the Data step that follows the proc import and then use it as I would as any other numerical variable (y=x-p in another Data step, say, or in a proc print).
PaigeMiller
Diamond | Level 26

So, if I understand properly

 

data constant;
    set have;
    if _n_=3 then do; 
        scalar=v;
        output;
    end;
    keep scalar;
run;
data want;
    if _n_=1 then set constant;
    set have;
    y=v-scalar;
run;
--
Paige Miller
normanbg
Obsidian | Level 7
Thank you for your quick reply.
When I try to run your script and then print scalar, all I get is a blank. Obviously, I'm missing something. How about if I just try to extract the third entry of v in the data step and then print it? That should result is simpler code that I may be able to get to work.
PaigeMiller
Diamond | Level 26

@normanbg wrote:
Thank you for your quick reply.
When I try to run your script and then print scalar, all I get is a blank. Obviously, I'm missing something. How about if I just try to extract the third entry of v in the data step and then print it? That should result is simpler code that I may be able to get to work.

We don't know what you did. Please show us the SAS log for the code you used, and the results you got.

--
Paige Miller
normanbg
Obsidian | Level 7

    What I did was to copy your script into a *.sas file and add a print. I included an additional line to check the suggestion offered by PGStats (Tourmaline); no joy in either case (see attached zip file, which contains the Excel file with the raw data and the Results tab in pdf format).

PaigeMiller
Diamond | Level 26

Most of us will not download attachments.

 

Show us the log of your code by including it in your reply. Copy the log as text, paste it into the window that appears here when you click on the </> icon.

--
Paige Miller
normanbg
Obsidian | Level 7

Here is a copy of the log file. (I didn't know about the reluctance to download attachments. Sorry. I'm a slow learner but I learn well.)

 

 
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         proc import datafile="/home/norman3/Norman/Yaacov/Norm.xlsx"
 74         dbms=xlsx out=have replace;
 75         run;
 
 NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with 
       options MSGLEVEL=I.
 NOTE: The import data set has 5 observations and 5 variables.
 NOTE: WORK.HAVE data set was successfully created.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.01 seconds
       memory              2771.00k
       OS Memory           31400.00k
       Timestamp           01/07/2021 12:47:30 PM
       Step Count                        24  Switch Count  2
       Page Faults                       0
       Page Reclaims                     839
       Page Swaps                        0
       Voluntary Context Switches        18
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 76         data constant;
 77             set have;
 78             if _n_=3 then scalar=v;
 79          x = input(scan(V, 3), best.);
 80             keep x scalar;
 81         run;
 
 NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
       79:18   
 NOTE: There were 5 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.CONSTANT has 5 observations and 2 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              949.90k
       OS Memory           30124.00k
       Timestamp           01/07/2021 12:47:30 PM
       Step Count                        25  Switch Count  2
       Page Faults                       0
       Page Reclaims                     219
       Page Swaps                        0
       Voluntary Context Switches        9
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 82         data want;
 83             if _n_=1 then set constant;
 84             set have;
 85             y=v-scalar;
 86         run;
 
 NOTE: Missing values were generated as a result of performing an operation on missing values.
       Each place is given by: (Number of times) at (Line):(Column).
       5 at 85:8   
 NOTE: There were 1 observations read from the data set WORK.CONSTANT.
 NOTE: There were 5 observations read from the data set WORK.HAVE.
 NOTE: The data set WORK.WANT has 5 observations and 8 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              1307.03k
       OS Memory           30384.00k
       Timestamp           01/07/2021 12:47:30 PM
       Step Count                        26  Switch Count  2
       Page Faults                       0
       Page Reclaims                     166
       Page Swaps                        0
       Voluntary Context Switches        9
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           272
       
 
 87         proc print noobs data=want;
 88         var v scalar x y;
 89         run;
 
 NOTE: There were 5 observations read from the data set WORK.WANT.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.02 seconds
       user cpu time       0.02 seconds
       system cpu time     0.00 seconds
       memory              1943.40k
       OS Memory           31144.00k
       Timestamp           01/07/2021 12:47:30 PM
       Step Count                        27  Switch Count  0
       Page Faults                       0
       Page Reclaims                     675
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 
 90         
 91         
 92         
 93         
 94         
 95         
 96         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 108        
Kurt_Bremser
Super User
 79          x = input(scan(V, 3), best.);
 80             keep x scalar;
 81         run;
 
 NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
       79:18   

This indicates that variable V is numeric; the automatic conversion to character can have at most two words (one before and one after the period), so the result of the SCAN function will be empty. This means that the result of the INPUT function will be a missing value.

For reference, run this short code:

data test;
input x;
x1 = scan(x,1);
x2 = scan(x,2);
x3 = scan(x,3);
datalines;
1
1.5
-1.5
;

Log from that:

 73         data test;
 74         input x;
 75         x1 = scan(x,1);
 76         x2 = scan(x,2);
 77         x3 = scan(x,3);
 78         datalines;
 
 NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
       75:11   76:11   77:11   

And you will see in the dataset that x3 is always empty.

normanbg
Obsidian | Level 7
I don't have character variables in the input and I don't want character variables in the output. Please see my reply to your next post..
Kurt_Bremser
Super User

From your initial post, I guess that your dataset looks like this:

data have;
input id v;
datalines;
1 2
2 4
3 6
4 8
5 10
;

So, what is the expected result from this dataset?

normanbg
Obsidian | Level 7
Yes, that is indeed the data set. What I want to extract is the third entry of v, namely 6; that is, I expect (hope) that when I am finished fiddling, I will get x=6 with no blanks: in fortran terms, a scalar.
Kurt_Bremser
Super User

If you need to keep that single value for a next step, store it into a macro variable, either by directly addressing the observation, or using a WHERE:

data _null_;
n = 3;
set have point=n;
call symputx("value",v);
stop;
run;

%put &=value;

data _null_;
set have;
where id = 3;
call symputx("value",v);
run;

%put &=value;
Tom
Super User Tom
Super User

@normanbg wrote:
Yes, that is indeed the data set. What I want to extract is the third entry of v, namely 6; that is, I expect (hope) that when I am finished fiddling, I will get x=6 with no blanks: in fortran terms, a scalar.

Your thinking is upside down here.  SAS works on DATASETS.  A "scalar" or "vector" independent of how the data step processes datasets does not even make any sense.

 

If you want find a specific value from your DATASET the best way is to know what values of the identifier variables identify that observation and the NAME of the variable your want to reference.  

 

It is possible to reference an observation by its position in the dataset.  You were shown one way using the _N_ automatic variable. You could also try the OBS= and FIRSTOBS= dataset options. Or the POINT= option on the SET statement.

 

If you wanted to reference the variable by its order in the data vector of your dataset that is harder. It is pretty easy do using ARRAY statement with a simple dataset structure (say only numeric variables) or a nice variable naming convention (like VAR1 VAR2 ... VAR20).

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 25 replies
  • 2619 views
  • 3 likes
  • 7 in conversation