I imported some data from an Excel file with 5 lines and two columns (proc import datafile): ID and V. Say that ID is 1 2 3 4 5 and V is 2 4 6 8 10. I want to define x as the third entry of V; that is 6. In Fortran, that would be x=V(3); other languages use [ ] or some such thing. What about SAS?
I don't mean to select the case with ID=3, but to create a scalar variable with the single value 6. Seems simple enough, but I have looked everywhere (clearly, except for the right place), but without success.
Any help would be greatly appreciated.
Are you talking about doing this in PROC IML, where vectors are a meaningful concept? Or are you talking about doing this in a DATA step, where there really are no such things as vectors? You can do this in either, but you really haven't indicated which you are using.
And what are you going to do with the value 6 once you have selected it?
So, if I understand properly
data constant;
set have;
if _n_=3 then do;
scalar=v;
output;
end;
keep scalar;
run;
data want;
if _n_=1 then set constant;
set have;
y=v-scalar;
run;
@normanbg wrote:
Thank you for your quick reply.
When I try to run your script and then print scalar, all I get is a blank. Obviously, I'm missing something. How about if I just try to extract the third entry of v in the data step and then print it? That should result is simpler code that I may be able to get to work.
We don't know what you did. Please show us the SAS log for the code you used, and the results you got.
What I did was to copy your script into a *.sas file and add a print. I included an additional line to check the suggestion offered by PGStats (Tourmaline); no joy in either case (see attached zip file, which contains the Excel file with the raw data and the Results tab in pdf format).
Most of us will not download attachments.
Show us the log of your code by including it in your reply. Copy the log as text, paste it into the window that appears here when you click on the </> icon.
Here is a copy of the log file. (I didn't know about the reluctance to download attachments. Sorry. I'm a slow learner but I learn well.)
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 72 73 proc import datafile="/home/norman3/Norman/Yaacov/Norm.xlsx" 74 dbms=xlsx out=have replace; 75 run; NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with options MSGLEVEL=I. NOTE: The import data set has 5 observations and 5 variables. NOTE: WORK.HAVE data set was successfully created. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.01 seconds memory 2771.00k OS Memory 31400.00k Timestamp 01/07/2021 12:47:30 PM Step Count 24 Switch Count 2 Page Faults 0 Page Reclaims 839 Page Swaps 0 Voluntary Context Switches 18 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 76 data constant; 77 set have; 78 if _n_=3 then scalar=v; 79 x = input(scan(V, 3), best.); 80 keep x scalar; 81 run; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 79:18 NOTE: There were 5 observations read from the data set WORK.HAVE. NOTE: The data set WORK.CONSTANT has 5 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 949.90k OS Memory 30124.00k Timestamp 01/07/2021 12:47:30 PM Step Count 25 Switch Count 2 Page Faults 0 Page Reclaims 219 Page Swaps 0 Voluntary Context Switches 9 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 264 82 data want; 83 if _n_=1 then set constant; 84 set have; 85 y=v-scalar; 86 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 5 at 85:8 NOTE: There were 1 observations read from the data set WORK.CONSTANT. NOTE: There were 5 observations read from the data set WORK.HAVE. NOTE: The data set WORK.WANT has 5 observations and 8 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 1307.03k OS Memory 30384.00k Timestamp 01/07/2021 12:47:30 PM Step Count 26 Switch Count 2 Page Faults 0 Page Reclaims 166 Page Swaps 0 Voluntary Context Switches 9 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 272 87 proc print noobs data=want; 88 var v scalar x y; 89 run; NOTE: There were 5 observations read from the data set WORK.WANT. NOTE: PROCEDURE PRINT used (Total process time): real time 0.02 seconds user cpu time 0.02 seconds system cpu time 0.00 seconds memory 1943.40k OS Memory 31144.00k Timestamp 01/07/2021 12:47:30 PM Step Count 27 Switch Count 0 Page Faults 0 Page Reclaims 675 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0 90 91 92 93 94 95 96 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 108
79 x = input(scan(V, 3), best.); 80 keep x scalar; 81 run; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 79:18
This indicates that variable V is numeric; the automatic conversion to character can have at most two words (one before and one after the period), so the result of the SCAN function will be empty. This means that the result of the INPUT function will be a missing value.
For reference, run this short code:
data test;
input x;
x1 = scan(x,1);
x2 = scan(x,2);
x3 = scan(x,3);
datalines;
1
1.5
-1.5
;
Log from that:
73 data test; 74 input x; 75 x1 = scan(x,1); 76 x2 = scan(x,2); 77 x3 = scan(x,3); 78 datalines; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 75:11 76:11 77:11
And you will see in the dataset that x3 is always empty.
From your initial post, I guess that your dataset looks like this:
data have;
input id v;
datalines;
1 2
2 4
3 6
4 8
5 10
;
So, what is the expected result from this dataset?
If you need to keep that single value for a next step, store it into a macro variable, either by directly addressing the observation, or using a WHERE:
data _null_;
n = 3;
set have point=n;
call symputx("value",v);
stop;
run;
%put &=value;
data _null_;
set have;
where id = 3;
call symputx("value",v);
run;
%put &=value;
@normanbg wrote:
Yes, that is indeed the data set. What I want to extract is the third entry of v, namely 6; that is, I expect (hope) that when I am finished fiddling, I will get x=6 with no blanks: in fortran terms, a scalar.
Your thinking is upside down here. SAS works on DATASETS. A "scalar" or "vector" independent of how the data step processes datasets does not even make any sense.
If you want find a specific value from your DATASET the best way is to know what values of the identifier variables identify that observation and the NAME of the variable your want to reference.
It is possible to reference an observation by its position in the dataset. You were shown one way using the _N_ automatic variable. You could also try the OBS= and FIRSTOBS= dataset options. Or the POINT= option on the SET statement.
If you wanted to reference the variable by its order in the data vector of your dataset that is harder. It is pretty easy do using ARRAY statement with a simple dataset structure (say only numeric variables) or a nice variable naming convention (like VAR1 VAR2 ... VAR20).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.