Imported a .csv file but can not delete first observation
IF ID<2 THEN DELETE;
PROC PRINT;
RUN;
try this
If _N_ = 1 then delete;
That didn't work. Sorry.
got this error
ERROR 180-322: Statement is not valid or it is used out of proper order
if you are reading with infile and input, use the option firstobs=2 in the infile statement
Yeah, you cannot just place code anywhere, it needs to be within a data step to remove the records.
data want;
set sashelp.class;
if _n_ < 3 then delete;
run;
You really also shouldn't just use PROC PRINT, you need a DATA= to indicate which data to print otherwise you get unexpected behaviour. I highly, highly suggest against coding in that style.
Assuming you have a data set already, you're looking for:
data want;
set have;
if ID < 2 then delete;
run;
proc print data=want;run;
@sachin05t wrote:
Imported a .csv file but can not delete first observation
IF ID<2 THEN DELETE; PROC PRINT; RUN;
Thanks Reeza. Our instructor is not the best at explaining the concepts, which is why we are struggling.
We were provided with a csv file that has name '1A'. I imported it from desktop using import wizard. How do I start using that file?
I mean, should I code:
Data 1A;
if _n_ < 2 then delete;
?
*creates an output data set called WANT;
DATA want;
*use the data set have as your data source;
SET have;
*delete if value of ID is less than 2;
IF id < 2 THEN DELETE;
RUN;
*print results to view;
PROC PRINT DATA=want;RUN;
I added some comments to my code above that explains each line. The item in ALL CAPS are essentially 'control words' they are commands to SAS. The ones in lower case are the ones you should be changing, they're the input and calculations.
I generally do not recommend using the wizard to import files either, you can't script it and remembering that you have to do that step is difficult. Look at PROC IMPORT for now to import your data.
There are tons of tutorials on video.sas.com>How to Tutorials>SAS Analytics U for basic getting started videos that are short in length.
@sachin05t wrote:
Thanks Reeza. Our instructor is not the best at explaining the concepts, which is why we are struggling.
We were provided with a csv file that has name '1A'. I imported it from desktop using import wizard. How do I start using that file?
I mean, should I code:
Data 1A;
if _n_ < 2 then delete;
?
You want to read all the CSV data, except for the first line. That can be done with the FIRSTOBS= parameter on in the infile statement (you know, the statement that identifies your csv data:
data want;
infile 'c:\temp\mydata.csv' dlm=',' dsd firstobs=2;
input id sex race doa :mmddyy10. .... status height weight;
format doa date9. ;
run;
I suggest the "firstobs=2" approach because the same tool is available when processing sas data set, except that the firstobs condition is specified as a data set name parameter, as in:
proc print data=have (firstobs=2);
run;
or
data want;
set have (firstobs=2);
run;
One way to just display without actually removing from the data:
Proc print data=<your data set name goes here> (firstobs=2);
run;
Or if you want to filter on the value of one or more variables use a WHERE statement:
proc print;
where id ge 2;
run;
That keeps where the value of the VARIABLE id is 2 or more. If you do not have a variable named ID then it does work and will generate errors.
IMPORTANT: Do not lie to computers. The file that was attached was in SYLK format according to EXCEL and has errors at that.
For plain text it is best to just paste into code box opened with the forum's {I} icon
ID,Sex,Race,DOA,LOS,Status,DX1,DX2,DX3,DX4,DX5,DOB,HxCVD,HxDM,HxHTN,HxCOPD,Height,Weight 1,1,1,11/11/1111,1,1,xxxxx,xxxxx,xxxxx,xxxxx,xxxxx,11/11/1111,1,1,1,1,11.1,111.11 2,2,1,12/2/1996,2,1,4359,,,,,9/7/1937,2,1,2,2,61.5,79.2 3,2,1,12/24/1996,2,1,4359,496,25000,4019,,9/29/1937,1,1,1,2,65.7,92.25 4,1,1,12/31/1998,6,1,4280,42732,25001,41400,4293,10/2/1939,1,2,1,1,69.8,79.9 5,2,1,7/18/1996,3,1,496,42800,25001,27880,40190,4/22/1937,2,1,2,2,60.2,79.3 6,2,1,3/23/1997,6,1,49121,27620,42800,25001,78057,12/23/1937,1,1,1,1,67.6,91.6 7,2,1,10/24/1997,4,1,49120,42800,25001,41400,24490,7/28/1938,2,1,1,1,68.7,65.5 8,2,1,3/9/1998,6,1,49121,25001,42800,51830,40190,12/9/1938,1,1,2,2,64.1,64.9 9,2,1,7/9/1998,7,1,49121,42800,25001,78652,51100,4/9/1939,2,1,1,1,66.2,94.75 10,2,1,6/25/1999,4,1,496,57400,57410,25001,42800,3/28/1940,2,1,2,2,70.6, 11,1,1,1/9/1997,3,1,49121,25063,59654,5363,3371,10/14/1937,2,2,2,2,66.5,78.8 12,1,1,4/15/1998,1,1,4431,44020,3051,,,1/20/1939,2,1,1,1,64.5,70.35 13,1,9,1/17/2000,4,1,49121,3051,4612,,,10/20/1940,2,1,1,1,62.3,79.1 14,1,9,2/21/2000,5,1,49121,2518,41401,,,11/23/1940,2,1,1,2,63.4,89.3 15,2,1,4/22/1998,6,1,496,42731,0549,7810,311,1/22/1939,2,1,1,2,64.2,80.45 16,2,2,3/3/2000,3,1,4280,4019,2500,,,12/6/1940,2,1,1,1,64.8,84.05 17,2,2,12/2/2000,3,1,45981,25042,58381,25052,36201,9/6/1941,2,1,2,2,60.4,75.25
I would NOT trust proc import to get that data correct if you have more fields with XXXXX,dates in the year 1111 (those will likely be missing. Note that the values of V1581 or V4581 could well be missing if Proc Import determines that column should be numeric.
Please select and mark the solution to your request.
Looking at the .csv you've posted I believe rather than not reading the first line during import (which is done setting option firstobs=2) I believe you'd rather want to instruct SAS to use the first line to name your variables.
Inspect the import wizard. There is an option where you can choose that the first line should be used to name your variables and that data starts on line 2. I believe you can find these options under Advanced.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.