BookmarkSubscribeRSS Feed
sachin05t
Calcite | Level 5

Imported a .csv file but can not delete first observation

IF ID<2 THEN DELETE;
PROC PRINT;
RUN;
12 REPLIES 12
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

try this 

If _N_ = 1 then delete;

 

sachin05t
Calcite | Level 5

That didn't work. Sorry.

sachin05t
Calcite | Level 5

got this error

 

ERROR 180-322: Statement is not valid or it is used out of proper order

novinosrin
Tourmaline | Level 20

if you are reading with infile and input, use the option firstobs=2 in the infile statement

Reeza
Super User

Yeah, you cannot just place code anywhere, it needs to be within a data step to remove the records. 

 

data want;
set sashelp.class;
if _n_ < 3 then delete;
run;

You really also shouldn't just use PROC PRINT, you need a DATA= to indicate which data to print otherwise you get unexpected behaviour. I highly, highly suggest against coding in that style.

 

Assuming you have  a data set already, you're looking for:

 

data want;
set have;

if ID < 2 then delete;
run;

proc print data=want;run;

@sachin05t wrote:

Imported a .csv file but can not delete first observation

IF ID<2 THEN DELETE;
PROC PRINT;
RUN;

 

sachin05t
Calcite | Level 5

Thanks Reeza. Our instructor is not the best at explaining the concepts, which is why we are struggling.

 

We were provided with a csv file that has name '1A'. I imported it from desktop using import wizard. How do I start using that file?

I mean, should I code:

Data 1A;

if _n_ < 2 then delete;

 

?

Reeza
Super User
*creates an output data set called WANT;
DATA want;

*use the data set have as your data source;
SET have;

*delete if value of ID is less than 2;
IF id < 2 THEN DELETE;


RUN;

*print results to view;
PROC PRINT DATA=want;RUN;

 

I added some comments to my code above that explains each line. The item in ALL CAPS are essentially 'control words' they are commands to SAS. The ones in lower case are the ones you should be changing, they're the input and calculations. 

 

I generally do not recommend using the wizard to import files either, you can't script it and remembering that you have to do that step is difficult. Look at PROC IMPORT for now to import your data. 

 

There are tons of tutorials on video.sas.com>How to Tutorials>SAS Analytics U for basic getting started videos that are short in length. 



@sachin05t wrote:

Thanks Reeza. Our instructor is not the best at explaining the concepts, which is why we are struggling.

 

We were provided with a csv file that has name '1A'. I imported it from desktop using import wizard. How do I start using that file?

I mean, should I code:

Data 1A;

if _n_ < 2 then delete;

 

?





mkeintz
PROC Star

You want to read all the CSV data, except for the first line.  That can be done with the FIRSTOBS= parameter on in the infile statement (you know, the statement that identifies your csv data:

 

data want;
  infile 'c:\temp\mydata.csv' dlm=',' dsd firstobs=2;
  input id sex race doa :mmddyy10. .... status height weight;
  format doa date9. ;
run;

 

I suggest the "firstobs=2" approach because the same tool is available when processing sas data set, except that the firstobs condition is specified as a data set name parameter, as in:

 

proc print data=have (firstobs=2);
run;


or

data want;
  set have (firstobs=2);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

One way to just display without actually removing from the data:

 

Proc print data=<your data set name goes here> (firstobs=2);

run;

 

Or if you want to filter on the value of one or more variables use a WHERE statement:

 

proc print;

   where id ge 2;

run;

That keeps where the value of the VARIABLE id is 2 or more. If you do not have a variable named ID then it does work and will generate errors.

IMPORTANT: Do not lie to computers. The file that was attached was in SYLK format according to EXCEL and has errors at that.

For plain text it is best to just paste into code box opened with the forum's {I} icon

 

ID,Sex,Race,DOA,LOS,Status,DX1,DX2,DX3,DX4,DX5,DOB,HxCVD,HxDM,HxHTN,HxCOPD,Height,Weight
1,1,1,11/11/1111,1,1,xxxxx,xxxxx,xxxxx,xxxxx,xxxxx,11/11/1111,1,1,1,1,11.1,111.11
2,2,1,12/2/1996,2,1,4359,,,,,9/7/1937,2,1,2,2,61.5,79.2
3,2,1,12/24/1996,2,1,4359,496,25000,4019,,9/29/1937,1,1,1,2,65.7,92.25
4,1,1,12/31/1998,6,1,4280,42732,25001,41400,4293,10/2/1939,1,2,1,1,69.8,79.9
5,2,1,7/18/1996,3,1,496,42800,25001,27880,40190,4/22/1937,2,1,2,2,60.2,79.3
6,2,1,3/23/1997,6,1,49121,27620,42800,25001,78057,12/23/1937,1,1,1,1,67.6,91.6
7,2,1,10/24/1997,4,1,49120,42800,25001,41400,24490,7/28/1938,2,1,1,1,68.7,65.5
8,2,1,3/9/1998,6,1,49121,25001,42800,51830,40190,12/9/1938,1,1,2,2,64.1,64.9
9,2,1,7/9/1998,7,1,49121,42800,25001,78652,51100,4/9/1939,2,1,1,1,66.2,94.75
10,2,1,6/25/1999,4,1,496,57400,57410,25001,42800,3/28/1940,2,1,2,2,70.6,
11,1,1,1/9/1997,3,1,49121,25063,59654,5363,3371,10/14/1937,2,2,2,2,66.5,78.8
12,1,1,4/15/1998,1,1,4431,44020,3051,,,1/20/1939,2,1,1,1,64.5,70.35
13,1,9,1/17/2000,4,1,49121,3051,4612,,,10/20/1940,2,1,1,1,62.3,79.1
14,1,9,2/21/2000,5,1,49121,2518,41401,,,11/23/1940,2,1,1,2,63.4,89.3
15,2,1,4/22/1998,6,1,496,42731,0549,7810,311,1/22/1939,2,1,1,2,64.2,80.45
16,2,2,3/3/2000,3,1,4280,4019,2500,,,12/6/1940,2,1,1,1,64.8,84.05
17,2,2,12/2/2000,3,1,45981,25042,58381,25052,36201,9/6/1941,2,1,2,2,60.4,75.25

 

I would NOT trust proc import to get that data correct if you have more fields with XXXXX,dates in the year 1111 (those will likely be missing. Note that the values of  V1581 or V4581 could well be missing if Proc Import determines that column should be numeric.

Soham0707
Obsidian | Level 7
_N_=1 then delete it works.
Thanks
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

Please select and mark the solution to your request.

 

Patrick
Opal | Level 21

@Soham0707 

Looking at the .csv you've posted I believe rather than not reading the first line during import (which is done setting option firstobs=2) I believe you'd rather want to instruct SAS to use the first line to name your variables.

Inspect the import wizard. There is an option where you can choose that the first line should be used to name your variables and that data starts on line 2. I believe you can find these options under Advanced.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 36106 views
  • 4 likes
  • 8 in conversation