Selective importing from flat file to SQL table

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I have a very large flat file that needs to be imported to a SQL table nightly (2.5M+ rows).

Much of this data I don't need.

I'd like to selectively limit which rows are actually imported based off the value of the "TRANS_DATE" column. Basically I don't care about anything prior to 2005... so something like "....WHERE TRANS_DATE < '2005-01-01' I know would be my normal query, but doing this via flat file import in SSIS, I'm not sure how to do that limitation.

The idea here is to both speed up the import (even if a little) and speed up queries on the table (this will cut the table down to nearly have the size)


Thanks in advance.
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
13
81
www.markbetz.net
You can use the import data wizard in SQL Server Management Studio to set up the import from the flat file, and then somewhere in the advanced tab you'll see an option to edit the query. You can open the query and add a where clause that will do what you want.
 

brandonbull

Diamond Member
May 3, 2005
6,330
1,203
126
I would import all the data into a staging table and then use SQL to move only the needed data over to your table. Truncate the staging table before each load.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I would import all the data into a staging table and then use SQL to move only the needed data over to your table. Truncate the staging table before each load.

This was actually my initial train of thought as well. Dump the entire flat file into a temp table, then via a simple query move from temp to real table.

My only hesitation with it was that it would actually increase the import time somewhat which was partially (though admittedly not the main) reason I wanted to limit the import.
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
Trying to come up with a scheme to avoid the staging table could eat up much more time if this is a one-time import.

With a staging table you also have a chance to sanity-check the import before adding it to your existing table, though that doesn't matter too much since you have a good backup scheme and will do a fresh backup right before the import. Right?
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
I guess I should have pointed out this is updated EVERY NIGHT.
I guess I shouldn't use the word "updated" either as nightly the entire TABLE is truncated then "rebuilt" from the flat file.

However, since there are a couple of indexes on the final table, if I used the temp table (no index) and imported the entire flat file there. Then imported from temp to final table with limitation on what was imported, I'd think the overall time would be reduced (since it would only be indexing on what will amount to about half the number of rows once I filter the import...

Hmmm decisions. That's whats so awesome about SQL (I am still learning it)... 100 ways to skin the same cat.
 

WannaFly

Platinum Member
Jan 14, 2003
2,811
1
0
if you are using SQL2008 you can look into using MERGE. Otherwise, you'd be best using a staging table I think also.

UNLESS, you know for a fact the first 1M rows are all pre-2005, I believe on import you can tell it to skip a certain # of rows.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
You can define "skip X rows" but I just won't know how many rows to skip, and technically speaking it will be dynamic too. I think, when it comes down to it I will be doing something like "...WHERE (DATEDIFF(YEAR, GETDATE(), TABLE.DATE) = -5))" to capture only the last 5 years worth of data.

I will look into the Merge function.
 

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,284
3,905
75
If you can find that particular column with a regex, you could use either grep (in Linux) or findstr (in Windows) to find rows that do not match "^(some_data)200[01234](optional data to help specify row)" or something. Or maybe search for "200[5-9]|201[0-9]" anywhere in the string.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
You can most definitely put actual SQL into the import and limit what gets imported, as Mark said.

WHERE YEAR(Table.Date) > 2005

http://msdn.microsoft.com/en-us/library/ms186313.aspx

yeah but I guess I'm missing how I would do that in an SSIS package itself. I know how I can do it if I am manually importing the flat file via SSMS or something. Maybe I can do it there, copy the TSQL statement over to SSIS or something...
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
13
81
www.markbetz.net
yeah but I guess I'm missing how I would do that in an SSIS package itself. I know how I can do it if I am manually importing the flat file via SSMS or something. Maybe I can do it there, copy the TSQL statement over to SSIS or something...

Yeah I don't work with repeatable SSIS packages much, but I know that at the end of the manual import wizard you have the option to save an SSIS package, so maybe that's the answer.
 

Homerboy

Lifer
Mar 1, 2000
30,856
4,974
126
Yeah I don't work with repeatable SSIS packages much, but I know that at the end of the manual import wizard you have the option to save an SSIS package, so maybe that's the answer.

Yeah. Within the SSIS packages there is a option to provide specific SQL statement. I think if I bastardize all this together, it may work... on some level.

 

LokutusofBorg

Golden Member
Mar 20, 2001
1,065
0
76
I haven't used SSIS/DTS heavily for years, but you can put SQL in there, most definitely. It lets you configure the SQL to pull from the source, and then you can make customizations to the transforms per column... you can do just about anything you need to do in there.
 

wrinkefreeon3

Junior Member
Nov 22, 2011
18
0
0
I would recommend wiring a stored procedure to do the bulk insert from the csv file, and in the sp you would create the temp table to write the records into, query only the needed records into your final table, then dump all the staging data, etc. You could also rebuild any indexes you need in this sp. Here is a good article on using bulk insert for this task (http://blog.sqlauthority.com/2008/0...rt-load-comma-delimited-file-into-sql-server/

You would then schedule this job in the Sql Server Agent . You will need admin rights on the sql server in order to do bulk insert and schedule the job as well.
 

KentState

Diamond Member
Oct 19, 2001
8,397
393
126
In SSIS, you could use a Conditional Split and have the data you want go into the SQL table and the rest into space.
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |