The truly terrible way to do it is to execute each INSERT
statement as its own batch:
Batch 1:
INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd);
Batch 2:
INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked);
Batch 3:
INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz);
Batch 4:
INSERT INTO Entries (id, name) VALUES (4, 'Giorgi);
Batch 5:
INSERT INTO Entries (id, name) VALUES (5, 'AMissico);
Note: Parameterization, error checking, and any other nit-picks elided for expoistory purposes.
This is truly, horrible, terrible way to do things. It gives truely awful performance, because you suffer the network round-trip-time every time.
A much better solution is to batch all the INSERT
statements into one batch:
Batch 1:
INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd');INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked');INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz');INSERT INTO Entries (id, name) VALUES (4, 'Giorgi');INSERT INTO Entries (id, name) VALUES (5, 'AMissico');
This way you only suffer one-round trip. This version has huge performance wins; on the order of 5x faster.
Even better is to use the VALUES
clause:
INSERT INTO Entries (id, name)VALUES (1, 'Ian Boyd'),(2, 'Bottlenecked'),(3, 'Marek Grzenkowicz'),(4, 'Giorgi'),(5, 'AMissico');
This gives you some performance improvements over the 5 separate INSERT
s version; it lets the server do what it's good at: operating on sets:
- each trigger only has to operate once
- foreign keys are only checked once
- unique constraints are only checked once
SQL Sever loves to operate on sets of data; it's where it's a viking!
Parameter limit
The above T-SQL examples have all the parameteriztion stuff removed for clarity. But in reality you want to parameterize queries
- Not to avoid SQL injection; because you're already a good developer who's using
QuotedString(firstName)
- Not so much for the performance bonus of saving the server from having to compile each T-SQL batch (Although, during a high-speed bulk-import, saving the parsing time really adds up)
- but to avoid flooding the server's query plan cache with gibibytes upon gibibytes of ad-hoc query plans. (I've seen SQL Server's working set, i.e. RAM usage, not memory usage, be 2 GB of just unparameterized SQL query plans)
But Bruno has an important point; SQL Server's driver only lets you include 2,100 parameters in a batch. The above query has two values:
@id, @name
If you import 1,051 rows in a single batch, that's 2,102 parameters - you'll get the error:
Too many parameters were provided in this RPC request
That is why i generally insert 5 or 10 rows at a time. Adding more rows per batch doesn't improve performance; there's diminishing returns.
It keeps the number of parameters low, so it doesn't get anywhere near the T-SQL batch size limit. There's also the fact that a VALUES
clause is limited to 1000 tuples anyway.
Implementing it
Your first approach is good, but you do have the issues of:
- parameter name collisions
- unbounded number of rows (possibly hitting the 2100 parameter limit)
So the goal is to generate a string such as:
INSERT INTO Entries (id, name) VALUES(@p1, @p2),(@p3, @p4),(@p5, @p6),(@p7, @p8),(@p9, @p10)
I'll change your code by the seat of my pants
IEnumerable<Entry> entries = GetStuffToInsert();SqlCommand cmd = new SqlCommand();StringBuilder sql = new StringBuilder();Int32 batchSize = 0; //how many rows we have build up so farInt32 p = 1; //the current paramter name (i.e. "@p1") we're going to useforeach(var entry in entries){ //Build the names of the parameters String pId = String.Format("@p{0}", p); //the "Id" parameter name (i.e. "p1") String pName = String.Format("@p{0}", p+1); //the "Name" parameter name (i.e. "p2") p += 2; //Build a single "(p1, p2)" row String row = String.Format("({0}, {1})", pId, pName); //a single values tuple //Add the row to our running SQL batch if (batchSize > 0) sb.AppendLine(","); sb.Append(row); batchSize += 1; //Add the parameter values for this row cmd.Parameters.Add(pID, System.Data.SqlDbType.Int ).Value = entry.Id; cmd.Parameters.Add(pName, System.Data.SqlDbType.String).Value = entry.Name; if (batchSize >= 5) { String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+ sb.ToString(); cmd.CommandText = sql; cmd.ExecuteNonQuery(); cmd.Parameters.Clear(); sb.Clear(); batchSize = 0; p = 1; }}//handle the last few stragglersif (batchSize > 0){ String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+ sb.ToString(); cmd.CommandText = sql; cmd.ExecuteNonQuery();}