Quantcast
Viewing all articles
Browse latest Browse all 12

Answer by Ian Boyd for How should I multiple insert multiple records?

The truly terrible way to do it is to execute each INSERT statement as its own batch:

Batch 1:

INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd);

Batch 2:

INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked);

Batch 3:

INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz);

Batch 4:

INSERT INTO Entries (id, name) VALUES (4, 'Giorgi);

Batch 5:

INSERT INTO Entries (id, name) VALUES (5, 'AMissico);

Note: Parameterization, error checking, and any other nit-picks elided for expoistory purposes.

This is truly, horrible, terrible way to do things. It gives truely awful performance, because you suffer the network round-trip-time every time.

A much better solution is to batch all the INSERT statements into one batch:

Batch 1:

INSERT INTO Entries (id, name) VALUES (1, 'Ian Boyd');INSERT INTO Entries (id, name) VALUES (2, 'Bottlenecked');INSERT INTO Entries (id, name) VALUES (3, 'Marek Grzenkowicz');INSERT INTO Entries (id, name) VALUES (4, 'Giorgi');INSERT INTO Entries (id, name) VALUES (5, 'AMissico');

This way you only suffer one-round trip. This version has huge performance wins; on the order of 5x faster.

Even better is to use the VALUES clause:

INSERT INTO Entries (id, name)VALUES (1, 'Ian Boyd'),(2, 'Bottlenecked'),(3, 'Marek Grzenkowicz'),(4, 'Giorgi'),(5, 'AMissico');

This gives you some performance improvements over the 5 separate INSERTs version; it lets the server do what it's good at: operating on sets:

  • each trigger only has to operate once
  • foreign keys are only checked once
  • unique constraints are only checked once

SQL Sever loves to operate on sets of data; it's where it's a viking!

Parameter limit

The above T-SQL examples have all the parameteriztion stuff removed for clarity. But in reality you want to parameterize queries

  • Not to avoid SQL injection; because you're already a good developer who's using QuotedString(firstName)
  • Not so much for the performance bonus of saving the server from having to compile each T-SQL batch (Although, during a high-speed bulk-import, saving the parsing time really adds up)
  • but to avoid flooding the server's query plan cache with gibibytes upon gibibytes of ad-hoc query plans. (I've seen SQL Server's working set, i.e. RAM usage, not memory usage, be 2 GB of just unparameterized SQL query plans)

But Bruno has an important point; SQL Server's driver only lets you include 2,100 parameters in a batch. The above query has two values:

@id, @name

If you import 1,051 rows in a single batch, that's 2,102 parameters - you'll get the error:

Too many parameters were provided in this RPC request

That is why i generally insert 5 or 10 rows at a time. Adding more rows per batch doesn't improve performance; there's diminishing returns.

It keeps the number of parameters low, so it doesn't get anywhere near the T-SQL batch size limit. There's also the fact that a VALUES clause is limited to 1000 tuples anyway.

Implementing it

Your first approach is good, but you do have the issues of:

  • parameter name collisions
  • unbounded number of rows (possibly hitting the 2100 parameter limit)

So the goal is to generate a string such as:

INSERT INTO Entries (id, name) VALUES(@p1, @p2),(@p3, @p4),(@p5, @p6),(@p7, @p8),(@p9, @p10)

I'll change your code by the seat of my pants

IEnumerable<Entry> entries = GetStuffToInsert();SqlCommand cmd = new SqlCommand();StringBuilder sql = new StringBuilder();Int32 batchSize = 0; //how many rows we have build up so farInt32 p = 1; //the current paramter name (i.e. "@p1") we're going to useforeach(var entry in entries){   //Build the names of the parameters   String pId =   String.Format("@p{0}", p);   //the "Id" parameter name (i.e. "p1")   String pName = String.Format("@p{0}", p+1); //the "Name" parameter name (i.e. "p2")   p += 2;   //Build a single "(p1, p2)" row   String row = String.Format("({0}, {1})", pId, pName); //a single values tuple   //Add the row to our running SQL batch   if (batchSize > 0)      sb.AppendLine(",");   sb.Append(row);   batchSize += 1;   //Add the parameter values for this row   cmd.Parameters.Add(pID,   System.Data.SqlDbType.Int   ).Value = entry.Id;   cmd.Parameters.Add(pName, System.Data.SqlDbType.String).Value = entry.Name;   if (batchSize >= 5)   {       String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+                    sb.ToString();       cmd.CommandText = sql;       cmd.ExecuteNonQuery();       cmd.Parameters.Clear();       sb.Clear();       batchSize = 0;       p = 1;   }}//handle the last few stragglersif (batchSize > 0){    String sql = "INSERT INTO Entries (id, name) VALUES"+"\r\n"+                 sb.ToString();    cmd.CommandText = sql;    cmd.ExecuteNonQuery();}

Viewing all articles
Browse latest Browse all 12

Trending Articles