Best sql-server questions in January 2011

SQL Server silently truncates varchar's in stored procedures

10 votes

According to this forum discussion, SQL Server (I'm using 2005 but I gather this also applies to 2000 and 2008) silently truncates any varchars you specify as stored procedure parameters to the length of the varchar, even if inserting that string directly using an INSERT would actually cause an error. eg. If I create this table:

CREATE TABLE testTable(
    [testStringField] [nvarchar](5) NOT NULL
)

then when I execute the following:

INSERT INTO testTable(testStringField) VALUES(N'string which is too long')

I get an error:

String or binary data would be truncated.
The statement has been terminated.

Great. Data integrity preserved, and the caller knows about it. Now let's define a stored procedure to insert that:

CREATE PROCEDURE spTestTableInsert
    @testStringField [nvarchar](5)
AS
    INSERT INTO testTable(testStringField) VALUES(@testStringField)
GO

and execute it:

EXEC spTestTableInsert @testStringField = N'string which is too long'

No errors, 1 row affected. A row is inserted into the table, with testStringField as 'strin'. SQL Server silently truncated the stored procedure's varchar parameter.

Now, this behaviour might be convenient at times but I gather there is NO WAY to turn it off. This is extremely annoying, as I want the thing to error if I pass too long a string to the stored procedure. There seem to be 2 ways to deal with this.

First, declare the stored proc's @testStringField parameter as size 6, and check whether its length is over 5. This seems like a bit of a hack and involves irritating amounts of boilerplate code.

Second, just declare ALL stored procedure varchar parameters to be varchar(max), and then let the INSERT statement within the stored procedure fail.

The latter seems to work fine, so my question is: is it a good idea to use varchar(max) ALWAYS for strings in SQL Server stored procedures, if I actually want the stored proc to fail when too long a string is passed? Could it even be best practice? The silent truncation that can't be disabled seems stupid to me.

It just is.

I've never noticed a problem though because one of my checks would be to ensure my parameters match my table column lengths. In the client code too. Personally, I'd expect SQL to never see data that is too long. If I did see truncated data, it'd be bleeding obvious what caused it.

If you do feel the need for varchar(max) beware a massive performance issue because of datatype precedence. varchar(max) has higher precedence than varchar(n) (longest is highest). So in this type of query you'll get a scan not a seek and every varchar(100) value is CAST to varchar(max)

UPDATE ...WHERE varchar100column = @varcharmaxvalue

Edit:

One could always raise a Connect issue for MS. At least they may explain this behaviour.

And it's probably worthy of inclusion in Erland Sommarkog's Strict settings

Edit 2, after Martins comment

DECLARE @sql VARCHAR(MAX), @nsql nVARCHAR(MAX)
SELECT @sql = 'B', @nsql = 'B'; 
select LEN(@sql), LEN(@nsql), DATALENGTH(@sql), DATALENGTH(@nsql)

declare @t table(c varchar(8000)) insert into @t values (replicate('A', 7500))

select LEN(c) from @t
select LEN(@sql + c), LEN(@nsql + c), DATALENGTH(@sql + c), DATALENGTH(@nsql + c) from @t

How do I average the difference between specific values in TSQL?

9 votes

Hey folks, sorry this is a bit of a longer question...

I have a table with the following columns:

[ChatID] [User] [LogID] [CreatedOn] [Text]

What I need to find is the average response time for a given user id, to another specific user id. So, if my data looks like:

[1] [john] [20] [1/1/11 3:00:00] [Hello]
[1] [john] [21] [1/1/11 3:00:23] [Anyone there?]
[1] [susan] [22] [1/1/11 3:00:43] [Hello!]
[1] [susan] [23] [1/1/11 3:00:53] [What's up?]
[1] [john] [24] [1/1/11 3:01:02] [Not much]
[1] [susan] [25] [1/1/11 3:01:08] [Cool]

...then I need to see that Susan has an average response time of (20 + 6) / 2 => 13 seconds to John, and John has an average of (9 / 1) => 9 seconds to Susan.

I'm not even sure this can be done in set-based logic, but if anyone has any ideas, they'd be much appreciated!

I don't have a PC to verify syntax or anything, but I think this should give you a starting place:

WITH ChatWithRownum AS (
    SELECT ChatID, User, LogID, CreatedOn, ROW_NUMBER() OVER(ORDER BY ChatID, CreatedOn) AS rownum
    FROM ChatLog
)
SELECT First.ChatID, Second.User,
    AVG(DATEDIFF(seconds, First.CreatedOn, Second.CreatedOn)) AS AvgElapsedTime
FROM ChatWithRownum First
    JOIN ChatWithRownum Second ON First.ChatID = Second.ChatID
        AND First.rownum = Second.rownum - 1
WHERE First.User != Second.User
GROUP BY First.ChatID, Second.User

Essentially, the idea is to add row numbers to the data so you can join one row to the next row (so you have a statement followed by its immediate response). Once you have the rows joined, you can get the time elapsed between the two entries, and then group the data by the ChatID (I'm assuming that times between separate chats aren't relevant) and the two users. Like I said though, this is just a starting place as I'm sure there may be some additional criteria and/or bugs in my query :)

SQL Server list of insert identities

9 votes

I have a table with an autoincrement id that I am doing a

INSERT INTO ( ... ) SELECT ... FROM ...

Is there a way for me to get the list of id's that have been inserted?

I was thinking I could get the max id before the insert then after and assuming everything in between is new, but then if a row gets inserted from somewhere else I could run into problems. Is there a proper way to do this?

I am using SQL Server 2005

Use the output clause.

DECLARE @InsertedIDs table(ID int);

INSERT INTO YourTable
    OUTPUT INSERTED.ID
        INTO @InsertedIDs 
    SELECT ...

Sql Server STUFF - is it supposed to work with position > 8000?

8 votes

Compare these two statements

select stuff(convert(varchar(max),replicate('a', 10000)),8001,1,'b')
select stuff(convert(varchar(max),replicate('a', 10000)),8000,1,'b')

Output

aaaaaaaaaaaaaaaaaaaaaaaa...
NULL

Books Online says start can be of type bigint. Why such a large range if it won't even work for 8001?

If the behaviour is different between 2005, 2008, 2008 R2 and Denali, then I would like to know the actual behaviour of each version.

REPLICATE ('a', 10000) will produce a 8000 characters string:

If string_expression is not of type varchar(max) or nvarchar(max), REPLICATE truncates the return value at 8,000 bytes. To return values greater than 8,000 bytes, string_expression must be explicitly cast to the appropriate large-value data type.

Try REPLICATE (cast('a' as varchar(max)), 10000).

To NOLOCK or NOT to NOLOCK, that is the question

7 votes

Hi all,

This is really more of a discussion than a specific question about nolock.

I took over an app recently that almost every query (and there are lots of them) has the nolock option on them. Now I am pretty new to SQL server (used Oracle for 10 years) but yet I find this pretty disturbing. So this weekend I was talking with one of my friends who runs a rather large ecommerce site (name will be withheld to protect the guilty) and he says he has to do this with all of his SQL servers cause he will always end in deadlocks.

Is this just a huge short fall with SQL server? Is this just a failure in the DB design (mine is not 3rd level, but its close) Is anybody out there running an SQL server app without nolocks? These are issues that Oracle handles better with more grandulare recordlocks.

Is SQL server just not able to handle big loads? Is there some better workaround than reading uncommited data? I would love to hear what people think.

Thanks

SQL Server has added snapshot isolation in SQL Server 2005, this will enable you to still read the latest correct value without having to wait for locks. StackOverflow is also using Snapshot Isolation. The Snapshot Isolation level is more or less the same that Oracle uses, this is why deadlocks are not very common on an Oracle box. Just be aware to have plenty of tempdb space if you do enable it

from Books On Line

When the READ_COMMITTED_SNAPSHOT database option is set ON, read committed isolation uses row versioning to provide statement-level read consistency. Read operations require only SCH-S table level locks and no page or row locks. When the READ_COMMITTED_SNAPSHOT database option is set OFF, which is the default setting, read committed isolation behaves as it did in earlier versions of SQL Server. Both implementations meet the ANSI definition of read committed isolation.

"If not exists" fails on SQL CE

6 votes

Hi

I've got an unexpected problem. I'm making a script to update the schema on a SQL CE database. This won't run:

if not exists
(
    Select column_name from information_schema.columns
    where column_name = 'TempTestField' 
        and table_name = 'Inventory_Master_File' 
)
    Alter table Inventory_Master_File
      add TempTestField nvarchar(10) null   

I think this is because I'm using stuff that isn't supported in SQL CE. Anyone know how to do this? I've tried rearranging the script and can't get anything to work. I tried "alter table ... where not exists ...".

Note that the "select" part runs fine and also the "alter" part. The problem is "if not exists". I know there are some other postings regarding problems like this using SQL CE but I couldn't find an answer to this particular problem.

Cheers
Mark

UPDATE:

I've spent over an hour looking for a solution. I've found many postings asking for help with similar problems but I've still got no idea how to fix it. I really don't want to do this in C# code. I need to do this in a SQL script. I can't believe something this basic is causing so much difficulty :(

It appears that SQL CE does not support any procedural extensions at all; only DDL and DML like from the 1970s.

You will either have to put procedural logic in your application (C# or whatever) or go to an embeddable database with procedural extensions, like MySQL.

How do I get 5 latest comments (SQL query for SQL Server ) for each user?

6 votes

I have a table that looks like this: comment_id, user_id, comment, last_updated.

Comment_id is a key here. Each user may have multiple comments.

How do I get 5 latest comments (SQL query for SQL Server ) for each user?

Output should be similar to the original table, just limit user's comments to 5 most recent for every user.

Assuming at least SQL Server 2005 so you can use the window function (row_number) and the CTE:

;with cteRowNumber as (
    select comment_id, user_id, comment, last_updated, ROW_NUMBER() over (partition by user_id order by last_updated desc) as RowNum
        from comments
)
select comment_id, user_id, comment, last_updated
    from cteRowNumber
    where RowNum <= 5
    order by user_id, last_updated desc

How to compare SQL timestamp in .NET?

6 votes

I have mapped Entity framework entities. Each table in SQL Server 2008 contains Timestamp column which is mapped as byte array. The length of array is always 8.

Now I need to compare timestamp values in .NET. I come with two solutions but I don't know which one is better?

  • Compare it as arrays. When first pair of bytes is different return false.
  • Convert byte array to long, compare longs.

Which solution is better? Or is there any other solution?

We do it by comparing them as byte arrays. Works fine for us.

SQL query to find all tables in a database that have a column with a specific name

6 votes

What query can I run on a database that will tell me which tables in that database have a column named "RCPTNMBR"?

Most databases support this:

select table_name from information_schema.columns where column_name = 'x'

Handling BLOBs in Entity Framework 4.0 in a stream-fashion

6 votes

Is it possible to handle (read and write) binary data to SQL Server using Entity Framework 4.0 using streams? (i.e.: not the whole content shipped in a byte array)

An example could be taken from Download and Upload images from SQL Server via ASP.Net MVC which illustrates the way to stream nicely data from SQL Server in an example available for ASP.NET MVC. However it requires direct access to the DB and I am curious whether this could be done using an ORM.

Unfortunately, this is not supported in Entity Framework 4. You'll need to work directly with the database.

I can't see it happening anytime soon either, since an ORM that didn't materialise objects fully in memory would have a lot of complications.

James

Entity Framework CTP5 - How to Call Stored Procedure?

6 votes

Hi Guys,

This may be a simple answer, but i can't see how to execute a stored procedure with EF CTP5.

In Entity Framework 4.0, we did this:

ExecuteFunction("ContainerName.StoredProcName", new ObjectParameter("Id", id)).

Which is a method on the ObjectContext.

But DbContext has no such method.

How do we call a stored proc? Is it not supported in EF CTP5?

EDIT:

I found this thread, which states you need to do this:

  var people = context.People.SqlQuery("EXECUTE [dbo].[GetAllPeople]");

This raises some concerns:

1) You are now calling a stored prodedure on the set, not the context. Stored procedures should be available context-wide, not tied to a particular entity set. Just like how they are under the "Database" in SQL Server, and not under the "Table".

2) What about complex types? I previously had a complex type being returned from a stored procedure. But now, it looks as though you have to map directly to an entity? That doesn't make any sense. I have many stored procs that return a type not directly represented by an ObjectSet/DBSet, which i can't see how i can pull over.

Hope someone can clear this up for me, because from what i understand so far, i won't be able to upgrade to CTP5.

You can execute database-wide sql statements like this

using(var context = new MyContext())
{
    // custum sql statement
    var c = context.Database.SqlQuery<int>("SELECT COUNT(*) FROM Employees");

    // returned entity type doesn't have to be represented by ObjectSet/DBSet
    var e = context.Database.SqlQuery<Employee>("SELECT * FROM Employees");

    // stored procedure
    var q = context.Database.SqlQuery<Employee>("GetEmployees");
}

Tools for visualising execution xml plans as HTML

6 votes

Are there any tools / XSLT style sheets around for transforming the XML execution plans returned by Microsoft Visual Studio into HTML?

Failing that, does anyone know of any techniques that can be used to display charts in HTML suitable for displaying execution plans?

I couldn't find one so I made one myself

http://code.google.com/p/html-query-plan/

It may need a little polishing, but its pretty much there and will certainly do as a basic query plan summary:

Screenshot

Its currently being used on the Stack Exchange Data Explorer

What is a good way to find gaps in a set of datespans?

5 votes

What is a way to find gaps in a set of date spans?

For example, I have these date spans:

1/ 1/11 - 1/10/11  
1/13/11 - 1/15/11  
1/20/11 - 1/30/11

Then I have a start and end date of 1/7/11 and 1/14/11.

I want to be able to tell that between 1/10/11 and 1/13/11 there is a gap so the start and end date is not possible. Or I want to return only the datespans up to the first gap encountered.

If this can be done in SQL server that would be good.

I was thinking to go through each date to find out if it lands in a datespan... if it does not then there's a gap on that day.

  • Jump to 2nd last code block for: *I want to be able to tell that between 1/10/11 and 1/13/11 there is a gap so the start and end date is* not possible.
  • Jump to last code block for: *I want to return only the datespans up to the first gap encountered.*

First of all, here's a virtual table to discuss

create table spans (date1 datetime, date2 datetime);
insert into spans select '20110101', '20110110';
insert into spans select '20110113', '20110115';
insert into spans select '20110120', '20110130';

This is a query that will list, individually, all the dates in the calendar

declare @startdate datetime, @enddate datetime
select @startdate = '20110107', @enddate = '20110114'
select distinct a.date1+v.number
from spans A
inner join master..spt_values v
  on v.type='P' and v.number between 0 and datediff(d, a.date1, a.date2)
-- we don't care about spans that don't intersect with our range
where A.date1 <= @enddate
  and @startdate <= A.date2

Armed with this query, we can now test to see if there are any gaps, by counting the days in the calendar against the expected number of days

declare @startdate datetime, @enddate datetime
select @startdate = '20110107', @enddate = '20110114'

select case when count(distinct a.date1+v.number)
    = datediff(d,@startdate, @enddate) + 1
    then 'No gaps' else 'Gap' end
from spans A
inner join master..spt_values v
  on v.type='P' and v.number between 0 and datediff(d, a.date1, a.date2)
-- we don't care about spans that don't intersect with our range
where A.date1 <= @enddate
  and @startdate <= A.date2
-- count only those dates within our range
   and a.date1 + v.number between @startdate and @enddate

Another way to do this is to just build the calendar from @start to @end up front and look to see if there is a span with this date

declare @startdate datetime, @enddate datetime
select @startdate = '20110107', @enddate = '20110114'
-- startdate+v.number is a day on the calendar
select @startdate + v.number
from master..spt_values v
where v.type='P' and v.number between 0
  and datediff(d, @startdate, @enddate)

-- run the part above this line alone to see the calendar
-- the condition checks for dates that are not in any span (gap)
  and not exists (
    select *
    from spans
    where @startdate + v.number between date1 and date2)

The query returns ALL dates that are gaps in the date range @start - @end A TOP 1 can be added to just see if there are gaps

To return all records that are before the gap, use the query as a derived table in a larger query

declare @startdate datetime, @enddate datetime
select @startdate = '20110107', @enddate = '20110114'
select *
from spans
where date1 <= @enddate and @startdate <= date2 -- overlaps
  and date2 < ( -- before the gap
    select top 1 @startdate + v.number
    from master..spt_values v
    where v.type='P' and v.number between 0
      and datediff(d, @startdate, @enddate)
      and not exists (
        select *
        from spans
        where @startdate + v.number between date1 and date2)
    order by 1 ASC
)